We aim to model an outcome measurement, our estimand: \(\color{green}{Y}\)
We have data on a number \((n)\) of observations\(({i})\) (e.g. survey respondents; pupils; students; factory workers; events; etc.): \({i}_{1\dots{n}}\)
We assume that observations are independent of each other (e.g. different respondents randomly sampled from a population)
The outcome measurement has a grand mean across all observations \((\mu)\), and each observation \((i)\) has some deviation (error) from this mean \((e_i)\)
\[ y_i = \mu + \color{red}{e_i} \]
A brief review of single-level regression
Observed part: our observation, outcome, estimate, etc.; the left-hand-side of the model
Fixed part: this can be a simple sample mean \((\mu)\) of a single measurement as in our basic example (e.g. a social trust scale), but it can also be a regression equation containing several predictor variables, as we have seen in previous weeks (e.g. \(b_0 + b_{1i}x_{1i} + b_{2i}x_{2i} \cdots b_{pi}x_{pi}\) for a model with \(p\) number of predictors/independent variables)
Random part: the deviations of the observations from the model mean
A brief review of single-level regression
We also assume that the error term \((e)\) is Normally distributed around a mean of 0 and has some variance \((\sigma^2)\) that we are estimating
# Import the dataosterman <-data_read("https://cgmoreh.github.io/HSS8005-24/data/osterman.dta")
cumulative European Social Survey (ESS) data, consisting of nine rounds from 2002 to 2018
data are weighted using ESS design weights (we will disregard this, so we can expect our results to differ somewhat)
follows the established approach of using a validated three-item scale to study generalised social trust
A brief review of single-level regression
An applied example
The outcome variable of interest is an eleven-point scale measure of social trust:
The scale consists of the classic trust question, an item on whether people try to be fair, and an item on whether people are helpful:
Generally speaking, would you say that most people can be trusted, or that you can’t be too careful in dealing with people?
Do you think that most people would try to take advantage of you if they got the chance, or would they try to be fair?
Would you say that most of the time people try to be helpful or that they are mostly looking out for themselves?
All of the items may be answered on a scale from 0 to 10 (where 10 represents the highest level of trust) and the scale is calculated as the mean of the three items
The three-item scale improves measurement reliability and cross-country validity compared to using a single item, such as the classic trust question.
Let’s start by fitting a single-level model of social trust as a function of education, age, gender, parental education and whether either of the parents were born abroad (i.e. the variable we computed earlier).
We have interpreted this model in earlier weeks. Our interest now is in extending it to account for the nesting of cases within different countries.
Multilevel models
Multilevel models
In our dataset we cannot assume that the observations are fully independent (or that the errors are independently distributed). We know that observations were sampled from within selected countries, so the countries are cluster variables that may have a group-level influence on the behaviour, opinions, conditions etc. of our individual observations.
Multilevel models
In our dataset we cannot assume that the observations are fully independent (or that the errors are independently distributed). We know that observations were sampled from within selected countries, so the countries are cluster variables that may have a group-level influence on the behaviour, opinions, conditions etc. of our individual observations.
Multilevel models
In our dataset we cannot assume that the observations are fully independent (or that the errors are independently distributed). We know that observations were sampled from within selected countries, so the countries are cluster variables that may have a group-level influence on the behaviour, opinions, conditions etc. of our individual observations.
Multilevel models
In our dataset we cannot assume that the observations are fully independent (or that the errors are independently distributed). We know that observations were sampled from within selected countries, so the countries are cluster variables that may have a group-level influence on the behaviour, opinions, conditions etc. of our individual observations.
Multilevel models
http://mfviz.com/hierarchical-models/
Multilevel models
In our dataset we cannot assume that the observations are fully independent (or that the errors are independently distributed). We know that observations were sampled from within selected countries, so the countries are cluster variables that may have a group-level influence on the behaviour, opinions, conditions etc. of our individual observations.
With such data, it makes sense to allow regression coefficients to vary by group.
Such variation can already be achieved by simply including group indicators in a least squares regression framework.
In other words, we could extend our previous model like such:
Very often, simply including group indicators in a least squares regression gives unacceptably noisy estimates.
Instead, we use multilevel regression, a method of partially pooling varying coefficients, equivalent to Bayesian regression where the variation in the data is used to estimate prior distribution on the variation of intercepts and slopes.
The terminology surrounding multilevel models can be confusing. Different disciplines use various names for them, for example:
Variance components
Random intercepts and slopes
Random effects
Random coefficients
Varying coefficients
Intercepts- and/or slopes-as-outcomes
Hierarchical linear models
Multilevel models (implies multiple levels of hierarchically clustered data)
Growth curve models (possibly Latent GCM)
Mixed effects models
Multilevel models
Some of these terms might be more historical, others are more often seen in a specific discipline, others might refer to a certain data structure, and still others are special cases (e.g. null models with no explanatory variables).
Though you will hear many definitions, random effects are simply those specific to an observational unit, however defined. In our examples We will mostly encounter the case where the observational unit is the level of some grouping factor, but this is only one possibility.
Mixed effects - or simply mixed - models generally refer to a mixture of fixed and random effects. This is probably the most general term, with no specific data structure implied.
Fitting multilevel models
In R we can fit multilevel models using the lmer function from the lme4 package.
Initially, it is advisable to first fit some simple, preliminary models, in part to establish a baseline for evaluating larger models. Then, we can build toward a final model for description and inference by attempting to add important covariates, centering certain variables, and checking model assumptions.
The standard first step is to model only the outcome measurement, without any predictors, to get a sense for the effect of the clusters; this is often called a random intercepts model or null model:
mod_null <-lmer(trustindex3 ~1+ (1| cntry), data = osterman)
The second step is then to fit the full covariate model: