Week 6 Worksheet Exercises

Hierarchies: Multilevel models

Aims

By the end of the session you will learn how to:

use survey weights in single-level linear regression models
estimate variance in hierarchical/clustered data using robust standard errors in a single-level modelling framework
fit random intercept models to hierarchical/clustered data

Setup

Create a worksheet document

In Week 1 you set up R and RStudio, and an RProject folder (we called it “HSS8005_labs”) with an .R script and a .qmd or .Rmd document in it (we called these “Lab_1”). Ideally, you saved this on a cloud drive so you can access it from any computer (e.g. OneDrive). You will be working in this folder. If it’s missing, complete Exercise 3 from the Week 1 Worksheet.

Create a new Quarto markdown file (.qmd) for this session (e.g. “Lab_6.qmd”) and work in it to complete the exercises and report on your final analysis.

R packages

# Just in case we get errors asking that the package repositories be explicitly set when installing new packages:

options(repos = list(CRAN = "http://cran.rstudio.com/"))

# Install and load required packages

# We can use the {pacman} package to easily install and load several packages:
# ({pacman} itself may need installing if not yet installed)

pacman::p_load(
  tidyverse, easystats, sjlabelled,
  ggformula, ggeffects, marginaleffects, 
  modelsummary, gtsummary,
  survey, sandwich, lmtest, lme4)            # new modelling packages

Example 1. Österman (2021): polynomials, weights and clustered errors

In Worksheet 4 we fitted versions of Models 1-3 reported in Table 3 of Österman (2021) (see also Table A.3. in their Appendix for a more complete reporting on the models).

Comparing the model we fitted to the one reported by the author, our results were very close, but not totally equivalent on several coefficients and standard errors. The main reasons for the divergence had to do with two aspects of the published model that we had disregarded: (1) we did not include survey weights to correct for sampling errors, and (2) we did not allow for intra-group correlation in the standard errors among respondents from the same country (and the same age cohort). We will first implement these two additional steps and compare our final results again to those reported in Österman (2021). Then, we will refit the model in a multilevel/hierarchical framework.

Data management

As a first step, let’s import the osterman dataset that underpins the Österman (2021) article (see the Data page on the course website for information about the datasets available in this course). We can download the data to a local folder and load it from there, or we can load it directly from the web (if it works…):

# Import the data
osterman <- data_read("https://cgmoreh.github.io/HSS8005-24/data/osterman.dta")

It’s always a good idea to inspect the dataset after importing, to identify any issues. One option is to check a codebook, for example:

# `View` the codebook:
data_codebook(osterman) |> View()

We do notice a few strange values for some variables. For example, let’s inspect the ppltrst and dscrgrp variables in a little more detail:

osterman |> data_tabulate(c(ppltrst, dscrgrp))

Most people can be trusted or you can't be too careful (ppltrst) <numeric>
# total N=68796 valid N=68733

Value |     N | Raw % | Valid % | Cumulative %
------+-------+-------+---------+-------------
0     |  3634 |  5.28 |    5.29 |         5.29
1     |  2374 |  3.45 |    3.45 |         8.74
2     |  4621 |  6.72 |    6.72 |        15.46
3     |  7170 | 10.42 |   10.43 |        25.90
4     |  6522 |  9.48 |    9.49 |        35.38
5     | 13962 | 20.29 |   20.31 |        55.70
6     |  7793 | 11.33 |   11.34 |        67.04
7     | 11067 | 16.09 |   16.10 |        83.14
8     |  8374 | 12.17 |   12.18 |        95.32
9     |  2049 |  2.98 |    2.98 |        98.30
10    |  1167 |  1.70 |    1.70 |       100.00
<NA>  |    63 |  0.09 |    <NA> |         <NA>

Member of a group discriminated against in this country (dscrgrp) <numeric>
# total N=68796 valid N=68565

Value |     N | Raw % | Valid % | Cumulative %
------+-------+-------+---------+-------------
1     |  3950 |  5.74 |    5.76 |         5.76
2     | 64615 | 93.92 |   94.24 |       100.00
<NA>  |   231 |  0.34 |    <NA> |         <NA>

The results in the frequency tables look good. So it’s probably something awkward about the labelling. We can check the codebook again; to allow for all levels of the ppltrst variable to be tabulated, we can change the number of maximum values shown from the default 10 to something larger:

osterman |> data_codebook(c(ppltrst, dscrgrp), max_values = 20)

osterman (68796 rows and 30 variables, 2 shown)

ID | Name    | Label                                                   | Type    |   Missings | Values | Value Labels               |             N
---+---------+---------------------------------------------------------+---------+------------+--------+----------------------------+--------------
4  | ppltrst | Most people can be trusted or you can't be too careful  | numeric |  63 (0.1%) |      0 | You can't be too careful   |  3634 ( 5.3%)
   |         |                                                         |         |            |      1 | 1                          |  2374 ( 3.5%)
   |         |                                                         |         |            |     10 | Most people can be trusted |  1167 ( 1.7%)
   |         |                                                         |         |            |      2 | 2                          |  4621 ( 6.7%)
   |         |                                                         |         |            |      3 | 3                          |  7170 (10.4%)
   |         |                                                         |         |            |      4 | 4                          |  6522 ( 9.5%)
   |         |                                                         |         |            |      5 | 5                          | 13962 (20.3%)
   |         |                                                         |         |            |      6 | 6                          |  7793 (11.3%)
   |         |                                                         |         |            |      7 | 7                          | 11067 (16.1%)
   |         |                                                         |         |            |      8 | 8                          |  8374 (12.2%)
   |         |                                                         |         |            |      9 | 9                          |  2049 ( 3.0%)
   |         |                                                         |         |            |  NA(a) | Refusal                    |     9 ( 0.0%)
   |         |                                                         |         |            |  NA(b) | Don't know                 |    41 ( 0.1%)
   |         |                                                         |         |            |  NA(c) | No answer                  |    13 ( 0.0%)
---+---------+---------------------------------------------------------+---------+------------+--------+----------------------------+--------------
7  | dscrgrp | Member of a group discriminated against in this country | numeric | 231 (0.3%) |      1 | Yes                        |  3950 ( 5.7%)
   |         |                                                         |         |            |      2 | No                         | 64615 (93.9%)
   |         |                                                         |         |            |  NA(a) | Refusal                    |    29 ( 0.0%)
   |         |                                                         |         |            |  NA(b) | Don't know                 |   179 ( 0.3%)
   |         |                                                         |         |            |  NA(c) | No answer                  |    23 ( 0.0%)
---------------------------------------------------------------------------------------------------------------------------------------------------

As we see, the so-called “tagged” NA/missing values are causing the ordering of the values to be unexpected: because of letters appearing as part of a numeric value scale, the scale is automatically following “alphabetic sorting” rather than “natural sorting”. R is not designed to handle “tagged” NA/missing values, but these are commonly used in other statistical software, and the original dataset was imported from a Stata/.dta format. In most cases this will not pose an issue, but it’s safer to convert all “lagged” missing values to standard NAs. We can easily do that with the zap_na_tags() function from the sjlabelled package:

osterman <- sjlabelled::zap_na_tags(osterman)

## testing the results:

osterman |> data_codebook(c(ppltrst, dscrgrp), max_values = 20)

osterman (68796 rows and 30 variables, 2 shown)

ID | Name    | Label                                                   | Type    |   Missings | Values | Value Labels               |             N
---+---------+---------------------------------------------------------+---------+------------+--------+----------------------------+--------------
4  | ppltrst | Most people can be trusted or you can't be too careful  | numeric |  63 (0.1%) |      0 | You can't be too careful   |  3634 ( 5.3%)
   |         |                                                         |         |            |      1 | 1                          |  2374 ( 3.5%)
   |         |                                                         |         |            |      2 | 2                          |  4621 ( 6.7%)
   |         |                                                         |         |            |      3 | 3                          |  7170 (10.4%)
   |         |                                                         |         |            |      4 | 4                          |  6522 ( 9.5%)
   |         |                                                         |         |            |      5 | 5                          | 13962 (20.3%)
   |         |                                                         |         |            |      6 | 6                          |  7793 (11.3%)
   |         |                                                         |         |            |      7 | 7                          | 11067 (16.1%)
   |         |                                                         |         |            |      8 | 8                          |  8374 (12.2%)
   |         |                                                         |         |            |      9 | 9                          |  2049 ( 3.0%)
   |         |                                                         |         |            |     10 | Most people can be trusted |  1167 ( 1.7%)
---+---------+---------------------------------------------------------+---------+------------+--------+----------------------------+--------------
7  | dscrgrp | Member of a group discriminated against in this country | numeric | 231 (0.3%) |      1 | Yes                        |  3950 ( 5.8%)
   |         |                                                         |         |            |      2 | No                         | 64615 (94.2%)
---------------------------------------------------------------------------------------------------------------------------------------------------

Polynomials

In the footnotes to Table 3 and Table A.3 in the Appendix, the author tells us:

All models include reform FEs [Fixed Effects], a general quadratic birth year trend, and reform-specific trends for birth year (linear) and age (quadratic). Additional controls: foreign-born parents and ESS round dummies.

The tables in the Appendix (Table A.3) “report all of the coefficients for the control variables, except for the age and birth year controls. Since the latter variables are interacted with all of the reform fixed effects, they are not interpretable as single coefficients”.

In the main text, the author further explains some of the reasoning behind their modelling choices:

One dilemma for the design is that there has been a trend of increasing educational attainment throughout the studied time period, which means that the reform-windows of treated and non-treated cohorts will also pick up the effects of this trend. To counter this, [the list of covariates] includes a general quadratic birth year trend, reform-specific linear controls for birth year and reform-specific quadratic age trends. The quadratic terms are included to allow sufficient flexibility in absorbing possible non-linear trends of increasing education within the reform-window of seven treated and seven untreated birth year cohorts. … The reform-fixed effects are also essential because they imply that only the within-reform-window variation is used to estimate the effects and between-reform differences are factored out, such as pre-reform differences in social trust. (Österman 2021:221–22)

Before we fit the model, some of the concepts in the quotation need unpacking. A quadratic term is a second-degree polynomial term: put simply, it’s the square of the variable concerned. The quadratic of a variable such as \(age\) is therefore \(age^2\), or \(age \times age\). In other words, it is like a variable’s “interaction” with itself. Because of this, there are several ways in which the quadratic terms to be included in a model can be specified:

We could create the quadratic terms as new variables, and include those in the model. Effectively, we would be creating new variables/columns in the dataset and using those in the model. We could do that as shown below:

# Create new quadratic variables by multiplication with themselves and add them to the dataset, saving it as a new data object:

osterman <- osterman |> 
  mutate(agea_quad = agea*agea,                                                         # quadratic age variable
         yrbrn_quad = yrbrn*yrbrn) |>                                                   # quadratic birth-year variable
  sjlabelled::var_labels(agea_quad = "Age (quadratic)",                                 # we can label the variable if we want
                        yrbrn_quad = "Birth year (quadratic)")

# We now have two additional variables in the dataset; we can fit a model using those:

m1_prequad <- lm(trustindex3 ~ reform1_7 + female + blgetmg_d +                         # main covariates reported in Table 3
                   fbrneur + mbrneur + fnotbrneur + mnotbrneur + factor(essround) +     # additional controls for foreign-born parents and ESS Round dummies (we treat `round` as a factor for this)
                   agea + yrbrn + agea_quad + yrbrn_quad +                              # general quadratic birth year trend and quadratic age
                   factor(reform_id_num) +                                              # reform fixed effects dummies
                   yrbrn:factor(reform_id_num) +                                        # reform-specific birth year trend
                   agea:factor(reform_id_num) +  agea_quad:factor(reform_id_num),       # reform-specific quadratic age trend
               data = osterman)                                                         # the new expanded dataset

We can get the same results by creating the quadratic terms directly as part of the modelling function. The one thing we should keep in mind is that if we want to include standard mathematical operations within a formula function, we need to isolate or insulate the operation from R’s formula parsing code using the I() function, which returns the contents of I(...) “as.is”. The model formula would then be:

# Create quadratic terms internally as part of the modelling function:

m1_funquad <- lm(trustindex3 ~ reform1_7 + female + blgetmg_d +                         # main covariates reported in Table 3
                 fbrneur + mbrneur + fnotbrneur + mnotbrneur + factor(essround) +       # additional controls for foreign-born parents and ESS Round dummies (we treat `round` as a factor for this)
                 agea + yrbrn + I(agea^2) + I(yrbrn^2) +                                # general quadratic birth year trend and quadratic age
                 factor(reform_id_num) +                                                # reform fixed effects dummies
                 yrbrn:factor(reform_id_num) +                                          # reform-specific birth year trend
                 agea:factor(reform_id_num) +  I(agea^2):factor(reform_id_num),         # reform-specific quadratic age trend
              data = osterman)                                                          # the original dataset

In the two previous options, the quadratic terms will be correlated with the original variables. To avoid this by relying on so-called orthogonal polynomials we should use the poly() function. We can also fit the same correlated polynomial model as the ones above by adding the raw = TRUE option to the poly() function. In the code below, we will fit the correlated version first, then the orthogonal version (This stackoverflow discussion explains in more detail the difference between the two options):

m1_polyraw <- lm(trustindex3 ~ reform1_7 + female + blgetmg_d + 
                 fbrneur + mbrneur + fnotbrneur + mnotbrneur + factor(essround) + 
                 poly(agea, 2, raw = TRUE) + poly(yrbrn, 2, raw = TRUE) +
                 factor(reform_id_num) +           
                 yrbrn:factor(reform_id_num) + 
                 agea:factor(reform_id_num) + poly(agea, 2, raw = TRUE):factor(reform_id_num),
               data = osterman)

m1_polyorth <- lm(trustindex3 ~ reform1_7 + female + blgetmg_d + 
                 fbrneur + mbrneur + fnotbrneur + mnotbrneur + factor(essround) + 
                 poly(agea, 2) + poly(yrbrn, 2) +
                 factor(reform_id_num) +           
                 yrbrn:factor(reform_id_num) + 
                 agea:factor(reform_id_num) + poly(agea, 2):factor(reform_id_num),
               data = osterman)

For a side-by-side overview comparison of the models we fitted so far, we can use one of the many model summary tabulation functions that exist in various packages. Some popular options are stargazer::stargazer(), jtools::export_summs(), sjPlot::tab_model(), modelsummary::modelsummary() or parameters::compare_models from the easystats ecosystem. Below we will use modelsummary(), which produces publication-ready summary tables in HTML format, but which can easily be exported to other formats, such as Microsoft Word or PDF:

# It's cleaner to first make a list of the models we want to summarise; we can even name them:
models <- list(
  "Pre-calculated quadratic" = m1_prequad,
  "Within-function quadratic" = m1_funquad,
  "poly() with raw coding" = m1_polyraw,
  "poly() with default orthogonal coding" = m1_polyorth
)

# modelsummary table with stars for p-values added
modelsummary::modelsummary(models, stars = TRUE)

The results from the modelsummary() are not shown here because it’s a long and ugly table, but it’s useful for perusing to compare the results across the different models. We do notice some differences in the affected variables between the orthogonal-codes version and the other versions. It’s worth noting, however, that the Stata routine used by the author fitted correlated/raw coded polynomials, so the orthogonal version from the output above is just for a comparison and we will not use it going forward. We generally want our transformed (polynomial) variables to be correlated with the original variables, as they are in fact measuring the same thing.

For a cleaner table showing only the results included in Table A.3 in the Appendix to Österman (2021), we can use the coef_map or the coef_omit option in modelsummary() and only include m1_funquad, which will be the polynomial fitting routine that we will use going forward:

# It's cleaner to first make a vector of the coefficients we wish to include; we can name the coefficients as they appear in Table A.3; note that we also leave out the Intercept, as in the published table:
coefs <- c("reform1_7" = "Reform",
           "female" = "Female",
           "blgetmg_d" = "Ethnic minority",
           "fbrneur" = "Foreign-born father, Europe",
           "mbrneur" = "Foreign-born mother, Europe",
           "fnotbrneur" = "Foreign-born father, outside Europe",
           "mnotbrneur" = "Foreign-born mother, outside Europe",
           "factor(essround)2" = "ESS Round 2",
           "factor(essround)3" = "ESS Round 3",
           "factor(essround)4" = "ESS Round 4",
           "factor(essround)5" = "ESS Round 5",
           "factor(essround)6" = "ESS Round 6",
           "factor(essround)7" = "ESS Round 7",
           "factor(essround)8" = "ESS Round 8",
           "factor(essround)9" = "ESS Round 9")

# Then we pass the vector to coef_map to select the coefficients to print
modelsummary::modelsummary(list(m1_funquad), stars = TRUE, coef_map = coefs)

	(1)
Reform	0.063*
	(0.027)
Female	0.058***
	(0.013)
Ethnic minority	−0.241***
	(0.054)
Foreign-born father, Europe	−0.111**
	(0.042)
Foreign-born mother, Europe	−0.108*
	(0.044)
Foreign-born father, outside Europe	−0.065
	(0.073)
Foreign-born mother, outside Europe	−0.110
	(0.078)
ESS Round 2	0.059
	(0.045)
ESS Round 3	0.162*
	(0.075)
ESS Round 4	0.243*
	(0.108)
ESS Round 5	0.360*
	(0.144)
ESS Round 6	0.397*
	(0.179)
ESS Round 7	0.449*
	(0.212)
ESS Round 8	0.655**
	(0.246)
ESS Round 9	0.816**
	(0.283)
Num.Obs.	68796
R2	0.200
R2 Adj.	0.198
AIC	268913.8
BIC	270056.1
Log.Lik.	−134331.877
RMSE	1.71
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

We can compare our table to the published one; there will, of course, be some differences, as our model specification is still not the same as that of the original paper. One source of the inconsistencies are the use of “survey weights” in the original article.

Survey weights

One of the differences between our coefficients and the ones in the published table is due to the fact that we did not account for survey weights, while in Österman (2021), p. 220:

The data are weighted using ESS design weights

To understand in more detail what this means, we need some understanding of how weights are constructed in the European Social Survey (ESS). In a nutshell, ESS data are distributed containing 3 types of weights:

dweight: These are the so-called design weights. Quoting the ESS website: “the main purpose of the design weights is to correct for the fact that in some countries respondents have different probabilities to be part of the sample due to the sampling design used.” These weights have been built to correct for the coverage error, that is, the error created by the different chances that individuals from the target population are covered in the sample frame.
pspwght: These are the post-stratification weights. According the the ESS website, these “are a more sophisticated weighting strategy that uses auxiliary information to reduce the sampling error and potential non-response bias.” These weights have been computed after the data has been collected, to correct from differences between population frequencies observed in the sample and the “true” population frequencies (i.e. those provided by the Labour Force Survey funded by the EU and available on Eurostat). Unlike the design weights, which are based on the probability of inclusion of different groups of individuals in the sample frames, these have been calculated starting from variables that are there in the data, and are really an “adjustment” of the design weight made to reach observed frequencies that match those of the target population.
pweight: These are the population size weights. These weights have the purpose to match the numbers of observations collected in each country to the country populations. They are to be used only when we calculate statistics on multiple countries (for instance, unemployment in Scandinavia). Their value is the same for all observations within the same country.

There is a lot to be said about survey weights and options for dealing with them, which we will not cover in more detail in this course. But as part of this exercise we will get to know some functions that can help with including survey weights and can be extended to include more complex design weights as well. Specifically, we will look at the survey package

We start by creating a weighted data object using the svydesign() function from survey, which includes the dweight as used in the original article:

## Create weighted data
osterman_w <- svydesign(id = ~1,                 # specifying cluster IDs is needed; ~0 or ~1 means no clusters
                        weights = ~dweight,      # we apply the design weights
                        data = osterman)

We then fit the model using the svyglm() function from the survey package and save the model object; note that we specify a design = option with the weighted data object we created earlier:

m1_w <- svyglm(trustindex3 ~ reform1_7 + female + blgetmg_d + 
           fbrneur + mbrneur + fnotbrneur + mnotbrneur + factor(essround) + 
           agea + yrbrn + I(agea^2) + I(yrbrn^2) +
           factor(reform_id_num) +           
           yrbrn:factor(reform_id_num) + agea:factor(reform_id_num) +  
           I(agea^2):factor(reform_id_num),
       design = osterman_w, data = osterman)

To compare the results from the weighted model to the one we produced earlier, we can check them in a modelsummary() table; we can reuse the list of coefficients to display that we created earlier:

# List and name the models
models <- list(
  "Unweighted model" = m1_funquad,
  "Weighted model" = m1_w)

# Tabulate the models
modelsummary::modelsummary(models, stars = TRUE, coef_map = coefs)

	Unweighted model	Weighted model
Reform	0.063*	0.063*
	(0.027)	(0.029)
Female	0.058***	0.061***
	(0.013)	(0.014)
Ethnic minority	−0.241***	−0.261***
	(0.054)	(0.066)
Foreign-born father, Europe	−0.111**	−0.090*
	(0.042)	(0.046)
Foreign-born mother, Europe	−0.108*	−0.092*
	(0.044)	(0.046)
Foreign-born father, outside Europe	−0.065	−0.053
	(0.073)	(0.079)
Foreign-born mother, outside Europe	−0.110	−0.087
	(0.078)	(0.082)
ESS Round 2	0.059	0.052
	(0.045)	(0.047)
ESS Round 3	0.162*	0.148+
	(0.075)	(0.080)
ESS Round 4	0.243*	0.253*
	(0.108)	(0.114)
ESS Round 5	0.360*	0.364*
	(0.144)	(0.153)
ESS Round 6	0.397*	0.403*
	(0.179)	(0.190)
ESS Round 7	0.449*	0.458*
	(0.212)	(0.224)
ESS Round 8	0.655**	0.671*
	(0.246)	(0.261)
ESS Round 9	0.816**	0.835**
	(0.283)	(0.299)
Num.Obs.	68796	68796
R2	0.200	0.206
R2 Adj.	0.198	0.204
AIC	268913.8
BIC	270056.1	305027.4
Log.Lik.	−134331.877	−151817.519
RMSE	1.71	1.71
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

We can check our results again to those published in Table A.3 in the Appendix to Österman (2021), and we will see that all the coefficients are almost the same now.

The remaining differences are in the standard error estimates, and those are due to the fact that we did not use robust standard errors to account for intra-group correlation.

Variance estimation using robust clustered errors

Österman (2021:222) write that:

All models are estimated with OLS, … and apply country-by-birth year clustered robust standard errors. [Footnote: An alternative would be to cluster the standard errors on the country level but such an approach would risk to lead to biased standard errors because of too few clusters]

Applying clustered robust standard errors is a more elementary and less flexible way to account for breaches of the iid assumptions of OLS (that our variables are “independent and identically distributed”) than fitting a mixed-effects (multilevel, hierarchical) model. There is a vast literature assessing whether one approach is better suited than the other in different contexts. To get a feel for the issues in question and for a deeper understanding of how these methods are useful, compare the analysis of Cheah (2009), whose results “suggest that modeling the clustering of the data using a multilevel methods is a better approach than fixing the standard errors of the OLS estimate”, to that of McNeish, Stapleton, and Silverman (2017), who discuss a number of cases (focusing on psychology literature) where cluster-robust standard error may be more advantageous than multilevel/hierarchical modelling.

We will first fit our model using clustered robust standard errors, as done by Österman (2021), then we will check the results against those we would obtain from a multilevel model design to see how the risk related to the low number of country clusters affects the results.

The standard was of applying error corrections is by using the sandwich and {lmtest} packages, but they do not handle well the summary objects produced by the survey package for weighted estimates. Using the unweighed model we fitted earlier, we could do the following:

# extract variance-covariance matrix with clustered correction
vc_cl <- vcovCL(m1_funquad, type='HC1', cluster=~cntry_cohort)

# get coefficients
m1_funquad_cl <- coeftest(m1_funquad, vc_cl)

## Or, the two steps above can be combined into one call:

m1_funquad_cl2 <- coeftest(m1_funquad, 
                          vcovCL, type='HC1', cluster = ~cntry_cohort)

# And we can check that the two have the same result:
all.equal(m1_funquad_cl, m1_funquad_cl2)

[1] TRUE

Let’s tabulate the results for comparison:

# List and name the models
models <- list(
  "Unweighted model" = m1_funquad,
  "Weighted model" = m1_w,
  "Unweighted model with cluster-robust errors" = m1_funquad_cl)

# Tabulate the models
modelsummary::modelsummary(models, stars = TRUE, coef_map = coefs)

	Unweighted model	Weighted model	Unweighted model with cluster-robust errors
Reform	0.063*	0.063*	0.063*
	(0.027)	(0.029)	(0.029)
Female	0.058***	0.061***	0.058***
	(0.013)	(0.014)	(0.015)
Ethnic minority	−0.241***	−0.261***	−0.241***
	(0.054)	(0.066)	(0.060)
Foreign-born father, Europe	−0.111**	−0.090*	−0.111*
	(0.042)	(0.046)	(0.046)
Foreign-born mother, Europe	−0.108*	−0.092*	−0.108*
	(0.044)	(0.046)	(0.047)
Foreign-born father, outside Europe	−0.065	−0.053	−0.065
	(0.073)	(0.079)	(0.077)
Foreign-born mother, outside Europe	−0.110	−0.087	−0.110
	(0.078)	(0.082)	(0.086)
ESS Round 2	0.059	0.052	0.059
	(0.045)	(0.047)	(0.044)
ESS Round 3	0.162*	0.148+	0.162*
	(0.075)	(0.080)	(0.079)
ESS Round 4	0.243*	0.253*	0.243*
	(0.108)	(0.114)	(0.115)
ESS Round 5	0.360*	0.364*	0.360*
	(0.144)	(0.153)	(0.154)
ESS Round 6	0.397*	0.403*	0.397*
	(0.179)	(0.190)	(0.190)
ESS Round 7	0.449*	0.458*	0.449*
	(0.212)	(0.224)	(0.224)
ESS Round 8	0.655**	0.671*	0.655*
	(0.246)	(0.261)	(0.262)
ESS Round 9	0.816**	0.835**	0.816**
	(0.283)	(0.299)	(0.298)
Num.Obs.	68796	68796	68796
R2	0.200	0.206
R2 Adj.	0.198	0.204
AIC	268913.8		406007.8
BIC	270056.1	305027.4	1033594.4
Log.Lik.	−134331.877	−151817.519
RMSE	1.71	1.71
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

The error-correction does not have an impact on the coefficients, but it did affect the standard errors. Notice that the standard errors came closer to those in the weighted model without any error correction, showing that the weighting procedure already applies an error correction, albeit not specifically on the cluster variable cntry_cohort.

An easy way to include information on the clustering directly in the svydesign() function is to add cluster IDs:

# Re-specify the survey design
osterman_w_cl <- svydesign(id = ~cntry_cohort,        # This time we add cntry_cohort as id
                         weights = ~dweight, 
                         data = osterman)

# Re-fit the model with the new survey design
m1_w_cl <- svyglm(trustindex3 ~ reform1_7 + female + blgetmg_d + 
               fbrneur + mbrneur + fnotbrneur + mnotbrneur + factor(essround) + 
               agea + yrbrn + I(agea^2) + I(yrbrn^2) +
               factor(reform_id_num) +           
               yrbrn:factor(reform_id_num) + agea:factor(reform_id_num) +  
               I(agea^2):factor(reform_id_num),
           design = osterman_w_cl, data = osterman) 

# Tabulate the models

# Add the latest model to the existing list using the append() function to save typing
models <- append(models, 
                 list("Weighted with cluster IDs" = m1_w_cl)
                 )

# Tabulate the models
modelsummary::modelsummary(models, stars = TRUE, coef_map = coefs)

	Unweighted model	Weighted model	Unweighted model with cluster-robust errors	Weighted with cluster IDs
Reform	0.063*	0.063*	0.063*	0.063*
	(0.027)	(0.029)	(0.029)	(0.030)
Female	0.058***	0.061***	0.058***	0.061***
	(0.013)	(0.014)	(0.015)	(0.015)
Ethnic minority	−0.241***	−0.261***	−0.241***	−0.261***
	(0.054)	(0.066)	(0.060)	(0.067)
Foreign-born father, Europe	−0.111**	−0.090*	−0.111*	−0.090+
	(0.042)	(0.046)	(0.046)	(0.047)
Foreign-born mother, Europe	−0.108*	−0.092*	−0.108*	−0.092+
	(0.044)	(0.046)	(0.047)	(0.047)
Foreign-born father, outside Europe	−0.065	−0.053	−0.065	−0.053
	(0.073)	(0.079)	(0.077)	(0.078)
Foreign-born mother, outside Europe	−0.110	−0.087	−0.110	−0.087
	(0.078)	(0.082)	(0.086)	(0.081)
ESS Round 2	0.059	0.052	0.059	0.052
	(0.045)	(0.047)	(0.044)	(0.045)
ESS Round 3	0.162*	0.148+	0.162*	0.148+
	(0.075)	(0.080)	(0.079)	(0.088)
ESS Round 4	0.243*	0.253*	0.243*	0.253*
	(0.108)	(0.114)	(0.115)	(0.125)
ESS Round 5	0.360*	0.364*	0.360*	0.364*
	(0.144)	(0.153)	(0.154)	(0.169)
ESS Round 6	0.397*	0.403*	0.397*	0.403+
	(0.179)	(0.190)	(0.190)	(0.209)
ESS Round 7	0.449*	0.458*	0.449*	0.458+
	(0.212)	(0.224)	(0.224)	(0.246)
ESS Round 8	0.655**	0.671*	0.655*	0.671*
	(0.246)	(0.261)	(0.262)	(0.288)
ESS Round 9	0.816**	0.835**	0.816**	0.835*
	(0.283)	(0.299)	(0.298)	(0.326)
Num.Obs.	68796	68796	68796	68796
R2	0.200	0.206		0.206
R2 Adj.	0.198	0.204		−209.204
AIC	268913.8		406007.8
BIC	270056.1	305027.4	1033594.4	7093411.5
Log.Lik.	−134331.877	−151817.519		−3546009.588
RMSE	1.71	1.71		1.71
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

This final model takes us close to the model reported in Österman (2021).

Example 2. Österman (2021): Random intercept model

Österman (2021:222) write that “An alternative [to the model above] would be to cluster the standard errors on the country level but such an approach would risk to lead to biased standard errors because of too few clusters”. The number of clusters is an important consideration when choosing to go for a multilevel model. For our aims, we will disregard this warning and go ahead and fit a multilevel model on the osterman data regardless, and we can think about what the differences in the results mean.

We can fit the model in a multilevel framework using the lme4 package and the lmer() function.

Below we begin by looking at the distribution of the cntry variable that codes the countries where respondents are from. Since the European Social Survey is a cross-national survey, it employs a random sampling approach within each participant country and the resulting cross-national dataset is therefore clustered at the country level.

data_tabulate(osterman, cntry)

Country (cntry) <character>
# total N=68796 valid N=68796

Value |    N | Raw % | Valid % | Cumulative %
------+------+-------+---------+-------------
AT    | 2387 |  3.47 |    3.47 |         3.47
BE    | 5220 |  7.59 |    7.59 |        11.06
DE    | 3035 |  4.41 |    4.41 |        15.47
DK    | 3979 |  5.78 |    5.78 |        21.25
ES    | 6062 |  8.81 |    8.81 |        30.06
FI    | 2505 |  3.64 |    3.64 |        33.71
FR    | 6942 | 10.09 |   10.09 |        43.80
GB    | 5494 |  7.99 |    7.99 |        51.78
GR    | 2065 |  3.00 |    3.00 |        54.78
HU    | 3101 |  4.51 |    4.51 |        59.29
IE    | 6086 |  8.85 |    8.85 |        68.14
IT    | 1502 |  2.18 |    2.18 |        70.32
NL    | 6063 |  8.81 |    8.81 |        79.13
PL    | 5372 |  7.81 |    7.81 |        86.94
PT    | 5987 |  8.70 |    8.70 |        95.65
SE    | 2996 |  4.35 |    4.35 |       100.00
<NA>  |    0 |  0.00 |    <NA> |         <NA>

There are 16 countries in the data.

Initially, it is advisable to first fit some simple, preliminary models, in part to establish a baseline for evaluating larger models. Since in the multilevel framework we are interested in understanding the effect that the clusters have in the data, the simplest model we can fit is a so-called random intercepts model or null model, which models the outcome without any predictors. In the single-level framework, this is equivalent to estimating the overall mean of the outcome variable; in the hierarchical framework, we are modelling the mean of the outcome within each category/cluster of the grouping variable. With lmer() this means a simple addition of a specification of the form (1 | cluster) to the model code we are already familiar with, where cluster is the name of the clustering variable, in our case cntry:

mod_null <- lmer(trustindex3 ~ 1 + (1 | cntry), data = osterman)

The result is the following:

model_parameters(mod_null)

# Fixed Effects

Parameter   | Coefficient |   SE |       95% CI | t(68793) |      p
-------------------------------------------------------------------
(Intercept) |        5.23 | 0.23 | [4.78, 5.69] |    22.47 | < .001

# Random Effects

Parameter             | Coefficient
-----------------------------------
SD (Intercept: cntry) |        0.93
SD (Residual)         |        1.71

random_parameters(mod_null)

# Random Effects

Within-Group Variance        2.94 (1.71)
Between-Group Variance
  Random Intercept (cntry)   0.87 (0.93)
N (groups per factor)
  cntry                        16
Observations                68796

Below we fit a random intercept model with cntry_cohort as the single grouping variable and disregarding survey weights:

# Fit a random intercept model with `cntry_cohort` as the grouping variable
m2 <- lmer(trustindex3 ~ reform1_7 + female + blgetmg_d + 
               fbrneur + mbrneur + fnotbrneur + mnotbrneur + factor(essround) + 
               agea + yrbrn + I(agea^2) + I(yrbrn^2) +
               factor(reform_id_num) +           
               yrbrn:factor(reform_id_num) + agea:factor(reform_id_num) +  
               I(agea^2):factor(reform_id_num) +
               (1|cntry_cohort),
           data = osterman)


# Fit a random intercept model with only `cntry` as the grouping variable
m2_cntry <- lmer(trustindex3 ~ reform1_7 + female + blgetmg_d + 
               fbrneur + mbrneur + fnotbrneur + mnotbrneur + factor(essround) + 
               agea + yrbrn + I(agea^2) + I(yrbrn^2) +
               factor(reform_id_num) +           
               yrbrn:factor(reform_id_num) + agea:factor(reform_id_num) +  
               I(agea^2):factor(reform_id_num) +
               (1|cntry),
           data = osterman)


# Add the latest models to the existing list using the append() function
models <- append(models, 
                 list("Random intercepts (cntry_cohort)" = m2,
                      "Random intercepts (cntry)" = m2_cntry)
                 )

# Tabulate and compare the models
modelsummary::modelsummary(models, stars = TRUE, coef_map = coefs)

	Unweighted model	Weighted model	Unweighted model with cluster-robust errors	Weighted with cluster IDs	Random intercepts (cntry_cohort)	Random intercepts (cntry)
Reform	0.063*	0.063*	0.063*	0.063*	0.064*	0.063*
	(0.027)	(0.029)	(0.029)	(0.030)	(0.028)	(0.027)
Female	0.058***	0.061***	0.058***	0.061***	0.058***	0.058***
	(0.013)	(0.014)	(0.015)	(0.015)	(0.013)	(0.013)
Ethnic minority	−0.241***	−0.261***	−0.241***	−0.261***	−0.241***	−0.241***
	(0.054)	(0.066)	(0.060)	(0.067)	(0.054)	(0.054)
Foreign-born father, Europe	−0.111**	−0.090*	−0.111*	−0.090+	−0.111**	−0.111**
	(0.042)	(0.046)	(0.046)	(0.047)	(0.042)	(0.042)
Foreign-born mother, Europe	−0.108*	−0.092*	−0.108*	−0.092+	−0.108*	−0.108*
	(0.044)	(0.046)	(0.047)	(0.047)	(0.044)	(0.044)
Foreign-born father, outside Europe	−0.065	−0.053	−0.065	−0.053	−0.064	−0.065
	(0.073)	(0.079)	(0.077)	(0.078)	(0.073)	(0.073)
Foreign-born mother, outside Europe	−0.110	−0.087	−0.110	−0.087	−0.110	−0.110
	(0.078)	(0.082)	(0.086)	(0.081)	(0.078)	(0.078)
ESS Round 2	0.059	0.052	0.059	0.052	0.059	0.059
	(0.045)	(0.047)	(0.044)	(0.045)	(0.045)	(0.045)
ESS Round 3	0.162*	0.148+	0.162*	0.148+	0.162*	0.162*
	(0.075)	(0.080)	(0.079)	(0.088)	(0.075)	(0.075)
ESS Round 4	0.243*	0.253*	0.243*	0.253*	0.243*	0.243*
	(0.108)	(0.114)	(0.115)	(0.125)	(0.108)	(0.108)
ESS Round 5	0.360*	0.364*	0.360*	0.364*	0.360*	0.360*
	(0.144)	(0.153)	(0.154)	(0.169)	(0.144)	(0.144)
ESS Round 6	0.397*	0.403*	0.397*	0.403+	0.396*	0.397*
	(0.179)	(0.190)	(0.190)	(0.209)	(0.179)	(0.179)
ESS Round 7	0.449*	0.458*	0.449*	0.458+	0.448*	0.449*
	(0.212)	(0.224)	(0.224)	(0.246)	(0.212)	(0.212)
ESS Round 8	0.655**	0.671*	0.655*	0.671*	0.654**	0.655**
	(0.246)	(0.261)	(0.262)	(0.288)	(0.246)	(0.246)
ESS Round 9	0.816**	0.835**	0.816**	0.835*	0.815**	0.816**
	(0.283)	(0.299)	(0.298)	(0.326)	(0.283)	(0.283)
Num.Obs.	68796	68796	68796	68796	68796	68796
R2	0.200	0.206		0.206
R2 Adj.	0.198	0.204		−209.204
R2 Marg.					0.199	0.126
R2 Cond.					0.200	0.494
AIC	268913.8		406007.8		269877.8	269878.2
BIC	270056.1	305027.4	1033594.4	7093411.5	271029.3	271029.7
ICC					0.0	0.4
Log.Lik.	−134331.877	−151817.519		−3546009.588
RMSE	1.71	1.71		1.71	1.70	1.71
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

References

Cheah, B. 2009. “Clustering Standard Errors or Modeling Multilevel Data ?”

McNeish, Daniel, Laura M. Stapleton, and Rebecca D. Silverman. 2017. “On the Unnecessary Ubiquity of Hierarchical Linear Modeling.” Psychological Methods 22(1):114–40. doi: 10.1037/met0000078.

Österman, Marcus. 2021. “Can We Trust Education for Fostering Trust? Quasi-experimental Evidence on the Effect of Education and Tracking on Social Trust.” Social Indicators Research 154(1):211–33. doi: 10.1007/s11205-020-02529-y.