Week 3 Worksheet Exercises

Categories: Logistic regression and other generalised linear models

Aims

This session introduces binary logistic regression models. These models are the simplest form of a broader class of models called generalised linear models, which are applicable when the outcome (“dependent”, “response”, “explained”, etc.) variable cannot be assumed to follow a Gaussian (i.e. “normal”) distribution, but it instead a bounded or discrete measurement (e.g. think of variables whose values cannot be negative - i.e. have a lower limit of 0 - or fall into discrete categories such as “yes”/“no”, “disagree”/“neither agree nor disagree”/“agree”, or “blue”/“green”/“black”/“brown”/“other”). Binary logistic regression is the simplest case, where the outcome can take only two values (therefore “binary”). However, the logic that underpins it is similar to that of other generalised linear models.

By the end of the session you will learn how to:

Fit and summarise logistic regression models in R
Interpret results from logistic regression models
Manipulate the regression output to ease interpretation
Plot and visualise results from logistic regression models to aid interpretation

Setup

In Week 1 you set up R and RStudio, and an RProject folder (we called it “HSS8005_labs”) with an .R script and a .qmd or .Rmd document in it (we called these “Lab_1”). Ideally, you saved this on a cloud drive so you can access it from any computer (e.g. OneDrive). You will be working in this folder. If it’s missing, complete Exercise 3 from the Week 1 Worksheet.

Create a new Quarto markdown file (.qmd) for this session (e.g. “Lab_3.qmd”) and work in it to complete the exercises and report on your final analysis.

Exercise 0: Load (and install) R packages needed for this lab

Using functions learnt in Week 1, load (and install, if needed) the following R packages:

Exercise 2: Refit the model for two single countries

You will carry out this exercise on your own, and you’ll make two adjustments compared to the previous exercise. Instead of treating the entire dataset as undifferentiated, originating from one single population, we will acknowledge the fact that the data originate from various countries and that the local socio-cultural context has an impact on social behaviours and attitudes. To account for this, re-fit the logistic regression model from the previous exercise in two different ways:

1.  fit the same model as before, but add the *country* variable to the model as a covariate;
2.  select *two* countries of your choice, reduce the dataset to that country and fit the model on that data;

In order to select countries from the data, you will need to use another function, filter(), which lets us select rows (cases) given some criteria.

References

Dingemans, Ellen, and Erik Van Ingen. 2015. “Does Religion Breed Trust? A Cross-National Study of the Effects of Religious Involvement, Religious Faith, and Religious Context on Social Trust.” Journal for the Scientific Study of Religion 54 (4): 739–55. https://doi.org/10.1111/jssr.12217.

Wu, Cary. 2021. “Education and Social Trust in Global Perspective.” Sociological Perspectives 64 (6): 1166–86. https://doi.org/10.1177/0731121421990045.

Aims

Setup

Exercise 0: Load (and install) R packages needed for this lab

Exercise 1: Modelling social trust and education

Exercise 2: Refit the model for two single countries

Exercise 3 (Advanced): Religiosity and social trust

References