Week 3 Curves
Logistic regression and other generalised linear models
Description
It wasn’t until the last quarter of the 20th century that a unified vision of statistical modelling emerged, allowing practitioners to see how the general linear model we have explored so far is only a specific case of a more general class of models. We could have had a fancy, memorable name for this class of models - as John Nelder, one of its inventors, acknowledged later in life (Senn 2003, 127) - but back then academics were not required to undertake marketing training on the tweetabilty-factor of the chosen names for their theories; so we ended up with “generalised linear models”. These models can be applied to explananda (“explained”, “response”, “outcome”, “dependent” etc. variables, our y
s) whose possible values have certain constraints (such as being limited by a lower bound or constrained to discreet choices) that makes the parameters of the Gaussian (‘normal’) distribution inefficient in describing them. Instead, they follow some of the other “exponential distributions” (and not only the exponential: cf. Gelman, Hill, and Vehtari (2020, 264)), of which the Poisson, gamma, beta, binomial and multinomial are probably the most common in human and social sciences research. Their “generalised linear modelling” involves mapping them unto a linear model using a so-called “link function”. We will explore what all of this means in practice and how it can be applied to data that we are interested in most in our respective fields of study.
Readings
Statistics
ROS: Chapters 13-15
Connelly, Roxanne, Vernon Gayle, and Paul S. Lambert. 2016. ‘Statistical Modelling of Key Variables in Social Survey Data Analysis’. Methodological Innovations 9:205979911663800. Library access
Coding
- TSD: Chapter 13
Application
Wu, Cary. 2021. ‘Education and Social Trust in Global Perspective’. Sociological Perspectives 64(6):1166–86. Available here: Library access
Dingemans, Ellen, and Erik Van Ingen. 2015. ‘Does Religion Breed Trust? A Cross-National Study of the Effects of Religious Involvement, Religious Faith, and Religious Context on Social Trust’. Journal for the Scientific Study of Religion 54(4):739–55. Library access
Poisson
Elgar, Frank J., Anna Stefaniak, and Michael J. A. Wohl. 2020. “The Trouble with Trust: Time-Series Analysis of Social Capital, Income Inequality, and COVID-19 Deaths in 84 Countries.” Social Science & Medicine 263: 113365.
- Weiss, Alexa, Corinna Michels, Pascal Burgmer, Thomas Mussweiler, Axel Ockenfels, and Wilhelm Hofmann. 2021. ‘Trust in Everyday Life’. Journal of Personality and Social Psychology 121:95–114. doi: 10.1037/pspi0000334 (access preprint version here) Library access
Aims
This session introduces binary logistic regression models. These models are the simplest form of a broader class of models called generalised linear models, which are applicable when the outcome (“dependent”, “response”, “explained”, etc.) variable cannot be assumed to follow a Gaussian (i.e. “normal”) distribution, but it instead a bounded or discrete measurement (e.g. think of variables whose values cannot be negative - i.e. have a lower limit of 0 - or fall into discrete categories such as “yes”/“no”, “disagree”/“neither agree nor disagree”/“agree”, or “blue”/“green”/“black”/“brown”/“other”). Binary logistic regression is the simplest case, where the outcome can take only two values (therefore “binary”). However, the logic that underpins it is similar to that of other generalised linear models.
By the end of the session you will learn how to:
- Fit and summarise logistic regression models in
R
- Interpret results from logistic regression models
- Manipulate the regression output to ease interpretation
- Plot and visualise results from logistic regression models to aid interpretation