SOC2069 Quantitative Methods
  • Materials
  • Data
  • Canvas
  1. Week 6
  2. [W6] Worksheet
  • Outline and materials

  • Week 1
    • Introduction
  • Week 2
    • [W2] Slides and Notes
    • [W2] Worksheet
  • Week 3
    • [W3] Slides and Notes
    • [W3] Worksheet
  • Week 4
    • [W4] Slides and Notes
    • [W4] Worksheet
  • Week 5
    • [W5] Slides and Notes
    • [W5] Worksheet
  • Week 6
    • [W6] Slides and Notes
    • [W6] Worksheet

On this page

  • Week 6 Worksheet
    • Learning outcomes
    • Intro
    • Exercise 6.1: Return to modelling social trust at national level as a function of inequality and Region
      • Task 6.1.1: Multiple linear regression
      • Task 6.1.2: Multiple logistic regression
    • Exercise 6.2. Recoding categorical variables
    • Exercise 6.3. Assignment 1 analysis

Week 6 Worksheet

Learning outcomes

By the end of the session, you should be familiar with:

  • recoding/dichotomising categorical variables in JASP

  • interpreting inferential statistics in regression outputs

  • understanding and describing the uncertainty around statistical results

Intro

In this worksheet we will revisit the linear and logistic regression models created in the previous two workshops and we will reassess those results by focusing on the uncertainty around the estimated coefficients and the meaning behind them. You will practice fitting regression models to address your assignment question of choice, and in the process we learn some further data transformation procedures in JASP.

Exercise 6.1: Return to modelling social trust at national level as a function of inequality and Region

Task 6.1.1: Multiple linear regression

In Exercise 3 on Worksheet 4 you fit a regression model of social trust as a function of inequality and Region using the trust_inequality.dta dataset. In this exercise, refit that multiple linear regression model, but in addition to what you have done in Workshop 4, also request Confidence intervals (95%) to be included in the output summary table:

The output Coefficients table should look something like this:

Questions

  • Using the lecture slides and the assigned readings, interpret the results focusing on the meaning of the “Standard Error”, “95% CI” and “p” columns

Task 6.1.2: Multiple logistic regression

In Exercise 5.1 - Task 5.1.4 (Worksheet 5) you fit a binary logistic regression model of social trust as a function of inequality and Region using as dependent variable a dichotomised version of the trust_pct variable (trust_d) from the trust_inequality dataset. In this exercise, refit that multiple binary logistic regression model, but in addition to what you have done in Workshop 5, also request Confidence intervals (95%) to be included in the output summary table:

The output Coefficients table should look something like this:

Questions

  • Using the lecture slides and the assigned readings, interpret the results focusing on the meaning of the “Standard Error”, “95% Confidence Interval” and “p” columns
  • Do the coefficients - particularly as viewed through the confidence intervals - on the Region variable(s) seem plausible? What do you think might be causing the very large “effects” and highly uncertain effects?

Tip: You may want to check a simple cross-tabulation (contingency table) of trust_d by Region to develop some intuition in answering the second point above.

Exercise 6.2. Recoding categorical variables

In In Exercise 5.1 - Task 5.1.1 (Worksheet 5) we dichotomised the trust_pct numeric scale-type variable by cutting the scale in two around its mean value. In practical data analysis you will often encounter situations when you will want to recode categorical variables into fewer categories than the original scale. Such a situation may emerge in the course of an assignment task in which the variable chosen as dependent variable in a planned regression model is multi-categorical (multinomial). There are special statistical models for such dependent variables, however, in this module we are not covering these more advanced methods. Instead, you will want to transform the variables - if logically, conceptually or theoretically possible - into binary/binomial/dichotomous variables (i.e. variables with only two categories).

Such transformations are easy to implement in JASP, and the procedure amounts to manually recoding the values of the categories by giving several categories the same values.

In this exercise, you can use the data_transformation dataset available to download from the module website’s Data page to replicate the transformations shown in the two videos below. The downloaded dataset is in .jasp format and can be opened directly in the software by double-clicking on the downloaded file.

The procedure for transforming categorical variables is demonstrated in the following YouTube videos:

Exercise 6.3. Assignment 1 analysis

You will find a detailed guide for Assignment 1 on Canvas. The guide contains detailed advice on all the steps involved in completing the assignment, including on the structure of the final report. In this exercise, you will focus on the analysis component of the assignment which you can then expand or change if your literature review and interpretation of the results necessitate it.

Once you have chosen your preferred research question from among the listed ones and identified the dataset and variables you will want to use to address the question, the guidance advises the following steps for your analysis:

  1. describe the source of the (original) data

  2. explain the motivation behind the variables you have chosen

  3. describe your chosen variables using summary descriptive statistics, visual plots and/or frequency tables, as best suited for the given variable type

  4. visualise or cross-tabulate the simple bi-variate association between your two main variables (i.e. the dependent and main independent variable that relate to the main concepts underpinning the chosen research question; e.g. “trust in the police” and “victimisation” in the case of question E)

  5. transform/clean up your variables for statistical analysis if needed (e.g. dichotomise a categorical dependent variable to make it usable in a logistic regression; remove redundant categories or set them as missing values; reorder variable levels; etc)

  6. test the association statistically using a simple bi-variate regression model (linear or logistic, as required by your dependent variable) and interpret the results

  7. expand the simple regression model into a multiple regression by adding a small selection of control variables - making sure to explain your choices behind the included variables in step 1 above, and ideally tie your choices to existing literature on the topic of your research question - and interpret these results

You will perform these analysis steps in JASP, following which you will:

  • write down your interpretation of your results in as much detail as possible (either (or both) In JASP notes and your main Word/Text editor document that you are using to write the main text of the report in; it’s a good idea to start saving your notes and interpretations alongside the analysis and outputs directly in JASP to make your work reproducible, then copy it over and adapt it in your main document;

  • Select which outputs (descriptive tables, graphs, statistical output tables) you will want to include in the main text of your report and copy them over from JASP into your Word document/text editor. To copy and paste outputs from JASP, you can click on the small black down-arrows next to the output and select “Copy”, then paste it into your text editor. (Keep in mind that tables will require some manual improvements in Word/text editor to make the content fit the page well).

  • Make sure to give a caption/title to your graphs and tables in your report.
[W6] Slides and Notes