HSS8005
  • Module plan
  • Materials
  • Resources
  • Data
  • Assessment
  • Canvas
  1. Week 1
  2. Application
  • Weekly materials

  • Introduction
  • Week 1
    • Theory
    • Coding
    • Application
  • Week 2
    • Theory
    • Coding
    • Application
  • Week 3
    • Theory
    • Coding
    • Application
  • Week 4
    • Theory
    • Coding
    • Application
  • Week 5
    • Theory
    • Coding
    • Application
  • Week 6
    • Theory
    • Coding
    • Application
  • Week 7
    • Theory
    • Coding
    • Application
  • Week 8
    • Theory
    • Coding
    • Application
  • Conclusions

On this page

  • Aims
  • Exercise 1: Install R and RStudio, and perform basic settings
  • Exercise 2: Create an RStudio Project containing a .R and a .qmd file
  • Exercise 3: Data-frame operations on built-in datasets
  • Exercise 4: Data frame operations in a Quarto document
  • Exercise 5: Install and load R packages
  1. Week 1
  2. Application

Week 1 — Application

Tools: R, RStudio, Markdown, Quarto and other tools of the trade

Aims

By the end of the session, you will:

  • understand how to use the most important panels in the RStudio interface
  • create an RStudio Project to store your work throughout the course
  • begin using R scripts (.R) and Quarto notebooks (.qmd) to record and document your coding progress
  • understand data types and basic operations in the R language
  • understand the principles behind functions
  • know how to install, load and use functions from user-written packages
  • gain familiarity with some useful functions from packages included in the tidyverse ecosystem

Exercise 1: Install R and RStudio, and perform basic settings

To install R and RStudio on your personal computers, follow the steps outlined here based on your operating system.

Although you will only interact directly with RStudio in this module, R needs to be installed first so that RStudio can detect it and connect to it.

Once installed, open RStudio and explore its panes.

Make the following changes to the RStudio settings in the Global options:

  • set RStudio to never save your workspace as .RData upon exiting;
  • set RStudio to insert the native “pipe operator” when typing the Ctrl+Shift+M keyboard shortcut.

Exercise 2: Create an RStudio Project containing a .R and a .qmd file

  1. Create a new folder set up as an R project; call the folder “HSS8005_labs”; when done, you should have an empty folder with a file called “HSS8005_labs.Rproj” in it
  2. Create a new R script (.R); once created, save it as “Lab_1.R” within the “HSS8005_labs” folder
  3. Create a new Quarto document (.qmd); once created, save it as “Lab_1.qmd” within the “HSS8005_labs” folder

You will work in each of these new documents in this lab to gain experience with them.

Exercise 3: Data-frame operations on built-in datasets

Type/copy the code below to the R script file your created earlier (“Lab_1.R”), and save it at the end for your records.

There are several toy data frames built into R, and we can have a look at one to see how it looks like.

  • Use the data() command to get a list of available built-in datasets;
  • Choose one of the available datasets and import it into the Rstudio Environment
  • Open the dataset in the Viewer to quickly inspect it visually
  • Check the following using the appropriate R functions:
    • How many cases (rows) are in the dataset?
    • How many variables (columns) are in the dataset?
    • What is the type of the first variables in the dataset?
    • Print the first few and last few entries in the dataset.

Exercise 4: Data frame operations in a Quarto document

In this task, let’s start using the other document we created, the .qmd file. This file format allows you to combine both longer written text (such as detailed descriptions of your data analysis process or the main body of a report or journal article) with code chunks. To get you started using this file format, read Chapter 3.2. in TSD. Below we will focus only on the code chunks.

Compared to what you have done in the R script, in the main Quarto document a # refers to a heading level rather than a comment. If you want to include a code chunk, you can click on the +C tab in the upper-right corner of the .qmd document’s toolbar, or use the keyboard shortcut Ctrl+Alt+i. In the code chunk you would write in the same way as you did in the R script (they are basically mini-scripts). Within a code-chunk, therefore, the # still refers to a comment.

To execute a command withing a code chunk, you can either run each line/selection separately using Ctrl+Enter as in the R script, or you can run the entire content of the chunk with the green right-pointing triangle-arrow in the upper-right corner of the chunk.

Let’s continue doing some operations on the mtcars dataset we looked at earlier, this time using some useful tidyverse functions.

Let’s subset the data frame by selecting certain rows or columns. In tidyverse, you can do this with the filter() function for selecting rows and the select() function for selecting columns. Here we pipe the selections into head() to show the first few rows. You could also use the dplyr::slice_head function

mtcars |>
  select(mpg, wt) |>
  head()

To select the cars with eight cylinders:

mtcars |>
  filter(cyl == 8)

We can use the slice() function. For example, to get the 5th through 10th rows:

mtcars |>
  slice(5:10)

If we pass a vector of integers to the select function, we will get the variables corresponding to those column positions. So to get the first through third columns:

mtcars |>
  select(1:3) |>
  head()

If you call summary() a data frame, it produces applies the vector version of the summary command to each column:

summary(mtcars)

These few tasks should be enough to get you started with R and RStudio.

If this was your first encounter with R, you can complete the R for Social Scientists online training too sometime during the week.

From next week we will begin working actively with real data and address specific data management challenges that arise from there.

Those of you who have worked on the advanced user exercise can check some optional solutions below.

Exercise 5: Install and load R packages

Install and load the following R packages:

  • tidyverse
  • easystats
  • gtsummary
  • ggformula

Spend a bit of time reading about these packages on their website documentation.

Coding
Week 2