Day 4: Functions and objects - Exercises

Intro to R (ESS 2025)

Author

Affiliation

Johannes Karreth

Ursinus College

Published

July 3, 2025

Goals

What you should be able to do on your own after this exercise:

Write somewhat efficient code for iterative tasks
Use the most common object types in R
Extract useful information from a regression object

Setup

To begin this exercise, please quit and re-open RStudio on your computer.

Acknowledgement: Some of this exercise contains materials from the book R for Data Science, the core text for this course. For citations of the R packages used here, please refer to citation("packagename")

1. Look for help online

You’re invited to ask questions while you complete this exercise during our meeting, but I also recommend consulting the main resources I recommend during our course. These are:

Lecture slides (link on my website)
R for Data Science, the book
Cookbook for R
R Graphics Cookbook
RStudio cheatsheets
Stack Overflow, which will have answers to pretty much every question I can imagine

2. Create a directory for this exercise

Using your computer’s tools or the “Files” Tab on the bottom right in RStudio, create a folder “Exercise 4” within your course folder. You’ll need to download some data for this exercise. Place it in this folder so you can access it easily.

3. Create an R script

Create an empty R script. Save it as “IntroR_Day4_Exercise.R” in your “Exercise 4” working directory. Within the script, type or copy & paste the following code in the first line to set the working directory to the same folder in which the script is located:

Code

setwd(dirname(rstudioapi::getSourceEditorContext()$path))

4. Load packages

To start your script, load the following packages: “tidyverse”, “tidylog”, “rio”, “nycflights13”, “tmap”, “ggfortify”, “estimatr”, and “modelsummary”. If you haven’t yet, install the package(s) via the package manager in RStudio.¹

¹ “tmap” will take a while to install.

5. Writing functions

How would you write this line of code as a function in a clear, intelligible way?

Code

mean(is.na(x))

Write your own function to compute the variance of a numeric vector. The variance is defined as: the sum of all differences between each value and the mean of the vector — divided by the length of the vector minus 1.
Write a function that take a vector as input and returns the last value only.
Write a function that take a vector as input and returns all but the last values.

6. Iteration and for loops

Write a for loop to compute the median of every column in the mtcars dataset.
Write a for loop to determine the type of each column in nycflights13::flights.
Repeat the previous two exercises, but use the map_ functions instead.

7. Regression

Return to yesterday’s exercise and load the QoG data into R. Keep only observations from 2019, and retain the following variables: cname, ccodecow, vdem_gender, vdem_polyarchy, wdi_gdpcapcon2015, wdi_lifeexp.
Estimate a regression model, predicting life expectancy with women’s political empowerment and GDP per capita.
Print the regression coefficients, but use the round() function to show only 4 digits after the period.
Obtain the residuals and plot them along with country names to show for which countries the model’s predictions are close to the observed values, and for which they are far.
Now plot the residuals as a shaded world map so that darker colors indicate larger absolute residuals.
Create a standard regression diagnostic plot.
Re-estimate the model, but with HC-1 robust standard errors. Create a visually appealing, publication-ready table in Word, HTML, or PDF format summarizing the results.