What you should be able to do on your own after this exercise:

  • Write somewhat efficient code for iterative tasks
  • Use the most common object types in R
  • Extract useful information from a regression object


To begin this exercise, please quit and re-open RStudio on your computer.

Acknowledgement: Some of this exercise contains materials from the book R for Data Science, the core text for this course. For citations of the R packages used here, please refer to citation("packagename")

1. Look for help online

You’re invited to ask questions while you complete this exercise during our meeting, but I also recommend consulting the main resources I recommend during our course. These are:

2. Create a directory for this exercise

Using your computer’s tools or the “Files” Tab on the bottom right in RStudio, create a folder “Exercise 4” within your course folder. You’ll need to download some data for this exercise. Place it in this folder so you can access it easily.

3. Create an R script

Create an empty R script. Save it as “IntroR_Day4_Exercise.R” in your “Exercise 4” working directory. Within the script, type or copy & paste the following code in the first line to set the working directory to the same folder in which the script is located:


4. Load packages

To start your script, load the following packages. Install them if necessary.

5. Writing functions

  1. How would you write this line of code as a function in a clear, intelligible way?
  1. Write your own function to compute the variance of a numeric vector. The variance is defined as: the sum of all differences between each value and the mean of the vector — divided by the length of the vector minus 1.

  2. Write a function that take a vector as input and returns the last value only.

  3. Write a function that take a vector as input and returns all but the last values.

6. Iteration and for loops

  1. Write a for loop to compute the median of every column in the mtcars dataset.

  2. Write a for loop to determine the type of each column in nycflights13::flights.

  3. Repeat the previous two exercises, but use the map_ functions instead.

7. Regression

  1. Return to yesterday’s exercise and load the QoG data into R. Keep only observations from 2019, and retain the following variables: cname, ccodecow, vdem_gender, vdem_polyarchy, wdi_gdpcapcon2010, wdi_lifeexp.

  2. Estimate a regression model, predicting life expectancy with women’s political empowerment and GDP per capita.

  3. Print the regression coefficients, but use the round() function to show only 4 digits after the period.

  4. Obtain the residuals and plot them along with country names to show for which countries the model’s predictions are close to the observed values, and for which they are far.

  5. Create a standard regression diagnostic plot.

  6. Re-estimate the model, but with HC-1 robust standard errors