What you should be able to do on your own after this exercise:
To begin this exercise, please quit and re-open RStudio on your computer.
Acknowledgement: Some of this exercise contains materials
from the book R for Data Science,
the core text for this course. For citations of the R packages used
here, please refer to
You’re invited to ask questions while you complete this exercise during our meeting, but I also recommend consulting the main resources I recommend during our course. These are:
Using your computer’s tools or the “Files” Tab on the bottom right in RStudio, create a folder “Exercise 4” within your course folder. You’ll need to download some data for this exercise. Place it in this folder so you can access it easily.
Create an empty R script. Save it as “IntroR_Day4_Exercise.R” in your “Exercise 4” working directory. Within the script, type or copy & paste the following code in the first line to set the working directory to the same folder in which the script is located:
To start your script, load the following packages. Install them if necessary.
Write your own function to compute the variance of a numeric vector. The variance is defined as: the sum of all differences between each value and the mean of the vector — divided by the length of the vector minus 1.
Write a function that take a vector as input and returns the last value only.
Write a function that take a vector as input and returns all but the last values.
Write a for loop to compute the median of every column in the
Write a for loop to determine the type of each column in
Repeat the previous two exercises, but use the
Return to yesterday’s exercise and load the QoG data into R. Keep
only observations from 2019, and retain the following variables:
Estimate a regression model, predicting life expectancy with women’s political empowerment and GDP per capita.
Print the regression coefficients, but use the
round() function to show only 4 digits after the
Obtain the residuals and plot them along with country names to show for which countries the model’s predictions are close to the observed values, and for which they are far.
Create a standard regression diagnostic plot.
Re-estimate the model, but with HC-1 robust standard errors