The purpose of this tutorial is to show a complete workflow for estimating Bayesian models in R using the rstanarm package (Goodrich et al. 2019) as an interface to Stan and rstan (Stan Development Team 2019), as shown throughout this short course There is also a PDF version of this tutorial as well as an R script containing all code.

If you are new to R, please refer to the tutorial for Lab 1 first.
For suggestions for model presentation, processing MCMC output, or using Stan directly, stay tuned for the remainder of this course.

If you find any errors in this document or have suggestions for improvement, please email me.

How to use this tutorial

This tutorial focuses on using rstanarm for fitting Bayesian models via R. rstanarm is, in the words of its authors, “an R package that emulates other R model-fitting functions but uses Stan (via the rstan package) for the back-end estimation. The primary target audience is people who would be open to Bayesian inference if using Bayesian software were easier but would use frequentist software otherwise.”

There are other options (foremost using Stan directly, but also JAGS and WinBUGS) for fitting Bayesian models that we will briefly discuss during the workshop. The brms package is similar in its ease of use and particularly useful for multilevel models. You can find more information about all these packages at the on the website for my ICPSR workshop. Some sections on that website are relevant for Mac or Windows users only as indicated.

In this tutorial, R code that you would enter in your script file or in the command line shows up on shaded background, with output following after ##:

1 + 1

## [1] 2

Note that copying and pasting code from the PDF version of this tutorial may lead to errors when trying to execute code; please copy code from the HTML version or the R script underlying the tutorial.

What is not in this tutorial

This tutorial does not address the following topics, which will instead show up in future labs:

Convergence diagnostics ($\rightarrow$ Lab 3)
Processing and working with MCMC output ($\rightarrow$ Lab 4)
Writing a customized model directly in Stan ($\rightarrow$ Lab 5)
Model presentation ($\rightarrow$ Lab 6)

Using R as frontend

A convenient way to fit Bayesian models using Stan (or JAGS or WinBUGS or OpenBUGS) is to use R packages that function as frontends for Stan. These packages make it easy to do all your Bayesian data analysis in R, including:

importing and preparing the data
writing the empirical model
estimate the model using MCMC
process the output of Bayesian models
present output in publication-ready form

rstanarm allows you to estimate Bayesian models almost without any additional steps compared to estimating standard frequentist models in R.

Installing Stan and rstanarm

rstanarm relies on Stan/rstan for estimation, so you need to install rstan first. To do this, go to http://mc-stan.org/rstan/ and follow the instructions under RStan: Getting Started. Choose your operating system and be sure to install the additional tools needed before actually installing rstan.

Next, you can install rstanarm like any other R package:

install.packages("rstanarm")

Note that rstanarm is under active development, so be sure to (a) explicitly specify arguments and (b) take note which version of rstanarm you are using; packageVersion("rstanarm") will show this. At the time of writing this tutorial, I have version 2.21.1 installed. If needed, you can always reproduce findings at a later point by reverting to that version using the devtools::install_version command:

install_version("rstanarm", version = "2.21.1", repos = "http://cran.us.r-project.org")

Using rstanarm

The vignette for the rstanarm package is exceptionally detailed and well structured, so there is no need to reproduce it here. Please read it. Below, I show what I consider the key steps for fitting a Bayesian linear regression model and producing output and convergence diagnostics.

Preparing the data and model

rstanarm uses rectangular data frames as input, as do most other modeling functions in R (but not other Bayesian tools such as Stan or JAGS). For an example dataset, we simulate our own data in R. For this tutorial, we aim to fit a linear model, so we create a continuous outcome variable $y$ as a function of two predictors $x_1$ and $x_2$ and a disturbance term $e$. We simulate a dataset with 100 observations.

First, we create the predictors:

n.sim <- 100; set.seed(123)
x1 <- rnorm(n = n.sim, mean = 5, sd = 2)
x2 <- rbinom(n.sim, size = 1, prob = 0.3)
e <- rnorm(n = n.sim, mean = 0, sd = 2)

Next, we create the outcome $y$ based on coefficients $b_1$ and $b_2$ for the respective predictors and an intercept $a$:

b1 <- 1.2
b2 <- -3.1
a <- 1.5
y <- a + b1 * x1 + b2 * x2 + e

Now, we combine the variables into one data frame for processing later:

sim.dat <- data.frame(y, x1, x2)

And we create and summarize a (frequentist) linear model (lm)¹ fit on these data:

freq.mod <- lm(y ~ x1 + x2, 
               data = sim.dat)
summary(freq.mod)

## 
## Call:
## lm(formula = y ~ x1 + x2, data = sim.dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6865 -1.3595 -0.2224  1.0733  6.4608 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.1990     0.5762   3.816 0.000239 ***
## x1            1.0702     0.1032  10.373  < 2e-16 ***
## x2           -3.0872     0.4130  -7.475 3.44e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.873 on 97 degrees of freedom
## Multiple R-squared:  0.6223, Adjusted R-squared:  0.6145 
## F-statistic:  79.9 on 2 and 97 DF,  p-value: < 2.2e-16

Fitting the Bayesian model with all default arguments

Now, we create a Bayesian linear model using rstanarm. Conveniently, the relevant function is simply called stan_glm() for generalized linear model.² This function allows fitting the model by specifying the prior distribution for coefficient estimates; all other arguments have default values (that we will modify later). You can find more about how to specify priors under ?rstanarm::priors. For this first take, we choose normal distributions as priors for the coefficients and choose a mean of 0 and standard deviation of 10 for the parameters of the prior distribution. For consistency reason, the prior() command uses location and scale as arguments (instead of mean and sd), just as it does for other distributions.

library("rstanarm")
bayes.mod <- stan_glm(y ~ x1 + x2, 
                     data = sim.dat,
                     family = "gaussian",
                     prior = normal(location = 0, scale = 10))

## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 7.9e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.79 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 0.041769 seconds (Warm-up)
## Chain 1:                0.04067 seconds (Sampling)
## Chain 1:                0.082439 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 1.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 0.036194 seconds (Warm-up)
## Chain 2:                0.036545 seconds (Sampling)
## Chain 2:                0.072739 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 1.5e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 0.037482 seconds (Warm-up)
## Chain 3:                0.036429 seconds (Sampling)
## Chain 3:                0.073911 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 0.038026 seconds (Warm-up)
## Chain 4:                0.038442 seconds (Sampling)
## Chain 4:                0.076468 seconds (Total)
## Chain 4:

summary(bayes.mod)

## 
## Model Info:
##  function:     stan_glm
##  family:       gaussian [identity]
##  formula:      y ~ x1 + x2
##  algorithm:    sampling
##  sample:       4000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 100
##  predictors:   3
## 
## Estimates:
##               mean   sd   10%   50%   90%
## (Intercept)  2.2    0.6  1.5   2.2   2.9 
## x1           1.1    0.1  0.9   1.1   1.2 
## x2          -3.1    0.4 -3.6  -3.1  -2.5 
## sigma        1.9    0.1  1.7   1.9   2.1 
## 
## Fit Diagnostics:
##            mean   sd   10%   50%   90%
## mean_PPD 6.8    0.3  6.5   6.9   7.2  
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##               mcse Rhat n_eff
## (Intercept)   0.0  1.0  4208 
## x1            0.0  1.0  4188 
## x2            0.0  1.0  5090 
## sigma         0.0  1.0  4407 
## mean_PPD      0.0  1.0  4065 
## log-posterior 0.0  1.0  1914 
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

You have now estimated your first Bayesian regression model! Note some more information in this output:

The posterior distribution contains 4000 draws - four chains with 1000 draws each.
The regression output contains quantiles of the posterior distribution for each parameter.
In addition to coefficients, the output also contains the posterior distribution for sigma (the (estimated) standard deviation of the outcome variable y) and mean_PPD (the mean of the Posterior Predictive Distribution of the outcome, here y; compare to mean(sim.dat$y)).
Beneath the summary of the posterior distribution, stan_glm provides basic convergence diagnostics - more on these later, in Lab 3.

In addition to this barebones call for stan_glm, you can (and should) explicitly specify a few more arguments for full transparency and reproducibility. A call to ?stan_glm and ?stan will reveal some of the additional arguments that you can specify in addition to formula, data, family, and prior as above. Here are the most important ones:

prior_intercept: a separate prior distribution for the intercept. This is warranted if the outcome variable’s scale will lead to an intercept with a (vastly) different scale than regression coefficients. For instance, when estimating a model with body height of babies in cm as an outcome variable and age (in days) as a predictor, the intercept will around a normal birth height and the coefficient will be around the additional height per day added in a baby’s life, probably somewhere just above 0. In this case, you would want to specify a different prior for the intercept than normal(0, 10) in order to avoid artificially pulling the intercept closer to 0.
prior: the prior distribution for the other regression coefficient(s). You can either specify the same prior for all coefficients (e.g. prior = normal(location = 0, scale = 2.5)) or separate priors for each coefficient by providing a vector for location and scale: prior - normal(location = c(0, 0), scale = c(2.5, 10))
chains is the number of Markov chains to be used; the default is currently 4.
iter is the number of iterations for each chain (including warmup). The default is currently 2000.
warmup is the number of warmup (aka burnin) iterations per chain. The default is currently 50% of iter, i.e. 1000 if iter is set to its default.
thin specifies the which iterations should be saved; 1 (the default) saves each iteration, 5 the draw from every 5th iteration, etc.
init specifies the starting values.
- The default is set to use random starting values, generated by Stan: init = "random". This is not always preferable, especially in more complex models.
- Instead, you can give specific starting values for each parameter by providing a list:
```
function() {list(x1 = 1, x2 = 1)}
```
or by providing the same starting value for all parameters: init = 1.
seed is the seed for random number generation; providing this explicitly is useful for reproducible estimates.
cores allows you to use multiple CPU cores on your computer, for parallel processing and therefore faster estimation.

With this information, you can re-specify the linear model above with explicit arguments:

bayes.mod <- stan_glm(y ~ x1 + x2, 
                     data = sim.dat,
                     family = "gaussian",
                     prior = normal(location = 0, scale = 5),
                     prior_intercept = normal(location = 0, scale = 10),
                     chains = 4,
                     iter = 5000,
                     warmup = 2500,
                     thin = 1,
                     init = 1,
                     seed = 123,
                     cores = 4)
summary(bayes.mod)

## 
## Model Info:
##  function:     stan_glm
##  family:       gaussian [identity]
##  formula:      y ~ x1 + x2
##  algorithm:    sampling
##  sample:       10000 (posterior sample size)
##  priors:       see help('prior_summary')
##  observations: 100
##  predictors:   3
## 
## Estimates:
##               mean   sd   10%   50%   90%
## (Intercept)  2.2    0.6  1.4   2.2   2.9 
## x1           1.1    0.1  0.9   1.1   1.2 
## x2          -3.1    0.4 -3.6  -3.1  -2.5 
## sigma        1.9    0.1  1.7   1.9   2.1 
## 
## Fit Diagnostics:
##            mean   sd   10%   50%   90%
## mean_PPD 6.8    0.3  6.5   6.8   7.2  
## 
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
## 
## MCMC diagnostics
##               mcse Rhat n_eff
## (Intercept)   0.0  1.0  11258
## x1            0.0  1.0  11416
## x2            0.0  1.0  13018
## sigma         0.0  1.0  11876
## mean_PPD      0.0  1.0  10141
## log-posterior 0.0  1.0   4382
## 
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Creating a regression table

To summarize the key results from the model, you will usually create a table showing estimates for each parameter and a measure of uncertainty around them. With a few colleagues, I’ve built the R package “BayesPostEst” that contains two functions doing just that: creating a regression table for you.³ You have to install the package first, then load it:

install.packages("BayesPostEst")

library("BayesPostEst")

Now, you can use mcmcTab() to create your summary table:

mcmcTab(bayes.mod)

##      Variable Median    SD  Lower  Upper
## 1 (Intercept)  2.186 0.595  1.026  3.366
## 2          x1  1.072 0.104  0.862  1.277
## 3          x2 -3.062 0.426 -3.904 -2.229
## 4       sigma  1.886 0.135  1.646  2.174

You can also select specific parameters of interest only:

mcmcTab(bayes.mod, pars = c("x1", "x2"))

##   Variable Median    SD  Lower  Upper
## 1       x1  1.072 0.104  0.862  1.277
## 2       x2 -3.062 0.426 -3.904 -2.229

If you wish to print only some of the columns, you can select these, too:

mcmcTab(bayes.mod, pars = c("x1", "x2"))[, c("Median", "SD")]

##   Median    SD
## 1  1.072 0.104
## 2 -3.062 0.426

And you can export this table to RMarkdown using the knitr::kable() function:

bayes.tab <- mcmcTab(bayes.mod, pars = c("x1", "x2"))
library("knitr")
kable(bayes.tab, 
      digits = 2,
      caption = "Summary of posterior estimates.",
      col.names = c("Variable", "Median", "Std. Dev.", "Lower 95% CI", "Upper 95% CI"))

Summary of posterior estimates.
Variable	Median	Std. Dev.	Lower 95% CI	Upper 95% CI
x1	1.07	0.10	0.86	1.28
x2	-3.06	0.43	-3.90	-2.23

Alternatively, you can create a Word version using flextable::flextable. Install the flextable and officer packages first. The flextable vignette offers guidance on customization options. The following command will create the Word document bayes_m1_table.docx in your working directory:

library("flextable")
library("officer")
bayes.ft <- flextable(bayes.tab)
doc <- read_docx()
doc <- body_add_flextable(doc, value = bayes.ft)
print(doc, target = "bayes_m1_table.docx")

You can also export your table to LaTeX using the the xtable package. The following command will create the LaTeX table bayes_m1_table.tex in your working directory:

library("xtable")
bayes.xt <- xtable(bayes.tab)
print(bayes.xt, file = "bayes_m1_table.tex")

An alternative that’s more useful for displaying multiple models and conforming to journal style requirements is the mcmcReg() function, also from the BayesPostEst package. It works similar to the texreg() function from the package with the same name:

mcmcReg(mod = bayes.mod)

## 
## \begin{table}
## \begin{center}
## \begin{tabular}{l c}
## \hline
##  & Model 1 \\
## \hline
## (Intercept) & $2.19^{*}$        \\
##             & $ [ 1.03;  3.37]$ \\
## x1          & $1.07^{*}$        \\
##             & $ [ 0.86;  1.28]$ \\
## x2          & $-3.06^{*}$       \\
##             & $ [-3.90; -2.23]$ \\
## sigma       & $1.89^{*}$        \\
##             & $ [ 1.65;  2.17]$ \\
## \hline
## \multicolumn{2}{l}{\scriptsize{$^*$ 0 outside 95\% credible interval.}}
## \end{tabular}
## \caption{Statistical models}
## \label{table:coefficients}
## \end{center}
## \end{table}

mcmcReg() produces a table for use in a LaTeX file by default, but can also write an HTML file that can be opened and processed in Word and similar word processors.

mcmcReg(mod = bayes.mod, format = "html", 
        file = "bayes_m1_table2.docx")

For the many ways to customize a call to mcmcReg(), see the help file. This function is under active development and will likely see some enhancements soon. Leave us feature requests on Github - just open an issue!⁴

Visualizing results directly from the rstanarm object

rstanarm and rstan have a convenient suite of commands to summarize and visualize the results from a fitted model. You already saw the results from summary() above. In addition, you can plot a variety of quantities using plot (see ?plot.stanreg for more information).

A call to plot without arguments will produce a coefficient dotplot for all parameters:

plot(bayes.mod)

You can restrict the plot to parameters of interest using the pars argument:

plot(bayes.mod, 
     pars = c("x1", "x2"))

You can use the plotfun argument to produce other plots. My first step would be to look at a trace plot of the chains to get an quick look at convergence:

plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_trace")

In addition, you can produce a multitude of plots to visualize the posterior distribution, including:

Autocorrelation plots for each parameter

plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_acf")

Histograms

plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_hist")

Density plots by parameter

plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_dens")

Density plots distinguishing between chains

plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_dens_overlay")

Violin plots distinguishing between chains

plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_violin")

Density plots added to a coefficient dot plot setup

plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_areas_ridges")

Density plots added to a coefficient dot plot setup, with credible intervals highlighted

plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_areas")

As with any use of ggplot2, you can create an object and then save the plot to your folder:

library("ggplot2")
p <- plot(bayes.mod, 
     pars = c("x1", "x2"),
     plotfun = "mcmc_areas_ridges")
ggsave(p, file = "bayes_m1_ridgeplot.pdf", width = 5, height = 2.5)

And because the rstan/rstanarm plotting function is based on ggplot2, you can customize the appearance of the plots:

plot(bayes.mod, 
     pars = c("x1", "x2")) + 
  theme_bw() + 
  xlab("Posterior estimate") + 
  labs(title = "Regression results", caption = "N = 100; estimates based on 4 chains with 2500 draws each.")

Visualizing the posterior distribution using shinystan

The shinystan package (Stan Development Team 2017) offers a convenient way to inspect the posterior distribution(s) of parameters in your web browser, using the R package shiny as a backend. This is a great tool for quick model checking, even though you will want to use the commands shown above to produce plots for a manuscript and/or appendix.

shinystan is installed automatically when you install rstanarm. To use it, simply call launch_shinystan on the model you fit with rstanarm:

launch_shinystan(bayes.mod)

After a few seconds, your browser should show the shinystan interface:

ShinyStan start-up screen

Click on any of the buttons DIAGNOSE, ESTIMATE, EXPLORE, or HELP to proceed.

References

Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2019. Rstanarm: Bayesian Applied Regression Modeling via Stan. http://mc-stan.org/.

Stan Development Team. 2017. Shinystan: Interactive Visual and Numerical Diagnostics and Posterior Analysis for Bayesian Models. http://mc-stan.org/.

———. 2019. RStan: The R Interface to Stan. http://mc-stan.org/.

This is equivalent to fitting a generalized linear model using glm with the Gaussian (normal) link function; see more below.↩︎
The results are equivalent to using the stan_lm() function, but stan_lm() only allows specifiying a prior distribution for the $R^2$ of the regression. For consistency with the course, where we begin with priors for parameters (coefficients), I prefer using stan_glm so we can specify priors for parameters.↩︎
For more information, see the package website or the companion article in the Journal of Open Source Software.↩︎
You can find the Github site for opening issues here: https://github.com/ShanaScogin/BayesPostEst/issues.↩︎

Lab 2: Using Stan via R

Johannes Karreth (Ursinus College)

Applied Bayesian Modeling (ICPSR Summer Program 2021)