The purpose of this tutorial is to show a complete workflow for estimating Bayesian models in R using JAGS or WinBUGS/OpenBUGS, as shown throughout this workshop. There is also a PDF version of this tutorial as well as an R script containing all code.

  • If you are new to R, please refer to the tutorial for Lab 1 first.
  • For suggestions for model presentation, processing MCMC output, or using Stan, please visit the subsequent labs during this workshop.

If you find any errors in this document or have suggestions for improvement, please email me.

How to use this tutorial

This tutorial focuses on using JAGS and WinBUGS/OpenBUGS for fitting Bayesian models via R. There are other options (foremost Stan) for fitting Bayesian models that we will briefly discuss during the workshop. You can find more information about them at the end of this tutorial and on my website. Some sections are relevant for Mac or Windows users only as indicated.

In this tutorial, R code that you would enter in your script file or in the command line shows up on shaded background, with output following after ##:

1 + 1
## [1] 2

Note that copying and pasting code from the PDF version of this tutorial may lead to errors when trying to execute code; please copy code from the HTML version or the R script underlying the tutorial.

Using R as frontend

A convenient way to fit Bayesian models using JAGS (or WinBUGS or OpenBUGS) is to use R packages that function as frontends for JAGS. These packages make it easy to do all your Bayesian data analysis in R, including:

  • importing and preparing the data
  • writing the empirical model
  • estimate the model using MCMC
  • process the output of Bayesian models
  • present output in publication-ready form

In this tutorial, I focus on the R2jags and R2WinBUGS/R2OpenBUGS packages that you encounter in Gelman and Hill (2007), as well as a few other options.

What are JAGS, R2jags, …?

  • JAGS (Plummer 2011) is Just Another Gibbs Sampler that was mainly written by Martyn Plummer in order to provide a BUGS engine for Unix. It is a free, open-source program. More information can be found in the excellent JAGS manual at http://sourceforge.net/projects/mcmc-jags/.
  • WinBUGS (D. J. Spiegelhalter et al. 2003) is a Windows-only program for Bayesian estimation with a graphical user interface. It has been a very popular option for Bayesian modeling in the past 10–15 years. WinBUGS is available for free at http://www.mrc-bsu.cam.ac.uk/software/bugs/, but it is not under active development anymore.
  • OpenBUGS (D. Lunn et al. 2009) is an open-source and free version of WinBUGS. The frontend and most practical aspects of the program are virtually identical with WinBUGS. OpenBUGS is available as a Windows program at http://www.openbugs.net. Downloading OpenBUGS will give you access to all functionality of WinBUGS as far as our workshop is concerned.
  • R2jags (Su and Yajima 2012) is an R package that allows fitting JAGS models from within R. Almost all examples in Gelman and Hill’s Data Analysis Using Regression and Multilevel/Hierarchical Models (Gelman and Hill 2007) can be worked through equivalently in JAGS, using R2jags.
  • R2WinBUGS/R2OpenBUGS (Sturtz, Ligges, and Gelman 2005) are R packages similar to R2jags that allow controlling WinBUGS/OpenBUGS from within R. Almost all examples in Data Analysis Using Regression and Multilevel/Hierarchical Models (Gelman and Hill 2007) can be worked through equivalently in JAGS, using R2jags.
  • rjags (Plummer 2013) is another R package that allows fitting JAGS models from within R. R2jags depends on it. Bayesian Analysis for the Social Sciences (Jackman 2009) provides many examples using rjags, and so does John Kruschke’s Doing Bayesian Data Analysis (Kruschke 2014).
  • runjags (Denwood 2016) allows some additional functionalities, including parallel computing.

Installing JAGS and R2jags

  • Follow the instructions at http://www.jkarreth.net/bayes-icpsr.html.
    • Windows users skip step 3.
  • You should now be able to run the following code in R, taken directly from the help file for the jags function:
library("R2jags")
## Loading required package: rjags
## Loading required package: coda
## Linked to JAGS 4.3.0
## Loaded modules: basemod,bugs
## 
## Attaching package: 'R2jags'
## The following object is masked from 'package:coda':
## 
##     traceplot
# An example model file is given in:
model.file <- system.file(package = "R2jags", "model", "schools.txt")
# data
J <- 8.0
y <- c(28.4,7.9,-2.8,6.8,-0.6,0.6,18.0,12.2)
sd <- c(14.9,10.2,16.3,11.0,9.4,11.4,10.4,17.6)
jags.data <- list("y","sd","J")
jags.params <- c("mu","sigma","theta")
jags.inits <- function(){
list("mu"=rnorm(n = 1),"sigma"=runif(1),"theta"=rnorm(n = J))
}
# Fit the model
jagsfit <- jags(data=list("y","sd","J"), inits = jags.inits,
jags.params, n.iter = 10, model.file = model.file)
## module glm loaded
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 8
##    Unobserved stochastic nodes: 10
##    Total graph size: 41
## 
## Initializing model

This should take less than a second and you should see the output above in your R console.

Fitting Bayesian models using R2jags

The purpose of R2jags is to allow fitting JAGS models from within R, and to analyze convergence and perform other diagnostics right within R. A typical sequence1 of using R2jags could look like this:

Preparing the data and model

For an example dataset, we simulate our own data in R. For this tutorial, we aim to fit a linear model, so we create a continuous outcome variable \(y\) as a function of two predictors \(x_1\) and \(x_2\) and a disturbance term \(e\). We simulate a dataset with 100 observations.

First, we create the predictors:

n.sim <- 100; set.seed(123)
x1 <- rnorm(n = n.sim, mean = 5, sd = 2)
x2 <- rbinom(n.sim, size = 1, prob = 0.3)
e <- rnorm(n = n.sim, mean = 0, sd = 1)

Next, we create the outcome \(y\) based on coefficients \(b_1\) and \(b_2\) for the respective predictors and an intercept \(a\):

b1 <- 1.2
b2 <- -3.1
a <- 1.5
y <- a + b1 * x1 + b2 * x2 + e

Now, we combine the variables into one dataframe for processing later:

sim.dat <- data.frame(y, x1, x2)

And we create and summarize a (frequentist) linear model fit on these data:

freq.mod <- lm(y ~ x1 + x2, data = sim.dat)
summary(freq.mod)
## 
## Call:
## lm(formula = y ~ x1 + x2, data = sim.dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3432 -0.6797 -0.1112  0.5367  3.2304 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.84949    0.28810    6.42 5.04e-09 ***
## x1           1.13511    0.05158   22.00  < 2e-16 ***
## x2          -3.09361    0.20650  -14.98  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9367 on 97 degrees of freedom
## Multiple R-squared:  0.8772, Adjusted R-squared:  0.8747 
## F-statistic: 346.5 on 2 and 97 DF,  p-value: < 2.2e-16

Now, we write a model for JAGS and save it as the text file "bayes.mod" in your working directory. (Do not paste this model straight into R yet.) You can set your working directory via:

setwd("/Users/johanneskarreth/Documents/Dropbox/Uni/9 - ICPSR/2019/Applied Bayes/Course materials/Labs/3 and 4 - JAGS-BUGS")

Be sure to provide the file path to your own working directory here when using this code.

The model looks just like the JAGS models shown throughout this course:

model {
for(i in 1:N){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta1 * x1[i] + beta2 * x2[i]
}

alpha ~ dnorm(0, .01)
beta1 ~ dunif(-100, 100)
beta2 ~ dunif(-100, 100)
tau ~ dgamma(.01, .01)
}

Instead of saving the model in your WD, you can also enter it in your R script:

bayes.mod <- function() {
for(i in 1:N){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta1 * x1[i] + beta2 * x2[i]
}
alpha ~ dnorm(0, .01)
beta1 ~ dunif(-100, 100)
beta2 ~ dunif(-100, 100)
tau ~ dgamma(.01, .01)
}

Now define the vectors of the data matrix for JAGS:

y <- sim.dat$y
x1 <- sim.dat$x1
x2 <- sim.dat$x2
N <- nrow(sim.dat)

Read in the data frame for JAGS:

sim.dat.jags <- list("y", "x1", "x2", "N")

(You could also do this more conveniently using the as.list() command on your data frame:)

sim.dat.jags <- as.list(sim.dat)

Note, though, that you will also need to specify any other variables not in the data, like in this case \(N\). So here, you would need to add:

sim.dat.jags$N <- nrow(sim.dat)

Define the parameters whose posterior distributions you are interested in summarizing later:

bayes.mod.params <- c("alpha", "beta1", "beta2")

Now, we need to define the starting values for JAGS. Per Gelman and Hill (2007, 370), you can use a function to do this. This function creates a list that contains one element for each parameter. Each parameter then gets assigned a random draw from a normal distribution as a starting value. This random draw is created using the rnorm function. The first argument of this function is the number of draws. If your parameters are not indexed in the model code, this argument will be1. If your jags command below then specifies more than one chain, each chain will start at a different random value for each parameter.

bayes.mod.inits <- function(){
list("alpha" = rnorm(n = 1), "beta1" = rnorm(n = 1), "beta2" = rnorm(n = 1))
}

Alternatively, if you want to have control over which starting values are chosen, you can provide specific separate starting values for each chain:

inits1 <- list("alpha" = 0, "beta1" = 0, "beta2" = 0)
inits2 <- list("alpha" = 1, "beta1" = 1, "beta2" = 1)
inits3 <- list("alpha" = -1, "beta1" = -1, "beta2" = -1)
bayes.mod.inits <- list(inits1, inits2, inits3)

Note: Here, I did not specify a starting value for the node tau. This will lead JAGS (or BUGS) to generate a random number as a starting value for tau. In general, any node for which you do not explicitly generate starting values will receive a random starting value. This is not a problem computationally, but undesirable from a reproducibility perspective. (More on this later in this workshop.)

Before using R2jags the first time, you need to load the package, and you might need to set a random number seed. To do this, type

library("R2jags")
set.seed(123)

directly in R or in your R script. You can choose any not too big number here. Setting a random seed before fitting a model is also good practice for making your estimates replicable. We will discuss replication in more detail in Weeks 3-4.

Fitting the model

Fit the model in JAGS. I’m giving the fit object the unwieldy name bayes.mod.fit.R2jags here to distinguish it from other objects later. When you work through this code, be sure the working directory is set to the folder in which your own bayes.mod is located.

bayes.mod.fit.R2jags <- jags(data = sim.dat.jags, inits = bayes.mod.inits,
parameters.to.save = bayes.mod.params, n.chains = 3, n.iter = 9000,
n.burnin = 1000,
model.file = "bayes.mod")
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 100
##    Unobserved stochastic nodes: 4
##    Total graph size: 511
## 
## Initializing model

Note: If you use as your model file the function you gave directly to R above, then remove the quotation marks:

bayes.mod.fit.R2jags <- jags(data = sim.dat.jags, inits = bayes.mod.inits,
parameters.to.save = bayes.mod.params, n.chains = 3, n.iter = 9000,
n.burnin = 1000, model.file = bayes.mod)
## Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 100
##    Unobserved stochastic nodes: 4
##    Total graph size: 511
## 
## Initializing model

Update your model if necessary — e.g. if there is no/little convergence:

bayes.mod.fit.R2jags.upd <- update(bayes.mod.fit.R2jags, n.iter = 1000)
bayes.mod.fit.R2jags.upd <- autojags(bayes.mod.fit.R2jags)

This function will auto-update until convergence (as defined by the \(\hat{R}\) diagnostic approaching 1).

Results

Running JAGS (or BUGS) from R offers seamless, and therefore quick, access to results after fitting a model. See for yourself:

print(bayes.mod.fit.R2jags)
## Inference for Bugs model at "/var/folders/v2/m7yb7wr53bb5vb7273jbg8lh0000gn/T//RtmpNFGT2U/modelb2472c3c0ea.txt", fit using jags,
##  3 chains, each with 9000 iterations (first 1000 discarded), n.thin = 8
##  n.sims = 3000 iterations saved
##          mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat
## alpha      1.846   0.292   1.279   1.650   1.849   2.046   2.409 1.001
## beta1      1.136   0.052   1.038   1.099   1.137   1.171   1.238 1.002
## beta2     -3.095   0.206  -3.516  -3.235  -3.088  -2.958  -2.699 1.001
## deviance 271.687   2.895 268.155 269.618 270.959 273.031 278.935 1.001
##          n.eff
## alpha     2900
## beta1     1100
## beta2     3000
## deviance  2000
## 
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
## 
## DIC info (using the rule, pD = var(deviance)/2)
## pD = 4.2 and DIC = 275.9
## DIC is an estimate of expected predictive error (lower deviance is better).
plot(bayes.mod.fit.R2jags)

traceplot(bayes.mod.fit.R2jags)

If you want to print and save the plot, you can use the following set of commands; adjust the file path and file name for your desired output:

pdf("bayes_trace.pdf")

… defines that the plot will be saved as a PDF file with the name"bayes_trace.pdf" in your working directory.2

traceplot(bayes.mod.fit.R2jags)

… creates the plot in the background (you will not see it).

dev.off()

… finishes the printing process and creates the PDF file of the plot. If successful, R will display the message "null device 1".

Fitting Bayesian models using runjags

Instead of R2jags, you can also use the runjags package to command JAGS via R. Here is an abbreviated version of the workflow for runjags. The workflow mimics what you saw for R2jags above:

library("runjags")

For an example dataset, we simulate our own data in R. For this tutorial, we aim to fit a linear model, so we create a continuous outcome variable \(y\) as a function of two predictors \(x_1\) and \(x_2\) and a disturbance term \(e\). We simulate 100 observations.

First, we create the predictors:

n.sim <- 100; set.seed(123)
x1 <- rnorm(n = n.sim, mean = 5, sd = 2)
x2 <- rbinom(n.sim, size = 1, prob = 0.3)
e <- rnorm(n = n.sim, mean = 0, sd = 1)

Next, we create the outcome \(y\) based on coefficients \(b_1\) and \(b_2\) for the respective predictors and an intercept a:

b1 <- 1.2
b2 <- -3.1
a <- 1.5
y <- a + b1 * x1 + b2 * x2 + e

Now, we combine the variables into one dataframe for processing later:

sim.dat <- data.frame(y, x1, x2)

And we summarize a (frequentist) linear model fit on these data:

freq.mod <- lm(y ~ x1 + x2, data = sim.dat)
summary(freq.mod)
## 
## Call:
## lm(formula = y ~ x1 + x2, data = sim.dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3432 -0.6797 -0.1112  0.5367  3.2304 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.84949    0.28810    6.42 5.04e-09 ***
## x1           1.13511    0.05158   22.00  < 2e-16 ***
## x2          -3.09361    0.20650  -14.98  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9367 on 97 degrees of freedom
## Multiple R-squared:  0.8772, Adjusted R-squared:  0.8747 
## F-statistic: 346.5 on 2 and 97 DF,  p-value: < 2.2e-16

Create the model object:

bayes.mod <- "model{
for(i in 1:N){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta1 * x1[i] + beta2 * x2[i]
}
alpha ~ dnorm(0, .01)
beta1 ~ dunif(-100, 100)
beta2 ~ dunif(-100, 100)
tau ~ dgamma(.01, .01)
}"

Convert the data frame to a list:

sim.list <- as.list(sim.dat)

and add the number of observations:

sim.list$N <- nrow(sim.dat)

Convert the data for runjags:

sim.dat.runjags <- dump.format(sim.list)

Specify initial values:

inits1 <- list(alpha = 1, beta1 = 1, beta2 = 1)
inits2 <- list(alpha = 0, beta1 = 0, beta2 = 0)
inits3 <- list(alpha = -1, beta1 = -1, beta2 = -1)

Fit the model in using run.jags:

bayes.mod.fit.runjags <- run.jags(model = bayes.mod, monitor = c("alpha", "beta1", "beta2"),
data = sim.dat.runjags, n.chains = 3, inits = list(inits1, inits2, inits3),
burnin = 1000, sample = 5000, keep.jags.files = TRUE)
## Calling the simulation...
## Welcome to JAGS 4.3.0 on Thu Jun 14 15:12:34 2018
## JAGS is free software and comes with ABSOLUTELY NO WARRANTY
## Loading module: basemod: ok
## Loading module: bugs: ok
## . . Reading data file data.txt
## . Compiling model graph
##    Resolving undeclared variables
##    Allocating nodes
## Graph information:
##    Observed stochastic nodes: 100
##    Unobserved stochastic nodes: 4
##    Total graph size: 511
## . Reading parameter file inits1.txt
## . Reading parameter file inits2.txt
## . Reading parameter file inits3.txt
## . Initializing model
## . Adapting 1000
## -------------------------------------------------| 1000
## ++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
## Adaptation successful
## . Updating 1000
## -------------------------------------------------| 1000
## ************************************************** 100%
## . . . . Updating 5000
## -------------------------------------------------| 5000
## ************************************************** 100%
## . . . . . . Updating 0
## . Deleting model
## . 
## Simulation complete.  Reading coda files...
## Coda files loaded successfully
## Calculating summary statistics...
## Calculating the Gelman-Rubin statistic for 3 variables....
## Finished running the simulation
## JAGS files were saved to the 'runjagsfiles' folder in your current
## working directory

Note a few things:

  • The model is read into R within quotation marks.
  • By setting keep.jags.files = TRUE, we get run.jags() to create a folder “runjagsfiles”" in your WD. This folder will contain the same files you obtain from JAGS in the next step below. This can be useful for further processing.
  • The option method allows for parallel chains on separate cores of your computer and other higher-end computing options. See the help file at ?run.jags for more information.

Finally, you can summarize the model:

print(bayes.mod.fit.runjags)
## 
## JAGS model summary statistics from 15000 samples (chains = 3; adapt+burnin = 2000):
##                                                                      
##       Lower95  Median Upper95    Mean       SD Mode     MCerr MC%ofSD
## alpha  1.2232  1.8433  2.3922  1.8377  0.29308   --  0.011521     3.9
## beta1  1.0333  1.1359  1.2426  1.1373 0.052392   -- 0.0020708       4
## beta2 -3.5099 -3.0948 -2.6872 -3.0955  0.20933   -- 0.0028238     1.3
##                             
##       SSeff     AC.10   psrf
## alpha   647   0.42412 1.0009
## beta1   640   0.43208 1.0012
## beta2  5496 0.0044403 1.0007
## 
## Total time taken: 1.7 seconds

Using JAGS via the command line

JAGS can also be operated straight from the command line—on Windows and Unix systems alike. This can be useful if you don’t want to have R busy fitting models that take a longer time. The most feasible way to do this is to write a script file with the following parts, and save it, for instance as bayes.jags. Before running this script, you will need to create some files (see below).

model clear
data clear
load dic
model in "bayes.mod"
data in "sim.dat"
compile, nchains(3)
inits in "bayes.mod.inits1.txt", chain(1)
inits in "bayes.mod.inits2.txt", chain(2)
inits in "bayes.mod.inits3.txt", chain(3)
initialize
update 2500, by(100)
monitor alpha, thin(2)
monitor beta1, thin(2)
monitor beta2, thin(2)
monitor deviance, thin(2)
update 2500, by(100)
coda *, stem(bayes_out)

The following instructions are for the Terminal on the Mac, but you can reproduce the same steps on Windows or Linux. You run this script file by

  • opening a Terminal window
  • changing to the working directory in which all the above files are located (adjust to your own WD)
cd "/Users/johanneskarreth/Documents/Dropbox/Uni/9 - ICPSR/2019/Applied Bayes/Course materials/Labs/3 and 4 - JAGS-BUGS"
  • and then telling JAGS to run the script:
jags bayes.jags

In your Terminal or console, you will see something like this:

In more detail, this is what each line of the script does:

model clear
data clear
load dic

Remove previous data and models (if applicable), load the dic module so you can monitor the model deviance later.

model in "bayes.mod"

Use the model bayes.mod, which is saved in your WD, and looks like a regular JAGS model. Make sure you use the exact and full name of the model file as it is in your working directory, otherwise JAGS will not find it. Look out for hidden file name extensions.3 You wrote this model earlier and saved it in your WD as bayes.mod:

model {
for(i in 1:N){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta1 * x1[i] + beta2 * x2[i]
}

alpha ~ dnorm(0, .01)
beta1 ~ dunif(-100, 100)
beta2 ~ dunif(-100, 100)
tau ~ dgamma(.01, .01)
}
data in "sim.dat"

Use the data sim.dat. These data can be saved as vectors in one file that you name sim.dat. This file will look something like this:

"y" <- c(6.94259729574987, 4.61661626624104, ...
"x1" <- c(3.87904870689558, 4.53964502103344, ...
"x2" <- c(0, 1, ...
"N" <- 100

You can create this data file by hand (inconvenient), or you can work with some R commands to do this automatically. If sim.dat is a data frame object in R, you can do the following:

sim.dat.list <- as.list(sim.dat)
sim.dat.list$N <- nrow(sim.dat)
dump("sim.dat.list", file = "sim.dat.dump")
bugs2jags("sim.dat.dump", "sim.dat")

The first command converts the data frame into a list; the second command writes out the data in BUGS format as a file to your working directory; the third command (part of the coda package) translates it into JAGS format. Both angell.dump and angell.dat are written to your working directory, so be sure that you have specified that in the beginning. Also be sure to visually inspect whether your data file was created correctly.

compile, nchains(3)

Compile the models and run three Markov chains.

inits in "bayes.mod.inits1.txt", chain(1)
inits in "bayes.mod.inits2.txt", chain(2)
inits in "bayes.mod.inits3.txt", chain(3)

Use the starting values you provide in three newly created text files bayes.mod.inits1.txt, bayes.mod.inits2.txt, and bayes.mod.inits3.txt, each of which could look like this:

"alpha" <- c(0)
"beta1" <- c(0)
"beta2" <- c(0)

This way, you can specify different starting values for each chain.

initialize

Initialize and run the model.

update 2500, by(100)

Specify 2500 updates before you start monitoring your parameters of interest.

monitor alpha, thin(2)
monitor beta1, thin(2)
monitor beta2, thin(2)
monitor deviance, thin(2)

Monitor the values for these parameters, in this case the three regression coefficients and the model deviance. thin(2) specifies that only each second draw from the chains will be used for your model output.

update 2500, by(100)

Let the sampler update for another 2500 iterations.

coda *, stem(bayes_out)

Tell JAGS to produce coda files that all begin with the stem bayes_out and are put in your WD. These can then be analyzed with the tools described later in this tutorial. Note that the script ends with an empty line to prevent errors parsing the code.

Installing WinBUGS/OpenBUGS (Windows only)

These instructions are for Windows users.4 Mac users wishing to install WinBUGS or OpenBUGS on a Mac computer: please visit my website for detailed instructions and come see me in office hours for more information.

WinBUGS

  1. Download and install WinBUGS14.exe from http://www.mrc-bsu.cam.ac.uk/software/bugs/the-bugs-project-winbugs/. Be sure to install WinBUGS in your user’s profile directory (e.g. C:/Users/Johannes/) to avoid problems with user rights.
  2. Download and install the patch for Version 1.4.3. You can do this simply by opening the file WinBUGS14_cumulative_patch_No3_06_08_07_RELEASE.txt from within WinBUGS and following the instructions at the top.
  3. Download and install the free key for unrestricted use, again by simply opening the file WinBUGS14_ immortality_key.txt from within WinBUGS and following the instructions at the top.

OpenBUGS

Download and install OpenBUGS323setup.exe from http://www.openbugs.net/w/Downloads. Be sure to install OpenBUGS in your user’s profile directory (e.g. C:/Users/Johannes/) to avoid problems with user rights.

Fitting Bayesian models using R2WinBUGS/R2OpenBUGS (Windows only)

These instructions are for Windows users. Mac users wishing to install WinBUGS or OpenBUGS on a Mac computer: please visit my website for detailed instructions and come see me in office hours for more information.

R2WinBUGS and R2OpenBUGS are nearly identical; I discuss them together in this section and point out differences where applicable. You should settle for using either of the two packages; OpenBUGS and R2OpenBUGS are slightly more up to date, and I recommend going with this option.

Once you have installed WinBUGS or OpenBUGS on your computer (previous section), install the respective package from within R or RStudio, via the Package Installer, or by using

install.packages("R2WinBUGS", dependencies = TRUE, repos = "https://cloud.r-project.org")
install.packages("R2OpenBUGS", dependencies = TRUE, repos = "https://cloud.r-project.org")

Preparing the data and model

For an example dataset, we simulate our own data in R. For this tutorial, we aim to fit a linear model, so we create a continuous outcome variable \(y\) as a function of two predictors \(x_1\) and \(x_2\) and a disturbance term \(e\). We simulate a dataset with 100 observations.

First, we create the predictors:

n.sim <- 100; set.seed(123)
x1 <- rnorm(n = n.sim, mean = 5, sd = 2)
x2 <- rbinom(n.sim, size = 1, prob = 0.3)
e <- rnorm(n = n.sim, mean = 0, sd = 1)

Next, we create the outcome \(y\) based on coefficients \(b_1\) and \(b_2\) for the respective predictors and an intercept \(a\):

b1 <- 1.2
b2 <- -3.1
a <- 1.5
y <- a + b1 * x1 + b2 * x2 + e

Now, we combine the variables into one dataframe for processing later:

sim.dat <- data.frame(y, x1, x2)

And we create and summarize a (frequentist) linear model fit on these data:

freq.mod <- lm(y ~ x1 + x2, data = sim.dat)
summary(freq.mod)

Now, we write a model for BUGS and save it as the text file "bayesmod.txt" in your working directory. (Do not paste this model straight into R yet.) You can set your working directory via:

setwd("C:/Users/Johannes/Bayes/Lab3")

The model looks just like the JAGS/BUGS models shown throughout this course:

model {
for(i in 1:N){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta1 * x1[i] + beta2 * x2[i]
}

alpha ~ dnorm(0, .01)
beta1 ~ dunif(-100, 100)
beta2 ~ dunif(-100, 100)
tau ~ dgamma(.01, .01)
}

Instead of writing the model in a text editor, you can also enter it in your R script:

sink("bayesmod.txt")
cat("
model{
for(i in 1:N){
y[i] ~ dnorm(mu[i], tau)
mu[i] <- alpha + beta1 * x1[i] + beta2 * x2[i]
}
alpha ~ dnorm(0, .01)
beta1 ~ dunif(-100, 100)
beta2 ~ dunif(-100, 100)
tau ~ dgamma(.01, .01)
}
", fill=TRUE)
sink()

Now define the vectors of the data matrix for BUGS:

y <- sim.dat$y
x1 <- sim.dat$x1
x2 <- sim.dat$x2
N <- nrow(sim.dat)

Read in the data frame for BUGS:

sim.dat.bugs <- list("y", "x1", "x2", "N")

(You could also do this more conveniently using the as.list command on your data frame:)

sim.dat.bugs <- as.list(sim.dat)

Note, though, that you will also need to specify any other variables not in the data, like in this case \(N\). So here, you would need to add:

sim.dat.bugs$N <- nrow(sim.dat)

Define the parameters whose posterior distributions you are interested in summarizing later:

bayes.mod.params <- c("alpha", "beta1", "beta2")

Now, you need to define the starting values for BUGS. Per Per Gelman and Hill (2007, 370), you can use a function to do this. This function creates a list that contains one element for each parameter. Each parameter then gets assigned a random draw from a normal distribution as a starting value. This random draw is created using the rnorm function. The first argument of this function is the number of draws. If your parameters are not indexed in the model code, this argument will be 1. If your bugs command below then specifies more than one chain, each chain will start at a different random value for each parameter.

bayes.mod.inits <- function(){
list("alpha" = rnorm(n = 1), "beta1" = rnorm(n = 1), "beta2" = rnorm(n = 1))
}

Note: Here, I did not specify a starting value for the node tau. This will lead JAGS (or BUGS) to generate a random number as a starting value for tau. In general, any node for which you do not explicitly generate starting values will receive a random starting value. This is not a problem computationally, but undesirable from a reproducibility perspective. (More on this later in this workshop.)

Before using R2WinBUGS the first time, you need to load the package, and you might need to set a random number seed. To do this, type

library("R2WinBUGS")
set.seed(123)

or alternatively

library("R2OpenBUGS")
set.seed(123)

directly in R or in your R script. You can choose any not too big number here. Setting a random seed before fitting a model is also good practice for making your estimates replicable. We will discuss replication in more detail in Weeks 3-4.

Fitting the model

You are now ready to use the bugs() function, which calls WinBUGS or OpenBUGS, depending on which package is loaded.

R2WinBUGS

I’m giving the resulting object the unwieldy name bayes.mod.fit.R2WinBUGS here to distinguish it from other objects later. Additionally, you have to specify the location of the model file, the data, the parameters, the initial values, as well as how many chains you want to fit and how long you want to run them. Finally, you need to specify the location of the WinBUGS directory (where you installed WinBUGS):

bayes.mod.fit.R2WinBUGS <- bugs(model.file = "bayes.mod",
data = sim.dat.bugs,
parameters.to.save = bayes.mod.params,
inits = bayes.mod.inits,
n.chains = 3,
n.iter = 5000,
n.burnin = 1000,
n.thin = 1,
bugs.directory = "C:/Users/Johannes/WinBUGS14/")

Running WinBUGS from R offers seamless, and therefore quick, access to results after fitting a model. Try for yourself:

print(bayes.mod.fit.R2WinBUGS)
plot(bayes.mod.fit.R2WinBUGS)

The next section offers details on obtaining convergence diagnostics from this model output.

R2OpenBUGS

I’m giving the resulting object the unwieldy name bayes.mod.fit.R2OpenBUGS here to distinguish it from other objects later. Additionally, you have to specify the location of the model file, the data, the parameters, the initial values, as well as how many chains you want to fit and how long you want to run them. Finally, you need to specify the location of the OpenBUGS directory (where you installed OpenBUGS):

bayes.mod.fit.R2OpenBUGS <- bugs(model.file = "bayes.mod",
data = sim.dat.bugs,
parameters.to.save = bayes.mod.params,
inits = bayes.mod.inits,
n.chains = 3,
n.iter = 5000,
n.burnin = 1000,
n.thin = 1)

Running OpenBUGS from R offers seamless, and therefore quick, access to results after fitting a model. Try for yourself:

print(bayes.mod.fit.R2OpenBUGS)
plot(bayes.mod.fit.R2OpenBUGS)

The next section offers details on obtaining convergence diagnostics from this model output.

Convergence diagnostics

R offers a variety of solutions to obtain convergence diagnostics. As a best practice, you should convert your model output (whether it comes from JAGS, R2Jags, runjags, R2WinBUGS, R2OpenBUGS, or other programs) into an MCMC object. MCMC objects are a separate class of R objects that contain one or multiple Markov Chains and the respective information about iterations etc. that is needed to conduct convergence diagnostics.

You can generate most model outputs into an MCMC object for analysis with this command:

  • For objects from R2jags, use as.mcmc():
bayes.mod.fit.mcmc <- as.mcmc(bayes.mod.fit.R2jags)
summary(bayes.mod.fit.mcmc)
## 
## Iterations = 1001:8993
## Thinning interval = 8 
## Number of chains = 3 
## Sample size per chain = 1000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##             Mean      SD  Naive SE Time-series SE
## alpha      1.846 0.29211 0.0053333       0.009263
## beta1      1.136 0.05219 0.0009529       0.001646
## beta2     -3.095 0.20611 0.0037631       0.003764
## deviance 271.687 2.89458 0.0528476       0.056665
## 
## 2. Quantiles for each variable:
## 
##             2.5%     25%     50%     75%   97.5%
## alpha      1.279   1.650   1.849   2.046   2.409
## beta1      1.038   1.099   1.137   1.171   1.238
## beta2     -3.516  -3.235  -3.088  -2.958  -2.699
## deviance 268.155 269.618 270.959 273.031 278.935
  • For objects from runjags, R2WinBUGS, R2OpenBUGS, use `as.mcmc.list():
bayes.mod.fit.mcmc <- as.mcmc.list(bayes.mod.fit.runjags)
summary(bayes.mod.fit.mcmc)
## 
## Iterations = 2001:7000
## Thinning interval = 1 
## Number of chains = 3 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##         Mean      SD  Naive SE Time-series SE
## alpha  1.838 0.29308 0.0023930       0.011519
## beta1  1.137 0.05239 0.0004278       0.002073
## beta2 -3.096 0.20933 0.0017092       0.002830
## 
## 2. Quantiles for each variable:
## 
##         2.5%    25%    50%    75%  97.5%
## alpha  1.235  1.646  1.843  2.037  2.406
## beta1  1.036  1.102  1.136  1.171  1.246
## beta2 -3.510 -3.235 -3.095 -2.957 -2.688
  • If you used JAGS or WinBUGS/OpenBUGS outside of R, the MCMC output is stored in text files and you need to read these text files into R.

    • Identify where your software saved the chains and index files - most likely in the working directory where the other components of your model (data, model, inits) are and what their names are.
    • In R, load the coda package:
    library("coda")
    • Read in your JAGS output. This requires that the chain and index files (see above) are in your working directory (be sure to use your own WD in the code below). Above, I gave these file names the stem bayes_out.
    chain1 <- read.coda(output.file = "bayes_outchain1.txt",
    index.file = "bayes_outindex.txt")
    ## Abstracting alpha ... 1250 valid values
    ## Abstracting beta1 ... 1250 valid values
    ## Abstracting beta2 ... 1250 valid values
    ## Abstracting deviance ... 1250 valid values
    chain2 <- read.coda(output.file = "bayes_outchain2.txt",
    index.file = "bayes_outindex.txt")
    ## Abstracting alpha ... 1250 valid values
    ## Abstracting beta1 ... 1250 valid values
    ## Abstracting beta2 ... 1250 valid values
    ## Abstracting deviance ... 1250 valid values
    chain3 <- read.coda(output.file = "bayes_outchain3.txt",
    index.file = "bayes_outindex.txt")
    ## Abstracting alpha ... 1250 valid values
    ## Abstracting beta1 ... 1250 valid values
    ## Abstracting beta2 ... 1250 valid values
    ## Abstracting deviance ... 1250 valid values
    bayes.chains <- as.mcmc.list(list(chain1, chain2, chain3))
    • Now you can proceed as before:
    summary(bayes.chains)
    ## 
    ## Iterations = 2501:4999
    ## Thinning interval = 2 
    ## Number of chains = 3 
    ## Sample size per chain = 1250 
    ## 
    ## 1. Empirical mean and standard deviation for each variable,
    ##    plus standard error of the mean:
    ## 
    ##             Mean      SD  Naive SE Time-series SE
    ## alpha      1.865 0.30367 0.0049589       0.016684
    ## beta1      1.132 0.05468 0.0008929       0.003079
    ## beta2     -3.095 0.21101 0.0034458       0.004138
    ## deviance 271.891 2.99669 0.0489358       0.081719
    ## 
    ## 2. Quantiles for each variable:
    ## 
    ##             2.5%     25%     50%     75%   97.5%
    ## alpha      1.254   1.661   1.871   2.067   2.461
    ## beta1      1.023   1.095   1.131   1.169   1.241
    ## beta2     -3.514  -3.236  -3.100  -2.955  -2.674
    ## deviance 268.169 269.659 271.226 273.477 278.991

With an MCMC object, you can use a variety of commands for diagnostics and presentation using the coda package (Plummer et al. 2006):

library("lattice")
xyplot(bayes.mod.fit.mcmc)

You can customize the plot layout (you can use other Lattice options here as well):

xyplot(bayes.mod.fit.mcmc, layout = c(1, 3), aspect = "fill")

Density plot:

densityplot(bayes.mod.fit.mcmc)

… with minor modifications for more effective display:

densityplot(bayes.mod.fit.mcmc, layout = c(1, 3), as.table = TRUE, aspect = "fill")

Trace- and density in one plot, printed directly to your working directory:

pdf("bayes_fit_mcmc_plot.pdf")
plot(bayes.mod.fit.mcmc)
dev.off()
  • Autocorrelation plot, printed directly to your working directory (adjust to your WD when using this code):
pdf("bayes_fit_mcmc_autocorr.pdf")
autocorr.plot(bayes.mod.fit.mcmc, ask = FALSE)
dev.off()
  • Other diagnostics using CODA:
gelman.plot(bayes.mod.fit.mcmc)
geweke.diag(bayes.mod.fit.mcmc)
## [[1]]
## 
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5 
## 
##   alpha   beta1   beta2 
##  2.1423 -2.1429 -0.2248 
## 
## 
## [[2]]
## 
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5 
## 
##    alpha    beta1    beta2 
## -0.06411  0.01448  0.84289 
## 
## 
## [[3]]
## 
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5 
## 
##  alpha  beta1  beta2 
##  1.916 -1.977  1.440
raftery.diag(bayes.mod.fit.mcmc)
## [[1]]
## 
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95 
##                                              
##        Burn-in  Total Lower bound  Dependence
##        (M)      (N)   (Nmin)       factor (I)
##  alpha 24       26828 3746         7.16      
##  beta1 24       24507 3746         6.54      
##  beta2 8        8178  3746         2.18      
## 
## 
## [[2]]
## 
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95 
##                                              
##        Burn-in  Total Lower bound  Dependence
##        (M)      (N)   (Nmin)       factor (I)
##  alpha 16       18232 3746         4.87      
##  beta1 19       20303 3746         5.42      
##  beta2 6        6406  3746         1.71      
## 
## 
## [[3]]
## 
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95 
##                                              
##        Burn-in  Total Lower bound  Dependence
##        (M)      (N)   (Nmin)       factor (I)
##  alpha 18       19920 3746         5.32      
##  beta1 21       24489 3746         6.54      
##  beta2 6        6406  3746         1.71
heidel.diag(bayes.mod.fit.mcmc)
## [[1]]
##                                     
##       Stationarity start     p-value
##       test         iteration        
## alpha passed       1         0.281  
## beta1 passed       1         0.223  
## beta2 passed       1         0.448  
##                                
##       Halfwidth Mean  Halfwidth
##       test                     
## alpha passed     1.84 0.03793  
## beta1 passed     1.14 0.00666  
## beta2 passed    -3.10 0.01008  
## 
## [[2]]
##                                     
##       Stationarity start     p-value
##       test         iteration        
## alpha passed       1         0.449  
## beta1 passed       1         0.461  
## beta2 passed       1         0.320  
##                                
##       Halfwidth Mean  Halfwidth
##       test                     
## alpha passed     1.85 0.03956  
## beta1 passed     1.14 0.00737  
## beta2 passed    -3.09 0.00898  
## 
## [[3]]
##                                     
##       Stationarity start     p-value
##       test         iteration        
## alpha passed       1         0.222  
## beta1 passed       1         0.196  
## beta2 passed       1         0.674  
##                                
##       Halfwidth Mean  Halfwidth
##       test                     
## alpha passed     1.83 0.03979  
## beta1 passed     1.14 0.00706  
## beta2 passed    -3.10 0.00973

Quick convergence diagnostics: superdiag

A very convenient function to analyze numerical representations of diagnostics in one sweep is the superdiag package (Tsai, Gill, and Rapkin 2012).

  • First, install the package:
install.packages("superdiag", dependencies = TRUE, repos = "https://cloud.r-project.org")
  • Then, load it:
library("superdiag")
  • Next, call the superdiag function and specify how many iterations of the chain you want to discard before analyzing for convergence:
superdiag(bayes.mod.fit.mcmc, burnin = 1000)
## Number of chains = 3 
## Number of iterations = 5000 per chain before discarding the burn-in period
## The burn-in period = 1000 per chain
## Sample size in total = 12000 
## 
## ********** The Geweke diagnostic: **********
## Z-scores:
##                        chain1    chain 2      chain 3
## alpha              0.07796324  -1.159461  0.003510968
## beta1             -0.03404077   4.391683 -0.092806768
## beta2             -0.67445676 -12.704165  0.785434910
## Window From Start  0.10000000   0.928650  0.451760000
## Window From Stop   0.50000000   0.000860  0.421810000
## 
## ********** The Gelman-Rubin diagnostic: **********
## Potential scale reduction factors:
## 
##       Point est. Upper C.I.
## alpha          1          1
## beta1          1          1
## beta2          1          1
## 
## Multivariate psrf
## 
## 1
## 
## ********** The Heidelberger-Welch diagnostic: **********
## 
## Chain 1, epsilon=0.1, alpha=0.05                                    
##       Stationarity start     p-value
##       test         iteration        
## alpha passed       1         0.534  
## beta1 passed       1         0.440  
## beta2 passed       1         0.348  
##                                
##       Halfwidth Mean  Halfwidth
##       test                     
## alpha passed     1.83 0.04263  
## beta1 passed     1.14 0.00753  
## beta2 passed    -3.10 0.01116  
## 
## Chain 2, epsilon=0.142, alpha=0.025                                    
##       Stationarity start     p-value
##       test         iteration        
## alpha passed       1         0.136  
## beta1 passed       1         0.128  
## beta2 passed       1         0.348  
##                                
##       Halfwidth Mean  Halfwidth
##       test                     
## alpha passed     1.85 0.04553  
## beta1 passed     1.13 0.00817  
## beta2 passed    -3.09 0.01056  
## 
## Chain 3, epsilon=0.121, alpha=0.025                                    
##       Stationarity start     p-value
##       test         iteration        
## alpha passed       1         0.839  
## beta1 passed       1         0.825  
## beta2 passed       1         0.545  
##                                
##       Halfwidth Mean  Halfwidth
##       test                     
## alpha passed     1.81 0.04541  
## beta1 passed     1.14 0.00809  
## beta2 passed    -3.10 0.01109  
## 
## ********** The Raftery-Lewis diagnostic: **********
## 
## Chain 1, converge.eps = 0.001
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95 
##                                              
##        Burn-in  Total Lower bound  Dependence
##        (M)      (N)   (Nmin)       factor (I)
##  alpha 22       22974 3746         6.13      
##  beta1 20       21942 3746         5.86      
##  beta2 8        8858  3746         2.36      
## 
## 
## Chain 2, converge.eps = 2e-04
## Quantile (q) = 0.05
## Accuracy (r) = +/- 5e-04
## Probability (s) = 0.975 
## 
## You need a sample size of at least 954539 with these values of q, r and s
## 
## Chain 3, converge.eps = 0.001
## Quantile (q) = 0.001
## Accuracy (r) = +/- 0.0025
## Probability (s) = 0.975 
##                                              
##        Burn-in  Total Lower bound  Dependence
##        (M)      (N)   (Nmin)       factor (I)
##  alpha 10       2417  804          3.01      
##  beta1 5        1342  804          1.67      
##  beta2 5        1342  804          1.67

Note that the R2jags object by default retains 1000 iterations (through thinning), hence the burn-in period you provide for superdiag must be less than 1000.

Quick diagnostic plots: mcmcplots

A convenient way to obtain graphical diagnostics and results is using the mcmcplots package (Curtis 2012):

  • First, install the package:
install.packages("mcmcplots", dependencies = TRUE, repos = "https://cloud.r-project.org")
  • Then, load it:
library("mcmcplots")
  • For quick diagnostics, you can produce html files with trace, density, and autocorrelation plots all on one page. The files will be displayed in your default internet browser.
mcmcplot(bayes.mod.fit.mcmc)
  • Some commands for individual plots:
denplot(bayes.mod.fit.mcmc, parms = c("alpha", "beta1", "beta2"))

traplot(bayes.mod.fit.mcmc, parms = c("alpha", "beta1", "beta2"))

  • If you want to produce a coefficient dot plot with credible intervals, use `caterplot}:
caterplot(bayes.mod.fit.mcmc, parms = c("alpha", "beta1", "beta2"),
labels = c("alpha", "beta1", "beta2"))

As always, check the help files for options to customize these plots.

More plots: ggmcmc

Yet another convenient option for plotting output is the ggmcmc package (Fernández i Marín 2013):

  • First, install the package:
install.packages("ggmcmc", dependencies = TRUE, repos = "https://cloud.r-project.org")
  • Then, load it:
library("ggmcmc")
  • To use this package, you need to first convert your MCMC object into a ggs object, using the `ggs} function. This object can then be passed on to the various ggmcmc plotting commands:
bayes.mod.fit.gg <- ggs(bayes.mod.fit.mcmc)
ggs_histogram(bayes.mod.fit.gg)

ggs_density(bayes.mod.fit.gg)

ggs_traceplot(bayes.mod.fit.gg)

ggs_running(bayes.mod.fit.gg)

ggs_compare_partial(bayes.mod.fit.gg)

ggs_autocorrelation(bayes.mod.fit.gg)

ggs_geweke(bayes.mod.fit.gg)

ggs_caterpillar(bayes.mod.fit.gg)

  • Or, you can use the ggmcmc command to create a PDF file containing a variety of diagnostic plots to your WD:
ggmcmc(bayes.mod.fit.gg,
file = "bayes_fit_ggmcmc.pdf")
## Plotting histograms
## Plotting density plots
## Plotting traceplots
## Plotting running means
## Plotting comparison of partial and full chain
## Plotting autocorrelation plots
## Plotting crosscorrelation plot
## Plotting Potential Scale Reduction Factors
## Plotting Geweke Diagnostic
## Plotting caterpillar plot
## Time taken to generate the report: 4.7 seconds.

MCMCpack

MCMCpack (Martin, Quinn, and Park 2011) is an R package that we will also briefly use in this workshop It is as easy to use as the lm() command in R and produces the same MCMC output you analyzed above, and it is quite fast. One major downside of MCMCpack is that it does not offer many options for model specification as writing your full model in JAGS or BUGS. For more information on MCMCpack, see http://mcmcpack.wustl.edu/. Because MCMCpack uses an R-like model formula, it is straightforward to fit a Bayesian linear model.

library("MCMCpack")

For an example dataset, we again simulate our own data in R. We create a continuous outcome variable \(y\) as a function of two predictors \(x_1\) and \(x_2\) and a disturbance term \(e\). We simulate 100 observations. First, we create the predictors:

n.sim <- 100; set.seed(123)
x1 <- rnorm(n = n.sim, mean = 5, sd = 2)
x2 <- rbinom(n.sim, size = 1, prob = 0.3)
e <- rnorm(n = n.sim, mean = 0, sd = 1)

Next, we create the outcome \(y\) based on coefficients \(b_1\) and \(b_2\) for the respective predictors and an intercept a:

b1 <- 1.2
b2 <- -3.1
a <- 1.5
y <- a + b1 * x1 + b2 * x2 + e

Now, we combine the variables into one dataframe for processing later:

sim.dat <- data.frame(y, x1, x2)

And we summarize a (frequentist) linear model fit on these data:

freq.mod <- lm(y ~ x1 + x2, data = sim.dat)
summary(freq.mod)
## 
## Call:
## lm(formula = y ~ x1 + x2, data = sim.dat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.3432 -0.6797 -0.1112  0.5367  3.2304 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.84949    0.28810    6.42 5.04e-09 ***
## x1           1.13511    0.05158   22.00  < 2e-16 ***
## x2          -3.09361    0.20650  -14.98  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9367 on 97 degrees of freedom
## Multiple R-squared:  0.8772, Adjusted R-squared:  0.8747 
## F-statistic: 346.5 on 2 and 97 DF,  p-value: < 2.2e-16

MCMCpack only requires you to specify one single model object. For the linear model, we use MCMCregress(). For other models available in MCMCpack, see help(package = "MCMCpack").

bayes.mod.fit.MCMCpack <- MCMCregress(y ~ x1 + x2,
data = sim.dat, burnin = 1000,
mcmc = 5000, seed = 123, beta.start = c(0, 0, 0),
b0 = c(0, 0, 0), B0 = c(0.1, 0.1, 0.1))
summary(bayes.mod.fit.MCMCpack)
## 
## Iterations = 1001:6000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 5000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##                Mean      SD  Naive SE Time-series SE
## (Intercept)  1.8366 0.28842 0.0040789       0.003941
## x1           1.1367 0.05165 0.0007305       0.000705
## x2          -3.0790 0.21073 0.0029801       0.002980
## sigma2       0.8969 0.13020 0.0018413       0.001853
## 
## 2. Quantiles for each variable:
## 
##                2.5%     25%     50%     75%  97.5%
## (Intercept)  1.2756  1.6407  1.8364  2.0274  2.414
## x1           1.0343  1.1022  1.1373  1.1722  1.236
## x2          -3.4835 -3.2204 -3.0802 -2.9389 -2.657
## sigma2       0.6789  0.8026  0.8859  0.9805  1.186

The resulting object is already an MCMC object. MCMCpack also allows you to use the same set of commands for diagnostics and accessing the fitted objects as you explored above. For example, you may use mcmcplot and superdiag for diagnostics:

mcmcplot(bayes.mod.fit.MCMCpack)
superdiag(bayes.mod.fit.MCMCpack, burnin = 1000)

Following this course using JAGS

JAGS and BUGS course are very similar, with some minor exceptions. You can find modified (if necessary) JAGS code for all models presented in this workshop on my website.

Many JAGS-related questions are answered at http://stackoverflow.com/questions/tagged/jags and the discussion board at http://sourceforge.net/projects/mcmc-jags/forums/forum/610037.

Stan

Stan is a new and fast program that can be accessed via R (and other statistical packages). The documentation on the Stan website is very easy to follow, and offers tutorials on fitting some example models in Stan. We will show you how to use Stan in Week 4 of this course.

References

Curtis, S. McKay. 2012. Mcmcplots: Create Plots from Mcmc Output. http://CRAN.R-project.org/package=mcmcplots.

Denwood, Matthew J. 2016. “Runjags: An R Package Providing Interface Utilities, Parallel Computing Methods and Additional Distributions for Mcmc Models in Jags.” Journal of Statistical Software 71 (9): 1–25. doi:10.18637/jss.v071.i09.

Fernández i Marín, Xavier. 2013. ggmcmc: Graphical Tools for Analyzing Markov Chain Monte Carlo Simulations from Bayesian Inference. http://xavier-fim.net/packages/ggmcmc.

Gelman, Andrew, and Jennifer Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. New York, NY: Cambridge University Press.

Jackman, Simon. 2009. Bayesian Analysis for the Social Sciences. Chichester: Wiley.

Kruschke, John. 2014. Doing Bayesian Data Analysis, Second Edition: A Tutorial with R, Jags, and Stan. Oxford: Academic Press / Elsevier.

Lunn, David, David Spiegelhalter, Andrew Thomas, and Nicky Best. 2009. “The Bugs Project: Evolution, Critique and Future Directions.” Statistics in Medicine 28 (25): 3049–67. doi:10.1002/sim.3680.

Martin, Andrew D., Kevin M. Quinn, and Jong Hee Park. 2011. “MCMCpack: Markov Chain Monte Carlo in R.” Journal of Statistical Software 42 (9): 22. http://www.jstatsoft.org/v42/i09/.

Plummer, Martyn. 2011. “JAGS Version 3.1.0 User Manual.” http://sourceforge.net/projects/mcmc-jags/files/Manuals/3.x/.

———. 2013. Rjags: Bayesian Graphical Models Using Mcmc. http://CRAN.R-project.org/package=rjags.

Plummer, Martyn, Nicky Best, Kate Cowles, and Karen Vines. 2006. “CODA: Convergence Diagnosis and Output Analysis for MCMC.” R News 6 (1): 7–11. http://CRAN.R-project.org/doc/Rnews/.

Spiegelhalter, David J., Andrew Thomas, Nicky G. Best, and Dave Lunn. 2003. “WinBUGS Version 1.4 User Manual.” Cambridge: MRC Biostatistics Unit.[780]. http://voteview.org/manual14.pdf.

Sturtz, Sibylle, Uwe Ligges, and Andrew Gelman. 2005. “R2WinBUGS: A Package for Running Winbugs from R.” Journal of Statistical Software 12 (3): 1–16. http://www.jstatsoft.org.

Su, Yu-Sung, and Masanao Yajima. 2012. R2jags: A Package for Running Jags from R. http://CRAN.R-project.org/package=R2jags.

Tsai, Tsung-han, Jeff Gill, and Jonathan Rapkin. 2012. Superdiag: R Code for Testing Markov Chain Nonconvergence. http://CRAN.R-project.org/package=superdiag.


  1. See Appendix C in Gelman and Hill (2007).

  2. LaTeX cannot process file names with periods, so if you use LaTeX and try to include the graphics file angell.trace, LaTeX will not compile your document.

  3. On the Mac, you can set the Finder to display all file extensions by going to Finder \(>\) Preferences \(>\) check “Show all filename extensions.”

  4. Some of this content borrows from a tutorial that Christopher Hare, former TA for this workshop, compiled in 2014.