The purpose of this tutorial is to show how to deploy the most common convergence diagnostics after fitting a Bayesian model in R (see the previous lab. There is also a PDF version of this tutorial as well as an R script containing all code.
If you find any errors in this document or have suggestions for improvement, please email me.
This tutorial provides a general approach to convergence diagnostics in R, independent of the software used to fit a Bayesian model. For consistency with Lab 2, we begin with a model fit using rstanarm, but the tutorial below also works with models fit with rstan, JAGS, or BUGS.
This tutorial should be read in conjunction with the background readings on convergence:
In addition, the Stan user’s guide, which you can find here, offers a detailed discussion of convergence-related issues and some tips in chapters 20 and 22.
We pick up at Lab 2 and fit a model after generating a “fake” dataset:
<- 100; set.seed(123)
n.sim <- rnorm(n = n.sim, mean = 5, sd = 2)
x1 <- rbinom(n.sim, size = 1, prob = 0.3)
x2 <- rnorm(n = n.sim, mean = 0, sd = 1)
e <- 1.2
b1 <- -3.1
b2 <- 1.5
a <- a + b1 * x1 + b2 * x2 + e
y <- data.frame(y, x1, x2)
sim.dat library("rstanarm")
<- stan_glm(y ~ x1 + x2,
bayes.mod data = sim.dat,
family = "gaussian",
prior = normal(location = 0, scale = 10))
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000121 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.21 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.039738 seconds (Warm-up)
## Chain 1: 0.040116 seconds (Sampling)
## Chain 1: 0.079854 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 1.2e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.042095 seconds (Warm-up)
## Chain 2: 0.037601 seconds (Sampling)
## Chain 2: 0.079696 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.040094 seconds (Warm-up)
## Chain 3: 0.037149 seconds (Sampling)
## Chain 3: 0.077243 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 1e-05 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.037369 seconds (Warm-up)
## Chain 4: 0.038307 seconds (Sampling)
## Chain 4: 0.075676 seconds (Total)
## Chain 4:
A first take at convergence is provided in the standard summary of the model object:
summary(bayes.mod)
##
## Model Info:
## function: stan_glm
## family: gaussian [identity]
## formula: y ~ x1 + x2
## algorithm: sampling
## sample: 4000 (posterior sample size)
## priors: see help('prior_summary')
## observations: 100
## predictors: 3
##
## Estimates:
## mean sd 10% 50% 90%
## (Intercept) 1.9 0.3 1.5 1.8 2.2
## x1 1.1 0.1 1.1 1.1 1.2
## x2 -3.1 0.2 -3.4 -3.1 -2.8
## sigma 0.9 0.1 0.9 0.9 1.0
##
## Fit Diagnostics:
## mean sd 10% 50% 90%
## mean_PPD 6.8 0.1 6.7 6.8 7.0
##
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
##
## MCMC diagnostics
## mcse Rhat n_eff
## (Intercept) 0.0 1.0 5064
## x1 0.0 1.0 4776
## x2 0.0 1.0 4807
## sigma 0.0 1.0 3950
## mean_PPD 0.0 1.0 4739
## log-posterior 0.0 1.0 1696
##
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
Here, the \(\hat{R}\) or Rhat
measure (the “potential scale reduction factor”) is the (square root of) the ratio of the variance within and between chains. When chains are mixing well, this ratio should be very close to 1. You can read more on this in Gelman & Shirley 2011, section 6.6 (available in the course folder).
Regardless of what program is used for estimating the regression, the next step is to convert the posterior distribution into an R object of the class mcmc.list
for further analysis. An mcmc.list
object is similar to a list of matrices, but it contains some additional information about the number of chains and iterations that is necessary for convergence diagnostics.
To convert a model built by rstanarm into an mcmc.list
, we use the rstan::As.mcmc.list()
function. This function takes two (relevant) arguments: an object of the class stanfit
, and the name(s) of the parameters for which we wish to inspect convergence. This is the default object produced by using rstan. We, however, used rstanarm as a wrapper for rstan, and therefore have a different type of object:
class(bayes.mod)
## [1] "stanreg" "glm" "lm"
Upon closer inspection, though, we find that this object contains an object of the class stanfit
:
names(bayes.mod)
## [1] "coefficients" "ses" "fitted.values"
## [4] "linear.predictors" "residuals" "df.residual"
## [7] "covmat" "y" "model"
## [10] "data" "family" "offset"
## [13] "weights" "prior.weights" "contrasts"
## [16] "na.action" "formula" "terms"
## [19] "prior.info" "algorithm" "stan_summary"
## [22] "stanfit" "rstan_version" "call"
## [25] "stan_function" "compute_mean_PPD" "xlevels"
This makes it easy to supply the required object to rstan::As.mcmc.list
. We’re restricting this object to only contain parameters named “alpha” and “beta” because these are the internal names for the intercept (“alpha”) and predictors (“beta”).
<- rstan::As.mcmc.list(bayes.mod$stanfit, pars = c("alpha", "beta")) bayes.mcmc
We can use the coda package (Plummer et al. 2006) to produce visual and numeric convergence diagnostics. This package should have been installed when you installed rstanarm. If
library("coda")
fails, install the package:
install.packages("coda")
First, we can produce density- and traceplots of the MCMC output:
plot(bayes.mcmc)
Next, we can generate an autocorrelation plot for each chain; see the background readings for more explanation on these diagnostics:
autocorr.plot(bayes.mcmc, ask = FALSE)
We can calculate other diagnostics:
gelman.plot(bayes.mcmc)
geweke.diag(bayes.mcmc)
## [[1]]
##
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5
##
## alpha[1] beta[1] beta[2]
## -1.1971 0.8257 1.0999
##
##
## [[2]]
##
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5
##
## alpha[1] beta[1] beta[2]
## 0.1356 -0.6124 0.3006
##
##
## [[3]]
##
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5
##
## alpha[1] beta[1] beta[2]
## -1.5749 1.7582 -0.6832
##
##
## [[4]]
##
## Fraction in 1st window = 0.1
## Fraction in 2nd window = 0.5
##
## alpha[1] beta[1] beta[2]
## -1.2324 0.4752 1.4356
raftery.diag(bayes.mcmc)
## [[1]]
##
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95
##
## You need a sample size of at least 3746 with these values of q, r and s
##
## [[2]]
##
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95
##
## You need a sample size of at least 3746 with these values of q, r and s
##
## [[3]]
##
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95
##
## You need a sample size of at least 3746 with these values of q, r and s
##
## [[4]]
##
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95
##
## You need a sample size of at least 3746 with these values of q, r and s
heidel.diag(bayes.mcmc)
## [[1]]
##
## Stationarity start p-value
## test iteration
## alpha[1] passed 1 0.293
## beta[1] passed 1 0.311
## beta[2] passed 1 0.258
##
## Halfwidth Mean Halfwidth
## test
## alpha[1] passed 1.85 0.01575
## beta[1] passed 1.14 0.00275
## beta[2] passed -3.10 0.01246
##
## [[2]]
##
## Stationarity start p-value
## test iteration
## alpha[1] passed 1 0.434
## beta[1] passed 1 0.207
## beta[2] passed 1 0.135
##
## Halfwidth Mean Halfwidth
## test
## alpha[1] passed 1.86 0.01736
## beta[1] passed 1.13 0.00272
## beta[2] passed -3.09 0.01248
##
## [[3]]
##
## Stationarity start p-value
## test iteration
## alpha[1] passed 1 0.261
## beta[1] passed 1 0.294
## beta[2] passed 1 0.398
##
## Halfwidth Mean Halfwidth
## test
## alpha[1] passed 1.84 0.0168
## beta[1] passed 1.14 0.0029
## beta[2] passed -3.09 0.0114
##
## [[4]]
##
## Stationarity start p-value
## test iteration
## alpha[1] passed 1 0.230
## beta[1] passed 1 0.398
## beta[2] passed 1 0.301
##
## Halfwidth Mean Halfwidth
## test
## alpha[1] passed 1.86 0.0184
## beta[1] passed 1.13 0.0036
## beta[2] passed -3.10 0.0112
superdiag
A convenient function to analyze numerical representations of diagnostics in one sweep is the superdiag package (Tsai, Gill, and Rapkin 2012).
First, install the package:
install.packages("superdiag", dependencies = TRUE, repos = "https://cloud.r-project.org")
Then, load it:
library("superdiag")
Next, call the superdiag
function and specify how many iterations of the chain you want to discard before analyzing for convergence:
superdiag(bayes.mcmc, burnin = 500)
## Number of chains = 4
## Number of iterations = 1000 per chain before discarding the burn-in period
## Burn-in period = 500 per chain
## Sample size in total = 2004
##
## ****************** The Geweke diagnostic: ******************
## Windows:
## chain 1 chain 2 chain 3 chain 4
## From start 0.1 0.0145 0.2219 0.8364
## From stop 0.5 0.8905 0.6656 0.0956
##
## Z-scores:
## chain 1 chain 2 chain 3 chain 4
## alpha[1] -0.3832 0.3590 0.9602 1.2209
## beta[1] 0.4786 0.5523 -1.1999 -0.8167
## beta[2] 0.8382 -0.7836 -1.7627 -0.2159
##
## *************** The Gelman-Rubin diagnostic: ***************
## Potential scale reduction factors:
## Point est. Upper C.I.
## alpha[1] 1.002 1.005
## beta[1] 1.002 1.004
## beta[2] 1.000 1.002
##
## Multivariate psrf: 1.0007
##
## ************* The Heidelberger-Welch diagnostic ************
## Chain 1:
## epsilon=0.1, alpha=0.05
## Stationarity start p-value
## test iteration
## alpha[1] passed 1 0.2954
## beta[1] passed 1 0.1213
## beta[2] passed 1 0.3417
##
## Halfwidth Mean Halfwidth
## test
## alpha[1] passed 1.865 0.025055
## beta[1] passed 1.134 0.004027
## beta[2] passed -3.104 0.018964
##
## Chain 2:
## epsilon=0.117, alpha=0.1
## Stationarity start p-value
## test iteration
## alpha[1] passed 1 0.2341
## beta[1] passed 1 0.3519
## beta[2] passed 1 0.5759
##
## Halfwidth Mean Halfwidth
## test
## alpha[1] passed 1.852 0.021371
## beta[1] passed 1.135 0.003768
## beta[2] passed -3.101 0.016706
##
## Chain 3:
## epsilon=0.023, alpha=0.1
## Stationarity start p-value
## test iteration
## alpha[1] passed 1 0.101108
## beta[1] passed 1 0.155161
## beta[2] failed NA 0.004056
##
## Halfwidth Mean Halfwidth
## test
## alpha[1] passed 1.838 0.024222
## beta[1] passed 1.137 0.004044
## beta[2] <NA> NA NA
##
## Chain 4:
## epsilon=0.121, alpha=0.05
## Stationarity start p-value
## test iteration
## alpha[1] passed 52 0.2610
## beta[1] passed 1 0.2290
## beta[2] passed 1 0.2521
##
## Halfwidth Mean Halfwidth
## test
## alpha[1] passed 1.862 0.020061
## beta[1] passed 1.132 0.003669
## beta[2] passed -3.094 0.014868
##
## *************** The Raftery-Lewis diagnostic ***************
## Chain 1:
## Convergence eps = 0.001
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.005
## Probability (s) = 0.95
##
## You need a sample size of at least 3746 with these values of q, r and s
##
## Chain 2:
## Convergence eps = 0.005
## Quantile (q) = 0.01
## Accuracy (r) = +/- 0.001
## Probability (s) = 0.95
##
## You need a sample size of at least 38031 with these values of q, r and s
##
## Chain 3:
## Convergence eps = 0.001
## Quantile (q) = 0.001
## Accuracy (r) = +/- 0.0025
## Probability (s) = 0.99
##
## You need a sample size of at least 1061 with these values of q, r and s
##
## Chain 4:
## Convergence eps = 5e-04
## Quantile (q) = 0.25
## Accuracy (r) = +/- 0.001
## Probability (s) = 0.999
##
## You need a sample size of at least 2030169 with these values of q, r and s
##
## ************* The Hellinger distance diagnostic ************
## Between chains:
## Min Max
## alpha[1] 0.05492 0.09572
## beta[1] 0.05182 0.08657
## beta[2] 0.04030 0.05922
##
## Within chain 1:
## 100 200 300 400
## alpha[1] 0.1415 0.10651 0.11649 0.09259
## beta[1] 0.1523 0.12082 0.13529 0.06390
## beta[2] 0.1193 0.08337 0.09957 0.11687
##
## Within chain 2:
## 100 200 300 400
## alpha[1] 0.13167 0.15644 0.12088 0.1199
## beta[1] 0.10411 0.21417 0.11155 0.1234
## beta[2] 0.06951 0.07329 0.09564 0.1200
##
## Within chain 3:
## 100 200 300 400
## alpha[1] 0.1041 0.05594 0.0965 0.09094
## beta[1] 0.1013 0.03789 0.1119 0.05297
## beta[2] 0.1587 0.11268 0.1738 0.12566
##
## Within chain 4:
## 100 200 300 400
## alpha[1] 0.07784 0.13319 0.10073 0.07097
## beta[1] 0.11501 0.09561 0.09025 0.10106
## beta[2] 0.07474 0.09507 0.06341 0.10099
Note that the rstanarm object by default retains 1000 iterations, hence the burn-in period you provide for superdiag must be less than 1000.
mcmcplots
Similar to shinystan, but a tad more convenient way to obtain graphical diagnostics and results in one page is using the mcmcplots package (Curtis 2012):
First, install the package:
install.packages("mcmcplots", dependencies = TRUE, repos = "https://cloud.r-project.org")
Then, load it:
library("mcmcplots")
For quick diagnostics, you can produce html files with trace, density, and autocorrelation plots all on one page. The files will be displayed in your default internet browser.
mcmcplot(bayes.mcmc)
A page will open in your browser, looking similar to the below:
Page produced by mcmcplot()
Some commands for individual plots:
denplot(bayes.mcmc)
traplot(bayes.mcmc)
If you want to produce a coefficient dot plot with credible intervals, use caterplot
:
caterplot(bayes.mcmc,
labels = c("alpha", "beta1", "beta2"))
As always, check the help files for options to customize these plots.
Yet another convenient option for plotting output is the ggmcmc package (Fernández i Marín 2013):
First, install the package:
install.packages("ggmcmc", dependencies = TRUE, repos = "https://cloud.r-project.org")
Then, load it:
library("ggmcmc")
To use this package, you need to first convert your MCMC object into a ggs object, using the ggs
function. This object can then be passed on to the various ggmcmc plotting commands:
<- ggs(bayes.mcmc) bayes.gg
ggs_histogram(bayes.gg)
ggs_density(bayes.gg)
ggs_traceplot(bayes.gg)
ggs_running(bayes.gg)
ggs_compare_partial(bayes.gg)
ggs_autocorrelation(bayes.gg)
ggs_geweke(bayes.gg)
ggs_caterpillar(bayes.gg)
Or, you can use the ggmcmc
command to create a PDF file containing a variety of diagnostic plots to your WD:
ggmcmc(bayes.gg,
file = "bayes_ggmcmc.pdf")
## Plotting histograms
## Plotting density plots
## Plotting traceplots
## Plotting running means
## Plotting comparison of partial and full chain
## Plotting autocorrelation plots
## Plotting crosscorrelation plot
## Plotting Potential Scale Reduction Factors
## Plotting shrinkage of Potential Scale Reduction Factors
## Plotting Number of effective independent draws
## Plotting Geweke Diagnostic
## Plotting caterpillar plot
## Time taken to generate the report: 5.2 seconds.
As a reminder, the shinystan package (Stan Development Team 2017) offers a convenient way to inspect the convergence of samples in your web browser. Just as you saw in Lab 2, simply call launch_shinystan
on the model you fit with rstanarm:
launch_shinystan(bayes.mod)
After a few seconds, your browser should show the shinystan interface:
ShinyStan start-up screen
Click on any of the buttons DIAGNOSE, ESTIMATE, EXPLORE, or HELP to proceed.