Lab 3: A closer look at convergence

Applied Bayesian Modeling (ICPSR Summer Program 2025)

Author

Affiliation

Ursinus College

Published

July 20, 2025

The purpose of this tutorial is to show how to deploy the most common convergence diagnostics after fitting a Bayesian model in R (see the previous lab.

If you find any errors in this document or have suggestions for improvement, please email me.

How to use this tutorial

This tutorial provides a general approach to convergence diagnostics in R, independent of the software used to fit a Bayesian model. For consistency with Lab 2, we begin with a model fit using rstanarm, but the tutorial below also works with models fit with rstan, JAGS, or BUGS.

This tutorial should be read in conjunction with the background readings on convergence:

Robert and Casella (2010, chap. 8)
Cowles and Carlin (1996)
Gelman and Shirley (2011)

In addition, the Stan reference manual, which you can find here, offers a detailed discussion of convergence-related issues and some tips in chapter 16 (direct link).

Fitting the model

We pick up at Lab 2 and fit a model after generating a “fake” dataset:

Code

n.sim <- 100; set.seed(123)
x1 <- rnorm(n = n.sim, mean = 5, sd = 2)
x2 <- rbinom(n.sim, size = 1, prob = 0.3)
e <- rnorm(n = n.sim, mean = 0, sd = 1)
b1 <- 1.2
b2 <- -3.1
a <- 1.5
y <- a + b1 * x1 + b2 * x2 + e
sim.dat <- data.frame(y, x1, x2)
library("rstanarm")
bayes.mod <- stan_glm(y ~ x1 + x2, 
                     data = sim.dat,
                     family = "gaussian",
                     prior = normal(location = 0, scale = 10))


SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 3.7e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.37 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.035 seconds (Warm-up)
Chain 1:                0.036 seconds (Sampling)
Chain 1:                0.071 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 5e-06 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.04 seconds (Warm-up)
Chain 2:                0.035 seconds (Sampling)
Chain 2:                0.075 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 8e-06 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.036 seconds (Warm-up)
Chain 3:                0.036 seconds (Sampling)
Chain 3:                0.072 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 5e-06 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.036 seconds (Warm-up)
Chain 4:                0.034 seconds (Sampling)
Chain 4:                0.07 seconds (Total)
Chain 4:

A first take at convergence is provided in the standard summary of the model object:

Code

summary(bayes.mod)


Model Info:
 function:     stan_glm
 family:       gaussian [identity]
 formula:      y ~ x1 + x2
 algorithm:    sampling
 sample:       4000 (posterior sample size)
 priors:       see help('prior_summary')
 observations: 100
 predictors:   3

Estimates:
              mean   sd   10%   50%   90%
(Intercept)  1.9    0.3  1.5   1.8   2.2 
x1           1.1    0.1  1.1   1.1   1.2 
x2          -3.1    0.2 -3.4  -3.1  -2.8 
sigma        0.9    0.1  0.9   0.9   1.0 

Fit Diagnostics:
           mean   sd   10%   50%   90%
mean_PPD 6.8    0.1  6.7   6.8   7.0  

The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).

MCMC diagnostics
              mcse Rhat n_eff
(Intercept)   0.0  1.0  5064 
x1            0.0  1.0  4776 
x2            0.0  1.0  4807 
sigma         0.0  1.0  3950 
mean_PPD      0.0  1.0  4739 
log-posterior 0.0  1.0  1696 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

Here, the \(\hat{R}\) or Rhat measure (the “potential scale reduction factor”) is the (square root of) the ratio of the variance within and between chains. When chains are mixing well, this ratio should be very close to 1. You can read more on this in Gelman and Shirley (2011), section 6.6 (available in the course folder).

Preparing the posterior output for further analysis

Regardless of what program is used for estimating the regression, the next step is to convert the posterior distribution into an R object of the class mcmc.list for further analysis. An mcmc.list object is similar to a list of matrices, but it contains some additional information about the number of chains and iterations that is necessary for convergence diagnostics.

To convert a model built by rstanarm into an mcmc.list, we use the rstan::As.mcmc.list() function. This function takes two (relevant) arguments: an object of the class stanfit, and the name(s) of the parameters for which we wish to inspect convergence. This is the default object produced by using rstan. We, however, used rstanarm as a wrapper for rstan, and therefore have a different type of object:

Code

class(bayes.mod)

[1] "stanreg" "glm"     "lm"

Upon closer inspection, though, we find that this object contains an object of the class stanfit:

Code

names(bayes.mod)

 [1] "coefficients"      "ses"               "fitted.values"     "linear.predictors" "residuals"         "df.residual"       "covmat"            "y"                 "model"             "data"              "family"            "offset"            "weights"           "prior.weights"    
[15] "contrasts"         "na.action"         "formula"           "terms"             "prior.info"        "dropped_cols"      "algorithm"         "stan_summary"      "stanfit"           "rstan_version"     "call"              "stan_function"     "compute_mean_PPD"  "xlevels"

This makes it easy to supply the required object to rstan::As.mcmc.list. We’re restricting this object to only contain parameters named “alpha” and “beta” because these are the internal names for the intercept (“alpha”) and predictors (“beta”).

Code

bayes.mcmc <- rstan::As.mcmc.list(bayes.mod$stanfit, pars = c("alpha", "beta"))

Convergence diagnostics using the coda package

We can use the coda package (Plummer et al. 2006) to produce visual and numeric convergence diagnostics. This package should have been installed when you installed rstanarm. If

Code

library("coda")

fails, install the package:

Code

install.packages("coda")

First, we can produce density- and traceplots of the MCMC output:

Code

plot(bayes.mcmc)

Next, we can generate an autocorrelation plot for each chain; see the background readings for more explanation on these diagnostics:

Code

autocorr.plot(bayes.mcmc, ask = FALSE)

We can calculate other diagnostics:

Code

gelman.plot(bayes.mcmc)

Code

geweke.diag(bayes.mcmc)

[[1]]

Fraction in 1st window = 0.1
Fraction in 2nd window = 0.5 

alpha[1]  beta[1]  beta[2] 
  -1.197    0.826    1.100 


[[2]]

Fraction in 1st window = 0.1
Fraction in 2nd window = 0.5 

alpha[1]  beta[1]  beta[2] 
   0.136   -0.612    0.301 


[[3]]

Fraction in 1st window = 0.1
Fraction in 2nd window = 0.5 

alpha[1]  beta[1]  beta[2] 
  -1.575    1.758   -0.683 


[[4]]

Fraction in 1st window = 0.1
Fraction in 2nd window = 0.5 

alpha[1]  beta[1]  beta[2] 
  -1.232    0.475    1.436

Code

raftery.diag(bayes.mcmc)

[[1]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

[[2]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

[[3]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

[[4]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

Code

heidel.diag(bayes.mcmc)

[[1]]
                                       
         Stationarity start     p-value
         test         iteration        
alpha[1] passed       1         0.293  
beta[1]  passed       1         0.311  
beta[2]  passed       1         0.258  
                                  
         Halfwidth Mean  Halfwidth
         test                     
alpha[1] passed     1.85 0.01575  
beta[1]  passed     1.14 0.00275  
beta[2]  passed    -3.10 0.01246  

[[2]]
                                       
         Stationarity start     p-value
         test         iteration        
alpha[1] passed       1         0.434  
beta[1]  passed       1         0.207  
beta[2]  passed       1         0.135  
                                  
         Halfwidth Mean  Halfwidth
         test                     
alpha[1] passed     1.86 0.01736  
beta[1]  passed     1.13 0.00272  
beta[2]  passed    -3.09 0.01248  

[[3]]
                                       
         Stationarity start     p-value
         test         iteration        
alpha[1] passed       1         0.261  
beta[1]  passed       1         0.294  
beta[2]  passed       1         0.398  
                                  
         Halfwidth Mean  Halfwidth
         test                     
alpha[1] passed     1.84 0.0168   
beta[1]  passed     1.14 0.0029   
beta[2]  passed    -3.09 0.0114   

[[4]]
                                       
         Stationarity start     p-value
         test         iteration        
alpha[1] passed       1         0.230  
beta[1]  passed       1         0.398  
beta[2]  passed       1         0.301  
                                  
         Halfwidth Mean  Halfwidth
         test                     
alpha[1] passed     1.86 0.0184   
beta[1]  passed     1.13 0.0036   
beta[2]  passed    -3.10 0.0112

Quick convergence diagnostics: `superdiag`

A convenient function to analyze numerical representations of diagnostics in one sweep is the superdiag package (Tsai, Gill, and Rapkin 2012).

First, install the package:

Code

install.packages("superdiag", dependencies = TRUE, repos = "https://cloud.r-project.org")

Then, load it:

Code

library("superdiag")

Next, call the superdiag function and specify how many iterations of the chain you want to discard before analyzing for convergence:

Code

superdiag(bayes.mcmc, burnin = 500)

Number of chains = 4 
Number of iterations = 1000 per chain before discarding the burn-in period
Burn-in period = 500 per chain
Sample size in total = 2004 

****************** The Geweke diagnostic: ******************
Windows:
           chain 1 chain 2 chain 3 chain 4
From start     0.1   0.015   0.222   0.836
From stop      0.5   0.891   0.666   0.096

Z-scores:
         chain 1 chain 2 chain 3 chain 4
alpha[1]  -0.383   0.359    0.96   1.221
beta[1]    0.479   0.552   -1.20  -0.817
beta[2]    0.838  -0.784   -1.76  -0.216

*************** The Gelman-Rubin diagnostic: ***************
Potential scale reduction factors:
         Point est. Upper C.I.
alpha[1]          1          1
beta[1]           1          1
beta[2]           1          1

Multivariate psrf: 1.001

************* The Heidelberger-Welch diagnostic ************
Chain 1:
epsilon=0.1, alpha=0.05                                       
         Stationarity start     p-value
         test         iteration        
alpha[1] passed       1         0.295  
beta[1]  passed       1         0.121  
beta[2]  passed       1         0.342  
                                  
         Halfwidth Mean  Halfwidth
         test                     
alpha[1] passed     1.86 0.02505  
beta[1]  passed     1.13 0.00403  
beta[2]  passed    -3.10 0.01896  

Chain 2:
epsilon=0.117, alpha=0.1                                       
         Stationarity start     p-value
         test         iteration        
alpha[1] passed       1         0.234  
beta[1]  passed       1         0.352  
beta[2]  passed       1         0.576  
                                  
         Halfwidth Mean  Halfwidth
         test                     
alpha[1] passed     1.85 0.02137  
beta[1]  passed     1.14 0.00377  
beta[2]  passed    -3.10 0.01671  

Chain 3:
epsilon=0.023, alpha=0.1                                       
         Stationarity start     p-value
         test         iteration        
alpha[1] passed        1        0.10111
beta[1]  passed        1        0.15516
beta[2]  failed       NA        0.00406
                                 
         Halfwidth Mean Halfwidth
         test                    
alpha[1] passed    1.84 0.02422  
beta[1]  passed    1.14 0.00404  
beta[2]  <NA>        NA      NA  

Chain 4:
epsilon=0.121, alpha=0.05                                       
         Stationarity start     p-value
         test         iteration        
alpha[1] passed       52        0.261  
beta[1]  passed        1        0.229  
beta[2]  passed        1        0.252  
                                  
         Halfwidth Mean  Halfwidth
         test                     
alpha[1] passed     1.86 0.02006  
beta[1]  passed     1.13 0.00367  
beta[2]  passed    -3.09 0.01487  

*************** The Raftery-Lewis diagnostic ***************
Chain 1:
Convergence eps = 0.001
Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

Chain 2:
Convergence eps = 0.005
Quantile (q) = 0.01
Accuracy (r) = +/- 0.001
Probability (s) = 0.95 

You need a sample size of at least 38031 with these values of q, r and s

Chain 3:
Convergence eps = 0.001
Quantile (q) = 0.001
Accuracy (r) = +/- 0.0025
Probability (s) = 0.99 

You need a sample size of at least 1061 with these values of q, r and s

Chain 4:
Convergence eps = 5e-04
Quantile (q) = 0.25
Accuracy (r) = +/- 0.001
Probability (s) = 0.999 

You need a sample size of at least 2030169 with these values of q, r and s

************* The Hellinger distance diagnostic ************
Between chains: 
            Min    Max
alpha[1] 0.0549 0.0957
beta[1]  0.0518 0.0866
beta[2]  0.0403 0.0592

Within chain 1:
           100    200    300    400
alpha[1] 0.141 0.1065 0.1165 0.0926
beta[1]  0.152 0.1208 0.1353 0.0639
beta[2]  0.119 0.0834 0.0995 0.1169

Within chain 2:
            100    200    300   400
alpha[1] 0.1317 0.1564 0.1209 0.120
beta[1]  0.1041 0.2141 0.1115 0.123
beta[2]  0.0696 0.0733 0.0956 0.120

Within chain 3:
           100    200    300   400
alpha[1] 0.104 0.0560 0.0965 0.091
beta[1]  0.101 0.0379 0.1119 0.053
beta[2]  0.159 0.1127 0.1738 0.126

Within chain 4:
            100    200    300   400
alpha[1] 0.0778 0.1332 0.1007 0.071
beta[1]  0.1150 0.0956 0.0902 0.101
beta[2]  0.0748 0.0951 0.0634 0.101

Note that the rstanarm object by default retains 1000 iterations, hence the burn-in period you provide for superdiag must be less than 1000.

Quick diagnostic plots: `mcmcplots`

Similar to shinystan, but a tad more convenient way to obtain graphical diagnostics and results in one page is using the mcmcplots package (Curtis 2012):

First, install the package:

Code

install.packages("mcmcplots", dependencies = TRUE, repos = "https://cloud.r-project.org")

Then, load it:

Code

library("mcmcplots")

For quick diagnostics, you can produce html files with trace, density, and autocorrelation plots all on one page. The files will be displayed in your default internet browser.

Code

mcmcplot(bayes.mcmc)

A page will open in your browser, looking similar to the below:

Some commands for individual plots:

Code

denplot(bayes.mcmc)

Code

traplot(bayes.mcmc)

If you want to produce a coefficient dot plot with credible intervals, use caterplot:

Code

caterplot(bayes.mcmc, 
  labels = c("alpha", "beta1", "beta2"))

As always, check the help files for options to customize these plots.

More plots: ggmcmc

Yet another convenient option for plotting output is the ggmcmc package (Fernández i Marín 2013):

First, install the package:

Code

install.packages("ggmcmc", dependencies = TRUE, repos = "https://cloud.r-project.org")

Then, load it:

Code

library("ggmcmc")

To use this package, you need to first convert your MCMC object into a ggs object, using the ggs function. This object can then be passed on to the various ggmcmc plotting commands:

Code

bayes.gg <- ggs(bayes.mcmc)

Code

ggs_histogram(bayes.gg)

Code

ggs_density(bayes.gg)

Code

ggs_traceplot(bayes.gg)

Code

ggs_running(bayes.gg)

Code

ggs_compare_partial(bayes.gg)

Code

ggs_autocorrelation(bayes.gg)

Code

ggs_geweke(bayes.gg)

Code

ggs_caterpillar(bayes.gg)

Or, you can use the ggmcmc command to create a PDF file containing a variety of diagnostic plots to your WD:

Code

ggmcmc(bayes.gg, 
       file = "bayes_ggmcmc.pdf")

Plotting histograms

Plotting density plots

Plotting traceplots

Plotting running means

Plotting comparison of partial and full chain

Plotting autocorrelation plots

Plotting crosscorrelation plot

Plotting Potential Scale Reduction Factors

Plotting shrinkage of Potential Scale Reduction Factors

Plotting Number of effective independent draws

Plotting Geweke Diagnostic

Plotting caterpillar plot

Time taken to generate the report: 2.8 seconds.

Using shinystan

As a reminder, the shinystan package (Stan Development Team 2017) offers a convenient way to inspect the convergence of samples in your web browser. Just as you saw in Lab 2, simply call launch_shinystan on the model you fit with rstanarm:

Code

launch_shinystan(bayes.mod)

After a few seconds, your browser should show the shinystan interface:

Click on any of the buttons DIAGNOSE, ESTIMATE, EXPLORE, or HELP to proceed.

References

Cowles, Mary Kathryn, and Bradley P. Carlin. 1996. “Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review.” Journal of the American Statistical Association 91 (434): 883–904. http://www.jstor.org/stable/2291683.

Curtis, S. McKay. 2012. Mcmcplots: Create Plots from MCMC Output. http://CRAN.R-project.org/package=mcmcplots.

Fernández i Marín, Xavier. 2013. ggmcmc: Graphical Tools for Analyzing Markov Chain Monte Carlo Simulations from Bayesian Inference. http://xavier-fim.net/packages/ggmcmc.

Gelman, Andrew, and Kenneth Shirley. 2011. “Inference from Simulations and Monitoring Convergence.” In Handbook of Markov Chain Monte Carlo, edited by Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng, 163–74. Chapman; Hall/CRC.

Plummer, Martyn, Nicky Best, Kate Cowles, and Karen Vines. 2006. “CODA: Convergence Diagnosis and Output Analysis for MCMC.” R News 6 (1): 7–11. http://CRAN.R-project.org/doc/Rnews/.

Robert, Christian, and George Casella. 2010. Introducing Monte Carlo Methods with r. New York, NY: Springer.

Stan Development Team. 2017. Shinystan: Interactive Visual and Numerical Diagnostics and Posterior Analysis for Bayesian Models. http://mc-stan.org/.

Tsai, Tsung-han, Jeff Gill, and Jonathan Rapkin. 2012. Superdiag: R Code for Testing Markov Chain Nonconvergence. http://CRAN.R-project.org/package=superdiag.

How to use this tutorial

Fitting the model

Preparing the posterior output for further analysis

Convergence diagnostics using the coda package

Quick convergence diagnostics: superdiag

Quick diagnostic plots: mcmcplots

More plots: ggmcmc

Using shinystan

References

Quick convergence diagnostics: `superdiag`

Quick diagnostic plots: `mcmcplots`