Code
data {
/* dimensionality of data and model */
int<lower=0> N; // number of observations
/* response and explanatory variables */
vector[N] moral; // response variable
vector[N] hetero; // x1
vector[N] mobility; // x1
}
The purpose of this tutorial is to briefly show how to write models directly in Stan, rather than using the convenient rstanarm functions. Doing this opens up almost unlimited opportunities to model more complex data structures.
Please note that this tutorial is by no means exhaustive, and offers only a quick glimpse at the Stan modeling language. For more details, I recommend the excellent collection of Stan examples accessible via the Stan website. You should also read through the package vignette for a more detailed walk through the process of fitting a model using Stan: vignette(topic = "cmdstanr", package = "cmdstanr")
.
You can use CmdStanR as a backend for the brms package, without ever writing Stan code. Refer to our lecture for an example.
The Stan language is different from R. You will write a Stan model in a text file or a script in RStudio, and save it on your computer before you can run Stan from within R. Your call to Stan (within R) will then reference your Stan model file. You can either store your model as a character object in R, or as a separate text file with the .stan
extension. Although you can use a text file with any extension, using the .stan
extension enables syntax highlighting in RStudio. Try opening the included file angell.stan
in RStudio to see for yourself!
A Stan program consists of several blocks, each of which performs a specific function. Every model needs data
, parameters
, and model
blocks, and they can also contain the transformed parameters
and generated quantities
blocks.
The Stan language has a few quirks that you need to follow.
/*
and ending with */
, and single-line comments beginning with //
. As before, we recommend commenting your code liberally to facilitate communication with coauthors and your future self.The data
block contains anything that is a fixed quantity. This includes your explanatory and response variables, as well as any indexing or dimensionality variables your model will use. Stan requires you to explicitly declare any object that you will be referencing in your model. This means that we have to assign specific data types to our objects. The two basic data types in Stan and int
and real
, which correspond to integer
and numeric
in R. While R doesn’t care if you have a dichotomous response variable stored in a numeric
object, Stan does, and trying to estimate a logit on a binary dependent variable stored as a real
object will cause an error.
For the examples in this tutorial, we’re returning to the Angell data (from the carData
package) that you’ve seen before in this course.
data {
/* dimensionality of data and model */
int<lower=0> N; // number of observations
/* response and explanatory variables */
vector[N] moral; // response variable
vector[N] hetero; // x1
vector[N] mobility; // x1
}
This model will carry out a linear regression of the two predictors on the dependent variable \(y\).
The parameters
block contains anything that is a random variable you wish to estimate. Due to the way that Stan’s sampler works, it cannot estimate discrete parameters. If you have a model with discrete parameters, you have to marginalize them out before writing the model in Stan. Practically, what this means is that your parameters will always be real
s and never int
s.
parameters {
/* parameters to be estimated by model */
vector[3] beta; // regression coefficients
real<lower=0> sigma; // standard deviation
}
This model will estimate 3 \(\beta\) coefficients (one intercept and two coefficients) and the error term \(\sigma\).
The model code below should look familiar: \(y \sim \mathcal{N}(\mu, \sigma)\), where \(\mu = \alpha + \beta_1 x_1 + \beta_2 x_2\). Here, I write the equation directly in the slot for \(\mu\): \(y \sim \mathcal{N}(\alpha + \beta_1 x_1 + \beta_2 x_2, \sigma)\), and I use beta[1]
, beta[2]
, and beta[3]
instead of \(\alpha\), \(\beta_1\), and \(\beta_3\) to put my coefficients into a vector.
model {
1] + beta[2] * hetero + beta[3] * mobility,
moral ~ normal(beta[// response standard deviation
sigma);
/* priors and data */
1] ~ normal(0, 10); // prior on intercept
beta[for(i in 2:3){
0, 2.5); // prior on coefficients
beta[i] ~ normal(
}0, 10); // prior on standard deviation
sigma ~ normal( }
This model has a \(\mathcal{N}(0, 2.5)\) prior on the coefficients and a \(\mathcal{N}(0, 10)\) prior on the intercept. If you don’t specify a prior on a bounded parameter (e.g., here sigma), then Stan uses an implicit uniform prior. You can refer to the “prior choice recommendations” by the Stan developers for suggestions. Note here that although I’m specifying a normal distribution for \(\sigma\), the constraint real<lower=0> sigma
in the parameter block above ensures that Stan will only sample from positive values for \(\sigma\).
We can also include a transformed parameters
block between the parameters
and model
blocks, and a generated quantities
block after the model
block. Transformed parameters are often used to speed up sampling, or when we are fixing the values of specific elements within parameters in latent variable models as an identification restriction. Generated quantities can include values such as \(\hat{y}\)s and residuals, or samples from the posterior predictive distribution. Both of these blocks are executed every iteration, allowing us to monitor objects in them and obtain a full posterior distribution for quantities of interest. In this case, our model file has a generated quantities block that calculates \(\hat{y}\) and the residual. Don’t forget the newline at the end of the file!
generated quantities {
/* define predictions and residuals */
vector[N] y_hat;
vector[N] resid;
/* calculate predictions and residuals */
1] + beta[2] * hetero + beta[3] * mobility;
y_hat = beta[
resid = moral - y_hat;
}
This completes the most basic setup of a Stan model: data, parameters, model, and (the optional) generated quantities block. Don’t forget the new line at the end of the file!
Once we have our model written out, we use R to prepare our data, decide which paramters we’re interested in monitoring, and setup our sampler. This is done using the cmdstanr
package. One advantage of Stan is that Stan natively supports parallel computing and makes it exceptionally easy to run multiple chains in parallel. The following code loads the cmdstanr
package:
library("cmdstanr")
Let’s use the Angell data to estimate a linear regression model.
<- rio::import("https://www.jkarreth.net/files/angell.csv")
data head(data)
moral hetero mobility region city
1 19.0 20.6 15.0 E Rochester
2 17.0 15.6 20.2 E Syracuse
3 16.4 22.1 13.6 E Worcester
4 16.2 14.0 14.8 E Erie
5 15.8 17.4 17.6 MW Milwaukee
6 15.3 27.9 17.5 E Bridgeport
Stan requires variables in the form of a named list, where the names match the variable names declared in the data block.
<- list(moral = data$moral,
angell_data_list hetero = data$hetero,
mobility = data$mobility,
N = nrow(data))
str(angell_data_list)
List of 4
$ moral : num [1:43] 19 17 16.4 16.2 15.8 15.3 15.2 14.3 14.2 14.1 ...
$ hetero : num [1:43] 20.6 15.6 22.1 14 17.4 27.9 22.3 23.7 10.6 12.7 ...
$ mobility: num [1:43] 15 20.2 13.6 14.8 17.6 17.5 14.7 23.8 19.4 31.9 ...
$ N : int 43
We now read the model file into R as an object, which will compile the model:
<- cmdstan_model("angell.stan") angell_mod
Now that angell_mod
is a model object, we can access it as follows. We can view the model using angell_mod$print()
:
$print() angell_mod
data {
/* dimensionality of data and model */
int<lower=0> N; // number of observations
/* response and explanatory variables */
vector[N] moral; // response variable
vector[N] hetero; // x1
vector[N] mobility; // x2
}
parameters {
/* parameters to be estimated by model */
vector[3] beta; // regression coefficients
real<lower=0> sigma; // standard deviation
}
model {
moral ~ normal(beta[1] + beta[2] * hetero + beta[3] * mobility,
sigma); // response standard deviation
/* priors and data */
beta[1] ~ normal(0, 10); // prior on intercept
for(i in 2:3){
beta[i] ~ normal(0, 2.5); // prior on coefficients
}
sigma ~ normal(0, 10); // prior on standard deviation
}
generated quantities {
/* define predictions and residuals */
vector[N] y_hat;
vector[N] resid;
/* calculate predictions and residuals */
y_hat = beta[1] + beta[2] * hetero + beta[3] * mobility;
resid = moral - y_hat;
}
The sample()
command will then run the sampler for a specified number of iterations. That function has many arguments, but there are a few key ones.
data
argument our named list of variables.chains
argument tells the function how many different chains to run, and parallel_chains
will define how many of these chains should be run on separate cores.num_samples
argument tells the function how many iterations to run the sampler for.init
argument, similar to what you did using JAGS/BUGS. See ?stan
for the different options to provide starting values. If no starting values are specified, Stan generates random starting values. If you provide a seed
for the RNG, the randomly generated starting values will be the same each time the command is initiated.model-method-sample
.Stan refers to the initial discarded draws at the beginning of a sampling run as the warmup. Hamiltonian Monte Carlo only has advantages over Gibbs sampling when the auxiliary parameters are properly tuned, so the warmup phase in Stan always involves adjustments to the sampler.The samples during the warmup phase are saved, so we can diagnose exploration of the parameter space and check to see if chains from different starting values converged to the same posterior mean. The default number of warmup iterations is 1/2 iter
.
<- angell_mod$sample(
angell_stan_fit data = angell_data_list, # named list of data
seed = 123, # seed for random number generator
chains = 4, # number of Markov chains total
parallel_chains = 4, # number of Markov chains to run in parallel
num_samples = 2000, # total number of iterations per chain
)
Warning in angell_mod$sample(data = angell_data_list, seed = 123, chains = 4, : 'num_samples' is deprecated. Please use 'iter_sampling' instead.
Running MCMC with 4 parallel chains...
Chain 1 Iteration: 1 / 3000 [ 0%] (Warmup)
Chain 1 Iteration: 100 / 3000 [ 3%] (Warmup)
Chain 1 Iteration: 200 / 3000 [ 6%] (Warmup)
Chain 1 Iteration: 300 / 3000 [ 10%] (Warmup)
Chain 1 Iteration: 400 / 3000 [ 13%] (Warmup)
Chain 1 Iteration: 500 / 3000 [ 16%] (Warmup)
Chain 1 Iteration: 600 / 3000 [ 20%] (Warmup)
Chain 1 Iteration: 700 / 3000 [ 23%] (Warmup)
Chain 1 Iteration: 800 / 3000 [ 26%] (Warmup)
Chain 1 Iteration: 900 / 3000 [ 30%] (Warmup)
Chain 1 Iteration: 1000 / 3000 [ 33%] (Warmup)
Chain 1 Iteration: 1001 / 3000 [ 33%] (Sampling)
Chain 1 Iteration: 1100 / 3000 [ 36%] (Sampling)
Chain 1 Iteration: 1200 / 3000 [ 40%] (Sampling)
Chain 1 Iteration: 1300 / 3000 [ 43%] (Sampling)
Chain 1 Iteration: 1400 / 3000 [ 46%] (Sampling)
Chain 1 Iteration: 1500 / 3000 [ 50%] (Sampling)
Chain 1 Iteration: 1600 / 3000 [ 53%] (Sampling)
Chain 1 Iteration: 1700 / 3000 [ 56%] (Sampling)
Chain 1 Iteration: 1800 / 3000 [ 60%] (Sampling)
Chain 1 Iteration: 1900 / 3000 [ 63%] (Sampling)
Chain 2 Iteration: 1 / 3000 [ 0%] (Warmup)
Chain 2 Iteration: 100 / 3000 [ 3%] (Warmup)
Chain 2 Iteration: 200 / 3000 [ 6%] (Warmup)
Chain 2 Iteration: 300 / 3000 [ 10%] (Warmup)
Chain 2 Iteration: 400 / 3000 [ 13%] (Warmup)
Chain 2 Iteration: 500 / 3000 [ 16%] (Warmup)
Chain 2 Iteration: 600 / 3000 [ 20%] (Warmup)
Chain 2 Iteration: 700 / 3000 [ 23%] (Warmup)
Chain 2 Iteration: 800 / 3000 [ 26%] (Warmup)
Chain 2 Iteration: 900 / 3000 [ 30%] (Warmup)
Chain 2 Iteration: 1000 / 3000 [ 33%] (Warmup)
Chain 2 Iteration: 1001 / 3000 [ 33%] (Sampling)
Chain 2 Iteration: 1100 / 3000 [ 36%] (Sampling)
Chain 2 Iteration: 1200 / 3000 [ 40%] (Sampling)
Chain 2 Iteration: 1300 / 3000 [ 43%] (Sampling)
Chain 2 Iteration: 1400 / 3000 [ 46%] (Sampling)
Chain 2 Iteration: 1500 / 3000 [ 50%] (Sampling)
Chain 2 Iteration: 1600 / 3000 [ 53%] (Sampling)
Chain 2 Iteration: 1700 / 3000 [ 56%] (Sampling)
Chain 2 Iteration: 1800 / 3000 [ 60%] (Sampling)
Chain 2 Iteration: 1900 / 3000 [ 63%] (Sampling)
Chain 2 Iteration: 2000 / 3000 [ 66%] (Sampling)
Chain 2 Iteration: 2100 / 3000 [ 70%] (Sampling)
Chain 3 Iteration: 1 / 3000 [ 0%] (Warmup)
Chain 3 Iteration: 100 / 3000 [ 3%] (Warmup)
Chain 3 Iteration: 200 / 3000 [ 6%] (Warmup)
Chain 3 Iteration: 300 / 3000 [ 10%] (Warmup)
Chain 3 Iteration: 400 / 3000 [ 13%] (Warmup)
Chain 3 Iteration: 500 / 3000 [ 16%] (Warmup)
Chain 3 Iteration: 600 / 3000 [ 20%] (Warmup)
Chain 3 Iteration: 700 / 3000 [ 23%] (Warmup)
Chain 3 Iteration: 800 / 3000 [ 26%] (Warmup)
Chain 3 Iteration: 900 / 3000 [ 30%] (Warmup)
Chain 3 Iteration: 1000 / 3000 [ 33%] (Warmup)
Chain 3 Iteration: 1001 / 3000 [ 33%] (Sampling)
Chain 3 Iteration: 1100 / 3000 [ 36%] (Sampling)
Chain 3 Iteration: 1200 / 3000 [ 40%] (Sampling)
Chain 3 Iteration: 1300 / 3000 [ 43%] (Sampling)
Chain 3 Iteration: 1400 / 3000 [ 46%] (Sampling)
Chain 3 Iteration: 1500 / 3000 [ 50%] (Sampling)
Chain 3 Iteration: 1600 / 3000 [ 53%] (Sampling)
Chain 3 Iteration: 1700 / 3000 [ 56%] (Sampling)
Chain 3 Iteration: 1800 / 3000 [ 60%] (Sampling)
Chain 3 Iteration: 1900 / 3000 [ 63%] (Sampling)
Chain 3 Iteration: 2000 / 3000 [ 66%] (Sampling)
Chain 3 Iteration: 2100 / 3000 [ 70%] (Sampling)
Chain 3 Iteration: 2200 / 3000 [ 73%] (Sampling)
Chain 3 Iteration: 2300 / 3000 [ 76%] (Sampling)
Chain 4 Iteration: 1 / 3000 [ 0%] (Warmup)
Chain 4 Iteration: 100 / 3000 [ 3%] (Warmup)
Chain 4 Iteration: 200 / 3000 [ 6%] (Warmup)
Chain 4 Iteration: 300 / 3000 [ 10%] (Warmup)
Chain 4 Iteration: 400 / 3000 [ 13%] (Warmup)
Chain 4 Iteration: 500 / 3000 [ 16%] (Warmup)
Chain 4 Iteration: 600 / 3000 [ 20%] (Warmup)
Chain 4 Iteration: 700 / 3000 [ 23%] (Warmup)
Chain 4 Iteration: 800 / 3000 [ 26%] (Warmup)
Chain 4 Iteration: 900 / 3000 [ 30%] (Warmup)
Chain 4 Iteration: 1000 / 3000 [ 33%] (Warmup)
Chain 4 Iteration: 1001 / 3000 [ 33%] (Sampling)
Chain 4 Iteration: 1100 / 3000 [ 36%] (Sampling)
Chain 4 Iteration: 1200 / 3000 [ 40%] (Sampling)
Chain 4 Iteration: 1300 / 3000 [ 43%] (Sampling)
Chain 4 Iteration: 1400 / 3000 [ 46%] (Sampling)
Chain 4 Iteration: 1500 / 3000 [ 50%] (Sampling)
Chain 4 Iteration: 1600 / 3000 [ 53%] (Sampling)
Chain 4 Iteration: 1700 / 3000 [ 56%] (Sampling)
Chain 4 Iteration: 1800 / 3000 [ 60%] (Sampling)
Chain 4 Iteration: 1900 / 3000 [ 63%] (Sampling)
Chain 4 Iteration: 2000 / 3000 [ 66%] (Sampling)
Chain 4 Iteration: 2100 / 3000 [ 70%] (Sampling)
Chain 4 Iteration: 2200 / 3000 [ 73%] (Sampling)
Chain 1 Iteration: 2000 / 3000 [ 66%] (Sampling)
Chain 1 Iteration: 2100 / 3000 [ 70%] (Sampling)
Chain 1 Iteration: 2200 / 3000 [ 73%] (Sampling)
Chain 1 Iteration: 2300 / 3000 [ 76%] (Sampling)
Chain 1 Iteration: 2400 / 3000 [ 80%] (Sampling)
Chain 1 Iteration: 2500 / 3000 [ 83%] (Sampling)
Chain 1 Iteration: 2600 / 3000 [ 86%] (Sampling)
Chain 1 Iteration: 2700 / 3000 [ 90%] (Sampling)
Chain 1 Iteration: 2800 / 3000 [ 93%] (Sampling)
Chain 1 Iteration: 2900 / 3000 [ 96%] (Sampling)
Chain 1 Iteration: 3000 / 3000 [100%] (Sampling)
Chain 2 Iteration: 2200 / 3000 [ 73%] (Sampling)
Chain 2 Iteration: 2300 / 3000 [ 76%] (Sampling)
Chain 2 Iteration: 2400 / 3000 [ 80%] (Sampling)
Chain 2 Iteration: 2500 / 3000 [ 83%] (Sampling)
Chain 2 Iteration: 2600 / 3000 [ 86%] (Sampling)
Chain 2 Iteration: 2700 / 3000 [ 90%] (Sampling)
Chain 2 Iteration: 2800 / 3000 [ 93%] (Sampling)
Chain 2 Iteration: 2900 / 3000 [ 96%] (Sampling)
Chain 2 Iteration: 3000 / 3000 [100%] (Sampling)
Chain 3 Iteration: 2400 / 3000 [ 80%] (Sampling)
Chain 3 Iteration: 2500 / 3000 [ 83%] (Sampling)
Chain 3 Iteration: 2600 / 3000 [ 86%] (Sampling)
Chain 3 Iteration: 2700 / 3000 [ 90%] (Sampling)
Chain 3 Iteration: 2800 / 3000 [ 93%] (Sampling)
Chain 3 Iteration: 2900 / 3000 [ 96%] (Sampling)
Chain 3 Iteration: 3000 / 3000 [100%] (Sampling)
Chain 4 Iteration: 2300 / 3000 [ 76%] (Sampling)
Chain 4 Iteration: 2400 / 3000 [ 80%] (Sampling)
Chain 4 Iteration: 2500 / 3000 [ 83%] (Sampling)
Chain 4 Iteration: 2600 / 3000 [ 86%] (Sampling)
Chain 4 Iteration: 2700 / 3000 [ 90%] (Sampling)
Chain 4 Iteration: 2800 / 3000 [ 93%] (Sampling)
Chain 4 Iteration: 2900 / 3000 [ 96%] (Sampling)
Chain 4 Iteration: 3000 / 3000 [100%] (Sampling)
Chain 1 finished in 0.2 seconds.
Chain 2 finished in 0.2 seconds.
Chain 3 finished in 0.2 seconds.
Chain 4 finished in 0.2 seconds.
All 4 chains finished successfully.
Mean chain execution time: 0.2 seconds.
Total execution time: 0.3 seconds.
You might get a lot of messages about C++ code while the model compiles, but you can ignore them. You might also get a few warning messages about rejected proposals, but unless there are a large number of these errors, you can ignore them as rejected proposals are a normal part of Metropolis and Hamiltonian Monte Carlo sampling.
Once our sampler has finished, the samples are stored in a CmdStanFit
object, which we can summarize accessing the summary()
method. Note that in this example, the output will include posterior estimates for the predicted value y_hat
for each observation because we included that in our Stan model.[^2]
$summary() angell_stan_fit
# A tibble: 91 × 10
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 lp__ -58.0 -57.7 1.45 1.27 -60.8 -56.3 1.00 3426. 4236.
2 beta[1] 19.7 19.7 1.21 1.21 17.6 21.6 1.00 3263. 4038.
3 beta[2] -0.107 -0.107 0.0177 0.0176 -0.136 -0.0774 1.00 4876. 4900.
4 beta[3] -0.186 -0.186 0.0361 0.0348 -0.245 -0.126 1.00 3466. 3688.
5 sigma 2.32 2.29 0.268 0.261 1.92 2.81 1.00 4969. 4343.
6 y_hat[1] 14.7 14.7 0.612 0.610 13.6 15.7 1.00 3687. 5310.
7 y_hat[2] 14.2 14.2 0.530 0.530 13.4 15.1 1.00 4222. 5731.
8 y_hat[3] 14.8 14.8 0.643 0.636 13.7 15.8 1.00 3626. 5253.
9 y_hat[4] 15.4 15.4 0.667 0.664 14.3 16.5 1.00 3617. 4807.
10 y_hat[5] 14.5 14.5 0.569 0.565 13.6 15.5 1.00 3882. 5332.
# ℹ 81 more rows
You can access specific parameters specifying variables
:
$summary(variables = c("beta", "sigma")) angell_stan_fit
# A tibble: 4 × 10
variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 beta[1] 19.7 19.7 1.21 1.21 17.6 21.6 1.00 3263. 4038.
2 beta[2] -0.107 -0.107 0.0177 0.0176 -0.136 -0.0774 1.00 4876. 4900.
3 beta[3] -0.186 -0.186 0.0361 0.0348 -0.245 -0.126 1.00 3466. 3688.
4 sigma 2.32 2.29 0.268 0.261 1.92 2.81 1.00 4969. 4343.
The bayesplot
package includes a number of useful functions for graphically checking convergence:
library("bayesplot")
mcmc_hist(angell_stan_fit$draws("beta"))
mcmc_trace(angell_stan_fit$draws("beta"))
What if we want a plot of the potential scale reduction factor \(\hat{R}\) at each iteration of the sampler? Unfortunately, cmdstanr
doesn’t include a built-in function to produce this plot. Fortunately, it does provide an easy way to convert a CmdStanMCMC
object to a coda
object using the as_mcmc.list()
function.
::gelman.plot(postpack::post_subset(as_mcmc.list(angell_stan_fit),
codaparams = "beta"),
autoburnin = FALSE)
The shinystan package (Stan Development Team 2017) offers a convenient way to inspect the posterior distribution(s) of parameters in your web browser, using the R package shiny as a backend. This is a great tool for quick model checking, even though you will want to use the commands shown above to produce plots for a manuscript and/or appendix.
shinystan is installed automatically when you install rstanarm. To use it, simply call launch_shinystan
on the model you fit with rstanarm:
library("shinystan")
launch_shinystan(angell_stan_fit)
After a few seconds, your browser should show the shinystan interface:
Click on any of the buttons DIAGNOSE, ESTIMATE, EXPLORE, or MORE to proceed. Shinystan can even produce a table in format, and you can generate and export a number of figures without typing any code. (For reproducibility purposes, this is not part of your desired workflow, but Shinystan does allow for a fast and comprehensive look at convergence diagnostics.)
Now that we’re confident that our samples are actually drawn from a posterior distribution of converging chains, we need to communicate the information contained in them to our readers. Here, you can use the same approach we’ve used before, for instance create density plots of the coefficients. Refer to prior labs and lectures for more information.
As before, you can use the mcmcTab()
function from the BayesPostEst package to create a table, as long as you convert the CmdStanR object into an mcmc.list
using as_mcmc.list()
:
library("BayesPostEst")
mcmcTab(as_mcmc.list(angell_stan_fit),
pars = c("beta[1]", "beta[2]", "beta[3]"))
Variable Median SD Lower Upper
1 beta[1] 19.662 1.214 17.287 22.035
2 beta[2] -0.107 0.018 -0.142 -0.072
3 beta[3] -0.186 0.036 -0.258 -0.114
All the other functions you saw in Lab 2 should work for CmdStanR objects accordingly as well (again, once you convert into an mcmc.list
).
You can use a similar plotting functions as encountered in prior labs and lectures to visualize the posterior distribution(s) of parameters. You need to first load the bayesplot package to produce plots, and then convert the CmdStanR object into matrix (see above – using a combination of as.matrix()
and as_mcmc.list()
. You can do this inside the call to the mcmc_areas()
plotting function, or beforehand
library("ggplot2")
mcmc_areas(as.matrix(as_mcmc.list(angell_stan_fit)),
pars = c("beta[2]", "beta[3]")) +
xlim(-0.4, 0.1) +
geom_vline(xintercept = 0) +
scale_y_discrete(labels = c("Heterogeneity", "Mobility"))
For more information on plotting rstan objects, please read the excellent vignettes on the bayesplot website:
Stan is very well documented; check out the Stan website as a portal to a plethora of resources. There is also a Stan discussion forum with a lot of examples of estimation problems that the Stan developers helped solve.
There are a few excellent sources for Stan model code that make it easy to translate your skills from this workshop (and your JAGS/BUGS code) to Stan.
stancode
function!