simulate_mrf() generates observations from a Markov Random
Field using user-specified parameters. For ordinal and
Blume-Capel variables, observations are generated via Gibbs
sampling. For continuous variables (Gaussian graphical model),
observations are drawn directly from the multivariate normal
distribution implied by the precision matrix.
Usage
simulate_mrf(
num_states,
num_variables,
num_categories,
pairwise,
main,
variable_type = "ordinal",
baseline_category,
iter = 1000,
seed = NULL
)Arguments
- num_states
The number of observations to be generated.
- num_variables
The number of variables in the MRF.
- num_categories
Either a positive integer or a vector of positive integers of length
num_variables. The number of response categories on top of the base category:num_categories = 1generates binary states. Only used for ordinal and Blume-Capel variables; ignored whenvariable_type = "continuous".- pairwise
A symmetric
num_variablesbynum_variablesmatrix. For ordinal and Blume-Capel variables, this contains the pairwise interaction parameters; only the off-diagonal elements are used. For continuous variables, this is the precision matrix \(\Omega\) (including diagonal) and must be positive definite.- main
For ordinal and Blume-Capel variables: a
num_variablesbymax(num_categories)matrix of category thresholds. The elements in rowiindicate the thresholds of variablei. Ifnum_categoriesis a vector, only the firstnum_categories[i]elements are used in rowi. If the Blume-Capel model is used for the category thresholds for variablei, then rowirequires two values (details below); the first is \(\alpha\), the linear contribution of the Blume-Capel model and the second is \(\beta\), the quadratic contribution. For continuous variables: a numeric vector of lengthnum_variablescontaining the means \(\mu\) for each variable. Defaults to zeros if not supplied or if all values are zero.- variable_type
What kind of variables are simulated? Can be a single character string specifying the variable type of all
pvariables at once or a vector of character strings of lengthpspecifying the type for each variable separately. Currently, bgm supports"ordinal","blume-capel", and"continuous". Binary variables are automatically treated as"ordinal". Ordinal and Blume-Capel variables can be mixed freely, but continuous variables cannot be mixed with ordinal or Blume-Capel variables. Whenvariable_type = "continuous", the function simulates from a Gaussian graphical model. Defaults tovariable_type = "ordinal".- baseline_category
An integer vector of length
num_variablesspecifying the baseline_category category that is used for the Blume-Capel model (details below). Can be any integer value between0andnum_categories(ornum_categories[i]).- iter
The number of iterations used by the Gibbs sampler (ordinal/Blume-Capel variables only). The function provides the last state of the Gibbs sampler as output. Ignored for continuous variables. By default set to
1e3.- seed
Optional integer seed for reproducibility. If
NULL, a seed is generated from R's random number generator (soset.seed()can be used before calling this function).
Value
A num_states by num_variables matrix of simulated
observations. For ordinal/Blume-Capel variables, entries are non-negative
integers. For continuous variables, entries are real-valued.
Details
Ordinal / Blume-Capel variables: The Gibbs sampler is initiated with random values from the response options, after which it proceeds by simulating states for each variable from its full conditional distribution given the other variable states.
Continuous variables (GGM):
Observations are drawn from \(N(\mu, \Omega^{-1})\)
where \(\Omega\) is the precision matrix specified via
pairwise and \(\mu\) is the means vector specified via main.
No Gibbs sampling is needed; iter is ignored.
There are two modeling options for the category thresholds. The default option assumes that the category thresholds are free, except that the first threshold is set to zero for identification. The user then only needs to specify the thresholds for the remaining response categories. This option is useful for any type of ordinal variable and gives the user the most freedom in specifying their model.
The Blume-Capel option is specifically designed for ordinal variables that have a special type of baseline_category category, such as the neutral category in a Likert scale. The Blume-Capel model specifies the following quadratic model for the threshold parameters: $$\mu_{\text{c}} = \alpha (\text{c} - \text{r}) + \beta (\text{c} - \text{r})^2$$ where \(\mu_{\text{c}}\) is the threshold for category c (which now includes zero), \(\alpha\) offers a linear trend across categories (increasing threshold values if \(\alpha > 0\) and decreasing threshold values if \(\alpha <0\)), if \(\beta < 0\), it offers an increasing penalty for responding in a category further away from the baseline_category category r, while \(\beta > 0\) suggests a preference for responding in the baseline_category category.
See also
simulate.bgms for simulating from a fitted model.
Other prediction:
predict.bgmCompare(),
predict.bgms(),
simulate.bgmCompare(),
simulate.bgms()
Examples
# Generate responses from a network of five binary and ordinal variables.
num_variables = 5
num_categories = sample(1:5, size = num_variables, replace = TRUE)
Pairwise = matrix(0, nrow = num_variables, ncol = num_variables)
Pairwise[2, 1] = Pairwise[4, 1] = Pairwise[3, 2] =
Pairwise[5, 2] = Pairwise[5, 4] = .25
Pairwise = Pairwise + t(Pairwise)
Main = matrix(0, nrow = num_variables, ncol = max(num_categories))
x = simulate_mrf(
num_states = 1e3,
num_variables = num_variables,
num_categories = num_categories,
pairwise = Pairwise,
main = Main
)
#> Warning: The matrix ``main'' contains numeric values for variable 2 for category
#> (categories, i.e., columns) exceding the maximum of 3. These values will
#> be ignored.
#> Warning: The matrix ``main'' contains numeric values for variable 3 for category
#> (categories, i.e., columns) exceding the maximum of 2. These values will
#> be ignored.
#> Warning: The matrix ``main'' contains numeric values for variable 4 for category
#> (categories, i.e., columns) exceding the maximum of 1. These values will
#> be ignored.
# Generate responses from a network of 2 ordinal and 3 Blume-Capel variables.
num_variables = 5
num_categories = 4
Pairwise = matrix(0, nrow = num_variables, ncol = num_variables)
Pairwise[2, 1] = Pairwise[4, 1] = Pairwise[3, 2] =
Pairwise[5, 2] = Pairwise[5, 4] = .25
Pairwise = Pairwise + t(Pairwise)
Main = matrix(NA, num_variables, num_categories)
Main[, 1] = -1
Main[, 2] = -1
Main[3, ] = sort(-abs(rnorm(4)), decreasing = TRUE)
Main[5, ] = sort(-abs(rnorm(4)), decreasing = TRUE)
x = simulate_mrf(
num_states = 1e3,
num_variables = num_variables,
num_categories = num_categories,
pairwise = Pairwise,
main = Main,
variable_type = c("b", "b", "o", "b", "o"),
baseline_category = 2
)
# Generate responses from a Gaussian graphical model (GGM) with 4 variables.
num_variables = 4
# Precision matrix (symmetric, positive definite)
Omega = diag(c(1, 1.2, 0.8, 1.5))
Omega[2, 1] = Omega[1, 2] = 0.3
Omega[3, 1] = Omega[1, 3] = 0.3
Omega[4, 2] = Omega[2, 4] = -0.2
x = simulate_mrf(
num_states = 500,
num_variables = num_variables,
pairwise = Omega,
variable_type = "continuous"
)