Ordinal Markov Random Field

The ordinal Markov random field (MRF) is a graphical model for networks of binary and ordinal variables (Anderson & Vermunt, 2000; Suggala et al., 2017). It is the default model in bgms and forms the discrete component of the Mixed MRF. To fit an ordinal MRF, use the default variable_type = "ordinal".

The model

For \(p\) ordinal variables with observations \(\mathbf{x} = (x_1, \ldots, x_p)\), the ordinal MRF specifies the joint distribution as:

\[ p(\mathbf{x}) \propto \exp\!\left(\sum_{i=1}^{p} \sum_{c=1}^{m_i} \mathcal{I}(x_i = c)\,\mu_{ic} + \mathbf{x}^{\sf T}\boldsymbol{\Omega}\, \mathbf{x}\right) \]

where \(\mu_{ic}\) is the category threshold parameter for category \(c\) of variable \(i\) (one per category above the baseline \(c = 0\), which is fixed at \(\mu_{i0} = 0\)) and \(\boldsymbol{\Omega}\) is a symmetric interaction matrix whose off-diagonal entry \(\omega_{ij}\) is the partial association between variables \(i\) and \(j\).

Why the adjacent category model?

The ordinal MRF can be understood as a multivariate generalization of the adjacent category logit model for ordinal responses (Agresti, 2010).

The univariate case

Consider a single ordinal variable \(x \in \{0, \dots, m\}\), where the category index reflects only the ordering of the response labels. The adjacent category model characterizes the log-odds between successive categories:

\[ \log\!\left( \frac{p(x = c+1)}{p(x = c)} \right) = \eta_c, \quad c = 0, \dots, m-1 \]

with \(\eta_0 = 0\) for identifiability. The resulting probability mass function,

\[ p(x = c) = \frac{\exp\!\left( \sum_{k=c}^{m} \eta_k \right)}{1 + \sum_{j=1}^{m} \exp\!\left( \sum_{k=j}^{m} \eta_k \right)} \]

is a member of the exponential family with canonical parameters \(\eta_c\) and sufficient statistics \(\mathcal{I}(x \geq c)\).

The multivariate extension

For \(p\) ordinal variables, we regress each variable \(x_i\) on the others by modeling the canonical parameters as

\[ \eta_c = \alpha_c + \sum_{j \neq i} \beta_j\, x_j \]

where \(\alpha_c\) are category-specific intercepts and \(\beta_j\) are common slopes. Crucially, the model depends on the ordinal variables only through the sufficient statistics \(\mathcal{I}(x_i \geq c)\) and the interaction terms \(x_i\, x_j = \sum_c \sum_{c'} \mathcal{I}(x_i \geq c)\, \mathcal{I}(x_j \geq c')\). Because these terms are computed from cumulative indicators, the model captures only the ordering of the categories, without relying on any numeric interpretation or spacing of the labels. It is therefore invariant under monotonic recodings and does not assume interval-scale measurement.

Other ordinal regression models, such as the cumulative logit and continuation ratio models (Agresti, 2010), do not share this property. As shown by Suggala et al. (2017), these models do not correspond to full conditionals of a joint distribution, and are therefore incompatible with a multivariate MRF formulation.

The adjacent category model does yield consistent full conditionals, and the joint distribution that generates them is the ordinal MRF (Marsman et al., 2025).

Connection to the MRF framework

The Graphical Models page introduced the general pairwise MRF:

\[ p(\mathbf{x}) \propto \exp\!\left(\sum_{i} \mu_i(x_i) + \mathbf{x}^{\sf T}\boldsymbol{\Omega}\, \mathbf{x}\right) \]

The ordinal MRF is the discrete special case. The node potentials \(\mu_i(x_i)\) are the category thresholds \(\mu_{ic}\), and the interaction matrix \(\boldsymbol{\Omega}\) has off-diagonal entries \(\omega_{ij}\). The bilinear form \(\mathbf{x}^{\sf T}\boldsymbol{\Omega}\,\mathbf{x}\) reduces to \(2\sum_{i < j} \omega_{ij}\, x_i\, x_j\) because \(\boldsymbol{\Omega}\) is symmetric (the diagonal terms are not identifiable, and absorbed by the thresholds).

Partial associations

As described in Graphical Models, bgms reports all pairwise effects on a unified partial association scale. For the ordinal MRF, the partial association \(\omega_{ij}\) is half the log adjacent-category odds ratio. The adjacent-category odds ratio compares, for two variables \(x_i\) and \(x_j\), how the odds of being in category \(h+1\) versus \(h\) for one variable change when the other variable moves from category \(q\) to \(q+1\):

\[ \text{OR}_{ij} = \frac{P(x_i = h,\; x_j = q)\; P(x_i = h+1,\; x_j = q+1)}{P(x_i = h+1,\; x_j = q)\; P(x_i = h,\; x_j = q+1)} = \exp(2\omega_{ij}) \]

This ratio is constant across all category pairs \((h, q)\) and does not depend on the values of the remaining variables. The partial association is therefore:

\[ \omega_{ij} = \tfrac{1}{2}\,\log\text{OR}_{ij} \]

A positive \(\omega_{ij}\) means that the odds of observing one variable in a higher adjacent category increase when the other variable is also in a higher adjacent category.

To obtain the full log adjacent-category odds ratios, use extract_log_odds(fit), which returns \(2\omega_{ij}\) for each pair. The coef(fit)$pairwise method returns the partial associations \(\omega_{ij}\).

Category thresholds

The thresholds \(\mu_{ic}\) are the node potentials of the ordinal MRF. They control the marginal category distribution of each variable — how likely each response category is, apart from any associations with other variables.

Regular ordinal variables

By default, bgms treats ordinal variables as “regular”: each category above the lowest receives its own threshold parameter. This imposes no constraint on the shape of the category distribution. Variables are internally recoded to consecutive integers starting at 0.

For binary variables (two categories, coded 0 and 1), the model reduces to the Ising model (Ising, 1925), a classical MRF for binary data. The single threshold parameter controls the probability of category 1 relative to category 0.

Blume-Capel variables

The Blume-Capel model originates in statistical physics, where Blume (1966) and Capel (1966) independently developed it to study phase transitions in ternary spin systems. As an application to psychology, van der Maas et al. (2026) used the model to study attitudes, depression, and intelligence.

For ordinal variables with a natural baseline or reference category (e.g., a “neutral” response), the Blume-Capel parameterization models category thresholds as a quadratic function of distance from the baseline:

\[ \mu_c = \alpha \cdot (c - b) + \beta \cdot (c - b)^2 \]

where \(c\) is the response category, \(b\) is the baseline category, \(\alpha\) captures the linear trend across categories, and \(\beta\) captures the preference toward or away from the baseline:

  • \(\beta < 0\): the model favors responses near the baseline category
  • \(\beta > 0\): the model favors responses away from the baseline category

Partial associations between Blume-Capel variables use centered scores: the interaction term in the joint distribution becomes \(2(x_i - b_i)(x_j - b_j)\,\omega_{ij}\) instead of \(2\,x_i\,x_j\,\omega_{ij}\). The partial association \(\omega_{ij}\) remains the same; the centering is absorbed by the thresholds. For binary variables, the Blume-Capel parameterization reduces to the Ising model, just as the regular ordinal parameterization does.

To use Blume-Capel variables, set variable_type = "blume-capel" and provide the baseline_category argument:

fit = bgm(x = data, variable_type = "blume-capel",
          baseline_category = 2, seed = 1234)

The baseline_category can be a single integer (applied to all variables) or a vector of length \(p\).

The thresholds are nuisance parameters — they are estimated during MCMC sampling but are typically not of direct inferential interest. To extract them, use extract_main_effects(fit).

Pseudolikelihood

The normalizing constant of the ordinal MRF involves a sum over all possible configurations of \(p\) ordinal variables, which is intractable for all but the smallest networks. bgms uses a pseudolikelihood approximation (Marsman et al., 2025): the product of full conditional distributions replaces the joint distribution.

The full conditional distribution for variable \(i\) given all other variables is:

\[ p(x_i = c \mid \mathbf{x}_{-i}) = \frac{\exp\!\left(\mu_{ic} + c \cdot r_i\right)}{\sum_{k=0}^{m_i} \exp\!\left(\mu_{ik} + k \cdot r_i\right)} \]

where \(r_i = 2\sum_{j \neq i} x_j\, \omega_{ij}\) is the residual score — a sufficient reduction that summarizes the influence of all other variables on \(x_i\). The pseudolikelihood is the product of these conditionals across all observations and variables.

Edge selection

When edge_selection = TRUE (the default), bgms applies spike-and-slab priors to the partial associations \(\omega_{ij}\). The spike sets \(\omega_{ij} = 0\), enforcing the absence of the edge. The slab places a Cauchy prior on \(\omega_{ij}\), allowing the effect size to be estimated from the data. See Prior Basics for the spike-and-slab specification.

The posterior inclusion probability for each edge quantifies the evidence for a direct association, averaged across all possible configurations of the remaining edges. A posterior inclusion probability exceeding 0.5 shows evidence for the presence of the corresponding edge, while a probability below 0.5 shows evidence for the exclusion of the corresponding edge. See Edge Selection.

Missing data

Missing values can be handled via listwise deletion (the default) or imputed within the Gibbs sampler by setting na_action = "impute". For discrete data, missing entries are drawn from their full conditional categorical distribution at each MCMC iteration. See Missing Data for details on both approaches and recommendations.

Fitting an ordinal MRF

The ordinal MRF is the default model type. Call bgm() directly:

fit = bgm(x = data, seed = 1234)

For Blume-Capel variables:

fit = bgm(x = data, variable_type = "blume-capel",
          baseline_category = 3, seed = 1234)

The summary() and coef() methods work the same as for GGM and mixed models. coef(fit)$pairwise returns the posterior means of the partial associations \(\omega_{ij}\), and coef(fit)$indicator returns the posterior inclusion probabilities.

To obtain model-specific parameterizations, use the dedicated extractors:

  • extract_main_effects(fit) — returns the category threshold parameters.
  • extract_log_odds(fit) — returns the log adjacent-category odds ratios \(2\omega_{ij}\).

References

Agresti, A. (2010). Analysis of ordinal categorical data (2nd ed.). Wiley. https://doi.org/10.1002/9780470594001
Anderson, C. J., & Vermunt, J. K. (2000). Log-multiplicative association models as latent variable models for nominal and/or ordinal data. Sociological Methodology, 30(1), 81–121. https://doi.org/10.1111/0081-1750.00076
Blume, M. (1966). Theory of the first-order magnetic phase change in UO\(_2\). Physical Review, 141(2), 517–524. https://doi.org/10.1103/PhysRev.141.517
Capel, H. W. (1966). On the possibility of first-order phase transitions in Ising systems of triplet ions with zero-field splitting. Physica, 32(5), 966–988. https://doi.org/10.1016/0031-8914(66)90027-9
Ising, E. (1925). Beitrag zur Theorie des Ferromagnetismus. Zeitschrift für Physik, 31(1), 253–258. https://doi.org/10.1007/BF02980577
Marsman, M., van den Bergh, D., & Haslbeck, J. M. B. (2025). Bayesian analysis of the ordinal Markov random field. Psychometrika, 90(1), 146–182. https://doi.org/10.1017/psy.2024.4
Suggala, A. S., Yang, E., & Ravikumar, P. (2017). Ordinal graphical models: A tale of two approaches. In D. Precup & Y. W. Teh (Eds.), Proceedings of the 34th international conference on machine learning (Vol. 70, pp. 3260–3269). PMLR.
van der Maas, H. L. J., Borsboom, D., & Waldorp, L. (2026). The statistical physics of psychological networks: Zero matters. Psychological Review. https://doi.org/10.1037/rev0000611