OMRF
Implementation details of the ordinal Markov random field in bgms. Source: src/models/omrf/omrf_model.h and omrf_model.cpp.
Ordinal threshold parameterization
For an ordinal variable \(X_i\) with \(K_i + 1\) categories \((0, 1, \ldots, K_i)\), the model assigns \(K_i\) threshold parameters \(\mu_i(1), \ldots, \mu_i(K_i)\). Category \(c = 0\) serves as the reference with \(\mu_i(0) = 0\). The conditional probability of observing category \(c\) for person \(n\) is:
\[ P(X_{ni} = c \mid \text{rest}) = \frac{\exp\!\bigl(\mu_i(c) + c \cdot r_{ni}\bigr)}{\sum_{k=0}^{K_i} \exp\!\bigl(\mu_i(k) + k \cdot r_{ni}\bigr)} \]
where \(r_{ni} = 2\sum_{j \neq i} x_{nj}\, \omega_{ij}\) is the residual score summarizing the pairwise interactions with all other variables.
Blume-Capel model
For Blume-Capel variables, the threshold structure is replaced by two parameters: a linear effect \(\alpha_i\) and a quadratic effect \(\beta_i\). The category score is:
\[ \mu_i(c) = \alpha_i (c - b_i) + \beta_i (c - b_i)^2 \]
where \(b_i\) is the reference (baseline) category. This allows the model to capture non-monotone category effects, such as preferences for moderate responses.
Residual matrix
The model maintains a precomputed \(n \times p\) residual matrix where entry \((n, i)\) stores:
\[ r_{ni} = 2\sum_{j \neq i} x_{nj}\, \omega_{ij} \]
When a single interaction \(\omega_{ij}\) changes by \(\delta\), the residual matrix is updated incrementally:
- Column \(i\): \(r_{ni} \leftarrow r_{ni} + 2\delta \cdot x_{nj}\) for all \(n\)
- Column \(j\): \(r_{nj} \leftarrow r_{nj} + 2\delta \cdot x_{ni}\) for all \(n\)
This avoids recomputing the full matrix product after each parameter update.
Pseudolikelihood
The OMRF uses a pseudolikelihood approximation: the product of full conditional distributions rather than the intractable joint distribution. The log-pseudolikelihood is:
For a cross-model overview, see Pseudolikelihood.
\[ \text{PL}(\boldsymbol{\theta}) = \sum_{i=1}^{p} \sum_{n=1}^{N} \log P(X_{ni} = x_{ni} \mid \text{rest}) \]
The gradient of the log-pseudolikelihood with respect to each parameter involves the expected category values under the current conditionals, computed by the Variable Helpers.
Gradient computation
The logp_and_gradient() method computes both the log-pseudoposterior and its gradient in a single pass. The observed contribution to the gradient for a pairwise interaction \(\omega_{ij}\) is \(4 \sum_{n} x_{ni} x_{nj}\) (from the sufficient statistic \(\mathbf{X}^\top\mathbf{X}\)). The expected contribution accumulates over both variables:
\[ \frac{\partial \text{PL}}{\partial \omega_{ij}} = 4\sum_{n=1}^{N} x_{ni}\, x_{nj} - 2\sum_{n=1}^{N} \bigl(E[X_i \mid \text{rest}]\, x_{nj} + E[X_j \mid \text{rest}]\, x_{ni}\bigr) \]
The expected values \(E[X_i \mid \text{rest}]\) are computed from the category probabilities, which are obtained via the numerically stable FAST/SAFE paths in variable_helpers.cpp. The pairwise gradient is vectorized using BLAS matrix products: \(\mathbf{X}^\top \mathbf{E}\) where \(\mathbf{E}\) is the matrix of conditional expected values.
Edge indicator updates
Edge selection uses a Metropolis-Hastings sampler based on the mixtures of mutually singular (MoMS) distributions framework (Gottardo & Raftery, 2008; van den Bergh et al., 2026). All interaction parameters live in a fixed-dimensional space; excluded edges are represented as parameters fixed at zero rather than removed from the model.
When edge selection is active, the model performs a full scan over all \(p(p-1)/2\) edges after each gradient-based step. For each edge \((i, j)\):
Add move (\(\gamma_{ij} = 0 \to 1\)). Propose a new interaction value \(\omega_{ij}^\star \sim \mathcal{N}(\omega_{ij},\, \sigma_{ij}^2)\) and accept with probability \[ \alpha = \min\!\left\{1,\; \frac{p(\boldsymbol{\gamma}^\star, \boldsymbol{\omega}^\star \mid \mathbf{X})} {p(\boldsymbol{\gamma},\, \boldsymbol{\omega} \mid \mathbf{X})} \;\frac{1}{q(\omega_{ij}^\star \mid \omega_{ij})} \right\} \] where the posterior ratio includes the pseudolikelihood ratio, the Cauchy slab prior on \(\omega_{ij}^\star\), and the prior inclusion odds.
Delete move (\(\gamma_{ij} = 1 \to 0\)). Set \(\omega_{ij}^\star = 0\) and accept with probability \[ \alpha = \min\!\left\{1,\; \frac{p(\boldsymbol{\gamma}^\star, \boldsymbol{\omega}^\star \mid \mathbf{X})} {p(\boldsymbol{\gamma},\, \boldsymbol{\omega} \mid \mathbf{X})} \;\frac{q(\omega_{ij} \mid \omega_{ij}^\star)}{1} \right\} \]
The scan order is randomized at each iteration via prepare_iteration(), which calls arma_randperm().
Adaptive Metropolis fallback
Even when NUTS is the primary sampler, the OMRF implements element-wise adaptive Metropolis proposals for use during warmup stage 3b (proposal SD learning) and as a user-selectable alternative via update_method = "adaptive-metropolis". The proposal SDs are tuned by Robbins-Monro targeting 0.44 acceptance, clamped to \([0.001, 2.0]\).