bgmCompare Internals

How bgmCompare() works under the hood: multi-group posterior, contrast parameterization, and sampler implementation. Source: src/bgmCompare/ directory.

Differences from single-group bgm

The bgmCompare engine shares the same OMRF likelihood (pseudolikelihood over ordinal/Blume-Capel variables) and the same NUTS/Metropolis sampler options (HMC is deprecated), but differs in three key ways:

  1. Parameter decomposition — Group-specific parameters \(\boldsymbol{\omega}^{(g)}\) are decomposed into a shared baseline \(\boldsymbol{\omega}\) and contrast parameters \(\boldsymbol{\delta}^{(k)}\), linked through a contrast matrix.
  2. Separate sampler — The bgmCompare engine uses its own chain runner (run_gibbs_sampler_bgmCompare()) rather than the general run_mcmc_chain(), because the multi-group structure requires group-specific likelihood evaluations and contrast selection.
  3. Contrast selection — Instead of edge selection (which edges exist), bgmCompare performs contrast selection: for each parameter, the baseline component is always present, and the spike-and-slab mechanism decides whether contrast parameters are active or collapsed to zero.

Contrast-matrix parameterization

For \(G\) groups, the per-group parameter vector \(\boldsymbol{\omega}^{(g)}\) is expressed as:

\[ \omega_{ij}^{(g)} = \omega_{ij} + \sum_{k=1}^{G-1} c_{gk}\, \delta_{ij}^{(k)} \]

where \(\mathbf{C}\) is a \(G \times (G-1)\) projection (contrast) matrix. The baseline \(\omega_{ij}\) captures the shared structure; the contrast parameters \(\delta_{ij}^{(k)}\) are the projections onto the \(G-1\) orthogonal contrast directions. For two groups, the single contrast reduces to a scaled group difference. Category thresholds are decomposed in the same way.

The flat parameter vector (used by NUTS and HMC) groups by component type, not per effect:

[ overall_mains | overall_pairs | active_main_diffs | active_pair_diffs ]

Contrast parameters appear only when their inclusion indicator is 1; inactive contrasts are excluded from the vector entirely.

Multi-group log-posterior and gradient

The log-pseudoposterior sums over groups. For each group \(g\):

  1. Reconstruct group-specific parameters via the contrast matrix
  2. Compute the pseudolikelihood using group \(g\)’s data (sufficient statistics: category counts, Blume-Capel stats, pairwise stats)
  3. Evaluate the gradient of the pseudolikelihood with respect to the active parameters

The gradient() function combines three components:

  • Observed-data gradient (gradient_observed_active()) — projects group-level sufficient statistics into the active-parameter layout using the contrast matrix. Precomputed from counts_per_category, blume_capel_stats, and pairwise_stats.
  • Expected-data gradient — computed inline within gradient() by building group-specific parameters via the contrast projection, evaluating softmax probabilities, and subtracting expected counts.
  • Prior gradient — Cauchy penalties on both baseline and contrast pairwise parameters, with scale adjusted by the number of observations.

Sampler implementation

run_gibbs_sampler_bgmCompare() orchestrates one MCMC chain. Each iteration:

  1. Missing data imputation (if na_impute = TRUE) — Impute missing observations per group using the current parameters. Updates the observation matrix, category counts, Blume-Capel statistics, and pairwise statistics.
  2. Contrast selection (if active) — A Metropolis step decides whether each contrast parameter is included or collapsed to zero (update_indicator_differences_metropolis_bgmcompare()).
  3. Parameter updates — With adaptive Metropolis, main effects and pairwise effects are updated separately (update_main_effects_metropolis_bgmcompare(), then update_pairwise_effects_metropolis_bgmcompare()). With NUTS or HMC, all active parameters are updated jointly via the vectorized log-pseudoposterior.
  4. Inclusion probability update (Beta-Bernoulli prior only) — Draw new inclusion probability from the conjugate Beta posterior.

The warmup schedule follows the same five-stage structure as the single-group model (see Warmup Schedule).

Active parameter tracking

Because contrast selection can turn parameters on and off, the total_length() function recomputes the active parameter dimension at each iteration. Index maps (build_index_maps()) provide the mapping between the flat active-parameter vector and the structured (variable, group) parameter layout.

When a contrast parameter is “off” (indicator = 0), it is excluded from the NUTS block. The step size and mass matrix adapt to the current active dimension.

Output handling

The bgmCompareOutput struct stores:

Field Content
main_samples Main-effect posterior samples (iterations \(\times\) thresholds \(\times\) groups)
pairwise_samples Pairwise-effect posterior samples (iterations \(\times\) pairs \(\times\) groups)
indicator_samples Contrast inclusion indicators (iterations \(\times\) parameters)
treedepth_samples Tree depth per iteration (NUTS only)
divergent_samples Divergent transition flags (NUTS only)
energy_samples Hamiltonian energy per iteration (NUTS only)

The R output builder (build_output_compare()) splits the posterior means into:

  • $group_posterior_means — Posterior means of the group-level parameters \(\boldsymbol{\omega}^{(g)}\), reconstructed from the baseline \(\omega_{ij}\) and contrast parameters \(\delta_{ij}^{(k)}\)
  • $difference_posterior_means — Posterior means of the contrast parameters \(\delta_{ij}^{(k)}\)
  • $difference_inclusion_probabilities — Proportion of iterations where each contrast indicator was included