Mixed Markov Random Field

The mixed Markov random field (MRF) is a graphical model for networks containing both discrete (binary or ordinal) and continuous variables. It unites the Ordinal MRF and the GGM into a single joint distribution through the conditional Gaussian framework (Lauritzen, 1996; Lauritzen & Wermuth, 1989). To fit a mixed MRF, provide a variable_type vector that includes both discrete and continuous types.

The model

For $p$ discrete variables $\mathbf{x} = (x_1, \ldots, x_p)$ and $q$ continuous variables $\mathbf{y} = (y_1, \ldots, y_q)$, the mixed MRF specifies the joint distribution as:

\[ p(\mathbf{x}, \mathbf{y}) \propto \exp\!\left( \sum_{i=1}^{p} \sum_{c=1}^{m_i}\mathcal{I}(x_i = c)\,\mu_{ic} + \mathbf{x}^{\sf T}\boldsymbol{\Omega}_{xx}\,\mathbf{x} + (\mathbf{y} - \boldsymbol{\mu}_y)^{\sf T}\boldsymbol{\Omega}_{yy}\,(\mathbf{y} - \boldsymbol{\mu}_y) + 2\,\mathbf{x}^{\sf T}\boldsymbol{\Omega}_{xy}\,\mathbf{y}\right) \]

where:

$\mu_{ic}$ are category thresholds for discrete variable $i$, category $c$ (with baseline $\mu_{i0} = 0$), as in the Ordinal MRF
$\boldsymbol{\mu}_y$ is the $q$-vector of continuous means
$\boldsymbol{\Omega}_{xx}$ is a $p \times p$ symmetric matrix of discrete–discrete interactions
$\boldsymbol{\Omega}_{yy}$ is a $q \times q$ symmetric negative semi-definite matrix, related to the precision matrix by $\boldsymbol{\Theta} = -2\,\boldsymbol{\Omega}_{yy}$
$\boldsymbol{\Omega}_{xy}$ is a $p \times q$ matrix of cross-type interactions

The three interaction blocks ($\boldsymbol{\Omega}_{xx}$, $\boldsymbol{\Omega}_{yy}$, $\boldsymbol{\Omega}_{xy}$) enter the density on the same scale.

Conditional Gaussian structure

The mixed MRF belongs to the class of conditional Gaussian (CG) distributions (Lauritzen & Wermuth, 1989): given the discrete variables, the continuous variables follow a multivariate normal distribution:

\[ \mathbf{y} \mid \mathbf{x} \sim \mathcal{N}\!\left(\boldsymbol{\mu}_y + 2\,\boldsymbol{\Sigma}\,\boldsymbol{\Omega}_{xy}^{\sf T}\,\mathbf{x},\;\boldsymbol{\Sigma}\right) \]

where $\boldsymbol{\Sigma} = \boldsymbol{\Theta}^{-1} = (-2\,\boldsymbol{\Omega}_{yy})^{-1}$ is the conditional covariance. The conditional mean $\boldsymbol{\mu}_y + 2\,\boldsymbol{\Sigma}\,\boldsymbol{\Omega}_{xy}^{\sf T}\,\mathbf{x}$ shifts linearly with the discrete scores: each discrete variable $x_i$ contributes to the conditional mean of continuous variable $y_j$ in proportion to $\omega_{ij}^{xy}$. The conditional covariance $\boldsymbol{\Sigma}$ does not depend on $\mathbf{x}$ — the discrete variables affect the location of the continuous distribution but not its spread.

In the pure GGM, bgms centers the continuous data so that the mean vector is zero and drops out of the model (see GGM). In the mixed MRF, this centering is not possible: the cross-type interactions couple the continuous means to the discrete variables, so $\boldsymbol{\mu}_y$ remains a free parameter.

The marginal distribution of $\mathbf{x}$, obtained by integrating out $\mathbf{y}$, is an ordinal MRF whose effective interactions absorb the indirect associations between discrete variables mediated through the continuous block.

Connection to the MRF framework

The Graphical Models page introduced the general pairwise MRF:

\[ p(\mathbf{z}) \propto \exp\!\left(\sum_{i} \mu_i(z_i) + \mathbf{z}^{\sf T}\boldsymbol{\Omega}\, \mathbf{z}\right) \]

The mixed MRF extends this framework to a mixed variable vector $\mathbf{z} = (\mathbf{x}, \mathbf{y})$. The interaction matrix $\boldsymbol{\Omega}$ decomposes into three blocks — $\boldsymbol{\Omega}_{xx}$, $\boldsymbol{\Omega}_{yy}$, and $\boldsymbol{\Omega}_{xy}$ — corresponding to the three types of edges in the graph. When all variables are discrete ($q = 0$), the continuous blocks vanish and the model reduces to the ordinal MRF. When all variables are continuous ($p = 0$), the discrete blocks vanish and the model reduces to the GGM with precision matrix $\boldsymbol{\Theta} = -2\,\boldsymbol{\Omega}_{yy}$.

Partial associations

The mixed MRF has three types of partial associations, all reported on the same scale in coef(fit)$pairwise:

Discrete–discrete. The off-diagonal entries of $\boldsymbol{\Omega}_{xx}$ have the same interpretation as in the ordinal MRF: each $\omega_{ij}^{xx}$ is half the log adjacent-category odds ratio, constant across all category pairs. See Ordinal MRF.

Continuous–continuous. The off-diagonal entries of $\boldsymbol{\Omega}_{yy}$ are related to the precision element by $\omega_{ij}^{yy} = -\tfrac{1}{2}\,\theta_{ij}$, the same relationship as in the GGM. See GGM.

Cross-type. The entries of $\boldsymbol{\Omega}_{xy}$ capture the direct association between a discrete variable $x_i$ and a continuous variable $y_j$, after accounting for all other variables. A cross-type partial association enters both the full conditional of $x_i$ (through the rest score) and the conditional mean of $y_j$.

When $\omega_{ij} = 0$ for any of the three types, the two variables are conditionally independent and the corresponding edge is absent from the graph. The spike-and-slab prior operates on partial associations regardless of type, so edge selection works uniformly across all three blocks.

To obtain model-specific parameterizations: extract_precision(fit) returns $\boldsymbol{\Theta}$, extract_partial_correlations(fit) returns partial correlations for the continuous block, and extract_log_odds(fit) returns the log adjacent-category odds ratios $2\omega_{ij}^{xx}$ for the discrete block.

Pseudolikelihood

The continuous conditional $p(\mathbf{y} \mid \mathbf{x})$ is a multivariate normal with a closed-form likelihood. For $n$ observations:

\[ \log p(\mathbf{Y} \mid \mathbf{X}) = \frac{n}{2}\left(\log|\boldsymbol{\Theta}| - q\log(2\pi)\right) - \frac{1}{2}\sum_{i=1}^{n}(\mathbf{y}_i - \mathbf{M}_i)^{\sf T}\boldsymbol{\Theta}\,(\mathbf{y}_i - \mathbf{M}_i) \]

where $\mathbf{M}_i = \boldsymbol{\mu}_y + 2\,\boldsymbol{\Sigma}\,\boldsymbol{\Omega}_{xy}^{\sf T}\,\mathbf{x}_i$ is the conditional mean for observation $i$. This expression is computed exactly — no pseudolikelihood approximation is needed for the continuous block.

For the discrete block, bgms integrates out the continuous variables to obtain a marginal MRF on $\mathbf{x}$ and applies a pseudolikelihood approximation to that marginal MRF. After integration, the log marginal density is

\[ \log p(\mathbf{x}) \propto \sum_{i=1}^{p}\sum_{c=1}^{m_i}\mathcal{I}(x_i = c)\,\mu_{ic} + \mathbf{x}^{\sf T} \mathbf{M}\, \mathbf{x} + 2(\boldsymbol{\Omega}_{xy}\boldsymbol{\mu}_y)^{\sf T}\,\mathbf{x} \]

with effective discrete–discrete interaction matrix

\[ \mathbf{M} = \boldsymbol{\Omega}_{xx} + 2\,\boldsymbol{\Omega}_{xy}\,\boldsymbol{\Sigma}\,\boldsymbol{\Omega}_{xy}^{\sf T}. \]

The off-diagonal entries of $\mathbf{M}$ combine the direct discrete–discrete partial associations in $\boldsymbol{\Omega}_{xx}$ with the indirect associations induced through the continuous block. The linear term $2(\boldsymbol{\Omega}_{xy}\boldsymbol{\mu}_y)^{\sf T}\mathbf{x}$ shifts the discrete category bias by an amount that depends on the cross-type interactions and the continuous means.

The full conditional for discrete variable $x_i$ in this marginal MRF has the same ordinal-MRF form as in the Ordinal MRF:

\[ p(x_i = c \mid \mathbf{x}_{-i}) = \frac{\exp\!\left(\mu_{ic} + c \cdot r_i\right)}{\sum_{k=0}^{m_i}\exp\!\left(\mu_{ik} + k \cdot r_i\right)} \]

with rest score

\[ r_i = 2\sum_{j \neq i} M_{ij}\, x_j + 2\,(\boldsymbol{\Omega}_{xy}\boldsymbol{\mu}_y)_i. \]

The pseudolikelihood is the product of these full conditionals across all observations and discrete variables.

Edge selection

When edge_selection = TRUE (the default), bgms applies spike-and-slab priors to all three types of partial associations. The spike sets $\omega_{ij} = 0$, enforcing the absence of the edge. The slab places a Cauchy prior on $\omega_{ij}$, allowing the effect size to be estimated from the data. See Prior Basics for the spike-and-slab specification.

The posterior inclusion probability for each edge quantifies the evidence for a direct association, averaged across all possible configurations of the remaining edges. A posterior inclusion probability exceeding 0.5 shows evidence for the presence of the corresponding edge, while a probability below 0.5 shows evidence for the exclusion of the corresponding edge. See Edge Selection.

Missing data

Missing values can be handled via listwise deletion (the default) or imputed within the Gibbs sampler by setting na_action = "impute". For mixed data, missing discrete entries are drawn from their full conditional categorical distribution and missing continuous entries from their conditional normal distribution at each MCMC iteration. See Missing Data for details on both approaches and recommendations.

Fitting a mixed MRF

Each column in the data must have a corresponding entry in variable_type. Allowed types are "ordinal", "blume-capel", and "continuous":

fit = bgm(
  x = data,
  variable_type = c("ordinal", "ordinal", "ordinal",
                     "continuous", "continuous"),
  seed = 1234
)

For Blume-Capel variables in the discrete block, provide baseline_category:

fit = bgm(
  x = data,
  variable_type = c("blume-capel", "blume-capel",
                     "continuous", "continuous"),
  baseline_category = c(2, 2, NA, NA),
  seed = 1234
)

The summary() and coef() methods work the same as for single-type models. coef(fit)$pairwise returns all partial associations on a unified scale, and coef(fit)$indicator returns the posterior inclusion probabilities.

To obtain model-specific parameterizations, use the dedicated extractors:

extract_main_effects(fit) — returns category thresholds (discrete) and continuous means.
extract_precision(fit) — returns the precision matrix $\boldsymbol{\Theta}$ for the continuous block.
extract_partial_correlations(fit) — returns partial correlations for the continuous block.
extract_log_odds(fit) — returns log adjacent-category odds ratios $2\omega_{ij}^{xx}$ for the discrete block.

Extractor requirements

Some extractors require at least two variables of the corresponding type:

extract_precision() and extract_partial_correlations() require at least two continuous variables.
extract_log_odds() requires at least two discrete variables (binary/ordinal or Blume-Capel).

If a model does not meet these requirements, these quantities are not defined for that fit.

References

Lauritzen, S. L. (1996). Graphical models. Oxford University Press.

Lauritzen, S. L., & Wermuth, N. (1989). Graphical models for associations between variables, some of which are qualitative and some quantitative. The Annals of Statistics, 17(1), 31–57. https://doi.org/10.1214/aos/1176347003