Graphical Models

A graphical model represents relationships among a set of variables as a graph. bgms estimates a class of undirected graphical models: pairwise Markov random fields.

Nodes and edges

In a graphical model, the \(p\) variables in a dataset are represented as nodes (or vertices) of a graph. An edge between two nodes indicates a direct statistical association between the corresponding variables, after accounting for all other variables in the model.

The absence of an edge between two nodes means the two variables are conditionally independent given the remaining variables. This is the key interpretive property of Markov random fields: the graph encodes the conditional independence structure of the joint distribution.

Conditional independence

Two variables \(X_i\) and \(X_j\) are conditionally independent given the rest if \(X_i\) provides no additional information about \(X_j\) once all other variables are observed. In the graph, this corresponds to the absence of a direct edge between nodes \(i\) and \(j\).

Conditional independence is distinct from marginal independence. Two variables may be marginally correlated (both associated with a third variable) yet conditionally independent once that third variable is accounted for. The Markov random field separates direct from indirect associations.

Markov Random Fields

A Markov random field (MRF) is an undirected graphical model in which the joint distribution factorizes according to the graph structure. For \(p\) variables with observations \(\mathbf{x} = (x_1, \ldots, x_p)\), the MRF specifies:

\[ p(\mathbf{x}) \propto \exp\left(\sum_{i} \mu_i(x_i) + \mathbf{x}^{\sf T}\boldsymbol{\Omega}\, \mathbf{x}\right) \]

where \(\mu_i(x_i)\) are node potentials that capture the univariate properties of variable \(i\), and \(\boldsymbol{\Omega}\) is a symmetric pairwise interaction matrix. The off-diagonal entry \(\omega_{ij}\) is the partial association between variables \(x_i\) and \(x_j\). When \(\omega_{ij} = 0\), the two variables are conditionally independent and the corresponding edge is absent from the graph. Thus, the primary inferential targets are the partial associations in \(\boldsymbol{\Omega}\) and the network structure that they imply.

Model families in `bgms`

bgms estimates one general MRF. The variable types in the data determine the form of the node potentials \(\mu_i(x_i)\) and the interpretation of the pairwise interactions \(\omega_{ij}\):

Ordinal MRF — all variables are binary or ordinal. Node potentials are category thresholds, and partial associations are proportional to log adjacent-category odds ratios. See Ordinal MRF.
Gaussian graphical model (GGM) — all variables are continuous. The partial associations are directly related to the precision matrix (inverse covariance matrix), whose standardized off-diagonal entries are partial correlations. See GGM.
Mixed MRF — for mixed discrete and continuous variables. The interaction matrix decomposes into blocks for discrete–discrete, continuous–continuous, and cross-type associations. See Mixed MRF.

Modeling conditional independence

The conditional independence structure — which edges are present — is typically unknown. bgms treats edge inclusion as a model selection problem, testing \(\omega_{ij} = 0\) against \(\omega_{ij} \neq 0\) for each pair of variables using spike-and-slab priors. This produces posterior inclusion probabilities that quantify the evidence for each edge. See Edge Selection.

Nodes and edges

Conditional independence

Markov Random Fields

Model families in bgms

Modeling conditional independence

Model families in `bgms`