NUTS Algorithm
The No-U-Turn Sampler (NUTS) is the primary MCMC algorithm used in bgms for continuous parameter sampling. It extends Hamiltonian Monte Carlo by adaptively selecting the trajectory length, eliminating the need to tune the number of leapfrog steps.
The implementation is in src/mcmc/algorithms/nuts.h and nuts.cpp, based on the NUTS algorithm described in Hoffman & Gelman (2014) with the generalized U-turn criterion from Betancourt (2017).
Leapfrog integration
Each elementary step of HMC is a leapfrog integrator that approximately preserves the Hamiltonian \(H(q, p) = -\log \pi(q) + \frac{1}{2} p^\top M^{-1} p\):
- Half-step momentum: \(p \leftarrow p + \frac{\epsilon}{2} \nabla \log \pi(q)\)
- Full-step position: \(q \leftarrow q + \epsilon\, M^{-1} p\)
- Half-step momentum: \(p \leftarrow p + \frac{\epsilon}{2} \nabla \log \pi(q)\)
The implementation uses a diagonal mass matrix \(M\), stored as the vector of inverse diagonal elements (inv_mass_diag).
Memoizer
The Memoizer class in leapfrog.h caches a single \((\log \pi, \nabla \log \pi)\) pair keyed by the position vector \(q\). During NUTS tree building, each leapfrog step produces a new unique \(q\), so the cache avoids redundant evaluations at tree-node boundaries where the same \(q\) appears as both the end of one subtree and the start of another.
When RATTLE is active, the SHAKE projection moves \(q\) to a new point on the constraint manifold, making the cached gradient stale. The Memoizer is explicitly invalidated after projection, forcing a fresh gradient evaluation at the projected position. The reversibility check in leapfrog_constrained_checked() uses a separate Memoizer instance for its backward verification step, so the caller’s cache remains pointed at the forward-step result.
Constrained leapfrog (RATTLE)
When a model has constraints (edge selection or a sparse graph in the mixed MRF), the standard leapfrog integrator cannot enforce zero entries for excluded edges. bgms uses a RATTLE-based constrained leapfrog scheme that projects position and momentum onto the constraint manifold at each step, with a runtime reversibility check that detects numerical failures.
The full algorithm, projection callbacks, reversibility check, and tolerance scaling are documented in Constrained Leapfrog (RATTLE).
Integrator dispatch
At each leaf of the NUTS tree (the base case of build_tree), the algorithm selects between two integrators:
- If the model provides projection callbacks (non-null
project_positionandproject_momentum), the constrained path runsleapfrog_constrained_checked()— the RATTLE integrator with reversibility verification. - Otherwise, the unconstrained path runs the standard Störmer–Verlet integrator with Memoizer gradient caching.
During sampling, a non-reversible RATTLE step terminates the NUTS tree (s_prime = 0), similar to a divergence. During warmup, non-reversible steps are recorded but do not terminate the tree.
Binary tree construction
NUTS builds a balanced binary tree by doubling the trajectory in a randomly chosen direction (forward or backward) at each depth level. The tree grows until one of these stopping conditions is met:
- U-turn — The trajectory starts bending back. Detected by checking whether the momentum sum across the subtree is no longer moving away from the endpoints (the generalized U-turn criterion from Betancourt, 2017).
- Divergence — The Hamiltonian error exceeds a threshold (default: \(\Delta H > 1000\)), indicating the integrator has entered a region of high curvature.
- Maximum depth — The tree reaches
max_tree_depth(default 10, corresponding to \(2^{10} = 1024\) leapfrog steps).
BuildTreeResult
Each subtree returns a BuildTreeResult struct containing:
| Field | Purpose |
|---|---|
theta_min, r_min |
Leftmost (backward) position and momentum |
theta_plus, r_plus |
Rightmost (forward) position and momentum |
theta_prime, r_prime |
Proposed sample from this subtree |
rho |
Running momentum sum (for U-turn check) |
p_sharp_beg, p_sharp_end |
Mass-preconditioned momentum \(M^{-1}p\) at subtree endpoints (used in the generalized U-turn criterion) |
p_beg, p_end |
Raw momentum at subtree endpoints (used in cross-subtree U-turn checks at merge time) |
n_prime |
Number of valid states (for multinomial sampling) |
s_prime |
Continuation flag (0 = stop) |
alpha, n_alpha |
Acceptance probability accumulator |
divergent |
Whether a divergence was detected |
non_reversible |
Whether a RATTLE reversibility check failed |
Proposal selection
Within a valid (non-U-turning) subtree, the proposal is selected via multinomial sampling weighted by the unnormalized target density. When two subtrees are merged, the proposed sample is drawn from the combined set with probability proportional to the number of valid states in each subtree.
Divergence detection
A divergence occurs when a single leapfrog step produces a Hamiltonian error \(\Delta H = H' - H_0 > 1000\). Divergences indicate that the step size is too large relative to the local curvature of the posterior. The number of divergent transitions is recorded in StepResult::NUTSDiagnostics and surfaced to the user via extract_nuts_diagnostics().
Warmup
NUTS adapts the step size \(\epsilon\) and the diagonal mass matrix \(M\) during a multi-stage warmup period before sampling begins. The three core stages (1, 2, 3a) follow Stan’s warmup scheme for NUTS. Stages 3b and 3c are bgms extensions that accommodate the spike-and-slab edge selection machinery.
The full warmup schedule — stage boundaries, budget allocation, adaptation algorithms, and warning system — is documented in Warmup Schedule.
NUTS diagnostics
Each NUTS step records four diagnostics:
- Tree depth — Number of doubling steps (0 to
max_tree_depth). Consistently hitting the maximum indicates the step size may be too small. - Divergent — Whether a divergence was detected in any leapfrog step during tree building.
- Non-reversible — Whether a constrained leapfrog step (RATTLE) failed the round-trip reversibility check. Only relevant for models that use the constrained integrator (e.g., edge selection with continuous variables). A non-reversible step terminates the NUTS tree, similar to a divergence.
- Energy — The Hamiltonian \(H = -\log \pi(q) + \frac{1}{2} p^\top M^{-1} p\). Used to compute E-BFMI (energy Bayesian fraction of missing information).
These diagnostics are collected per chain and accessible via extract_nuts_diagnostics() in R.