romtools.workflows.inverse.vi_drivers#

Variational inference drivers for Gaussian posterior approximations.

This module provides derivative-free variational inference (VI) routines for black-box forward models. An approximate posterior is represented by a Gaussian variational family, and the optimizer updates the variational parameters to maximize the evidence lower bound (ELBO).

Theory

run_vi solves a Gaussian variational inference problem by maximizing

\[\max_{\zeta}\; \mathbb{E}_{\theta \sim q(\theta;\zeta)} \left[ \log p(y,\theta) - \log q(\theta;\zeta) \right],\]

where \(q(\theta;\zeta)\) is the variational search distribution and \(\zeta\) denotes its parameters. In this implementation, the variational family is Gaussian in either diagonal form or fixed-correlation multivariate form, with the optimizer state stored as the variational mean and log standard deviation.

For black-box forward models, the code uses the score-function (REINFORCE/log-likelihood-trick) estimator instead of reparameterization gradients through the PDE model:

\[\nabla_{\zeta} \mathbb{E}_{\theta \sim q(\theta;\zeta)} \left[\mathcal{L}(\theta;\zeta)\right] = \mathbb{E}_{\theta \sim q(\theta;\zeta)} \left[ \mathcal{L}(\theta;\zeta)\, \nabla_{\zeta}\log q(\theta;\zeta) \right],\]

with

\[\mathcal{L}(\theta;\zeta) = \log p(y,\theta) - \log q(\theta;\zeta).\]

The expectation is approximated with Monte Carlo or randomized quasi-Monte Carlo samples from the current variational distribution. Because only \(\nabla_{\zeta}\log q\) is needed, the forward model itself is treated as derivative-free.

When optimizer_method="newton", the routine also forms a second-order score-function estimator for curvature:

\[\nabla_{\zeta}^2 \mathbb{E}_{\theta \sim q(\theta;\zeta)} \left[\mathcal{L}(\theta;\zeta)\right] = \mathbb{E}_{\theta \sim q(\theta;\zeta)} \left[ \mathcal{L}(\theta;\zeta) \left( \nabla_{\zeta}\log q\,\nabla_{\zeta}\log q^{\top} + \nabla_{\zeta}^2 \log q \right) \right].\]

This is the curvature model used by the Newton update path. The implementation supports a metric rescaling via newton_metric and either diagonal or full projected Hessian solves via newton_hessian_type.

Variance Reduction

The gradient estimator optionally supports a baseline through baseline_method. In particular, the leave-one-out option subtracts a sample mean baseline that is independent of the current score term, preserving unbiasedness while reducing Monte Carlo variance:

\[\widehat g_{\mathrm{LOO}} = \frac{1}{N}\sum_{i=1}^{N} \left( \mathcal{L}(\theta^{(i)};\zeta) - b^{(-i)} \right) \nabla_{\zeta}\log q(\theta^{(i)};\zeta),\]

where \(b^{(-i)}\) is the mean ELBO over all samples except \(\theta^{(i)}\).

Gaussian Structure

For Gaussian priors and Gaussian variational families, the ELBO integrand decomposes into

\[\log p(y,\theta) = \log p(y\mid\theta) + \log p_0(\theta),\]

so the forward-model dependence enters through the log-likelihood term, while the prior and variational-density terms are handled analytically. This is why the routine can remain derivative-free with respect to the forward model while still using gradient- and Hessian-based updates in the variational parameters.

Functions

run_vi(model, prior_parameter_space, ...[, ...])

Run Gaussian variational inference with score-function gradients.

romtools.workflows.inverse.vi_drivers.run_vi(model, prior_parameter_space, observations, observations_covariance, parameter_mins=None, parameter_maxes=None, initial_variational_parameter_space=None, restart_file=None, optimizer_method='gradient', optimizer_config=None, line_search_method='stochastic_nonmonotone', line_search_config=None, absolute_vi_directory='/home/runner/work/rom-tools-and-workflows/rom-tools-and-workflows/work/', sample_size=30, random_seed=1, sampling_method='mc', evaluation_concurrency=1, covariance_regularization=1e-08, restart_files_to_keep=10, elbo_scaling_factor='diag_mean', elbo_relative_tolerance=None, baseline_method=None, bounded_parameter_handling='transform', transform_interior_margin=1e-06, transform_map='sigmoid', min_physical_variational_std_fraction=1e-06)[source]#

Run Gaussian variational inference with score-function gradients.

This routine approximates the posterior with a Gaussian variational family (diagonal or fixed-correlation multivariate) and updates its mean and log-standard deviation using REINFORCE. An optional baseline can be enabled to reduce gradient estimator variance.

Parameters:
  • model (QoiModel) – QoiModel to evaluate at sampled parameters.

  • prior_parameter_space – Either GaussianParameterSpace (diagonal VI) or MultivariateGaussianParameterSpace (multivariate VI). Defines the Bayesian prior in physical parameter space.

  • observations (ndarray) – Observed QoI vector.

  • observations_covariance (ndarray) – Observation covariance matrix.

  • parameter_mins (ndarray) – Optional lower bounds on parameters.

  • parameter_maxes (ndarray) – Optional upper bounds on parameters.

  • initial_variational_parameter_space – Optional Gaussian initializer for the variational state in physical parameter space. Defaults to the prior moments.

  • restart_file (str) – Optional restart file path. Restart files written by this routine store variational_mean in physical coordinates.

  • optimizer_method (str) – Optimizer used for variational updates. Supported options are ‘gradient’ and ‘newton’.

  • optimizer_config – Method-specific optimizer config. Expected types are VIGradientOptimizerConfig for optimizer_method=’gradient’, VINewtonOptimizerConfig for optimizer_method=’newton’.

  • line_search_method (str) – Line-search acceptance strategy. Supported options are ‘legacy’ and ‘stochastic_nonmonotone’. Defaults to ‘stochastic_nonmonotone’.

  • line_search_config – Method-specific line-search config. Expected types are VILegacyLineSearchConfig for line_search_method=’legacy’ and VIStochasticNonmonotoneLineSearchConfig for line_search_method=’stochastic_nonmonotone’.

  • absolute_vi_directory (str) – Absolute path to the working directory for runs.

  • sample_size (int) – Number of MC samples per iteration.

  • random_seed (int) – RNG seed for reproducibility.

  • sampling_method (str) – Sampling method for variational draws. Supported options are ‘mc’ and ‘rqmc’.

  • evaluation_concurrency – Concurrent model evaluations per iteration.

  • covariance_regularization (float) – Diagonal regularization for covariance inversion.

  • restart_files_to_keep (int) – Number of most-recent restart files to retain under absolute_vi_directory. Older restart files are removed.

  • elbo_scaling_factor – Positive scalar that multiplies the ELBO objective, or a string mode. Supported string modes are: ‘diag_mean’ (or ‘auto’): mean(diag(observations_covariance)), ‘diag_trace’: sum(diag(observations_covariance)).

  • elbo_relative_tolerance (float) – Optional non-negative relative tolerance for ELBO improvement relative to the initial ELBO at the initial variational guess. When set, VI stops when elbo_current / (elbo_initial + 1e-16) is less than or equal to this tolerance.

  • baseline_method (str) – Baseline used in REINFORCE gradient estimation. Supported options are ‘none’, ‘loo’, and ‘optimal’.

  • bounded_parameter_handling (str) – Parameter bounds handling. Supported options are ‘clip’ and ‘transform’.

  • transform_interior_margin (float) – Margin used by bounded_parameter_handling=’transform’ to keep mapped samples away from exact bounds.

  • transform_map (str) – Transform used by bounded_parameter_handling=’transform’. Supported options are ‘sigmoid’ and ‘arctan’.

  • min_physical_variational_std_fraction (float) – Minimum physical-space variational standard deviation as a fraction of each parameter range when bounded_parameter_handling=’transform’.

Returns:

Tuple of (variational_mean, variational_std, parameter_samples, qois).