Skip to content

Approximate distributions

When constructing a PosteriorEstimator, one must specify a parametric family of probability distributions used to approximate the posterior distribution. These families of distributions are implemented as subtypes of AbstractApproximateDistribution.

Distributions

NeuralEstimators.AbstractApproximateDistribution Type
julia
AbstractApproximateDistribution

An abstract supertype for approximate distributions used in conjunction with a PosteriorEstimator.

Subtypes A <: AbstractApproximateDistribution must implement the following methods:

  • _logdensity(q::A, θ::AbstractMatrix, t::AbstractMatrix)

    • Used during training and therefore must support automatic differentiation.

    • θ is a d × K matrix of parameter vectors.

    • t is a dstar × K matrix of learned summary statistics obtained by applying the neural network in the PosteriorEstimator to a collection of K data sets.

    • Should return a 1 × K matrix, where each entry is the log density log q(θₖ | tₖ) for the k-th data set evaluated at the k-th parameter vector θ[:, k].

  • sampleposterior(q::A, t::AbstractMatrix, N::Integer)

    • Used during inference and therefore does not need to be differentiable.

    • Should return a Vector of length K, where each element is a d × N matrix containing N samples from the approximate posterior q(θ | tₖ) for the k-th data set.

source
NeuralEstimators.Gaussian Type
julia
Gaussian <: AbstractApproximateDistribution
Gaussian(d::Integer, num_summaries::Integer; kwargs...)

A Gaussian distribution for amortised inference with a PosteriorEstimator, where d is the dimension of the parameter vector.

The density of the distribution is:

where the parameters comprise the mean vector and the lower-triangular Cholesky factor of the dense covariance matrix  .

When using a Gaussian distribution as the approximate distribution of a PosteriorEstimator, the (learned) summary statistics are mapped to the distribution parameters using a multilayer perceptron (MLP) with appropriately chosen output activation functions (identity for and the off-diagonal entries of , softplus for the diagonal entries of ).

Keyword arguments

  • kwargs: additional keyword arguments passed to MLP.
source
NeuralEstimators.GaussianMixture Type
julia
GaussianMixture <: AbstractApproximateDistribution
GaussianMixture(d::Integer, num_summaries::Integer; num_components::Integer = 10, kwargs...)

A mixture of Gaussian distributions for amortised inference with a PosteriorEstimator, where d is the dimension of the parameter vector.

The density of the distribution is:

where the parameters comprise the mixture weights   subject to  , the mean vector of each component, and the variance parameters of the diagonal covariance matrix .

When using a GaussianMixture as the approximate distribution of a PosteriorEstimator, the (learned) summary statistics are mapped to the mixture parameters using a multilayer perceptron (MLP) with approporiately chosen output activation functions (e.g., softmax for the mixture weights, softplus for the variance parameters).

Keyword arguments

  • num_components::Integer = 10: number of components in the mixture.

  • kwargs: additional keyword arguments passed to MLP.

source
NeuralEstimators.NormalisingFlow Type
julia
NormalisingFlow <: AbstractApproximateDistribution
NormalisingFlow(d::Integer, num_summaries::Integer; num_coupling_layer = 6, kwargs...)

A normalising flow for amortised inference with a PosteriorEstimator, where d is the dimension of the parameter vector and num_summaries is the dimension of the summary statistics for the data.

Normalising flows are diffeomorphisms (i.e., invertible, differentiable transformations with differentiable inverses) that map a simple base distribution (e.g., standard Gaussian) to a more complex target distribution (e.g., the posterior). They achieve this by applying a sequence of learned transformations, the forms of which are chosen to be invertible and allow for tractable density computation via the change of variables formula. This allows for efficient density evaluation during the training stage, and efficient sampling during the inference stage. For further details, see the reviews by Kobyzev et al. (2020) and Papamakarios (2021).

NormalisingFlow uses affine coupling blocks (see AffineCouplingBlock), with optional activation normalisation (ActNorm; Kingma and Dhariwal, 2018) and permutations applied between each block via CouplingLayer. The base distribution is taken to be a standard multivariate Gaussian distribution.

When using a NormalisingFlow as the approximate distribution of a PosteriorEstimator, the (learned) summary statistics are used to condition the affine coupling blocks at each layer.

Note

To use NormalisingFlow with Enzyme.jl, set adtype = AutoEnzyme(mode = set_runtime_activity(Enzyme.Reverse)) in train.

Keyword arguments

source

Methods

NeuralEstimators.numdistributionalparams Function
julia
numdistributionalparams(q::AbstractApproximateDistribution)
numdistributionalparams(estimator::PosteriorEstimator)

The number of distributional parameters (i.e., the dimension of the space of approximate-distribution parameters ).

source

Building blocks

NeuralEstimators.CouplingLayer Type
julia
CouplingLayer(d, num_summaries; use_act_norm = true, use_permutation = true, kwargs...)

A coupling layer used in a NormalisingFlow, combining two AffineCouplingBlocks with optional activation normalisation and permutation.

The layer splits its d-dimensional input into a lower half of dimension d₁ = div(d, 2) and an upper half of dimension d₂ = d - d₁. The two halves are then passed through a pair of affine coupling blocks in sequence: the first block transforms the lower half conditioned on the upper, and the second block transforms the upper half conditioned on the already-transformed lower half. This ensures every component is updated in a single forward pass, unlike a standard coupling layer where one half is left unchanged. When d = 1, the layer reduces to a single affine transformation of the one component conditioned on the summary statistics.

Optionally, activation normalisation (ActNorm) is applied before the coupling blocks, and a random Permutation is applied after.

The argument num_summaries is the dimension of the conditioning summary statistics (see PosteriorEstimator) and kwargs are passed to AffineCouplingBlock.

source
NeuralEstimators.AffineCouplingBlock Type
julia
AffineCouplingBlock(κ₁::MLP, κ₂::MLP)
AffineCouplingBlock(d₁::Integer, num_summaries::Integer, d₂; kwargs...)

An affine coupling block used in a NormalisingFlow.

An affine coupling block splits its input into two disjoint components, and , with dimensions and , respectively. The block then applies the following transformation:

where   and   are generic, non-invertible multilayer perceptrons (MLPs) that are functions of both the (transformed) first input component and the learned -dimensional summary statistics (see PosteriorEstimator).

To prevent numerical overflows and stabilise the training of the model, the scaling factors   are clamped using the function

where   is a fixed clamping threshold. This transformation ensures that the scaling factors do not grow excessively large.

Additional keyword arguments kwargs are passed to the MLP constructor when creating κ₁ and κ₂.

source
NeuralEstimators.ActNorm Type
julia
ActNorm(d::Integer)

Activation normalisation layer Kingma and Dhariwal, 2018 for an input of dimension d.

source
NeuralEstimators.Permutation Type
julia
Permutation(in::Integer)

A layer that permutes the inputs (of dimension in) entering a coupling block.

Note that a permutation layer is invertible with Jacobian determinant |J| = 1.

source