Approximate distributions
When constructing a PosteriorEstimator
, one must choose an approximate distribution $q(\boldsymbol{\theta}; \boldsymbol{\kappa})$ for the posterior distribution $p(\boldsymbol{\theta} \mid \boldsymbol{Z})$. These distributions are implemented as subtypes of the abstract supertype ApproximateDistribution.
Distributions
NeuralEstimators.ApproximateDistribution
— TypeApproximateDistribution
An abstract supertype for approximate posterior distributions used in conjunction with a PosteriorEstimator
.
Subtypes A <: ApproximateDistribution
should provide methods logdensity(q::A, θ::AbstractMatrix, Z)
and sampleposterior(q::A, Z, N::Integer)
.
NeuralEstimators.GaussianDistribution
— TypeGaussianDistribution <: ApproximateDistribution
GaussianDistribution(d::Integer)
A Gaussian distribution for amortised posterior inference, where d
is the dimension of the parameter vector.
The approximate-distribution parameters $\boldsymbol{\kappa} = (\boldsymbol{\mu}', \textrm{vech}(\boldsymbol{L})')'$ consist of a $d$-dimensional mean parameter $\boldsymbol{\mu}$ and the $d(d+1)/2$ non-zero elements of the lower Cholesky factor $\boldsymbol{L}$ of a covariance matrix, where the half-vectorisation operator $\textrm{vech}(\cdot)$ vectorises the lower triangle of its matrix argument.
When using a GaussianDistribution
as the approximate distribution of a PosteriorEstimator
, the neural network
of the PosteriorEstimator
should be a mapping from the sample space to $\mathbb{R}^{|\mathcal{K}|}$, where $\mathcal{K}$ denotes the space of $\boldsymbol{\kappa}$. The dimension $|\mathcal{K}|$ can be determined from an object of type GaussianDistribution
using numdistributionalparams()
. Given the $|\mathcal{K}|$-dimensional real-valued outputs of the neural network, a valid covariance matrix is constructed internally using CovarianceMatrix
.
NeuralEstimators.NormalisingFlow
— TypeNormalisingFlow <: ApproximateDistribution
NormalisingFlow(d::Integer, dstar::Integer; num_coupling_layers::Integer = 6, kwargs...)
A normalising flow for amortised posterior inference (e.g., Ardizzone et al., 2019; Radev et al., 2022), where d
is the dimension of the parameter vector and dstar
is the dimension of the summary statistics for the data.
Normalising flows are diffeomorphisms (i.e., invertible, differentiable transformations with differentiable inverses) that map a simple base distribution (e.g., standard Gaussian) to a more complex target distribution (e.g., the posterior). They achieve this by applying a sequence of learned transformations, the forms of which are chosen to be invertible and allow for tractable density computation via the change of variables formula. This allows for efficient density evaluation during the training stage, and efficient sampling during the inference stage. For further details, see the reviews by Kobyzev et al. (2020) and Papamakarios (2021).
NormalisingFlow
uses affine coupling blocks (see AffineCouplingBlock
), with activation normalisation (Kingma and Dhariwal, 2018) and permutations used between each block. The base distribution is taken to be a standard multivariate Gaussian distribution.
When using a NormalisingFlow
as the approximate distribution of a PosteriorEstimator
, the neural network of the PosteriorEstimator
should be a mapping from the sample space to $\mathbb{R}^{d^*}$, where $d^*$ is an appropriate number of summary statistics for the given parameter vector (e.g., $d^* = d$). The summary statistics are then mapped to the parameters of the affine coupling blocks using conventional multilayer perceptrons (see AffineCouplingBlock
).
Keyword arguments
num_coupling_layers::Integer = 6
: number of coupling layers.kwargs
: additional keyword arguments passed toAffineCouplingBlock
.
Methods
NeuralEstimators.numdistributionalparams
— Functionnumdistributionalparams(q::ApproximateDistribution)
numdistributionalparams(estimator::PosteriorEstimator)
The number of distributional parameters (i.e., the dimension of the space $\mathcal{K}$ of approximate-distribution parameters $\boldsymbol{\kappa}$).
Building blocks
NeuralEstimators.AffineCouplingBlock
— TypeAffineCouplingBlock(κ₁::MLP, κ₂::MLP)
AffineCouplingBlock(d₁::Integer, dstar::Integer, d₂; kwargs...)
An affine coupling block used in a NormalisingFlow
.
An affine coupling block splits its input $\boldsymbol{\theta}$ into two disjoint components, $\boldsymbol{\theta}_1$ and $\boldsymbol{\theta}_2$, with dimensions $d_1$ and $d_2$, respectively. The block then applies the following transformation:
\[\begin{aligned} \tilde{\boldsymbol{\theta}}_1 &= \boldsymbol{\theta}_1,\\ \tilde{\boldsymbol{\theta}}_2 &= \boldsymbol{\theta}_2 \odot \exp\{\boldsymbol{\kappa}_{\boldsymbol{\gamma},1}(\tilde{\boldsymbol{\theta}}_1, \boldsymbol{T}(\boldsymbol{Z}))\} + \boldsymbol{\kappa}_{\boldsymbol{\gamma},2}(\tilde{\boldsymbol{\theta}}_1, \boldsymbol{T}(\boldsymbol{Z})), \end{aligned}\]
where $\boldsymbol{\kappa}_{\boldsymbol{\gamma},1}(\cdot)$ and $\boldsymbol{\kappa}_{\boldsymbol{\gamma},2}(\cdot)$ are generic, non-invertible multilayer perceptrons (MLPs) that are functions of both the (transformed) first input component $\tilde{\boldsymbol{\theta}}_1$ and the learned $d^*$-dimensional summary statistics $\boldsymbol{T}(\boldsymbol{Z})$ (see PosteriorEstimator
).
Additional keyword arguments kwargs
are passed to the MLP
constructor when creating κ₁
and κ₂
.