Estimators
The package provides several classes of neural estimator, organised within a type hierarchy rooted at the abstract supertype NeuralEstimator.
NeuralEstimators.NeuralEstimator Type
NeuralEstimatorAn abstract supertype for all neural estimators.
sourcePosterior estimators
PosteriorEstimator approximates the posterior distribution, using a flexible, parametric family of distributions (see Approximate distributions).
NeuralEstimators.PosteriorEstimator Type
PosteriorEstimator <: NeuralEstimator
PosteriorEstimator(summary_network, q::ApproximateDistribution)
PosteriorEstimator(summary_network, num_parameters::Integer; num_summaries::Integer, q = NormalisingFlow, kwargs...)A neural estimator that approximates the posterior distribution summary_network and an approximate distribution q (see the available in-built Approximate distributions).
The summary_network maps data q. The precise way in which the summary statistics condition q depends on the choice of approximate distribution: for example, GaussianMixture uses an MLP to map NormalisingFlow uses
The convenience constructor builds q internally given num_parameters and num_summaries, with any additional keyword arguments passed to the constructor of q.
Keyword arguments
num_summaries::Integer: the number of summary statistics output bysummary_network. Must match the output dimension ofsummary_network.q::Type{<:ApproximateDistribution} = NormalisingFlow: the type of approximate distribution to use.kwargs...: additional keyword arguments passed to the constructor ofq.
Examples
using NeuralEstimators, Flux, CairoMakie
# Data Z|μ,σ ~ N(μ, σ²) with priors μ ~ N(0, 1) and σ ~ U(0, 1)
d, m = 2, 100 # dimension of θ and number of replicates
sampler(K) = NamedMatrix(μ = randn(K), σ = rand(K))
simulator(θ::AbstractVector) = θ["μ"] .+ θ["σ"] .* sort(randn(m))
simulator(θ::AbstractMatrix) = reduce(hcat, map(simulator, eachcol(θ)))
# Neural network
num_summaries = 3d
summary_network = Chain(Dense(m, 64, gelu), Dense(64, 64, gelu), Dense(64, num_summaries))
# Initialise the estimator, with q built internally
estimator = PosteriorEstimator(summary_network, d; num_summaries = num_summaries)
# Or, build q explicitly
q = NormalisingFlow(d; num_summaries = num_summaries)
estimator = PosteriorEstimator(summary_network, q)
# Train the estimator
estimator = train(estimator, sampler, simulator, K = 3000)
# Plot the risk history
plotrisk()
# Assess the estimator
θ_test = sampler(250)
Z_test = simulator(θ_test);
assessment = assess(estimator, θ_test, Z_test)
plot(assessment)
# Inference with observed data
θ = sampler(1)
Z = simulator(θ)
sampleposterior(estimator, Z) # posterior draws
posteriormean(estimator, Z) # point estimateRatio estimators
RatioEstimator approximates the likelihood-to-evidence ratio, enabling both frequentist and Bayesian inference through various downstream algorithms.
NeuralEstimators.RatioEstimator Type
RatioEstimator <: NeuralEstimator
RatioEstimator(summary_network, num_parameters; num_summaries, kwargs...)A neural estimator that estimates the likelihood-to-evidence ratio,
where
The estimator jointly summarises the data 2 * num_parameters summaries by default.
For numerical stability, training is done on the log-scale using the relation
Given data Z and parameters θ, the estimated ratio can be obtained using logratio and can be used in various Bayesian (e.g., Hermans et al., 2020) or frequentist (e.g., Walchessen et al., 2024) inferential algorithms. For Bayesian inference, posterior samples can be obtained via simple grid-based sampling using sampleposterior.
Keyword arguments
num_summaries::Integer: the number of summaries output bysummary_network. Must match the output dimension ofsummary_network.num_summaries_θ::Integer = 2 * num_parameters: the number of summaries output by the parameter summary network.summary_network_θ_kwargs::NamedTuple = (;): keyword arguments passed to the MLP constructor for the parameter summary network.kwargs...: additional keyword arguments passed to the MLP constructor for the inference network.
Examples
using NeuralEstimators, Flux, CairoMakie
# Data Z|μ,σ ~ N(μ, σ²) with priors μ ~ U(0, 1) and σ ~ U(0, 1)
d, m = 2, 100 # dimension of θ and number of replicates
sampler(K) = NamedMatrix(μ = rand(K), σ = rand(K))
simulator(θ::AbstractVector) = θ["μ"] .+ θ["σ"] .* sort(randn(m))
simulator(θ::AbstractMatrix) = reduce(hcat, map(simulator, eachcol(θ)))
# Neural network
num_summaries = 3d
summary_network = Chain(Dense(m, 64, gelu), Dense(64, 64, gelu), Dense(64, num_summaries))
# Initialise the estimator
estimator = RatioEstimator(summary_network, d; num_summaries = num_summaries)
# Train the estimator
estimator = train(estimator, sampler, simulator, K = 1000)
# Plot the risk history
plotrisk()
# Assess the estimator
θ_test = sampler(250)
Z_test = simulator(θ_test);
θ_grid = expandgrid(0:0.01:1, 0:0.01:1)' # fine gridding of the parameter space
assessment = assess(estimator, θ_test, Z_test; θ_grid = θ_grid)
plot(assessment)
# Generate "observed" data
θ = sampler(1)
z = simulator(θ)
# Grid-based optimization and sampling
logratio(estimator, z, θ_grid = θ_grid) # log of likelihood-to-evidence ratios
posteriormode(estimator, z; θ_grid = θ_grid) # posterior mode
sampleposterior(estimator, z; θ_grid = θ_grid) # posterior sampleBayes estimators
Neural Bayes estimators are implemented as subtypes of BayesEstimator. The general-purpose PointEstimator supports user-defined loss functions (see Loss functions). The types IntervalEstimator and its generalisation QuantileEstimator are designed for posterior quantile estimation based on user-specified probability levels, automatically configuring the quantile loss and enforcing non-crossing constraints.
NeuralEstimators.BayesEstimator Type
BayesEstimator <: NeuralEstimatorAn abstract supertype for neural Bayes estimators.
sourceNeuralEstimators.PointEstimator Type
PointEstimator <: BayesEstimator
PointEstimator(network)
PointEstimator(summary_network, inference_network)
PointEstimator(summary_network, num_parameters; num_summaries, kwargs...)A neural point estimator mapping data to a point summary of the posterior distribution.
The neural network can be provided in two ways:
As a single
networkthat maps data directly to the parameter space.As a
summary_networkthat maps data to a vector of summary statistics, with theinference_networkconstructed internally based onnum_parametersandnum_summaries.
Examples
using NeuralEstimators, Flux, CairoMakie
# Data Z|μ,σ ~ N(μ, σ²) with priors μ ~ N(0, 1) and σ ~ U(0, 1)
d, m = 2, 100 # dimension of θ and number of replicates
sampler(K) = NamedMatrix(μ = randn(K), σ = rand(K))
simulator(θ::AbstractVector) = θ["μ"] .+ θ["σ"] .* sort(randn(m))
simulator(θ::AbstractMatrix) = reduce(hcat, map(simulator, eachcol(θ)))
# Neural network, an MLP mapping m inputs into d outputs
network = Chain(Dense(m, 64, gelu), Dense(64, 64, gelu), Dense(64, d))
# Initialise a neural point estimator
estimator = PointEstimator(network)
# Train the estimator
estimator = train(estimator, sampler, simulator)
# Plot the risk history
plotrisk()
# Assess the estimator
θ_test = sampler(1000)
Z_test = simulator(θ_test)
assessment = assess(estimator, θ_test, Z_test)
bias(assessment)
rmse(assessment)
plot(assessment)
# Apply to observed data (here, simulated as a stand-in)
θ = sampler(1) # ground truth (not known in practice)
Z = simulator(θ) # stand-in for real observations
estimate(estimator, Z) # point estimateNeuralEstimators.IntervalEstimator Type
IntervalEstimator <: BayesEstimator
IntervalEstimator(u, v = u, c::Union{Function, Compress} = identity; probs = [0.025, 0.975], g = exp)
IntervalEstimator(u, c::Union{Function, Compress}; probs = [0.025, 0.975], g = exp)A neural estimator that jointly estimates marginal posterior credible intervals based on the probability levels probs (by default, 95% central credible intervals).
The estimator employs a representation that prevents quantile crossing. Specifically, given data
where
The functions Compress object, which can constrain the interval estimator's output to the prior support. If these functions are unspecified, they will be set to the identity function so that the range of the intervals will be unrestricted. If only a single neural-network architecture is provided, it will be used for both
The return value when applied to data using estimate() is a matrix with interval() can be used to format this output in a readable
See also QuantileEstimator.
Examples
using NeuralEstimators, Flux, CairoMakie
# Data Z|μ,σ ~ N(μ, σ²) with priors μ ~ U(0, 1) and σ ~ U(0, 1)
d = 2 # dimension of the parameter vector θ
n = 1 # dimension of each independent replicate of Z
m = 100 # number of independent replicates
sample(K) = rand32(d, K)
simulate(θ, m) = [ϑ[1] .+ ϑ[2] .* randn(n, m) for ϑ in eachcol(θ)]
# Neural network
w = 128 # width of each hidden layer
ψ = Chain(Dense(n, w, relu), Dense(w, w, relu))
ϕ = Chain(Dense(w, w, relu), Dense(w, d))
u = DeepSet(ψ, ϕ)
# Initialise the estimator
estimator = IntervalEstimator(u)
# Train the estimator
estimator = train(estimator, sample, simulate, simulator_args = m, K = 3000)
# Plot the risk history
plotrisk()
# Assess the estimator
θ_test = sample(500)
Z_test = simulate(θ_test, m);
assessment = assess(estimator, θ_test, Z_test)
plot(assessment)
# Inference with "observed" data
θ = [0.8f0; 0.1f0]
Z = simulate(θ, m)
estimate(estimator, Z)
interval(estimator, Z)NeuralEstimators.QuantileEstimator Type
QuantileEstimator <: BayesEstimator
QuantileEstimator(v; probs = [0.025, 0.5, 0.975], g = Flux.softplus, i = nothing)A neural estimator that jointly estimates a fixed set of marginal posterior quantiles, with probability levels probs. This generalises IntervalEstimator to support an arbitrary number of probability levels.
Given data
for parameters i set to a positive integer, the estimator approximates quantiles of the full conditional distribution of
where
The estimator employs a representation that prevents quantile crossing, namely,
where g = nothing, the quantiles are estimated independently through the representation
When the neural networks are DeepSet objects, two requirements must be met. First, the number of input neurons in the first layer of the outer network must equal the number of neurons in the final layer of the inner network plus
The return value is a matrix with
Examples
using NeuralEstimators, Flux
# Data Z|μ,σ ~ N(μ, σ²) with priors μ ~ U(0, 1) and σ ~ U(0, 1)
d = 2 # dimension of the parameter vector θ
n = 1 # dimension of each independent replicate of Z
m = 30 # number of independent replicates in each data set
sample(K) = rand32(d, K)
simulate(θ, m) = [ϑ[1] .+ ϑ[2] .* randn32(n, m) for ϑ in eachcol(θ)]
# ---- Quantiles of θᵢ ∣ 𝐙, i = 1, …, d ----
# Neural network
w = 64 # width of each hidden layer
ψ = Chain(Dense(n, w, relu), Dense(w, w, relu))
ϕ = Chain(Dense(w, w, relu), Dense(w, d))
v = DeepSet(ψ, ϕ)
# Initialise the estimator
estimator = QuantileEstimator(v)
# Train the estimator
estimator = train(estimator, sample, simulate, simulator_args = m)
# Inference with "observed" data
θ = [0.8f0; 0.1f0]
Z = simulate(θ, m)
estimate(estimator, Z)
# ---- Quantiles of θᵢ ∣ 𝐙, θ₋ᵢ ----
# Neural network
w = 64 # width of each hidden layer
ψ = Chain(Dense(n, w, relu), Dense(w, w, relu))
ϕ = Chain(Dense(w + 1, w, relu), Dense(w, d - 1))
v = DeepSet(ψ, ϕ)
# Initialise estimators respectively targetting quantiles of μ∣Z,σ and σ∣Z,μ
q₁ = QuantileEstimator(v; i = 1)
q₂ = QuantileEstimator(v; i = 2)
# Train the estimators
q₁ = train(q₁, sample, simulate, simulator_args = m)
q₂ = train(q₂, sample, simulate, simulator_args = m)
# Estimate quantiles of μ∣Z,σ with σ = 0.5 and for many data sets
θ₋ᵢ = 0.5f0
q₁(Z, θ₋ᵢ)
# Estimate quantiles of μ∣Z,σ with σ = 0.5 for a single data set
q₁(Z[1], θ₋ᵢ)Ensembles
Ensemble combines multiple estimators, aggregating their individual estimates to improve accuracy.
NeuralEstimators.Ensemble Type
Ensemble <: NeuralEstimator
Ensemble(estimators)
Ensemble(architecture::Function, J::Integer)
(ensemble::Ensemble)(Z; aggr = mean)Defines an ensemble of estimators which, when applied to data Z, returns the mean (or another summary defined by aggr) of the individual estimates (see, e.g., Sainsbury-Dale et al., 2025, Sec. S5).
The ensemble can be initialised with a collection of trained estimators and then applied immediately to observed data. Alternatively, the ensemble can be initialised with a collection of untrained estimators (or a function defining the architecture of each estimator, and the number of estimators in the ensemble), trained with train(), and then applied to observed data. In the latter case, where the ensemble is trained directly, if savepath is specified both the ensemble and component estimators will be saved.
Note that train() currently acts sequentially on the component estimators, using the Adam optimiser.
The ensemble components can be accessed by indexing the ensemble; the number of component estimators can be obtained using length().
See also Parallel, which can be used to mimic ensemble methods with an appropriately chosen connection.
Examples
using NeuralEstimators, Flux
# Data Z|θ ~ N(θ, 1) with θ ~ N(0, 1)
d = 1 # dimension of the parameter vector θ
n = 1 # dimension of each independent replicate of Z
m = 30 # number of independent replicates in each data set
sampler(K) = randn32(d, K)
simulator(θ, m) = [μ .+ randn32(n, m) for μ ∈ eachcol(θ)]
# Neural-network architecture of each ensemble component
function architecture()
ψ = Chain(Dense(n, 64, relu), Dense(64, 64, relu))
ϕ = Chain(Dense(64, 64, relu), Dense(64, d))
network = DeepSet(ψ, ϕ)
PointEstimator(network)
end
# Initialise ensemble with three component estimators
ensemble = Ensemble(architecture, 3)
ensemble[1] # access component estimators by indexing
ensemble[1:2] # indexing with an iterable collection returns the corresponding ensemble
length(ensemble) # number of component estimators
# Training
ensemble = train(ensemble, sampler, simulator, m = m, epochs = 5)
# Assessment
θ = sampler(1000)
Z = simulator(θ, m)
assessment = assess(ensemble, θ, Z)
rmse(assessment)
# Apply to data
ensemble(Z)Helper functions
The following helper functions operate on an estimator to inspect its components or apply parts of it to data. For the main inference functions used post-training, see Inference with observed data.
NeuralEstimators.summarynetwork Function
summarynetwork(estimator::NeuralEstimator)Returns the summary network of estimator.
See also summarystatistics.
NeuralEstimators.setsummarynetwork Function
setsummarynetwork(estimator::NeuralEstimator, network)Returns a new estimator identical to estimator but with the summary network replaced by network. Useful for transfer learning.
Note that RatioEstimator has a second summary network for the parameters, accessible via estimator.summary_network_θ, which is not affected by this function.
See also summarynetwork.
NeuralEstimators.summarystatistics Function
summarystatistics(estimator::NeuralEstimator, Z; batchsize::Integer = 32, use_gpu::Bool = true)Computes learned summary statistics by applying the summary network of estimator to data Z.
If Z is a DataSet object, the learned summary statistics are concatenated with the precomputed expert summary statistics stored in Z.S.
See also summarynetwork.
NeuralEstimators.logdensity Function
logdensity(estimator::PosteriorEstimator, θ, Z)Evaluates the log-density of the approximate posterior implied by estimator at parameters θ given data Z,
where
θ should be a Z a collection of K data sets.
See also sampleposterior.