Post-training assessment
The function assess can be used to assess a trained estimator. The resulting Assessment object contains ground-truth parameters, estimates, and other quantities that can be used to compute quantitative and qualitative diagnostics.
NeuralEstimators.assess Function
assess(estimator, θ, Z; ...)
assess(estimators::Vector, θ, Z; ...)Assesses an estimator (or a collection of estimators) based on true parameters θ and corresponding simulated data Z.
The parameters θ should be given as a
When Z contains more simulated data sets than the number θ will be recycled via horizontal concatenation: θ = repeat(θ, outer = (1, J)), where J = numobs(Z) ÷ K is the number of simulated data sets for each parameter vector. This allows assessment of the estimator's sampling distribution under fixed parameters.
The return value is of type Assessment.
Keyword arguments
estimator_name::String(orestimator_names::Vector{String}for multiple estimators): name(s) of the estimator(s) (sensible defaults provided).parameter_names::Vector{String}: names of the parameters (sensible default provided).use_gpu = true:BoolorVector{Bool}with length equal to the number of estimators.probs = nothing(applicable only toPointEstimator): probability levels taking values between 0 and 1. By default, no bootstrap uncertainty quantification is done; ifprobsis provided, it must be a two-element vector specifying the lower and upper probability levels for non-parametric bootstrap intervals (note that parametric bootstrap is not currently supported withassess()).B::Integer = 400(applicable only toPointEstimator): number of bootstrap samples.pointsummary::Function = mean(applicable only to estimators that yield posterior samples): a function that summarises a vector of posterior samples into a single point estimate for each marginal; any function mapping a vector to a scalar is valid (e.g.,medianfor the posterior median).N::Integer = 1000(applicable only to estimators that yield posterior samples): number of posterior samples drawn for each data set.kwargs...(applicable only to estimators that yield posterior samples): additional keyword arguments passed tosampleposterior.
NeuralEstimators.Assessment Type
AssessmentA type for storing the output of assess(). The field runtime contains the total time taken for each estimator. The field estimates is a long-form DataFrame with columns:
parameter: the name of the parametertruth: the true value of the parameterestimate: the estimated value of the parameterk: the index of the parameter vectorj: the index of the data set (only relevant in the case that multiple data sets are associated with each parameter vector)
If the estimator is a PosteriorEstimator or a RatioEstimator, in addition to the fields listed above, the field samples stores the posterior samples as a long-form DataFrame with the columns parameter, truth, k, j (as given above), as well as:
draw: the index of the draw within the posterior samplesvalue: the value of the posterior sample for a given parameter and draw.
If the estimator is an IntervalEstimator, the column estimate will be replaced by the columns lower and upper, containing the lower and upper bounds of the interval, respectively.
If the estimator is a QuantileEstimator, there will also be a column prob indicating the probability level of the corresponding quantile estimate.
Use merge() to combine assessments from multiple estimators of the same type or join() to combine assessments from a PointEstimator and an IntervalEstimator.
Makie.plot Method
plot(assessment::Assessment; prob = 0.99)Visualise the performance of a neural estimator. Accepts the Assessment object returned by assess.
Extension
This function is defined in the NeuralEstimatorsPlottingMakieExt extension and requires CairoMakie (or another Makie backend) to be loaded.
The plot produced depends on the type of estimator being assessed:
PointEstimator: produces a scatter plot of estimates vs. true values, faceted by parameter.
IntervalEstimator: produces a plot of estimated credible intervals vs. true values, faceted by parameter. Each interval is drawn as a vertical line segment from lower to upper bound, with tick marks at the endpoints.
QuantileEstimator: produces a calibration plot of the empirical coverage probability vs. the nominal probability level τ, faceted by parameter. A well-calibrated estimator will follow the red diagonal line. Specifically, the diagnostic is constructed as follows:
For k = 1,…,K, sample pairs (θᵏ, Zᵏ) with θᵏ ∼ p(θ), Zᵏ ∼ p(Z ∣ θᵏ). This gives K "posterior draws", θᵏ ∼ p(θ ∣ Zᵏ).
For each k and each τ ∈ {τⱼ : j = 1,…,J}, estimate the posterior quantile Q(Zᵏ, τ).
For each τ, compute the proportion of quantiles Q(Zᵏ, τ) exceeding the corresponding θᵏ, and plot this proportion against τ.
PosteriorEstimator: produces a three-row figure:
Recovery plot: posterior mean vs. true value (scatter), with vertical line segments showing the 95% posterior credible interval, faceted by parameter.
ECDF plot: for each parameter, the empirical CDF of the fractional rank of the true value within the posterior samples, together with a simultaneous
prob-level confidence band. A well-calibrated posterior yields an ECDF that stays within the band.Z-score / contraction plot: posterior z-score (posterior mean − truth) / posterior SD vs. posterior contraction 1 − Var(posterior) / Var(prior), faceted by parameter. Ideally z-scores are centred near zero and contractions are near one.
Keyword arguments
prob = 0.99: nominal simultaneous coverage level for the SBC confidence band. Only used whenassessmentcontains posterior samples.
NeuralEstimators.risk Function
risk(assessment::Assessment; ...)Computes a Monte Carlo approximation of an estimator's Bayes risk,
where
If the Assessment object corresponds to an estimator with a self-defined loss (e.g., PosteriorEstimator), the precomputed risk is returned directly. Otherwise, the risk is computed from the estimates and true parameters using the provided loss function.
Keyword arguments
loss = (x, y) -> abs(x - y): a binary operator defining the loss function (default: absolute-error loss)average_over_parameters::Bool = false: iftrue, the loss is averaged over all parameters; otherwise (default), it is computed separately for each parameter.
NeuralEstimators.bias Function
bias(assessment::Assessment; average_over_parameters = false)Computes a Monte Carlo approximation of an estimator's bias,
where
NeuralEstimators.rmse Function
rmse(assessment::Assessment; average_over_parameters = false)Computes a Monte Carlo approximation of an estimator's root-mean-squared error,
where
NeuralEstimators.coverage Function
coverage(assessment::Assessment; ...)Computes a Monte Carlo approximation of an interval estimator's expected coverage, as defined in Hermans et al. (2022, Definition 2.1), and the proportion of parameters below and above the lower and upper bounds, respectively.
Keyword arguments
average_over_parameters::Bool = false: if true, the coverage is averaged over all parameters; otherwise (default), it is computed over each parameter separately.