Skip to content

Loss functions

When training an estimator of type PointEstimator, a loss function must be specified that determines the Bayes estimator that will be approximated. In addition to the standard loss functions provided by Flux (e.g., mae, mse, which allow for the approximation of posterior medians and means, respectively), the following loss functions are provided with the package.

NeuralEstimators.tanhloss Function
julia
tanhloss(θ̂, θ, κ; joint::Bool = true, scale_by_parameter_dim::Bool = true)

For κ > 0, computes the loss function given in Sainsbury-Dale et al. (2025; Eqn. 14), namely,

which yields the 0-1 loss function in the limit κ → 0.

If joint = true (default), the L₁ norm is computed over each parameter vector, so that with κ close to zero, the resulting Bayes estimator approximates the mode of the joint posterior distribution. Otherwise, if joint = false, the loss function is computed as

where denotes the dimension of the parameter vector . In this case, with κ close to zero, the resulting Bayes estimator approximates the vector containing the modes of the marginal posterior distributions.

Compared with the kpowerloss(), which may also be used as a continuous approximation of the 0–1 loss function, the gradient of this loss is bounded as   , which can improve numerical stability during training.

source
NeuralEstimators.kpowerloss Function
julia
kpowerloss(θ̂, θ, κ; agg = mean, safeorigin = true, ϵ = 0.1)

For κ > 0, the κ-th power absolute-distance loss function,

contains the squared-error (κ = 2), absolute-error (κ = 2), and 0–1 (κ → 0) loss functions as special cases. It is Lipschitz continuous if κ = 1, convex if κ ≥ 1, and strictly convex if κ > 1. It is quasiconvex for all κ > 0.

If safeorigin = true, the loss function is modified to be piecewise, continuous, and linear in the ϵ-interval surrounding the origin, to avoid pathologies around the origin.

See also tanhloss().

source
NeuralEstimators.quantileloss Function
julia
quantileloss(θ̂, θ, τ; agg = mean)
quantileloss(θ̂, θ, τ::Vector; agg = mean)

The asymmetric quantile loss function,

where τ ∈ (0, 1) is a probability level and 𝕀(⋅) is the indicator function.

source
NeuralEstimators.intervalscore Function
julia
intervalscore(l, u, θ, α; agg = mean)
intervalscore(θ̂, θ, α; agg = mean)
intervalscore(assessment::Assessment; average_over_parameters::Bool = false, average_over_sample_sizes::Bool = true)

Given an interval [l, u] with nominal coverage 100×(1-α)% and true value θ, the interval score (Gneiting and Raftery, 2007) is defined as

where α ∈ (0, 1) and 𝕀(⋅) is the indicator function.

The method that takes a single value θ̂ assumes that θ̂ is a matrix with rows, where is the dimension of the parameter vector to make inference on. The first and second sets of rows will be used as l and u, respectively.

source