Loss functions

When training an estimator of type PointEstimator, a loss function must be specified that determines the Bayes estimator that will be approximated. In addition to the standard loss functions (e.g., mae, mse, which allow for the approximation of posterior medians and means, respectively), the following loss functions are provided with the package.

NeuralEstimators.tanhloss Function

julia

tanhloss(θ̂, θ, κ; joint::Bool = true, scale_by_parameter_dim::Bool = true)

For κ > 0, computes the loss function given in Sainsbury-Dale et al. (2025; Eqn. 14), namely,

which yields the 0-1 loss function in the limit κ → 0.

If joint = true (default), the L₁ norm is computed over each parameter vector, so that with κ close to zero, the resulting Bayes estimator approximates the mode of the joint posterior distribution. Otherwise, if joint = false, the loss function is computed as

where denotes the dimension of the parameter vector . In this case, with κ close to zero, the resulting Bayes estimator approximates the vector containing the modes of the marginal posterior distributions.

Compared with the kpowerloss(), which may also be used as a continuous approximation of the 0–1 loss function, the gradient of this loss is bounded as , which can improve numerical stability during training.

source

NeuralEstimators.kpowerloss Function

julia

kpowerloss(θ̂, θ, κ; agg = mean, safeorigin = true, ϵ = 0.1)

For κ > 0, the κ-th power absolute-distance loss function,

contains the squared-error (κ = 2), absolute-error (κ = 2), and 0–1 (κ → 0) loss functions as special cases. It is Lipschitz continuous if κ = 1, convex if κ ≥ 1, and strictly convex if κ > 1. It is quasiconvex for all κ > 0.

If safeorigin = true, the loss function is modified to be piecewise, continuous, and linear in the ϵ-interval surrounding the origin, to avoid pathologies around the origin.

See also tanhloss().

source

NeuralEstimators.quantileloss Function

julia

quantileloss(θ̂, θ, τ; agg = mean)
quantileloss(θ̂, θ, τ::Vector; agg = mean)

The asymmetric quantile loss function,

where τ ∈ (0, 1) is a probability level and 𝕀(⋅) is the indicator function.

source

NeuralEstimators.intervalscore Function

julia

intervalscore(l, u, θ, α; agg = mean)
intervalscore(θ̂, θ, α; agg = mean)
intervalscore(assessment::Assessment; average_over_parameters::Bool = false, average_over_sample_sizes::Bool = true)

Given an interval [l, u] with nominal coverage 100×(1-α)% and true value θ, the interval score (Gneiting and Raftery, 2007) is defined as

⁻ ⁻

where α ∈ (0, 1) and 𝕀(⋅) is the indicator function.

The method that takes a single value θ̂ assumes that θ̂ is a matrix with rows, where is the dimension of the parameter vector to make inference on. The first and second sets of rows will be used as l and u, respectively.

source

Loss functions ​

Loss functions