Loss functions
When training an estimator of type PointEstimator, a loss function must be specified that determines the Bayes estimator that will be approximated. In addition to the standard loss functions provided by Flux (e.g., mae, mse, which allow for the approximation of posterior medians and means, respectively), the following loss functions are provided with the package.
NeuralEstimators.tanhloss Function
tanhloss(θ̂, θ, κ; joint::Bool = true, scale_by_parameter_dim::Bool = true)For κ > 0, computes the loss function given in Sainsbury-Dale et al. (2025; Eqn. 14), namely,
which yields the 0-1 loss function in the limit κ → 0.
If joint = true (default), the L₁ norm is computed over each parameter vector, so that with κ close to zero, the resulting Bayes estimator approximates the mode of the joint posterior distribution. Otherwise, if joint = false, the loss function is computed as
where κ close to zero, the resulting Bayes estimator approximates the vector containing the modes of the marginal posterior distributions.
Compared with the kpowerloss(), which may also be used as a continuous approximation of the 0–1 loss function, the gradient of this loss is bounded as
NeuralEstimators.kpowerloss Function
kpowerloss(θ̂, θ, κ; agg = mean, safeorigin = true, ϵ = 0.1)For κ > 0, the κ-th power absolute-distance loss function,
contains the squared-error (κ = 2), absolute-error (κ = 2), and 0–1 (κ → 0) loss functions as special cases. It is Lipschitz continuous if κ = 1, convex if κ ≥ 1, and strictly convex if κ > 1. It is quasiconvex for all κ > 0.
If safeorigin = true, the loss function is modified to be piecewise, continuous, and linear in the ϵ-interval surrounding the origin, to avoid pathologies around the origin.
See also tanhloss().
NeuralEstimators.quantileloss Function
quantileloss(θ̂, θ, τ; agg = mean)
quantileloss(θ̂, θ, τ::Vector; agg = mean)The asymmetric quantile loss function,
where τ ∈ (0, 1) is a probability level and 𝕀(⋅) is the indicator function.
NeuralEstimators.intervalscore Function
intervalscore(l, u, θ, α; agg = mean)
intervalscore(θ̂, θ, α; agg = mean)
intervalscore(assessment::Assessment; average_over_parameters::Bool = false, average_over_sample_sizes::Bool = true)Given an interval [l, u] with nominal coverage 100×(1-α)% and true value θ, the interval score (Gneiting and Raftery, 2007) is defined as
where α ∈ (0, 1) and 𝕀(⋅) is the indicator function.
The method that takes a single value θ̂ assumes that θ̂ is a matrix with l and u, respectively.