Model-specific functions

Data simulators

The philosophy of NeuralEstimators is to cater for any model for which simulation is feasible by allowing users to define their model implicitly through simulated data. However, the following functions have been included as they may be helpful to others, and their source code illustrates how a user could formulate code for their own model.

See also Distributions.jl for a range of distributions implemented in Julia, and the package RCall for calling R functions within Julia.

NeuralEstimators.simulategaussianFunction
simulategaussian(L::AbstractMatrix, m = 1)

Simulates m independent and identically distributed realisations from a mean-zero multivariate Gaussian random vector with associated lower Cholesky factor L.

If m is not specified, the simulated data are returned as a vector with length equal to the number of spatial locations, $n$; otherwise, the data are returned as an $n$xm matrix.

Examples

using NeuralEstimators, Distances, LinearAlgebra

n = 500
ρ = 0.6
ν = 1.0
S = rand(n, 2)
D = pairwise(Euclidean(), S, dims = 1)
Σ = Symmetric(matern.(D, ρ, ν))
L = cholesky(Σ).L
simulategaussian(L)
source
NeuralEstimators.simulatepottsFunction
simulatepotts(grid::Matrix{Int}, β)
simulatepotts(grid::Matrix{Union{Int, Nothing}}, β)
simulatepotts(nrows::Int, ncols::Int, num_states::Int, β)

Chequerboard Gibbs sampling from a spatial Potts model with parameter β>0 (see, e.g., Sainsbury-Dale et al., 2025, Sec. 3.3, and the references therein).

Approximately independent simulations can be obtained by setting nsims > 1 or num_iterations > burn. The degree to which the resulting simulations can be considered independent depends on the thinning factor (thin) and the burn-in (burn).

Keyword arguments

  • nsims = 1: number of approximately independent replicates.
  • num_iterations = 2000: number of MCMC iterations.
  • burn = num_iterations: burn-in.
  • thin = 10: thinning factor.

Examples

using NeuralEstimators 

## Marginal simulation 
β = 0.8
simulatepotts(10, 10, 5, β)

## Marginal simulation: approximately independent samples 
simulatepotts(10, 10, 5, β; nsims = 100, thin = 10)

## Conditional simulation 
β = 0.8
complete_grid   = simulatepotts(50, 50, 2, β)        # simulate marginally from the Ising model 
incomplete_grid = removedata(complete_grid, 0.1)     # remove 10% of the pixels at random  
imputed_grid    = simulatepotts(incomplete_grid, β)  # conditionally simulate over missing pixels

## Multiple conditional simulations 
imputed_grids   = simulatepotts(incomplete_grid, β; num_iterations = 2000, burn = 1000, thin = 10)

## Recreate Fig. 8.8 of Marin & Robert (2007) “Bayesian Core”
using Plots 
grids = [simulatepotts(100, 100, 2, β) for β ∈ 0.3:0.1:1.2]
heatmaps = heatmap.(grids, legend = false, aspect_ratio=1)
Plots.plot(heatmaps...)
source
NeuralEstimators.simulateschlatherFunction
simulateschlather(L::Matrix, m = 1; C = 3.5, Gumbel::Bool = false)

Simulates m independent and identically distributed realisations from Schlather's (2002) max-stable model given the lower Cholesky factor L of the covariance matrix of the underlying Gaussian process.

The function uses the algorithm for approximate simulation given by Schlather (2002).

If m is not specified, the simulated data are returned as a vector with length equal to the number of spatial locations, $n$; otherwise, the data are returned as an $n$xm matrix.

Keyword arguments

  • C = 3.5: a tuning parameter that controls the accuracy of the algorithm. Small C favours computational efficiency, while large C favours accuracy.
  • Gumbel = true: flag indicating whether the data should be log-transformed from the unit Fréchet scale to the Gumbel scale.

Examples

using NeuralEstimators, Distances, LinearAlgebra

n = 500
ρ = 0.6
ν = 1.0
S = rand(n, 2)
D = pairwise(Euclidean(), S, dims = 1)
Σ = Symmetric(matern.(D, ρ, ν))
L = cholesky(Σ).L
simulateschlather(L)
source

Spatial point processes

NeuralEstimators.maternclusterprocessFunction
maternclusterprocess(; λ=10, μ=10, r=0.1, xmin=0, xmax=1, ymin=0, ymax=1, unit_bounding_box=false)

Generates a realisation from a Matérn cluster process (e.g., Baddeley et al., 2015, Ch. 12).

The process is defined by a parent homogenous Poisson point process with intensity λ > 0, a mean number of daughter points μ > 0, and a cluster radius r > 0. The simulation is performed over a rectangular window defined by [xmin, xmax] × [ymin, ymax].

If unit_bounding_box = true, the simulated points will be scaled so that the longest side of their bounding box is equal to one (this may change the simulation window).

See also the R package spatstat, which provides functions for simulating from a range of point processes and which can be interfaced from Julia using RCall.

Examples

using NeuralEstimators

# Simulate a realisation from a Matérn cluster process
S = maternclusterprocess()

# Visualise realisation (requires UnicodePlots)
using UnicodePlots
scatterplot(S[:, 1], S[:, 2])

# Visualise realisations from the cluster process with varying parameters
n = 250
λ = [10, 25, 50, 90]
μ = n ./ λ
plots = map(eachindex(λ)) do i
	S = maternclusterprocess(λ = λ[i], μ = μ[i])
	scatterplot(S[:, 1], S[:, 2])
end
source

Covariance functions

These covariance functions may be of use for various models.

NeuralEstimators.maternFunction
matern(h, ρ, ν, σ² = 1)

Given distance $\|\boldsymbol{h}\|$ (h), computes the Matérn covariance function

\[C(\|\boldsymbol{h}\|) = \sigma^2 \frac{2^{1 - \nu}}{\Gamma(\nu)} \left(\frac{\|\boldsymbol{h}\|}{\rho}\right)^\nu K_\nu \left(\frac{\|\boldsymbol{h}\|}{\rho}\right),\]

where ρ is a range parameter, ν is a smoothness parameter, σ² is the marginal variance, $\Gamma(\cdot)$ is the gamma function, and $K_\nu(\cdot)$ is the modified Bessel function of the second kind of order $\nu$.

source
NeuralEstimators.paciorekFunction
paciorek(s, r, ω₁, ω₂, ρ, β)

Given spatial locations s and r, computes the nonstationary covariance function

\[C(\boldsymbol{s}, \boldsymbol{r}) = |\boldsymbol{\Sigma}(\boldsymbol{s})|^{1/4} |\boldsymbol{\Sigma}(\boldsymbol{r})|^{1/4} \left|\frac{\boldsymbol{\Sigma}(\boldsymbol{s}) + \boldsymbol{\Sigma}(\boldsymbol{r})}{2}\right|^{-1/2} C^0\big(\sqrt{Q(\boldsymbol{s}, \boldsymbol{r})}\big), \]

where $C^0(h) = \exp\{-(h/\rho)^{3/2}\}$ for range parameter $\rho > 0$, the matrix $\boldsymbol{\Sigma}(\boldsymbol{s}) = \exp(\beta\|\boldsymbol{s} - \boldsymbol{\omega}\|)\boldsymbol{I}$ is a kernel matrix (Paciorek and Schervish, 2006) with scale parameter $\beta > 0$ and reference point $\boldsymbol{\omega} \equiv (\omega_1, \omega_2)' \in \mathbb{R}^2$, and

\[Q(\boldsymbol{s}, \boldsymbol{r}) = (\boldsymbol{s} - \boldsymbol{r})' \left(\frac{\boldsymbol{\Sigma}(\boldsymbol{s}) + \boldsymbol{\Sigma}(\boldsymbol{r})}{2}\right)^{-1} (\boldsymbol{s} - \boldsymbol{r})\]

is the squared Mahalanobis distance between $\boldsymbol{s}$ and $\boldsymbol{r}$.

Note that, in practical applications, the reference point $\boldsymbol{\omega}$ is often taken to be an estimable parameter rather than fixed and known.

source

Density functions

Density functions are not needed in the workflow of NeuralEstimators. However, as part of a series of comparison studies between neural estimators and likelihood-based estimators given in various paper, we have developed the following functions for evaluating the density function for several popular distributions. We include these in NeuralEstimators to cater for the possibility that they may be of use in future comparison studies.

NeuralEstimators.gaussiandensityFunction
gaussiandensity(Z::V, L::LT) where {V <: AbstractVector, LT <: LowerTriangular}
gaussiandensity(Z::A, L::LT) where {A <: AbstractArray, LT <: LowerTriangular}
gaussiandensity(Z::A, Σ::M) where {A <: AbstractArray, M <: AbstractMatrix}

Efficiently computes the density function for Z ~ 𝑁(0, Σ), namely,

\[|2\pi\boldsymbol{\Sigma}|^{-1/2} \exp\{-\frac{1}{2}\boldsymbol{Z}^\top \boldsymbol{\Sigma}^{-1}\boldsymbol{Z}\},\]

for covariance matrix Σ, and where L is lower Cholesky factor of Σ.

The method gaussiandensity(Z::A, L::LT) assumes that the last dimension of Z contains independent and identically distributed replicates.

If logdensity = true (default), the log-density is returned.

source