Skip to content

Miscellaneous

Core

These functions can appear during the core workflow, and may need to be overloaded in some applications.

NeuralEstimators.numberreplicates Function
julia
numberreplicates(Z)

Generic function that returns the number of replicates in a given object. Default implementations are provided for commonly used data formats, namely, data stored as an Array or as a GNNGraph.

source
NeuralEstimators.subsetreplicates Function
julia
subsetreplicates(Z::V, i) where {V <: AbstractArray{A}} where {A <: Any}
subsetreplicates(Z::A, i) where {A <: AbstractArray{T, N}} where {T, N}
subsetreplicates(Z::G, i) where {G <: GNNGraph}

Return replicate(s) i from each data set in Z.

If working with data that are not covered by the default methods, overload the function with the appropriate type for Z.

For graphical data, calls getgraph(), where the replicates are assumed be to stored as batched graphs. Since this can be slow, one should consider using a method of train() that does not require the data to be subsetted when working with graphical data (use numberreplicates() to check that the training and validation data sets are equally replicated, which prevents subsetting).

Examples

julia
using NeuralEstimators
using GraphNeuralNetworks
using Flux: batch

d = 1  # dimension of the response variable
n = 4  # number of observations in each realisation
m = 6  # number of replicates in each data set
K = 2  # number of data sets

# Array data
Z = [rand(n, d, m) for k  1:K]
subsetreplicates(Z, 2)   # extract second replicate from each data set
subsetreplicates(Z, 1:3) # extract first 3 replicates from each data set

# Graphical data
e = 8 # number of edges
Z = [batch([rand_graph(n, e, ndata = rand(d, n)) for _  1:m]) for k  1:K]
subsetreplicates(Z, 2)   # extract second replicate from each data set
subsetreplicates(Z, 1:3) # extract first 3 replicates from each data set
source

Downstream-inference algorithms

NeuralEstimators.EM Type
julia
EM(simulateconditional::Function, MAP::Union{Function, NeuralEstimator}, θ₀ = nothing)

Implements a (Bayesian) Monte Carlo expectation-maximization (EM) algorithm for parameter estimation with missing data. The algorithm iteratively simulates missing data conditional on current parameter estimates, then updates parameters using a (neural) maximum a posteriori (MAP) estimator.

The th iteration is given by:

where denotes the complete-data log-likelihood function,   denotes the complete data with and the observed and missing components respectively, ,  , are simulated from the distribution of  , and is the prior density (which can be viewed as a penalty function).

The algorithm monitors convergence by computing:

where is the average of parameter estimates from iteration burnin+1 to iteration , and is machine precision. Convergence is declared when this quantity is less than tol for nconsecutive consecutive iterations (see keyword arguments below).

Fields

  • simulateconditional::Function: Function for simulating missing data conditional on observed data and current parameter estimates. Must have signature:

    julia
    simulateconditional(Z::AbstractArray{Union{Missing, T}}, θ; nsims = 1, kwargs...)

    Returns completed data in the format expected by MAP (e.g., 4D array for CNNs).

  • MAP::NeuralEstimator: MAP estimator applied to completed data.

  • θ₀: Optional initial parameter values (vector). Can also be provided when calling the EM object.

Methods

Once constructed, objects of type EM can be applied to data via the following methods (corresponding to single or multiple data sets, respectively):

(em::EM)(Z::A, θ₀::Union{Nothing, Vector} = nothing; ...) where {A <: AbstractArray{Union{Missing, T}, N}} where {T, N}
(em::EM)(Z::V, θ₀::Union{Nothing, Vector, Matrix} = nothing; ...) where {V <: AbstractVector{A}} where {A <: AbstractArray{Union{Missing, T}, N}} where {T, N}

where Z is the complete data containing the observed data and Missing values.

For multiple datasets, θ₀ can be a vector (same initial values for all) or a matrix with one column per dataset.

Keyword Arguments

  • niterations::Integer = 50: Maximum number of EM iterations.

  • nsims::Union{Integer, Vector{<:Integer}} = 1: Number of conditional simulations per iteration. Can be fixed (scalar) or varying (vector of length niterations).

  • burnin::Integer = 1: Number of initial iterations to discard before averaging parameter estimates for convergence assessment.

  • nconsecutive::Integer = 3: Number of consecutive iterations meeting the convergence criterion required to halt.

  • tol = 0.01: Convergence tolerance. Algorithm stops if the relative change in post-burnin averaged parameters is less than tol for nconsecutive iterations.

  • use_gpu::Bool = true: Whether to use a GPU (if available) for MAP estimation.

  • verbose::Bool = false: Whether to print iteration details.

  • kwargs...: Additional arguments passed to simulateconditional.

Returns

For a single data set, returns a named tuple containing:

  • estimate: Final parameter estimate (post-burnin average).

  • iterates: Matrix of all parameter estimates across iterations (each column is one iteration).

  • burnin: The burnin value used.

For multiple data set, returns a matrix with one column per dataset.

Notes

  • If Z contains no missing values, the MAP estimator is applied directly (after passing through simulateconditional to ensure correct format).

  • When using a GPU, data are moved to the GPU for MAP estimation and back to the CPU for conditional simulation.

Examples

julia
# Below we give a pseudocode example; see the "Missing data" section in "Advanced usage" for a concrete example.

# Define conditional simulation function
function sim_conditional(Z, θ; nsims = 1)
    # Your implementation here
    # Returns completed data in format suitable for MAP estimator
end

# Define or load MAP estimator
MAP_estimator = ... # Neural MAP estimator

# Create EM object
em = EM(sim_conditional, MAP_estimator, θ₀ = [1.0, 2.0])

# Apply to data with missing values
Z = ... # Array with Missing entries
result = em(Z, niterations = 100, nsims = 5, tol = 0.001, verbose = true)

# Access results
θ_final = result.estimate
θ_sequence = result.iterates

# Multiple datasets
Z_list = [Z1, Z2, Z3]
estimates = em(Z_list, θ₀ = [1.0, 2.0])  # Matrix with 3 columns
source

Utility functions

NeuralEstimators.adjacencymatrix Function
julia
adjacencymatrix(S::Matrix, k::Integer)
adjacencymatrix(S::Matrix, r::AbstractFloat)
adjacencymatrix(S::Matrix, r::AbstractFloat, k::Integer; random = true)
adjacencymatrix(M::Matrix; k, r, kwargs...)

Computes a spatially weighted adjacency matrix from spatial locations S based on either the k-nearest neighbours of each location; all nodes within a disc of fixed radius r; or, if both r and k are provided, a subset of k neighbours within a disc of fixed radius r.

If S is a square matrix, it is treated as a distance matrix; otherwise, it should be an x matrix, where is the number of spatial locations and is the spatial dimension (typically = 2). In the latter case, the distance metric is taken to be the Euclidean distance.

Two subsampling strategies are implemented when choosing a subset of k neighbours within a disc of fixed radius r. If random=true (default), the neighbours are randomly selected from within the disc. If random=false, a deterministic algorithm is used that aims to preserve the distribution of distances within the neighbourhood set, by choosing those nodes with distances to the central node corresponding to the quantiles of the empirical distribution function of distances within the disc (this in fact yields up to   neighbours, since both the closest and furthest nodes are always included).

By convention with the functionality in GraphNeuralNetworks.jl which is based on directed graphs, the neighbours of location i are stored in the column A[:, i] where A is the returned adjacency matrix. Therefore, the number of neighbours for each location is given by collect(mapslices(nnz, A; dims = 1)), and the number of times each node is a neighbour of another node is given by collect(mapslices(nnz, A; dims = 2)).

By convention, we do not consider a location to neighbour itself (i.e., the diagonal elements of the adjacency matrix are zero).

Examples

julia
using NeuralEstimators, Distances, SparseArrays

n = 250
d = 2
S = rand(Float32, n, d)
k = 10
r = 0.10

# Memory efficient constructors
adjacencymatrix(S, k)
adjacencymatrix(S, r)
adjacencymatrix(S, r, k)
adjacencymatrix(S, r, k; random = false)

# Construct from full distance matrix D
D = pairwise(Euclidean(), S, dims = 1)
adjacencymatrix(D, k)
adjacencymatrix(D, r)
adjacencymatrix(D, r, k)
adjacencymatrix(D, r, k; random = false)
source
NeuralEstimators.containertype Function
julia
containertype(A::Type)
containertype(::Type{A}) where A <: SubArray
containertype(a::A) where A

Returns the container type of its argument.

If given a SubArray, returns the container type of the parent array.

Examples

julia
a = rand(3, 4)
containertype(a)
containertype(typeof(a))
[containertype(x) for x  eachcol(a)]
source
NeuralEstimators.encodedata Function
julia
encodedata(Z::A; c::T = zero(T)) where {A <: AbstractArray{Union{Missing, T}, N}} where T, N

For data Z with missing entries, returns an encoded data set (U, W) where U is the original data Z with missing entries replaced by a fixed constant c, and W encodes the missingness pattern as an indicator array equal to one if the corresponding element of Z is observed and zero otherwise.

The behavior depends on the dimensionality of Z. If Z has 1 or 2 dimensions, the indicator array W is concatenated along the first dimension of Z. If Z has more than 2 dimensions, W is concatenated along the second-to-last dimension of Z.

Examples

julia
using NeuralEstimators

Z = rand(16, 16, 1, 1)
Z = removedata(Z, 0.25)	# remove 25% of the data at random
UW = encodedata(Z)
source
NeuralEstimators.expandgrid Function
julia
expandgrid(xs, ys)

Generates a grid of all possible combinations of the elements from two input vectors, xs and ys.

Same as expand.grid() in R, but currently caters for two dimensions only.

source
NeuralEstimators.maternchols Function
julia
maternchols(D, ρ, ν, σ² = 1; stack = true)

Given a matrix D of distances, constructs the Cholesky factor of the covariance matrix under the Matérn covariance function with range parameter ρ, smoothness parameter ν, and marginal variance σ².

Providing vectors of parameters will yield a three-dimensional array of Cholesky factors (note that the vectors must of the same length, but a mix of vectors and scalars is allowed). A vector of distance matrices D may also be provided.

If stack = true, the Cholesky factors will be "stacked" into a three-dimensional array (this is only possible if all distance matrices in D are the same size).

Examples

julia
using NeuralEstimators
using LinearAlgebra: norm
n  = 10
S  = rand(n, 2)
D  = [norm(sᵢ - sⱼ) for sᵢ  eachrow(S), sⱼ  eachrow(S)]
ρ  = [0.6, 0.5]
ν  = [0.7, 1.2]
σ² = [0.2, 0.4]
maternchols(D, ρ, ν)
maternchols([D], ρ, ν)
maternchols(D, ρ, ν, σ²; stack = false)

= rand(n, 2)
= [norm(sᵢ - sⱼ) for sᵢ  eachrow(S̃), sⱼ  eachrow(S̃)]
maternchols([D, D̃], ρ, ν, σ²)
maternchols([D, D̃], ρ, ν, σ²; stack = false)

= rand(2n, 2)
= [norm(sᵢ - sⱼ) for sᵢ  eachrow(S̃), sⱼ  eachrow(S̃)]
maternchols([D, D̃], ρ, ν, σ²; stack = false)
source
NeuralEstimators.removedata Function
julia
removedata(Z::Array, Iᵤ::Vector{T}) where T <: Union{Integer, CartesianIndex}
removedata(Z::Array, p::Union{Float, Vector{Float}}; prevent_complete_missing = true)
removedata(Z::Array, n::Integer; fixed_pattern = false, contiguous_pattern = false)

Replaces elements of Z with missing.

The simplest method accepts a vector Iᵤ that specifes the indices of the data to be removed.

Alternatively, there are two methods available to randomly generate missing data.

First, a vector p may be given that specifies the proportion of missingness for each element in the response vector. Hence, p should have length equal to the dimension of the response vector. If a single proportion is given, it will be replicated accordingly. If prevent_complete_missing = true, no replicates will contain 100% missingness (note that this can slightly alter the effective values of p).

Second, if an integer n is provided, all replicates will contain n observations after the data are removed. If fixed_pattern = true, the missingness pattern is fixed for all replicates. If contiguous_pattern = true, the data will be removed in a contiguous block based on a randomly selected starting index.

The return type is Array{Union{T, Missing}}.

Examples

julia
d = 5           # dimension of each replicate
m = 2000        # number of replicates
Z = rand(d, m)  # simulated data

# Passing a desired proportion of missingness
p = rand(d)
removedata(Z, p)

# Passing a desired final sample size
n = 3  # number of observed elements of each replicate: must have n <= d
removedata(Z, n)
source
NeuralEstimators.rowwisenorm Function
julia
rowwisenorm(A)

Computes the row-wise norm of a matrix A.

source
NeuralEstimators.spatialgraph Function
julia
spatialgraph(S)
spatialgraph(S, Z)
spatialgraph(g::GNNGraph, Z)

Given spatial data Z measured at spatial locations S, constructs a GNNGraph ready for use in a graph neural network that employs SpatialGraphConv layers.

When independent replicates are collected over the same set of spatial locations,

where   denotes the spatial domain of interest, Z should be given as an   matrix and S should be given as an   matrix. Otherwise, when independent replicates are collected over differing sets of spatial locations,

Z should be given as an -vector of -vectors, and S should be given as an -vector of   matrices.

The spatial information between neighbours is stored as an edge feature, with the specific information controlled by the keyword arguments stationary and isotropic. Specifically, the edge feature between node and node stores the spatial distance   (if isotropic), the spatial displacement   (if stationary), or the matrix of locations (if !stationary).

Additional keyword arguments inherit from adjacencymatrix() to determine the neighbourhood of each node, with the default being a randomly selected set of k=30 neighbours within a disc of radius r=0.15 units.

Examples

julia
using NeuralEstimators, GraphNeuralNetworks

# Number of replicates and spatial dimension
m = 5  
d = 2  

# Spatial locations fixed for all replicates
n = 100
S = rand(n, d)
Z = rand(n, m)
g = spatialgraph(S, Z)

# Spatial locations varying between replicates
n = rand(50:100, m)
S = rand.(n, d)
Z = rand.(n)
g = spatialgraph(S, Z)
source
NeuralEstimators.stackarrays Function
julia
stackarrays(v::Vector{<:AbstractArray}; merge::Bool = true)

Stack a vector of arrays v into a single higher-dimensional array.

If all arrays have the same size along the last dimension, stacks along a new final dimension. Then, if merge = true, merges the last two dimensions into one.

Alternatively, if sizes differ along the last dimension, concatenates along the last dimension.

Examples

julia
# Vector containing arrays of the same size:
Z = [rand(2, 3, m) for m  (1, 1)];
stackarrays(Z)
stackarrays(Z, merge = false)

# Vector containing arrays with differing final dimension size:
Z = [rand(2, 3, m) for m  (1, 2)];
stackarrays(Z)
source
NeuralEstimators.vectotril Function
julia
vectotril(v; strict = false)
vectotriu(v; strict = false)

Converts a vector v of length    (a triangular number) into a   lower or upper triangular matrix.

If strict = true, the matrix will be strictly lower or upper triangular, that is, a     triangular matrix with zero diagonal.

Note that the triangular matrix is constructed on the CPU, but the returned matrix will be a GPU array if v is a GPU array. Note also that the return type is not of type Triangular matrix (i.e., the zeros are materialised) since Triangular matrices are not always compatible with other GPU operations.

Examples

julia
using NeuralEstimators

d = 4
n = d*(d+1)÷2
v = collect(range(1, n))
vectotril(v)
vectotriu(v)
vectotril(v; strict = true)
vectotriu(v; strict = true)
source

Model-specific functions

Data simulators

The philosophy of NeuralEstimators is to cater for any model for which simulation is feasible by allowing users to define their model implicitly through simulated data. However, the following functions have been included as they may be helpful to others, and their source code illustrates how a user could formulate code for their own model.

See also Distributions.jl for a range of distributions implemented in Julia, and the package RCall for calling R functions within Julia.

NeuralEstimators.simulategaussian Function
julia
simulategaussian(L::AbstractMatrix, m = 1)

Simulates m independent and identically distributed realisations from a mean-zero multivariate Gaussian random vector with associated lower Cholesky factor L.

If m is not specified, the simulated data are returned as a vector with length equal to the number of spatial locations, ; otherwise, the data are returned as an xm matrix.

Examples

julia
using NeuralEstimators, Distances, LinearAlgebra

n = 500
ρ = 0.6
ν = 1.0
S = rand(n, 2)
D = pairwise(Euclidean(), S, dims = 1)
Σ = Symmetric(matern.(D, ρ, ν))
L = cholesky(Σ).L
simulategaussian(L)
source
NeuralEstimators.simulatepotts Function
julia
simulatepotts(grid::Matrix{Int}, β)
simulatepotts(grid::Matrix{Union{Int, Nothing}}, β)
simulatepotts(nrows::Int, ncols::Int, num_states::Int, β)

Chequerboard Gibbs sampling from a spatial Potts model with parameter β>0 (see, e.g., Sainsbury-Dale et al., 2025, Sec. 3.3, and the references therein).

Approximately independent simulations can be obtained by setting nsims > 1 or num_iterations > burn. The degree to which the resulting simulations can be considered independent depends on the thinning factor (thin) and the burn-in (burn).

Keyword arguments

  • nsims = 1: number of approximately independent replicates.

  • num_iterations = 2000: number of MCMC iterations.

  • burn = num_iterations: burn-in.

  • thin = 10: thinning factor.

Examples

julia
using NeuralEstimators 

## Marginal simulation 
β = 0.8
simulatepotts(10, 10, 3, β)

## Marginal simulation: approximately independent samples 
simulatepotts(10, 10, 3, β; nsims = 100, thin = 10)

## Conditional simulation 
β = 0.8
complete_grid   = simulatepotts(100, 100, 3, β)      # simulate marginally 
incomplete_grid = removedata(complete_grid, 0.1)     # randomly remove 10% of the pixels 
imputed_grid    = simulatepotts(incomplete_grid, β)  # conditionally simulate over missing pixels

## Multiple conditional simulations 
imputed_grids   = simulatepotts(incomplete_grid, β; num_iterations = 2000, burn = 1000, thin = 10)

## Recreate Fig. 8.8 of Marin & Robert (2007) “Bayesian Core”
using Plots 
grids = [simulatepotts(100, 100, 2, β) for β  0.3:0.1:1.2]
heatmaps = heatmap.(grids, legend = false, aspect_ratio=1)
Plots.plot(heatmaps...)
source
NeuralEstimators.simulateschlather Function
julia
simulateschlather(L::Matrix, m = 1; C = 3.5, Gumbel::Bool = false)

Simulates m independent and identically distributed realisations from Schlather's (2002) max-stable model given the lower Cholesky factor L of the covariance matrix of the underlying Gaussian process.

The function uses the algorithm for approximate simulation given by Schlather (2002).

If m is not specified, the simulated data are returned as a vector with length equal to the number of spatial locations, ; otherwise, the data are returned as an xm matrix.

Keyword arguments

  • C = 3.5: a tuning parameter that controls the accuracy of the algorithm. Small C favours computational efficiency, while large C favours accuracy.

  • Gumbel = true: flag indicating whether the data should be log-transformed from the unit Fréchet scale to the Gumbel scale.

Examples

julia
using NeuralEstimators, Distances, LinearAlgebra

n = 500
ρ = 0.6
ν = 1.0
S = rand(n, 2)
D = pairwise(Euclidean(), S, dims = 1)
Σ = Symmetric(matern.(D, ρ, ν))
L = cholesky(Σ).L
simulateschlather(L)
source

Spatial point processes

NeuralEstimators.maternclusterprocess Function
julia
maternclusterprocess(; λ=10, μ=10, r=0.1, xmin=0, xmax=1, ymin=0, ymax=1, unit_bounding_box=false)

Generates a realisation from a Matérn cluster process (e.g., Baddeley et al., 2015, Ch. 12).

The process is defined by a parent homogenous Poisson point process with intensity λ > 0, a mean number of daughter points μ > 0, and a cluster radius r > 0. The simulation is performed over a rectangular window defined by [xmin, xmax] × [ymin, ymax].

If unit_bounding_box = true, the simulated points will be scaled so that the longest side of their bounding box is equal to one (this may change the simulation window).

See also the R package spatstat, which provides functions for simulating from a range of point processes and which can be interfaced from Julia using RCall.

Examples

julia
using NeuralEstimators

# Simulate a realisation from a Matérn cluster process
S = maternclusterprocess()

# Visualise realisation (requires UnicodePlots)
using UnicodePlots
scatterplot(S[:, 1], S[:, 2])

# Visualise realisations from the cluster process with varying parameters
n = 250
λ = [10, 25, 50, 90]
μ = n ./ λ
plots = map(eachindex(λ)) do i
	S = maternclusterprocess= λ[i], μ = μ[i])
	scatterplot(S[:, 1], S[:, 2])
end
source

Covariance functions

These covariance functions may be of use for various models.

NeuralEstimators.matern Function
julia
matern(h, ρ, ν, σ² = 1)

Given distance (h), computes the Matérn covariance function

where ρ is a range parameter, ν is a smoothness parameter, σ² is the marginal variance,   is the gamma function, and   is the modified Bessel function of the second kind of order .

source
NeuralEstimators.paciorek Function
julia
paciorek(s, r, ω₁, ω₂, ρ, β)

Given spatial locations s and r, computes the nonstationary covariance function

where    for range parameter  , the matrix    is a kernel matrix (Paciorek and Schervish, 2006) with scale parameter   and reference point   , and

is the squared Mahalanobis distance between and .

Note that, in practical applications, the reference point is often taken to be an estimable parameter rather than fixed and known.

source

Density functions

Density functions are not needed in the workflow of NeuralEstimators. However, as part of a series of comparison studies between neural estimators and likelihood-based estimators given in various paper, we have developed the following functions for evaluating the density function for several popular distributions. We include these in NeuralEstimators to cater for the possibility that they may be of use in future comparison studies.

NeuralEstimators.gaussiandensity Function
julia
gaussiandensity(Z::V, L::LT) where {V <: AbstractVector, LT <: LowerTriangular}
gaussiandensity(Z::A, L::LT) where {A <: AbstractArray, LT <: LowerTriangular}
gaussiandensity(Z::A, Σ::M) where {A <: AbstractArray, M <: AbstractMatrix}

Efficiently computes the density function for Z ~ 𝑁(0, Σ), namely,

for covariance matrix Σ, and where L is lower Cholesky factor of Σ.

The method gaussiandensity(Z::A, L::LT) assumes that the last dimension of Z contains independent and identically distributed replicates.

If logdensity = true (default), the log-density is returned.

source
NeuralEstimators.schlatherbivariatedensity Function
julia
schlatherbivariatedensity(z₁, z₂, ψ₁₂; logdensity = true)

The bivariate density function (see, e.g., Sainsbury-Dale et al., 2024, Sec. S6.2) for Schlather's (2002) max-stable model, where ψ₁₂ denotes the spatial correlation function evaluated at the locations of observations z₁ and z₂.

source