Miscellaneous
Core
These functions can appear during the core workflow, and may need to be overloaded in some applications.
NeuralEstimators.numberreplicates Function
numberreplicates(Z)Generic function that returns the number of replicates in a given object. Default implementations are provided for commonly used data formats, namely, data stored as an Array or as a GNNGraph.
NeuralEstimators.subsetreplicates Function
subsetreplicates(Z::V, i) where {V <: AbstractArray{A}} where {A <: Any}
subsetreplicates(Z::A, i) where {A <: AbstractArray{T, N}} where {T, N}
subsetreplicates(Z::G, i) where {G <: GNNGraph}Return replicate(s) i from each data set in Z.
If working with data that are not covered by the default methods, overload the function with the appropriate type for Z.
For graphical data, calls getgraph(), where the replicates are assumed be to stored as batched graphs. Since this can be slow, one should consider using a method of train() that does not require the data to be subsetted when working with graphical data (use numberreplicates() to check that the training and validation data sets are equally replicated, which prevents subsetting).
Examples
using NeuralEstimators
using GraphNeuralNetworks
using Flux: batch
d = 1 # dimension of the response variable
n = 4 # number of observations in each realisation
m = 6 # number of replicates in each data set
K = 2 # number of data sets
# Array data
Z = [rand(n, d, m) for k ∈ 1:K]
subsetreplicates(Z, 2) # extract second replicate from each data set
subsetreplicates(Z, 1:3) # extract first 3 replicates from each data set
# Graphical data
e = 8 # number of edges
Z = [batch([rand_graph(n, e, ndata = rand(d, n)) for _ ∈ 1:m]) for k ∈ 1:K]
subsetreplicates(Z, 2) # extract second replicate from each data set
subsetreplicates(Z, 1:3) # extract first 3 replicates from each data setDownstream-inference algorithms
NeuralEstimators.EM Type
EM(simulateconditional::Function, MAP::Union{Function, NeuralEstimator}, θ₀ = nothing)Implements a (Bayesian) Monte Carlo expectation-maximization (EM) algorithm for parameter estimation with missing data. The algorithm iteratively simulates missing data conditional on current parameter estimates, then updates parameters using a (neural) maximum a posteriori (MAP) estimator.
The
where
The algorithm monitors convergence by computing:
where burnin+1 to iteration tol for nconsecutive consecutive iterations (see keyword arguments below).
Fields
simulateconditional::Function: Function for simulating missing data conditional on observed data and current parameter estimates. Must have signature:juliasimulateconditional(Z::AbstractArray{Union{Missing, T}}, θ; nsims = 1, kwargs...)Returns completed data in the format expected by
MAP(e.g., 4D array for CNNs).MAP::NeuralEstimator: MAP estimator applied to completed data.θ₀: Optional initial parameter values (vector). Can also be provided when calling theEMobject.
Methods
Once constructed, objects of type EM can be applied to data via the following methods (corresponding to single or multiple data sets, respectively):
(em::EM)(Z::A, θ₀::Union{Nothing, Vector} = nothing; ...) where {A <: AbstractArray{Union{Missing, T}, N}} where {T, N}
(em::EM)(Z::V, θ₀::Union{Nothing, Vector, Matrix} = nothing; ...) where {V <: AbstractVector{A}} where {A <: AbstractArray{Union{Missing, T}, N}} where {T, N}where Z is the complete data containing the observed data and Missing values.
For multiple datasets, θ₀ can be a vector (same initial values for all) or a matrix with one column per dataset.
Keyword Arguments
niterations::Integer = 50: Maximum number of EM iterations.nsims::Union{Integer, Vector{<:Integer}} = 1: Number of conditional simulations per iteration. Can be fixed (scalar) or varying (vector of lengthniterations).burnin::Integer = 1: Number of initial iterations to discard before averaging parameter estimates for convergence assessment.nconsecutive::Integer = 3: Number of consecutive iterations meeting the convergence criterion required to halt.tol = 0.01: Convergence tolerance. Algorithm stops if the relative change in post-burnin averaged parameters is less thantolfornconsecutiveiterations.use_gpu::Bool = true: Whether to use a GPU (if available) for MAP estimation.verbose::Bool = false: Whether to print iteration details.kwargs...: Additional arguments passed tosimulateconditional.
Returns
For a single data set, returns a named tuple containing:
estimate: Final parameter estimate (post-burnin average).iterates: Matrix of all parameter estimates across iterations (each column is one iteration).burnin: The burnin value used.
For multiple data set, returns a matrix with one column per dataset.
Notes
If
Zcontains no missing values, the MAP estimator is applied directly (after passing throughsimulateconditionalto ensure correct format).When using a GPU, data are moved to the GPU for MAP estimation and back to the CPU for conditional simulation.
Examples
# Below we give a pseudocode example; see the "Missing data" section in "Advanced usage" for a concrete example.
# Define conditional simulation function
function sim_conditional(Z, θ; nsims = 1)
# Your implementation here
# Returns completed data in format suitable for MAP estimator
end
# Define or load MAP estimator
MAP_estimator = ... # Neural MAP estimator
# Create EM object
em = EM(sim_conditional, MAP_estimator, θ₀ = [1.0, 2.0])
# Apply to data with missing values
Z = ... # Array with Missing entries
result = em(Z, niterations = 100, nsims = 5, tol = 0.001, verbose = true)
# Access results
θ_final = result.estimate
θ_sequence = result.iterates
# Multiple datasets
Z_list = [Z1, Z2, Z3]
estimates = em(Z_list, θ₀ = [1.0, 2.0]) # Matrix with 3 columnsUtility functions
NeuralEstimators.adjacencymatrix Function
adjacencymatrix(S::Matrix, k::Integer)
adjacencymatrix(S::Matrix, r::AbstractFloat)
adjacencymatrix(S::Matrix, r::AbstractFloat, k::Integer; random = true)
adjacencymatrix(M::Matrix; k, r, kwargs...)Computes a spatially weighted adjacency matrix from spatial locations S based on either the k-nearest neighbours of each location; all nodes within a disc of fixed radius r; or, if both r and k are provided, a subset of k neighbours within a disc of fixed radius r.
If S is a square matrix, it is treated as a distance matrix; otherwise, it should be an
Two subsampling strategies are implemented when choosing a subset of k neighbours within a disc of fixed radius r. If random=true (default), the neighbours are randomly selected from within the disc. If random=false, a deterministic algorithm is used that aims to preserve the distribution of distances within the neighbourhood set, by choosing those nodes with distances to the central node corresponding to the
By convention with the functionality in GraphNeuralNetworks.jl which is based on directed graphs, the neighbours of location i are stored in the column A[:, i] where A is the returned adjacency matrix. Therefore, the number of neighbours for each location is given by collect(mapslices(nnz, A; dims = 1)), and the number of times each node is a neighbour of another node is given by collect(mapslices(nnz, A; dims = 2)).
By convention, we do not consider a location to neighbour itself (i.e., the diagonal elements of the adjacency matrix are zero).
Examples
using NeuralEstimators, Distances, SparseArrays
n = 250
d = 2
S = rand(Float32, n, d)
k = 10
r = 0.10
# Memory efficient constructors
adjacencymatrix(S, k)
adjacencymatrix(S, r)
adjacencymatrix(S, r, k)
adjacencymatrix(S, r, k; random = false)
# Construct from full distance matrix D
D = pairwise(Euclidean(), S, dims = 1)
adjacencymatrix(D, k)
adjacencymatrix(D, r)
adjacencymatrix(D, r, k)
adjacencymatrix(D, r, k; random = false)NeuralEstimators.containertype Function
containertype(A::Type)
containertype(::Type{A}) where A <: SubArray
containertype(a::A) where AReturns the container type of its argument.
If given a SubArray, returns the container type of the parent array.
Examples
a = rand(3, 4)
containertype(a)
containertype(typeof(a))
[containertype(x) for x ∈ eachcol(a)]NeuralEstimators.encodedata Function
encodedata(Z::A; c::T = zero(T)) where {A <: AbstractArray{Union{Missing, T}, N}} where T, NFor data Z with missing entries, returns an encoded data set (U, W) where U is the original data Z with missing entries replaced by a fixed constant c, and W encodes the missingness pattern as an indicator array equal to one if the corresponding element of Z is observed and zero otherwise.
The behavior depends on the dimensionality of Z. If Z has 1 or 2 dimensions, the indicator array W is concatenated along the first dimension of Z. If Z has more than 2 dimensions, W is concatenated along the second-to-last dimension of Z.
Examples
using NeuralEstimators
Z = rand(16, 16, 1, 1)
Z = removedata(Z, 0.25) # remove 25% of the data at random
UW = encodedata(Z)NeuralEstimators.expandgrid Function
expandgrid(xs, ys)Generates a grid of all possible combinations of the elements from two input vectors, xs and ys.
Same as expand.grid() in R, but currently caters for two dimensions only.
NeuralEstimators.maternchols Function
maternchols(D, ρ, ν, σ² = 1; stack = true)Given a matrix D of distances, constructs the Cholesky factor of the covariance matrix under the Matérn covariance function with range parameter ρ, smoothness parameter ν, and marginal variance σ².
Providing vectors of parameters will yield a three-dimensional array of Cholesky factors (note that the vectors must of the same length, but a mix of vectors and scalars is allowed). A vector of distance matrices D may also be provided.
If stack = true, the Cholesky factors will be "stacked" into a three-dimensional array (this is only possible if all distance matrices in D are the same size).
Examples
using NeuralEstimators
using LinearAlgebra: norm
n = 10
S = rand(n, 2)
D = [norm(sᵢ - sⱼ) for sᵢ ∈ eachrow(S), sⱼ ∈ eachrow(S)]
ρ = [0.6, 0.5]
ν = [0.7, 1.2]
σ² = [0.2, 0.4]
maternchols(D, ρ, ν)
maternchols([D], ρ, ν)
maternchols(D, ρ, ν, σ²; stack = false)
S̃ = rand(n, 2)
D̃ = [norm(sᵢ - sⱼ) for sᵢ ∈ eachrow(S̃), sⱼ ∈ eachrow(S̃)]
maternchols([D, D̃], ρ, ν, σ²)
maternchols([D, D̃], ρ, ν, σ²; stack = false)
S̃ = rand(2n, 2)
D̃ = [norm(sᵢ - sⱼ) for sᵢ ∈ eachrow(S̃), sⱼ ∈ eachrow(S̃)]
maternchols([D, D̃], ρ, ν, σ²; stack = false)NeuralEstimators.removedata Function
removedata(Z::Array, Iᵤ::Vector{T}) where T <: Union{Integer, CartesianIndex}
removedata(Z::Array, p::Union{Float, Vector{Float}}; prevent_complete_missing = true)
removedata(Z::Array, n::Integer; fixed_pattern = false, contiguous_pattern = false)Replaces elements of Z with missing.
The simplest method accepts a vector Iᵤ that specifes the indices of the data to be removed.
Alternatively, there are two methods available to randomly generate missing data.
First, a vector p may be given that specifies the proportion of missingness for each element in the response vector. Hence, p should have length equal to the dimension of the response vector. If a single proportion is given, it will be replicated accordingly. If prevent_complete_missing = true, no replicates will contain 100% missingness (note that this can slightly alter the effective values of p).
Second, if an integer n is provided, all replicates will contain n observations after the data are removed. If fixed_pattern = true, the missingness pattern is fixed for all replicates. If contiguous_pattern = true, the data will be removed in a contiguous block based on a randomly selected starting index.
The return type is Array{Union{T, Missing}}.
Examples
d = 5 # dimension of each replicate
m = 2000 # number of replicates
Z = rand(d, m) # simulated data
# Passing a desired proportion of missingness
p = rand(d)
removedata(Z, p)
# Passing a desired final sample size
n = 3 # number of observed elements of each replicate: must have n <= d
removedata(Z, n)NeuralEstimators.rowwisenorm Function
rowwisenorm(A)Computes the row-wise norm of a matrix A.
NeuralEstimators.spatialgraph Function
spatialgraph(S)
spatialgraph(S, Z)
spatialgraph(g::GNNGraph, Z)Given spatial data Z measured at spatial locations S, constructs a GNNGraph ready for use in a graph neural network that employs SpatialGraphConv layers.
When
where Z should be given as an S should be given as an
Z should be given as an S should be given as an
The spatial information between neighbours is stored as an edge feature, with the specific information controlled by the keyword arguments stationary and isotropic. Specifically, the edge feature between node isotropic), the spatial displacement stationary), or the matrix of locations !stationary).
Additional keyword arguments inherit from adjacencymatrix() to determine the neighbourhood of each node, with the default being a randomly selected set of k=30 neighbours within a disc of radius r=0.15 units.
Examples
using NeuralEstimators, GraphNeuralNetworks
# Number of replicates and spatial dimension
m = 5
d = 2
# Spatial locations fixed for all replicates
n = 100
S = rand(n, d)
Z = rand(n, m)
g = spatialgraph(S, Z)
# Spatial locations varying between replicates
n = rand(50:100, m)
S = rand.(n, d)
Z = rand.(n)
g = spatialgraph(S, Z)NeuralEstimators.stackarrays Function
stackarrays(v::Vector{<:AbstractArray}; merge::Bool = true)Stack a vector of arrays v into a single higher-dimensional array.
If all arrays have the same size along the last dimension, stacks along a new final dimension. Then, if merge = true, merges the last two dimensions into one.
Alternatively, if sizes differ along the last dimension, concatenates along the last dimension.
Examples
# Vector containing arrays of the same size:
Z = [rand(2, 3, m) for m ∈ (1, 1)];
stackarrays(Z)
stackarrays(Z, merge = false)
# Vector containing arrays with differing final dimension size:
Z = [rand(2, 3, m) for m ∈ (1, 2)];
stackarrays(Z)NeuralEstimators.vectotril Function
vectotril(v; strict = false)
vectotriu(v; strict = false)Converts a vector v of length
If strict = true, the matrix will be strictly lower or upper triangular, that is, a
Note that the triangular matrix is constructed on the CPU, but the returned matrix will be a GPU array if v is a GPU array. Note also that the return type is not of type Triangular matrix (i.e., the zeros are materialised) since Triangular matrices are not always compatible with other GPU operations.
Examples
using NeuralEstimators
d = 4
n = d*(d+1)÷2
v = collect(range(1, n))
vectotril(v)
vectotriu(v)
vectotril(v; strict = true)
vectotriu(v; strict = true)Model-specific functions
Data simulators
The philosophy of NeuralEstimators is to cater for any model for which simulation is feasible by allowing users to define their model implicitly through simulated data. However, the following functions have been included as they may be helpful to others, and their source code illustrates how a user could formulate code for their own model.
See also Distributions.jl for a range of distributions implemented in Julia, and the package RCall for calling R functions within Julia.
NeuralEstimators.simulategaussian Function
simulategaussian(L::AbstractMatrix, m = 1)Simulates m independent and identically distributed realisations from a mean-zero multivariate Gaussian random vector with associated lower Cholesky factor L.
If m is not specified, the simulated data are returned as a vector with length equal to the number of spatial locations, m matrix.
Examples
using NeuralEstimators, Distances, LinearAlgebra
n = 500
ρ = 0.6
ν = 1.0
S = rand(n, 2)
D = pairwise(Euclidean(), S, dims = 1)
Σ = Symmetric(matern.(D, ρ, ν))
L = cholesky(Σ).L
simulategaussian(L)NeuralEstimators.simulatepotts Function
simulatepotts(grid::Matrix{Int}, β)
simulatepotts(grid::Matrix{Union{Int, Nothing}}, β)
simulatepotts(nrows::Int, ncols::Int, num_states::Int, β)Chequerboard Gibbs sampling from a spatial Potts model with parameter β>0 (see, e.g., Sainsbury-Dale et al., 2025, Sec. 3.3, and the references therein).
Approximately independent simulations can be obtained by setting nsims > 1 or num_iterations > burn. The degree to which the resulting simulations can be considered independent depends on the thinning factor (thin) and the burn-in (burn).
Keyword arguments
nsims = 1: number of approximately independent replicates.num_iterations = 2000: number of MCMC iterations.burn = num_iterations: burn-in.thin = 10: thinning factor.
Examples
using NeuralEstimators
## Marginal simulation
β = 0.8
simulatepotts(10, 10, 3, β)
## Marginal simulation: approximately independent samples
simulatepotts(10, 10, 3, β; nsims = 100, thin = 10)
## Conditional simulation
β = 0.8
complete_grid = simulatepotts(100, 100, 3, β) # simulate marginally
incomplete_grid = removedata(complete_grid, 0.1) # randomly remove 10% of the pixels
imputed_grid = simulatepotts(incomplete_grid, β) # conditionally simulate over missing pixels
## Multiple conditional simulations
imputed_grids = simulatepotts(incomplete_grid, β; num_iterations = 2000, burn = 1000, thin = 10)
## Recreate Fig. 8.8 of Marin & Robert (2007) “Bayesian Core”
using Plots
grids = [simulatepotts(100, 100, 2, β) for β ∈ 0.3:0.1:1.2]
heatmaps = heatmap.(grids, legend = false, aspect_ratio=1)
Plots.plot(heatmaps...)NeuralEstimators.simulateschlather Function
simulateschlather(L::Matrix, m = 1; C = 3.5, Gumbel::Bool = false)Simulates m independent and identically distributed realisations from Schlather's (2002) max-stable model given the lower Cholesky factor L of the covariance matrix of the underlying Gaussian process.
The function uses the algorithm for approximate simulation given by Schlather (2002).
If m is not specified, the simulated data are returned as a vector with length equal to the number of spatial locations, m matrix.
Keyword arguments
C = 3.5: a tuning parameter that controls the accuracy of the algorithm. SmallCfavours computational efficiency, while largeCfavours accuracy.Gumbel = true: flag indicating whether the data should be log-transformed from the unit Fréchet scale to the Gumbel scale.
Examples
using NeuralEstimators, Distances, LinearAlgebra
n = 500
ρ = 0.6
ν = 1.0
S = rand(n, 2)
D = pairwise(Euclidean(), S, dims = 1)
Σ = Symmetric(matern.(D, ρ, ν))
L = cholesky(Σ).L
simulateschlather(L)Spatial point processes
NeuralEstimators.maternclusterprocess Function
maternclusterprocess(; λ=10, μ=10, r=0.1, xmin=0, xmax=1, ymin=0, ymax=1, unit_bounding_box=false)Generates a realisation from a Matérn cluster process (e.g., Baddeley et al., 2015, Ch. 12).
The process is defined by a parent homogenous Poisson point process with intensity λ > 0, a mean number of daughter points μ > 0, and a cluster radius r > 0. The simulation is performed over a rectangular window defined by [xmin, xmax] × [ymin, ymax].
If unit_bounding_box = true, the simulated points will be scaled so that the longest side of their bounding box is equal to one (this may change the simulation window).
See also the R package spatstat, which provides functions for simulating from a range of point processes and which can be interfaced from Julia using RCall.
Examples
using NeuralEstimators
# Simulate a realisation from a Matérn cluster process
S = maternclusterprocess()
# Visualise realisation (requires UnicodePlots)
using UnicodePlots
scatterplot(S[:, 1], S[:, 2])
# Visualise realisations from the cluster process with varying parameters
n = 250
λ = [10, 25, 50, 90]
μ = n ./ λ
plots = map(eachindex(λ)) do i
S = maternclusterprocess(λ = λ[i], μ = μ[i])
scatterplot(S[:, 1], S[:, 2])
endCovariance functions
These covariance functions may be of use for various models.
NeuralEstimators.matern Function
matern(h, ρ, ν, σ² = 1)Given distance h), computes the Matérn covariance function
where ρ is a range parameter, ν is a smoothness parameter, σ² is the marginal variance,
NeuralEstimators.paciorek Function
paciorek(s, r, ω₁, ω₂, ρ, β)Given spatial locations s and r, computes the nonstationary covariance function
where
is the squared Mahalanobis distance between
Note that, in practical applications, the reference point
Density functions
Density functions are not needed in the workflow of NeuralEstimators. However, as part of a series of comparison studies between neural estimators and likelihood-based estimators given in various paper, we have developed the following functions for evaluating the density function for several popular distributions. We include these in NeuralEstimators to cater for the possibility that they may be of use in future comparison studies.
NeuralEstimators.gaussiandensity Function
gaussiandensity(Z::V, L::LT) where {V <: AbstractVector, LT <: LowerTriangular}
gaussiandensity(Z::A, L::LT) where {A <: AbstractArray, LT <: LowerTriangular}
gaussiandensity(Z::A, Σ::M) where {A <: AbstractArray, M <: AbstractMatrix}Efficiently computes the density function for Z ~ 𝑁(0, Σ), namely,
for covariance matrix Σ, and where L is lower Cholesky factor of Σ.
The method gaussiandensity(Z::A, L::LT) assumes that the last dimension of Z contains independent and identically distributed replicates.
If logdensity = true (default), the log-density is returned.
NeuralEstimators.schlatherbivariatedensity Function
schlatherbivariatedensity(z₁, z₂, ψ₁₂; logdensity = true)The bivariate density function (see, e.g., Sainsbury-Dale et al., 2024, Sec. S6.2) for Schlather's (2002) max-stable model, where ψ₁₂ denotes the spatial correlation function evaluated at the locations of observations z₁ and z₂.