Model-specific functions
Data simulators
The philosophy of NeuralEstimators
is to cater for arbitrary statistical models by having the user define their statistical model implicitly through simulated data. However, the following functions have been included as they may be helpful to others, and their source code illustrates how a user could formulate code for their own model.
See also Distributions.jl for a large range of distributions implemented in Julia, and the package RCall for calling R functions within Julia.
NeuralEstimators.simulategaussian
— Functionsimulategaussian(L::AbstractMatrix, m = 1)
Simulates m
independent and identically distributed (i.i.d.) realisations from a mean-zero multivariate Gaussian random variable with associated lower Cholesky factor L
.
If m
is not specified, the simulated data are returned as a vector with length equal to the number of spatial locations, $n$; otherwise, the data are returned as an $n$xm
matrix.
Examples
using NeuralEstimators, Distances, LinearAlgebra
n = 500
ρ = 0.6
ν = 1.0
S = rand(n, 2)
D = pairwise(Euclidean(), S, dims = 1)
Σ = Symmetric(matern.(D, ρ, ν))
L = cholesky(Σ).L
simulategaussian(L)
NeuralEstimators.simulatepotts
— Functionsimulatepotts(grid::Matrix{Int}, β)
simulatepotts(grid::Matrix{Union{Int, Nothing}}, β)
simulatepotts(nrows::Int, ncols::Int, num_states::Int, β)
Chequerboard Gibbs sampling from 2D Potts model with parameter β
>0.
Approximately independent simulations can be obtained by setting nsims
> 1 or num_iterations > burn
. The degree to which the resulting simulations can be considered independent depends on the thinning factor (thin
) and the burn-in (burn
).
Keyword arguments
nsims = 1
: number of approximately independent replicates.num_iterations = 2000
: number of MCMC iterations.burn = num_iterations
: burn-in.thin = 10
: thinning factor.
Examples
using NeuralEstimators
## Marginal simulation
β = 0.8
simulatepotts(10, 10, 5, β)
## Marginal simulation: approximately independent samples
simulatepotts(10, 10, 5, β; nsims = 100, thin = 10)
## Conditional simulation
β = 0.8
complete_grid = simulatepotts(50, 50, 2, β) # simulate marginally from the Ising model
incomplete_grid = removedata(complete_grid, 0.1) # remove 10% of the pixels at random
imputed_grid = simulatepotts(incomplete_grid, β) # conditionally simulate over missing pixels
## Multiple conditional simulations
imputed_grids = simulatepotts(incomplete_grid, β; num_iterations = 2000, burn = 1000, thin = 10)
## Recreate Fig. 8.8 of Marin & Robert (2007) “Bayesian Core”
using Plots
grids = [simulatepotts(100, 100, 2, β) for β ∈ 0.3:0.1:1.2]
heatmaps = heatmap.(grids, legend = false, aspect_ratio=1)
Plots.plot(heatmaps...)
NeuralEstimators.simulateschlather
— Functionsimulateschlather(L::Matrix, m = 1; C = 3.5, Gumbel::Bool = false)
Simulates m
independent and identically distributed (i.i.d.) realisations from Schlather's max-stable model using the algorithm for approximate simulation given by Schlather (2002).
Requires the lower Cholesky factor L
associated with the covariance matrix of the underlying Gaussian process.
If m
is not specified, the simulated data are returned as a vector with length equal to the number of spatial locations, $n$; otherwise, the data are returned as an $n$xm
matrix.
Keyword arguments
C = 3.5
: a tuning parameter that controls the accuracy of the algorithm: smallC
favours computational efficiency, while largeC
favours accuracy. Schlather (2002) recommends the use ofC = 3
.Gumbel = true
: flag indicating whether the data should be log-transformed from the unit Fréchet scale to theGumbel
scale.
Examples
using NeuralEstimators, Distances, LinearAlgebra
n = 500
ρ = 0.6
ν = 1.0
S = rand(n, 2)
D = pairwise(Euclidean(), S, dims = 1)
Σ = Symmetric(matern.(D, ρ, ν))
L = cholesky(Σ).L
simulateschlather(L)
Spatial point processes
NeuralEstimators.maternclusterprocess
— Functionmaternclusterprocess(; λ=10, μ=10, r=0.1, xmin=0, xmax=1, ymin=0, ymax=1, unit_bounding_box=false)
Simulates a Matérn cluster process with density of parent Poisson point process λ
, mean number of daughter points μ
, and radius of cluster disk r
, over the simulation window defined by xmin
and xmax
, ymin
and ymax
.
If unit_bounding_box
is true
, then the simulated points will be scaled so that the longest side of their bounding box is equal to one (this may change the simulation window).
See also the R package spatstat
, which provides functions for simulating from a range of point processes and which can be interfaced from Julia using RCall
.
Examples
using NeuralEstimators
# Simulate a realisation from a Matérn cluster process
S = maternclusterprocess()
# Visualise realisation (requires UnicodePlots)
using UnicodePlots
scatterplot(S[:, 1], S[:, 2])
# Visualise realisations from the cluster process with varying parameters
n = 250
λ = [10, 25, 50, 90]
μ = n ./ λ
plots = map(eachindex(λ)) do i
S = maternclusterprocess(λ = λ[i], μ = μ[i])
scatterplot(S[:, 1], S[:, 2])
end
Covariance functions
These covariance functions may be of use for various models.
NeuralEstimators.matern
— Functionmatern(h, ρ, ν, σ² = 1)
Given distance $\|\boldsymbol{h}\|$ (h
), computes the Matérn covariance function,
\[C(\|\boldsymbol{h}\|) = \sigma^2 \frac{2^{1 - \nu}}{\Gamma(\nu)} \left(\frac{\|\boldsymbol{h}\|}{\rho}\right)^\nu K_\nu \left(\frac{\|\boldsymbol{h}\|}{\rho}\right),\]
where ρ
is a range parameter, ν
is a smoothness parameter, σ²
is the marginal variance, $\Gamma(\cdot)$ is the gamma function, and $K_\nu(\cdot)$ is the modified Bessel function of the second kind of order $\nu$.
NeuralEstimators.paciorek
— Functionpaciorek(s, r, ω₁, ω₂, ρ, β)
Given spatial locations s
and r
, computes the nonstationary covariance function,
\[C(\boldsymbol{s}, \boldsymbol{r}) = |\boldsymbol{\Sigma}(\boldsymbol{s})|^{1/4} |\boldsymbol{\Sigma}(\boldsymbol{r})|^{1/4} \left|\frac{\boldsymbol{\Sigma}(\boldsymbol{s}) + \boldsymbol{\Sigma}(\boldsymbol{r})}{2}\right|^{-1/2} C^0\big(\sqrt{Q(\boldsymbol{s}, \boldsymbol{r})}\big), \]
where $C^0(h) = \exp\{-(h/\rho)^{3/2}\}$ for range parameter $\rho > 0$, the matrix $\boldsymbol{\Sigma}(\boldsymbol{s}) = \exp(\beta\|\boldsymbol{s} - \boldsymbol{\omega}\|)\boldsymbol{I}$ is a kernel matrix (Paciorek and Schervish, 2006) with scale parameter $\beta > 0$ and $\boldsymbol{\omega} \equiv (\omega_1, \omega_2)' \in \mathcal{D}$, and
\[Q(\boldsymbol{s}, \boldsymbol{r}) = (\boldsymbol{s} - \boldsymbol{r})' \left(\frac{\boldsymbol{\Sigma}(\boldsymbol{s}) + \boldsymbol{\Sigma}(\boldsymbol{r})}{2}\right)^{-1} (\boldsymbol{s} - \boldsymbol{r})\]
is the squared Mahalanobis distance between $\boldsymbol{s}$ and $\boldsymbol{r}$.
Density functions
Density functions are not needed in the workflow of NeuralEstimators
. However, as part of a series of comparison studies between neural estimators and likelihood-based estimators given in various paper, we have developed the following functions for evaluating the density function for several popular distributions. We include these in NeuralEstimators
to cater for the possibility that they may be of use in future comparison studies.
NeuralEstimators.gaussiandensity
— Functiongaussiandensity(y::V, L::LT) where {V <: AbstractVector, LT <: LowerTriangular}
gaussiandensity(y::A, L::LT) where {A <: AbstractArray, LT <: LowerTriangular}
gaussiandensity(y::A, Σ::M) where {A <: AbstractArray, M <: AbstractMatrix}
Efficiently computes the density function for y
~ 𝑁(0, Σ
) for covariance matrix Σ
, and where L
is lower Cholesky factor of Σ
.
The method gaussiandensity(y::A, L::LT)
assumes that the last dimension of y
contains independent and identically distributed (iid) replicates.
The log-density is returned if the keyword argument logdensity
is true (default).
NeuralEstimators.schlatherbivariatedensity
— Functionschlatherbivariatedensity(z₁, z₂, ψ; logdensity = true)
The bivariate density function for Schlather's max-stable model.