Skip to content

Time-series data

Here, we develop a neural estimator to infer parameters from a -dimensional VAR(1) process,

observed over time steps. For simplicity we take   and target the transition matrix entries and innovation standard deviations  , with  . We adopt independent priors on each entry of , truncated to the stationary region  , and independent priors on and .

Package dependencies

julia
using NeuralEstimators
using CairoMakie
using LinearAlgebra
using MatrixEquations: lyapd
using MLUtils: flatten

To improve computational efficiency, various GPU backends are supported. Once the relevant package is loaded and a compatible GPU is available, it will be used automatically:

julia
using CUDA, cuDNN
julia
using AMDGPU
julia
using Metal
julia
using oneAPI

Select a deep-learning backend:

julia
using Flux
julia
using Lux

# NB: for the most computationally efficient setup, pass device = reactant_device() to train()
using Reactant
Reactant.set_default_backend("gpu")

Sampling parameters

We first define a function to sample parameters from the prior distribution. Each column of the returned NamedMatrix holds a single draw: the four entries of and the two innovation standard deviations stacked as . We reject draws whose spectral radius   to ensure stationarity:

julia
function sampler(K)
    p = 2   # dimension of the VAR process
    A = Matrix{Float64}(undef, p^2, 0)
    while size(A, 2) < K
        candidates = 0.5 .* randn(p^2, 2K)   # oversample then filter
        valid = [ρ(reshape(candidates[:, k], p, p)) < 1 for k in 1:2K]
        A = hcat(A, candidates[:, valid])
    end
    NamedMatrix(
        A₁₁ = A[1, 1:K],
        A₁₂ = A[2, 1:K],
        A₂₁ = A[3, 1:K],
        A₂₂ = A[4, 1:K],
        σ₁  = abs.(0.5 .* randn(K)),
        σ₂  = abs.(0.5 .* randn(K)),
    )
end

ρ(A) = maximum(abs.(eigvals(A)))   # spectral radius

Simulating data

Next, we define the statistical model through simulation. Each data set is a single multivariate time series of length , stored as a   matrix, while a batch of data sets is stored as a    array.

julia
function simulator::AbstractVector, T::Integer; stationary_init::Bool = true)
    p = 2   # dimension of the VAR process
    A = reshape([θ["A₁₁"], θ["A₁₂"], θ["A₂₁"], θ["A₂₂"]], p, p)
    σ = [θ["σ₁"], θ["σ₂"]]
    Z = Matrix{Float64}(undef, p, T)
    Z[:, 1] = if stationary_init
        Σ = lyapd(A, Diagonal.^2)) # solve Σ = AΣAᵀ + diag(σ²)
        L = cholesky(Σ).L
        L * randn(p)
    else
        σ .* randn(p)
    end
    for t in 2:T
        Z[:, t] = A * Z[:, t-1] + σ .* randn(p)
    end
    return Z
end
simulator::AbstractMatrix, T::Integer) = stack([simulator(ϑ, T) for ϑ in eachcol(θ)])

Constructing the neural network

Because the data are a time series, the neural network should capture temporal dependencies. Two natural choices are a 1D CNN (which extracts local patterns via convolution over time) and an LSTM (which maintains a hidden state across the full sequence).

A common heuristic is to set the number of summary statistics to a multiple of , the number of unknown parameters.

julia
p = 2    # dimension of each observation
d = 6    # number of parameters (entries of A and innovation standard deviations)
num_summaries = 3d
julia
network = Chain(
    # 1D CNN: convolve over the time dimension
    x -> permutedims(x, (2, 1, 3)),    # CNN expects time-step along first dimension
    Conv((3,), p => 32, gelu; pad = 1),
    Conv((3,), 32 => 64, gelu; pad = 1),
    GlobalMeanPool(),
    flatten,
    Dense(64, 64, gelu),
    Dense(64, num_summaries, gelu)
)
julia
network = Chain(
    # LSTM: maps observation at each step to a hidden state
    LSTM(p => 64),
    x -> x[:, end, :],                # take the final hidden state
    Dense(64, 64, gelu),
    Dense(64, num_summaries, gelu)
)

Constructing the neural estimator

We now construct a neural estimator by wrapping the neural network in the subtype corresponding to the intended inferential method:

julia
estimator = PointEstimator(network, d; num_summaries = num_summaries)
julia
estimator = PosteriorEstimator(network, d; num_summaries = num_summaries)
julia
estimator = RatioEstimator(network, d; num_summaries = num_summaries)

Training the estimator

Next, we train the estimator using train:

julia
K = 10000 # number of parameter vectors in the training/validation sets
T = 100   # number of time steps in each data set
julia
estimator = train(estimator, sampler, simulator; K = 10000, simulator_args = T)
julia
θ_train   = sampler(K)
θ_val     = sampler(K)
estimator = train(estimator, θ_train, θ_val, simulator; simulator_args = T)
julia
θ_train   = sampler(K)
θ_val     = sampler(K)
Z_train   = simulator(θ_train, T)
Z_val     = simulator(θ_val, T)
estimator = train(estimator, θ_train, θ_val, Z_train, Z_val)

Assessing the estimator

The function assess can then be used to assess the trained estimator based on unseen test data simulated from the statistical model:

julia
θ_test = sampler(1000)
Z_test = simulator(θ_test, T)
assessment = assess(estimator, θ_test, Z_test)

The resulting Assessment object contains ground-truth parameters, estimates, and other quantities that can be used to compute quantitative and qualitative diagnostics:

julia
bias(assessment)
rmse(assessment)
plot(assessment)

Applying the estimator to observed data

Once an estimator is deemed to be well calibrated, it may be applied to observed data (below, we use simulated data as a stand-in for observed data):

julia
θ = sampler(1)                      # ground truth (not known in practice)
Z = simulator(θ, 100)               # stand-in for a real VAR(1) time series
julia
estimate(estimator, Z)             # point estimate of (A₁₁, A₁₂, A₂₁, A₂₂, σ₁, σ₂)
julia
sampleposterior(estimator, Z)      # posterior sample
julia
sampleposterior(estimator, Z)      # posterior sample