Archmodels.Jl: Estimating Arch Models in Julia

This paper introduces ARCHModels.jl, a package for the Julia programming language that implements a number of univariate and multivariate ARCH-type models. This model class is the workhorse tool for modelling the conditional volatility of financial assets. Their distinguishing feature is that they model the latent volatility as a (deterministic) function of past returns and volatilities. This recursive structure results in loop-heavy code which, due to its just-in-time compiler, Julia is well-equipped to handle. As such, the entire package is written in Julia, without any binary dependencies. We benchmark the performance of ARCHModels.jl against popular implementations in MATLAB, R, and Python, and illustrate its use in a detailed case study.


Introduction
Financial returns data at daily or higher frequency display a number of stylized facts, including volatility clustering (large (in absolute value) returns tend to cluster together), heavy tails, and statistical leverage, among others; see Figure 1 for an example. Modeling these lies at the heart of much of financial econometrics, because the volatility and conditional distribution of an asset (or a group of assets) are key ingredients in applications such as risk management, portfolio optimization, and derivative pricing. ARCH models (autoregressive conditional heteroskedasticity) form the most widely used class of models for capturing these features. In an ARCH-type model, the latent volatility σ 2 t of an asset is modeled in terms of past returns and volatilities. For example, the basic GARCH(1,1) model for a sample of daily asset returns {r t } t∈{1,...,T } , due to Bollerslev (1986), is (1.1) r t = σ t z t , z t ∼ N(0, 1), σ 2 t = ω + αr 2 t−1 + βσ 2 t−1 , ω, α, β > 0, α + β < 1.
The GARCH model extends the ARCH model (in which β = 0) of Engle (1982), who in 2003 was awarded a Nobel Memorial Prize in Economic Sciences for its development.  (van Rossum, 1995), and MATLAB's (The MathWorks Inc., 2019) Econometrics toolbox. These implementations all outsource the evaluation of the likelihood function to a compiled language, because the recursive nature of (1.1) defies any attempt at "vectorizing" it, a common recommendation for performant implementations in interpreted languages. Julia (Bezanson et al., 2017) is unique among these high-level languages because its just-in-time compiler allows us to keep even the tight loops in Julia itself without sacrificing performance; in fact, our implementation outperforms those mentioned above; see Section 5.2. This is partly due to the use of automatic differentiation for the computation of gradients in maximizing the likelihood, via Optim.jl (Mogensen and Riseth, 2018) and ForwardDiff.jl (Revels et al., 2016).
ARCHModels.jl is a registered Julia package and is easily installed with Julia's package manager using the commands using Pkg; Pkg.add("ARCHModels"). It implements estimation, model selection, simulation, and Value at Risk (VaR) calculations for a variety of univariate (ARCH, GARCH, TGARCH, EGARCH) and multivariate (CCC, DCC) ARCH models, for different choices of standardized innovation distributions (Gaussian, Student's t, Generalized error distribution). The conditional mean can be specified as either zero, an intercept, a linear regression model, or an ARMA(p, q) model. The package is designed to be easy to extend with other volatility specifications and distributions, and integrates with the relevant parts of the Julia ecosystem, such as DataFrames.jl (Harris et al., 2015), Distributions.jl (Lin et al., 2019b), GLM.jl (Bates and other contributors, 2012), HypothesisTests.jl (Kornblith and other contributors, 2012), and StatsBase.jl (Lin et al., 2019a).
The remainder of the paper is as follows: Section 2 defines the various models implemented in ARCH-Models.jl. Section 3 describes the design of the package, in particular the type hierarchy. Section 4 presents a case study detailing the use of the package. Section 5 offers a comparison with implementations in other languages. Section 6 concludes.

ARCH models
Consider a sample of daily asset returns {r t } t∈{1,...,T } . All models covered in this package share the same basic structure, in that they decompose the return into a conditional mean and a mean-zero innovation. In the univariate case, z t ≡ a t /σ t is identically and independently distributed according to some law with mean zero and unit variance, and {F t } is the natural filtration of {r t } (i.e., it encodes information about past returns). In the multivariate case, r t ∈ R d , and the general model structure is ARCH models specify the conditional volatility σ t (or in the multivariate case, the conditional covariance matrix Σ t ) in terms of past returns, conditional (co)variances, and potentially other variables.

Univariate type hierarchy
This package represents an ARCH model as an instance of either UnivariateARCHModel or MultivariateARCHModel.
These are subtypes of ARCHModel and implement the interface of StatisticalModel from StatsBase.
An instance of UnivariateARCHModel contains a vector of data (such as equity returns), and encapsulates information about the volatility specification (e.g., GARCH or EGARCH), the mean specification (e.g., whether an intercept is included), and the error distribution.

Volatility specifications
Volatility specifications describe the evolution of σ t .
They are modelled as subtypes of UnivariateVolatilitySpec. There is one type for each class of (G)ARCH model, parameterized by the number(s) of lags (e.g., p, q for a GARCH(p, q) model).
The simplest volatility specification is given by the ARCH(q) model, due to Engle (1982). It reads The corresponding type is ARCH{q}. The GARCH(p, q) model, due to Bollerslev (1986), generalizes the ARCH(q) model by including lagged values of the squared volatility on the right hand side of (3.1). This renders the conditional variance as It is available as GARCH{p, q}. The ARCH and GARCH models are special cases of a more general class of models, known as TGARCH (Threshold GARCH), due to Glosten et al. (1993). The TGARCH(o, p, q) model takes the form The TGARCH model allows the volatility to react differently (typically more strongly) to negative shocks, a feature known as the (statistical) leverage effect. Is available as TGARCH{o, p, q}. Finally, The EGARCH(o, p, q) volatility specification, due to Nelson (1991), is Like the TGARCH model, it can account for the leverage effect.
The corresponding type is EGARCH{o, p, q}.
The constructors for the volatility specifications take as input a coefficient vector, where the order of the parameters is such that all parameters pertaining to the first type parameter (p) appear before those pertaining to the second (q). For example, an EGARCH(1, 1, 1) model with ω = −0.003, γ 1 = −0.03, β 1 = 0.99 and α 2 = .2 is obtained with julia> using ARCHModels julia> EGARCH{1, 1, 1}([-0.003, -0.03, 0.99, 0.2]); Explicitly creating instances of volatility specifications is only necessary for simulation (see Section 4); for fitting, passing the type is sufficient.

Mean specifications
Mean specifications serve to specify µ t . They are modelled as subtypes of MeanSpec. They contain their parameters as (possibly empty) vectors. Convenience constructors are provided where appropriate, though as with volatility specifications, constructing them explicitly is only required for simulation, not for fitting. The following specifications are available: A zero mean, i.e., µ t = 0, is available as NoIntercept; An intercept (µ t = µ) is obtained as Intercept. An ARMA(p, q) model, which specifies the conditional mean as is available as ARMA{p, q}. The special cases of pure AR(p) and MA(q) models are also available, as AR{p} and MA{q}, respectively.
Finally, Regression allows one to specify a linear regression model, as in Unlike the other mean specifications, a regression requires external data, which the constructor expects as a matrix with observations in rows and variables in columns, as follows: julia> reg = Regression(ones(100, 1)); In this example, we created a regression model containing one regressor, given by a column of ones; this is equivalent to including an intercept in the model. The latter is however more memory efficient, as no design matrix needs to be stored. Another way to create a linear regression with ARCH errors is to pass a LinearModel or TableRegressionModel from GLM.jl, as described in Section 4.

Distributions
Different standardized (mean zero, variance one) distributions for z t are available as subtypes of StandardizedDistribution, which in turn subtypes Distribution{Univariate, Continuous} from Distributions.jl, though not the entire interface need necessarily be implemented. Instances of StandardizedDistribution again hold their parameters as vectors, but convenience constructors are provided. Available distributions include the standard normal (StdNormal), standardized Student's t (StdT), and standard generalized error distribution (StdGED). In addition, it is possible to wrap a continuous univariate distribution from the Distributions package in the Standardized wrapper type. Below, we reimplement the standardized normal distribution: julia> using Distributions julia> const MyStdNormal = Standardized{Normal}; MyStdNormal can be used whereever a built-in distribution could, albeit with a speed penalty. Note also that if the underlying distribution (such as Normal in the example above) contains location and/or scale parameters, then these are no longer identifiable, which implies that the estimated covariance matrix of the estimators will be singular. A final remark concerns the domain of the parameters: the estimation process relies on a starting value for the parameters of the distribution, say θ ≡ (θ 1 , . . . , θ p ) T . For a distribution wrapped in Standardized, the starting value for θ i is taken to be a small positive value, say ε. This will fail if ε is not in the domain of θ i ; as an example, the standardized Student's t distribution is only defined for degrees of freedom larger than 2, because a finite variance is required for standardization. In that case, it is necessary to define a method of the (non-exported) function startingvals that returns a feasible vector of starting values, as follows: julia> ARCHModels.startingvals(::Type{<:MyStdT}, data::Vector{T}) where T = T[3.]

Working with UnivariateARCHModels
The constructor for UnivariateARCHModel takes two mandatory arguments: an instance of a subtype of UnivariateVolatilitySpec, and a vector of returns. The mean specification and error distribution can be changed via the keyword arguments meanspec and dist, which respectively default to NoIntercept and StdNormal. For example, to construct a GARCH(1, 1) model with t-distributed errors, one would do julia> spec = GARCH{1, 1}([1., .9, .05]); julia> data = BG96; julia> am = UnivariateARCHModel(spec, data; dist=StdT(3.)); The model can then be fitted as follows: julia> fit!(am) ARCHModels.TGARCH{0,1,1} model with Student's t errors, T=1974. It should, however, rarely be necessary to construct a UnivariateARCHModel manually via its constructor; typically, instances of it are created by calling fit, selectmodel, or simulate. As discussed earlier, UnivariateARCHModel implements the interface of StatisticalModel from Stats-Base, so one may can call coef, coefnames, confint, dof, informationmatrix, isfitted, loglikelihood, nobs, score, stderror, vcov, etc. on its instances. Other useful methods include means, volatilities, and residuals.

Multivariate type hierarchy
Analogously to the univariate case, an instance of MultivariateARCHModel contains a matrix of data (with observations in rows and assets in columns), and encapsulates information about the covariance specification (e.g., CCC or DCC), the mean specification, and the error distribution.
MultivariateARCHModels support many of the same methods as UnivariateARCHModels, with a few noteworthy differences: the prediction targets for predict are :covariances and :correlations for predicting Σ t and R t , respectively, and the new functions covariances and correlations respectively return the in-sample estimates of Σ t and R t .

Covariance specifications
The main challenge in multivariate ARCH modelling is the curse of dimensionality: allowing each of the d(d + 1)/2 elements of Σ t to depend on the past returns of all d other assets requires O(d 4 ) parameters without imposing additional structure. ARCHModels.jl focusses on conditional correlation models, which approach this issue by decomposing Σ t as where R t is the conditional correlation matrix and D t is a diagonal matrix containing the volatilities of the individual assets, which are modelled as univariate ARCH processes. The dynamics of Σ t are modelled as subtypes of MultivariateVolatilitySpec. Two options are available: DCC The dynamic conditional correlation (DCC) model of Engle (2002) imposes a GARCH-type structure on the R t . In particular, for a DCC(p, q) model (with covariance targeting), where It is available as DCC{p, q}. The constructor takes as inputsQ, a vector of coefficients, and a vector of UnivariateARCHModels: julia> DCC{1, 1}([1. .5; .5 1.], [.9, .05], [GARCH{1, 1}([1., .9, .05]) for _ in 1:2]) DCC{1, 1, ARCHModels.TGARCH{0,1,1}} specification.
The DCC model is typically estimated in two steps, by first fitting univariate ARCH models to the individual assets and saving the standardized residuals { t }, and then estimating the DCC parameters from those. Engle (2002) provides the details and expressions for the standard errors. By default, this package employs an alternative estimator due to Engle et al. (2019), which is better suited to largedimensional problems. It achieves this by i) estimatingQ with a nonlinear shrinkage estimator instead of the sample covariance of t , and ii) estimating the DCC parameters by maximizing the sum of the pairwise log-likelihoods, rather than the joint log-likelihood over all assets, thereby avoiding the inversion of large matrices during the optimization. The estimation method is controlled by passing the method keyword to the constructor. Possible values are :largescale (the default), and :twostep.
CCC The CCC (constant conditional correlation) model of Bollerslev (1990) models R t = R as constant.

No estimable parameters.
As for the DCC model, the constructor accepts a method keyword argument, which takes the possible values :largescale (default) or :twostep that determines whether R will be estimated by nonlinear shrinkage or the sample correlation of the t .

Mean Specifications
The conditional mean of a MultivariateARCHModel is specified by a vector of univariate MeanSpecs.

Multivariate Standardized Distributions
Multivariate standardized distributions subtype MultivariateStandardizedDistribution. Currently, only MultivariateStdNormal is available. Note that under mild assumptions, the Gaussian (quasi-)MLE consistently estimates the (multivariate) ARCH parameters even if Gaussianity is violated.

Univariate Modelling
We will be using the data from Bollerslev and Ghysels (1996), available as the constant BG96. The data consist of daily German mark/British pound exchange rates (1974 observations) and are often used in evaluating implementations of (G)ARCH models (see, e.g., Brooks et al. (2001). We begin by convincing ourselves that the data exhibit ARCH effects; a quick and dirty way of doing this is to look at the sample autocorrelation function of the squared returns: julia> using ARCHModels julia> autocor(BG96.^2, 1:4, demean=true)' # re-exported from StatsBase 1*4 LinearAlgebra.Adjoint{Float64,Array{Float64,1}}: 0.222941 0.176632 0.14086 0.12632 Using a critical value of 1.96/ √ 1974 = 0.044, we see that there is indeed significant autocorrelation in the squared series.
A more formal test for the presence of volatility clustering is Engle's (1982) ARCH-LM test. The test statistic is given by LM ≡ T R 2 aux , where R 2 aux is the coefficient of determination in a regression of the squared returns on an intercept and p of their own lags. The test statistic follows a χ 2 p distribution under the null of no volatility clustering.  This returns an instance of UnivariateARCHModel, as described in Section 3.2. The parameters α 1 and β 1 in the volatility equation are highly significant, as one would expect from our preliminary testing. Note also that the fitted values are the same as those found by Bollerslev and Ghysels (1996) and Brooks et al. (2001) for the same dataset.
The fit method supports a number of keyword arguments; the full signature is fit( ::Type{<:UnivariateVolatilitySpec}, data::Vector; dist=StdNormal, meanspec=Intercept, algorithm=BFGS(), autodiff=:forward, kwargs... ) Here, dist is a subtype (not instance) of StandardizedDistribution; see Section 3.1.3. The mean specification is specified via meanspec and defaults to Intercept. It can be passed as either a subtype of MeanSpec or an instance thereof (for specifications that require additional data, such as Regression; see Section 3.1.2). If the mean specification in question has a notion of sample size (like Regression), then the sample size should match that of the data, or an error will be thrown. The remaining keyword arguments are passed on to the optimizer.
It is also possible to pass a LinearModel (or TableRegressionModel) from GLM.jl to fit instead of a data vector. This is equivalent to using a Regression as a mean specification. In the following example, we fit a linear model with GARCH (1, 1)  One of the issues in ARCH modeling is selecting the lag order. One possibility is to make this choice based on an information criterion. ARCHModels.jl can automate this procedure, via the selectmodel function. Given a class of model (i.e., a subtype of UnivariateVolatilitySpec), it will return a fitted UnivariateARCHModel, with the lag length parameters (i.e., p and q in the case of GARCH) chosen to minimize the desired criterion. The BIC is used by default. As an example, the following selects the optimal (minimum AIC) EGARCH(o, p, q) model, where o, p, q < 2, assuming t distributed errors.  Here, an EGARCH(1, 1, 2) model was selected. Passing the keyword argument show_trace=true will show the criterion for each model after it is estimated. Any unspecified lag length parameters in the mean specification (e.g., p and q for ARMA) will be optimized over as well. Note however that this can result in an explosion of the number of models that must be estimated; e.g., selecting the best model from the class of TGARCH(o, p, q)-ARMA(p, q) models results in 5 maxlags models being estimated. It may thus be preferable to fix the lag length of the mean specification: am = selectmodel(ARCH, BG96; meanspec=AR{1}) considers only ARCH(q)-AR(1) models. Similarly, one may restrict the lag length of the volatility specification and select only among different mean specifications. E.g., the following will select the best ARMA(p, q) specification with constant variance: julia> am = selectmodel(ARCH{0}, BG96; meanspec=ARMA); One of the primary uses of ARCH models is for estimating and forecasting Value at Risk. Basic insample estimates for the Value at Risk implied by an estimated UnivariateARCHModel can be obtained using VaRs: julia> am = fit(GARCH{1, 1}, BG96); julia> vars = VaRs(am, 0.05); julia> using Plots julia> plot(-BG96, legend=:none, xlabel="\$t\$", ylabel="\$-r_t\$"); julia> plot!(vars, color=:purple); julia> savefig(joinpath("img", "VaRplot.pdf")); nothing This produces the graph in Figure 1.
The predict(am::UnivariateARCHModel) method can be used to construct one-step ahead forecasts for a number of quantities. Its signature is predict(am::UnivariateARCHModel, what=:volatility; level=0.01) The keyword argument what controls which object is predicted; the choices are :volatility (the default), :variance, :return, and :VaR. The VaR level can be controlled with the keyword argument level.
One way to use predict is in a backtesting exercise. The following code snippet constructs out-ofsample VaR forecasts for the Bollerslev and Ghysels data by re-estimating the model in a rolling window fashion, and then tests the correctness of the VaR specification with the dynamic quantile test of Engle and Manganelli (2004)   Testing volatility models in general relies on the estimated conditional volatilitiesσ t and the standardized residualsẑ t ≡ (r t −μ t )/σ t , accessible via volatilities(::UnivariateARCHModel) and residuals(::UnivariateARCHModel), respectively. The non-standardized residualsû t ≡ r t −μ t can be obtained by passing standardized=false as a keyword argument to residuals.
One possibility to test a volatility specification is to apply the ARCH-LM test to the standardized residuals. This is achieved by calling ARCHLMTest on the estimated UnivariateARCHModel:  By default, the number of lags is chosen as the maximum order of the volatility specification (e.g., max(p, q) for a GARCH(p, q) model). Here, the test does not reject, indicating that a GARCH(1, 1) specification is sufficient for modelling the volatility clustering (a common finding).

Multivariate models
In this section, the percentage returns on 29 stocks from the DJIA from 03/19/2008 through 04/11/2019, available as DOW29, will be used.
Fitting a multivariate ARCH model proceeds similarly to the univariate case, by passing the type of the multivariate ARCH specification to fit. If the lag length (and in the case of the DCC model, the univariate specification) is left unspecified, then these default to 1 (and GARCH); i.e., the following is equivalent to both fit(DCC{1, 1}, DOW29) and fit(DCC{1, 1, GARCH{1, 1}}, DOW29): julia> m = fit(DCC, DOW29[:, 1:2]); The returned object is of type MultivariateARCHModel. Like UnivariateARCHModel, it implements most of the interface of StatisticalModel and hence behaves similarly, so this section documents only the major differences.
The standard errors are not calculated by default. They can be obtained with show(IOContext(stdout, :se=>true), m). Alternatively, stderror(m) can be used. As in the univariate case, fit supports a number of keyword arguments. The full signature is fit(spec, data: method=:largescale, dist=MultivariateStdNormal, meanspec=Intercept, algorithm=BFGS(), autodiff=:forward, kwargs...) Their meaning is similar to the univariate case. In particular, meanspec can be any univariate mean specification, as described in Section 3.1.2. Certain models support different estimation methods; in the case of the DCC model, these are :twostep and :largescale, which respectively refer to the methods of Engle (1982) and Engle et al. (2019). The latter sacrifices some amount of statistical efficiency for much-improved computational speed and is the default. Again paralleling the univariate case, one may also construct a MultivariateARCHModel by hand and then call fit or fit! on it, but this is rather cumbersome, as it requires specifying all parameters of the covariance specification. One-step ahead forecasts of the covariance or correlation matrix are obtained by respectively passing what=:covariance (the default) or what=:correlation to predict: julia> predict(m, what=:correlation) 2*2 Array{Float64,2}: 1.0 0.436513 0.436513 1.0 In the multivariate case, there are three types of residuals that can be considered: the unstandardized residuals, a t ; the devolatized residuals, t , where it ≡ a it /σ it ; and the decorrelated residuals z t ≡ Σ −1/2 t a t . When called on a MultivariateARCHModel, residuals returns {z t } by default. Passing decorrelated=false returns { t }, and passing standardized=false returns {a t } (decorrelated=true implies standardized=true).

Benchmarks
Next, we compare the different implementations in terms of speed.

Summary
We have introduced ARCHModels.jl, a package for estimating ARCH-type models in Julia. It provides an intuitive API for estimating, simulating, and testing univariate and multivariate ARCH-type models. Julia's properties, such as JIT compilation and the availability of advanced automatic differentiation packages, allow us to beat implementations in other languages by significant margins in terms of estimation speed, despite keeping the entire implementation in the high-level language.