7 Dynamic linear models (DLMs)
7.1 Introduction
The set up here is similar to the previous chapter. We have a “hidden” Markov chain model,
For a hidden state of dimension
7.2 Filtering
The filtering problem is the computation of
Since we know
Predict step
Since
Update step
Using
7.2.1 The Kalman filter
Let us now summarise the steps of the Kalman filter. Our input at time
-
.
Note that the only inversion involves the
Also note that it is easy to handle missing data within the Kalman filter by carrying out the predict step and skipping the update step. If the observation at time
7.2.2 Example
To keep things simple, we will just implement the Kalman filter for time-invariant
So, if we provide kFilter
function, we will get back a function for carrying out one step of the Kalman filter. As a simple illustration, suppose that we are interested in the long-term trend in the Southern Oscillation Index (SOI).
We suppose that a simple random walk model is appropriate for the (hidden) long term trend (so Reduce
function and initialising with
$m
[,1]
[1,] -0.03453493
$C
[,1]
[1,] 0.00495025
This gives us
fs = Reduce(advance, soi, list(m=0, C=100), acc=TRUE)
fsm = sapply(fs, function(s) s$m)
fsTs = ts(fsm[-1], start=start(soi), freq=frequency(soi))
tsplot(soi, col=4)
lines(fsTs, col=2, lwd=2)
We will look at more interesting DLM models in the next chapter.
7.3 Forecasting
Making forecasts from a DLM is straightforward, since it corresponds to the computation of predictions, and we have already examined this problem in the context of the predict step of a Kalman filter. Suppose that we have a time series of length
7.4 Marginal likelihood
For many applications (including parameter estimation) it is useful to know the marginal likelihood of the data. We can factorise this as mvtnorm::dmvnorm
) to evaluate the log density of a multivariate normal, but if not, we can write it explicitly in the form
7.4.1 Example
We can easily modify our Kalman filter function to compute the marginal likelihood, and apply it to our example problem as follows.
kFilterML = function(G, W, F, V)
function(mCl, y) {
m = G %*% mCl$m
C = (G %*% mCl$C %*% t(G)) + W
f = F %*% m
Q = (F %*% C %*% t(F)) + V
ll = mvtnorm::dmvnorm(y, f, Q, log=TRUE)
K = t(solve(Q, F %*% C))
m = m + (K %*% (y - f))
C = C - (K %*% Q %*% t(K))
list(m=m, C=C, ll=mCl$ll+ll)
}
advance = kFilterML(1, 0.01^2, 1, 0.5^2)
Reduce(advance, soi, list(m=0, C=100, ll=0))
$m
[,1]
[1,] -0.03453493
$C
[,1]
[1,] 0.00495025
$ll
[1] -237.2907
7.5 Smoothing
For the smoothing problem, we want to calculate the marginal at each time given all of the data,
Since
7.5.1 The RTS smoother
To summarise, we receive as input
7.6 Sampling
If we want to understand the joint distribution of the hidden states given the data, generating samples from it will be useful. So, we want to generate exact samples from
7.7 Parameter estimation
Just as for HMMs, there are a wide variety of Bayesian and likelihood-based approaches to parameter estimation. Here, again, we will keep things simple and illustrate basic numerical optimisation of the marginal log-likelihood of the data.
7.7.1 Example
For our SOI running example, suppose that we are happy with the model structure, but uncertain about the variance parameters
mll = function(wv) {
advance = kFilterML(1, wv[1], 1, wv[2])
Reduce(advance, soi, list(m=0, C=100, ll=0))$ll
}
mll(c(0.01^2, 0.5^2))
[1] -237.2907
We can see that it reproduces the same log-likelihood as we got before when we evaluate it at our previous guess for optim
can do.
$par
[1] 0.05696905 0.03029240
$value
[1] -144.0333
$counts
function gradient
91 NA
$convergence
[1] 0
$message
NULL
So optim
has found parameters with a much higher likelihood, corresponding to a much better fitting model. Note that these optimised parameters apportion the noise more equally between the hidden states and the observation process, and so lead to smoothing the hidden state much less. This may correspond to a better fit, but perhaps not to the fit that we really want. In the context of Bayesian inference, a strong prior on
7.8 R package for DLMs
We have seen that it is quite simple to implement the Kalman filter and related computations such as the marginal log-likelihood of the data. However, a very naive implementation such as we have given may not be the most efficient, nor the most numerically stable. Many different approaches to implementing the Kalman filter have been developed over the years, which are mathematically equivalent to the basic Kalman filter recursions, but are more efficient and/or more numerically stable than a naive approach. The dlm
R package is an implementation based on the singular value decomposition (SVD), which is significantly more numerically stable than a naive implementation. We will use this package for the study of more interesting DLMs in Chapter 8. More information about this package can be obtained with help(package="dlm")
and vignette("dlm", package="dlm")
. It is described more fully in Petris, Petrone, and Campagnoli (2009). We can fit our simple DLM to the SOI data as follows.
library(dlm)
mod = dlm(FF=1, GG=1, V=0.5^2, W=0.01^2, m0=0, C0=100)
fs = dlmFilter(soi, mod)
tsplot(soi, col=4, main="Filtered states")
lines(fs$m, col=2, lwd=2)
We can also compute the smoothed states.
These are very smooth. We can also optimise the parameters, similar to the approach we have already seen. But note that the dlmMLE
optimiser requires that the vector of parameters being optimised is unconstrained, so we use
buildMod = function(lwv)
dlm(FF=1, GG=1, V=exp(lwv[2]), W=exp(lwv[1]),
m0=0, C0=100)
opt = dlmMLE(soi, parm=c(log(0.5^2), log(0.01^2)),
build=buildMod)
opt
$par
[1] -2.865240 -3.496717
$value
[1] -272.2459
$counts
function gradient
35 35
$convergence
[1] 0
$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
exp(opt$par)
[1] 0.05696943 0.03029668
After mapping back to the constrained space, we see that we get the same optimised values as before (the log-likelihood is different, but that is just due to dropping of unimportant constant terms). We can check the smoothed states for this optimised model as follows.
mod = buildMod(opt$par)
ss = dlmSmooth(soi, mod)
tsplot(soi, col=4, main="Smoothed states")
lines(ss$s, col=2, lwd=1.5)
This confirms that the smoothed states for the optimised model are not very smooth. This suggests that our simple random walk model for the data is perhaps not very good.