Kalman filter is a Bayesian method to update and improve parameter estimates from observations with uncertainty. It can be applied to time series formulated as a Linear Gaussian State Space Model. The kalman filter method was used by NASA in the Apollo Project to estimate trajectories of the manned spacecraft landing on the Moon and back (wow!).

In this and the next post, I will apply Kalman filter to two time series datasets: (1) simulated GPS locations (in honour of the Apollo 11 Lunar Module), and (2) a power market electricity price dataset.

Problem Formulation

Assumption: all noises are Gaussian, in which case the Kalman filter minimises the mean square error of estimated parameters. If the noises are not Gaussian, given only the mean and standard deviation, the Kalman filter is the best linear estimator.

The State Space Model, with observations Yt and hidden states Xt:

Yt=HtXt+ϵt,ϵtN(0,Rt)Xt+1=AtXt+ηt,ηtN(0,Qt)

where Rt is the measurement error covariance matrix and Qt is the process error covariance matrix.

Using conclusions from the marginal and conditional distributions of multivariate Gaussians, and let Xt+1|t be the predicted latent state with predicted mean μt+1|t and predicted covariance Σt+1|t, the Kalman filter algorithm:

  • At time 0, initial guess is X0(μ0|0,Σ0|0)
  • At time t,
    • Prediction updates:

    μt+1|t=Atμt|tΣt+1|t=AtΣt|tAtT+Qt

    • Measurement updates:

    Kt+1=Σt+1|tHt+1T(Ht+1Σt+1|tHt+1T+Rt+1)1, the Kalman gainμt+1|t+1=μt+1|t+Kt+1(Yt+1Ht+1μt+1|t)=(IKt+1Ht+1)μt+1|t+Kt+1Yt+1, the InnovationΣt+1|t+1=(IKt+1Ht+1)Σt+1|t

Simulated Race Car Location Data

Suppose a race car is on a track with the following equations:

x=2cos(t),y=sin(3t),t0

Suppose unfortunately the car jerks when accelerating (such that the acceleration is not constant), we consider the third order derivatives of the car’s position as Gaussian variables axN(0,σx2),ayN(0,σy2).

By the Newton’s law of motion,

x(t+δt)=x(t)+δtx˙(t)+δt22x¨(t)+δt36axx˙(t+δt)=x˙(t)+δtx¨(t)+δt22axx¨(t+δt)=x¨(t)+δtax

The dynamics are similar for dimension y.

Let the car carry a GPS device to observe its current physical position, with measurement error following ϵN(0,σGPS2) on both dimensions. The observations might look like this:

Applying Newton dynamics to Kalman Filter, the state and observation vectors contain up to second order derivatives of the position (note x,y are position on the 2D plane, and X,Y are the hidden state and observation of the car’s GPS position):

X=[xx˙x¨yy˙y¨]TY=[xobsyobs]T At=[1δtδt2200001δt0000010000001δtδt2200001δt000001]

The process error matrix covariance Q is

Q=Var([δt36axδt22axδtaxδt36ayδt22ayδtay]T)=σx2uuT+σy2vvT

where u=[δt36axδt22axδtax000]T,v=[000δt36ayδt22ayδtay]T

The GPS observes hidden states directly, so

H=[100000000100]

and the measurement error covariance matrix R is

R=[σGPS200σGPS2]

At time 0, suppose the car starts at the rightmost race track location (2,0) with known velocity and acceleration, and the hidden state covariance is a relatively large value due to uncertainty.

X0=[202030]TΣ0=0.5I6×6

Here’re the results. The Kalman Filter algorithm updates the car’s predicted position as each observation comes in, which effectiveliy reduces noises and smoothes the raw GPS observations.