Reading Notes | Optimal Estimation of Dynamic Systems (Crassidis, 2011)

John L. Crassidis, and John L. Junkins, Optimal Estimation of Dynamic Systems, CRC Press, 2011.

Corrections to the book can be found at here.

Chapter 2 Probability Concepts in Least Squares

2.5. Maximum Likelihood Estimation

2.6. Properties of Maximum Likelihood Estimation

Invariance Principle

Consistent Estimator

Asymptotically Gaussian Property

Asymptotically Efficient Property

2.7. Bayesian Estimation

2.7.1. MAP Estimation

Maximum a posteriori estimation (MAP)

  • if the a priori distribution p(x^)p(\hat{\bm{x}}) is uniform, then MAP estimation is equivalent to maximum likelihood (MLL) estimation.
  • asymptotic consistency and efficiency
  • MAP estimator converges to MLL estimator for large samples
  • obeys the invariance principle

2.7.2. Minimum Risk Estimation

assume x\bm{x} is distributed by the a posteriori function p(xy~)p(\bm{x}|\tilde{\bm{y}})

assign a cost to any loss suffered due to errors in the estimate

The risk funtion is

JMR(x)=c(xx)p(xy~)dx(2.171)J_{\rm MR}(\bm{x}^*) = \int_{-\infty}^{\infty} c(\bm{x}^*|\bm{x}) p(\bm{x}|\tilde{\bm{y}}) d\bm{x} \tag{2.171}


For these reasons (see p97), minimum risk approaches are often avoided in practical estimation problems, although the relationship between decision theory and optimal estimation is very interesting.

2.8. Advanced Topics (not read yet)

2.8.1. Nonuniqueness of the Weight Matrix

2.8.2. Analysis of Covariance Errors

2.8.3. Ridge Estimation

2.8.4. Total Least Squares

3. Sequential State EStimation

3.2. Full-Order Estimators

Ackermann’s formula

3.2.1. Discrete-Time Estimators

Discrete state-space representation

xk+1=Φxk+Γukyk=Hxk+Duk(A.122)\begin{aligned} \bm{x}_{k+1} &= \Phi \bm{x}_k + \Gamma \bm{u}_k\\ \bm{y}_k &= H \bm{x}_k + D \bm{u}_k \end{aligned} \tag{A.122}

x^k+1=Φx^k++Γukx^k+=x^k+K[y~Hx^k](3.20)\begin{aligned} \hat{\bm{x}}_{k+1}^- &= \Phi \hat{\bm{x}}_k^+ + \Gamma\bm{u}_k\\ \hat{\bm{x}}_k^+ &= \hat{\bm{x}}_k^- + K [\tilde{\bm{y}}-H\hat{\bm{x}}_k^-] \end{aligned} \tag{3.20}

3.3. The Discrete-Time Kalman Filter

The Kalman filter provides a rigorous theoretical approach to “place” the poles of the estimator, based upon stochastic processes for the measurement error and model error.

  • both discrete-time dynamic models and measurements
  • both continuous-time dynamic models and measurements
  • continuous-time dynamic models with discrete-time measurements

3.3.1. Kalman Filter Derivation


xk+1=Φkxk+Γkuk+Υkwk(3.27a)\bm{x}_{k+1} = \Phi_k \bm{x}_k + \Gamma_k \bm{u}_k + \Upsilon_k \bm{w}_k \tag{3.27a}

y~k=Hkxk+vk(3.27b)\tilde{\bm{y}}_k = H_k \bm{x}_k + \bm{v}_k \tag{3.27b}

assuming the errors are not correlated forward or backward in time, so that

E{vkvjT}={0kjRkk=j(3.28)E\{ \bm{v}_k \bm{v}_j^T \} = \begin{cases} 0 & k \neq j \\ R_k & k = j \end{cases} \tag{3.28}


E{wkwjT}={0kjQkk=j(3.29)E\{ \bm{w}_k \bm{w}_j^T \} = \begin{cases} 0 & k \neq j \\ Q_k & k = j \end{cases} \tag{3.29}

x~k+1=Φkx~k+Υkwk(3.33)\tilde{\bm{x}}_{k+1}^- = \Phi_k \tilde{\bm{x}}_k^+ - \Upsilon_k \bm{w}_k \tag{3.33}

Pk+1E{x~k+1x~k+1T}=E{Φkx~k+x~k+TΦkT}+E{ΥkwkwkTΥkT}=E{Φkx~k+wkTΥkT}E{Υkwkx~k+TΦkT}(3.34)\newcommand{\xm}[1]{\tilde{\bm{x}}_{ #1}^{-}} \newcommand{\xmT}[1]{\tilde{\bm{x}}_{ #1}^{-T}} \newcommand{\xp}[1]{\tilde{\bm{x}}_{ #1}^{+}} \newcommand{\xpT}[1]{\tilde{\bm{x}}_{ #1}^{+T}} \newcommand{\wk}{\bm{w}_k} \newcommand{\wkT}{\bm{w}_k^T} \newcommand{\E}[1]{E\{ #1 \}} \begin{aligned} P_{k+1}^- &\equiv E\{ \xm{k+1} \xmT{k+1} \} \\ &= \E{ \Phi_k \xp{k}\xpT{k} \Phi_k^T } + \E{\Upsilon_k\wk\wkT\Upsilon_k^T} \\ &\phantom{=} - \E{\Phi_k\xp{k}\wkT\Upsilon_k^T} - \E{\Upsilon_k\wk\xpT{k}\Phi_k^T} \\ \end{aligned} \tag{3.34}

where wk+\bm{w}_k^+ and x~k+\tilde{\bm{x}}_k^+ are uncorrelated since x~k+1\tilde{\bm{x}}_{k+1}^- (not x~k\tilde{\bm{x}}_k) directly depends on wk\bm{w}_k.

Pk+1=ΦkPk+ΦkTΥkQkΥkT(3.35)P_{k+1}^- = \Phi_k P_k^+ \Phi_k^T - \Upsilon_k Q_k \Upsilon_k^T \tag{3.35}

x~k+=(IKkHk)x~k+Kkvk(3.37)\tilde{\bm{x}}_k^+ = (I - K_k H_k) \tilde{\bm{x}}_k^- + K_k \bm{v}_k \tag{3.37}

Pk+E{x~k+x~k+T}=E{(IKkHk)x~kx~kT(IKkHk)T}+E{KkvkvkTKkT}=+E{(IKkHk)x~kvkTKkT}+E{Kkvkx~kT(IKkHk)T}(3.38)\newcommand{\xm}[1]{\tilde{\bm{x}}_{ #1}^{-}} \newcommand{\xmT}[1]{\tilde{\bm{x}}_{ #1}^{-T}} \newcommand{\xp}[1]{\tilde{\bm{x}}_{ #1}^{+}} \newcommand{\xpT}[1]{\tilde{\bm{x}}_{ #1}^{+T}} \newcommand{\wk}{\bm{w}_k} \newcommand{\wkT}{\bm{w}_k^T} \newcommand{\E}[1]{E\{ #1 \}} \begin{aligned} P_k^+ &\equiv \E{ \xp{k}\xpT{k} } \\ &= \E{ (I - K_k H_k) \xm{k} \xmT{k} (I - K_k H_k)^T } + \E{ K_k\bm{v}_k\bm{v}_k^TK_k^T } \\ &\phantom{=} + \E{(I - K_k H_k) \xm{k} \bm{v}_k^T K_k^T} + \E{ K_k \bm{v}_k \xmT{k} (I - K_k H_k)^T } \end{aligned} \tag{3.38}

3.3.2 Stability and Joseph’s Form

Another form for Pk+1P_{k+1} that ensures to be positive definite:


Pk+=[IKkHk]Pk[IkkHk]T+KkRkKkT(3.39)P_k^+ = [I-K_kH_k]P_k^-[I-k_kH_k]^T + K_kR_kK_k^T \tag{3.39}

instead of

Pk+=[IKkHk]Pk(3.44)P_k^+ = [I-K_kH_k]P_k^- \tag{3.44}

Proof: see p149–151.

3.3.3. Information Filter and Sequential Processing

handle large measurement vectors

The information filter may be more computationally efficient than the traditional Kalman filter when the size of the measurement vector is much larger than the size of the state vector.

Another more commonly used approach is to use sequential processing.
processing one measurement at a time, then the tain and covariance are updated until all measurements at each sampling instant have been processed.

3.3.4. Steady-State Kalman Filter

For time-invariant systems the error covariance PP reaches a steady-state value very quickly.
Therefore, a constant gain KK can be pre-computed using the steady-state covariance.
The solution is suboptimal in the strictest sense but can save a lot in computations.

3.3.5. Relationship to Least Squares Estimation

The Kalman filter can be derived using a least squares type loss function

J=12(x^0x0)TPa0(x^0x0)+12i=1k(y~Hix^i)TRi1(y~Hix^i)(3.109)J = \frac{1}{2}(\hat{\bm{x}}_0-\bm{x}_0)^T\mathcal{Pa}_0(\hat{\bm{x}}_0-\bm{x}_0) + \frac{1}{2} \sum_{i=1}^k (\tilde{\bm{y}}-H_i\hat{\bm{x}}_i)^T R_i^{-1} (\tilde{\bm{y}}-H_i\hat{\bm{x}}_i) \tag{3.109}

subject to the constraint

x^i+1=Φ(i+1,i)x^i,i=1,2,,k1(3.110)\hat{\bm{x}}_{i+1} = \Phi(i+1,i)\hat{\bm{x}}_i,\quad i=1,2,\dots,k-1 \tag{3.110}

3.3.6. Correlated Measurement and Process Noise

Consider the correlation between wk1\bm{w}_{k-1} and vk\bm{v}_k

E{wk1vkT}=Sk(3.127)E\{ \bm{w}_{k-1} \bm{v}_k^T \} = S_k \tag{3.127}

because only wk1\bm{w}_{k-1} and vk\bm{v}_k affect the measurement at the same time (3.27a substituted into 3.27b), as shown below,

y~k=Hkxk+vk=Hk(Φk1xk1+Γk1uk1+Υk1wk1)+vk.\begin{aligned} \tilde{\bm{y}}_k &= H_k \bm{x}_k + \bm{v}_k \\ &= H_k \left( \Phi_{k-1} \bm{x}_{k-1} + \Gamma_{k-1} \bm{u}_{k-1} + \Upsilon_{k-1} \bm{w}_{k-1} \right) + \bm{v}_k. \end{aligned}

In Eq. (3.38),

Pk+E{x~k+x~k+T}=E{(IKkHk)x~kx~kT(IKkHk)T}+E{KkvkvkTKkT}=+E{(IKkHk)x~kvkTKkT}+E{Kkvkx~kT(IKkHk)T},(3.38)\newcommand{\xm}[1]{\tilde{\bm{x}}_{ #1}^{-}} \newcommand{\xmT}[1]{\tilde{\bm{x}}_{ #1}^{-T}} \newcommand{\xp}[1]{\tilde{\bm{x}}_{ #1}^{+}} \newcommand{\xpT}[1]{\tilde{\bm{x}}_{ #1}^{+T}} \newcommand{\wk}{\bm{w}_k} \newcommand{\wkT}{\bm{w}_k^T} \newcommand{\E}[1]{E\{ #1 \}} \begin{aligned} P_k^+ &\equiv \E{ \xp{k}\xpT{k} } \\ &= \E{ (I - K_k H_k) \xm{k} \xmT{k} (I - K_k H_k)^T } + \E{ K_k\bm{v}_k\bm{v}_k^TK_k^T } \\ &\phantom{=} + \E{(I - K_k H_k) \xm{k} \bm{v}_k^T K_k^T} + \E{ K_k \bm{v}_k \xmT{k} (I - K_k H_k)^T }, \end{aligned} \tag{3.38}

the following expectation is not zero:

E{x~kvkT}=E{(Φk1x~k1+Υk1wk1)vkT}=E{Υk1wk1vkT}=Υk1Sk(3.128)E\{\tilde{\bm{x}}_k^-\bm{v}_k^T\} = E\{ \left( \Phi_{k-1}\tilde{\bm{x}}_{k-1}^+ - \Upsilon_{k-1}\bm{w}_{k-1} \right) \bm{v}_k^T \} = - E\{ \Upsilon_{k-1}\bm{w}_{k-1} \bm{v}_k^T \} = - \Upsilon_{k-1} S_k \tag{3.128}

This leads to different covariance update

Pk+=(Only use its trace to derive Kk)(3.129)P_{k}^+ = \dots \quad \color{red} (\text{Only use its trace to derive } K_k) \tag{3.129}

and Kalman gain (still by minimizing the above Pk+P_{k}^+)

Kk=(3.130)K_k = \dots \tag{3.130}

which could simplify Eq. (3.129) to the final covariance update formula

Pk+=(Use this one for propagation)P_k^+ = \dots \quad \color{red} (\text{Use this one for propagation})

(AN APPLICATION EXAMPLE, refined by me) an aircraft flying through a field of random turbulence. The effect of turbulence in the aircraft’s acceleration is modeled as process noise on wk1\bm{w}_{k-1}. Since any sensor mounted on an aircraft is also corrupted by turbulence, the measurement error vk\bm{v}_k is correlated with the process noise wk1\bm{w}_{k-1}.

3.3.7. Cramer-Rao Lower Bound

see wikipedia:
In its simplest form, the bound states that the variance of any unbiased estimator is at least as high as the inverse of the Fisher information.\

3.3.8. Orthogonality Principle

the orthogonality of the estimate and its error

E{x^k+x~k+T}=0(3.152)E\{\hat{\bm{x}}_k^+\tilde{\bm{x}}_k^{+T}\} = 0 \tag{3.152}


x~k+x^k+xk(3.32)\tilde{\bm{x}}_k^+ \equiv \hat{\bm{x}}_k^+ - \bm{x}_k \tag{3.32}

This orthogonality principle is extremely important in the derivation of the linear quadratic-Gaussian controller of Sec. 8.6.

Example 3.3: derive the process noise

For the real system (where there is always a random process term):

Continuous system:

θ˙=ω~βηvβ˙=ηu\begin{aligned} \dot{\theta} &= \tilde{\omega} - \beta - \eta_v \\ \dot{\beta} &= \eta_u \end{aligned}

Discrete system:

Let Δt=tk+1tk\Delta t = t_{k+1} - t_{k}.

θk+1θk=tktk+1[ω~βηv]dτ=tktk+1ω~dτtktk+1β(τ)dτtktk+1ηvdτ=tktk+1ω~dτ[βkΔt+tktk+1(tk+1τ)ηu(τ)dτ]tktk+1ηvdτ[see Eq. eg.3.3.x1]=tktk+1ω~dτβkΔttktk+1[ηv(τ)+(tk+1τ)ηu(τ)]dτβk+1βk=tktk+1ηudτβ(τ)=βk+tkτηu(ν)dν\begin{aligned} \theta_{k+1} - \theta_{k} &= \int_{t_k}^{t_{k+1}} [\tilde{\omega} - \beta - \eta_v] d\tau \\ &= \int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau - \int_{t_k}^{t_{k+1}} \textcolor{red}{\beta(\tau)} d\tau - \int_{t_k}^{t_{k+1}} \eta_v d\tau\\ &= \int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau - \left[ \beta_k \Delta t + \int_{t_k}^{t_{k+1}} (t_{k+1} - \tau) \eta_u(\tau)d\tau \right] - \int_{t_k}^{t_{k+1}} \eta_v d\tau \quad \text{[see Eq. eg.3.3.x1]}\\ &= \int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau - \beta_k \Delta t - \int_{t_k}^{t_{k+1}} \left[ \eta_v(\tau) + (t_{k+1} - \tau)\eta_u(\tau) \right] d\tau \\ \\ \beta_{k+1} - \beta_{k} &= \int_{t_k}^{t_{k+1}} \eta_u d\tau \quad\Rightarrow\quad \textcolor{red}{\beta(\tau) = \beta_k + \int_{t_k}^{\tau} \eta_u(\nu) d\nu} \\ \end{aligned}

tktk+1β(τ)dτ=tktk+1[βk+tkτηu(ν)dν]dτ=tktk+1βkdτ+tktk+1tkτηu(τ)dτdτ=βkΔt+tktk+1tkτηu(ν)dνdτ=βkΔt+tktk+1(tk+1τ)ηu(τ)dτ[see Eq. eg.3.3.x2](eg.3.3.x1)\begin{aligned} \int_{t_k}^{t_{k+1}} \textcolor{red}{\beta(\tau)} d\tau &= \int_{t_k}^{t_{k+1}} \left[ \textcolor{red}{\beta_k + \int_{t_k}^{\tau} \eta_u(\nu) d\nu} \right] d\tau \\ &= \int_{t_k}^{t_{k+1}} \textcolor{red}{\beta_k} d\tau + \int_{t_k}^{t_{k+1}} \textcolor{red}{\int_{t_k}^{\tau} \eta_u(\tau) d\tau} d\tau \\ &= \textcolor{red}{\beta_k} \Delta t + \textcolor{magenta}{\int_{t_k}^{t_{k+1}} \int_{t_k}^{\tau} \eta_u(\nu) d\nu d\tau }\\ &= \beta_k \Delta t + \textcolor{magenta}{\int_{t_k}^{t_{k+1}} (t_{k+1} - \tau) \eta_u(\tau) d\tau } \quad \text{[see Eq. eg.3.3.x2]}\\ \end{aligned} \tag{eg.3.3.x1}

tktk+1[tkτηu(ν)dν]dτ=tktk+1[[τtkτηu(ν)dν]τηu(τ)]dτ[see Eq. eg.3.3.x3]=tktk+1[τtkτηu(ν)dν]dτtktk+1τηu(τ)dτ=tk+1tktk+1ηu(ν)dνtktktkηu(ν)dνtktk+1τηu(τ)dτ=tktk+1(tk+1τ)ηu(τ)dτ(eg.3.3.x2)\begin{aligned} \textcolor{magenta}{\int_{t_k}^{t_{k+1}} \left[ \int_{t_k}^{\tau} \eta_u(\nu) d\nu \right] d\tau} &= \int_{t_k}^{t_{k+1}} \left[ \left[ \tau \int_{t_k}^\tau \eta_u(\nu)d\nu \right]' - \tau \eta_u(\tau) \right] d\tau \quad \text{[see Eq. eg.3.3.x3]}\\ &= \int_{t_k}^{t_{k+1}} \left[ \tau \int_{t_k}^\tau \eta_u(\nu)d\nu \right]' d\tau - \int_{t_k}^{t_{k+1}} \tau \eta_u(\tau) d\tau \\ &= t_{k+1} \int_{t_k}^{t_{k+1}} \eta_u(\nu)d\nu - \cancel{t_k \int_{t_k}^{t_k} \eta_u(\nu)d\nu} - \int_{t_k}^{t_{k+1}} \tau \eta_u(\tau) d\tau \\ &= \textcolor{magenta}{\int_{t_k}^{t_{k+1}} (t_{k+1} - \tau) \eta_u(\tau) d\tau} \end{aligned} \tag{eg.3.3.x2}

[τtkτηu(ν)dν]=tkτηu(ν)dν+τηu(τ)[integration by parts](eg.3.3.x3)\left[ \tau \int_{t_k}^\tau \eta_u(\nu)d\nu \right]' = \textcolor{red}{\int_{t_k}^\tau \eta_u(\nu)d\nu} + \tau \eta_u(\tau) \quad \text{[integration by parts]} \tag{eg.3.3.x3}

[Keep for records. Only after using the following series derivation, I realized I should have used integration by parts.]

Assuming {τ1,τ2,,τn}\{ \tau_1, \tau_2, \cdots, \tau_n \}, where τ1=tk\tau_1 = t_k and τn=tk+1\tau_n = t_{k+1}, and νi=τi\nu_i=\tau_i.

tktk+1tkτηu(ν)dνdτ=limni=1nj=1iηu(νj)δνδτ=limn{ηu(ν1)=+ηu(ν1)+ηu(ν2)=+ηu(ν1)+ηu(ν2)+ηu(ν3)=+=+ηu(ν1)+ηu(ν2)+ηu(ν3)++ηu(νn)}δνδτ=limni=1n(ni+1)ηu(νi)δνδτ=limni=1n(ni+1)δτηu(νi)δν=limni=1n(tk+1τi)ηu(νi)δν=tktk+1(tk+1τ)ηu(τ)dτ\begin{aligned} \int_{t_k}^{t_{k+1}} \int_{t_k}^{\tau} \eta_u(\nu) d\nu d\tau &= \lim_{n\rightarrow\infin} \sum_{i = 1}^{n} \sum_{j=1}^{i} \eta_u(\nu_j) \delta\nu \delta\tau \\ &= \lim_{n\rightarrow\infin} \{ \eta_u(\nu_1) \\ &\phantom{=}+ \eta_u(\nu_1) + \eta_u(\nu_2) \\ &\phantom{=}+ \eta_u(\nu_1) + \eta_u(\nu_2) + \eta_u(\nu_3) \\ &\phantom{=}+ \cdots \\ &\phantom{=}+ \eta_u(\nu_1) + \eta_u(\nu_2) + \eta_u(\nu_3) + \cdots + \eta_u(\nu_n) \} \cdot \delta\nu \delta\tau \\ &= \lim_{n\rightarrow\infin} \sum_{i=1}^{n} (n-i+1) \eta_u(\nu_i) \delta\nu \delta\tau \\ &= \lim_{n\rightarrow\infin} \sum_{i=1}^{n} (n-i+1)\delta\tau \eta_u(\nu_i)\delta\nu \\ &= \lim_{n\rightarrow\infin} \sum_{i=1}^{n} (t_{k+1}-\tau_i) \eta_u(\nu_i)\delta\nu \\ &= \int_{t_k}^{t_{k+1}} (t_{k+1}-\tau) \eta_u(\tau) d\tau \end{aligned}

For the estimation system:

Continuous system:

θ^˙=ω~β^β^˙=0\begin{aligned} \dot{\hat{\theta}} &= \tilde{\omega} - \hat{\beta} \\ \dot{\hat{\beta}} &= 0 \end{aligned}

Discrete system:

θ^k+1θ^k=tktk+1θ^˙(τ)dτ=tktk+1[ω~(τ)β^(τ)]dτ=tktk+1ω~(τ)dτβ^kΔtβ^k+1β^k=0\begin{aligned} \hat{\theta}_{k+1} - \hat{\theta}_k &= \int_{t_k}^{t_{k+1}} \dot{\hat{\theta}}(\tau) d\tau \\ &= \int_{t_k}^{t_{k+1}} [\tilde{\omega}(\tau) - \hat{\beta}(\tau)] d\tau \\ &= \int_{t_k}^{t_{k+1}} \tilde{\omega}(\tau) d\tau - \hat{\beta}_k \Delta t \\ \hat{\beta}_{k+1} - \hat{\beta}_k &= 0\\ \end{aligned}

Then, we have the final discrete-time error propagation equation

θk+1θ^k+1=θkθ^k=+{tktk+1ω~dτβkΔttktk+1[ηv(τ)+(tk+1τ)ηu(τ)]dτ}={tktk+1ω~dτβ^kΔt}=θkθ^k(βkβ^k)Δt+tktk+1[ηv(τ)(tk+1τ)ηu(τ)]dτθkθ^k(βkβ^k)Δt+pkβk+1β^k+1=βkβ^k+tktk+1ηudτβkβ^k+qk\begin{aligned} \theta_{k+1} - \hat{\theta}_{k+1} &= \theta_k - \hat{\theta}_k \\ &\phantom{=} + \left\{ \cancel{\int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau} - \beta_k \Delta t - \int_{t_k}^{t_{k+1}} \left[ \eta_v(\tau) + (t_{k+1} - \tau)\eta_u(\tau) \right] d\tau \right\}\\ &\phantom{=} - \left\{ \cancel{\int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau} - \hat{\beta}_k \Delta t \right\} \\ &= \theta_k - \hat{\theta}_k - (\beta_k - \hat{\beta}_k) \Delta t + \textcolor{magenta}{\int_{t_k}^{t_{k+1}} \left[ - \eta_v(\tau) - (t_{k+1} - \tau)\eta_u(\tau) \right] d\tau} \\ &\equiv \theta_k - \hat{\theta}_k - (\beta_k - \hat{\beta}_k) \Delta t + \textcolor{magenta}{p_k} \\ \beta_{k+1} - \hat{\beta}_{k+1} &= \beta_{k} - \hat{\beta}_{k} + \textcolor{magenta}{\int_{t_k}^{t_{k+1}} \eta_u d\tau} \\ &\equiv \beta_{k} - \hat{\beta}_{k} + \textcolor{magenta}{q_k} \end{aligned}

The discrete process noise is

Q=[Q11Q12Q21Q22][E{pk2}E{pkqk}E{qkpk}E{qk2}]\newcommand{\E}[1]{E\left\{ #1\right\}} Q = \begin{bmatrix} Q_{11} & Q_{12} \\ Q_{21} & Q_{22} \end{bmatrix} \equiv \begin{bmatrix} \E{p_k^2} & \E{p_kq_k} \\ \E{q_kp_k} & \E{q_k^2} \end{bmatrix}

E{pk2}=E{(tktk+1[ηv(τ)(tk+1τ)ηu(τ)]dτ)2}=E{tktk+1tktk+1[ηv(ν)(tk+1ν)ηu(ν)][ηv(τ)(tk+1τ)ηu(τ)]dνdτ}(Paused) (Expand the red integrand)[ηv(ν)+(tk+1ν)ηu(ν)][ηv(τ)+(tk+1τ)ηu(τ)]=[ηv(ν)+(tk+1ν)ηu(ν)]ηv(τ)+[ηv(ν)+(tk+1ν)ηu(ν)](tk+1τ)ηu(τ)=ηv(ν)ηv(τ)+(tk+1ν)ηv(τ)ηu(ν)+(tk+1τ)ηv(ν)ηu(τ)+(tk+1ν)(tk+1τ)ηu(ν)ηu(τ)(Integrate each expanded integrand)tktk+1ηv(ν)ηv(τ)dν=ηv2(τ)tktk+1(tk+1ν)ηv(τ)ηu(ν)dν=(tk+1τ)ηv(τ)ηu(τ)=0[σv and σu uncorrelated?]tktk+1(tk+1τ)ηv(ν)ηu(τ)dν=(tk+1τ)ηv(τ)ηu(τ)=0[σv and σu uncorrelated?]tktk+1(tk+1ν)(tk+1τ)ηu(ν)ηu(τ)dν=(tk+1τ)2νu2(τ)(continued)=tktk+1E{ηv2(τ)}dτ+tktk+1(tk+1τ)2E{ηu2(τ)}dτ=tktk+1E{ηv2(τ)}dτ+σu2tktk+1(tk+1τ)2dτ=σv2Δt+σu2(tk+1τ)33tktk+1=σv2Δt+σu2Δt33\newcommand{\E}[1]{E\left\{ #1\right\}} \newcommand{\I}{\int_{t_k}^{t_{k+1}}} \begin{aligned} \E{p_k^2} &= \E{ \left( \I \left[ - \eta_v(\tau) - (t_{k+1} - \tau)\eta_u(\tau) \right] d\tau \right)^2 } \\ &= \E{ \I \textcolor{red}{\I \left[ - \eta_v(\nu) - (t_{k+1} - \nu)\eta_u(\nu) \right] \left[ - \eta_v(\tau) - (t_{k+1} - \tau)\eta_u(\tau) \right] d\nu} d\tau } \\ &-------- \text{(Paused) (Expand the red integrand)} \\ & \left[ \eta_v(\nu) +(t_{k+1} - \nu)\eta_u(\nu) \right] \left[ \eta_v(\tau) + (t_{k+1} - \tau)\eta_u(\tau) \right] \\ &= \left[ \eta_v(\nu) +(t_{k+1} - \nu)\eta_u(\nu) \right] \eta_v(\tau) + \left[ \eta_v(\nu) +(t_{k+1} - \nu)\eta_u(\nu) \right] (t_{k+1} - \tau)\eta_u(\tau)\\ &= \eta_v(\nu)\eta_v(\tau) + (t_{k+1} - \nu)\eta_v(\tau)\eta_u(\nu) + (t_{k+1} - \tau)\eta_v(\nu)\eta_u(\tau) + (t_{k+1} - \nu)(t_{k+1} - \tau)\eta_u(\nu)\eta_u(\tau) \\ &-------- \text{(Integrate each expanded integrand)} \\ &\I \eta_v(\nu)\eta_v(\tau) d\nu = \eta_v^2(\tau)\\ &\I (t_{k+1} - \nu)\eta_v(\tau)\eta_u(\nu) d\nu = (t_{k+1} - \tau)\eta_v(\tau)\eta_u(\tau) = 0 \quad[\sigma_v \text{ and } \sigma_u \text{ uncorrelated?}] \\ &\I (t_{k+1} - \tau)\eta_v(\nu)\eta_u(\tau) d\nu = (t_{k+1} - \tau)\eta_v(\tau)\eta_u(\tau) = 0 \quad[\sigma_v \text{ and } \sigma_u \text{ uncorrelated?}]\\ &\I (t_{k+1} - \nu)(t_{k+1} - \tau)\eta_u(\nu)\eta_u(\tau) d\nu = (t_{k+1} - \tau)^2\nu_u^2(\tau) \\ &-------- \text{(continued)}\\ &= \I \E{ \eta_v^2(\tau) } d\tau + \I (t_{k+1}-\tau)^2\E{ \eta_u^2(\tau) } d\tau \\ &= \I \E{ \eta_v^2(\tau) } d\tau + \sigma_u^2 \I (t_{k+1}-\tau)^2 d\tau \\ &= \sigma_v^2\Delta t + \sigma_u^2 \left. \frac{-(t_{k+1}-\tau)^3}{3} \right|_{t_k}^{t_{k+1}} \\ &= \sigma_v^2\Delta t + \sigma_u^2\frac{\Delta t^3}{3} \\ \end{aligned}

E{pkqk}=E{(tktk+1[ηv(τ)(tk+1τ)ηu(τ)]dτ)(tktk+1ηudτ)}=E{tktk+1tktk+1[ηv(τ)(tk+1τ)ηu(τ)]ηu(ν)dνdτ}=E{tktk+1tktk+1[ηv(τ)ηu(ν)+(tk+1τ)ηu(τ)ηu(ν)]dνdτ}=E{tktk+1ηv(τ)ηu(τ)dτ+tktk+1(tk+1τ)ηu2(τ)dτ}=0tktk+1(tk+1τ)E{ηu2(τ)}dτ=σu2(tk+1τ)22tktk+1=σu2Δt22\newcommand{\E}[1]{E\left\{ #1\right\}} \newcommand{\I}{\int_{t_k}^{t_{k+1}}} \begin{aligned} \E{p_k q_k} &= \E{ \left( \I \left[ - \eta_v(\tau) - (t_{k+1} - \tau)\eta_u(\tau) \right] d\tau \right) \left( \int_{t_k}^{t_{k+1}} \eta_u d\tau \right) } \\ &= \E{ \I \I \left[ - \eta_v(\tau) - (t_{k+1} - \tau)\eta_u(\tau) \right] \eta_u(\nu) d\nu d\tau } \\ &= - \E{ \I \I \left[\eta_v(\tau)\eta_u(\nu) + (t_{k+1} - \tau)\eta_u(\tau)\eta_u(\nu) \right] d\nu d\tau } \\ &= -\E{ \I \eta_v(\tau)\eta_u(\tau) d\tau + \I (t_{k+1} - \tau)\eta_u^2(\tau) d\tau } \\ &= 0 - \I (t_{k+1} - \tau)\E{\eta_u^2(\tau)} d\tau \\ &= \sigma_u^2 \left. \frac{(t_{k+1} - \tau)^2}{2} \right|_{t_k}^{t_{k+1}} = - \sigma_u^2 \frac{\Delta t^2}{2} \end{aligned}

E{qk2}=E{(tktk+1ηudτ)2}=tktk+1E{ηu2}dτ=σu2Δt\newcommand{\E}[1]{E\left\{ #1\right\}} \newcommand{\I}{\int_{t_k}^{t_{k+1}}} \begin{aligned} \E{q_k^2} &= \E{ \left( \int_{t_k}^{t_{k+1}} \eta_u d\tau \right)^2 } \\ &= \I \E{\eta_u^2} d\tau = \sigma_u^2 \Delta t \end{aligned}

MY-QUESTION-1: If bias is not estimated (Not in the book, double check this!)

For the real system:

Continuous system:

θ˙=ω~βMLηv\begin{aligned} \dot{\theta} &= \tilde{\omega} - \beta_\text{ML} - \eta_v \\ \end{aligned}

Discrete system:

Let Δt=tk+1tk\Delta t = t_{k+1} - t_{k}.

θk+1θk=tktk+1[ω~βMLηv]dτ=tktk+1ω~dτtktk+1βML(τ)dτtktk+1ηvdτ=tktk+1ω~dτβML,kΔttktk+1ηvdτ\begin{aligned} \theta_{k+1} - \theta_{k} &= \int_{t_k}^{t_{k+1}} [\tilde{\omega} - \beta_\text{ML} - \eta_v] d\tau \\ &= \int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau - \int_{t_k}^{t_{k+1}} \beta_\text{ML}(\tau) d\tau - \int_{t_k}^{t_{k+1}} \eta_v d\tau\\ &= \int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau - \beta_{\text{ML},k} \Delta t - \int_{t_k}^{t_{k+1}} \eta_v d\tau \end{aligned}

For the estimation system:

Continuous system:

θ^˙=ω~βML\dot{\hat{\theta}} = \tilde{\omega} - \beta_\text{ML}

Discrete system:

θ^k+1θ^k=tktk+1θ^˙(τ)dτ=tktk+1[ω~(τ)βML(τ)]dτ=tktk+1ω~(τ)dτβML,kΔt\begin{aligned} \hat{\theta}_{k+1} - \hat{\theta}_k &= \int_{t_k}^{t_{k+1}} \dot{\hat{\theta}}(\tau) d\tau \\ &= \int_{t_k}^{t_{k+1}} \left[ \tilde{\omega}(\tau) - \beta_\text{ML}(\tau) \right] d\tau \\ &= \int_{t_k}^{t_{k+1}} \tilde{\omega}(\tau) d\tau - \beta_{\text{ML},k} \Delta t \end{aligned}

Then, we have the final discrete-time error propagation equation

θk+1θ^k+1=θkθ^k=+{tktk+1ω~dτβML,kΔttktk+1ηv(τ)dτ}={tktk+1ω~dτβML,kΔt}=θkθ^k+tktk+1ηv(τ)dτθkθ^k+pk\begin{aligned} \theta_{k+1} - \hat{\theta}_{k+1} &= \theta_k - \hat{\theta}_k \\ &\phantom{=} + \left\{ \cancel{\int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau} - \bcancel{\beta_{\text{ML},k} \Delta t} - \int_{t_k}^{t_{k+1}} \eta_v(\tau) d\tau \right\}\\ &\phantom{=} - \left\{ \cancel{\int_{t_k}^{t_{k+1}} \tilde{\omega} d\tau} - \bcancel{\beta_{\text{ML},k} \Delta t} \right\} \\ &= \theta_k - \hat{\theta}_k + \textcolor{magenta}{\int_{t_k}^{t_{k+1}} - \eta_v(\tau) d\tau} \\ &\equiv \theta_k - \hat{\theta}_k + \textcolor{magenta}{p_k} \end{aligned}

E{pk2}=E{(tktk+1ηv(τ)dτ)2}=E{tktk+1tktk+1ηv(ν)ηv(τ)dνdτ}=E{tktk+1ηv2(τ)dτ}=tktk+1E{ηv2(τ)}dτ=σv2Δt\newcommand{\E}[1]{E\left\{ #1\right\}} \newcommand{\I}{\int_{t_k}^{t_{k+1}}} \begin{aligned} \E{p_k^2} &= \E{ \left( \I - \eta_v(\tau) d\tau \right)^2 } \\ &= \E{ \I \textcolor{red}{\I \eta_v(\nu) \eta_v(\tau) d\nu} d\tau } \\ &= \E{ \I \textcolor{red}{\eta_v^2(\tau)} d\tau } \\ &= \I \E{ \eta_v^2(\tau) } d\tau \\ &= \sigma_v^2\Delta t \end{aligned}

MY-QUESTION-2: Other process noise?

The above derivations only considered the process noise due to

  1. the difference between the real system model and the estimation system model;
  2. the time discretization.

So, it doesn’t consider the process noise due to other differences to the reality (not the real system model). If so, do we need an additional term to compensate for additional uncertainties?
(Is my speculation correct?) Maybe not, because the real system model is also a model of the reality.

Can I just simply increase σv\sigma_v and σu\sigma_u to reflect other model errors? Because the real system model is an approximation of the reality and the last term ηv\bm{\eta}_v should include or absorb all the differences to reality. Similar for σu\sigma_u.
(Is this speculation correct?)

3.4. The Continuous-Time Kalman Filter

3.5. The Continuous-Discrete Kalman Filter

x˙(t)=F(t)x(t)+B(t)u(t)+G(t)w(t)y~k=Hkxk+vk\begin{aligned} \dot{\bm{x}}(t) &= F(t)\bm{x}(t) + B(t)\bm{u}(t) + G(t)\bm{w}(t)\\ \tilde{\bm{y}}_k &= H_k\bm{x}_k + \bm{v}_k \end{aligned}


The continuous-time propagation model equation does not involve the measurement directly. The covariance propagation follows a continuous-time Lyapunov differential equation.

the sample times of the measurements need not occur in regular intervals. In fact, different measurement sets can be spread out over various time intervals.

The real beauty of the continuous-discreteKalman filter is that it can handle different scattered measurement sets quite easily.


  • During the propagation, how is G(t)Q(t)G(t)TG(t)Q(t)G(t)^T handled?
    • This is just a normal term.
    • G(t)wtG(t)\bm{w}_t shall be handled as a random term.
  • How is it handled in Orekit?

3.6. Extended Kalman Filter

The problem with this nonlinear model is that a Gaussian input does not necessarily produce a Gaussian output (unlike the linear case).

The fundamental concept of this filter involves the notion that the true state is sufficiently close to the estimated state. Therefore, the error dynamics can be represented fairly accurately by a linearized first-order Taylor series expansion.


A steady-state gain cannot be found, because F(t)F(t) and H(t)H(t) in Table 3.8 will not be constant in general.

Another approach involves linearizing about the normal (a priori) state vector xˉ(t)\bar{\bm{x}}(t) instead of the current estiamte x^(t)\hat{\bm{x}}(t).

This gives the linearized Kalman filter. In general, the linearized Kalman filter is less accurate than the extended Kalman filter since xˉ(t)\bar{\bm{x}}(t) is usually not as close to the truth as is x^(t)\hat{\bm{x}}(t). However, since the nominal state is known a priori the gain K(t)K(t) can be pre-computed and stored, which reduces the on-line computational burden.


Proving convergence in the extended Kalman filter is difficult (if not impossible!) even for simple systems where the initial condition is not well known.
But it is often robust to initial condition errors, which can often be verified through simulation.

iterated extended Kalman filter:
local iteration at step kk, until no longer improved.

nonlinear smoother back to time tk1t_{k-1} (see Sec. 5.1.3):
The reference trajectory over [tk1,tk)[t_{k-1},t_k) can also be improved once the measurement y~k\tilde{\bm{y}}_k is given.

3.7. Unscented Filtering (not needed for now)

3.8. Constrained Filtering (not needed for now)

3.9. Summary

All important formulas are summarized here.

4. Advanced Topics in Sequential State Estimation ()

5. Batch State Estimation (to read later)

Smoothing ??


5.1.3. (utilized in Sec. 7.4., to read)

6. Parameter Estimation: Applications (to read now)

6.2. Global positioning System Navigation

6.4. Orbit Determination

Gaussian Least Squares Differential Correction (GLSDC)

??(6.53)?? \tag{6.53}

??(6.54)?? \tag{6.54}

Ffx=[03×3I3×3F2103×3](6.57)F \equiv \frac{\partial\bm{f}}{\partial\bm{x}} = \begin{bmatrix}0_{3\times3}&I_{3\times3}\\F_{21}&0_{3\times3}\end{bmatrix} \tag{6.57}

hx=[H1103×3]\frac{\partial \bm{h}}{\partial\bm{x}} = \begin{bmatrix}H_{11}&0_{3\times3}\end{bmatrix}

7. Estimation of Dynamic Systems: Applications

7.1. Attitude Estimation

q˙=12Ω(ω)q(7.1)\dot{\mathbf{q}} = \frac{1}{2} \Omega(\bm{\omega}) \mathbf{q} \tag{7.1}

7.1.2. Discrete-Time Attitude Estimation

The following approach is named “power series approach”, which is used to derive a discrete approximation to the continuous propagation. Here, the example in the book is for the discrete quaternion propagation of the kinematic equation

q˙=12Ω(ω)q(7.1)\dot{\mathbf{q}} = \frac{1}{2} \Omega(\bm{\omega}) \mathbf{q} \tag{7.1}

Ω(ω^)=[[ω^×]ω^ω^T0]=[0w3w2w1w30w1w2w2w10w3w1w2w30]\newcommand{\ww}[0]{\hat{\bm{\omega}}} \newcommand{\IwI}[0]{\|\hat{\bm{\omega}}\|} \newcommand{\OO}[0]{\Omega} \OO(\ww) = \begin{bmatrix} -[\ww\times] & \ww \\ -\ww^T & 0 \end{bmatrix} = \begin{bmatrix} 0 & -w3 & w2 & w1 \\ w3 & 0 & -w1 & w2 \\ -w2 & w1 & 0 & w3 \\ -w1 & -w2 & -w3 & 0 \end{bmatrix}

Ω2(ω^)=ω^2I4×4Ω3(ω^)=ω^2Ω(ω^)Ω2k(ω^)=(1)kω^2kI4×4(k0,Ω0=I4×4 does satisfy.)Ω2k+1(ω^)=(1)kω^2kΩ(ω^)(7.36)\newcommand{\ww}[0]{\hat{\bm{\omega}}} \newcommand{\IwI}[0]{\|\hat{\bm{\omega}}\|} \newcommand{\OO}[0]{\Omega} \begin{aligned} \OO^2(\ww) &= -\IwI^2 I_{4\times4} \\ \OO^3(\ww) &= -\IwI^2 \OO(\ww) \\ &\cdots \\ \OO^{2k}(\ww) &= (-1)^k\IwI^{2k} I_{4\times4} \quad(k\geq0, \textcolor{blue}{\OO^0=I_{4\times4} \text{ does satisfy.}})\\ \OO^{2k+1}(\ww) &= (-1)^k\IwI^{2k} \OO(\ww) \\ \end{aligned} \tag{7.36}

exp[12Ω(ω^)t]=j=0[12Ω(ω^)t]jj!=k=0[12t](2k)(1)kω^2kI4×4(2k)!+k=0[12t](2k+1)(1)kω^2kΩ(ω^)(2k+1)!=I4×4k=0(1)k[12ω^t](2k)(2k)!+Ω(ω^)ω^k=0(1)k[12ω^t](2k+1)(2k+1)!=I4×4cos(12ω^t)+Ω(ω^)ω^sin(12ω^t)(7.35+37+38)\newcommand{\ww}[0]{\hat{\bm{\omega}}} \newcommand{\IwI}[0]{\|\hat{\bm{\omega}}\|} \newcommand{\OO}[0]{\Omega} \begin{aligned} \exp\left[ \frac{1}{2}\OO(\ww)t \right] &= \sum_{j=0}^{\infty} \frac{\left[ \frac{1}{2}\OO(\ww)t \right]^j}{j!} \\ &= \sum_{k=0}^{\infty} \frac{\left[ \frac{1}{2} t \right]^{(2k)} (-1)^k\IwI^{2k} I_{4\times4} }{(2k)!} + \sum_{k=0}^{\infty} \frac{\left[ \frac{1}{2} t \right]^{(2k+1)} (-1)^k\IwI^{2k} \OO(\ww) }{(2k+1)!} \\ &= I_{4\times4} \textcolor{red}{ \sum_{k=0}^{\infty} \frac{(-1)^k \left[ \frac{1}