0%

John L. Crassidis, and John L. Junkins, Optimal Estimation of Dynamic Systems, CRC Press, 2011.

Corrections to the book can be found at here.

# Chapter 2 Probability Concepts in Least Squares

## 2.5. Maximum Likelihood Estimation

Jonathan Ko, “Gaussian Process for Dynamic Systems”, PhD Thesis, University of Washington, 2011.

Bayes filter equation in Eq. 4.1 (p.34) has a typo (should be $\propto$, not $=$)

$p(x_t|z_{1:t},u_{1:t-1}) \propto p(z_t|x_t) \int \textcolor{red}{p(x_t|x_{t-1},u_{t-1})} \textcolor{green}{p(x_{t-1}|z_{1:t-1},u_{1:t-2})} dx_{t-1}$

• $\textcolor{red}{Red}$ part is dynamics model, describing how the state $x$ evolves in time based on the control input $u$ (p.34)
• $\textcolor{green}{Green}$ part is observation model, describing the likelihood of making an observation $z$ given the state $x$
• GP-BayesFilter improves these two parts.

The dynamics model maps the state and control $(x_t,u_t)$ to the state transition $\Delta x_t = x_{t+1} - x_t$. So, the training data is

$D_p = <(X,U),X'>$

The observation model maps from the state $x_t$ to the observation $z_t$. So, the training data is

$D_o = $

The resulting GP dynamics and observation models are (p.44)

$p(x_t|x_{t-1},u_{t-1}) \approx \mathcal{N}(\text{GP}_\mu([x_{t-1},u_{t-1}],D_p), \text{GP}_\Sigma([x_{t-1},u_{t-1}],D_p))$

and

$p(z_t|x_t) \approx \mathcal{N}(\text{GP}_\mu(x_t,D_o), \text{GP}_\Sigma(x_t,D_o))$

almosallam_heteroscedastic_2017

Heteroscedastic Gaussian processes for uncertain and incomplete data

Ibrahim Almosallam

PhD Thesis, University of Oxford, https://ora.ox.ac.uk/objects/uuid:6a3b600d-5759-456a-b785-5f89cf4ede6d

If you are looking at this post, it means you are also pretty much a newbie to TensorFlow, like me, as of 2020-07-29.

# tensorflow.keras

Keras is already part of TensorFlow, so, use from tensorflow.keras import ***, not from keras import ***.

TensorFlow backend

## Early stopping

EarlyStopping

model.fit(..., callbacks=[EarlyStopping(monitor='val_loss', patience=5, verbose=1, mode='min', restore_best_weights=True)], ...)

# Reproducibility of results

TL;DR
Set all random seeds
Use tensorflow.keras instead standalone keras
Use model.predict_on_batch(x).numpy() for predicting speed.

I use CNN for time series prediction (1D), not for image works (2D or 3D).

# Learning Materials • How to Develop 1D Convolutional Neural Network Models for Human Activity Recognition
• time series classification
• two 1D CNN layers, followed by a dropout layer for regularization, then a pooling layer. 为什么这样？
• It is common to define CNN layers in groups of two in order to give the model a good chance of learning features from the input data. 为什么这样？
• CNNs learn very quickly, so the dropout layer is intended to help slow down the learning process
• The pooling layer … consolidating them to only the most essential elements.
• After the CNN and pooling, the learned features are flattened to one long vector
• a standard configuration of 64 parallel feature maps and a kernel size of 3 (Where comes this “standard” configuration?)
• a multi-headed model, where each head of the model reads the input time steps using a different sized kernel.

I ran across this document page of pytransform3d, and it claims:

There are two different quaternion conventions: Hamilton’s convention defines ijk = -1 and the JPL convention (from NASA’s Jet Propulsion Laboratory, JPL) defines ijk = 1. We use Hamilton’s convention.

It’s not new to know about different definitions (mostly the sequency differs), but what is this ijk=1 definition? First time to hear about.

Then I continue diving into the reference source it provided.

Only after this, I found that the problem is not only about the sequence of the components, but about something more fundamental. So I put down this summary for my future reference.

# $(q_0, q_1, q_2, q_3)$ or $(q_1, q_2, q_3, q_4)$ ?

The answer is it doesn’t matter that much. This is not a mathematical or fundamental difference.

Equations can be easily converted. Codes can be easily modified.

# $ij=k$ or $ij=-k$

1. Harold L. Hallock, Gary Welter, David G. Simpson, and Christopher Rouff, ACS without an attitude, London: Springer, 2017.
• (p.16) Alternatively, one could follow a different convention with quaternion multiplication. Many authors prefer a convention that, although not expressed as such, essentially redefines Hamilton’s hyper-complex commutation relations (Eq. 1.5b above) into $i j = -k, k j = -i, ki = -j$

The quaternion representation is one of the best characterizations, and this chapter will focus on this representation. The presentation in this chapter follows the style of [99, 205, 219].

# Which one is used in references?

Will keep updating as I read more references…

## Using $ij=k$ and $(q_0, q_1, q_2, q_3)$

1. Yaguang Yang, Spacecraft Modeling, Attitude Determination, and Control Quaternion-based Approach, Boca Raton, FL : CRC Press, 2019. | “A science publishers book.”: CRC Press, 2019. [Link].

## Using $ij=k$ and $(q_1, q_2, q_3, q_4)$

1. Harold L. Hallock, Gary Welter, David G. Simpson, and Christopher Rouff, ACS without an attitude, London: Springer, 2017.

## Using $ij=-k$ and $(q_1, q_2, q_3, q_4)$

1. F. Landis Markley, and John L. Crassidis, Fundamentals of Spacecraft Attitude Determination and Control, New York, NY: Springer New York, 2014.

2. Malcolm D. Shuster, “The nature of the quaternion”, The Journal of the Astronautical Sciences, vol. 56, Sep. 2008, pp. 359–373.

3. Hanspeter Schaub, and John L. Junkins, Analytical Mechanics of Space Systems (Second Edition), Reston, VA: American Institute of Aeronautics and Astronautics, 2009.
(p.107) 似乎是默认了与 Rotation matrix 顺序一致的一种，即 $ij=-k$

# Book:

Probabilistic Programming & Bayesian Methods for Hackers (Version 0.1)

PyMC3 is a Python library for programming Bayesian analysis . It is a fast, well-maintained library. The only unfortunate part is that its documentation is lacking in certain areas, especially those that bridge the gap between beginner and hacker. One of this book’s main goals is to solve that problem, and also to demonstrate why PyMC3 is so cool.

We assign them to PyMC3’s stochastic variables, so-called because they are treated by the back end as random number generators.

Excerpt some information about the attitude subsystem of CubeSats.

Learned something about the attitude estimation EKF used in several books and papers. Try to note something here to clarify their relationships.

The only thing I’m sure about is:
The quaternion attitude + gyro bias estimator is widely used in practice.