0%

Jonathan Ko, “Gaussian Process for Dynamic Systems”, PhD Thesis, University of Washington, 2011.

Bayes filter equation in Eq. 4.1 (p.34) has a typo (should be \propto, not ==)

p(xtz1:t,u1:t1)p(ztxt)p(xtxt1,ut1)p(xt1z1:t1,u1:t2)dxt1p(x_t|z_{1:t},u_{1:t-1}) \propto p(z_t|x_t) \int \textcolor{red}{p(x_t|x_{t-1},u_{t-1})} \textcolor{green}{p(x_{t-1}|z_{1:t-1},u_{1:t-2})} dx_{t-1}

  • Red\textcolor{red}{Red} part is dynamics model, describing how the state xx evolves in time based on the control input uu (p.34)
  • Green\textcolor{green}{Green} part is observation model, describing the likelihood of making an observation zz given the state xx
  • GP-BayesFilter improves these two parts.

The dynamics model maps the state and control (xt,ut)(x_t,u_t) to the state transition Δxt=xt+1xt\Delta x_t = x_{t+1} - x_t. So, the training data is

Dp=<(X,U),X>D_p = <(X,U),X'>

The observation model maps from the state xtx_t to the observation ztz_t. So, the training data is

Do=<X,Z>D_o = <X,Z>

The resulting GP dynamics and observation models are (p.44)

p(xtxt1,ut1)N(GPμ([xt1,ut1],Dp),GPΣ([xt1,ut1],Dp))p(x_t|x_{t-1},u_{t-1}) \approx \mathcal{N}(\text{GP}_\mu([x_{t-1},u_{t-1}],D_p), \text{GP}_\Sigma([x_{t-1},u_{t-1}],D_p))

and

p(ztxt)N(GPμ(xt,Do),GPΣ(xt,Do))p(z_t|x_t) \approx \mathcal{N}(\text{GP}_\mu(x_t,D_o), \text{GP}_\Sigma(x_t,D_o))

Read more »

If you are looking at this post, it means you are also pretty much a newbie to TensorFlow, like me, as of 2020-07-29.

tensorflow.keras

Keras is already part of TensorFlow, so, use from tensorflow.keras import ***, not from keras import ***.

TensorFlow backend

Early stopping

EarlyStopping

model.fit(..., callbacks=[EarlyStopping(monitor='val_loss', patience=5, verbose=1, mode='min', restore_best_weights=True)], ...)

Reproducibility of results

TL;DR
Set all random seeds
Use tensorflow.keras instead standalone keras
Use model.predict_on_batch(x).numpy() for predicting speed.

Read more »

I use CNN for time series prediction (1D), not for image works (2D or 3D).

Learning Materials

  • How to Develop 1D Convolutional Neural Network Models for Human Activity Recognition
    • time series classification
    • two 1D CNN layers, followed by a dropout layer for regularization, then a pooling layer. 为什么这样?
      • It is common to define CNN layers in groups of two in order to give the model a good chance of learning features from the input data. 为什么这样?
      • CNNs learn very quickly, so the dropout layer is intended to help slow down the learning process
      • The pooling layer … consolidating them to only the most essential elements.
    • After the CNN and pooling, the learned features are flattened to one long vector
    • a standard configuration of 64 parallel feature maps and a kernel size of 3 (Where comes this “standard” configuration?)
    • a multi-headed model, where each head of the model reads the input time steps using a different sized kernel.
Read more »

I ran across this document page of pytransform3d, and it claims:

There are two different quaternion conventions: Hamilton’s convention defines ijk = -1 and the JPL convention (from NASA’s Jet Propulsion Laboratory, JPL) defines ijk = 1. We use Hamilton’s convention.

It’s not new to know about different definitions (mostly the sequency differs), but what is this ijk=1 definition? First time to hear about.

Then I continue diving into the reference source it provided.

Only after this, I found that the problem is not only about the sequence of the components, but about something more fundamental. So I put down this summary for my future reference.

(q0,q1,q2,q3)(q_0, q_1, q_2, q_3) or (q1,q2,q3,q4)(q_1, q_2, q_3, q_4) ?

The answer is it doesn’t matter that much. This is not a mathematical or fundamental difference.

Equations can be easily converted. Codes can be easily modified.

ij=kij=k or ij=kij=-k

This is about math!

  1. Harold L. Hallock, Gary Welter, David G. Simpson, and Christopher Rouff, ACS without an attitude, London: Springer, 2017.
  • (p.16) Alternatively, one could follow a different convention with quaternion multiplication. Many authors prefer a convention that, although not expressed as such, essentially redefines Hamilton’s hyper-complex commutation relations (Eq. 1.5b above) into ij=k,kj=i,ki=ji j = -k, k j = -i, ki = -j

The quaternion representation is one of the best characterizations, and this chapter will focus on this representation. The presentation in this chapter follows the style of [99, 205, 219].

Which one is used in references?

Will keep updating as I read more references…

Using ij=kij=k and (q0,q1,q2,q3)(q_0, q_1, q_2, q_3)

  1. Yaguang Yang, Spacecraft Modeling, Attitude Determination, and Control Quaternion-based Approach, Boca Raton, FL : CRC Press, 2019. | “A science publishers book.”: CRC Press, 2019. [Link].

Using ij=kij=k and (q1,q2,q3,q4)(q_1, q_2, q_3, q_4)

  1. Harold L. Hallock, Gary Welter, David G. Simpson, and Christopher Rouff, ACS without an attitude, London: Springer, 2017.

Using ij=kij=-k and (q1,q2,q3,q4)(q_1, q_2, q_3, q_4)

还是没有搞明白为什么这就相当于重新定义了 ij=kij=-k

  1. F. Landis Markley, and John L. Crassidis, Fundamentals of Spacecraft Attitude Determination and Control, New York, NY: Springer New York, 2014.

  2. Malcolm D. Shuster, “The nature of the quaternion”, The Journal of the Astronautical Sciences, vol. 56, Sep. 2008, pp. 359–373.

  3. Hanspeter Schaub, and John L. Junkins, Analytical Mechanics of Space Systems (Second Edition), Reston, VA: American Institute of Aeronautics and Astronautics, 2009.
    (p.107) 似乎是默认了与 Rotation matrix 顺序一致的一种,即 ij=kij=-k

Official documents

Book:

Probabilistic Programming & Bayesian Methods for Hackers (Version 0.1)

PyMC3 is a Python library for programming Bayesian analysis [3]. It is a fast, well-maintained library. The only unfortunate part is that its documentation is lacking in certain areas, especially those that bridge the gap between beginner and hacker. One of this book’s main goals is to solve that problem, and also to demonstrate why PyMC3 is so cool.

We assign them to PyMC3’s stochastic variables, so-called because they are treated by the back end as random number generators.