Tensorflow and Keras
If you are looking at this post, it means you are also pretty much a newbie to TensorFlow, like me, as of 2020-07-29.
tensorflow.keras
Keras is already part of TensorFlow, so, use from tensorflow.keras import ***, not from keras import ***
TensorFlow backend
Early stopping
model.fit(..., callbacks=[EarlyStopping(monitor='val_loss', patience=5, verbose=1, mode='min', restore_best_weights=True)], ...)
Reproducibility of results
TL;DR
Set all random seeds
Usetensorflow.kerasinstead standalonekeras
Usemodel.predict_on_batch(x).numpy()for predicting speed.
Put this at the very beginning should work.
| 1 | import os, random | 
Update all codes to tf.keras SEEMS solved the reproducibility problem.
BUT, the speed is 10x slower than using keras directly.
After some digging, I find a workaround:
- Use model.predict_on_batch(x)to do sequential predictions.- Because model.predict()will trigger the same calculation path as inmodel.fit(), including gradient computation or something I don’t understand. See here for details.
- Also, use model(x)for predicting seems speed up a lot.
- Using model.compile(..., experimental_run_tf_function=False)seems also speed up a lot.
 
- Because 
- This will cause another problem, the returned value should be a ndarray, but somehow I got atftensor. So, I need to usemodel.predict_on_batch(x).numpy()to get thendarrayfrom thetftensorexplicitly.- I guess this is a bug and would be fixed in the future, because the docs say predict_on_batch()always returns a numpy.
 
- I guess this is a bug and would be fixed in the future, because the docs say 
predict() v.s. predict_on_batch():
- predict()is used for training
- predict_on_batch()is used for pure predicting
- They have a huge speed difference on small testing data. Guess I would never understand the background causes.
About Pure TensorFlow
GradientTape 是新版的自动微分器
TensorFlow学习(四):梯度带(GradientTape),优化器(Optimizer)和损失函数(losses)
General Optimization
(Not read yet) An overview of gradient descent optimization algorithms