Tensorflow and Keras
If you are looking at this post, it means you are also pretty much a newbie to TensorFlow
, like me, as of 2020-07-29.
tensorflow.keras
Keras
is already part of TensorFlow
, so, use from tensorflow.keras import ***
, not .from keras import ***
TensorFlow backend
Early stopping
model.fit(..., callbacks=[EarlyStopping(monitor='val_loss', patience=5, verbose=1, mode='min', restore_best_weights=True)], ...)
Reproducibility of results
TL;DR
Set all random seeds
Usetensorflow.keras
instead standalonekeras
Usemodel.predict_on_batch(x).numpy()
for predicting speed.
Put this at the very beginning should work.
1 | import os, random |
Update all codes to tf.keras
SEEMS solved the reproducibility problem.
BUT, the speed is 10x slower than using keras
directly.
After some digging, I find a workaround:
- Use
model.predict_on_batch(x)
to do sequential predictions.- Because
model.predict()
will trigger the same calculation path as inmodel.fit()
, including gradient computation or something I don’t understand. See here for details. - Also, use
model(x)
for predicting seems speed up a lot. - Using
model.compile(..., experimental_run_tf_function=False)
seems also speed up a lot.
- Because
- This will cause another problem, the returned value should be a
ndarray
, but somehow I got atftensor
. So, I need to usemodel.predict_on_batch(x).numpy()
to get thendarray
from thetftensor
explicitly.- I guess this is a bug and would be fixed in the future, because the docs say
predict_on_batch()
always returns a numpy.
- I guess this is a bug and would be fixed in the future, because the docs say
predict()
v.s. predict_on_batch()
:
predict()
is used for trainingpredict_on_batch()
is used for pure predicting- They have a huge speed difference on small testing data. Guess I would never understand the background causes.
About Pure TensorFlow
GradientTape 是新版的自动微分器
TensorFlow学习(四):梯度带(GradientTape),优化器(Optimizer)和损失函数(losses)
General Optimization
(Not read yet) An overview of gradient descent optimization algorithms