Distracted Driver Day 4 — LRScheduler

Chiraag K V
3 min readJul 26, 2021

Hello! In my last blog, I created a baseline model using MobileNet. It had an accuracy of 52.78% on validation dataset. Today, I tried to tweak a very important hyperparameters of the model.

Faster Experiments

Hyperparameter tuning in TensorFlow is an experimental process. In order to reduce the duration of each experiment, I decided to temporarily shrink the size of he train and validation dataset. I indexed the file paths and labels to the first 5000 samples to decrease the dataset size.

You may have noticed a small parameter I put in the Adam optimizer in the last blog — Learning Rate(lr). LR determines how fast the optimizer should improve. To find the ideal learning rate, TensorFlow has a callback called LRScheduler. The LRScheduler trains every epoch with different learning rates. We can later plot how the loss goes down with each epoch to find the ideal learning rate. In this blog, I will tell you how I used this to get better results.

Creating the LRScheduler callback

the LRScheduler accepts a function as a parameter. After I implemented the LRScheduler in the code, the code looked like:

I dropped the EarlyStopping callback in this model so that the model tries different learning rates and gives a better reflection of the actual decreases in loss.

Plotting the fluctuations in loss with different LRs

I used Matplotlib’s semilogx plots (they are plots where the x and y co-ordinates are in logarithmic base 10).

This outputted:

We can further magnify by setting the xlim of the plot

We can approximate our ideal learning rate to be 3 x 10^-4 = 0.0003.

Conclusion

In this blog, we found the ideal learning rate of our model. Tomorrow, I will be training the model again with this learning rate.

Okay, that’s it for today. Bye!

--

--