News

This week, we are learning about optimization methods. We will start with Stochastic Gradient Descent (SGD). SGD has several design parameters that we can tweak, including learning rate, momentum, and ...