Back propagation To calculate the loss between output layer/final activations and actual target values. Use the resulting losses to: Calculate the gradients with respect to the parameters and Update the parameters: $\text{parameters} -= \text{learning rate} \cdot \text{gradient of parameters}$. Fine tuning Example: ResNet-34 The final layer, i.e. that final weight matrix, of ResNet-34 has 1000 columns because the images can be in one of 1000 different categories, i.