Chapter 7: Building the Training Loop

Welcome to Chapter 7! In this chapter, we will be building the training loop for our neural network. The training loop is the heart of any machine learning model. It's where the magic happens, where our model learns from the data. We will start by understanding the general process of training a neural network. Then, we will dive into the details of the forward and backward passes. Finally, we will walk through the Python code that implements the training loop, explaining each line in detail.

7.1: Training Neural Networks: An Overview

Training a neural network is like teaching a child to ride a bike. At first, the child doesn't know how to balance, pedal, and steer at the same time. But with practice, the child gets better and better. Similarly, our neural network starts off knowing nothing about our data. But with each pass through the training loop, it gets better at making predictions.

The training loop is an iterative process. In each iteration, also known as an epoch, the model makes predictions, calculates how far off these predictions are from the actual values (the loss), and then adjusts its weights to reduce the loss. This process is repeated for a certain number of epochs until the model's predictions are satisfactory.

7.2: Forward Pass: Predictions and Loss Calculation

The first step in the training loop is the forward pass. This is where the model uses its current weights to make predictions.

Let's consider an example. Suppose we have an input of 0.5. Our model, which is a simple linear function, multiplies this input by its current weights to produce an output. If the current weights are 1.0, the output will be 0.5.

After making predictions, the model calculates the loss, which measures how far off the predictions are from the actual values. The lower the loss, the better the model's predictions.

7.3: Backward Pass: Gradient Calculation and Weights Update

After the forward pass comes the backward pass. This is where the model adjusts its weights to reduce the loss.

The model calculates the gradient of the loss with respect to the weights. The gradient is a fancy term for the rate of change or slope. In our case, it tells us how much the loss will change if we change the weights by a small amount.

The model then adjusts its weights in the opposite direction of the gradient. This is because we want to decrease the loss, not increase it. If the gradient is positive, we decrease the weights. If the gradient is negative, we increase the weights.

The size of the adjustment is determined by the learning rate, a hyperparameter that we set before training. A smaller learning rate means smaller adjustments to the weights, and vice versa.

7.4: Training Loop Design: Iterative Process

The forward and backward passes together form one epoch of the training loop. We repeat these passes for a certain number of epochs until the model's predictions are satisfactory.

Think of the training loop as a feedback loop. The model makes predictions, gets feedback on how well it did (the loss), and uses this feedback to improve (adjust its weights). This process is repeated over and over, with the model getting better and better with each epoch.

7.5: Code Explanation: Training Loop Definition

Now, let's go through the Python code that implements the training loop. We'll explain each line to ensure you understand how the forward and backward passes work.

def train(X, Y, weights, lr, epochs):
    for epoch in range(epochs):
        # forward pass
        Y_pred = model(X, weights)

        current_loss = loss(Y, Y_pred)
        print(f"Epoch {epoch}:")
        print(f"Predicted output for the first 5 samples: {Y_pred[:5].T}")
        print(f"Actual output for the first 5 samples: {Y[:5].T}")
        difference = Y - Y_pred
        print(f"Difference between actual and predicted: {difference[:5].T}")
        print(f"Current loss: {current_loss}")

        # calculate the gradients
        gradients = -2 * np.dot(X.T, (Y - Y_pred)) / len(X)
        gradients = np.round(gradients, 3)  # round gradients to 3 decimal places
        print(f"Gradients for this epoch: {gradients}")

        # update weights
        weights -= lr * gradients
        weights = np.round(weights, 3)  # round weights to 3 decimal places
        print(f"Updated weights after this epoch: {weights}")

        # compute new prediction and loss after weight update
        Y_pred = model(X, weights)
        new_loss = loss(Y, Y_pred)
        print(f"New prediction for the first 5 samples after weight update: {Y_pred[:5].T}")
        print(f"New loss after weight update: {new_loss}")

    return weights

This code defines a function called train that takes five arguments: X (the input data), Y (the actual values), weights (the initial weights), lr (the learning rate), and epochs (the number of epochs).

The function starts with a for loop that runs for the specified number of epochs. In each epoch, the function performs a forward pass, calculates the loss, calculates the gradients, updates the weights, and then computes the new prediction and loss after the weight update.

That's it for this chapter! You now understand how the training loop works and how to implement it in Python. In the next chapter, we will execute the training loop and evaluate our model. Stay tuned!