Chapter 9: Testing the Model

Welcome to Chapter 9! After training our model in the previous chapter, we're now ready to test it on unseen data. This is a crucial step in the machine learning process because it helps us understand how well our model can generalize to new data. In this chapter, we'll create a new synthetic dataset to test our model, graph the model's predictions against the ground truth, and make predictions for specific inputs. We'll also walk through the Python code for each step, explaining each line in detail.

9.1: Creating an Unseen Synthetic Dataset to Test the Model on

The first step in testing our model is to create a new dataset that the model hasn't seen before. This is often referred to as the test set. We'll create this dataset in the same way we created our original dataset, using the same linear function and adding a bit of random noise. However, we'll generate new random inputs so that the model hasn't seen these exact data points before.

Here's the Python code to create the test set:

# Step 6: Create a test set
X_test = np.round(np.random.rand(100, 1), 3)
Y_test = np.round(10 * X_test + 0.2 * np.random.randn(100, 1), 3) # corresponding outputs with added noise

In this code, X_test is a 100x1 NumPy array of random numbers between 0 and 1, rounded to 3 decimal places. Y_test is a 100x1 NumPy array of outputs generated by multiplying X_test by 10 and adding some random noise. The noise is generated by np.random.randn(100, 1), which creates a 100x1 NumPy array of random numbers from a standard normal distribution, and then we multiply this noise by 0.2 to reduce its magnitude.

9.2: Testing the Model

Now that we have our test set, we can use it to test our model. We do this by passing the test inputs X_test to our model function, which uses the final weights learned during training to make predictions. We then calculate the test loss, which measures how far off these predictions are from the actual test outputs Y_test.

Here's the Python code to test the model:

# Step 7: Test the model
def test(X, Y, weights):
    Y_pred = model(X, weights)
    test_loss = loss(Y, Y_pred)
    print(f"Test loss: {test_loss}")
    return Y_pred

# Test the model
Y_pred = test(X_test, Y_test, weights)

In this code, the test function takes the test inputs X, the actual test outputs Y, and the final weights as arguments. It uses the model function to make predictions for X using the weights, calculates the test loss using the loss function, prints the test loss, and returns the predictions.

The line Y_pred = test(X_test, Y_test, weights) calls the test function with X_test, Y_test, and the final weights, and stores the predictions in Y_pred.

9.3: Graphing Predictions vs. Ground Truth

After testing our model, it's helpful to visualize the model's predictions against the ground truth. This can give us a better sense of how well our model is performing.

We can do this by creating a scatter plot with the test inputs on the x-axis and the test outputs on the y-axis. We'll plot the actual test outputs in one color and the model's predictions in another color.

Here's the Python code to create this plot:

# Step 8: Graph the predictions vs true
plt.figure(figsize=(8, 6))
plt.scatter(X_test, Y_test, color='blue', label='True values')
plt.scatter(X_test, Y_pred, color='red', label='Predicted values')
plt.legend()
plt.xlabel('Input')
plt.ylabel('Output')
plt.title('True vs Predicted Values')
plt.show()

In this code, plt.figure(figsize=(8, 6)) creates a new figure with a size of 8x6 inches. plt.scatter(X_test, Y_test, color='blue', label='True values') creates a scatter plot of the actual test outputs in blue. plt.scatter(X_test, Y_pred, color='red', label='Predicted values') creates a scatter plot of the model's predictions in red. plt.legend() adds a legend to the plot. plt.xlabel('Input') and plt.ylabel('Output') label the x-axis and y-axis, respectively. plt.title('True vs Predicted Values') adds a title to the plot. Finally, plt.show() displays the plot.

9.4: Predicting for a Specific Input

In addition to testing our model on a set of test data, we can also use our model to make predictions for specific inputs. This can be useful in many applications where we want to make predictions for new data points.

Here's the Python code to make a prediction for a specific input:

# Function for making predictions
def predict(X, weights):
    return np.dot(X, weights)

# Let's say we have a new input
new_input = np.array([[-100]])

# Use our trained weights to make a prediction
prediction = predict(new_input, weights)

print(f"Input: {new_input}")
print(f"Prediction: {prediction}")

In this code, the predict function takes an input X and the final weights as arguments, and returns the prediction made by the model for X using the weights. The line new_input = np.array([[-100]]) creates a new input of -100. The line prediction = predict(new_input, weights) calls the predict function with the new input and the final weights, and stores the prediction in prediction. The last two lines print the new input and the prediction.

That's it for this chapter! You now understand how to test a machine learning model on unseen data, visualize the model's predictions, and make predictions for specific inputs. In the next chapter, we'll wrap up by summarizing what we've learned and suggesting next steps for further learning. Stay tuned!