CHAPTER 09 Intermediate

Training and Evaluating Models in PyTorch

Updated: May 16, 2026

6 min read

# CHAPTER 9

Training and Evaluating Models in PyTorch

1. Introduction

In Keras/TensorFlow, training a model is a single line of code: model.fit(). In PyTorch, there is no fit() function. You must write the entire Training Loop from scratch using standard Python for loops. While this seems daunting at first, it is the primary reason researchers love PyTorch: you have absolute, line-by-line control over exactly how the AI learns. In this chapter, we will write the 5-step PyTorch Training Loop.

2. Learning Objectives

By the end of this chapter, you will be able to:

Define an Optimizer (like Adam or SGD).

Write a standard PyTorch Training Loop.

Understand the 5 critical steps of Backpropagation in code.

Write an Evaluation Loop to test the model on unseen data.

Manage memory using model.train() and model.eval().

3. The Optimizer

Before looping, we must define the Optimizer. The Optimizer is the engine that looks at the gradients calculated by Autograd and physically updates the Weights of the network.

python

1234567891011

import torch
import torch.nn as nn
import torch.optim as optim

# Assume `model` is a previously defined nn.Module
# We pass the model's parameters (weights) to the optimizer so it knows what to update
# lr = Learning Rate (the size of the step it takes to fix the error)
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Define the Loss Function (Criterion)
criterion = nn.CrossEntropyLoss()

4. The 5-Step Training Loop

Every PyTorch training loop follows the exact same 5 steps for every batch of data. Memorize this sequence.

python

123456789101112131415161718192021222324252627

# 1. Set the model to Training Mode
# (This activates Dropout layers and BatchNorm layers)
model.train()

epochs = 10

for epoch in range(epochs):
    # Assume X_train and y_train are our Tensors
    
    # STEP 1: Forward Pass (Make a guess)
    predictions = model(X_train)
    
    # STEP 2: Calculate the Loss (How wrong was the guess?)
    loss = criterion(predictions, y_train)
    
    # STEP 3: Zero the Gradients
    # PyTorch accumulates gradients by default. We must clear them from the last loop!
    optimizer.zero_grad()
    
    # STEP 4: Backward Pass (Autograd calculates the calculus derivatives)
    loss.backward()
    
    # STEP 5: Optimizer Step (Update the weights based on the derivatives)
    optimizer.step()
    
    if (epoch+1) % 2 == 0:
        print(f"Epoch: {epoch+1} | Loss: {loss.item():.4f}")

5. The Evaluation Loop (Testing)

If a student memorizes a textbook, they will score 100% on a practice quiz, but fail the real exam. To prevent this, we test the model on a "Test Set" of data it has never seen before.

During Evaluation, we do NOT want the model to learn. We do not want to calculate gradients or update weights.

python

1234567891011121314151617181920

# 1. Set the model to Evaluation Mode
# (This turns OFF Dropout layers so the model uses 100% of its brain)
model.eval()

# 2. Turn off Autograd Engine to save massive amounts of RAM and speed up testing
with torch.no_grad():
    # Forward pass on the unseen test data
    test_predictions = model(X_test)
    test_loss = criterion(test_predictions, y_test)
    
    # Calculate Accuracy (Assuming classification)
    # Get the index of the highest probability
    _, predicted_classes = torch.max(test_predictions, dim=1)
    
    # Count how many predictions match the true labels
    correct = (predicted_classes == y_test).sum().item()
    total = y_test.size(0)
    accuracy = (correct / total) * 100
    
    print(f"Test Loss: {test_loss.item():.4f} | Test Accuracy: {accuracy:.2f}%")

6. Mini Project: Putting it Together

A professional workflow runs a Training phase and an Evaluation phase *inside* the same Epoch loop, tracking the val_loss continuously to ensure the model isn't overfitting.

python

12345678910111213141516171819202122

# The Ultimate Training Loop Structure
epochs = 50

for epoch in range(epochs):
    
    ### TRAINING PHASE ###
    model.train()
    optimizer.zero_grad()
    train_preds = model(X_train)
    train_loss = criterion(train_preds, y_train)
    train_loss.backward()
    optimizer.step()
    
    ### EVALUATION PHASE ###
    model.eval()
    with torch.no_grad():
        test_preds = model(X_test)
        test_loss = criterion(test_preds, y_test)
        
    # Print progress
    if epoch % 10 == 0:
        print(f"Epoch {epoch} | Train Loss: {train_loss.item():.4f} | Test Loss: {test_loss.item():.4f}")

7. Common Mistakes

Forgetting optimizer.zerograd(): If you forget this, PyTorch will add the gradients from Epoch 1, Epoch 2, and Epoch 3 together. By Epoch 5, your gradients will be massive, and your model's loss will explode to NaN (Not a Number).

Evaluating without with torch.nograd():: The model will still output the correct prediction, but PyTorch will quietly store the massive computational graph in your GPU RAM. If you do this for a large test set, your computer will crash with an "Out of Memory" (OOM) error.

8. Best Practices

Use model.train() and model.eval() always: Even if your simple model doesn't currently use Dropout or BatchNorm layers, write these functions anyway. It builds muscle memory for when you start building complex architectures where forgetting them will silently ruin your accuracy.

9. Exercises

1. Write the 5 steps of the PyTorch training loop in order.

2. What does loss.item() do, and why do we use it in our print statements instead of just printing loss?

10. MCQ Quiz with Answers

Question 1

What is the purpose of `torch.nograd()` during the evaluation loop?

Question 2

Which step physically updates the weights and biases inside the neural network?

11. Interview Questions

Q: Explain the relationship between loss.backward() and optimizer.step() in PyTorch. What exactly is each function doing mathematically?

Q: If you notice your training loss is fluctuating wildly up and down instead of smoothly decreasing, what Hyperparameter in the Optimizer should you likely adjust? (Answer: The Learning Rate is too high).

12. FAQs

Q: My training loop is incredibly slow. What's wrong? A: In our examples, we passed the *entire* Xtrain dataset into the model at once. If Xtrain is 50,000 images, your computer will choke. In the real world, we use "Batches." We cover this in the next chapter!

13. Summary

The PyTorch Training Loop is the beating heart of Deep Learning. By mastering the 5-step sequence—Forward Pass, Loss Calculation, Zeroing Gradients, Backpropagation, and Optimizer Step—you have unlocked the ability to train any neural network. Furthermore, by strictly separating Training and Evaluation modes, you ensure your model is robust and your computer's RAM is protected.

14. Next Chapter Recommendation

Passing the entire dataset into the model at once is mathematically sound, but practically impossible for large datasets. We need a way to spoon-feed data to the GPU in small "batches." In Chapter 10: PyTorch Datasets and DataLoaders, we will learn how to build enterprise-grade data pipelines.

Featured

Browse All 21+ Subject Areas

Popular Topics

More Topics

Quick Links

Featured

Visual Algorithm Labs

Sorting Algorithms

Data Structures

Featured

Frontend Dev

Career Paths

Skill Tracks

Featured

The Future of Web Architecture in 2026

Categories

Community

Practice Quizzes

Training and Evaluating Models in PyTorch

Training and Evaluating Models in PyTorch

1. Introduction

2. Learning Objectives

3. The Optimizer

4. The 5-Step Training Loop

5. The Evaluation Loop (Testing)

6. Mini Project: Putting it Together

7. Common Mistakes

8. Best Practices

9. Exercises

10. MCQ Quiz with Answers

What is the purpose of `torch.nograd()` during the evaluation loop?

Which step physically updates the weights and biases inside the neural network?

11. Interview Questions

12. FAQs

13. Summary

14. Next Chapter Recommendation

Finish this Chapter

Discussion

Send Feedback / Bug

Feedback Submitted!

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Training and Evaluating Models in PyTorch #

1. Introduction #

2. Learning Objectives #

3. The Optimizer #

4. The 5-Step Training Loop #

5. The Evaluation Loop (Testing) #

6. Mini Project: Putting it Together #

7. Common Mistakes #

8. Best Practices #

9. Exercises #

10. MCQ Quiz with Answers #

What is the purpose of torch.nograd() during the evaluation loop?

Which step physically updates the weights and biases inside the neural network?

11. Interview Questions #

12. FAQs #

13. Summary #

14. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 4

Send Feedback / Bug

Feedback Submitted!

Training and Evaluating Models in PyTorch

1. Introduction

2. Learning Objectives

3. The Optimizer

4. The 5-Step Training Loop

5. The Evaluation Loop (Testing)

6. Mini Project: Putting it Together

7. Common Mistakes

8. Best Practices

9. Exercises

10. MCQ Quiz with Answers

What is the purpose of `torch.nograd()` during the evaluation loop?

11. Interview Questions

12. FAQs

13. Summary

14. Next Chapter Recommendation