Skip to main content
PyTorch Essentials
CHAPTER 07 Intermediate

Building Your First Neural Network in PyTorch

Updated: May 16, 2026
6 min read

# CHAPTER 7

Building Your First Neural Network in PyTorch

1. Introduction

We have covered the theory of neurons and the math of Tensors. Now, it is time for the main event. We are going to build a functional Artificial Neural Network. In PyTorch, building a network requires an understanding of Object-Oriented Programming (Classes). We will subclass PyTorch's nn.Module to create our own custom "Brain." In this chapter, we will design the architecture for a model capable of recognizing patterns.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Subclass the nn.Module class to create a custom PyTorch model.
  • Define network layers (like nn.Linear) inside the _init method.
  • Define the data flow using the forward method.
  • Utilize the nn.Sequential API for simpler networks.
  • Pass a dummy tensor through the network to verify its architecture.

3. The nn.Module Class

In PyTorch, every neural network must inherit from a base class called nn.Module. This tells PyTorch: "Hey, treat this Python class as a Neural Network. Keep track of all its weights, and prepare it for Autograd."

Every PyTorch model requires exactly two methods:

  1. 1. init_(self): This is where you define the layers (the building blocks).
  1. 2. forward(self, x): This is where you define *how* the data (x) flows through the layers.

4. Step-by-Step: Building a Model

Let's build a network designed to take an image of a handwritten digit (28x28 pixels = 784 total pixels) and predict which of the 10 digits (0-9) it is.
python
1234567891011121314151617181920212223242526272829303132333435363738
import torch
import torch.nn as nn

# 1. Subclass nn.Module
class DigitClassifier(nn.Module):
    def __init__(self):
        super().__init__() # Must call the parent class initializer
        
        # 2. Define the layers
        # Input layer takes 784 pixels. The first hidden layer has 128 neurons.
        self.hidden1 = nn.Linear(in_features=784, out_features=128)
        
        # Second hidden layer takes 128 neurons, outputs 64.
        self.hidden2 = nn.Linear(in_features=128, out_features=64)
        
        # Output layer takes 64 neurons, outputs exactly 10 (one for each digit 0-9)
        self.output = nn.Linear(in_features=64, out_features=10)
        
        # We need activation functions (covered deeply in the next chapter)
        self.relu = nn.ReLU()

    # 3. Define the Forward Pass
    def forward(self, x):
        # Pass the data (x) through hidden layer 1, then apply ReLU
        x = self.hidden1(x)
        x = self.relu(x)
        
        # Pass through hidden layer 2, apply ReLU
        x = self.hidden2(x)
        x = self.relu(x)
        
        # Pass through output layer
        x = self.output(x)
        return x

# Instantiate the brain!
model = DigitClassifier()
print(model)

5. Passing Data Through the Model

To verify our model works, let's create a "fake" image tensor and pass it through the network. Notice that we just call model(dummyimage)—we do *not* explicitly call model.forward(dummyimage). PyTorch handles that internally.
python
123456789
# Create a dummy tensor representing a batch of 1 image, with 784 pixels
dummy_image = torch.randn(1, 784)

# Pass it through the untrained network
predictions = model(dummy_image)

print("Output Shape:", predictions.shape)
# Output Shape: torch.Size([1, 10]) -> 1 image, 10 digit probabilities
print("Raw Predictions:", predictions)

6. Mini Project: The nn.Sequential Shortcut

If your network is just a straight line of layers (like the one above), writing out the forward method can feel repetitive. PyTorch provides a shortcut called nn.Sequential. It automatically chains the layers together.
python
123456789101112
import torch.nn as nn

# This creates the exact same architecture as the class above!
sequential_model = nn.Sequential(
    nn.Linear(784, 128),
    nn.ReLU(),
    nn.Linear(128, 64),
    nn.ReLU(),
    nn.Linear(64, 10)
)

print(sequential_model)

*Why use classes at all then?* Because advanced models (like ResNets or Transformers) don't move in a straight line. Data splits, merges, and skips layers. nn.Sequential cannot handle complex branching logic, but nn.Module classes can do anything.

7. Understanding the Parameters

When you instantiate nn.Linear(784, 128), PyTorch automatically generates a Tensor of random Weights (size 784x128) and a Tensor of Biases (size 128). You can view them:
python
1234567
# View the random weights of the first layer
print("Weights shape:", model.hidden1.weight.shape)

# Count total parameters in the model
total_params = sum(p.numel() for p in model.parameters())
print(f"Total parameters to train: {total_params}")
# Output: ~109,386 weights and biases!

8. Common Mistakes

  • Forgetting super()._init(): If you forget this line inside your init method, PyTorch won't know it's supposed to track the weights, and your code will crash immediately.
  • Shape mismatch between layers: The outfeatures of Layer 1 MUST match the infeatures of Layer 2. If Layer 1 outputs 128, and Layer 2 expects 100, PyTorch will throw a massive error during the forward pass.

9. Best Practices

  • Define layers, don't define math: Inside init, you define the objects (e.g., self.layer1 = nn.Linear()). Do not put data operations or math here. All math and data routing belongs strictly inside the forward() method.

10. Exercises

  1. 1. Modify the DigitClassifier class to add a third hidden layer with 32 neurons. Make sure you adjust the input features of the output layer to match!
  1. 2. Write a 2-layer neural network using the nn.Sequential shortcut.

11. MCQ Quiz with Answers

Question 1

Every custom neural network class in PyTorch must inherit from which base class?

Question 2

If you use nn.Sequential, what PyTorch method is automatically handled for you behind the scenes?

12. Interview Questions

  • Q: Explain the purpose of the init method versus the forward method in a PyTorch nn.Module class.
  • Q: When would you choose to subclass nn.Module manually instead of using the nn.Sequential shortcut?

13. FAQs

Q: Do I need to write a backward() method? A: No! Because PyTorch uses the Autograd engine, you only write the forward pass. PyTorch automatically figures out the calculus for the backward pass based on how you routed the data.

14. Summary

You just built your first Artificial Intelligence architecture! By subclassing nn.Module, defining our layers in
init_, and routing the data in forward(), we created a fully functional neural network graph. While nn.Sequential offers a great shortcut for simple models, understanding the class structure is critical for mastering PyTorch.

15. Next Chapter Recommendation

In our code, we mysteriously used keywords like nn.ReLU(). What does this actually mean? If you pick the wrong ones, your model will fail completely. In Chapter 8: Activation Functions and Loss Functions, we will demystify the math that makes learning possible.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·