Building Your First Neural Network in PyTorch
# CHAPTER 7
Building Your First Neural Network in PyTorch
1. Introduction
We have covered the theory of neurons and the math of Tensors. Now, it is time for the main event. We are going to build a functional Artificial Neural Network. In PyTorch, building a network requires an understanding of Object-Oriented Programming (Classes). We will subclass PyTorch'snn.Module to create our own custom "Brain." In this chapter, we will design the architecture for a model capable of recognizing patterns.
2. Learning Objectives
By the end of this chapter, you will be able to:-
Subclass the
nn.Moduleclass to create a custom PyTorch model.
-
Define network layers (like
nn.Linear) inside the_initmethod.
-
Define the data flow using the
forwardmethod.
-
Utilize the
nn.SequentialAPI for simpler networks.
- Pass a dummy tensor through the network to verify its architecture.
3. The nn.Module Class
In PyTorch, every neural network must inherit from a base class called nn.Module. This tells PyTorch: "Hey, treat this Python class as a Neural Network. Keep track of all its weights, and prepare it for Autograd."
Every PyTorch model requires exactly two methods:
-
1.
init_(self): This is where you define the layers (the building blocks).
-
2.
forward(self, x): This is where you define *how* the data (x) flows through the layers.
4. Step-by-Step: Building a Model
Let's build a network designed to take an image of a handwritten digit (28x28 pixels = 784 total pixels) and predict which of the 10 digits (0-9) it is.5. Passing Data Through the Model
To verify our model works, let's create a "fake" image tensor and pass it through the network. Notice that we just callmodel(dummyimage)—we do *not* explicitly call model.forward(dummyimage). PyTorch handles that internally.
6. Mini Project: The nn.Sequential Shortcut
If your network is just a straight line of layers (like the one above), writing out the forward method can feel repetitive. PyTorch provides a shortcut called nn.Sequential. It automatically chains the layers together.
*Why use classes at all then?* Because advanced models (like ResNets or Transformers) don't move in a straight line. Data splits, merges, and skips layers. nn.Sequential cannot handle complex branching logic, but nn.Module classes can do anything.
7. Understanding the Parameters
When you instantiatenn.Linear(784, 128), PyTorch automatically generates a Tensor of random Weights (size 784x128) and a Tensor of Biases (size 128). You can view them:
8. Common Mistakes
-
Forgetting
super()._init(): If you forget this line inside yourinitmethod, PyTorch won't know it's supposed to track the weights, and your code will crash immediately.
-
Shape mismatch between layers: The
outfeaturesof Layer 1 MUST match theinfeaturesof Layer 2. If Layer 1 outputs 128, and Layer 2 expects 100, PyTorch will throw a massive error during the forward pass.
9. Best Practices
-
Define layers, don't define math: Inside
init, you define the objects (e.g.,self.layer1 = nn.Linear()). Do not put data operations or math here. All math and data routing belongs strictly inside theforward()method.
10. Exercises
-
1.
Modify the
DigitClassifierclass to add a third hidden layer with 32 neurons. Make sure you adjust the input features of the output layer to match!
-
2.
Write a 2-layer neural network using the
nn.Sequentialshortcut.
11. MCQ Quiz with Answers
Every custom neural network class in PyTorch must inherit from which base class?
If you use nn.Sequential, what PyTorch method is automatically handled for you behind the scenes?
12. Interview Questions
-
Q: Explain the purpose of the
initmethod versus theforwardmethod in a PyTorchnn.Moduleclass.
-
Q: When would you choose to subclass
nn.Modulemanually instead of using thenn.Sequentialshortcut?
13. FAQs
Q: Do I need to write abackward() method?
A: No! Because PyTorch uses the Autograd engine, you only write the forward pass. PyTorch automatically figures out the calculus for the backward pass based on how you routed the data.
14. Summary
You just built your first Artificial Intelligence architecture! By subclassingnn.Module, defining our layers in init_, and routing the data in forward(), we created a fully functional neural network graph. While nn.Sequential offers a great shortcut for simple models, understanding the class structure is critical for mastering PyTorch.
15. Next Chapter Recommendation
In our code, we mysteriously used keywords likenn.ReLU(). What does this actually mean? If you pick the wrong ones, your model will fail completely. In Chapter 8: Activation Functions and Loss Functions, we will demystify the math that makes learning possible.