CHAPTER 05
Beginner
Image Transformations and Filtering
Updated: May 14, 2026
20 min read
# CHAPTER 5
Image Transformations and Filtering
1. Introduction
When you take a picture with your phone, it rarely comes out perfectly level, and you almost always apply a filter to make it look better. In Computer Vision, we apply mathematical transformations to rotate, resize, and warp images. We also apply "Filters" (also known as Kernels) to blur backgrounds or sharpen edges. In this chapter, we will learn the math behind how an image is transformed without destroying the data.2. Learning Objectives
By the end of this chapter, you will be able to:- Understand affine transformations (Scaling, Rotation, Translation).
- Explain what a Kernel/Convolution matrix is.
- Understand how Blurring (Smoothing) reduces high-frequency noise.
- Apply basic filters using OpenCV.
3. Beginner-Friendly Explanation
Imagine a piece of graph paper where every square is colored in to make a picture.- Transformations: If you want to make the picture bigger (Scaling), you stretch the graph paper. Because you stretched it, you now have empty squares between your colors! The computer has to mathematically guess what color should fill the new empty squares.
- Filtering: Imagine looking at the graph paper through a tiny magnifying glass that only shows a 3x3 grid of squares. As you slide the magnifying glass across the paper, you mix the 9 colors together to create a new color. This "sliding window" is exactly how computers blur or sharpen an image!
4. Image Transformations (Affine)
Transformations physically move the pixels.- 1. Translation (Shifting): Shifting the entire image 50 pixels to the right.
- 2. Rotation: Spinning the image by an angle (e.g., 45 degrees). When an image rotates, the corners might get cut off or create black triangles!
- 3. Scaling (Resizing): Shrinking or enlarging. When enlarging, OpenCV uses *Interpolation*—a mathematical guess to create new pixels based on the colors of the surrounding original pixels.
5. What is a Kernel (Filter)?
A Kernel (or Convolution Matrix) is a tiny grid of numbers, usually 3x3 or 5x5. To apply a filter to an image, the computer takes this tiny 3x3 grid and slides it across every single pixel in the entire image. At each step, it multiplies the pixel values by the numbers in the Kernel to calculate a brand new pixel value. By simply changing the numbers inside this tiny 3x3 grid, you can completely change what the filter does!6. Blurring (Smoothing)
Blurring is used to remove noise (static) from an image. If you use a Mean Filter (Averaging Kernel), the 3x3 grid looks at a pixel and its 8 neighbors. It calculates the average color of all 9 pixels and assigns that average to the center pixel. By averaging everything, sharp edges and random specks of static are smoothed out into a blur.- *Gaussian Blur:* A smarter, more natural-looking blur where the center pixel is given more mathematical "weight" than the outer pixels.
7. Sharpening
A sharpening Kernel does the exact opposite of a blur. It looks at the difference between a pixel and its neighbors and exaggerates that difference mathematically. If a dark pixel is next to a light pixel, the sharpening kernel makes the dark one darker and the light one lighter, creating a harsh, crisp edge.8. Python Example: Applying Transformations
Let's rotate an image and then blur it using OpenCV.
python
9. Mini Project
Design the Pipeline: You are building an app that scans receipts. When users take a photo, the receipt is often tilted and slightly grainy from low light. What two operations should you apply before passing it to a text-reading AI? *(Answer: 1. A Rotation Transformation to level the text so it is horizontal. 2. A light Gaussian Blur to remove the camera grain/noise).*10. Best Practices
-
Odd-Sized Kernels: When defining a Kernel for blurring in OpenCV (like
(5, 5)or(11, 11)), the dimensions must always be odd numbers. This is because the sliding window needs a true "center" pixel to apply the calculation to!
11. Common Mistakes
- Applying blur *after* edge detection: If you are trying to find the sharp edges of a car, you must apply the blur *first* to remove the static, and *then* run the edge detector. If you run the edge detector and then blur it, you destroy the edges you just found!
12. Exercises
- 1. Explain why enlarging an image (Scaling up) results in a loss of quality, while shrinking it does not.
13. Coding Challenges
Challenge 1: Write pseudocode to shrink a 4K image (3840x2160) down to a machine-learning friendly size of 224x224.
text
14. MCQs with Answers
Question 1
In image filtering, what is a Kernel?
Question 2
When rotating an image using Computer Vision software, what happens to the corners of the image if it is rotated 45 degrees within its original bounding box?
15. Interview Questions
- Q: Explain the mathematical concept of Convolution (sliding a Kernel over an image) and how it achieves a blur effect.
- Q: What is Image Interpolation, and why is it necessary when scaling an image to a larger size?