CHAPTER 07 Beginner

NumPy Broadcasting and Vectorization

Updated: May 18, 2026

5 min read

# CHAPTER 7

NumPy Broadcasting and Vectorization

1. Chapter Introduction

Broadcasting is NumPy's mechanism for performing arithmetic on arrays of different shapes. Vectorization eliminates Python loops — making code 100-1000x faster. These are the two most important performance concepts in NumPy.

2. Vectorization vs Python Loops

python

1234567891011121314151617181920

import numpy as np
import time

n = 1_000_000
data = np.random.rand(n)

# Python loop (slow)
start = time.time()
result_py = [x * 2 + 1 for x in data]
py_time = time.time() - start

# NumPy vectorized (fast)
start = time.time()
result_np = data * 2 + 1
np_time = time.time() - start

print(f"Python loop: {py_time:.3f}s")
print(f"NumPy:       {np_time:.4f}s")
print(f"Speedup:     {py_time / np_time:.0f}x faster")
# Typical output: NumPy is ~50-200x faster

3. Broadcasting Rules

text

12345678910111213141516171819

Broadcasting Rules (applied dimension by dimension):
1. If arrays have different ndim, prepend 1s to shape of smaller array
2. Arrays are compatible if dimensions are equal OR one of them is 1
3. Output shape is the maximum of each dimension

Example:
arr (3, 4) + scalar (1,)
→ scalar broadcasts to (3, 4)
→ Result: (3, 4)

arr (3, 4) + row (1, 4)
→ row broadcasts to (3, 4)
→ Result: (3, 4)

arr (3, 1) + col (1, 4)
→ arr broadcasts to (3, 4), col to (3, 4)
→ Result: (3, 4)

INCOMPATIBLE: (3, 4) + (3, 3) → Error! 4 ≠ 3 and neither is 1

4. Broadcasting Examples

python

12345678910111213141516171819202122232425

# Example 1: Scalar broadcasting
matrix = np.array([[1, 2, 3],
                   [4, 5, 6]])
print(matrix + 10)     # Adds 10 to every element
print(matrix * 2)      # Doubles every element

# Example 2: Row vector broadcasting
row = np.array([1, 2, 3])          # shape (3,)
matrix = np.ones((4, 3))           # shape (4, 3)
result = matrix + row              # row broadcasts to (4,3)
print(result)

# Example 3: Column vector broadcasting
col = np.array([[1], [2], [3], [4]])   # shape (4, 1)
matrix = np.ones((4, 3))               # shape (4, 3)
result = matrix + col                   # col broadcasts to (4,3)
print(result)
# [[2,2,2],[3,3,3],[4,4,4],[5,5,5]]

# Example 4: Two vectors → outer product
x = np.array([1, 2, 3])     # shape (3,)
y = np.array([[10], [20], [30]])  # shape (3, 1)
outer = x + y   # (3,) + (3,1) → (3,3)
print(outer)
# [[11,12,13],[21,22,23],[31,32,33]]

5. Practical Vectorization Patterns

python

123456789101112131415161718192021

# Pattern 1: Normalize data (z-score)
data = np.array([85, 92, 78, 96, 67])
mean = np.mean(data)
std = np.std(data)
normalized = (data - mean) / std    # Vectorized
print(normalized.round(3))

# Pattern 2: Min-Max scaling
scaled = (data - data.min()) / (data.max() - data.min())
print(scaled.round(3))

# Pattern 3: Pairwise distances
points = np.array([[1, 2], [4, 6], [7, 1]])
# Distance from origin to each point
distances = np.sqrt(np.sum(points**2, axis=1))
print(distances.round(2))    # [2.24 7.21 7.07]

# Pattern 4: Conditional vectorized assignment
sales = np.array([120, 340, 280, 510, 90, 450])
bonus = np.where(sales > 300, sales * 0.1, sales * 0.05)
print(bonus)   # 10% bonus if >300, else 5%

6. Mini Project: Salary Calculator System

python

123456789101112131415161718192021222324252627282930313233343536373839404142

import numpy as np

# Employee data
employees = {
    &#039;names': np.array(['Alice', 'Bob', 'Carol', 'David', 'Eve', 'Frank']),
    &#039;base_salary': np.array([55000, 72000, 48000, 88000, 61000, 95000]),
    &#039;performance': np.array([0.95, 0.82, 1.05, 0.78, 1.12, 0.91]),  # multiplier
    &#039;years': np.array([3, 7, 2, 12, 5, 15]),
    &#039;department': np.array(['Eng', 'Mkt', 'Sales', 'Eng', 'Mkt', 'Eng'])
}

# Vectorized calculations (no loops!)
base = employees[&#039;base_salary']
perf = employees[&#039;performance']
years = employees[&#039;years']

# Annual bonus: performance * 10% of base
bonus = base * (perf - 0.8) * 0.5
bonus = np.maximum(bonus, 0)   # No negative bonus

# Seniority raise: 2% per year (capped at 20%)
seniority_pct = np.minimum(years * 0.02, 0.20)
seniority_raise = base * seniority_pct

# Total compensation
total = base + bonus + seniority_raise

# Tax brackets (vectorized)
tax_rate = np.where(total > 100000, 0.35,
           np.where(total > 75000, 0.28,
           np.where(total > 50000, 0.22, 0.15)))

net = total * (1 - tax_rate)

print(f"{&#039;Name':<8} {'Base':>8} {'Bonus':>7} {'Seniority':>10} {'Total':>9} {'Tax%':>5} {'Net':>9}")
print("-" * 60)
for i, name in enumerate(employees[&#039;names']):
    print(f"{name:<8} ${base[i]:>7,.0f} ${bonus[i]:>6,.0f} ${seniority_raise[i]:>9,.0f} ${total[i]:>8,.0f} {tax_rate[i]*100:>4.0f}% ${net[i]:>8,.0f}")

print(f"\nTeam total payroll: ${np.sum(total):,.0f}")
print(f"Average net salary: ${np.mean(net):,.0f}")
print(f"Highest earner: {employees[&#039;names'][np.argmax(net)]}")

7. Common Mistakes

Broadcasting incompatibility: (3, 4) + (3, 3) fails. Shapes must be compatible dimension by dimension. Use reshape(-1, 1) to add a dimension.

np.where vs Python if: np.where(condition, x, y) is vectorized. Python if cannot work on arrays.

8. MCQs

Question 1

NumPy vectorization is faster because?

Question 2

Broadcasting allows?

Question 3

`(3,4) + (4,)` broadcasting result shape?

Question 4

`np.where(cond, x, y)` returns?

Question 5

Normalizing data means?

Question 6

`(3,1) + (1,4)` broadcasts to?

Question 7

`np.maximum(a, 0)` returns?

Question 8

Typical speedup of NumPy vs Python loop?

Question 9

Z-score normalization formula?

Question 10

`reshape(-1, 1)` converts 1D array to?

9. Interview Questions

Q: Explain NumPy broadcasting with an example.

Q: Why is vectorized code preferred over Python loops in data science?

10. Summary

Vectorization eliminates Python loops — achieving 50-200x speedups. Broadcasting extends scalar/vector operations to match array shapes automatically. np.where provides vectorized conditional logic. These patterns are essential for writing production-quality data science code.

11. Next Chapter Recommendation

In Chapter 8: NumPy Random Module, we generate random data for simulations, statistical sampling, and machine learning dataset creation.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

NumPy Broadcasting and Vectorization #

1. Chapter Introduction #

2. Vectorization vs Python Loops #

3. Broadcasting Rules #

4. Broadcasting Examples #

5. Practical Vectorization Patterns #

6. Mini Project: Salary Calculator System #

7. Common Mistakes #

8. MCQs #

NumPy vectorization is faster because?

Broadcasting allows?

(3,4) + (4,) broadcasting result shape?

np.where(cond, x, y) returns?

Normalizing data means?

(3,1) + (1,4) broadcasts to?

np.maximum(a, 0) returns?

Typical speedup of NumPy vs Python loop?

Z-score normalization formula?

reshape(-1, 1) converts 1D array to?

9. Interview Questions #

10. Summary #

11. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 5

Send Feedback / Bug

Feedback Submitted!

NumPy Broadcasting and Vectorization

1. Chapter Introduction

2. Vectorization vs Python Loops

3. Broadcasting Rules

4. Broadcasting Examples

5. Practical Vectorization Patterns

6. Mini Project: Salary Calculator System

7. Common Mistakes

8. MCQs

`(3,4) + (4,)` broadcasting result shape?

`np.where(cond, x, y)` returns?

`(3,1) + (1,4)` broadcasts to?

`np.maximum(a, 0)` returns?

`reshape(-1, 1)` converts 1D array to?

9. Interview Questions

10. Summary

11. Next Chapter Recommendation