Skip to main content
Jupyter Notebooks
CHAPTER 11 Beginner

NumPy Integration in Jupyter

Updated: May 18, 2026
5 min read

# CHAPTER 11

NumPy Integration in Jupyter

1. Chapter Introduction

Pandas is excellent for tables with different data types (e.g., text names, integer ages, boolean flags). But what if you have a massive matrix of pure numbers, like the pixels in an image or a neural network's weights? For high-performance mathematics, you need NumPy (Numerical Python). In fact, Pandas is built *on top* of NumPy. This chapter teaches you how to use NumPy arrays and why they are essential for data science.

2. The NumPy Array (Ndarray)

The core feature of NumPy is the ndarray (N-dimensional array). It looks like a Python list, but it operates entirely differently under the hood.

Cell 1:

python
123456789101112
import numpy as np
# 'np' is the standard industry alias

# Create a Python list
py_list = [1, 2, 3, 4, 5]

# Convert it to a NumPy array
np_array = np.array(py_list)

print("Python List:", py_list)
print("NumPy Array:", np_array)
print("Type:", type(np_array))

3. Why NumPy Arrays? (Vectorization)

If you have a list of numbers and want to multiply them all by 2, standard Python requires a slow for loop. NumPy uses Vectorization, allowing you to perform operations on the entire array at once. This is executed in optimized C code, making it incredibly fast.

Cell 2:

python
1234567891011
# The Slow Python Way
prices_list = [10, 20, 30]
doubled_list = []
for price in prices_list:
    doubled_list.append(price * 2)
print("List Math:", doubled_list)

# The Fast NumPy Way (Vectorization)
prices_array = np.array([10, 20, 30])
doubled_array = prices_array * 2  # Multiplies every element instantly!
print("NumPy Math:", doubled_array)

4. Creating Arrays from Scratch

NumPy provides built-in functions to quickly generate arrays of numbers without typing them out.

Cell 3:

python
123456789101112131415
# 1. Array of zeros (useful for initializing matrices)
zeros = np.zeros(5)
print("Zeros:", zeros)

# 2. Array of ones
ones = np.ones(3)
print("Ones:", ones)

# 3. A sequence of numbers (similar to range())
seq = np.arange(0, 10, 2) # Start at 0, stop at 10, step by 2
print("Sequence:", seq)

# 4. Evenly spaced numbers (useful for plotting graphs)
linspace = np.linspace(0, 1, 5) # 5 numbers evenly spaced between 0 and 1
print("Linspace:", linspace)

5. Multi-Dimensional Arrays (Matrices)

Machine learning heavily relies on 2D matrices (rows and columns) and 3D tensors.

Cell 4:

python
12345678
# Create a 2D array (Matrix)
matrix = np.array([
    [1, 2, 3],
    [4, 5, 6]
])

print("Matrix:\n", matrix)
print("\nShape (Rows, Cols):", matrix.shape) # Output: (2, 3)

6. Mathematical and Statistical Functions

NumPy has hundreds of built-in mathematical functions that operate on arrays much faster than the standard Python math library.

Cell 5:

python
12345678910
data = np.array([15, 22, 9, 31, 14])

# Statistics
print("Mean:", np.mean(data))
print("Max:", np.max(data))
print("Standard Deviation:", np.std(data))

# Find the INDEX of the maximum value
max_index = np.argmax(data)
print(f"The highest value is at index {max_index}")

7. Mini Project: Performance Testing

Let's use a Jupyter "Magic Command" (%timeit) to prove why NumPy is the industry standard.

Cell 6:

python
123456789101112
import numpy as np

# Create 1 million random numbers
massive_list = list(range(1000000))
massive_array = np.arange(1000000)

print("Timing Python List (List Comprehension):")
# %timeit runs the code multiple times to get an accurate average speed
%timeit [x * 2 for x in massive_list]

print("\nTiming NumPy Array (Vectorization):")
%timeit massive_array * 2

*If you run this in Jupyter, you will see that NumPy is often 50x to 100x faster!*

8. Common Mistakes

  • Mixing Data Types: A Python list can hold [1, "Apple", True]. A NumPy array CANNOT. A NumPy array requires every element to be the exact same data type (usually float or int). If you try to mix them, NumPy will convert everything to strings.
  • Using math.sqrt() on an array: The standard Python math library does not understand NumPy arrays. You must use np.sqrt(array) instead.

9. MCQs

Question 1

What is the standard alias for importing NumPy?

Question 2

What is the core data structure in NumPy?

Question 3

Why are NumPy arrays faster than standard Python lists?

Question 4

If you execute np.array([10, 20, 30]) * 2, what is the result?

Question 5

How do you create an array of 5 zeros?

Question 6

What does matrix.shape return for a 2D array?

Question 7

Which function returns a sequence of numbers (e.g., from 0 to 10 step 2)?

Question 8

What Jupyter Magic Command can you put at the start of a line to benchmark its execution speed?

Question 9

Which function returns the INDEX of the highest value in an array?

Question 10

Can a NumPy array hold both integers and strings at the same time?

10. Interview Questions

  • Q: Explain the concept of Vectorization in NumPy. Why do data scientists use it instead of Python for loops?
  • Q: What is the difference between a Python List and a NumPy Array regarding data types?

11. Summary

NumPy is the foundational mathematical library for Python data science. Its core structure, the ndarray, requires homogeneous data types, which allows it to perform Vectorized operations in highly optimized C code. You can do math on entire matrices instantly without writing for loops. Jupyter's %timeit magic command is excellent for proving these performance gains.

12. Next Chapter Recommendation

In Chapter 12: Data Visualization in Jupyter, we will bring our data to life by using Matplotlib and Seaborn to draw charts and graphs directly inside the notebook interface.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·