CHAPTER 20 Beginner

Advanced NumPy Concepts

Updated: May 18, 2026

5 min read

# CHAPTER 20

Advanced NumPy Concepts

1. Chapter Introduction

Beyond basic arrays, NumPy offers structured arrays for mixed-type data, memory-mapped arrays for datasets larger than RAM, advanced indexing tricks, and performance tools that underpin production data science workflows.

2. Advanced Indexing

python

12345678910111213141516171819202122

import numpy as np

# ix_ — open mesh for cross-indexing
matrix = np.arange(1, 26).reshape(5, 5)
rows = np.array([0, 2, 4])
cols = np.array([1, 3])
print(matrix[np.ix_(rows, cols)])    # 3x2 submatrix

# Boolean indexing with 2D mask
mask = matrix > 15
print(matrix[mask])    # 1D array of elements > 15
matrix[mask] = 0       # Set all > 15 to 0

# where — complex conditional selection
data = np.array([10, -5, 8, -3, 12, -1, 7])
result = np.where(data > 0, data, data * -1)   # abs() equivalent
print(result)    # [10 5 8 3 12 1 7]

# select — multiple conditions
conditions = [data < 0, data == 0, data > 0]
choices = [&#039;negative', 'zero', 'positive']
print(np.select(conditions, choices))

3. Structured Arrays

python

1234567891011121314151617181920

# Structured array — like a mini database table
employee_dtype = np.dtype([
    (&#039;name', 'U20'),       # Unicode string max 20 chars
    (&#039;age', 'i4'),         # 32-bit integer
    (&#039;salary', 'f8'),      # 64-bit float
    (&#039;active', 'bool')     # Boolean
])

employees = np.array([
    (&#039;Alice', 28, 85000.0, True),
    (&#039;Bob',   35, 72000.0, False),
    (&#039;Carol', 31, 91000.0, True),
    (&#039;David', 42, 55000.0, True)
], dtype=employee_dtype)

print(employees)
print(employees[&#039;name'])       # All names
print(employees[&#039;salary'])     # All salaries
print(employees[employees[&#039;active']])    # Active employees
print(employees[employees[&#039;salary'] > 75000]['name'])  # High earners

4. Memory Layout and Strides

python

12345678910111213141516171819

# C-order (row-major) vs F-order (column-major)
arr_c = np.array([[1,2,3],[4,5,6]], order=&#039;C')  # Row-major (default)
arr_f = np.array([[1,2,3],[4,5,6]], order=&#039;F')  # Column-major (Fortran)

print(arr_c.strides)   # (24, 8) — 24 bytes to next row, 8 to next element
print(arr_f.strides)   # (8, 16)

# Check memory ownership
arr = np.arange(12)
view = arr[::2]          # View — shares memory
copy = arr[::2].copy()   # Copy — independent

print(np.shares_memory(arr, view))   # True
print(np.shares_memory(arr, copy))   # False

# Contiguous check (affects performance)
print(arr_c.flags[&#039;C_CONTIGUOUS'])   # True
print(arr_f.flags[&#039;F_CONTIGUOUS'])   # True
arr_f_contiguous = np.ascontiguousarray(arr_f)   # Convert to C-order

5. Memory-Efficient Techniques

python

12345678910111213141516171819202122

# Use appropriate dtypes to save memory
import sys

arr_float64 = np.random.rand(1_000_000)
arr_float32 = arr_float64.astype(np.float32)
arr_int64   = np.arange(1_000_000)
arr_int16   = arr_int64.astype(np.int16)  # If values fit in -32768 to 32767

print(f"float64: {sys.getsizeof(arr_float64):,} bytes")   # ~8MB
print(f"float32: {sys.getsizeof(arr_float32):,} bytes")   # ~4MB (50% saving)
print(f"int64:   {sys.getsizeof(arr_int64):,} bytes")     # ~8MB
print(f"int16:   {sys.getsizeof(arr_int16):,} bytes")     # ~2MB (75% saving)

# Memory-mapped arrays — work with files larger than RAM
fp = np.memmap(&#039;large_data.npy', dtype='float32', mode='w+', shape=(10000, 1000))
fp[:] = np.random.rand(10000, 1000)
fp.flush()   # Write to disk
del fp

# Reopen and read without loading into RAM
fp_read = np.memmap(&#039;large_data.npy', dtype='float32', mode='r', shape=(10000, 1000))
print(f"First row mean: {fp_read[0].mean():.4f}")

6. Common Mistakes

Using float64 when float32 suffices: For ML applications, float32 uses half the memory with negligible precision loss.

Unintended views: arr[::2] returns a view — modifying it modifies the original. Use .copy() when independence is needed.

7. MCQs

Question 1

`np.ix([0,2], [1,3])` creates?

Question 2

np.select(conditions, choices) selects?

Question 3

Structured array dtype 'U20' means?

Question 4

Strides tell NumPy?

Question 5

np.memmap is for?

Question 6

float32 vs float64 memory?

Question 7

`np.shares``memory(a, b)` returns?

Question 8

C-order array stores data?

Question 9

`.flags['CCONTIGUOUS']` True means?

Question 10

Best dtype for age data (0-120)?

8. Interview Questions

Q: What is the difference between a NumPy view and a copy?

Q: How do you reduce memory usage when working with large NumPy arrays?

9. Summary
Advanced NumPy: np.ix for cross-indexing, structured arrays for mixed-type tabular data, strides for memory layout understanding, memmap for out-of-core computation, and dtype selection for 50-87% memory savings. These tools scale NumPy from exploration to production.

10. Next Chapter Recommendation

In Chapter 21: Advanced Pandas Operations, we master MultiIndex, window functions, categorical data, and advanced aggregation patterns.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Advanced NumPy Concepts #

1. Chapter Introduction #

2. Advanced Indexing #

3. Structured Arrays #

4. Memory Layout and Strides #

5. Memory-Efficient Techniques #

6. Common Mistakes #

7. MCQs #

np.ix([0,2], [1,3]) creates?

np.select(conditions, choices) selects?

Structured array dtype 'U20' means?

Strides tell NumPy?

np.memmap is for?

float32 vs float64 memory?

np.sharesmemory(a, b) returns?

C-order array stores data?

.flags['CCONTIGUOUS'] True means?

Best dtype for age data (0-120)?

8. Interview Questions #

9. Summary #

10. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 5

Send Feedback / Bug

Feedback Submitted!

Advanced NumPy Concepts

1. Chapter Introduction

2. Advanced Indexing

3. Structured Arrays

4. Memory Layout and Strides

5. Memory-Efficient Techniques

6. Common Mistakes

7. MCQs

`np.ix([0,2], [1,3])` creates?

`np.select(conditions, choices)` selects?

Structured array dtype `'U20'` means?

`np.memmap` is for?

`np.shares``memory(a, b)` returns?

`.flags['CCONTIGUOUS']` True means?

8. Interview Questions

9. Summary

10. Next Chapter Recommendation