CHAPTER 08 Beginner

NumPy Random Module

Updated: May 18, 2026

5 min read

# CHAPTER 8

NumPy Random Module

1. Chapter Introduction

Random number generation is essential for simulations, statistical sampling, data augmentation, and machine learning. NumPy's random module provides cryptographically strong random generation with all major probability distributions.

2. Random Number Generation

python

1234567891011121314151617181920

import numpy as np

# Seed for reproducibility (same seed → same random numbers)
rng = np.random.default_rng(seed=42)  # Modern API (recommended)

# Or legacy API:
np.random.seed(42)

# Uniform random floats [0.0, 1.0)
print(rng.random(5))           # 5 random floats

# Random integers
print(rng.integers(1, 100, 5)) # 5 integers between 1-99
print(rng.integers(1, 7, (3, 4)))  # 3x4 matrix, dice values 1-6

# Standard normal distribution (mean=0, std=1)
print(rng.standard_normal(5))  # 5 normally distributed values

# Normal with custom mean and std
print(rng.normal(loc=170, scale=10, size=5))  # Heights: mean 170cm, std 10

3. Probability Distributions

python

1234567891011121314151617181920212223

rng = np.random.default_rng(42)

# Uniform distribution
uniform = rng.uniform(low=0, high=100, size=1000)

# Normal (Gaussian) distribution — bell curve
normal = rng.normal(loc=0, scale=1, size=1000)

# Binomial (n trials, probability p)
binomial = rng.binomial(n=10, p=0.5, size=1000)  # Coin flips

# Poisson (events per interval)
poisson = rng.poisson(lam=5, size=1000)  # 5 events per hour average

# Exponential (time between events)
exponential = rng.exponential(scale=2.0, size=1000)

# Beta (0-1, useful for probabilities)
beta = rng.beta(a=2, b=5, size=1000)

# Print distribution stats
for name, dist in [(&#039;Normal', normal), ('Binomial', binomial), ('Poisson', poisson)]:
    print(f"{name}: mean={np.mean(dist):.2f}, std={np.std(dist):.2f}")

4. Sampling and Shuffling

python

1234567891011121314151617181920212223242526

rng = np.random.default_rng(42)
data = np.arange(1, 21)   # [1, 2, ..., 20]

# Random choice (with replacement)
sample_with = rng.choice(data, size=5, replace=True)
print("With replacement:", sample_with)

# Random choice (without replacement)
sample_without = rng.choice(data, size=5, replace=False)
print("Without replacement:", sample_without)

# Shuffle in-place
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
rng.shuffle(arr)
print("Shuffled:", arr)

# Permutation (returns new shuffled array)
original = np.arange(10)
shuffled = rng.permutation(original)
print("Permuted:", shuffled)

# Weighted random choice
items = np.array([&#039;apple', 'banana', 'cherry'])
weights = np.array([0.5, 0.3, 0.2])  # Probabilities (must sum to 1)
sample = rng.choice(items, size=10, p=weights)
print("Weighted:", np.unique(sample, return_counts=True))

5. Practical: Simulations

python

12345678910111213141516171819202122232425262728

import numpy as np

rng = np.random.default_rng(42)

# Simulation 1: Monte Carlo estimate of π
n_points = 1_000_000
x = rng.uniform(-1, 1, n_points)
y = rng.uniform(-1, 1, n_points)
inside_circle = (x**2 + y**2) <= 1
pi_estimate = 4 * np.sum(inside_circle) / n_points
print(f"π estimate: {pi_estimate:.4f}")  # ~3.1416

# Simulation 2: Stock price simulation (random walk)
n_days = 252  # Trading days in a year
daily_returns = rng.normal(loc=0.0005, scale=0.02, size=n_days)
price = 100 * np.cumprod(1 + daily_returns)
print(f"Final stock price: ${price[-1]:.2f}")
print(f"Return: {(price[-1]/100 - 1)*100:.1f}%")

# Simulation 3: Bootstrap confidence interval
data = np.array([34, 45, 67, 23, 78, 56, 89, 45, 34, 67])
n_bootstrap = 10000
bootstrap_means = [rng.choice(data, len(data), replace=True).mean()
                   for _ in range(n_bootstrap)]
bootstrap_means = np.array(bootstrap_means)
ci_lower = np.percentile(bootstrap_means, 2.5)
ci_upper = np.percentile(bootstrap_means, 97.5)
print(f"95% CI for mean: [{ci_lower:.1f}, {ci_upper:.1f}]")

6. Common Mistakes

Not seeding for reproducibility: Machine learning experiments must be reproducible. Always set np.random.seed() or use defaultrng(seed).

Old np.random.rand() vs new rng.random(): The new defaultrng() API is statistically superior. Prefer it for new code.

7. MCQs

Question 1

Purpose of setting random seed?

Question 2

`rng.normal(loc=0, scale=1)` generates?

Question 3

`rng.choice(arr, replace=False)` means?

Question 4

Monte Carlo methods use?

Question 5

`rng.integers(1, 7)` simulates?

Question 6

`rng.permutation(arr)` vs `rng.shuffle(arr)`?

Question 7

Binomial distribution models?

Question 8

Poisson distribution models?

Question 9

`np.cumprod([1.1, 1.2, 0.9])` returns?

Question 10

`defaultrng(seed=42)` creates?

8. Interview Questions

Q: Why is reproducibility important in data science and how do you ensure it?

Q: What is the difference between uniform and normal distributions?

9. Summary
NumPy's random module provides all major distributions for simulation, sampling, and ML. Always seed with defaultrng(seed) for reproducibility. Monte Carlo methods demonstrate the power of random simulation for estimation. Bootstrap sampling uses resampling to estimate confidence intervals.

10. Next Chapter Recommendation

In Chapter 9: Introduction to Pandas, we begin Pandas — the library that transforms data analysis from programming into data science.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

NumPy Random Module #

1. Chapter Introduction #

2. Random Number Generation #

3. Probability Distributions #

4. Sampling and Shuffling #

5. Practical: Simulations #

6. Common Mistakes #

7. MCQs #

Purpose of setting random seed?

rng.normal(loc=0, scale=1) generates?

rng.choice(arr, replace=False) means?

Monte Carlo methods use?

rng.integers(1, 7) simulates?

rng.permutation(arr) vs rng.shuffle(arr)?

Binomial distribution models?

Poisson distribution models?

np.cumprod([1.1, 1.2, 0.9]) returns?

defaultrng(seed=42) creates?

8. Interview Questions #

9. Summary #

10. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 5

Send Feedback / Bug

Feedback Submitted!

NumPy Random Module

1. Chapter Introduction

2. Random Number Generation

3. Probability Distributions

4. Sampling and Shuffling

5. Practical: Simulations

6. Common Mistakes

7. MCQs

`rng.normal(loc=0, scale=1)` generates?

`rng.choice(arr, replace=False)` means?

`rng.integers(1, 7)` simulates?

`rng.permutation(arr)` vs `rng.shuffle(arr)`?

`np.cumprod([1.1, 1.2, 0.9])` returns?

`defaultrng(seed=42)` creates?

8. Interview Questions

9. Summary

10. Next Chapter Recommendation