Skip to main content
Pandas & NumPy
CHAPTER 02 Beginner

Installing Python, NumPy, and Pandas

Updated: May 18, 2026
5 min read

# CHAPTER 2

Installing Python, NumPy, and Pandas

1. Chapter Introduction

A solid environment is the foundation of data science work. This chapter sets up Python, NumPy, Pandas, and Jupyter Notebook — the interactive analysis tool used by data scientists worldwide.

2. Installation Methods

bash
12345678
# Option A: pip (clean installation)
python --version             # Verify Python 3.8+
python -m venv ds_env        # Create virtual environment
ds_env\Scripts\activate      # Windows activation
pip install numpy pandas matplotlib seaborn jupyter openpyxl

# Launch Jupyter Notebook
jupyter notebook             # Opens browser at localhost:8888
bash
123456
# Option B: Anaconda (recommended for beginners)
# Download from: https://www.anaconda.com/download
conda create -n datascience python=3.11
conda activate datascience
conda install numpy pandas matplotlib seaborn scikit-learn jupyter
jupyter notebook
text
123
# Option C: Google Colab (no installation!)
# Open: https://colab.research.google.com
# NumPy + Pandas already installed, free GPU available

3. Verify Installation

python
12345678910111213
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

print(f"NumPy:  {np.__version__}")    # 1.26.0
print(f"Pandas: {pd.__version__}")    # 2.1.0

# Quick test
arr = np.array([1, 2, 3])
df = pd.DataFrame({'A': [10, 20], 'B': [30, 40]})
print("NumPy:", arr)         # [1 2 3]
print("Pandas:\n", df)
print("✅ Ready for data science!")

4. Jupyter Essentials

python
12345678910111213
# Key shortcuts:
# Shift+Enter  → Run cell and move to next
# B            → Insert cell below
# DD           → Delete cell
# M            → Markdown mode

# Magic commands:
%timeit np.sum(np.arange(1_000_000))   # Benchmark
%matplotlib inline                      # Show plots in notebook

# Pandas display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:.2f}'.format)

5. Standard Imports (use in every project)

python
12345678
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

print("Data science environment ready!")

6. Common Mistakes

  • Python 2 instead of 3: Pandas requires Python 3.8+. Always python --version.
  • No virtual environment: Install per-project to avoid dependency conflicts.

7. MCQs

Question 1

Recommended Python version?

Question 2

Anaconda advantage?

Question 3

Shift+Enter in Jupyter?

Question 4

%timeit does?

Question 5

Google Colab advantage?

Question 6

Virtual environment purpose?

Question 7

Standard Pandas alias?

Question 8

Standard NumPy alias?

Question 9

pd._version_ shows?

Question 10

pip install order matters when?

8. Interview Questions

  • Q: What is a virtual environment and why is it important?
  • Q: What is the difference between Anaconda and pip?

9. Summary

Use Python 3.11 + pip + venv for professional setups, or Anaconda for all-in-one convenience. Google Colab works instantly with zero setup. Jupyter Notebook is the go-to environment for interactive data analysis.

10. Next Chapter Recommendation

In Chapter 3: NumPy Arrays Basics, we master ndarray creation, shapes, dtypes, and the fundamentals of numerical computing.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·