Skip to main content
Python for Data Science
CHAPTER 02 Beginner

Installing Python and Data Science Environment

Updated: May 18, 2026
5 min read

# CHAPTER 2

Installing Python and Data Science Environment

1. Chapter Introduction

Before you can analyze data, you need to set up your workshop. Data scientists do not use standard text editors like Notepad. We use specialized environments that can handle massive datasets, execute code interactively, and manage hundreds of third-party libraries. This chapter guides you through installing Python via Anaconda, setting up Jupyter Notebooks, and using VS Code.

Python on its own is just a bare-bones programming language. If you install standard Python from python.org, you will have to manually install Pandas, NumPy, and Jupyter one by one via the command line.

Anaconda is the industry-standard distribution for Data Science. It is a massive package that pre-installs Python *and* the top 100 data science libraries simultaneously.

Step-by-step Installation:

  1. 1. Go to anaconda.com/download.
  1. 2. Download the installer for Windows, Mac, or Linux.
  1. 3. Run the installer. Leave all settings on their defaults.
  1. 4. Once installed, search your computer for Anaconda Navigator and open it.

3. Launching Jupyter Notebook

Jupyter Notebook is the primary tool you will use to write Python code in this course. It runs in your web browser but processes data locally on your computer.

  1. 1. Open Anaconda Navigator.
  1. 2. Find the tile labeled Jupyter Notebook.
  1. 3. Click Launch.
  1. 4. A black terminal window will open (do not close this!). Then, a new tab will open in your web browser (usually at localhost:8888). This is your Jupyter file browser.

4. VS Code: The Professional IDE

While Jupyter Notebook in the browser is great for beginners, most professional Data Scientists and Machine Learning Engineers eventually migrate to Visual Studio Code (VS Code). It is a powerful, free code editor built by Microsoft.

How to integrate VS Code with Data Science:

  1. 1. Download VS Code from code.visualstudio.com.
  1. 2. Open VS Code, go to the Extensions tab (square boxes icon on the left).
  1. 3. Search for and install the Python extension.
  1. 4. Search for and install the Jupyter extension.
  1. 5. You can now create files ending in .ipynb and run Jupyter notebooks directly inside VS Code!

5. Managing Libraries with pip and conda

If you need a library that Anaconda didn't install, you must download it from the internet using a package manager. You have two options: pip (Python's default) and conda (Anaconda's default).

To install a package, open your Anaconda Prompt (Windows) or Terminal (Mac) and type:

bash
12345
# Using pip
pip install seaborn

# Using conda
conda install seaborn

*Note: In Jupyter Notebook, you can run these commands directly in a code cell by adding an exclamation mark: !pip install seaborn.*

6. Virtual Environments

Imagine Project A needs Pandas version 1.0, but Project B requires Pandas version 2.0. If you only have one installation of Python on your computer, you can't run both projects.

Virtual Environments solve this. They create an isolated, "fenced-off" version of Python for each project.

Creating an environment with Conda:

bash
12345678
# Create a new environment named 'my_ds_project' with Python 3.10
conda create --name my_ds_project python=3.10

# Activate (enter) the environment
conda activate my_ds_project

# Now any pip installs only affect THIS environment!
pip install pandas

7. Common Mistakes

  • Installing Multiple Pythons: Beginners often install Python from python.org, then install Anaconda, then install Python from the Windows Store. This creates massive conflicts. Choose ONE ecosystem (Anaconda is best) and stick to it.
  • Closing the Jupyter Terminal: When you launch Jupyter, a scary-looking black terminal window stays open in the background. If you close it, your notebook will disconnect and stop working. Minimize it, don't close it!

8. MCQs

Question 1

What is Anaconda?

Question 2

Where does Jupyter Notebook natively open its interface?

Question 3

What must you install in VS Code to run data science notebooks?

Question 4

What command do you use to install a new Python library using the default Python package manager?

Question 5

Why do data scientists use Virtual Environments?

Question 6

If you want to run a terminal command (like pip install) directly inside a Jupyter Notebook cell, you prefix it with?

Question 7

What happens if you close the background terminal window that launched Jupyter Notebook?

Question 8

What file extension is used for Jupyter Notebooks?

Question 9

Which IDE is currently the most popular choice for professional Python developers?

Question 10

How do you activate a Conda virtual environment named 'project1'?

9. Interview Questions

  • Q: Explain the concept of Virtual Environments. Why are they a critical best practice in professional data science teams?
  • Q: What is the difference between writing code in a standard .py script versus a .ipynb Jupyter Notebook?

10. Summary

Setting up your environment correctly is critical. Use the Anaconda Distribution to get Python, Jupyter, and essential libraries installed simultaneously. For your code editor, start with Jupyter Notebook in the browser, and eventually transition to VS Code. When working on real projects, always use Virtual Environments (conda create) to keep your library dependencies organized and prevent version conflicts.

11. Next Chapter Recommendation

In Chapter 3: Python Basics for Data Science, we will write our first lines of Python code, learning about syntax, comments, variables, and how to print output to the screen.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·