Skip to main content
Jupyter Notebooks
CHAPTER 09 Beginner

File Handling and Notebook Management

Updated: May 18, 2026
5 min read

# CHAPTER 9

File Handling and Notebook Management

1. Chapter Introduction

Data science is ultimately about processing files. You read data in from a file, analyze it, and write the results to a new file. This chapter covers the basics of Python file I/O (Input/Output), understanding file paths within the Jupyter environment, and how Jupyter manages your notebook files using Checkpoints.

2. Understanding File Paths in Jupyter

When you launch Jupyter, it opens a specific directory (folder) on your computer. This is your Current Working Directory (CWD).

Cell 1:

python
1234567
import os

# Check where Jupyter is currently looking
print("Current Directory:", os.getcwd())

# List all files in this directory
print("Files:", os.listdir())

When you try to open a file, you must tell Jupyter where it is.

  • Relative Path: Looks for the file *relative* to where the notebook is saved. (e.g., data.csv assumes the file is in the exact same folder as the notebook).
  • Absolute Path: The full path from the root of your hard drive. (e.g., C:/Users/Name/Documents/data.csv).

*Best Practice:* Always keep your data files in a folder named data right next to your notebook, and use relative paths like data/my_file.csv. This ensures your code works if you email the project to a colleague.

3. Reading from Text Files

Python has a built-in open() function to read files. We use the with statement because it automatically closes the file when we are done, freeing up computer memory.

Cell 2: *(Assuming you have a file named sample.txt in the same folder)*

python
12345
# 'r' stands for Read mode
with open('sample.txt', 'r') as file:
    # Read the entire file into a single string
    content = file.read()
    print(content)

4. Writing to Text Files

To save your analytical results, you write to a file.

Cell 3:

python
12345678910111213
report = "Analysis Complete: 500 records processed."

# 'w' stands for Write mode. 
# WARNING: 'w' will overwrite the file if it already exists!
with open('report.txt', 'w') as file:
    file.write(report)
    
print("File saved successfully.")

# 'a' stands for Append mode. 
# It adds text to the end of an existing file.
with open('report.txt', 'a') as file:
    file.write("\nAppended note: No errors found.")

5. Jupyter Notebook Files (.ipynb)

When you save a notebook, it is saved as a .ipynb file. If you try to open this file in a normal text editor (like Notepad), you will see a massive, messy JSON file. Do not edit .ipynb files in a standard text editor! It will corrupt the notebook. You can only safely open and edit them inside the Jupyter interface or VS Code.

6. The Power of Checkpoints

Jupyter auto-saves your notebook every 120 seconds. However, it also has a feature called Checkpoints.

When you click the "Save" icon (or press Ctrl+S), Jupyter explicitly creates a Checkpoint. A checkpoint is a hidden backup of your file at that exact moment.

How to use Checkpoints: If you write some code that breaks your notebook, and you want to go back in time:

  1. 1. Go to the top menu.
  1. 2. Click File -> Revert to Checkpoint.
  1. 3. Select the timestamp you want to revert to.

*Note:* Jupyter only keeps the *single most recent* checkpoint by default.

7. Mini Project: File Analyzer

Let's build a quick notebook tool that reads a text file and counts the words.

Cell 4:

python
1234567891011121314151617
# First, let's create a dummy file to analyze
with open('dummy_data.txt', 'w') as f:
    f.write("Data science is the sexiest job of the 21st century. Data is everywhere.")

# Now, let's analyze it
def analyze_file(filepath):
    try:
        with open(filepath, 'r') as file:
            text = file.read()
            words = text.split()
            word_count = len(words)
            return f"The file has {word_count} words."
    except FileNotFoundError:
        return "Error: File does not exist!"

print(analyze_file('dummy_data.txt'))
print(analyze_file('missing_file.txt'))

8. Common Mistakes

  • FileNotFoundError: You type open('data.csv'), but you get an error. 99% of the time, the file is in your "Downloads" folder, but your Jupyter Notebook is saved in your "Documents" folder. Move the CSV to the exact same folder as the notebook.
  • Forgetting to close files: If you don't use the with open() syntax and instead do f = open('file.txt'), the file remains "locked" in memory. Other programs won't be able to edit it.

9. MCQs

Question 1

What is the Current Working Directory (CWD) in Jupyter?

Question 2

Which path format is best for sharing projects with colleagues?

Question 3

What does the with keyword do when opening files in Python?

Question 4

If you open a file using mode 'w' (Write), what happens if the file already exists?

Question 5

Which mode should you use to add a new line of text to the end of an existing file?

Question 6

What format are Jupyter Notebook files saved in under the hood?

Q7. Can you safely edit a .ipynb file using Windows Notepad? a) Yes b) No, modifying the raw JSON structure will likely corrupt the notebook — Answer: b
Question 8

What happens when you manually click the "Save" icon in Jupyter?

Question 9

How do you recover a notebook to its last saved state if you make a terrible mistake?

Question 10

What Python module allows you to check your current working directory?

10. Interview Questions

  • Q: Explain the difference between an Absolute Path and a Relative Path. Why are relative paths preferred in data science projects?
  • Q: Why is it dangerous to open a file in 'w' mode? What should you use instead if you want to keep historical log data?

11. Summary

File management in Jupyter revolves around understanding your Working Directory. Keep your .ipynb notebook and your data files in the same folder, and use Relative Paths to access them. Use Python's with open('file', 'r') syntax to safely read and write data. Finally, utilize Jupyter's manual "Save" button to create Checkpoints, allowing you to rollback your code if you make a mistake.

12. Next Chapter Recommendation

In Chapter 10: Data Analysis with Pandas in Jupyter, we leave basic text files behind and introduce Pandas, the industry-standard library that transforms Jupyter into a powerful spreadsheet and database engine.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·