File Handling in Python
# CHAPTER 9
File Handling in Python
1. Chapter Introduction
As a Data Scientist, your data rarely lives directly inside your Python script. It lives in files on your hard drive or in the cloud. You must know how to ingest (read) this data and export (write) your results. This chapter covers standard Python file I/O (Input/Output), focusing on plain text, CSVs (Comma Separated Values), and JSON files.2. Reading Text Files
Python has a built-in open() function. We use the with statement because it automatically closes the file when the block finishes, preventing memory leaks and locked files.
3. Writing Text Files
To save your results, you open a file in Write ('w') or Append ('a') mode.
4. Working with CSV Files
While you will eventually use Pandas for CSVs, it is important to know how the built-in csv module works for basic scripting.
5. Working with JSON Files
JSON (JavaScript Object Notation) is the standard format for web data. It looks exactly like Python nested Dictionaries and Lists. The json module translates JSON strings into Python Dictionaries.
6. File Paths
You must tell Python where the file is.
-
Relative Path: Looks relative to where the script is running. (e.g.,
data/sales.csv). *This is preferred!*
-
Absolute Path: The full path (e.g.,
C:/Users/Name/Documents/data/sales.csv).
7. Mini Project: Notes Manager System
Let's build a simple script that logs daily notes and timestamps them.
8. Common Mistakes
-
FileNotFoundError: You type
open('data.csv'), but your script is in a different folder. Always double-check your Current Working Directory (os.getcwd()).
-
Overwriting Data: Opening an existing file in
'w'mode instantly wipes it clean. Use'a'if you want to keep the historical data and add to it.
-
Forgetting
newline=''in CSVs: On Windows, writing CSVs withoutnewline=''often results in blank rows between every line of data.
9. MCQs
What does the with keyword do when opening files?
Which mode should you use in open() to completely overwrite an existing file?
Which mode should you use to add text to the bottom of an existing file?
What built-in Python module is used to handle Comma Separated Values?
JSON data visually looks exactly like which Python data structure?
What does json.load() do?
What is a "Relative Path"?
How can you check if a file exists before opening it to avoid a crash?
When reading a file line-by-line using a for loop, what method removes the invisible newline character (\n)?
What happens if you forget to close a file (i.e., not using with)?
10. Interview Questions
-
Q: Explain the difference between opening a file in
'w'mode versus'a'mode.
-
Q: When sharing code with a team, why is it better to use relative paths (
data/file.csv) instead of absolute paths (C:/Users/Dave/data/file.csv)?
11. Summary
File I/O is a foundational skill. Usewith open('file', 'mode') to ensure files are safely closed. Use 'r' to read, 'w' to overwrite, and 'a' to append. For structured data, utilize the built-in csv and json modules. Always design your scripts using relative paths so they work on any computer, and use os.path.exists() to write defensive code that doesn't crash if a file is missing.