CHAPTER 11
Beginner
Reading and Writing Data Files
Updated: May 18, 2026
5 min read
# CHAPTER 11
Reading and Writing Data Files
1. Chapter Introduction
Real data lives in files — CSVs, Excel sheets, JSON APIs, databases. Pandas' IO functions are the gateway from raw files to analysis-ready DataFrames, supporting 15+ formats with production-grade options.2. Reading CSV Files
python
text
3. Reading Excel Files
python
4. Reading JSON
python
5. Writing Data Files
python
6. Mini Project: CSV Sales Report Analyzer
python
7. Common Mistakes
-
Forgetting
index=Falseintocsv(): Without it, Pandas adds a row number column — creating a duplicate index on re-read.
-
Date columns not auto-parsed: Use
parsedates=['Date']inreadcsv()to get proper datetime dtype, not strings.
8. MCQs
Question 1
readcsv('file.csv', indexcol='ID') does?
Question 2
tocsv(index=False) prevents?
Question 3
usecols=['A','B'] in readcsv?
Question 4
nrows=500 in readcsv?
Question 5
parsedates=['Date'] converts?
Question 6
pd.jsonnormalize() handles?
Question 7
pd.readexcel(sheetname=None) returns?
Question 8
ExcelWriter context manager is for?
Question 9
navalues=['N/A', '-'] in readcsv?
Question 10
dtype={'Revenue': float} in readcsv?
9. Interview Questions
- Q: How do you load only specific columns from a large CSV in Pandas?
- Q: How do you write a DataFrame to multiple sheets in a single Excel file?
10. Summary
Pandas IO handles every common data format. Key params:indexcol, usecols, nrows, parsedates, dtype, navalues. Always use index=False when writing CSVs. Use ExcelWriter for multi-sheet Excel reports.
11. Next Chapter Recommendation
In Chapter 12: Data Selection and Filtering, we masterloc[], iloc[], boolean filtering, and query syntax to extract exactly the data we need.