Skip to main content
Data Visualization
CHAPTER 26 Beginner

Data Visualization with Pandas

Updated: May 18, 2026
5 min read

# CHAPTER 26

Data Visualization with Pandas

1. Chapter Introduction

Pandas has a built-in .plot() method that creates Matplotlib charts directly from DataFrames — perfect for rapid EDA without writing chart code from scratch. This chapter masters Pandas plotting for the complete data analysis workflow.

2. Pandas plot() API

python
123456789101112131415161718192021222324252627282930313233343536373839
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

np.random.seed(42)
df = pd.DataFrame({
    'Month': pd.date_range('2024-01', periods=12, freq='MS'),
    'Revenue':   np.random.normal(80000, 15000, 12).clip(50000, 120000),
    'Cost':      np.random.normal(55000, 10000, 12).clip(35000, 80000),
    'Customers': np.random.randint(100, 300, 12)
}).set_index('Month')
df['Profit'] = df['Revenue'] - df['Cost']
df['Margin'] = df['Profit'] / df['Revenue'] * 100

plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (10, 5)

# Line chart
ax = df[['Revenue', 'Cost', 'Profit']].plot(
    title='Monthly Financials 2024',
    ylabel='Amount ($)',
    color=['#1565C0', '#E53935', '#4CAF50'])
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x/1000:.0f}K'))
plt.tight_layout()
plt.savefig('pandas_line.png', dpi=150)
plt.show()

# Bar chart
df['Revenue'].plot(kind='bar', title='Monthly Revenue', color='#1565C0', rot=30)
plt.tight_layout()
plt.savefig('pandas_bar.png', dpi=150)
plt.show()

# Stacked area
df[['Revenue', 'Cost']].plot(kind='area', alpha=0.5, title='Revenue vs Cost',
                               color=['#1565C0', '#E53935'])
plt.tight_layout()
plt.savefig('pandas_area.png', dpi=150)
plt.show()

3. GroupBy + Plot for EDA

python
12345678910111213141516171819202122232425262728293031323334353637
# Grouped visual analysis
np.random.seed(42)
sales_df = pd.DataFrame({
    'Date':    pd.date_range('2024-01', periods=200, freq='D'),
    'Region':  np.random.choice(['North', 'South', 'East', 'West'], 200),
    'Product': np.random.choice(['Laptop', 'Phone', 'Monitor'], 200),
    'Revenue': np.random.normal(5000, 1500, 200).clip(1000, 10000),
    'Units':   np.random.randint(1, 20, 200)
})

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# GroupBy bar chart
sales_df.groupby('Region')['Revenue'].sum().sort_values().plot(
    kind='barh', ax=axes[0,0], color='#1565C0', title='Revenue by Region')

# Pivot + stacked bar
pivot = sales_df.groupby(['Region', 'Product'])['Revenue'].sum().unstack()
pivot.plot(kind='bar', ax=axes[0,1], stacked=True, title='Revenue by Region & Product', rot=30)

# Line chart — monthly trend
sales_df.set_index('Date')['Revenue'].resample('W').sum().plot(
    ax=axes[1,0], color='#4CAF50', title='Weekly Revenue Trend')

# Box plot comparison
sales_df.boxplot(column='Revenue', by='Region', ax=axes[1,1], grid=False)
axes[1,1].set_title('Revenue Distribution by Region')
axes[1,1].set_xlabel('Region')

for ax in axes.flatten():
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)

plt.suptitle('Sales EDA Dashboard', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.savefig('pandas_eda.png', dpi=150, bbox_inches='tight')
plt.show()

4. Common Mistakes

  • df.plot() ignoring non-numeric columns: Pandas plotting automatically includes all numeric columns. Use column selection df[['A','B']].plot() to control what's plotted.
  • Not resetting index after groupby: df.groupby().sum() returns an index-based Series. Call .resetindex() for better Pandas plot axis labels.

5. MCQs

Question 1

df.plot(kind='bar') uses?

Question 2

df[['A','B']].plot() plots?

Question 3

df.resample('W').sum().plot() shows?

Question 4

df.boxplot(by='Region') groups by?

Question 5

kind='area' creates?

Question 6

stacked=True in plot(kind='bar') creates?

Question 7

df.groupby('Region')['Revenue'].sum().sortvalues() before plot?

Question 8

pivot.plot(kind='bar') on grouped pivot table?

Question 9

.unstack() before pivot plot?

Question 10

Pandas plot for EDA is best because?

6. Interview Questions

  • Q: How do you create a grouped bar chart using Pandas groupby?
  • Q: What is the difference between df.plot() and plt.plot()?

7. Summary

Pandas .plot() wraps Matplotlib for one-line charts from DataFrames. Key kind values: line, bar, barh, area, hist, box. Combine groupby().sum().plot() for instant category comparison. resample().plot() for time series EDA. Use for exploration — switch to Matplotlib/Seaborn for publication polish.

8. Next Chapter Recommendation

In Chapter 27: Advanced Visualization Techniques, we explore animations, 3D surfaces (correctly used), custom chart types, and large-scale visual analytics.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·