This chapter compiles the most-asked data visualization interview questions from analytics, data science, and BI roles — covering visualization theory, chart selection, dashboard design, and practical coding challenges.
Q1. What is data visualization and why does it matter?
Data visualization is the graphical representation of data enabling humans to identify patterns, trends, and outliers faster than reading tables. Humans process visuals 60,000x faster than text.
Q2. What is Edward Tufte's Data-Ink Ratio?
The ratio of ink encoding data to total ink used. Maximize it by removing chart junk (decorations, unnecessary gridlines, 3D effects) and keeping only data-encoding elements.
Q3. What are pre-attentive attributes?
Visual properties processed by the brain in <250ms before conscious attention: position, length, color hue, color value, size, shape, orientation. Position is most accurate for quantitative comparison.
Q4. When would you use a bar chart vs. a line chart?
Bar charts: categorical comparison (products, regions). Line charts: continuous time-based trends. Rule: if the X-axis has meaningful gaps between points, use bars; if continuous, use lines.
Q5. What is the Gestalt closure principle in visualization?
The brain "closes" incomplete shapes mentally. Area charts and closed polygons (radar charts) feel complete even when truncated.
Q6. What is a choropleth map?
A map where geographic regions are colored based on a statistical variable — GDP, population, sales. Color intensity encodes magnitude (sequential palette) or direction (diverging palette).
Q7. What is the 80/20 rule in data visualization?
The Pareto principle: ~80% of value comes from 20% of factors. Visualized with a Pareto chart (sorted bar + cumulative % line). Used in product analysis, defect analysis, sales attribution.
Q8. What is overplotting and how do you fix it?
Overplotting occurs when too many scatter points overlap, hiding density. Fixes: transparency (alpha), jitter, hexbin plot, 2D histogram, or sampling.
Q9. Sequential vs. diverging vs. qualitative color palettes?
Sequential (Blues): ordered magnitude. Diverging (RdBu): positive/negative with meaningful midpoint. Qualitative (tab10): categorical — no implied order.
Q10. What is a Sankey diagram used for?
Visualizing flows and quantities between stages — customer journeys, energy flows, financial transfers. Width of flows encodes volume.
Q11. What is a violin plot?
Combines box plot (quartiles + outliers) with KDE (full distribution shape). Shows where data is dense, unlike box plot which only shows 5-number summary.
Q12. What is small multiples (faceting)?
The same chart repeated for different data subsets (e.g., same line chart for each product). Enables easy comparison without overlapping data. Implemented with FacetGrid in Seaborn.
Q13. How do you design for color-blind viewers?
Use colorblind-safe palettes (e.g., ColorBrewer), don't rely on color alone — add shape, pattern, or direct labels. ~8% of men have red-green color blindness.
Q14. What is a treemap?
Hierarchical visualization where rectangles encode value through size. Good for part-of-whole with many items (e.g., file system, product categories). Max 3 hierarchy levels.
Q15. Why is 3D visualization generally discouraged?
3D perspective distorts proportions — nearer bars appear larger. Viewers must mentally compensate for the 3D angle. Only use 3D for inherently 3D data (surfaces, terrain).
COMMON INTERVIEW TASK: "Critique this chart and redesign it"
CRITIQUE FRAMEWORK (CASE method):
C — Context: What data is shown? What question does it answer?
A — Accuracy: Are encodings accurate? Y-axis at zero? Correct chart type?
S — Story: What's the key message? Is it clear in <5 seconds?
E — Execute: How would you redesign it for maximum clarity?
Example critique response:
"This 3D pie chart has 12 slices. Issues:
1. 3D perspective distorts slice proportions
2. 12 slices are impossible to compare (angle perception)
3. No sorted order
4. Redundant legend
Redesign: Sorted horizontal bar chart, top 5 + 'Other' grouping,
value labels instead of legend, clean white background"
50 interview questions covering: theory (pre-attentive, Tufte, Gestalt), chart selection, dashboard design, and coding challenges (reusable functions, CASE critique). Master these patterns to confidently discuss visualization in data analyst, data scientist, and BI engineer interviews.
In Chapter 29: Performance Optimization for Large Datasets, we render millions of data points efficiently using sampling, aggregation, and specialized rendering tools.
Finish this Chapter
Save your progress on your learning path and prepare for coding interview challenges.