CHAPTER 15
Beginner
Data Manipulation with dplyr
Updated: May 18, 2026
5 min read
# CHAPTER 15
Data Manipulation with dplyr
1. Chapter Introduction
dplyr is R's data manipulation grammar — a consistent set of verbs that make data transformation readable, chainable, and fast. This chapter masters all core dplyr verbs with real business data analysis.2. Core dplyr Verbs
r
3. Advanced dplyr Operations
r
4. Mini Project: Sales Analytics Dashboard
r
5. Common Mistakes
-
Forgetting
.groups="drop"in summarise: In dplyr 1.0+,summarise()retains grouping. Forgetting.groups="drop"leaves the result grouped, causing unexpected behavior in downstream operations.
-
Using
filter()with|vs%in%:filter(col == "A" | col == "B")is verbose. Usefilter(col %in% c("A", "B"))for cleaner code.
6. MCQs
Question 1
filter(revenue > 5000, region == "North") keeps rows where?
Question 2
select(startswith("re")) selects columns?
Question 3
mutate() vs summarise()?
Question 4
arrange(desc(revenue)) sorts?
Question 5
leftjoin(a, b, by="id") keeps?
Question 6
pivotlonger(cols=Q1:Q4) creates?
Question 7
casewhen() is equivalent to?
Question 8
n() inside summarise() counts?
Question 9
ungroup() removes?
Question 10
cumsum(x) computes?
7. Interview Questions
-
Q: What is the difference between
mutate()andsummarise()in dplyr?
- Q: How do you perform a left join in dplyr?
8. Summary
dplyr verbs:filter() (rows), select() (columns), mutate() (new columns), arrange() (sort), groupby() + summarise() (aggregate), leftjoin()/innerjoin() (combine tables). Pipe %>% chains operations. casewhen() for complex conditionals. pivotlonger()/pivotwider() for reshaping. Always .groups="drop" in summarise.