Skip to main content
R Programming
CHAPTER 27 Beginner

Advanced R Programming Concepts

Updated: May 18, 2026
5 min read

# CHAPTER 27

Advanced R Programming Concepts

1. Chapter Introduction

Advanced R features — functional programming with purrr, environments and closures, tidy evaluation, and data.table — make R code faster, more expressive, and production-ready. This chapter covers what separates intermediate from expert R programmers.

2. Functional Programming with purrr

r
12345678910111213141516171819202122232425262728293031323334353637383940
library(purrr)
library(dplyr)

# map() — apply function to list/vector, return list
map(1:5, ~ .x^2)               # List of squares
map_dbl(1:5, ~ .x^2)           # Numeric vector
map_chr(1:5, ~ paste0("item_", .x))  # Character vector
map_lgl(1:5, ~ .x > 3)         # Logical vector

# map2() — iterate over two vectors simultaneously
x_vals <- c(1, 2, 3, 4, 5)
y_vals <- c(10, 20, 30, 40, 50)
map2_dbl(x_vals, y_vals, ~ .x + .y)  # 11 22 33 44 55

# pmap() — iterate over multiple inputs
params <- list(n=c(10,20,30), mean=c(0,5,10), sd=c(1,2,3))
pmap(params, rnorm) %>% map_dbl(mean)  # 3 means from 3 distributions

# reduce() — fold list with binary function
reduce(1:5, `+`)      # 1+2+3+4+5 = 15
reduce(list(c(1,2,3), c(4,5), c(6)), union)  # Set union

# keep/discard — filter list elements
nums <- list(1, -3, 5, -2, 8, -1)
keep(nums, ~ .x > 0)    # 1 5 8
discard(nums, ~ .x > 0) # -3 -2 -1

# Safely handle errors
safe_log <- safely(log)
results <- map(list(10, -1, 0, 5), safe_log)
map(results, "result")  # Results (NA for errors)
map(results, "error")   # Error messages

# compose functions (pipe without %>%)
process <- compose(
  partial(round, digits=2),
  mean,
  ~ .x[!is.na(.x)]      # Remove NAs first
)
process(c(1, 2, NA, 4, 5))  # 3.0

3. Environments and Closures

r
12345678910111213141516171819202122232425262728293031
# Closures — functions that remember their environment
make_counter <- function(start=0) {
  count <- start
  list(
    increment = function(by=1) { count <<- count + by; invisible(count) },
    reset     = function() { count <<- 0; invisible(count) },
    get       = function() count
  )
}

counter <- make_counter(10)
counter$increment()
counter$increment(5)
counter$get()    # 16
counter$reset()
counter$get()    # 0

# Factory functions — create customized functions
make_power <- function(exp) function(x) x^exp
square <- make_power(2)
cube   <- make_power(3)
square(4)  # 16
cube(3)    # 27

# Memoization — cache expensive function results
library(memoise)
slow_function <- function(n) { Sys.sleep(0.01); sum(1:n) }
fast_function <- memoise(slow_function)  # Cache results!

system.time(fast_function(1000))  # Slow first time
system.time(fast_function(1000))  # Instant (cached)

4. data.table (High Performance)

r
1234567891011121314151617181920212223242526272829
library(data.table)

# Convert to data.table
DT <- as.data.table(diamonds)  # Use ggplot2 diamonds dataset

# data.table syntax: DT[rows, columns, by]
# Filter
DT[cut == "Ideal" & price < 1000]

# Select and compute
DT[, .(mean_price = mean(price), count = .N), by=cut]

# Add column (modifies in place — memory efficient)
DT[, price_per_carat := price / carat]

# Fast aggregation (10-100x faster than dplyr for large data)
DT[, .(avg_price=mean(price), sd=sd(price)), by=.(cut, color)]

# data.table is 10-50x faster than dplyr for:
# - Large datasets (> 100K rows)
# - Complex groupby operations
# - In-place modification (no copy)

# Benchmark
large_df <- data.frame(x=rnorm(1e6), g=sample(letters, 1e6, TRUE))
large_DT <- as.data.table(large_df)

system.time(aggregate(x ~ g, large_df, mean))    # base R
system.time(large_DT[, .(mean=mean(x)), by=g])  # data.table — much faster

5. Tidy Evaluation (Advanced dplyr)

r
12345678910111213141516171819202122
library(dplyr)
library(rlang)

# Problem: function that takes column names as arguments
# This WON'T work: dplyr uses non-standard evaluation
filter_and_select <- function(df, filter_col, filter_val, output_cols) {
  # Naive version fails:
  # df %>% filter(filter_col == filter_val) %>% select(output_cols)

  # Correct: use enquo() and !! to program with dplyr
  df %>%
    filter({{ filter_col }} == filter_val) %>%
    select({{ output_cols }})
}

# Using .data pronoun (simpler for string column names)
summarize_group <- function(df, group_col, value_col) {
  df %>%
    group_by(.data[[group_col]]) %>%
    summarise(mean = mean(.data[[value_col]]), n=n(), .groups="drop")
}
summarize_group(mtcars, "cyl", "mpg")

6. Common Mistakes

  • <<- inside purrr functions: <<- modifies global scope. In functional pipelines, avoid side effects — use reduce() or accumulate values through the return value instead.
  • data.table := modifies in place: DT[, x := x + 1] permanently modifies DT — no assignment needed. Beginners confused by this do DT2 <- DT[, x := x + 1] creating a reference copy, not a new object.

7. MCQs

Question 1

mapdbl(x, f) returns?

Question 2

Closure in R is?

Question 3

reduce(1:5, sum) returns?

Question 4

safely(f) wraps f to?

Question 5

data.table := operator does?

Question 6

pmap() iterates over?

Question 7

memoise(f) creates?

Question 8

.N in data.table means?

Question 9

{{ col }} in dplyr function uses?

Question 10

data.table is fastest for?

8. Interview Questions

  • Q: What is a closure in R and how does it enable factory functions?
  • Q: When would you use data.table instead of dplyr?

9. Summary

Functional programming: purrr::map
*() (type-stable), map2(), pmap(), reduce(), safely(), compose(). Closures: functions capture lexical scope — enable factory patterns and stateful objects. Memoization with memoise. data.table: 10-50x faster than dplyr for large data — DT[rows, cols, by] syntax, := for in-place. Tidy eval: {{ col }} for user-facing dplyr functions.

10. Next Chapter Recommendation

In Chapter 28: R Programming Interview Preparation, we compile 50 interview questions, coding challenges, and statistical problem sets.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·