Skip to main content
R Programming
CHAPTER 12 Beginner

File Handling in R

Updated: May 18, 2026
5 min read

# CHAPTER 12

File Handling in R

1. Chapter Introduction

Data analysis always involves reading from and writing to files. This chapter covers R's complete file handling toolkit — text files, CSVs, binary files, and directory management — essential for building reproducible data pipelines.

2. Reading Files

r
123456789101112131415161718192021222324252627282930313233343536373839
# ─── READ TEXT FILES ──────────────────────────────────
# Read all lines into character vector
lines <- readLines("data.txt")
cat("Lines:", length(lines), "\n")
print(head(lines, 5))

# Read with connection (memory-efficient for large files)
con <- file("large_data.txt", "r")
while (TRUE) {
  line <- readLines(con, n=1)
  if (length(line) == 0) break
  # Process line here
}
close(con)

# ─── READ CSV FILES ───────────────────────────────────
df <- read.csv("sales.csv")              # Basic
df <- read.csv("sales.csv",
               header     = TRUE,        # First row is header
               sep        = ",",         # Delimiter
               na.strings = c("", "NA", "N/A", "NULL"),  # NA placeholders
               stringsAsFactors = FALSE, # Don't auto-convert strings to factors
               skip       = 2,           # Skip first 2 rows
               nrows      = 1000)        # Read only first 1000 rows

# Tab-separated
df <- read.delim("data.tsv", sep="\t")
# Any delimiter
df <- read.table("data.txt", sep="|", header=TRUE)

# ─── USING readr (faster, tidyverse) ──────────────────
library(readr)
df <- read_csv("sales.csv",
               col_types = cols(
                 date    = col_date("%Y-%m-%d"),
                 revenue = col_double(),
                 region  = col_character()
               ))
problems(df)  # Show parsing issues

3. Writing Files

r
12345678910111213141516171819202122
# Write CSV
write.csv(df, "output.csv", row.names=FALSE)  # row.names=FALSE to avoid X column
write.csv2(df, "output_eu.csv")  # European format (semicolon separator)

# Tidyverse write (no row names by default)
library(readr)
write_csv(df, "output_tidy.csv")
write_tsv(df, "output.tsv")

# Write text file
cat("Report generated:", Sys.time(), "\n", file="report.txt")
writeLines(c("Line 1", "Line 2", "Line 3"), "output.txt")

# Append to existing file
cat("New data\n", file="log.txt", append=TRUE)

# Save/Load R objects
saveRDS(df, "my_data.rds")          # Save single R object
df_loaded <- readRDS("my_data.rds") # Load it back

save(df, model, "workspace.RData")  # Save multiple objects
load("workspace.RData")             # Restores all saved objects

4. File System Operations

r
1234567891011121314151617181920212223
# File paths (use / or \\ for Windows)
path <- "data/sales/2024/q4_sales.csv"

# Check existence
file.exists("data.csv")    # TRUE/FALSE
dir.exists("output/")      # TRUE/FALSE

# Create directories
dir.create("output/charts", recursive=TRUE)   # Create nested dirs

# List files
list.files("data/")                  # All files
list.files("data/", pattern="\\.csv$")  # Only CSV files
list.files("data/", recursive=TRUE)   # Include subdirectories

# File operations
file.copy("source.csv", "backup.csv")
file.rename("old_name.csv", "new_name.csv")
file.remove("temp_file.csv")

# Get file info
file.info("data.csv")$size    # File size in bytes
file.info("data.csv")$mtime   # Last modified time

5. Mini Project: Notes Management System

r
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
# ─── NOTES MANAGEMENT SYSTEM ─────────────────────────
notes_file <- "notes.txt"

add_note <- function(title, content) {
  timestamp <- format(Sys.time(), "%Y-%m-%d %H:%M")
  note <- paste0(
    "─────────────────────────────────────\n",
    "TITLE: ", title, "\n",
    "DATE:  ", timestamp, "\n",
    "─────────────────────────────────────\n",
    content, "\n\n"
  )
  cat(note, file=notes_file, append=TRUE)
  cat("Note added:", title, "\n")
}

list_notes <- function() {
  if (!file.exists(notes_file)) {
    cat("No notes found.\n"); return()
  }
  lines <- readLines(notes_file)
  titles <- lines[grep("^TITLE:", lines)]
  cat("=== YOUR NOTES ===\n")
  for (i in seq_along(titles)) cat(i, ".", gsub("^TITLE: ", "", titles[i]), "\n")
}

search_notes <- function(keyword) {
  if (!file.exists(notes_file)) { cat("No notes.\n"); return() }
  lines <- readLines(notes_file)
  matches <- grep(keyword, lines, ignore.case=TRUE, value=TRUE)
  if (length(matches) == 0) cat("No matches for:", keyword, "\n")
  else { cat("Found", length(matches), "matches:\n"); print(matches) }
}

export_csv <- function(output_file = "notes_export.csv") {
  if (!file.exists(notes_file)) { cat("No notes.\n"); return() }
  lines <- readLines(notes_file)
  titles <- gsub("^TITLE: ", "", lines[grep("^TITLE:", lines)])
  dates  <- gsub("^DATE: ", "", lines[grep("^DATE:", lines)])
  df <- data.frame(Title=titles, Date=dates, stringsAsFactors=FALSE)
  write_csv(df, output_file)
  cat("Exported", nrow(df), "notes to", output_file, "\n")
}

# Demo
file.remove(notes_file)  # Start fresh
add_note("R Basics", "R is a statistical computing language. Use <- for assignment.")
add_note("ggplot2", "Grammar of graphics: data + aesthetics + geometry.")
add_note("dplyr", "Key verbs: filter, select, mutate, group_by, summarise.")
list_notes()
search_notes("ggplot2")
export_csv()

6. Common Mistakes

  • write.csv() includes row names by default: This adds an unwanted X column when you re-read the CSV. Always use row.names=FALSE.
  • Relative vs absolute paths: setwd() changes directory, making relative paths fragile. Use R Projects + here::here() for reproducible paths.

7. MCQs

Question 1

read.csv("file.csv", row.names=1) uses column 1 as?

Question 2

saveRDS(obj, "file.rds") saves?

Question 3

write.csv() default includes row names?

Question 4

list.files(pattern="\\.csv$") returns?

Question 5

file.exists() returns?

Question 6

append=TRUE in cat(file=...) does?

Question 7

readLines() reads file as?

Question 8

dir.create(recursive=TRUE) creates?

Question 9

readcsv() from readr vs read.csv()?

Question 10

load("workspace.RData") restores?

8. Interview Questions

  • Q: What is the difference between saveRDS() and save() in R?
  • Q: Why should you use row.names=FALSE with write.csv()?

9. Summary

File reading: read.csv() (base), readcsv() (readr, faster). Always specify stringsAsFactors=FALSE in older R. Writing: write.csv(row.names=FALSE), writecsv(). Binary R objects: saveRDS()/readRDS() (single object), save()/load() (multiple). File system: file.exists(), dir.create(recursive=TRUE), list.files(pattern=). Use R Projects + here package for portable paths.

10. Next Chapter Recommendation

In Chapter 13: Data Import and Export, we import from Excel, JSON, databases, and export analysis results to professional formats.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·