Importing, Exporting, Backup, and Recovery
# CHAPTER 24
Importing, Exporting, Backup, and Recovery
1. Introduction
All the advanced queries in the world mean nothing if your server's hard drive crashes and you lose the data. Data loss is a catastrophic event that destroys companies. As a database developer, you must know how to move data in and out of your system seamlessly, and more importantly, how to execute bulletproof Backup and Recovery protocols. In this chapter, we will step out of the SQL editor and into the Terminal to master the essential operational tools of PostgreSQL.2. Learning Objectives
By the end of this chapter, you will be able to:-
Use the
COPYcommand to import and export CSV files.
- Understand the difference between logical and physical backups.
-
Use
pgdumpto create a complete database backup.
-
Use
pgrestoreto rebuild a database from a backup file.
- Understand the 3-2-1 Backup Strategy.
3. Importing and Exporting CSV Data
Often, a client will hand you an Excel spreadsheet with 10,000 rows and say, "Put this in the database." You do not write 10,000INSERT statements. You save the spreadsheet as a .csv file and use the incredibly fast COPY command.
Exporting a Table to CSV:
Open psql (or the pgAdmin Query Tool) and run:
*(This takes the entire users table and writes it to a file. The HEADER keyword ensures the column names are printed on the first line).*
Importing a CSV into a Table: *Note: The table must already be created with the correct columns!*
4. Logical Backups: The pgdump Utility
The COPY command only moves raw data for a single table. It does not copy your Table Structures, Indexes, or Views.
To back up an entire database (Data + Architecture), PostgreSQL provides a terminal utility called pgdump. You run this from your operating system's terminal (Bash/PowerShell), NOT from inside the SQL editor!
Creating a plain-text SQL backup:
*(This generates a massive text file containing every CREATE TABLE and INSERT statement needed to magically recreate your database from scratch).*
5. Creating a Custom Format Backup (Recommended)
While.sql text files are great for reading, they are slow to restore. Professional DBAs use the "Custom Format" (-Fc). It is heavily compressed and allows you to restore individual tables later if needed.
6. Disaster Recovery: The pgrestore Utility
Your server crashed. You bought a new server, installed a fresh copy of PostgreSQL, and transferred your universitybackup.dump file to it. How do you revive the database?
First, you must create a blank database container:
Next, use the pg_restore terminal utility to unpack the backup into the empty container:
*(The -1 flag is brilliant: It wraps the entire restoration in a single Transaction. If the restore fails halfway through, it rolls back, preventing a corrupted partial restore!)*
7. Physical Backups and WAL (Advanced)
pgdump is a "Logical" backup. It extracts the logic (SQL commands).
For massive enterprise databases (Terabytes of data), logical backups are too slow. Instead, DBAs use "Physical Backups" (copying the actual binary .dat files off the hard drive) combined with WAL (Write-Ahead Logging).
WAL files record every single byte change in real-time. By continuously archiving WAL files to an AWS S3 bucket, enterprises can perform "Point-In-Time Recovery," rewinding the database to any exact millisecond before a crash occurred.
8. Mini Project: Automating Backups with Cron
A backup is useless if you forget to run it. In Linux, we usecron to schedule automated tasks.
Let's create a bash script (backup.sh) that runs pgdump and stamps the file with today's date.
*(You would then configure the Linux Cron daemon to execute this script every night at 3:00 AM).*
9. Common Mistakes
-
Never Testing the Recovery: Generating backup files every night is useless if the files are corrupted. A backup does not officially exist until you have successfully executed
pgrestoreon a test server and verified the data is intact.
-
Storing Backups on the Same Server: If you save your
pgdumpfile to the same hard drive that the database lives on, and that hard drive catches fire, you lose both the database and the backup. Always transfer backups off-site immediately.
10. Best Practices
- The 3-2-1 Backup Strategy:
- Keep 3 copies of your data (1 Primary, 2 Backups).
- Store them on 2 different types of media (e.g., Local Drive + Cloud Storage).
- Keep 1 copy completely off-site (e.g., in a different physical city or AWS region).
11. Exercises
- 1. What terminal utility is used to generate a complete logical backup of a PostgreSQL database?
-
2.
Why is the
-1flag highly recommended when executingpg_restore?
12. SQL Challenges
Write the SQL command (executed insidepsql) to export the entire products table into a CSV file located at C:\backups\products.csv, ensuring the column headers are included.
13. MCQ Quiz with Answers
When running the COPY command to import a massive CSV file into PostgreSQL, what must be true about the target table?
What is the primary difference between a Logical Backup (pgdump) and a Physical Backup?
14. Interview Questions
-
Q: Describe the architectural differences between
pgdumpand theCOPYcommand. When would you use one over the other?
- Q: Explain the concept of Write-Ahead Logging (WAL) in PostgreSQL. How does it facilitate Point-In-Time Recovery in enterprise disaster scenarios?
15. FAQs
Q: Does runningpgdump lock the database and prevent users from reading the website?
A: No! Because of PostgreSQL's brilliant MVCC architecture, pgdump takes a seamless snapshot. Your website will remain 100% online and functional while the massive backup is running in the background.
16. Summary
You are now equipped to protect your organization's most valuable asset. By utilizingCOPY for rapid data onboarding, and deploying automated pgdump and pgrestore workflows, you ensure that your data is highly portable, archivable, and perfectly insulated against catastrophic hardware failures.