Aggregate Functions in PostgreSQL
# CHAPTER 10
Aggregate Functions in PostgreSQL
1. Introduction
Databases are not just storage cabinets; they are extremely powerful calculators. When the CEO of an E-commerce company asks, "How much money did we make yesterday?" or "How many users registered this month?", you cannot run aSELECT * query and manually add up 50,000 rows on a calculator. Instead, you use Aggregate Functions. These functions take thousands or millions of individual rows, perform math on them internally, and return a single, summarized answer.
2. Learning Objectives
By the end of this chapter, you will be able to:-
Count the total number of rows in a table using
COUNT().
-
Add numerical values together using
SUM().
-
Calculate the mathematical average using
AVG().
-
Find the absolute highest and lowest values using
MAX()andMIN().
-
Combine aggregate functions with the
WHEREclause.
3. The COUNT() Function
The most commonly used aggregate function. It simply counts how many rows exist.
*(Notice the * inside COUNT. This tells PostgreSQL to count rows regardless of what data is in them).*
4. The SUM() Function
Used exclusively on numerical columns, SUM() adds all the numbers together.
5. The AVG() Function
Calculates the mean average of a numerical column.
*Note: AVG() often returns a massive decimal (e.g., 55333.333333333). In PostgreSQL, you can use the ROUND() function to clean it up: SELECT ROUND(AVG(salary), 2) FROM employees;*
6. The MAX() and MIN() Functions
Finds the absolute highest and absolute lowest values. These work on numbers, but they also work on dates and text!
7. Combining Aggregates
You can run multiple aggregate functions in a single query to generate a complete statistical dashboard in milliseconds.8. Mini Project: Sales Analytics Dashboard
Let's pretend we are building the backend API for a merchant's dashboard. They need their statistics for the month of October.9. Common Mistakes
-
Mixing Aggregates with Normal Columns: This is the #1 error beginners make. You CANNOT write
SELECT firstname, COUNT(*) FROM employees;. Think about it logically:COUNT(*)returns exactly ONE row (the total number).firstnamewants to return 1,000 rows (every single name). The database cannot smash 1,000 rows and 1 row into the same grid. It will crash. (We will learn the workaround for this in Chapter 11).
-
NULL values in
AVG(): Aggregate functions automatically ignoreNULLvalues. If 5 employees have a salary, and 5 haveNULL(missing) salaries,AVG(salary)will calculate the average of the 5 who have it, NOT divide by 10.
10. Best Practices
-
Always use Aliases: Aggregate output column names are incredibly ugly (e.g.,
sum,avg). Always useAS "Total"so your frontend application receives a clean JSON key.
11. Exercises
-
1.
Write a query to find the total combined inventory (
quantity) of all items in theproductstable.
- 2. Write a query to find the hire date of the newest (most recently hired) employee.
12. SQL Challenges
Write a query to find the total revenue (price * quantity) of all orders placed by user_id = 42.
13. MCQ Quiz with Answers
What is the fundamental restriction when using Aggregate Functions in a standard SELECT query?
How do Aggregate Functions (like SUM or AVG) handle NULL values in a column?
14. Interview Questions
-
Q: Explain why executing
SELECT department, SUM(salary) FROM employees;will result in a PostgreSQL syntax error.
- Q: In a massive E-commerce database, describe how you would calculate the Average Order Value (AOV) for a specific marketing campaign.
15. FAQs
Q: IsCOUNT(*) slow on massive tables?
A: Yes! In PostgreSQL, COUNT(*) forces the database to scan the table to ensure MVCC (concurrency) rules are followed. If you have 50 million rows, COUNT(*) can take several seconds.
16. Summary
Aggregate functions transform databases from static filing cabinets into powerful statistical engines. By utilizingCOUNT, SUM, AVG, MAX, and MIN, you can compress massive datasets into instant, actionable business metrics.