Skip to main content
MySQL Basics
CHAPTER 11 Beginner

GROUP BY and HAVING Clauses

Updated: May 16, 2026
6 min read

# CHAPTER 11

GROUP BY and HAVING Clauses

1. Introduction

In the last chapter, we learned that mixing a standard column (department) with an aggregate function (SUM(salary)) causes MySQL to crash. But what if your boss specifically asks: *"I want a list of every department, and the total salary cost next to each one."* To achieve this, you must instruct MySQL to categorize the data *before* running the math. In this chapter, we introduce the GROUP BY clause—the undisputed champion of data reporting and analytics.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Understand how GROUP BY splits data into categories.
  • Safely combine standard columns with aggregate functions.
  • Differentiate between the WHERE clause and the HAVING clause.
  • Write complex, multi-layered reporting queries.

3. The GROUP BY Clause

The GROUP BY clause tells MySQL: *"Before you do the SUM() math, group all the identical rows together into buckets. Then, run the math separately on each bucket!"*
sql
1234567891011
-- Calculate the total payroll for EACH department separately!
SELECT department, SUM(salary) AS total_payroll
FROM employees
GROUP BY department;

-- Output:
-- department | total_payroll
-- -----------|--------------
-- IT         | 250000.00
-- Sales      | 420000.00
-- HR         | 110000.00

*Why this works:* Because we specified GROUP BY department, MySQL knows it is safe to print the department column alongside the aggregate function, because the math is now directly tied to that specific category!

4. Grouping by Multiple Columns

You can create subgroups. Imagine you want to know how many employees exist in each department, broken down by whether they are a manager or not.
sql
123456789101112
-- Count employees by Department, AND by Manager status
SELECT department, is_manager, COUNT(*) AS total_people
FROM employees
GROUP BY department, is_manager;

-- Output:
-- department | is_manager | total_people
-- -----------|------------|-------------
-- IT         | 0          | 45
-- IT         | 1          | 5
-- Sales      | 0          | 120
-- Sales      | 1          | 10

5. Filtering Groups: The HAVING Clause

Now, the boss says: *"Give me the payroll per department, but ONLY show me departments whose payroll is greater than $300,000."* Your instinct is to use the WHERE clause: WHERE SUM(salary) > 300000 THIS WILL CRASH. The WHERE clause runs *before* the math happens. It filters raw, individual rows. It cannot filter aggregate totals because the totals haven't been calculated yet! To filter aggregated groups, you MUST use the HAVING clause.
sql
12345678910
-- Correct way to filter aggregated data
SELECT department, SUM(salary) AS total_payroll
FROM employees
GROUP BY department
HAVING SUM(salary) > 300000;

-- Output:
-- department | total_payroll
-- -----------|--------------
-- Sales      | 420000.00

6. Mixing WHERE and HAVING

You can use both in the same query! They just execute at different times.
  • WHERE filters the raw rows *before* grouping.
  • HAVING filters the math totals *after* grouping.
sql
123456789
-- Find total payroll by department, 
-- but ONLY calculate it for employees hired after 2020 (WHERE),
-- and ONLY show departments where the total exceeds $300k (HAVING).

SELECT department, SUM(salary) AS total_payroll
FROM employees
WHERE hire_date > '2020-01-01'
GROUP BY department
HAVING SUM(salary) > 300000;

7. Mini Project: E-commerce Reporting

You need to generate a report showing the best-selling product categories, but only for categories that have sold more than 50 total items.
sql
12345
SELECT category, SUM(quantity_sold) AS total_items_sold
FROM order_details
GROUP BY category
HAVING SUM(quantity_sold) > 50
ORDER BY total_items_sold DESC;

*(Notice the order: SELECT -> FROM -> WHERE -> GROUP BY -> HAVING -> ORDER BY).*

8. Common Mistakes

  • Selecting Ungrouped Columns: If you write SELECT department, firstname, SUM(salary) GROUP BY department;, it will fail. You grouped by department, but asked for firstname. MySQL has 50 different first names in the IT bucket; it doesn't know which one to print. Rule: Every column in your SELECT must either be inside an Aggregate Function, or listed in the GROUP BY clause.

9. Best Practices

  • Performance: GROUP BY requires MySQL to create temporary tables in memory to sort and calculate the buckets. On massive datasets, this can be slow. Always try to filter out as much data as possible using a WHERE clause *before* the GROUP BY executes!

10. Exercises

  1. 1. Write a query to find the average (AVG) salary for each department.
  1. 2. Why does the query SELECT department, COUNT(*) FROM employees WHERE COUNT(*) > 10 GROUP BY department; result in a syntax error?

11. MCQ Quiz with Answers

Question 1

What is the explicit purpose of the HAVING clause?

Question 2

If you want to list the total number of users registered in each country, what clause MUST you include to prevent a syntax error?

12. Interview Questions

  • Q: Explain the strict difference in execution order and purpose between the WHERE clause and the HAVING clause.
  • Q: What is the SQL rule regarding selecting non-aggregated columns alongside an aggregate function? How does GROUP BY resolve this?

13. FAQs

Q: Can I use Aliases in the HAVING clause? A: Yes! In modern MySQL, you can write HAVING total_payroll > 300000 instead of repeating HAVING SUM(salary) > 300000. However, you CANNOT use aliases in the WHERE clause, because WHERE executes before the alias is created!

14. Summary

The GROUP BY and HAVING clauses transform raw databases into sophisticated analytical engines. By bucketing rows into distinct categories and filtering the resulting mathematics, you can generate the exact high-level reports required by business intelligence tools and dashboards.

15. Next Chapter Recommendation

Up to this point, we have operated entirely within a single table. But true relational databases rely on linking dozens of tables together. Before we can link tables, we must understand how to uniquely identify every row flawlessly. In Chapter 12: Primary Keys and Auto Increment, we master the cornerstone of data integrity.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·