CHAPTER 15 Beginner

Aggregation Framework Basics

Updated: May 16, 2026

15 min read

# CHAPTER 15

Aggregation Framework Basics

1. Introduction

The find() method is great for retrieving data, but it is terrible at analyzing data. If you want to calculate the total sales revenue for a specific month, you cannot use find(). In SQL, you would use GROUP BY and SUM(). In MongoDB, you use the most powerful analytical engine ever built: the Aggregation Framework. In this chapter, we will learn how to build data "Pipelines" to crunch massive amounts of NoSQL data in milliseconds.

2. Learning Objectives

By the end of this chapter, you will be able to:

Understand the concept of an Aggregation Pipeline.

Use the aggregate() method.

Filter data flowing into the pipeline using the $match stage.

Crunch data (Sum, Average, Count) using the $group stage.

Shape the final output using the $project stage.

3. What is an Aggregation Pipeline?

Think of the Aggregation Framework as an assembly line in a factory. You take a bucket of 1,000 raw documents and push them onto a conveyor belt (the Pipeline). The documents pass through different "Stages":

1. Stage 1: Filter out the bad ones. (Now we have 500 documents).

2. Stage 2: Group them by category and calculate the total price. (Now we have 5 summary documents).

3. Stage 3: Sort them from highest to lowest.

4. Output: The final analytics report!

4. The `$match` Stage (The Filter)

The pipeline is executed using the aggregate() method, which accepts an Array [] of stages. The $match stage acts exactly like the find() query. It is almost always the first stage, used to filter out garbage data before doing heavy math.

javascript

1234

// Pipeline Stage 1: Only allow "Completed" orders onto the conveyor belt
db.orders.aggregate([
    { $match: { status: "Completed" } }
])

5. The `$group` Stage (The Calculator)

This is where the magic happens. $group acts like SQL's GROUP BY. It takes the documents, groups them by a specific field (using the mandatory _id key), and performs mathematical operations ($sum, $avg, $max, $min) on them.

javascript

12345678

// Calculate Total Revenue grouped by User
db.orders.aggregate([
    // Stage 1: Group by the user_id (You must prefix the field with a '$' inside strings!)
    { $group: { 
        _id: "$user_id",               // Group by User
        total_spent: { $sum: "$price" } // Calculate the SUM of the price field
    }}
])

*(Notice the syntax: When referencing the value of an existing field inside an aggregation string, you must prefix it with a dollar sign: "$price").*

6. Chaining the Pipeline ($match + $group)

Let's combine them! We want to calculate the total sales revenue generated in the year 2023, grouped by Product Category.

javascript

1234567891011

db.orders.aggregate([
    // Stage 1: Filter. Only allow 2023 orders onto the belt.
    { $match: { year: 2023 } },
    
    // Stage 2: Math. Group the 2023 orders by Category, and sum their totals!
    { $group: {
        _id: "$category",
        annual_revenue: { $sum: "$total" },
        items_sold: { $sum: 1 } // Adding 1 for every document counts the total number of orders!
    }}
])

7. The `$project` Stage (The Formatter)

The final stage of a pipeline is often $project. It acts exactly like Projection in find(). You use it to rename fields, do string concatenation, or hide the ugly _id field before sending the final report to the frontend dashboard.

javascript

12345678

db.users.aggregate([
    // Stage 1: Re-shape the output document
    { $project: {
        _id: 0, // Hide the ID
        full_name: { $concat: ["$first_name", " ", "$last_name"] }, // Combine strings!
        is_adult: { $gte: ["$age", 18] } // Returns a boolean True/False!
    }}
])

8. Mini Project: E-Commerce Analytics Dashboard

The CEO wants a leaderboard showing the Top 3 highest grossing products, but only for "Electronics".

javascript

12345678910111213141516

db.sales.aggregate([
    // 1. Filter out non-electronics
    { $match: { category: "Electronics" } },
    
    // 2. Group by product name and sum the revenue
    { $group: {
        _id: "$product_name",
        total_revenue: { $sum: "$price" }
    }},
    
    // 3. Sort by total_revenue descending (Highest first)
    { $sort: { total_revenue: -1 } },
    
    // 4. Limit to the Top 3
    { $limit: 3 }
])

9. Common Mistakes

Forgetting the $ Prefix in Grouping: If you write { id: "userid" }, MongoDB will literally group everyone under the text string "userid". You MUST write { id: "$userid" } to tell MongoDB to evaluate the *value* of the userid field.

Putting $group before $match: If you group and calculate the math on 5 million rows, and THEN filter it, you are wasting massive amounts of server CPU. Always $match first to reduce the pipeline volume as early as possible.

10. Best Practices

Indexes work on $match: The $match stage can utilize B-Tree indexes, but ONLY if it is the very first stage in the pipeline!

11. Exercises

1. What MongoDB method is used to initiate an aggregation pipeline?

2. Inside a $group stage, what operator do you use to add up the values of a specific numeric field across all grouped documents?

12. MongoDB Challenges

Write an aggregation pipeline on the employees collection. Group them by $department, and calculate the Average ($avg) salary for each department.

javascript

123456

db.employees.aggregate([
    { $group: {
        _id: "$department",
        average_salary: { $avg: "$salary" }
    }}
])

13. MCQ Quiz with Answers

Question 1

In the MongoDB Aggregation Framework, how is data processed?

Question 2

When defining the `id` field inside a `$group` stage, why must you prepend the target field name with a dollar sign (e.g., `"$category"`)?

14. Interview Questions

Q: Explain the architectural concept of an Aggregation Pipeline. Why is the order of stages (specifically putting $match before $group and $sort) critical for database performance?

Q: Compare and contrast the purpose of the $project stage in an aggregation pipeline with the standard Projection argument used in a find() query.

15. FAQs
Q: Can I save the results of an aggregation directly into a new collection? A: Yes! You can append the $out stage as the very last step in your pipeline. {$out: "annualreport"} will take the final data and permanently write it into a brand new collection!

16. Summary

The Aggregation Framework is what separates beginners from Senior Database Engineers. By constructing logical pipelines using $match to filter, $group to crunch math, and $project to format the final JSON, you can build enterprise-grade analytics dashboards directly on the database engine.

17. Next Chapter Recommendation

We know how to group data within a single collection. But what if we need to combine data from TWO different collections? In Chapter 16: Advanced Aggregation Pipelines, we will master the $lookup stage—the MongoDB equivalent of the legendary SQL JOIN.

Browse All 21+ Subject Areas

Quick Links

Visual Algorithm Labs

Frontend Dev

The Future of Web Architecture in 2026

Practice Quizzes

Aggregation Framework Basics #

1. Introduction #

2. Learning Objectives #

3. What is an Aggregation Pipeline? #

4. The $match Stage (The Filter) #

5. The $group Stage (The Calculator) #

6. Chaining the Pipeline ($match + $group) #

7. The $project Stage (The Formatter) #

8. Mini Project: E-Commerce Analytics Dashboard #

9. Common Mistakes #

10. Best Practices #

11. Exercises #

12. MongoDB Challenges #

13. MCQ Quiz with Answers #

In the MongoDB Aggregation Framework, how is data processed?

When defining the id field inside a $group stage, why must you prepend the target field name with a dollar sign (e.g., "$category")?

14. Interview Questions #

15. FAQs #

16. Summary #

17. Next Chapter Recommendation #

Finish this Chapter

Discussion

Explore More

📖 Related Tutorials 6

❓ Related Quizzes 6

Send Feedback / Bug

Feedback Submitted!

Aggregation Framework Basics

1. Introduction

2. Learning Objectives

3. What is an Aggregation Pipeline?

4. The `$match` Stage (The Filter)

5. The `$group` Stage (The Calculator)

6. Chaining the Pipeline ($match + $group)

7. The `$project` Stage (The Formatter)

8. Mini Project: E-Commerce Analytics Dashboard

9. Common Mistakes

10. Best Practices

11. Exercises

12. MongoDB Challenges

13. MCQ Quiz with Answers

When defining the `id` field inside a `$group` stage, why must you prepend the target field name with a dollar sign (e.g., `"$category"`)?

14. Interview Questions

15. FAQs

16. Summary

17. Next Chapter Recommendation