DEV Community

Jacob Pelletier for Yankeedom.io

Posted on

MongoDB Guide | Aggregation

MongoDB Guide | Aggregation

by Jacob Pelletier
Contact me for suggestions, comments, or just to say hello at jacob@yankeedo.io! 👋

Follow me on my journey in learning MongoDB.

I highly recommend checking out Mongo University!


Image description

What will we be covering in this guide

  1. What aggregation is.
  2. What the stages are.

Introduction.

What is aggregation?
Aggregation is an analysis and aggregation of data. An aggregation occurs in one or more stages that process documents. Documents can be filtered, sorted, grouped, and transformed in a pipeline. Outputs of one stage become inputs for another. Aggregations do not affect the original source data.

db.collection.aggregate([
  { $stage_name: {<expression>} },
  { $stage_name: {<expression>} }
])
Enter fullscreen mode Exit fullscreen mode

Each stage is a single operation on the data. Common operation stages are:

  1. $match - filters for data that match criteria.
  2. $group - groups documents based on criteria.
  3. $sort - puts documents in a specified order.

Each stage has its own syntax to carry out its operations.

Field names prefixed with a dollar sign are called "field paths". It allows us to refer to the value in that field.

For example, the code below grabs the values of the first_name and last_name value fields and sets the concatenation of these values to defaultUsername.

$set: {
  defaultUsername: {
     $concat: ["$first_name", "$last_name"]
   }
}
Enter fullscreen mode Exit fullscreen mode

An aggregation is a collection and summary of data. A stage is a built-in methode that can be completed on data (does not alter it). An aggregation pipeline is a series of stages completed in data in some order.

db.collection.aggregate([
    {
        $stage1: {
            { expression1 }...
        },
        $stage2: {
            { expression1 }...
        }
    }
])
Enter fullscreen mode Exit fullscreen mode

Stage Syntax:
{$match: {<query>}}

Query Syntax:
{$expr: {<expression>}}

Method Syntax:
db.collection.aggregate([])


Match and Group Stages

These are very common aggregations.

  1. $match: Filters documents matching some criteria a. takes one argument. b. works exactly like a find command. c. {$match: {"state": "CT"}}. d. place as early as possible so it can use indexes. e. it reduces the amount of documents, and thus the amount of processing required.

*match airbnb example *
Find all airbnbs with 5 or more bedrooms

{ $match: { "beds": { $gte: 5 } } },
Enter fullscreen mode Exit fullscreen mode

Image description

  1. $group: Creates a single document for each distinct value. a. groups documents by a group key. b. output is one document for each unique value of the group key. c. requires a group key, specified by _id, the field to group by. d. it may also include one or more fields with an accumulator. e. the accumulator specifies how to aggregate the information for each of the groups.

$group generic example:

$group: 
  {
    _id: <expression>, // group key
    <field>: { <accumulator> : <expression> }
  }
Enter fullscreen mode Exit fullscreen mode

$group airbnb example
Find all airbnbs with 5 or more bedrooms and return the count of each record.

db.listingsAndReviews.aggregate([

    { $match: { "beds": { $gte: 5 } } },

    { $group: { _id: "$beds", count: { $count: {} } } },

    { $sort: { count: -1 } },    

])
Enter fullscreen mode Exit fullscreen mode

Image description


Sort and Limit Stage

I had already used sort earlier to make the results more readable.

  1. $sort: as demonstrated in the $group example, the $sort stage sort by some field, such as the count field above.
    a. 1 represents ascending order, while -1 represents descending order.

  2. $limit: limits the number of documents that are passed to the next stage.
    a. only takes a positive integer

db.listingsAndReviews.aggregate([

    { $match: { "beds": { $gte: 5 } } },

    { $group: { _id: "$beds", count: { $count: {} } } },

    { $sort: { count: -1 } },    

    { $limit: 3 }

])
Enter fullscreen mode Exit fullscreen mode

Image description

As you can see, the order of the stages matters. If we were to limit before we sorted, then we would get different results.


Project, Count, and Set Stage

  1. $project: determines resulting output shape. a. specifies fields to return in aggregation. b. similar to find operation. c. should be used last if possible. d. chooses which fields to keep by either inclusion or exclusion. e. set the field to 1 to include, <field> : 1 f. set the field to 0 to include, <field> : 0 g. new value specified for new fields and existing fields can be given a new value, <field> : <new value>

Notice the differences in the project stages below.

Aggregation #1
Image description

Aggregation #2
Image description

Aggregation #3
Image description

  1. $count: counts the number of documents in the pipeline.

For example, if we wanted to count how many bed options above 5 there are, we could perform the following aggregation.
Image description

  1. $set: adds or modifies fields in the pipeline.

In this example, we rename beds to _id.
Image description

The same pipeline without the $set stage.
Image description


Out Stage

  1. $out writes the documents that are returned by an aggregation pipeline into a collection. a. must the the last stage. b. creates a new collection if it does not already exist. c. if the collection exists, $out replaces the existing collection with new data.

Store counts of airbnbs with greater than 5 beds.
Image description

Top comments (0)