DEV Community

Rishabh Rawat
Rishabh Rawat

Posted on • Updated on • Originally published at rrawat.com

How to use project in MongoDB aggregation pipeline

If you have ever worked with MongoDB, chances are you've heard of or even come across the aggregation pipeline that comes with it. Aggregation pipeline in MongoDB is a very powerful feature to get insights out of your data that is not possible with simple find operation.

In this article, we're going to take at the project stage. We do have the projection support in find queries as well but it cannot achieve what project stage in the aggregation pipeline can. Today we're going to explore some of the common use-cases for the project stage:

  1. Projecting all fields
  2. Working with arrays
  3. Naming field in project stage
  4. Filtering fields
  5. Literal fields
  6. Conditionally removing fields

Establishing the data

Before we jump into the aggregation pipeline and the project stage, we would need some data to work with. I'm taking an example Restaurants collection for understanding the concept here. I'll be linking to a MongoDB playground for each query so that you can play around and see how it affects the output.

Here's the Restaurants collection with only 5 documents containing random data:

{
    name: 'Thai Restaurant',
    location: 'London',
    notOpenOnDays: ['Monday', 'Thursday'],
    entryFee: '$5',
    rating: '3',
},
{
    name: 'Indian.com',
    location: 'Broughton',
    notOpenOnDays: ['Wednesday'],
    entryFee: '$0',
    rating: '4',
},
{
    name: '23rd Street Pizza',
    location: 'Fallowfield',
    notOpenOnDays: ['Sunday'],
    entryFee: '$2',
    rating: '4',
},
{
    name: '7 STAR Pizza',
    location: 'Newcastle',
    notOpenOnDays: ['Saturday', 'Sunday'],
    entryFee: '$10',
    rating: '2',
},
{
    name: 'A Cake A Shake',
    location: 'London',
    notOpenOnDays: ['Monday'],
    entryFee: '$1',
    rating: '4',
},
Enter fullscreen mode Exit fullscreen mode

Projecting all fields

In the MongoDB aggregation pipeline, a pipeline stage has access to only those fields which are included in the previous stage.

So if you want to include all fields, you can include everything in the project stage so that it gets passed to the next one. Any field missed is not going to make it to the next stage of the pipeline.

MongoDB gives us a way to state which fields we want to include and exclude. So if you want to include only the name, the project stage would be:

{
    $project: { name: 1 }
}
Enter fullscreen mode Exit fullscreen mode

You can also put true or any number greater than 1 to show inclusion of a field but let's just stick to 1 for simplicity & conciseness.

Similarly, if you want to exclude the field you'd do the opposite:

{
    $project: { name: 0 }
}
Enter fullscreen mode Exit fullscreen mode

And, if you want to include everything in the Restaurants collection, here's what the project stage would look like:

{
    $project: {
        name: 1,
        location: 1,
        notOpenOnDays: 1,
        entryFee: 1,
        rating: 1,
    }
}
Enter fullscreen mode Exit fullscreen mode

NOTE: Including a field in the project stage implicitly means excluding all the other fields and vice-versa.

Working with arrays

Creating arrays

Creating a new array field is just as creating any non-array field, the only difference is that the field value is wrapped in [] to make it an array.

We can do it like this:

{
    $project: {
        review: [ "$rating" ]
    }
}
Enter fullscreen mode Exit fullscreen mode

Now review field will be an array with rating values in it.

Extracting element(s) at specific index

To get an array element at a specific index, we use use $arrayElemAt like shown below,

{
    $project: {
        fieldName: { $arrayElemAt: ["$arrayFieldName", index] }
    }
}
Enter fullscreen mode Exit fullscreen mode

With the latest MongoDB version 4.4, we now also have dedicated operators to get first and last elements in an array. The operators are $first and $last.

To get the first element,

{
    $project: {
        review: [ "$rating" ]
    }
},
{
    $project: {
        newReview: { $first: "$review" },
        _id: 0
    }
}
Enter fullscreen mode Exit fullscreen mode

We are putting the restaurant rating in an array named review and in the next project stage we are extracting the first element of the array created in the stage before. This gives us:

{ "newReview" : 3 }
{ "newReview" : 4 }
{ "newReview" : 4 }
{ "newReview" : 2 }
{ "newReview" : 4 }
Enter fullscreen mode Exit fullscreen mode

NOTE: We can chain as many project stages as we want in an aggregation pipeline.

To get the last element,

{
    $project: {
        review: [ "$rating" ]
    }
},
{
    $project: {
        newReview: { $last: "$review" },
        _id: 0
    }
}
Enter fullscreen mode Exit fullscreen mode

This will give the same output as there is only one element in the array currently, let's add some values to the reviews array.

{
    $project: {
        review: [ 1, 2, 3, "$rating" ]
    }
},
{
    $project: {
        newReview: { $last: "$review" },
        _id: 0
    }
}
Enter fullscreen mode Exit fullscreen mode

We would get the ratings as the output of this as we have it in the last position:

{ "newReview" : 3 }
{ "newReview" : 4 }
{ "newReview" : 4 }
{ "newReview" : 2 }
{ "newReview" : 4 }
Enter fullscreen mode Exit fullscreen mode

Filtering values

Filter lets you write custom filter which is applied to every element in the array and only the elements which pass the filter are passed as output to next stage.

  {
    $project: {
      _id: 0,
      openOnWeekend: {
        $filter: {
          input: "$notOpenOnDays",
          as: "notOpen",
          cond: {
            $and: [
              { $ne: ["$$notOpen", "Saturday"] },
              { $ne: ["$$notOpen","Sunday"] },
            ]
          }
        }
      }
    }
  }
Enter fullscreen mode Exit fullscreen mode

Output of the above aggregation query would only contain the restaurants which are open on weekends:

[
  {
    "openOnWeekend": [ "Monday", "Thursday" ]
  },
  {
    "openOnWeekend": [ "Wednesday" ]
  },
  {
    "openOnWeekend": []
  },
  {
    "openOnWeekend": []
  },
  {
    "openOnWeekend": [ "Monday" ]
  }
]
Enter fullscreen mode Exit fullscreen mode

NOTE: We use $ for values present in our document and $$ for fields that are introduced within a pipeline stage.

Transforming data

$map operator is just like the array.map() in javascript, it takes an input array, applies an expression on each item and return the transformed array.

Here you can directly transform data as compared to filter where you need to provide a condition to handle the data appropriately.

Let's transform the notOpenOnDays array and shorten the weekday names:

  {
    $project: {
      _id: 0,
      newPrice: {
        $map: {
          input: "$notOpenOnDays",
          as: "noOpenOn",
          in: {
            $substrCP: [ "$$noOpenOn", 0, 3 ]
          }
        }
      }
    }
  }
Enter fullscreen mode Exit fullscreen mode

We have the substring operator to get only first 3 characters and remove the rest:

[
  {
    "newPrice": [ "Mon", "Thu" ]
  },
  {
    "newPrice": [ "Wed" ]
  },
  {
    "newPrice": [ "Sun" ]
  },
  {
    "newPrice": [ "Sat", "Sun" ]
  },
  {
    "newPrice": [ "Mon" ]
  }
]
Enter fullscreen mode Exit fullscreen mode

There are a bunch of other array expression operators, you can refer them here.

Naming field in project stage

The aggregation pipeline provides great flexibility when it comes to naming fields. Any name you specify in the project stage acts as the "new" name for that field from that point forwards.

You give an existing field a new name which to effectively rename it:

{
    $project: {
        closedOn: "$notOpenOnDays"
    }
}
Enter fullscreen mode Exit fullscreen mode

Now the output will contain closedOn field with notOpenOnDays array as value.

Literal fields

By default, whatever you put in the project stage is treated as an expression and it is evaluated to a value. You can use $literal operator if you want MongoDB to treat any field's value like a constant or literal.

Let's say we want to find out the restaurants where the entry fee is exactly equal to "$1", we can leverage literal operator to achieve this:

  {
    "$project": {
      affordable: {
        $eq: [
          "$entryFee",
          {
            $literal: "$1"
          }
        ]
      }
    }
  },
  {
    "$project": {
      _id: 0,
      status: {
        "$cond": {
          "if": "$affordable",
          "then": "restaurant is affordable 🍕🍟",
          "else": "Not affordable 🙈"
        }
      }
    }
  }
Enter fullscreen mode Exit fullscreen mode

You can chain as many project stages as you need in the aggregation pipeline. Output of the above pipeline evaluates to 👇🏻

[
  {
    "status": "Not affordable 🙈"
  },
  {
    "status": "Not affordable 🙈"
  },
  {
    "status": "Not affordable 🙈"
  },
  {
    "status": "Not affordable 🙈"
  },
  {
    "status": "restaurant is affordable 🍕🍟"
  }
]
Enter fullscreen mode Exit fullscreen mode

Conditionally excluding fields

In the above example, we saw a bunch of redundant information in the output. We are only interested in knowing the affordable restaurants so it makes sense to not output the other ones.

With MongoDB 3.6, you can remove fields from the output conditionally. Let's modify our above project stage to only output affordable restaurants:

  {
    "$project": {
      name: 1,
      affordable: {
        $eq: [
          "$entryFee",
          {
            $literal: "$1"
          }
        ]
      }
    }
  },
  {
    "$project": {
      _id: 0,
      status: {
        "$cond": {
          "if": "$affordable",
          "then": "$name",
          "else": "$$REMOVE"
        }
      }
    }
  }
Enter fullscreen mode Exit fullscreen mode
  • We're now including the name in the first project stage to make sure it is available in the next stage.
  • $cond operator is updated to show the name of only the affordable restaurants.

Output:

[
  {},
  {},
  {},
  {},
  {
    "status": "A Cake A Shake"
  }
]
Enter fullscreen mode Exit fullscreen mode

Conclusion

So to summarize we have seen that the project stage in the MongoDB aggregation pipeline can do much more than just include & exclude fields. We can create arrays, modify its elements, perform filters, slice strings, conditionally remove fields and get data as per our needs.

If you think something is missing or have any questions, please post it down in the comments. I'd be happy to connect!

Top comments (0)