DEV Community

Cover image for Mastering MongoDB Aggregation Framework: Unraveling the Power of Pipelines
Riyad Hossain
Riyad Hossain

Posted on

Mastering MongoDB Aggregation Framework: Unraveling the Power of Pipelines

In the fast-paced world of software development, engineers are often tasked with solving complex problems efficiently. As a Backend Software Engineer, I face numerous challenges while building robust APIs and integrating them with powerful databases. One such scenario arises when I was assigned to create an API for displaying statistics on a service. In this blog, I will illustrate my journey to embark on a quest to implement a MongoDB Aggregation Framework for fetching complex calculated data.

Afew days ago, I was handed the responsibility of writing an API that displays statistics for a service. To obtain this data, I needed to fetch and process information stored in a MongoDB database in different collections. However, my knowledge of the MongoDB Aggregation Framework was initially limited. Nonetheless, I eagerly took on the challenge, realizing that this would be an excellent opportunity to enhance your expertise in MongoDB Aggregation.


Unveiling the Aggregation Pipeline: The MongoDB Aggregation Framework provides a powerful toolset for performing complex data operations on MongoDB collections. It allows you to process, transform, and analyze data through a series of stages known as the Aggregation Pipeline. Each stage in the pipeline takes input from the previous stage and produces output for the subsequent one. Let’s dive into the details of each stage in the Aggregation Framework implemented for your API.

Stage 1: Match Document:

{
  $match: {
    store: storeId,
  }
}
Enter fullscreen mode Exit fullscreen mode

At the beginning of the pipeline, the $match stage filters the documents based on a specific condition. In this case, it matches documents with a particular store ID.

Stage 2: Lookup OfferUsers:

       {
          $lookup: {
            from: "offerusers",
            localField: "_id",
            foreignField: "offer",
            as: "customers"
          }
        }
Enter fullscreen mode Exit fullscreen mode

In the $lookup stage, the Aggregation Framework performs a left outer join with the “offerusers” collection. It connects documents from the “Offers” collection with related documents from the “offerusers” collection using the “offer” and “_id” fields, respectively. The matching documents are stored in the “interestedCustomers” field.

Stage 3: Filter the interestedCustomer:

      {
          $addFields: {
            interestedCustomers: {
              $filter: {
                input: "$customers",
                as: "customer",
                cond: {
                  $and: [
                    { $gte: ["$$customer.viewDate", startDate] },
                    { $lte: ["$$customer.viewDate", endDate] }
                  ]
                }
              }
            }
          }
        }
Enter fullscreen mode Exit fullscreen mode

The $addFields stage is used to filter the “interestedCustomers” based on a date range. The $filter operator selects only the customers whose “viewDate” falls within the specified start and end dates.

Stage 4: Add fields — footTraffic, sales:

{
          $addFields: {
            footTrafficStatus: {
              $filter: {
                input: "$customers",
                as: "customer",
                cond: {
                  $and: [
                    { $gte: ["$$customer.footTrafficDate", startDate] },
                    { $lte: ["$$customer.footTrafficDate", endDate] }
                  ]
                }
              }
            },
            salesStatus: {
              $filter: {
                input: "$customers",
                as: "customer",
                cond: {
                  $and: [
                    { $gte: ["$$customer.claimedAt", startDate] },
                    { $lte: ["$$customer.claimedAt", endDate] }
                  ]
                }
              }
            }
          }
        }
Enter fullscreen mode Exit fullscreen mode

The $addFields stage is used again to add two fields: “footTrafficStatus” and “salesStatus.” These fields contain filtered arrays of “interestedCustomers” based on specific date ranges.

Stage 5: Add fields — interestedUsers, footTraffic, salesCount:

{
          $addFields: {
            interestedUsers: {
              $size: "$interestedCustomers"
            },
            footTraffic: {
              $size: "$footTrafficStatus"
            },
            sales: {
              $size: "$salesStatus"
            },
          }
        }
Enter fullscreen mode Exit fullscreen mode

In this stage, the $addFields operator calculates the number of interested users, foot traffic, and the count of sales. The $size operator counts the number of elements in each array.

Stage 6: Add fields — amount, conversionRate, revenue:

{
          $addFields: {
            salesStatus: {
              $map: {
                input: "$salesStatus",
                as: "status",
                in: {
                  $mergeObjects: [
                    "$$status",
                    {
                      amount: {
                        $cond: {
                          if: { 
                          $eq: ['$__t', 'ParticularProduct'] 
                          },
                          then: "$discountedPrice",
                          else: "$minimumBillAmount",
                        }
                      }
                    }
                  ]
                }
              }
            },
            conversionRate: {
              $cond: {
                if: {
                  $eq: ["$interestedUsers", 0]
                },
                then: 0,
                else: {
                  $multiply: [
                    {
                      $divide: [
                        "$sales", {
                          $max: [1, "$interestedUsers"]
                        }
                      ]
                    },
                    100
                  ]
                }
              }
            },
            revenue: {
              $cond: {
                if: {
                  $eq: ["$sales", 0]
                },
                then: 0,
                else: {
                  $cond: {
                    if: { $eq: ['$__t', 'ParticularProduct'] },
                    then: { $multiply: ["$sales", "$discountedPrice"] },
                    else: { $multiply: ["$sales", "$minimumBillAmount"] }
                  }
                }
              }
            }
          }
        }
Enter fullscreen mode Exit fullscreen mode

Here, the $addFields stage adds additional fields to the “salesStatus” objects. It also calculates the conversion rate and revenue based on certain conditions.

Stage 7: Group by _id to merge salesStatus objects into an array:

{
          $group: {
            _id: null,
            interestedUsers: { $sum: "$interestedUsers" },
            footTraffic: { $sum: "$footTraffic" },
            sales: { $sum: "$sales" },
            revenue: { $sum: "$revenue" },
            conversionRate: { $avg: "$conversionRate" },
            interestedCustomers: { $push: "$interestedCustomers" },
            footTrafficStatus: { $push: "$footTrafficStatus" },
            salesStatus: { $push: "$salesStatus" }
          }
        }
Enter fullscreen mode Exit fullscreen mode

The $group stage groups the documents and merges the “salesStatus” objects into an array. It also calculates the sum of interested users, foot traffic, sales, and revenue. The conversion rate is averaged across all documents.

Stage 8: Add objects in interestedCustomers, footTraffic, salesStatus array to the root level:

{
          $addFields: {
            interestedCustomers: {
              $reduce: {
                input: "$interestedCustomers",
                initialValue: [],
                in: { $concatArrays: ["$$value", "$$this"] }
              }
            },
            footTrafficStatus: {
              $reduce: {
                input: "$footTrafficStatus",
                initialValue: [],
                in: { $concatArrays: ["$$value", "$$this"] }
              }
            },
            salesStatus: {
              $reduce: {
                input: "$salesStatus",
                initialValue: [],
                in: { $concatArrays: ["$$value", "$$this"] }
              }
            }
          }
        }
Enter fullscreen mode Exit fullscreen mode

The $addFields stage extracts the arrays of “interestedCustomers,” “footTrafficStatus,” and “salesStatus” from the grouped documents. It flattens these arrays and brings them to the root level.

Stage 9: Add field — visitedStore, madePurchase in the interestedCustomers & madePurchase in the footTrafficStatus:

{
          $addFields: {
            interestedCustomers: {
              $map: {
                input: "$interestedCustomers",
                as: "status",
                in: {
                  $mergeObjects: [
                    "$$status",
                    {
                      visitedStore: {
                        $cond: {
                          if: { $ifNull: ["$$status.footTrafficDate", false] },
                          then: true,
                          else: false
                        }
                      },
                      madePurchase: {
                        $cond: {
                          if: { $ifNull: ["$$status.claimedAt", false] },
                          then: true,
                          else: false
                        }
                      }
                    }
                  ]
                }
              }
            },
            footTrafficStatus: {
              $map: {
                input: "$footTrafficStatus",
                as: "status",
                in: {
                  $cond: {
                    if: { $ifNull: ["$$status.claimedAt", false] },
                    then: {
                      $mergeObjects: [
                        "$$status",
                        { madePurchase: true }
                      ]
                    },
                    else: {
                      $mergeObjects: [
                        "$$status",
                        { madePurchase: false }
                      ]
                    }
                  }
                }
              }
            }
          }
        }
Enter fullscreen mode Exit fullscreen mode

In this stage, the $addFields operator adds two new fields, “visitedStore” and “madePurchase,” to the “interestedCustomers” array. Similarly, the “madePurchase” field is added to the “footTrafficStatus” array.

Stage 10: Exclude specific fields from the interestedCustomers, footTrafficStatus, and salesStatus array:

{
          $project: {
            validOffersCount: 1,
            _id: 1,
            interestedUsers: 1,
            footTraffic: 1,
            sales: 1,
            revenue: 1,
            conversionRate: 1,
            interestedCustomers: {
              $map: {
                input: "$interestedCustomers",
                as: "customer",
                in: {
                  user: "$$customer.user",
                  timeOfClicking: "$$customer.viewDate",
                  visitedStore: "$$customer.visitedStore",
                  madePurchase: "$$customer.madePurchase",
                }
              }
            },
            footTrafficStatus: {
              $map: {
                input: "$footTrafficStatus",
                as: "customer",
                in: {
                  user: "$$customer.user",
                  visitedTime: "$$customer.footTrafficDate",
                  madePurchase: "$$customer.madePurchase"
                }
              }
            },
            salesStatus: {
              $map: {
                input: "$salesStatus",
                as: "customer",
                in: {
                  user: "$$customer.user",
                  purchasedAt: "$$customer.claimedAt",
                  amount: "$$customer.amount"
                }
              }
            },
          }
        }
Enter fullscreen mode Exit fullscreen mode

The $project stage excludes specific fields from the “interestedCustomers,” “footTrafficStatus” and “salesStatus” arrays, keeping only the desired fields related to user information, timestamps, and purchase details.

Full aggregation framework:

[
        // Stage 1: Match Document
        {
          $match: {
            store: storeId,
          }
        },

        // Stage 2: Lookup OfferUsers
        {
          $lookup: {
            from: "offerusers",
            localField: "_id",
            foreignField: "offer",
            as: "customers"
          }
        },

        // Stage 3: Filter the interestedCustomer
        {
          $addFields: {
            interestedCustomers: {
              $filter: {
                input: "$customers",
                as: "customer",
                cond: {
                  $and: [
                    { $gte: ["$$customer.viewDate", startDate] },
                    { $lte: ["$$customer.viewDate", endDate] }
                  ]
                }
              }
            }
          }
        },

        // Stage 4: Add fields - footTraffic, sales
        {
          $addFields: {
            footTrafficStatus: {
              $filter: {
                input: "$customers",
                as: "customer",
                cond: {
                  $and: [
                    { $gte: ["$$customer.footTrafficDate", startDate] },
                    { $lte: ["$$customer.footTrafficDate", endDate] }
                  ]
                }
              }
            },
            salesStatus: {
              $filter: {
                input: "$customers",
                as: "customer",
                cond: {
                  $and: [
                    { $gte: ["$$customer.claimedAt", startDate] },
                    { $lte: ["$$customer.claimedAt", endDate] }
                  ]
                }
              }
            }
          }
        },

        // Stage 5: Add fields - interestedUsers, footTraffic, salesCount
        {
          $addFields: {
            interestedUsers: {
              $size: "$interestedCustomers"
            },
            footTraffic: {
              $size: "$footTrafficStatus"
            },
            sales: {
              $size: "$salesStatus"
            },
          }
        },

        // Stage 6: Add fields - amount conversionRate, revenue
        {
          $addFields: {
            salesStatus: {
              $map: {
                input: "$salesStatus",
                as: "status",
                in: {
                  $mergeObjects: [
                    "$$status",
                    {
                      amount: {
                        $cond: {
                          if: { $eq: ['$__t', 'ParticularProduct'] },
                          then: "$discountedPrice",
                          else: "$minimumBillAmount",
                        }
                      }
                    }
                  ]
                }
              }
            },
            conversionRate: {
              $cond: {
                if: {
                  $eq: ["$interestedUsers", 0]
                },
                then: 0,
                else: {
                  $multiply: [
                    {
                      $divide: [
                        "$sales", {
                          $max: [1, "$interestedUsers"]
                        }
                      ]
                    },
                    100
                  ]
                }
              }
            },
            revenue: {
              $cond: {
                if: {
                  $eq: ["$sales", 0]
                },
                then: 0,
                else: {
                  $cond: {
                    if: { $eq: ['$__t', 'ParticularProduct'] },
                    then: { $multiply: ["$sales", "$discountedPrice"] },
                    else: { $multiply: ["$sales", "$minimumBillAmount"] }
                  }
                }
              }
            }
          }
        },
        // Stage 7: Group by _id to merge salesStatus objects into an array
        {
          $group: {
            _id: null,
            interestedUsers: { $sum: "$interestedUsers" },
            footTraffic: { $sum: "$footTraffic" },
            sales: { $sum: "$sales" },
            revenue: { $sum: "$revenue" },
            conversionRate: { $avg: "$conversionRate" },
            interestedCustomers: { $push: "$interestedCustomers" },
            footTrafficStatus: { $push: "$footTrafficStatus" },
            salesStatus: { $push: "$salesStatus" }
          }
        },

        // Stage 8: Add objects in interestedCustomers, footTraffic, salesStatus array to the root level
        {
          $addFields: {
            interestedCustomers: {
              $reduce: {
                input: "$interestedCustomers",
                initialValue: [],
                in: { $concatArrays: ["$$value", "$$this"] }
              }
            },
            footTrafficStatus: {
              $reduce: {
                input: "$footTrafficStatus",
                initialValue: [],
                in: { $concatArrays: ["$$value", "$$this"] }
              }
            },
            salesStatus: {
              $reduce: {
                input: "$salesStatus",
                initialValue: [],
                in: { $concatArrays: ["$$value", "$$this"] }
              }
            }
          }
        },

        // Stage 9: Add field - visitedStore, madePurchase in the interestedCustomers & madePurchase in the footTrafficStatus
        {
          $addFields: {
            interestedCustomers: {
              $map: {
                input: "$interestedCustomers",
                as: "status",
                in: {
                  $mergeObjects: [
                    "$$status",
                    {
                      visitedStore: {
                        $cond: {
                          if: { $ifNull: ["$$status.footTrafficDate", false] },
                          then: true,
                          else: false
                        }
                      },
                      madePurchase: {
                        $cond: {
                          if: { $ifNull: ["$$status.claimedAt", false] },
                          then: true,
                          else: false
                        }
                      }
                    }
                  ]
                }
              }
            },
            footTrafficStatus: {
              $map: {
                input: "$footTrafficStatus",
                as: "status",
                in: {
                  $cond: {
                    if: { $ifNull: ["$$status.claimedAt", false] },
                    then: {
                      $mergeObjects: [
                        "$$status",
                        { madePurchase: true }
                      ]
                    },
                    else: {
                      $mergeObjects: [
                        "$$status",
                        { madePurchase: false }
                      ]
                    }
                  }
                }
              }
            }
          }
        },

        // Stage 10: Exclude specific fields from the interestedCustomers, footTrafficStatus and salesStatus array
        {
          $project: {
            validOffersCount: 1,
            interestedUsers: 1,
            footTraffic: 1,
            sales: 1,
            revenue: 1,
            conversionRate: 1,
            interestedCustomers: {
              $map: {
                input: "$interestedCustomers",
                as: "customer",
                in: {
                  user: "$$customer.user",
                  timeOfClicking: "$$customer.viewDate",
                  visitedStore: "$$customer.visitedStore",
                  madePurchase: "$$customer.madePurchase",
                }
              }
            },
            footTrafficStatus: {
              $map: {
                input: "$footTrafficStatus",
                as: "customer",
                in: {
                  user: "$$customer.user",
                  visitedTime: "$$customer.footTrafficDate",
                  madePurchase: "$$customer.madePurchase"
                }
              }
            },
            salesStatus: {
              $map: {
                input: "$salesStatus",
                as: "customer",
                in: {
                  user: "$$customer.user",
                  purchasedAt: "$$customer.claimedAt",
                  amount: "$$customer.amount"
                }
              }
            },
          }
        }
      ]
Enter fullscreen mode Exit fullscreen mode

Conclusion:

Through this blog, I have demonstrated the journey of my implementation of a MongoDB Aggregation Framework to fetch complex calculated data for a statistics API. The Aggregation Pipeline, consisting of ten stages, provided the necessary tools to process, filter, and transform the data at each step. By gaining expertise in the MongoDB Aggregation Framework, I truly empowered myself to tackle similar challenges in the future. As a Backend Software Engineer, mastering powerful tools like MongoDB Aggregation enables me to unlock the full potential of backend APIs and deliver valuable insights to users.

Top comments (0)