In the fast-paced world of software development, engineers are often tasked with solving complex problems efficiently. As a Backend Software Engineer, I face numerous challenges while building robust APIs and integrating them with powerful databases. One such scenario arises when I was assigned to create an API for displaying statistics on a service. In this blog, I will illustrate my journey to embark on a quest to implement a MongoDB Aggregation Framework for fetching complex calculated data.
Afew days ago, I was handed the responsibility of writing an API that displays statistics for a service. To obtain this data, I needed to fetch and process information stored in a MongoDB database in different collections. However, my knowledge of the MongoDB Aggregation Framework was initially limited. Nonetheless, I eagerly took on the challenge, realizing that this would be an excellent opportunity to enhance your expertise in MongoDB Aggregation.
Unveiling the Aggregation Pipeline: The MongoDB Aggregation Framework provides a powerful toolset for performing complex data operations on MongoDB collections. It allows you to process, transform, and analyze data through a series of stages known as the Aggregation Pipeline. Each stage in the pipeline takes input from the previous stage and produces output for the subsequent one. Let’s dive into the details of each stage in the Aggregation Framework implemented for your API.
Stage 1: Match Document:
{
$match: {
store: storeId,
}
}
At the beginning of the pipeline, the $match stage filters the documents based on a specific condition. In this case, it matches documents with a particular store ID.
Stage 2: Lookup OfferUsers:
{
$lookup: {
from: "offerusers",
localField: "_id",
foreignField: "offer",
as: "customers"
}
}
In the $lookup stage, the Aggregation Framework performs a left outer join with the “offerusers” collection. It connects documents from the “Offers” collection with related documents from the “offerusers” collection using the “offer” and “_id” fields, respectively. The matching documents are stored in the “interestedCustomers” field.
Stage 3: Filter the interestedCustomer:
{
$addFields: {
interestedCustomers: {
$filter: {
input: "$customers",
as: "customer",
cond: {
$and: [
{ $gte: ["$$customer.viewDate", startDate] },
{ $lte: ["$$customer.viewDate", endDate] }
]
}
}
}
}
}
The $addFields stage is used to filter the “interestedCustomers” based on a date range. The $filter operator selects only the customers whose “viewDate” falls within the specified start and end dates.
Stage 4: Add fields — footTraffic, sales:
{
$addFields: {
footTrafficStatus: {
$filter: {
input: "$customers",
as: "customer",
cond: {
$and: [
{ $gte: ["$$customer.footTrafficDate", startDate] },
{ $lte: ["$$customer.footTrafficDate", endDate] }
]
}
}
},
salesStatus: {
$filter: {
input: "$customers",
as: "customer",
cond: {
$and: [
{ $gte: ["$$customer.claimedAt", startDate] },
{ $lte: ["$$customer.claimedAt", endDate] }
]
}
}
}
}
}
The $addFields stage is used again to add two fields: “footTrafficStatus” and “salesStatus.” These fields contain filtered arrays of “interestedCustomers” based on specific date ranges.
Stage 5: Add fields — interestedUsers, footTraffic, salesCount:
{
$addFields: {
interestedUsers: {
$size: "$interestedCustomers"
},
footTraffic: {
$size: "$footTrafficStatus"
},
sales: {
$size: "$salesStatus"
},
}
}
In this stage, the $addFields operator calculates the number of interested users, foot traffic, and the count of sales. The $size operator counts the number of elements in each array.
Stage 6: Add fields — amount, conversionRate, revenue:
{
$addFields: {
salesStatus: {
$map: {
input: "$salesStatus",
as: "status",
in: {
$mergeObjects: [
"$$status",
{
amount: {
$cond: {
if: {
$eq: ['$__t', 'ParticularProduct']
},
then: "$discountedPrice",
else: "$minimumBillAmount",
}
}
}
]
}
}
},
conversionRate: {
$cond: {
if: {
$eq: ["$interestedUsers", 0]
},
then: 0,
else: {
$multiply: [
{
$divide: [
"$sales", {
$max: [1, "$interestedUsers"]
}
]
},
100
]
}
}
},
revenue: {
$cond: {
if: {
$eq: ["$sales", 0]
},
then: 0,
else: {
$cond: {
if: { $eq: ['$__t', 'ParticularProduct'] },
then: { $multiply: ["$sales", "$discountedPrice"] },
else: { $multiply: ["$sales", "$minimumBillAmount"] }
}
}
}
}
}
}
Here, the $addFields stage adds additional fields to the “salesStatus” objects. It also calculates the conversion rate and revenue based on certain conditions.
Stage 7: Group by _id to merge salesStatus objects into an array:
{
$group: {
_id: null,
interestedUsers: { $sum: "$interestedUsers" },
footTraffic: { $sum: "$footTraffic" },
sales: { $sum: "$sales" },
revenue: { $sum: "$revenue" },
conversionRate: { $avg: "$conversionRate" },
interestedCustomers: { $push: "$interestedCustomers" },
footTrafficStatus: { $push: "$footTrafficStatus" },
salesStatus: { $push: "$salesStatus" }
}
}
The $group stage groups the documents and merges the “salesStatus” objects into an array. It also calculates the sum of interested users, foot traffic, sales, and revenue. The conversion rate is averaged across all documents.
Stage 8: Add objects in interestedCustomers, footTraffic, salesStatus array to the root level:
{
$addFields: {
interestedCustomers: {
$reduce: {
input: "$interestedCustomers",
initialValue: [],
in: { $concatArrays: ["$$value", "$$this"] }
}
},
footTrafficStatus: {
$reduce: {
input: "$footTrafficStatus",
initialValue: [],
in: { $concatArrays: ["$$value", "$$this"] }
}
},
salesStatus: {
$reduce: {
input: "$salesStatus",
initialValue: [],
in: { $concatArrays: ["$$value", "$$this"] }
}
}
}
}
The $addFields stage extracts the arrays of “interestedCustomers,” “footTrafficStatus,” and “salesStatus” from the grouped documents. It flattens these arrays and brings them to the root level.
Stage 9: Add field — visitedStore, madePurchase in the interestedCustomers & madePurchase in the footTrafficStatus:
{
$addFields: {
interestedCustomers: {
$map: {
input: "$interestedCustomers",
as: "status",
in: {
$mergeObjects: [
"$$status",
{
visitedStore: {
$cond: {
if: { $ifNull: ["$$status.footTrafficDate", false] },
then: true,
else: false
}
},
madePurchase: {
$cond: {
if: { $ifNull: ["$$status.claimedAt", false] },
then: true,
else: false
}
}
}
]
}
}
},
footTrafficStatus: {
$map: {
input: "$footTrafficStatus",
as: "status",
in: {
$cond: {
if: { $ifNull: ["$$status.claimedAt", false] },
then: {
$mergeObjects: [
"$$status",
{ madePurchase: true }
]
},
else: {
$mergeObjects: [
"$$status",
{ madePurchase: false }
]
}
}
}
}
}
}
}
In this stage, the $addFields operator adds two new fields, “visitedStore” and “madePurchase,” to the “interestedCustomers” array. Similarly, the “madePurchase” field is added to the “footTrafficStatus” array.
Stage 10: Exclude specific fields from the interestedCustomers, footTrafficStatus, and salesStatus array:
{
$project: {
validOffersCount: 1,
_id: 1,
interestedUsers: 1,
footTraffic: 1,
sales: 1,
revenue: 1,
conversionRate: 1,
interestedCustomers: {
$map: {
input: "$interestedCustomers",
as: "customer",
in: {
user: "$$customer.user",
timeOfClicking: "$$customer.viewDate",
visitedStore: "$$customer.visitedStore",
madePurchase: "$$customer.madePurchase",
}
}
},
footTrafficStatus: {
$map: {
input: "$footTrafficStatus",
as: "customer",
in: {
user: "$$customer.user",
visitedTime: "$$customer.footTrafficDate",
madePurchase: "$$customer.madePurchase"
}
}
},
salesStatus: {
$map: {
input: "$salesStatus",
as: "customer",
in: {
user: "$$customer.user",
purchasedAt: "$$customer.claimedAt",
amount: "$$customer.amount"
}
}
},
}
}
The $project stage excludes specific fields from the “interestedCustomers,” “footTrafficStatus” and “salesStatus” arrays, keeping only the desired fields related to user information, timestamps, and purchase details.
Full aggregation framework:
[
// Stage 1: Match Document
{
$match: {
store: storeId,
}
},
// Stage 2: Lookup OfferUsers
{
$lookup: {
from: "offerusers",
localField: "_id",
foreignField: "offer",
as: "customers"
}
},
// Stage 3: Filter the interestedCustomer
{
$addFields: {
interestedCustomers: {
$filter: {
input: "$customers",
as: "customer",
cond: {
$and: [
{ $gte: ["$$customer.viewDate", startDate] },
{ $lte: ["$$customer.viewDate", endDate] }
]
}
}
}
}
},
// Stage 4: Add fields - footTraffic, sales
{
$addFields: {
footTrafficStatus: {
$filter: {
input: "$customers",
as: "customer",
cond: {
$and: [
{ $gte: ["$$customer.footTrafficDate", startDate] },
{ $lte: ["$$customer.footTrafficDate", endDate] }
]
}
}
},
salesStatus: {
$filter: {
input: "$customers",
as: "customer",
cond: {
$and: [
{ $gte: ["$$customer.claimedAt", startDate] },
{ $lte: ["$$customer.claimedAt", endDate] }
]
}
}
}
}
},
// Stage 5: Add fields - interestedUsers, footTraffic, salesCount
{
$addFields: {
interestedUsers: {
$size: "$interestedCustomers"
},
footTraffic: {
$size: "$footTrafficStatus"
},
sales: {
$size: "$salesStatus"
},
}
},
// Stage 6: Add fields - amount conversionRate, revenue
{
$addFields: {
salesStatus: {
$map: {
input: "$salesStatus",
as: "status",
in: {
$mergeObjects: [
"$$status",
{
amount: {
$cond: {
if: { $eq: ['$__t', 'ParticularProduct'] },
then: "$discountedPrice",
else: "$minimumBillAmount",
}
}
}
]
}
}
},
conversionRate: {
$cond: {
if: {
$eq: ["$interestedUsers", 0]
},
then: 0,
else: {
$multiply: [
{
$divide: [
"$sales", {
$max: [1, "$interestedUsers"]
}
]
},
100
]
}
}
},
revenue: {
$cond: {
if: {
$eq: ["$sales", 0]
},
then: 0,
else: {
$cond: {
if: { $eq: ['$__t', 'ParticularProduct'] },
then: { $multiply: ["$sales", "$discountedPrice"] },
else: { $multiply: ["$sales", "$minimumBillAmount"] }
}
}
}
}
}
},
// Stage 7: Group by _id to merge salesStatus objects into an array
{
$group: {
_id: null,
interestedUsers: { $sum: "$interestedUsers" },
footTraffic: { $sum: "$footTraffic" },
sales: { $sum: "$sales" },
revenue: { $sum: "$revenue" },
conversionRate: { $avg: "$conversionRate" },
interestedCustomers: { $push: "$interestedCustomers" },
footTrafficStatus: { $push: "$footTrafficStatus" },
salesStatus: { $push: "$salesStatus" }
}
},
// Stage 8: Add objects in interestedCustomers, footTraffic, salesStatus array to the root level
{
$addFields: {
interestedCustomers: {
$reduce: {
input: "$interestedCustomers",
initialValue: [],
in: { $concatArrays: ["$$value", "$$this"] }
}
},
footTrafficStatus: {
$reduce: {
input: "$footTrafficStatus",
initialValue: [],
in: { $concatArrays: ["$$value", "$$this"] }
}
},
salesStatus: {
$reduce: {
input: "$salesStatus",
initialValue: [],
in: { $concatArrays: ["$$value", "$$this"] }
}
}
}
},
// Stage 9: Add field - visitedStore, madePurchase in the interestedCustomers & madePurchase in the footTrafficStatus
{
$addFields: {
interestedCustomers: {
$map: {
input: "$interestedCustomers",
as: "status",
in: {
$mergeObjects: [
"$$status",
{
visitedStore: {
$cond: {
if: { $ifNull: ["$$status.footTrafficDate", false] },
then: true,
else: false
}
},
madePurchase: {
$cond: {
if: { $ifNull: ["$$status.claimedAt", false] },
then: true,
else: false
}
}
}
]
}
}
},
footTrafficStatus: {
$map: {
input: "$footTrafficStatus",
as: "status",
in: {
$cond: {
if: { $ifNull: ["$$status.claimedAt", false] },
then: {
$mergeObjects: [
"$$status",
{ madePurchase: true }
]
},
else: {
$mergeObjects: [
"$$status",
{ madePurchase: false }
]
}
}
}
}
}
}
},
// Stage 10: Exclude specific fields from the interestedCustomers, footTrafficStatus and salesStatus array
{
$project: {
validOffersCount: 1,
interestedUsers: 1,
footTraffic: 1,
sales: 1,
revenue: 1,
conversionRate: 1,
interestedCustomers: {
$map: {
input: "$interestedCustomers",
as: "customer",
in: {
user: "$$customer.user",
timeOfClicking: "$$customer.viewDate",
visitedStore: "$$customer.visitedStore",
madePurchase: "$$customer.madePurchase",
}
}
},
footTrafficStatus: {
$map: {
input: "$footTrafficStatus",
as: "customer",
in: {
user: "$$customer.user",
visitedTime: "$$customer.footTrafficDate",
madePurchase: "$$customer.madePurchase"
}
}
},
salesStatus: {
$map: {
input: "$salesStatus",
as: "customer",
in: {
user: "$$customer.user",
purchasedAt: "$$customer.claimedAt",
amount: "$$customer.amount"
}
}
},
}
}
]
Conclusion:
Through this blog, I have demonstrated the journey of my implementation of a MongoDB Aggregation Framework to fetch complex calculated data for a statistics API. The Aggregation Pipeline, consisting of ten stages, provided the necessary tools to process, filter, and transform the data at each step. By gaining expertise in the MongoDB Aggregation Framework, I truly empowered myself to tackle similar challenges in the future. As a Backend Software Engineer, mastering powerful tools like MongoDB Aggregation enables me to unlock the full potential of backend APIs and deliver valuable insights to users.
Top comments (0)