What I built
I have a pet project that I started some time ago, while studying programming. It's a Telegram bot for people learning German, called Dasbot.
I'm pretty proud of its daily audience of a few hundred users who have collectively answered more than 300k quiz questions 😎, but I must confess: until now its database has been residing in a Docker container 📦. Like, not even on a mounted volume 🤦♂️.
This hackaton motivated me to amend this gruesome mistake.
Also, now that I know about change streams, I can display some real time stats on the bot's web page, yay!
Category Submission:
No idea! Just wanted to share some life lessons :)
App Link
You're welcome to use the bot and answer its questions! (Especially if you're struggling with German like I do). If it annoys you, just ban it 😊
Screenshots
Description
German language is difficult! Especially terrible are its grammatical genders which defy any logic, so you just have to memorize them.
Dasbot actually helps you do this, with a simple spaced repetition algorithm.
It's written in Python, because I was studying Python at that time.
And it's using MongoDB for database, because I didn't need much structure in my documents.
(There should be a photo of my desk here, covered with all the bureaucratic papers they send you twice a day here in Germany 📩).
In the database I keep everyone's scores neeeded for the repetition system. I also collect stats (user, word, answer, time) -- there could be some useful insights in there.
Link to Source Code
https://github.com/wetterkrank/dasbot -- main app
https://github.com/wetterkrank/dasbot-docs-live -- web app with the new /stats page
Permissive License
Background
So, I used Docker.
It's a great tool! And I guess it's ok for a study project to spawn a database in a container. But when you do it in "production", you start collecting some gotchas. Here's a couple of mine.
mongo:
ports:
- "0.0.0.0:27017:27017"
-- this was a part of my docker-compose.yml
.
After the launch, everything worked fine for a few days, and then I found my database empty!
I checked the Mongo logs and found some dropDatabase
calls coming from unknown IPs. Hacked! 🪓 But how!? I knew my ufw
rules by heart! What I didn't know is that Docker keeps its own iptables
and will not be trammelled by a mere firewall.
So when you expose the port using 0.0.0.0
, you share it with the world full of people with port scanners.
Fast forward to this November. I just updated a config setting and decided to restart the containers manually.
Then I pinged the bot and was slightly surprised that it didn't recognise me. So I looked at the db collections... interesting... 0 documents... 😰
After scrolling up the shell history, I noticed that I typed docker-compose down
instead of docker-compose stop
. Here goes my data! Luckily, I had a backup 😅.
How I built it
As for the moving to Atlas part: this was simple!
I would have loved to use the live migration service but I decided to start with M0 cluster so didn't have the opportunity and just used mongorestore
instead:
DB_CONTAINER="dasbot_db"
RESTORE_URI="mongodb+srv://$DB_USERNAME:$DB_PASSWORD@mydb.smth.mongodb.net/"
echo "Piping mongodump to mongorestore with Atlas as destination..."
docker exec $DB_CONTAINER mongodump --db=dasbot --archive | mongorestore --archive --drop --uri="$RESTORE_URI"
One notable hiccup was the speed of mongorestore
-- a pitiful 50Mb of data took several minutes to load! However, increasing the number of workers (numInsertionWorkersPerCollection
) helped.
For the change streams (real time stats) exercise I had to refresh my knowledge of aggregation pipelines and write some JS code. I already mentioned stats
collection above, it can be used to build all kinds of reports.
So I've added a couple of triggers which are responsible for aggregating this data and publishing the updates to a separate database, and an Atlas app that lets users access this database anonymously.
// Scheduled to run twice per day
// Updates correct / incorrect counters in answers_total
exports = function() {
const mongodb = context.services.get("DasbotData");
const collection = mongodb.db("dasbot").collection("stats");
const pipeline = [
{ $group: {
_id: { $cond: [ { $eq: ["$correct", true] }, 'correct', 'incorrect' ] },
count: { "$sum": 1 }
}
},
{
$out: { db: "dasbot-meta", coll: "answers_total" }
}
]
collection.aggregate(pipeline);
};
// This runs on every `stats` insert and updates the aggregated results
exports = function(changeEvent) {
const db = context.services.get("DasbotData").db("dasbot-meta");
const answers_total = db.collection("answers_total");
const fullDocument = changeEvent.fullDocument;
const key = fullDocument.correct ? "correct" : "incorrect";
const options = { "upsert": true };
answers_total.updateOne( { "_id": key }, { "$inc": { "count": 1 } }, options); // { _id:, value: }
};
To display the data, I made a simple React app that uses the Realm Web SDK. Now, when someone answers the bot's question, you can immediately see it ⚡.
Additional Resources/Info
This tutorial was quite handy!
Top comments (0)