DEV Community

Cover image for Why we migrated opensource 😼inboxkitten (77 million serverless request) from 🔥Firebase to ☁️Cloudflare workers & 🐑CommonsHost
Eugene Cheah for Uilicious

Posted on • Edited on • Originally published at uilicious.com

Why we migrated opensource 😼inboxkitten (77 million serverless request) from 🔥Firebase to ☁️Cloudflare workers & 🐑CommonsHost

Get Mail Nyow! 📩

Since our original launch of inboxkitten.com, a free opensource serverless project for creating a disposable email service...

The 🐈 kittens have multiplied and gotten out of hand...

February Bill Shock

Gamercat summer sale

Ok cat team, we need a plan to protect the summer sale! 💳

One of the key limitations of existing AWS lambda or GCP/Firebase cloud functions is that it limits 1 request to an application instance at any point of time.

While that is awesome, if you are compressing images, or doing complex stuff.

Inboxkitten API only role is to make HTTP requests to mailgun, which holds the actual email, with the API key, and echo the result back.

As a result our resource consumption is well below 1% of a CPU, or <10MB of ram per request. Well, below the minimum of 128MB ram and dedicated CPU per request.

So our original plan was that once these kittens grow in very constant load, would be to toss it onto a $5/month Linode Nanode 1GB instance.

However, the downside is that it wouldn't "scale" automatically at peak workload or if it ever goes to a gazillion request a month, which would be beyond one instance. (Not that we would ever need to support such workload)

So in keeping to the spirit of serverless fun (which made this project popular), we would put that option aside and ask...

What if, there was a serverless platform, with an alternative billing model. What if it bills only per request, or only for the amount of CPU and ram used. What if we ...

Cat workers

Use cat Cloudflare workers ☁️

Cloudflare worker, which is part of the growing trend of "edge" serverless computing. Also happens to simplify serverless billing down to a single metric.

Turning what was previously convoluted, and hard to decipher bills from GCP...

Billable Item Usage Unit Total Cost Avg price (per million invocations)
Invocation 77,418,914 invocation $30.17 $0.390
CPU Time 4,677,797 GHZ-second $44.78 $0.578
Outgress 215.12 Gibibyte $25.21 $0.326
Memory Time 2,923,623 Gibibyte second $6.31 $0.082
Log Volume 61.16 Gibibyte $5.58 $0.072
Total: $112.05 $1.447
Like seriously, try to guess how much your API will cost for a million request in advance is kinda impossible on GCP / AWS, without historical data

Into something much easily understood, and cheaper overall per request ...

Billable Item Usage Unit Total Cost Avg price (per million invocations)
Invocation 77,418,914 invocation $39 $0.5
Total: $39 $0.5
Note that there is a minimum of $5/mth

😸 That's enough net savings for 7 more $9.99 games during summer sale!

And has a bonus benefit of lower latency edge computing!

Ready for summer cat

But, there are catches ...

1) <5ms CPU time limit

Each request is limited to < 5ms of CPU time, this differs greatly for the request time (also known as wall clock time), in that it only counts the amount of time the CPU spent on our function itself and ignoring all its sleep/wait time.

This is as opposed to how GCP or AWS measure the time taken from start to end of the function, including all sleep/wait time.

In such a setup, serverless functions which fold DNA strands or cat images, with large CPU or RAM consumption. Will find Cloudflare unusable.

However, our serverless functions spend 99.99% of the time waiting for mailgun API to respond. Making the <5ms limit purrfect.

Also if we ever need to upgrade this to <10ms, it is an optional addon.

Cat captain

This is your captain speaking: Jumping ship is a go!

2) Incompatibility with express.js (as it uses web workers pattern)

Another thing to note is that Cloudflare workers are based on the web workers model, where it hooks onto the "fetch" event in cloudflare like an interceptor function.

So instead of the following in express JS (as we currently do for firebase)

const express = require('express')
const app = express()

app.get('/', (req, res) => res.send('Hello World!'))

const port = 3000
app.listen(port, () => console.log(`Example app listening on port ${port}!`))
Enter fullscreen mode Exit fullscreen mode

In Cloudflare, it would be the following

addEventListener('fetch', event => {
  event.respondWith(fetchAndApply(event.request))
})

async function fetchAndApply(request) {
  return new Response('hello world')
}
Enter fullscreen mode Exit fullscreen mode

As this is a huge fundamental change in the code structure. It can be a major no go for larger existing projects, due to the sheer amount of rewrite work involved. Despite the extremely close similarity.

Perhaps one day someone will come up with a Cloudflare to express.js adapter?

But for us, due to the simplicity of the project, it would be just a simple rewrite. You can compare the difference in the code between express js version here and the Cloudflare version here on github.

And while workers is an open standard, with Cloudflare being the only major provider, it is currently also a form of vendor lock-in.

cat squeezing inside a box

I can't squeeze an express cat, into a cloudflare worker, out of the box

3) one script limitation per domain (unless you are on enterprise)

While not a deal breaker for inboxkitten, this can be one for many commercial / production workload.

Because Cloudflare serverless packages cannot be broken down into smaller packages for individual subdomains and/or URI routes.

This greatly complicates things, making it impossible to have testing and production code separated on a single domain, among many more complicated setups.

However in our use case, as this is a hobby project.... it doesn't matter...

cat sharing a box

Sadly we cant share multiple different cats across a single box domain easily

The scratches in summary...

  1. <5ms CPU time limit
  2. Incompatible with express.js
  3. one script limitation per domain

cool cat

3 problems with cloudflares workers, and we do not care!

And all it took was a quick one day rewrite, by @jmtiong, and we are done.


That's cool, but Why do I even need Inboxkitten disposable email for again?

One of the key use cases currently, and why we built this project, is to perform email validations as part of our automated test scripts. Such as the following...


// Lets goto inbox kitten
I.goTo("https://inboxkitten.com");
I.see("Open-Source Disposable Email");

// Go to a random inbox inbox 
I.fill("email", SAMPLE.id(22));
I.click("Get Mail Nyow!");

// Check that its empty
I.see("There for no messages for this kitten :(");

// Testing for regular email
// (sent using a jenkins perodic build)
I.goTo("https://inboxkitten.com");
I.see("Open-Source Disposable Email");
I.fill("email", "ik-reciever-f7s1g28");
I.click("Get Mail Nyow!");

// See an email we expect, nyow
I.see("Testing inboxkitten subject");

Enter fullscreen mode Exit fullscreen mode

With sharable test results such as

uilicious demo

Plus it's simple, cool, and fun to tinker around with.


cat and sheep as friends

cat and sheeps can be friends!

Hey, how about the $31 firebase static file hosting?

The easiest free solution would be to throw the entire website onto github pages

However in the spirit of opensource...

We are gonna do a shout out for our friends from 🐑 commonshost.com, an open source static site hosting platform, being built out of Singapore 🇸🇬

And help them push their network with a real production work load test across their 22+ servers on its global network.

As to why commons host instead of GitHub... cause its cool, and I want to support the underdog of the CDN world, being an underdog of the testing world.

Oops, did I say underdog? I mean undercat 😼

What's next?

Due to the project rather unique simplicity (quick to rewrite), and its heavy production load. I am thinking of potentially expending its deployment option to as many serverless options as possible, or docker based deployments even.

Exploring the various trade-offs with actual 24/7 production load.

Done

  • GCP/Firebase: function and hosting
  • Cloudflare workers
  • Commonshost hosting

Todo

  • Docker container deployment
    • ECS Fargate
    • Digital Ocean Kubernetes
  • AWS lambda
  • Others??

Let's see where this cat ship will sail next... Till then we will be taking a cat nap

Cat sleeping


Happy Shipping 🖖🏼🚀

Top comments (7)

Collapse
 
tmikaeld profile image
Mikael D

Nice writeup and great usage of CF workers, hot tip - you can use CF Key-value database to store the static files!

Collapse
 
picocreator profile image
Eugene Cheah • Edited

Thanks, @tmikaeld we did actually consider using KeyValue or alternatively using the Cache API - though it would very likely double our bill.

For a more commercial production use case, having such edge like performance could be well worth the cost. For a hobby project, not as much haha 😅

Currently, we even limit the workers to strictly the required endpoints required, to cut out the random API scans, from bots, etc. As from the firebase logs, we realized it can be quite sizable, due to the site's popularity in Russia, China, and the USA.

That being said - for email response body, that is something we definitely want to do next! Less so for performance or cost... But to do so for fun!

Collapse
 
tmikaeld profile image
Mikael D

Yeah, I realized after i wrote it that you'd still hit the Workers and incur a cost for it, so normal hosting would be cheaper.

Thread Thread
 
picocreator profile image
Eugene Cheah • Edited

Though if you get creative. There might be some possible Rube-Goldberg level of cost-cutting that might be possible. Something one should never do in production for code maintainability (maybe?)

In general, our API functions in 2 major step - listing of emails, and reading of the email content, with separate API calls. And due to its workflow, its always in that sequence.

From the documentation of Cloudflare VCL and Cache API

What we can do is split the API routing into

  • listing on Cloudflare worker
  • reading of email body on firebase (or something else) with standard Cloudflare in front of it

So that during the initial "listing of emails" - what we can do is to fill in the Cloudflare cache for the subsequent "reading of email" API call. For the top 10 emails in the list (if any).

And when the subsequent "reading of email" is called on the user click, it would hopefully be reading from cache, saving a single call hit... And if it misses, it would simply fall back to firebase.


But like I said, it's a crazy rube-Goldberg. And would only technically work cause there is a reasonable chance of hitting the same Cloudflare node due to the magic of http2.

And that's assuming it works lol - would be amused to see someone experiment with such setups.

Collapse
 
ngrilly profile image
Nicolas Grilly

You mention DigitalOcean Managed Kubernetes: have you tried it?

Collapse
 
picocreator profile image
Eugene Cheah

Nope, though I am interested in doing so. As using it with autoscale (like google or aws kubernetes), is kinda "close to serverless".

Plus they have very bandwidth friendly pricing 😼

Collapse
 
ashanita_tabihany_6b1113f profile image
ashanita tabihany

Please Fix inboxkitten that if we put space on username it will show all email, it serious problem bro