Erik Lieben for Effectory

Posted on Sep 20, 2019 • Edited on Sep 23, 2019

Building a serverless blog site on Azure

#typescript #node #serverless #dotnet

This article is part of #ServerlessSeptember. You'll find other helpful articles, detailed tutorials, and videos in this all-things-Serverless content collection. New articles are published every day — that's right, every day — from community members and cloud advocates in the month of September.

Find out more about how Microsoft Azure enables your Serverless functions at https://docs.microsoft.com/azure/azure-functions/.

Introduction

In this blog post, I want to take you through the story of a serverless application and teach you how to build a serverless application that runs at minimal costs while maintaining scalability. I hope, to inspire you, to try, play, and get experience with serverless ideas and implementations to gain knowledge of serverless scenario's.

We will build an application that allows us to post articles in markdown and render them out to static HTML pages for easy consumption even if you don't have JavaScript enabled (search engine) and will, later on, look at ways to enhance the site if you do have JavaScript enabled.

This article takes you through the story and gives a global overview of the application with some code samples, but is in no way meant as a copy and paste example for a full application. I will go more in-depth into the specific topics in follow-up blog posts looking at each of the parts separately.

Architecture / Helicopter view

The application can be divided in to a few sections:

the hosting of the static files (below green bar)
the API for performing modifications to content (below red bar)
processing/ generation part (below purple bar)

The goal of serverless in our case is to remove as much of the idle CPU processing parts as possible, while still allowing us to be able to scale out to handle traffic or processes.

The hosting of the static files (below the green bar)

In the first section, we host the files/content of the blog on Azure Storage and serve files to clients using Azure CDN. This allows us to pay only for the storage of files and transfer of files from Azure Blob Storage to the Azure CDN. We won't require anything that is potentially wasting CPU cycles (idle VM or App Services). The CDN allows us to scale and deliver content rapidly to our clients, and we again only pay for the usage of the CDN (no idle machine if there is no traffic).

The API for performing modifications to content (below the red bar)

The second part consists of Azure Functions that we can run as part of the consumption plan. This allows us to remove the need for a machine that is spinning (adding to our costs) and waiting for requests from clients. With Azure Functions in the consumption plan, we only pay for the startup of a function and the amount of CPU/memory it uses during execution. So when no one is writing blog posts (retrieving and storing), the system is, in a sense, turned off and not generating costs. One of the downsides of running your code in this manner is that it takes a bit of time for functions to wake up or cold start. For now, we accept that we sometimes need to wait a few seconds to save or retrieve our content when editor.

Processing/ generation part (below purple bar)

The last part of the application is a set of Azure Functions that handle generating static content that can be consumed by clients. This allows us to serve our content quickly and to all clients (also clients that don't have JavaScript enabled, like search engines) without the need to render static content on each request.

Infrastructure

The central part of our application visited by most of the consumers of our application are the static files (either the JavaScript app/bundles or generated static blog articles). To serve those to the consumers, we require only a small portion of the services Azure offers: Azure Blob Storage and the Azure CDN service.

Static file hosting using Azure Blob static website hosting

Azure Blob Storage supports static website hosting. A feature that allows us to only pay for traffic/transfer and the storage of our files, a feature that fits perfectly into the Serverless story. It also allows us to define an index and error document path, which is very useful for single-page applications using push state.

You can set up a custom domain name for blob storage, but it won't allow you to use a custom SSL certificate for your domain name. So if you want to serve files over HTTPS, it will give you a warning about an incorrect SSL certificate, because it serves the certificate for blob.core.windows.net instead of the one you need for your custom domain. This can be resolved by using the Azure CDN service, which has the option to generate or use a custom certificate for your domain.

Azure Content Delivery Network

Azure CDN is a distributed network of servers managed by Azure that allows us to cache our content close to the end-users to minimize latency. The CDN has worldwide POP (point of presence) locations to provide content as quickly as possible to anyone, anywhere in the world, at any load.

As mentioned above, it also resolves our issue with the SSL certificate, because we can either upload or own SSL certificate or get one for free for our domain.

The CDN on top of Azure Blob storage gives us the perfect scalability and performance targets because the Azure CDN service supports much higher egress limits than a single storage account.

Costs

Calculating costs is difficult if we don't know the exact usage patterns of a site, but we can come up with some quick estimates that give us an idea of the bill that we could get at the end of the month.

Azure Storage

Local redundant storage, which is sufficient for our use case, will cost us €0.0166 per GB per month for the storage we need. The process for transactions are a bit more specific, but if we generalize them, they cost €0.0456 per 10.000 transactions. We get 5GB/month for free on outbound data transfer. After that, we pay €0.074 per GB.

The static files we store aren't GB's of data, it's most likely below a GB of data, which means €0.0166 and let's say we do 50.000 operations (which is a lot, but let's say our authors save their work often) that's €0.228 and a GB of data transfer for €0.074 per GB. That gives us an overall amount of 32 euro cents to host all the content for a month, which is nearly free and we will probably have a lower usage pattern because the Azure CDN does most of the data transfer.

Azure CDN

The costs for Azure CDN are the costs we will start to pay for transfer to clients because they will most likely hit one of the CDN Edge points. We will use Azure Premium from Verizon which is a bit more expensive than the standard one (but supports HTTP to HTTPS redirect rules).

Each zone has a different price, but if we take the most expensive one, which is €0.3930 per GB and estimate 5 GB of transfer, we will end up with a total cost of around 2 euro.

zone	area	per GB/ month
Zone 1	North America, Europe, Middle East and Africa	€0.1333
Zone 2	Asia Pacific (including Japan)	€0.1965
Zone 3	South America	€0.3930
Zone 4	Australia	€0.2202
Zone 5	India	€0.2674

Setup Azure Blob Storage hosting

Azure blob storage can be set up for hosting static content quite easily. Once your storage account is created, go to the 'Static website' section under Settings and enable it using the toggle.

There are two options to configure, the 'Index document name' and the 'Error document name'. If you want to host a SPA application with 'pushState' enabled, set both of these options to the 'index.html' or the root document of your SPA application to make it possible for the SPA application to activate on deeper routes than the base route (deep link into your SPA application/ pushState enabled).

Setup Azure CDN

We can now create a new Azure CDN Profile and point the endpoint to our newly created Azure Storage static site URL. You can find the URL for your static site in the same screen as were you enabled static site hosting. It's the 'Primary endpoint'. When creating the Azure CDN Profile, check the box before 'Create a new CDN endpoint now' and supply the name you want to use. Select 'Custom origin' from the dropdown box 'Origin type' and paste the 'Primary endpoint' URL into the textbox named 'Origin hostname'. Be sure to remove the leading 'https://' to make it valid.

Adding a custom domain name

If you own your own domain name you can set it up to point to the CDN endpoint.

Enable HTTPS

Once you've added your custom domain name you can click on it to setup HTTPS for the custom domain. You can either buy your own SSL certificate or get one for free from Microsoft Azure by using the option 'CDN managed'.

API

The editor will need a way to access the blog articles that are still unpublished and require a way to publish/ save a blog article in a secure way.

Secure API (Azure Function with HTTP trigger) with Azure AD

As we don't want anyone to be able to modify our blog post we need to limit the access to the Azure Functions with HTTP endpoints.

The Azure Functions team created a very easy to use option to accomplish this. We can simply add a provider that takes care of it in the 'Platform features' tab of the 'Functions App' in the section 'Networking' under 'Authentication/ Authorization' without making any modifications to our code.

There are a lot of different authentication providers. For now, I will use 'Azure Active Directory' as the authentication provider and create a user in AD with 2-factor authentication enabled. This will add an additional cost of around €1,- to our overall costs (for a user that has 2-factor authentication enabled).

Azure Functions C-sharp

Our REST API is used by the admin interface and takes care of serving and saving our blog articles. Using the input and output binding of Azure Functions allows us to build our REST API without a lot of code to maintain/ write.

Get blog post

        [FunctionName(nameof(Get))]
        public async Task<IActionResult> Get(
            [HttpTrigger(AuthorizationLevel.Anonymous, "get", Route = null)] HttpRequest req,
            [Blob("posts", FileAccess.Read, Connection = "connection")] CloudBlobContainer container)
        {
            string slug = req.Query["slug"];
            var blobRef = container.GetBlockBlobReference(slug + ".md");
            string markdownText = await blobRef.DownloadTextAsync();
            return new OkObjectResult(markdownText);
        }

Save blog post

       [FunctionName(nameof(Save))]
        public async Task<IActionResult> Save(
            [HttpTrigger(AuthorizationLevel.Anonymous, "post", Route = null)] HttpRequest req,
            [Blob("posts", FileAccess.ReadWrite, Connection = "connection")] CloudBlobContainer container,
            [Queue("get-markdown-metadata", Connection = "blogeriklieben")]CloudQueue outputQueue)
        {
            string slug = req.Query["slug"];
            if (string.IsNullOrWhiteSpace(slug))
            {
                return new BadRequestObjectResult("slug cannot be empty");
            }

            var blobRef = container.GetBlockBlobReference(slug + ".md");

            await blobRef.UploadFromStreamAsync(req.Body);
            blobRef.Properties.ContentType = "text/markdown";
            await blobRef.SetPropertiesAsync();

            // request update to the index file
            await outputQueue.AddMessageAsync(new CloudQueueMessage(slug));

            return new OkObjectResult(slug);
        }

List markdown files

        [FunctionName(nameof(List))]
        public IActionResult List(
            [HttpTrigger(AuthorizationLevel.Anonymous, "get", Route = null)] HttpRequest req,
            [Blob("posts/index.json", FileAccess.ReadWrite, Connection = "connection")] string index)
        {
            return new JsonResult(index);
        }

Azure Functions TypeScript

The great thing about Azure Functions is that you can make small functions that handle a single responsibility and pass it on to the next function for processing further. That function doesn't even need to be written in the same programming language, you can use the language that best fits the use case.

In our case, we will use TypeScript/JavaScipt to render out markdown files out using markdown-it. This is the markdown to HTML transformer we will use in our client-side editor. Markdown-it is a JavaScript framework for generating HTML from markdown with a rich set of plugins/ extensions.

This way, we don't need to find a C# framework or a port of markdown-it that does precisely the same, we can rather use the same logic in a small function and pass it back to our C# functions.

So even if you don't feel like you have a lot of experience or knowledge of JavaScript, at least you can use a small part of JavaScript code and don't need to worry about gaining the knowledge to host it as a service along with other concerns one might have to keep it running during the lifespan of our application.

In this case, I will use two TypeScript functions; one for gathering metadata and one for generating out static content using Aurelia.

Read markdown metadata

In our editor, we can provide metadata of a blog post by adding the following in key/value sets to the top of our markdown text:

---
title: 'amazing blog post',
publishDate: 2019-09-09,
state: published,
tags: amazing, awesome, superb
---

The only way to get this metadata out of our blog post, is by processing the markdown file itself. What we will do is listen to modifications to markdown files stored in our blob storage account.

Once a markdown file is saved, we need to process the markdown metadata to check if the blog post is in the published state which means that we need to queue it for publication and we will need to update the blog post index file that we keep in blob storage, with the latest information.

The function code index.ts:

const MarkdownIt = require('markdown-it');

module.exports = async function (context, markdownFilePath, markdownFile) {

    context.log('Processing metadata for markdown file: ', markdownFilePath);  

    const md = new MarkdownIt();
    md.use(require('markdown-it-meta'));
    const html = md.render(markdownFile);

    const meta = md.meta;
    meta.fileName = markdownFilePath;
    return JSON.stringify(meta);
};

As you can see this isn't much code and it's still easy to understand and maintain.

The function imports the markdown library and creates an instance of it. The next line imports the markdown-it-meta plugin for parsing the metadata and tells markdown-it to use the plugin/ extension. It will render the markdown to HTML and save the metadata in a separate property on the markdown instance. This is the data we need for further processing; we extend it with the markdownFilePath fileName and return the object serialized as JSON.

Now, if you don't want to use a SPA for rendering out the static HTML, you could just as well use the HTML variable in the above code snippet and combine that with your template HTML, and write it out to blob storage as an .HTML file.

A part of the magic of the above code sample is in the bindings. The Azure Functions runtime is injected in to our function. To let the runtime inject these, we define the following functions.json file with binding definitions:

{
  "bindings": [
    {
      "name": "markdownFilePath",
      "type": "queueTrigger",
      "direction": "in",
      "queueName": "get-markdown-metadata",
      "connection": "ConnectionString_STORAGE"
    },
    {
      "name": "markdownFile",
      "type": "blob",
      "path": "{queueTrigger}",
      "connection": "ConnectionString_STORAGE",
      "direction": "in",
      "dataType": "string"
    },
    {
      "name": "$return",
      "type": "queue",
      "direction": "out",
      "queueName": "markdown-metadata",
      "connection": "ConnectionString_STORAGE"
    }
  ]
}

The first binding is a trigger that activates as soon as a new message arrives in the storage queue, named get-markdown-metadata. The message content is the filename of the modified markdown file.

The second binding provides us with the content of the markdown file. To get the path of the markdown file we use the dynamic variable {queueTrigger} to get the message content from the queue that activated the Azure Function.

The last binding is the binding on the return value of the function and writes out the return value in a different storage queue named markdown-metadata.

Generate static files

I want to enhance my blog, later on, to become more dynamic and use a SPA (single page application) framework to do this. For now, generating static files using a SPA framework might look a bit odd, but it will be instrumental, to be revealed soon (in a future blog post-:-)).

One of the downsides of a SPA is that it is Client-Side Rendered by default, which isn't optimal for visitors that depend upon the static content and it also requires a little bit of time to initialize the SPA framework on the first load of the page. An example of a visitor that isn't starting up your SPA application is a search engine and it will miss out on most of your content. Luckily, there are a few options to mitigate the downsides.

Enhancing

With the enhance technique, you take a static, (or a server-side rendered) part of the site (rendered using another framework such as ASP.NET) and progressively enhance it using client-side code. This technique works well if the page has static content and doesn't use any dynamic content on each page load to render/understand the page. Content doesn't need to be static forever; the number of reads/views of the content just needs to succeed the amount of writes/modifications to the content.

Examples of these might be one blog post, a product page, and the news section.

This technique works well in a serverless context because we only need CPU cycles to generate static content from time to time. You will need to think about the amount of content you have and the timeframe in which you require the static content to refresh. It does its job right if the number of views is higher than the number of times the content is regenerated.

Server-side rendering

With the SSR (Service Side Rendering) technique, you run the framework on the server-side on each request to dynamically generate the first view that the client will be presented with. Now, this doesn't feels like anything new Since we've been doing that for ages using ASP.NET.

The main difference with this technique is that you use the same SPA framework as on the client-side and run it using Node.JS on the server. This allows you to have one code base and let the framework handle the rehydration of the page from the static content.

An example of this might be a (very active) discussion in a discussion board. You want to present the latest discussions at the page load, and let the client-side rendering handle the new posts that arrive after the first page load. Alternatively, if you have a profile page that due to the content changes, changes every hour, but only receives a visitor once a week, SSR might also be a better fit.

You can use this technique in a serverless manner, but you will need to keep in mind that it will require CPU cycles for each request because you need to render on each request. This works great if you have a large amount of content and the change rate is higher than the read/ visitor rate or if you need to render out pages with a 1-to-1 rate for write/ modifications and reads/visits.

The implementation

The SPA framework I like to use is Aurelia, which has been around since late 2015. The framework consists of a set of different libraries that can be used together as a robust framework. Due to this separation and all the different use cases, the libraries can be used in; from the start of the development of the framework, it provided high extensibility. One of the examples of that is the PAL (platform abstraction library) that is used throughout the libraries to abstract away the dependency on an actual browser, which means we can use it with a 'virtual browser' implementation in NodeJS. The next version of Aurelia which I will use during this post contains a similar implementation that is built on top of JSDOM in the library @aurelia/runtime-html-jsdom, which runs perfectly inside of on Azure Function.

A small disclaimer: the next version of Aurelia (vNext or 2) is still under development, which means it might not be the best choice for production usage at the time of writing this blog, but for this blog post I accept that things might be different in the final release of the next version of Aurelia.

side note: I will create a separate blog post later on that will walk through the Aurelia application step-by-step, for this post I will focus on rendering the static file content using Azure Functions and starting up the SPA app.

At the first try to generate static pages, I created code to start Aurelia and used the @aurelia/runtime-html-jsdom, which worked smoothly for everything related to Aurelia. One of the things that didn't work as well was the webpack plugin style-loader because I could not find a way to provide or inject a custom implementation of the DOM; it seems to have a hard dependency on objects in the browser. The easiest way around this was to load it inside the 'virtual browser' (that is created by JSDOM) where all the objects it requires exist.

Let's first look at the code required to render out the static page:

import { AzureFunction, Context } from "@azure/functions";
import * as jsdom from 'jsdom';
import * as fetch from 'node-fetch';

const queueTrigger: AzureFunction = async function (context: Context, slug: string): Promise<void> {

    context.log('Slug to render', slug);

    // Retrieve the SPA application html and javascript bundle
    const mainjs = await getFile('main.js');
    const indexhtml = await getFile('index.html');

    // Create a new JSDOM instance and use the index.html as the open document
    const dom = new jsdom.JSDOM(indexhtml, {
        contentType: "text/html",
        includeNodeLocations: true,
        pretendToBeVisual: true,
        storageQuota: 10000000,
        runScripts: "dangerously",
        resources: "usable"
    });

    // JSDOM has no default support for fetch, let's add it because we use fetch for performing calls to our API in our SPA app
    dom.window.fetch = fetch["default"];

    // Once JSDOM is done loading all the content (our index file)
    dom.window.document.addEventListener("DOMContentLoaded", async function () {

        // Tell JSDOM to load our webpack bundle and execute it
        dom.window.eval(mainjs);

        // Wait for the Aurelia application to start
        await dom.window.au.wait();

        // Change the url to let the aurelia-router open the component blog-post with the specified slug (the component will load the file from our get-post API)
        dom.window.location.hash = `blog-post(${slug})`;

        // Wait a second for the routing to complete
        await new Promise(resolve => setTimeout(resolve, 1000));

        // Serialize the state of the DOM to a string 
        let result = dom.serialize();

        // Replace the bundle, so that the app doesn't directly startup when the page is loaded (we want to keep it static for now)
        result = result.replace('<script type="text/javascript" src="main.js"></script>', '');

        // Store the result and notify Azure Functions we are done
        context.done(await saveFile(slug, result));
    });
};

export default queueTrigger;

As you can see in this case, we don't use blob input or output bindings. This is because at the point of writing this blog post the option to access blobs from the $web container (which is used by Azure Blob Storage static site hosting as the root container) is still not supported or I could not find a way to escape the $ character.

What we can do for the time being is use the azure blob storage SDK to get and save the files ourselves. The functions getFile and saveFile in the code block below will do that for us. It's a bit less pleasant, but it also gives us the insights into how much code we can save/remove by using the Azure Functions bindings :-)

import {
  Aborter,
  BlockBlobURL,
  ContainerURL,
  ServiceURL,
  SharedKeyCredential,
  StorageURL} from '@azure/storage-blob';

// credentials should not be in code, but just here to make it easier to read
const storageAccount = 'storage-account-name';
const pipeline = StorageURL.newPipeline(new SharedKeyCredential(storageAccount, 'key'));
const serviceURL = new ServiceURL(`https://${storageAccount}.blob.core.windows.net`, pipeline);
const containerURL = ContainerURL.fromServiceURL(serviceURL, '$web');

async function getFile(file) {   
    const blockBlobURL = BlockBlobURL.fromContainerURL(containerURL, file);
    const aborter = Aborter.timeout(30 * 1000);
    const downloadResponse = await blockBlobURL.download(aborter, 0);
    return await streamToString(downloadResponse.readableStreamBody);
}

async function streamToString(readableStream) {
    return new Promise((resolve, reject) => {
      const chunks = [];
      readableStream.on("data", data => {
        chunks.push(data.toString());
      });
      readableStream.on("end", () => {
        resolve(chunks.join(""));
      });
      readableStream.on("error", reject);
  });
}

async function saveFile(slug: string, content: string) {

  const blockBlobURL = BlockBlobURL.fromContainerURL(containerURL, `${slug}\\index.html`);
  const uploadBlobResponse = await blockBlobURL.upload(Aborter.none, content, content.length, {
    blobHTTPHeaders: {
      blobContentType: "text/html",
      blobContentEncoding: "utf-8",
    }
  });

  return uploadBlobResponse.errorCode;
}

The only content left for the above function is the function.json file that contains our binding information.
As you can see we generate a new static page as soon as we get a new item in the render-static-page storage queue.
The slug we push into the queue is a short identifier for the blog post itself, mostly with dashes to create a readable URL.

{
  "bindings": [
    {
      "name": "slug",
      "type": "queueTrigger",
      "direction": "in",
      "queueName": "render-static-page",
      "connection": "connectionString_STORAGE"
    }
  ],
  "scriptFile": "../dist/RenderFile/index.js"
}

So what are our approximate monthly running costs?

€1,18 a month for a Active Directory user
~ €0.32 for hosting our content on Azure Storage
~ €2,- for proving our content using the Azure CDN

So for the price of a coffee or a beer a month at a café we are able to serve our application in optimal conditions around the world.

Where can we go next?

There are a lot of different services in Azure that you can attach to your system or external system you can talk to using web hooks.

A few examples are:

Generate audio transcript using Azure Cognitive services text to speech
Tweet new blog post created (Azure Function => twitter API)
Notify Microsoft Teams channel (Azure Function => Teams API)
Generate PDF/ EPUB (Azure Function)

I hope this article could inspire you to think differently about the things you need to build and that you don't always need a AppService or VM that is costing money while it's idle.