Piloting Puppeteer with PureScript - Part 1

#webdev #javascript #functional

tl;dr Here's the GitHub repo showing all this in action.

Functional languages are not often the off-the-shelf choice when working with I/O intensive asynchronous tasks like piloting a headless browser. I find, though, that this is a place where functional programming shines. In addition to helping guarantee the correctness of the code (ie no pesky null-s or undefined-s), it provides a step-by-step framework that helps reason about what is going on.

In this series of articles, I'd like to show you how you can pilot Puppeteer on AWS Lambda using PureScript. I hope that, by the end, you'll see how functional programming can be a good fit for these sorts of tasks.

Comparing Puppeteer JS to Puppeteer PureScript

Below is a snippet of how to use the chrome puppeteer plugin copied from the README and edited a bit for clarity.

const chromium = require('chrome-aws-lambda');

exports.handler = async (event, context, callback) => {
  let result = null;
  let browser = null;

  try {
    executablePath = await chromium.executablePath;
    browser = await launchBrowser(executablePath);
    let page = await browser.newPage();
    await page.goto(event.url || 'https://example.com');
    result = await page.title();
  } catch (error) {
    return callback(error);
  } finally {
    if (browser !== null) {
      await browser.close();
    }
  }

  return callback(null, result);
};

Compare that to the PureScript version.

handler ::
  Foreign ->
  Foreign ->
  LambdaCallback ->
  Effect Unit
handler event context callback =
  launchAff_
    $ bracket
        (executablePath >>= launchBrowser)
        close
        ( \browser -> do
            page <- newPage browser
            goto page "https://example.com"
            title page
        )
    >>= liftEffect
    <<< resolveCallback callback

Comparing the two, we can see that there's not much difference between the basic flow.

An instance of a browser is created.
A new page is created.
The page navigates to example.com.
The lambda returns the title.

One immediate benefit of the PureScript compared to vanilla JS is the type safety: if you write goto page 42, the program won't compile. This it the case in TypeScript, Elm, and Reason as well. Using strongly-typed languages helps prevent bugs where you accidentally pass an invalid value and have to sort through error logs later down when headless chrome can't navigate to 42 and crashes with error code 127.

Aff

An additional benefit of PureScript, and the main focus of this article, is the Aff monad. Aff-s are asynchronous, fiber-based computations in a monadic context. This endows them with several superpowers, like the ability to be forked, joined, spawned, all of which is clunky in JS/TS.

Aff-s can also be used to reason about how resources are used - how they're allocated, how they're released, and what they're used to make. This is done with the function bracket. Let's take a look at its signature:

bracket :: Aff a -> (a -> Aff Unit) -> (a -> Aff b) -> Aff b
bracket acquire release use = ...

acquire is where you create a resource, release is where you clean it up irrespective of what happens when it's used, and use is where a is used to create a b. This is a bit like try/catch/finally, but it has several advantages:

It forces us to write cleanup code in finally.
It distinguishes between failure in the use stage and failure in the acquire stage, whereas try clumps these two together.
It always returns an Aff of type b, which makes it easier to do the next step as a continuation - in this case, the lambda callback. Compare this to the JavaScript, where the only way to get result to callback is by making result mutable, which is an invitation for disaster.

I find that the last point is the most important one. When I write lambda is JS or TS, it's hard to remember to call the callback and often requires passing the callback around to lots of internal functions. Here, by using Aff, the callback is always the last thing called and it is called with an immutable result (here, the outcome of bracket).

Given all the stuff that can go wrong when running a headless browser on a serverless function executing on bare metal somewhere in Ireland, it's nice to know that the orchestration of acquiring and releasing assets in an asynchronous context is predictable thanks to a rock-solid type system. And not just nice for us - it's nice for our users as well! This helps guarantee that Meeshkan users have smooth tests and videos on the Meeshkan service, both of which are produced on headless Chrome on AWS Lambda.

In the next article, we'll look at how to use type classes in PureScript to enforce consistent patterns in the writing of asynchronous code.

DEV Community

Piloting Puppeteer with PureScript - Part 1

Comparing Puppeteer JS to Puppeteer PureScript

Aff

Top comments (0)

Read next

TypeScript vs. JavaScript: Which One Is Better?

SOAP vs REST API: Understanding the Differences

Qwen2.5 Coder — The Future of Local Code Generation! 🎉

Go-DOM - 1st major milestone