Carl Mungazi

Posted on Sep 19, 2019

Learn JavaScript by building a UI framework: Part 4 - Creating A Module Bundler

#bundler #node #javascript #npm

This article is the fourth in a series of deep dives into JavaScript. You can view previous articles by visiting the Github repository associated with this project.

This series does not comprehensively cover every JavaScript feature. Instead, features are covered as they crop up in solutions to various problems. Also, every post is based on tutorials and open source libraries produced by other developers, so like you, I too am also learning new things with each article.

In the last article we added the functionality for our framework to create and render DOM elements, created an example application and then successfully tested it. Today we will cover the steps necessary to run our app in a browser.

The first step

At the moment, if we installed a server like http-server and spun it up in the folder housing our example application, this error shows up in the browser console Uncaught ReferenceError: require is not defined. This is because the require function only exists in the node environment. It provides a way of accessing code that exists in separate files. The easiest (and most painful) way to replicate this behaviour in the browser would be to use <script> tags.

Before the advent of ES Modules, developers used (and still do) either the CommonJS or AMD formats to tackle this problem. And this is where build tools such as Webpack or Parcel come in. Conceptually, their work is straightforward. They gather all the files needed to run an application, work out the dependencies of those files and then create one big JavaScript file which can run in a web browser. The complexity comes in the how of that process and various other cool tricks such as hot reloading (creating a new bundle each time you save changes to a file) and tree shaking (eliminating unused code).

The first step in creating the bundler will be creating a command line tool so we can use npm scripts to initiate everything. Our framework aprender already has a package.json file so we begin by adding the following command.

{
  "name": "aprender",
  "version": "1.0.0",
  "description": "",
  "main": "",
  "scripts": {
    "test": "node tests",
    "demo": "maleta demo/index.html --entry index.js"
  }
}

At this point it is worth exploring what happens when we type npm run demo in the terminal. Before running the commend, we first create a symlink between aprender and our build tool, which will be called maleta. The symlink is created by:

Creating a folder called maleta on the same folder level as aprender
In the terminal, navigate to maleta and type npm link
Navigate to aprender and type npm link maleta

When npm run demo is executed, npm grabs the scripts object in aprender's package.json file and runs whatever command is assigned to the property demo. The first part of the demo command is referencing maleta, our module bundler. npm will process maleta's package.json file and look for an object called bin. It looks like this:

"bin": {
  "maleta": "bin/cli.js"
}

The bin folder in any npm package contains executable files. The value of maleta is the path to the file cli.js, which contains the following code:

#!/usr/bin/env node

const program = require('commander');
const version = require('../package.json').version;
const bundler = require('../src/bundler');

program.version(version);

program
  .command('serve <filename>')
  .description('serves the files')
  .option(
    '--entry <file>',
    'set the name of the entry JS file'
  )
  .action(bundle);

program
  .command('help [command]')
  .description('display help information for a command')
  .action(function(command) {
    let cmd = program.commands.find(c => c.name() === command) || program;
    cmd.help();
  });

const args = process.argv;

// Make serve the default command except for --help
if (args[2] === '--help' || args[2] === '-h') args[2] = 'help';
if (!args[2] || !program.commands.some(c => c.name() === args[2])) args.splice(2, 0, 'serve');

program.parse(process.argv);

function bundle (entryJsFile, command) {
  bundler(entryJsFile, {
    entryJsFile: command.entry
  });
}

This file is executed by your operating system's shell. npm does this by using node's child_process method. The shebang #!/usr/bin/env node at the top of the file tells your operating system which interpreter or application to use when executing the file (if you are using Windows, this will be slightly different). When the node process is launched, any arguments specified are passed to the process.argv property. The first two arguments refer to the absolute pathname of the executable that started the process and the path to the JavaScript file being executed. Every argument from index two onwards is used by whatever code is being executed.

Maleta's CLI tool is built using commander. Commander exposes an object with a number of methods. We can use the version method to return the bundler version by typing maleta -V or maleta --version. After that we use the command method to begin creating our commands. command takes one argument written in the following syntax command <requiredArg> [optionalArg]. Our CLI tool has two commands - one to serve the app and another to print help text. The string specified via description is displayed when a user runs the help command. The action method is used to specifiy the callback function which runs when the command is executed. It receives the argument(s) passed in via the <> or [] brackets and the commander object, which will have the names of any specified options among its properties.

Taking inspiration from Parcel, we make serve the default argument if no argument has been passed and then use commander's parse method to add the arguments to the commander object. Finally, bundle calls the imported bundler function with the entry file.

The bundler at work

Maleta borrows much of its structure from Minipack, a similar project written by Ronen Amiel that explains how bundlers work. The only differences are that Maleta bundles both ES and CommonJS modules, has a CLI tool and spins up a server to run the app. At the core of our bundler's work is the dependancy graph. This lists all the files used in an application along with any dependencies. Before building that graph, we will use the entry file to create a rootAsset object with the following structure:

const rootAsset = {
  outDir: '', // the path of the directory where the bundle will created
  content: '', // the code in the file
  entryJsFilePath: '', // the path of the entry JavaScript file
  rootDir: '', // the path of the directory where the entry file lives
  dependencyGraph: '', // the dependencies of the entry file
  ast: '' // an abstract syntax tree created from the code in the file
}

Bundlers should be able to handle JavaScript or HTML files as the entry file but for simplicity, Maleta will only accept HTML files as the starting point. The function which creates the rootAsset object is:

function createRootAssetFromEntryFile(file, config) {
  rootAsset.content = fs.readFileSync(file, 'utf-8');
  rootAsset.rootDir = getRootDir(file);
  rootAsset.outDir = path.resolve('dist');

  if (config.entryJsFile) {
    rootAsset.ast = htmlParser(rootAsset.content);
    rootAsset.entryJsFilePath = path.resolve(rootAsset.rootDir, config.entryJsFile);
  } else {
    extractEntryJSFilePathFromEntryFile(rootAsset);
  }

  rootAsset.dependencyGraph = createDependencyGraph(rootAsset.entryJsFilePath);

  return rootAsset;
}

It receives the arguments passed into the bundler function by the CLI tool. The only interesting activities occur in the htmlParser, extractEntryJSFilePathFromEntryFile and createDependencyGraph functions. fs and path are node modules which are documented here and getRootDir does what its name states. Note: Reading the file synchronously with fs.readFileSync is not very performant as it a blocking call but we are not too worried about that at this moment.

When we call htmlParser it receives the following content from our demo app:

<html>
  <head>
    <title>Hello, World</title>
  </head>
  <body>
    <div id="app"></div>
    <script src="./index.js"></script>
  </body>
</html>

htmlParser refers to the module posthtml-parser, a tool for parsing and turning HTML into an abstract syntax tree (AST). Our npm command demo: maleta demo/index.html --entry index.js helps us easily find the path to related entry JavaScript file. However, if the --entry option is missing, we call extractEntryJSFilePathFromEntryFile.

function extractEntryJSFilePathFromEntryFile(rootAsset) {
  const parsedHTML = htmlParser(rootAsset.content);

  rootAsset.ast = parsedHTML;
  parsedHTML.walk = walk;

  parsedHTML.walk(node => {
    if (node.tag === 'script') {
      if (node.attrs.src.endsWith('/index.js')) {
        rootAsset.entryJsFilePath = path.resolve(rootAsset.rootDir, node.attrs.src)
      }
    }

    return node;
  });

  if (!rootAsset.entryJsFilePath) throw Error('No JavaScript entry file has been provided or specified. Either specify an entry file or make sure the entry file is named \'index.js\'');
}

The only difference here is posthml's walk method which we have attached to the AST. We use it to traverse the tree and ensure the HTML file has a link to a JavaScript file called index.js.

Building the dependency graph

Our graph will be an array of objects listing every module in the application. Each object will have:

an id
the code from the module
the original filename
an array of the relative file paths of that module's dependencies
an object with the ids of those same dependencies.

The first thing createDependencyGraph does is create the main asset from the entry JavaScript file using this function:

function createJSAsset(filename) {
  const content = fs.readFileSync(filename, 'utf-8');
  const ast = babylon.parse(content, { sourceType: 'module' });

  const relativeFilePathsOfDependenciesArray = [];

  traverse(ast, {
    ImportDeclaration({ node }) {
      relativeFilePathsOfDependenciesArray.push(node.source.value)
    },
    CallExpression({ node })  {
      const { callee, arguments: args } = node;
      if (
        callee.name === 'require' &&
        args.length === 1 &&
        args[0].type === 'StringLiteral'

      ) {
        relativeFilePathsOfDependenciesArray.push(args[0].value)
      }
    }
  })

  const id = moduleID++;

  const { code } = transformFromAstSync(ast, null, {
    presets: ['@babel/env'],
    cwd: __dirname
  });

  return {
    id,
    code,
    filename,
    relativeFilePathsOfDependenciesArray,
    mapping: {}
  }
}

babylon is the same JavaScript parser used by babel. Its parse method runs the given code as a JS program and in the second argument you pass an options object which tells it whether it is dealing with a module or script. Its output is an AST according to the babel AST format. We use it with the babel plugin traverse (babel-traverse) to find all the dependency references. ImportDeclaration finds all the ES module imports whilst CallExpression searches for every function call expression, from which we can check if its being done with the require keyword.

The next task is to parse the JavaScript code in the file. transformFromAstSync is a method from the babel/core module and it turns our AST into the final code which will run in the browser. It also creates a source map. In the config object it is important to set the working directory to maleta otherwise any file paths will be resolved to whichever directory is running maleta, which in our case is aprender.

Once the main asset has been created from the entry JavaScript file, it is assigned to the assetQueue array for processing. This array is a queue which will eventually contain assets representing every JavaScript file in the application. The relationship between each asset and its dependencies is stored in an object called mapping. Every property on this object is the file name of each dependency along with its id.

Creating the bundle

function createBundle(entryFile, config) {
  let modules = '';
  let bundle;
  const rootAsset = createRootAssetFromEntryFile(entryFile, config);
  const bundlePath = path.resolve(rootAsset.outDir, 'index.js');
  const bundleHtml = htmlRender(rootAsset.ast);
  const bundleHtmlPath = path.resolve(rootAsset.outDir, 'index.html');

  // ...
}

createBundle is the function used by our CLI to kickstart the bundling process. createRootAssetFromEntryFile performs all the steps listed above and returns a rootAsset object. From that, we create the file paths for the output files. We also use htmlRender(which is actually posthtml-render) to turn the AST we grabbed from the entry HTML file into a new HTML tree. The next step is to iterate over the dependency graph and create the bundled code like so:

function createBundle(entryFile, config) {
  // ...

  rootAsset.dependencyGraph.forEach(mod => {
    modules += `${mod.id}: [
      function (require, module, exports) {
        ${mod.code}
      },
      ${JSON.stringify(mod.mapping)},
    ],`;
  });

  bundle = `
    (function(modules) {
      function require(id) {
        const [fn, mapping] = modules[id];

        function localRequire(name) {
          return require(mapping[name]);
        }

        const module = { exports: {} };

        fn(localRequire, module, module.exports);

        return module.exports;
      }

      require(0);
    })({${modules}})
  `;

  // ...
}

The bundle explained

The bundle is an immediately invoked function expression (IIFE), a JavaScript function that runs immediately as soon as it is defined. We assign it to the bundle variable and then pass in the modules object as the argument. Each module is an array with a function that executes code for that module as its first element and the module/dependency relationship as its second element.

The first thing the IIFE does is create a require function which takes an id as its only argument. In this function, we destructure the array and access the function and mapping object of each module. The modules will have require() calls to relative file paths and some might make calls to the same file paths even though they are referring to different dependencies. We handle that by creating a dedicated local require function which turns file paths into module ids.

For example, in our demo application the require(0) call at the end of the IIFE results in the following:

function require(id) {
  const [fn, mapping] = modules[id];
  /* the value of fn */
    function (require, module, exports) {
      "use strict";
      var aprender = require('../src/aprender');
      var button = aprender.createElement('button', {
        children: ['Click Me!']
      });
      var component = aprender.createElement('div', {
        attrs: {
          id: 'root-component'
        },
        children: ['Hello, world!', button]
      });
      var app = aprender.render(component);
      aprender.mount(app, document.getElementById('app'));
    }
  /* the value of mapping */ 
  {"../src/aprender": 1}
}

require('../src/aprender'); is really localRequire('../src/aprender'). Internally, localRequire makes this recursive call require(mapping['../src/aprender']. mapping['../src/aprender'] returns the value 1, which is the id of the entry JavaScript file's only dependency. require(1) returns:

function require(id) {
  const [fn, mapping] = modules[id];
  /* the value of fn */
    function (require, module, exports) {
      "use strict";
      var createElement = require('./createElement');
      var render = require('./render');
      var mount = require('./mount');
      module.exports = {
        createElement: createElement,
        render: render,
        mount: mount
      };
    }

  /* the value of mapping */
  {"./createElement":2,"./render":3,"./mount":4}
}

Each time the code in our dependencies makes a require call, it will be destructured in this way. The rest of the code in the bundler IIFE is:

function localRequire(name) {
  return require(mapping[name]);
}

const module = { exports: {} };

fn(localRequire, module, module.exports);

return module.exports;

localRequire wraps the recursive call we explained above and fn(localRequire, module, module.exports) executes the function we destructured at the beginning of the require function. All the exports from the dependencies of module in question will be stored in the module object. In our demo application, createElement, render and mount all export functions and an object with all these exports is value of the aprender module.

Serving the bundle

Once the bundle is ready, we create an output directory, create the index.js and index.html files for the demo application and then serve them using http and serve-static.

function createBundle(entryFile, config) {

  //...

  // create the output directory if it does not exist
  if (!fs.existsSync(rootAsset.outDir)) {
    fs.mkdirSync(rootAsset.outDir);
  }


  // create output html and js files
  fs.writeFileSync(bundlePath, bundle);
  fs.writeFileSync(bundleHtmlPath, bundleHtml);

  // create server and serve files
  const serve = serveStatic(rootAsset.outDir); 
  const server = http.createServer( function onRequest(req, res) {
    serve(req, res, finalhandler(req, res));
  });

  server.listen(3000);
  console.log(`${chalk.bold('Now serving the application on')} ${chalk.red('http://localhost:3000')}`);
}

Summary

The bundler we created is by no means perfect and no doubt contains many holes and candidates for improvement. However, it is functional and that is the most important thing. We have reached a stage in our project where we can view our application in a browser. In the next article, we will return to our UI framework and add the functionality which allow us to create more complicated demo application.

DEV Community

Learn JavaScript by building a UI framework: Part 4 - Creating A Module Bundler

The first step

The bundler at work

Building the dependency graph

Creating the bundle

The bundle explained

Serving the bundle

Summary

Top comments (0)

Read next

Code Against the Clock: How I Enhanced My Scrum Master Productivity

Evolution of Web Tech and Browsers

Thoughts on ThoughtWorks Radar 2024

HTML meta 標籤中 viewport 的設定