DEV Community

Cover image for How to create code compressor in JavaScript | HTML Minifier
Stackfindover
Stackfindover

Posted on

How to create code compressor in JavaScript | HTML Minifier

Hello guys, today I am going to show you how to create an HTML Minifier using HTML CSS & JavaScript, in this article, I will create a simple code minifier using some line of JavaScript code.

HTML Minifier step by step

Step 1 — Creating a New Project

In this step, we need to create a new project folder and files(index.html, style.css) for creating an awesome responsive website footer. In the next step, you will start creating the structure of the webpage.

You may like these also:

  1. JavaScript signature pad
  2. Full Page Scrolling Effect

Step 2 — Setting Up the basic structure

In this step, we will add the HTML code to create the basic structure of the project.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>How to make html minifier</title>
    <link rel="stylesheet" href="style.css">
</head>
<body>

</body>
</html> 
Enter fullscreen mode Exit fullscreen mode

This is the base structure of most web pages that use HTML.
Add the following code inside the <body> tag:

<section class="codeminify">
        <textarea class="simplecode" placeholder="Paste or type your data here..."></textarea>
        <button id="htmlMinify">Minify HTML</button>
        <textarea placeholder="Output" class="minifycode"></textarea>
</section>
Enter fullscreen mode Exit fullscreen mode

Step 3 — Adding Styles for the Classes

In this step, we will add styles to the section class Inside style.css file

@import url('https://fonts.googleapis.com/css2?family=Poppins:wght@300&display=swap');
* {
    padding: 0;
    margin: 0;
    font-family: 'Poppins', sans-serif;
}
body {
    display: flex;
    align-items: center;
    justify-content: center;
    height: 100vh;
    width: 100vw;
    overflow: hidden;
}
.codeminify {
    display: grid;
    grid-template-columns: auto auto auto;
}
textarea {
    padding: 10px;
    min-width: 420px;
    min-height: 300px;
    font-size: 16px;
}
button#htmlMinify {
    display: block;
    width: 150px;
    height: 40px;
    font-size: 16px;
    font-weight: 600;
    background: #4b00ff;
    color: #fff;
    border: transparent;
    cursor: pointer;
    outline: 0;
    margin: 0 10px;
}
Enter fullscreen mode Exit fullscreen mode

Step 4 — Adding some lines of JavaScript code

In this step, we will add some JavaScript code to minify html code.

<script>
        var $tag = function(tag) {
            return document.getElementsByTagName(tag);
        }

        function minify_html(type, input, output) {
            output.value = input.value
            .replace(/\<\!--\s*?[^\s?\[][\s\S]*?--\>/g,'')
            .replace(/\>\s*\</g,'><');
        }

        document.getElementById("htmlMinify").addEventListener("click", function(){
            minify_html(
                this.innerHTML, $tag('textarea')[0], $tag('textarea')[1]
            );
        }, false);
</script>
Enter fullscreen mode Exit fullscreen mode

HTML Minifier final result

Top comments (7)

Collapse
 
tehmoros profile image
Piotr "MoroS" Mrożek • Edited

Why not simply parse the code to DOM nodes, remove the comment and text nodes with whitespaces, then output it back in text form? That's the first solution that came to my mind, when I saw the title of this post. :)

Regex brings that uncertainty level for nested content (HTML in a script, as described in evalPenny's comment above/below). DOM would treat it as a tag node with text nodes inside it. It would be easier to ignore those nodes than to write a regex that handles/ignores nested content.

Collapse
 
prabhukadode profile image
Prabhu

Would you please explore bit more ?

Collapse
 
tehmoros profile image
Piotr "MoroS" Mrożek • Edited

But of course. :)

We can parse a string into a DOM Document with the DOMParser class. From there we can use a function to traverse the DOM and eliminate any text and comment nodes (nodes have types assigned). This is going to be a bit lengthy:

Let's parse a sample document:

const dom = new DOMParser().parseFromString(`
<!doctype html>
<html>
    <head>
        <title>Test</title>
    </head>
    <body>
        <strong>Simple text<\/strong>
        <!-- comment -->
        <script>
            document.write('<em>This is not</em>      <em>a part of the document</em>');
            console.log('This is not as well');
        <\/script>
    </body>
</html>`, "text/html");
Enter fullscreen mode Exit fullscreen mode

We have here a simple HTML document with new lines, tabs/spaces, a comment and a script block. I've had to escape the closing script tag or otherwise Firefox and VSCode were complaining (unterminated string).

Let's write a simple minify function (recursive - I'm lazy ;) ):

function minify(parent) {
  // we have to make a copy of the iterator for traversal, because we cannot
  // iterate through what we'll be modifying at the same time
  const values = [...parent?.childNodes?.values()];
  for (const node of values) {
    if (node.nodeType == Node.COMMENT_NODE) {
      // remove comments node
      parent.removeChild(node);
    } else if (node.nodeType == Node.TEXT_NODE) {
      // test for pure whitespace node (not containing characters other than whitespaces)
      if (!/[^\s]/.test(node.nodeValue)) {
        // remove pure whitespace node
        parent.removeChild(node);
      }
    } else {
      // process child node recursively
      minify(node);
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

It's simple and won't turn into a mess once you try implementing corner cases (like preventing regex from parsing what's inside a script tag). It also gives more flexibility and control (as is the case with code vs regex).

Finally, let's use it:

console.log(`<!doctype ${dom.doctype.name}>\n${dom.childNodes[1].outerHTML}`); // original HTML
minify(dom);
console.log(`<!doctype ${dom.doctype.name}>${dom.childNodes[1].outerHTML}`); // minified HTML
Enter fullscreen mode Exit fullscreen mode

Yes, I know doctypes are a bit more complex, when you take pre-HTML5 document types into account, but for the sake of simplicity let's assume we're only dealing with simple HTML5 document type.
The first log will print the formatted HTML code generated from the unminified DOM Document. The second log will print it after minification (removal of unnecessary nodes). Outputs to compare below:

First logging - before minify:

<!doctype html>
<html><head>
        <title>Test</title>
    </head>
    <body>
        <strong>Simple text</strong>
        <!-- comment -->
        <script>
            document.write('<em>This is not</em>      <em>a part of the document</em>');
            console.log('This is not as well');
        </script>
</body></html>
Enter fullscreen mode Exit fullscreen mode

Second logging - after minify:

<!doctype html><html><head><title>Test</title></head><body><strong>Simple text</strong><script>
            document.write('<em>This is not</em>      <em>a part of the document</em>');
            console.log('This is not as well');
        </script></body></html>
Enter fullscreen mode Exit fullscreen mode

While the HTML document has been minified, the JavaScript code remained unchanged. In our minify function we could add another condition for detecting script tags and minifying them differently (e.g. compare node.nodeType === Node.ELEMENT_NODE and check if node.nodeName === 'SCRIPT').

It's just a simple example of how you could use DOM to minify your HTML. It could also be used as a parser for XML documents and such, among other use cases.

Thread Thread
 
blumed profile image
Cullan

I do like your answer I think it has a great specific use case, but I am confused by the rigidity of this approach. Example if someone minifies a chunk of html which has no doctype or head or body. First how would your code handle full html files and html chunks? From my testing of your code you can either do one or the other but not both. Is there something I am missing because I do like your answer but not sure it has the flexibility of minifying any html you throw at it.

Collapse
 
kieudac201 profile image
KieuDac201

What is the Nodes?

Collapse
 
tehmoros profile image
Piotr "MoroS" Mrożek

I was referring to Document Object Model nodes. More details in my answer to Prabhu's comment.

Collapse
 
stackfindover profile image
Stackfindover

yes, this is only HTML Minifier we will update it :)