In part 1 of this post, I presented to you my journey of writing a set of compilers to minify HTML/CSS/JS using PHP, along with the software I wrote to see how they stacked up against the competition.
This is the results.
I used google, packagist, and GitHub to find some projects to go up against, I picked projects that were most used. Where the original project had been abandoned, I picked forks that maintained them, although some of those forks themselves hadn’t been updated in a while.
So here they are (Note stats may not be completely up to date):
The tests were run on a VPS running Ubuntu and PHP 8, with 2 Intel Xeon CPU's and 2GB RAM.
To understand the results, there are 3 metrics here that we need to look at:
- Compression Ratio – How much compression was achieved by each package (Gzipped and non-gzipped)
- Speed – How fast does each package perform the compression
- Reliability – Is the output valid
The first two were fairly straight forward to calculate, but the third needed some extra programming to plugin some packages or external services to validate the code, and there are some caveats:
- Not all the input is valid, so where the number of errors in the input equals the number of errors in the output, the result is considered valid
- The HTML and CSS minifiers both use the W3 validator service, on certain errors it doesn’t present the errors in that tree, if the minifier fixes the error, it may show more errors than the input if the tree in question has more errors underneath it, causing a false positive
Because of the false positives on the output, the software allows you to see the validation errors, and to view the output.
For each metric, a score will be calculated like this:
This means that the slowest / lowest compression / least reliable will score 0 and the fastest / highest compression / most reliable will score 100, and other values will be graded between the highest and lowest.
To get the totals, the scores will be added up, but since there are two metrics for the compression (gzipped and non-gzipped), the reliability and speed metrics will be multiplied by 2, allowing the average score of the two compression metrics to be calculated in.
Before looking at the results I want to note that all the developers including me have worked very hard on their creations, and many of them are well tested and are used by millions of websites everyday.
Currently my software is only used by a handful of websites.
The scores for the HTML test are as follows:
taufik-nurrohman’s minifier was the fastest followed by
mrclay/minify, both beat the next fastest which was mine, by about 8x. Unsurprisingly these were both RegExp based ones. All the compilers were slower, the slowest of which was
deruli/html-minifier, which is a PHP port of the Blink Tokeniser. It was 4x slower than the fastest compiler (mine), but this is not surprising since the original code was written in C++, which is much faster than PHP.
My software gave the highest compression at 11.00%, 39% better than the next highest
All the compilers performed better than their RegExp counterparts on the compression metric.
The worst performer was
taufik-nurrohman at 0.20%, note that only the results considered valid were taken into account, and because of the amount of errors in the results, the compression was always going to be lower, although strangely most of the results that were valid seemed to produce bigger output than the input.
All the minifiers tested showed more errors in the output than the input on at least one test, although after a brief analysis of the input and output, it looks like most of them are false positives. Certainly in my software, the errors captured by the validator were all present in the input code, but for some reason were not reported in the input.
There was once exception to this, and this was the Gist script by
taufik-nurrohman, it showed output errors in over 90% of the test websites.
Even though there were false positives, I took the results as the came out for scoring.
Overall I am pretty happy with the performance of my software, discounting the errors which I think are all false positives, it was the fastest compiler and also produced the best compression, along with the highest overall score.
The script from
taufik-nurrohman did not perform well, whilst it was the fastest, not only did it have the most errors, it also had by far the lowest compression.
The most popular project
mrclay/minify performed well, a few more errors were reported than on my software, but again I think these are all false positives.
The scores for the CSS test are as follows:
All the contenders performed well here, there were no minifiers that produced more validation errors that there were in the input.
The lack of errors is here is likely an output of the fact that CSS is easier to parse than other languages due to its fixed and predictable format.
The fastest minifier here was
wikimedia/minify, along with
websharks/css-minifier it blitzed the rest of the field, minifying all 12 test files in 0.047 seconds. The slowest was
matthiasmullie/minify which completed the tests in 8.4 seconds. Again the RegExp based minifiers were all faster than the compilers with the exception of the above, the fastest compiler was
natxet/cssmin at 0.94 seconds.
All the contenders were very close in their compression ratio, again this is likely due to the predictable format of CSS. The lowest compression was
natxet/cssmin at 21.17%, 16.70% gzipped. The highest compression ratio was achieved by my software at 24.63% and 20.14% gzipped.
One anomaly here was
matthiasmullie/minify, it mostly performed well in compression and would have completed the tests in good time, but it for some reason had trouble processing a couple of the test files, one has significantly lower compression than the others, and another took 6 seconds out of its 8 seconds overall to minify it.
Here my software was not the fastest compiler, but it did achieve the highest compression. I ran the results here on Ubuntu, but on my laptop which is Windows, it was by far the fastest compiler, I guess the OS and platform implementation of PHP makes a difference in the performance of the code that is running.
taufik-nurrohman which had 2/10 invalid results. It was also the fastest, completing the tests in 0.18 second,
wikimedia/minify was the next fastest at 1.75 seconds.
After the RegExp based minifiers, the Linear Consumers came next in the speed ranking, with my compiler and then
matthiasmullie/minify bringing up the rear. My software had the highest compression ratio in both non-gzip and gzip with 33.97% and 26.67% respectively.
The slowest and lowest compression was
matthiasmullie/minify, again there was an anomaly where it was unable to compress one of the tests files to anywhere close to the ratio achieved by all the others. This had a big impact on its overall compression ratio which ended up being about half of the others (17.55%), it would have been on par without this problem. It also struggled in the speed department, clocking in a total of 11.4 seconds, interestingly on Windows it was one of the fastest, whilst the most popular
mrclay/jsmin was the slowest.
It is interesting how the structure of the input language determined the spread of speed and ratio, with the more predictably structured CSS showing the contenders much closer on all metrics. Also the difference in speed/ratio/reliability between the RegExp based minifiers and the compilers, there was definitely a clear gap in speed with the RegExp software being faster, whilst also being slightly less reliable and lower compression.
The results here are just an indication of the performance and quality of the software packages presented here. With different inputs, running on other OS's the results will be different. Also the scoring system levels each metric evenly against the others, whereas real world requirements may be different.
On quality, I think this is harder to measure than the benchmark presented here, which didn't take into account metrics like code coverage for tests, or whether the project is maintained.
What do you think? Please tell me your thoughts in the comments.
To the performance of my own software I am definitely happy, whilst none were the fastest overall which I expected, they were also not the slowest, and proved to be reliable in all tests (bar the false positives which all the competitors seemed to suffer from in the HTML test).
Proud also that my software achieved the highest compression across the board, helped by the software I wrote to compare it to the competition.
For my own process, writing and testing this software has been extremely challenging, but has enabled me to learn new concepts, implementation techniques, and overall improve my process. Before starting these projects, I hadn’t:
- Done any serious testing with PHPUnit
- Published anything on packagist
- Used composer or published anything that required it
- Written a Wordpress plugin
- Produced a code coverage report
- Used GitHub’s build system
- Published any articles on dev.to
To all developers who are starting out or who haven’t published their own project yet, I would highly recommend it. You will be pushed out of your comfort zone, and be required to learn new techniques, processes, platforms and systems to be able to get your software to a point where it is stable, tested, documented, and basically good enough for other developers to pull into their production projects.
If you want to run these tests yourself, or want to try out my software, all my projects are available on GitHub and other platforms with the following links: