I’ve been using the Hemingway App to try to improve my posts. At the same time I’ve been trying to find ideas for small projects. I came up with the idea of integrating a Hemingway style editor into a markdown editor. So I needed to find out how Hemingway worked!
Getting the Logic
When I started I had no idea how the app worked. It could have sent the text to a server to calculate the complexity of the work but I expected it to be calculated client side. Luckily, opening the developer tools in Chrome ( Control + Shift + I or F12) and navigating to Sources. In there I found the file I was looking for: hemingway3-web.js.
Minified file on the top, formatted file on the bottom. What a difference it makes!
This code is in a minified form which is a pain to read to understand. To solve this I copied the file into VS Code and formatted the document (Control + Shift + I for VS Code). This changes a 3 line file into a 4859 line file with everything formatted nicely.
Exploring the Code
With the file formatted far more nicely, I started to look through for anything that I could make sense of. The start of the file was a lot of immediately invoked function expressions that gave me very little idea of what was happening.
!function(e) {
function t(r) {
if (n[r])
return n[r].exports;
var o = n[r] = {
exports: {},
id: r,
loaded: !1
};
...
This continued for about 200 lines before I decided that I was probably reading the code to make the page run (React?). I started skimming through the rest of the code until I found something I could understand. (I missed quite a lot that I would later find through finding function calls and looking at the function definition).
The first bit of code I understood was all the way at line 3496!
getTokens: function(e) {
var t = this.getAdverbs(e),
n = this.getQualifiers(e),
r = this.getPassiveVoices(e),
o = this.getComplexWords(e);
return [].concat(t, n, r, o).sort(function(e, t) {
return e.startIndex - t.startIndex
})
}
And amazingly, all of these functions were defined right below. Now I knew how they defined adverbs, qualifiers, passive voice and complex words. Some of them are very simple. There are lists of qualifiers, complex words and passive voice phrases and each word is checked against them. this.getAdverbs filters words based on if they end in ‘ly’ and then checks that it is not in their list of non-adverb words ending in ‘ly’.
The next bit of useful code was where the word or sentence highlighting is implemented. In this code there is a line:
e.highlight.hardSentences += h
I then searched the file for ‘hardSentences’ and got 13 matches. This lead to a line that calculated the readability stats:
n.stats.readability === i.default.readability.hard && (e.hardSentences += 1),
n.stats.readability === i.default.readability.veryHard && (e.veryHardSentences += 1)
Using this, I searched again for ‘readability’ and got 40 matches. I found the getReadabilityStyle function and found out how they grade your writing. They have 3 levels: normal, hard and very hard.
t = e.words;
n = e.readingLevel;
return t < 14
? i.default.readability.normal
: n >= 10 && n < 14
? i.default.readability.hard
: n >= 14 ? i.default.readability.veryHard
: i.default.readability.normal;
If there are less than 14 words then its normal. If the reading level is between 10 and 14 then its hard and if its more than 14 its very hard. Now to find how to calculate the reading level.
I spent a while here trying to find any notion of how to calculate the reading level. I found it 4 lines above the getReadabilityStyle function.
e = letters in paragraph;
t = words in paragraph;
n = sentences in paragraph;
getReadingLevel: function(e, t, n) {
if (0 === t || 0 === n) return 0;
var r = Math.round(4.71 * (e / t) + 0.5 * (t / n) - 21.43);
return r <= 0 ? 0 : r;
}
That means your score is 4.71 * average word length + 0.5 * average sentence length -21.43.
Other Interesting Things I Found
The highlight commentary (information about your writing on the right hand side) is a big switch statement. Ternary statements are used to change the response dependant on how well you’ve written.
The grading goes up to 16 before it’s classed as “Post-Graduate” level.
What I’m going to do with this
I am planning to make a very basic website and apply what I’ve learnt from deconstructing the Hemingway app. I’ve built a Markdown previewer before so I want to see if I can integrate this highlighting and
What have you learnt from reverse engineering a website?
If you’ve ever done something similar, let me know in the comments. It’s great hearing about cool things that other developers have found.
Top comments (4)
There is nothing fun about trawling through minified code, even after formatting! You did a great job pulling out the interesting bits. It's fascinating to note that so much of the code is UI and the bits that do the calculations are just a few lines.
Yes, getting started was massively confusing.
I was amazed how simple the logic was as well. I think the logic for applying highlighting word by word is going to be the most difficult thing to replicate. Also managing the logic so that it only rescans the paragraph that's changed.
I bet, I can only imagine that getting something like this to perform well is the real task at hand here. I've enjoyed using Hemmingway in the past and it's great to see the engineering that goes into it.
Thanks for sharing again!
This is a neat reverse-engineering/code investigation. Thanks for sharing!