DEV Community

loading...
Cover image for Designing the ultimate (INCLUSIVE) writing tool. [Part 1 - a WYSIWYG in 745 *Bytes*! 😱]

Designing the ultimate (INCLUSIVE) writing tool. [Part 1 - a WYSIWYG in 745 *Bytes*! 😱]

InHuOfficial
Specialising in accessibility and website load speed / performance. If you have a question about [accessibility] or [page-speed-insights] ask away and I will help any way I can!
・9 min read

A WYSIWYG in just 745 bytes of JS (gzipped)? Check. A bonus JS syntax highlighter in 900 bytes of JS? Check. Combining the two? You bet! Things are about to get weird, but I do have a good reason for (most of) it!

In this article I will be introducing a new series all around creating the ultimate inclusive writing tool and the inspiration behind it.

And obviously, as promised, a super tiny WYSIWYG...you might be surprised how "full featured" it is!

Skip to the WYSIWYG(s)!

Can't be bothered reading all he really interesting features I am building or what this series will be about? How rude!

But I understand you might be busy, so here is a shortcut to the first stage of the WYSIWYG....and the syntax highlighter...of course!

Introduction to this series and where it started

There was a really interesting article released by @michaeltharrington earlier this week on ableism and language choice.

Now it may have appeared from a very long comment I made that I did not agree that language choice is important.

It is, it was my lack of faith in being able to police it effectively and the examples of "good substitutes" for potentially offensive words that I took issue with. Plus so much of ableist language is contextual.

It really got me thinking though...

Out of a simple article an idea was born.

The article prompted me to start having a think about how you could make it easier for people to write inclusively.

Without making assumptions about the culture, whether they had a limited vocabulary due to a disability or due to a lack of access to educational resources or because English was a second or even third language etc.

Basically a piece of software that could steer people towards language that was suitable in a professional / public setting.

All without the need for a human to intervene, as no matter how well intentioned, you will never have enough information about the writer to know whether you are doing more harm than good.

A set of guidelines if you will, with the option to ignore them if you wish.

The one big advantage of this approach is that software is far less likely (although not perfectly unlikely) to make people feel that they are being criticised for their language choices.

It also scales so that thousands of people can benefit from guidance without the need for more and more human moderators.

Although ableist language was the catalyst for the idea, inclusive writing is about so much more!

Not just ableist language, far from it!

Inclusive writing includes keeping an eye on pronoun use, avoiding racist language, avoiding language that excludes non-binary individuals, swear words being over-used (as the occasional "fuck yeah" is obviously desirable πŸ˜‰) and more I probably haven't thought of yet.

There are even more aspects to inclusive language, not just choice of words!

"passive voice vs active voice" is one, I will explain passive voice and why to avoid it in a future article when we build the part of the system that highlights passive voice and suggests alternatives that use active voice.

Headings structure, essential for people who use a screen reader and for helping everyone understand the relationships in the article etc.

Sentence length, as longer sentences are more difficult to process without a "mini break" provided by a full stop, comma etc.

Complex words and jargon should be avoided where possible. 1 in 5 people in the UK have the reading age expected of that of a 12 year old. This one is a big point!

Explaining abbreviations. One that we often don't think about. Just because you know what "SSR" means doesn't mean everyone does.

Does it mean "Strategic Scientific Reserve", "Same Sex Relationship" or "Sonic and the Secret Rings". When writing about tech you probably mean "Server Side Rendering" but that may not be obvious to someone who does not know the term.

Paragraph length. This depends almost entirely on what you are writing and where.

However this is a tool designed for writing on the web. So short paragraphs are much preferred than walls of text. In fact, most of the preferred ways of writing for the web would get you marked down in English classes!

grammatical errors I am not smart enough to write an application to correct for grammatical errors, so I won't be tackling that (initially, who knows if this project grows I might attempt it!)...there are plenty of services that do that already, so I think I can get away with shelving that for now.

Those are all the things to do with language I could think of.

Oh and it doesn't stop there

Now that I have decided to put a couple of hours a week aside for this there are loads of things I personally have wanted in a writing system.

So it might become much more than just an editor, it may have a whole system around it. A few things that I would like to see if I build this are as follows:

  • A research tool, where I can bookmark articles (at the relevant part of the page if necessary) and link them to an article I am researching.
  • A simple SEO tool that ensures that my first 200 words or so are optimised. Simple stats like word occurrence, semantically similar words etc. Nothing too heavy here as quality writing comes first, just a little nudge to help on-page SEO.
  • Templates for different article types.
  • A "scratch pad" for notes and ideas as the article is written (things I need to research further etc.)
  • Placeholders. For things like images that need sourcing (or screenshots I need to take / insert), links to future articles (with a way of adding them to a queue) or related articles not written yet, notes for myself, etc. Basically things that will not show up in the released article but can be searched and acted upon.
  • And heck, while we are at it, why not have an article release checklist that ensures that I have completed all the steps required to release quality content and see where I am up to when writing multiple articles simultaneously.

Oh, and as always with anything I do, load speed is essential and the thing needs to be as accessible as is humanly possible with current technology.

Stage 1 - building my first ever WYSIWYG

I have built a What You See Is What You Get (WYSIWYG) editor for dev.to in the past. However it wasn't a WYSIWYG, it was a Markdown editor.

So I can't reuse any of that as I want this to be an actual WYSIWYG.

No I am going to have to start from scratch and learn all about live editors on the web!

Some of you are thinking "You must like pain if you are going to build a WYSIWYG!"

For those of you who have been brave enough to try and write a WYSIWYG before, you are already wincing and know that what I have decided to tackle is a horrendous task!

WYSIWYGs are hard to build.

How do you let people edit a document while generating the underlying HTML on the fly and not upset / change their cursor position?

How do you keep track of opening and closing HTML tags when they start getting nested?

How do you account for deleting a word or phrase that has styling applied to only part of it and move the tags accordingly?

All sounds rather complicated. I don't like complicated so I think the only real answer would be to cheat!

Our cheat and why contenteditable is awesome.

A large number of you will have used, heard of or stumbled across contenteditable in your careers.

If you haven't, it is an attribute you can add to an HTML element that magically allows you to click the element and start changing the content.

The following fiddle demonstrates this in its simplest form.

Now that may not seem very impressive on its own. But it really is when you think about it.

It is much more than just a replacement for an <input>. Every change you make is directly updating the DOM and adjusting the HTML on the fly.

Still not impressed? Select some text and press Ctrl + B (on Windows). The contenteditable <div> has just changed to include a <b> tag wrapped around your text.

It deals with all of the HTML tag management so we don't have to.

But not only that, a contenteditable area has a super power. It exposes various JavaScript APIs so we can get and set the state of text!

Sure, it has loads of quirks, but I think it is pretty amazing how much functionality you get from one single attribute (even if it is a real pain to type correctly!).

A basic WYSIWYG

It is worth noting, there is still a load to do here. It has some accessibility issues (read that as a lot of accessibility issues) so it shouldn't be used in production, it is also missing loads of features etc.

However the aim here was to build a tiny WYSIWYG as a base.

At this point, it is a technical showpiece and a learning exercise for me on all the APIs I need to learn to interact with a contenteditable <div>, not the finished product.

Anyway, enough disclaimers, I know what you came here to see!

The following WYSIWYG is a total of 896 bytes of JS and CSS combined (when Gzipped).

How is that for tiny?

How about syntax highlighting?

Oh you thought the WYSIWYG was the showpiece?

No no no, I have been busy creating more tiny experiments.

A lot of them still need a lot of work but just for fun how about a super tiny JavaScript syntax highlighter?

That was another interesting learning exercise (luckily a lot of the regexes were available with a bit of research so I didn't have to write them, just tweak them!).

It is not perfect but the concept is there.

Now I was not intending to do anything else in this article...but I just had to try combining the two fiddles...

How about Syntax highlighting...in a WYSIWYG?

I created a monster! A weird WYSIWYG where you get syntax highlighting, but can still edit it like a normal document.

Alt Text

It can create some pretty interesting results I have to say....I don't think I will be using it as my day to day editor just yet!

It is full of bugs as this was obviously not intended etc. etc. but...why not have some fun?

Sadly you can't insert images, horizontal lines, links etc. as the input gets mangled...but you can still have a load of fun with formatting text!

It might not look right on your mobile so save this one for when you get to your PC!

I hope it makes you laugh (and cry at the same time) as much as it has me!

Back to the serious stuff!

Obviously, while this is all fun, the intention is never to have the WYSIWYG functions as part of the Code Editor.

The idea is to create a blocks system (similar to WordPress etc.) where you have a WYSIWYG block, then a code editor block, then back to a different block type etc.

However there was one thing (that you may not have noticed) that was important with the code editor and combining the two that I was doing.

I was seeing how I could create live highlighting as you typed.

It isn't as simple as you may think, so have a good look at the code in the last example to work out what the trick is. Don't worry if you don't spot it...I will explain all the tricks etc. in more detail in the next part of this series when I tidy up my sloppy code!

What is next?

OK so these were some fun experiments but not really useful.

In part two I am going fix the WYSIWYG to a stage where it is both usable and more easily extended so we can start bolting in some of the features I listed earlier.

I am also going to use the knowledge gained from the silly WYSIWYG code editor combo to add highlighting to words that are not recommended etc.

So by the end of part two we should have a usable WYSIWYG that will allow us to highlight a given word, phrase etc. and have suggestions on alternatives. Who knows I might throw another couple of silly things in that article for you to play with too!

Conclusion

From one simple article a gigantic, all consuming project that is going to take me months has emerged.

That is the conclusion as far as I am concerned.

So do me a favour, give me a follow, bookmark the article, leave a comment or share this article with someone you don't like so you can put then through the pain of experiencing my WYSIWYG code editor monstrosity! 🀣

inhuofficial image

Have a great week and I hope you found this interesting, even if it wasn't useful (yet...that is what part two is for I hope!)

Discussion (20)

Collapse
lexlohr profile image
Alex Lohr

I once wrote the engine for a small syntax highlighter in less than 140 bytes JavaScript for a code golfing competition: gist.github.com/atk/1084980 - it ran on the page that presented the code golfs.

The RegEx required for js is not for those faint of heart, though. πŸ˜‰

Collapse
inhuofficial profile image
InHuOfficial Author

I always enjoy seeing people "Golf" things.

I used a great version of pong in an article on steganography that was something like 460 bytes fully functional with scoring etc. as the thing I was hiding in the image. Blew my mind!

Love your solution.

"The RegEx"...singular? You mean the 10 I have so far πŸ˜‹πŸ€£

Collapse
siddharthshyniben profile image
Siddharth
window.j=(f,a)=>{var l=[{h:/'(.*?)'/g,r:"<span class='string'>'$1'</span>"},{h:/(\d+\.?\d+?)/g,r:'<span class="number">$1</span>'},{h:/(\/\/.*)/g,r:'<span class="comment">$1</span>'},{h:/\b(var|let|const|function|this|do|super|as|constructor|instanceof|default)\b/g,r:'<span class="js-keyword">$1</span>'},{h:/\b(typeof|try|catch|finally|delete|switch|case|in|of|if|else|import|from|as|export|extends|new|return|throw|for|while|break|continue|async|await)\b/g,r:'<span class="js-command">$1</span>'},
{h:/\b(true|false|null|undefined|NaN|Infinity|\$)(?=[^\w])/g,r:'<span class="js-literal">$1</span>'},{h:/([\b\s\[\{\(])([!=]=|[!=]==|\+\+?|--?|\*|\/|&&|\|\||!|<=?|>=?|>>|<<|\.\.\.)(?!span)([\b\s\w])/g,r:'<span class="js-operator">$1$2$3</span>'},{h:/\b(window|document|navigator|console|self|top|process|require|module|define|global|Promise|Array|Math|String|Number|Symbol|Function|Reflect|Proxy|Error)\b/g,r:'<span class="js-global">$1</span>'},{h:/(\w[A-Za-z0-9]*)(?=\()/g,r:'<span class="js-function">$1</span>'},
{h:/\b(getElementsBy(TagName|ClassName|Name)|getElementById|(get|set|remove)Attribute|querySelector(|All))(?=[^\w])/g,r:'<span class="js-dommethod">$1</span>'}];f=f.replace(/(&lt;!--(?:[^-]|-(?!-&gt;))*--&gt;)|(&lt;(?:(?!&gt;).)+&gt;)/g,function(h,k,m){if(null!=k)return'<span class="comment">'+k+"</span>";if(null!=m)return'<span class="tag">'+m.replace(/(\s[\w_-]+)+(?:(=)+("[a-z-\s]?")+)?/ig,"<i>$1</i>$2<u>$3</u>")+"</span>"});for(var c=f.split("\n"),d=0;d<c.length;d++){var g=c[d],b=document.createElement("div");
for(x=0;x<l.length;x++){var e=l[x];g=g.replace(e.h,e.r)}b.innerHTML=g;a.appendChild(b)}};
(function(f){function a(c,d){return document.execCommand(c,!1,d||null)}var l=[{icon:"Para",g:function(){return a("formatBlock","<p>")}},{icon:"<b>B</b>",state:"bold",g:function(){return a("bold")}},{icon:"<i>I</i>",state:"italic",g:function(){return a("italic")}},{icon:"<u>U</u>",state:"underline",g:function(){return a("underline")}},{icon:"<strike>S</strike>",state:"strikeThrough",g:function(){return a("strikeThrough")}},{icon:"<b>H1</b>",g:function(){return a("formatBlock","<h1>")}},{icon:"<b>H2</b>",
g:function(){return a("formatBlock","<h2>")}},{icon:"<b>H3</b>",g:function(){return a("formatBlock","<h3>")}},{icon:"&#8220; &#8221;",g:function(){return a("formatBlock","<blockquote>")}},{icon:"&#35;",g:function(){return a("insertOrderedList")}},{icon:"&#8226;",g:function(){return a("insertUnorderedList")}},{icon:"&lt;/&gt;",g:function(){return a("formatBlock","<pre>")}},{icon:"&#8213;",g:function(){return a("insertHorizontalRule")}},{icon:"&#128279;",g:function(){var c=f.prompt("Enter the link URL");
c&&a("createLink",c)}},{icon:"img",g:function(){var c=prompt("Enter the image URL");c&&a("insertImage",c)}}];return{i:function(c){var d=document.querySelector(".highlight");console.log(l);var g=document.createElement("div");g.className="bar";c.appendChild(g);var b=document.createElement("div");b.contentEditable=!0;b.className="content";b.innerHTML="<p>&nbsp;</p>";b.oninput=function(){console.log(b.innerHTML);var e=b.innerHTML;d.innerHTML="";f.j(e,d);d.height=b.height};c.appendChild(b);l.forEach(function(e){var h=
document.createElement("button");h.innerHTML=e.icon;h.onclick=function(){return e.g()&&b.focus()};if(e.state){var k=function(){var m=document.queryCommandState(e.state);return h.classList[m?"add":"remove"]("active")};b.addEventListener("keyup",k);b.addEventListener("mouseup",k);h.addEventListener("click",k)}g.appendChild(h)});a("defaultParagraphSeparator","p")}}})(window).i(document.querySelector(".editor"));
Enter fullscreen mode Exit fullscreen mode

Your code, cut in half.

I can definitely decrease it more. But that's enough for a WYSIWYG.

Thread Thread
inhuofficial profile image
InHuOfficial Author • Edited

And once you run mine through a compiler / minifier and gzip them both up you saved...39 bytes! (1569 bytes vs 1530 bytes).

Those were the numbers I was talking about anyway, I don't work in Raw bytes as they are meaningless, gzipped and minified are the numbers that matter for network performance and sometimes more raw bytes means lower gzipped bytes!

I am not going to share minified code with people if I expect them to be able to dig around in it.

To really golf this you would have to completely rewrite most of it as the regexes are about 70% of the weight. Combining them into one gigantic RegEx would be one way and using capture groups.

Have a look at the page Alex linked the engine he used for his syntax highlighter was 140 bytes!

function(c,r){return c.replace(r,function(f,i){for(i=7;~i*!arguments[i--];);return i?'<t class=f'+i+'>'+f.replace('<','&lt;')+'</t>':''})}
Enter fullscreen mode Exit fullscreen mode

But yet again, once you add the RegExs back in...

var re = /(?![\d\w]\s*)(\/[^\/\*][^\n\/]*\/[gi])|(".*?"|'.*?')|(\/\/.*?\n|\/\*[\x00-\xff\u00\uffff]*?\*\/)|(?:\b)(abstract|boolean|break|byte|case|catch|char|class|const|continue|debugger|default|delete|do|double|else|enum|export|extends|false|final|finally|float|for|function|goto|if|implements|import|in|instanceof|int|interface|long|native|new|null|package|private|protected|public|return|short|static|super|switch|synchronized|this|throw|throws|transient|true|try|typeof|var|void|volatile|while|with)(?:\b)|(?:\b)(Array|Boolean|Date|Function|Math|Number|Object|RegExp|String|document|window|arguments)(?:\b)|(\d[\d\.eE]*)|([\x28-\x2b\x2d\x3a-\x3f\x5b\x5d\x5e\x7b-\x7e]+|\x2f|(?=\D)\.(?=\D))/g;
Enter fullscreen mode Exit fullscreen mode

It quadruples the size! (The total is 562 bytes gzipped, damned impressive!)

Not sure if he fixed the problem in the comments where function(a,b){b/=2;return (a+b)/b;} is highlighted as a RegEx but mine can deal with that at least.

And above all....why are you golfing this silly monstrosity! I have no intention of making a WYSIWYG code editor combination in real life as I said 🀣 it lets you do things that are so so wrong like having a <h1>function(){</h1> - it hurts, it hurts so much! πŸ˜‹

Hopefully both of those will give you some ideas on how to really "golf" them, but you would almost certainly be starting from scratch to get any meaningful gains over my code.

Collapse
lexlohr profile image
Alex Lohr

No, the one that is used for my version:

var re = /(\B\/(?:\\.|[^\n/*])(?:\\.|[^\n/])*\/[gim]*)|("(?:\\.|\\\r*\n|[^\n"])*"|'(?:\\.|\\\r*\n|[^\n'])*')|(\/\/[^\n]*\n|\/\*[\x00-\xff\u00\uffff]*?\*\/)|\b(abstract|boolean|break|byte|case|catch|char|class|const|continue|debugger|default|delete|do|double|else|enum|export|extends|false|final|finally|float|for|function|goto|if|implements|import|in|instanceof|int|interface|long|native|new|null|package|private|protected|public|return|short|static|super|switch|synchronized|this|throw|throws|transient|true|try|typeof|var|void|volatile|while|with)\b|\b(Array|Boolean|Date|Function|Math|Number|Object|RegExp|String|document|window|arguments)\b|(\d[\d\.eE]*)|([\x28-\x2b\x2d\x3a-\x3f\x5b\x5d\x5e\x7b-\x7e]+|\x2f|\.(?=\D))/g;
Enter fullscreen mode Exit fullscreen mode

Yours is an example of readability by comparison. 😁

Thread Thread
inhuofficial profile image
InHuOfficial Author

Yeah but I couldn’t work out how to also do different colours, unless I used match groups I suppose and increment the group in the loop...

Probably could be done but obviously the idea is this is a base for something useful so I don’t think I will even try it! 😜🀣

Thread Thread
lexlohr profile image
Alex Lohr

That's exactly what I did.

Thread Thread
inhuofficial profile image
InHuOfficial Author

Ah yes, I see it now, being a non code golfer I didn't get that was a counter you had implemented!

Collapse
inhuofficial profile image
InHuOfficial Author

Ironically this article rambles on a bit in places and isn't as well structured as I would like.

However, that is the whole point of this series, I will be using the tools I build to improve my own writing and take some of the thought process out of ensuring I use inclusive writing. I know my writing sucks sometimes! My apologies for that!

Over to you: if you have ever thought "I would like to see XXX" in a WYSIWYG to make your writing better or make it easier to create great content then let me know in the comments, all ideas are welcome (even slightly silly ones!).

Collapse
siddharthshyniben profile image
Siddharth

Protip: change the syntax highlighter to highlight markdown, because no one is going to write JS in a WYSIWYG anyways

Collapse
inhuofficial profile image
InHuOfficial Author • Edited

Nobody is going to write markdown in a WYSIWYG as the whole idea of a WYSIWYG is it looks like the end product and you don’t need to know Markdown, HTML etc 😜

Plus if you copy a code snippet in to a true WYSIWYG it is nice to see the syntax highlighting live!

I think I have made this all very confusing by combining the two for jokes! I think part two is going to need a lot of notes to cover this and the deprecation you mentioned in another comment!

However you have made a good point that Markdown import and export (that will be hard as some things aren’t possible in markdown that are in a HTML wysiwyg) needs adding to the spec list πŸ‘

Collapse
miketalbot profile image
Mike Talbot

How about using pegjs to build a parser to do the syntax highlighting? Sure it's a bit more work, but I'm pretty sure there's a starter grammar out there that could help... like this one

Collapse
inhuofficial profile image
InHuOfficial Author

Thanks for the suggestion, I will keep a bookmark on pegjs as if this evolves then it could be useful, but at the moment I want to understand every nut and bolt of what I am building.

I presume pegjs is purely a generator and I don't need it as a dependency once I have created a parser with it?

To be fair the point of the syntax highlighter was more to experiment with the headache that is live colouring on a WYSIWYG editor.

It is actually a really complex thing to deal with if you don't cheat as I did with the double stacked divs perfectly aligned and hiding the content in the contenteditable div that you actually write in.

I am experimenting as one idea for one part can then be applied to another part while I am still prototyping.

For example, one thing I didn't mention in the article is trying to guess if someone has added a "non English" phrase to a sentence like "c'est la vie" so that we can prompt them to add a <span lang="fr"> around the phrase so that screen readers can announce things properly.

The RegEx parsing method used in the syntax highlighter would work nicely on that scenario as I could just do a load of pipe separated words that are common in English and some basic occurrence counting to highlight potential other languages (or at least that is my first thought of how to do it...that could also change!).

Also just having an array of RegExs and corresponding <span> outputs seems like a really simple way to cover basic word lookup for swear words, racist language etc. that isn't sensitive to the context of the document.

One thing I learned in all this is apparently a prebuilt piped RegEx is really efficient using .match on a string when trying to match multiple needles to a haystack.

All very much a work in progress and suggestions like the above are always welcome!

Collapse
miketalbot profile image
Mike Talbot

I'm interested to see how you proceed with this for sure :) PegJS produces a parser that is a standalone JS file - the rules of Peg are very similar to regexes (though no backtracking etc) so that's what brought it to mind.

I'd wonder also about training a model to recognise sub sections of the document etc, though you are right, categorical searches for racist or swear words would also make sense.

Thread Thread
inhuofficial profile image
InHuOfficial Author

Yeah the training a model bit is way beyond my current abilities (hence why I have shelved Grammar suggestions for now as trying to do that with simple algorithms seems very difficult!).

This project does provide some really interesting challenges and I don't think I can ever make a perfect solution. But who knows, maybe I can produce something useful enough that it can be turned into a paid service and I can pay someone smarter than me to handle the scary stuff! πŸ˜‹πŸ€£

Thanks once again for the suggestion, I will have a play with PegJS at some point in the future as it is interesting from a learning perspective, especially that parser example you linked to, still scratching my head on some of that!

Collapse
siddharthshyniben profile image
Siddharth

Nice... I would have built this document.execCommand was supported more, but it isn't.

Collapse
inhuofficial profile image
InHuOfficial Author

Yeah that is a real issue, it isn't marked for removal in any browser yet so at least 2 years use of it.

But I agree, against good practices to use it in anything new!

I should perhaps making a point of that in the next article so people don't use it without knowing the risks, thanks for the great suggestion!

Collapse
siddharthshyniben profile image
Siddharth

But if you don't use it, you don't get undoing capabilities anymore

Thread Thread
inhuofficial profile image
InHuOfficial Author

Undo is not too bad as we can just store the exact state each time there is an update.

Or better yet we can use doc diffing and just store the differences. I have to admit I have no idea which is better!

I think the answer I have at this point is to use it, but build it in a way I can swap it out easily. πŸ€·β€β™‚οΈ

Thread Thread
siddharthshyniben profile image
Siddharth

Diffing is better if you care about space, which, to be honest, is not a big concern right now, unless you plan to persist them.