James Carlson

Posted on Feb 13, 2021

Towards a Pure Elm Text Editor

#latex #elm #editor

I'd like to report on progress with a text-editor project I've been working on for some time. It is written purely in Elm, my favorite language for building web apps. I needed such an editor to build an IDE for another project, MiniLaTeX, and had found no other alternatives which suited me. But of course I hope that the editor, when finished, can be of use to others as well. Here is a demo and here is the GitHub repo.

The image below gives an idea of what the editor can do. Below the image I'll discuss features, challenges, and some of the technical details, as well as a bit of the story of how the project came to be.

The Story

First the story and some acknowledgements. I had been using the Elm equivalent of text areas for user input, but found a serious problem when the text was modified programmatically - the cursor would jump to the bottom! Very bad. I needed, for this and other reasons, a real editor. I tried some drop-in solutions but these did not work out well, so I started dreaming of a pure Elm editor that I could fiddle with at the lowest possible level. A big task, which I repeatedly put off.

One day, I chanced upon a prototype by Sydney Nemzer. While the demo this link points to might not seem like much, the code is well thought-out, really solid, and beautiful. To be honest, I had no idea how to solve many basic problems that Nemzer had already solved, e.g., how to interact with the cursor. Over the next two months, I grew Nemzer's code into a prototype that could do more and integrated it with the MiniLaTeX system. Sadly, as I worked on longer and longer LaTeX documents, the editor slowed down to a point that I was no longer happy with it.

About that time I heard a podcast with Martin Janiczek, where, among other things, he discussed what turned out to be an earlier prototype that Nemzer had based his work on. The difference: in Janiczek's version, the fundamental data structure was an array of lines, whereas in Nemzer's it was a single string. The latter is in many ways easier to work with, but the former is more efficient. Over the course of a couple of weeks, I rewrote the code to use an array. Lots of primitives working with text areas to replace basic string operations. Finally: much better performance!

Another challenge

Well, the long file story was not over. Web browsers slow to a crawl when the number of DOM nodes grows too large, and of course that would be the case if the editor loads the entire text of a large document into the DOM so that it can be scrolled in an unfettered way.

I decided to address the problem head-on by downloading a copy of Darwin's Origin of the Species, weighing in at more than 15,000 lines. This was to be my test file. The strategy would be to maintain a "window" into the full array of lines that was characterized by an offset and a "height," - the number of lines in the window. I've been using 600 lines for the window, with the editor displaying just 33 lines. We'll call the 33-line visible display the viewport. When the upper or lower edge of the viewport gets too close to the upper or lower edge of the window, the offset is repositioned so that the viewport is well inside it. This solution worked really well, and was quite snappy when editing The Origin of the Species.

Repositioning the viewport and window was slightly tricky - it should be unnoticeable to the user. Before solving that problem, of course, I had to redo the cursor-management code so as to take into account the offset.

On to new features!

This work done, I was finally ready to do what I had long wanted to do: implement LaTeX command completion. The idea is this. In LaTeX there are many constructions like

$$
\int x^n dx = \frac{1}{n+1}
$$

and

\begin{theorem}
There are infinitely many primes.
\end{theorem}

These are problematic in an app like minilatex.lamdera.app which updates the rendered LaTeX every 300 milliseconds or so. The reason is that if the user types \begin{theorem} without the closing \end{theorem}, the source text is in a particularly obnoxious error state that can extend to the end of the document. Ugh! One solution is have a keystroke sequence like ESC th ESC that inserts the text below, placing the cursor just before the XXX:

\begin{theorem}
XXX
\end{theorem}

This way, the user can be in a mostly error-free state at all times. (By "mostly", I mean avoiding really bad errors that extend over huge parts of the document because of unmatched begin-end pairs and the like).

The Developer Experience

In word, it was a good. Elm's type system helps me to think about what needs to be done and to translate ideas into code in a friction-free way. I've also found that it helps when I come back to a project after a long break, as I did several times with this one. In a very short time, I'm able to get the code "back into my head" and start forging ahead.

The fact that Elm is is compiled language is a huge advantage. A problem I faced in my previous life as a web developer was extreme FOR anxiety: Fear Of Refactoring: don't even think about the code, much less touch it: it might break! With Elm, by contrast, refactoring is a pleasure. Many times now I've changed a fundamental data structure deep in the core of the code, followed the compiler messages, and come out the other end an hour or two or three later with a fully functioning app - and the confidence that it is indeed free of lurking runtime bugs.

One last comment. Some years ago, I watched Rich Hickey's talk, Hammock-driven development. It is both entertaining and insightful. I used this methodology in the project several times with good results. Highly recommended!

Further technical notes

The app posted here was compiled using Matt Griffith's elm-optimize-level-2, then minified using uglify-js, resulting in a 307kb js file. This js file includes code for the editor, the MiniLaTeX-to-Html compiler, and a custom Markdown compiler, all written in Elm. Using elm make --optimize, the js file is slightly larger: 319kb.

The small asset size is due to the fact that elm make --optimize performs dead-code elimination on a per-function basis. If a module consists of one hundred functions, and only three (with their dependencies) are called by the code, then only those three survive compilation to Javascript.

I'll have to run some benchmarks on large LaTeX files to see to what extent elm-optimize-level-2 speeds things up.

Tooling

I've been using Umberto Pepato's Velociraptor for running all my development scripts ever since I learned about it on dev.to. Simple, light-weight, and perfect for the job. I love it! Velociraptor is written in Deno.

DEV Community