DEV Community

Cover image for A Mermaid to the rescue - visualizing complex problems
Ryan Brown
Ryan Brown

Posted on

A Mermaid to the rescue - visualizing complex problems

The Setup

As a developer there are fewer things more frightening than an ask or requirement phrased similar to the following:

I have this excel spreadsheet, but we need it as an application.

Invariably the spreadsheet has a convoluted sequence of functions and back-references, look-ups and unexplainable inconsistencies in format and display.

Ideally sometime before a developer is handed a spreadsheet and the the above “requirement” there could be a detailed analysis of business requirements including data sources, and all transformations expected of that data including test cases and example data. However in a fast paced modern working environment that is a lofty luxury indeed.

So what is our hard-working, indefatigable developer to do? Trying to suss out the hierarchy of column formulas extending up to and possibly column AX, some referencing sums of other formula derived columns, is a head scratchier indeed.

The Situation

In a recent exercise the (nearly) exact situation described above was encountered. The resulting artifact was to be in the form of a Database View. The data was (or would be) available in tables in the target database so that was at least covered. Still, I needed to map out the column inter-relationships and devise which structure of nested select statements would be appropriate to capture the same functionality of the Excel spreadsheet.

I first considered mapping out the column relationships on paper, but quickly realized that doing so would need a substantial amount of erasing as I moved on to the next column and had to re-draw a portion of my map. Clearly a more automatic solution was needed and being a developer... I had an idea for a tool to use.

The Search

A few months ago I stumbled across a diagramming tool that could take diagram descriptions in a markup language and produce decent quality graphical outputs. This was an amazing discovery as I’ve always struggled with the visual documentation tools such as draw.io or Visio. I much prefer to construct things like this in a declarative way rather than using visual tools. For me the benefits of declarative ways include being able to copy-paste easier, check into source control and even programmatically create the declarative statements.

The tooling I found is called Mermaid and it has a really nifty live editor for quick work and learning: Since discovering Mermaid, I’ve used it for some light documentation of processes, data-flows and... some unorthodox things.

The Solution

Back to our problem of trying to get a handle on the rat’s nest of excel formulas. Let’s look at one...

an excel formula

Doesn’t look too bad... bet let’s look again.

$AA$1 is a “constant” to be provided from fairly static table. Not doing too bad so far.

P7 is another formula=SUM(M7:O7) OK so now referencing another formula. Now we're getting more complicated.

$P$19 is =SUM(P6:P18) Now things are interesting... a sum of the set of data.

The whole sheet is littered with things like this. Very often the results of one formula are used in later or earlier formulas.

I figured I could use a type diagram to help me understand these relationships.flowchart
So to start with....



flowchart TD
    AA7 --> $AA$1 & P7 & $P$19


Enter fullscreen mode Exit fullscreen mode

Line 1 Flowchart tells Mermaid that this is a type diagram.

Line 2 is indented for readability. Let’s take it piece by piece
AA7 This is the first node in the diagram. This is the simplest way to define a node, just an identifier.

--> This is a line, in this case an arrow line. It indicates a directional relationship. In this case, that the identifier (column) to the left is related (references) the identifiers (columns) the right.

$AA$1 The next identifier (column) that is the referenced item of the relationship.

& P7 We can indicate that more than 1 thing is related, we could have put the relationship on a separate line as AA7 -> P7 if we wanted to, but I like this one line per referencing column format for clarity.

& $P$19 The next column referenced.

So far we’ve defined a flowchart that has a node AA7 that references $AA$1, P7 and $P$19

Let’s see what it looks like:

Mermaid Diagram showing 4 nodes

We can clearly see that AA7 references the other three columns.

let’s expand on that...

P7 is =SUM(M7:O7) and becomes a line in our diagram P7 --> M7 & N7 & O7 Mermaid doesn’t understand Excel range notation, so I manually unrolled this.

$AA$1 is a value so we don’t need to expand it anymore.

$P$19 is =SUM(P6:P18) I’m really interested at a line-by-line model of this so I can just say that it references back to P7 This becomes a line in the diagram $P$19 --> P7

With our full (so far) diagram markup of



flowchart
    AA7 --> $AA$1 & P7 & $P$19
    P7 --> M7 & N7 & O7
    $P$19 --> P7


Enter fullscreen mode Exit fullscreen mode

The resulting diagram is

Mermaid diagram showing 7 nodes

Now we can more clearly see the hierarchy of our Excel formulas.

Using this method I created a flowchart of all the formulas for one row of the “spreadsheet that is a requirement”. Here is what the full markup looks like:



flowchart BT
    H6 --> F6 & G6
    I6 --> AT6
    J6 --> H6 & I6
    KG --> AU6
    P6 --> M6 & N6 & O6
    T6 --> R6 & S6
    AA6 --> P6
    AA6 --> P19_SUM_P6
        P19_SUM_P6 --> P6
    AB6 --> Y6 & Y19_SUM_Y6 & W6 & W19_SUM_W6 & T6 & T19_SUM_T6 & V6 & V19_SUM_V6
        Y19_SUM_Y6 --> Y19
        W19_SUM_W6 --> W6
        T19_SUM_T6 --> T6
        V19_SUM_V6 --> V6
    AC6 --> P6 & T6 & V6 & W6 & Y6 & AA6
    AD6 --> P6 & T6 & V6 & W6 & Y6 & AB6
    AE6 --> AL6
    AF7 --> AL6
    AJ6 --> AG6
    AK6 --> AH6
    AL6 --> AJ6 & AK6
    AM6 --> P6
    AN6 --> AD6 & AM6
    AO6 --> AN6 & AL6
    AP6 --> AO6 & AO19_SUM_AO6
        AO19_SUM_AO6 --> AO6
    AQ6 --> AO6 & AP6 & AN6
    AR6 --> AN6 & H6
    AS6 --> AQ6 & AQ19_SUM_AQ19
    AT6 --> AS6 & AQ6
        AQ19_SUM_AQ19 --> AQ6
    AU6 --> F6 & J6
    AV6 --> J6 & H6
    AW6 --> AU6
    AX6 --> AV6 & AW6


Enter fullscreen mode Exit fullscreen mode

It’s big.

I did change the direction of the flowchart by adding BT after the flowchart definition. This stands for Bottom-to-Top. Other directions supported are LR and RL in addition to TD the default for Top-Down.

Here is the resulting diagram:

Mermaid diagram showing MANY nodes

Is your mind blown yet?

This diagram clearly shows me the conceptual layers of the SQL View I’ll be creating. I could now see which columns I needed to have ready for
each following layer.

At initial count I see 18 layers. I count by horizontal groupings. This a naive count of layers as many of the smaller functions can be in-lined into a sibling layer. In practice the very large functions that include checks for divide by zero were not rolled together with other layers and... the summing columns could not be rolled in without some correlated-sub-query mess. It was much cleaner to have these as separate layers.

The Summary

Had I not used Mermaid to help me diagram the requirements, it would have taken me multiple times longer than it did with using the diagram to assist in understanding the chain of data formulas.

That I don’t need to manually adjust the positions of nodes or connect and draw lines myself, saved me a ton of time. Being able to save the
markup to a text file for later editing or adding to this very same blog-post... priceless (insert appropriate credit-card commercial here💳).

I plan on using Mermaid diagrams in this manner in the future to better capture the true nature of requirements.

I strongly encourage you to check out the Live Editor and Documentation to learn more about how you can use Mermaid diagrams.

Of note, in this task we used just ONE type of Mermaid diagram out of many... and more are being added. Mermaid is open-source and being constantly improved on.

Here are the diagram types that Mermaid has as of this writing:

  • Flowchart
  • Sequence Diagram
  • Class Diagram
  • State Diagram
  • Entity Relationship Diagram
  • User Journey
  • Gantt
  • Pie Chart
  • Gitgraph (Git) Diagram
  • C4C Diagram (Context) Diagram
  • Mindmaps
  • Timeline

Top comments (0)