Graphing RPG source code with Neo4j

#ibmi #neo4j #clojure

As I learned Clojure, I had a lot of crazy, over-ambitious thoughts. I find that it is really good to try hard things. Software documentation and training materials seem mighty dry and cryptic until you have built "Buckets" to put the knowledge into. But once you have tried and failed, or tried and succeeded inefficiently, when you get training, you think, "If I had known that, I would have been able to write my program that way." It is much easier to retain knowledge once you know how and why it is useful. You have hooks to hang it on.

One such experiment stemmed from my attempt to "mass document" our software. Legacy RPG code is highly structured. It contains 'F specs' which define what files are used for input and output. I had been working on writing documentation, and thought "That would be useful to have" So I wrote a Clojure program to read through all 8000+ files we have and built maps out of the f specs contained within each of them.

As I scanned the code I also searched for the most common patterns our programmers used to have one program call another program and added those into the map. Many programs are controlled by constants questions, so I searched the code for references to those as well.

It was a fun project, but eventually I just had a big map (2.9 MB!). I was unsure how to use that map effectively. It seemed to unwieldy to use directly. It got filed away as a cool learning experiment.

A year or so passed, and I started to become interested in Graph databases. I downloaded an installed Neo4j. I tinkered with it enough to suspect it could do the job. Once I had "Buckets built" (questions that I needed to solve) I enrolled in a 1 day Neo4j class.

I quickly wrote another clojure program to load my map into a Neo4j DB. I created nodes for all the programs, files, and relationships between them. Once I got the data into Neo4j, I was able to start to write queries. I added labels indicating which programs users can call from the menus.

So now I can write a Cypher query that gets all of the possible paths that branch off of a menu item. A customer may call and say "I ran this program, and this report is wrong" -- And I can quickly identify the actual program that wrote the file in question (sometimes the third or forth in a sequence of events) This gives me a more certain starting point for troubleshooting. In the map above for example, the customer calls GRJRNL_C -> program GRGLUP -> file GLDET -> program GRGLPR -> file GLDT. It is much easier to look at the bubble graph than it is to read through 3 programs worth of source code to figure that out.

It is still far from perfect. Programmers have changed a lot over the last 35 years, and I have identified most of their patterns, but not all. The CL code can also override the files to redirect input or output to use another file. I suspect my automated documentation about 80-85% accurate, but if you take it with that grain of salt in mind, it is still pretty hand as a starting point.

And I learned a lot about RPG, about Clojure, and about Neo4j -- so even if it comes to nothing, I've got that!

Top comments (3)

meanFlatPrep • Oct 10 '18 • Edited

:) ... well done. Sometimes I find myself prototyping an idea in javascript with mindmup (in the end I download the file with Outline - Tab Indented Text - Export - Local to preview it). But what would be really useful was to attach a neo4j db or any other graph db (those who run in the browser come to my mind) to some sort of IDE in order to create/organize/assemble automatically the code snippets. By the way, I tried to do this kind of prototyping with Xmind (opensource free edition) but it doesn't allowed to save the text file as I wanted it.