Discussion on: Consider SQL when writing your next processing pipeline

View post

IBM has such a tool since the late '90s: AppConnect (it was called "IBM Integration Bus" and "Websphere Message Broker" earlier). It's an application integration tool with a bunch of data transformation engines. Its 1st and still most relevant transformation engine ("compute node") uses an SQL dialect called ESQL. Its main benefit is that it abstracts data structure from serialised representation. That way you are not bound to databases - you can use ESQL to transform any kind of data, e.g. XML to CSV, or you can join JSON with tables in different databases. And a pipeline is called "flow" in this tool.

But Marcel is right - 1200 lines of SQL code are hard to grasp. You better use something different for the heavy algorithmic lifting and limit the use of ESQL to join/filter data from different sources. You can use a Java node in AppConnect for such a scenario and build a flow like this:

+--------+
| CSV in +--+
+--------+  |
            |    +--------+     +--------+    +----------+
            +--->+  ESQL  +---->+  Java  +--->+ REST out |
+--------+  |    +--------+     +--------+    +----------+
| DB in  +--+
+--------+