These days, there’s been an urgent demand for COBOL programmers across a number of US states. As we’re talking about a programming language designed in the late 50s, i.e. some 60 years ago, this demand is quite a precedent.
Let’s start at the top. COBOL is a programming language used on mainframe computers in the 60s. Business systems such as banks or public administration systems primarily used it. These computer systems, some of which date back to the 70s, are still in use. According to a 2017 Reuters report, over 40% of US banks still use systems built on COBOL. Over 80% of in-person transactions use COBOL, and 95% of ATMs rely on COBOL code. We’re talking about 220 billion lines of COBOL code that are still in use—in the US alone.
COBOL was designed a couple of years after FORTRAN, the language used mainly by engineers. These two languages have a lot in common (which is also a sort of window into that early age of software development). The idea itself behind COBOL was fine. It was to create a standardised business computer language for a wide range of computers; i.e. a portable programming language for data processing. That’s where its name derives from —
COmputer Business Oriented Language. The language was designed by representatives from corporations, as well as the US Department of Defense. It was conceived as a temporary solution, as a sort of a stopgap. However, the Defense Department passive-aggressively forced computer manufacturers to provide support for COBOL. They’d refuse to rent or buy any system without a COBOL compiler, unless it could be proven that COBOL was somehow disrupting system performance. In only a year, COBOL set off on a journey to meet its destiny. It became the industry standard and quite possibly the most used programming language of all time.
Shortly after, COBOL got its first upgrades, such as the ability to create reports. This made the language even more popular. In the late 60s, ANSI created the first COBOL standard. Since then, the standard has been revised and amended a couple of times, roughly every ten years. The last revision was in 2014, so in fact not so long ago. COBOL is one of the first high-level computer languages. The programme is written in a code that looks more like spoken (English) language than machine commands. At the time of its conception, this was both novel and valuable. COBOL also has robust support for data processing, built into the language itself. This way, COBOL replaced a large amount of data handling, which at that time had to be done manually. Mainframe computer manufacturers promoted the language, with IBM at the forefront. To this day, the company has been producing new versions of mainframe solutions. This helped COBOL become an integral part of numerous computer systems in later decades, some of which perform vital business or government functions.
COBOL is a simple programming language. Its simplicity is what keeps all those billions of lines of code working; making it clear, and I dare say, more resilient to mishaps. Let me be clear: people make mistakes in COBOL as much as in the next modern programming language. But the lack of complex concepts makes the language more resilient. And so it is easier to understand and maintain it. As COBOL evolved, it started including more complex concepts. For example, it got its object-oriented features in the 2002 version.
From today’s perspective, programs written in COBOL look crude, even painful at times. The syntax is very "talkative", the documentation is incomplete. It’s far from being cool, in fact, COBOL’s popularity over the past 20 or so years has been pretty low. It remains the language of mainframe computers. It’s tucked under the heap of modern computer languages and solutions that have overwhelmed us throughout the years. Software development has moved to the web and cloud. Processors are far more powerful than before. All this calls for new concepts and brings about different programming languages.
As a programming language, COBOL lives on. While it didn’t exist as an open code software back in the 80s and 90s, things have changed. Today there’s a GNU version of the compiler. You can find environments and add-ons for COBOL development in all operational systems. There are online communities that still nurture the skills of coding using COBOL.
This is the question that’s troubling the US business world.
Let’s take a step back first. The awareness of using legacy solutions certainly exists. 92 out of 100 best US banks still use mainframe computers. 70% of these banks are Fortune 500 companies. If we accepted that COBOL has become a remnant of the past and mainframe computers should give way to smaller servers and cloud computers, how come this hasn’t been the case?
Interestingly, there’s a body in the US that evaluates system reliability. It's called the Government Accountability Office. The GAO has reported systems owned by the state in a desperate need of a thorough update. For example, the Department of Education still uses a 1973 system to process student data. It’s maintained by 18 contractors and requires special hardware, so it’s hard to integrate with modern software. The GAO considers COBOL a legacy programming language. And that reflects the concurrent issue of finding and hiring new programmers. This deficit in the workforce has made COBOL specialists extremely expensive.
Now, let’s go back to the question: why don’t we abandon COBOL and migrate to new, modern computer systems and solutions?
Because it would cost us dearly. Let me emphasize that: it’s ridiculously expensive. The Commonwealth Bank of Australia replaced their COBOL platform in 2012. It took them five years and the whole endeavour cost $750 million. There’s another example where migration of a similar system to a Java platform lasted four years and still hasn’t been completed.
So the conclusion is self-evident: migration from a legacy platform is neither trivial nor cheap. While the business world might be able to afford it, many public institutions (in the US) simply can’t afford it.
The Homeland Security Department, for instance, uses a 2008 IBM z10 mainframe system which runs on COBOL-powered programmes. The Social Security Administration uses some 60 million lines of COBOL code. It’s similar to computer systems of other state institutions across the US—some of them rely on systems that are over 40 years old.
As it would seem, COBOL is here to stay at least for quite some more time.
At times of crisis, the amount of data that needs processing is significantly increased. For example, due to an increasing number of people losing their jobs, the number of applications for unemployment in New Jersey soared by 1600%, with 580 thousand people filing claims in a short amount of time.
Legacy systems simply can’t support this change in traffic. No system can—unless it was designed from the get-go to be able to support this. On the other hand, designing robust systems is in itself a complex feat. Thus, their development takes more time and costs more. Unfortunately, in a world where we want everything done right away, software development quality isn’t picking up. But that’s a topic that deserves a blog post of its own.
All of this requires that in extraordinary circumstances, existing systems be maintained: monitored, repaired, or sped up. The lack of programmers has led to a public outcry for COBOL programmers from users affected by increased traffic, primarily state institutions. Online COBOL courses have recently started to spring up; with IBM at the forefront, obviously looking to increase its user base.
The last time that COBOL was popular was at the turn of 2000, due to the so-called Y2K problem. The industry had time to prepare for those events as it had been working on that transition since the 80s. In case of crises, such as the COVID-19 pandemic, there was no time to prepare.
What we can do is learn something from this.
We, programmers, have been obsessed with shiny objects of software novelties. "Programmers are like children", I was once told by a CTO of a Singapore bank. "They run around like crazy — look at this, look at that, drawn to tech novelties, but they easily bruise in the process and break things", he added, answering my question on adopting new technologies in their banking system. And he wasn’t wrong. The industry doesn’t care whether your code is written this or that way. It cares whether your code will continue to work in 20 years and whether someone will understand it. It is, after all, the industry that gives value to the biggest part of code. For example, Java is the new COBOL—and I’ll admit this only once—Oracle does a great job of maintaining compatibility with older versions and is super cautious when introducing novelties. I’m the first one to badmouth Java whenever I get the chance and call it the dullest modern language in the world, but it’s the (dull) coder in me who's speaking, not the engineer.
That’s why we need to be careful — very careful — with technologies that we adopt and that are becoming the norm. I’m not sure whether the organic approach is the right one: when we adopt a thing for the sake of its popularity. Take Python for instance. Imagine just how much it would cost the industry to migrate all the programmes from version 2 to version 3 if they chose to use Python as the language of their computer systems. Or AngularJS, a failed concept that was later jettisoned. That’s why I repeat, we need to be careful with what we adopt. Sometimes more isn’t better.
There’s one more thing we can learn from this and it concerns our everyday work. If nothing else, then at least we, who develop software, must not forget two equally important values of each computer system—robustness and sustainability. Write the code that outlives you. On the other hand, robustness and quality are something we need to invest into. "Haste makes waste", as wise COBOL programmers would say.
GNU Cobol is easily installed on OSX:
brew install gnu-cobol. You can use Sublime or VS Code as the editor, both of these support COBOL; for the latter, it’s more advanced than simple recognition of the syntax.
That’s it, now prepare to feel old.
Let’s have a look at a COBOL programme:
IDENTIFICATION DIVISION. PROGRAM-ID. HELLO. DATA DIVISION. WORKING-STORAGE SECTION. *> 9 — numeric A — alphabetic X — alphanumeric V — decimal S — sign 01 NUM-VAR PIC S9(3)V9(2). 01 COUNT1 PIC 9(2) VALUE 0. 01 NUM PIC 9(9). 01 TEXT-VAR PIC X(8) VALUE 'OBLAC.RS'. 01 STR1 PIC X(8). 01 STR2 PIC X(8). 01 GROUP-VAR. 05 BROJ PIC 9(3) VALUE 173. 05 NAZIV PIC X(15) VALUE 'LALALAND'. 01 CHECK-VAL PIC 9(9). 88 PASS VALUES ARE 044 THRU 100. 88 FAIL VALUES ARE 000 THRU 43. PROCEDURE DIVISION. DISPLAY 'CIAO COBOL'. MOVE 2.1 TO NUM-VAR. DISPLAY "NUM VAR : "NUM-VAR. DISPLAY "TEXT VAR : "TEXT-VAR. DISPLAY "GROUP VAR : "GROUP-VAR. COMPUTE NUM = (NUM-VAR * NUM-VAR). DISPLAY "MUL : "NUM. IF NUM > 3 AND NUM LESS THAN 100 THEN DISPLAY "Yes!" END-IF MOVE NUM TO CHECK-VAL IF FAIL DISPLAY "Oops" END-IF INSPECT TEXT-VAR TALLYING COUNT1 FOR CHARACTERS. DISPLAY "Broji : "COUNT1. INSPECT TEXT-VAR REPLACING ALL 'C' BY 'K'. UNSTRING TEXT-VAR DELIMITED BY '.' INTO STR1, STR2 END-UNSTRING. DISPLAY STR1 PERFORM FN WITH TEST AFTER UNTIL COUNT1=0. STOP RUN. FN. DISPLAY 'Hi!'. SUBTRACT 1 FROM COUNT1.
First of all, this is no mistake—the first columns in the programme are reserved and not used when writing code. The seventh column is also special. The code is written from the eighth column. Everything is written in capital letters, so Capslock will come in handy. :)
COBOL programme is divided into
DIVISIONs. In the first one, we declare the variables. There are no predefined types, but rather data types are defined when declaring them. You might also find group variables interesting (they look like structures), as well as some kind of
enum variable (
The programme is written in the procedural part. The programme above executes some basic computing actions and tasks. Working with strings can be particularly tedious; there’s a lot you need to write to carry out simple manipulations. The last thing you can see in our programme is how a procedure is called several times, in a loop.
What COBOL was made to do is data processing, which means working with files. The language has built-in capabilities to work with files and records. Here’s what that looks like:
IDENTIFICATION DIVISION. PROGRAM-ID. FILES. ENVIRONMENT DIVISION. INPUT-OUTPUT SECTION. FILE-CONTROL. SELECT ZAPISI ASSIGN TO 'file.txt' ORGANIZATION IS SEQUENTIAL. DATA DIVISION. FILE SECTION. FD ZAPISI. 01 ZAPISI-STRUCT. 02 UID PIC 9(6). 02 NOTE PIC X(30). 02 ACCOUNT. 03 AMOUNT PIC 9(6)V9(2). 03 BALANCE PIC 9(6)V9(2). 02 ACCOUNT-ID PIC 9(7). 02 ACCOUNT-OWNER PIC A(50). WORKING-STORAGE SECTION. 01 ZAPISI-RECORD. 02 UID PIC 9(6) VALUE 123456. 02 NOTE PIC X(30) VALUE 'TESTING'. 02 ACCOUNT. 03 AMOUNT PIC 9(6)V9(2) VALUE 000173.98. 03 BALANCE PIC 9(6)V9(2) VALUE 000173.12. 02 ACCOUNT-ID PIC 9(7). 02 ACCOUNT-OWNER PIC A(50). PROCEDURE DIVISION. DISPLAY 'WRITING RECORD: 'ZAPISI-RECORD. OPEN OUTPUT ZAPISI WRITE ZAPISI-STRUCT FROM ZAPISI-RECORD CLOSE ZAPISI STOP RUN.
The programme above writes content into a file. The files you are working with here are simple, structured textual files; when you open them, I guarantee you’ll get the 80s vibe (if you’re old enough to remember them).
The programmes above are compiled with
cobc -x <name>.cob which gives you the executable programme.