Developing critical software: Intro to formal methods and theorem provers

#programming #computerscience #development

The amount and complexity of software developed and used has grown tremendously. There are many applications where high reliability must be ensured, because failures are costly (with respect to human lives, environmental issues, or money).
There are more safety-critical software systems than you realize, but they don’t get nearly as much attention as consumer-focused software, and it can be found in almost all areas, e.g., aviation, (nuclear) power plants, medicine, transportation, space technologies, process control, and banking.
Whereas the crash of a word processor or your favorite game may be a nuisance, a malfunction in software controlling a nuclear plant could possibly cost thousands of human lives, and could put millions more at risk; recent actual examples of critical systems failing would be Boeing’s 737 MAX groundings, and the “Meltdown and Spectre” Vulnerabilities.

Safety first

In most companies, software developers are primarily concerned with shipping more features, and delivering more value. But software developers involved in the creation of safety-critical systems must be concerned primarily with creating safe systems. And that is a very different task.
Most safety-critical software appears to be developed using the waterfall or spiral models. NASA specifically recommends against using Agile methods for the safety-critical components in its software guidebook (page 87)

defects per thousand lines of code vs. dependability of multiple systems — Defects per thousand lines of code vs. dependability of multiple systems, the goal here is to show how "different" critical systems are

The problem with testing:

The most common way to check for faults and bugs is to use testing, and human review. During and after the implementation of the system, the system’s functionality is usually tested to ensure that the resulting program satisfies the requirements and that no errors or bugs are present. Testing large and complex systems can be very time consuming and, due to the size of the system and the amount of code, an exhaustive testing is practically impossible. Nevertheless, when the system is safety and security critical, correct functionality must be guaranteed, which requires either exhaustive testing or a way of proving that the code correctly implements the specification.

Formal methods:

In computer science, formal methods are mathematically rigorous techniques and tools for the specification, design and verification of software and hardware Systems. Mathematically rigorous means that the specification consists of Well-formed statements using mathematical logic and that a formal verification consists of rigorous deductions in that logic. A formal specification is precise and there is no risk of misinterpretations. Also, if there is a proof that the implementation abides by the specification, then one can be sure that the programmers have implemented what is described in the specification, which means complete verification of the entire state space of the system.

a basic banking system specification using the "Z" formal specification language — A basic banking system specification using the "Z" formal specification language

When formal methods cannot be used through the entire development process (due to the complexity of the system, lack of tools or other reasons), they can still be used successfully on parts of the system, for example on the most safety or security critical components alone.

Formal methods are used in several ways:

To assure the software after-the-fact.
To assure the software in parallel.
To develop the software.

“After the fact” software verification can increase the confidence in a safety-critical system. When the regular software development is completed, then the formal specification and verification begin. The software assurance engineer converts the “human readable” requirements into a formal specification and proves properties about the specification. The code that implements the system may also be formally verified in order to comply with the formal specification. With this approach, two separate development activities occur, increasing costs and schedule length. In addition, problems found at this late stage are costly to fix.

“In parallel” software verification still uses two separate teams (software development and FM verification), but they operate in parallel during the whole process. The development team uses the regular practices of good software development. At the same time the FM team writes and verifies the formal system specifications. While still costly, this method of assuring the software allows for quicker development. Software errors are detected earlier in the development cycle when they are less expensive to correct. However, communication between the two teams is crucial for this approach to work.

Instead of having two teams working in parallel, the software can be developed using FM exclusively. This is an integrated approach. Requirements and design are written in a formal language. The design is formally verified before code is generated. This method is the least costly of the three, though the developers must be trained in FM for it to work.

Most formal methods are considered to be too clumsy and time-consuming in their application. They usually require a long learning curve for the software engineer and are often incompatible with the "as-we-always-did" software development procedures. Together with schedules, which are generally very tight, an application of formal methods in an industrial environment is often not accepted by the project team. This is where theorem provers can help.

Interactive theorem provers

By interactive theorem proving, we mean some arrangement where the machine and a human user work together interactively to produce a formal proof. There is a wide spectrum of possibilities. At one extreme, the computer may act merely as a checker on a detailed formal proof produced by a human; at the other the prover may be highly automated and powerful, while nevertheless being subject to some degree of human guidance.
ITPs can be customized to specific application domains by providing
libraries of tactics, thus increasing the amount of automatic processing.
On the other hand, interactive theorem provers have several serious disadvantages:

they can sometimes become “too interactive”, Even the smallest steps in the proof must be carried out by hand.
Large and complicated proofs take weeks to months to be completed.
most interactive theorem provers can only be used by people who have a lot of knowledge and experience in the application domain, the calculus, and sometimes even the implementation language of the prover, Therefore, the use of interactive theorem provers requires specially trained and experienced people.

Automated theorem provers

Automated theorem provers are programs which, given a formula, try to evaluate whether the formula is universally valid. Currently, most ATPs are like racing cars: although very fast and powerful, they can’t be used for everyday traffic, because essential things (like headlights) are missing. The classical architecture of ATPs must be extended to several directions in order to be useful for real applications.
Automated theorem provers are not a universal cure for every application and every proof obligation, even if they could be processed by the prover; In many cases, user interactions will always be necessary to perform complex core proof operations, However, the goal is to relieve the user from tedious low-level detail-work. Ideally, automated theorem provers can play a role in software engineering equivalent to a pocket calculator in classical engineering: simple tasks are done automatically, but there is still work to be done by the engineer.
Another problem with automated theorem provers is that, in most cases, there is no user interface, and no possibility of debugging a formula. Current implementations expect a syntactically correct formula, and they provide no feedback if no proof can be found.

The theorem prover museum is an initiative to conserve the sources of theorem prover systems for future analysis, since they are important scientific artifacts. It has the sources of many of the theorem provers.

examples of practical use:

Formal methods have been successfully used on NASA, military, and commercial systems that were considered safety-critical applications, and interactive theorem provers such as Isabelle are being used successfully in verification of compilers, protocols, hardware, and algorithms.

The seL4 micro-kernel has been formally verified from its abstract specification down to its C implementation using machine checking. The correctness of a very detailed, low-level design is shown, and the C implementation is formally verified.
An abstract specification of the system was created as an operational model of the system. Then the Haskell programming language was used to implement a prototype of the kernel. This prototype could be automatically translated into the theorem prover Isabelle to form an executable, design-level specification of the kernel. Then the mode was manually implemented in C to form a high-performance C implementation of seL4.
AMD, Intel and others use automated theorem proving to verify that division and other operations are correctly implemented in their processors.
Lightweight Java programming language was designed for academic purposes within the Computer Laboratory of university of Cambridge. it was proven type-sound using Isabelle.