Best practices for PHP exception handling

#php #exceptions #oop #softwaredesign

Handling errors or other non-’happy path’ situations is essential when creating robust PHP applications. While errors were the main construct to do so in PHP 4, exceptions have been around since PHP 5. They should nowadays be considered the main mechanism for handling alternative or exceptional paths. It seems that these alternative paths still don’t always get the attention they deserve, though.

Proper exception handling takes quite some effort, but will eventually result in a much more stable application. A sensible exception handling strategy makes it clear what exceptions should be expected (and thus handled!) at a given point in the code. Moreover it will maintain the encapsulation and abstraction you carefully applied to your object-oriented design. Last but not least, it should make debugging a breeze.

In this post, I would like to introduce you to the set of best practices we have adopted at Moxio over the years. We have found these to work very well for us, but keep in mind that they are our best practices; your mileage may vary. The following guidelines are aimed at PHP code, but the basic principles behind them will also (with some translations) work for similar languages.

Types of exceptions

We make a distinction between the two top-level types of exceptions that the PHP SPL library defines. These are LogicException and RuntimeException. The interpretation of these two types in literature varies. We however attach the following meaning to them:

A LogicException is an exception that should require a fix or change in code. With 'code' we do not only mean source code, but any content managed by a developer. This includes configuration and database contents that are not maintained by an end user of the system. A LogicException is mainly an internal guard or assertion. In perfectly written and wired code one should never occur.
A RuntimeException is an exception that might also occur in code that is written and configured 'perfectly'. Such an exception may be caused by input from the end user, or an error in (the communication with) an external system. When a RuntimeException is thrown, it should not require a fix in the code. If the exception ends up uncaught however, we should add some code to handle it. This may mean logging the error, using a fallback strategy, reporting the error to the user, or a combination thereof.

Note that this interpretation differs from the generic perception in Java, where a RuntimeException is what we would call a LogicException. Furthermore, in our standards we classify all exceptions into either one of these categories. We do not create exceptions as a direct subclass of Exception.

Exceptions as part of the function signature

We see the possible variants of RuntimeException that can be thrown or bubbled up from a function as part of the contract of that function. That means such exceptions should be annotated on the function using @throws.

If the function is part of the implementation of an interface, that interface should specify that (a supertype of) the exception in case could be thrown. Annotating such an exception also on the implementation is not necessary. A function that implements an interface should never throw a runtime exception not declared on the interface. Such a case would be a violation of the Liskov Substitution Principle.

On the contrary, we consider the different types of LogicException to not be part of the contract of a function. Of course we assume a perfect implementation, but at the same time we know there can always be an error somewhere. Therefore a LogicException can always be expected and unexpected at the same time. Hence annotating it using @throws is not desirable. See also 'catching exceptions' below.

Creating exception subclasses

Under these guidelines, creating a hierarchy of subtypes under RuntimeException is very desirable. The more specific the exceptions we throw (and annotate as part of our contract), the more granular we can handle them. Subclasses of LogicException however are not necessary. These are not part of the contract of a function anyway.

Catching exceptions

To us, a RuntimeException is a checked exception. When such an exception can be thrown from a function, the calling function needs to either catch that exception or declare it as a possible exception from itself using @throws. Catching a runtime exception is a good idea when the calling code can sensibly handle the exception, or when it can re-throw it at a better level of abstraction (see further down). At least it is important to think about the handling of these exceptions when calling a function that might throw them.

A logic exception should never be caught, at least not based on the type LogicException or a subclass thereof. These exceptions should never occur in a correct implementation. It therefore does not make sense to try to handle them along the line. Instead, the logic that triggered the exception should be fixed.

At some point it may be necessary to catch all exceptions, including both variants of RuntimeException and LogicException. As we require PHP 7 at minimum, we do this using Throwable in our catch-clause, not Exception. Constructs like this are exclusively meant for component entry points, and should never make assumptions about the specific cause or character of the exception. Such code will therefore never contain specific error handling logic. Instead, they are a generic catch-all for logging or reporting the error, or giving the end user feedback that something went wrong.

Sometimes a catch-all like this is also used for cleaning up resources or closing open connections. A finally-block is often better in such situations, especially if the exception is re-thrown at the end of the catch-block.

Special case: debug info

One valid reason to catch a LogicException anyway is to augment it with extra debugging information that was not available deeper in the call stack. In such a case we directly throw a new LogicException with the extra data, and the original exception in $previous. We preferably catch such an exception based on the most generic type possible. This may be a common base class or marker interface for all LogicExceptions that can occur in the given piece of code.

The alternative for such a catch would be to pass the debug information on to the callees that eventually produce the error. We prefer not to do so if such information is only used for passing it into an exception.

Throwing a new exception after catching

After catching an exception, it is of course possible to throw a new exception. Indeed, for sensible exception handling this is needed more than one may expect. Just make to always set the $previous-parameter of the new exception to the original (caught) exception. This ensures the the full cause of the exception can still be derived. Because of the order of the Exception constructor parameters it is then often necessary to specify a $code. We don't use this parameter, so we just set it to 0.

Translation to the correct level of abstraction

Catching-and-throwing is often necessary to ensure that an exception manifests itself at a suitable level of abstraction. Suppose we have a UserRepositoryInterface that is implemented by a DatabaseUserRepository. The latter stores users in the database, which raises a DuplicateDatabaseKeyException if a user with the given username already exists. According to the rules described earlier we should use @throws to annotate this exception on the interface, but that rightly feels a bit strange. Why would a generic interface, meant to abstract the storage mechanism away, know about a type of exception specific to a database? The solution is to catch the DuplicateDatabaseKeyException within DatabaseUserRepository and throw something like a UserAlreadyExistsExeption in its place. This exception matches the level of abstraction of UserRepositoryInterface: it knows about users, but not how they are persisted. It can therefore added to the signature of that interface without issues.

From `RuntimeException` to `LogicException`

It is very well possible that an exception that was a RuntimeException is at some point converted into a LogicException. This has to do with specific knowledge we have at that point, from which we know that the given exception should not be possible. That knowledge was not available deeper in the call stack.

To illustrate this, assume we have an XML reader with a method getUniqueTagContents. This method reads the contents of one unique tag from an XML file based on the tag name. A lot of things can go wrong inside such a method: the XML file may be malformed, the given tag may not be present, or it may occur multiple times. These are all examples of a RuntimeException. Without extra knowledge about the origin of the XML (which may be uploaded by a user) and the tag name they can also occur in a perfectly programmed application. But it is possible that we use this method on a piece of XML that we just validated against an XML schema that enforces the existence and unicity of the given tag. In such a situation we know that getUniqueTagContents should not fail. The same applies when we use the method to read a configuration file that we put in VCS ourselves and which we thus fully control.

In such situations we still have to catch the ‘impossible’ runtime exceptions, as we consider them checked. In this catch-block we then throw a LogicException: this situation should never happen. Of course we save the original exception through the $previous parameter.

This is a pattern that occurs often. Deep in the call stack, where the bigger picture is not available, many faults are a RuntimeException. As the exception bubbles up (whether translated to another level of abstraction or not), it reaches a point where we know the error should be impossible. At that point it becomes a LogicException. Note that the inverse is not possible: an unexpected error can not suddenly be expected.

A grey area

The distinction between a RuntimeException and a LogicException is not always 100% clear. There is a grey area where the correct type of exception depends on interpretation and the semantic constract of a function. A few examples to illustrate this:

Syntax error in a query

A method executeQuery to execute a database query can fail due a syntax error in that query. A first sight this looks like a RuntimeException: we don't know where the query comes from and thus cannot guarantee its syntactical correctness. On the other hand it would be very strange if user input (or input from another source beyond our control) could lead to a syntax error. That smells of SQL injection. It is therefore very reasonable to state that the code calling executeQuery is responsible for the syntactical correctness of the query. That makes the exception a LogicException. An exemption would be if we were building an application like adminer or phpMyAdmin where we should expect errors in user-entered SQL queries.

Cache item not found

Suppose we have a cache class with a method get($key) to retrieve a cache item. Of course it can happen that get is called with a key that does not exist. We assume that we have chose to communicate this via an exception (alternatives would be through the return value or a by-reference parameter). Would such an exception be a runtime or a logic exception?

The answer probably depends on the other methods on the cache class and how we expect to use these. If the cache has a has method we could demand that a consumer uses that method to check whether a cache item exists before retrieving it with get. In that case a LogicException is reasonable. Depending on the implementation of the cache there may be a tiny chance that a cache item is deleted between the calls to has and get. In such cases even a perfect implementation (which checks existence first) cannot fully prevent the error. The 'correct' exception category is not very clear here. For now we tend to use a LogicException for situations with rare edge cases like this.

Corrupt data in the database

Another grey area is formed by errors that can only occur in case of a corrupted database. From a puristic view this is an example of a RuntimeException, the database being an external factor. On the other side it is undesirable to have to take the possibility of database corruption into account everywhere in the code. This applies all the more if only the application and occasionally a developer or sysadmin writes to the database. A more pragmatic approach is to consider database corruption a LogicException. Chances are that the corrupt data was caused by an implementation error in the application. Anyway, developer or sysadmin intervention is required to manually fix the data.

Recap

We distinguish checked exceptions that represent inherently ‘unfixable’ situations from unchecked exceptions that represent programming errors. Therefore we know at every point in the code what exceptions we should expect, and thus handle. By catching and re-throwing exceptions we make sure that they are at the proper level of abstraction and thus not break encapsulation. Chaining the original exception using the $previous parameter ensures that no debugging information is lost.

Are you trying out these practices in your project? Let us know if they work for you, if there are obstacles you run into, or if you have improvements to these guidelines. We’d love to get your feedback!

DEV Community