loading...

Don't return associative arrays!

aleksikauppila profile image Aleksi Kauppila ・3 min read

I hate dealing with associative arrays when i'm writing client code. The
problem with arrays is that there is no context. There is no special knowledge. They are usually just data packaged in an inconvenient format. The worst part is that they tie me to a particular implementation.

Now, i don't mean that you shouldn't return arrays that have uniform data. Returning an array of certain type of objects is fine(ish). Although you still cannot promise through your API that it will contain only certain type of data.

Arrays may work as an internal representation of data for some class, but that is the way it should stay - internal.

What i'm talking about is those nasty multidimensional arrays with hardcoded strings as keys.

Consider this:

class AcmeFileSender implements FileSender
{
    public function sendFiles(FileCollection $files): array
    {
        $sent = 0;
        $errors = [];
        foreach ($files as $file) {
            try {
                $this->send($file);
                $sent++;
            } catch (SenderException $exception) {
                $errors[] = $exception->getMessage();
            }
        }
        return [
            "sent" => $sent,
            "failed" => count($errors),
            "errors" => $errors
        ];
    }

    public function send(File $file): void {//...}
}

Doesn't look that bad right? It's just three keyed values in an associative array.

The problem in this case is, that the array cannot promise us anything about it's structure. This means that when we write our client code we are dependent of sendFiles()'s implementation. We have to know exactly what the keys are and what sort of data is behind those keys.

In OOP we should not be concerned about implementation details of other classes. We care only about the interfaces.

Lately i've been bumping in to something that might be an even more serious offense:

public function processStuff($stuff): array
{
    //...
    return [$someArray, $someObject];
}

//... in client code
list($someArray, $someObject) = $this->processor->processStuff($stuff);

Now, it's one thing to return this kind of abominations from private methods. But when you start designing your public API like this all hope is lost. This is pretty much a dead give away that you are dealing with a procedural code base. In procedural code bases everything is data and can (and will) be manipulated.

In the case above the issue is the lack of context. What does it mean when you receive someArray and someObject together? Why are they returned together? It's clear that the implementation of processStuff() knows about how it is used. So it's not really designed as much as glued together with some client code. The client code is very tightly coupled to implementation of processStuff().

In many cases there's some confusion about how to define responsibilities. Code does something and returns some indication of it's success to the client. There is not much trust between the client and the server. The client is paranoid about server's quality of work.

What can we do to make the first example better? Value objects. sendFiles() would be a lot more expressive by using value objects.

interface SenderReport
{
    public function sent(): int;
    public function failures(): int;
    public function errors(): Iterator;
    public function print(): string;
}

We will update FileSender and it's implementations.

class AcmeFileSender implements FileSender
{
    //...
    public function sendFiles(FileCollection $files): SenderReport
    {
        //...
        return new FileSenderReport($sent, count($errors), $errors);
    }
}

Now we can rely on the interface of SenderReport instead of being painfully aware of the implementation of FileSender::sendFiles(). Now we have context. Also, we can change the implementations of FileSender freely.

--

Don't return associative arrays from the public API. Evaluate if you really even have to return a value. And, if you must, return something that has an interface that you and users of your code can rely on.

Posted on by:

Discussion

pic
Editor guide
 

I'm an advocate of value objects by experience and totally support your post!
Another nice side effect is, that you will have a nice code suggest for the FileSenderReport, which you wouldn't have if you return an array.

 

Thanks! Value objects FTW!

 

Don't return errors. Use exceptions.

 

IRL i would probably design a FileSender with only the send()-method. If we just look at the send()-method. It's a void method that throws SenderException. However sendFiles() is supposed to send multiple files.

Regarding design we have two alternatives.

  1. stop execution on first failure and get information about the one file that couldn't be sent.

  2. continue execution and return SenderReport with information about every failed file.

Overall i agree that void is a perfectly fine "return type" and it should inform clients about failures with exceptions. No true/false for success/failure. Command Query separation works.

 

I totally agree with the concept, but I wouldn't say SenderReport is a value object, but a DTO.
Value objects represent one value (a date, a currency, a language... Even others specific to your domain, like PaymentMethod).
In this case the SenderReport is used to exchange data between the service and the consumer, by keeping immutability and type safety, so it looks like a DTO.
It's sometimes easy to mix those.

Anyway, your statement remains, and as I said, I totally agree with it.

 

I've used DTO mainly in the context of dealing with databases or REST API's. But your reasoning is very solid, and i agree. I think it's not important what the service does, but what kind of relationship client code has to the server.

 

I get where you're coming from, but this just shifts the coupling from the code to the data. Which is totally fine if you're in a situation where you have complete control over all of your datasources and there aren't any aberrations from the model that was defined at inception, but you can fail pretty hard on what could be valid data if you go too far down this road.

I think there's definitely merit into asking yourself whether or not an associative array is appropriate for your use case before forging ahead, but taking it as a maxim is just gonna line you up for all sorts of different trouble.

Having strictly defined interfaces solely for data in a language that's (I've been assuming by design) not strictly typed seems like trying to make a sledgehammer into a bandsaw by yelling at it to be sharper. Why not build your test suite around standardized return data if that's a requirement instead of trying to make PHP into Java cough*zend*cough.

 

Thanks for your comment!

Frankly, i'm not really sure if it's that good design to return these sort of reports. To me that means the service has knowledge that there's is some sort of UI where those values are somewhat important. I left myself a backdoor by giving SenderReport a method to print itself which gives the object some behaviour.

I do however think that it's useful to give data some context by encapsulating it into an object. It allows us to fall back on sane defaults and handle other data related issues in a very limited scope. After all you can't tell data to do anything. With objects you can.

I can't say i have much experience about this, but just a hunch that a system can be more robust when the focus is not on what kind of data is processed but what kind of services objects provide for each other.

Always consider that others could be involved in it some day. That helps to reflect your own work and will at least make it easier for you in the end if you have to look at that code after some months.

I personally always set the goal that I want to make a project open source available on github. That means that others will see this piece of code and they want to understand what I'm doing.

Regarding method signatures i think PHP is doing a good job. AFAIK you can pass a null to method in Java instead of an object. That is very weird and stupid imo. PHP throws a TypeError if the type doesn't match.

 

I think typings (especially when paired with unit tests) help alleviate some of the tension caused by this idea of passing around associative arrays. You could always guarantee and have bulletproof signatures for your APIs through documentation. In that sense, even if you're passing around descriptive data objects and have poor documentation, typings, or tests, you're doomed.

 

Thanks. I definitely recommend using type hinting as it really lets you know quickly if your methods are not receiving or returning the type of data you expect.

 

Hey, thanks for the article.
I really enjoyed reading it. As I am currently writing in first server backend (in Python though, but I guess some concepts aren't bound by language), I am tripping over my own code and styles of returning values all the time.
A lot of things just went 'click' inside my head on how I can go about streamlining my code.

 

Glad you liked it! Python equivalent would be to return dictionaries. Although they are actual objects and have some methods, they are too generic to promise anything useful.

 

I have a specific test case... let's make an article out of it and give general advice to anyone. Well, now imagine, that in another case this would actually help? Just try that. Need an example? How about Amazon AWS APIs?

 

You can use anonymous class it'll get rid of creating interface and creating class

 

But wouldn't that make client code as tightly coupled to implementation of sendFiles() as using an array?

 

Sometimes minimum abstraction is enough but if you want to expand your logic of course it's better use an interfaced approach.

 

I moved from PHP to Typescript/nodejs and totally happy, since the latter don't have these problems

 

When you are building REST APIs, why isn't this an issue for you? Do you have some examples where you can't use value objects over arrays?

Well you can pass a null to method in PHP also. php.net/manual/en/migration71.new-...

True. With an important note that the param has to be explicitly defined as nullable.

 

I said so many nopes reading your post.

 

Okay. Let's discuss.