Close to two hundred AI-specific hardware accelerators are coming to market in the next few years. Let that sink in. Just last month TheNextPlatform dove into the current state of the market with ‘Deep divides between AI chip startups, developers.’ In her piece, Nicole Hemsoth addresses the AI hardware ecosystem, its stakeholders, and its struggle to drive developer adoption. It’s a harsh wake-up call:
The gap in understanding is not a general gripe about the lack of tools or time or resources. There are some deep-seated, fundamental holes in how AI chips are designed (and by whom), how developers experiment with new architectures, and how the market/more developers can adopt them if they ever do manage to find a way into production.
‘Deep divides between AI chip startups, developers’ — TheNextPlatform
I was engineering machine learning on DARPA programs long before there was such a thing as a “machine learning engineer” and, as such, I had to be a jack-of-all-trades. But we’re now at an inflection point in the software and hardware industries where differentiation and specialization in engineering roles is happening at a dizzying pace. To echo Deep divide ‘s message, if you are building or betting on hardware acceleration it is critical to understand:
How market forces, technology constraints, and the developer’s individual incentives create a clear path-of-least resistance leading away from hardware acceleration.
That the path to mass adoption of hardware acceleration isn’t paved with free chips. The only way to bridge the huge divide with developers is to show your commitment to the developer experience.
Pattern matching matters
Having worked on dozens of AI applications and reviewed many times more architectures during my career, I’ve found that every single one has been “hardware-constrained” in one way or another. They either did benefit or could have benefited from hardware acceleration.
Through those fifteen-odd years, I have seen just three successful team-building patterns:
The “Mastermind” pattern — Find one or two people who are 10x performance engineers. Ideally, they’ve built your application at least once before and are already familiar with both the algorithms and target hardware. The mastermind’s key skill is in the art of tuning the software to get every last drop of performance out of the hardware.
The “Hardware-first” pattern — Set the hardware target at the start of the project and only hire people that are experienced with it. Enshrine the performance requirements and demand performance compliance on every git push.
The “Algorithms-first” pattern — Use the availability of scientific papers and open source codes to derisk the application’s accuracy requirements. Worry about the rest later.
I’ve written about the “Mastermind” pattern elsewhere, and Deep divides flirts with it a bit:
…. hire people that seem (sic) the complexity of the [acceleration] problem in all of its forms … that understand [acceleration] and what it means from the developer point of view. It’s no small undertaking to offload the right bits of a workload to the right device and it can take a year or more to establish that on a new architecture.
‘Deep divides between AI chip startups, developers’ — TheNextPlatform
If an application venture can find a “mastermind” — big if — this teaming pattern can be effective until the business needs to scale. Unfortunately, for all the reasons that Deep divides covers, there simply isn’t enough godmode to go around these days. Consequently this teaming pattern is relatively rare in the wild.
Adopting the “hardware-first” strategy works for application ventures that already have capital and/or plenty of firmware engineers in-house. Can those same engineers learn ML? Probably, but while individual hardware engineers can be agile, hardware-focused organizations tend to be fatally sluggish around AI. Incubating an everything-on-the-device culture creates fundamental rate limits in the R&D process. Consider: While data scientists employed by “algorithms-first” ventures are training and testing hundreds of models a day in the cloud, “hardware-first” engineers are bottlenecked by the clock speed of the one or two devices on their desk.
“Algorithms-first” ventures typically seed where a business case and a line to novel data intersect. These organizations blossom in the relative availability of academic ML papers and open source codes. ML-savvy data scientists are generally among the first, more junior hires. Because of that early inertia, the modus operandi is to focus on derisking accuracy requirements above all else.
Who is the hardware acceleration customer here? There simply aren’t enough masterminds out there to build a business around. The hardware-first customer is a good one, but they may know you too well; you will face competition from in-house teams and your technology is unlikely to transform their business. That leaves the algorithms crowd. It’s universally understood by software folks that swapping in new hardware is the cheapest way to improve application performance. But is using your technology as turnkey as upgrading from an i5 to an i7? Unlikely. So how do you make the sale?
The developer experience
Deep divides tells us about a mastermind, circa 2009, who needs a sample chip and twelve months to determine whether or not your hardware acceleration technology is useful. If that sounds painful to you, imagine how it sounds to your developer customer.
The culture of software development has changed dramatically in the last decade, and the focus on 2019 is squarely on speed and agility. Thanks to the cloud, software engineers — especially those that are data-facing — are now organized around the idea of hardware as an abstraction. This is doubly true for algorithms-first ventures. When an algorithms-first developer gets excited about hardware acceleration, the last thing you want them to have to do is leave that comfy bubble of abstraction. Let me spell it out for you:
Dear hardware acceleration vendor, your go-to-market strategy must turn your technology into the Wheaties that your opportunity customer eats for breakfast, namely software and data. — the author
But before I do the big reveal here and give away all my best tactics for becoming developer Wheaties, let me assure you of the following: developers are NOT going to talk to you. When targeting the developer market, the burden of proof is on the vendor. Don’t expect your customers to give you feedback about your technology until the day they can pick it up and use it. Then, expect bug reports — lots of them. With that in mind….
CI > fabrication. Waterfall development required lots of talking, and developers are done with that now. Do you know what the agile software community has settled on instead of talking? Continuous integration. Whether it’s CircleCI, Jenkins, or good ole Travis, developers have enshrined CI as the lynchpin of the engineering process. If you want your hardware acceleration product to be a part of the conversation before you go to fabrication, then find ways to support the customer’s process during your R&D phase. At a minimum, you’ll need a software interface (Wheaties!) that developers can start integrating against RIGHT NOW and a way to show them the expected delta (more Wheaties!) in key performance metrics. As long as you treat those two touch points as contracts and hit some intermediate milestones, you will win adoptees. You might not even need working hardware to do it.
hello_world time > run time. What’s the difference between a begrudging adoptee and a self-describing “happy customer?” Delight. And what does it take to delight a developer looking for a way to accelerate their code? Plenty of great hardware companies have gained reputations for bad software because they get this question wrong. They’re hyper-focused on the “acceleration” part and refuse to think about the end-to-end developer experience. Latency reduction is certainly important, but it comes at the end of a lengthy integration process. If you don’t want to be in the “bad software” camp, the first thing you should try to minimize — before latency, before power consumption — is the adoptee’s hello_world time. This is a tactical no-brainer: Don't drag the customer through the smouldering hell that is every vendor-specific installer ever. Figure out which package manager they're using, and support it. If the OS requires additional setup, provide a bootstrap script. Done.
Packages > algorithms. If it wasn’t obvious from the previous suggestion, toolchains that may appear ancillary to your core value are nonetheless extremely important to your success. This is because software development is now dominated as much by “frameworks” and associated tooling as it is by programming languages. It’s no longer sufficient to provide a handful of accelerated algorithms in a language and declare it “supported.” To drive adoption, the data structures under the hood of those algorithms need to play nicely with the core data structures of key application frameworks. Figure out what those are. Then work with framework maintainers to provide interop. Do even a little of this and you’ll have have sealed your reputation for caring about the developer experience.
Images > docs. Docker ruined docs. How? Repeatability. When a developer is trying to reproduce an artifact — a test condition, a trained model, whatever — docs leave too much room for error. In the age of Docker, things aren’t considered repeatable unless they’re containerized and runnable. This is great for developers — and for you. The first thing you should do is build a hello_world container where everything just works. Then, find a way to keep this thing up to date. Need PCI passthrough? Fine. Implementation details don't matter that much, so just use a VM. The point here is to make sure that the very first steps in your prospective customer's experience aren't a trip through kernel-compiling purgatory and dependency hell. Note: This is NOT an excuse for skimping on docs! They're still 100% necessary, but no longer sufficient.