Moesif’s Podcast Network: Providing actionable insights for API product managers and other API professionals
Joining us this week we have the well-known author and speaker Mike Amundsen. He’s a prolific writer on all things APIs and recently released his latest book entitled Design and Build Great Web APIs: Robust, Reliable, and Resilient. When he’s not writing, Mike helps companies capitalize on opportunities in APIs, Microservices, and Digital Transformation.
Mike shares his perspectives on why organizations think about APIs in three levels, how AWS’s Werner Vogel does deprecation, what the future holds for API automation tools, and many more knowledgeable insights.
Derric Gilling, Moesif’s CEO is your host today.
Derric Gilling (Moesif): Welcome to another episode from Moesif’s APIs over IPAs podcast network. I’m Derric Gilling, your host today and the CEO of Moesif, the API Observability Platform. Joining me is Mike Amundsen, well known author and speaker. He started the API Academy and was one of the first to be in APIs 20 years ago. He currently consults with companies to help them capitalize on APIs, microservices and digital transformation opportunities. Happy to have you here today”
Mike Amundsen (Author & Speaker): It’s great to be with you Derric. It’s great to talk with you.
Roy Fielding’s PhD dissertation from 2000 is most well known for introducing REST. But, it also helps solve scaling and customization problems when building SaaS products.
Derric: Awesome. You’ve been doing APIs for quite some time since discovering Roy Fielding’s PhD dissertation introducing REST. What’s the biggest changes in the last 20 years and how did you get into APIs.
Mike: Well, there’s a lot packed in the first question. I should say that actually I didn’t by myself start the API Academy. I want to make sure that I give a shout out to Matt McLarty and Ronnie Mitra. The three of us were the initial sort of founders of that group, and they had a lot to do with its initial design and success. So I want to shout out to them. The Academy’s just one of the steps along the way that I’ve experienced in this whole API world, and a reference to some of the earliest stuff that I had done.
I had been working on creating systems that were based on the web. I had been doing it beforehand using Microsoft tools, using SOAP services and remote procedures and all these other things. Things were working great, things were going fine, but I was having scaling problems. I was literally working on an early Software as a Service type implementation, but just wasn’t able to scale it the way I wanted. So I was looking around and searching desperately to find something that could work and I stumbled on Fielding’s dissertation. He had written it in 2000. He initially had written an article and showed some stuff to Microsoft in 1998, but I didn’t come to it until about 2003 or 2004. It helped me solve my immediate problems, which had to do with scale at first, and then eventually had to do with customization. So the way I found Fielding was out of desperation - I reached for him and he solved some problems for me, initially. And then as I read more and more, you’re reading a 175 page PhD dissertation (I don’t have a PhD, I don’t even have degrees in computing, so it was a challenge for me), but as I read more and more, it started to make more and more sense. And I was able to start to experiment. More and more of what he talked about helped solve my immediate problems. So that’s really how I got into the Fielding space - I was looking for solutions, it helped me solve some problems and then taught me more about how to look for ways to solve other scaling and customization problems - those are really the two big ones in my life.
API first is really about knowing what problems you’re going to solve for your customers — the developers. Know what their issues are, what data they’ll be sending, what data they’ll be passing through the interface, before you specify anything.
Derric: Really glad to hear that. There’s a lot of new things happening within the API space, both on the business side and also the engineering. What does it mean to be API first and API driven? What are the differences there?
Mike: Yeah, there’s again lots of stuff there. So, let’s start with that whole API-first notion. I mentioned this in the book that I just completed Design and Build Great Web APIs: Robust, Reliable, and Resilient. I first learned about API first from Kas Thomas. Kas was writing about the notion that when you’re thinking about the interface first, you’re solving problems for people, understand who those people are, what their problems are and then that’s going to help you build the interface. And if I remember correctly, at that time, Kas was not really talking about web APIs the way we think of them, he was really thinking about the interface for a class library, for functions, for modules, for things that you would give developers. So he was also really thinking about developer first. He was thinking about this idea that it’s the developers that you’re trying to help solve problems for.
Now the other API-first version that I use quite often, that I think a lot more people are familiar with, comes from Kin Lane, the API Evangelist. I think Ken was just recently on this podcast, so I think you’ve talked to Ken about this already. But the notion is that before you go building an app, before you go building a website, before you go building a module, build the interface. Establish what the interface is going to do; what are the problems it’s solving, what are the data you’re sending, what are the data it’s passing. And then once you have that API first, that is a foundational element and then you can tack on things like mobile apps, or web apps, or desktop apps, without a lot of disruption, because they’re all acting on that interface.
So I think those are sort of the two versions that come quickly to mind to me, and both of them for me relate to that notion of know your audience, know your target audience and know what their problems are - that’s what you’re solving. It’s sort of the reverse of you build it and they will come. Instead, understand their problems, build a great product for them, and I think that’s really a great way to think about API-first for me.
If you’re thinking of building an API platform, build the tools first. Build the tools people want, make sure the tools can interact, and you’ll get your platform along the way — don’t try to build a platform first, because you have to make too many decisions in the dark.
Derric: And for the organizations out there that are migrating towards an API-first strategy, do you have any tips for them on how to think about that, especially if they are new to the idea?
Mike: Yeah, the tip is to focus on the target audience. The way I talked to folks, is that if you’re solving their problems, they’re going to love your product, they’re going to love your API. Now, who they are is super important. If it’s developers inside your own organization who understand your object model and your business model, and all have computer science degrees, you’re going to design that API very differently than if it’s somebody outside your organization that works in the the field of, maybe HR. They’re going to have a very different set of problems, a very different set of understandings, so you’re going to solve their problems. And I think that’s a really important element.
A lot of organizations think about APIs in levels, they think of them as like foundational, middleware and user interface. We’ve been taught that over a long period of time, but that works only so far. As long as that holds true, that means you have three different audiences. The audiences are the foundational team, the middleware team and the UI team. But when teams have to solve all of those things, which sometimes they often do, they have to have a UI and middleware, and they’re in charge of data, then the APIs are going to look different. So know your audience — start with your audience and I think that’s super important.
I’ll volunteer something else that I’ve just really started to think about. I recently read an article in the Association of Computing Machinery (ACM), an article about an interview with Verner Vogel’s who’s really considered sort of the architect or the master of the AWS section of Amazon’s Web Services. And he had this great insight that they started with early on — he said their goal was to build tools, not platforms, which I thought was really an insightful remark. So this is a challenge when I talk to enterprise companies. They want this notion of “We want a platform of services. We want a platform of plug and play elements,” and that’s a perfectly reasonable thing to want. But when you’re building it, you don’t build a platform first, you build the tools and you make the tools part of a set of interactive bits, and then the platform emerges. I think that’s another piece of advice that I would pass on to any enterprise, especially if you’re building internally: build the tools people want, make sure the tools can interact, and you’ll get your platform along the way. Don’t try to build a platform first, because you have to make too many decisions in the dark. So I think build tools, not platforms is going to be one of my new mantras.
Follow the reverse Conway approach when putting together your API team: focus on the problem at hand and build the team that you need to solve the problem that’s right up in front. Don’t think too far ahead.
Derric: Oh, that’s a really interesting way to look at it. I love it. I mean, looking at building for a single use case for this API, versus what does this entire platform do, well it looks like just a mix of a bunch of different things, but it doesn’t really solve a single use case for me. This also brings up a different point, for organizations that are structuring new API teams how do you set that up - is it one single platform team, or is it a bunch of individual product focused teams?
Mike: Yeah, that’s really good. And I wish I was better at this. My buddy Ronnie Mitra, who I mentioned was one of the co-founders of the API Academy, does this really well. He talks a lot about teams in his great talk called Programming the People Platform. It’s really this notion of understanding how people interact, and this goes to the Mel Conway thing about your software reflecting the organization you have, and all of these other elements.
But what I think is a really important way to think about it is, you want to figure out what your current team is doing and help solve those problems first. Don’t think too far ahead. Don’t try to organize or engineer too much about what your new teams are going to be. One of the things that’s really commonly spoken about is something called the reverse Conway, where you say, “well, I want this kind of software. so let me just go ahead and create teams.” That’s the new team to do this job. Teaming is actually a pretty difficult step. Most of us in this IT business, we’re great at hardware, we’re great at software, we’re not so good at wetwear. I got into this business because I wasn’t so good at people, I thought that was me. So I try not to get too crazy about trying to engineer people. I try to figure out what’s your problem, what are you going to do.
So as you’re beginning this process, just keep focusing on the problem at hand. What’s right up in front. I think anybody who’s running a business, whether you’re at the startup stage or whether you’re somewhere in the middle, or you’re at some kind of maturity, focusing on the problem at hand is usually most of the challenge, most of the problem. And as you get large enough and mature enough, you can start to change your horizon — you can move that horizon out aways. I don’t know if this is what it’s like for you — you guys have been at this for a while, you’ve probably experienced some of these same things. But I think, focus on what your problem is and get the team you need at that time.
One of the things I’ve learned from working with startups is often when you’re at different stages of the organization you hire different people, because at this stage I need somebody who’s really good in engineering. At this stage, I need somebody who’s really good at product management. At this stage I need somebody who’s really good at growth management. Like you hire different people. So your teams will change. So don’t try to do too much engineering of your people in order to fit software, because that set of problems will keep changing all the time. I don’t know if that comes close to what we were originally talking about, but that’s what’s in my head right now.
Derric: That’s a really important thing when it comes to engineering, you can’t over-engineer stuff, especially at the startup stage.
Mike: And that same goes that goes to tools, not platforms — don’t get too excited, just get it the Minimum Viable Product (MVP), which is another AWS thing that I think is just so incredible. All the way back to S3 buckets, they really created the super simplest thing just to see if anybody was interested in it and to see if it would work. And then they built on that. It goes to that that old adage “complex systems that work, are proven to have started from simple systems that work. If you build a complex system, you’ll never get it to work right. You always start from a simple system.” So I think that runs through the whole experience of business product, IT engineering, everything
Get your product out the door with a focus on your core feature only — try to farm out as much as you can, for as long as it makes sense: use somebody else for your user management, billing solution and account management.
Derric: If I’m starting an API first company, what are those components that I really need to just get the MVP out the door? Do I need to billing, gateways, there’s a lot of different tools out there.
Mike: I’m working with some companies now and they’re struggling with the same thing. They’ve been engineering for quite a while. Their product has been proven a few times, but now they want to level up, they want to SaaS up. So all this pricing and billing management and user management is sort of hitting them and they realize, “we just spent a year and a half, building this this set of tools, now do we have to spend another year building a whole bunch of other ones before we actually bring it to scale?” So one of the things that’s really been impressing me lately is there are lots and lots of platforms to help you on those initial stages: user management platforms, pricing platforms, stripe and all these other things, so I’m starting to see more and more especially startups say, “you know what, in the beginning, I’m just going to let somebody else handle that. Somebody else is going to do my user management, somebody else is going to do my billing, someone else is going to do my account management. And if it scales to the point where it’s important to me, then I can bring that in house. I love that attitude. To me it reminds me of Tom and Mary Poppins, the people who brought lean to the software movement. One of the things that I’ve heard Mary say over and over again is “put off decisions to the last responsible moment.” Don’t decide ahead of time. Again, it’s like, don’t over engineer. So now I’m seeing these startup companies say, you know what, I’ll let somebody else handle billing, I’ll let somebody else handle user management, and I can decide later if I want to take that on. But don’t take it on right away. So I think again, focus on your core, focus on your product, and try to farm out as much as you can for as long as it makes sense. And when it no longer makes sense, now maybe you’ve got some time to invest in it. That’s what I’m seeing people doing.
Reduce Custom Builds with APIs
We’re edging closer to the stage where I can plop a service down anywhere in the value chain and everybody automatically works and it’s all about product management. API businesses are transforming from this custom-built engineering exercise, to this product-focused, consumer-driven enterprise. And that’s going to be a really powerful way to build businesses in the future.
Derric: Well, the same thing is true for us — when we think about our billing, we go straight to Stripe, who’s also using APIs. It’s just a huge transitive graph, which is interesting to think about.
Mike: That’s actually a really good way to think about it. It is this sort of constantly modifying, constantly growing graph.
One of the initial questions you had is what’s changed. So much of what was about APIs when I first started doing this was, all craft. It was all custom bespoke everything. I’m going to create this API, you’re going to create a client for the API, we’re really happy. And as every year goes by, we’re starting to put ourselves at some distance from that custom build/custom build/custom build. Payment Management is a great example of that. We’ve had that as an automated service for 20 years or more. Before APIs were popular, ATMs were working just great. So that’s a network of systems that interact with each other, where you can plop an ATM down anywhere, plug it into the network, it already knows how to talk because all of the standards and all of the processes are already there, the business model is there for taking percentages, for impressed cash and all these other things. So that’s a very mature network, 40 years old possibly of network machinery, or maybe it’s from the 60s. So that’s a very mature network of services that have pre-built APIs that carry the right information.
We’re just getting to that stage, where we can say “I want to plug into Stripe”, or “I’m going to plug into somebody else” or “I’m going to plug into this user management piece.” We’re not quite at the stage where I can just plop a service down anywhere in the middle of the value chain and everybody automatically works and it’s all about product management. There’s still a lot of engineering, but I think we’re getting closer. And that, to me, is very exciting because that transforms this business from this custom-built engineering exercise, to this product-focused, consumer-driven enterprise. And I think that’s going to be a really, really powerful way to build businesses in the future.
Automate the fiddly engineering bits using low code/no code, integrations platforms, or similar, so that you can focus on the bits that can’t be automated and that bring more value, such as design - how we interact as humans.
Derric: Oh, really great to hear that. That could be a huge impact for APIs. For stuff like no code solutions, integration platforms, I’m really curious to see where that goes in the future. How do you think API is changing or driving that?
Mike: I hear a lot about low code/no code again, it kind of goes in cycles. If you’ve been in this long enough, you don’t see my gray hairs because I shaved, but you start to see sort of the same cycles again. I think I’m seeing RPA coming up again, that’s process automation stuff, RPA + AI, that’s sort of like RPA 3.0. So I think we go through these cycles over and over again, and to me it’s very encouraging, it’s not discouraging at all.
I think we discover: I can automate some more, I can safely automate another part of this. Automation brings the project into focus in terms of cost management and in terms of quality control. The whole idea of a lot of our automation processing that we do in Lean really comes from the desire to have TQM, total quality management — this notion of doing the same thing every time. The whole idea of DevOps, scripting all of this, is so that every time I do a build it’s going to be dependably exactly the same. So parts that we can automate help us focus on the parts that need creative intervention.
I don’t think we’ll ever do a good job of automating design, because design is where we interact as humans. Whether it’s API first or developer first or consumer first, but we can automate an awful lot of the connectors, we can automate an awful lot of the value chain that delivers from A to B. And I think that’s where we’re going to see a lot of this.
Low code/no code is getting rid of the fiddly engineering bits, so I can focus on the consumer, so I can focus on inputs and outputs, targets, where I want this information to appear and how I want to modify this information for my target audience. And to me that’s super exciting.
Office automation has largely replaced typists and editors. If the way word processing worked was that I had to set up some configuration before I wrote the document, that’s kind of what APIs are now. We’re not quite there yet.
Derric: Oh definitely, even at enterprises we’re seeing a lot of non engineers and non-technical people experiment with APIs. I think Postman released a report showing that a majority of folks now are non technical, but they’re still familiar with APIs and are playing around with them. What’s preventing adoption, widespread adoption at an enterprise by business teams?
Mike: Yeah, I’m not really sure. That’s an excellent question. I think we’re sort of on the cusp of a lot of this automation, but it’s not easy. Think about the way the spreadsheet changed offices in the ‘80s and ‘90s. It changed offices in a fundamental way. Think about just the way word processing and all that other kind of stuff changed offices. We used to have lots of staff that were typists, editors and people answering the phone. I worked as an accountant for a while, we were counting numbers, we were bean counters, we were doing all these things manually. Spreadsheets and word processing and all that kind of stuff completely removed that and changed the office. We’re not at that sort of level of change yet for APIs. It’s not quite automating it yet. You definitely don’t have to have a CS degree anymore to actually consume an API. But, you still have to know an awful lot of programming. I don’t have to know any programming to do a spreadsheet, to do word processing, or anything like that.
Think of it this way, if the way word processing worked was that I had to set up some configuration before I wrote the document, that’s kind of what APIs are now. I have to do some configuration and set a few other things, and then I can finally maybe start working. But with word processors and spreadsheets and slideshows and stuff like that, I don’t have to do any of that, that’s all done for me - magic happens behind the scenes. So we’re not quite at the magic stage yet. So I think what’s holding us back is working through that process to continue to lower the bar, and make it lower and lower and lower, so that we can make productivity higher and higher and higher. That’s a super challenging thing, but if I think about what’s happening in the next few years. I think that’s really the focus.
Metrics or goals can be thought of as forming a triangle: machine infrastructure metrics for stability, software processing metrics for solving business problems, and real business metrics for managing classical business issues.
Derric: It’s definitely a really challenging task, and we are seeing a lot more product managers shipping APIs and thinking about it from both the technical standpoint, but also for the non-technical users - how do they get up and running with their API platform or API products as quickly as possible. But this brings up a different point, if I am shipping that MVP, that is an API product, what are some of the metrics or KPIs that should we be thinking about, and how does that change over time throughout the entire life cycle?
Mike: I talked about this in API Traffic Management, a little book I did for O’Reilly. You’ve got KPIs and OKRs, that kind of balance against each other. OKRs are another thing from Intel with this idea of “what are your key objectives, versus what are your key measurements.” So I think of objectives as the way you kind of measure as a triangle: you’ve got the machine infrastructure kinds of metrics (how do you manage and what are your objectives for just stability at the infrastructure level), you’ve got the software processing workflow kinds of aspects of things (how do I actually use infrastructure to solve immediate business problems, and there’s a different set of metrics for that), and then you’ve got the real business, independent of electronics (whether you’re shipping containers or shipping software you’ve got a real business). Each one of those has their own sort of goals and metrics.
So, from the infrastructure side, which I think we know the most, I’m really familiar with what Google calls LETS, an acronym for Latency Errors Traffic and Saturation, and that’s what Google’s talks about an awful lot in SRE programming and their engineering for site reliability. So you’re basically talking about latency, how long does it take for errors, how many errors am I producing or am I running on, what’s the overall traffic. where’s traffic good, where’s traffic bad, and what’s the saturation of my system - am I at a point where I’m going to break if I don’t do something more. So from an infrastructure point of view, I think that’s probably some really good metrics and we’re really good at this. We’ve got all sorts of metrics built into every box, every operating system and every machine, where we can monitor that.
Now when you move to the business side or to the software side, the process side, we’re getting pretty darn good at that. This is like your company which is a great example of how we can start to say “what is my web traffic, what are my requests, what’s going on at the gateway” and all these other things like this. These are really key elements to the story. So now I want to figure out what are my entry points, what are my exit points, what are my dwell times, things that talk about my software. How’s my software doing, am I doing a lot of doing a lot of close transactions, do I have do I have a lot of abandoned shopping carts, things like that. These are sort of key metrics that I think most of us are familiar with as well. And they’re very custom to the software we build.
Finally, there’s the real business. In the business the metrics have been the same for centuries. We’ve got cash flow, we’ve got revenue, we’ve got productivity, we’ve got people. How long does it take me to get from A to B. How long does it take me to put a new feature into the system. That’s my lead time in terms of features, in terms of product management. So I think there’s cases for each one of those. For those of us who work in the middleware that business process space, I have to know what’s important to helping the business succeed, helping to their bottom line, helping the revenue. helping the productivity. From the business standpoint, I’ve got to make sure I communicate to everyone, those key elements that are super, super important.
So I think it really varies. We know about the infrastructure. We need to get very introspective, we need to add a lot of observability to that middleware side, to that software process side. And then we need to get business doing a better job of communicating that back to the rest of the group. Those are the challenges that I think I see in that space.
Observability is making it possible to infer things from the data you’re getting out, whilst monitoring is talking about that the infrastructure element. Observability really needs to be leading an action-oriented moment, where machines are acting upon information the information, rather than people.
Derric: I really like how you’re looking at it from different use cases and different teams that might need to access somewhat similar information, but also different. When it comes to stuff like API visibility, we’ve seen a lot of different terminology recently, we have monitoring, observability, analytics, what is all of this and how would you compare them?
Mike: Yeah, we’ve had some interesting threads online about this. Observability and monitoring get these sort of answers like, well, that’s like asking the difference between data and information. It’s like everybody’s got these high-handed kind of approaches to it. I think really when it gets down to it, if you think about it, the whole history of observability comes from systems control, control systems. This really comes from the whole idea of so much of our technology has come out of what we were doing in the space program in the 60s, right. They were trying to figure out “how do I understand what’s going on inside a system, how do we understand what’s going on inside this complex software and this dynamic system that plays back and forth.” So, to me when I say observability, I’m really talking about the ability to observe, the ability to see. And that means I’m adding a lot of telemetrics, telemetry inside software, of the kind of stuff we were talking about - how many people hit this function, how many people have closed a cart, how many people have initialized a document, how many people have approved a meaningful thing. And then trying to observe that in some sort of consistent way.
There was a lot of talk about 50 or 60 years ago about check-list culture - the notion of “I learn through a checklist.” And a lot of what we do in DevOps today is really based on checklist culture - go through the list: you’re on red here, you’re on green here. What I’m seeing more and more of in the last several years is dashboard culture. So it’s not just that we’re doing things, we’re seeing things. So I see all these big dashboards. When I go into their office, their main room tells them exactly what’s going on. That’s observability. So observability is making it possible to kind of infer things from data you’re getting out. And I think that’s where that data and information thing is. Shorthand, my logic is that usually monitoring is talking about that sort of infrastructure element. Observability is often talking about what’s going on in that second group I mentioned, which is business process. But, especially for super-large organizations like Google, Facebook, Microsoft and AWS, they need a ton of observability on their infrastructure, because that’s what they’re selling. So they have to know the health of their infrastructure and what changes over time.
Now, what I like to see, and I’m seeing some of this with the new talk of AIOps, this idea of artificial intelligence ops, is turning the information you’re getting at observability levels, into action. So often the way we do monitoring, the way we do observability today is a light goes off and some human has to do figure out what’s going on, start up another machine or solve a problem. What I want to see more of is automating the solution. So, when things get out of balance, you have this sort of autonomic nature of things we’re kind of working in balance, when they get out of balance the system automatically spins things up. We can do some of that today, that’s kind of what a lot of cluster management is all about - setting metrics and then letting people automatically spin things up and spin things down. So, I think observability for us really needs to be leading to that sort of action-oriented moment, where now you’re actually creating information that can be acted upon, and if possible, acted upon by machines, rather than people. That’s the way I think about it.
Designing a great API environment is really designing a system that keeps running even if people make mistakes along the way. And eventually you’ll actually start to naturally protect yourself against malicious actions - you’ll have a resilient, self-reliant system.
Derric: Really love that breakdown between monitoring and observability. For us, one thing we’ve been noticing a lot of is how do you attach that user component, that customer component to understand who is accessing it, how they accessing your API when you think about stuff like number of transactions. But this brings up a whole different point which is around API ops and what it means to have a good API strategy. What are the various components? Is observability part of it? Is there something else?
Mike: You know, I’m not really sure. I’m going to turn this back and say “what does that mean to you, API ops?” What kind of world is that for you, because I’m just starting to understand this and I don’t really have a good sense of what’s going on. Do you guys have a kind of an API ops approach right now and, if so, what does that sort of mean for you?
Derric: It’s still in our belief an the evolving category. Just as we talked about API products, what are all the different operational aspects required to deliver APIs at scale and also deprecate, or sunset an API. That’s where you start talking about stuff like API life cycle. When it comes to API ops itself, you have things like observability, API security, API management. You might add some other components around billing, how do you actually ensure folks are within their tier, or not exceeding rate limits. So many different components around API ops, just like when we think about a website, what does it mean to have DevOps or some type of IT infrastructure.
Mike: That that actually makes a lot of sense. So for you and your organization at least, they’re all things around operating, managing, continuing, maintaining. It’s sort of the after-release kind of moment. There’s a whole environment, there’s a whole set of things that you’ve got to deal with after the API has been created, after it’s been successfully built, all the way to the idea of deprecation or modification or change over time.
I think this goes back to just plain software knowledge, SDLC. We all know that, what’s the adage “80% of the effort and cost on software is actually spent after release, not before, it’s in the maintenance.” And so I think this idea of API ops is probably trying to get a handle on what are these categories, what are these things that I have to deal with. And I think that APIs are a lot more challenging than most software that’s released, because often APIs are generating dependencies. There are people who depend on you. They’re making changes or having unhealthy aspects to this system, or lots of management choices can affect people that you’ve never met. So, it’s a more complex, it’s a more dynamic system. Which I think leads back to that systems control theory and stuff that we were talking about earlier.
Derric: Definitely. And we’ve seen this, even within our own customer base or people we talked to, where if you’re deploying a website, there’s one way to login. Maybe you have a couple of different login methods. When you deploy an API, you really have no control over the different integrations and use cases they might use the API for. So how do you actually reduce that scope and ensure that you have some crazy performance issue or something that’s going to bring down the entire system.
Mike: Yeah, I know. A phrase that I use quite a bit, especially in this API space, “essentially, you’re building something to be used by people you will never meet, to solve problems you’ve never thought of.” That’s kind of what an API is. People are going to use this in all sorts of crazy ways. It’s Hyrum’s Law from Hyrum Wright at Google: “no matter what promise you make, every single aspect of your API is going to be a fatal dependency for someone.” Like, even changing the things that you said might change is going to upset someone. So, there is a huge set of challenges in this space.
If I think about broader systems theory systems, Donella Meadows kind of thinking, what you do is you don’t prevent other people from doing things, you survive other people doing things. Complex systems keep running even in failure mode. Even when parts of them aren’t working properly they stay up and running, somehow. So designing a great API environment is really designing a system that keeps running even if people make mistakes along the way. And eventually, especially if you think about security, you think about the way credit card management and money management, money movement works, you actually start to protect yourself against malicious actions, or unexpected actions of some type. So you’ve got these various levels. So you really have to design in early this resiliency that if something bad happens, we’re still going to be okay. And you have to defend yourself against failures that you might see along the way. I love this book called Release It by Michael Nygaard. Michael has this approach about capability patterns and stability patterns. Capability patterns help you solve problems, stability patterns help you survive them. He’s got a whole series on the go, seven or eight of them.
So I think when you architect and design and start to build your own APIs you’re engaged in this process of creating a resilient, sort of self-reliant kind of model. Which, is again, super challenging, but I think we’ve seen people like AWS, Google and Microsoft do it, that’s what they’re doing and they do a pretty good job at it for the most part.
To build robust internal APIs - avoid interdependencies, build in separation and treat them like third-party companies.
Derric: Yeah, this actually brings up an interesting point. When we think about APIs and rate limits to circuit breakers and throttling, sometimes it’s easier to deploy when your API is public, because it’s exposed to the internet. People might abuse it, whether it’s intentional or unintentional. What about internal APIs? Do you have the same processes, is it different? Sometimes you can create these artificial dependencies or think just because you have access to everything.
Mike: So again I’ll go back to the AWS example, because they talk a lot about this story about how Bezos told everybody you have to access through the API, you can’t get direct access to the database anymore. These are all attempts to sort of mimic that same distance that you have in the web, but for internal use. One of the things I tell people, especially in large enterprises, is “treat other APIs as third-party products, treat them like they’re an external company.” You don’t get to see inside, all you get to see is their interface. I have lots of organizations where they basically manage the interface through a repository like an OpenAPI spec repository, or an AsyncAPI repository, and you really can’t see past the interface. You don’t know what their data model is, you don’t know any of that. So that can that can help you quite a bit. When you’re on the internal side you can fall into these not realizing how much of a fatal dependency you have on somebody’s data model, or somebody processing model, and if they change it, how bad it can become.
So one of the first things I tell people when they want to get their organization to level up to be more of a microservicey, or more of an API kind of thing, is to start putting distance between all of these other parts of the internal system. Treat the data services like you don’t control them, you don’t own them. When you design a service, don’t design the data model, just say “I need the following,” and let somebody else let the data layer decide that for you (this goes back to our layers about foundations, middleware and UI). Whether it’s going to be in a relational database, an object database, a document database, whether it’s in one place, or 10, whether it’s hosted locally or externally, that’s their problem. This is what I need you to give me. If you start treating them all like externals, it turns out you start building in more of that resiliency and that self reliance early on. It’s not easy because so many of us organizations I’ve worked in their key efficiency, not their effectiveness, but their key efficiency, is that they’ve already decided this decades ago and they don’t have decide it every day. So there’s a new set of problems for them when they want to create this sort of interdependent platform. And that is that you’ve got to start adding some distance, even if it’s sort of like a fake distance.
The head of AWS says that “an API is forever, the code is not.” So when deprecating an API, essentially you fork the code and give it up.
Derric: That’s a good point, where you can’t just go access the database, you really do need those true limits, just like it’s an external company. On a different segue around change management for these internal APIs, how do I handle change management? If I need to deprecate a service, get rid of something, are there any good practices that I should be following?
Mike: Um, yeah there’s a handful and most of them have been there for us to see through things like HTML, HTTP and TCP/IP. All of those have been around for 30, 40 and for TCP/IP cases 50 years, and they still work. But there have been lots of changes to them. And those changes all operate on a few basic rules and those rules are one of the things I talk about in my book: you can’t take anything away (once it’s there, it’s there), you can’t change the meaning of something (you can’t say this field used to mean the total number of users but now it means to total number of pages), and every new change has to be done as an optional element. So you can’t make a new requirement later on down the road. And that’s the way HTTP, HTML and TCP have worked for decades. And that’s what we can do too.
So the first thing is, you can’t take it back. The way Werner Vogel talks about it is “the API is forever, the code is not.” We tend to tend to think about it the other way around, at least I used to. But if you think of it as the API is forever, you’re going to do much better at it. But at the same time, you still need to do just what you talked about. You’re gonna have to deprecate this someday. You know what, there’s a reason we don’t use gopher as much as we used to. Nobody cares. So you’ve got to face it, and the way you do that is through deprecation, not just simply yanking somebody’s spleen out, you just stopped using it. So you have to have a kind of a process.
I did a short video that’s on my YouTube channel about that whole process. Essentially what you’re doing is: when you have to change something fundamental, when you’re going to have to break something (like there’s a new regulation, you’re working in a regulated industry), what you do is, you fork. You literally fork it. Now I’ve got this new thing that’s never going to be backwards compatible and it has to be managed separately. And so that’s your first step along the way. So now you’ve got a branch somewhere. And everybody knows in open source projects, you fork when you give up, your fork when you can’t make things work anymore. And that makes perfect sense. At some point, you’re going to have to deprecate this, get rid of this in some way and literally you’re going to have to get people to move from one form to another, or you’re going to have to get people to go somewhere else and solve their problem some other way. So you need to give people lead time, just like you do in any open source project. And lead time, depending on the scope, depending on the reach, depending on how many people are involved, lead time could be years.
Salesforce has this set process where they have a release every four months and they’re on release 42 or 43, and they support back to release 22 or 23, they support about 20 releases. So they support years and years of this as people move forward. Eventually you’ll get rid of it and what you do is, like any good open source project, you need to find somebody to take it over. You need to give up the code. If you don’t care about this anymore, give up the code to your customers and say “you know what, you can run this yourself.” If you can’t give up a code because there’s intellectual property involved and stuff like that, that gets a little messy, give up the interface. Document the interface, publish the OpenAPI spec and give it to everyone in the world and say “you know what, you could run one of these too.” And then eventually, if somebody has data that they’ve been storing with you, you’ve got to give them a chance to recover their data, give them some exit plan, some check-out option, some takeaway. And then finally put the interface up and just redirect it to 410 Gone. There’s a thing in HTTP called 410 Gone, which means not only is it not here, it’s never going to come back. So go find something else. So often, I’ll arrange a 410 Gone with a pointer to the open source project or a pointer to the interface or a pointer to the alternate service, or something like that. So there’s definitely a process and it’s just like any other product.
Think about tires, think about wheels, think about auto parts, we have the same thing going on there. You can get an auto part for a 1957 Chevy. Somebody’s got it somewhere. They’re not making them anymore, but there’s pointers to where they’re doing, and that’s the way we have to do it in the IT space.
Derric: How do you make sure that you don’t have any issues? Since COVID hit we’ve seen a record pace of re-platforming and moving to new gateways and call solutions. But it’s easy to make a human error and turn off the service too early.
Mike: Yep. There’s definitely the challenge to do that. I think Twitter’s gone through this a couple times, where they would tell people “we’re going to shut off our service on the first of the year.” I think they did this about four years ago, I used to document it. And they get to the end of the year, and what they do is they sort of test, like turn it off for an hour, to see if anything breaks. That’s literally what you can be stuck doing. Sometimes you’re going to run into that. I can design my system to be resilient, but I can’t design your system to be resilient. So there’s definitely problems, there are definitely cases just like the 57 Chevy where I’m going to have to actually craft this part on my own, because nobody offers it anymore. You’re always going to run into those kinds of things, but I think this process of communicating “we’re going to deprecate this.” And then having these moments where people can sort of test out things, they can they can split and go to another version. If you’re running a gateway service, you know if you have traffic. You know if you have traffic on the endpoints you’re about to deprecate. So you are in full control, at least on the inbound side, of what’s going on. So you can start to communicate with others. Now it’s tougher in an anonymous world, like an open source project. But, there are steps along the way that you can do.
You’ll never be foolproof - fools are too damned ingenious, they always come up with new ways to mess us up. But you can give a lot of lead time and then you can follow that kind of responsible deprecation model, which at least gives people another option along the way. That’s what, that’s what I suggest.
More and more opportunities have emerged that lower the bar for creating and maintaining APIs. RPA 3.0, AIOps and others are increasingly automating many elements in the API lifecycle.
Derric: Love that, especially when it comes to deprecation. It’s a pain we’re very close to at Moesif, just thinking about that process and automating the entire flow. As my last question, what are some recommendations that you want to give to the latest generation of product managers in terms of what they should be reading, studying and following up on?
Mike: Wow. I can offer this. I can tell you what I’m reading and what I’m seeing, and I think if you’re a good product manager this will give you some ideas. I’m going to be honest with you, I am not a very good product manager. I often get lost in the weeds. I get lost in the technical details. But good product managers can usually find something in my weeds that can be powerful.
I’m seeing a lot of this focus on this new automation approach, and I think that’s going to affect us a lot. Whether you call it RPA 3.0, or AIOps or all these kinds of things. It’s been going on for quite a while. But again, to use this latest pandemic as an example, a lot of this has been accelerated over the last year. This idea of how can I automate more and put a distance between myself and the system. So people are going to be looking for these kinds of low code / no code. They’re going to be looking for the ability to link things together. They’re going to be looking for more and more opportunity to lower the bar.
I’m seeing some very creative work from the idea of actually monitoring your API traffic in order to create for you, things like OpenAPI documents and documentation automatically. Using that as a validator tool. You’ve got organizations that have very complex approval systems - you have to have an OpenAPI spec to get approved to get on to the gateway. Now they’re using these AI tools to actually see if your running service complies to that set of specs that you’ve submitted to the registry. So that’s an idea of using intelligence to kind of put some distance and create some more resilient and stable systems. There’s a company in the Czech Republic, I can’t remember the name right now, that’s actually working on client and server applications that actually negotiate in real time. That actually figure out, “oh you’re using this version of the API, here’s the OpenAPI spec. I’m going to adjust my calls from puts to patch, you’re going to accept these inputs.” Now, so this is again putting distance and creating more resilience in automation. So it isn’t just generating code, it’s actually in real time adjusting these negotiated elements. So I’m seeing more and more of that. So that’s automation on a different level.
At the side of observability and management I’m seeing so much more focus and so many more questions on what’s observable and what’s not, and how to turn that into action, as we talked about before. So I think all of that’s going to be really, really key. And then finally, I think more than anything else, and you mentioned this in the discussion, people who are product managers, business leaders, trying to solve their own problems are looking more and more to pre-built software, no code or low code software, where I can link things together. And I’m starting to see more people talk in the API space about trying to make API construction and API release more of a visual or visceral experience. Not a typing experience anymore. So I’m seeing more visual tools, more connecting tools, and to me that’s another thing that’s going to be really, really important. When you start to lower the barrier for making connections, you start to increase the possibility of having more connections. Now I can have more customers. Now I can send more information over the wire for the same amount of money that I used to build one single bespoke app. So I think if anything else, I think the product space is going to go more towards this automation, more towards self healing or self reliance, and that means we’re getting closer to that kind of real value chain automation where I could plop down my SaaS product in the middle of a value chain, without disrupting it, add value and get some return in the exchange. And that would be a fantastic new set of products and services that you can just automatically plug and play. I would love to see it.
Derric: And then you can actually decide which API you want to go with, based off of pricing, maybe in real time. We’ll see.
Mike: Absolutely. That’s exactly right. Google has a system, their ad bidding system is totally hypermedia driven robots, like “I’ll bid this, I’ll bid that.” I’ll just give you to give you the example. Years and years ago when I was working in managing telephone switches, there were all sorts of plans for doing what were called short haul services or hardline services between cities. And I could get different prices and different discounts based on time of day and distance. So I wrote a program that said “if somebody wants to call 100 miles awa, I’m going to use this service, because I have this rate, because it’s a Saturday I’m going to use this service, but on the Friday I’m going to use this service.” So we’re going to do just what you said. Eventually we’re all going to have the same interfaces that work the same way. I’m going to be able to start to actually make connections based on the price and the value to me. So the exchange is going to be super, super important. And we can start focusing on quality delivery, convenience and all the other things, in all sorts of other markets, that’s going to be the same for APIs as well.
Derric: Well, really glad to hear. And thank you for joining us today on our podcast, Mike.
Mike: I had a great time talking with you, Derric. And we’ll be talking again soon. Thanks.
Derric: Thanks a lot.