Design-first approach to write APIs
This is the transcript of the talk by Phil Sturgeon presented at Git Commit Show 2020.
About the speaker
Phil Sturgeon builds API Design tools for Stoplight.io. He has written articles and books about pragmatic API design and systems architecture. He has been responsible in part or entirely for projects like the JSON Schema, OpenAPI, The League of Extraordinary Packages, PHP The Right Way, PHP-FIG, CodeIgniter, FuelPHP, PyroCMS, and a bunch of other stuff.
My name is Phil sturgeon, wrote a book a while back about APIs. It's now become kind of an online community with a slack group and a whole bunch of blog posts, not just the one book anymore. So there's a lot of content on there if you're interested in API design or API development but if you're new to API design, I can introduce you to the topic.
For the last couple of years, a lot of people have been trying to come up with workflows for their API lifecycle. I'm getting a lot of technologies involved as you hear about swagger and open API and API blueprint and RAML. So which one should you use? People wonder about using annotations in their code or about creating machine-readable open API files by hand or some other way. Should you design first or should you write the code first and then kind of document it later? Are there any visual editors that help with this so you don't have to do a whole bunch of boring stuff by hand? How and when does documentation get involved with this process? and when does mocking get involved in this process? and do you even know what mocking is? and then once we've got these API designs or API descriptions how do we keep them in sync with the code that we've written? Is it possible and how is the first thing to figure out? The first thing I figured out when I was looking at this years ago was swagger or open API v2. As it was renamed I am very old now open API v3 is out, it's great API blueprint and ram have mostly been abandoned. There's not all that much tooling and the tooling that does exist is for slightly older versions. So if you want modern up-to-date stuff, it's pretty simple. You go with open API three, the industry has kind of agreed that this is kind of the go-to standard. When I maintain a website as part of API as you and hey I've got open API tools. It's a community open source project where we just keep track of all the tools that are up to date. So anything that only supports v2, we generally don't put it on there. So there are other lists of tools around but this is the one that they're probably good at. If we don't think they're very good, we don't list them. They look a little bit like this so data validators are one category of tool and you can see whether they support version two or not over there.
If you're more of a visual learner, it looks a bit more like there's often one thing and does something else. So you have to make sure that doesn't happen or your documentation is lying. This looks complicated but it's the exact design flow that I have seen or the exact um kind of flow where documentation is involved for your open API. People plan it, somehow they chat about it in person. They grab a whiteboard or they grab a napkin or something and just start doodling on it. Then they say that a napkin looks like a great design for our API. Let's start writing the code and so they spend a long time writing all that code, making it work writing tests maybe not bothering yet. They hand that code over a couple of weeks later to the customer to the API consumer to the mobile developers the web people whatever. They ask is any good the customers might then play around with it maybe they spot some bugs but mostly they're trying to give feedback on whether the API will satisfy their needs, whether all of the data they need is there, whether they need to make a hundred different requests to get the information that they need. I'm just trying to find out if the API is useful and then give feedback and then you have to write a bunch more code or change that code and update all the controllers, models and all the tests and that feedback loop takes quite a long time. You can end up being forced to skip some improvements that you would have liked to have made because you have to get it into production on time. But then at some point you deploy it and you think wow that was stressful, let's write some documentation for that pretty soon but then feedback starts to come in, and version 1.1 and version 1.2 or whatever you have to keep adding these improvements fixing these bugs performance issues. Whatever's going on and you get sucked up in maintaining the API that you just deployed and you think we'll get to that documentation. We'll get to that documentation later, don't have time right now. Then a new customer appears and you don't have any documentation to share with them or they just don't exist or they're bad so what ends up happening in my experience is the developers then look at the code to try and figure out how it works because they've forgotten what the contract should be. And the code is a bit confusing, you've been hacking away at it for a while to make it support all this feedback that you've been given. So you end up just creating a new global version. Like version one you forgot how that works, you need to get something working soon. This genuinely happens at large companies. They create versions 2, 3, and 15 just because they don't have time to figure out how that old code worked. Then you're back to planning new functionality. Somehow a new endpoint needs to be made or a new version needs to be made. You just crack out another napkin and get going. So this is how a lot of people have been developing APIs for a long time. Recently a variation of this has popped up which seems to be quite popular. The same stuff you write a bunch of code, then you annotate the code and maybe you do that at first or maybe you wait for feedback or whatever. Maybe you wanna take the code and then share that documentation and get customer feedback on that but you still have this long where you're writing a bunch of code before that. Before you can get the feedback and that's the slow part because changing code takes a lot longer than changing a bit of YAML. Again you deploy the code and deploy the docs. New functionality is requested and you go back to start writing a bunch of code and this is a slightly better variation of code. It's like a blueprint for a house. You could build a house.
You could run around and sketch it but that's not the best use of anyone's time and it's not particularly scientific, safe, or useful. So you are generating a document but if you generate that blueprint before you start doing the work. You then already have this document. So day one after the house is complete if somebody wants to see this document. You already have it, you'd have to run around trying to make an accurate drawing of that thing that already exists. So in API design, first the workflow for several years. This has been what people have been pitching and it's not great. It ends up being a mixture of design first and at some point, you hand over to code first. Very similar to how you design with an open API, you get mocks and docs mocks are like a fake API generated from the metadata that you've created. So people can interact with that mock if they want and they can find out if that works well for them. If they have to make lots of requests and they also get docs. So they don't want to play with the mops. They can look at the documentation to see if it looks like it will solve their problems and then you get customer feedback in a much smaller, more simple feedback loop because you're just tweaking sim YAML. Once it's kind of agreed that is good and you're going to deploy what people then do is either generate a bunch of code. So you then have the second source of truth. You have these old design files that we use for the planning phase but you've kind of moved into code. The code is the source of truth or you like to throw away the open API and use annotations which may be okay. If you don't plan on ever-changing your API but when new functionality is requested you have to follow the code first model. Then pass it over to them for feedback and you have that slower feedback loop. So the workflow that I've been suggesting for quite a while now which is starting to catch on is the same start design with open API mocks and docs customer feedback but then you use those open API files and those machine-readable files. You use them instead of code because it's a contract and it's a machine-readable file. So you can use it to replace things that where you used to write out a whole bunch of code and avoid having those two sources of truth because you just have the one. You can then deploy that code and the documentation is immediately ready and when new functionality is requested, you still have this up-to-date blueprint of what the API looks like, what it does, what all the parameters are, and what all of the validations are. Then you can easily create a new endpoint or a new version or whatever you want to add. You can use the mocks immediately because they're already up to date. AP life cycles are a very big, long, hard topic and I'm not talking about the full life cycle of the API itself but the API design can help out with a lot of these different stages, design mock tests even if it reduces implementation. Then it can help you consume and discover. So there's a lot of these steps that API designs help out with especially if you have those designs very early on. So while I was at my previous employer we were usually in the news if you haven't heard of them before we came up with this whole life cycle for all of the API design files. There's the development cycle, it's kind of the aggregation stage. The development cycle is where you write open APIs and pull requests that can initiate continuous integration which checks that they're good and checks that they're accurate and returns the result. Humans can review it so that API design reviews another concept that is made much easier with API design files.
As the computers and humans agree that it's a good chance for the API, you can merge that to master. Then you know you have some sort of process or some sort of hosted platform. I built my own. Now they exist on open API tools. There's a bunch of them that will look at all your repos and clone them down, take the documentation files and turn them into mock servers and doc servers and generate SDKs and even mirror things to the postman. When I started telling people that we worked 100 APIs there and not everyone was interested in writing documentation for the sake of it but various teams were like “oh I could get a free SDK “ up to date without having to maintain it manually. So different people get excited about different aspects of this workflow and they don't need to love all of it. They just need to like some of it but the problem was most people agreed that writing a whole bunch of YAML by hand was going to be terrible. I understand what he meant but what was created was this and this does not seem particularly easier. It's just a different complicated syntax that people need to learn and that tools need to convert to and from it might be slightly fewer lines but I don't know if it's any more readable than anything else. You can't even see it all on this screen so DSLs are an option. Some look nicer than others but again you have the problem where syntax highlighting might not work and just various things. So while I was still at work, I was wondering where the visual editors are, where are the guises for this. To make things easier and now there's loads of them there didn't use to be a couple of years ago but there are a few competing ones and I work for a company who makes them. Ours looks like this and you can edit all of your paths on the side and update all the information, URLs, and methods. You can just click your way through it and you never really need to look at the open API. You don't need to know what it looks like because you can just work with files in your local file system or you can use the online hosted version. So all sorts of different people including technical writers can work with these files in a very simple way. And even complicated things like data types and all of and any jobs which lots of people that do know open API don't understand um include things like using references to break files down and split spread them across multiple files. So if you've ever considered working with swagger or open API and you've heard and you've seen these like 5000 line YAML files. You don't need to worry about that because you can split things up with ref but again this isn't a tutorial about how to use open API. But a lot of these tools will mix open API description files with markdown files. So you can write tutorials and guides as well as just the very important reference documentation.
Another awesome benefit built into a lot of these editors is the concept of linting if your editor has a lint tool or you're using it in the cli wherever that supports custom rule sets you can enforce API standards and style guides before people start writing because if you link an API that someone's already made and then they output some open API, it might be too late to give them any feedback on that API. This is because it's already code, it's already in production but if you're working with an open API you can have these custom rule sets. I made one here that says every single error, so anything with a four or five response should be one of these two well-known error formats. So we don't want to have custom error formats in our APIs that people might not know how to handle or you can have rules like something you must use kebab case for URLs instead of underscores like the entire company can just discuss decide all of our URLs are going to have underscores. So you can run these linters in your cli for developers to get real-time feedback. You can do it in vs code spectral exists in both cli and vs code. I said some editors have it built in so you can be told how to write better open API as you go which helps developers that don't know anything about open API. To write a better open API without realizing it, so linting is a really useful way to not only improve the quality of the API description but the quality of the underlying API itself. So how on when the mocks get involved um we there's a bunch of tools again on open API tools. You can have local mock servers and hosted mock servers. So here you run your API description and it will just list all the URLs that are available at pets and pet stores and all this stuff and it will run a fake server that you're able to talk to and it will even give you validation messages if you send an invalid request. So the person on the front-end team who is implementing your fake API will get real feedback that they're not quite implementing it correctly so that when you switch out the mock for the rail API. There's much less work to be done and that can increase the parallel parallelization of the front end and the back end both working on their codebases instead of a month for the API to get built and then a month for the front end to get built you can both work on it at the same time. The overall time is one month instead of two. So this gets rid of the waterfall model for API development and there are hosted mock servers too quite often. It's just some project dot sum provider slash endpoint and when you get it gives you a relatively realistic fake of what that real API would use. So at some point, you can just switch out the server and interact with that API documentation. You can just deploy it on s3 some things will read from your git repo. So again list of tools on there and you can have beautiful API reference documentation some of it equivalent to a stripe which is the metric of quality in the API doc world.
Now it's time for questions.
Q. What is an open API?
Open API is an API description format that contains metadata about your API. So if you're creating an API you have you know URLs and headers and parameters and body parameters and JSON is coming up or maybe csv going up and multiple different combinations of that. So an open API is a standardized way of describing the entire interface of everything an API could do if you have callbacks and web hooks responding to requests that you make. You can document them or describe them in an open API and JSON schema is another similar compatible standard. So the open API wraps the service layer of all the endpoints and headers and then JSON schema is the kind of the payloads in the bodies of that data. I got it last time I tried using swagger. So I was very excited that yes I will design first and do this but very soon that excitement dreaded off as soon as I entered coding it and then it got more complex. There were more dynamic things that were changing and it seemed to me that there was a gap. I will say there is a learning barrier once you need to learn about this.
Q. Can you share some tips on how we can make getting started with designing first easier any tools apart from the knowledge you shared any tools or anything else that you would like to share?
Q. What would you say to developers or the teams who already have APIs in production and now want to start with the open API for their existing and new API they develop further?
You need to develop a workflow that supports everybody. So if new APIs should be designed first like that's the simple part of the equation but how do you catch up with APIs that already exist. If they have annotations, you can export those annotations to the machine-readable file and then delete those annotations and you can delete a bunch of your code and put the server-side validation in there. So that you're using that instead of the code. so you can slowly morph existing APIs towards becoming design first especially if you say like the next version of this API. Well, do you know we're on version 3 right now? We want to build version 4 so we will use v4. We'll make that design first and then when we deprecate the old APIs or the old versions. Then we fixed it so it's a game of catch-up but there are other things you can do like if you have postman collections you can convert those to open API and then you have to fill in a lot of the information because they're not it doesn't store as much information about the payloads.
Q. How would you recommend implementing contract testing and specifically mentioning grad does not seem to support open API 3 for now?
So dread has experimental support for open API 3 and I had to go a bit quick on that slide. I don't honestly recommend tread, it's more of a tool for testing that your documentation seems like it's roughly correct then. It is a tool for like contract testing um contract testing like I mentioned if you google search for JSON schema matches um or just JSON schema validators or open API validators they're kind of interchangeable sometimes if you use those existing tools in your existing test suite you do a bunch of work. You send a post request to it and then you are asserting that the response that comes back is valid compared to the JSON schema file that you have. That's one way of doing it. There are other ways of doing it. There's another tool that we made called prism which is a mock server. I demonstrated that a little bit with a screenshot but it also has a proxy. So you can run a proxy server, you pass it to arguments, you pass it the open API file and the server that you'd like it to go to. Then you make requests to the prism proxy and it will validate that the requests you're sending and the responses that you're receiving match with the open API file so if you send an invalid request or if the server sends you something. It said it shouldn't send, then you'll have an error blow up either in the logs in a header or the payload. So you can't miss it and you can implement this just um mobile developers can run that locally or you can put it in your end-to-end test suite. So that all of your tests when they go through your QA environment whatever you call, they will all blow up if anyone makes a bad request and this replaces the concept of producer contract testing and consumer contract testing by just having in-flight contract testing. So prism proxy or any JSON schema validator or open API validator wrapped up in a test.
Q. One challenge I see in API building API or collaborating over API in the documentation. Let's say instead of just uh diving deep into this someone who wants to cover that challenge of only documenting the API. What are good open-source tools to just do that?
There's a tool called redox um and another one called wider shins they're funny names but there are various tools out there that will just provide documentation and again open API tools has a list of them. There's a lot of tools on there.
Q. Would you like to share any last thoughts with the audience?
I'm currently collating all those blog posts into another entire book so there's a lot of stuff on there that will help you with whatever you're trying to do with APIs. So stick around and ask me questions or find me on Twitter. I'm Phil sturgeon and enjoy the conference.
Thanks for running this awesome!!