When Everything is Not Actually 200 OK

indiana

If you are fortunate enough to have an API where a ping that verifies status codes can suffice for your uptime analytics, at least be sure the status codes are accurate.

Two of our customers ran their API programs for over six months, during that time their monitoring systems reported a 100% uptime. When they onboarded with us the ugly truth was revealed. We quickly discovered that rather than replying with status codes matching the actual response, they used soft 404s and 500s.

Soft error codes are when an API always responds with a 200 OK, even when an error happens. The error is then described in the payload.

Even though this practice may be practical for some reasons, it exposes the API owner and its partners to two dangerous behaviors.

  1. The API monitoring system responsible for reporting downtimes might not be equipped with payload analysis and rely solely or status codes. If that’s the case then we have a false 100%.
  2. The developer using the API may decide to cache certain payloads based on the fact the content is not subject to frequent changes. They would never cache a hard 404 or 500, but if it’s a 200…

Soft 404s/500s is a bad practice for an API program and we strongly advise against it. With that said, if you must do it then have a few suggestions:

  1. Use an API monitoring tool with payload verification, and configure it so that the payload structure matches your expectations. We know a good one.
  2. Add cache headers to the API responses and be clear on the documentation that caching strategy should solely rely on them.

You may be asking yourself, “who would do that!?” Expedia for one. An API that generates over $2b/year in revenue. Expedia is one of the best designed APIs we have ever seen (actually going to write a blog post about it), but this is one small mark against them. Having a great API is not complicated, you just need to take advantage of the tools at your disposal.