A Bedtime Story about GitHub and the Silly DNS

Last night, I was watching yet another episode of House MD on TV. There’s always a good reason to watch House again, although I know most of the lines by memory. Also, if you want to ruin the show for yourself, at minute 38, the correct diagnosis always happens and it’s not lupus (except for once). So, not paying too close attention, I couldn’t ignore that my phone was begging for me to look at it.

It was API Fortress, letting me know that the GitHub tests were failing. Yes, we monitor the GitHub API ourselves, because we rely on it.

Now, sometimes GitHub may run into occasional issues, which last a couple of minutes, so I didn’t worry enough to open the notification email and look at it. But the email alerts kept coming, so I finally decided to take a look, and to my great surprise, I noticed that the error was a dreadful “Unknown host.”

As a fundamentally insecure person, I feared that API Fortress may be running into some DNS problems, so I rushed to my computer to figure out what was going on.

The GitHub tests were running just fine from the US. The issue seemed to originate in the European datacenter in Dublin, and it was consistent. I logged into the machine and tried to ping api.github.com… unknown host. Then I tried to ping github.com… unknown host.

Well, that’s weird.

The very next thing I did was to browse the Amazon AWS health page, searching for the classic anemic announcement of a DNS issue. Nothing.

Discomforted, I searched for answers from the world’s artillery store: Twitter. If a problem is going on somewhere, then someone is complaining about it, and this beautiful piece of irony came up as a first result:

Mystery solved. GitHub’s DNS entries were broken!

Four lessons to be learned from this incident:

  1. Even the dumbest monitor can track a devastating issue. Please monitor!
  2. Your own API is something you need to monitor. An API you rely upon is also something you need to monitor
  3. Monitoring is better done with a geographically distributed system. Errors propagate at a certain rate as well as resolutions
  4. A little bit of extra self-confidence makes life less stressful

A FALSE SENSE OF SECURITY

Did you know that the API Fortress platform can reuse your data-driven, functional, integration, and load tests as holistic Functional Uptime Monitors

For more information, view the eBook: API Monitors: A False Sense of Security to learn why API monitors must go beyond uptime and performance.

API Monitoring from API Fortress

Monitor the performance, uptime, and functional uptime of live APIs. Deploy unlimited internal API monitors. Mass generate monitors for 3rd-party, partner, and public APIs.

API Monitoring