Horrible Bug
posted on 2017-06-10 16:01:00

I just figured out the most horrible bug! For nearly two months, I've been watching my deploys of a lisp webapp say

Woo server is going to start.
Listening on localhost:5000.
Unhandled WOO.EV.CONDITION:OS-ERROR in thread #<SB-THREAD:THREAD "main thread" RUNNING
  getaddrinfo() failed (Code: -8)
Backtrace for: #<SB-THREAD:THREAD "main thread" RUNNING {1005C85A43}>

over and over and over.

Since I only find a little time to work on this stuff, around my day job, it was really pissing me off.

When I finally figured out what the problem was, I was even more pissed off.

It took me forever to try to figure out what that -8 error code was. At least I'd seen C code, before, to know that that was what it was. I must have miscounted, looking at strange webpages that listed the possible errors for getaddrinfo(), and that threw me off for a month alone. Today, I finally decided there wasn't some quick way to solve this, and I had to dig into the problem with due dilligence. I wrote a quick C program to get the description of the error from the horse's mouth:

#include <netdb.h>
#include <stdio.h>

int main(int argc, char** argv[]) {
  const char *explanation = gai_strerror(-8);
  printf("Error -8 is %s", explanation);
  return 0;

And it informed me: Error -8 is Servname not supported for ai_socktype. A quick search led me to this stackoverflow, which had a comment I include here for your convenience:

I had this problem with a Tornado/Python app. Apparently, this can be caused by the port being interpreted as a string instead of an integer. So in my case, I needed to change my startup script to force it to be interpreted as an integer.

This turned out to be precisely my problem.

I had written my development configuration as a p-list, which, being readable could contain numbers. In production, I was reading configuration out of environment variables, which are all strings. So, pop in an if statement with a (parse-integer port) and blam.

Two months of hobby time, down the toilet.

I'm gonna have to see if I can make a patch for woo or something to make sure this doesn't happen, again.

