On 21.07.2019, a beautiful sunday, we got some alerts from our production servers. Around 95% of the requests were getting 500 “Internal Server Errors” and no one modified or touched anything.
We started investigating the cause. Soon we realized that our system got a high load that triggered the autoscaling to start new instances. We don’t have a pre-made image so when our instances start, they install all the python libraries from scratch. One of the libraries that we use was dependent on another library called docutils and they didn’t freeze the version. So instead of requiring “docutils==0.14” they just required “docutils”.
Because of this, our new instances got the new version (0.15) that was just released on that day. And this version had some python 3.x code left in it which was giving a syntax error on our python 2.7 backend. It was sad to see this happening. I found a bug report about the issue and also left a comment.