Boosting Efficiency: Technical Improvements for Boomset – Part 1
In 2018, I embarked on a career in the events industry. To ensure that my technical expertise wasn’t lost over time, I decided to create a post where I could document my experiences and refer back to it as needed. The industry here is probably irrelevant since all things I’m going to share are on the technical side of things. They don’t have anything to do with the events industry itself.
The project was built using Django and Python 2.7, and hosted on AWS. Fortunately, it had been dockerized shortly before I joined, making it relatively straightforward to set up a development environment. During my first month, I tackled some minor tasks until the first major issue arose.
The Problem: Static Variables in Python Classes
There was a long time unsolved issue where customers created an integration to sync data from a third party to their event. Randomly, people’s data got mixed and you could see an integration disappear from an event and appear in a completely unrelated one. We had an engineer who was exclusively focused on fixing these sorts of problems through manual database checks and attempting to update integration and event IDs.
Based on my experience, it became immediately apparent that the problem at hand was caused by improper usage of class static variables. Upon further investigation, I discovered that this had been going on for nearly two years. It’s staggering to think about the amount of engineering resources that have been wasted as a result.
The Solution
I’m not going to share the actual code (it’s probably illegal) but here’s the explanation: When you use static variables in python and serve your project on uwsgi the static variables are shared between all request/response life cycles. This might sound obvious to some but for example it is not the case when you have a PHP project. When you serve a PHP project your static variables are created and destroyed at the end of each request/response flow. The context in python served with uwsgi is not the same and your variables are shared between all request response flows for that each server. To visualize this check out the following example:
class Integration(object): event_id = 123 def do_something(self, event_id): Integration.event_id = event_id
This should be enough to show what was happening. Each time you call the method do_something with another event_id, all the event ids in all the requests change to the latest event_id. All of a sudden your integration becomes the integration of an unrelated event when you call .save() somewhere. The fix was easy, just move the static variables to instance variables:
class Integration(object): def __init__(self, event_id): self.event_id = event_id
Conclusion
This is the end of part 1, where a seemingly simple issue had been consuming engineering resources for almost two years. Solving this problem not only improved efficiency but also provided a sense of accomplishment. In part 2, I’ll tell the story with the database transaction issues – and how I was able to overcome it. Stay tuned 🙂
[…] sorry to say that this part was a small mistake causing big reactions just like part 1. I don’t think it will take too much time to explain so I’m hoping this one will be a […]