Clean Coding

Image from Dall-E: CS student writing messy code 🙂 It actually looks like a Japanese horror movie poster!

I was going to think and write about what clean coding ment to me but was struggling to do so, there’s no reason, I just couldn’t start writing. So I asked ChatGPT to do it for me. Here’s what ChatGPT wrote about clean coding:

Clean coding is a software development practice that emphasizes readability, maintainability, and simplicity. It is a way of writing code that makes it easy to understand and modify, even by someone who did not write it. Clean coding has many benefits, including improved software quality, faster development cycles, and increased developer productivity.

One of the main principles of clean coding is to write code that is easy to read and understand. This means using meaningful variable names, writing clear comments, and following a consistent coding style. When code is easy to read, it is also easier to debug and modify. This can save developers time and reduce the risk of introducing bugs into the code.

Another principle of clean coding is to write code that is maintainable. This means breaking code into small, reusable modules that can be easily modified and tested. Clean code also minimizes dependencies, making it easier to upgrade and maintain the code over time. When code is maintainable, it is also more scalable, meaning it can handle larger workloads and more complex functionality.

Clean coding can also affect performance. Writing clean code can improve the performance of an application by reducing the amount of unnecessary or redundant code. When code is clean, it is also easier to optimize, meaning developers can identify and fix performance issues more quickly. Clean code is also less likely to contain bugs, which can slow down an application or cause it to crash.

In addition to improving performance, clean coding can also have a positive impact on developer productivity. When code is clean, developers spend less time debugging and more time developing new features. Clean code is also easier to test, meaning developers can write automated tests to catch bugs before they make it into production.

In conclusion, clean coding is an essential practice for software development. It promotes readability, maintainability, and simplicity, which can lead to improved software quality, faster development cycles, and increased developer productivity. It also has a positive impact on performance by reducing unnecessary or redundant code and making it easier to optimize. Whether you are a seasoned developer or just starting out, writing clean code should be a top priority.

Well, why would anyone write about anything anymore unless it’s something very specific and unknown. From ChatGPT I see that clean coding is a well established concept. I feel so because I agree with most of the things it says.

I want to comment on what it says about improving performance. I feel this might not be the case in some cases and is really debatable. For example, in an interpreted language, when you divide your code to many classes and functions/methods to make it more readable you also make it run slower because now it has to follow classes and methods around as opposed to just interpreting a single huge function in a single file. So writing dirty code with the exact same flow will actually run faster. Here’s a very simple example demonstrating this fact:

# Apologies for the dirty code...
import time
def sum(x, y):
    return x + y
LOOP_COUNT = 1000000
start = time.time()
for i in range(LOOP_COUNT):
    my_sum = sum(1, 2)
end = time.time()
time_call_function = end - start
print(f"Call function sum: {time_call_function}")
start = time.time()
for i in range(LOOP_COUNT):
    my_sum = 1 + 2
end = time.time()
time_direct_sum = end - start
print(f"Directly sum: {time_direct_sum}")
print(f"{time_call_function/time_direct_sum}")

The output of this code is:
Call function sum: 0.0413360595703125
Directly sum: 0.014580011367797852
2.835118473337367

As you can see, calling the function to do the same thing a million times is 2.8 times slower than summing the values directly.

The example above doesn’t always apply to a compiled C program. At compilation, the compiler decides if there’s need to inline the function call. That means if the compiler decides it’s necessary it just replaces the function call with the body of the function. If our python code was compiled in this way, it would automatically convert the function call to the second example and the code would run 2.8 faster magically.

The point is that practicing clean coding might actually reduce performance in most cases. At that point, if you are in such a position that your code is the bottleneck of your system, you might want to think to do some things that go against clean coding.

So now let’s ask ChatGPT how can we determine if we have clean code or not?

Determining if a software code is clean code can be subjective, but there are several characteristics that are commonly agreed upon by software engineers as being indicative of clean code. Here are some guidelines that you can use to evaluate the cleanliness of a software code:

Readability: Clean code is easy to read, understand and follow. It should have clear and meaningful names for variables, functions and classes.

Simplicity: Clean code is simple and easy to understand. It follows the “Keep It Simple, Stupid” (KISS) principle, meaning that it is not overly complex or convoluted.

Modularity: Clean code is modular, meaning it is divided into small, self-contained modules that can be easily understood and tested.

Consistency: Clean code is consistent in style and structure. It follows a set of coding standards and conventions that make it easier to read and understand.

Testability: Clean code is designed to be easily testable, with clear and concise unit tests that cover all possible scenarios.

Minimalism: Clean code is minimalistic, meaning it avoids unnecessary code and follows the “Don’t Repeat Yourself” (DRY) principle.

Maintainability: Clean code is maintainable, meaning it can be easily modified and updated without introducing bugs or breaking existing functionality.

Documentation: Clean code is well-documented, with clear and concise comments and documentation that explain the purpose and functionality of each module and function.

By evaluating a software code based on these characteristics, you can get a good idea of how clean and maintainable it is. However, keep in mind that clean code is not an absolute concept and can vary depending on the specific requirements and context of the software project.

Again, I agree with almost all it says. I might not agree 100% with the documentation part where it tells to comment the code, module and function. For this part I think that the code should be self explanatory in most cases. When you have good readability you automatically have good documentation within the code. It will be easier to read and understand and this means it has built-in documentation. Of course this won’t apply to some advanced areas such as gaming, graphics, scientific software where you would expect to see a lot of advanced/complex algorithms.

In conclusion, we have some more time before AI completely takes over and we have no jobs anymore. It’s OK if the AI chooses to focus on unbelievably perfect performance with zero readability for humans. Until then, I strongly believe that focusing on clean code in a software project will pay off in a big way. Since we humans still design, plan, implement and maintain software it’s better to write code for humans to read instead of writing code for computers to run.

My Current State (March 2023) Calendar Organization and Vitamin D Upgrades

Calendar

At work, we use the calendar a lot. Meetings, 1-1s, lunch time, exercise time, your OOO time, everything is there. Everything is documented and organized. At the same time my own life was not that organized. Other than work, my calendar was empty so I decided to organize my own time too. I divided my days to different projects that I’ve been working on on my free time.

Monday: work on project 1, walk,
Tuesday: work on project 2 and call people,
Wednesday: research how to invest and take action, swim,
Thursday: work on my blog, swim.

I tend to procrastinate a lot in my private life and I’m hoping this will help me fix that. Even if I don’t follow this all the time it will be something like my past self reminding me to do things. If it only works 10% of the time it still a gain.

Check-up

Me and my wife decided to go to the hospital and do a general check-up. Fortunately everything went well except some vitamin D issues. The doctor told us that this is very common and most people don’t have enough vitamin D, specially in winter. So he prescribed some vitamin D supplements for us. I think being unmotivated, feeling tired all the time and things like these are mostly linked to some missing chemicals in your body. In this case, taking supplements really helped and I feel more motivated and I feel I have more energy to do stuff.

Conclusion

I am curious to see how the calendar + vitamin D supplements will work out in the long run. For now it’s been a couple of weeks and it feels good.

Things to do with ChatGPT

I tried ChatGPT after some time it was released and it really surprised me in a very positive way. I didn’t expect to be this amazed by it since I was already following OpenAI and previous GPT models. I never truly interacted with the previous models, just read about what they are capable of, I’m guessing the actual interaction itself and the quality of the dialog was the thing that amazed me and millions of others in the world.

Some of the things I did with it so far are:

Asking things I usually ask google

To me this immediately felt like a better experience than google. You ask things and you get direct answers. Like asking something to another human being who knows a lot of things. Of course you should not trust everything it says but in my experience it was pretty accurate in many things.

Find bugs in pieces of code

There was an instance at work where I had to find bugs in a code written by a customer. They were trying/failing to use our API and sent us some PHP code for me to debug. I looked at the code for some time, searched for some info on GuzzleHttp usage on google, found two problems in the code and wrote a message back in our slack. I think this whole process took me thirty minutes or so. Right after this, I thought about ChatGPT and asked it to find bugs in that piece of code. It actually found the exact two things I previously found and described the whole thing in a much better way than I did, in 30 seconds or so. At this point I was very amazed. An LLM finding bugs in a piece of code? WOW.

Do my nephews highschool homework

My nephew had a homework where she had to write about their trip to Italy as a diary in italian. I know this is bad but just to try we asked ChatGPT to write a diary for a trip in Italy. Amazingly, it wrote about a five day trip to Italy, day by day, with very nice details including restaurants, archeological sites, museums and so on. It actually looked really authentic and this was all done in turkish. Then, of course, we asked it to translate the whole thing to italian. The result, as you might have guessed, was amazing again. We had her homework done in a couple of minutes.

Again, I know the highschool homework thing is somewhat bad or maybe not bad we don’t really know. We had a discussion about this too and I believe time will tell.

Answer formal linkedin messages/emails

Sometimes I’m having a hard time answering formal linkedin messages or similar emails so I started asking ChatGPT to answer them for me. The messages it writes just felt perfect to me and it actually saved me some time too. For example, a recruiter sends a message asking me if I’m available for some position. I just copy that message and tell ChatGPT to write an answer stating that I won’t be available until September. Somehow, when I try to answer these messages myself I always have some doubts about the tone or about the words I use, also english being a second language to me does not help. ChatGPT was really useful about this.

Write better versions of a text

My wife works as an executive assistant and sometimes she has to write messages in english. This process usually has a couple of steps. First she writes the text, then she sends it to me to get some feedback, I check it, make some adjustments and send it back to her. Now I just tell her to ask ChatGPT. ChatGPT does a very good job in this scenario too, it does it so good that my wife told me it was too good to be used by her in some cases.

Find stuff you have trouble remembering

In my case I was trying to find a very old Commodore 64 game which we used to play with my sisters when I was 6-7 years old. It was a platform spaceship game where you battle space enemies. Before ChatGPT I used google to look for the game. As everyone knows, when using google, you have to search something and read through many different websites, watch videos, search images and so on… With ChatGPT you can just ask it to list Commodore 64 games where there’s a spaceship and you move from left to right and you kill enemies. Surprisingly, again, it did a very good job of listing such games. Even if you cannot find what you are looking for in its first answer you can ask it to list more and it will do it. After a couple of messages I think I found what I was looking for but still I am not one hundred percent sure since my memory from that time is not very strong.

There are probably many other things to do and this is just the beginning. I’m excited to see what will happen in the following years.

Is Targeting 100% Code Coverage Bad?

TLDR; No, if supported by other tools. Yes, if that is your only target.

Image generated by Dall-E: “A software engineer thinking if “Targeting 100% Code Coverage is Bad” while working from home remotely in a lord of the rings house”

Recently I was in a discussion in my current company where a respected colleague told me that aiming for a 100% code coverage is considered bad practice.

For context, my team is developing one of the services in an event driven system. Feeling lucky that I got the chance to develop something from scratch, I set the minimum required code coverage percentage as 100 in our CI pipeline. We never went under 100 since.

My colleague told me that I could find many blog posts around the internet explaining why it is considered bad practice.

Previously, I worked on many projects where there was very little or no code coverage at all. My personal record (excluding personal projects) was 42% and even that project had zero coverage when I joined the company. I had to push and convince the management and the engineers to write tests and increase the coverage.

I know from experience how bad and how hard is to work in an environment where there are no tests at all. Any change has the possibility to break totally unrelated parts of the project and you might never find out until a customer stumbles upon that part of your product. This was so frustrating that when I got the chance to work on a new project from scratch, I set the minimum code coverage requirement to 100.

Going back to our conversation with the colleague; right after our talk I quickly searched for some blog posts explaining why aiming for 100 is considered bad. Now I will go through the process here once again and I’ll do my best to explain how this might actually be a good idea.

So I searched for “100 code coverage bad“. The search was in robot language but it did the job:

I’ve read all the links in the first page and all the answers in the stackoverflow page.

First of all, no post is directly saying 100% code coverage is bad. Most of them say things like:

“It’s bad if done poorly”,

“It’s bad if you start with zero coverage in a big project”,

“Just aiming for 100% coverage is not enough by itself”.

The last post from the screenshot is even defending 100% code coverage and I am sure I would be able to find many other examples for both sides and possibly many other valid views.

I feel like I should point out that I am well aware that blindly aiming for 100% code coverage is bad and it doesn’t guarantee anything. That’s why there are other tools you should integrate to your pipeline and/or development flow to help you understand and later increase the level of quality of your project.

Mutation Tests

Image generated by Dall-E from the sentence: “A scared blue cute mutant in an orange forest full of trees”.

Mutation testing is a way to help you determine if your code coverage is good or bad. It measures your tests success to changes in your code so even if you have 100. For example, let’s look at the following code and pretend this is our project:

# main.py
def sum(x, y):
    return x + y

# tests.py
def test_sum():
    assert 0 == sum(0, 0)

In this very simple example we have a 100% code coverage. It is very obvious that the test is bad but still, I cannot say that this won’t ever happen, I can almost guarantee that somewhere in the world some engineer will write a test equivalent (in uselessness) to this one. Maybe s/he was tired that day, no blaming.

To protect ourselves against these kinds of situations we set up mutation tests. The mutation test run will analyze your code and will change things around one by one and run all your tests for each change (it will take a very long time so better to run mutations concurrently in many parallel pipelines). For this example the mutation test run will change to code to “x – y” and see if any of your tests fail. Our test_sum() function won’t fail because zero minus zero is still zero. Now, we have what we call a mutant that managed to stay alive. Our test wasn’t able to kill this mutant. One out of one alive mutants is still alive and this means our test just sucks. We are made aware by the mutation test run that we should either fix our current test or add more tests to cover this specific mutant.

# tests.py
def test_zero_sum():
    assert 0 == sum(0, 0)


def test_sum():
    assert 5 == sum(2, 3)

In the above example we just fixed the issue by adding another test. If we ran the same mutation test again our “test_sum()” would fail and kill the mutant hence giving us 100% mutation test coverage. Having 100% test coverage plus 100% mutation test coverage tells us that our code is better protected against simple changes. I am well aware that even with 100% test coverage and 100% mutation test coverage your complex logic might fail somewhere. This doesn’t mean all this coverage is useless, it just means you need another test to cover your high complexity bug. Aiming for 100 code coverage with a high percentage (above 80% maybe?) mutation test coverage will help you sleep better and be more confident when merging new code.

Code Quality

Image generated by Dall-E from the sentence: “A software engineer working on a project in a futuristic office in a dream”.

In my early days as a computer science student the only conventions that I was aware of were camel case vs snake case. If you are using Java, the convention is usually camelCase and if you are using python you use snake_case. You should be consistent in your project by sticking to either one of them so your project doesn’t burn unnecessary brain power of other developers by making them switch from one convention to the other.

Turns out there is much more to this. There are programming styles with tools that check your code style on every commit. Tools that check your import statements (depending on your programming language) and automatically order them alphabetically. Tools to check the overall code quality of your project. There are measurements to calculate your code complexity like cyclomatic complexity and/or cognitive complexity. Tools that parse your code and warn you for duplicate code, long functions and long files and so on…

These tools help you keep things clean and when things are clean every aspect of your development process will be better. You will get more readable code hence less time to understand the code. You will get smaller low complexity functions which are much easier to test.

Engineers will be happier, more motivated and highly confident because everything is easier to do in these kinds of projects.

An Example

Lets say with are working on the following tech stack:

  • A Python project.
  • Automated checks on every commit with git pre-commit hooks.
  • Code quality checks and minimum code coverage check on Gitlab CI/CD.
  • Mutations tests with mutmut.

Python Project

For our project we can use the following tools:

  • Flake8 for style guide enforcement.
  • Black for automated code formatting.
  • isort to sort your imports automatically.
  • mypy for static type checking.

Set up pre-commit Hooks

The pre-commit project will save you huge amounts of time by automatically checking things you specify at every commit. Here is an example .pre-commit-config.yaml:

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
      - id: debug-statements
      - id: check-ast
      - id: check-added-large-files
        exclude: ^static/
  - repo: local
    hooks:
      - id: black
        name: Black
        description: "Black: The uncompromising Python code formatter."
        entry: poetry run black
        exclude: ^.*\b(migrations|schema_registry)\b.*$
        types: [python]
        language: system

      - id: flake8
        name: Flake8
        description: "`flake8` is a command-line utility for enforcing style consistency across Python projects."
        entry: poetry run flake8
        types: [python]
        language: system

      - id: isort
        name: isort (python)
        description: "isort your imports, so you don't have to."
        entry: poetry run isort
        exclude: ^.*\b(migrations|schema_registry)\b.*$
        types: [python]
        language: system

CI Tools

I am not going into much detail here. The point is that you should set up a code quality analyzer in your CI pipeline. For example gitlab offers built in support for code climate. You can set up some configuration files in your pipeline config and you are ready to go.

The code quality tool will automatically generate reports like these:

Image from gitlab.

You just have to set your teams culture so that the team prioritizes code quality issues. This will help you keep your code clean and maintainable thus greatly lowering your development costs.

Mutation Tests

For our python example there is a library called mutmut. For me it did the job out of the box. I just had to add some ignores and other small optimizations for some mutations that didn’t really make sense in our case.

mutmuts output.

Since every single mutation has to run your entire test suite, mutation tests take a very long time to complete so it’s better if you schedule a pipeline to run concurrently on your code repository of choice. Running pipelines costs some money so to keep things cheaper we run mutation tests once a month but running once a week would be fine if you have the bandwidth to handle them.

Conclusion

Like most subjects in our field there is no definitive answer. It’s another “it depends on other things” case. 100% code coverage is good if you support it with other things such as the ones I described and possibly much more. It is also better if you start your project from scratch and aim for the hundred from the very beginning. If you join a project after years of development and there is no coverage at all then aiming for 100 doesn’t really make sense. It is probably best to try to cover as much as possible and aim for most critical parts of the system in such cases.

I see and hear a lot of “80-85% is enough” statements but I don’t think that’s the case. You see, now you have to decide what stays in the untested 20%. Why wouldn’t you test that? What if there’s a bug in that code? Do you write a test or fix the bug and leave it like that? Since there’s no obligation to satisfy a percentage any engineer can leave or forget to write tests for a critical part of a feature. Code reviews might catch some of these but ultimately the code review process is just another human being trying to find bad stuff in your code.

My conclusion (these days) is: Aiming for 100% code coverage + 100% mutation test coverage + high quality code is the way to go.

Linked lists, pointer tricks and good taste

Here is a good example of clean coding and refactoring. Even for such a small piece of code, refactoring is always good (if done properly of course).

Refactoring and clean coding are not separate tasks that you should do when you have extra time or when your boss gives you permission to do it. They are a part of the software development (or coding) process. When you write code, you should always pay attention to clean coding and as you write the code, if something doesn’t feel OK, you should immediately refactor and get rid of the problem.

I’ve worked on methods with 400+ lines of code, crazy if statements that no one understands, code with indentation so deep that it doesn’t make sense anymore and many unsolved mystery bugs. I witnessed some cases where the code does not do what it was supposed to do with extras like no logging, no errors on sentry, no nothing. This one time the team lead called me to fix the situation. The code was the one I just described and what was expected of me? Solve the problem just by looking at 400 lines of code. We didn’t even have the slightest clue of what was happening. Of course I knew that was not possible at that very moment, I told him we needed logging to figure it out but the response was something like: “You are a senior developer, you should be able to solve it.”.

I have no idea why the culture around coding is so limited and narrow for most of the coders. I think it is probably about the quality of software being so abstract and hard to measure. Anyone can code something that works but most of the coders, educated or not, cannot code high quality code. It is something like learning to read and write your mother language. We all learn to read and write in elementary school but most people, even in adulthood, have a hard time to write a good sentence. For example, let’s assume that I want to express to you that tonight I want to eat a hamburger. I can do this in many many ways but the most straightforward way is like this:

“Tonight I want to eat a hamburger.”

This is really easy to understand, very simple. Now imagine I am really really bad at expressing myself, like a really bad programmer. I will say something like this:

“After sun goes away today I want put in my stomach meat between bread.”

Here I am trying to express the same thing but it’s longer and takes more time to understand. I can go on like this as much as I want to. Here’s one more example:

“After 6 hours after hour 16:00 minus 1 hour I want to make my stomach full with cooked meat between two pieces of bread.”

So now imagine a program with 50000 lines of these. That’s what bad code looks like and that code needs refactoring.

Firestore proxy with Nginx (Use firestore.yourdomain.com instead of firestore.googleapis.com)

Important note: Thank you Jerome Meinke for letting me know that this solution is for clients using HTTP 1.1. If your client uses HTTP 2 this won’t work.

We use firestore cloud db to show data to the users on our virtual web platform and mobile applications. After many events we have a problem where some of the users are behind a corporate firewall or they are connected to their companies VPN and the API endpoint https://firestore.googleapis.com somehow is blocked so they cannot use our system at all. Why a company would block *.googleapis.com? I have no idea, except maybe for China.

To overcome this, I’ve set up an Nginx proxy on an Amazon EC2 instance so that the web and mobile clients can send the firestore requests through our domain (firestore.yourdomain.com) instead of firestore.googleapis.com. Here’s the Nginx setup:

server {
    listen 80;

     root /var/www/html;
     index index.html;

     location /index.html {
         try_files $uri $uri/ =404;
     }

     location / {
         resolver 172.0.0.53 ipv6=off;
         proxy_pass https://firestore.googleapis.com;
         proxy_http_version 1.1;
         proxy_connect_timeout 120s;
         proxy_read_timeout 300s;
         proxy_send_timeout 100s;
         proxy_set_header Cache-Control no-cache;
         proxy_set_header X-Real-IP $remote_addr;
         proxy_cache_bypass $http_upgrade;
         proxy_buffering off;
    }
}

So this setup is behind an Amazon load balancer. That’s why it listens to port 80. The actual traffic from the client to the load balancer is secured and then mapped to port 80 of the EC2 instance. You cannot just proxy the TLS encypted traffic to https://firestore.googleapis.com from your domain, you will get an SSL error because the certificate is registered to *.googleapis.com.

Supposing your system is on Amazon:

  • You’ve set up a load balancer with your wildcard certificate (*.yourdomain.com).
  • You mapped firestore.yourdomain.com to your load balancer.
  • You’ve created an instance with the nginx config above.
  • Now you can initialize you js firestore library like this:
import * as app from "firebase/app";
import "firebase/firestore";
import "firebase/database";
import "firebase/storage";
import "firebase/analytics";

const config = {
  apiKey: "yourkey",
  authDomain: "yourdomain.firebaseapp.com",
  databaseURL: "https://yourdatabase.firebaseio.com",
  projectId: "yourprojectId",
  messagingSenderId: "messagingSenderId",
  appId: "yourAppId",
  measurementId: "youMeasurementId",
};

app.initializeApp(config);
app.analytics();

export const firebase = app;
export const db = app.firestore();
db.settings({
  "host": "firestore.yourdomain.com",
  "ssl": true
});

Now you just refresh you browser app and check your XHR requests. They will go through your server. Hope this will help someone as I’ve spent a lot of time finding the right setup.

Happy proxying.

Why Linux is more secure than Windows

I stumbled upon a question on Quora. The question is: How complicated is the code for Microsoft Windows? The person who answered it shared a link to another question: Why windows is less secure than linux? and shared two images. These images are one of the best examples of high complexity leading you to less security. So here is the linux call trace for apache on linux:

This might seem already complex, but wait until you see the call trace for Microsoft IIS on Windows:

The images describe it all. More complexity usually leads to more bugs and more bugs lead to more security problems. Let’s all hope Windows developers refactor Windows so it becomes more like Linux: simple.

Design Patterns

I always read about design patterns. I read and read again and again and again… But the feeling that I have to learn more or understand more never goes away. Every time I dive into a pattern I feel like I’m learning something new about it. So this post is about me trying to grasp design patterns better (again). I started writing this post while spending time around this book: Design Patterns: Elements of Reusable Object-Oriented Software. As the book goes on, I will copy and paste some parts of the book here, while adding my opinions or questions; if I have any. I believe this will strengthen my perception of design patterns and at the same time, it may help some others.

Now, I will try to describe what a design pattern is, with my own (english) words, as I understand it now, at this very moment. Then I will go to wikipedia and/or google and copy and paste the description here and see the difference. I’m hoping there won’t be a huge difference :). Here is my definition of a design pattern:

A design pattern is a method to meet a recurring requirement.

So this is what I came up with. Actually it took me a while. First I had to write it in turkish on paper. Then I had to fix the turkish version, then I had to translate it to english. That was the result. So now I’m googling it and here it is:

In software engineering, a design pattern is a general repeatable solution to a commonly occurring problem in software design.

The context of this article is already software engineering and software design so this definition can be shortened to this:

design pattern is a general repeatable solution to a commonly occurring problem.

I think this sounds better than my definition. I especially like the “commonly occuring problem” part. It is much better than “recurring requirement”. I think I will change my definition after this to: “a method to meet a commonly occuring requirement.” Now the main difference is my “method” is their “general repeatable solution” and my “commonly occuring requirement” is their “commonly occuring problem”. A method vs a solution and requirement vs problem. This can be further discussed but both definitions are pretty close to each other in my opinion.

A much simpler definition can be found in the book:

[A design pattern] is a solution to a problem in a context.

Now that we know what a design pattern is, I will list the design patterns I encountered in the book I mentioned above. This part will be mostly copy/pasting the names and definitions. Here is an overview of 23 design patterns:

  • Abstract Factory: Provide an interface for creating families of related or dependent objects without specifying their concrete classes. For a long time I didn’t know the difference between factory pattern and abstract factory pattern.
  • Adapter: Convert the interface of a class into another interface clients expect.
  • Bridge: Decouple an abstraction from its implementation so that the two can vary independently.
  • Builder: Separate the construction of a complex object from its representation so that the same construction process can create different representations. For example, libraries like doctrine use builder pattern to build sql queries.
  • Chain of Responsibility: Avoid coupling the sender of a request to its receiver by giving more than one object a chance to handle the request. Chain the receiving objects and pass the request along the chain until an object handles it. Thoughts: for example, this is how django middlewares work. It passes the request to the middleware objects one by one, they all handle it or one of them stops the handling process.
  • Command: Encapsulate a request as an object, thereby letting you parameterize clients with different requests, queue or log requests, and support undoable operations. What are undoable operations? I don’t really understand this definition.
  • Composite: Compose objects into tree structures to represent part-whole hierarchies. Composite lets clients treat individual objects and compositions of objects uniformly.
  • Decorator: Attach additional responsibilities to an object dynamically. Decorators provide a flexible alternative to subclassing for extending functionality. Decorator pattern is among the most popular ones, at least for me.
  • Facade: Provide a unified interface to a set of interfaces in a subsystem. Facade defines a higher-level interface that makes the subsystem easier to use.
  • Factory Method: Define an interface for creating an object, but let subclasses decide which class to instantiate. Factory Method lets a class defer instantiation to subclasses.
  • Flyweight: Use sharing to support large numbers of fine-grained objects efficiently. I had a personal project once where I wanted to code a grid with java swing. I tried to create an object for each square in the grid and ended up with thousands of objects. It killed the process and it was very very slow rendering it. I’m not sure but this pattern may be the solution to that.
  • Interpreter: Given a language, define a representation for its grammar along with an interpreter that uses the representation to interpret sentences in the language. I’m curious about this one.
  • Iterator: Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation.
  • Mediator: Define an object that encapsulates how a set of objects interact. Mediator promotes loose coupling by keeping objects from referring to each other explicitly, and it lets you vary their interaction independently.
  • Memento: Without violating encapsulation, capture and externalize an object’s internal state so that the object can be restored to this state later.
  • Observer: Define a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically. In django framework, the signals are an example of the observer pattern.
  • Prototype: Specify the kinds of objects to create using a prototypical instance, and create new objects by copying this prototype.
  • Proxy: Provide a surrogate or placeholder for another object to control access to it.
  • Singleton: Ensure a class only has one instance, and provide a global point of access to it.
  • State: Allow an object to alter its behavior when its internal state changes. The object will appear to change its class.
  • Strategy: Define a family of algorithms, encapsulate each one, and make them interchangeable. Strategy lets the algorithm vary independently from clients that use it. I used this one recently where we had 6 different login scenarios. I divided the strategies to 6 classes with an execute(…) method. I had to create the appropriate strategy object from http post data and call the execute method to log the user in.
  • Template Method: Define the skeleton of an algorithm in an operation, deferring some steps to subclasses. Template method lets subclasses redefine certain steps of an algorithm without changing the algorithm’s structure.
  • Visitor: Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.

After listing them, there is one last thing to do with this list of design patterns: classifying them. There are three main categories: Behavioral, Structural and Creational.

Behavioral patterns describe how the objects communicate with each other and tell us the responsibilities of the objects. These patterns are: Interpreter, Template Method, Chain of Responsibility, Command, Iterator, Mediator, Memento, Observer, State, Strategy, Visitor.

Structural patterns deal with the composition of classes or objects. These patterns are: Adapter, Bridge, Composite, Decorator, Facade, Flyweight, Proxy.

Creational patterns deal with the process of object creation. These patterns are: Factory Method, Abstract Factory, Builder, Prototype, Singleton.

Even now I feel much better about design patterns. My next quest will be one post for each design pattern. For that, I will probably read the book and support it with some online materials and create a post with the combination of both.

How docutils broke our production

On 21.07.2019, a beautiful sunday, we got some alerts from our production servers. Around 95% of the requests were getting 500 “Internal Server Errors” and no one modified or touched anything.

We started investigating the cause. Soon we realized that our system got a high load that triggered the autoscaling to start new instances. We don’t have a pre-made image so when our instances start, they install all the python libraries from scratch. One of the libraries that we use was dependent on another library called docutils and they didn’t freeze the version. So instead of requiring “docutils==0.14” they just required “docutils”.

Because of this, our new instances got the new version (0.15) that was just released on that day. And this version had some python 3.x code left in it which was giving a syntax error on our python 2.7 backend. It was sad to see this happening. I found a bug report about the issue and also left a comment.

It was a sunday…

We had to require docutils ourselves, freezing it to the previous working version and it all came back to normal. I think this mistake caused a lot of other projects to fail and it reminded me of the leftPad failure of the javascript world even though it is not exactly the same thing.