Non-atomic assignments in Python

It’s not a hidden knowledge that many assignments are not atomic and we can face the word tearing. They are mostly related to CPU word length, synchronization, concurrency, etc.

However, things can be much worse when we’re dealing with interpreted languages or languages with no strict schema. In these languages, a “regular” assignment can also be non-atomic.

Let’s take Python. In Python, every object can be considered a dictionary of fields. This means that a single assignment may result in expanding the forementioned dictionary which may cause issues for some other thread. Let’s see an example:

We have a Sample class that has one field initially, namely big_property.

We have two different types of tasks: the first one serializer uses the jsonpickle library to serialize an object to a JSON string. The second task result_setter sets a field on the object. We then run sufficiently many tasks to observe the issue.

If we’re unlucky enough, we’ll hit the following race condition: the first task starts serializing the object, then the first task is paused and the second task kicks in. The second tasks sets a field on the object. Normally, we could think this assignment is “atomic” as we only set a reference into a field. However, since the Python object is a dictionary of fields, we need to add new entry to the dictionary. Once the first task is resumed, it throws the following error:

We can see the dictionary of the fields changed the size because of the assignment. This would not be the case if we initialized the field in the constructor (i.e., if we didn’t need to add a new field to the object but to modify an existing one).