Python Behind the Scenes (Part 2)
Python is interpreted, Everything in Python is an object and Python is dynamically typed
In Part 1, we touched (well, at least scratched the surface of) the first statement Python is interpreted. In this blog, I will be attempting to touch on the other concepts. They are directly tied into how Python does memory management.
Everything in Python is an object
Almost everything is treated as an object. Save a few keywords like def, class, if, for, etc., every variable (numbers, strings, lists, dictionaries, tuples), function, class, even special keywords -None, True and False, almost everything, whether it is built-in or user-defined, is an object.
An object in Python has –
· Memory – Objects must be placed somewhere in memory. How memory is managed in Python will be discussed below
· Type – Every object is associated with a type. This is done at object creation time during execution. Type information gets stored in the object’s memory as its meta information. Types in Python are classes themselves. Int, float, str, list, dict, tuples are all classes.
· Reference count – This is a count of how many times the object is currently being referred to. When this count reaches 0, the object is removed from memory.
Note: Python provides in-built functions id(), type() and sys.getrefcount() which can be used to get an object’s memory location, type and reference count respectively.
I have used a lot of verbs here ‘stored’, ‘removed’, ‘created’, ‘allocated’. But what is doing all this? The interpreter or the Python virtual machine (See Part 1)
This was the memory view of objects.
Now here’s the fun part 😊
Since objects are instances of a class, and since class is a blueprint describing attributes and operations, it follows that everything that gets treated as an object in Python not only has memory allocated for it but also has attributes and methods.
AND, since even functions, classes and types are objects themselves, it follows that not only do they have memory allocations, but they also each have attributes and methods of their own defined by the class they are instances of.
OK, hold on, classes are objects, which means they are instances of another class right? Right. Python has a superclass called type. Every class is an instance of this. This notion was a bit tricky to grasp since we are all accustomed to viewing data types and class names as compile time concepts. But, Python is designed to be a dynamic language and so even functions and classes are treated as runtime objects. Feel free to explore this further using id() and type() on functions and classes in your Python code.
Python Memory Management
Let’s now delve into how Python stores its objects.
Let us visualize this –
In statically typed languages like C, two integer variables declared as follows –
int var1 = 10;
int var 2 = 10;
would result in 2 different integer locations which are fixed size memory containers.
var1 = 20
would result in –
A new float variable float var3 = 3.14; would result in –
Throughout the existence of the program the types of these variables will never change. This is static typing.
In contrast, in Python –
var1 = 10
var2 = 10
would result in both var1 and var2 referring to the same location in memory –
What happens when we do var1 = 20?
Now, var1 refers to a new object ‘20’ in memory. ‘10’ is still referred to by var2
In Python, it is possible to simply re-assign a variable to a different typed object.
Until now var2 was referring to the object ‘10’ in memory. When we do var2 = 3.14, a new object of type float and value 3.14 gets created in memory and now var2 refers to this object.
Since ‘10’ is not being referred to by any variable name, it gets deleted from memory (technically this memory is freed for future allocations)
Key Points –
· Variables in Python are references to a location in memory. That location houses an object with value, type and reference count information
· Hence, it becomes very easy to re-assign a variable with a different typed object since all we are doing is updating its reference. This is why Python is dynamically typed.
· The same object can be referenced by more than one variable names. This is part of Python memory optimization. When a variable is re-assigned, for eg when we do var1= var1+1 in the previous example, Python creates a new numeric object ‘21’ in memory and now var1 refers to this. If the object ‘20’ is not being used by any other variable in the program, it gets deleted.
Where does Python allocate memory for its objects from?
The answer is heap. All objects are stored on the heap. The interpreter at any time maintains a dictionary of object names and their heap addresses. If we re-assign object name with a new value, its associated address is simply updated. Therefore, there is no need for type declaration in Python. Just a reference update, effectively changes a variable’s type and we can so easily do something like –
var1 = 100
var1 = “Hello”
var1 = [1, 2, 3]
This is also why we can have mixed type objects like lists and tuples in Python. Since every item is a different object in memory and only its references are stored as list or tuple elements. (This is a very simplistic explanation, technically there are multiple layers of referencing)
Python runs a periodic garbage collector during execution, that tracks the reference count of objects. If the reference count of the object is down to 0, it frees its memory for other allocations (This is necessary, or we would run out of heap memory!)
Knowing about Python’s memory management can be helpful in making optimal programming decisions.
I hope, with this blog, I was able to present a simplistic view of Python’s memory management techniques. If you haven’t already, do check out Part 1 to get an inside view of interpreted execution.