Python, the versatile programming language, has a lot going on beneath the surface. As a developer, you may have encountered or may have thought about question like "What happens under the hood when you write and run Python code?". In this blog, we'll take a closer look at Python's inner workings to demystify some of the magic.
Python: An Interpreted Language
Python is an interpreted language, which means it doesn't need to be compiled like languages such as C++ or Java. When you run a Python script, the Python interpreter reads your code and executes it line by line. Let's explore this with a simple example.
print("Hello, World!") # Hello, World!
When you run this code, the Python interpreter reads each line from top to bottom. In this case, it encounters the print("Hello, World!") statement and prints "Hello, World!" to the console. The interpreter executes your code in a step-by-step manner.
Python's CPython Interpreter
Python has multiple implementations, with the most popular one being CPython. CPython is the reference implementation of Python and is written in C. It's responsible for translating your Python code into machine code that your computer can understand.
Python: An Object-Centric Language
In Python, everything is an object
. Unlike languages with primitive data types, Python treats even simple values as objects. When you create a variable and assign it a value, you're actually creating a reference to an object. This is because Python uses a dynamic and strongly-typed system, where variables reference objects and can change their type during runtime. Here's an example illustrating how variables are associated with objects in Python:
# Variables referencing objects
# Use hex(id(x)) to check reference id of each variable
x = 5 # x is a variable referencing an integer object
y = "Hello" # y is a variable referencing a string object
# Reassigning variables to different types of objects
x = "Python" # Now x references a string object
y = 3.14 # Now y references a float object
# Variables can reference objects of different types
z = [1, 2, 3] # z is a variable referencing a list object
w = {"name": "Alice", "age": 30} # w is a variable referencing a dictionary object
# You can check the type of an object using the type() function
print(type(x)) #
print(type(y)) #
print(type(z)) #
print(type(w)) #
Memory Management
Python keeps track of how many references exist for an object using a reference counter. You can access this counter through the sys.getrefcount()
function from the sys
module. When an object's reference count reaches zero, Python automatically reclaims the memory used by that object.
import sys
clr = object()
print("Reference count for clr: ", sys.getrefcount(clr)) # Prints
The reference count includes both the variable itself and the call to getrefcount. In contrast, when you create variables that reference immutable objects like integers, the reference count may seem unexpectedly high:
import sys
x = 10
y = x
print(sys.getrefcount(x)) # Prints a higher count, e.g., 15
You would have assumed that the result would be 3, but Python's internal references to commonly used integers are the reason why the count is high.
Python Bytecode
When you run a Python script, the code is first compiled into bytecode. Bytecode is a low-level representation of your Python code that the interpreter can execute. You can see the bytecode of a Python function using the dis
module:
import dis
def add_numbers(a, b):
return a + b
dis.dis(add_numbers)
The dis
module will display the bytecode instructions for the add_numbers
function. These bytecode instructions are what the Python interpreter executes to perform the addition.
Shallow Copy
Shallow copying in Python creates a new reference to the same object. If you modify the object through one reference, the changes are visible through the other, as both references point to the same memory location.
list_1 = [1, 2, 3, [6, 4, 7], 9]
list_2 = list_1
list_1.append(4)
print("List 1 =", list_1) # [1, 2, 3, [6, 4, 7], 9, 4]
print("List 2 =", list_2) # [1, 2, 3, [6, 4, 7], 9, 4]
print(hex(id(list_1))) # 0x7faef2bc4b40
print(hex(id(list_2))) # 0x7faef2bc4b40
Here, list_1
and list_2
share the same reference, so any changes made to list_1
are reflected in list_2
. This is efficient in terms of memory but may not always be the desired behavior.
Deep Copy
Deep copying, on the other hand, creates a new object and recursively duplicates nested objects. Any changes to the original object won't affect the deep copy.
import copy
deep_old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
deep_new_list = copy.deepcopy(deep_old_list)
deep_old_list[0][2] = "deep"
print("Old list (deep copy):", deep_old_list) # [[1, 1, 'deep'], [2, 2, 2], [3, 3, 3]]
print("New list (deep copy):", deep_new_list) # [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
With deep copies, changes to the original object remain isolated from the deep copy, providing a level of independence.
Conclusion
Python's simplicity and elegance hide a rich and complex inner world. Understanding how Python works under the hood can help you write more efficient code and debug issues when they arise. Remember, Python's beauty isn't just in its syntax; it's also in the way it manages objects, executes code, and handles memory.