Friday, November 3, 2023

Python Under the Hood

Python, the versatile programming language, has a lot going on beneath the surface. As a developer, you may have encountered or may have thought about question like "What happens under the hood when you write and run Python code?". In this blog, we'll take a closer look at Python's inner workings to demystify some of the magic.

Python: An Interpreted Language

Python is an interpreted language, which means it doesn't need to be compiled like languages such as C++ or Java. When you run a Python script, the Python interpreter reads your code and executes it line by line. Let's explore this with a simple example.

📋


                    print("Hello, World!")  # Hello, World!

When you run this code, the Python interpreter reads each line from top to bottom. In this case, it encounters the print("Hello, World!") statement and prints "Hello, World!" to the console. The interpreter executes your code in a step-by-step manner.

Python's CPython Interpreter

Python has multiple implementations, with the most popular one being CPython. CPython is the reference implementation of Python and is written in C. It's responsible for translating your Python code into machine code that your computer can understand.

Python: An Object-Centric Language

In Python, everything is an object. Unlike languages with primitive data types, Python treats even simple values as objects. When you create a variable and assign it a value, you're actually creating a reference to an object. This is because Python uses a dynamic and strongly-typed system, where variables reference objects and can change their type during runtime. Here's an example illustrating how variables are associated with objects in Python:

📋


                    # Variables referencing objects
                    # Use hex(id(x)) to check reference id of each variable
                    x = 5  # x is a variable referencing an integer object
                    y = "Hello"  # y is a variable referencing a string object

                    # Reassigning variables to different types of objects
                    x = "Python"  # Now x references a string object
                    y = 3.14  # Now y references a float object

                    # Variables can reference objects of different types
                    z = [1, 2, 3]  # z is a variable referencing a list object
                    w = {"name": "Alice", "age": 30}  # w is a variable referencing a dictionary object

                    # You can check the type of an object using the type() function
                    print(type(x))  # 
                    print(type(y))  # 
                    print(type(z))  # 
                    print(type(w))  #

Memory Management

Python keeps track of how many references exist for an object using a reference counter. You can access this counter through the sys.getrefcount() function from the sys module. When an object's reference count reaches zero, Python automatically reclaims the memory used by that object.

📋


                        import sys

                        clr = object()
                        print("Reference count for clr: ", sys.getrefcount(clr))  # Prints

The reference count includes both the variable itself and the call to getrefcount. In contrast, when you create variables that reference immutable objects like integers, the reference count may seem unexpectedly high:

📋


                        import sys

                        x = 10
                        y = x
                        print(sys.getrefcount(x))  # Prints a higher count, e.g., 15

You would have assumed that the result would be 3, but Python's internal references to commonly used integers are the reason why the count is high.

Python Bytecode

When you run a Python script, the code is first compiled into bytecode. Bytecode is a low-level representation of your Python code that the interpreter can execute. You can see the bytecode of a Python function using the dis module:

📋


                        import dis

                        def add_numbers(a, b):
                            return a + b

                        dis.dis(add_numbers)

The dis module will display the bytecode instructions for the add_numbers function. These bytecode instructions are what the Python interpreter executes to perform the addition.

Shallow Copy

Shallow copying in Python creates a new reference to the same object. If you modify the object through one reference, the changes are visible through the other, as both references point to the same memory location.

📋


                        list_1 = [1, 2, 3, [6, 4, 7], 9]
                        list_2 = list_1
                        list_1.append(4)

                        print("List 1 =", list_1)  # [1, 2, 3, [6, 4, 7], 9, 4]
                        print("List 2 =", list_2)  # [1, 2, 3, [6, 4, 7], 9, 4]
                        print(hex(id(list_1)))  # 0x7faef2bc4b40
                        print(hex(id(list_2)))  # 0x7faef2bc4b40

Here, list_1 and list_2 share the same reference, so any changes made to list_1 are reflected in list_2. This is efficient in terms of memory but may not always be the desired behavior.

Deep Copy

Deep copying, on the other hand, creates a new object and recursively duplicates nested objects. Any changes to the original object won't affect the deep copy.

📋


                        import copy

                        deep_old_list = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
                        deep_new_list = copy.deepcopy(deep_old_list)

                        deep_old_list[0][2] = "deep"

                        print("Old list (deep copy):", deep_old_list)  # [[1, 1, 'deep'], [2, 2, 2], [3, 3, 3]]
                        print("New list (deep copy):", deep_new_list)  # [[1, 1, 1], [2, 2, 2], [3, 3, 3]]

With deep copies, changes to the original object remain isolated from the deep copy, providing a level of independence.

Conclusion

Python's simplicity and elegance hide a rich and complex inner world. Understanding how Python works under the hood can help you write more efficient code and debug issues when they arise. Remember, Python's beauty isn't just in its syntax; it's also in the way it manages objects, executes code, and handles memory.