Every value in Python, from a simple integer to a complex function, is an object built upon the C language. In this lesson, we will peel back the abstraction layer to reveal how these objects are represented in memory and how the CPython runtime manages their lifecycle.
In the eyes of the CPython interpreter, every object is essentially a C struct known as a PyObject. This structure is the base level of all Python data types and is defined in the source file Include/object.h. At its core, the PyObject contains only two fields: ob_refcnt (reference count) and ob_type (a pointer to the type object).
When you create a Python variable, the interpreter allocates memory for this struct. The ob_refcnt is crucial for garbage collection, as it tracks how many references point to that memory location. When this count hits zero, the memory is immediately deallocated. The ob_type pointer tells Python whether the object is an int, str, list, or a custom class, effectively defining what operations can be performed on the raw bytes stored in the object.
Not all data types have a fixed size. While an integer has a predictable memory footprint, a list or string must grow dynamically. For these cases, CPython uses the PyVarObject structure, which extends PyObject by adding an ob_size field. This field tracks the number of items in the container.
The memory management for these objects is handled by the obmalloc allocator. Instead of calling the standard C malloc for every tiny object creation, which would be inefficient and lead to memory fragmentation, python uses a multi-layered allocator. It pre-allocates pools of memory for small objects (up to 512 bytes) and uses the standard system allocator for larger ones. This design significantly improves the performance of short-lived objects.
Note: Because CPython uses
ob_refcnt, circular references (where object A points to B, and B points to A) can cause memory leaks. Python solves this with a separate cyclic garbage collector that periodically detects and cleans up these unreachable islands of objects.
If a PyObject holds the data, where are the methods and attributes stored? They reside in the Type Object. For a class, the type object contains a dictionary (tp_dict) of methods. When you call my_list.append(), Python follows the object's ob_type pointer to the list type object and looks up the append function in its dictionary.
This mechanism is the basis for method resolution order and why Python's dynamic nature works. Because the type definition is just a pointer, you can technically change an object's __class__ at runtime, pointing it to a different type object. This alters the behavior of the object mid-execution, though this is rarely done in production environments due to the stability risks involved.
CPython employs an optimization technique called interning for small integers (typically -5 to 256) and certain strings. Because these objects are used so frequently, CPython creates a pre-allocated array of these objects during startup.
When you define a = 10 and b = 10, Python does not create two separate memory objects. Instead, both variables point to the same memory address in the interned array. This saves memory and speeds up identity comparison, as the interpreter can simply compare the memory addresses rather than the values themselves. This is why a is b returns True for small integers but may return False for larger ones.
PyObject structure containing a reference count and a type pointer.PyVarObject extends base objects to support dynamic sizing for lists, dictionaries, and strings.tp_dict of their corresponding type object.