Python Internals Notes

Here are some things I have learned about Python that I found interesting and helpful (sometimes more interesting than helpful). Take the information here with a grain of salt because I am still new to Python VM internals and the information differs greatly between implementations (CPython vs. Jython vs. PyPy vs. IronPython). This is information about CPython, the standard implementation.

Call by Reference || Call by Value

Does Python send actual objects when calling functions (reference) or does it send copies of the object (value)? Neither. Both. Here’s a more correct (albeit confusing) definition “Call by value of the reference”. In general, think of it like call by reference but being dependent on mutability (an important characteristic of the Python Language). If you pass in a mutable object like a list and change elements in that list, the calling function will see those changes in the list. But if you pass in an immutable object like a string or number and change the value of your local alias, the calling function does not see that change. The data at the address of the immutable object was not affected by the function because the object is immutable. In case you were wondering, python internally always passes objects by address (reference) between functions. Thus python is call by reference but acts like call by value on immutable data types.

The Stack

Stacks grow upward. Elements are removed/popped off top. The bottom element is the last thing to get popped. Internally, Python uses stacks to keep track of functions that should execute and their order; let’s call it the Interpreter Stack (technically there are multiple stacks keeping track of program execution data; read more here). Each function call creates a frame which gets pushed onto the Interpreter Stack. Parameters for that function call also get pushed onto the stack. Stacks are pretty efficient at popping things off and stacking things on, but efficient processes still lose power when used inefficiently. Reducing function calls reduces pushes and pops from the stack which makes your program run faster. A good way to reduce function calls is to cache function call results before iterating.

Problems occur when the stack and heap meet (i.e. use up all the memory between). The heap is used for dynamic memory allocation, so if you call a lot of functions which allocate a lot of memory then the stack will grow high and heap will grow deep and the two will potentially meet and generate a memory error. This is why you want to avoid global variables or excessive function calls. Interesting note; recursive calls are treated like just another function call on the stack but they grow the stack very high because they push a lot of frames onto the stack and then pop them all off at the end. If you get memory errors with a recursive function, consider refactoring to use iteration to ease the stack overloading. That being said, recursive calls are supported in python.

Scope and Lookups

In addition to stack usage, the scope of functions and variables can impact performance. Python keeps a dictionary of all the functions and variables in local scope. When you call a function, python will first check the local dictionary to see if there is an entry and check in a higher scoped dictionary if the lookup failed. Failed dictionary lookups are slower than successful dictionary lookups, so if a function has to go up several levels of scope to resolve a function call it will be slower than if the lookup succeeded in the local scope dictionary. Take advantage of this by making local aliases of out-of-scope functions before iterating.

Interning of Objects

Python allocates certain oft-used numbers and strings as singleton objects at the beginning of a program and does not deallocate/garbage-collect them even if their reference counts go to zero.

  • Integers range(-5, 257)
  • single-character strings
  • empty strings

This is an undocumented feature and changes occasionally (the range of interned integers has changed several times). I just thought this was interesting

Python Internals Notes