What's in a virtualenv?

2021-05-04

A useful question for understanding software tools is to peel back a layer of abstraction and ask what the thing underneath is. For example:

On a computer, text is a sequence of numbers (and a system to interpret that as letters).
An HTTP request is a blob of text with a particular format.
An interpreter (like the Python or Ruby interpreters) is a program, whose function is to execute other programs.

Knowing this sort of thing is useful when the abstractions break (e.g. when you open a text file with the wrong encoding) or when thinking about the fundamental contours of a system's possibility space. For example, if you know that programs can be configured with environment variables and CLI flags, knowing that the Python interpreter is a program means you know where to start looking if you need to configure it.

While there are many, many articles on the internet explaining how and why to use them, there's less information about what a Python virtual environment (or "virtualenv") is. Luckily for us, it ends up not being very difficult to investigate.

Looking inside a virtualenv

Creating a virtualenv is the work of a moment. If you have Python installed, you can get one of your very own like so:

> python -m venv ./venv

This creates a directory called venv. Let's see what's in it!

> cd venv
> ls
...

Listing the directory's contents reveals three directories and a file:

Include/, which appears to be empty,
Lib/, which contains a directory called site-packages,
Scripts/, which contains a bunch of executables, including the Python interpreter and the activate/deactivate scripts for this virtualenv, and
pyvenv.cfg, a config file with the Python version the path to my system Python interpreter.

(Note: This was run on a Windows machine; you'll get different-but-analogous results if you run it on Linux or MacOS).

I know from prior experience that site-packages is where Python packages get installed, that "include" is probably something to do with compiled extensions. It also makes sense that there would be a central place to put scripts, so that the virtualenv's activation script can add it to your shell's PATH. So a virtualenv is a directory full of things for Python to import or execute. But how does it work?

Understanding virtualenvs

The documentation of virtualenvironments is, again, very helpful if you want to understand how to use a virtualenvironment, but not particularly illuminating on the topic of its inner workings.

Instead the answer is in the documentation for the site package: it turns out that when the Python interpreter starts up it looks for a pyvenv.cfg file one directory above itself. If it finds one, it knows its in a virtualenv and configures itself accordingly.

Now we know a few things about virtualenvs:

You can use the Python interpreter at venv/scripts/python without activating the virtualenv.
If you need to find the source code of a dependency (for debugging, say) you can find it in venv/Lib/site-packages/.
You can ruin your virtualenv by messing with the pyvenv.cfg file (or, possibly, un-mess it to fix a ruined virtualenv).

Nathaniel Knight

Reflections, diversions, and opinions from a progressive ex-physicist programmer dad with a sore back.

What's in a virtualenv?

Looking inside a virtualenv

Understanding virtualenvs