Managed environments with pip-tools

Reading time ~6 minutes

A new version of pip-tools was recently released and it brings with it a much cleaner work-flow to manage your Python virtual environments.

I’m assuming you are using virtual environments for your Python and Django projects. If not you should look up some of the excellent guides on the Internet on how to get started with virtualenv, virtualenvwrapper or my personal favorite virtualfish for Fish.

Package management

One of the problems you encounter with Python development is always package management, especially when you use Django and you keep adding cool new apps to your project. Those applications will install dependencies and those in turn install their own. So at some point you won’t be able to tell or remember what those were anymore. Each of those packages adds up to the size of your virtual environment. This is not a big deal if you are just developing locally, but once you start using hosting platforms you quickly find out that every megabyte of your project counts. For example when you run a wsgi server that uses forking, chances are that each fork will load your Django environment in memory. Now multiply that by the amount of workers you have set up and you see how a few megabytes more or less can make a big difference. If you author open-source projects, and you should, it’s also nice to not inadvertently list a bunch of unnecessary packages.

Pip tools

Vincent Driessen’s pip-tools package has been a long time favorite of many Python developer. The pip-dump and pip-review commands are a great way of creating requirement files and updating them. Now with version 1 finally released, it introduces two new commands pip-compile and pip-sync which previously were only available as a preview in the future branch of the repository. These two commands bring a new, and in my opinion a much improved, work-flow with them. I’ll take you through how I’ve been using it, so let’s start with installing pip-tools using pip:

$ pip install pip-tools --upgrade

Pip compile

First we create a requirements.in file which will list all the basic package requirements for our projects. I will not insult you by explaining how to create and edit a file, but take care you use the correct extension .in instead of .ini or .txt. Here I’m just listing some standard packages I would use for most Django projects:

# requirements.in
django
django-extensions
django-debug-toolbar
django-redis
psycopg2

Now we run $ pip-compile and it creates a requirements.txt file for us:

# requirements.txt
#
# This file is autogenerated by pip-compile
# Make changes in requirements.in, then run this to update:
#
#    pip-compile requirements.in
#
django-debug-toolbar==1.3.2
django-extensions==1.5.5
django-redis==4.2.0
django==1.8.3
msgpack-python==0.4.6     # via django-redis
psycopg2==2.6.1
redis==2.10.3             # via django-redis
six==1.9.0                # via django-extensions
sqlparse==0.1.16          # via django-debug-toolbar

As you can see pip-compile neatly pinned all the packages and their own requirements. Marking all the child dependencies with their parents where needed. Now you know exactly why e.g.sqlparse is installed or six and by which package. If you still wonder why you should be pinning your requirements like this, you can read about it in this article by the pip-tools author himself.

Development requirements

Often I’ll use some extra packages for development that I don’t want included in the requirements.txt file cause they will neither be used in staging or production. However I do switch working environments a lot, so it’s nice to not have to install all the stuff I need to start work from memory. So let’s create a dev-requirements.in file:

# dev-requirements.in
pytest
pytest-django
bpython

Executing $ pip-compile dev-requirements.in will create our dev-requirements.txt:

# dev-requirements.txt
#
# This file is autogenerated by pip-compile
# Make changes in dev-requirements.in, then run this to update:
#
#    pip-compile dev-requirements.in
#
blessings==1.6            # via curtsies
bpython==0.14.2
curtsies==0.1.19          # via bpython
greenlet==0.4.7           # via bpython
py==1.4.30                # via pytest
pygments==2.0.2           # via bpython
pytest-django==2.8.0
pytest==2.7.2
requests==2.7.0           # via bpython
six==1.9.0                # via bpython

Most software will only look for a requirements.txt file to install dependencies, so this is a great way of keeping these packages out of your Heroku dynos or similar platforms while still managing your development environment.

Pip sync

Until now we haven’t installed anything yet in our virtualenv. We could simpy do $ pip install -r requirements.txt but what if there are already packages previously installed? The new pip-sync command takes care of this. It not only installs your dependencies, it also removes anything that is not defined in your requirement files. You’ll soon find out if you forgot to list a crucial package in the .in files.

Let’s run:

$ pip-sync requirements.txt dev-requirements.txt

You’ll see pip-tools make your environment reflect exactly what you defined in your requirements. No more extraneous mystery packages.

Dependency management

Let us assume I’ve decided that django-debug-toolbar is a great development package but I don’t want to use it on my servers. Normally you would run $ pip uninstall djang-debug-toolbar and remove the entry from your requirements.txt file manually. Unfortunately this doesn’t remove the sqlparse package, unless you remembered, and months later you’ll be wondering why it is in your requirements file. With pip-tools this process is a lot cleaner. All we do is remove the django-debug-toolbar entry from the requirements.in file and run $ pip-compile again.

# requirements.txt
#
# This file is autogenerated by pip-compile
# Make changes in requirements.in, then run this to update:
#
#    pip-compile requirements.in
#
django-extensions==1.5.5
django-redis==4.2.0
django==1.8.3
msgpack-python==0.4.6     # via django-redis
psycopg2==2.6.1
redis==2.10.3             # via django-redis
six==1.9.0                # via django-extensions

As you can see both the toolbar and its sqlparse dependency are gone. Now we run $ pip-sync requirements.txt dev-requirements.txt to actually un-install those packages from our environment. That’s really all there is to it.

Package updates

Once every while I will run $ pip-compile again so it will pick up any updated packages that have since been released. The same way I used to run pip-review, except I’ll use $ git diff to show me if any packages have been updated or not. This is the only part that probably still needs a little love from the developers. If you’re not using version control to show you something has changed, it can be very hard to tell if any of your requirements have had an update. Especially if all you wanted to do is remove a dependency. A little change report after a compile would make this a lot easier. It is still very early days for this new version though and updates have been coming out frequently.

Finally

Personally I’m very excited that @nvie released these new commands. I think they should be the standard tool for anyone who does any kind of Python development and needs reproducible environments. Utilizing this work-flow has enabled me to reduce my project sizes by 10 to 20 percent. That’s quite significant in itself, but more importantly it makes me feel I’m in control of my Python environments again.

Heroku, Django, Nginx and PgBouncer

My struggle to get Heroku to work with Nginx and PgBouncer for Gunicorn, uWSGI and Waitress Continue reading