A new version of
pip-tools was recently released and it brings with it a much cleaner work-flow to manage your Python virtual environments.
I’m assuming you are using virtual environments for your Python and Django projects. If not you should look up some of the excellent guides on the Internet on how to get started with virtualenv, virtualenvwrapper or my personal favorite virtualfish for Fish.
One of the problems you encounter with Python development is always package management, especially when you use Django and you keep adding cool new apps to your project. Those applications will install dependencies and those in turn install their own. So at some point you won’t be able to tell or remember what those were anymore. Each of those packages adds up to the size of your virtual environment. This is not a big deal if you are just developing locally, but once you start using hosting platforms you quickly find out that every megabyte of your project counts. For example when you run a wsgi server that uses forking, chances are that each fork will load your Django environment in memory. Now multiply that by the amount of workers you have set up and you see how a few megabytes more or less can make a big difference. If you author open-source projects, and you should, it’s also nice to not inadvertently list a bunch of unnecessary packages.
Vincent Driessen’s pip-tools package has been a long time favorite of many Python developer. The
pip-review commands are a great way of creating requirement files and updating them. Now with version 1 finally released, it introduces two new commands
pip-sync which previously were only available as a preview in the future branch of the repository.
These two commands bring a new, and in my opinion a much improved, work-flow with them. I’ll take you through how I’ve been using it, so let’s start with installing pip-tools using pip:
$ pip install pip-tools --upgrade
First we create a
requirements.in file which will list all the basic package requirements for our projects. I will not insult you by explaining how to create and edit a file, but take care you use the correct extension
.in instead of
.txt. Here I’m just listing some standard packages I would use for most Django projects:
# requirements.in django django-extensions django-debug-toolbar django-redis psycopg2
Now we run
$ pip-compile and it creates a
requirements.txt file for us:
# requirements.txt # # This file is autogenerated by pip-compile # Make changes in requirements.in, then run this to update: # # pip-compile requirements.in # django-debug-toolbar==1.3.2 django-extensions==1.5.5 django-redis==4.2.0 django==1.8.3 msgpack-python==0.4.6 # via django-redis psycopg2==2.6.1 redis==2.10.3 # via django-redis six==1.9.0 # via django-extensions sqlparse==0.1.16 # via django-debug-toolbar
As you can see
pip-compile neatly pinned all the packages and their own requirements. Marking all the child dependencies with their parents where needed. Now you know exactly why e.g.
sqlparse is installed or
six and by which package. If you still wonder why you should be pinning your requirements like this, you can read about it in this article by the pip-tools author himself.
Often I’ll use some extra packages for development that I don’t want included in the
requirements.txt file cause they will neither be used in staging or production. However I do switch working environments a lot, so it’s nice to not have to install all the stuff I need to start work from memory. So let’s create a
# dev-requirements.in pytest pytest-django bpython
$ pip-compile dev-requirements.in will create our
# dev-requirements.txt # # This file is autogenerated by pip-compile # Make changes in dev-requirements.in, then run this to update: # # pip-compile dev-requirements.in # blessings==1.6 # via curtsies bpython==0.14.2 curtsies==0.1.19 # via bpython greenlet==0.4.7 # via bpython py==1.4.30 # via pytest pygments==2.0.2 # via bpython pytest-django==2.8.0 pytest==2.7.2 requests==2.7.0 # via bpython six==1.9.0 # via bpython
Most software will only look for a
requirements.txt file to install dependencies, so this is a great way of keeping these packages out of your Heroku dynos or similar platforms while still managing your development environment.
Until now we haven’t installed anything yet in our virtualenv. We could simpy do
$ pip install -r requirements.txt but what if there are already packages previously installed? The new
pip-sync command takes care of this. It not only installs your dependencies, it also removes anything that is not defined in your requirement files. You’ll soon find out if you forgot to list a crucial package in the
$ pip-sync requirements.txt dev-requirements.txt
You’ll see pip-tools make your environment reflect exactly what you defined in your requirements. No more extraneous mystery packages.
Let us assume I’ve decided that
django-debug-toolbar is a great development package but I don’t want to use it on my servers. Normally you would run
$ pip uninstall djang-debug-toolbar and remove the entry from your
requirements.txt file manually. Unfortunately this doesn’t remove the
sqlparse package, unless you remembered, and months later you’ll be wondering why it is in your requirements file.
pip-tools this process is a lot cleaner. All we do is remove the
django-debug-toolbar entry from the
requirements.in file and run
$ pip-compile again.
# requirements.txt # # This file is autogenerated by pip-compile # Make changes in requirements.in, then run this to update: # # pip-compile requirements.in # django-extensions==1.5.5 django-redis==4.2.0 django==1.8.3 msgpack-python==0.4.6 # via django-redis psycopg2==2.6.1 redis==2.10.3 # via django-redis six==1.9.0 # via django-extensions
As you can see both the toolbar and its sqlparse dependency are gone.
Now we run
$ pip-sync requirements.txt dev-requirements.txt to actually un-install those packages from our environment. That’s really all there is to it.
Once every while I will run
$ pip-compile again so it will pick up any updated packages that have since been released. The same way I used to run
pip-review, except I’ll use
$ git diff to show me if any packages have been updated or not. This is the only part that probably still needs a little love from the developers. If you’re not using version control to show you something has changed, it can be very hard to tell if any of your requirements have had an update. Especially if all you wanted to do is remove a dependency. A little change report after a compile would make this a lot easier. It is still very early days for this new version though and updates have been coming out frequently.
Personally I’m very excited that @nvie released these new commands. I think they should be the standard tool for anyone who does any kind of Python development and needs reproducible environments. Utilizing this work-flow has enabled me to reduce my project sizes by 10 to 20 percent. That’s quite significant in itself, but more importantly it makes me feel I’m in control of my Python environments again.