Saturday, April 9, 2011

Managing Python packages: virtualenv, pip and yolk

I've recently been playing with the Python virtualenv package - along with pip and yolk - as a way of managing third-party packages. This post is my brief introduction to the basics of these three tools.

virtualenv lets you create isolated self-contained "virtual environments" which are separate from the system Python. You can then install and manage the specific Python packages that you need for a particular application - safe from potential problems due to version incompatibilities, and without needing superuser privileges - using the pip package installer. yolk provides an extra utility to keep track of what's installed.

1. virtualenv: building virtual Python environments

virtualenv can either be installed via your system's package manager (for example, synaptic on Ubuntu), or by using the easy_install tool, i.e.:

$ easy_install virtualenv

(If you don't have the SetupTools package which provides easy_install then you can download the "bootstrap" install script from Save as and run using /path/to/python

Once virtualenv is installed you can create a new virtual environment (called in this example, "myenv") as follows:

$ virtualenv --no-site-packages myenv

This makes a new directory myenv in the current directory (which will contain bin, include and lib subdirectories) based on the system version of Python. The --no-site-packages option tells virtualenv not to include any third-party packages which might have been installed into the system Python (see the virtualenv documentation for details of other options).

To start using the new environment, run the environment's "activate" command e.g.:

$ source myenv/bin/activate

The shell command prompt will change from e.g. $ to (myenv)$, indicating that the "myenv" environment (and any packages installed in it) will be used instead of the system Python for applications run in this shell. (Note that the Python application code doesn't need to be inside the virtual environment directory; in fact this directory is just using for the packages associated with the virtual environment.)

Finally, when you've finished working with the virtual environment you can leave it by running the deactivate command (also in the bin directory).

(On Windows you may have to specify the full path to the "Scripts" directory of your Python installation when invoking the easy_install and virtualenv commands above, e.g. C:\Python27\Scripts\virtualenv. Also, note that when a virtual environment is created it won't contain a "bin" directory - instead it's activated by invoking the Scripts\activate batch file in the virtual environment directory. Invoking the deactivate command exits the environment as before.)

2. pip: installing Python packages

Once you're created a virtual environment you can start to add packages (which is really the point of doing this in the first place). virtualenv automatically includes both easy_install and an alternative package installer called pip (at least, for virtualenv 1.4.1 and up; earlier versions only have easy_install, so you'll need to run easy_install pip within the virtual environment in order to get it).

Most packages that are easy_installable can also be installed using pip, and it's designed to work well with virtualenv. However I think its main advantage is that it offers some useful functionality that's missing from easy_install - most significantly, the ability to uninstall previously installed packages. (Other useful features include the ability to explicitly control and export versions of third-party package dependencies via "requirements files" - see the pip documentation for more details.)

Basic pip usage looks like this:

(myenv)$ pip install python-dateutil # install latest version of a package

(myenv)$ pip uninstall python-dateutil # remove package

(myenv)$ pip install python-dateutil==1.5 # install specific version

(As an aside, the python-dateutil package is illustrative of one of the advantages of using pip over easy_install: after installing the latest version of python-dateutil, I discovered that it's only compatible with Python 3 - an earlier 1.* version is required to work with Python 2. pip let me uninstall the newer version and reinstall the older one.)

3. yolk: checking Python packages installed on your system

The final utility I'd recommend is yolk, which provides a way of querying which packages (and versions) have been installed in the current environment. It also has options to query PyPI (the Python Package Index). Installing it is easy:

(myenv)$ pip install yolk

Running it with the -l option (for "list") then shows us what packages are available:
(myenv)$ yolk -l
Python - 2.6.4 - active development (/usr/lib/python2.6/lib-dynload)
pip - 1.0 - active
python-dateutil - 1.5 - active
setuptools - 0.6c9 - active
wsgiref - 0.1.2 - active development (/usr/lib/python2.6)
yolk - 0.4.1 - active
(See the yolk documentation to learn more about its other features.)


Obviously the above is just an introduction to the basics of virtualenv, pip and yolk for managing and working with third-party packages - but hopefully it's enough to get started. If you're interested in using virtualenv in practice then Doug Hellman's article about working with multiple virtual environments (and his virtualenvwrapper project, which provides tools to help) is recommended as a starting point for further reading.

No comments:

Post a Comment