4 Sharing and reusing code: environments, packages and containers

Being on github is a good step to make your code available to the community, but it is not sufficient. One also needs to make sure that the code will be easy to install by other users, will run on their system that might be different from yours, and that it will stand the test of time, by not breaking with updates and registering its dependencies. This last point can be challenging in python where everything changes constantly. Fortunately, many tools are available to help you with that, notably the infamous environments and containers that we will see in this chapter. We will also see how to package your code so that it can be reused by others, and how to share it on package indexes like PyPI.

❯ Level 1

What are packages, and why do we need pip and conda?

Let’s start with the very basics. If you have spent at least a few weeks doing python, you have surely stumbled upon the pip and conda utilities, but their role might still be blurry. That is totally expected, and a complete understanding of these tools will only come after several paragraphs of this chapter, but we can start by outlining the very basics.

The main functionality of these tools is to install packages¹. Packages are arguably the most important factor of python’s success. A package is fundamentally a collection of code - classes and functions - that can be reused by other python programmers. It’s basically those things you can import into your code. Some packages are present by default in python, these are called the “standard library”, and they include for example math, os, or itertools, a complete list is here. But most packages are not present by default and have to be installed separately, even though they are very popular like numpy, pandas, scipy or matplotlib.

All these packages are crucial to python’s success because there is a large community developing and maintaining them, and they allow you to do impressive things with very few lines of code. Imagine if you had to implement a plotting library or pandas functions yourself! On top of that, many packages are updated very frequently, often leading to very quick availability of novel algorithms, which led to python’s success in machine learning notably.

But such power comes at a price, notably the frequent updates: since packages rely on each other, updates can break compatibilities. For example, pandas depends on numpy, but recent versions of pandas might not be work with old versions of numpy. You can check that by looking at the pyproject.toml file of the pandas source code online, which at the time of writing contained these lines:

dependencies = [
  "numpy>=1.23.5; python_version<'3.12'",
  "numpy>=1.26.0; python_version>='3.12'",
...

meaning that the latest version of pandas requires numpy to be at least in its version 1.23.5 (and even higher for certain versions of python). You might wonder, why not simply have all packages to their latest versions? Here’s the catch, sometimes features get deprecated, or start behaving in a different way, because developers realize some choices were a mistake, or want to go in another direction. Hence, another library might rely on a feature that was removed in the latest version of numpy, and if you want to use it along with pandas you will need to find some version of numpy that works for both. Imagine this problem with dozens of packages, and you get an idea of what people call “dependency hell”.

This is where pip and conda come into play. They are package managers, meaning they can install python packages for you, but they can also do it while trying to maintain compatibility between all your packages. When there are a lot of packages, themselves dependent on a lot of other packages, this task quickly becomes very complicated, and even with those tools you can often meet people screaming at their screens and loosing their minds as they are just trying to install a package. As an important side note, this is now much easier with the new generation of package managers, including poetry, uv and pixi that we will see at the end of this Level 1 section, but it is still early for this new generation, and still useful to learn pip and conda.

Basic use of pip

pip is the default package manager for python, and also the simplest one. It is installed by default since python 3.4, and its core commands are:

pip install <package>  # install a package (and its dependencies)
pip uninstall <package>  # uninstall a package
pip freeze  # list all installed packages and their versions

# Other useful commands:
pip install -U <package>  # upgrade a package
pip install -r requirements.txt  # install all packages listed in a file (see later)
pip show <package>  # show information about an installed package

Some magic happens in the pip install command: pip will not only look for the latest version of your package but also for all its dependencies, and the dependencies of those dependencies and so on, in a tree fashion, and will try to find a version of every package in the tree that is compatible with each other². It is also interesting to know where pip finds packages it will install. As of now, they are hosted on the Python Package Index (PyPI), and you can search it at pypi.org. However, certain developers prefer to maintain versions of their packages elsewhere or simply host them on github, which is why pip also supports installation directly from a git repository, with something like (with the same url that you would use for cloning):

pip install "package@git+https://github.com/somerepo/somerepo.git"

Installing packages with conda

conda is an alternative package manager that will be installed by default if you installed python through the anaconda or miniconda distributions (which are quite useful because they come with a lot of scientific packages pre-installed). conda however does a bit more than pip because it can also manage virtual environments, which we will see in the next section. But for the package management part, it is quite similar to pip, and the main commands are:

conda install <package>  # install a package
conda remove <package>  # uninstall a package
conda list # list all installed packages and their versions
conda update <package>  # upgrade a package

So in terms of installing packages, it all seems very similar to pip with some key differences however: - first, conda installs packages from the Anaconda repository which is not the same as PyPI and might contain different packages or versions. In conda, one can also specify a --channel (or -c) option to install from an alternative package repository (also called channel), including conda-forge which has a larger collection, or bioconda for bioinformatics. - conda also has a somewhat more rigorous dependency solver than pip, although it can take more time and sometimes seem to get stuck³. - Finally, conda packages rely on a different format than PyPI ones, and in particular can contain “shared libraries”, which we will see in more detail at the end of Level 3, but very shortly are non-python libraries that can do very fundamental operations (key examples include CUDA for GPU programming or BLAS for optimised linear algebra code). The consequence of this is that conda packages might be heavier but could solve some tricky issues that can happen in pip if you have a missing shared library.

It is also worth mentioning here the existence of the mamba project, which acts as a drop-in replacement for conda but much faster because written in C++. You can check how to install it on their website, and then use it by replacing conda with mamba on any conda command (installations notably might be much faster since the dependency resolution code is accelerated). However you might want to see uv and pixi, which are also fast and have additional features, before settling on your package manager of choice.

What is (really) a virtual environment?

One of the annoyances of the python ecosystem comes when you are working on different projects in parallel, each having different dependencies. You might have a recent project using torch 2, and an older one that is still using torch 1 for example. For obvious reasons, python needs to have a single version of each package visible to it: otherwise it wouldn’t know what to do when you import torch. The solution to this dilemma is called virtual environments, and these are a tool that will also allow us to more easily share code and reproduce experiments.

Before looking into virtual environments, we need to understand how python finds packages. When you type import torch, python will search for a file named torch.py or a directory named torch containing an __init__.py file on your computer. Obviously, it doesn’t look anywhere, but in a specific list of directories, the “module search path”. You can see it by typing:

import sys
print(sys.path)

Depending on how you installed python, you will see different results, but that usually include a certain site-packages directory. This is usually where pip or conda install packages by default. The trick of virtual environments will be to create a new site-packages directory for each project (and sometimes even a new python executable), and allow you to switch very easily from one to another.

There is no scarcity of tools to handle virtual environments in python: virtualenv was a precursor, but now venv is the officially supported one, although conda also handles virtual environments, as well as pipenv or poetry and now uv… Phew! We will see how venv and conda handle them here because these are major tools and it will help us understand what happens behind the scenes, but we will also see the uv way in another section.

Let’s start with venv. You can go to a new folder and run the following commands:

python -m venv .venv  # create a ".venv" directory containing the virtual environment
source .venv/bin/activate  # activate the virtual environment
which python  # see which python executable is used
pip install numpy==2.0.0  # install a certain numpy version in the virtual environment
python -c "import numpy; print(numpy.__version__)"
deactivate  # deactivate the virtual environment
which python
python -c "import numpy; print(numpy.__version__)"

You will see that when the environment is activated, the path to the python executable is different than when it is not, and when you install a package in the virtual environment it is not available outside of it (maybe you really had numpy 2.0.0 in your base environment, but most likely the last line gives you an error or another version).

How does this work? The first line calls the venv standard library module which simulates an entire python installation in a newly created .venv directory. You can explore it and see that it has executables⁴ and a site-packages directory where packages can be installed. The second line calls a script that will “activate” the environment, meaning that it meddles with some python settings so that your shell will now only see the python installation in the .venv directory. While the environment is activated, it will be as if your initial python installation didn’t exist (for this terminal only, if you open another terminal it will still see your default installation). The deactivate command simply reverts those changes and sends you back to your default installation (which we will now call the “base environment”).

The key feature is of course that this allows you to have completely different versions of the same packages in different environments, as anything you install in the virtual environment will not be seen in the base environment, and vice versa. This is also tremendously useful because it allows you to “break things” in terms of installing packages without too many consequences: if you find yourself in the middle of an installation catastrophe, you can simply delete the .venv directory and start again from scratch, without affecting your base environment or your other projects.

Note that if you activate an environment, and then open a new terminal in another window, the environment will not be active in the new window: the activation of an environment is specific to a terminal session, and thus you can have several terminals open with different environments activated. This is a very useful feature, as it allows you to work on several projects at the same time, each with its own environment.

Virtual environments with conda

Conda has the particularity of being both a package manager and a virtual environment manager, which makes it a bit confusing to beginners. Let’s see its commands related to virtual environments:

conda create -n myenv python=3.9  # create a new environment named "myenv" with python 3.9
conda activate myenv  # activate the environment
# Here you are "in" the environment. If you install packages, they will be installed
# only in this environment.
pip install numpy  # install numpy with conda
conda install torch  # install torch with conda
conda deactivate  # deactivate the virtual environment
# Now you are back in your base environment. The two packages we just installed
# are not available anymore.

You can notice two big differences with venv: first, the python version can now be different than the base one, and you can choose a new python version every time you create a new environment, yay! This is because the whole python installation is duplicated this time, and not based on a symlink as for venv. A second major difference is that the environments are not tied anymore to a specific directory that you choose. Environments are now identified by a name, which must be unique on your system, they are stored in a special .conda directory somewhere, and can be activated from anywhere simply by using their name. The advantage is that you can have a single environment for different projects for example. On another hand, with venv, we usually tend to create the environment directory inside of the project’s directory so that both are tied to one another, whereas with conda it is easier to forget which environment was used for which project without a little discipline.

Since conda stores all the environments together, you can easily list them or delete one with some additional commands:

conda env list  # to list available environments
conda env remove --name myenv

Finally, a few notes about interoperability: it is possible to install pip packages in a conda environment without hassle, but installing a conda package in a venv environment is generally not possible. Also, since environments do complex modifications to your environment variables and the way python finds its dependencies, it can be risky to work with conda environments and venv environments on the same system. If you find yourself in this situation, no need to panic though, in general everything works out fine, but if you use conda, it is usually good practice to stick to it and avoid using venv too.

Sharing your environment

One major leap forward for a python project is being able to share a whole environment so that other people can run your code with the exact same package versions that you used, and thus avoid numerous inconsistencies. Of course, it’s not the environment directory itself that we want to share on github: this one contains a lot of things, a list of packages and their versions would be sufficient! This is what can be done with a requirements.txt file for pip-only environments and with an environment.yml file for conda environments.

If you have an environment with a set of pip-installed packages, you can easily export their versions to a text file with:

pip freeze > requirements.txt

You can see that the created file will contain lines with dependency specifiers, such as numpy==2.0.0. You can then recreate an environment from this file by creating an empty environment (either with venv or pip), and then running:

pip install -r requirements.txt

which should get you all the same packages and versions. You can thus share this file along with your code if you want coworkers to reproduce the environment you are working on. You can note that all packages are pinned to an exact version with this practice, which is good for reproducibility but prevents users of your code from updating a package. It also creates issues if you want people to install your code in their environment, where they might use other packages that don’t use the exact same versions you specify. We will see how to handle this better with pyproject.toml files in the level 2 part, for the moment this is a reasonable solution if you simply want to ensure consistency in a project with few end-users.

The equivalent way to export a conda environment is:

conda env export --no-builds > environment.yml

This environment can then be recreated with:

conda env create -f environment.yml

You will notice a few things in the created environment.yml: first, the python version is now specified (which makes sense since conda environments can each have a different python version), as well as channels. Secondly, if you installed some packages with pip, they will be part of a special - pip section of the file, allowing you to share environments that combine conda and PyPI packages. Finally, if you export without the --no-builds option, there will also be some strings after the version specifiers, that indicate the exact build version of a library that you have installed. This is particularly useful for packages that rely on non-python code, allowing you to have an even more precise trace of what is installed in your environment, but it can also make it difficult for people on other OSes or computer architectures to install the environment from such a file, which is why it can be simpler to always add the --no-builds option.

Modern package management with poetry, uv and pixi

After time, it appeared that pip and conda had some issues that made life difficult for python practicioners. Notably, managing environments was very repetitive, the environment was not tied to the project, and even with files like requirements.txt, an environment was not completely reproducible. To fix these issues, modern package managers like poetry, uv, and pixi work with two files: - a pyproject.toml file which we will study in detail in a level 2 section. This file brings together what requirements.txt used to do along with project metadata, and instructions to “build” the project so users can install it very easily. - a “lockfile” which is intended to be read only by the package manager and records all of the exact versions of packages used in your environment, to ensure full reproducibility when something goes wrong. This file is called poetry.lock for poetry, uv.lock for uv, and pixi.lock for pixi. It is not meant to be edited by hand, but it is very useful to share with your code so that other people can install the exact same environment you used.

Let’s see how you would manage a project with poetry (you can visit this page for installation instructions): - First, a very useful command is poetry new: it creates a whole project structure that follows best practices, which helps standardise your code organisation. It will also create the pyproject.toml file in which you can fill metadata information (authors, etc.), a lockfile, and a virtual environment in a system cache directory (at the moment of writing, for Linux users, this was ~/.cache/pypoetry/virtualenvs/). You can create your new project with: sh poetry new myproject - The idea now is you can use the virtual environment simply by running poetry commands. For example, to add a new dependency, run: sh cd myproject poetry add numpy==2.0.0 This will not only install the package in the virtual environment, but also update the pyproject.toml file with the new dependency, and record all changes in the poetry.lock lockfile. - Once you have added all your dependencies, you can run a script of your project with: sh poetry run python myscript.py Which will run the script in the virtual environment. If you prefer to have a shell with the virtual environment activated, as was the case with venv and conda, you can run: sh poetry shell - If you clone a repository that uses poetry, you can create an environment and install all the dependencies simply by running: sh poetry install This will use the lockfile if it is available, or the dependencies listed in pyproject.toml if not.

Note that when you are in the environment, you can still use pip commands as usual. However, these will not update the pyproject.toml file and your dependencies might become inconsistent with your environment! Please never do that then.

As you can see, poetry is very convenient because it removes some repetitive tasks, forces you to use better practices, and allows you to easily share your project simply by including the pyproject.toml and poetry.lock files in your repository. Given all these advantages, I hope you see there is virtually no reason to stick to pure pip or conda-based managements!

Despite all its advantages, people found that poetry was still slow however (it is indeed written in python), which led to the development of fast alternatives like uv (written in Rust) and pixi (written in C++). These two tools are very similar to poetry however, so the reader can simply go check their documentation as well as the python package management Rosetta Stone I added in the appendix of this book. Note that the diffeence between uv and pixi is that uv is focused on using PyPI packages (like pip, and poetry) while pixi can also use conda packages, which can include compiled shared libraries. I hope this will help you choose the best package manager for your needs, but I can recommend starting with poetry if you are unsure.

Why is python package management so chaotic?

Let’s do a quick “therapy session” about python package management. In this section, I want to acknowledge your pain, and give you some more fundamental understanding of how the python package ecosystem was structured. This should help you understand why it looks so much like a Wild West, and how to navigate it better.

First, we need to acknowledge that the difficulties with handling project dependencies in python are the price to pay for a very dynamic associated open-source community. Some languages and communities are more stable, either because most packages are provided by a centralized provider, or because packages don’t dramatically change too often. In python, it is common to have many open-source libraries providing an algorithm soon after it is published, and much earlier than it would be available in other languages, and this is one of the main reasons why diverse scientific communities have quickly adopted python. This also means that certain packages are completely rewritten sometimes without backwards compatibility, or stop being maintained, and the people using them have to deal with that. There is a trade-off between dynamism and stability, and python makes an aggressive choice towards the first option, so that is falls onto the python user, you, to stabilise their development environment. This is why package managers and virtual environments are so important in python, and seem more messy than with other languages.

Here are the main components that exist to manage dependencies in python: - packages indexes: these are online services that store all packages submitted to them for all their versions. The main ones are PyPI and conda. - package format: a python package is usually, as we said, a collection of python files, but it can also contain code in other languages (for example C or Rust), that can then be compiled. When you install a package, usually you download an archive that contains python files, and sometimes compiled libraries. Different formats exist for all these different situations: .sdist is an archive format for python-only packages, .whl (“wheel” format) is an archive that can contain python files as well as small compiled libraries, and .conda is a format specific to the conda world that can also contain large, “system-level” libraries (which also consist of compiled code but generally act at a more fundamental level, like linear algebra or GPU programming libraries. This is why conda was often used for high-performance scientific packages). .egg was an older format now deprecated. - package manager/installer: this is a tool that can download a package on online indexes, if necessary search and download its dependencies, and put its files in all the correct places so that everything works when you try to import it. As we say, pip and conda but also uv, poetry and pixi can fulfill this role. Now you can understand that pip and conda differ both in terms of the package index and the package format that they use. - virtual environment manager: as we saw earlier, virtual environments simulate a python installation, allowing to have several concurrent ones with different packages installed while avoiding conflicts. The main virtual environment manager offered by python is venv, and there have been alternative ones including poetry, but note that it does not allow multiple versions of the python executable itself to coexist. This is a problem solved notably by pyenv, but also by conda, uv, and pixi. - python project metadata file: now we are getting to more tricky concepts, for which I am making up names myself, although the concepts are very real. When you are working on a project, there is a lot of metadata information that you may need to share: most importantly, this includes the dependencies (with their versions), but also the minimally required python version, how to install the dependencies (is it via pip or conda?), and finally information about how to install your code itself as a library (which we can call “building” your project). To some extent, requirements.txt and environment.yml are able to perform this role (the second one containing a bit more information), and traditionally, in the python world, the remaining information was scattered across many files: a setup.py file to explain how to build the project, sometimes a .python-version file, and additional config files for additional development tools like mypy or ruff. Now, all the metadata can be gathered in a single format, the pyproject.toml file, which we will cover in a following section. - python project manager: a new concept we can use is that of a “python project manager”, meaning a full-fledged solution to manage your projects. Ideally, such a solution is able to create a virtual environment for your project, install all its dependencies in it, and record changes in the metadata file. An integration of all these tasks was a goal already of conda, but conda still lacks many useful features. The main complete project management solutions in python right now are poetry, uv and pixi. - python build backend: building a project is needed if you want to import it from anywhere (see the section “A guide to the intricacies of python imports” in chapter 3 to understand the problem). We will see in detail how this works in Level 2, but what you need to know is that there are several tools that can perform this task, called “build backends”, the historically famous one being setuptools, with modern alternatives like hatchling, scikit-build, mesonpy, and others. The important point is that these tools are now specified in the pyproject.toml file, which allows you to use them with any project manager, and thus separate the “build frontend” (the project manager) from the “build backend” (the tool that actually builds the project).

For a more complete comparison, check the Rosetta Stone in the appendix!

❯❯ Level 2

Understanding the `pyproject.toml` file

We have mentioned that tools poetry, uv or pixi rely on a pyproject.toml file to gather all the metadata about a project. Let’s open an example file and break it down:

[project]
name = "myproject"
version = "0.1.0"
description = "A simple project to demonstrate the pyproject.toml file"
authors = [
    { name = "John Doe", email = "xxx" },
]
requires-python = ">=3.8"
dependencies = [
    "numpy>=2.0.0",
    "pandas>=2.0.0",
]

[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"

[tool.poetry.scripts]
myproject-cli = "myproject.cli:main"

First, this file is written in the TOML format, which is a simple format to write configuration files (like yaml, json or ini). It is composed of sections, separated by the headers in square brackets, and sets of key-value pairs (where the keys are strings and the values can be strings, numbers, lists, dictionaries). I can recommend this page for a quick tour of the syntax.

As you can see, the first section of this file is project which contains some metadata as well as essential information to build the project, notably the minimal python version and main package dependencies. These dependencies will be added automatically by commands like poetry add or uv add, but as you can see are not exact, nor complete (numpy actually depends on many other packages, but only the minimal required information to reconstruct the environment is given here. This is in contrast with the lockfile which will list every package and its exact version).

The next section is build-system which contains instructions on how to build the project. If you make your project available on PyPI, these instructions will be used by pip or the equivalent tool to install your library. We will also see how to do it locally.

Finally, several tool sections can be added to this file, and will be read by the corresponding tool. In this case, we have a tool.poetry.scripts section which allows you to define command-line scripts that can be run from the terminal. We will detail these later, simply remember that many tools can add their own sections to the pyproject.toml file, which is a very convenient way to gather all your configuration in a centralized file. For example, uv can read from a tool.uv section, or ruff from a tool.ruff section, mypy from a tool.mypy section, and so on.

Packaging python code

We have seen how to share your dependencies, and how to let people reproduce exactly your environment with tools like poetry install. This will let people run your code, but it will still doesn’t turn it into a library that can be used in other projects. You can see it from the mere fact that you cannot import your code outside of the project directory, as we saw in the section “A guide to the intricacies of python imports” in chapter 3.

To make your code usable in other projects, we need to “package” it, which means we will need to respect certain conventions regarding its structure, and use a build backend to turn into one of the formats we saw earlier, like .sdist or .whl. The complete rules for packaging python code are described in the Python Packaging User Guide, but we will see some main points here.

First, the project structure needs to look like this:

myproject/
├── source/  # or src/, here all the importable code will go
│   ├── myproject/  # this name will be the package name
│   │   ├── __init__.py
│   │   ├── module1.py
│   │   └── subpackage/
│        │   ├── __init__.py
│        │   └── module2.py
├── tests/  # if you have tests
│   ├── test_module1.py
│   └── test_module2.py
├── pyproject.toml
├── README.md

Thankfully, this will be created automatically when your run a command like poetry new myproject or poetry init or uv init --lib. One key point is that all the code that you want to be importable should be in a directory with the package name, inside of a source or src directory. This will allow the build backend to find all the code. The tests are separated so they are not installed with the package, but you can separate other directories that shouldn’t be installed, like documentation, tutorials, notebooks, or data.

The second step of the process involves writing instructions to build the project. Traditionally, and until tools like poetry came along, this was done in a setup.py file, which you can still find in many projects, and is still taught in many courses. This was a bit verbose and confusing, so it is now recommended to use the pyproject.toml file, but setup.py files are still perfectly usable. If you use poetry or uv to setup your project, they should write some default build instructions like the ones in the example above, and in most cases, you can stick with them and will have nothing else to do! They typically only specify which build backend to use, and this backend will then read the rest of the pyproject.toml file to gather all the information it needs to build the project (dependencies, authors, etc.). If you are not using these tools, a good default would be:

[build-system]
requires = ["setuptools>=61.0"] # Minimum requirements for the build system itself
build-backend = "setuptools.build_meta"

which will use the setuptools build backend, the most common one.

Now, you can build your project locally, which means you will finally be able to import it from anywhere! First, you should know that if you are using poetry or uv, your project is actually already built and available in the virtual environment associated with your project. So if you open a shell with the environment activated, with poetry shell, navigate anywhere on your computer, and launch a python interpreter, you will be able to import myproject. However, this will only be true in this specific environment.

If you want to build your project and install it in another environment, or if you are not using poetry or uv, the simplest method is to run:

pip install -e .

when you are exactly in the root directory of your project (where the pyproject.toml file is). The -e option means “editable”, which will install the project such that any changes to the code in myproject will be immediately available when you import it. Without this option, the whole code would be copied in the site-packages directory of your environment, and you would need to reinstall it every time you change the code.

At this point, the whole world can already install your project very easily! Indeed, if your code is on a public repo on github for example, anyone can simply run:

pip install git+https://github.com/yourusername/myproject.git
# Or
poetry add git+https://github.com/yourusername/myproject.git
# or uv add, etc.

With this, pip will download your code, build it, and install it in whichever environment is activated. The poetry version will even add the dependency with the github url to the pyproject.toml file. This is an amazing feature, don’t hesitate to use it! A word of caution however, with this technique your users are quite vulnerable to any changes you make to the code, and if you want to offer them more safety, you should consider publishing the project on PyPI with semantic versioning, which we will see next.

Distributing python code

At this point, you may have a robust project, that can be built, is importable, is continously tested, and well documented. You want to share it more widely and make it installable through pip. For this, you will simply need to publish your package on PyPI. The good news is that with the pyproject.toml file we wrote in the previous sections, this will be very easy. The metadata in the project section of this file is critical because it will be used to index and display your project on PyPI. Notably, the name should be unique, the authors should be listed, as well as the description which will allow people to find your project. The version number is also mandatory, since it will allow your users to find specific versions of your project. PyPI will let you upload a package with a given version number only once! If it contains mistakes, these will be there forever, and you will need to increment the version number to upload a fixed version. Such is the hard law of software! We will see later how to easily increment your version numbers.

For now, to upload a given version of your project to PyPI, you first need to create an account here. Then you will need to first build your project, which will depend on the project management tool you are using. With poetry, the command will simply be:

poetry build

There is an equivalent command for uv and pixi. If you are using pip and the setuptools build backend, you will need to install a “build frontend” first. A simple and effective one is build, which you can install with pip install build. Then you can run:

python -m build

In all cases, this will create a dist/ directory containing the built package, which will be a .sdist and/or a .whl file. To upload your package, you will need another tool called twine (again, pip install twine will do), which can both check your package and upload it to PyPI. You can do these steps as follows:

twine check dist/*
twine upload dist/*

Important note: if you want to practice those steps, you can upload to the Test PyPI index, which is a sandbox version of PyPI. You can create an account there as well, and use twine upload -r testpypi dist/* to upload your package to the test index.

Semantic versioning and release workflows

You may notice that software projects often use three numbers to identify their versions, something like 1.0.3. This is done not simply because it looks cool, but mainly because it is a convention called semantic versioning, with its own website, and that is itself versioned! The convention explains in great detail how each of these three numbers should be incremented, but the idea is roughly the following: - the first number is called the “major” version, and should be incremented once you change your code in a way that is not backwards compatible, for example by deleting functions or changing the behavior in a way that alters usage. In practice I have not always seen this rule to be followed very precisely, but roughly it is a way to indicate to users that they should be very careful when updating to a new major version. - the second number is called the “minor” version, and it mostly indicates that new features have been added, but in a way that doesn’t fundamentally alter previous functionality. This is the number that would be incremented most often. - finally the third number is called the “patch” version. It is only really used to indicate that a bug has been fixed⁵ There are more complex naming conventions for “pre-release” versions, but I will let you look at the official documentation for these.

Let us now see how you would modify the version of your project before a new release. If you followed previous steps, you normally have a version field in your pyproject.toml set to 0.1.0 for example. Sometimes people also consider it a good practice to have a __version__ variable in the __init__.py file of your main package, which should naturally be equal to the version in the pyproject.toml file⁶. The first step would thus be to update all the places were your version numbers appear, which you can do manually (dangerous), or with any of the numerous tools for this task. Typically, running poetry version major/minor/patch or uv version --bump major/minor/patch will update the version in the pyproject.toml file.

❯❯❯ Level 3

What is (really) docker?

🏗️ Work in progress

Basic use of docker

🏗️ Work in progress

Fundamentals of a dockerfile

🏗️ Work in progress

4.0.1 Inside a python package

🏗️ Work in progress

4.0.2 Packaging python with another language

🏗️ Work in progress

The term “package” is very closely related to the more general concept of “library” in programming, which simply refers to a collection of code elements that can be reused by other programs. In python, the word “package” is somewhat more precise, referring to exactly how a specific unit of organised code, but no need to worry about these details here.↩︎
But it does not yet verify compatibility with your other installed packages, except at the end of installation process where it can give a warning. This is a bit of blurry situation yet, and again something handled more thoroughly by uv.↩︎
for extra speed and if staying within the conda universe, it is possible to install mamba which is a drop-in replacement for conda but written in C++ and much faster. Commands are all the same, just replace conda by mamba.↩︎
In the case of venv, you will see that the executables are actually symlinks (symbolic links) to the base python installation, which lightens the environments. Due to this, venv does not allow to easily create a virtual environment with a different python version, unlike conda and uv.↩︎
Note that each of these numbers doesn’t have to be a single digit: 0.12.254 is a perfectly valid version number.↩︎
This is actually not always advised, and in fact something the poetry project has actively critcised, as it breaks the classical pattern of having a single source of truth. People liked having the __version__ variable because it allows you to know the version of an imported package from the interpreter: you would run import mypackage and then mypackage.__version__ would give you the version number. However, there is also the possibility to use the importlib.metadata module as such: from importlib.metadata import version; version('mypackage'). In this case, importlib would use the version number indicated in the pyproject.toml. For this reason, running poetry version will not update the __version__ variable in your code if you have it, which has led to some quarrels, and the development of plugins like poetry-bumpversion. If you install this plugin and configure it, running poetry version will again update your __version__ variable.↩︎

❯ Level 1

What are packages, and why do we need pip and conda?

Basic use of pip

Installing packages with conda

What is (really) a virtual environment?

Virtual environments with conda

Sharing your environment

Modern package management with poetry, uv and pixi

Why is python package management so chaotic?

❯❯ Level 2

Understanding the pyproject.toml file

Packaging python code

Distributing python code

Semantic versioning and release workflows

❯❯❯ Level 3

What is (really) docker?

Basic use of docker

Fundamentals of a dockerfile

4.0.1 Inside a python package

4.0.2 Packaging python with another language

Understanding the `pyproject.toml` file