-m venv path-to-mynewenv python3
At least once a year I get a chance to dable with Python. This time I’m involved in a project where we have developed outlier detection algorithms for district heating networks. To run the Python code and the application built on top of it I have to use a specific version of Python and some specific package versions.
So I started up my Python environment. Eager to get started. But instead of investigating the code and the results from the algorithm I got stuck on installing the correct package versions.
I have been down this road before. Usually it means I have to spend a day googling and following different tutorials to get everything to work. But I have never really understood why. So I thought I would do this a bit more thorough and write everything down this time.
Why write this post
I primarily use R to analyze data. After working out package management in Python I have realized that R users are very spoiled by the package management system that is built into R. When I install a package via install.packages("packagename")
it just works. This is very nice feature since most of my work is experimental. I mainly use R to investigate something: manipulate, visualize and model data. This means that the less time I need to spend on setup, the more productive I can be.
Virtual environments are not new to me
Of course, if I want my code to be used for production, for example scheduling a script, I want to make sure that the script doesn’t fail if I change my setup, like updating a package. So I’m familiar with the concept of virtual environments. I have used R packages for this, like renv
and packrat
, and also Docker. But I use these when I need them.
Virtual environments in Python
In order to use Python productivly most developers will encourage you to use Virtual Environments. If you are a Data Scientist that uses a bundled Python distribution like Anaconda, you might be using a virtual environment without thinking to much about it.
People use Virtual Environments to isolate projects from each other. In other words, the packages you use in one project can be different (versions) from another project, and when you open up a separate project it should’nt be dependent on the virtual environment in another project. I think of virtual environments as folders where you save all the packages that you use for a project in Python.
venv
The most basic way to create virtual environment in Python is venv
.
To use it is pretty straight forward:
This create a folder in your directory where the packages installed for your virtual environment will be saved.
What version are we using?
When you open the terminal and type python --version
you can get different answers. I found this article really helpful in understanding why this happens. This effects our virtual environment as venv
will inherit your Python version. As it says in the Python documentation:
A virtual environment is created on top of an existing Python installation, known as the virtual environment’s “base”
In other words: we cannot specify Python version for virtual environments with venv
. To do that you will have to use something like pyenv
.
Anyways, to use the environment we created we need to activate it:
/bin/activate source mynewenv
and then we can then install packages into it:
pip install pandas numpy
This works. But I usually want to use a virtual environment in many different projets and not have to install everything again when doing a new projet. Also, I want to have control of the version of Python that I’m using.
Anaconda and conda
A popular distribution and platform for working with Python is Anaconda. Anaconda
is not only for Python. On its website it says:
Package, dependency and environment management for any language—Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN
In other words, when you install Anaconda, you install a lot.
conda
conda
is the package manager built into Anaconda. For Python you use it the same way as venv
but instead of creating a virtual environment for each project you can use conda
to create environments that you can use over many projects.
It should be noted that I installed Anaconda and
conda
a long time ago so I had to update it before getting it to work properly. First run:conda update conda
, thenconda install anaconda
, not sure why and lastly I had to runconda update --all
to get it to work the way I wanted. This took a while.
But as soon as it worked I was able to create an environment where I also can specify the python version.
-n pythonds python=3.8 conda create
To install packages into the conda environment you run: conda install -n condaenv numpy=1.19.2 pandas=1.2.3
I wanted to use the environment in a Quarto document. To do this I also had to register it to the ipykernel
.
-m ipykernel install --user --name=pythonds python
The setup that now works for me
At some point we just want things to work. And right now this is what works for me when working with Python.
- I create and manage virtual environments with conda
- If I want to share my virtual environment I create a
conda.yml
file which is the equivalent of arequirements.txt
file to specify dependencies - Lastly, because I use Quarto, I have to register the environment to the jupyter kernel
This setup works for me right now. Crossing my fingers that it will work tomorrow.
Bonus tricks
miniconda
As I mentioned conda
is not restricted to Python. A smaller version of conda, primarily made for Python is miniconda
:
Miniconda is a free minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages, including pip, zlib and a few others. Use the conda install command to install 720+ additional conda packages from the Anaconda repository.”
The good thing about Miniconda is that you can use conda
in the same way. So to create a virtual environment you do exactly the same: conda create -n minienv python=3.8
. If I would start again I would probably restrict myself ot miniconda
but I haven’t really figured out how to run these things separate.
pyenv
With conda
you can specify Python version, but not with venv
. If you want to switch Python versions most people will suggest pyenv. You can use pyenv
in a similar fashion to venv
.