Python Venv

This lab requires a Linux or Mac desktop with full administrator rights. You will be able to measure your success with simple terminal commands to verify if the correct version is installed. Open a text editor and paste code from the examples below. Simply change the environment name to match whatever you are experimenting with and try it for yourself. This guide will help demystify using command line tools and for most users these few commands will be all you need for your daily work. You should allocate about 30 minutes of time to replicate these commands. On your own device you can get familiar with them by installing some other Python versions and modules.

The value of using virtual environments was something I missed when first learning how to build software from scratch. When you perform research in both physical and abstract forms, you spend a lot of time finding out what does not work. Due to this, I wasted a lot of hours reinstalling operating systems, databases and research software. Once I began to understand the value of virtual environments, I reduced the time and frustration involved with setting up the next attempt. Simply delete the folder and make a copy of the last one that worked properly, then rename and activate the new one. The beauty of developing on a Linux desktop is clear in this scenario because the same commands that work on your local device will be the same when you connect to your data warehouse server via secure shell. Development environments should match the target deployment server precisely.

The day-to-day use of the Python programming language is greatly simplified with the use of the built in virtual environment called venv. Python is an interpreted language, which means it reads the .py file line by line. In the most basic setup you simply need Python and a text editor installed on your operating system. Open a terminal shell and navigate to the folder where you saved the file. You simply type python, add a space, and then the name of the file you just saved. Then, it should execute. If you know other scripting languages you will learn to use it quickly.

The compile step happens at the last second, where it turns the human readable file into byte code (Lutz, 2013). The compiler is always running so any future updates you push to end users will just make another byte code file. This greatly simplifies your life if you are supporting a large number of end users, because modifications can be made by individual teams and not require a system wide update. There is no reason to clutter up the main app with stuff only 5% of the company uses.

If you are using Python 3.2 or later you should see a newly created ‘_pycache_‘ folder. It should contain a file named example.pyc, the ‘c’ indicating the compiled file extension. The next time you execute your program it will start much faster. If you modify the source code or Python installation it will generate a new one. This is the file you might send to the end user along with a desktop shortcut. Your example.pyc file is iterated through with the runtime engine known simply as Python Virtual Machine (PVM).

I mostly use Python for exploratory data analysis and creating automations. Start off from scratch making sure it will launch automatically, later a Debian server will trigger it with the cron command. This makes your applications highly versatile. It can be set to launch 4 times per day, or if there are new /data_raw files.

At the enterprise level of software development, it becomes a common thing to have to support multiple versions of software at the same time. In the past I spent a lot of time looking for a bug in some code that I wrote, only to find out that it was just a conflict between modules at the installation level. The local development environment that you will be working with will not have all of Python’s modules installed by default, just the standard library. Eventually you will want to install some extra packages to check them out, being a wizard with venv will make this easy. If you decide that the new package is not working out, just deactivate and delete the folder. This keeps the core installation sitting on your operating system clean and you can avoid burning time performing reinstalls.

For this lab we will build a development environment on a Linux workstation using PopOS! which is an Ubuntu-based distribution. These instructions will work on any Debian distribution, though the available versions may be different. Apple macOS will have a different install procedure, but the Python commands will be the same in the default Z Shell.

Open a terminal

#Find your current Python version  
micah@server1:~$ python -V
Python 3.8.10

#Find out what versions of Python are available in your current apt-cache.
micah@server1:~$ apt-cache search python3.*

Normally a Linux distribution will only contain a couple versions of Python. When you need a version that is older than those currently available, just add the repository for older Python versions. This is called deadsnakes which at first seems suspicious, but remember that this language is named after a famous comedy troupe.

#First we update apt, add the '-y' option to avoid having to 
#answer yes to the "are you sure?" prompt
micah@server1:~$ sudo apt update && sudo apt upgrade -y

#Once that is complete we can add the repository
micah@server1:~$ sudo add-apt-repository ppa:deadsnakes/ppa

#Now we can specify a Python version and also install the venv package, e.g. Python 3.9
#remember to grab the venv package also
micah@server1:~$ sudo apt install python3.9 python3.9-venv

Your installed version of Python might be different, but for this lab just pull down a version that is one before the default one on your apt repository. For example if you have 3.11, you will add the 3.10 version for this lab.

I create a lot of these environments, so I want to keep them organized in one hidden folder called ‘.venvs’. Later we can use this path in our shebangs and get our apps to automatically launch the .py file with a specific virtual environment. This avoids having to include it in shell scripts that will launch the application. This is important and makes your life easier when creating and administering applications for teams of regular desktop users.

#make a .venv folder
>micah@server1:~$ mkdir ~/.venvs

Now we can finally start our experiments. Here, I want to install Jupyter-lab, which is one of my favorite Python modules. I will later install and test other packages, so I want the name to be descriptive so I can find it when I browse the folder later. Below I use my own nomenclature, two zeros indicate basic parts are being researched. Once the first version of my app works I changed this to 01, continuing up to 04 meaning live production code. Later, when I have multiple versions open at the same time, I can match up the file version and the venv. The purpose is to make your own intuitive system that fits the coding style guide or security rules for your organization. Now we can finally activate the new virtual environment. The name of our new active environment should now show in parentheses in front of your Bash command prompt.

#make a new venv with the version installed above.
micah@server1:~$ python3.9 -m venv ~/.venvs/jupyter_lab_00

#activate the new venv
micah@server1:~$ source ~/.venvs/jupyter_lab_00/bin/activate
>(jupyter_lab_00) micah@server1:~$

#determine the Python version inside the venv:
(jupyter_lab_00) micah@server1:~$ python -V
>Python 3.9.9

#determine the pip version inside your venv
(jupyter_lab_00) micah@server1:~$ pip3 --version
>pip 21.2.4 from /home/micah/.venvs/jupyter_lab_00/lib/python

Now try to deactivate the venv virtual environment. Let’s make sure it works after turning it off and on again. You get extra points if you reference the character Roy from IT Crowd. If this works you should see the (jupyter_lab_00) in front of your user name go away in the command prompt.

(jupyter_lab_00) micah@server1:~$ deactivate

Now is a good time to determine if we somehow modified our core Python installation. Reuse the command we typed in first, you can probably just hit the up arrow and it will cycle through your previous commands. Determine the version outside the new virtual environment by opening another terminal, there should not be anything in parentheses in front of your user name at the command prompt. You should get the same result as the first command.

Begin installing modules

# Check your Python version
micah@server1:~$ python -V
>Python 3.8.10

# Check out the pip version
micah@server1:~$ pip --version
>>pip 20.0.2 from /home/micah/.venvs/jupyter_lab_00/lib/python

#restart the environment
micah@server1:~$ source ~/.venvs/jupyter_lab_00/bin/activate

# initiate the Jupyter install
# you might need to type python3 in your terminal
>(jupyter_lab_00) micah@server1:~$ python -m pip install jupyterlab matplotlib pandas

Here we installed Jupyter and Matplotlib, but you could just install whatever Python package you want to experiment with. The steps before this will become your install script, which will be separate from your launch script. A basic Bash script is just chaining these together into one file. This is also the time to consider how you want to setup your working directories, best practice is to at least separate the in-progress Python files and data folders. I also add subfolders to data for /raw, /interim and /processed at the very least.

If you followed along with the above commands you should now be able to activate a new environment. Launch Jupyter Lab and specify the port number, this makes your life easier when you have multiple environments activated at the same time. A real world example would be testing if the same data visualization can be created with both Matplotlib and Seaborn.

#first launch jupyter with just matplotlib from before
>(jupyter_lab_00) micah@server1:~$ jupyter-lab --port=9876

>#in a separate terminal shell build a Seaborn version of the same app
>micah@server1:~$ python3.9 -m venv ~/.venvs/jupyter_seaborn_00
>micah@server1:~$ source ~/.venvs/jupyter_seaborn_00/bin/activate
>(jupyter_seaborn_00) micah@server1:~$ python -m pip install jupyterlab matplotlib pandas seaborn
>(jupyter_seaborn_00) micah@server1:~$ jupyter-lab --port=9877


You should now have two browser tabs open with different Python environments running at the same time. Try your own experiments locally, create a working folder and clone a notebook from my gitlab repository. If a notebook you want to check out has import statements at the beginning, you can now create a new environment and install just those specific modules. The terminal shell is a powerful tool and with these basic commands you can start researching Python on your own.

Lutz, Mark; ‚ÄúLearning Python”, 2013.
ISBN: 978-1-449-35573-9