Skip to main content
Uncategorized

Virtual Environments for your Python Projects

By december 1, 2023maart 19th, 2024No Comments

All experienced programmers recommend creating a virtual environments for every Python project. Let’s see if we can get comfortable using them.

Python’s popularity in the analytics and data engineering domains stems from its vibrant developer community and the wealth of packages available. However, managing dependencies across different projects with varying requirements posed challenges until the advent of Python virtual environments. This blog aims to guide analytics engineers on the importance of using virtual environments for their Python projects.

What is a Virtual Environment?

A Python virtual environment (venv) acts as an isolated container for Python installs and associated packages. It allows for the independent management of packages, decoupling your project from system-wide installations or other project dependencies. The virtual environment provides stability, reproducibility, and portability.

You can think of a python virtual environment as the specific laboratory in which you conduct your experiments. Creating one ensures that anyone who comes after you knows how to reproduce your work.

The key mantra for analytics engineers: always use a virtual environment. This practice ensures a stable and reproducible project environment, granting control over package versions and update schedules. It also enables independence from the system-wide Python installation.

Creating a Virtual Environment

The workflow for creating and using a virtual environment in Python is a relatively straightforward process:

  1. Create the virtual environment
  2. Activate the virtual environment
  3. Install packages
  4. (Optional) Create a requirements file with pip freeze
  5. (Optional) Ignore the virtual environment in your git repo
A flowchart infographic of the virtual environment workflow: CREATE with python -m venv, ACTIVATE with source, INSTALL packages with pip, FREEZE with pip, and IGNORE with .gitignore.

Virtual Environment Creation

To create your virtual environment, invoke the python module venv in your project directory. This will create a folder in your project directory that contains all the necessary files to run an isolated install of Python:

Virtual Environment Activation

Once the virtual environment has been created, it needs to be activated so that when you invoke python, the version installed by venv is called.

# MacOS / Unix
source my_venv/bin/activate

# Windows
my_venv/Scripts/activate

Install your packages

Now that the virtual environment has been activated, you can install the modules you need for your project.

# MacOS / Unix
pip install useful_package

(Reproducibility) Create a requirements file

Creating a requirements file allows you to reproduce the python install with pip. It’s easy to create one, and just as easy to use pip to reproduce a module library. This requirements file should be added to your version control system.

# Use > to send the results of pip freeze to a text file
pip freeze > requirements.txt

# Use the text file to install dependencies
pip install -r requirements.txt

(Version Control) Ignore the environment in your repo

If you’re working in a version-controlled git repo, you shouldn’t directly commit the contents of the environment. Fortunately, it’s extremely simple to ignore the directory entirely using a .gitignore file.

# In the .gitignore for the entire directory
my_venv/

# In a.gitignore inside the venv directory
*

Conclusion

In conclusion, Python virtual environments are indispensable for analytics engineers seeking stability, reproducibility, and portability in their projects. By embracing this practice, you gain control over dependencies and empower your projects to thrive in diverse computing environments.

Auteur

Leave a Reply