Requests to set up virtual environments (python, containers, os-related, for example) seem to be increasing in frequency. A workshop in which I recently participated required attendees to create a virtual environment for the interactive portion. Are there certain indicators that suggest that the implementation of a virtual environment would be beneficial (for a given problem)? And how does one determine the best type of virtual environment to launch?
What qualifies as a virtual environment, and when is it appropriate to create a virtual environment?
@toreliza We use Virtual Environments for python and perl. We provide a minimal set of common add-ons for python (e.g., numpy, spicy), but our docs tell the users to use the “virtualenv” command to create a world where they have additional packages and/or newer versions.
https://public.confluence.arizona.edu/display/UAHPC/Using+and+Installing+Python
covers much of the “how to”, and some of the why. We’ve had bad experiences in updating packages as the newer versions sometimes break existing workflows, or are just plain buggy. With virtualenv, User A can have the latest version of XXX she needs, while User Y continues happily with the older version.
We have a similar setup for perl (perl-virtualenv) documented at
https://public.confluence.arizona.edu/display/UAHPC/Using+and+Installing+Perl
I create and use virtual environments and containers liberally. Each project benefits from having a separate and managed software environment, in my opinion. Integrating a project with the complete global software stack can wait until it is needed at some point.
Disk space is cheap and ensuring that things work smoothly is worth the extra space that either a virtual environment or container will take.
I’d love to know what you use. Spack? EasyBuild? Apptainer? We’re in the process of exploring those (and more) but are having trouble determining when one would be better than another. And, not that any topic can be reduced to an extremely basic level, but documentation and examples seem to expect a level of technical acumen that someone wanting to experiment a bit might be overwhelmed. Any resources you found helpful would be welcome!
I’ve used pretty much all the variations at different times. I’m not currently actively supporting users at my institution currently (my role is more about connecting users with resources and filling in other gaps).
Generally, I find that Spack seems to be my slight preference, but I typically look for which has support for the applications.
Easybulid has experimental support for building container images, so it could be an excellent place to experiment with both parts.
I agree with the difficulty of finding intermediate tutorials and that it would be very good if that gap were addressed (it might even be something I should do to refresh my skills).
My advice is to start experimenting. Take an application you are familiar with building and try to deploy all the possible ways. A cool thing about Easybuild and Spack is that you don’t have to deploy globally. You can configure them to build and deploy in directories. For instance, when I am working on a cluster, and I’m missing software, I use Spack to build the needed software stack in a directory with enough space (NB: this is probably the first problem users will have if you don’t configure it for them since the default is usually the home directory which is small on most clusters)