Ask.Cyberinfrastructure

Custom software stack environment for user (no sudo/root)

research-software
administering-hpc
#1

Anybody have experience with package managers that can provide user-managed software stack environment for HPC or similar kind of environment (multi-user, multi-node)? Another scenario is for users given one or more VM’s where he/she could not have root access for any reason. Here are key requirements:

  • This needs to be an environment that can be 100% controlled by the user, not by sysadmin.
  • It needs to be a “package manager” (like apt, yum, fink, brew, pacman) with (mostly) simple one-command installation.
  • It needs to have a gentle learning curve for the user (who is also the “admin” of the software stack)
  • The tool must be able to install to a user-specified location and NOT require superuser privileges (sudo/su) in the setup/build/install/config processes.

It can have a GUI, but CLI capabilities is much preferred.

Background: Under certain circumstances, user may choose to create his/her own software stack. Is there a platform that can be recommended for this user? The user has some basic skill on software install, build, etc but not too high skill in terms of software troubleshooting, etc. Ideally it contains many tested recipes for up-to-date software versions so the work on the user part is not too much.

I notice there are software distribution with similar philosophy. For example, Conda/Anaconda for Python environment. Then there is also MSYS2, which seems to leverage ArchLinux pacman (?). And cygwin. But for Linux environment, anything like this? I came across Linuxbrew recently. Any input on your experience with this software management tool? How will this compare to Spack, etc?

Wirawan

(Question clarified 2019-03-06 because the original wording lacked “package manager” which can then be interpreted in many ways.)

0 Likes

#2

Conda provides for much more than just python packages. I’ve used conda environments (with miniconda specifically) to create custom software stacks for various projects across different environments, including multiple HPC clusters. Adding channels such as conda-forge and bioconda provides access to a wide variety of packages. And building your own isn’t too hard.

0 Likes

#3

Conda, interestingly, is unable to unload the $PATH variable on deactivate, which makes it a bit inconvenient to switch between environments like modules or lmod does.

0 Likes

#4

“Building your own [package spec] isn’t too hard”

This was what I was trying to avoid, actually. I added a clarifying stmt in the original question, i.e. what packaging system can be easily deployed by end-users in shared HPC or VM environment? Many HPC/VM users probably don’t have the patience to build a “packaging spec” file (be it for Conda, for LinuxBrew, for Debian, etc.).

0 Likes

#5

@lparsons Can you say more about the packages you used in the past? Does Conda have packages for general (non-python) computing? Say, gcc, gdb, MPI, MKL or OpenBLAS, …?

0 Likes

#6

Can you say more about the packages you used in the past? Does Conda have packages for general (non-python) computing? Say, gcc, gdb, MPI, MKL or OpenBLAS, …?

I’ve used quite a few bioinformatics packages from bioconda, including samtools, STAR, featureCounts, DESeq2, etc. In addition to the biology related bioconda channel, there is an extremely useful channel called conda-forge that is a community driven effort to provide conda recipes for all sorts of general purpose computing tools, including things like OpenBLAS. Finally, there are many other channels, including the Anaconda channel that have many other things available.

0 Likes

#7

Users can always install there own packages and use something like modules.sf.net or lmod ( https://www.tacc.utexas.edu/research-development/tacc-projects/lmod ) to control their environment. Often this is provided by the site, and users can create private modules so they don’t have to setup everything from scratch.

To wrap everything in a conda like environment and get modules you can use something like spack ( https://spack.readthedocs.io/en/latest/ ) Look at how I used it to bootstrap an environment quickly to do some cloud bench-marking to get an idea to get started (update to your own spack or master tree) https://github.com/brockpalen/benchmark/blob/master/setup.sh

Spack will then setup most things for you, and then you just need to source the setup-env.sh to get everything going again.

1 Like

#8

It’s not quite what you’re asking for, but you can give a non-root user access to Docker and users can then use Docker to install whatever packages are desired. A Berkeley data science class that I worked with has students install Jupyter to provide a Python environment. Then whatever packages are needed can be installed safely in the container.

0 Likes

#9

Spack.

Spack is so great that when I found out about it I teared up.

For the installer: get the code from github, source a script, type ‘spack install package’. To use, ‘spack load package’.

All installs are self-contained, so if you upgrade the HDF5 it won’t break the IOAPI, NetCDF, CMAQ - whatever - in some other install.

It has all the major bioinformatics packages including all the dependencies in the snp-pipeline (just an example of something with like 12 dependencies.)

You can do permutations of libraries and compilers, although using something other than gcc is less magical. I also had a problem with something that needed a gcc higher than our system gcc.

You should modify the config files to use system openssl This is easy.

0 Likes

#10

You can give a “non-root user” access to run Docker, but that user can easily become root from within that container, as well as control a root running daemon process which lacks a reasonable control plane to manage security ACLs,… this may open up security concerns on many shared systems.

1 Like