Ask.Cyberinfrastructure

What support does your center provide for containers?

containers

#2

Yes, we use singularity for containers, especially for tensorflow and for rstudio (server). There are several containers in the “community” area that everybody can contribute to, and we direct users to dockerhub and https://singularity-hub.org/ to find their own containers without needing to build their own. So far we have not really needed to help users build their own containers, but I think we will do some examples and webcasts explaining how to do it.


#1

How to you support containers on at your research computing center?
Do you provide containers for users?
Do you help users create containers?


#3

Yes - we use Singularity. We have instructions on how to create a Singularity container from Docker hub; we pull Singularity images from the Nvidia GPU Cloud for DL workflows like Tensorflow and Pytorch. If you are curious we have documentation at:
https://docs.hpc.arizona.edu/display/UAHPC/Containers
Chris


#4

We also use Singularity containers.
Although we allow uses to “bring their own containers,” and have some limited documentation at
http://hpcc.umd.edu/help/software/singularity.html on how to do so, I am unaware of anyone actually doing that.

We have a handful of packages (mostly deep learning) which we only provide as containers.
We provide environmental modules which load singularity if needed and set a variable with the path to the container, and set the PATH to include some wrapper scripts. The wrapper scripts will invoke the appropriate command in the container, and handle things like mounting lustre
filesystem and nvidia drivers if a GPU node. So e.g. for tensorflow all user needs to do is:
module load tensorflow
tensorflow myscript.py


#5

At Stanford on most of our clusters (Sherlock, Farmshare, SCG4) we support Singularity containers, and the “bring your own container” mentality. Our newly released documentation is here, and I also started a small effort called “containershare” that will allow a user to specify a shared space (my scratch in a share folder via a tool called forward (linked in the instructions for the containershare repository),

I also single handedly manage Singularity Hub mentioned by @KrisP , so hopefully this is also useful to provide containers on the fly, built version controlled from Github repos. I’ll also point others on here to Singularity Registry, which is the open source Singularity Hub. It would let an institution deploy their own Singularity Hub, and instead of reliance on Github actually push containers directly to it via an authentication token. My hope is that institutions that want to provide containers for their users can have some server running with their own Singularity Registry, with containers then accessible also with the shub:// uri. Given that we then add schema.org definitions for Containers to these pages, it would then be possible to have containers indexed with Google (akin to the Google Dataset search) and we could actually solve this discoverability problem. More info on that here --> https://vsoch.github.io/2018/schemaorg/

TLDR: let’s work together on both the open source registry server and standards to describe the containers so we can not only support containers, but also their provenance and discoverability. It should be a community effort to work on these tools (and associated documentation) together.


#6

Thank you for this question - we have been working our way through the best approach for supporting containers as well. At Northwestern we use Singularity on our HPC cluster. We provide documentation (https://kb.northwestern.edu/page.php?id=85614) and teach workshops; while interest is high from our user community, it’s still early-days for broad adaptation.

Singularity-hub is a great resource (thank you @vsoch!) especially for the bioinformatics community. Bioinformatics software has recently had an explosion of dependencies for installation - we’ve seen as many as 900 for a single package - so we point that community to containers whenever possible.

We do build containers in some cases - for example if a software package doesn’t run on our system. We built a container from scratch for one very complex pipeline and it was a learning experience. We realized that building containers from user’s home-grown code wasn’t a sustainable support model that we could offer. By providing documentation and teaching workshops we hope our users can develop the skills to create their own containers.


#7

At UMD we also support Singularity containers on the HPC.
We are a mix of “bring-your-own” and systems staff built containers; would like to go more toward the former but not all our users are comfortable building their own containers. We build some containers for packages that do not install well on our cluster — we typically provide wrapper scripts and setup modules to make things somewhat seamless (some users might not even realize they are using containers); e.g
module load tensorflow
tensorflow my-script.py
will load tensorflow module which loads Singularity, sets an env var with name to the container image file, and adds tensorflow script which launches tensorflow in the container image and passes arguments to it.