Ask.Cyberinfrastructure

How can i install git LFS on a shared campus cluster running SL7?

data-management
git
git-lfs
compiling
missing-libraries
software-installation
#1

We are using our local campus cluster for research in our group. We have multiple students using it to run data analysis on top of shared datasets. During this we discovered issues. Despite the last major upgrade of the cluster Linux system from SL6 to SL7 there are still issues with what is available on the system and how it works. We have found:

When I use scikit-learn with python3.6 module loaded, mkl.so and svx.so libraries are missing which prevent its use.

It seems that there is no easy way of enabling/installing git LFS:

We are using it to distribute datasets. Is there a way for us to compile and install git LFS? And since it seems this could be used by a wider audience beyond just our group, what would be required to install it system wide on a shared campus cluster running SL7?

#2

I would be careful about advocating for Git LFS. If your cluster has a hosted service, great, but for the average user “Git LFS” means a free tier offered by one of GitHub, Bitbucket, GitLab. The free tiers not only place a limit on the upper file size, but also limit the total size and bandwidth that can be used. It can work for (small amounts) of large data, but in no way is a good suggestion for substantial datasets that we are likely to see with HPC. I’ve had this opinion based on my own usage, and didn’t stumble on an article that articulated the same concerns until today.

#3

@aculich this also seems like it’s a minor issue with libraries, and you probably have a solution in mind - and were posting the question to share. What did you learn?