What configurations best support sensitive data in a shared cluster environment?


#2

Setting a dedicate onclave for protected data computation seems to be the prevailing way to provide the required protection measures. This includes: separate network, encryption in-network and at rest, and tight control of open ports, 2FA / MFA…

It may be helpful to list current efforts that people have undertaken in this area. I list what I am aware of, just in the spirit to be helpful. I have no affiliation or connection with these people. Here is the most recent workshop on HPC Security and Compliance at PEARC’18:

https://www.rc.ufl.edu/research/events/workshop-pearc18/

It is full of useful information. CTSC also has several presentations:

  • Aug 27, “NIST 800-171 Compliance Program at University of Connecticut”

#1

There is an increasing need to analyze sensitive data using shared cluster resources. Sensitive data is defined here as data that may fall under HIPAA, ITAR, FISMA, FERPA, etc guidelines. What approaches are in use besides just setting up isolated environments?


#3

There are a few schools of thought around this other than the enclaves.

  • Single use nodes
    ** Nodes only run one job at a time, and they reload after each job and use an encryption key generated at boot. Be sure to also use tools like pbs_pam and others so non-admins cannot connect to the node.

For shared storage be sure to monitor that users do not modify permissions so they are readable by anyone other than the user. Set umask to 077 etc.

Aggressive off-boarding is also highly recommended.

Depending on your circle of trust, that might be enough, or you could get away with less. It depends on your risk appetite and DUA’s.