Research groups often have data that is in some way sensitive, for example, data containing gambling information, data containing educational interventions (concerns minors), or medical data. For example a state agency may contract a research group to carry on some analyses on some regulated activity (e.g. gambling). Even when de-identified (i.e. each person assigned a unique random identifier by the regulating agency), such data is sensitive and needs to be protected, both in transit and at rest. Such data can also be very large - in excess of 1Tb - concerning millions of subjects and billions of events, so the analysis may not fit in a single machine.
So here are some more concrete questions:
- What software do you use to encrypt data? Pros and cons of each package? Availability, cost, ease of use, compatibility with analyzing software?
- Are there encrypted solutions on the hardware level? Pros and cons of hardware vs. software encryption? Cost, speed, etc?