I have a 250T data set that I share with researchers across the country. What resources are available for storing and sharing my data set?
HI Julie,
You might be able to leverage the Open Storage Network (assuming the data classification is appropriate). For 250TB you’ll need to contact them directly: https://www.openstoragenetwork.org/get-involved/get-an-allocation/
Stephen Oglesby (CSU)
(Edit): I forgot to include other Xsede resources as an option, you can view these at this website: XSEDE User Portal | User Guides
Just to add to oglesbys. OSN already talks about Globus in their software stack. You can use
Globus to share and manage data (very handy)
here are several resources available for storing and sharing large data sets like yours. Here are a few options you may want to consider:
- Cloud-based storage: Cloud storage services like Amazon S3, Microsoft Azure Blob Storage, and Google Cloud Storage provide scalable, cost-effective storage solutions for large data sets. These services also offer features like data encryption, access control, and high availability to ensure that your data is secure and easily accessible.
- Data sharing platforms: There are many data sharing platforms that allow researchers to share and collaborate on large data sets, such as Globus, Figshare, and Zenodo. These platforms often have features like version control, access control, and DOI assignment to make it easy to share and cite your data.
- Academic research networks: Many academic research networks offer data storage and sharing services for researchers, such as XSEDE, Internet2, and Eduroam. These networks often provide high-speed connectivity, security, and support for data-intensive research projects.
- National data repositories: There are several national data repositories that provide long-term storage and sharing of large data sets, such as the National Center for Biotechnology Information (NCBI) and the National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Information (NCEI).
Sarya,
Please note that jma is asking about storing and sharing 250T of data (This a large volume of data). your point 1 is a viable storage solution and if you couple that with Globus then you have a way to store, manage and share large data. As for the rest of your points I am not quite sure how they address the question