Ask.Cyberinfrastructure

Q&A


Running R code in parallel in an HPC environment (3)
How do I request a node with a specific resource (like gpu) in Slurm? (2)
Starting a virtual machine image without knowing the user name or password (3)
How to determine if jobs are dying on their own or from the scheduler? (1)
How can I determine how much memory I need before submitting my GAMESS calculation? (2)
How do I check that all mount points are in place, and mark a node out of service if a mount point is missing before a job is scheduled in Slurm? (3)
How do I get the list of features and resources of each node in Slurm? (3)
Can the ArcGIS application run in an HPC cluster environment? (2)
What are some ways to organize the modules created to manage multiple versions and combinations of compilers, tools, drivers and libraries on a shared cluster? (3)
In a PBS Pro select statement, what's the difference between procs and mpiprocs? (2)
How do I use a Globus file transfer service? (3)
How I can improve the performance of my job that needs to perform many I/O operations with a very large text file (2)
How do I use DMTCP to create a checkpoint and restart my program? (1)
InCommon Federation for HPC Clusters (3)
Cgroups with MPI process affinity (2)
"slurmstepd: error: Exceeded step memory limit at some point" (3)
How many MPI ranks should I request for a GAMESS calculation? (2)
Why is casting a double to an integer so slow? (5)
Why was Rocks software named "Rocks"? (3)
How can I use SLURM's sacct command to show memory usage statistics for a job that I am running? (6)
How do I install Python 2.7 on my Comet allocation and where can I store data long term? (3)
How can I restart my GAMESS calculation and resubmit? (2)
What are the relative benefits of a stateful vs. stateless cluster configuration? (4)
Transferring multiple TB-sized files between my local cluster and AWS S3 storage (2)
Is there a way to do startup and cleanup tasks with an SGE task array? (2)
I am exploring a parameter space, and need to launch several hundred variants of the same small job. What can I do to ensure the shortest completion time? (2)
How can I run Gaussian09 on a server? (2)
How do you configure SLURM groups to limit access to partitions? (2)
Duo for Multi-Factor Authentication using Bridges (1)
How can I use EasyBuild to manage building multiple versions of netCDF for different compiler and MPI versions? (2)