How do I run Apache Spark in an HPC environment?



Spark is typically run atop HDFS in a Hadoop cluster (with name node, data nodes, …). How do I run Spark on a typical HPC environment equipped with a job scheduler, parallel filesystem, and high-performance network fabric?

CURATOR: Kristina Plazonic KrisP


There is a remote direct memory access (RDMA) for Apache Spark package developed by Network-Based Computing Laboratory at OSU to run Spark on HPC. Visit for more details.