How do I run Apache Spark in an HPC environment?

KrisP · March 16, 2018, 1:09am

Spark is typically run atop HDFS in a Hadoop cluster (with name node, data nodes, …). How do I run Spark on a typical HPC environment equipped with a job scheduler, parallel filesystem, and high-performance network fabric?

raminder · June 22, 2018, 10:59am

There is a remote direct memory access (RDMA) for Apache Spark package developed by Network-Based Computing Laboratory at OSU to run Spark on HPC. Visit http://hibd.cse.ohio-state.edu/ for more details.