Ask.Cyberinfrastructure

How do I run Apache Spark in an HPC environment?

spark

#1

Spark is typically run atop HDFS in a Hadoop cluster (with name node, data nodes, …). How do I run Spark on a typical HPC environment equipped with a job scheduler, parallel filesystem, and high-performance network fabric?

CURATOR: Kristina Plazonic KrisP


#2

There is a remote direct memory access (RDMA) for Apache Spark package developed by Network-Based Computing Laboratory at OSU to run Spark on HPC. Visit http://hibd.cse.ohio-state.edu/ for more details.