Ask.Cyberinfrastructure

What are the biggest misconceptions about HPC?

I was reading this article https://www.hpe.com/us/en/insights/articles/8-ways-sci-fi-imagines-data-storage-1903.html about different ways Sci-Fi has imagined data storage, and it occurred to me that this would be an interesting question to propose to HPC. Akin to how the print media mis-represents science, or how film glamorizes different aspects, what (do you think) are common misunderstandings or views that people have about HPC? For example, we can talk about a supercomputer generally:

  1. Media: A supercomputer is an infinitely massive, and complex underground grid that goes on for miles with flashing lights and complicated command sequences

  2. Reality: A supercomputer is an air conditioned server room that has a bunch of smaller servers in racks that are configured together with software that is installed just like any other software.

  3. Misconception: I have to be a genius programmer to use a compute cluster.

  4. Reality: I can get started with relatively little knowledge, maybe just a small tutorial with instructions to login, look around, and submit a task.

A good way to think about it is, if there were a movie made about HPC, what would a person’s expectations be, what would they see in the movie, and then what would they read on wikipedia (and be surprised).

A very common misconception I encounter with newcomers is that if they have a large computational problem to solve, all they need is access to HPC. Simply copying over their code and running it on HPC will make it run faster and better. No additional effort needed. So in a sense, the misconception is that HPC is just a big computer, it is used just like one’s laptop or workstation, with the only difference being that it is infinitely more powerful just because it is big. Some of them are stunned when they see that serial performance on a large many-core machine can be slower than their own computer. Some even feel cheated on due to these false expectations.

3 Likes

A related issue is getting across the concept of nodes. Yes, you can ask for your program to run on four nodes, but if you run the same code that is on your laptop, three of your nodes are going to be sitting idle. I usually end up telling people that each node is like your laptop, and if you ask for four nodes you have four laptops to run your code on :slightly_smiling_face:

1 Like