Research Computing Infrastructure open positions at Northwestern

Northwestern is hiring three full-time (hybrid) positions on our growing Research Computing Infrastructure (RCI) team:
HPC Lead Systems Engineer
HPC Senior Systems Engineer
HPC Systems Engineer

These RCI roles work closely with Research Computing Services to support researchers using Northwestern’s High-Performance Computing (HPC) infrastructure, a suite of resources that includes Quest, an HPC system with more than 50,000 cores, used by Northwestern researchers to make cutting-edge discoveries through computational research and data science.

I am the hiring manager for these RCI roles, so please feel free to contact me directly with any questions.

Our team members come from a spectrum of backgrounds, and we are committed to creating a diverse and inclusive work environment. We offer competitive pay and provide extensive opportunities for skills and career development. At Northwestern, we are proud to provide high-quality health care plans, retirement benefits, and significant tuition discounts.

More information about the RCI roles:

HPC Systems Engineer Lead
As Lead HPC Systems Engineer, you will actively support Northwestern’s HPC systems while leading the RCI team in best practices for developing, implementing, maintaining, and securing HPC cluster systems and solutions for computational research requirements, including AI/ML and data science. As a Lead member of the RCI team, you will work closely with your teammates and the Research Computing Services team to develop a long-term strategy for the evolving Northwestern research enterprise.
Specific Responsibilities:
• Leads implementation of best practices for HPC design, operations, automation, maintenance and security
• Leads developing technical skills of RCI team members including mentoring, instructing, directing, documenting, and coaching
• Leads troubleshooting and diagnostics
• Oversees the installation, maintenance, configuration, and integrity of Northwestern’s research infrastructure
• Develops strong working relationships with our stakeholders and partners, prioritizing excellent customer service
• Continually develops technical skills to meet the evolving needs of research computing, both through independent learning and attending conferences and workshops
• Owns relationships with partners who provide technical expertise for specific engagements, developing effective working relationships and overseeing their work
This role offers a hybrid work schedule and takes part in regular 24/7 on-call HPC support rotations.

HPC Senior Systems Engineer
As a Senior System Engineer, you will be member of a team responsible for managing the university’s high performance computing infrastructure and related platforms. In this role you will also participate in an on-call rotation, system architecture discussions, respond to customer requirements and inquiries using the university’s IT services management platform. You will also collaborate and partner with Internal teams including the Research Computing Services team as well as external research focused groups. Collaboration with partners and internal team members is essential to ensuring service delivery to the community meets expectations. The role reports to the manager of RCI team.
Specific Responsibilities:
• Collaborates in the planning, development, and coordination of systems related operations and projects for current and future High Performance Computing Infrastructure related initiatives within our group.
• Provides systems recommendations regarding High Performance Computing Infrastructure, operations, automation, and self-service enablement within Platform Services.
• Constructively participates in collaborative opportunities in support of systems improvements with internal and external organizations.
• Reviews current systems configurations and recommends enhancements.
• Facilitates coordination and a thorough understanding of systems requirements, attends project meetings, creates own meeting notes, creates appropriate ticketing, etc.
• Suggests systems and architecture guidance for IT operations and the broader Northwestern community (as needed) for research services delivered by our team, in alignment with NUIT policies and procedures.
• Maintain a secure access standard to High Performance Computing Infrastructure platforms.
This role offers a hybrid work schedule and takes part in regular 24/7 on-call HPC support rotations.

HPC Systems Engineer
As RCI’s HPC Systems Engineer, you will actively support HPC cluster systems and solutions for computational research requirements, including AI/ML and data science. As the successful candidate, you will bring a growth mindset to the HPC Systems Engineer role and will work closely with your teammates and the Research Computing Services team to provide excellent service and develop strategies for the evolving Northwestern research enterprise.
Specific Responsibilities:
• Implements best practices for HPC configuration, administration, maintenance and operations
• Performs and supports HPC system installation, management, monitoring, performance tuning, diagnostics and troubleshooting, and hardware support as needed
• Works closely with Research Computing Services to facilitate effective HPC support and provide excellent customer service
• Continually develops technical skills to meet the evolving needs of research computing, both through independent learning and attending conferences and workshops, then shares those skills with the team
• Develops strong working relationships with RCI’s stakeholders and partners
This role offers a hybrid work schedule and takes part in regular 24/7 on-call HPC support rotations.

Please feel free to pass these job opportunities on to anyone who might be interested. Thanks!