HPC Systems Administrator
Excelerate
Bruyères-le-Châtel, Essonne
Are you interested in high-performance computing (HPC) and looking to take your career to the next level?
Salary: €45,000 – €65,000 Dependant on experience
Location: Bruyères-le-Châtel, France
Employment Type: Permanent
We are seeking an experienced and highly motivated HPC System Administrator to join our clients dynamic team. As a European leader in High-Performance Computing, they deliver world-class solutions to solve the most complex scientific challenges. If you’re passionate about cutting-edge technology and are looking for an exciting opportunity to work in a collaborative, high-level environment, then we want to hear from you.
About the Role
In this position, you’ll play a key role in the administration of HPC systems, including supercomputers and storage solutions, contributing to groundbreaking projects. You'll work with a talented team of technicians and engineers specialized in HPC systems, ensuring the continuous optimization and reliability of our systems.
Key Responsibilities
- Administration of HPC systems, including supercomputers and associated storage systems.
- Install, configure, and optimize software across several thousand computing nodes.
- Perform regular software maintenance operations.
- Implement high availability solutions (HA, Pacemaker, Corosync).
- Develop automation procedures using scripts (Bash, Python).
- Write technical documentation, including operating procedures (Wiki).
- Analyze, diagnose, and resolve production incidents.
- Handle customer support tickets, either resolving or escalating to L3 support as necessary.
- Participate in technical escalation procedures with internal and partner teams.
- Provide Level 1 and Level 2 support for software stack based on CentOS.
- Participate in on-call rotations (approximately one week per month).
Desired Skills & Experience
- At least 5 years of experience in Linux system administration, particularly in HPC environments.
- Mastery of GNU/Linux HPC systems (RedHat, CentOS, or others).
- Experience with Lustre File System and related technologies.
- Familiarity with networking (InterConnect, Infiniband, Ethernet, RoCE).
- Experience with containers (Docker, OpenStack) and orchestration tools (Puppet, Ansible).
- Scripting skills (Shell, Python, Perl).
- Knowledge of key Linux services (DNS, DHCP, Web, FTP, authentication, deployment management).
- Familiarity with monitoring tools like Nagios.
- Hands-on experience with hardware components, including network switches, X86 servers, and storage (DDN, ClusterStor, etc.).
Languages:
- Fluent in English (French is a plus).
If you’re excited to join a high-performing team and want to make a real impact in the world of HPC, apply today.