RedLine Performance Solutions is seeking a Senior HPC Systems Administrator to manage and support both on-premise and cloud-based HPC clusters. The role requires extensive experience in Linux systems, HPC environments, and strong security practices.
RedLine Performance Solutions (RedLine) has been in the HPC solutions engineering services business for over 25 years and is consistently determined to keep the "bar of excellence" quite high for new hires. This enables RedLine to accomplish what other firms cannot and promotes a high level of staff retention. We offer services ranging from full life cycle HPC systems engineering to remote managed services to HPC program analysis. RedLine is looking for a Senior High Performance Computing (HPC) Systems Administrator to join our team. The administrator for this team will be providing support and administration for a large on-premise HPC cluster and a small cloud-based HPC cluster. The Senior HPC Systems Administrator will be an experienced individual with a strong security, Linux, HPC, configuration management, systems automation, and networking background. Job Details • This position requires mission-critical monitoring and maintenance and will require off hours support in a team rotation. • U.S. Citizenship and the ability to obtain a Public Trust clearance is a requirement to apply. • The preference is for the candidate to be in the Phoenix, AZ area, however the position can be remote with the possibility of some travel. • This full-time position includes a comprehensive benefits package featuring paid time off, a 401(k) match, health insurance, and a full range of additional benefits. Job Responsibilities: • Provide HPC cluster administration using technologies such as HPCM, Lustre, Slingshot, Cray OS, and Slurm • Engage with the customer to identify the needs and user stories to build enhancements and upgrades for the HPC clusters • Work with configuration management solutions to develop Ansible playbooks to support image generation and server support • Work with version control systems to perform and review Git pull requests from the team to ensure that the cluster support follows best practices • Update and expand existing systems monitoring capabilities • Develop automation tools for cluster administration • Participate in resource optimization and job scheduling software and policies • Support HPE-based Cluster Management solutions • Provide technical support to researchers using HPC resources, troubleshoot problems, and develop appropriate computational strategies. Job Requirements: • Minimum of 7 years SLES, RedHat and CentOS Linux system administrator experience in an HPC environment. • Experience with schedulers/batch systems (e.g., SLURM, PBS, LSF) • Experience with managing parallel and cluster file systems (e.g., GPFS, Lustre) • Network management experience, including in an HPC context (e.g., InfiniBand, OmniPath) • Demonstrated ability to configure, deploy, and manage a major system area such as batch system, network, data storage, backup system, database system, or distributed computing • Scripting experience (e.g., bash, Python, Perl). Preferred Skills: • Experience supporting HPC cloud environments (e.g., Azure) • Server provisioning and image management • Experience with Lmod/Lua • Experience with MPI technologies • One of the ISC2 certifications (e.g., CISSP, SSCP) or Security+ certification • Experience integrating applications with cloud provider software stack.
KPMG is seeking a Temporary Systems Engineer, Systems & Infrastructure Administrator who is bilingual in English and Spanish to provide support for server infrastructure in Austin, Texas. This role involves administration, maintenance, and troubleshooting of enterprise applications running on Windows and Linux operating systems.
KPMG is seeking a Temporary Systems Engineer to provide administration and support for server infrastructure in a bilingual (English/Spanish) role. This position is based in Denver, Colorado, and requires experience with Windows and Linux operating systems.
The Systems Administrator II at Arandell is responsible for managing server technologies to ensure service availability and system performance. This on-site role requires strong analytical and communication skills, along with proficiency in various technologies.
RedLine Performance Solutions is seeking a Senior HPC Systems Administrator to manage and support both on-premise and cloud-based HPC clusters. The role requires extensive experience in Linux systems, HPC environments, and strong security practices.
KPMG is seeking a Temporary Systems Engineer, Systems & Infrastructure Administrator who is bilingual in English and Spanish to support server infrastructure for enterprise applications. This role involves administration, maintenance, and troubleshooting of systems in a large enterprise environment.
SAIC is seeking a Top Secret cleared Systems Engineer to serve as a Systems Administrator, focusing on the operation and maintenance of a US Navy enterprise transport network. The role requires expertise in IT networks and a commitment to ensuring 24/7 availability and reliability.
KPMG is seeking a Temporary Systems Engineer, Systems & Infrastructure Administrator who is bilingual in English and Spanish to provide support for server infrastructure in Austin, Texas. This role involves administration, maintenance, and troubleshooting of enterprise applications running on Windows and Linux operating systems.
KPMG is seeking a Temporary Systems Engineer to provide administration and support for server infrastructure in a bilingual (English/Spanish) role. This position is based in Denver, Colorado, and requires experience with Windows and Linux operating systems.
The Systems Administrator II at Arandell is responsible for managing server technologies to ensure service availability and system performance. This on-site role requires strong analytical and communication skills, along with proficiency in various technologies.
RedLine Performance Solutions is seeking a Senior HPC Systems Administrator to manage and support both on-premise and cloud-based HPC clusters. The role requires extensive experience in Linux systems, HPC environments, and strong security practices.
KPMG is seeking a Temporary Systems Engineer, Systems & Infrastructure Administrator who is bilingual in English and Spanish to support server infrastructure for enterprise applications. This role involves administration, maintenance, and troubleshooting of systems in a large enterprise environment.
SAIC is seeking a Top Secret cleared Systems Engineer to serve as a Systems Administrator, focusing on the operation and maintenance of a US Navy enterprise transport network. The role requires expertise in IT networks and a commitment to ensuring 24/7 availability and reliability.
KPMG is seeking a Temporary Systems Engineer, Systems & Infrastructure Administrator who is bilingual in English and Spanish to provide support for server infrastructure in Austin, Texas. This role involves administration, maintenance, and troubleshooting of enterprise applications running on Windows and Linux operating systems.
KPMG is seeking a Temporary Systems Engineer to provide administration and support for server infrastructure in a bilingual (English/Spanish) role. This position is based in Denver, Colorado, and requires experience with Windows and Linux operating systems.
RedLine Performance Solutions is seeking a Senior HPC Systems Administrator to manage and support both on-premise and cloud-based HPC clusters. The role requires extensive experience in Linux systems, HPC environments, and strong security practices.