Paradyme Management is seeking a DevOps/Site Reliability Engineer (SRE) with Secret Clearance to manage and optimize Kubernetes clusters and cloud infrastructure. The role involves collaboration across teams to ensure reliability and scalability of AI solutions.
DevOps / Site Reliability Engineer (SRE) - Secret Clearance Required Job Locations US-VA-Tysons Job ID 2025-2479 Type Full-Time Overview Paradyme, a CATHEXIS Company is a rapidly growing government technology leader that puts service first, for its customers, its team and the communities it supports. We harness DevSecOps and Agile development processes to deliver exceptional results for digital transformations. With headquarters office in Tysons Corner, VA, our award-winning culture sets it apart through its team's deep commitment to service and collaboration with its customers, each other and the community. Learn more at Responsibilities Paradyme, a CATHEXIS Company has partnered with an industry leader in enterprise Artificial Intelligence software and is seeking a highly skilled Site Reliability Engineer (SRE) to join our team to manage, monitor, and optimize our clusters on Kubernetes. Together we're accelerating our client's digital transformation through the building and deployment of data-driven, scalable AI solutions. The ideal candidate will have a deep understanding of Kubernetes, Cloud Infrastructure, and Infrastructure as Code (IaC) practices. You will be responsible for ensuring the reliability, scalability of our Kubernetes clusters and Cloud Infrastructure Responsibilities Monitor and Manage Kubernetes Clusters: Ensure the stability, health, and scalability of Kubernetes Clusters, deploying applications and services on Kubernetes. Kubernetes Management: Deploy, monitor, and scale applications on Kubernetes clusters. Maintain Helm charts, manage services, and ensure resource allocation for optimal cluster performance. Cloud Infrastructure Management: Work with leading Cloud Platforms (AWS, GCP, Azure) to set up, configure, and manage infrastructure resources using Infrastructure as Code (Terraform, CloudFormation, etc.). Monitoring & Incident Response: Set up monitoring solutions, define alerts, and manage the incident response process for any issues related to Jenkins, or Kubernetes clusters. Automate Infrastructure Processes: Build automation tools for scaling, monitoring, and maintaining infrastructure using modern tools like Terraform, Ansible, or equivalent. Collaborate Across Teams: Work closely with development, services, and operations teams to ensure a seamless integration between application development and infrastructure. Security & Compliance: Ensure all systems follow best practices in terms of security and compliance with relevant regulations. This includes role-based access, encryption, and automated vulnerability scanning. Requirements Bachelor's degree (or equivalent) in computer science or related discipline A minimum of two(2) years of experience working with on-premise and off-premise cloud environments. Experience with AWS, Azure and / or GCP Ability to program (structured and OOP) using one or more high-level languages, such as Python, Java, C/C++, Ruby, and JavaScript Experience with distributed storage technologies such as NFS, HDFS, Ceph, and Amazon S3, as well as dynamic resource management frameworks (Apache Mesos, Kubernetes, Yarn) Proactive approach to identifying problems, performance bottlenecks, and areas for improvement Agile/Scrum experience. Physical Requirements: These are the essential physical requirements needed to successfully perform the job. Sedentary work. Requires sitting up to 8 hours per day. May require lifting up to 5 pounds unassisted. Fine repetitive motor skills with hands, wrists, and fingers in coordination with eyes. Hearing, speaking, and vision: Adequate to perform job duties and communicate in person, via video, and telephone. Includes reading information from printed sources and computer screens. Other: Work may be performed in an office environment, which may involve frequent contact with staff and the public. Work may be stressful at times. EEO Statement Paradyme, a CATHEXIS Company is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law. If you are an individual with a disability and would like to request a reasonable accommodation as part of the employment selection process, please contact View email address on click.appcast.io #J-18808-Ljbffr Paradyme Management
Oracle is seeking a Site Reliability Engineer DevOps to join their new Oracle Health team, focusing on automation and product deployment. This remote role requires US citizenship and involves working on large-scale distributed systems.
Oracle is seeking a Senior Site Reliability Engineer / DevOps to join their new Oracle Health team, focusing on product deployment and automation in healthcare technology. This remote position requires US citizenship and offers a competitive salary range.
Paradyme Management is seeking a DevOps/Site Reliability Engineer (SRE) with Secret Clearance to manage and optimize Kubernetes clusters and cloud infrastructure. The role involves collaboration across teams to ensure reliability and scalability of AI solutions.
Capital One is seeking a Senior Software Engineer specializing in DevOps/SRE to enhance cloud operations resilience. The role involves collaborating with Agile teams to develop and support technical solutions in a fast-paced environment.
AutoRABIT is seeking a Senior Site Reliability/DevOps Engineer to enhance and manage their cloud services. This role focuses on automation, reliability, and security within a fast-paced SaaS environment.
Oracle is seeking a Senior Site Reliability Engineer / DevOps to enhance automation and reliability in their new Oracle Health organization. This remote role requires US citizenship and focuses on cloud services and large-scale distributed systems.
Oracle is seeking a Site Reliability Engineer DevOps to join their new Oracle Health team, focusing on automation and product deployment. This remote role requires US citizenship and involves working on large-scale distributed systems.
Oracle is seeking a Senior Site Reliability Engineer / DevOps to join their new Oracle Health team, focusing on product deployment and automation in healthcare technology. This remote position requires US citizenship and offers a competitive salary range.
Paradyme Management is seeking a DevOps/Site Reliability Engineer (SRE) with Secret Clearance to manage and optimize Kubernetes clusters and cloud infrastructure. The role involves collaboration across teams to ensure reliability and scalability of AI solutions.
Capital One is seeking a Senior Software Engineer specializing in DevOps/SRE to enhance cloud operations resilience. The role involves collaborating with Agile teams to develop and support technical solutions in a fast-paced environment.
AutoRABIT is seeking a Senior Site Reliability/DevOps Engineer to enhance and manage their cloud services. This role focuses on automation, reliability, and security within a fast-paced SaaS environment.
Oracle is seeking a Senior Site Reliability Engineer / DevOps to enhance automation and reliability in their new Oracle Health organization. This remote role requires US citizenship and focuses on cloud services and large-scale distributed systems.
Oracle is seeking a Site Reliability Engineer DevOps to join their new Oracle Health team, focusing on automation and product deployment. This remote role requires US citizenship and involves working on large-scale distributed systems.
Oracle is seeking a Senior Site Reliability Engineer / DevOps to join their new Oracle Health team, focusing on product deployment and automation in healthcare technology. This remote position requires US citizenship and offers a competitive salary range.
Paradyme Management is seeking a DevOps/Site Reliability Engineer (SRE) with Secret Clearance to manage and optimize Kubernetes clusters and cloud infrastructure. The role involves collaboration across teams to ensure reliability and scalability of AI solutions.