DRC Systems is seeking a DevOps Engineer with an SRE background to implement and manage observability solutions using Datadog. The role involves collaborating with teams to enhance monitoring across various environments and automating configurations.
Position: DevOps Engineer with SRE background Key Responsibilities • Implement and manage full-stack observability using Datadog, ensuring seamless monitoring across infrastructure, applications, and services. • Instrument agents for on-premise, cloud, and hybrid environments to enable comprehensive monitoring. • Design and deploy key service monitoring, including dashboards, monitor creation, SLA/SLO definitions, and anomaly detection with alert notifications. • Configure and integrate Datadog with third-party services such as Service Now, SSO enablement, and other ITSM tools. Core Responsibilities • Design & Implement Solutions: Build and maintain comprehensive observability platforms that provide deep insights into complex systems, incorporating logs, metrics, and traces. • System Instrumentation: Instrument applications, infrastructure, and services to collect telemetry data using frameworks like Open Telemetry. • Data Analysis & Visualization: Develop dashboards, reports, and alerts using tools like Prometheus, Grafana, and Splunk to visualize system performance and detect issues. • Collaboration: Work with development, SRE, and Dev Ops teams to integrate observability best practices and align monitoring with business and operational goals. • Automation: Develop scripts and use Infrastructure as Code (IaC) tools like Ansible and Terraform to automate monitoring configurations and telemetry collection. Key Skills & Tools • Observability Tools: Proficiency in monitoring, logging, and tracing tools, including Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Splunk, Datadog, New Relic, and cloud-native solutions like AWS Cloud Watch. • Programming Languages: Expertise in languages such as Python and Go for scripting and automation. • Infrastructure & Cloud Platforms: Experience with cloud platforms (AWS, GCP, Azure) and container orchestration systems like Kubernetes. • Infrastructure as Code (IaC): Familiarity with Terraform and Ansible for managing infrastructure and configurations. • CI/CD & Automation: Experience with CI/CD pipelines and automation tools like Jenkins. • System & Software Engineering: A strong background in both system operations and software development. • Optimize cloud agent instrumentation, with cloud certifications being a plus. • Datadog Fundamental, APM and Distributed Tracing Fundamentals & Datadog Demo Certification (Mandatory) • Strong understanding of Observability concepts (Logs, Metrics, Tracing) • Expertise in security & vulnerability management in observability • Possesses 2 years of experience in cloud-based observability solutions, specializing in monitoring, logging, and tracing across AWS, Azure, and GCP environments. #J-18808-Ljbffr
DRC Systems is seeking a DevOps Engineer with an SRE background to implement and manage observability solutions using Datadog. The role involves collaborating with teams to enhance monitoring across various environments and automating configurations.
Info Way Solutions LLC is seeking a skilled DevOps/SRE with expertise in Python automation to join their team in Seattle. The role involves migrating the observability stack and ensuring high availability and performance across infrastructure.
Robotics Technology LLC is seeking a DevOps SRE with extensive experience in Azure Cloud and cloud migration. The ideal candidate will have strong skills in Kubernetes, Azure DevOps, and performance tuning.
RNR IT Solutions, Inc. is seeking a DevOps / Site Reliability Engineer (SRE) in Dallas, Texas, to design and maintain CI/CD pipelines and manage cloud infrastructure. The ideal candidate will have extensive experience in DevOps practices and cloud technologies.
Aicadium is seeking a DevOps/SRE Engineer to operationalize and optimize AI-driven applications and infrastructure. This role requires collaboration with cross-functional teams to ensure high availability and performance of AI products.
Irish Life Group Services Limited is seeking a DevOps/SRE Engineer in New York, NY, to support their transformation towards cloud adoption and innovative solutions. The role involves designing, implementing, and maintaining applications while ensuring compliance with security standards.
DRC Systems is seeking a DevOps Engineer with an SRE background to implement and manage observability solutions using Datadog. The role involves collaborating with teams to enhance monitoring across various environments and automating configurations.
Info Way Solutions LLC is seeking a skilled DevOps/SRE with expertise in Python automation to join their team in Seattle. The role involves migrating the observability stack and ensuring high availability and performance across infrastructure.
Robotics Technology LLC is seeking a DevOps SRE with extensive experience in Azure Cloud and cloud migration. The ideal candidate will have strong skills in Kubernetes, Azure DevOps, and performance tuning.
RNR IT Solutions, Inc. is seeking a DevOps / Site Reliability Engineer (SRE) in Dallas, Texas, to design and maintain CI/CD pipelines and manage cloud infrastructure. The ideal candidate will have extensive experience in DevOps practices and cloud technologies.
Aicadium is seeking a DevOps/SRE Engineer to operationalize and optimize AI-driven applications and infrastructure. This role requires collaboration with cross-functional teams to ensure high availability and performance of AI products.
Irish Life Group Services Limited is seeking a DevOps/SRE Engineer in New York, NY, to support their transformation towards cloud adoption and innovative solutions. The role involves designing, implementing, and maintaining applications while ensuring compliance with security standards.
DRC Systems is seeking a DevOps Engineer with an SRE background to implement and manage observability solutions using Datadog. The role involves collaborating with teams to enhance monitoring across various environments and automating configurations.
Info Way Solutions LLC is seeking a skilled DevOps/SRE with expertise in Python automation to join their team in Seattle. The role involves migrating the observability stack and ensuring high availability and performance across infrastructure.
DRC Systems is seeking a DevOps Engineer with an SRE background to implement and manage observability solutions using Datadog. The role involves collaborating with teams to enhance monitoring across various environments and automating configurations.