PSR Associates is a consulting and talent solutions firm that connects qualified IT professionals with great opportunities. Whether you're looking for a contract or permanent position, we can help you find the right fit for your skills and experience. We have a team of experienced recruiters who know the IT industry inside and out, and we work with you every step of the way to ensure a smooth and successful transition. PSR Connecting Talent, Crafting Success. Site Reliability Engineer (Azure SRE) Remote Long-term, multi-year engagement Description: This position is for an Azure Site Reliability Engineer. We are looking for a skilled engineer that has a willingness to learn new technologies quickly and often. We are looking for expertise in setting up, logging, monitoring, alerts. Experience in writing IaC (Infrastructure as Code) and scripts. Must be willing to work as part of an on-call schedule and assist with troubleshooting/debugging VMs. This role will be part of a Hybrid Cloud Services team providing operational support for the Azure cloud platform. The role will implement hybrid cloud solutions using automated tools. Important Note: Candidates MUST either possess an active US “Public Trust” clearance certification or be able to attain one within 60 days of starting this assignment. Require ments: This Azure SRE position requires 7+ years of experience with the following: • **Programming & Scripting:** • Proficiency in languages like Python, Go, Java, or Ruby. • Scripting skills with Bash or PowerShell. • **Systems Administration:** • Expertise in Linux/Unix systems. • Knowledge of Windows Server environments. • **Cloud Platforms:** • Experience with cloud services such as IBM Cloud, AWS, Azure, or Google Cloud Platform. • Understanding of cloud-native concepts and services (e.g., Kubernetes, Docker). • **Infrastructure Management:** • Skills in configuring and managing servers, databases, and networking components. • Experience with Infrastructure as Code (IaC) tools like Terraform or Ansible. • **Monitoring & Observability:** • Familiarity with monitoring tools such as Prometheus, Grafana, or IBM Cloud Monitoring. • Experience with log management and analysis tools like ELK Stack or Splunk. • **Performance Tuning:** • Expertise in optimizing system performance and reliability. • Knowledge of performance testing and tuning for applications and infrastructure. • **Networking:** • Understanding of networking concepts (e.g., TCP/IP, DNS, HTTP/HTTPS). • Experience with load balancing, firewalls, and VPNs. • **Security:** • Knowledge of security best practices and compliance standards. • Experience with vulnerability management and threat detection. Additional Position Requirements: • Bachelor’s Degree or higher • Minimum 7 years of work experience. • Public Trust security clearance or be able to obtain one. • Off-shift work that includes evenings, weekends, and on-call support. • Proactively monitoring an Incident and Task Ticketing Queue. • Ability to Update Tickets with appropriate technical detail and communicate details in an effective manner. Preferred Skills: • Experience solutioning, implementing, and providing support for VMS in an azure hybrid cloud environment. • Experience with cloud native components such as Networking, Firewalls, Peering, Security Groups, Availability Zones, Storage, Serverless, Load Balancers, Containerization, System Administration, Backups, patching, etc. • An understanding of how automation is integrated with a multi-cloud, native tool environment • Experience implementing, managing, and monitoring identity, governance, storage, compute and virtual networks within cloud platforms. • Experience configuring Azure, AWS, and Google Cloud native monitoring tools and their integration with client application environments. • Experience performing troubleshooting and root cause analysis to expedite incident resolution and acting as an escalation point for Tier 1 and Tier 2 support members • Familiarity with the implementation of highly available cloud services through Availability Zones and Auto Scaling • Experience with tools such as Azure CLI, PowerShell, AWS CLI, etc. • Familiarity with DevOps Tools such as Puppet, Ansible, Terraform and/or familiarity with a programming language such as Python • Experience with VMware Virtualization and ServiceNow Nice to haves • Experience maintaining Gold Images using Azure Image Builder and patching using Azure’s native patching solution • Experience with Azure resources and services such as networking, Virtual Machines, security groups, peering, availability zones, storage, load balancers, backups, Azure DNS, etc. Preferred Certifications: • CompTIA Linux+ • RHCSA or RHCE • Microsoft Azure Administrator Other Skills/Requirements: Microsoft Excel x Microsoft Word x Excellent verbal and written communication skills x Must be US Citizen x Candidate must be able to obtain a current US Federal Public Trust clearance (or higher) x
Job Type
Fulltime role
Skills required
Azure, Python, Go, Java, Kubernetes
Location
Location not specified
Salary
No salary information was found.
Date Posted
October 10, 2024
PSR Associates is seeking an Azure Site Reliability Engineer (SRE) for a long-term remote engagement. The ideal candidate will have extensive experience in cloud services, scripting, and systems administration.