What You'll Do - Team Leadership and Development - Lead and mentor 5+ direct managers and 20–30 indirect reports across cloud operations and systems engineering functions. - Build a team culture of accountability, urgency, and client ownership. - Support overall performance management, and long-term career development practices. - Act as an escalation point for technical and operational blockers impacting delivery or customer satisfaction. - Operational Excellence & Service Delivery - Drive improvements in incident response, ticket handling, change management, and patch compliance. - Standardize runbooks, monitoring, escalation paths, and documentation across client environments. - Identify and track key operational metrics such as MTTR, SLA adherence, and customer satisfaction. - Partner with internal teams to create more proactive service models that anticipate client issues before escalation. - Strategic and Organizational Growth - Collaborate with leadership to expand technical capabilities and develop new professional service offerings. - Evaluate emerging technologies and trends to guide innovation within the team’s technical practices. - Support organizational growth by creating scalable frameworks for service delivery and team expansion. - Participate in strategic planning sessions to align technical direction with business objectives. - Cross-Functional Collaboration - Collaborate with other departments to ensure alignment between professional services and broader business goals. - Partner with the Security Director on shared concerns such as incident containment, vulnerability remediation, and tooling integration. What You'll Bring - Proven leadership experience with technical operations teams in a managed services or MSP context. - Deep knowledge of cloud infrastructure in AWS, Azure, and GCP environments. - Familiarity with infrastructure-as-code tools like Terraform, Ansible, GitHub/GitLab pipelines. - Strong communication skills with the ability to manage both internal teams and client expectations. - High emotional intelligence and situational awareness during client escalations and internal performance issues. - Experience leading operational maturity or ITSM process rollouts (e.g., incident/change/problem management). - Familiarity with SRE principles, but adaptable to operationally heavy environments. - Metric and KPI management - 8+ years of technical leadership experience, ideally within a managed services or multi-client environment. - Proven success in scaling technical organizations and driving operational excellence in a professional services environment. - Experience managing key operational metrics such as utilization, margins, and capacity. Bonus Points - Direct experience leading cloud-focused teams or organizations. - Background in customer-facing roles, with experience in client escalations or high-level technical discussions. - Experience leading operational maturity or ITSM process rollouts (e.g., incident/change/problem management). - Familiarity with SRE principles, but adaptable to operationally heavy environments. - Relevant certifications in cloud platforms (AWS, Azure, GCP) or IT frameworks (ITIL, TOGAF) are preferred.
Job Type
Remote role
Skills required
Azure
Location
United States
Salary
No salary information was found.
Date Posted
May 20, 2025
The Director of Site Reliability Engineering at Coalfire will lead a team responsible for managing client infrastructure across AWS, Azure, and GCP, focusing on operational excellence and team development. This role emphasizes technical leadership, service delivery improvements, and cross-functional collaboration.