Tredence Inc. is seeking an AI/MLOps Architect to lead AI Observability initiatives, focusing on the design and delivery of an Azure-based observability platform. This role requires strong leadership in managing a team of engineers and expertise in AI and cloud architecture.
Dear Candidate, Job Title: AI Observability Architect / Delivery Lead Location: Plano, Texas (Onsite Required) Role Summary We are seeking a visionary and execution-focused Architect / Delivery Lead to spearhead our AI Observability initiatives. This is a critical leadership role responsible for the design, development, and delivery of our next-generation, Azure-based observability platform. You will lead a talented team of engineers, leveraging Arize AI to build robust telemetry and monitoring solutions specifically designed for the unique challenges of modern AI systems. The ideal candidate is a seasoned technical leader with a passion for people management, a deep understanding of cloud architecture, and a forward-thinking perspective on the future of AI. A key focus of this role will be establishing thought leadership and pioneering solutions for the observability and monitoring of complex, Agentic AI systems. Key Responsibilities • People Leadership: • Lead, mentor, and grow a high-performing team of software and MLOps engineers, fostering a culture of innovation, collaboration, and accountability. • Conduct regular performance reviews, provide constructive feedback, and support the career development of your team members. • Manage team capacity, resource allocation, and day-to-day operational activities. • Technical Architecture & Strategy: • Define the end-to-end technical architecture for the AI Observability platform on Microsoft Azure, ensuring it is scalable, reliable, and secure. • Serve as the subject matter expert on AI telemetry, leveraging Arize AI to its full potential for model performance monitoring, drift detection, and data quality validation. • Develop novel architectural patterns and best practices for monitoring and observing multi-step, autonomous Agentic AI systems, including tracking agent state, tool usage, and decision-making processes. • Delivery & Backlog Management: • Own and manage the team’s product backlog, working closely with product managers, data scientists, and other stakeholders to define requirements and user stories. • Prioritize features, plan sprints, and drive the execution of the product roadmap using Agile methodologies. • Identify and remove impediments, manage risks, and ensure the timely delivery of high-quality solutions. • Thought Leadership & Innovation: • Stay at the forefront of industry trends in AI Observability, MLOps, and especially Agentic AI. • Champion new ideas and technologies, influencing the strategic direction of our AI platform. • Evangelize the importance and best practices of AI observability across the organization through presentations, documentation, and internal workshops. Required Qualifications (Must-Haves) • Experience: A minimum of 10+ years of experience in software engineering, MLOps, or cloud architecture, with at least 4+ years in a technical lead, architect, or management capacity. • Location: Must be based in or willing to relocate to Plano, Texas. This is a full-time, onsite position. • People Management: Proven experience in direct people management, including hiring, coaching, and performance management. • Cloud Expertise: Deep, hands-on experience designing and delivering large-scale, cloud-native platforms on Microsoft Azure. • AI/ML Acumen: Strong understanding of AI/ML concepts, the MLOps lifecycle, and the challenges associated with deploying and monitoring models in production. • Agile Leadership: Expertise in backlog management, sprint planning, and leading development teams using Agile methodologies (Scrum/Kanban). • Communication: Excellent verbal and written communication skills, with the ability to articulate complex technical concepts to both technical and non-technical audiences. Preferred Qualifications (Nice-to-Haves) • Direct Arize AI Experience: Hands-on experience implementing or using Arize AI or similar AI observability/monitoring platforms (e.g., WhyLabs, Fiddler, Arthur). • Agentic AI Knowledge: Deep knowledge of and a passion for Agentic AI, Large Language Models (LLMs), and Retrieval-Augmented Generation (RAG) patterns. Experience with frameworks like LangChain or LlamaIndex is a significant plus. • Azure Services: Specific experience with Azure services such as Azure Machine Learning, Azure Kubernetes Service (AKS), Azure Functions, Cosmos DB, and Azure Monitor. • DevOps/MLOps: Experience with CI/CD pipelines, Infrastructure as Code (e.g., Terraform, Bicep), and containerization technologies (Docker, Kubernetes). • Data Engineering: Familiarity with data processing technologies like Apache Spark, Databricks, or Synapse Analytics.
Hallmark is looking for a Technical Solution Architect with expertise in Azure Cloud and AI/ML integration to lead architectural vision and technical strategy. This role involves designing cloud-native applications and mentoring cross-functional teams in a dynamic healthcare technology environment.
SimpleTalks is seeking an AIOps/MLOps Architect with extensive experience in AI/ML systems and cloud infrastructure. The role focuses on operationalizing AI models and ensuring robust systems for AI-driven applications.
Tredence Inc. is seeking an AI/MLOps Architect to lead AI Observability initiatives, focusing on the design and delivery of an Azure-based observability platform. This role requires strong leadership in managing a team of engineers and expertise in AI and cloud architecture.
Tredence Inc. is seeking an AI Observability Architect / Delivery Lead to lead AI observability initiatives and develop an Azure-based observability platform. The role requires strong leadership, cloud architecture expertise, and a focus on AI systems monitoring.
Conquest Consulting is seeking an experienced Enterprise Architect specializing in Cloud Platforms and AI/ML to lead architectural initiatives for the State of Texas. The role requires extensive IT experience and expertise in cloud-native architectures and AI/ML technologies.
Teladoc Health is seeking a Staff Data & AI Platform Engineer to lead the design and operationalization of scalable ML and LLM infrastructure. The role involves defining MLOps strategies and building CI/CD pipelines across various platforms.
Hallmark is looking for a Technical Solution Architect with expertise in Azure Cloud and AI/ML integration to lead architectural vision and technical strategy. This role involves designing cloud-native applications and mentoring cross-functional teams in a dynamic healthcare technology environment.
SimpleTalks is seeking an AIOps/MLOps Architect with extensive experience in AI/ML systems and cloud infrastructure. The role focuses on operationalizing AI models and ensuring robust systems for AI-driven applications.
Tredence Inc. is seeking an AI/MLOps Architect to lead AI Observability initiatives, focusing on the design and delivery of an Azure-based observability platform. This role requires strong leadership in managing a team of engineers and expertise in AI and cloud architecture.
Tredence Inc. is seeking an AI Observability Architect / Delivery Lead to lead AI observability initiatives and develop an Azure-based observability platform. The role requires strong leadership, cloud architecture expertise, and a focus on AI systems monitoring.
Conquest Consulting is seeking an experienced Enterprise Architect specializing in Cloud Platforms and AI/ML to lead architectural initiatives for the State of Texas. The role requires extensive IT experience and expertise in cloud-native architectures and AI/ML technologies.
Teladoc Health is seeking a Staff Data & AI Platform Engineer to lead the design and operationalization of scalable ML and LLM infrastructure. The role involves defining MLOps strategies and building CI/CD pipelines across various platforms.
Hallmark is looking for a Technical Solution Architect with expertise in Azure Cloud and AI/ML integration to lead architectural vision and technical strategy. This role involves designing cloud-native applications and mentoring cross-functional teams in a dynamic healthcare technology environment.
SimpleTalks is seeking an AIOps/MLOps Architect with extensive experience in AI/ML systems and cloud infrastructure. The role focuses on operationalizing AI models and ensuring robust systems for AI-driven applications.
Tredence Inc. is seeking an AI/MLOps Architect to lead AI Observability initiatives, focusing on the design and delivery of an Azure-based observability platform. This role requires strong leadership in managing a team of engineers and expertise in AI and cloud architecture.