Key Responsibilities - SaaS Platform Engineering: - Design, build, and optimize a cloud-native, multi-tenant SaaS platform that scales with Enable’s rapid growth. - Develop and maintain core infrastructure components, including compute, networking, observability, and CI/CD pipelines. - Implement best practices for cloud cost optimization, security, and system reliability. - Enhance API gateways, identity management, and service orchestration for seamless integration across services. Key Responsibilities - MLOps & AI Infrastructure: - Architect and manage scalable AI/ML pipelines for model training, deployment, and monitoring. - Develop and maintain MLOps workflows using tools like Kubeflow and MLflow. - Optimize ML model inference for real-time and batch processing in a production SaaS environment. - Collaborate with data scientists and ML engineers to streamline model lifecycle management. Key Responsibilities - Automation, Observability & Performance: - Implement and refine Infrastructure-as-Code (IaC) practices using Terraform or Pulumi. - Build self-healing, automated monitoring solutions for system health, application performance, and security. - Improve CI/CD processes to support high-velocity engineering teams with minimal operational overhead. - Establish robust logging, tracing, and metrics collection for visibility into SaaS application performance. Key Responsibilities - Cross-Functional Collaboration & Leadership: - Work closely with engineering teams to ensure platform capabilities support product innovation and reliability. - Partner with security teams to ensure compliance with SaaS security and compliance standards and best practices. - Implement monitoring, logging, and alerting solutions to track model and system performance, ensuring compliance with best practices. - Define and document platform engineering best practices to elevate team-wide capabilities. - Mentor junior and mid-level engineers, fostering a culture of technical excellence and continuous learning. Qualifications: - 8+ years of experience in platform, DevOps, or cloud engineering, with at least 3+ years in SaaS environments. - Bachelor’s degree in Computer Science, Engineering, or a related field. - Expertise in architecting, deploying, and managing cloud-native applications on AWS, GCP, or Azure. - Strong experience with Kubernetes, serverless computing, and container orchestration. - Proficiency in modern Infrastructure-as-Code (Terraform, Pulumi) and CI/CD tools (GitHub Actions, ArgoCD). - Experience in distributed systems, service mesh technologies (Istio, Linkerd), and event-driven architectures. - Hands-on experience with databases (SQL, NoSQL, vector DBs), ensuring high availability and performance. - Deep understanding of SaaS security principles, identity management, and compliance frameworks (SOC 2, ISO 27001). - Strong programming and automation skills in Python, SQL, and Bash. - Strong experience with microservices architecture and API development. - Familiarity with MLOps frameworks and ML model operationalization. Preferred Qualifications: - Experience building and maintaining large-scale data processing pipelines. - Expertise in observability tools such as Prometheus, Grafana, OpenTelemetry, or ELK stack. - Familiarity with real-time data processing and messaging platforms (Kafka, Pub/Sub, Kinesis). - Background in high-availability, globally distributed architectures. - Work experience in cutting-edge AI and MLOps challenges at the intersection of ML, engineering, and cloud infrastructure. - Certifications in cloud computing (e.g., AWS Certified DevOps Engineer, Azure DevOps Engineer). - Contributions to open-source DevOps, platform, or MLOps tooling.
Job Type
Remote role
Skills required
No particular skills mentioned.
Location
Toronto, ON
Salary
No salary information was found.
Date Posted
March 20, 2025
Enable is seeking a Tech Lead Platform Engineer specializing in AI & MLOps to enhance their cloud-native SaaS platform. This role involves architecting scalable systems and collaborating across teams to drive innovation and efficiency.