Senior MLOps Engineer

Julien Moreau

Full-time · Senior · New_York

About the role

We operate a real-time recommendation system serving 800 million requests per day across 40+ models running in parallel on AWS, with specific workloads on GCP. The infrastructure works. It is not, however, mature enough. Our model deployment pipeline has too many manual steps, our rollback process takes 45 minutes when it should take five, and our experiment tracking is inconsistent enough that engineers sometimes can't reproduce a training run from three months ago. We are hiring a senior MLOps engineer to fix this — not to manage it, but to actively engineer better solutions. You will be hands-on in the code every day. You will also be setting engineering standards and reviewing the work of three junior and mid-level engineers. If you have operated ML infrastructure at this scale and you care about the craft of building systems that work reliably when it matters most, we want to talk.

Responsibilities

Lead the redesign of our model deployment and rollback pipeline
Improve experiment tracking standardisation across all ML teams
Mentor and review work for three junior and mid-level MLOps engineers
Define and enforce MLOps engineering standards and best practices
Drive the migration of manual deployment steps into automated, auditable processes

Requirements

6+ years in infrastructure or platform engineering with 3+ years focused on ML systems
Deep expertise in Kubernetes for model serving and scaling
Airflow pipeline design and operational experience at scale
MLflow or a comparable model registry — experiment tracking, versioning, deployment
AWS-native infrastructure (EKS, SageMaker, S3, CloudWatch) is essential; GCP is a bonus
Databricks for large-scale feature engineering and training jobs
Maturity in CI/CD design for ML — canary deployments, shadow mode, automated rollback

Benefits

Technical leadership at real scale — 800M requests/day is not a slide deck number
Full remote across US time zones
$135,000 – $170,000 base salary + equity
Conference and research budget: $3,000 annually
Equipment refresh every two years

Job Type

Full-time

Level

Senior

Language

English

Salary Range

$135,000 – $170,000

AI Expertise

MLOps & AI Infrastructure

Ready to apply for this role?

Create a free talent account in under 2 minutes.

Apply to verified AI companies
Get AI-matched job recommendations
Message hiring managers directly
Build your public AI talent profile

Create free account & apply Log in