Profile cover
R

ML Platform Engineer

Ryan Ashford

Full-time · Mid-level · Chicago

About the role

Role summary: ML Platform Engineer, mid-level. Company: B2B SaaS, 85 employees, Series A. Team: Platform Engineering (4 engineers + 1 manager). Reporting to: Head of Platform. Stack: Python, Kubernetes on AWS EKS, MLflow, Prefect, Docker, PostgreSQL. What this role owns: model serving infrastructure, experiment tracking and model registry, CI/CD pipelines for model deployments, monitoring and alerting for model performance. What this role does not own: model research and training (that's the ML team), product engineering (separate team), data pipelines upstream of model inputs (data engineering). Current state: We run 12 production models. Deployment is semi-automated. MLflow is partially adopted. Monitoring is reactive rather than proactive. Rollback takes too long. Target state: Fully automated deployment with canary releases. MLflow standardised across all teams. Proactive alerting on data drift and performance degradation. Rollback under five minutes. Timeline to target: Two to three quarters. What we need from you: Hands-on engineering capability, clear written communication, and an opinion on the right approach before we've told you what to build.

Responsibilities

  • Automate model deployment pipeline from registry trigger to canary release on EKS
  • Standardise MLflow adoption across ML and data science teams with clear documentation
  • Build proactive monitoring for model performance, data drift, and infrastructure health
  • Reduce rollback time from current 40 minutes to under five minutes
  • Conduct platform onboarding sessions for new ML team members

Requirements

  • 4+ years in platform, DevOps, or MLOps engineering
  • Kubernetes — you design and debug pod configurations, resource limits, and scaling behaviour yourself
  • MLflow for experiment tracking, model registry, and serving — production deployment experience
  • Python for scripting infrastructure, deployment automation, and monitoring integrations
  • AWS-native services: EKS, ECR, S3, CloudWatch — not theoretical
  • Prefect or Airflow for workflow orchestration
  • Docker — you write Dockerfiles and debug image issues without a tutorial

Benefits

  • Clear scope, defined target state, and direct path to ownership
  • Full remote, US time zones preferred
  • $95,000 – $118,000 base salary + equity
  • Platform tooling budget: $1,500 quarterly
  • $2,000 conference and certification budget annually

Job Type

Full-time

Level

Mid-level

Language

English

Salary Range

$95,000 – $118,000

AI Expertise

MLOps & AI Infrastructure

Ready to apply for this role?

Create a free talent account in under 2 minutes.

  • Apply to verified AI companies
  • Get AI-matched job recommendations
  • Message hiring managers directly
  • Build your public AI talent profile