Profile cover
N

Senior MLOps Engineer

Nadia Kowalski

Full-time · Senior · New_York

About the role

I want to be honest with you about what we are and what we're not. We are a 400-person company. We have a mature product, a real revenue base, and a data science team that has been shipping models for four years. We are not a startup. We do not move fast. We have compliance requirements, change management processes, and an infosec review that adds two weeks to any third-party integration. If you want to feel like a hero who single-handedly transforms a company in ninety days, this is probably not the right place for you. What we do have is a genuinely hard infrastructure problem that has been partially solved, badly, by three different teams over three years, and a leadership team that is finally serious about fixing it properly. We want someone experienced enough to diagnose what's actually wrong — not just what looks wrong — and patient enough to fix it in an organisation that moves at enterprise speed. If you've spent time inside a company like ours and understand why change is hard before you understand why it's necessary, please apply. That context is the most valuable thing you could bring.

Responsibilities

  • Produce a written architecture review of the current ML platform: gaps, risks, and a prioritised remediation plan
  • Lead the migration of model serving from our current ad-hoc ECS deployment to a standardised Kubernetes-based platform
  • Build automated retraining and deployment pipelines for our eight highest-priority production models
  • Define observability standards for model serving: latency SLOs, prediction monitoring, and alerting
  • Work across data science, platform, and infosec teams to drive technical decisions through proper approval processes

Requirements

  • 6+ years in data engineering, platform engineering, or MLOps with at least three years at a company of 200+ employees
  • Kubernetes at production depth — cluster administration, resource governance, multi-tenant namespace design
  • MLflow or an equivalent model registry and experiment tracking system in a real production setting
  • AWS including EKS, S3, IAM boundary policies, and VPC design — you've navigated enterprise AWS, not just sandbox accounts
  • Terraform with cross-team module sharing and state management at scale
  • Experience working within change management and infosec review processes without circumventing them

Benefits

  • Stable, funded company with real engineering problems — no runway anxiety
  • Full remote with quarterly in-person team meetings (travel covered)
  • $130,000 – $155,000 base salary + annual bonus
  • $2,000 annual learning budget
  • Generous pension contribution — 6% employer match

Job Type

Full-time

Level

Senior

Language

English

Salary Range

$130,000 – $155,000

AI Expertise

MLOps & AI Infrastructure

Ready to apply for this role?

Create a free talent account in under 2 minutes.

  • Apply to verified AI companies
  • Get AI-matched job recommendations
  • Message hiring managers directly
  • Build your public AI talent profile