The numbers: 12 ML models in production, 6 of them retrained weekly, 3 teams with conflicting deployment processes, 0 shared model registry, and 1 MLOps engineer (currently) handling everything. We are hiring a second MLOps engineer to help build the infrastructure that scales. This is not a glamorous greenfield role — it's fixing what exists, building standards, and making it survivable for a team that is growing fast. If you like turning operational chaos into working systems, come talk to us.
Responsibilities
Stabilise and expand our CI/CD pipelines for model training and deployment
Set up a shared model registry and enforce versioning standards across teams
Build monitoring for model performance and data drift
Document our infrastructure and create onboarding guides for new ML engineers
Work with data engineers on pipeline orchestration improvements
Requirements
2–4 years MLOps with at least one production ML system owned end to end
Hands-on experience with Airflow and MLflow (or comparable alternatives)
Docker and basic Kubernetes — comfortable deploying containerised services
Cloud experience (GCP, AWS, or Azure)
Able to write clear documentation and enforce standards across teams