Profile cover
Y

ML Engineer — Distributed Training Setup (Multi-GPU)

Yemi Adeyemi

Contract · Senior

About the role

We're training large transformer models (7B–30B parameters) and our training setup is inefficient. Single-machine multi-GPU only. No FSDP, no ZeRO optimisation, no gradient checkpointing. We need an ML engineer to set up proper distributed training across our cluster (16 × H100 on GKE). Deliverables: working distributed training config using PyTorch FSDP, training speed benchmarks before and after, runbook for our research team. Bonus if you can also set up experiment tracking integration (we use Weights & Biases).

Contract Type

Hourly rate

Level

Senior

Budget Range

$90 – $140 / hour

Duration

3 months

AI Expertise

MLOps & AI Infrastructure

Ready to apply for this role?

Create a free talent account in under 2 minutes.

  • Apply to verified AI companies
  • Get AI-matched job recommendations
  • Message hiring managers directly
  • Build your public AI talent profile