Profile cover
M

Senior ML Research Scientist

Mia Johansson

Full-time · Senior · Los_Angeles

About the role

Our research group focuses on efficient large language model inference — specifically, the problem of making capable models fast and cheap enough to operate in latency-sensitive production environments without unacceptable quality degradation. The work spans speculative decoding, quantisation, knowledge distillation, and architectural modifications that improve inference throughput without the regressions that naïve compression approaches introduce. We are a team of six researchers and two engineers. Our publication bar is high and we do not publish for the sake of it. We are currently pursuing a research direction we believe has significant commercial implications and have not yet published. We hire researchers who are motivated by hard problems, who write clearly and precisely, and whose experimental work is rigorous enough that a peer can reproduce it from the write-up alone without requesting additional clarification. If you are in a research role at a university, national lab, or industry group and are ready to work on problems where the output matters beyond citations, we welcome the conversation.

Responsibilities

  • Conduct original research on LLM inference efficiency with publication-quality rigour
  • Implement and evaluate novel algorithmic approaches in PyTorch with full reproducibility documentation
  • Collaborate with the engineering team to translate validated research into production-ready implementations
  • Produce internal research memos on experimental outcomes — positive and negative — within two weeks of completion
  • Present research progress at weekly internal seminars and represent the company at external conferences

Requirements

  • PhD in machine learning, computer science, or a closely related discipline, or equivalent research output
  • 4+ years of research experience with publications at NeurIPS, ICML, ICLR, ACL, or equivalent venues
  • Deep expertise in at least one of: LLM inference optimisation, model quantisation, knowledge distillation, speculative decoding
  • PyTorch for research implementation — you implement novel architectures and training procedures from scratch
  • Python for experimentation infrastructure, data processing, and rigorous evaluation pipelines
  • Experimental practice that would satisfy a peer reviewer: careful baselines, ablations, and statistically sound evaluation
  • Strong technical writing — both research papers and internal technical memos

Benefits

  • Research problems with direct commercial production implications — your findings get implemented, not filed
  • Full remote with optional onsite quarters in San Francisco
  • $150,000 – $190,000 base salary + equity + publication bonus
  • $5,000 annual research budget covering conferences, compute credits, and tooling
  • Pre-publication peer review process — your work is internally reviewed before external submission

Job Type

Full-time

Level

Senior

Language

English

Salary Range

$150,000 – $190,000

AI Expertise

AI & Machine Learning Engineers

Ready to apply for this role?

Create a free talent account in under 2 minutes.

  • Apply to verified AI companies
  • Get AI-matched job recommendations
  • Message hiring managers directly
  • Build your public AI talent profile