AI Engineer

Marco Russo

Full-time · Mid-level · Los_Angeles

About the role

Context: we serve 1.2 million users. Our AI features handle 300,000 requests per day. Our P95 latency for LLM calls is currently 4.2 seconds. Our target is 1.8 seconds. We have a hypothesis about where the bottleneck is. We need an AI engineer who can test that hypothesis, implement the fix, and own the performance monitoring layer going forward. Direct impact. Measurable output. Clear success criteria from day one.

Responsibilities

Profile and diagnose current AI feature latency bottlenecks
Implement streaming responses and caching strategies
Build latency and cost dashboards for the AI features layer
Own the LLM API integration layer and keep it efficient
Document all performance improvements with before/after benchmarks

Requirements

2–4 years building AI features with LLM APIs in production
Experience optimising LLM call latency (streaming, caching, prompt compression)
Strong Python with async programming experience
Familiarity with observability tools (Datadog, Grafana, or similar)
Able to scope and deliver a performance improvement project independently

Benefits

Concrete problem with measurable success criteria
Full remote
$95,000 – $125,000
Equity package
Engineering-first culture — no endless meetings

Job Type

Full-time

Level

Mid-level

Language

English

Salary Range

$95,000 – $125,000

AI Expertise

AI & Machine Learning Engineers

Ready to apply for this role?

Create a free talent account in under 2 minutes.

Apply to verified AI companies
Get AI-matched job recommendations
Message hiring managers directly
Build your public AI talent profile

Create free account & apply Log in