We research and productise reinforcement learning for robotic process automation. Specifically, we train agents to navigate complex digital workflows — browser-based processes, form-filling sequences, multi-step approval flows — using a combination of reinforcement learning and imitation learning from human demonstrations. Our codebase is Python and PyTorch. We publish occasionally. We ship continuously. This is a junior research engineering role, which means you'll split your time between implementing ideas from recent RL papers and making sure those ideas actually work in our production environment. You should understand RL fundamentals — policy gradients, PPO or similar, reward shaping — either from coursework, a research project, or something you built yourself. TensorFlow and Keras experience is a plus since parts of our codebase haven't fully migrated to PyTorch yet.
Responsibilities
Implement RL algorithm variants from recent papers and benchmark them against our baselines
Build and maintain custom simulation environments for workflow automation tasks
Tune reward functions and training loops to improve sample efficiency
Write clear experiment logs documenting what you tested, why, and what you found
Contribute to the migration of TensorFlow legacy code to PyTorch
Requirements
Solid Python and PyTorch for implementing and experimenting with RL algorithms
Conceptual understanding of RL fundamentals: Q-learning, policy gradients, actor-critic methods
Familiarity with PPO, SAC, or a comparable modern algorithm
NumPy for numerical computation in custom environments and reward functions
Any exposure to TensorFlow or Keras is helpful for working with legacy parts of the codebase
A project, thesis, or paper where you applied RL to a real problem is worth more than a list of courses
Benefits
Genuine research-adjacent work — papers, experiments, and production in equal measure
Full remote, flexible schedule
$60,000 – $78,000 base salary
Conference attendance budget: $1,500 annually
Small team — your contributions are visible and credited