Member of Technical Staff (Post Training)

Inherent

London, UK

Posted on May 29, 2026

Member of Technical Staff, Post-Training — Inherent (London)

At Inherent, we are on a mission to build AI that recursively self-improves to discover new knowledge. Scientific advances are the backbone of our economic, technological and societal prosperity, but ideas are getting harder to find and breakthroughs are becoming more expensive. We are building a new frontier lab dedicated to developing AI that explores “unknown unknowns” to uncover paradigm-shifting research contributions. Science is a social endeavour, and so our mission is inextricably a human-machine teaming problem. We’re starting by reinventing the AI research factory so that our own agents accelerate their own creation.

Inherent is a well-funded, fast-growing neo-lab backed by Tier 1 VCs who believe in our ethical stance. We are a team of operators with backgrounds at frontier labs who have done foundational work in recursive self-improvement, AI Scientists, world modelling, meta-RL and human-machine cooperation. Working in-person every day at our high-intensity London headquarters, we believe that Europe will lead the way in the coming paradigm of AI-enabled science, unlocking human potential across the globe.

About the role

We’re looking for Members of Technical Staff to lead work on post-training state-of-the-art foundation models for open-ended agentic capabilities in scientific research. You’ll be involved at every level of the post-training pipeline: sourcing and creating data, building autocurricula, devising and implementing SFT and RL algorithms, constructing tools and harnesses for foundation model self-improvement, analysing research results, and using information gained to devise future hypotheses. You will work closely with an experienced technical team of humans, and increasingly alongside the AI scientist collaborators we dogfood.

What you'd do

Design, implement, and tune SFT and RL algorithms to post-train models that autonomously perform state-of-the-art research.
Build the autocurricula, judges, harnesses and eval pipelines that turn open-ended research tasks into reliable reward signal.
Run large-scale experiments on state-of-the-art hardware and analyse experiments to determine the next hypotheses to test, in collaboration with our AI agents.
Close recursive loops so that AI agents drive their own post-training research.
Work closely with colleagues in the Infrastructure and AI for Science teams to optimise hardware and deliver remarkable performance in real scientific domains.

What we're looking for

3+ years of deep learning research experience.
Experience post-training large language, vision, video or multi-modal models.
Demonstrated track record of success in deep learning research, whether papers, model releases, open-source contributions, or other artifacts.
5+ years of software engineering experience, including deep familiarity with Python and at least one deep learning framework (e.g., PyTorch, JAX).
Experience using the latest coding agents, and opinions about optimal workflow.
Enthusiasm for experimental organizational design.
AI-pilled: adopting agents, keen to build a company where agents are front and centre.

Strong candidates may also have

PhD in mathematics, computer science or hard science discipline.
Hands-on experience training LLMs with RL at scale (GRPO/PPO, DPO, distillation, and variants).
Familiarity with distributed and long-context training infrastructure.
A background in autocurricula, open-endedness, meta-learning, or recursive self-improvement.
Experience post-training frontier models at an industry lab (scale, infra, and iteration speed).

Why this is interesting

You'll shape the core research of a frontier AI lab from the beginning.
You'll work on genuine recursive self-improvement — training AI scientists that improve the very pipeline that trains them — not incremental benchmark-chasing.
You'll dogfood your own work: the agents you post-train accelerate the research that creates them.
Small team, high trust, no bureaucracy, and a genuinely technical culture.

Culture

We only select people with low ego, spiky skill profiles, commitment to societal benefit, unusual viewpoints, and a passion for "living in the experiment". We'll win because we're willing to try things that no incumbent would even think to do, let alone action.

We have really good lunch and dinner. Seriously. You've got to try it. We're based in King's Cross, London and believe in the pace and energy of working in person. We’re committed to having the most tasteful, and the weirdest, office of any AI lab: the environment shapes the agents within it.

If you believe in our mission and culture, and are qualified and motivated, we encourage you to apply, even if you don’t meet every one of the criteria above. We know that many of the most creative and talented people have had unusual career paths and backgrounds. Building a team with a diversity of thought is mission-critical, for plurality spurs curiosity, invention and collective experimentation.

See more open positions at Inherent

It's your turn to create the future.