AI/ML Research Engineer - Synthetic Data
P-1 Ai
> about P-1 AI
We are building an engineering AGI. We founded P-1 AI with the conviction that the greatest impact of artificial intelligence will be on the built world—helping mankind conquer nature and bend it to our will. Our first product is Archie, an AI engineer capable of quantitative and spatial reasoning over physical product domains that performs at the level of an entry-level design engineer. We aim to put an Archie on every engineering team at every industrial company on earth.
Our founding team includes the top minds in deep learning, model-based engineering, and industries that are our customers. We just closed a $23 million seed round led by Radical Ventures that includes a number of other AI and industrial luminaries (from OpenAI, DeepMind, etc.).
> about the role
As a Research Engineer (Synthetic Data) here, you will be responsible for building large-scale synthetic (and human-produced) datasets that we’ll use to {mid/post-}train our AI systems (LLMs) and endow them with quantitative reasoning capability so that they can perform previously impossible tasks or achieve unprecedented levels of performance in the domain of designing physical systems. We're looking for people with solid engineering skills, writing bug-free machine learning code, and building the science behind the systems employed (algorithms, data, evals). You will get exposure and will be expected to solve and take ownership of components across the entire stack. You will be interfacing with simulation engineering and domain experts to deploy this technology on real-world problems.
> tech stack
Python
PyTorch
C++
> location
This is a hybrid role but principally based in San Francisco. Candidates are expected to be located in the Bay Area or open to relocation.
> we expect you to
have strong programming skills and deep understanding of machine learning
have experience working with large distributed systems
be comfortable diving into a large ML codebase to debug
have a solid understanding of LLM architectures
have experience with LLM post-training on the synthetic data side
execute and analyze experiments autonomously and collaboratively
be excited about the prospect of building an engineering AGI
> ++
keeping up with state-of-the-art LLM / synthetic data research
you’ve built an impactful/popular open-source project
you’ve published/co-authored papers related to large scale synthetic data generation
> you will thrive in this role if
you have a background in statistical machine learning, physics, mathematics, or another theoretically and empirically rigorous field
you love working in a fast-paced, dynamic startup environment
you are intellectually curious and quick to pick up concepts outside of your direct areas of expertise
> interview process
Initial screening - with Head of Talent (30 mins)
Hiring manager interview - with co-founder & Head of AI (30 mins)
Technical interview 1 (60 mins)
Technical interview 2 (60 mins)
Culture fit / Q&A (maybe in-person) - with co-founder & CEO