Research Engineer - Midtraining
Periodic Labs
Location
Menlo Park, Remote
Employment Type
Full time
Department
Bits: LLMs, machine learning, infra, etc.
About Periodic Labs
We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries. We are well funded and growing rapidly. Team members are owners who identify and solve problems without boundaries or bureaucracy. We eagerly learn new tools and new science to push forward our mission.
About the role
You will train frontier models to be highly knowledgeable scientific experts that serve as the foundation for reinforcement learning. You will develop methods for synthetic data generation, distillation, and continual learning at scale. You will work closely with RL researchers, physicists, and chemists to create evals that guide scientific data curation. You will collaborate with supercompute engineers to scale compute-efficient LLM training to thousands of GPUs. You will build high-performance tools for yourself to investigate how data shapes intelligence.
You might thrive in this role if you have experience with:
Training LLMs on curated mixes of trillions of tokens
Calculating scaling laws and compute-optimal hyperparameters
Generating billions of tokens of high-quality synthetic data
Building evals that correlate with downstream task performance