Job Description
The Arc Institute is a new scientific institution that conducts curiosity-driven basic science and technology development to understand and treat complex human diseases. Headquartered in Palo Alto, California, Arc is an independent research organization founded on the belief that many important research programs will be enabled by new institutional models. Arc operates in partnership with Stanford University, UCSF, and UC Berkeley. Funding: Arc will fully fund Core Investigators’ (PIs’) research groups, liberating scientists from the typical constraints of project-based external grants. Biomedical research has become increasingly dependent on complex tooling. Support: Arc aims to provide first-class support—operationally, financially and scientifically—that will enable scientists to pursue long-term high risk, high reward research that can meaningfully advance progress in disease cures, including neurodegeneration, cancer, and immune dysfunction. We aim to create a culture that is focused on scientific curiosity, a deep commitment to truth, broad ambition, and selfless collaboration. With $650M+ in committed funding and a state‑of‑the‑art new lab facility in Palo Alto, Arc will continue to grow quickly in the coming years. We are searching for an experienced and collaborative machine learning research engineer focused on advancing the frontiers of biological foundation models. This role will contribute to the development and application of Arc’s frontier DNA foundation model (Evo), Arc’s Virtual Cell Initiative focused on developing cell biological models capable of predicting the impact of perturbations and stimuli, and other projects in the context of Institute‑wide machine‑learning efforts. You are an innovative machine learning engineer with a deep understanding of ML principles, enabling you to design, modify, and critically evaluate model architectures, not just apply existing ones. You have significant experience in training large deep learning models. You enjoy thinking from first principles, seeking to deeply understand the data and its underlying dynamics to drive effective and innovative modeling strategies. Actively participate in the design, implementation, and refinement of state‑of‑the‑art foundation models developed in collaboration with other ML researchers and scientists at Arc with the goal of understanding and designing complex biological systems. Engineer large‑scale distributed model pretraining and pipelines for efficient model inference. D. in Computer Science, Machine Learning or a related field. Minimum of 5‑8+ years of relevant experience in machine‑learning research or ML engineering in an academic (e.g., or industry research lab. Well‑versed in machine‑learning frameworks such as PyTorch or JAX. Experience with developing distributed training tools such as FSDP, DeepSpeed, or Megatron‑LM. Ability to communicate and collaborate successfully with biologists and software/infrastructure engineers. The actual base compensation paid to any individual for this position may vary depending on factors such as experience, market conditions, education/training, skill level, and whether the compensation is internally equitable, and does not include bonuses, commissions, differential pay, other forms of compensation, or benefits. This position is also eligible to receive an annual discretionary bonus, with the amount dependent on individual and institute performance factors. #