About the role

We are seeking exceptional Senior Research Engineers to join our mission-critical team building the world's best oncology foundational models. As an AI-driven drug development company, these models are the engine that powers everything we do, from predicting patient survival, to identifying novel therapeutic targets to optimizing clinical trial design.

In this role, you'll be at the intersection of cutting-edge AI research and real-world drug development. You'll work on foundational models that integrate diverse data modalities, known cancer biology, tumor mechanisms, DNA/RNA sequencing, detailed medical notes, and examination results to generate insights that directly inform our clinical-stage programs.

You'll participate in both pre-training and post-training of our foundation models, requiring deep expertise in modern architectures and post-training algorithms such as reinforcement learning. You may also operate at the CUDA level, building customized kernels and understanding performance at the hardware-software interface.

What You'll Do

Design, implement, and optimize large-scale oncology foundation models integrating genomic sequences, medical notes, lab results, imaging, and clinical outcomes
Build and experiment with modern architectures optimized for biomedical applications
Spearhead pre-training and post-training efforts, including RLHF, DPO, RLAIF, and other alignment techniques
Write and optimize custom CUDA kernels; profile and resolve performance bottlenecks across the hardware-software interface
Maintain and optimize our 1,000+ H200 GPU cluster for reliability, utilization, and performance
Build distributed training and inference pipelines, experiment tracking systems, and evaluation frameworks
Develop benchmarks that measure real progress on drug development-relevant tasks
Collaborate with oncologists, biologists, and clinical development teams to ground model development in real therapeutic questions
Contribute to publications in top-tier ML and biomedical venues (NeurIPS, ICML, ICLR, Nature, Cell, etc.)

What We're Looking For

Required

Ph.D. in Computer Science, Machine Learning, Computational Biology, or a related field, or an M.S. with 5+ years of relevant industry experience
Publication record in machine learning, including multiple first-author papers at top-tier venues
3 to 5 years of hands-on deep learning experience (PyTorch, JAX, or TensorFlow)
Strong command of modern architectures: Transformers, attention mechanisms, state-space models, mixture-of-experts
Hands-on experience with post-training techniques: RLHF, DPO, PPO, or similar
Expert-level GPU programming and CUDA, including custom kernel development and performance profiling
Practical experience training or fine-tuning large-scale models (multi-billion parameter) in distributed settings (DeepSpeed, FSDP, Megatron, or similar)
Experience managing GPU clusters and ML infrastructure (Kubernetes, SLURM, or equivalent)
Strong software engineering fundamentals in Python and C++/CUDA
Clear communicator, able to present complex technical work to both engineering and scientific audiences

Preferred

Background in oncology, cancer biology, or drug development
Experience with biomedical foundation models (AlphaGenome, GeneFormer, Evo2, etc.)
Deep knowledge of cancer genomics, tumor biology, or mechanisms of resistance
Contributions to ML systems frameworks (FlashAttention, Triton, xFormers, etc.)
Experience with multi-modal learning and cross-modal architectures
Familiarity with advanced training techniques: synthetic data generation, curriculum learning, data filtering
Familiarity with regulatory considerations in healthcare AI (FDA, HIPAA, GxP)
Open-source contributions to ML projects or frameworks

Location

This is a hybrid role, requiring up to 3-4 days per week onsite, in our NYC Headquarters.