Research AI Engineer

Overview

Prof. Marinka Zitnik invites applications for a Research AI Engineer position at Harvard University.

We are seeking a talented Research AI Engineer to implement, test, and deploy customized biomedical AI tools. This role involves designing cutting-edge software solutions that accelerate biomedical discovery and support lab operations. The engineer will work closely with graduate students, postdoctoral fellows, biologists, and clinicians to develop transformative AI systems.

Responsibilities:

  • Develop model pipelines for LLMs and foundation models using multi-GPU training and inference.
  • Design and implement large-scale data pipelines optimized for precision, recall, and processing speed.
  • Train and orchestrate multi-agent LLM systems, including model fine-tuning and custom embedding creation.
  • Develop an experimentation platform for medical AI tools.
  • Test diverse AI system designs and optimize performance across benchmarks.
  • Generate new ideas and develop solutions to maximize performance on a broad range of benchmarks.

This position offers an opportunity to contribute to groundbreaking research, using state-of-the-art AI to address critical problems in science.

Interested candidates are encouraged to explore our recent publications and research directions before applying.

Qualifications

Required skills and experience:

  • Strong experience across ML and LLM software stack, including feature engineering, model development, deployment, and validation.
  • Proficiency in deep learning frameworks such as PyTorch, with openness to learning new tools and technologies.
  • Experience in training embeddings with custom evaluation datasets.
  • Familiarity with custom agents and techniques such as retrieval-augmented generation.
  • Experience using distributed systems for large-scale training or inference.
  • Experience with LLMs for training, fine-tuning, or inference.

Strong candidates will have a keen desire to leverage advanced AI technologies to address scientific challenges. They will have excellent communication and collaboration skills to work effectively in a multidisciplinary, fast-paced environment.

Publications in machine learning or AI conferences, or scientific journals, are a strong plus.

Candidates must hold a Ph.D. in computer science, engineering, or a closely related field. Exceptional candidates with a Master’s degree and significant relevant experience will also be considered.

Location

On campus of Harvard Medical School, Boston, MA.

Application process

Interested applicants should submit the following documents via email to Prof. Zitnik and use the subject line “Research AI Engineer”:

  • Cover letter
  • Curriculum Vitae (include links to your academic webpage and GitHub repositories)
  • Three representative examples of past work
  • Three letters of recommendation (will be solicited after the initial review)

We are reviewing applications on a rolling basis. Interested candidates are encouraged to submit their applications early.

Advisor

Marinka Zitnik is an Assistant Professor at Harvard University with appointments in the Department of Biomedical Informatics, Kempner Institute for the Study of Natural and Artificial Intelligence, Broad Institute of MIT and Harvard, and Harvard Data Science. We investigate machine learning with a current focus on learning systems informed by geometry, structure, and symmetry and grounded in knowledge. This approach creates foundational models, including pre-trained, self-supervised, multi-purpose, and multi-modal models trained at scale to enable broad generalization. Our methods produce actionable outputs to advance biological problems past the state of the art and open up new opportunities.

Dr. Zitnik has published extensively in top ML venues, such as NeurIPS, ICLR, ICML, and leading scientific journals, including Nature, Nature Methods, Nature Communications, and PNAS. She has organized numerous workshops and tutorials in the nexus of AI, deep learning, AI4Science and AI4Medicine at leading conferences, where she is also in the organizing committees.

Her research received best paper and research awards from International Society for Computational Biology, International Conference on Machine Learning, Bayer Early Excellence in Science Award, Amazon Faculty Research Award, Google Faculty Research Scholar Award, Roche Alliance with Distinguished Scientists Award, Sanofi iDEA-iTECH Award, Rising Star Award in Electrical Engineering and Computer Science (EECS), and Next Generation Recognition in Biomedicine, being the only young scientist with such recognition in both EECS and Biomedicine. Dr. Zitnik received the Kavli Fellowship by the US National Academy of Sciences and the Kaneb Fellowship award at Harvard Medical School. She also received the NSF CAREER Award.

Dr. Zitnik is an ELLIS Scholar in the European Laboratory for Learning and Intelligent Systems (ELLIS) Society. She is a member of the Science Working Group at NASA Space Biology. Dr. Zitnik co-founded Therapeutics Data Commons and is the faculty lead of the AI4Science initiative. Dr. Zitnik is the recipient of the 2022 Young Mentor Award at Harvard Medical School.


Harvard is an Equal Opportunity Employer.

Latest News

Jan 2025:   ProCyon AI Highlighted by Kempner

Thanks to Kempner Institute for highlighting our latest research, ProCyon, a multimodal foundation model for protein phenotypes.

Jan 2025:   AI Design of Proteins for Therapeutics

Dec 2024:   Unified Clinical Vocabulary Embeddings

New paper: A unified resource provides a new representation of clinical knowledge by unifying medical vocabularies. (1) Phenotype risk score analysis across 4.57 million patients, (2) Inter-institutional clinician panels evaluate alignment with clinical knowledge across 90 diseases and 3,000 clinical codes.

Dec 2024:   SPECTRA in Nature Machine Intelligence

Are biomedical AI models truly as smart as they seem? SPECTRA is a framework that evaluates models by considering the full spectrum of cross-split overlap: train-test similarity. SPECTRA reveals gaps in benchmarks for molecular sequence data across 19 models, including LLMs, GNNs, diffusion models, and conv nets.

Nov 2024:   Ayush Noori Selected as a Rhodes Scholar

Congratulations to Ayush Noori on being named a Rhodes Scholar! Such an incredible achievement!

Nov 2024:   PocketGen in Nature Machine Intelligence

Oct 2024:   Activity Cliffs in Molecular Properties

Oct 2024:   Knowledge Graph Agent for Medical Reasoning

Sep 2024:   Three Papers Accepted to NeurIPS

Exciting projects include a unified multi-task time series model, a flow-matching approach for generating protein pockets using geometric priors, and a tokenization method that produces invariant molecular representations for integration into large language models.

Sep 2024:   TxGNN Published in Nature Medicine

Aug 2024:   Graph AI in Medicine

Excited to share a new perspective on Graph Artificial Intelligence in Medicine in Annual Reviews.

Aug 2024:   How Proteins Behave in Context

Harvard Medicine News on our new AI tool that captures how proteins behave in context. Kempner Institute on how context matters for foundation models in biology.

Jul 2024:   PINNACLE in Nature Methods

PINNACLE contextual AI model is published in Nature Methods. Paper. Research Briefing. Project website.

Jul 2024:   Digital Twins as Global Health and Disease Models of Individuals

Paper on digitial twins outlining strategies to leverage molecular and computational techniques to construct dynamic digital twins on the scale of populations to individuals.

Jul 2024:   Three Papers: TrialBench, 3D Structure Design, LLM Editing

Jun 2024:   TDC-2: Multimodal Foundation for Therapeutics

The Commons 2.0 (TDC-2) is an overhaul of Therapeutic Data Commons to catalyze research in multimodal models for drug discovery by unifying single-cell biology of diseases, biochemistry of molecules, and effects of drugs through multimodal datasets, AI-powered API endpoints, new tasks and benchmarks. Our paper.

May 2024:   Broad MIA: Protein Language Models

Zitnik Lab  ·  Artificial Intelligence in Medicine and Science  ·  Harvard  ·  Department of Biomedical Informatics