Generalized Protein Pocket Generation with Prior-Informed Flow Matching

Designing ligand-binding proteins, such as enzymes and biosensors, is essential in bioengineering and protein biology. One critical step in this process involves designing protein pockets, the protein interface binding with the ligand. Current approaches to pocket generation often suffer from time-intensive physical computations or template-based methods, as well as compromised generation quality due to the overlooking of domain knowledge. To tackle these challenges, we propose PocketFlow, a generative model that incorporates protein-ligand interaction priors based on flow matching. During training, PocketFlow learns to model key types of protein-ligand interactions, such as hydrogen bonds. In the sampling, PocketFlow leverages multi-granularity guidance (overall binding affinity and interaction geometry constraints) to facilitate generating high-affinity and valid pockets. Experiments show that PocketFlow outperforms baselines on multiple benchmarks, e.g., achieving an average improvement of 1.29 in Vina Score and 0.05 in scRMSD. Moreover, modeling interactions make PocketFlow a generalized generative model across multiple ligand modalities, including small molecules, peptides, and RNA.


Publication

Generalized Protein Pocket Generation with Prior-Informed Flow Matching
Zaixi Zhang, Marinka Zitnik*, Qi Liu*
NeurIPS 2024 [NeurIPS Spotlight]

@article{zhang2024pocketflow,
  title={Generalized Protein Pocket Generation with Prior-Informed Flow Matching},
  author={Zhang, Zaixi and Zitnik, Marinka and Liu, Qi},
  journal={NeurIPS},
  url={https://arxiv.org/abs/2409.19520},
  year={2024}
}

Code Availability

Pytorch implementation of PocketFlow is available in the GitHub repository.

Authors

Latest News

Feb 2026:   Overton Prize

Our research has been recognized with the 2026 Overton Prize.

Feb 2026:   Foundation Models that Can 'Act or Defer'

Feb 2026:   Reasoning Model for Longitudinal Data

Feb 2026:   Context Switching AI in Nature Medicine

Jan 2026:   Zoom-Out and Zoom-In Retrieval for LLMs

Much of the world’s knowledge lies outside public web text accessible to LLMs, including internal ontologies, curated catalogs, drug safety tables, patient health data, and lab knowledge bases. ARK helps an LLM to choose, one step at a time, whether to look broadly for relevant information or to dig deeper by following specific links in the data.

Jan 2026:   AI Scientist for Therapeutic Discovery

Jan 2026:   AI Scientists - LLMs Using Scientific Tools

Excited about this academic collaboration with Anthropic on adding connectors to ToolUniverse to make Claude even more powerful for scientific discovery.

Dec 2025:   AI + Validation in Molecular, Organoid, and Clinical Systems

Dec 2025:   Digital Twinning

A piece in Harvard Gazette on digital twins, cellular chatbots, and building digital twins at a cellular scale.

Dec 2025:   Virtual Cells and Instruments

We are excited to meet hundreds of researchers attending our AI Virtual Cells and Instruments: A New Era in Drug Discovery and Development workshop at NeurIPS 2025.

Dec 2025:   CUREBench

Excited to see 1,622 researchers from around the world entering our CUREBench Challenge with 398 participating teams that made 3,383 submissions to the competition and submitted 8,457,500+ AI reasoning traces for therapeutics. Join us at the Award Ceremony at NeurIPS.

Dec 2025:   AI For Science at NeurIPS

Join us and hundreds of other scientists at the 6th AI for Science workshop at NeurIPS.

Nov 2025:   Protein Structure Tokenization

Nov 2025:   Generative AI Model for Spatial Biology

Nov 2025:   AI Cell Models

A piece in Science explores how AI cell models could transform biomedicine (if they work as promised) and highlights ToolUniverse. ToolUniverse lets AI co-scientists test, analyze, and build on AI cell models.

Oct 2025:   Is AI sycophancy holding science back?

A piece in Nature explores how AI sycophancy, in which models agree too much with users instead of reasoning on its own, could affect the use of AI in medical research.

Oct 2025:   Our research featured by Kempner and Crimson

A news story about PDGrapher in Harvard Crimson. ToolUniverse featured on the Kempner Institute blog.

Zitnik Lab  ·  Artificial Intelligence in Medicine and Science  ·  Harvard  ·  Department of Biomedical Informatics