Multimodal AI Predicts Clinical Outcomes of Drug Combinations from Preclinical Data

Predicting clinical outcomes from preclinical data is essential for identifying safe and effective drug combinations. Current models rely on structural or target-based features to identify high-efficacy, low-toxicity drug combinations. However, these approaches fail to incorporate the multimodal data necessary for accurate, clinically-relevant predictions.

Here, we introduce MADRIGAL, a multimodal AI model that learns from structural, pathway, cell viability, and transcriptomic data to predict drug combination effects across 953 clinical outcomes and 21842 compounds, including combinations of approved drugs and novel compounds in development. MADRIGAL uses a transformer bottleneck module to unify preclinical drug data modalities while handling missing data during training and inference--a major challenge in multimodal learning.

It outperforms single-modality methods and state-of-the-art models in predicting adverse drug interactions. MADRIGAL performs virtual screening of anticancer drug combinations and supports polypharmacy management for type II diabetes and metabolic dysfunction-associated steatohepatitis (MASH). It identifies transporter-mediated drug interactions. MADRIGAL predicts resmetirom, the first and only FDA-approved drug for MASH, among therapies with the most favorable safety profile. It supports personalized cancer therapy by integrating genomic profiles from cancer patients. Using primary acute myeloid leukemia samples and patient-derived xenograft models, it predicts the efficacy of personalized drug combinations. Integrating MADRIGAL with a large language model allows users to describe clinical outcomes in natural language, improving safety assessment by identifying potential adverse interactions and toxicity risks. MADRIGAL provides a multimodal approach for designing combination therapies with improved predictive accuracy and clinical relevance.

Publication

Multimodal AI predicts clinical outcomes of drug combinations from preclinical data
Yepeng Huang, Xiaorui Su, Varun Ullanat, Ivy Liang, Lindsay Clegg, Damilola Olabode, Nicholas Ho, Bino John, Megan Gibbs, Marinka Zitnik
In Review 2025 [arXiv]

@article{huang2025madrigal,
  title={Multimodal AI predicts clinical outcomes of drug combinations from preclinical data},
  author={Huang, Yepeng and Su, Xiaorui and Ullanat, Varun and Liang, Ivy and Clegg, Lindsay and Olabode, Damilola and  Ho, Nicholas and John, Bino and Gibbs, Megan and Zitnik, Marinka},
  journal={arXiv:2503.02781},
  url={https://arxiv.org/abs/2503.02781},
  year={2025}
}

Code and Data Availability

Pytorch implementation of PocketGen is available in the GitHub repository. Datasets are also available at Harvard Dataverse repository.

Authors

Latest News

Mar 2025:   Multimodal AI predicts clinical outcomes of drug combinations from preclinical data

Mar 2025:   KGARevion: AI Agent for Knowledge-Intensive Biomedical QA

KGARevion is an AI agent designed for complex biomedical QA that integrates the non-codified knowledge of LLMs with the structured, codified knowledge found in knowledge graphs. [ICLR 2025 publication]

Feb 2025:   MedTok: Unlocking Medical Codes for GenAI

Meet MedTok, a multimodal medical code tokenizer that transforms how AI understands structured medical data. By integrating textual descriptions and relational contexts, MedTok enhances tokenization for transformer-based models—powering everything from EHR foundation models to medical QA. [Project website]

Feb 2025:   What If You Could Rewrite Biology? Meet CLEF

What if we could anticipate molecular and medical changes before they happen? Introducing CLEF, an approach for counterfactual generation in biological and medical sequence models. [Project website]

Feb 2025:   Digital Twins as Global Health and Disease Models

Jan 2025:   LLM and KG+LLM agent papers at ICLR

Jan 2025:   Artificial Intelligence in Medicine 2

Excited to share our new graduate course on Artificial Intelligence in Medicine 2.

Jan 2025:   ProCyon AI Highlighted by Kempner

Thanks to Kempner Institute for highlighting our latest research, ProCyon, our protein-text foundation model for modeling protein functions.

Jan 2025:   AI Design of Proteins for Therapeutics

Dec 2024:   Unified Clinical Vocabulary Embeddings

New paper: A unified resource provides a new representation of clinical knowledge by unifying medical vocabularies. (1) Phenotype risk score analysis across 4.57 million patients, (2) Inter-institutional clinician panels evaluate alignment with clinical knowledge across 90 diseases and 3,000 clinical codes.

Dec 2024:   SPECTRA in Nature Machine Intelligence

Are biomedical AI models truly as smart as they seem? SPECTRA is a framework that evaluates models by considering the full spectrum of cross-split overlap: train-test similarity. SPECTRA reveals gaps in benchmarks for molecular sequence data across 19 models, including LLMs, GNNs, diffusion models, and conv nets.

Nov 2024:   Ayush Noori Selected as a Rhodes Scholar

Congratulations to Ayush Noori on being named a Rhodes Scholar! Such an incredible achievement!

Nov 2024:   PocketGen in Nature Machine Intelligence

Oct 2024:   Activity Cliffs in Molecular Properties

Oct 2024:   Knowledge Graph Agent for Medical Reasoning

Sep 2024:   Three Papers Accepted to NeurIPS

Exciting projects include a unified multi-task time series model, a flow-matching approach for generating protein pockets using geometric priors, and a tokenization method that produces invariant molecular representations for integration into large language models.

Sep 2024:   TxGNN Published in Nature Medicine

Aug 2024:   Graph AI in Medicine

Excited to share a new perspective on Graph Artificial Intelligence in Medicine in Annual Reviews.

Zitnik Lab  ·  Artificial Intelligence in Medicine and Science  ·  Harvard  ·  Department of Biomedical Informatics