Focused Tutorials

Overview
Tutorial 1: Introduction to NLP in Medicine
Tutorial 2: Generative AI in Medicine
Tutorial 3: Multimodal Learning with EHRs (Clinical Events + Notes, Optional Imaging)
Tutorial 4: Medical Image Analysis
Tutorial 5: Supervised Fine-Tuning and Reinforcement Learning for LLMs
Tutorial 6: Radiology Report Generation with Multimodal LLMs

Overview

Through the tutorials, you will apply AI methods to healthcare problems using open-source datasets. You will build models, evaluate performance, and visualize results.

Each tutorial is designed to be easy to follow so you can start running the code quickly. The tutorials cover core areas such as natural language processing, medical image analysis, graph neural networks, generative models, LLMs, and biological and clinical foundation models. They complement the lecture material by showing how these methods behave on real data, including where they work well and where they fail.

Working with real-world datasets will strengthen your coding and data analysis skills and help you practice model evaluation. The tutorials also support the course project and provide a practical baseline you can reuse in future research.

Tutorial 1: Introduction to NLP in Medicine

Speaker

Dr. Grey Kuling

Datasets

Clinical notes dataset (e.g., MIMIC-III or de-identified open-source clinical datasets).

Tasks

Preprocessing Medical Text Data
- Tokenize text into sentences and words using NLP libraries like spaCy or NLTK.
- Normalize text by handling abbreviations, medical jargon, and common typos.
- Annotate datasets with relevant medical entities using Named Entity Recognition (NER).
Fine-Tuning Pre-trained Transformers
- Use Hugging Face Transformers to fine-tune ClinicalBERT or similar models.
- Train the model on specific clinical tasks like disease classification or drug identification.
Parameter-Efficient Fine-Tuning with LoRA
- Apply Low-Rank Adaptation (LoRA) to reduce computational requirements.
- Compare model performance between full fine-tuning and LoRA.
Text Generation with LLMs
- Use GPT-based models to summarize medical notes or generate clinical trial matches.
- Experiment with controlling output length and quality using temperature and sampling parameters.
Advanced NLP Applications (Stretch Task)
- Implement a pipeline for de-identifying clinical notes using trained NER models.
- Use a GPT model to answer medical questions based on clinical scenarios.

Skills Developed

Mastery of medical text preprocessing techniques.
Practical experience fine-tuning transformers for clinical NLP tasks.
Hands-on introduction to efficient NLP model training.

Tutorial 2: Generative AI in Medicine

Speaker

Dr. Grey Kuling

Datasets

Medical imaging datasets (e.g., CheXpert or the Medical Segmentation Decathlon).

Tasks

Building Variational Autoencoders (VAEs)
- Implement a VAE using PyTorch for generating synthetic medical images.
- Visualize the learned latent space to understand its representation of data variability.
Building Generative Adversarial Networks (GANs)
- Train a GAN to generate synthetic medical images (e.g., X-rays or MRI scans).
- Evaluate the quality of generated images using Fréchet Inception Distance (FID).
Exploring Generative Models for Text
- Use a GPT model to generate synthetic clinical records.
- Compare synthetic records to real notes and evaluate semantic coherence.
Understanding Data Privacy in Generative AI
- Experiment with techniques to assess privacy risks, such as membership inference attacks.
- Discuss how synthetic data can mitigate privacy concerns in healthcare.
Model Comparison (Stretch Task)
- Compare VAEs and GANs for their ability to generate realistic medical data.
- Explore how generated data can be used to augment training datasets for downstream tasks.

Skills Developed

Building and evaluating generative models for healthcare applications.
Understanding privacy implications and synthetic data generation.

Tutorial 3: Multimodal Learning with EHRs (Clinical Events + Notes, Optional Imaging)

Speaker

Rishabh Goel

Datasets

MIMIC-IV (structured EHR + de-identified free-text clinical notes) (Nature)
Optional imaging add-on: MIMIC-CXR (chest X-rays with free-text radiology reports) or MIMIC-CXR-JPG (PhysioNet)

Tasks

Cohort + Prediction Setup (Warm-up)
- Define a simple cohort (for example, adult ICU stays) and a prediction target (for example, in-hospital mortality or prolonged LOS) using structured MIMIC-IV tables (PhysioNet)
- Create a clean train/val/test split at the patient level (avoid leakage)
Modality 1: Structured EHR Feature Pipeline
- Build time-aligned features from vitals, labs, and interventions (first 24 hours, summary stats, trend features)
- Establish a strong baseline model (logistic regression or gradient boosted trees)
Modality 2: Clinical Notes Representations
- Basic text cleaning and note selection (which note types, what time window)
- Create note embeddings (simple baseline: TF-IDF; stronger baseline: pretrained clinical transformer embeddings)
- Train a notes-only baseline model
Multimodal Fusion Models (Core)
- Late fusion baseline: concatenate structured features + note embeddings, train a single classifier
- Attention or transformer-based fusion: fuse structured time series representations with note representations
- Reproduce the key idea behind a multimodal transformer that combines structured EHR and notes for outcome prediction (PMC)
Evaluation + What Actually Changes with Multimodality
- Compare structured-only vs notes-only vs fused models (AUROC, AUPRC, calibration)
- Do a short error analysis: cases where notes add signal, cases where they add noise
- Practical discussion: missingness, note availability, and real-world deployment tradeoffs
Advanced (Stretch Task): Add Imaging
- Use pretrained image embeddings from MIMIC-CXR (or use provided embeddings to save time) and fuse as a third modality (PhysioNet)
- Ablate: structured + notes vs structured + notes + imaging, and interpret when imaging helps

Skills Developed

Building an end-to-end multimodal clinical ML pipeline from raw EHR tables and text (Nature)
Representation learning for clinical notes and principled fusion with structured data (PMC)
Strong experimental practice: leakage control, ablations, calibration, and failure-mode analysis
Optional exposure to tri-modal modeling with imaging using MIMIC-CXR (PhysioNet)

Tutorial 4: Medical Image Analysis

Speaker

Dr. Grey Kuling

Datasets

Datasets from the Medical Segmentation Decathlon (e.g., liver or brain scans).

Tasks

Image Preprocessing
- Load, resize, and normalize medical images using Python libraries like Pillow or OpenCV.
- Visualize medical images and perform basic exploratory analysis.
Image Classification and Regression
- Train a CNN to classify medical images into diagnostic categories (e.g., tumor or non-tumor).
- Perform regression analysis to predict risk scores based on image features.
Image Segmentation with U-Net
- Build and train a U-Net model for segmenting anatomical regions in medical images.
- Evaluate segmentation performance using the Dice coefficient and Jaccard index.
Experimentation and Model Tracking
- Use Weights & Biases to log model performance and compare experiments.
- Optimize model performance through hyperparameter tuning.
Advanced Segmentation Tasks (Stretch Task)
- Perform multi-class segmentation for complex medical images (e.g., segmenting organs and tumors).
- Explore using pre-trained models for transfer learning in segmentation tasks.

Skills Developed

Understanding medical image preprocessing, classification, and segmentation.
Hands-on experience with CNNs and U-Net architectures.

Tutorial 5: Supervised Fine-Tuning and Reinforcement Learning for LLMs

Speaker

Shvat Messica

Description

This tutorial introduces fine-tuning of large language models for medical applications through hands-on exercises. Students will learn how to fine-tune models using full-parameter approaches, including reinforcement learning methods such as GRPO, PPO, and DAPO, as well as parameter-efficient techniques such as LoRA. The tutorial covers both the underlying principles and practical implementation, with an emphasis on code-level workflows.

In addition to methods, the tutorial reviews recent research on large language models in medical AI and ends with a discussion of open challenges and directions for future work.

Tutorial 6: Radiology Report Generation with Multimodal LLMs

Speaker

Mohammed Baharoon

Datasets

Paired medical image–report datasets (e.g., chest X-ray datasets with free-text radiology reports).

Tasks

Understanding the Report Generation Task
- Overview of radiology report structure (Findings vs. Impression)
- Common failure modes: omissions, hallucinations
- Why report generation differs from generic image captioning
Baseline Models for Report Generation
- Implement an image-to-text model using a vision encoder and LLM-based decoder
Using Hugging Face Transformers
- Learn the Hugging Face Transformers library for model loading, training, and inference
- Fine-tune pre-trained Transformer decoders for medical report generation
Evaluation of Generated Reports
- Apply standard NLP metrics (BLEU, ROUGE, BERTScore)
- Analyze qualitative examples where automated metrics fail to reflect clinical quality
- Introduce impression-level and finding-level evaluation concepts

Skills Developed

Practical use of the Hugging Face Transformers library for multimodal modeling
Building and training radiology report-generation systems
Evaluating medical text generation beyond surface-level NLP metrics
Understanding clinical constraints in medical AI applications

Focused Tutorials

Table of contents

Overview

Tutorial 1: Introduction to NLP in Medicine

Speaker

Datasets

Tasks

Skills Developed

Tutorial 2: Generative AI in Medicine

Speaker

Datasets

Tasks

Skills Developed

Tutorial 3: Multimodal Learning with EHRs (Clinical Events + Notes, Optional Imaging)

Speaker

Datasets

Tasks

Skills Developed

Tutorial 4: Medical Image Analysis

Speaker

Datasets

Tasks

Skills Developed

Tutorial 5: Supervised Fine-Tuning and Reinforcement Learning for LLMs

Speaker

Description

Tutorial 6: Radiology Report Generation with Multimodal LLMs

Speaker

Datasets

Tasks

Skills Developed