Encoding Time-Series Explanations through Self-Supervised Model Behavior Consistency

Interpreting time series models is uniquely challenging because it requires identifying both the location of time series signals that drive model predictions and their matching to an interpretable temporal pattern. While explainers from other modalities can be applied to time series, their inductive biases do not transfer well to the inherently uninterpretable nature of time series.

We present TimeX, a time series consistency model for training explainers. TimeX trains an interpretable surrogate to mimic the behavior of a pretrained time series model. It addresses the issue of model faithfulness by introducing model behavior consistency, a novel formulation that preserves relations in the latent space induced by the pretrained model with relations in the latent space induced by TimeX. TimeX provides discrete attribution maps and, unlike existing interpretability methods, it learns a latent space of explanations that can be used in various ways, such as to provide landmarks to visually aggregate similar explanations and easily recognize temporal patterns.

We evaluate TimeX on eight synthetic and real-world datasets and compare its performance against state-of-the-art interpretability methods. We also conduct case studies using physiological time series, showing that the novel components of TimeX show potential for training faithful, interpretable models that capture the behavior of pretrained time series models.

State-of-the-art time series models are high-capacity pre-trained neural networks often seen as black boxes due to their internal complexity and lack of interpretability. However, practical use requires techniques for auditing and interrogating these models to rationalize their predictions. Interpreting time series models poses a distinct set of challenges due to the need to achieve two goals:

  • pinpointing the specific location of time series signals that influence the model’s predictions,
  • aligning those signals with interpretable temporal patterns.

Research in model understanding and interpretability developed post-hoc explainers that treat pretrained models as black boxes and do not need access to internal model parameters, activations, and gradients. Recent research, however, shows that such post hoc methods suffer from a lack of faithfulness and stability, among other issues. A model can also be understood by investigating what parts of the input it attends to through attention mapping and measuring the impact of modifying individual computational steps within a model. Another major line of inquiry investigates internal mechanisms by asking what information the model contains. For example, it has been found that even when a language model is conditioned to output falsehoods, it may include a hidden state that represents the true answer internally. Such a gap between external failure modes and internal states can only be identified by probing model internals. Such representation probing has been used to characterize the behaviors of language models, but leveraging these strategies to understand time series models has yet to be attempted.

These lines of inquiry drive the development of in-hoc explainers that build inherent interpretability into the model through architectural modifications or regularization. However, no in-hoc explainers have been developed for time series data. While explainers designed for other modalities can be adapted to time series, their inherent biases translate poorly to the uninterpretable nature of time series data and can miss important structures in time series.

Explaining time series models is challenging for many reasons:

  1. Large time series data are not visually interpretable, as opposed to imaging or text datasets.
  2. Time series often exhibit dense informative features, in contrast to more explored modalities such as imaging, where informative features are often sparse. In time series datasets, timestep-to-timestep transitions can be negligible and temporal patterns only show up when looking at time segments and long-term trends. In contrast, in text datasets, word-to-word transitions are informative for language modeling and understanding. Time series interpretability involves understanding dynamics of the model and identifying trends or patterns.
  3. Another key issue with applying prior methods is that they treat all time steps as separate features, ignoring potential time dependencies and contextual information; we need explanations that are temporally connected and visually digestible.
  4. While understanding predictions of individual samples is valuable, the ability to establish connections between explanations of various samples (for example, in an appropriate latent space) could help alleviate these challenges.

Overview of TimeX

TimeX is a novel time series in-hoc explainer that produces interpretable attribution masks as explanations over time series inputs. An essential contribution of TimeX is the introduction of model behavior consistency, a novel formulation that ensures the preservation of relationships in the latent space induced by the pretrained model and the latent space induced by TimeX.

In addition to achieving model behavior consistency, TimeX offers interpretable attribution maps, which are valuable tools for interpreting the model’s predictions, generated using discrete straight-through estimators, a type of gradient estimator that enable end-to-end training of TimeX models.

Unlike existing interpretability methods, TimeX goes further by learning a latent space of explanations. By incorporating model behavior consistency and leveraging a latent space of explanations, TimeX provides discrete attribution maps and visual summaries of similar explanations with interpretable temporal patterns.

Landmark Explanation Analysis Using an ECG Dataset

To demonstrate TimeX’s landmarks, we show how landmarks serve as summaries of diverse patterns in the ECG dataset. The figure below visualizes the learned landmarks in the latent space of explanations. We choose four representative landmarks. Every landmark occupies different regions of the latent space, capturing diverse types of explanations generated by the model.

We show the three nearest explanations for the top two landmarks in terms of the nearest neighbor in the latent space. Explanations ①, ②, and ③ are all similar to each other while distinctly different from ④, ⑤, and ⑥, both in terms of attribution and temporal structure. This visualization shows how landmarks can partition the latent space of explanations into interpretable temporal patterns.

Publication

Encoding Time-Series Explanations through Self-Supervised Model Behavior Consistency
Owen Queen, Thomas Hartvigsen, Teddy Koker, Huan He, Theodoros Tsiligkaridis, Marinka Zitnik
Proceedings of Neural Information Processing Systems, NeurIPS 2023 [arXiv]

@inproceedings{queen2023encoding,
title = {Encoding Time-Series Explanations through Self-Supervised Model Behavior Consistency},
author = {Queen, Owen and Hartvigsen, Thomas and Koker, Teddy and Huan, He and Tsiligkaridis, Theodoros and Zitnik, Marinka},
booktitle = {Proceedings of Neural Information Processing Systems, NeurIPS},
year      = {2023}
}

Code

Pytorch implementation of TimeX is available in the GitHub repository.

Authors

Latest News

Dec 2024:   Unified Clinical Vocabulary Embeddings

New paper: A unified resource provides a new representation of clinical knowledge by unifying medical vocabularies. (1) Phenotype risk score analysis across 4.57 million patients, (2) Inter-institutional clinician panels evaluate alignment with clinical knowledge across 90 diseases and 3,000 clinical codes.

Dec 2024:   SPECTRA in Nature Machine Intelligence

Are biomedical AI models truly as smart as they seem? SPECTRA is a framework that evaluates models by considering the full spectrum of cross-split overlap: train-test similarity. SPECTRA reveals gaps in benchmarks for molecular sequence data across 19 models, including LLMs, GNNs, diffusion models, and conv nets.

Nov 2024:   Ayush Noori Selected as a Rhodes Scholar

Congratulations to Ayush Noori on being named a Rhodes Scholar! Such an incredible achievement!

Nov 2024:   PocketGen in Nature Machine Intelligence

Oct 2024:   Activity Cliffs in Molecular Properties

Oct 2024:   Knowledge Graph Agent for Medical Reasoning

Sep 2024:   Three Papers Accepted to NeurIPS

Exciting projects include a unified multi-task time series model, a flow-matching approach for generating protein pockets using geometric priors, and a tokenization method that produces invariant molecular representations for integration into large language models.

Sep 2024:   TxGNN Published in Nature Medicine

Aug 2024:   Graph AI in Medicine

Excited to share a new perspective on Graph Artificial Intelligence in Medicine in Annual Reviews.

Aug 2024:   How Proteins Behave in Context

Harvard Medicine News on our new AI tool that captures how proteins behave in context. Kempner Institute on how context matters for foundation models in biology.

Jul 2024:   PINNACLE in Nature Methods

PINNACLE contextual AI model is published in Nature Methods. Paper. Research Briefing. Project website.

Jul 2024:   Digital Twins as Global Health and Disease Models of Individuals

Paper on digitial twins outlining strategies to leverage molecular and computational techniques to construct dynamic digital twins on the scale of populations to individuals.

Jul 2024:   Three Papers: TrialBench, 3D Structure Design, LLM Editing

Jun 2024:   TDC-2: Multimodal Foundation for Therapeutics

The Commons 2.0 (TDC-2) is an overhaul of Therapeutic Data Commons to catalyze research in multimodal models for drug discovery by unifying single-cell biology of diseases, biochemistry of molecules, and effects of drugs through multimodal datasets, AI-powered API endpoints, new tasks and benchmarks. Our paper.

May 2024:   Broad MIA: Protein Language Models

Apr 2024:   Biomedical AI Agents

Zitnik Lab  ·  Artificial Intelligence in Medicine and Science  ·  Harvard  ·  Department of Biomedical Informatics