Self-Supervised Contrastive Pre-Training For Time Series

Time series datasets present a unique challenge for pre-training because of the potential mismatch between pre-training and fine-tuning that can lead to poor performance. Domain adaptation methods can mitigate unwanted distribution shifts, including shifts in dynamics, varying trends, and long-range and short-range cyclic effects. To do so, these methods require access to data points in the target (fine-tuning) domain, meaning they must access target data points early on during pre-training, which is not possible in the real world.

To address this challenge, we put forward a principle: time-based and frequency-based representations of the same time series and its local augmentations must produce consistent latent representations and predictions. This generalizable property, called Time-Frequency Consistency (TF-C), is theoretically rooted in signal processing theory and leads to effective pre-training.

Time series are relevant for many areas, including medical diagnosis and healthcare settings. While representation learning has considerably advanced analysis of time series, learning broadly generalizable representations for time series remains a fundamentally challenging problem. Being able to generate such representations can directly improve pre-training. The central problem is transferring knowledge from a time series dataset to other different datasets during fine-tuning. The difficulty due to distribution shifts and other issues is compounded by the complexity of time series: large variations of temporal dynamics across datasets, varying semantic meaning, system factors, etc.

Supervised pre-training requires large annotated datasets, creating an obstacle to deep learning deployment. For example, in medical domains, experts often disagree on ground-truth labeling, which can introduce unintended biases (e.g., ECG signals falling outside of normal and abnormal rhythms).

We introduce a generalizable concept called Time-Frequency Consistency (TF-C) and use it to develop a broadly generalizable pre-training strategy.

Motivation for TF-C

The TF-C approach uses self-supervised contrastive learning to transfer knowledge across time series domains and pre-train models. The approach builds on the fundamental duality between time and frequency views of time signals. TF-C embeds time-based and frequency-based views learned from the same time series sample such that they are closer to each other in the joint time-frequency space; the embeddings are farther apart if they are associated with different time series samples.

The broad scope of the TF-C principle indicates that the approach promotes positive transfer across different time-series datasets. That is true even when datasets are complex in terms of considerable variations of temporal dynamics across datasets, varying semantic meaning, irregular sampling, and systemic factors.

The following figure gives an overview of the TF-C approach. Given a time series sample, the resulting time and frequency based embeddings are close to each other in a latent time-frequency space (panel a). TF-C expands the scope and applicability of pre-training; it can transfer pre-trained models to diverse scenarios (panel b).

TF-C approach

TF-C has four neural network components (as shown in the following figure):

  • Contrastive time encoder,
  • Contrastive frequency encoder, and
  • two cross-space projectors that map time-based and frequency-based representations to the same time-frequency space.

To position time-based and frequency-based embeddings close together, TF-C specifies the following constrastive learning objective:

  1. We first apply time-domain augmentations to the input time series sample. The embeddings of the original and the augmented views produced by the time-domain encoder are used to compute a term in the contrastive loss.

  2. We transform the input time series to its frequency spectrum. Then, similarly to before, we apply frequency-domain augmentations to the input time series sample, produce the embeddings, and compute another term in the contrastive loss.

  3. TF-C requires the consistency between time-domain and frequency-domain representations, which is achieved by a well-designed consistency loss item measured in the time-frequency space. As shown in the figure, our total contrastive loss is a weighted sum of these three terms.

  4. Finally, during the fine-tuning phase, we concatenate the time-domain and frequency-domain embeddings to form the sample embedding. This learned sample embedding can be fed into a downstream task such as classification.

Attractive properties of TF-C

  • A generalizable assumption (TF-C) for time series: It is grounded in the signal theory that a time series can be represented equivalently in either the time or frequency domain. Hence, our assumption is very reasonable that TF-C is invariant to different time-series datasets and can guide the development of effective pre-training models.

  • Self-supervised pretraining for diverse scenarios: The TF-C model can be used with pre-training dataset that are entirely unlabelled. Results show that TF-C provides effective transfer across four scenarios involving time series datasets with different kinds of sensors and measurements and still achieve strong performance when the domain gap is large (e.g., when pre-training and fine-tuning data points are given in different feature spaces or have different semantic meaning).


Self-Supervised Contrastive Pre-Training For Time Series via Time-Frequency Consistency
Xiang Zhang, Ziyuan Zhao, Theodoros Tsiligkaridis, and Marinka Zitnik
Proceedings of Neural Information Processing Systems, NeurIPS 2022 [arXiv]

title = {Self-Supervised Contrastive Pre-Training For Time Series via Time-Frequency Consistency},
author = {Zhang, Xiang and Zhao, Ziyuan and Tsiligkaridis, Theodoros and Zitnik, Marinka},
booktitle = {Proceedings of Neural Information Processing Systems, NeurIPS},
year      = {2022}


PyTorch implementation of TF-C and all datasets are available in the GitHub repository.


  • SleepEEG dataset of whole-night EEG recordings monitored by a sleep cassette

  • Epilepsy dataset of single-channel EEG measurements from 500 subjects

  • FD-A dataset of an electromechanical drive system monitoring the condition of rolling bearings

  • FD-B dataset of an electromechanical drive system monitoring the condition of rolling bearings

  • HAR dataset of human daily activities, including walking, walking upstaris, walking downstairs, sitting, standing, and laying

  • Gesture dataset of accelerometer measurements for eight hand gestures

  • ECG dataset of ECG measuruments across four underlying conditions of cardiac arrhythmias

  • EMG dataset of EMG recordings of muscular dystrophies and neuropathies


Latest News

Apr 2024:   Biomedical AI Agents

Mar 2024:   Efficient ML Seminar Series

We started a Harvard University Efficient ML Seminar Series. Congrats to Jonathan for spearheading this initiative. Harvard Magazine covered the first meeting focusing on LLMs.

Mar 2024:   UniTS - Unified Time Series Model

UniTS is a unified time series model that can process classification, forecasting, anomaly detection and imputation tasks within a single model with no task-specific modules. UniTS has zero-shot, few-shot, and prompt learning capabilities. Project website.

Mar 2024:   Weintraub Graduate Student Award

Michelle receives the 2024 Harold M. Weintraub Graduate Student Award. The award recognizes exceptional achievement in graduate studies in biological sciences. News Story. Congratulations!

Mar 2024:   PocketGen - Generating Full-Atom Ligand-Binding Protein Pockets

PocketGen is a deep generative model that generates residue sequence and full-atom structure of protein pockets, maximizing binding to ligands. Project website.

Feb 2024:   SPECTRA - Generalizability of Molecular AI

Feb 2024:   Kaneb Fellowship Award

The lab receives the John and Virginia Kaneb Fellowship Award at Harvard Medical School to enhance research progress in the lab.

Feb 2024:   NSF CAREER Award

The lab receives the NSF CAREER Award for our research in geometric deep learning to facilitate algorithmic and scientific advances in therapeutics.

Feb 2024:   Dean’s Innovation Award in AI

Jan 2024:   AI's Prospects in Nature Machine Intelligence

We discussed AI’s 2024 prospects with Nature Machine Intelligence, covering LLM progress, multimodal AI, multi-task agents, and how to bridge the digital divide across communities and world regions.

Jan 2024:   Combinatorial Therapeutic Perturbations

New paper introducing PDGrapher for combinatorial prediction of chemical and genetic perturbations using causally-inspired neural networks.

Nov 2023:   Next Generation of Therapeutics Commons

Oct 2023:   Structure-Based Drug Design

Geometric deep learning has emerged as a valuable tool for structure-based drug design, to generate and refine biomolecules by leveraging detailed three-dimensional geometric and molecular interaction information.

Oct 2023:   Graph AI in Medicine

Graph AI models in medicine integrate diverse data modalities through pre-training, facilitate interactive feedback loops, and foster human-AI collaboration, paving the way to clinically meaningful predictions.

Sep 2023:   New papers accepted at NeurIPS

Sep 2023:   Future Directions in Network Biology

Excited to share our perspectives on current and future directions in network biology.

Aug 2023:   Scientific Discovery in the Age of AI

Jul 2023:   PINNACLE - Contextual AI protein model

PINNACLE is a contextual AI model for protein understanding that dynamically adjusts its outputs based on biological contexts in which it operates. Project website.

Jun 2023:   Our Group is Joining the Kempner Institute

Excited to join Kempner’s inaugural cohort of associate faculty to advance Kempner’s mission of studying the intersection of natural and artificial intelligence.

Jun 2023:   Welcoming a New Postdoctoral Fellow

An enthusiastic welcome to Shanghua Gao who is joining our group as a postdoctoral research fellow.

Zitnik Lab  ·  Artificial Intelligence in Medicine and Science  ·  Harvard  ·  Department of Biomedical Informatics