RespDB – A Respiratory Signal Database

Authors: Wimmert, Lukas and Madesta, Frederic and Gauer, Tobias and Werner, Rene

This SQLite-based database contains 2,510 respiratory signals from 419 patients (total acquisition time > 90 hours) with thoracic lesions treated between February 2013 and May 2022 at the Clinic of Radiotherapy and Radiation Oncology of the University Medical Center Hamburg-Eppendorf.

We believe this comprehensive dataset is of high value to the radiotherapy community as well as to researchers working on time-series analysis tasks such as forecasting and classification. Open access to these retrospectively collected and anonymized respiratory signals was approved by the local ethics board, with the need for written informed consent waived [2023-300334-WF].

Usage

Please visit the corresponding GitHub repository for data reading and preprocessing functionalities.
Additionally, review the provided Entity Relationship Diagram (ERD) to understand the database structure. We recommend using DB Browser for SQLite as a convenient tool for browsing the database.

Data

All respiratory signals were recorded in the course of radiotherapy treatment. Signals were acquired using the Varian RPM System during:

  • 4D CT imaging (sampling rate: 25 Hz)
  • 4D CBCT imaging (66 Hz)
  • Dose delivery (66 Hz)

The Varian RPM System monitors an external marker block placed on the patient’s chest wall with an infrared camera. From the obtained marker block signal, only the one-dimensional signal component representing the vertical displacement (anterior-posterior) of the chest wall is considered, resulting in uni-variate time series. All patients breathed freely during acquisition without visual guidance or coaching. Please also refer to the provided acquisition images and example images for additional context.

Overview

modality # signals # patients mean signal length (s)
4D CT 481 419 98.1
4D CBCT 251 52 59.6
dose delivery 1778 357 145.6
total 2510 419 129.3

Splitting

For typical machine or deep learning workflows, we provide a predefined data split into training, validation, and testing sets at the patient level (i.e., all signals from one patient belong to the same set). This information is stored in the deeplearningdataset table.

set proportion of data # signals # patients
training 50 % 1262 215
validation 20 % 514 84
testing 30 % 726 117
total 100 % 2502 416

Note: 8 signals (from three patients) were retrospectively classified as corrupted and removed from these splits.

Citation

If you use this database, please also cite the underlying publication:

@article{wimmert2024benchmarking,
  doi={10.1002/mp.17038}
  title={Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT},
  author={Wimmert, Lukas and Nielsen, Maximilian and Madesta, Frederic and Gauer, Tobias and Hofmann, Christian and Werner, Rene},
  journal={Medical Physics},
  year={2024},
  publisher={Wiley Online Library}
}

License

MIT