Authors: Wimmert, Lukas and Madesta, Frederic and Gauer, Tobias and Werner, Rene
This SQLite-based database contains 2,510 respiratory signals from 419 patients (total acquisition time > 90 hours) with thoracic lesions treated between February 2013 and May 2022 at the Clinic of Radiotherapy and Radiation Oncology of the University Medical Center Hamburg-Eppendorf.
We believe this comprehensive dataset is of high value to the radiotherapy community as well as to researchers working on time-series analysis tasks such as forecasting and classification. Open access to these retrospectively collected and anonymized respiratory signals was approved by the local ethics board, with the need for written informed consent waived [2023-300334-WF].
Please visit the corresponding GitHub repository for data reading and preprocessing functionalities.
Additionally, review the provided Entity Relationship Diagram (ERD) to understand the database structure.
We recommend using DB Browser for SQLite as a convenient tool for browsing the database.
All respiratory signals were recorded in the course of radiotherapy treatment. Signals were acquired using the Varian RPM System during:
The Varian RPM System monitors an external marker block placed on the patient’s chest wall with an infrared camera. From the obtained marker block signal, only the one-dimensional signal component representing the vertical displacement (anterior-posterior) of the chest wall is considered, resulting in uni-variate time series. All patients breathed freely during acquisition without visual guidance or coaching. Please also refer to the provided acquisition images and example images for additional context.
| modality | # signals | # patients | mean signal length (s) |
|---|---|---|---|
| 4D CT | 481 | 419 | 98.1 |
| 4D CBCT | 251 | 52 | 59.6 |
| dose delivery | 1778 | 357 | 145.6 |
| total | 2510 | 419 | 129.3 |
For typical machine or deep learning workflows, we provide a predefined data split into training, validation, and testing sets at the patient level (i.e., all signals from one patient belong to the same set). This information is stored in the deeplearningdataset table.
| set | proportion of data | # signals | # patients |
|---|---|---|---|
| training | 50 % | 1262 | 215 |
| validation | 20 % | 514 | 84 |
| testing | 30 % | 726 | 117 |
| total | 100 % | 2502 | 416 |
Note: 8 signals (from three patients) were retrospectively classified as corrupted and removed from these splits.
If you use this database, please also cite the underlying publication:
@article{wimmert2024benchmarking,
doi={10.1002/mp.17038}
title={Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT},
author={Wimmert, Lukas and Nielsen, Maximilian and Madesta, Frederic and Gauer, Tobias and Hofmann, Christian and Werner, Rene},
journal={Medical Physics},
year={2024},
publisher={Wiley Online Library}
}