Miscellaneous statistical topics
We are in intense collaboration with a number of groups and provide essential contributions by developing methodology for the emerging scientific questions, thus making the collaboration a statistical research field. A selection of these research projects is described here.
Biostatistics for medical image analysis
Health imaging AI algorithm validation is frequently done through large international competitions where participating algorithms are ranked according to their performance. In a collaboration with the Division of Intelligent Medical Systems we identified pitfalls in the evaluation of competitions and introduced a toolkit for the evaluation of these competitions which has been increasingly adopted by the community. Transferring clinical trials' standards into this area, we jointly created reporting guidelines. As a consequence of the cooperation, we have been involved in the benchmarking of machine learning algorithms and analysis of various health imaging competitions. Thereby, typical challenges include hierarchical data structures (repeated measurements), appropriate choice of performance measures and appropriate quantification of uncertainty.
- Antonelli M, Reinke A, Bakas S, Farahani K, Kopp-Schneider A, Landman BA, Litjens G, Menze B, Ronneberger O, Summers RM, van Ginneken B, Bilello M, Bilic P, Christ PF, Do RKG, Gollub MJ, Heckers SH, Huisman H, Jarnagin WR, McHugo MK, Napel S, Pernicka JSG, Rhode K, Tobon-Gomez C, Vorontsov E, Meakin JA, Ourselin S, Wiesenfarth M, Arbeláez P, Bae B, Chen S, Daza L, Feng J, He B, Isensee F, Ji Y, Jia F, Kim I, Maier-Hein K, Merhof D, Pai A, Park B, Perslev M, Rezaiifar R, Rippel O, Sarasua I, Shen W, Son J, Wachinger C, Wang L, Wang Y, Xia Y, Xu D, Xu Z, Zheng Y, Simpson AL, Maier-Hein L, Cardoso MJ. The Medical Segmentation Decathlon. Nat Commun. 2022 Jul 15;13(1):4128. doi: 10.1038/s41467-022-30695-9.
- Maier-Hein, L., Eisenmann, M., Reinke, A., Onogur, S., Stankovic, M., Scholz, P., ... & Kopp-Schneider, A. (2018). Why rankings of biomedical image analysis competitions should be interpreted with care. Nature communications, 9(1), 1-13.
- Maier-Hein L, Reinke A, Kozubek M, Martel AL, Arbel T, Eisenmann M, Hanbury A, Jannin P, Müller H, Onogur S, Saez-Rodriguez J, van Ginneken B, Kopp-Schneider A, Landman BA. BIAS: Transparent reporting of biomedical image analysis challenges. Med Image Anal. 2020 Dec;66:101796. doi: 10.1016/j.media.2020.101796.
- Maier-Hein, L., Reinke, A., Christodoulou, E., Glocker, B., Godau, P., Isensee, F., ... & Jäger, P. F. (2022). Metrics reloaded: Pitfalls and recommendations for image analysis validation. arXiv preprint arXiv:2206.01653.
- Schelb, P., Kohl, S., Radtke, J. P., Wiesenfarth, M., Kickingereder, P., Bickelhaupt, S., ... & Bonekamp, D. (2019). Classification of cancer at prostate MRI: deep learning versus clinical PI-RADS assessment. Radiology, 293(3), 607-617.
- Wiesenfarth M, Reinke A, Landman BA, Eisenmann M, Saiz LA, Cardoso MJ, Maier-Hein L, Kopp-Schneider A. Methods and open-source toolkit for analyzing and visualizing challenge results. Sci Rep. 2021 Jan 27;11(1):2369. Doi: 10.1038/s41598-021-82017-6.
Biostatistics for molecular neuropathology research in meningioma
In a long-term collaboration with the University Hospital HD Neuropathology (Felix Sahm's group), we assessed the prognostic impact of molecular biomarkers and developed integrated risk stratifications using different types of histological and molecular factors (DNA-methylation classifier, CNVs, mutations) in meningioma patients. One of the aims is to differentiate between patients requiring treatment and those just being kept under surveillance. Measures of prediction accuracy and discrimination for time-to-event endpoints are commonly used to characterize and compare the performance between competing risk models. These analyses are based on retrospective and prospective multi-center data from international collaborations.
- Katz LM*, Hielscher T*, et al. Loss of histone H3K27me3 identifies a subset of meningiomas with increased risk of recurrence. Acta Neuropathol. 2018 Jun;135(6):955-963.
- Sievers P*, Hielscher T*, et al. CDKN2A/B homozygous deletion is associated with early recurrence in meningiomas. Acta Neuropathol. 2020 Sep;140(3):409-413.
- Maas SLN*, Stichel D*, Hielscher T*, Sievers P*, et al. Integrated Molecular-Morphologic Meningioma Classification: A Multicenter Retrospective Analysis, Retrospectively and Prospectively Validated. J Clin Oncol. 2021 Dec 1;39(34):3839-3852.
- Hielscher T, et al. Clinical implementation of integrated molecular-morphologic risk prediction for meningioma. Brain Pathol. 2022 Nov 14:e13132.
Biostatistics at large
Questions that repeatedly come up in collaborations occasionally motivate research in a statistical topic, examples of which are given here.
In some situations, an expensive marker is to be measured in an existing clinical cohort but limited resources are available. We modified the nested case-control and the case-cohort design for the proportional hazards model to deal with such a scenario.
Often, DKFZ experiments result in multiple quantitative measurements per subject, in which case a mixed model analysis would be called for. To avoid this additional complexity, we propose to base the analysis on averaged observations per subject, an ad hoc approach that works well if the number of measurements per subject is not too unbalanced.
When a method comparison study is performed, e.g. in radiology, the aim is to evaluate the agreement of measurements of different methods. We presented the Bland–Altman plot with limits of agreement as the correct analysis methodology and discussed other scaled and unscaled indices of agreement as well as commonly used inappropriate approaches.
- Edelmann D, Ohneberg K, Becker N, Benner A, Schumacher M. Which patients to sample in clinical cohort studies when the number of events is high and measurement of additional markers is constrained by limited resources. Cancer medicine 2020, 9:7398-7406.
- Holland-Letz T, Kopp-Schneider A: Drawing statistical conclusions from experiments with multiple quantitative measurements per subject. Radiotherapy and oncology 2020, 152:30-33.
- Kopp-Schneider A, Hielscher T. How to evaluate agreement between quantitative measurements. Radiotherapy and oncology 2019, 141:321-326.