Biomedical Image Analysis ChallengeS (BIAS) Initiative
The importance of data science techniques in almost all fields of biomedicine is increasing at an enormous pace. This holds particularly true for the field of biomedical image analysis, which plays a crucial role in many areas including tumor detection, classification, staging and progression modeling as well as automated analysis of cancer cell images acquired using microscopy.
While clinical trials are the state-of-the-art methods to assess the effect of new medication in a comparative manner, benchmarking in the field of image analysis is performed by so-called challenges. Challenges are international competitions, typically hosted by individual researchers, institutes, or societies, that aim to assess the performance of multiple algorithms on identical data sets and encourage benchmarking. They are often published in prestigious journals, are associated with significant amounts of prize money (up to €1 million on platforms like Kaggle) and receive a huge amount of attention, indicated by the number of downloads, citations and views. Our recent comprehensive analysis of biomedical image analysis challenges (Maier-Hein et al., 2018), which involved 38 researchers from 30 institutes worldwide, however, revealed a huge discrepancy between the impact of a challenge and the quality (control) of the design and reporting standard. We showed that (1) "common practice related to challenge reporting is poor and does not allow for adequate interpretation and reproducibility of results"', (2) "challenge design is very heterogeneous and lacks common standards, although these are requested by the community"' and (3) "challenge rankings are sensitive to a range of challenge design parameters, such as the metric variant applied, the type of test case aggregation performed and the observer annotating the data" (Maier-Hein et al., 2018). We also showed that security holes in challenge design can potentially be exploited by both challenge organizers and participants to tune rankings (e.g. by selective test case submission (participants) or retrospective tuning of the ranking scheme (organizers)) (Reinke et al., 2018). The conclusion from our studies was that "journal editors and reviewers should provide motivation to raise challenge quality by establishing a rigorous review process."
The Enhancing the QUAlity and Transparency Of health Research (EQUATOR) network is a global initiative with the aim of improving the quality of research publications and research itself. A key mission in this context is to achieve accurate, complete and transparent reporting of health research studies to support reproducibility and usefulness. A core activity of the network is to assist in the development, dissemination and implementation of robust reporting guidelines, where a guideline is defined as "a checklist, flow diagram or structured text to guide authors in reporting a specific type of research" (The EQUATOR network, 2008). Between 2006 and 2019, more than 400 reporting guidelines have been published under the umbrella of the Equator network. A well-known guideline is the CONSORT statement developed for reporting of randomized controlled trials. Prominent journals, such as Lancet, Jama or the British Medical Journal require the CONSORT checklist to be submitted along with the actual paper when reporting results of a randomized controlled trial.
Inspired by this success story, the Biomedical Image Analysis ChallengeS (BIAS) initiative was founded by the MICCAI board challenge working group, since 2021 known as MICCAI Special Interest Group on Biomedical Image Analysis Challenges (http://www.miccai.org/special-interest-groups/challenges), lead by Prof. Dr. Lena Maier-Hein. Our goal is to bring biomedical image analysis challenges to the next level of quality.
As a first step towards better scientific practice, we presented a guideline (Maier-Hein et al. 2020) to standardize and facilitate the writing and reviewing process of biomedical image analysis challenges and help readers of challenges interpret and reproduce results by making relevant information explicit (Maier-Hein et al., 2020). The guideline was used to enhance the quality of challenge designs and to simplify the challenge submission process for prestigious conferences (MICCAI 2018, 2019 and 2020, the IEEE International Symposium on Biomedical Imaging (ISBI) 2020 and the Conference on Medical Imaging with Deep Learning (MIDL) 2020), see https://www.biomedical-challenges.org/. In our most recent contribution (Wiesenfarth et al., 2019), we presented methodology along with an open-source framework to facilitate the visualization of challenge results.
A light version fo the checklist which can be used for journal submission and reviews of challenge reports, can be downloaded here.
As stated above, common practice in biomedical challenge organization can be exploited to tune challenge rankings. To prevent such incidents and to improve the quality of challenges, the MICCAI board challenge working group and the MICCAI 2020 Satellite Event team decided to introduce the concept of challenge registration. Similar to how clinical trials have to be registered before starting, the complete design of accepted MICCAI challenges will be put online before the challenges take place. Changes to the design (e.g. to the metrics or ranking schemes applied) must be well-justified and officially be registered online (as a new version of the challenge design). Registered challenges are listed here: http://www.miccai.org/special-interest-groups/challenges/miccai-registered-challenges/.
The BIAS initiative furthermore aims to provide best practice recommendations for choosing the correct performance metrics for a specific biomedical research problem. Metrics are the key to objective, transparent and comparative performance assessment in the field of image analysis. Unfortunately, the field suffers from a lack of guidelines/standards for choosing appropriate performance metrics that reflect the task-specific needs, and practical pitfalls in metric usage are often overseen (Reinke et al. 2021). To address this issue, the BIAS initiative founded a large consortium with more than 60 international experts (including researchers from Google Health, Harvard University, Imperial College, Radiological Society of North America (RSNA) and National Institutes of Health (NIH)) aiming to provide guidelines and tools for choosing performance metrics in a problem-aware manner. The resulting recommendation framework involves capturing the characteristics of the given research problem in a problem fingerprint, which comprises domain interest-related, target structure-related and test set-related properties. Based on this fingerprint, an image processing task-specific mapping is applied to match the task to a set of metrics that reflect the needs of the target domain.
Members and collaborators:
- Lena Maier-Hein (Chair), Annika Reinke, Matthias Eisenmann, Sinan Onogur, Patrick Godau, Minu D. Tizabi, Tim Rädsch, Doreen Heckmann-Nötzel, Emre Kavur, Division of Computer Assisted Medical Interventions (CAMI), German Cancer Research Center (DKFZ)
- Annette Kopp-Schneider, Manuel Wiesenfarth, Division of Biostatistics, DKFZ
- Spyridon (Spyros) Bakas, Center for Biomedical Image Computing & Analytics (CBICA), Perelman School of Medicine, University of Pennsylvania
- Michal Kozubek, Centre for Biomedical Image Analysis, Masaryk University
- Bennett A. Landman, Electrical Engineering, Vanderbilt University
- Anne L. Martel, Physical Sciences, Sunnybrook Research Institute; Department Medical Biophysics, University of Toronto
- Tal Arbel, Centre for Intelligent Machines, McGill University
- Allan Hanbury, Institute of Information Systems Engineering, Technische Universität (TU) Wien; Complexity Science Hub Vienna
- Pierre Jannin, Laboratoire Traitement du Signal et de l'Image (LTSI) - UMR_S 1099, Université de Rennes 1, Inserm
- Henning Müller, University of Applied Sciences Western Switzerland (HES-SO); Medical Faculty, University of Geneva
- Julio Saez-Rodriguez, Institute of Computational Biomedicine, Heidelberg University; Faculty of Medicine, Heidelberg, University Hospital; Joint Research Centre for Computational Biomedicine, Westfälische Technische Hochschule (RWTH) Aachen
- Bram van Ginneken, Radboud University Medical Center; Fraunhofer MEVIS
- Paul Jäger, Fabian Isensee, Jens Petersen, Michael Baumgartner, Klaus Maier-Hein, Division of Medical Image Computing, DKFZ
- Carole H. Sudre, School of Biomedical Engineering and Imaging Science, King's College London (KCL); Centre for Medical Image Computing, University College London (UCL); MRC Unit for Lifelong Health and Ageing at UCL
- Laura Acion, Department of Psychiatry, University of Iowa; Instituto de Cálculo, Universidad de Buenos Aires – CONICET
- Michela Antonelli, Centre for Medical Image Computing, UCL; School of Biomedical Engineering and Imaging Science, KCL
- Peter Bankhead, University of Edinburgh
- Arriel Benis, HIT- Holon Institute of Technology
- M. Jorge Cardoso, Department of Biomedical Engineering, School of Biomedical Engineering & Imaging Sciences, KCL; Department of Medical Physics and Biomedical Engineering, UCL
- Veronika Cheplygina, IT University of Copenhagen
- Beth A. Cimini, Broad Institute of MIT and Harvard
- Gary S. Collins, Director of the UK EQUATOR Centre; Centre for Statistics in Medicine, University of Oxford; Botnar Research Centre
- Keyvan Farahani, Center for Biomedical Informatics and Information Technology, National Cancer Institute (NIH)
- Ben Glocker, Biomedical Image Analysis Group, Department of Computing, Imperial College London
- Fred Hamprecht, Heidelberg Collaboratory for Image Processing (HCI), Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University
- Daniel Hashimoto, Surgical Artificial Intelligence and Innovation Laboratory at the Massachusetts General Hospital
- Michael Hoffman, University Health Network, Toronto; Department of Medical Biophysics, Department of Computer Science, University of Toronto
- Merel Huisman, Department of Radiology, University Medical Center Utrecht
- Charles E. Kahn, Perelman School of Medicine, University of Pennsylvania
- Hannes Kenngott, Felix Nickel, Department of General, Visceral and Transplantation Surgery, Heidelberg University Hospital
- Alexandros Karargyris, IHU Strasbourg
- Alan Karthikesalingam, Google Health
- Jens Kleesiek, Translational Image-guided Oncology (TIO), Institute for AI in Medicine (IKIM), University Medicine Essen
- Anna Kreshuk, Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL)
- Tashin Kurc, Stony Brook Cancer Center, Stony Brook University
- Geert Litjens, Department of Pathology, Radboud University Medical Center; Radboud University Medical Center, Radboud Institute for Health Sciences
- Amin Madani, The Institute for Education Research (TIER), University Health Network (UHN), University of Toronto
- Peter Mattson, Google
- Erik Meijering, School of Computer Science and Engineering, University of New South Wales
- Bjoern Menze, Department of Quantitative Biomedicine, University of Zurich
- David Moher, Centre for Journalology, Clinical Epidemiology Program, Ottawa Hospital Research Institute; School of Epidemiology and Public Health, Faculty of Medicine, University of Ottawa
- Carl Moons, Clinical Epidemiology, Julius Center, UMC Utrecht; Director Health Innovation Netherlands
- Nasir Rajpoot, Tissue Image Analytics Laboratory, Department of Computer Science, University of Warwick
- Mauricio Reyes, Healthcare Imaging A.I., Insel Data Science Center, Bern University Hospital
- Michael Riegler, SimulaMet
- Nicola Rieke, NVIDIA; Technical University of Munich (TUM)
- Clarisa Sánchez Gutiérrez, Informatics Institute, Faculaty of Science, University of Amsterdam
- Shravya Shetty, Google Health
- Bram Stieltjes, Department of Radiology, University Hospital of Basel
- Ronald M. Summers, Radiology and Imaging Sciences, Clinical Center, NIH
- Aziz A. Taha, Research Studio Data Science, Research Studios Austria, Salzburg, Austria
- Sotirios A. Tsaftaris, School of Engineering, The University of Edinburgh
References
Maier-Hein, L., Eisenmann, M., Reinke, A., Onogur, S., Stankovic, M., Scholz, P., Arbel, T., Bogunovic, H., Bradley, A. P., Carass, A., Feldmann, C., Frangi, A. F., Full, P. M., van Ginneken, B., Hanbury, A., Honauer, K., Kozubek, M., Landman, B. A., März, K., ... Kopp-Schneider, A. (2018). Why rankings of biomedical image analysis competitions should be interpreted with care. Nature Communications, 9(1), 5217. https://doi.org/10.1038/s41467-018-07619-7
Maier-Hein, L., Reinke, A., Kozubek, M., Martel, A. L., Arbel, T., Eisenmann, M., Hanbuary, A., Jannin, P., Müller, H., Onogur, S., Saez-Rodriguez, J., van Ginneken, B., Kopp-Schneider, A., & Landman, B. (2020). BIAS: Transparent reporting of biomedical image analysis challenges. Medical Image Analysis, 101796. https://doi.org/10.1016/j.media.2020.101796
Reinke, A., Eisenmann, M., Onogur, S., Stankovic, M., Scholz, P., Full, P. M., Bogunovic, H., Landman, B. A., Maier, O., Menze, B., Sharp, G. C., Sirinukunwattana, K., Speidel, S., van der Sommen, F., Zheng, G., Müller, H., Kozubek, M., Arbel, T., Bradley, A. P., ... Maier-Hein, L. (2018). How to Exploit Weaknesses in Biomedical Challenge Design and Organization. In A. F. Frangi, J. A. Schnabel, C. Davatzikos, C. Alberola-López, & G. Fichtinger (Eds.), Medical Image Computing and Computer Assisted Intervention – MICCAI 2018 (pp. 388–395). Springer International Publishing. https://doi.org/10.1007/978-3-030-00937-3_45
Wiesenfarth, M., Reinke, A., Landman, B. A., Eisenmann, M., Saiz, L. A., Cardoso, M. J., Maier-Hein, L., & Kopp-Schneider, A. (2021). Methods and open-source toolkit for analyzing and visualizing challenge results. Scientific Reports, 11(1), 2369. https://doi.org/10.1038/s41598-021-82017-6
Reinke, A., Eisenmann, M., Tizabi, M. D., Sudre, C. H., Rädsch, T., Antonelli, M., Arbel, T., Bakas, S., Cardoso, M. J., Cheplygina, V., Farahani, K., Glocker, B., Heckmann-Nötzel, D., Isensee, F., Jannin, P., Kahn, C. E., Kleesiek, J., Kurc, T., Kozubek, M., ... Maier-Hein, L. (2021). Common Limitations of Image Processing Metrics: A Picture Story. ArXiv:2104.05642.
The EQUATOR network – Enhancing the QUAlity and Transparency Of health Research. http://www.equator-network.org, 2008. Accessed: 2019-09-12.