Abstract
Systematic evaluation has had a strong impact on many data analysis domains, for example, TREC and CLEF in information retrieval, ImageCLEF in image retrieval, and many challenges in conferences such as MICCAI for medical imaging and ICPR for pattern recognition. With Kaggle, a platform for machine learning challenges has also had a significant success in crowdsourcing solutions. This shows the importance to systematically evaluate algorithms and that the impact is far larger than simply evaluating a single system. Many of these challenges also showed the limits of the commonly used paradigm to prepare a data collection and tasks, distribute these and then evaluate the participants’ submissions. Extremely large datasets are cumbersome to download, while shipping hard disks containing the data becomes impractical. Confidential data can often not be shared, for example medical data, and also data from company repositories. Real-time data will never be available via static data collections as the data change over time and data preparation often takes much time. The Evaluation-as-a-Service (EaaS) paradigm tries to find solutions for many of these problems and has been applied in the VISCERAL project. In EaaS, the data are not moved but remain on a central infrastructure. In the case of VISCERAL, all data were made available in a cloud environment. Participants were provided with virtual machines on which to install their algorithms. Only a small part of the data, the training data, was visible to participants. The major part of the data, the test data, was only accessible to the organizers who ran the algorithms in the participants’ virtual machines on the test data to obtain impartial performance measures.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Riding the wave: how Europe can gain from the rising tide of scientific data (2010) Submission to the European commission. http://cordis.europa.eu/fp7/ict/e-infrastructure/docs/hlg-sdi-report.pdf
Brodt T, Hopfgartner F (2014) Shedding light on a living lab: the CLEF NEWSREEL open recommendation platform. In: IIiX’14: proceedings of information interaction in context conference. ACM, pp 223–226. http://dx.doi.org/10.1145/2637002.2637028
Hanbury A, Müller H, Langs G, Weber MA, Menze BH, Fernandez TS (2012) Bringing the algorithms to the data: cloud–based benchmarking for medical image analysis. In: Catarci T, Forner P, Hiemstra D, Peñas A, Santucci G (eds) CLEF 2012. LNCS, vol 7488. Springer, Heidelberg, pp 24–29. doi:10.1007/978-3-642-33247-0_3
Hanbury A, Müller H, Balog K, Brodt T, Cormack GV, Eggel I, Gollub T, Hopfgartner F, Kalpathy-Cramer J, Kando N, Krithara A, Lin J, Mercer S, Potthast M (2015) Evaluation-as-a-Service: overview and outlook. CoRR abs/1512.07454. http://arxiv.org/abs/1512.07454
Hopfgartner F, Kille B, Lommatzsch A, Plumbaum T, Brodt T, Heintz T (2014) Benchmarking news recommendations in a living lab. In: Kanoulas E, Lupu M, Clough P, Sanderson M, Hall M, Hanbury A, Toms E (eds) CLEF 2014. LNCS, vol 8685. Springer, Cham, pp 250–267. doi:10.1007/978-3-319-11382-1_21
Hopfgartner F, Hanbury A, Müller H, Kando N, Mercer S, Kalpathy-Cramer J, Potthast M, Gollub T, Krithara A, Lin J, Balog K, Eggel I (2015) Report on the Evaluation-as-a-Service (EaaS) expert workshop. SIGIR Forum 49(1):57–65
Jiménez-del-Toro O, Hanbury A, Langs G, Foncubierta-Rodríguez A, Müller H (2015) Overview of the VISCERAL retrieval benchmark 2015. In: Müller H, Jimenez del Toro O, Hanbury A, Langs G, Foncubierta Rodriguez A (eds) Multimodal retrieval in the medical domain (MRMD) 2015. LNCS, vol 9059. Springer, Cham. doi:10.1007/978-3-319-24471-6_10
Jimenez-del-Toro O, Müller H, Krenn M, Gruenberg K, Taha AA, Winterstein M, Eggel I, Foncubierta-Rodríguez A, Goksel O, Jakab A, Kontokotsios G, Langs G, Menze B, Salas Fernandez T, Schaer R, Walleyo A, Weber MA, Dicente Cid Y, Gass T, Heinrich M, Jia F, Kahl F, Kechichian R, Mai D, Spanier AB, Vincent G, Wang C, Wyeth D, Hanbury A (2016) Cloud-based evaluation of anatomical structure segmentation and landmark detection algorithms: VISCERAL anatomy benchmarks. IEEE Trans Med Imaging
Krenn M, Dorfer M, Jiménez del Toro OA, Müller H, Menze B, Weber MA, Hanbury A, Langs G (2016) Creating a large-scale silver corpus from multiple algorithmic segmentations. In: Menze B, Langs G, Montillo A, Kelm M, Müller H, Zhang S, Cai W, Metaxas D (eds) MCV 2015. LNCS, vol 9601. Springer, Cham, pp 103–115. doi:10.1007/978-3-319-42016-5_10
Langs G, Hanbury A, Menze B, Müller H (2013) VISCERAL: towards large data in medical imaging — challenges and directions. In: Greenspan H, Müller H, Syeda-Mahmood T (eds) MCBR-CDS 2012. LNCS, vol 7723. Springer, Heidelberg, pp 92–98. doi:10.1007/978-3-642-36678-9_9
Lin J, Efron M (2013) Overview of the TREC-2013 microblog track. In: TREC’13: proceedings of the 22nd text retrieval conference, Gaithersburg, Maryland
Müller, Kalpathy-Cramer J, Hanbury A, Farahani K, Sergeev R, Paik JH, Klein A, Criminisi A, Trister A, Norman T, Kennedy D, Srinivasa G, Mamonov A, Preuss N (2016) Report on the cloud-based evaluation approaches workshop 2015. ACM SIGIR Forum 51(1):35–41
Potthast M, Gollub T, Rangel F, Rosso P, Stamatatos E, Stein B (2014) Improving the reproducibility of PAN’s shared tasks: plagiarism detection, author identification, and author profiling. In: Kanoulas E, Lupu M, Clough P, Sanderson M, Hall M, Hanbury A, Toms E (eds) CLEF 2014. LNCS, vol 8685. Springer, Cham, pp 268–299. doi:10.1007/978-3-319-11382-1_22
Rebholz-Schuhmann D, Jimeno Yepes AJ, Van Mulligen EM, Kang N, Kors J, Milward D, Corbett P, Buyko E, Beisswanger E, Hahn U (2010) CALBC silver standard corpus. J Bioinform Comput Biol 8(1):163–179
Rowe BR, Wood DW, Link AN, Simoni DA (2010) Economic impact assessment of NIST text retrieval conference (TREC) program. Technical report project number 0211875, National Institute of Standards and Technology
Taha AA, Hanbury A (2015) Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool. BMC Med Imaging 15(1):1–28
Thornley CV, Johnson AC, Smeaton AF, Lee H (2011) The scholarly impact of TRECVid (2003–2009). J Am Soc Info Sci Tech 62(4):613–627
Tsikrika T, Herrera AGS, Müller H (2011) Assessing the scholarly impact of ImageCLEF. In: Forner P, Gonzalo J, Kekäläinen J, Lalmas M, Rijke M (eds) CLEF 2011. LNCS, vol 6941. Springer, Heidelberg, pp 95–106. doi:10.1007/978-3-642-23708-9_12
Acknowledgements
The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement 318068 (VISCERAL).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is distributed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the work’s Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work’s Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.
Copyright information
© 2017 The Author(s)
About this chapter
Cite this chapter
Hanbury, A., Müller, H. (2017). VISCERAL: Evaluation-as-a-Service for Medical Imaging. In: Hanbury, A., Müller, H., Langs, G. (eds) Cloud-Based Benchmarking of Medical Image Analysis. Springer, Cham. https://doi.org/10.1007/978-3-319-49644-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-49644-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49642-9
Online ISBN: 978-3-319-49644-3
eBook Packages: Computer ScienceComputer Science (R0)