Abstract
This chapter discusses some typical approaches that are commonly used to evaluate metalearning and AutoML systems. This helps us to establish whether we can trust the recommendations provided by a particular system, and also provides a way of comparing different competing approaches. As the performance of algorithms may vary substantially across different tasks, it is often necessary to normalize the performance values first to make comparisons meaningful. This chapter discusses some common normalization methods used. As often a given metalearning system outputs a sequence of algorithms to test, we can study how similar this sequence is from the ideal sequence. This can be determined by looking at a degree of correlation between the two sequences. This chapter provides more details on this issue. One common way of comparing systems is by considering the effect of selecting different algorithms (workflows) on base-level performance and determining how the performance evolves with time. If the ideal performance is known, it is possible to calculate the value of performance loss. The loss curve shows how the loss evolves with time or what its value is at the maximum available time (i.e., the time budget) given beforehand. This chapter also describes the methodology that is commonly used in comparisons involving several metalearning/AutoML systems with recourse to statistical tests.
Chapter PDF
Similar content being viewed by others
References
Abdulrahman, S., Brazdil, P., van Rijn, J. N., and Vanschoren, J. (2018). Speeding up algorithm selection using average ranking and active testing by introducing runtime. Machine Learning, 107(1):79–108.
Brazdil, P. and Soares, C. (2000). A comparison of ranking methods for classification algorithm selection. In de M´antaras, R. L. and Plaza, E., editors, Machine Learning: Proceedings of the 11th European Conference on Machine Learning ECML 2000, pages 63–74. Springer.
Brazdil, P., Soares, C., and da Costa, J. P. (2003). Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning, 50(3):251–277.
da Costa, J. P. (2015). Rankings and Preferences: New Results in Weighted Correlation and Weighted Principal Component Analysis with Applications. Springer.
da Costa, J. P. and Soares, C. (2005). A weighted rank measure of correlation. Aust. N.Z. J. Stat., 47(4):515–529.
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7:1–30.
Gama, J. and Brazdil, P. (1995). Characterization of classification algorithms. In Pinto-Ferreira, C. and Mamede, N. J., editors, Progress in Artificial Intelligence, Proceedings of the Seventh Portuguese Conference on Artificial Intelligence, pages 189–200. Springer-Verlag.
Järvelin, K. and Kekäläinen, J. (2002). Cumulative gain-based evaluation of IR techniques. IEEE Transactions on Information Systems, 20(4).
Kalousis, A. (2002). Algorithm Selection via Meta-Learning. PhD thesis, University of Geneva, Department of Computer Science.
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of Int. Joint Conference on Articial Intelligence (IJCAI), volume 2, pages 1137–1145. Montreal, Canada.
Leite, R., Brazdil, P., and Vanschoren, J. (2012). Selecting classification algorithms with active testing. In Machine Learning and Data Mining in Pattern Recognition, pages 117–131. Springer.
Mitchell, T. M. (1997). Machine Learning. McGraw-Hill.
Neave, H. R. and Worthington, P. L. (1992). Distribution-Free Tests. Routledge.
Schaffer, C. (1993). Selecting a classification method by cross-validation. Machine Learning, 13(1):135–143.
Sohn, S. Y. (1999). Meta analysis of classification algorithms for pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(11):1137–1144.
Spearman, C. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15:72–101.
Sun, Q. and Pfahringer, B. (2013). Pairwise meta-rules for better meta-learning-based algorithm ranking. Machine Learning, 93(1):141–161.
Thornton, C., Hutter, F., Hoos, H. H., and Leyton-Brown, K. (2013). Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 847–855. ACM.
Torgo, L. (1999). Inductive Learning of Tree-Based Regression Models. PhD thesis, Faculty of Sciences, Univ. of Porto.
van Rijn, J. N. (2016). Massively collaborative machine learning. PhD thesis, Leiden University.
Varma, S. and Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7(1):91.
Author information
Authors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this chapter
Cite this chapter
Brazdil, P., van Rijn, J.N., Soares, C., Vanschoren, J. (2022). Evaluating Recommendations of Metalearning/AutoML Systems. In: Metalearning. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-030-67024-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-67024-5_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67023-8
Online ISBN: 978-3-030-67024-5
eBook Packages: Computer ScienceComputer Science (R0)