Summary
This chapter discusses dataset characteristics that play a crucial role in many metalearning systems. Typically, they help to restrict the search in a given configuration space. The basic characteristic of the target variable, for instance, determines the choice of the right approach. If it is numeric, it suggests that a suitable regression algorithm should be used, while if it is categorical, a classification algorithm should be used instead. This chapter provides an overview of different types of dataset characteristics, which are sometimes also referred to as metafeatures. These are of different types, and include so-called simple, statistical, information-theoretic, model-based, complexitybased, and performance-based metafeatures. The last group of characteristics has the advantage that it can be easily defined in any domain. These characteristics include, for instance, sampling landmarkers representing the performance of particular algorithms on samples of data, relative landmarkers capturing differences or ratios of performance values and providing estimates of performance gains. The final part of this chapter discusses the specific dataset characteristics used in different machine learning tasks, including classification, regression, time series, and clustering.
Chapter PDF
References
Adya, M., Collopy, F., Armstrong, J., and Kennedy, M. (2001). Automatic identification of time series features for rule-based forecasting. International Journal of Forecasting, 17(2):143–157.
Aha, D. W. (1992). Generalizing from case studies: A case study. In Sleeman, D. and Edwards, P., editors, Proceedings of the Ninth InternationalWorkshop on Machine Learning (ML92), pages 1–10. Morgan Kaufmann.
Atkeson, C. G., Moore, A. W., and Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11(1-5):11–73.
Baldi, P. and Chauvin, Y. (1993). Neural networks for fingerprint recognition. Neural Computation, 5.
Bensusan, H. (1998). God doesn’t always shave with Occam’s razor - learning when and how to prune. In ECML ’98: Proceedings of the 10th European Conference on Machine Learning, pages 119–124, London, UK. Springer-Verlag.
Bensusan, H. and Giraud-Carrier, C. (2000). Discovering task neighbourhoods through landmark learning performances. In Zighed, D. A., Komorowski, J., and Zytkow, J., editors, Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2000), pages 325–330. Springer.
Bensusan, H., Giraud-Carrier, C., and Kennedy, C. (2000). A higher-order approach to meta-learning. In Proceedings of the ECML 2000 Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination, pages 109–117. ECML 2000.
Bensusan, H. and Kalousis, A. (2001). Estimating the predictive accuracy of a classifier. In Flach, P. and De Raedt, L., editors, Proceedings of the 12th European Conference on Machine Learning, pages 25–36. Springer.
Box, G. and Jenkins, G. (2008). Time Series Analysis, Forecasting and Control. John Wiley & Sons.
Brazdil, P., Gama, J., and Henery, B. (1994). Characterizing the applicability of classification algorithms using meta-level learning. In Bergadano, F. and De Raedt, L., editors, Proceedings of the European Conference on Machine Learning (ECML94), pages 83–102. Springer-Verlag.
Brazdil, P. and Henery, R. J. (1994). Analysis of results. In Michie, D., Spiegelhalter, D. J., and Taylor, C. C., editors, Machine Learning, Neural and Statistical Classification, chapter 10, pages 175–212. Ellis Horwood.
Brazdil, P., Soares, C., and da Costa, J. P. (2003). Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Machine Learning, 50(3):251–277.
Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., and Shah, R. (1994). Signature verification using a “siamese” time delay neural network. In Advances in Neural Information Processing Systems 7, NIPS’94, pages 737–744.
Chatfield, C. (2003). The Analysis of Time Series: An Introduction. Chapman & Hall/CRC, 6th edition.
Chen, K. and Salman, A. (2011). Extracting speaker-specific information with a regularized Siamese deep network. In Advances in Neural Information Processing Systems 24, NIPS’11, pages 298–306.
Costa, A. J., Santos, M. S., Soares, C., and Abreu, P. H. (2020). Analysis of imbalance strategies recommendation using a meta-learning approach. In 7th ICML Workshop on Automated Machine Learning (AutoML).
Cunha, T., Soares, C., and de Carvalho, A. C. (2018a). cf2vec: Collaborative filtering algorithm selection using graph distributed representations. arXiv preprint arXiv:1809.06120.
Cunha, T., Soares, C., and de Carvalho, A. C. (2018b). Metalearning and recommender systems: A literature review and empirical study on the algorithm selection problem for collaborative filtering. Information Sciences, 423:128 – 144.
da Costa, J. P. (2015). Rankings and Preferences: New Results in Weighted Correlation and Weighted Principal Component Analysis with Applications. Springer.
da Costa, J. P. and Soares, C. (2005). A weighted rank measure of correlation. Aust. N.Z. J. Stat., 47(4):515–529.
de Souto, M. C. P., Prudencio, R. B. C., Soares, R. G. F., de Araujo, D. S. A., Costa, I. G., Ludermir, T. B., and Schliep, A. (2008). Ranking and selecting clustering algorithms using a meta-learning approach. In 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pages 3729–3735.
dos Santos, P. M., Ludermir, T. B., and Prudêncio, R. B. C. (2004). Selection of time series forecasting models based on performance information. In Proceedings of the Fourth International Conference on Hybrid Intelligent Systems (HIS’04), pages 366–371.
Ferrari, D. and de Castro, L. (2015). Clustering algorithm selection by meta-learning systems: A new distance-based problem characterization and ranking combination methods. Information Sciences, 301:181–194.
Fürnkranz, J. and Petrak, J. (2001). An evaluation of landmarking variants. In Giraud- Carrier, C., Lavrač, N., and Moyle, S., editors, Working Notes of the ECML/PKDD 2000 Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning, pages 57–68.
Fusi, N., Sheth, R., and Elibol, M. (2018). Probabilistic matrix factorization for automated machine learning. In Advances in Neural Information Processing Systems 32, NIPS’18, pages 3348–3357. Gama, J. and Brazdil, P. (1995). Characterization of classification algorithms. In Pinto-Ferreira, C. and Mamede, N. J., editors, Progress in Artificial Intelligence, Proceedings of the Seventh Portuguese Conference on Artificial Intelligence, pages 189–200. Springer-Verlag.
Hilario, M. and Kalousis, A. (2001). Fusion of meta-knowledge and meta-data for case based model selection. In Siebes, A. and De Raedt, L., editors, Proceedings of the Fifth European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD01). Springer.
Ho, T. and Basu, M. (2002). Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3):289–300.
Kalousis, A. (2002). Algorithm Selection via Meta-Learning. PhD thesis, University of Geneva, Department of Computer Science.
Kalousis, A. and Hilario, M. (2000). Model selection via meta-learning: A comparative study. In Proceedings of the 12th International IEEE Conference on Tools with AI. IEEE Press.
Kalousis, A. and Hilario, M. (2001a). Feature selection for meta-learning. In Cheung, D. W., Williams, G., and Li, Q., editors, Proc. of the Fifth Pacific-Asia Conf. on Knowledge Discovery and Data Mining. Springer.
Kalousis, A. and Hilario, M. (2001b). Model selection via meta-learning: a comparative study. Int. Journal on Artificial Intelligence Tools, 10(4):525–554.
Kalousis, A. and Hilario, M. (2003). Representational issues in meta-learning. In Proceedings of the 20th International Conference on Machine Learning, ICML’03, pages 313–320.
Kalousis, A. and Theoharis, T. (1999). NOEMON: Design, implementation and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis, 3(5):319–337.
Köpf, C. and Iglezakis, I. (2002). Combination of task description strategies and case base properties for meta-learning. In Bohanec, M., Kavšek, B., Lavrač, N., and Mladenić, D., editors, Proceedings of the Second International Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning (IDDM-2002), pages 65–76. Helsinki University Printing House.
Köpf, C., Taylor, C., and Keller, J. (2000). Meta-analysis: From data characterization for meta-learning to meta-regression. In Brazdil, P. and Jorge, A., editors, Proceedings of the PKDD 2000 Workshop on Data Mining, Decision Support, Meta-Learning and ILP: Forum for Practical Problem Presentation and Prospective Solutions, pages 15–26.
Kuhn, M. and Johnson, K. (2013). Applied Predictive Modeling. Springer.
Leite, R. and Brazdil, P. (2004). Improving progressive sampling via meta-learning on learning curves. In Boulicaut, J.-F., Esposito, F., Giannotti, F., and Pedreschi, D., editors, Proc. of the 15th European Conf. on Machine Learning (ECML2004), LNAI 3201, pages 250–261. Springer-Verlag.
Leite, R. and Brazdil, P. (2005). Predicting relative performance of classifiers from samples. In Proceedings of the 22nd International Conference on Machine Learning, ICML’05, pages 497–503, NY, USA. ACM Press.
Leite, R. and Brazdil, P. (2007). An iterative process for building learning curves and predicting relative performance of classifiers. In Proceedings of the 13th Portuguese Conference on Artificial Intelligence (EPIA 2007), pages 87–98.
Leite, R. and Brazdil, P. (2021). Exploiting performance-based similarity between datasets in metalearning. In Guyon, I., van Rijn, J. N., Treguer, S., and Vanschoren, J., editors, AAAI Workshop on Meta-Learning and MetaDL Challenge, volume 140, pages 90–99. PMLR.
Leite, R., Brazdil, P., and Vanschoren, J. (2012). Selecting classification algorithms with active testing. In Machine Learning and Data Mining in Pattern Recognition, pages 117–131. Springer.
Lemke, C. and Gabrys, B. (2010). Meta-learning for time series forecasting and forecast combination. Neurocomputing, 74:2006–2016.
Lindner, G. and Studer, R. (1999). AST: Support for algorithm selection with a CBR approach. In Giraud-Carrier, C. and Pfahringer, B., editors, Recent Advances in Meta-Learning and Future Work, pages 38–47. J. Stefan Institute.
Lorena, A., Maciel, A., de Miranda, P., Costa, I., and Prudêncio, R. (2018). Data complexity meta-features for regression tasks. Machine Learning, 107(1):209–246.
Manning, C., Raghavan, P., and Schütze, H. (2009). An Introduction to Information Retrieval. Cambridge University Press.
Michie, D., Spiegelhalter, D. J., and Taylor, C. C. (1994). Machine Learning, Neural and Statistical Classification. Ellis Horwood.
Muñoz, M., Villanova, L., Baatar, D., and Smith-Miles, K. (2018). Instance Spaces for Machine Learning Classification. Machine Learning, 107(1).
Mueller, J. and Thyagarajan, A. (2016). Siamese recurrent architectures for learning sentence similarity. In Thirtieth AAAI Conference on Artificial Intelligence.
Peng, Y., Flach, P., Brazdil, P., and Soares, C. (2002). Improved dataset characterisation for meta-learning. In Discovery Science, pages 141–152.
Perez, E. and Rendell, L. (1996). Learning despite concept variation by finding structure in attribute-based data. In Proceedings of the 13th International Conference on Machine Learning, ICML’96.
Pfahringer, B., Bensusan, H., and Giraud-Carrier, C. (2000). Meta-learning by landmarking various learning algorithms. In Langley, P., editor, Proceedings of the 17th International Conference on Machine Learning, ICML’00, pages 743–750.
Pimentel, B. A. and de Carvalho, A. C. (2019). A new data characterization for selecting clustering algorithms using meta-learning. Information Sciences, 477:203 – 219.
Pinto, F. (2018). Leveraging Bagging for Bagging Classifiers. PhD thesis, University of Porto, FEUP.
Pinto, F., Soares, C., and Mendes-Moreira, J. (2016). Towards automatic generation of metafeatures. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 215–226. Springer International Publishing.
Post, M. J., van der Putten, P., and van Rijn, J. N. (2016). Does feature selection improve classification? a large scale experiment in OpenML. In Advances in Intelligent Data Analysis XV, pages 158–170. Springer.
Prudêncio, R. and Ludermir, T. (2004). Meta-learning approaches to selecting time series models. Neurocomputing, 61:121–137.
Rendell, L. and Seshu, R. (1990). Learning hard concepts through constructive induction: Framework and rationale. Computational Intelligence, 6:247–270.
Rendell, L., Seshu, R., and Tcheng, D. (1987). More robust concept learning using dynamically-variable bias. In Proceedings of the Fourth International Workshop on Machine Learning, pages 66–78. Morgan Kaufmann Publishers, Inc.
Rice, J. R. (1976). The algorithm selection problem. Advances in Computers, 15:65–118.
Rivolli, A., Garcia, L. P. F., Soares, C., Vanschoren, J., and de Carvalho, A. C. P. L. F. (2019). Characterizing classification datasets: a study of meta-features for metalearning. In arXiv. https://arxiv.org/abs/1808.10406.
Smith, M. R., Martinez, T., and Giraud-Carrier, C. (2014). An instance level analysis of data complexity. Machine Learning, 95(2):225–256.
Smith-Miles, K., Baatar, D., Wreford, B., and Lewis, R. (2014). Towards objective measures of algorithm performance across instance space. Computers & Operations Research, 45:12–24.
Smith-Miles, K. A. (2008). Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys (CSUR), 41(1):6:1–6:25.
Soares, C. (2004). Learning Rankings of Learning Algorithms. PhD thesis, Department of Computer Science, Faculty of Sciences, University of Porto.
Soares, C. and Brazdil, P. (2006). Selecting parameters of SVM using meta-learning and kernel matrix-based meta-features. In Proceedings of the ACM SAC.
Soares, C., Brazdil, P., and Kuba, P. (2004). A meta-learning method to select the kernel width in support vector regression. Machine Learning, 54:195–209.
Soares, C., Petrak, J., and Brazdil, P. (2001). Sampling-based relative landmarks: Systematically test-driving algorithms before choosing. In Brazdil, P. and Jorge, A., editors, Proceedings of the 10th Portuguese Conference on Artificial Intelligence (EPIA2001), pages 88–94. Springer.
Soares, R. G. F., Ludermir, T. B., and De Carvalho, F. A. T. (2009). An analysis of metalearning techniques for ranking clustering algorithms applied to artificial data. In Alippi, C., Polycarpou, M., Panayiotou, C., and Ellinas, G., editors, Artificial Neural Networks – ICANN 2009. Springer, Berlin, Heidelberg.
Sohn, S. Y. (1999). Meta analysis of classification algorithms for pattern recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(11):1137–1144.
Sun, Q. and Pfahringer, B. (2013). Pairwise meta-rules for better meta-learning-based algorithm ranking. Machine Learning, 93(1):141–161.
Todorovski, L., Blockeel, H., and Džeroski, S. (2002). Ranking with predictive clustering trees. In Elomaa, T., Mannila, H., and Toivonen, H., editors, Proc. of the 13th European Conf. on Machine Learning, number 2430 in LNAI, pages 444–455. Springer-Verlag.
Todorovski, L., Brazdil, P., and Soares, C. (2000). Report on the experiments with feature selection in meta-level learning. In Brazdil, P. and Jorge, A., editors, Proceedings of the Data Mining, Decision Support, Meta-Learning and ILP Workshop at PKDD 2000, pages 27–39.
Todorovski, L. and Džeroski, S. (1999). Experiments in meta-level learning with ILP. In Rauch, J. and Zytkow, J., editors, Proceedings of the Third European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD99), pages 98–106. Springer.
Tomp, D., Muravyov, S., Filchenkov, A., and Parfenov, V. (2019). Meta-learning based evolutionary clustering algorithm. In Lecture Notes in Computer Science, Vol. 11871, pages 502–513.
Tsuda, K., Rätsch, G., Mika, S., and Müller, K. (2001). Learning to predict the leave-one out error of kernel based classifiers. In ICANN, pages 331–338. Springer-Verlag.
Tukey, J. (1977). Exploratory Data Analysis. Addison-Wesley Publishing Company.
van Rijn, J. N., Abdulrahman, S., Brazdil, P., and Vanschoren, J. (2015). Fast algorithm selection using learning curves. In International Symposium on Intelligent Data Analysis XIV, pages 298–309.
Vanschoren, J. (2019). Meta-learning. In Hutter, F., Kotthoff, L., and Vanschoren, J., editors, Automated Machine Learning: Methods, Systems, Challenges, chapter 2, pages 39–68. Springer.
Vilalta, R. (1999). Understanding accuracy performance through concept characterization and algorithm analysis. In Giraud-Carrier, C. and Pfahringer, B., editors, Recent Advances in Meta-Learning and Future Work, pages 3–9. J. Stefan Institute.
Vukicevic, M., Radovanovic, S., Delibasic, B., and Suknovic, M. (2016). Extending metalearning framework for clustering gene expression data with component-based algorithm design and internal evaluation measures. International Journal of Data Mining and Bioinformatics (IJDMB), 14(2).
Yang, C., Akimoto, Y., Kim, D. W., and Udell, M. (2019). Oboe: Collaborative filtering for AutoML model selection. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1173–1183. ACM.
Author information
Authors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this chapter
Cite this chapter
Brazdil, P., van Rijn, J.N., Soares, C., Vanschoren, J. (2022). Dataset Characteristics (Metafeatures). In: Metalearning. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-030-67024-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-67024-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67023-8
Online ISBN: 978-3-030-67024-5
eBook Packages: Computer ScienceComputer Science (R0)