Abstract
This chapter focuses on metalearning approaches that have been applied to data streams. This is an important area, as many real-world data arrive in the form of a stream of observations. We first review some important aspects of the data stream setting, which may involve online learning, non-stationarity, and concept drift.
Chapter PDF
Similar content being viewed by others
References
Beringer, J. and H¨ullermeier, E. (2007). Efficient instance-based learning on data streams. Intelligent Data Analysis, 11(6):627–650.
Bifet, A., Frank, E., Holmes, G., and Pfahringer, B. (2012). Ensembles of restricted Hoeffding trees. ACM Transactions on Intelligent Systems and Technology (TIST), 3(2):30.
Bifet, A. and Gavald`a, R. (2007). Learning from Time-Changing Data with Adaptive Windowing. In SDM, volume 7, pages 139–148. SIAM.
Bifet, A. and Gavald`a, R. (2009). Adaptive learning from evolving data streams. In Advances in Intelligent Data Analysis VIII, pages 249–260. Springer.
Bifet, A., Holmes, G., Kirkby, R., and Pfahringer, B. (2010). MOA: Massive Online Analysis. J. Mach. Learn. Res., 11:1601–1604.
Bifet, A., Read, J., ˇZliobait˙e, I., Pfahringer, B., and Holmes, G. (2013). Pitfalls in benchmarking data stream classification and how to avoid them. In Machine Learning and Knowledge Discovery in Databases, pages 465–479. Springer.
Bottou, L. (2004). Stochastic learning. In Advanced Lectures on Machine Learning, pages 146–168. Springer.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2):123–140.
Celik, B. and Vanschoren, J. (2020). Adaptation strategies for automated machine learning on evolving data. arXiv preprint arXiv:2006.06480.
Cerqueira, V., Torgo, L., Pinto, F., and Soares, C. (2019). Arbitrage of forecasting experts. Machine Learning, 108(6):913–944.
Domingos, P. and Hulten, G. (2000). Mining High-Speed Data Streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 71–80.
Domingos, P. and Hulten, G. (2003). A general framework for mining massive data streams. Journal of Computational and Graphical Statistics, 12(4):945–949.
Finn, C., Rajeswaran, A., Kakade, S., and Levine, S. (2019). Online meta-learning. In Chaudhuri, K. and Salakhutdinov, R., editors, Proceedings of the 36th International Conference on Machine Learning, ICML’19, pages 1920–1930. JMLR.org.
Gama, J. and Kosina, P. (2014). Recurrent concepts in data streams classification. Knowledge and Information Systems, 40(3):489–507.
Gama, J., Medas, P., Castillo, G., and Rodrigues, P. (2004). Learning with drift detection. In SBIA Brazilian Symposium on Artificial Intelligence, volume 3171 of Lecture Notes in Computer Science, pages 286–295. Springer.
Gama, J., Sebasti˜ao, R., and Rodrigues, P. P. (2009). Issues in evaluation of stream learning algorithms. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 329–338. ACM.
Gama, J., Sebasti˜ao, R., and Rodrigues, P. P. (2013). On evaluating stream learning algorithms. Machine Learning, 90(3):317–346.
Kolter, J. Z. and Maloof, M. A. (2007). Dynamic weighted majority: An ensemble method for drifting concepts. Journal of Machine Learning Research, 8:2755–2790.
Lee, J. W. and Giraud-Carrier, C. (2011). A metric for unsupervised metalearning. Intelligent Data Analysis, 15(6):827–841.
Nguyen, H.-L., Woon, Y.-K., Ng, W.-K., and Wan, L. (2012). Heterogeneous Ensemble for Feature Drifts in Data Streams. In Advances in Knowledge Discovery and Data Mining, pages 1–12. Springer.
Oza, N. C. (2005). Online bagging and boosting. In Systems, Man and Cybernetics, 2005 IEEE International Conference, volume 3, pages 2340–2345. IEEE.
Peterson, A. H. and Martinez, T. (2005). Estimating the potential for combining learning models. In Proc. of the ICML Workshop on Meta-Learning, pages 68–75.
Pfahringer, B., Bensusan, H., and Giraud-Carrier, C. (2000). Meta-learning by land marking various learning algorithms. In Langley, P., editor, Proceedings of the 17th International Conference on Machine Learning, ICML’00, pages 743–750.
Pfahringer, B., Holmes, G., and Kirkby, R. (2007). New options for Hoeffding trees. In AI 2007: Advances in Artificial Intelligence, pages 90–99. Springer.
Read, J., Bifet, A., Pfahringer, B., and Holmes, G. (2012). Batch-Incremental versus Instance-Incremental Learning in Dynamic and Evolving Data. In Advances in Intelligent Data Analysis XI, pages 313–323. Springer.
Rossi, A. L. D., de Leon Ferreira, A. C. P., Soares, C., and De Souza, B. F. (2014). MetaStream: A meta-learning based method for periodic algorithm selection in time changing data. Neurocomputing, 127:52–64.
Schapire, R. (1990). The strength of weak learnability. Machine Learning, 5(2):197–227.
Shalev-Shwartz, S., Singer, Y., Srebro, N., and Cotter, A. (2011). Pegasos: primal estimated sub-gradient solver for SVM. Mathematical Programming, 127(1):3–30.
van Rijn, J. N. (2016). Massively collaborative machine learning. PhD thesis, Leiden University.
van Rijn, J. N., Holmes, G., Pfahringer, B., and Vanschoren, J. (2014). Algorithm Selection on Data Streams. In Discovery Science, volume 8777 of LNCS, pages 325–336. Springer.
van Rijn, J. N., Holmes, G., Pfahringer, B., and Vanschoren, J. (2015). Having a Blast: Meta-Learning and Heterogeneous Ensembles for Data Streams. In 2015 IEEE International Conference on Data Mining (ICDM), pages 1003–1008. IEEE.
van Rijn, J. N., Holmes, G., Pfahringer, B., and Vanschoren, J. (2018). The online performance estimation framework: heterogeneous ensemble learning for data streams. Machine Learning, 107(1):149–167.
Vanschoren, J., van Rijn, J. N., Bischl, B., and Torgo, L. (2014). OpenML: networked science in machine learning. ACM SIGKDD Explorations Newsletter, 15(2):49–60.
Wang, H., Fan, W., Yu, P. S., and Han, J. (2003). Mining Concept-Drifting Data Streams using Ensemble Classifiers. In KDD, pages 226–235.
Yu, L. and Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation based filter solution. In Proceedings of the 20th International Conference on Machine Learning, ICML’03, pages 856–863.
Zhang, P., Gao, B. J., Zhu, X., and Guo, L. (2011). Enabling fast lazy learning for data streams. In 2011 IEEE 11th International Conference on Data Mining (ICDM), pages 932–941. IEEE.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this chapter
Cite this chapter
Brazdil, P., van Rijn, J.N., Soares, C., Vanschoren, J. (2022). Algorithm Recommendation for Data Streams. In: Metalearning. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-030-67024-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-67024-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67023-8
Online ISBN: 978-3-030-67024-5
eBook Packages: Computer ScienceComputer Science (R0)