Abstract
Deep neural networks have enabled large breakthroughs in various domains ranging from image and speech recognition to automated medical diagnosis. However, these networks are notorious for requiring large amounts of data to learn from, limiting their applicability in domains where data is scarce. Through metalearning, the networks can learn how to learn, allowing them to learn from fewer data. In this chapter, we provide a detailed overview of metalearning for knowledge transfer in deep neural networks. We categorize the techniques into (i) metric-based, (ii) model-based, and (iii) optimization-based techniques, cover the key techniques per category, discuss open challenges, and provide directions for future research such as performance evaluation on heterogeneous benchmarks.
Chapter PDF
Similar content being viewed by others
References
Andrychowicz, M., Denil, M., Colmenarejo, S. G., Hoffman, M. W., Pfau, D., Schaul, T., Shillingford, B., and de Freitas, N. (2016). Learning to learn by gradient descent by gradient descent. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 3988–3996, USA. Curran Associates Inc.
Antoniou, A., Edwards, H., and Storkey, A. (2019). How to train your MAML. In International Conference on Learning Representations, ICLR’19.
Barrett, D. G., Hill, F., Santoro, A., Morcos, A. S., and Lillicrap, T. (2018). Measuring abstract reasoning in neural networks. In Proceedings of the 35th International Conference on Machine Learning, ICML’18, pages 4477–4486. JMLR.org.
Bertinetto, L., Henriques, J. F., Torr, P. H. S., and Vedaldi, A. (2019). Meta-learning with differentiable closed-form solvers. In International Conference on Learning Representations, ICLR’19.
Chen, W.-Y., Liu, Y.-C., Kira, Z., Wang, Y.-C., and Huang, J.-B. (2019). A closer look at few-shot classification. In International Conference on Learning Representations, ICLR’19.
Finn, C., Abbeel, P., and Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning, ICML’17, pages 1126–1135. JMLR.org.
Finn, C. and Levine, S. (2018). Meta-learning and universality: Deep representations and gradient descent can approximate any learning algorithm. In International Conference on Learning Representations, ICLR’18.
Finn, C., Rajeswaran, A., Kakade, S., and Levine, S. (2019). Online meta-learning. In Chaudhuri, K. and Salakhutdinov, R., editors, Proceedings of the 36th International Conference on Machine Learning, ICML’19, pages 1920–1930. JMLR.org.
Finn, C., Xu, K., and Levine, S. (2018). Probabilistic model-agnostic meta-learning. In Advances in Neural Information Processing Systems 31, NIPS’18, pages 9516–9527. Curran Associates Inc.
Garcia, V. and Bruna, J. (2017). Few-shot learning with graph neural networks. In International Conference on Learning Representations, ICLR’17.
Garnelo, M., Rosenbaum, D., Maddison, C., Ramalho, T., Saxton, D., Shanahan, M., Teh, Y. W., Rezende, D., and Eslami, S. M. A. (2018). Conditional neural processes. In Dy, J. and Krause, A., editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of ICML’18, pages 1704–1713. JMLR.org.
Grant, E., Finn, C., Levine, S., Darrell, T., and Griffiths, T. (2018). Recasting gradient based meta-learning as hierarchical bayes. In International Conference on Learning Representations, ICLR’18.
Graves, A., Wayne, G., and Danihelka, I. (2014). Neural Turing Machines. arXiv preprint arXiv:1410.5401.
Hospedales, T., Antoniou, A., Micaelli, P., and Storkey, A. (2020). Meta-learning in neural networks: A survey. arXiv preprint arXiv:2004.05439.
Koch, G., Zemel, R., and Salakhutdinov, R. (2015). Siamese Neural Networks for Oneshot Image Recognition. In Proceedings of the 32nd International Conference on Machine Learning, volume 37 of ICML’15. JMLR.org.
Lake, B. M. (2019). Compositional generalization through meta sequence-to-sequence learning. In Advances in Neural Information Processing Systems 33, NIPS’19, pages 9791–9801. Curran Associates Inc.
Lee, K., Maji, S., Ravichandran, A., and Soatto, S. (2019). Meta-learning with differentiable convex optimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10657–10665.
Li, K. and Malik, J. (2018). Learning to optimize neural nets. arXiv preprint arXiv:1703.00441.
Li, Z., Zhou, F., Chen, F., and Li, H. (2017). Meta-SGD: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835.
Mishra, N., Rohaninejad, M., Chen, X., and Abbeel, P. (2018). A simple neural attentive meta-learner. In International Conference on Learning Representations, ICLR’18.
Munkhdalai, T. and Yu, H. (2017). Meta networks. In Proceedings of the 34th International Conference on Machine Learning, ICML’17, pages 2554–2563. JMLR.org.
Nichol, A. and Schulman, J. (2018). Reptile: a scalable metalearning algorithm. arXiv preprint arXiv:1803.02999, 2:2.
Oord, A. v. d., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., and Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.
Raghu, A., Raghu, M., Bengio, S., and Vinyals, O. (2020). Rapid learning or feature reuse? towards understanding the effectiveness of MAML. In International Conference on Learning Representations, ICLR’20.
Rajeswaran, A., Finn, C., Kakade, S. M., and Levine, S. (2019). Meta-learning with implicit gradients. In Advances in Neural Information Processing Systems 32, NIPS’19, pages 113–124. Curran Associates Inc.
Ravi, S. and Larochelle, H. (2017). Optimization as a model for few-shot learning. In International Conference on Learning Representations, ICLR’17.
Rusu, A. A., Rao, D., Sygnowski, J., Vinyals, O., Pascanu, R., Osindero, S., and Hadsell, R. (2018). Meta-learning with latent embedding optimization. In International Conference on Learning Representations, ICLR’18.
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., and Lillicrap, T. (2016). Metalearning with memory-augmented neural networks. In Proceedings of the 33rd International Conference on Machine Learning, ICML’16, pages 1842–1850. JMLR.org.
Shyam, P., Gupta, S., and Dukkipati, A. (2017). Attentive recurrent comparators. In Proceedings of the 34th International Conference on Machine Learning, ICML’17, pages 3173–3181. JMLR.org.
Snell, J., Swersky, K., and Zemel, R. (2017). Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems 30, NIPS’17, pages 4077–4087. Curran Associates Inc.
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P. H., and Hospedales, T. M. (2018). Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1199–1208.
Tokmakov, P., Wang, Y.-X., and Hebert, M. (2019). Learning compositional representations for few-shot recognition. In Proceedings of the IEEE International Conference on Computer Vision, pages 6372–6381.
Triantafillou, E., Zhu, T., Dumoulin, V., Lamblin, P., Evci, U., Xu, K., Goroshin, R., Gelada, C., Swersky, K., Manzagol, P.-A., et al. (2019). Meta-dataset: A dataset of datasets for learning to learn from few examples. arXiv preprint arXiv:1903.03096.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems 30, NIPS’17, pages 5998–6008. Curran Associates Inc.
Vinyals, O. (2017). Talk: Model vs optimization meta learning. http://metalearning-symposium.ml/files/vinyals.pdf. Neural Information Processing Systems (NIPS); accessed 06-06-2020.
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., and Wierstra, D. (2016). Matching networks for one shot learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 3637–3645, USA. Curran Associates Inc.
Vuorio, R., Cho, D.-Y., Kim, D., and Kim, J. (2018). Meta continual learning. arXiv preprint arXiv:1806.06928.
Yin, M., Tucker, G., Zhou, M., Levine, S., and Finn, C. (2020). Meta-learning without memorization. In International Conference on Learning Representations, ICLR’20.
Yoon, J., Kim, T., Dia, O., Kim, S., Bengio, Y., and Ahn, S. (2018). Bayesian modelagnostic meta-learning. In Advances in Neural Information Processing Systems 31, NIPS’18, pages 7332–7342. Curran Associates Inc.
Author information
Authors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this chapter
Cite this chapter
Huisman, M., van Rijn, J.N., Plaat, A. (2022). Metalearning for Deep Neural Networks. In: Metalearning. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-030-67024-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-67024-5_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67023-8
Online ISBN: 978-3-030-67024-5
eBook Packages: Computer ScienceComputer Science (R0)