Metalearning for Deep Neural Networks

Huisman, Mike; van Rijn, Jan N.; Plaat, Aske

doi:10.1007/978-3-030-67024-5_13

Mike Huisman⁶,
Jan N. van Rijn⁶ &
Aske Plaat⁶

Part of the book series: Cognitive Technologies ((COGTECH))

12k Accesses
2 Citations

Abstract

Deep neural networks have enabled large breakthroughs in various domains ranging from image and speech recognition to automated medical diagnosis. However, these networks are notorious for requiring large amounts of data to learn from, limiting their applicability in domains where data is scarce. Through metalearning, the networks can learn how to learn, allowing them to learn from fewer data. In this chapter, we provide a detailed overview of metalearning for knowledge transfer in deep neural networks. We categorize the techniques into (i) metric-based, (ii) model-based, and (iii) optimization-based techniques, cover the key techniques per category, discuss open challenges, and provide directions for future research such as performance evaluation on heterogeneous benchmarks.

Download to read the full chapter text

Chapter PDF

A survey of deep meta-learning

Article Open access 19 April 2021

Deep Learning

Using Metalearning for Parameter Tuning in Neural Networks

References

Andrychowicz, M., Denil, M., Colmenarejo, S. G., Hoffman, M. W., Pfau, D., Schaul, T., Shillingford, B., and de Freitas, N. (2016). Learning to learn by gradient descent by gradient descent. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 3988–3996, USA. Curran Associates Inc.
Google Scholar
Antoniou, A., Edwards, H., and Storkey, A. (2019). How to train your MAML. In International Conference on Learning Representations, ICLR’19.
Google Scholar
Barrett, D. G., Hill, F., Santoro, A., Morcos, A. S., and Lillicrap, T. (2018). Measuring abstract reasoning in neural networks. In Proceedings of the 35th International Conference on Machine Learning, ICML’18, pages 4477–4486. JMLR.org.
Google Scholar
Bertinetto, L., Henriques, J. F., Torr, P. H. S., and Vedaldi, A. (2019). Meta-learning with differentiable closed-form solvers. In International Conference on Learning Representations, ICLR’19.
Google Scholar
Chen, W.-Y., Liu, Y.-C., Kira, Z., Wang, Y.-C., and Huang, J.-B. (2019). A closer look at few-shot classification. In International Conference on Learning Representations, ICLR’19.
Google Scholar
Finn, C., Abbeel, P., and Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning, ICML’17, pages 1126–1135. JMLR.org.
Google Scholar
Finn, C. and Levine, S. (2018). Meta-learning and universality: Deep representations and gradient descent can approximate any learning algorithm. In International Conference on Learning Representations, ICLR’18.
Google Scholar
Finn, C., Rajeswaran, A., Kakade, S., and Levine, S. (2019). Online meta-learning. In Chaudhuri, K. and Salakhutdinov, R., editors, Proceedings of the 36th International Conference on Machine Learning, ICML’19, pages 1920–1930. JMLR.org.
Google Scholar
Finn, C., Xu, K., and Levine, S. (2018). Probabilistic model-agnostic meta-learning. In Advances in Neural Information Processing Systems 31, NIPS’18, pages 9516–9527. Curran Associates Inc.
Google Scholar
Garcia, V. and Bruna, J. (2017). Few-shot learning with graph neural networks. In International Conference on Learning Representations, ICLR’17.
Google Scholar
Garnelo, M., Rosenbaum, D., Maddison, C., Ramalho, T., Saxton, D., Shanahan, M., Teh, Y. W., Rezende, D., and Eslami, S. M. A. (2018). Conditional neural processes. In Dy, J. and Krause, A., editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of ICML’18, pages 1704–1713. JMLR.org.
Google Scholar
Grant, E., Finn, C., Levine, S., Darrell, T., and Griffiths, T. (2018). Recasting gradient based meta-learning as hierarchical bayes. In International Conference on Learning Representations, ICLR’18.
Google Scholar
Graves, A., Wayne, G., and Danihelka, I. (2014). Neural Turing Machines. arXiv preprint arXiv:1410.5401.
Hospedales, T., Antoniou, A., Micaelli, P., and Storkey, A. (2020). Meta-learning in neural networks: A survey. arXiv preprint arXiv:2004.05439.
Koch, G., Zemel, R., and Salakhutdinov, R. (2015). Siamese Neural Networks for Oneshot Image Recognition. In Proceedings of the 32nd International Conference on Machine Learning, volume 37 of ICML’15. JMLR.org.
Google Scholar
Lake, B. M. (2019). Compositional generalization through meta sequence-to-sequence learning. In Advances in Neural Information Processing Systems 33, NIPS’19, pages 9791–9801. Curran Associates Inc.
Google Scholar
Lee, K., Maji, S., Ravichandran, A., and Soatto, S. (2019). Meta-learning with differentiable convex optimization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10657–10665.
Google Scholar
Li, K. and Malik, J. (2018). Learning to optimize neural nets. arXiv preprint arXiv:1703.00441.
Li, Z., Zhou, F., Chen, F., and Li, H. (2017). Meta-SGD: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835.
Mishra, N., Rohaninejad, M., Chen, X., and Abbeel, P. (2018). A simple neural attentive meta-learner. In International Conference on Learning Representations, ICLR’18.
Google Scholar
Munkhdalai, T. and Yu, H. (2017). Meta networks. In Proceedings of the 34th International Conference on Machine Learning, ICML’17, pages 2554–2563. JMLR.org.
Google Scholar
Nichol, A. and Schulman, J. (2018). Reptile: a scalable metalearning algorithm. arXiv preprint arXiv:1803.02999, 2:2.
Oord, A. v. d., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., and Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499.
Raghu, A., Raghu, M., Bengio, S., and Vinyals, O. (2020). Rapid learning or feature reuse? towards understanding the effectiveness of MAML. In International Conference on Learning Representations, ICLR’20.
Google Scholar
Rajeswaran, A., Finn, C., Kakade, S. M., and Levine, S. (2019). Meta-learning with implicit gradients. In Advances in Neural Information Processing Systems 32, NIPS’19, pages 113–124. Curran Associates Inc.
Google Scholar
Ravi, S. and Larochelle, H. (2017). Optimization as a model for few-shot learning. In International Conference on Learning Representations, ICLR’17.
Google Scholar
Rusu, A. A., Rao, D., Sygnowski, J., Vinyals, O., Pascanu, R., Osindero, S., and Hadsell, R. (2018). Meta-learning with latent embedding optimization. In International Conference on Learning Representations, ICLR’18.
Google Scholar
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., and Lillicrap, T. (2016). Metalearning with memory-augmented neural networks. In Proceedings of the 33rd International Conference on Machine Learning, ICML’16, pages 1842–1850. JMLR.org.
Google Scholar
Shyam, P., Gupta, S., and Dukkipati, A. (2017). Attentive recurrent comparators. In Proceedings of the 34th International Conference on Machine Learning, ICML’17, pages 3173–3181. JMLR.org.
Google Scholar
Snell, J., Swersky, K., and Zemel, R. (2017). Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems 30, NIPS’17, pages 4077–4087. Curran Associates Inc.
Google Scholar
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P. H., and Hospedales, T. M. (2018). Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1199–1208.
Google Scholar
Tokmakov, P., Wang, Y.-X., and Hebert, M. (2019). Learning compositional representations for few-shot recognition. In Proceedings of the IEEE International Conference on Computer Vision, pages 6372–6381.
Google Scholar
Triantafillou, E., Zhu, T., Dumoulin, V., Lamblin, P., Evci, U., Xu, K., Goroshin, R., Gelada, C., Swersky, K., Manzagol, P.-A., et al. (2019). Meta-dataset: A dataset of datasets for learning to learn from few examples. arXiv preprint arXiv:1903.03096.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems 30, NIPS’17, pages 5998–6008. Curran Associates Inc.
Google Scholar
Vinyals, O. (2017). Talk: Model vs optimization meta learning. http://metalearning-symposium.ml/files/vinyals.pdf. Neural Information Processing Systems (NIPS); accessed 06-06-2020.
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., and Wierstra, D. (2016). Matching networks for one shot learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 3637–3645, USA. Curran Associates Inc.
Google Scholar
Vuorio, R., Cho, D.-Y., Kim, D., and Kim, J. (2018). Meta continual learning. arXiv preprint arXiv:1806.06928.
Yin, M., Tucker, G., Zhou, M., Levine, S., and Finn, C. (2020). Meta-learning without memorization. In International Conference on Learning Representations, ICLR’20.
Google Scholar
Yoon, J., Kim, T., Dia, O., Kim, S., Bengio, Y., and Ahn, S. (2018). Bayesian modelagnostic meta-learning. In Advances in Neural Information Processing Systems 31, NIPS’18, pages 7332–7342. Curran Associates Inc.
Google Scholar

Download references

Author information

Authors and Affiliations

Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands
Mike Huisman, Jan N. van Rijn & Aske Plaat

Authors

Mike Huisman
View author publications
You can also search for this author in PubMed Google Scholar
Jan N. van Rijn
View author publications
You can also search for this author in PubMed Google Scholar
Aske Plaat
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Huisman, M., van Rijn, J.N., Plaat, A. (2022). Metalearning for Deep Neural Networks. In: Metalearning. Cognitive Technologies. Springer, Cham. https://doi.org/10.1007/978-3-030-67024-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-67024-5_13
Published: 22 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67023-8
Online ISBN: 978-3-030-67024-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Metalearning for Deep Neural Networks

Abstract

Chapter PDF

Similar content being viewed by others

A survey of deep meta-learning

Deep Learning

Using Metalearning for Parameter Tuning in Neural Networks

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Metalearning for Deep Neural Networks

Abstract

Chapter PDF

Similar content being viewed by others

A survey of deep meta-learning

Deep Learning

Using Metalearning for Parameter Tuning in Neural Networks

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation