Fundamental Limitations of Generative LLMs

Kucharavy, Andrei

doi:10.1007/978-3-031-54827-7_5

Andrei Kucharavy⁶

14k Accesses

Abstract

Given the impressive performances of LLM-derived tools across a range of tasks considered all but impossible for computers until recently, the capabilities of LLMs seem limitless. However, there are some fundamental limitations to what they can or cannot do inherent to the current architecture of LLMs. I will attempt to review the most notable of them to give the reader an understanding of what architectural modifications will need to take place before a given problem is solved. Specifically, I discuss counterfactual generation, private information leakage, reasoning, limited attention span, dependence on the training dataset, bias, and non-normative language.

Download to read the full chapter text

Chapter PDF

References

Ross Taylor et al. Galactica: A large language model for science. CoRR, abs/2211.09085, 2022.
Google Scholar
Benj Edwards. New meta ai demo writes racist and inaccurate scientific literature, gets pulled. Ars Technica, 2022.
Google Scholar
Amelia Glaese et al. Improving alignment of dialogue agents via targeted human judgements. CoRR, abs/2209.14375, 2022.
Google Scholar
Kurt Shuster et al. Language models that seek for knowledge: Modular search & generation for dialogue and prompt completion. CoRR, abs/2203.13224, 2022.
Google Scholar
OpenAI. Gpt-4 technical report. CoRR, abs/2303.08774, 2023.
Google Scholar
Nicholas Carlini et al. Extracting training data from large language models. In Michael Bailey and Rachel Greenstadt, editors, 30th USENIX Security Symposium, USENIX Security 2021, August 11–13, 2021, pages 2633–2650. USENIX Association, 2021.
Google Scholar
Ethan Perez et al. Red teaming language models with language models. CoRR, abs/2202.03286, 2022.
Google Scholar
Tom B. Brown et al. Language models are few-shot learners. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020, virtual, 2020.
Google Scholar
Jason Wei et al. Chain of thought prompting elicits reasoning in large language models. CoRR, abs/2201.11903, 2022.
Google Scholar
Takeshi Kojima et al. Large language models are zero-shot reasoners. In NeurIPS, 2022.
Google Scholar
Hyung Won Chung et al. Scaling instruction-finetuned language models. CoRR, abs/2210.11416, 2022.
Google Scholar
Or Honovich, Thomas Scialom, Omer Levy, and Timo Schick. Unnatural instructions: Tuning language models with (almost) no human labor. CoRR, abs/2212.09689, 2022.
Google Scholar
Jason Wei et al. Finetuned language models are zero-shot learners. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25–29, 2022. OpenReview.net, 2022.
Google Scholar
Simon Frieder et al. Mathematical capabilities of chatgpt. CoRR, abs/2301.13867, 2023.
Google Scholar
Romal Thoppilan et al. Lamda: Language models for dialog applications. CoRR, abs/2201.08239, 2022.
Google Scholar
Kurt Shuster et al. Blenderbot 3: a deployed conversational agent that continually learns to responsibly engage. CoRR, abs/2208.03188, 2022.
Google Scholar
Stephen Wolfram. Chatgpt gets its “wolfram superpowers”!, 2023.
Google Scholar
Shouyuan Chen, Sherman Wong, Liangjian Chen, and Yuandong Tian. Extending context window of large language models via positional interpolation. CoRR, abs/2306.15595, 2023.
Google Scholar
Jianlin Su et al. Roformer: Enhanced transformer with rotary position embedding. CoRR, abs/2104.09864, 2021.
Google Scholar
Ofir Press, Noah A. Smith, and Mike Lewis. Train short, test long: Attention with linear biases enables input length extrapolation. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25–29, 2022. OpenReview.net, 2022.
Google Scholar
Nelson F. Liu et al. Lost in the middle: How language models use long contexts. CoRR, abs/2307.03172, 2023.
Google Scholar
Xiangyu Peng, Siyan Li, Spencer Frazier, and Mark O. Riedl. Reducing non-normative text generation from language models. In Brian Davis, Yvette Graham, John D. Kelleher, and Yaji Sripada, editors, Proceedings of the 13th International Conference on Natural Language Generation, INLG 2020, Dublin, Ireland, December 15–18, 2020, pages 374–383. Association for Computational Linguistics, 2020.
Google Scholar
Ben Krause et al. Gedi: Generative discriminator guided sequence generation. In Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih, editors, Findings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16–20 November, 2021, pages 4929–4952. Association for Computational Linguistics, 2021.
Google Scholar
Long Ouyang et al. Training language models to follow instructions with human feedback. CoRR, abs/2203.02155, 2022.
Google Scholar
Zvi Mowshowitz. Jailbreaking chatgpt on release day, 2022. Accessed on Feb 07, 2023.
Google Scholar
James Vincent. Microsoft’s bing is an emotionally manipulative liar, and people love it. The Verge, February 2023.
Google Scholar
Yixin Wan et al. “kelly is a warm person, joseph is a role model”: Gender biases in llm-generated reference letters. CoRR, abs/2310.09219, 2023.
Google Scholar
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. On the dangers of stochastic parrots: Can language models be too big? In Madeleine Clare Elish, William Isaac, and Richard S. Zemel, editors, FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event / Toronto, Canada, March 3–10, 2021, pages 610–623. ACM, 2021.
Google Scholar
Jieyu Zhao a et al. Gender bias in contextualized word embeddings. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), pages 629–634. Association for Computational Linguistics, 2019.
Google Scholar

Download references

Author information

Authors and Affiliations

HES-SO Valais-Wallis, Sierre, Switzerland
Andrei Kucharavy

Authors

Andrei Kucharavy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrei Kucharavy .

Editor information

Editors and Affiliations

HES-SO Valais-Wallis, Sierre, Switzerland
Andrei Kucharavy
Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland
Octave Plancherel
Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland
Valentin Mulder
Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland
Alain Mermoud
Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland
Vincent Lenders

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kucharavy, A. (2024). Fundamental Limitations of Generative LLMs. In: Kucharavy, A., Plancherel, O., Mulder, V., Mermoud, A., Lenders, V. (eds) Large Language Models in Cybersecurity. Springer, Cham. https://doi.org/10.1007/978-3-031-54827-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-54827-7_5
Published: 12 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54826-0
Online ISBN: 978-3-031-54827-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics