Keywords

1 Introduction

Building a startup is a challenging endeavor, and the failure rate is notoriously highFootnote 1. Scarce resources and lack of knowledge and experience in startup processes are among the key reasons why startups fail. Support for startups exists in different forms, such as mentoring, incubation, and digital tools. With the emergence of Generative Artificial Intelligence (GAI), particularly Large Language Models (LLMs), startups now have more opportunities to receive assistance and develop their innovative ideas. However, given the nascent nature of LLMs, how they could be utilized to support startups is yet to be investigated.

The emergent and enduring value of LLMs is fundamentally tied to effective prompt engineering [1]. A prompt refers to a set of text instructions crafted to program and customize LLMs for the desired interaction. It provides a scope or context for an LLM to act on [2]. In turn, prompt engineering is the means by which LLMs are programmed via prompts [3]. Prompt engineering skills are vital for fully leveraging LLMs, but they do not come naturally and need to be learned. This can pose a challenge to startup teams, who often operate on tight resources and have many other critical tasks to attend to [4]. We aim to facilitate startups in utilizing LLMs as AI assistants through prompt engineering. To this end, we propose the research question: How to apply prompt engineering to turn large language models into AI assistants for startups?

As the first step to answer the research question, we investigated prompt patterns which are summaries of effective prompt-tuning techniques. They provide a codified approach to customizing the input and output of LLMs as well as interactions with them. Application of prompt patterns may prove beneficial for startups in maximizing the potential of LLMs. We evaluated the prompt patterns proposed in [3], and identified a subset of them as more relevant in the startup context (see Sect. 2). We first tried these patterns using a simulated conversation with ChatGPT. Then we applied them in real-life conversations between a group of students studying entrepreneurship and ChatGPT. The preliminary results from this initial step helped us achieve a better understanding of the requirements for converting LLMs into effective AI assistants.

2 Background and Related Literature

The role of AI in transforming businesses is now well established. However, its utility in the context of startups has remained fuzzy [5]. While several startups are seen using this technology to support their operations in data analysis, chatbots, and process automation, a vast majority of startups are still unsure how to incorporate or leverage this technology in the development of their innovative ideas. The challenge becomes more significant when considering early-phase startup development in which ideation and brainstorming activities are intensive. The existing literature paid little attention to the role of AI and how to leverage its core values for startups [5]. The recent upsurge of generative AI, particularly LLMs, calls for more exploration of AI potentials for startups.

The development of the first GPT (Generative Pre-trained Transformer) model in 2018 [6] laid the foundation of LLMs. They gained significant attention soon after OpenAI released ChatGPT on November 30, 2022. Dimitrov [7] defines an LLM as a pre-trained model, trained on a very large text dataset. LLMs promise great potential in generating, summarizing, translating, and performing various natural language tasks [8]. They can offer a series of conversations with a great conversational user experience [9]. However, despite the widespread adoption of LLMs in various industries, it is unclear how non-experts can design effective prompts to interact with these models and elicit the desired behaviour [9].

Dwivedi et al. [10] claim that the output of an LLM significantly depends on how a prompt is designed and provided to the model. Pengfei et al. [2] conclude the same and propose to experiment with prompts to elicit the desired knowledge from LLMs. A similar view is presented in another study [7]. The author ascertains that a response from an LLM will be effective if a prompt is good. Therefore, effective prompts are essential for a desired answer and future interactions. Similarly, according to [10], there is a need to train users to provide an effective prompt and it is going to be an essential future skill. The field to design and implement prompts for LLMs is referred to as prompt engineering [3].

In this context, White et al. [3] propose a catalog of prompt patterns to enhance the output of LLMs. The authors propose 15 prompt patterns in their study, originally inspired by the software design patterns. The presented patterns are designed with the intent to offer better yet reusable ways to interact with LLMs. Considering all the patterns in the catalog, we selected seven patterns to be further investigated in our study, briefly described below.

  • Persona: In this pattern, a user asks an LLM to play a particular role. In role-playing, the LLM is given a role without providing fine details of that role. The pattern is applicable when users do not know the exact details required to process the request, however, what they know is the role of a person who is responsible for this kind of job.

  • Context Manager: This pattern helps users to either introduce or remove a specific context while having a conversation with LLM. Therefore, users drive LLMs to consider or ignore a few aspects while producing the output. Otherwise, LLMs tend to provide broader or generic answers to a particular question asked by the user.

  • Flipped Interaction: In this pattern, the interaction between the user and LLM is flipped. It means that the LLM is supposed to lead the interaction and ask questions to accomplish the user’s goals. Communicating a goal to the LLM is a prerequisite in this pattern. As the content of the interaction is produced by the LLM according to the specified goal, therefore, a more precise output is generated by the LLM using the knowledge that the user does not possess.

  • Cognitive Verifier: This pattern is proposed to restrict LLMs to always decompose the original questions into a series of sub-questions automatically. Thereafter, by combining answers to sub-questions, an LLM produces the answer to the original question. The original thought for this pattern comes from a recent study [11]. According to the authors, the LLM can assert more reasonably if we divide the key problem into sub-problems and then the LLM can process them in a sequence. This strategy is referred to as least-to-most prompting [11]. The primary goal of this pattern is to improve the prompts.

  • Alternative Approaches: It helps to overcome the problem of cognitive biases. Humans naturally have the tendency to exhibit what they see and think. Therefore, the rationale behind this pattern is to overcome cognitive errors so that users may ask for alternative ways of doing a particular task. A comparison of alternative practices, in terms of pros and cons, could also be asked by users.

  • Question Refinement: The purpose of this pattern is to prompt an LLM to produce a refined version of the user question so that a piece of accurate information should be produced. Therefore, the LLM needs to have a few interactions with a user to produce the refined question. Alongside this, the LLM also needs a context to produce a better version of the question. Therefore, the goal is to turn the user’s question into a refined version with the help of a series of interactions.

  • Template: This pattern is recommended when the user needs to restrict LLMs to follow a use-case and produce output accordingly. Therefore, LLMs deliver output in a format that the user specifies. In this scenario, LLMs do not have knowledge of the specified template and the user has to specify it while asking a question.

3 Research Process

We conducted the study in two steps: 1) Apply the prompt patterns to a simulated conversation, and 2) Apply the prompt patterns to real-life conversations.

In the first step, we created a scenario in which a younger entrepreneur is interested in building a startup in the healthcare domain. She did not have a clear startup idea to start with and turned to ChatGPT to understand how she could proceed. We designed the conversation following an example provided by an entrepreneurship educator in a YouTube videoFootnote 2. The educator is an expert on entrepreneurship educationFootnote 3. The conversation covers various aspects of the ideation phase, such as problem and solution, customer segment, first minimal viable product, etc. We created two versions of the conversation, the first one with naturally formulated questions, and the second one with prompt-tuned questions using the identified prompt patterns. We asked an entrepreneurship educator to evaluate the two sets of answers and gathered feedback from her. The evaluation session was designed as an unstructured interview with open questions. The materials evaluated by the entrepreneurship educator are available at https://figshare.com/s/0b5588735abbc484098c.

In the second step, we invited four students at our university who were working on various startup ideas as part of an entrepreneurship course. We set the context of the conversation to problem validation (the first phase of startup development based on the Lean Startup methodology [12]) which was the focus of their startup projects at the time of the study. Each of them was asked to interact individually with ChatGPT within the defined context. They first used the questions they formulated naturally and intuitively. Then we helped them repeat the conversation applying the prompt patterns where applicable to their questions. During these sessions, we observed how they formulated questions, interacted with ChatGPT, and reacted to the answers they received. After these sessions, we asked them to evaluate the two sets of answers by commenting freely on them. The documented conversations from these sessions can be found at https://figshare.com/s/0b5588735abbc484098c.

All collected ChatGPT conversations, feedback from the entrepreneurship educator and students, as well as our observations will be analyzed systematically using appropriate qualitative data analysis techniques. In Sect. 4, we report the preliminary results.

4 Preliminary Results

Table 1 lists the prompt patterns selected from [3] (as described in Sect. 2) and their application in the conversations described in Sect. 3. Some patterns are applied independently, at the beginning of a conversation, or before a group of questions. Others are integrated as part of the questions.

Table 1. Prompt patterns used in startup-related conversations with ChatGPT

It is evident that, based on our initial analysis of the conversations, the prompt-engineered questions tend to elicit more elaborated answers or enhance the conversational interactions and experience. The feedback from the entrepreneurship educator and the four students was generally positive toward the answers to the prompt-engineered questions. They commented that the answers were more specific, had more details, and in some cases less assertive than the answers to their original questions. However, in a few cases, both the educator and students considered the prompt-engineered questions produced worse answers. We need to conduct further analysis to understand whether it is because the prompt pattern applied was not appropriate.

Our initial observation of the interactions between the students and ChatGPT indicates that not all of them were equally comfortable or confident when asking questions to ChatGPT. Some of them were struggling to formulate appropriate questions to ask or ask follow-up questions. This may be partially linked to how familiar they were with conversing with LLMs, and partially attributed to how well they learned problem validation as a startup topic. They also revealed different attitudes towards ChatGPT and the answers to some prompt-engineered questions. An interesting comment was made by one student when he received the answer to a question prompt-tuned using the “Flipped Interaction” pattern. He commented that he expected ChatGPT to always provide clear answers rather than asking further questions. In his opinion, answering one question with another or more questions was a sign of not being polite.

5 Discussion

The study was conducted using conversations that are meaningful for early-stage startups. Brainstorming is one of the most intensive activities in the initial phase of a startup. If a startup team intends to use LLMs to support their brainstorming activities, they could apply the prompt patterns that can lead to divergent ideas as well as convergent answers. Several patterns examined in our study have good potential to stimulate divergent thinking. Flipped interaction is a good pattern for this purpose. Using this pattern to formulate prompts and tune questions, rather than being provided with direct answers, the startup team can open their minds and ponder on more questions that are relevant to their business idea. This style of interaction is close to the mentoring relationship between a startup team and their mentor. It facilitates the team to think more actively rather than looking for fast and simple answers. Another pattern, Cognitive Verifier, can produce a similar effect. It differs from Flipped Interaction in that it does provide a conclusive answer after decomposing a complex question into several sub-questions. The third pattern, Alternative Approaches, can also support divergent thinking. Applying this type of prompt to LLMs can return multiple answers/possibilities to a question to which a single answer is not desirable.

The preliminary results indicate that, without applying any prompt engineering techniques, startups may not obtain the desired benefits from LLMs such as more creative ideas and new insights and understanding of their startup business. Using prompt patterns effectively can help startup teams overcome their lack of knowledge regarding startup processes and deficient prompt engineering skills. The results also serve as a reminder that the successful application of prompt engineering to LLMs is not solely a technical matter. Socio-cultural and human factors will significantly influence the effectiveness of prompt engineering.

There are several limitations in our study. Firstly, we applied the prompt patterns to ChatGPT only. Their applicability to other LLMs remains unclear. Applying and customizing prompt patterns for other LLMs is an interesting future study. Another limitation is that we studied a small number of students who develop startup ideas in a university course setting. Further studies are needed to understand what questions real-world startups are asking LLMs and whether the prompt patterns are equally applicable and useful for them. Lastly, our study was focused on the initial phase of the startup process. Whether the findings are valid for more mature startups is yet to be understood.

6 Conclusion

The recent upsurge of generative AI, including LLMs, opened up numerous exciting opportunities for startups. In this paper, we applied prompt engineering to help startup teams obtain better results when interacting with LLMs. We investigated the application of a set of prompt patterns to ChatGPT. The initial results show that some patterns are more suitable for brainstorming which is a typical activity conducted by early-stage startups. Prompt-tuned questions may lead to more specific and more detailed responses, but it is not guaranteed. In addition, human factors can play an important role in the effective application of prompt patterns.

The study presented in the paper is the first step toward building AI assistants for startups based on LLMs. To reach our goal eventually, we will employ a design science research approach. Two main artifacts are envisioned: 1) a “startup prompt book” in which prompt patterns are a main component. It is a knowledge base that contains a set of rules that transform intuitively expressed questions and requests made by startup teams into prompt-engineered questions as input to LLMs; and 2) a “prompt engine” that can automate the prompt engineering process and choreograph conversations with LLMs. These two artifacts are at the core of an AI assistant that can become a valuable resource for a startup team, as a co-founder, a team member, or a mentor.

To address the limitations mentioned previously, the presented study can be extended to include other LLMs, use a more diverse and representative sample of startups, and explore the applicability of their findings to startups in various stages of development. We believe that the results we obtain in the startup context can be generalisable to established companies for endeavours such as product innovation (e.g., in the ideation phase). However, the generalisability needs to be validated by future research. Last but not least, the potential drawbacks and risks of using LLMs (such as privacy and ethical issues) is an important research topic that needs to be investigated in future research.