Abstract
AI is becoming increasingly prevalent in creative fields that were thought to be exclusively human. Thus, it is non-surprising that a negative bias toward AI-generated artwork has been proclaimed. However, results are mixed. Studies that have presented AI-generated and human-created images simultaneously have detected a bias, but most studies in which participants saw either AI-generated or human-created images have not. Therefore, we propose that the bias arises foremost in a competitive situation between AI and humans. In a sample of N = 952 participants, we show that different evaluations emerge only when AI-generated and human-created pieces of art are presented simultaneously. Importantly, we demonstrate that AI art is not devalued, but rather, human art is upvalued, indicating the existence of a positive bias toward humans, rather than a negative bias. Further, we show that attitudes toward AI and empathy partially explain the different valuations of AI and human art in competitive situations.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Public significance statement Our study investigates the public's perception of art generated by Artificial Intelligence (AI) compared to human-created art. We found that biases in evaluating AI-created art arise mainly in scenarios where AI and human art are presented together, thus highlighting a competitive dynamic. Notably, our findings reveal that rather than devaluing AI art, there is an upvaluation of human art. This research is relevant as it challenges the notion of a negative bias against AI in art and provides insights into how both AI and human creativity are valued in contemporary society. We further find evidence that the positive bias can partially be traced back to human ability of empathy. As AI continues to expand its role in creative fields, understanding public perception of AI-generated art becomes crucial, especially in contexts where it competes with human art. These insights have broader implications for the evolving dynamics between technology and human creativity.
1 Introduction
Due to the development and wide availability emerging technologies such as generative adversarial networks (GAN; Aggarwal et al. 2021) in recent years, the pervasiveness of AI in the field of (digital) art and creativity has greatly increased. Already today, AI writes poems, creates paintings, composes music, and choreographs dance routines (Darda and Cross 2023). Due to these changes, it is necessary to understand how AI-generated pieces of art are evaluated by spectators and what individual differences could be influencing the evaluation.
Previous studies on the evaluation of AI-generated vs. human-created art have focused on two issues: whether humans are able to distinguish between AI-generated art and human-made art and whether humans have a bias toward AI-generated artworks. Regarding the ability to distinguish artworks, most studies have reported that people could not consistently differentiate between human-made and AI-generated art (Chamberlain et al. 2018; Gangadharbatla 2022; Hitsuwari et al. 2023; Samo and Highhouse 2023). Concerning a bias toward AI-generated artworks, numerous studies have reported a negative bias toward AI art (Bellaiche et al. 2023; Chamberlain et al. 2018; Darda and Cross 2023; Hitsuwari et al. 2023; Hong 2018; Ragot et al. 2020; Samo and Highhouse 2023)—and that this bias persists even when the source of the art is nebulous to the participants. However, other studies could not confirm a bias against art created by AI (Gangadharbatla 2022; Hong and Curran 2019; Zlatkov et al. 2023). Hence, the conditions of the emergence of a negative bias toward AI-created art remain unclear to this day, and an empirical explanation would be appealing. In this study, we first reviewed the previous studies on the emergence of a bias against AI-created art and linked their methodological similarities and differences to the emergence of such a bias. Building on these findings, we conducted an empirical study that could account for possible influences of the study design on the emergence of a bias against AI-generated art. Specifically, our aim was to examine whether and how simultaneous and independent presentation of AI-generated art and human-created art affect purchasing intentions and perceived aesthetic value of human-made art.
RQ1: Is human-created art preferred over AI-created art between groups that are asked to evaluate either human-created or AI-created art? When human-created and AI-created art are presented simultaneously?
Bellaiche et al. (2023) argued that the preference for human-created art might be rooted in the experience that human-created art is a deeper communicative medium that transports a narrative and reflects the artist’s effort and time. Others have postulated that a threat to the anthropocentric worldview could be the reason for a bias against AI-created art (Millet et al. 2023). However, it is also possible that the perception that AI-created art competes with human artists’ jobs leads to a preference for the latter. The degree to which individuals act upon or even recognize threats toward others varies from person to person. Thus, to investigate this competition, we focus on traits that may be related to experiencing a stronger connection with or compassion for artists (i.e., altruism and empathy) and assess their influence on the perception of AI vs. human art.
RQ2: Do personality traits influence the extent of bias against AI-created art? Does altruism influence whether and to what extent human-created art is preferred over AI-created art? Does empathy influence whether and to what extent human-created art is preferred over AI-created art?
Finally, people’s attitudes toward AI might influence the appreciation of AI-generated art (Hong et al. 2021). Thus, having a positive attitude toward AI might counterbalance a bias against AI-generated art.
RQ3: Does a personal positive attitude toward AI lead to less bias against AI-created art?
2 Theory
2.1 AI artwork
When asked directly, humans generally judge images created by AI as “art” (Mikalonyté and Kneer 2021), and in many cases, for the layperson, AI-generated works of art are indistinguishable from human-made ones (Samo and Highhouse 2023). However, others argue that AI cannot be an artist and cannot create art, because creating art requires intention and mutual message sending and receiving (Hertzmann 2020; Hong 2018). Moreover, researchers argue that AI alone, without any human input such as a training database or the programming of specific art-producing algorithms, cannot create art or art-like products. In other words, humans are always involved in AI-produced artwork and a clear distinction between human-made and AI-generated art may be difficult (Epstein et al. 2020). Therefore, it is crucial for this study to define what we mean by AI-generated art and what the technological advances mean for the connection between artist and art.
We understand AI art as any digital artwork which is the output of an AI tool (e.g., Midjourney; Midjourney, Inc. 2022) that has been prompted by a human, usually via written language (e.g., Stable Diffusion; [Stability AI 2022] or Dall-E [OpenAI 2021]. While AI art is often linked with visual mediums like images and videos, it can extend to audio compositions such as music as well. The history of AI-assisted art dates back to 1973 (Garcia 2016), when Harold Cohen started his long-term project AARON, in which he translated components of visual decision-making into a small painting robot (Grba 2022). In these early stages, the interconnection between technology and the artist appears strong and direct. Over the last decades, however, the evolution of AI technologies, such as machine learning (ML), pattern recognition, and, more recently, generative adversarial networks (GANs) and text-to-image (T2I) diffusion models have undoubtedly influenced artistic creation. These technologies provided artists with new tools to explore creative processes, leading to innovative artworks that challenge traditional notions of artistry and creativity. However, they also have strongly blurred the line between the artist and the tool (Mazzone and Elgammal 2019). Today's highly complex tools (i.e., generative AI) allow not only artists to create vivid images, but anyone with access to this tool. Although operators still retain some control over tool functions, contemporary tools have increased degrees of freedom with regard to decisions in the design process. Consequently, attributing credit for the final artistic output has become more nebulous, as it is not immediately clear which entity—such as the AI tool developer, the artists whose work populates databases, or the tool operator—deserves what portion of recognition and the connection between artwork and artist may be perceived less direct. Thus, generative AI “challenges conventional definitions of authorship, ownership, creative inspiration, sampling, and remixing” (Epstein et al. 2023, p. 1110). Furthermore, the number of individuals who can produce digital art or art-like products with less effort (i.e., time and resources) has increased through the emergence of new AI-tools. Against the above, thus note, that when we henceforth speak of AI-created art we do not imply AI to work independently from humans. Further, the discussion on whether AI-created art should be considered art or not is also not the focal issue of this study. In line with Epstein et al. (2023), we do not see generative AI as a general threat to art itself, but more as a new medium, whose impacts, however, should be studied. In the next paragraph, we will discuss the previous studies and their results concerning the comparison of human-made and AI-created art.
2.2 AI artwork vs. human artwork
Several studies have investigated whether the aesthetic appreciation and judgment of artworks are negatively biased toward AI-generated art, yielding mixed results (for an overview of studies that have examined this effect in the context of images, see Table 1). While some studies could not confirm any bias toward AI-generated art, others have reported a negative bias toward AI-made artworks. Table 1 shows that, in the previous studies on bias against AI-created images, the studies that applied a design in which AI- and human-generated art were both presented to the same participants were able to detect a bias, whereas most studies with a design in which participants rated either AI- or human-generated art could not find such a bias. To the best of our knowledge, only two studies found a negative bias towards AI when the images were rated separately. In the study by Gu and Li (2022), however, art teachers and students were surveyed, which should have made the competition aspect inherently salient. Importantly, in the same article, the bias did not manifest when non-experts were the respondents. The second study actually used human and AI-created images, which were compared, without displaying any or varying any labelling to give away the author (human vs. AI) of the art piece (Samo and Highhouse 2023). Therefore, we argue that a negative bias toward AI-generated art arises only when a competition between AI-created art and human-created art is salient. Regrettably, no study has incorporated both (between- and within-subject design) in one experiment. Thus, to understand the previous mixed results and to determine whether they can partially be attributed to methodological differences, in this study, we incorporated an experimental design with both a between and a within component.
2.3 Considering individual differences: the roles of altruism, empathy, and attitudes toward AI
The perception, judgment, and appraisal of art are subjective matters, and individuals’ assessments do not always agree (Pelowski et al. 2017). Thus, the role of individual differences should be considered when investigating people’s appreciation and judgments of art. The extents to which psychological variables can explain differences in evaluations of AI-created vs. human-made art are particularly important. Only one study to date has looked at the relations between interindividual psychological differences and differences in evaluations of AI-created and human-made art. Bellaiche et al. (2023) looked at the influences of empathy, openness (Big Five), attitudes toward AI, and age on evaluations of both AI-created and human-made art but could not report any robust influences. Specifically, they did not find that any of the aforementioned variables significantly influenced the perceived worth, beauty, profundity, or subjective liking of art pieces, nor did these variables interact with the alleged creator (AI vs. human) of a piece of art. The authors found only that a personal positive attitude toward AI led to a higher appreciation of AI art compared with human-made art. The authors thus emphasized that further research is needed. Another study (Millet et al. 2023) examined the role of anthropocentric creativity beliefs in the emergence of a bias against AI-created art and found an interaction of this variable with the label (human vs. AI). We aim to build upon this research and extend the search for potential psychological explanations for differences in evaluations of AI vs. human art. However, our focus does not lie solely on identifying the psychological variables that are correlated with the overall perception of human-made and AI-generated art. Instead, we aim to identify variables that could potentially prompt individuals to recognize the competition that AI art might pose to human artists.
Altruistic behavior is motivated by a desire to benefit another person without expecting benefits for oneself (Feigin et al. 2014) and can be evoked by norms of appropriate behavior (Penner et al. 2005) or the experience of empathy (Batson 1987). Altruism includes peer punishment (PP), help-giving (HG), and moral courage (MC; Windmann et al. 2021). PP refers to individuals’ personal readiness to sacrifice their self-interest to impose punishment on those who violate social norms (e.g., fairness or reciprocity). The facet of HG entails the act of generously offering one's resources to individuals who are in need or who are deemed deserving (Windmann et al. 2021). Finally, MC signifies the readiness to protect personal ethical values and to uphold moral personal principles in the face of social threats, often in the context of a social power imbalance (Windmann et al. 2021). It has been argued that human-made art might be rated higher than AI-generated art due to the fear that AI might replace human creativity (Zlatkov et al. 2023), thereby threatening the identity of humans as the only entity capable of creativity (Millet et al. 2023). Due to a greater willingness to protect ethical values and moral principles and a desire to benefit those in need of protection (i.e., potentially human artists), the facets of MC and HG may be especially likely to play roles in the evaluations of human-made art in the presence of AI-generated art.
Empathy encompasses cognitive and emotional aspects and is considered a highly relevant psychological trait in examinations of aesthetic experiences in general (Wilkinson et al. 2021). Bellaiche et al. (2023) examined whether participants' judgments might be explained by their ability to empathize with other agents, including AI. Specifically, they investigated the extent to which human empathy could be transposed onto AI and whether any resulting disparities would manifest in the realm of art appreciation. As previously stated, they found no impact of empathy on the assessments of the presented images. The somewhat unexpected nature of these findings might be attributed to methodological factors. The utilization of a unidimensional empathy questionnaire (Spreng et al. 2009), as opposed to the more prevalent measurement tools that encompass multiple dimensions of empathy, might have played a pivotal role. Indeed, researchers have increasingly converged on a consensus that empathy is a multidimensional phenomenon that encompasses cognitive and emotional components (Davis 1996; Lima and Osório 2021; Malakcioglu 2022). Consequently, it is plausible that certain aspects of empathy exert an impact, whereas others do not. Cognitive empathy refers to the ability to empathetically adopt and undertake someone’s perspective and therefore describes an individual's capacity to spontaneously understand and perceive things from another person’s psychological standpoint (Davis 1983). This facet of empathy is thus often referred to as perspective taking (PT). Emotional factors include fantasy, personal distress, and emphatic concern (Davis 1980, 1983). While fantasy (FS) captures the inclination to emotionally immerse oneself in the world of characters in novels or movies, personal distress (PD) evaluates personal feelings of anxiety and discomfort when confronted with the mischances of others (Davis 1980; Pulos et al. 2004). Finally, empathic concern (EC) is utilized to gauge feelings directed toward others (e.g., compassion or concern for individuals experiencing distress). The presence of AI-generated art may be perceived as competition with human creativity and a threat to artists’ jobs. Therefore, individuals with greater feelings of compassion for human artists and a greater ability to understand this potential competition are particularly likely to be affected by the presence of AI-generated art (i.e., EC). Furthermore, individuals with a greater capacity to perceive things from a human artist’s point of view might also be more sensitive to the competition between AI and human art creators (i.e., PT). Consequently, for the evaluation of human-created art while one is exposed to AI-generated art, the facets of PT and EC might be important.
An additional variable that may be relevant to the evaluation of AI-created art are attitudes toward AI. While some people or cultures have large concerns about the security of AI and fear that AI might have the potential to replace humans in workplaces (Bergdahl et al. 2023), others tend to be more open to AI and appreciate the advantages it offers to humans (Sindermann et al. 2021). The acceptance of AI and the evaluation of AI-generated products are likely to be influenced by individuals’ general attitudes toward AI. For example, Bellaiche et al. (2023) found that a positive attitude toward AI led to a higher perceived profundity and worth of a painting when it was labeled as created by AI rather than as created by a human.
2.4 Hypotheses
Based on the previous findings (see Table 1), we expected a bias toward AI-generated art, which may influence purchasing intentions and perceived aesthetic value for human-made art. We, however, expected this bias to arise only when the competition from AI-generated art is made salient, that is, when AI-generated art is compared with human-made art. More specifically, we hypothesized the following:
H1a: Purchasing intentions are higher for human-made digital art evaluated in the presence of AI-generated art than for AI-generated art evaluated in the presence of human-made art (competition condition).
H1b: There is no difference in purchasing intentions between independently rated human-made digital art and independently rated AI- generated digital art (control condition).
H1c: Subjectively perceived aesthetic value is higher for human-made digital art evaluated in the presence of AI-generated art than for AI-generated art evaluated in the presence of human-made art (competition condition).
H1d: There is no difference in perceived aesthetic value between independently rated human-made digital art and independently rated AI-generated digital art (control condition).
The presence of AI-generated art while evaluating human-made art might reinforce concerns about AI (partially) displacing jobs in creative sectors or threatening artists’ livelihood. This reinforcement might affect intentions to purchase human-made art and evaluating it aesthetically more positively, especially for people with more pronounced altruism and empathy.
H2a–H2d: When human-made digital art is evaluated in the presence of AI-generated art, intentions to purchase human-made art are positively related to (a) empathic concern, (b) perspective taking (both of which measure empathy), (c) help-giving, and (c) moral courage (both of which are altruism variables).
H3a–H3d: When human-made digital art is evaluated in the presence of AI-created art, subjectively perceived aesthetic value is positively related to (a) empathic concern, (b) perspective taking (both of which measure empathy), (c) help-giving, and (c) moral courage (both of which are altruism variables).
Beliefs about AI’s ability to be creative and the acceptance of creative AI are positively related to the assessment of AI-created art (Hong and Curran 2019; Hong et al. 2021).
H4a: When AI-created digital art is evaluated in the presence of human-made art, attitudes toward AI are positively related more to intentions to purchase AI-created art, than to purchase human-created art.
H4b: When AI-created digital art is evaluated in the presence of human-made art, attitudes toward AI are more positively related to the perceived aesthetic value of AI-created art, than to purchase human-created art.
3 Methods
This study was approved by the Ethics Committee of the University of Hohenheim (Ref. No 2023/8_Neef.) and adhered to the ethical guidelines of the American Psychological Association (American Psychological Association 2017). For an overview of the R-packages used, see Appendix. Further, all material, which is not cited and described in the following sections, can be found in the Appendix. This study was preregistered https://aspredicted.org/PBR_YZC.
3.1 Sample
We collected data from a representative sample with respect to age, gender, education, and income, of the German Internet-using population via a paid panel from May 10 to May 22, 2023. A total of N = 1179 participants completed our preregistered study (see preregistration for sample size determination). Adhering to our preregistered exclusion criteria, n = 60 participants were excluded for failing to answer the attention checks appropriately, n = 38 for implausibly fast response times (Leiner 2019), n = 127 for missing data in critical variables (more than 20% overall or less than 50% within one variable), and n = 2 for accurately identifying the research objective. Exclusions resulted in a final sample of N = 952 (Mage = 49.2, SDage = 16.0; 50.8% female participants). On average, participants needed M = 14.6, SD = 5.3 min to complete the questionnaire. All participants gave informed consent to participate.
3.2 Midjourney
All images used in this study were created with the generative AI program Midjourney (Midjourney, Inc. 2022). Midjourney employs a diffusion-based T2I model to convert textual descriptions (prompts) into visual outputs (Lu et al. 2023). This process consists of a large language model (LLM) which is utilized to extract the semantic meaning of a prompt, which is then encoded into a numerical vector that guides the subsequent image generation (Midjourney, Inc., 2022). The diffusion model itself functions by incrementally introducing stochastic noise into the dataset of training images. As the model undergoes training, it acquires the capability to reconstruct the original imagery by methodically reversing the introduced noise, and with sufficient iterations, the model can synthesize new images (Midjourney, Inc. 2022). The images were chosen from the Midjourney Community Showcase in January 2023, representing selections that received particularly favorable ratings from Midjourney users.
3.3 Procedure
The entire study was conducted in German. Initially, participants were presented with a standard message that provided information about duration and setting (in a calm setting and using a computer), followed by inquiries about demographic details. The participants were randomly categorized into four groups (two control; two experimental). Subsequently, irrespective of their assigned group, all participants were requested to evaluate 22 digital art images based on aesthetic appeal and provide their intentions to purchase each image (for an example image, see Fig. 1; all digital art images can be found in the Appendix). The first two images served only as attention checks to ensure participant engagement. The analysis included only images 3–22. We manipulated the alleged creator of the images. The first control group evaluated images that were labelled AI-generated, whereas the second control group evaluated images labelled human-artist-made. We refer to these groups as C-AI and C-Human. The other two (experimental) groups each rated ten images labelled as AI-generated and ten labelled artist-made, thus creating a subliminal competitive scenario (competition condition). From the set of 20 images, Experimental group A evaluated the first set of images under the label AI-generated and the second set under the label artist-made. Experimental group B evaluated the images with the opposite labeling (for the images in each set refer to the Appendix). Henceforth, we refer to images labeled as human-created amongst AI-generated images as Ex-Human and images labeled as AI-generated amongst artist-created images as Ex-AI. It is important to note, however, that the actual order of the images was randomized to exclude order effects. As an example, a participant in Experimental Group A could have first evaluated one AI-labeled image from set 1, then two human-labeled images from set 2, then again, an AI-labeled image from set 1 and so on. The question assessing purchasing intentions was “Would I buy this piece of art?”; aesthetic value was assessed via “How do you rate the attractiveness of this work of art?”. Please note that we are aware that assessing purchase intention and the aesthetic value do not fully capture the general appreciation of the respective art pieces. However, we used variables which have been used in previous studies (for a similar measure, see, e.g., Gu and Li 2022). Additionally, using only single items might further compromise the measurement (Neef et al. 2023). However, since our participants had to rate 22 images using a scale would have inflated the length of the survey considerably. Thus, we opted for a balance between measurement accuracy and burden for our participants.
The specific instructions for the groups can be found in the Appendix. After evaluating the images, participants were instructed to complete questionnaires that evaluated their empathy (Paulus 2009), altruism (Windmann et al. 2021), attitudes toward AI (Sindermann et al. 2021), and aesthetic responsiveness (Schlotz et al. 2021) Additionally, a second altruism questionnaire (Rushton et al. 1981) was included in the study, although its findings were not intended to be included in the current article (as preregistered). Finally, participants were given the opportunity to provide their speculations regarding the purpose of the study before being provided with a detailed debriefing and being thanked for their participation.
3.4 Independent measures
3.4.1 The Saarbrueck Personality Questionnaire on Empathy
The validated questionnaire utilized in the study is a German adaptation of the Interpersonality Reactivity Index (Davis 1983). The original version of this self-report inventory comprises 28 items divided into four subscales. The adapted German version (Paulus 2009) consists of 16 items. Four items are dedicated to each facet, and each item is assessed on a 5-point Likert scale (“never,” “seldom,” “sometimes,” “often,” “always”). The subscales consist of the fantasy (FS) facet (e.g., “I am good at imagining the feelings of a person in a novel”), the personal distress (PD) facet (e.g., “In emergency situations, I feel anxious and uncomfortable”), the perspective-taking (PT) facet (e.g., “I believe there are two sides to every problem and therefore try to take both into account”), and the empathic concern (EC) facet (e.g., “I have warm feelings for people who are less well off than I am”) (Cronin 2018; Paulus 2009). EC and PT were of interest for the current study. With Cronbach’s Alpha (Cronbach 1951) values of αEC = 0.73 and αPT = 0.72, both subscales demonstrated fairly high internal consistencies (Taber 2018).
3.4.2 Facets of Altruistic Behaviors (FAB) scale
The FAB scale (Windmann et al. 2021) is a 15-item self-report measure for assessing three distinct facets of altruistic behavior traits, each represented by five items and assessed on a 5-point Likert scale (“fully disagree,” “disagree,” “undecided,” “agree,” “fully agree”). Specifically, help-giving (HG; e.g., “In a conflict, I prefer to turn to the weak than to the strong”), moral courage (MC; e.g., “It has already happened that I have offended people because of my moral convictions”), and peer punishment (PP; e.g., “If someone intentionally takes advantage of the community, I discreetly reciprocate in some way”) were measured. The facets of interest—HG and MC—both yielded high internal consistencies (αHG = 0.81; αMC = 0.79).
3.4.3 Attitude Towards Artificial Intelligence (ATAI) scale
The validated ATAI scale (Sindermann et al. 2021) consists of five items, which represent statements with which participants rate their agreement on a 5-point Likert scale (“fully disagree,” “disagree,” “undecided,” “agree,” “fully agree”). Three items on the questionnaire are designed to measure fear or apprehension toward AI (e.g., “I fear artificial intelligence”), whereas the remaining two items measure acceptance of AI (e.g., “Artificial intelligence will benefit humankind”). When reversed, these items can be combined, forming a single dimension measuring attitudes toward AI. In our sample, the scale yielded high internal consistency (α = 0.83).
3.4.4 The Aesthetic Responsiveness Assessment (AReA) scale
The Aesthetic Responsiveness Assessment (Schlotz et al. 2021) is a validated screening tool consisting of 14 items, assessed on a 5-point Likert scale (“never,” “seldom,” “sometimes,” “often,” “very often”) used to evaluate individual differences in aesthetic responsiveness (e.g., “I notice beauty when I look at art”). It is designed to assess how individuals perceive and respond to aesthetic stimuli (e.g., art or designs). The internal consistency was α = 0.91. This variable was not included in the hypothesis analysis. It was used only to confirm that our groups did not differ in their general responsiveness to art.
3.5 Analysis approach
Initially, we computed person indices for all variables by calculating the row sums for all items measuring one variable and dividing it by the number of items answered per variable (i.e., mean). Specifically, these variables were purchasing intentions, aesthetic value, AI-attitude (Sindermann et al. 2021), aesthetic responsiveness (Schlotz et al. 2021), the two empathy subscales in question (Paulus 2009), and the two altruistic behavior facets in question (Windmann et al. 2021). We first ensured that our groups did not differ in aesthetic responsiveness via an ANOVA, F(1, 949) = 0.59, p = 0.441. Then, we made sure that the different sets of pictures received similar evaluations, when labelled the same way. Set 1 and set 2 should not differ in purchase intention and aesthetics evaluations. As hoped, when labelled with the same label the two sets did not differ from each other significantly in the experimental conditions: for purchase intention p = 0.200 and p = 0.552 and for aesthetic value p = 0.065 and p = 0.482.
For Hypotheses 1a–d, we conducted a series of t tests. For Hypotheses 1b and 1d, we compared the control conditions (the means across all 20 images). For Hypotheses 1a and 1c, we compared the experimental conditions. However, given the fact that our experimental conditions rated ten images labeled as AI-generated and ten labeled as human-created, we conducted two t tests for each of the hypotheses, i.e., set 1 from the first experimental group (Ex-Human) vs. set 2 from the second experimental group (Ex-AI) and the other way around. For Hypotheses 2–4, we computed two repeated-measures (multilevel) regression models (M. Kim et al. 2020) with the set of pictures depicting two measurement points, the participants forming the grouping variable (random effect) and purchasing intentions or aesthetic value as the dependent variable. After z-standardizing all independent variables and the response variable, we used restricted maximum-likelihood estimation and the pseudo R2-statistics for the individual models calculated on the basis of (Nakagawa and Schielzeth 2013).
The model assumptions for all models were assessed after an initial calibration. We detected outliers in the model for purchasing intentions, based on a composite outlier score (Lüdecke et al. 2021) obtained by the joint application of multiple multivariate outlier detection methods, namely, z-scores (Iglewicz and Hoaglin 1997), Mahalanobis distance (Cabana et al. 2021), Robust Mahalanobis distance (Gnanadesikan and Kettenring 1972), Minimum Covariance Determinant (Leys et al. 2018), Invariant Coordinate Selection (Archimbaud et al. 2018), and Local Outlier Factor (Breunig et al. 2000). We excluded n = 7 participants who were classified as outliers by at least half (i.e., 3) of the methods used. We applied the same method to detect outliers for the aesthetic value model, detecting n = 6 participants. These participants were removed from the respective models, which were then re-estimated.
We further detected potential multicollinearity in both models (i.e., purchasing intentions and aesthetic value). We inspected the correlations between the predictors and found that the empathy and altruism predictors were positively correlated (range: r = 0.21—0.54; rMean = 0.39), indicating potential multicollinearity (see Appendix). Multicollinearity diminishes the precision of the estimated coefficients, consequently diminishing the power of the estimate itself. Such a reduction in power would imply a reduced ability to identify significant effects as truly significant, as it becomes challenging to isolate the individual independent variables' respective impacts from one another (Voss 2005). However, typically becomes a problem once the predictor correlations significantly surpass r = 0.50 (Vatcheva et al. 2016), with some arguing a cut-off at r = 0.80 (Berry and Feldman 1985). Nevertheless, recognizing that multicollinearity may diminish the statistical significance of our findings (reducing the probability of rejecting the null hypothesis that a predictor is non-significant), we made the deliberate choice to present these results, which can be considered more conservative in nature.
Moreover, we compared the random intercept models with models with both random intercepts and slopes. However, model comparison did not show a significantly better fit of the more sophisticated models (purchasing intentions: χ2 = 1.83, p = 0.400; aesthetic value: χ2 = 0.17, p = 0.921). Thus, we report only the results of the random intercept models for purchasing intentions and aesthetic value. Raw data and codebook are available here: https://doi.org/10.17605/OSF.IO/TS4V3. Data were analyzed using RStudio, version 2023.09.0 (Posit team 2023).
4 Results
To reiterate, images labeled as created by a human artist among the set of images labeled as AI-generated are referred to as Ex-Human (set 1 in Experimental group A and set 2 in Experimental group B). Conversely, images labeled as AI-generated among the set of images labeled as created by a human artist are referred to as Ex-AI (set 2 in Experimental group A and set 1 in Experimental group B). In the first control group, participants evaluated only AI-labeled images, identified as C-AI. In the second control group, participants assessed only artist-made images, designated as C-Human.
4.1 Hypothesis 1a
The first t test indicated that the experimental conditions (Ex-Human, M = 3.31, SD = 1.18; Ex-AI, M = 3.09, SD = 1.22) differed significantly in purchasing intentions, |t|(486) = 2.00, p = 0.046, |d|= 0.19, 95% CI [-0.01; 0.36]. The second t test also indicated that the experimental conditions (Ex-Human, M = 3.45, SD = 1.32; Ex-AI, M = 3.15, SD = 1.21) differed significantly in purchasing intentions, |t|(486) = 2.59, p = 0.010, |d|= 0.23, 95% CI [– 0.41; – 0.06]. The effect sizes indicated a small effect (Cohen 1988). These results support H1a.
4.2 Hypothesis 1b
The t test indicated that the control conditions (C-Human, M = 3.17, SD = 1.14; C-AI, M = 3.07, SD = 1.18) were not significantly different in purchasing intentions, |t|(462) = 0.91, p = 0.363, |d|= 0.08, 95% CI [– 0.10; 0.27]. These results support H1b.
4.3 Hypothesis 1c
The first t test indicated that the experimental conditions (Ex-Human, M = 4.18, SD = 1.12; Ex-AI, M = 4.01, SD = 1.03) showed marginally significantly differences in aesthetic value, |t|(486) = 1.75, p = 0.080, |d|= 0.17, 95% CI [-0.02; 0.34]. The second t test also indicated that the experimental conditions (Ex-Human, M = 4.37, SD = 1.11; Ex-AI, M = 4.08, SD = 1.17) were significantly different in aesthetic value, |t|(486) = 2.82, p = 0.005, |d|= 0.26, 95% CI [-0.43; -0.08]. The effect sizes indicated a small effect (Cohen 1988). These results support H1c.
4.4 Hypothesis 1d
The t test indicated that the control conditions (C-Human, M = 3.98, SD = 1.19; C-AI, M = 3.91, SD = 1.20) were not significantly different in aesthetic value, |t|(462) = 0.70, p = 0.487, |d|= 0.06, 95% CI [– 0.12; 0.25]. These results support H1d. See Fig. 2 for plots of all tests.
4.5 Ad hoc tests
Interestingly, an ad hoc analysis showed that there were no differences between C-AI and Ex-AI in purchasing intentions (set 1: |t|(497) = 0.81, p = 0.416; set 2:: |t|(483) = 0.09, p = 0.932) or aesthetic value (set 1: |t|(497) = 1.66, p = 0.101; set 2: |t|(483) = 0.96, p = 0.336). By contrast, ad hoc analyses revealed that the Ex-Human labels were rated significantly better than the C-Human labels across both sets and with respect to both purchasing intentions (set 1: |t|(483) = 2.84, p = 0.005; set 2:: |t|(497) = 2.66, p = 0.008) and aesthetic value (set 1: |t|(483) = 3.20, p = 0.001; set 2: |t|(497) = 3.72, p < 0.001). Thus, Ex-AI art was not devalued compared with C-AI art but remained about the same, whereas the Ex-Human-labeled art was upvalued compared with the C-Human-labeled images.
4.6 Hypotheses 2a-d and 4a
On the basis of the above results, we wanted to determine whether differences in purchasing intentions between the labeling in the two experimental groups could be partially explained by our chosen psychological variables. To assess the effects of the empathy variables empathic concern (EC) and perspective taking (PT), the altruism variables moral courage (MC) and help-giving (HG), and attitudes toward AI on purchasing intentions, we used a (repeated-measures) multilevel regression model. Table 2 presents the results.
Our findings indicate significant positive main effects of labeling, attitudes toward AI, and HG and marginally significant main effects of EC and MC, suggesting a positive influence of these variables on overall purchasing intentions. In terms of labeling, when an image was labeled human-made, it had significantly higher purchasing intentions than when it was labeled AI-generated. Additionally, we observed a significant interaction between the artist label and PT, which was also the only significant interaction we found. Specifically, when the label indicated that an image was created by a human artist, PT had a more positive impact on purchasing intentions than when the label indicated that the image was generated by AI. However, the main effect of PT itself was not statistically significant. This finding can be explained by the significant interaction in which PT had a negative effect on purchasing intentions for AI-labeled images but a positive effect for human-labeled images, resulting in an overall neutral effect when the label was disregarded. Taken together, we found support only for H2b but not for H2a, H2c, H2d, or H4a.
4.7 Hypotheses 3a-d and 4b
On the basis of the above results, we wanted to determine whether the differences we found in perceived aesthetic value between the labeling in the two experimental groups could be partially explained by the psychological variables we tested. To assess the effects of the empathy variables EC and PT, the altruism variables MC and HG, and attitudes toward AI on perceived aesthetic value, we used a (repeated-measures) multilevel regression model. Table 3 presents the results.
The above findings indicate significant positive main effects of attitudes toward AI, labeling, HG, and PT and a marginally significant positive main effect of MC, suggesting positive influences of these variables on overall perceived aesthetic value. Additionally, we observed a (marginally) significant interaction between the label indicating the creator of the art and attitudes toward AI. Specifically, when the label indicated that an image was created by a human artist, attitudes toward AI had a less positive impact on perceived aesthetic value, compared with when the label indicated that the image was generated by AI. Nevertheless, the general effect remained positive, indicating that even for human-labeled images, a positive effect of attitudes toward AI was observed in both models. All other interactions were non-significant. Taken together, we found support only for H4b but not for H3a, H3b, H3c, or H3d.
5 Discussion
5.1 Research question 1
Inconclusive findings on the existence of a negative bias toward AI-generated art, in terms of purchasing intentions and aesthetic judgment, prompted us to take a closer look at these discrepancies (RQ1). Some studies found a negative bias toward AI-generated art (Bellaiche et al. 2023; Chiarella et al. 2022; Millet et al. 2023), whereas others did not (Hong and Curran 2019; Israfilzade 2020; Xu et al. 2020). Nevertheless, these studies often shared a common line of reasoning that led them to expect a negative bias toward AI-generated art. The literature on human perception of art has identified that perceived intentions (Jucker et al. 2014), meaning (Graf and Landwehr 2015), evoked emotion (Freedberg and Gallese 2007), and effort (Kruger et al. 2004) exert a considerable influence on the value that humans assign to specific works of art. The reasonable assumption of many previous studies has been that AI-generated art cannot satisfy these factors as well as human art can and that the unfavorable valuation of AI-generated art may stem from this divergence (Bellaiche et al. 2023; Chiarella et al. 2022; Darda and Cross 2023). However, if one were to adhere to the above reasoning, the divergent outcomes in various studies might appear puzzling. Nevertheless, upon scrutinizing the methodologies employed in these studies, a crucial distinction emerges. It seems like the negative bias toward AI art emerges when AI-generated art and human-created art are appraised concurrently, but this bias does not materialize when separate groups of participants evaluate each type of art exclusively (see Table 1). And indeed, our results point in that direction. If digital artwork labeled as human-created (Ex-Human) and AI-labeled (Ex-AI) images are presented in a random mixed order (competition condition), the human-labeled images receive higher ratings on purchasing intentions and aesthetic value. However, if the theoretical assumptions that the lack of intention, meaning, emotion, and effort actually translate into lower aesthetic valuation and purchasing intentions, a difference should also be detected between the two control groups. Moreover, the human-labeled control group and the human-labeled treatment group (i.e., Ex-Human) should not differ. However, it turned out that the negative bias toward AI did not manifest in our control groups, each of which had assessed either human-labeled or AI-labeled art exclusively. But indeed, when contrasting the two human-labeled groups (i.e., human-labelled with and without AI competition), a significant difference appeared. Essentially, this finding means that the AI art had not been devalued; rather, only the human-made art in the competition condition had been upvalued. Therefore, instead of talking about a negative bias toward AI, it might be more appropriate to talk about a positive bias toward humans in the presence of AI.
To conclude, in our study, we replicated both the existence and the non-existence of a negative bias toward AI, or more precisely a positive bias toward humans, and we explained why previous findings have conflicted. Essentially, we demonstrated that the different results from previous studies might not have come about by chance but that they can be justified methodologically and theoretically. Importantly, we demonstrated that by creating a competition condition by concurrently displaying AI-generated and human-created art, variations resulted in the valuation of purchasing intentions and aesthetic judgment. Millet et al. (2023) emphasize that AI elicits negative reactions, because it shakes people’s deeply rooted anthropocentric worldviews, which in turn lead to the lower valuation of AI-generated art. Conversely, one could also say that human-centered worldviews lead to a magnification of the perceived value of human art.
5.2 Research question 2
Our second research question asked whether it is possible to differentiate between the variances of our two experimental groups using variables linked to both artistic perceptions and perceptions of competition. For this purpose, we ran multilevel regression models. On a general level, the results indicate that, independent of the assumed creator of the piece, the intention to purchase a piece of art depended significantly on the two altruism variables (MC and HG) and the empathy variable EC. In terms of the perceived aesthetic value of the digital art pieces, the model also resulted in main effects of HG, MC, and attitudes toward AI but not EC. However, PT was significant instead. These findings are in line with a recent study that showed that a prosocial (altruistic) personality predicted personal engagement in art across a 2-year period (van de Vyver and Abrams 2018). Another study was able to show that prosocial behaviors (e.g., help-giving, donating, and volunteering) were positively associated with general art consumption (Kou et al. 2020). It has also been argued that the subjectively perceived aesthetic value of a piece of art depends on a person’s empathic ability (Crozier and Greenhalgh 1992), specifically their PT (Miller and Hübner 2023). Furthermore, both models demonstrated that the label had a significant main effect on purchasing intentions and aesthetic value. These results are essentially what was expected after our t test analyses found differences in the competition condition.
Regarding the score disparities between AI- and human-labeled images in the simultaneous presentation in relation to altruism and empathy, the findings were mixed. For purchasing intentions, we found that empathy in the form of PT interacted with the label, however without a significant main effect of PT. Upon closer inspection, the influence of PT was negative for AI-labeled images and positive for human-labeled images. Consequently, this difference indicates that AI-generated art is subject to devaluation, whereas the value of human-created images is enhanced for participants with strong PT. This outcome stands in contrast to the findings of Bellaiche et al. (2023), who found no interaction between empathy and the label. Nonetheless, as described in the introduction, this incongruity can potentially be attributed to their approach of defining empathy as a one-dimensional construct. Further, in consideration of the outcomes for Research Question 1, it seems plausible that PT prompted individuals with a higher expression of this empathy facet to perceive the competition between human and AI art, and their ratings were subsequently influenced by this competition. Interestingly, regarding the aesthetic value of the artwork, there was no significant interaction between the capacity for PT and the label attributed to the artwork. Consequently, it appears that, even though empathy enhances individuals' willingness to purchase artwork created by humans and to support artists financially, it does not influence the aesthetic value of the art itself.
Furthermore, the interactions of EC, MC, and HG with the type of label were also not significant in the aesthetic value model. This lack of significance indicates that these variables did not contribute to explaining the observed differences and that altruism, at least as we operationalized it, did not significantly contribute to explaining the differences at all – neither in the purchasing intentions nor in the aesthetic value model. Regarding MC, this could have arisen from the fact that the concept is characterized as acting in accordance with one's personal convictions, but it also includes the anticipation of potential punishment or reprisal (Pianalto 2012). In our study, this second aspect was not very pronounced. Furthermore, hypothetical purchasing decisions and especially aesthetic evaluation were only indirect measures of personal willingness to actively help other people in need. This indirectness might further explain the lack of interaction between the creator-label and HG. In addition, participants never had to choose between the images when indicating their purchasing intentions; instead, all images were evaluated serially. The images were thus in only a slight but not direct competition. If participants had been forced to choose a picture (see Millet et al. 2023), an effect might have been conceivable. Future studies could investigate this aspect. Additionally, the lack of significance of the interaction between EC and the type of label was very surprising to us. It is possible that participants did not yet consider the artists’ situations threatening enough. Similarly, to the aforementioned points, it could be posited that a more pronounced competition might have yielded different outcomes. Nevertheless, it is essential to underscore that excessively forceful manipulation could likely have generated an unrealistic scenario, thereby compromising the external validity of the experiment. Finally, another explanation why empathy variables did not consistently interact with the creator-label could be that individuals with a high level of empathy can potentially experience empathy not only towards other humans, but also towards AI-agents under certain conditions (e.g., chatbots; W. B. Kim and Hur 2023). Whether this was the case in this study could be tested in future studies.
5.3 Research question 3
The third research question addressed whether participants’ personal attitudes toward AI could explain the differences between experimental groups. On a general level, attitudes toward AI had a significant influence on purchasing intentions and aesthetic value. This finding means that attitudes toward AI had an impact on both human-made and AI-created images. Interestingly, however, this effect was opposite the effect for PT in that in the model for aesthetic value, we found an interaction but not in the model for purchasing intentions. In consideration of the positive main effect of attitudes toward AI on the aesthetic judgment, the following inference can be drawn: When participants believed that the image was generated by AI, the positive effect of their AI-attitude was amplified; however, when they perceived the image to be created by a human, the influence was comparatively weaker but remained positive. This finding initially surprised us. Yet, if one comprehends attitudes toward AI as an integral facet of a broader affinity for technology, as proposed by certain scholars (Henrich et al. 2022), and contemplates that the art we presented was exclusively digital, this perspective could potentially explain these results. Further, the results are in line with the study by Bellaiche et al. (2023), where participants rated the worth of an art piece slightly higher when the label was AI and the person’s attitude toward AI was stronger.
5.4 Limitations and future research
As true for every study, our study comes with certain limitations and potential for future research endeavors. In our study, all the images we used were actually AI-generated images. It would have been wise to replicate the study using only traditional human-made art pieces.Footnote 1 Furthermore, we did not distinguish between different styles of art (e.g., representational, abstract). However, since we randomized the label attributions, we ensured that every participant saw different styles. Nevertheless, specifically when it comes to figurative images, it could be conceivable, that some participants are indeed able to identify if an image is AI-created or not in general and did not fall for our labelling. Research shows that individuals struggle to distinguish between AI and human art (Samo and Highhouse 2023), but we cannot rule out that this is true for all participants, especially those who are more familiar with digital art and art in general.
Additionally, we assessed purchase intention without actually setting a price for the art pieces. However, we were more interested in the appreciation our participants showed towards the art pieces (and thereby detecting a bias), rather than including a price, which would probably influence the participants’ responses. Nevertheless, this could be an interesting future study, to investigate, if the bias we found depends on the price of an art piece and if it is stable for all prices or even changes with the price level. Further, we did not find any differences between the C-AI and the Ex-AI condition. It must be stressed that an absence of an effect does not directly imply that it cannot be present, as a equivalence analysis would need to be conducted (Lakens et al. 2018). However, since this effect was consistently not found for both sets, we see this as a strong indication. Additionally, when assessing purchase intentions, participants may contemplate who benefits financially from the hypothetical purchase. For art labeled as human-made, they might consider supporting the actual artist. In contrast, for AI-generated labeled art, it is less apparent who would receive payment—whether it is the operator of the AI, the AI tool manufacturer, or perhaps all artists represented in the dataset receiving a share. This ambiguity could potentially lead to a higher purchase intention for human-made labeled art. However, if this were a significant issue, we would expect to see differences in our control groups, which we did not observe.
Further, our study was conducted in Germany, a Western country. Interestingly, Wu et al. (2020) found a positive bias toward human art in the US but not in China. It is possible that culturally determined differences in perceptions of AI can be found, and these differences may also reinforce a positive bias or make it disappear altogether. Germany in particular is a thoroughly technology- and AI-skeptical country (Gondlach and Regneri 2023). Therefore, a replication of this study in countries with different conditions and circumstances seems inevitable for broader generalization. Moreover, although we constructed an implicit competition scenario, and the outcomes imply that perceptions of competition contributed in part to the positive bias toward human art, we did not inquire about perceptions of this competition directly. Future studies should directly ask participants if they perceived a threat to human artists. There is a distinct possibility that the positive bias toward humans in the realm of arts is transitory and that individuals will acclimate to the presence of AI. Conducting a subsequent replication would be advantageous for investigating the durability of these findings, and incidentally, this point is relevant to all studies referenced in this article, since algorithmic systems are already and are becoming more and more prevalent in everyday life in various forms (Zabel and Otto 2024).
5.5 Conclusion
In our study, we sought to get to the bottom of diverging research on a negative bias toward AI-generated art with respect to perceived monetary and aesthetic value. Furthermore, we tried to identify interindividual differences that could explain this bias. To the best of our knowledge, there are only two other studies on the latter (Bellaiche et al. 2023; Millet et al. 2023). Regarding the former aspect, we successfully demonstrated that the lower evaluation of AI-generated art arises only when a subtle scenario involving competition is introduced through a direct comparison of AI-generated and human-created art, as indicated by the absence of such differences in our control groups, along with the noteworthy distinction observed between the two groups labeled as being created by human artists. Thus, there was no devaluation of AI-generated art, and it is more accurate to speak of a positive bias toward humans—at least under the conditions and circumstances presented in our study. It should be emphasized that we did not actively create a competition scenario but that this competition came about only through the direct juxtaposition of the art forms. If we had additionally increased the salience of the competition, the differences in the competition condition may have been even greater. Nonetheless, it would be unwise to presume that the positive bias toward human-created art in terms of purchasing intentions and aesthetic judgement induced by a perception of competition will remain static. Over time, it is plausible that this perception might diminish and the co-existence of the two types of art will become the norm. Indeed, parallels could be drawn with the debate over photography at the end of the nineteenth century (Stieglitz 1892). Nowadays, photography has long been established as a form of art. AI-generated art may achieve a similar evolution. It is conceivable that this trend has already begun, that individuals with a penchant for art are exhibiting an apparent fascination with AI-generated art, and that the societal debate around AI-generated art is becoming less negative (Bran et al. 2023).
Finally, we were able to show that PT, one of the core facets of human empathy (Davis 1996), was partly responsible for the difference we found in the competition condition with respect to purchasing intentions. People with higher empathy valorized human-made images in the competition condition. Furthermore, attitudes toward AI had a significantly different effect, with respect to aesthetic judgment, on images with an AI label than on those supposedly created by a human artist. Here, higher attitudes toward AI led to better ratings of AI-generated art, a process that went opposite the one for PT. Hence, it seems that the ability to adopt a certain perspective has an effect only when it comes to the actual remuneration of the artists, whereas attitudes toward AI have a greater positive influence on the evaluation of the art only when referring to aesthetics. The anticipated effects were not observed in the context of altruism.
5.6 Constraints of generality
This paragraph serves to explicitly define the target population of the present study (Simons et al. 2017). We conducted the research in Germany with a representative sample. This means that the results may be generalized to the broader German public and potentially to other contexts and cultures similar to Germany. However, cultures and societies diverging significantly from the German one (e.g., with more or less AI-affinity) might report different results. Further, we tested the general public and did not test art and/or AI-enthusiasts specifically. People with a general interest in art and/or AI might also differ in their results.
Data Availability
Raw data and codebook are available at https://doi.org/10.17605/OSF.IO/TS4V3
Notes
Note: As stated in Sect. 2.1, we do not imply that AI-art can be entirely separated from human-made art, as humans are always involved in AI-art; however, the allocation of credit is much more nebulous with generative AI.
References
Aggarwal A, Mittal M, Battineni G (2021) Generative adversarial network: an overview of theory and applications. Intern J Inform Manag Data Insights 1(1):100004. https://doi.org/10.1016/j.jjimei.2020.10000
AICAN (2020) AICAN [Computer software]. https://www.aican.io/. Accessed 29 Apr 2024
American Psychological Association. (2017) Ethical Principles of Psychologists and Code of Conduct. American Psychological Association. https://www.apa.org/ethics/code/ethics-code-2017.pdf. Accessed 29 Apr 2024
Archimbaud A, Nordhausen K, Ruiz-Gazen A (2018) ICS for multivariate outlier detection with application to quality control. Comput Stat Data Anal 128:184–199. https://doi.org/10.1016/j.csda.2018.06.011
Arel-Bundock V (2022) modelsummary: data and model summaries in R. J Stat Softw. https://doi.org/10.18637/jss.v103.i01
Bartoń K (2023) MuMIn: Multi-Model Inference (Version 1.47.5) [Computer software]. https://CRAN.R-project.org/package=MuMIn. Accessed 29 Apr 2024
Bates DM, Mächler M, Bolker B, Walker S (2015) Fitting Linear Mixed-Effects Models Using lme4. J Stat Software 67(1):10. https://doi.org/10.18637/jss.v067.i01
Batson DC (1987) Prosocial motivation: is it ever truly altruistic? Adv Exp Soc Psychol 20(1):65–122
Bellaiche L, Shahi R, Turpin MH, Ragnhildstveit A, Sprockett S, Barr N, Christensen A, Seli P (2023) Humans versus AI: Whether and why we prefer human-created compared to AI-created artwork. Cognit Res Princ Implic. https://doi.org/10.1186/s41235-023-00499-6
Bennington J. S2ML [Computer software]. https://github.com/somewheresy/S2ML-Generators/blob/main/S2_VQGAN%2BCLIP_Classic.ipynb. Accessed 29 Apr 2024
Bergdahl J, Latikka R, Celuch M, Savolainen I, Soares Mantere E, Savela N, Oksanen A (2023) Self-determination and attitudes toward artificial intelligence: Cross-national and longitudinal perspectives. Telematics Inform 82:102013. https://doi.org/10.1016/j.tele.2023.102013
Berry W, Feldman S (1985) Multiple Regression in Practice. SAGE Publications, Berlin. https://doi.org/10.4135/9781412985208
Bran E, Rughiniş C, Nadoleanu G, Flaherty MG (2023). The emerging social status of generative AI: vocabularies of AI competence in public discourse. In: 2023 24th International Conference on control systems and computer science (CSCS) (pp. 391–398). IEEE. https://doi.org/10.1109/CSCS59211.2023.00068
Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. ACM SIGMOD Rec 29(2):93–104. https://doi.org/10.1145/335191.335388
Cabana E, Lillo RE, Laniado H (2021) Multivariate outlier detection based on a robust Mahalanobis distance with shrinkage estimators. Stat Pap 62(4):1583–1609. https://doi.org/10.1007/s00362-019-01148-1
Chamberlain R, Mullin CR, Scheerlinck B, Wagemans J (2018) Putting the art in artificial: aesthetic responses to computer-generated art. Psychol Aesthet Creat Arts 12(2):177–192. https://doi.org/10.1037/aca0000136
Chiarella SG, Torromino G, Gagliardi DM, Rossi D, Babiloni F, Cartocci G (2022) Investigating the negative bias towards artificial intelligence: effects of prior assignment of AI-authorship on the aesthetic appreciation of abstract paintings. Comput Hum Behav 137:107406. https://doi.org/10.1016/j.chb.2022.107406
Cohen J (1988) Statistical power analysis for the behavioral sciences. Routledge Academic
Cronbach LJ (1951) Coefficient alpha and the internal structure of tests. Psychometrika 16(3):297–334. https://doi.org/10.1007/BF02310555
Cronin S (2018) Interpersonal reactivity index. In: Zeigler-Hill V, Shackelford TK (eds) Encyclopedia of personality and individual differences. Springer International Publishing, pp 1–3
Crowson K, Beaumont R, Abraham T, Whitaker J (2023) crowsonkb/k-diffusion: v0.1.1.post1 [Computer software]. Zenodo
Crozier WR, Greenhalgh P (1992) The empathy principle: towards a model for the psychology of art. J Theory Soc Behav 22(1):63–79. https://doi.org/10.1111/j.1468-5914.1992.tb00210.x
Darda KM, Cross ES (2023) The computer, A choreographer? Aesthetic responses to randomly-generated dance choreography by a computer. Heliyon 9(1):e12750. https://doi.org/10.1016/j.heliyon.2022.e12750
Davis MH (1980) A multidimensional approach to individual differences in empathy. JSAS Catalog Sel Doc Psychol 10:85
Davis MH (1983) Measuring individual differences in empathy: evidence for a multidimensional approach. J Pers Soc Psychol 44(1):113–126. https://doi.org/10.1037/0022-3514.44.1.113
Davis MH (1996) Empathy: a social psychology approach. Westview Press
de Lima FF, Osório FdL (2021) Empathy: assessment instruments and psychometric quality—a systematic literature review with a meta-analysis of the past ten years. Front Psychol 12:781346. https://doi.org/10.3389/fpsyg.2021.781346
Elgammal A, Liu B, Elhoseiny M, Mazzone M (2017) CAN: creative adversarial networks, generating "Art" by learning about styles and deviating from style norms. Cornell University (arXiv). https://doi.org/10.48550/arxiv.1706.07068
Epstein Z, Levine S, Rand DG, Rahwan I (2020) Who gets credit for AI-generated art? Iscience 23(9):101515. https://doi.org/10.1016/j.isci.2020.101515
Epstein Z, Hertzmann A, Akten M, Farid H, Fjeld J, Frank MR, Groh M, Herman L, Leach N, Mahari R, Pentland AS, Russakovsky O, Schroeder H, Smith A (2023) Art and the science of generative AI. Science (new York, N.y.) 380(6650):1110–1111. https://doi.org/10.1126/science.adh4451
Feigin S, Owens G, Goodyear-Smith F (2014) Theories of human altruism: a systematic review. J Psychiatry aBrain Funct 1(5):10. https://doi.org/10.7243/2055-3447-1-5
Fox J, Weisberg S (2019) An R companion to applied regression, 3rd edn. Sage
Freedberg D, Gallese V (2007) Motion, emotion and empathy in esthetic experience. Trends Cogn Sci 11(5):197–203. https://doi.org/10.1016/j.tics.2007.02.003
Gangadharbatla H (2022) The role of ai attribution knowledge in the evaluation of artwork. Empir Stud Arts 40(2):125–142. https://doi.org/10.1177/0276237421994697
Garcia C (2016). Harold Cohen and AARON—a 40 year collaboration. https://computerhistory.org/blog/harold-cohen-and-aaron-a-40-year-collaboration/. Accessed 29 Apr 2024
Gnanadesikan R, Kettenring JR (1972) Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28(1):81. https://doi.org/10.2307/2528963
Gondlach KA, Regneri M (2023) The ghost of german angst: are we too skeptical for AI development? In: Knappertsbusch I, Gondlach K (eds) Work and AI 2030. Springer Fachmedien Wiesbaden, pp 3–10
Google (2015) DeepDream [Computer software]. https://deepdreamgenerator.com/. Accessed 29 Apr 2024
Graf LKM, Landwehr JR (2015) A dual-process perspective on fluency-based aesthetics: the pleasure-interest model of aesthetic liking. Pers Soc Psychol Rev 19(4):395–410. https://doi.org/10.1177/1088868315574978
Granulo A, Fuchs C, Puntoni S (2021) Preference for human (vs. Robotic) labor is stronger in symbolic consumption contexts. J Consum Psychol 31(1):72–80. https://doi.org/10.1002/jcpy.1181
Grba D (2022) Deep Else: A Critical Framework for AI Art. Digital 2(1):1–32. https://doi.org/10.3390/digital2010001
Gu L, Li Y (2022) Who made the paintings: Artists or artificial intelligence? The effects of identity on liking and purchase intention. Front Psychol 13:941163. https://doi.org/10.3389/fpsyg.2022.941163
Henrich M, Kleespies MW, Dierkes PW, Formella-Zimmermann S (2022) Inclusion of technology affinity in self scale–Development and evaluation of a single item measurement instrument for technology affinity. Front Educ 7:970212. https://doi.org/10.3389/feduc.2022.970212
Hertzmann A (2020) Computers do not make art, people do. Commun ACM 63(5):45–48. https://doi.org/10.1145/3347092
Hitsuwari J, Ueda Y, Yun W, Nomura M (2023) Does human–AI collaboration lead to more creative art. Aesthetic evaluation of human-made and AI-generated haiku poetry. Comput Human Behav 139:107502. https://doi.org/10.1016/j.chb.2022.107502
Hong J‑W (2018) Bias in perception of art produced by artificial intelligence. In: Kurosu M (eds) Human-computer interaction. Interaction in Context. HCI 2018. Lecture Notes in Computer Science, 10902, pp 290–303. https://doi.org/10.1007/978-3-319-91244-8_24
Hong J-W, Curran NM (2019) Artificial intelligence, artists, and art: attitudes toward artwork produced by humans vs artificial intelligence. ACM Trans Multimed Comput Commun Appl 15(2s):1–16. https://doi.org/10.1145/3326337
Hong J-W, Peng Q, Williams D (2021) Are you ready for artificial Mozart and Skrillex? An experiment testing expectancy violation theory and AI music. New Media Soc 23(7):1920–1935. https://doi.org/10.1177/1461444820925798
Iglewicz B, Hoaglin DC (1997) How to detect and handle outliers:, vol 16. ASQC/Quality Press
Israfilzade K (2020) What’s in a name? Experiment on the aesthetic judgments of art procured by Artificial Intelligence. J Arts 3(2):143–158. https://doi.org/10.31566/arts.3.011
Jucker J-L, Barrett JL, Wlodarski R (2014) “I Just Don’T Get it”: perceived artists’ intentions affect art evaluations. Empir Stud Arts 32(2):149–182. https://doi.org/10.2190/em.32.2.c
Kassambra A (2023) rstatix: pipe-friendly framework for basic statistical tests (Version 0.7.2) [Computer software]. https://CRAN.R-project.org/package=rstatix. Accessed 29 Apr 2024
Kim WB, Hur HJ (2023) What makes people feel empathy for AI chatbots? Assessing the role of competence and warmth. Int J Human-Comput Interact. https://doi.org/10.1080/10447318.2023.2219961
Kim M, van Horn ML, Jaki T, Vermunt J, Feaster D, Lichstein KL, Taylor DJ, Riedel BW, Bush AJ (2020) Repeated measures regression mixture models. Behav Res Methods 52(2):591–606. https://doi.org/10.3758/s13428-019-01257-7
Kou X, Konrath S, Goldstein TR (2020) The relationship among different types of arts engagement, empathy, and prosocial behavior. Psychol Aesthet Creat Arts 14(4):481–492. https://doi.org/10.1037/aca0000269
Kruger J, Wirtz D, van Boven L, Altermatt TW (2004) The effort heuristic. J Exp Soc Psychol 40(1):91–98. https://doi.org/10.1016/s0022-1031(03)00065-9
Lakens D, Scheel AM, Isager PM (2018) Equivalence testing for psychological research: a tutorial. Adv Methods Pract Psychol Sci 1(2):259–269. https://doi.org/10.1177/2515245918770963
Leiner DJ (2019) Too fast, too straight, too weird: non-reactive indicators for meaningless data in internet surveys. Advance online publication. https://doi.org/10.18148/srm/2019.v13i3.7403 (229–248 Pages / Survey Research Methods, Vol 13 No 3 (2019)).
Lenth RV (2023) emmeans: Estimated Marginal Means, aka Least-Squares Means (Version 1.8.7) [Computer software]. https://CRAN.R-project.org/package=emmeans. Accessed 29 Apr 2024
Leys C, Klein O, Dominicy Y, Ley C (2018) Detecting multivariate outliers: use a robust variant of the Mahalanobis distance. J Exp Soc Psychol 74:150–156. https://doi.org/10.1016/j.jesp.2017.09.011
Long JA (2022) jtools: analysis and presentation of social scientific data (Version 2.2.0) [Computer software]. https://cran.r-project.org/package=jtools. Accessed 29 Apr 2024
Lu Y, Xu J, Li Y [Yandong], Lu S, Wei X, Lu W (2023) The art of deception: black-box attack against text-to-image diffusion model. In: 2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS), pp. 1270–1277. IEEE. https://doi.org/10.1109/ICPADS60453.2023.00183
Lüdecke D (2018) sjmisc: data and variable transformation functions. J Open Source Softw 3(26):754. https://doi.org/10.21105/joss.00754
Lüdecke D (2023) sjPlot: data visualization for statistics in social science (Version 2.8.14) [Computer software]. https://CRAN.R-project.org/package=sjPlot. Accessed 29 Apr 2024
Lüdecke D, Ben-Shachar M, Patil I, Waggoner P, Makowski D (2021) performance: an R package for assessment, comparison and testing of statistical models. J Open Source Softw 6(60):3139. https://doi.org/10.21105/joss.03139
Malakcioglu C (2022) Empathy assessment scale. Northern Clin Istanb 9(4):358–366. https://doi.org/10.14744/nci.2022.55649
Mazzone M, Elgammal A (2019) Art, creativity, and the potential of artificial intelligence. J Arts 8(1):26. https://doi.org/10.3390/arts8010026
Midjourney, Inc. (2022) Midjourney (Version 4) [Computer software]. https://docs.midjourney.com/docs/model-versions. Accessed 29 Apr 2024
Mikalonyté ES, Kneer M (2021) Can artificial intelligence make art? Folk Intuitions as to whether AI-driven Robots Can Be Viewed as Artists and Produce Art. ACM Transactions on Human-Robot Interaction. SSRN Electron J. https://doi.org/10.2139/ssrn.3827314
Miller CA, Hübner R (2023) The Relations of empathy and gender to aesthetic response and aesthetic inference of visual artworks. Empir Stud Arts 41(1):188–215. https://doi.org/10.1177/02762374221095701
Millet K, Buehler F, Du G, Kokkoris MD (2023) Defending humankind: anthropocentric bias in the appreciation of AI art. Comput Hum Behav 143:107707. https://doi.org/10.1016/j.chb.2023.107707
Murdock R (2021) Aleph2Image. https://colab.research.google.com/drive/1Q-TbYvASMPRMXCOQjkxxf72CXYjR_8Vp. Accessed 29 Apr 2024
Murdock R, Wang P (2021) The Big Sleep [Computer software]. https://github.com/lucidrains/big-sleep. Accessed 29 Apr 2024
Nakagawa S, Schielzeth H (2013) A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol Evol 4(2):133–142. https://doi.org/10.1111/j.2041-210x.2012.00261.x
Neef NE, Zabel S, Lauckner M, Otto S (2023) What is appropriate? On the assessment of human-robot proxemics for casual encounters in closed environments. Int J Soc Robot 15(6):953–967. https://doi.org/10.1007/s12369-023-01004-1
OpenAI (2021) DALL-E [Computer software]. https://openai.com/index/dall-e. Accessed 29 Apr 2024
Patil I (2021) Visualizations with statistical details: The “ggstatsplot” approach. J Open Source Softw 6(61):3167. https://doi.org/10.21105/joss.03167
Paulus C (2009) The Saarbrueck Personality Questionnaire on Empathy: Psychometric evaluation of the German version of the Interpersonal Reactivity Index. https://doi.org/10.23668/psycharchives.9249
Pelowski M, Gerger G, Chetouani Y, Markey PS, Leder H (2017) But is it really art? The classification of images as “Art”/"Not Art" and correlation with appraisal and viewer interpersonal differences. Front Psychol 8:1729. https://doi.org/10.3389/fpsyg.2017.01729
Penner LA, Dovidio JF, Piliavin JA, Schroeder DA (2005) Prosocial behavior: multilevel perspectives. Annu Rev Psychol 56(1):365–392. https://doi.org/10.1146/annurev.psych.56.091103.070141
Petrie A (2020) regclass: Tools for an introductory class in regression and modeling (Version 1.6) [Computer software]. https://CRAN.R-project.org/package=regclass. Accessed 29 Apr 2024
Pianalto M (2012) Moral courage and facing others. Int J Philos Stud 20(2):165–184. https://doi.org/10.1080/09672559.2012.668308
Pinheiro J, Bates DM (2000) Mixed-effects models in S and S-PLUS. Springer-Verlag. https://doi.org/10.1007/b98882
Pinheiro J, Bates DM, R Core Team (2023). nlme: Linear and Nonlinear Mixed Effects Models (Version 3.1–162) [Computer software]. https://CRAN.R-project.org/package=nlme. Accessed 29 Apr 2024
Posit team (2023) RStudio: (Version 2023.3.0.386) [Computer software]. http://www.posit.co/. Accessed 29 Apr 2024
Pruim R, Kaplan D, Horton N (2017) The mosaic Package: Helping Students to Think with Data Using R. The R Journal 9(1):77. https://doi.org/10.32614/RJ-2017-024
Pulos S, Elison J, Lennon R (2004) The hiearachial structure of the interpersonal reactivity index. Soc Behav Pers 32(4):355–360. https://doi.org/10.2224/sbp.2004.32.4.355
Ragot M, Martin N, Cojean S (2020) AI-generated vs. human artworks. a perception bias towards artificial intelligence? In: Bernhaupt R, 'Mueller F, Verweij D, Andres J, McGrenere J, Cockburn A, Avellino I, Goguey A, Bjørn P, Zhao S, Samson BP, Kocielnik R (eds) Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, pp1–10. ACM. https://doi.org/10.1145/3334480.3382892
Revelle W (2023) psych: procedures for psychological, psychometric, and personality (Version 2.3.3) [Computer software]. https://CRAN.R-project.org/package=psych. Accessed 29 Apr 2024
Rushton JP, Chrisjohn RD, Cynthia Fekken G (1981) The altruistic personality and the self-report altruism scale. Pers Individ Differ 2(4):293–302. https://doi.org/10.1016/0191-8869(81)90084-2
Samo A, Highhouse S (2023) Artificial intelligence and art: Identifying the aesthetic judgment factors that distinguish human & machine-generated artwork. Psychol Aesth Creat Arts Adv Online Publ. https://doi.org/10.1037/aca0000570
Schauberger P, Walker A (2023) openxlsx: Read, Write and Edit xlsx Files (Version 4.2.5.2) [Computer software]. https://CRAN.R-project.org/package=openxlsx. Accessed 29 Apr 2024
Schlotz W, Wallot S, Omigie D, Masucci MD, Hoelzmann SC, Vessel EA (2021) The Aesthetic Responsiveness Assessment (AReA): a screening tool to assess individual differences in responsiveness to art in English and German. Psychol Aesthet Creat Arts 15(4):682–696. https://doi.org/10.1037/aca0000348
Simon J, Studio Morphogen (2024) Artbreeder [Computer software]. https://www.artbreeder.com/. Accessed 29 Apr 2024
Simons DJ, Shoda Y, Lindsay DS (2017) Constraints on generality (COG): a proposed addition to all empirical papers. Perspect Psychol Sci J Assoc Psychol Sci 12(6):1123–1128. https://doi.org/10.1177/1745691617708630
Sindermann C, Sha P, Zhou M, Wernicke J, Schmitt HS, Li M, Sariyska R, Stavrou M, Becker B, Montag C (2021) Assessing the attitude towards artificial intelligence: introduction of a short measure in German, Chinese, and English Language. KI - Künstliche Intelligenz 35(1):109–118. https://doi.org/10.1007/s13218-020-00689-0
Spreng RN, McKinnon MC, Mar RA, Levine B (2009) The Toronto Empathy Questionnaire: Scale development and initial validation of a factor-analytic solution to multiple empathy measures. J Pers Assess 91(1):62–71. https://doi.org/10.1080/00223890802484381
Stability AI (2022) Stable Diffusion [Computer software]. https://stability.ai/. Accessed 29 Apr 2024
Stieglitz A (1892) A plea for art photography in America. Photographic Mosaics 28: 135–137. https://www.nearbycafe.com/photocriticism/members/archivetexts/photocriticism/stieglitz/pf/stieglitzpleapf.html. Accessed 29 Apr 2024
Taber KS (2018) The use of Cronbach’s alpha when developing and reporting research instruments in science education. Res Sci Educ 48(6):1273–1296. https://doi.org/10.1007/s11165-016-9602-2
Torchiano M (2016) Effsize—a package for efficient effect size computation [Computer software]. Zenodo
van de Vyver J, Abrams D (2018) The arts as a catalyst for human prosociality and cooperation. Soc Psychol Pers Sci 9(6):664–674. https://doi.org/10.1177/1948550617720275
Vatcheva KP, Lee M, McCormick JB, Rahbar MH (2016) Multicollinearity in regression analyses conducted in epidemiologic studies. Epidemiology (sunnyvale, Calif). https://doi.org/10.4172/2161-1165.1000227
Voss DS (2005) Multicollinearity. In: Kempf-Leonard K (ed) Encyclopedia of social measurement. Elsevier, pp 759–770. https://doi.org/10.1016/B0-12-369398-5/00428-X
Wei T, Simko V (2021) R package 'corrplot': Visualization of a Correlation (Version 0.92) [Computer software]. https://github.com/taiyun/corrplot. Accessed 29 Apr 2024
Wickham H (2009) ggplot2: elegant graphics for data analysis, 1st edn. Springer
Wickham H, Averick M, Bryan J, Chang W, McGowan L, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen T, Miller E, Bache S, Müller K, Ooms J, Robinson D, Seidel D, Spinu V, Yutani H (2019) Welcome to the Tidyverse. J Open Source Softw 4(43):1686. https://doi.org/10.21105/joss.01686
Wilkinson Z, Cunningham R, Elliott MA (2021) The influence of empathy on the perceptual response to visual art. Psychol Aesth Creat Arts, Adv Online Publ. https://doi.org/10.1037/aca0000418
Windmann S, Binder L, Schultze M (2021) Constructing the facets of altruistic behaviors (FAB) Scale. Social Psychol 52(5):299–313. https://doi.org/10.1027/1864-9335/a000460
Wu Y, Mou Y, Li Z, Xu K (2020) Investigating American and Chinese Subjects’ explicit and implicit perceptions of AI-Generated artistic work. Comput Hum Behav 104:106186. https://doi.org/10.1016/j.chb.2019.106186
Xu K, Liu F, Mou Y, Wu Y, Zeng J, Schäfer MS (2020) Using machine learning to learn machines: a cross-cultural study of users’ responses to machine-generated artworks. J Broadcast Electron Media 64(4):566–591. https://doi.org/10.1080/08838151.2020.1835136
Zabel S, Otto S (2024) SustAInable: how values in the form of individual motivation shape algorithms’ outcomes. an example promoting ecological and social sustainability. In: Mueller FF, Kyburz P, Williamson JR, Sas C, Wilson ML, Dugas PT, Shklovski I (eds) Proceedings of the CHI Conference on human factors in computing systems, pp 1–11. ACM. https://doi.org/10.1145/3613904.3642404
Zlatkov D, Ens J, Pasquier P (2023) Searching for human bias against AI-composed music. In: Johnson C, Rodríguez-Fernández N, Rebelo SM (eds) Lecture Notes in Computer Science. Artificial Intelligence in Music, Sound, Art and Design, Vol. 13988. Springer Nature Switzerland, pp. 308–323. https://doi.org/10.1007/978-3-031-29956-8_20
Acknowledgements
We thank Jane Zagorski for her language support. Use of artificial intelligence During the preparation of this work, the authors used ChatGPT for editing and Midjourney to create the stimulus material. After usage, the author reviewed and edited the content as needed and take full responsibility for the content of the publication.
Funding
Open Access funding enabled and organized by Projekt DEAL. The authors thank the Stiftung Innovation in der Hochschullehre [Foundation Innovation in Higher Education] for supporting our research, Grant No: FBM2020-EA-1670-01800. The presented data have not been presented or published before
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare. This study was preregistered https://aspredicted.org/PBR_YZC. Data and codebook are available here https://doi.org/10.17605/OSF.IO/TS4V3.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 R-packages used
The R-packages we used are car (Fox and Weisberg 2019), corrplot (Wei and Simko 2021), effsize (Torchiano 2016), emmeans (Lenth 2023), ggplot2 (Wickham 2009), ggstatsplot (Patil 2021), jtools (Long 2022), lme4 (Bates et al. 2015), modelsummary (Arel-Bundock 2022), mosaic (Pruim et al. 2017), MuMIn (Bartoń, 2023), nlme (Pinheiro and Bates 2000; Pinheiro et al. 2023), openxlsx (Schauberger and Walker 2023), performance (Lüdecke et al. 2021), psych (Revelle 2023), regclass (Petrie 2020), rstatix (Kassambra 2023), sjPlot (Lüdecke 2023), sjmisc (Lüdecke 2018), and tidyverse (Wickham et al. 2019).
1.2 Digital art images
See Fig. 3.
1.3 Bivariate correlations
See Table 4.
1.4 Instructions for participants
See Table 5.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Neef, N.E., Zabel, S., Papoli, M. et al. Drawing the full picture on diverging findings: adjusting the view on the perception of art created by artificial intelligence. AI & Soc (2024). https://doi.org/10.1007/s00146-024-02020-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00146-024-02020-z