Introduction

The idea of applying Darwinian thinking to archaeology is not new. The notion of evolution broadly conceived can be found in the works of many early practitioners. In fact, some of the early pioneers of professional archaeology working at the end of the 19th century directly invoked Charles Darwin himself when describing the ways in which artefacts changed over time (cf. Riede, 2006, 2010). Following in the footsteps of, for instance, Pitt Rivers (1875) and Montelius (1899), Kidder (1932, p. 8, cited in O’Brien, 2005, p. 27) noted that “the sooner we roll up our sleeves and begin comparative studies of axes and arrowheads and bone tools, make classifications, prepare accurate descriptions, draw distribution maps and, in general, persuade ourselves to do a vast deal of painstaking, unspectacular work, the sooner shall we be in a position to approach the problems of cultural evolution, the solving of which is, I take it, our ultimate goal”. As archaeology is fundamentally concerned with material culture change over time, cursory as well as specific mention of cultural evolution has been made in many works since then, both in anglophone archaeology (e.g. Sahlins and Service, 1965), and in German (Gandert, 1950), Brazilian (Araujo and Okumura, 2021), Spanish (Escacena Carrasco et al., 2010), as well as Italian and French traditions (see Plutniak, 2022 for a recent review of the latter). However, much of this either refers to non-Darwinian and often stage-progressive notions of social evolution (cf. Richard, 2012) or to merely vernacular forms of cultural evolution in the sense of ‘change over time’. Inspired initially by sociobiology, the 1980s saw the emergence of a new and more formal definition of evolutionary archaeology in primarily anglophone research environments, where artefacts were seen as the hard parts of the human phenotype akin to paleobiological fossils (Dunnell, 1980, 1988, 1989). While influential in many ways (cf. Riede et al., 2022), this overly rigid approach was swiftly superseded by a conceptualisation of cultural evolution rooted in dual-inheritance theory (cf. Marwick, 2006; Shennan, 2008, 2011).

Critically, the shift from a strictly selectionist approach that shoehorned cultural evolution into a genetics-like model to the much broader dual-inheritance framework was a major step forward. Early in the development of the cultural evolutionary theory (CET) and its specific archaeological application, there was a considerable debate with regard to the degree to which culture change could be formally understood as evolutionary or whether evolution was merely a loose and perhaps even ultimately misleading analogy (Boone and Smith, 1998; Temkin and Eldredge, 2007). As Creanza et al. (2017, p. 7782) point out “cultural traits are likely to deviate from all three laws of Mendelian inheritance: segregation, independent assortment, and dominance”. Yet, many cultural traits behave sufficiently like coherent units of inheritance for culture to take on emergent evolutionary properties, albeit properties that demand specific theoretical and methodological treatment rather than a blanket transferral of a genetics-like approach (Collard et al., 2006; Henrich and Boyd, 2002; Henrich et al., 2008). Many aspects of culture can therefore be usefully studied as part of an evolutionary system within which strategic social learning decisions in a given population result in the passing down of information across generations and where cultural traits are under either some form of selection or drift.

Mesoudi et al. (2006) explicitly discussed the approaches and methods used in evolutionary biology in regard to possible analogues in the study of cultural evolution, and showed that these exist on micro- and macro-scales (Fig. 1). Against the canvas of this alignment, Mesoudi and colleagues proposed a programmatic synthesis of the field of cultural evolution that includes archaeology as the component specialising in long-term and macro-scale patterns and processes of culture change (cf. Mesoudi and O’Brien, 2009). More than a decade after this attempt at synthesis, Garvey (2018) agrees but also cautions that: “[a]rchaeology has much to contribute to the study of cultural evolution. Empirical data at archaeological timescales are uniquely well suited to tracking rates of cultural change, detecting phylogenetic signals among groups of artefacts, and recognising the long-run effects of distinct cultural transmission mechanisms. Nonetheless, these are still relatively infrequent subjects of archaeological analysis and archaeology’s potential to help advance our understanding of cultural evolution has thus far been largely unrealised”. While several reviews of CET in anglophone archaeology already exist (Feinman, 2000; Garvey, 2018; Prentiss, 2019; Riede, 2010; Walsh et al., 2019), a systematic bibliometric mapping of the cultural evolutionary research front in archaeology has hitherto not been attempted. Building on a recent bibliometric exploration of cultural evolution more broadly (Youngblood and Lahti, 2018), we here map in detail the development and emerging research front of CET in archaeology specifically. We do so on the basis of bibliographic data derived from the Web of Science Core Collection in the form of a bibliographic coupling network and with the aid of text mining. Beyond visualising a given research field, knowledge networks enable the discovery of new, meta-level knowledge, the identification of research gaps, and latent structure in already existing knowledge landscapes (Egghe and Rousseau, 2002; Grant and Booth, 2009; Zhao and Strotmann, 2008). In this way, we identify key areas of ongoing debate and shifts in the application of CET in archaeology. In the spirit of Mesoudi et al. (2006) alignment of archaeology with palaeobiology, we pay special attention to the way in which methods originally developed in paleobiology—namely phylogenetics and geometric morphometrics—have been applied in evolutionary archaeology.

Fig. 1: Comparison of major subdivisions within evolutionary biology and corresponding disciplines currently or potentially employed in the study of cultural evolution (right side).
figure 1

Figure redrawn after Mesoudi et al. (2006, Fig. 1).

Our review thus offers a systematised exploration and visualisation of topical trends within anglophone evolutionary archaeology of the past decades, which we contextualise in a qualitative way. This review is neither aiming for a general summary of the history of evolutionary thoughts in archaeology since its inception (i.e. Sackett, 1991) nor is it claiming to fully capture the use of Evolutionary Theory in archaeology beyond the anglophone literature. In the terminology of Grant and Booth (2009), ours is a mapping review with a specific focus on methodological trends.

In line with the finding of Youngblood and Lahti (2018), we see considerable impacts of computational methods on cultural evolutionary research in archaeology, and we also see major clusters and lineages of research traditions (cf. Hull, 1988). Interestingly, however, our review also identifies a decline in evolutionary archaeological publications applying specifically phylogenetic methods in favour of the application of (geometric) morphometric techniques. The latter are powerful tools for describing complex object shapes but can be difficult to integrate with phylogenetic approaches. Recent methodological developments in computational palaeobiology offer new and exciting perspectives for how artefact shape data may be seamlessly integrated into Bayesian phylogenetic analyses. Such an integration would bring evolutionary archaeological approaches based on cultural phylogenetics in line with current practice in evolutionary linguistics, and stimulate novel analyses that may bring archaeology closer to meeting its potential in regard to its contribution to the field of cultural evolution at large.

Data

For our data, we queried the Clarivate Web of Science Core Collection (WoS, www.webofscience.com). WoS is a subscription-based web service that provides access to citation databases including >82 million records (Clarivate, 2022). The natural sciences are particularly strongly represented in this database, while the humanities and social sciences are, due to diverging publication practices (Archambault et al., 2006) and a (as of now) less comprehensive digitisation less robustly covered. Publications from the latter domains are included only from 1975 and onwards, and many cornerstone publications—especially books or book chapters, also for the fields of archaeology and cultural evolution such as the ones listed in Riede (2010, Table 1)—are not included. This has already been noted critically by Youngblood and Lahti (2018). Nevertheless, WoS—next to much younger alternatives such as Scopus and Google Scholar—is a well-established source for bibliometric data (i.e. Schmidt and Marwick 2020; Youngblood and Lahti 2018), and we will contextualise our strictly bibliometric results with broader insights from the archaeological cultural evolution literature beyond the WoS below. We do this to counter the various shortcomings and biases such bibliometrics databases introduce (cf. Bar-Ilan, 2010; Falagas et al., 2008; Jacso, 2005; Li et al., 2010) and caution that the quantitative results should only be seen as general trends.

Our search terms included the terms ‘cultural evolution’ and ‘cultural transmission’ for the fields of ‘archaeology’ and ‘anthropology’ for the time period until the end of 2021 to be able to make statements about per-year trends (see the Supplementary Material for precise search terms). As of March 14, 2022, the query resulted in a total of 674 entries, with the earliest publication dating to 1981. All 674 entries were downloaded as a bibliographic document file in.bib format with attendant bibliographic information and including cited references. We chose to search the topics of archaeology and anthropology to capture both publications following European or American terminology where archaeology is a subfield of anthropology. We limited the keywords to ‘cultural evolution’ and ‘cultural transmission’, as they are the two broadest terms to describe the field. For this article, only publications in English will be studied, as this is the main language in which CET as we study it is negotiated. Besides that, and on top of the above-mentioned biases, publications in other languages than English are significantly underrepresented in bibliometric databases (Larivière and Macaluso, 2011).

Methods

Data preparation

The.bib file was converted with the bibliometrix package (Aria and Cuccurullo, 2017) to be used in R 4.2.1 (R Core Team, 2022). The names of the lead authors of the articles as well as of the lead authors of their cited references were automatically adjusted for inconsistencies in their initials.

Thesaurus

For all articles, the author-assigned keywords, the titles, and the abstracts were extracted. For the titles and abstracts, uni- and bi-grams were created and those that appeared in ≥1% of the articles were kept. The automatically assigned keywords from WoS were excluded, as random samples showed that they can be incorrect. The author-assigned keywords, uni- and bi-grams were used to create a thesaurus (see Supplementary Material), which combines them into first-order categories, which again, were assigned to broader second-order categories (Table 1). In turn, this hierarchical thesaurus was used to create a thesaurified occurrence matrix of first-order category keywords across our sample of articles. Each row of the matrix represents one individual article, and each column one of the thesaurified first-order keywords. The matrix fields the counts of each of the keywords as retrieved from the WoS data in relation to our thesaurus. The same occurrence matrix was also transformed into an incidence matrix, where instead of the count of keyword occurrences per article, the presence or absence of each thesaurified keyword in each article was noted.

Table 1 Thesaurus categories (first-order thesaurified keywords) with their overarching second-order meta-categories.

To visually identify thematic trends over time, we calculated the per year fraction of articles containing keywords belonging to the second-order categories ‘Methods’ and ‘Topics’ (Fig. 5, Supplementary Fig. S3). Where suitable, a smoothed trend curve—using the loess method in ggplot2 (Wickham, 2016)—was applied over a span of 5 years.

Bibliographic coupling

To map the thematic field of CET in archaeology, we constructed a weighted, undirected, bibliographic-coupling network of individual research articles using the igraph package (Csardi and Nepusz, 2006). A bibliographic-coupling network is constructed by connecting two nodes, in this case, two articles, that share one or more cited references (Fano, 1956; Kessler, 1963). The rationale behind this approach is that articles citing the same references presumably treat a similar subject matter. It is the same method used in the WoS databases to find related publications (Ahlgren and Jarneving, 2008). As proxies for intellectual content, such citation fellowships may ultimately also trace the evolution of specific scientific fields (cf. Hull, 1988). Critics have stated that the method might not be as reliable as others, since two articles might cite completely unrelated matters in a reference (Martyn, 1964). We argue, however, that disregarding the content of a paper, the choice of references in itself is an active decision reflecting affiliations to certain schools of thought (Glänzel and Czerwon, 1996; Persson, 1994; Vladutz and Cook, 1984).

In our application of bibliographic coupling, we decided to weight the edges connecting articles by the cosine similarity which was calculated on the basis of the total number of shared references. Cosine similarity is often used in text mining since it works well for long and sparse numeric data and emphasises the commonalities while ignoring zero-matches (Han et al., 2012; Tan et al., 2006). We did not select a subset of edges by setting a similarity threshold. While methods for selecting such a threshold are proposed (i.e. Small, 2009), there exists no agreed-upon method and often they are selected arbitrarily (i.e. Leydesdorff and Rafols, 2009). However, the main clusters we describe below remain coherent entities even if we introduce a threshold (Supplementary Fig. S6). We kept the largest connected, weighted graph, which consisted of 630 articles, and applied the Louvain community detection algorithm (Blondel et al., 2008) implemented in igraph to search for clusters. This method applies a modularity optimisation algorithm, where the relative density of node edges has to be higher inside of a given cluster compared to outside. Network density is defined as the ratio of existing edges to theoretically possible edges. In the case of bibliographic coupling, it provides insights into the overall level of shared citations and serves as a proxy for the conformity (high values) or diversity (low values) within the network and each cluster. Besides this theoretical reasoning for choosing the Louvain algorithm, it too is used for bibliometric analyses elsewhere (cf. Glänzel and Thijs 2017) and has also been claimed to perform particularly well vis-à-vis social (cf. Tommasel and Godoy, 2019) and biological networks (cf. Rahiminejad et al., 2019). Like other community detection/clustering algorithms it does, though, require careful analysis of the obtained results and their limitations (Held, 2022).

The igraph package was used to calculate the network-wide as well as within-cluster density measure. For each article within each of the retrieved clusters, we also calculated the closeness centrality measure using igraph. Closeness centrality is defined as the average length of the shortest path between the node of interest and all other nodes. For the bibliographic coupling network, this would mean that articles/clusters with a high closeness centrality constitute core elements of the network; conversely, articles/clusters characterised by lower centrality are more peripheral.

We finally conducted a one-way ANOVA to test for a significant difference between the mean closeness centrality values of each cluster derived from the Louvain community detection algorithm. This resulted in a statistically significant result (p < 0.001). We then applied a two-sample t-test to all pairs of clusters (using the base R function pairwise.t.test with adjustment for multiple comparisons using the method after Bonferroni), to evaluate which pairs exactly are significantly different in terms of their mean centrality values.

Description of network clusters

To describe the network clusters based on all the derived keywords, a chi-squared test (base R function chisq.test) was applied to the matrix of summed, first-order thesaurified keyword incidences per article per bibliographic-coupling cluster. A chi-squared test examines whether there is a significant association between the rows (keywords) and columns (clusters) of a contingency table, with the null hypothesis (H0), that the rows and columns are independent, and the alternative hypotheses (H1), that they are not. For each keyword-cluster pair, the standardised residuals were retrieved. The standardised residuals of the chi-squared test highlight the divergence between the model and data, and so shed light on the positive or negative association between the clusters and the keywords. We extracted all keywords for each cluster that had positive standardised residuals of 2 or more (i.e. deviated by more than two standard deviations), meaning that they are a major contributor to the overall test result (Table 3). The keywords identified by the standardised residuals should not be interpreted as the sole descriptors of a cluster, but rather as the keywords that diverged the most from a baseline expectation given uniformly distributed keywords.

As a second line of evidence, we applied the weighted log odds ratio algorithm as implemented in the tidylo package (Schnoebelen et al., 2022) using the method described by Monroe et al. (2008). In this case, weighted log odds were used to measure how first-order thesaurified keyword occurrence differs between clusters. Weighted log odds ratio works well for keywords that appear in all of the cluster, yet—and importantly—with varying frequencies. Taking this into account, the method then measures which keywords are more (high log odds ratio) or less (low/negative log odds ratio) likely to appear in each cluster. The results are visualised below (Fig. 2; Supplementary Fig. S1).

Fig. 2: Keyword importance based on the weighted log odds ratio for clusters 1–5.
figure 2

The y-axis shows the weighted log odds ratio, whereas the x-axis shows the frequency of the respective keywords in each cluster. High values on the y-axis represent a strong cluster association.

Results

Data extraction

Our WoS query resulted in N = 674 publications published by 1402 authors. Around 40% (n = 271) are single-authored documents. The multi-authored articles have 2.89 co-authors on average. Around 39% of the publications have a first author affiliated with institutions in the United States, ~15% in the United Kingdom, ~6% in Germany, and 5% in China. Other contributing countries are Australia (4.3%), Canada (3.9%), Spain (2.9%), France (2.7%), Israel (1.8%), and Italy (1.7%). The documents retrieved come from n = 246 different sources (Journals, Books, etc.), with Journal of Archaeological Science and Quaternary International being the most popular outlets (Table 2).

Table 2 Top 10 most common journal outlets and the number of articles within which cultural evolutionary theory appears coupled with archaeological topics.

Within the disciplinary domains of anthropology and archaeology, there has been a general growth in the use and elaboration of CET from the onset of the new millennium (Fig. 3). By the same token, it has been within the last decade that the vast majority of the whole corpus has been published. The bulk of articles in the data set mentions no time-period-related keywords in their titles, abstracts, or author-given keywords (Fig. 4). The first period-related keyword retrieved from the data set stems from 1994 and belongs to the Palaeolithic, which—in this data set—is the first, as well as the most common archaeological period in which CET is employed, followed by the Neolithic and the Bronze Age. As can be seen in Fig. 4, the fraction of articles focused on the Palaeolithic—while volatile—increased ever since.

Fig. 3: Chronological distribution of articles used in this study.
figure 3

The figure shows the number of yearly archaeological and anthropological publications within the field of cultural evolution and cultural transmission theory on the WoS database as retrieved by our search query.

Fig. 4: Number of archaeological period keyword occurrences per year.
figure 4

The data is taken from the thesaurified keywords of each article.

Since the data set starts to have a significant size only from the year 2000 onwards, we focused on the period since then in the following. As expected, keywords associated with cultural evolution and cultural transmission are the most frequent overall, with cultural evolution outweighing the latter (Supplementary Fig. S2). The third-most frequent keywords are associated with social learning, peaking around 2010 and 2015. Keywords associated with gene-culture co-evolution, and evolutionary psychology occurred mostly in the early 2000s. Likewise, memetics made a brief appearance around the same period, and again around 2010. Since then, however, this keyword has been absent, coinciding with the de facto disappearance of memetics as a mainstream domain of academic enquiry. The meme as a cultural analogue to the gene was introduced by the British evolutionary biologist Richard Dawkins (1976) and found a surprising amount of resonance in popular science (e.g. Blackmore, 1999). At the core of memetics was the insistence on an actual replicator akin to genes, which, however, turned out to be largely irrelevant to cultural evolution research (Boyd and Richerson, 2001; Laland and Brown, 2011). Memetics also failed to inspire reproducible quantitative models (Edmonds, 2005; Gil-White, 2005) and adopted an often unhelpful terminology and rhetoric (Kuper, 2000; Pigliucci, 2007).

Regarding the methods used, keywords connected to modelling and/or simulations have been the most frequent in the data set (Fig. 5). However, in the past five years, their relative frequency has declined. Phylogenetic/cladistic methods, while dynamic early on, have continuously declined in popularity as well. This trend, in particular, is worth noting. Phylogenetic methods constituted a major element of methodological innovation in evolutionary archaeology and were at the core of many early programmatic papers (e.g. Lyman and O’Brien, 2000; O’Brien et al., 2001, 2002) and foundational case studies (e.g. Tehrani and Collard, 2002; Jordan and Shennan, 2003; Jordan and O’Neill, 2010). In surveying evolutionary anthropology/archaeology as part of a major book review, Cronk (2006, p. 197) went so far as to suggest that “the phylogenetic method should be placed alongside cultural transmission theory, gene-culture co-evolution, signalling theory, experimental economic games, niche construction, and animal culture studies in our toolkit for studying culture from an evolutionary perspective”. Yet, while a constant stream of phylogenetic applications continue to characterise neighbouring domains of evolutionary linguistics and anthropology, the popularity of phylogenetic applications specifically in archaeology has stagnated or even declined. In contrast, (geometric) morphometric applications in CET contexts have—besides a few boosts of interest—largely remained constant in popularity within the last 10 years, and may even be on the rise, as has also been suggested by Wyatt-Spratt’s (2022) recent review of these approaches. Keywords related to theoretical approaches (e.g. “neo marxism”, “world-systems theory”, etc. see thesaurus for details) are on a slight downward trend. A positive trend, albeit with much lower frequencies, is visible for keywords under the typological/taxonomic category. While keywords associated with quantitative/statistical methods have occurred in only every tenth article in the year 2000, their relative frequency has tripled by the year 2007. They are now present in at least every fourth article.

Fig. 5: Relative appearance of keywords related to methods used in the whole corpus per year.
figure 5

The y-axis is scaled dynamically for each subplot. A comparison of trends can only be done within categories but not between categories.

Popular topics under discussion are climate and catastrophes, whose relative occurrence increased around the year 2016 to around 40%, as well as ecology/environment. Keywords related to the latter appear in every third article, and flora specifically appears in roughly every fifth paper (Supplementary Fig. S3). Fauna on the other hand is far less popular, with a representation of only around 10%. Topics like behaviour and cognition have a steady appearance at around 20% of the papers published per year with behaviour peaking in 2004, 2014, and 2020. Keywords related to human evolution are rising in relative frequency, whereas keywords related to genetics are, somewhat surprisingly perhaps, declining. The graph indicates further that there is a greater focus on temporality than there is on spatiality. Moreover, much of CET research in archaeology relates to hunter-gatherers/foragers, mobility, subsistence, and demography. For post-palaeolithic periods metallurgy is an important matter.

Bibliographic coupling network

The overall network is a graph of articles linked by their shared bibliographies (Fig. 6), which consists of 630 articles and has a modularity score of ~0.155 and a density of ~0.224. The Louvain community detection algorithm retrieved seven clusters in total. Because of their diminutive size, clusters 6 and 7 are not considered further here (n1 = 148, n2 = 92, n3 = 150, n4 = 154, n5 = 81, n6 = 3, n7 = 2). The network is clearly structured, but the clusters are of large size and some are only weakly connected. This is due to not filtering the weakest edges in the network using a threshold. The cluster with the highest density is cluster 3, indicating a larger number of shared cited references, followed by clusters 4, 2, and 1. Cluster 5 displays the lowest density indicating a somewhat looser thematic coherence (d1 = 0.306, d2 = 0.306, d3 = 0.691, d4 = 0.496, d5 = 0.260). The pairwise comparisons of the closeness centrality measures (c1mean = 0.62, c2mean = 0.571, c3mean = 0.699, c4mean = 0.71, c5mean = 0.646, Fig. 7) using t-tests with pooled standard deviation found that cluster 4 has a significantly higher mean closeness centrality than all other clusters (p < 0.05; see Supplementary Table S1) except for cluster 3. On the basis of the mean centrality scores, we identify clusters 4 and 3 as the centre of the network. In contrast, cluster 2 has the lowest mean closeness centrality and thus represents the periphery.

Fig. 6: Bibliographic coupling network of the 630 connected articles.
figure 6

The clusters retrieved by the Louvain algorithm are differentiated by colour. The graph was visualised using the Fruchterman–Reingold method in the ggraph package (Pedersen, 2021). As explained below, we characterised the first five clusters as follows: cluster 1: “Niche construction theory|psychology|gene-culture co-evolution” (n = 148), cluster 2: “Climate change|social adaptations|population density|Neolithic” (n = 92), cluster 3: “(Foundational) cultural evolutionary theory|methods” (n = 150), cluster 4: “Complex human behaviour|cultural transmission|Early Stone Age“ (n = 154), cluster 5: “Ethnoarchaeology|cultural complexity|chiefdoms|early states” (n = 81).

Fig. 7: Boxplots of the scaled closeness centrality per retrieved bibliographic coupling cluster.
figure 7

Closeness centrality is defined as the average length of the shortest path between the node of interest and all other nodes. Here, closeness centrality is used to identify core elements of the network.

As each article included in our analysis is time-stamped with its year of publication, we can follow the dynamics of thematic development and connectedness over time. Through the reduction in shared references between articles and a simultaneous increase in published articles, the overall field of CET in archaeology as a whole became more diverse within the last ten years as indicated by a reduction in network density (d1981–2001 = 0.204, d2002–2006 = 0.261, d2007–2011 = 0.321, d2012–2016 = 0.304, d2017–2021 = 0.212; Supplementary Fig. S5).

Description of network clusters

To understand the network and its significance we attempted to derive a general, human-readable characterisation of each cluster. To this end, we combined multiple lines of evidence, starting with the ones introduced above: closeness centrality, which indicates how close each cluster is to the thematic centre of the network, and network density, measuring internal cluster coherence. We also applied Pearson’s chi-squared test. This test had a statistically significant result (χ² = 2394.7, df = 372, p-value < 0.001) for the whole network, indicating a general dependency between clusters and keyword occurrences. We extracted the keywords of each second-order category (Table 1) that had a significant positive association with a given cluster and could thus assess their semantic properties (Table 3).

Table 3 Results of the chi-squared test for each cluster and each meta-category.

In the following, we describe each cluster based on their closeness from the centre of the network in descending order and in combination with the results of the chi-squared test, as well as the results of the weighted log odds-ratio analysis. As noted above, the two clusters with the highest mean closeness centrality are cluster 4 which together with cluster 3 represents the dense core of the network (Fig. 6).

Cluster 4: Complex human behaviour|cultural transmission|Early Stone Age

With the highest mean closeness centrality, cluster 4 (n4 = 154, c4mean = 0.71, d4 = 0.496) represents the centre of the network. It is very much focussed on the study of ‘becoming human’, i.e. hominins and complex human behaviour. The associated geographical areas of research are Africa and Europe and the periods covered are the Palaeolithic and African Stone Age respectively, where lithic tools and technology, mainly hand-axes, but also other types of material culture (e.g. ornaments) are studied. Cluster 4 can be considered the expression of a mature field, where CET is applied to answer specific research questions and case studies, and is the cluster containing the most publications of the past ten years (Fig. 8). Within cluster 4, there exist several thematic sub-clusters. The first one concerns the Early Stone Age, such as the Oldowan and Acheulean industrial complexes in Africa, where hand-axes are studied in the framework of behavioural ecology (for example, Plummer, 2004; Ambrose, 2001; Lycett and von Cramon-Taubadel, 2008; Lycett and Gowlett, 2008; Stout et al., 2010). Within the same setting, laboratory experiments on transmission processes are conducted (e.g. Lycett et al., 2016; Pargeter et al., 2019; Newman and Moore, 2013) with a strong focus on studying processes of cultural evolution and transmission. In the second sub-cluster, CET is applied to study cognitive evolution and behaviour through the use of ornaments, material culture, and symbolic communication—often in combination with the study of anatomically modern humans and their dispersal (e.g. Bar-Yosef and Belfer-Cohen, 2013; Coward and Gamble, 2008; d’Errico et al., 2009; Mackay et al., 2014; Vanhaeren and d’Errico, 2006). Also, cultural complexity and the impact of population size and connectedness thereupon, as well as on cultural evolution (e.g. Kline and Boyd, 2010; Muthukrishna et al., 2014; Premo and Kuhn, 2010; Vaesen et al., 2016), is treated.

Fig. 8: Relation of network density to number of articles across time bins per cluster.
figure 8

The number of publications per time bin per cluster are represented as columns. The bibliographic coupling network density (ratio of existing edges to all possible edges) for each time bin for each cluster is represented as point-and-line plots. The network density works as a proxy for the conformity/diversity within the individual clusters of the network.

Cluster 3: (Foundational) cultural evolutionary theory|methods

Cluster 3 (n3 = 150, c3mean = 0.699, d3 = 0.691) represents the theoretical core of CET in archaeology. Like cluster 4 described above, cluster 3 is large and can be divided into several sub-clusters. The first and oldest sub-cluster contains foundational articles in which the groundwork for the translation of (palaeo-)biological theories and methods, such as phylogenetics/cladistics, in archaeology were laid (e.g. Lyman and O’Brien, 1998; O’Brien et al., 2001). Chiefly focussing on cultural transmission and key processes such as copying error, drift, and innovation (e.g. Bentley et al., 2004; Bentley and Shennan, 2003; Eerkens and Lipo, 2005; Neiman, 1995), as well as on the influence of population size on cultural evolution (e.g. Shennan, 2000), models were developed and tested in case studies using a variety of different archaeological datasets. While these data are archaeological and the papers inherently concerned with archaeological questions, it is the methods and theory that are central in these papers. While not part of the corpus captured bibliometrically here, the contents of the papers in cluster 3 strongly echo the contents of major monographs and edited volumes published around the same time (cf. Riede, 2010:Table 1). The articles contained in the second sub-cluster are generally more recent and arguably reflect the maturation of cultural evolutionary approaches in archaeology. Here, the articles’ focus has shifted away from the programmatic development and testing of general models of cultural transmission, to their refinement and concrete application in order to answer specific questions. This research often focuses on material culture and the underlying transmission processes of social learning. In terms of methods, this sub-cluster contains articles employing metric and geometric morphometrics for the study of lithic tools and technology (e.g. Eren et al., 2015; Lycett and von Cramon-Taubadel, 2013; Okumura and Araujo, 2014; Schillinger et al., 2015). Within other, much smaller niches of cluster 3, modes of cultural transmission and social learning stand tall, especially in regard to ceramic production (e.g. Hart and Brumbach, 2009; Stark, 2003; Tehrani and Riede, 2008). This includes ethnoarchaeological approaches (e.g. Roux, 2007), but also aspects of mobility, demography, and social complexity (e.g. Broodbank and Kiriatzi, 2007) whose association with formal CET is but loose. What we observe here is that cluster 3 also includes some recent papers that, akin to cluster 2 (see below), employ CET as a general framework to contextualise their research (e.g. Schmid, 2020) citing foundational publications and the references therein.

Cluster 5: Ethnoarchaeology|cultural complexity|chiefdoms|early states

The geographical areas positively associated with cluster 5 (n5 = 81, c5mean = 0.646, d5 = 0.26) are Meso- and South America, and the associated keyword categories for this cluster are complexity and cultural evolution. Thematically, the articles treat topics of social and cultural complexity in chiefdoms and early states. Within this domain, topics such as climate change and subsistence (e.g. Arnold, 1992; Kennett and Kennett, 2000; Raab and Larsonb, 1997), as well as intensification in connection with agriculture (e.g. Morgan, 2015; Morrison, 1996), state formation and organisation on local (e.g. Keegan, 2000; Parkinson and Galaty, 2007; Stanish, 2001) and global scales (e.g. Turchin et al., 2017) are discussed from an ethnoarchaeological point of view. This cluster aligns closely with notions of social evolution, that are mostly distinct from evolutionary archaeology in the stricter sense (cf. Shennan, 2011).

Cluster 1: Niche construction theory|psychology|gene-culture co-evolution

While cluster 1 (n1 = 148, c1mean = 0.62, d1 = 0.306) undoubtedly operates within the CET framework, the case studies are to a large degree unrelated to archaeology per se, but—very similar to cluster 5—more ethnographical in nature. Publications within this cluster centre around subjects such as evolutionary psychology (e.g. Atran et al., 2005; Curtis et al., 2011), behavioural ecology and niche construction (e.g. Ellis, 2015), and social learning (e.g. Klineet al., 2013). Furthermore, topics such as cognition, ethics, morality, and religion, yet also linguistics, primatology, and economics are covered. Cluster 1 has one of the lowest density scores, indicating a loose interconnection and therefore high thematic variation and pronounced multidisciplinarity. This is underlined by the chi-squared test failing to result in any strong association of this cluster with any specific geographic area of research, time period, or domain of material culture. Interestingly, cluster 1 is the cluster with the most applications of cladistic/phylogenetic (comparative) methods applied to study questions of cultural transmission, mobility, and ecological adaptation (e.g. Holden and Mace, 2003; Jordan et al., 2009; Mace and Holden, 2005).

Cluster 2: Climate change|social adaptations|population density|Neolithic

The articles in cluster 2 (n2 = 92, c2mean = 0.571, d2 = 0.306) are positively associated with the Neolithic first and foremost and treat topics such as climate and catastrophes, subsistence, and mobility. Geographically, the articles’ keywords most strongly relate to areas of research in East, Central, and Northern Asia, while heavily relying on methods and concepts from the geosciences and absolute dating. Absolute dating has become increasingly popular within the whole data set from around the 2010s (Fig. 5)—when this cluster formed – which is likely driven by recent methodological advances (cf. Crema, 2022) and the ever-increasing availability of large compilations of radiocarbon dates (e.g. Bird et al., 2022). Some of the articles study the spread of the Neolithic as a cultural transmission (diffusion) phenomenon (e.g. Fort, 2012, 2015; Isern et al., 2017). Others focus heavily on the study of population density through the use of radiocarbon dates as a proxy (e.g. Edinborough et al., 2017; Gayo et al., 2015; Timpson et al., 2014). This is often studied in combination with the impact of climatic change (e.g. Manning and Timpson, 2014; Woodbridge et al., 2014), and the results are then discussed within the bounds of cultural evolution and cultural dynamics, most likely due to the strong association of such studies with the foundational efforts of Stephen Shennan. Events of rapid climate change are also studied in their effects on material culture (e.g. Dong et al., 2012), often through the use of proxy measures such as paleoenvironmental data (e.g. Clarke et al., 2016; Kaniewski et al., 2008; Maher et al., 2011; Sharifi et al., 2015; Staubwasser and Weiss, 2006) and settlement distributions (e.g. Dong et al., 2013; Hosner et al., 2016). Throughout, the focus lies on paleoclimate and societal impacts. While the articles in this cluster are thematically and geographically coherent, the low mean closeness centrality indicates that this cluster is at the very periphery of the network, meaning that it operates in the field of CET only in the broadest sense and instead is mostly archaeological.

Discussion

The field of CET has been growing rapidly within the last two decades. Our network analysis visualises and tracks this development from the foundation-building period around the turn of the last millennium to the most recent radiations. Our bibliometric analysis captures the core corpus of evolutionary archaeology beyond its early, formational monographs, and also highlights the continuing relations that evolutionary archaeology sensu stricto has with semantically related notions of social evolution and vernacular uses of the term. Notably, our review also shows that archaeology—perhaps in poignant parallel to paleobiology—does remain relatively peripheral to CET at large. Evolutionary archaeological studies are not commonly published in core CET journals such as Evolutionary Human Sciences, Evolution and Human Behaviour, Behaviour and Brain Sciences. Instead, the bulk of relevant works appear in more general-scope journals that preferentially reach the archaeological rather than CET communities.

More than a decade ago, Mesoudi and O’Brien (2009) sketched a roadmap for how evolutionary archaeology fits within the overarching framework of cultural evolution, itself modelled on similar conceptual divisions within evolutionary biology. In this scheme, evolutionary archaeology sits within the branch of cultural macro-evolution (see also Mesoudi et al., 2006: Fig. 1), a view that aligns closely with that of many archaeologists publishing within this domain (e.g. Lyman and O’Brien, 2001; Prentiss et al., 2009). Calls for macro-archaeological approaches have been sounded loudly again recently (e.g. Perreault, 2019), yet our results indicate that evolutionary archaeology is invested in a range of CET domains that include both micro- and macro-scale processes. What we observe is three different classes of articles: The first encompasses articles where biological evolutionary theory is translated and modified to study cultural evolution. While using archaeological data, these articles are heavily focussed on theory-building and methodological exploration and use these data for simulations that prove that the transmission and evolution of cultural traits can be studied empirically in a similar manner to genes. Many of these programmatic articles focus on basal mechanisms of CET and position archaeology within the broader field of CET (cf. Mesoudi et al., 2006). The second type represents articles that apply archaeology-tailored CET methods to study intrinsically archaeological questions, while tightly adhering to the CET framework. While an apparent dearth of empirical applications of CET in archaeology has once been lamented to the degree that the paradigm’s usefulness was questioned (O’Brien and Lyman, 2000), our results show that such doubts can be put aside. The divide between theory and data can be bridged (Creanza et al., 2017), and empirical studies have in fact been growing a great deal in the last decades. This is likely correlated with a wider trend towards ‘data science’ and a ‘scientification’ of archaeology (Kristiansen, 2014). For the last type of archaeological studies considered here, CET is merely a general framework in which questions of a purely archaeological nature are addressed. We suggest that instead of reflecting a ‘normalisation’ of evolutionary archaeology to ‘just’ archaeology (Lycett, 2015), articles such as these indicate that ‘cultural evolution’ remains a vernacular synonym for culture change. While no scientific community can lay exclusive claim to certain terms, the continuing use of cultural evolution across multiple theoretical paradigms remains, we suggest, also a barrier to progress in evolutionary archaeology.

While microevolutionary approaches to archaeological material culture are clearly visible in this data set, and represented by articles studying the underlying mechanisms of cultural evolution through simulations and laboratory experiments, amongst others, the opposite is the case for macroevolutionary archaeology. This apparent lack is especially surprising, as evolutionary archaeology is the only explicitly named’archaeology’ in Mesoudi et al.’s (2006) taxonomy and is generally recognised to be the macroevolutionary counterpart to paleobiology—and therefore the key to understanding long-term changes in cultural evolution. The parallels between evolutionary archaeology and paleobiology have often been pointed out (Lyman and O’Brien, 1998; O’Brien and Lyman, 2000, p. 20), with an explicit call to employ paleobiological methods in archaeology. For example, Perreault (2019) suggests that the study of the range and duration of archaeological types, and the external and internal influences thereupon, as well as the pace of cultural change are key topics that would benefit from the application of such methods. Similarly, Garvey (2018) named the tracking of rates of cultural evolution, the detection of phylogenetic signals in artefact shapes and technology, and the study of long-term effects of cultural transmission as goals for macroevolutionary archaeology.

Of particular importance to study all of these research items is the wider framework of the phylogenetic method. It has been advocated to be “perhaps the most important” (Mesoudi and O’Brien, 2009, p. 22) paleobiological approach to study macro-scale archaeological questions. But how often and to what extent has it been applied in archaeology? To answer this question, we considered it necessary to take a short excursus explicitly illustrating the applications, relevance and potential of the phylogenetic method for the broader field of CET.

Recent developments regarding the phylogenetic method in archaeology

Three decades ago, Mace and Pagel (1994) recommended the use of cultural phylogenies to identify the evolution of cultural change. Likewise, O’Brien and Lyman (2000) explicitly wrote about the tempo and mode of evolution, their separation, and manifestation in the archaeological record, highlighting just how parallel issues in paleobiology have been addressed using phylogenetic methods. Around the middle of the first decade of the new millennium, major efforts were made to review the state of the art in relation to phylogenetic methods in archaeology (e.g. Lipo et al., 2006; Lycett, 2015; Lycett et al., 2007; Mace et al., 2005) after which these approaches have, however, declined in popularity (Fig. 5). Potentially, and unlike for languages and genes, the lack of a universally agreed classification system for artefacts and a methodological conservatism stand behind this decline.

Within our publication data set, the application of (Bayesian) phylogenetic or cladistic methods in cultural evolutionary research is largely confined to topics of ethnographic/anthropological nature, and nearly all of them occur in cluster 1. This subset consists of 19 articles, where 12 of them centre around language trees, and the others use the material culture from ethnographic contexts or more abstract cultural traits. Within this subset, archaeological data is merely used to put the results into perspective (e.g. Teixidor-Toneu et al., 2021). Phylogenetic methods have been adapted very early on linguistic data to test hypotheses about cultural evolution: for example, mode and rates of change, co-evolution of traits along the tree (e.g. Currie and Mace, 2011; Mace and Holden, 2005), or the effect of horizontal transmission on the robustness of Bayesian phylogenetic comparative methods (e.g. Greenhill et al., 2009; Currie et al., 2010). Furthermore, phylogenetic comparative methods were used to estimate language divergence dates (e.g. Gray et al., 2011; Kolipakam et al., 2018), to study cultural signatures in languages as the result of ancient population expansion and dispersal (e.g. da Silva and Tehrani, 2016), or to reconstruct the evolution of kinship systems and rules of inheritance in populations (e.g. Holden and Mace, 2003; Jordan et al., 2009; Mace and Jordan, 2011).

Phylogenetic methods have been applied to material culture as well. Cladistic methods have been used to reconstruct the prehistoric evolution and dispersal of polynesian bark cloth to study human dispersal (Larsen, 2011), or to investigate different models of cultural evolution for Iranian tribal textile traditions using Bayesian phylogenetic analysis (Matthews et al., 2011). Prentiss et al. (2018) studied the early Thule culture using maximum parsimony analysis on a dataset describing the culture’s assemblage of material cultures, such as lithic tools and architectural features to measure the effects of both cultural transmission and ecological context; Manem (2020) created phylogenetic trees on the basis of chaînes opératoires as discrete character data to study how cultural traits are transmitted and modified in the European Middle Bronze Age.

While language-based trees have long been inferred using Bayesian methods, the methods applied for the direct study of archaeological artefacts, mostly although not exclusively stone tools, have been lagging behind in regard to their methodological sophistication. O’Brien et al. (2001, 2002) provided the seminal work for the study of cultural evolution of archaeological artefacts where they derived homologous characters to describe stone tool shape and then analysed them within a cladistic framework. Since then, other studies followed, where either metric, shape, or technological trait information of archaeological stone tools have been discretized into homologous characters and then studied using a maximum parsimony approach (e.g. Lycett, 2009a, 2009b; Smith and Goebel, 2018). So far, only a few studies have used shape information derived from GMM directly to infer phylogenies. This is partially due to a lack of easily accessible software implementations, which allow for the use of continuous character traits, and partially due to a reservation against this kind of data in the past, which was considered not sufficiently informative (Zelditch et al., 1995).

However, as summarised by Parins-Fukuchi (2017), the use of continuous characters has already been discussed in the earliest stages of statistical phylogenetics (Cavalli-Sforza and Edwards, 1967; Felsenstein, 1973), and empirical studies have proven their phylogenetic informativeness (Goloboff et al., 2006; Smith and Hendricks, 2013). A software implementation was available early on with TNT (Goloboff et al., 2008, 2006), which has been applied to the study of morphometric stone tool data in a maximum parsimony framework (e.g. Muscio and Cardillo, 2019; Schillinger et al., 2016). Since the recent advent of model-based Bayesian phylogenetic analyses, the maximum parsimony method has become outdated. The use of Bayesian phylogenetic methods in cultural evolutionary research is becoming more popular in certain fields, and some authors (e.g. Lukas et al., 2021) proclaim that phylogenies have almost become obligatory when investigating cross-cultural variation. Recent work by Evans et al. (2021) provides broad guidelines and best practices for the inference of Bayesian phylogenetic trees and the use of phylogenetic comparative methods. Still, the application of phylogenetic methods—especially Bayesian ones—in archaeology remains rare and archaeology’s unique deep-time perspective therefore under-utilised in the context of CET.

The reasons for this lack of macro-archaeological phylogenetic analyses of cultural evolution are, we suggest, manifold. Firstly, it has to be appreciated that model-based, Bayesian phylogenetic analyses are a very recent development. Secondly, and despite the fact that Bayesian statistics are having a broad impact on the discipline through their common implementation in radiocarbon date calibration, the number of archaeologists who are familiar with Bayesian approaches to phylogenetics and are capable of implementing them remains small. Archaeology borrows these methods from paleobiology, where necessary advances for the work with chronologically stratified samples, i.e. the incorporation of age information and fossils (~artefacts), have been made only within the past decade. This near-revolutionary advancement—the fossilised birth–death (FBD) process to explicitly use extinct samples for the inference of phylogenetic trees—was described for the first time in 2010 (Stadler, 2010) and subsequently implemented in 2014 by Heath et al. (2014). Its importance to paleobiology is hard to underestimate, and its potential usefulness in archaeology evident. Another recently overcome hurdle has been inferring Bayesian phylogenies from continuous characters, such as the ones one would use for the study of whole outline shapes of artefacts (e.g. Leplongeon et al., 2020; Matzig et al., 2021). Our bibliometric analysis shows that geometric morphometrics has become a popular way of quantitatively describing complex artefact shapes but feeding these into downstream phylogenetic analyses has been cumbersome. Only recently advances have been made to study continuous character evolution using Bayesian phylogenetic inference methods, now possible in the RevBayes (Höhna et al., 2016) and the BEAST2 (Bouckaert et al., 2014) software packages. Still, this method remains experimental and as of now, few case studies exist even in palaeobiology (e.g. Parins-Fukuchi, 2017).

Conclusion

In the spirit of Hull (1988), research communities themselves can be said to evolve, a process that can be tracked through citation practices. Our bibliometric analysis of evolutionary archaeology has mapped a vibrant field consisting of numerous intellectual clusters and research traditions. These are situated within the wider field of cultural evolutionary theory but also connect to research that falls outside of this community. While scholars within Cultural Evolution have gone to great pains to delineate their field of study from earlier and often critically flawed uses of evolutionary thinking in the social sciences and humanities, the notion of cultural evolution has evidently not been patented. Instead, it remains a common vernacular for change over time. While such laxity in core terminology can be a barrier to scientific communication, it may also be seen as a strength if the broader communicative affordances of the term are appropriately leveraged. In addition to this terminological aspect, our analysis also highlights the central role of quantitative approaches in evolutionary archaeology, albeit with shifting focus on different methods. Here we stress specifically how the early bloom in evolutionary archaeological applications was strongly borne by the application of phylogenetic methods to artefacts but how precisely this suite of methods has seen a decline in recent years. We have attempted to diagnose this stagnation and suggest that a lack of articulation between morphometric shape analysis and phylogenetics—until recently an unresolved analytical challenge also in palaeobiology—lies at its roots. Recent methodological breakthroughs in Bayesian phylogenetics allow the use of continuous (i.e. quantitative shape-based) characters directly in tree-building and so may open up new and exciting vistas for future applications.

Evolutionary archaeology does not always need to look towards palaeobiology as its supplier of methodological innovation. Yet, the strong structural similarities between the two fields—a stratified, fragmented and incompletely sampled record; complex shapes reflecting equally complex underlying histories of transmission and adaptation—does make it the go-to scientific domain for methodological sparring. If further quantification is the aim, palaeobiology certainly continues to provide a rich seam of inspiration.