How Atypical Combinations of Scientific Ideas Are Related to Impact: The General Case and the Case of the Field of Geography

Mukherjee, Satyam; Uzzi, Brian; Jones, Benjamin F.; Stringer, Michael

doi:10.1007/978-3-319-45023-0_12

Satyam Mukherjee⁵,
Brian Uzzi⁶,
Benjamin F. Jones⁷ &
…
Michael Stringer⁸

Part of the book series: Knowledge and Space ((KNAS,volume 11))

10k Accesses
4 Citations
1 Altmetric

Abstract

Novelty is an essential feature of creative ideas, yet the building blocks of new ideas are often embodied in existing knowledge. From this perspective, balancing atypical knowledge with conventional knowledge may be critical to the link between innovativeness and impact. The authors’ analysis of 17.9 million papers spanning all scientific fields suggests that science follows a nearly universal pattern: The highest-impact science is primarily grounded in exceptionally conventional combinations of prior work, yet simultaneously features an intrusion of unusual combinations. Papers of this type were twice as likely to be highly cited works. Novel combinations of prior work are rare, yet teams are 37.7 % more likely than solo authors to insert novel combinations into familiar knowledge domains.

You have full access to this open access chapter, Download chapter PDF

Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines

Article Open access 24 March 2023

The impact of a paper’s new combinations and new components on its citation

Article 04 December 2019

Interdisciplinarity revisited: evidence for research impact and dynamism

Article Open access 12 November 2019

Keywords

Scientific enterprises are increasingly concerned that research within narrow boundaries is unlikely to be the source of the most fruitful ideas (National Academy of Sciences, National Academy of Engineering, & Institute of Medicine of the National Academies, 2004). Models of creativity emphasize that innovation is spurred by original combinations that spark new insights (Becker, 1982; Guimera, Uzzi, Spiro, & L.A. Amaral, 2005; Jones, 2009; Jones, Wuchty, & Uzzi, 2008; Schilling, 2005; Schumpeter, 1939; Usher, 1929/1998; Uzzi & Spiro, 2005; Weitzman, 1998). Current interest in team science and how scientists search for ideas is premised in part on the idea that teams can span scientific specialties, effectively combining knowledge that prompts scientific breakthroughs (Evans & Foster, 2011; Falk-Krzesinski et al., 2010; Fiore, 2008; Stokols, Hall, Taylor, & Moser, 2008; Wuchty, Jones, & Uzzi, 2007).

The production and consumption of boundary-spanning ideas can also raise well-known challenges (Azoulay, Zivin, & Manso, 2011; Collins, 1998; Einstein, 1949; Fleming, 2001; Henderson & Clark, 1990; Schilling & Green, 2011). If, as Einstein (1949) believed, individual scientists inevitably become narrower in their expertise as the body of scientific knowledge expands, then reaching effectively across boundaries may be increasingly challenging (Jones, 2009), especially given the difficulty of searching unfamiliar domains (Fleming, 2001; Schilling & Green, 2011). Moreover, novel ideas can be difficult to absorb (Henderson & Clark, 1990) and communicate, leading scientists to intentionally display conventionality. In his Principia, Newton presented his laws of gravitation using accepted geometry rather than his newly developed calculus, despite the latter’s importance in developing his insights (Whiteside, 1970). Similarly, Darwin devoted the first part of On the Origin of Species to conventional, well-accepted knowledge of the selective breeding of dogs, cattle, and birds. Given these tendencies, the balance between extending science with atypical combinations of knowledge while maintaining advantages of conventional domain-level thinking is critical to the link between innovativeness and impact. However, little is known about the composition of this balance or how scientists can achieve it. In this paper, our analysis of 17.9 million papers spanning all scientific fields suggests that science follows a nearly universal pattern: The highest-impact science is primarily grounded in exceptionally conventional combinations of prior work yet simultaneously features an intrusion of unusual combinations. Papers of this type were twice as likely to be highly cited works. Notably, novel combinations of prior work are rare, yet teams are 37.7 % more likely than solo authors to insert novel combinations into familiar knowledge domains.

Data and Methods

Data

We examined 17.9 million scientific publications across 15,613 journals, constituting all research articles indexed in the Thomson Reuters Web of Science (WOS) database that was published between 1950 and 2000. According to each journal’s subject area, the Institute for Scientific Research (ISI, a.k.a. Web of Science) currently defines three fields and constituent subfields : science and engineering (171 subfields), social sciences (54 subfields), and arts and humanities (27 subfields) with coverage for research publications in science and engineering since 1945, social sciences since 1956, and arts and humanities since 1975. For each paper, the WOS records the citations, number of authors, and citation links to other papers in the database.

Methods

We measured the relative conventionality and novelty of the prior work that a paper combines by examining the papers referenced in a paper’s bibliography (Small, 1973; Stringer, Sales-Pardo, & Amaral, 2010). This section first provides an overview of our methodology, followed by an illustrative example and further details.

Overview

Our basic measurement question is to assess how common or novel any pairwise combination of prior work is. To determine this, we want to know both the (i) observed frequency of any given pairing of references in the WOS and (ii) the frequency of that pairing that would have occurred by chance. Comparing the observed frequency to the frequency expected by chance creates a normalized z-score measure for whether any given pairing appears novel or conventional.

To measure the observed frequency of any given pairing in the WOS, we took the following five steps:

(1)
Took the references listed in a given paper’s bibliography.
(2)
Considered all pairwise combinations of the papers referenced in the bibliography of the paper.
(3)
For each pairwise combination, recorded the two journals that were paired.
(4)
Repeated steps (1–3) for every paper in the WOS.
(5)
Counted the aggregate, population-wide frequency of each journal pairing for all referenced pairs from a given publication year.

Figure 12.1 presents a stylized example for steps (1–3), showing for a given paper how pairs of references are counted from that paper’s reference list . The algorithm repeats this counting process for every article in the WOS and aggregates the counts for each given publication year.

Our method counts specific journal pairings, using different journals as a proxy for different areas of knowledge. Journal-level analysis is well positioned to distinguish domains of knowledge while having precedence in the literature for being relatively transparent, interpretable, and computationally feasible (Bollen et al., 2006; Itzkovitz et al., 2003; Small, 1973; Stringer et al., 2010).^{Footnote 1}

Having determined the observed frequency of each journal pairing, we considered the frequency distribution for each journal pairing that would have occurred by chance. The null model randomly reassigns the citation links between papers. As further detailed below, the method uses a variation of the Markov Chain Monte Carlo (MCMC) algorithm to randomly switch co-citations between all 17.9 million papers into a synthetic network with 302 million citations (edges), the same number of papers and citations as the observed network. Note that this method preserved the detailed paper-level structure of the global citation network. The number of citations to and from each paper was preserved backward and forward in time.

Using this approach, we created 10 synthetic instances of the entire WOS, each with its own set of randomized citation links. For each instance of the WOS, we then repeated steps (1–5) above, calculating the frequency of each co-referenced journal pair. Looking at all 10 randomized cases of the WOS, we generated a distribution of frequencies for each journal pair. We could then evaluate the z-score for each observed journal pair relative to what was expected by chance:

$$ z=\left(obs- \exp \right)/\sigma $$

Where obs is the observed frequency of the journal pair in the actual WOS while exp is the mean and σ is the standard deviation of the number of journal pairs obtained from the 10 randomized simulations of the paper-to-paper citation network.

Finally, returning to categorizing a paper’s prior work regarding novelty and conventionality, we could then assign a z-score to each of the journal pairs in that paper’s reference list. Each paper thus had a distribution of journal pairings, where any given pairing could be more or less common compared to chance. To summarize the information in this distribution, we took two primary summary statistics:

(i)
The median z-score for that paper.
(ii)
The 10th percentile z-score for that paper.

The first measure is a summary statistic for the central tendency of the combinations of journals that a paper cites. The larger the median z-score for a paper, the more common the main mass of journal combinations in that paper compared to chance. The second measure is a summary statistic for the left tail of combinations of journals that a paper cites—journal pairings that are relatively unusual, compared to chance, among the set of journal pairings in that paper’s reference list.

Illustrative Example of Methodology and Further Detail

To illustrate these procedures, consider the following example, based on a single paper in the field of geography.

1.
Step 1. Take the references in a bibliography in a given paper. Consider the paper, “The Tropical Cyclone Hazard Over the South China Sea 1970–1989: Annual Spatial and Temporal Characteristics,” which was published in Applied Geography in 1995. This paper has 22 references, of which 10 are known references (Fig. 12.2).
2.
Step 2. Consider all pairwise combinations of the papers referenced in the bibliography of that paper. As can be seen in Fig. 12.2, pairwise paper combinations include, for example, (i) Deser et al. 1992 with Black 1990, (ii) Deser et al. 1992 with Thompson 1987, and (iii) Thompson 1987 with Black 1990. With 10 known references, we have 45 (i.e., 11 choose 2) pairwise paper combinations.
3.
Step 3. Map the observed paper pairs into observed journal pairs. The 45 paper pairs are mapped into 45 journal pairs, where some journal pairs in this list appear multiple times. For example, Nature and Monthly Weather Review are paired twice.
4.
Step 4. Repeat steps (1–3) for every paper in the WOS. The above steps, shown in a single article, are now repeated for every paper in the WOS. References to materials outside the WOS (for example, books) are not included.
5.
Step 5. Count the frequency of each observed journal pairing for a given publication year, using the referenced works of every paper published that year in the WOS. Information from the sample paper above would be counted as part of the year 1995. Hence, we allow journal pair frequencies varying over time.

Having completed steps (1–5) for the observed papers in the WOS, we repeated them for each synthetic instance of the WOS, as created by the null model. Comparing the observed frequency of journal pairs under the real WOS with the frequency distribution that appears across instances of the null model, we computed a z-score for each journal pair. Continuing our illustrative example, the observed frequency, expected frequency, and z-score for several journal pairings that appear in the paper, “The Tropical Cyclone Hazard Over the South China Sea 1970–1989: Annual Spatial and Temporal Characteristics,” are presented in Table 12.1. As Table 12.1 demonstrates (for a subsample of journal pairs), each published paper has a distribution of journal pairs, some of which are highly conventional (such as Monthly Weather Review–Monthly Weather Review) while others are unusual compared to chance (such as Nature–Monthly Weather Review). Fig. 12.4a presents the distribution of z-scores for this illustrative paper and indicates the median z-score and the 10th percentile z-score in that paper’s distribution.

Table 12.1 Examples of journal pair frequencies for illustrative paper

Full size table

Table 12.1 further shows the importance of normalizing the observed frequencies. For example, compare the pairings (1) Nature and Journal of Climate and (2) Nature and Monthly Weather Review. Both have similarly observed co-citation frequencies in the WOS: 2720 and 2226, respectively. However, compared to chance, the first pairing appears to have high conventionality while the second pairing seems to have high novelty.

Null Model Detail

The null model creates random synthetic instances of the WOS while incorporating realistic aspects of the data and its network structure. In particular, the null model incorporates two basic empirical facts about citation patterns :

Citation distributions are skewed. Some papers and journals are cited far more often than other papers and journals and consequently are referenced more frequently in bibliographies.
Citation counts are dynamic processes that vary by the journal (Stringer, Sales-Pardo, & Amaral, 2008), so that the rate at which papers accumulate citations is journal dependent.

Keeping these facts in mind, the null model preserves for each paper in the WOS the same number of references to past work, the same number of citations from subsequent papers, and the same distribution of these citations over time (Fig. 12.3, left panel and middle panel). The right panel of Fig. 12.3 showed the distributions of observed frequency and expected frequency of journal papers for the example paper above.

Specifically, we used a variation of the Markov Chain Monte Carlo (MCMC) algorithm to construct randomized citation networks for all papers in the WOS database. The switching of endpoints of citation links was constrained to randomly chosen endpoints within the same class (Fig. 12.3), where the link classes are defined as having the same origin year and target year (Itzkovitz et al., 2003). One can think of each link class as a sub-graph of the global citation network, which can then be randomized in the usual way by performing Q*E switches, where E is the number of links in the subgraph. There is no proof for when the Markov Chain converges; however, it is suggested (Itzkovitz et al., 2003) to set Q at a safe value of 100. Since the citation network has 302 million edges, the scale of the computation is large, and we used a slightly less conservative value of Q = 2log(E) to reduce computational burden. As can be noted in the original paper on the MCMC switching algorithm (Itzkovitz et al., 2003), this value of Q is well within the region where correlations with the original network cannot be detected.

Results

In this study, we examined 17.9 million research articles in the Web of Science (WOS) to see how prior work is combined. We present facts that inform (i) the extent to which scientific papers reference novel versus conventional combinations of prior work, (ii) the relative impact of papers based on the combinations they draw upon, and (iii) how (i) and (ii) are associated with collaboration.

We considered pairwise combinations of references in the bibliography of each paper (Small, 1973; Stringer et al., 2010). We counted the frequency of each co-citation pair across all papers published that year in the WOS and compared these observed frequencies to those expected by chance, using randomized citation networks. In the randomized citation networks, all citation links between all papers in the WOS were switched using a Monte Carlo algorithm. The switching algorithm preserves the total citation counts to and from each paper, and the distribution of these citations counts forward and backward in time to ensure that a paper (or journal) with n citations in the observed network will have n citations in the randomized network. For both the observed and the randomized paper-to-paper citation networks, we aggregated counts of paper pairs into their respective journal pairs to focus on domain-level combinations (Itzkovitz et al., 2003; Stringer et al., 2008, 2010). In the data, there were over 122 million potential journal pairs created by the 15,613 journals indexed in the WOS.

Comparing the observed frequency with the frequency distribution created with the randomized citation networks, we generated a z-score for each journal pair. This normalized measure describes whether any given pair appeared novel or conventional. Z-scores above zero indicate pairs that appeared more often in the observed data than expected by chance, indicating relatively common or “conventional” pairings. Z-scores below zero indicate pairs that appear less often in the observed WOS than expected by chance, indicating relatively atypical or “novel” pairings. For example, in the year 1995, the pairing Nature and Journal of Climate had a high z-score (47.344) indicating a conventional pairing, while Nature paired with Monthly Weather Review had a negative z-score (−24.554) indicating a pairing more unusual than chance.

The above method assigns each paper a distribution of journal pair z-scores based on the paper’s reference list (Fig. 12.4a). To characterize a paper’s tendency to draw together conventional and novel combinations of prior work, we took two summary statistics. First, to characterize the central tendency of a paper’s combinations, we considered the paper’s median z-score. The median allowed us to characterize conventionality in the paper’s main mass of combinations. Second, we considered the paper’s 10th percentile z-score. The left tail allows us to characterize the paper’s more unusual journal combinations where novelty may reside.

We found that papers typically relied on very high degrees of conventionality. Figure 12.4b presents the distribution of papers’ median z-scores for the WOS in the indicated decades. Considering that a z-score below zero represents a journal pair that occurs less often than expected by chance, the analysis of median z-scores suggests very high degrees of conventionality. Half the papers had median z-scores exceeding 69.0 in the 1980s and 99.5 in the 1990s. Moreover, papers with a median z-score below zero were rare. In the 1980s only 3.54 % of papers had this feature, while in the 1990s the percentage fell to 2.67 %, indicating a persistent and prominent tendency for high conventionality.

Focusing on each paper’s left tail combinations, we found that even among the paper’s relatively unusual journal combinations, the majority of papers did not feature atypical journal pairs . Figure 12.4c shows that 40.8 % of the papers in the 1980s and 40.7 % in the 1990s had a 10th percentile z-score below zero. Overall, by these measures, science typically relies on highly conventional combinations and rarely incorporates journal pairs that are uncommon compared to chance.

Our next finding indicates a powerful relationship between combinations of prior work and ensuing impact. Figure 12.5 presents the probability of a “hit” paper conditional on the combination of its referenced journal pairs. Hit papers are operationalized as those in the upper 5th percentile of citations received across the whole dataset, as measured by total citations through 8 years after publication. The vertical axis shows the probability of a hit paper conditional on a 2 × 2 categorization indicating the paper’s (i) “median conventionality” (an indicator of whether the paper’s median z-score is in the upper or lower half of all median z-scores) and (ii) “tail novelty” (an indicator of whether the paper’s 10th percentile z-score is above or below zero).

Papers with “high median conventionality” and “ high tail novelty” display a hit rate of 9.11 out of 100 papers, or nearly twice the background rate of 5 out of 100 papers. All other categories show significantly lower hit rates. Papers featuring high median conventionality but low tail novelty displayed hit rates of 5.82 out of 100 papers, while those featuring low median conventionality but high tail novelty display hit rates of 5.33 out of 100 papers. Finally, papers low on both dimensions have hit rates of just 2.05 out of 100.

Further analyses suggest universality of these relationships for scientific work across time and fields. In Fig. 12.6, we show that the results hold (a) over five decades of data recorded in the WOS from 1950 to 2000 and (b) using the upper 1st or 10th percentiles of citation impact. In Fig. 12.7, we define the cutoff for high and low tail novelty at different percentiles of a paper’s z-score: The 1st, 5th, 20th, 30th, and 40th. Figure 12.7 shows that using the 1st, 5th, 10th, or 20th percentile captures significant positive associations between impact and tail novelty in the 1990s. Beyond the 30th percentile, the significant association between impact and tail novelty disappears. These patterns suggest that the concept of tail novelty is not sensitive to a single value and that beyond a precise focus on the 10th percentile the construct is related to impact so long as one continues to consider the left tail of the distribution.

Results by Subfields

The following analysis shows that the results presented in the main text for the whole of the WOS continued to appear quite broadly when examining patterns within individual subfields. By subfield, we presented (1) the tendency for tail novelty and median conventionality, and (2) the relationship between novelty, conventionality, and hit papers. We examined all 243 subfields that appeared in the WOS over the 1990s.

To examine any field-specific relationships between novelty, conventionality, and hit papers, we calculated the subfield-specific probabilities of a “hit” by the four categories used in Fig. 12.2 and defined in the text. We then ranked these four categories in each subfield, where 1 indicates the highest probability of a hit, 2 indicates the second highest probability of a hit and so on. Consistent with the main results, Table 12.2 shows that in 64.4 % of fields, a paper’s likelihood of being a hit paper was greatest when combining prior work characterized by high tail novelty and high median conventionality. This category (Row 3 in Table 12.2) is ranked first or second in 86.3 % of subfields. Notably, to the extent that this category is not dominant within a subfield, the category featuring a more general shift toward novelty (Row 1 in Table 12.2) appears prominently, suggesting that tail novelty is an especially generic feature of the highest-impact papers. Conversely, the category featuring low tail novelty and low median conventionality (Row 4 in Table 12.2) ranks lowest in 70.4 % of subfields. Thus, novelty and conventionality are not opposing factors in the production of science; rather, papers with an injection of novelty into an otherwise exceptionally familiar mass of prior work are unusually likely to have high impact. Next, we focus on the effect of teams on novelty.

Table 12.2 Novelty, conventionality, and citation impact by field

Full size table

Role of Teams in Production of Knowledge

Collaboration is often claimed to produce more novel combinations of ideas (Falk-Krzesinski et al., 2010; Fiore, 2008; Stokols et al., 2008; Uzzi & Spiro, 2005; Wuchty et al., 2007), but the extent to which teams incorporate novel combinations across the universe of fields is unknown. Team-authored papers were more likely to show atypical combinations than single or pair-authored papers. Figure 12.8a shows that the distribution of 10th percentile z-scores shifted significantly leftward as the number of authors increased (Kolmogorov-Smirnov [KS] tests indicate solo vs. pair p = 0.016, pair vs. team p = 0.001, team vs. solo p < 0.001). Papers written by one, two, three, or more authors showed high tail novelty in 36.1 %, 39.8 %, and 49.7 % of cases, respectively, indicating that papers with three or more authors showed an increased frequency of high tail novelty over the solo-author rate by 37.7 %.

Teams were neither more nor less likely than single authors or pairs of authors to display high median conventionality. Figure 12.8b indicates no significant statistical difference in the median z-score distributions (KS tests indicate solo vs. pair p = 0.768, pair vs. team p = 0.417, team vs. solo p = 0.164). Teams thus achieve high tail novelty more often than solo authors, yet teams were not simply “more novel” but rather displayed a propensity to incorporate high tail novelty without giving up a central tendency for high conventionality.

Regression Methods

In our final analysis, we examined the interplay between citation, combination, and collaboration using regression methods (Fig. 12.9). Papers were binned into eleven equally sized categories of median conventionality. We used logistic regression to predict the probability of hit papers in the 1990s and ran these regressions in a flexible manner that avoided imposing functional forms on the data. In particular, we first divided papers into subsamples based on their median conventionality (11 categories, from least to greatest median conventionality, as defined in the main text) and the number of authors (3 categories, for solo authors, two-author pairs, and three or more authors). This created 33 distinct subsamples. We then ran a separate regression for each subsample.

For a given subsample, a regression takes the form

$$ \Pr\ \left({y}_i\right)=f\left(\beta Tail\_ Novelt{y}_i+{\displaystyle \sum}_f{\gamma}_f Fiel{d}_{fi}\right) $$

where $ {y}_{ij}\in \left\{0,1\right\} $ is an indicator variable for a “hit” paper, and $ Tail\_ Novelt{y}_i\in \left\{0,1\right\} $ is an indicator variable for whether a paper’s 10th percentile z-score is below zero. The regression includes a full set of fixed effects for each of 243 subfields indexed by the WOS in the 1990s, where the indicator variables $ Fiel{d}_{fi}\in \left\{0,1\right\} $ are equal to 1 if the paper i is in field f. The inclusion of these fixed effects accounts for any mean differences in hit probabilities and tail novelty across subfields. We further restricted the sample to papers with at least ten known references, which ensured that each paper in the sample had many pairwise combinations of prior work.

Figure 12.9 establishes a large positive relationship between tail novelty and hit papers, which appears independently in each of the 33 subsamples. The regressions further establish that the probability of hit papers increases with median conventionality, peaking at approximately the 85th percentile of median conventionality. These strong empirical regularities extend to alternative analyses. Figure 12.10 reconsiders these regressions defining hit papers to be in the top 1 % of citations received. The results for this higher threshold for a “hit” paper look extremely similar.

Novelty and Conventionality in Geography

In Fig. 12.11 we show the novel and conventional combinations in papers written in the field of geography. We do not observe any high degrees of conventionality when compared with fields like physics and economics. Figure 12.11a presents the distribution of papers’ median z-scores for papers published in geography in the indicated decades. Half the papers have median z-scores exceeding 16 in the 1980s in the 1990s. Papers with a median z-score below zero are rare; only 4 % of papers displayed this feature in the 1980s and 1990s. Focusing on each paper’s left tail combinations, we found that 25 % of papers in the 1980s and 1990s have a 10th percentile z-score below 0 (Fig. 12.11b). Overall, by these measures, we observed that geography papers rarely draw atypical knowledge from prior works.

We investigated the role of team authors in geography in production of knowledge. Our findings support the previous observations in Fig. 12.8: Each team size shows similar distributions for median conventionality (Fig. 12.11c) and papers with team authors show greater novelty than solo-authored or pair-authored papers (Fig. 12.11d).

We plot the average hit citations by considering the top 5 % of highly cited papers written in the fields of geography, economics, and physics over time (Fig. 12.12). The average of hit citations for geography is significantly lower than that of economics and physics. For papers published between 1980 and 2000, half the papers in geography have median z-scores above 36. Thus papers in geography combine less conventionality when compared to physics or economics, where half the papers have median z-scores above 145. Moreover, for geography, conventionality doesn’t increase with time when compared with physics and economics (Fig. 12.13). This indicates that for geography, mixing novel and conventional combinations does not result in high impact work.

Discussion and Conclusion

There were three primary findings. First, high tail novelty papers had higher impact than low tail novelty papers, an impact advantage that occurred at any level of conventionality and regardless of authorship structure. Second, peak impact occurs in the 85–95th percentile of median conventionality, an exceptionally high level. This peak and its position appeared irrespective of tail novelty/no tail novelty or authorship structure. These generic features suggest fundamental underlying rules relating combinations of prior work to the highest impact science.

Finally, Fig. 12.4 indicates that for virtually all possible mixes of tail novelty and median conventionality, larger teams were associated with higher impact. Thus, while teams incorporated the highest impact mixes more frequently (Fig. 12.3), teams also tended to obtain higher impact for any particular mix (Fig. 12.4). Nonetheless, despite this advantage in citations across virtually all fields of science (Wuchty et al., 2007), even teams had low impact at low levels of median conventionality and tail novelty.

Our analysis of 17.9 million papers across all scientific fields suggests that the highest-impact science draws on primarily highly conventional combinations of prior work with an intrusion of combinations unlikely to have been joined before. These patterns suggest that novelty and conventionality are not factors in opposition; rather, papers that mix high tail novelty with high median conventionality have nearly twice the propensity to be unusually highly cited.

These findings have implications for theories about creativity and scientific progress . Combinations of existing material are centerpieces in theories of creativity, whether in the arts, the sciences, or commercial innovation (Becker, 1982; Collins, 1998; Guimera et al., 2005; Jones, 2009; Schilling & Phelps, 2007; Schumpeter, 1939; Usher, 1929/1998; Uzzi & Spiro, 2005; Weitzman, 1998). Across the sciences, the propensity for high impact work is sharply elevated when combinations of prior work are anchored in substantial conventionality while mixing in a left tail of combinations that are rarely seen together. In part, this pattern may reflect advantages to being within the mainstream of a research trajectory, where scientists are currently focused while being distinctive in one’s creativity. For example, as mentioned in the beginning of the chapter, Newton remained in the mainstream of traditional geometry and at the same time remained creative while communicating the laws of gravitation in Principia. Combinations of prior work also relate to “burden of knowledge ” theory, which emphasizes the growing knowledge demands upon scientists (Einstein, 1949; Fleming, 2001; Jones, 2009). New articles indexed by the WOS now exceed 1.4 million per year across 251 fields, encouraging specialization and challenging scientists’ capacity to comprehend new thinking across domains. The finding that teams preserve high conventionality yet introduce tail novelty suggests that teams help meet the challenge of the burden of knowledge by balancing domain-level depth with a capacity for atypical combinations.

Our methodology considered paper and journal pairings but can be applied at the level of disciplines, papers, or topics within papers, allowing the examination of combinations of prior work at different resolutions in future studies of creativity and scientific impact. Beyond science, links between novelty and conventionality in successful innovation also appear. E-books retain page-flipping graphics to remind the reader of physical books, and blue jeans were designed with a familiar watch pocket to look like conventional trousers. From this viewpoint, the balance between extending technology with atypical combinations of prior ideas while embedding them in conventional knowledge frames may be critical to human progress in many domains. Future research questions also arise from our findings. Science is dynamic, with research areas shifting and new fields arising. While we find that the regularities relating novelty, conventionality, and impact persist across time and fields, understanding how research trajectories shift and how new fields are born are questions that measures of novelty and convention may valuably inform. At root, our work suggests that creativity in science appears to be a nearly universal phenomenon of two extremes. At one extreme is conventionality, and at the other is novelty. Curiously, notable advances in science appear most closely linked not with efforts along one boundary or the other but with efforts that reach toward both frontiers.

Notes

1.
Other operationalizations might consider lower resolution pairings using the ISI’s 252 subfield categories, text-based combinations, or conceptualizations for measuring novelty beyond combinatorial pairs (Rosenkopf & McGrath, 2011).

References

Azoulay, P., Zivin, J. G., & Manso, G. (2011). Incentives and creativity: evidence from the Academic Life Science. Howard Hughes Medical Investigator Program. The RAND Journal of Economics, 42, 527–554. doi:10.1111/j.1756-2171.2011.00140.x
Article Google Scholar
Becker, H. S. (1982). Art Worlds. Berkeley: University of California Press.
Google Scholar
Bollen, J., Rodriguez, M. A., & van de Sompel, H. (2006). Journal status. Scientometrics, 69, 669–687. doi:10.1007/s11192-006-0176-z
Article Google Scholar
Collins, R. (1998). The sociology of philosophies: A global theory of intellectual change. Cambridge, MA: Harvard University Press.
Google Scholar
Einstein, A. (1949). The world as I see it. Secaucus: Citadel Press.
Google Scholar
Evans, J., & Foster, J. (2011). Metaknowledge. Science, 331, 721–725. doi:10.1126/science.1201765
Article Google Scholar
Falk-Krzesinski, H. J., Börner, K., Contractor, N., Fiore, S. M., Hall, K. L., Keyton, J., & Uzzi, B. (2010). Advancing the science of team science. CTS: Clinical and Translational Science Journal, 3, 263–266. doi:10.1111/j.1752-8062.2010.00223.x
Google Scholar
Fiore, S. M. (2008). Interdisciplinarity as teamwork: How the science of teams can inform team science. Small Group Research, 39, 251–277. doi:10.1177/1046496408317797
Article Google Scholar
Fleming, L. (2001). Recombinant uncertainty in technological search management. Management Science, 47, 117–132. doi:10.1287/mnsc.47.1.117.10671
Article Google Scholar
Guimera, R., Uzzi, B., Spiro, J., & Nunes Amaral, L. A. (2005). Team assembly mechanisms determine collaboration network structure and team performance. Science, 308, 697–702. doi:10.1126/science.1106340
Article Google Scholar
Henderson, R. M., & Clark, K. B. (1990). Architectural innovation: The reconfiguration of existing technologies and the failure of established firms. Administrative Science Quarterly, 35, 9–30. doi:10.2307/2393549
Article Google Scholar
Itzkovitz, S., Milo, R., Kashtan, N., Newman, M. E. J., & Alon, U. (2003). Subgraphs in random networks. Physical Review E, 68. doi:10.1103/PhysRevE.68.026127
Jones, B. (2009). The burden of knowledge and the death of the renaissance man: Is innovation getting harder? The Review of Economic Studies, 76, 283–317. doi:10.1111/j.1467-937X.2008.00531.x
Article Google Scholar
Jones, B., Wuchty, S., & Uzzi, B. (2008). Multi-university research teams: Shifting impact, geography, and stratification in science. Science, 322, 1259–1262. doi:10.1126/science.1158357
Article Google Scholar
National Academy of Sciences, National Academy of Engineering, & Institute of Medicine of the National Academies (2004). Facilitating interdisciplinary research. Washington, DC: National Academies Press.
Google Scholar
Rosenkopf, L., & McGrath, P. (2011). Advancing the conceptualization and operationalization of novelty in organizational research. Organization Science, 22, 1297–1311. doi:10.1287/orsc.1100.0637
Article Google Scholar
Schilling, M. A., & Phelps, C. C. (2007). Interfirm collaboration networks: The impact of small world connectivity on firm innovation. Management Science, 53, 1113–1126.
Google Scholar
Schilling, M., & Green, E. (2011). Recombinant search and breakthrough idea generation: An analysis of high impact papers in the social sciences. Research Policy, 40, 1321–1331. doi:10.1016/j.respol.2011.06.009
Article Google Scholar
Schumpeter, J. (1939). Business Cycles. New York, McGraw-Hill.
Google Scholar
Small, H. (1973). Co-citation in scientific literature: A new measure of the relationship between two documents. Journal of the American Society for information Science, 24, 265–269. doi:10.1002/asi.4630240406
Article Google Scholar
Stokols, D., Hall, K. L., Taylor, B. K., & Moser, R. P. (2008). The science of team science: Overview of the field and introduction to the supplement. American Journal of Preventive Medicine, 35, 77–89. doi:10.1016/j.amepre.2008.05.002
Article Google Scholar
Stringer, M. J., Sales-Pardo, M. & Amaral, L. A. (2008). Effectiveness of journal ranking schemes as a tool for locating information. PLOS ONE, 3. doi:10.1371/journal.pone.00016
Stringer, M. J., Sales-Pardo, M., & Amaral, L. A. (2010). Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal. Journal of the American Society for Information Science and Technology, 61, 1377–1385. doi:10.1002/asi.21335
Article Google Scholar
Usher, A. P. (1998). A History of mechanical invention. Cambridge, MA: Harvard University Press. (Original work published 1929)
Google Scholar
Uzzi, B., Mukherjee, S., Stringer, M., & Jones, B. (2013a). Atypical combinations and scientific impact. Science, 342, 468–472. doi:10.1126/science.1240474
Uzzi, B., Mukherjee, S., Stringer, M., & Jones, B. (2013b). Supplementary materials for atypical combinations and scientific impact. Science, 342. Retrieved from www.sciencemag.org/content/342/6157/468/suppl/DC1
Uzzi, B., & Spiro, J. (2005). Collaboration and creativity: The small world problem. American Journal of Sociology, 111, 447–504. doi:10.1086/432782
Article Google Scholar
Weitzman, M. L. (1998). Recombinant growth. The Quarterly Journal of Economics. 113, 331–360. doi:10.1162/003355398555595
Article Google Scholar
Whiteside, D. T. (1970). The mathematical principles underlying Newton’s principia mathematica. Journal for the History of Astronomy, 1, 116–138. doi:10.1177/002182867000100203
Article Google Scholar
Wuchty, S., Jones, B., & Uzzi, B. (2007). The increasing dominance of teams in the production of knowledge. Science, 316, 1036–1039. doi:10.1126/science.1136099
Article Google Scholar

Download references

Acknowledgements

Sponsored by the Northwestern University Institute on Complex Systems (NICO), by the Army Research Laboratory under Cooperative Agreement Number W911NF-09-2-0053 and DARPA BAA-11-64, Social Media in Strategic Communication. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. 11B.

Author information

Authors and Affiliations

Indian Institute of Management, Udaipur, Rajasthan, India
Satyam Mukherjee
Kellogg School of Management, Northwestern University, 2001 Sheridan Road, 60208, Evanston, IL, USA
Brian Uzzi
Department of Management and Strategy, Kellogg School of Management, Northwestern University, 2001 Sheridan Road, Jacobs Center 371, 60208-2001, Evanston/Chicago, IL, USA
Benjamin F. Jones
Northwestern Institute on Complex Systems, Northwestern University, 600 Foster, 60208, Evanston, IL, USA
Michael Stringer

Authors

Satyam Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Brian Uzzi
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin F. Jones
View author publications
You can also search for this author in PubMed Google Scholar
Michael Stringer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brian Uzzi .

Editor information

Editors and Affiliations

Department of Geography, University of Heidelberg, Heidelberg, Germany
Johannes Glückler
Department of Sociology, Institutd’ Etudes Politiques de Paris (SPC) and and Centre for the Sociology of Organizations (CNRS), Paris, France
Emmanuel Lazega
Department of Geography, University of Heidelberg, Heidelberg, Germany
Ingmar Hammer

Rights and permissions

This chapter is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, duplication, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the work’s Creative Commons license, unless indicated otherwise in the credit line; if such material is not included in the work’s Creative Commons license and the respective action is not permitted by statutory regulation, users will need to obtain permission from the license holder to duplicate, adapt or reproduce the material.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mukherjee, S., Uzzi, B., Jones, B.F., Stringer, M. (2017). How Atypical Combinations of Scientific Ideas Are Related to Impact: The General Case and the Case of the Field of Geography. In: Glückler, J., Lazega, E., Hammer, I. (eds) Knowledge and Networks. Knowledge and Space, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-319-45023-0_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-45023-0_12
Published: 06 January 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45022-3
Online ISBN: 978-3-319-45023-0
eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics