Introduction

During the COVID-19 pandemic, the world has faced three main crises: the medical crisis, the social crisis, and the economic crisis (Nicola et al., 2020; ”syndemic” Ryan, 2021). Dealing with these crises has required a bundling of the strengths of scientists from several very different scientific disciplines (Moradian et al., 2021). Following Rafols and Meyer (2010), we define interdisciplinary research as research targeting scientific problems by integrating concepts, theories, techniques, instruments, and data from different scientific disciplines. Interdisciplinary collaborations drive breakthrough ideas and innovations (Jung et al., 2021; Schilling & Green, 2011) that were urgently needed in the COVID-19 pandemic (e.g., drugs and vaccines). Despite numerous benefits, interdisciplinary collaborations can be subjected to many cognitive and institutional challenges (Leahey et al., 2017a), are highly risky in terms of the desired outcome, can take many years to become successful, and are particularly difficult to manage (Blackwell et al., 2009). Furthermore, due to multiple organizational and cultural barriers and disciplinary orientations, the willingness and ability for interdisciplinary collaborations are only weakly developed among many actors (Amey & Brown, 2006; Kirby et al., 2019). In the same time, interdisciplinary work became more difficult by the lockdown, canceled conferences and home office (Kastenhofer et al., 2023). Moreover, researchers emphasize the need for international collaboration to tackle COVID-19 (Mohamed et al., 2020). Even though multiple scholars investigated the rapid development of a scientific response to the COVID-19 virus crisis based on a bibliometric analysis (Haghani & Bliemer, 2020; Colavizza et al., 2020; Bonilla-Aldana et al., 2020; Joshua & Sivaprakasam, 2020), their explorative focus did not elaborate on the factors that induce or hinder interdisciplinary collaborations in COVID-19 research. Little is known about the aspect of interdisciplinarity and, in particular, whether the pandemic has increased the degree of interdisciplinarity in scientific collaborations. This is an important issue regarding the guidance for policymakers’ decisions on the institutional environment for coping with similar crises in the future. In our study, we aim to focus on the actors and their characteristics in scientific networks to understand them as decision-makers for collaborations with distant disciplines.

Furthermore, there has been a tendency that other important research topics could have been displaced by the intensively growing research field “COVID-19” (Woo Baidal et al., 2020; Watermeyer et al., 2021). Besides a cannibalization of topics, research on quite different other established topics may also have adapted to the new conditions of the pandemic. Topic adjusting effects due to COVID-19 pandemic have not yet been investigated in scientific research. Adding knowledge to this potentially unintended effects may provide important insights for science management and science policy for future global urgent crises and in the course of future mission-oriented research policies.

We go beyond the findings by Zhao et al. (2022) and Coccia (2021) and investigate to what extent COVID-19 has had an impact on how research is conducted in terms of scientific collaborations and interwovenness of scientific topics. In particular, our research questions are:

  • Did the emergency of the COVID-19 pandemic lead researchers to focus on topics of COVID-19 and by this to a reduction of efforts in other established scientific topics? How can the scientific response to COVID-19 be compared to the responses to medical emergencies, such as the Zika virus, Ebola, and SARS?

  • Did research on COVID-19 increase the degree of interdisciplinarity, as measured by the established representation learning method DeepWalk applied to the network of bibliographic metadata?

  • What social network characteristics of the authors of COVID-19-relevant research are important for increasing the degree of interdisciplinarity?

Answering these questions requires a multi-faceted analysis: We first analyze the impact of COVID-19 crisis-driven research and compare it with other disease outbreaks. Then we conduct a concept (MeSH descriptorsFootnote 1) similarity analysis grounded on concept embeddings that are obtained with machine learning from graph-structured bibliographic data, including publications, concept annotations, and citations. Based on the cosine distances between the embeddings of the MeSH descriptors, we measure the (dis)similarity between scientific fields, which is the foundation for our interdisciplinarity indicator. We use this interdisciplinarity indicator to run a fine-grained analysis of the degree of interdisciplinarity of COVID-19 research in 2020. Moreover, we conduct a reference analysis of SARS-CoV-2/COVID-19 relevant scientific articles and preprints. Finally, we investigate the scientific social networks of scholars who have contributed to the COVID-19 research to explore whether an author’s betweenness centrality or an author’s degree centrality facilitates the degree of interdisciplinarity at the researcher level.

As datasets, we rely on the ZB MED Knowledge EnvironmentFootnote 2 (ZB MED KE) and a dedicated COVID-19 dataset (called COVID-19++, see Galke et al. (2021) for details on the dataset), which contains scholarly publications and preprints on COVID-19, enriched with works that are cited by them. The datasets are described in more detail in Appendix A.

In summary, the key contributions of this work are:

  • We run four analyses to investigate the dynamics of the scientific response to the COVID-19 outbreak from different angles: research volume, concept similarity, references (i.e., citations), and author networks.

  • We show how an incrementally trained DeepWalk model can be utilized to reflect the similarity of concept embeddings and how this can be used to conduct an analysis of research dynamics regarding the interdisciplinary work.

  • Our analyses of COVID-19 research dynamics reveal that, while the focus shifted noticeably to the topic of COVID-19 in disregard of other topics, there is no increase in interdisciplinarity.

This article is structured as follows. The literature on COVID-19 research dynamics is reviewed in Sect. 2. Section 3 analyses the shift in research volume by topic and compares this shift with other medical emergencies through a descriptive analysis. Section 4 analyses to what extent research topics change over time and how this change affects their proximity through a concept similarity analysis, carried out with representation learning in citation networks with bibliographic metadata. Subsequently, Sect. 5 investigate what insights researchers use from other fields by an analysis of the relationship between COVID-19 research articles and their references. Lastly, Sect. 6 investigates the factors for interdisciplinarity (quantified as diversity in topics of publications) of authors through a social network analysis, before we discuss our results in Sect. 7.

State of research and literature gap

In the following, we describe how COVID-19 has affected academic research from multiple angles. During the pandemic, many research articles on COVID-19 have been published. Raynaud et al. (2021) show that the increase in publication volume of COVID-19 was accompanied by a decrease in the share of non-COVID-19 publications. Dinis-Oliveira labels this as a “paperdemic” and reminds us of the principles of scientific integrity (Dinis-Oliveira, 2020). Within this large volume of research, we compile papers from the scientometric perspective on how the pandemic has impacted research and what new trends can be inferred from bibliographic data.

Impact of the pandemic on research: a scientometric perspective

From the first months of the COVID-19-pandemic, scholars have attempted to understand the knowledge system behind SARS-COV-2. For instance, Benjamens et al. (2021) analyze geographic sources of publications in medical journals. Although the authors note that the distribution is unexpected, they argue that medical journals should solicit articles from underrepresented countries for a more representative discourse, given the global impact of the crisis. Similar to our research interest, Coccia (2021) is concerned with research dynamics from a scientometric perspective. Using Scopus publication metadata, the author compares scientific output that is driven by a crisis (“crisis driven”; e.g., COVID-19, MERS, ZIKA, HIV, H1N1) with that addressing a continuous problem (“problem-driven”; e.g., lung cancer, chronic obstructive pulmonary disease (COPD)). Crisis-driven research has a high publication rate; whereas problem-driven research-related publication volume is rather linear. While crisis-driven research is concentrated in terms of a few scientific fields (3–5), a few journals, a few funding agents, and countries associated with about 80% of the publication volume; problem-driven research is more widely diffused Coccia (2021). Coccia finds another difference with regard to open-access publications: Crises-driven research is more published in open access (78%), whereas only about 40% of open access publications constitute research in lung cancer and COPD.

Other authors investigate the degree of interdisciplinarity of COVID-19-related research and conclude that COVID-19-related publications are dominated by a few disciplines such as Medicine, Immunology and Microbiology, Biochemistry Genetics, and Molecular Biology (Zhao et al., 2022). The authors found that before 2020 the number of disciplines involved in coronavirus research raised, while the balance and diversity of disciplines showed a falling trend from 1990 to 2019. They conclude that coronavirus research has not become more interdisciplinary, but it has been catalyzed by the COVID-19 pandemic (Zhao et al., 2022). Also, Fassin (2021) takes a bibliometric approach and investigates the impact of the COVID-19 publication explosion on bibliometric indicators. After providing an analysis of \(h_a\) and h indices, the paper stresses the salience of the topic, the magnitude of the problem, and urgency, as the key drivers for citations. Aviv-Reuven and Rosenfeld (2021) investigate changes in publishing patterns, particularly in the volume and in average time to acceptance of both preprints and journal articles. The authors find a sharp increase in publication volume and a significantly faster mean time of acceptance for COVID-19 papers.

Liu et al. (2022) argue that the pandemic is a catalyst of scientific novelty. The authors have applied a BioBERT model, pretrained on 29 million PubMed articles, on 98,981 COVID-19 related papers to find that scientific novelty has increased, along with an increase in first-time collaborations, whereas international collaboration experienced a sudden decrease. Harsanto (2020) analyzes the bibliographic metadata of publications in the first three months after COVID-19 hit. The study finds that, on average, 150.33 documents were published every month, with medicine as the main field of research (62.04%). Shan et al. (2020) argue that it becomes increasingly difficult to publish non-COVID-19 research.

Need for interdisciplinary collaboration during the COVID-19 crisis?

Due to the effective results, such as the development and successful testing of the coronavirus vaccine (Li et al., 2021), we expect to find interdisciplinary research dynamics in the scientific activities on COVID-19 since the multidimensional challenge of the pandemic requires collaborative efforts of multiple disciplines. The advantages of an interdisciplinary approach in the fight against the pandemic are seen by around 88% of respondents to a delphi study conducted between February and April 2022, ”The incorporation of research paradigms from diverse disciplines has greater potential to end COVID-19 as a public health threat [...]” (Lazarus et al., 2022). In particular, the need for cooperation between the social sciences and medicine is repeatedly emphasized (Corsi & Ryan, 2022). We adopt the definition of interdisciplinary research of the US National Academy of Sciences, Engineering, and Medicine which defines it as follows: ” (...) a mode of research by teams or individuals that integrates information, data, techniques, tools, perspectives, concepts, and/or theories from two or more disciplines or bodies of specialized knowledge to advance fundamental understanding or to solve problems whose solutions are beyond the scope of a single discipline or area of research practice.” (Sciences et al., 2005) The definition also describes interdisciplinarity as interdisciplinarity between subgroups of disciplines (”bodies of knowledge”).

A new combination of technological knowledge and components may result in breakthroughs (Fleming, 2001), and a wide breadth of knowledge enables the development of radical innovations (Xu, 2015). A high level of interdisciplinarity can drive the creativity and innovativeness of researchers in academia (Leahey et al., 2017b), increase publication impact  (Yegros-Yegros et al., 2015), and enhance the R &D performance of public research centers (Jung et al., 2021). A high degree of interdisciplinarity of researchers is positively related to their intention and ability to launch a company and license their patents to private firms (D’Este et al., 2019). Moreover, a solution-driven R &D approach also forces organizations to employ professionals from different science and technology fields who must work with their colleagues from other disciplines and, therefore, go beyond the boundaries of their disciplines. In this regard, the distance of knowledge fields in interdisciplinary R &D is especially large during early phases of radical innovation, for instance, developing a vaccine against the SARS-CoV-2 virus, and when new techno-scientific fields emerge (Meyer-Krahmer, 1997). Therefore, interdisciplinary collaborations could have facilitated the development of urgently required vaccines and medicine against the novel coronavirus. Nevertheless, studies for the period before the COVID-19 pandemic show that there is little openness among scientists to implementing interdisciplinary approaches. And this is also in contrast to the broad-based funding opportunities offered by third-party founders who want to support interdisciplinary approaches between STEM (science, technology, engineering and math) and Non-STEM research(Uddin et al., 2021).

Research volume analysis

Methods

Comparison to other medical emergencies To get a general sense of the publication volume related to the COVID-19 pandemic, we conducted a comparison to the occurrence of MeSH terms related to other similar medical emergencies in the Medline database (He & Chen, 2018; Xiang etal., 2020). Therefore, we retrieved amounts of publications annotated with the specific MeSH terms related to the AIDS crisis (since the early 1980ies), the SARS pandemic (2002-2003), the Swine Flue (pandemic in 2009), the MERS epidemic (2014), Ebola epidemic (2014-2016), Zika virus outbreak (2015–2016), as well as Marburgvirus with its last major outbreak in 2017 (Table 1 and Fig. 1).

Concept Drift Magnitude To quantify the change over time in the concept annotations, we evaluate the drift magnitude of the concept distribution in our datasets in order to estimate the shift in content over time. We follow Webb et al. (2018) and use total variation distance. Formally, it is defined as:

$$\begin{aligned} \sigma _{t-1,t} = \frac{1}{2} \sum _{y \in {\text {dom}}(Y_{t-1} \cup Y_t)} \vert P_{t-1}(y) - P_{t}(y) \vert \end{aligned}$$

where \(t-1\) and t may be time points or time intervals that define the respective observed distribution. Figure 2 shows the results for the class drift magnitude on our COVID19++ dataset. We use the total variation distance in two ways. First, we compare the drift magnitude of the data over time. Second, we quantify the topic shift between primary COVID-19 articles and referenced literature.

We use the ZB MED Knowledge Environment (ZB MED KE) and a derived COVID-19-specific subset of the ZB MED KE, called COVID-19++ Galke et al. (2021). The ZB MED KE is a heterogeneous database environment containing over 70 million records from over 70 specialized databases in the life sciences field. We describe the datasets in more detail in Appendix A.

Table 1 MeSH terms for the comparison of the novel coronavirus to other medical emergencies’ publication volume

In order to get a broader impression of the publication amount in general, the publication volume on the following MeSH terms of broader topics has been included (Fig.1, upper panel): Betacoronavirus (D000073640), Heart Diseases (D006331), Cardiovascular Diseases (D002318), Myocardial Ischemia (D017202), Respiratory Tract Diseases (D012140), Betacoronavirus (D000073640), and Heart Diseases (D006331).

Results

Fig. 1
figure 1

Trending MeSH terms within health crisis publications in Medline. The upper picture shows the broad context of publication topics and their publication volume; the lower image concentrates on individual pathogens with a more precise focus

Although there has not been a specific MeSH term for the SARS-CoV-2 pathogen nor for the SARS-CoV-2 illness in 2020, we can assess the increasing number of publications on the topic from the development of the term Betacoronavirus (D000073640) (Fig. 1). Moreover, the steep increasing curve progression of SARS, which is related to SARS-CoV-2, can be taken as an indication. Notably, the term COVID-19 (D000086382) was only introduced in the MeSH edition 2021 but was not available before that date. Fig. 1 shows that the research on respiratory tract diseases has also drastically increased in 2020 reflecting that the novel coronavirus predominantly attacks the human respiratory tract (Ali & Alharbi, 2020).

We took the increasing amount of publications during the last decades into account by normalizing the data as a percentage of the actual volume of research output. The outbreak of HIV in the early 1980s and the research on it seem to put research on other virus-related analyzed topics in shade in the data. However, we can still observe the decline of research on HIV that appears to be related to other outbreaks, especially on SARS, Influenza H1N1 Virus, and Zika Virus.

Similarly, in 2002–2003, when the SARS pandemic occurred, the data in Fig. 1 (lower panel) shows an increase in research activities on the topic. Analogously, in 2009, we know the swine flu pandemic took place. It was mainly caused by the H1N1 virus and resulted in an increase in the amount of research on the virus. The Ebola outbreak in 2014 induced a comparatively slight rise in scholarly publications on the topic. It might be also overlapped with the Zika Virus epidemic in 2015-2016, which caused a lot of publications annotated with the respective MeSH term. All these curves appear to decline at the moment when betacoronavirus and SARS research enter the scene again in 2020. Both terms are related to SARS-CoV-2 research, which lacked a specific term at that time. Compared to the other curves related to medical crises, the rise of scientific output on the MeSH term Betacoronavirus as well as on SARS during 2020 is much stronger in terms of the amount of research output. If we combine both curves of Betacoronavirus and SARS—since some publications will be annotated with both MeSH terms, and this is why we cannot simply add the numbers—the research dynamic can be compared to the crisis of HIV rather than to other crises in the recent decades. Notably, we observe a decline in the share of publications in other topics such as heart diseases and cardiovascular diseases.

Figure 2 shows the drift magnitude in the datasets, i.e. the change in topics over time of the COVID-19 subset and a 15M subset of the ZB MED Knowledge Environment (ZB MED KE) (filters: English and MeSH descriptors availability). Both COVID19++ (N=122,886) and ZB MED KE (N=15,196,066) exhibit an unusually strong drift in 2020. The drift in the COVID19++ data set is consistently larger than the drift of ZB MED KE.

Fig. 2
figure 2

Changes in the distribution of MeSH terms. Class Drift Magnitude between 2000 and 2020 and the 15M documents from the ZB MED KE compared to the COVID-19 specific dataset COVID-19++. COVID19++ is a subset of ZB MED KE that contains articles on COVID-19, along with preprints, and the publications cited by those

To compare the publication process in relation to other outbreaks of viruses, namely the SARS outbreak in 2002-2003, the Influenza pandemic caused by the H1N1 virus in 2009-2010, and most recently the Zika virus epidemic in 2015-2016, publications from these very periods of the acute health crises had been extracted from ZB MED Knowledge Environment. Referring to the open CrossRef DatabaseFootnote 3, the referenced publications had been supplemented. For identified references, we extracted MeSH descriptors if available. Afterwards, the MeSH descriptors had been mapped to tree number level 1 of the MeSH hierarchy. We then use the MeSH descriptors to calculate the shift in topics between primary and referenced publications. The results are shown in Table 2. COVID-19 shows a similar topic shift as Zika Virus. SARS and H1N1 have a higher topic shift (Table 2).

Moreover, we observe significant differences in the distribution of publications and references over the MeSH thesaurus level 1 domain between MeSH descriptors Respiratory Tract Diseases and Cardiovascular Diseases, Heart Diseases, and Myocardial Ischemia (Fig. 3, upper figure). These differences are salient for MeSH domains M (Named Groups) and N (Health Care). Further, the analysis reveals considerable differences in the distribution of publications and references over the MeSH thesaurus level 1 domain between MeSH descriptors SARS and Influenza A Virus, H1N1 as well as Zika Virus for MeSH domains A (Anatomy), B (Organisms), C (Diseases), D (Chemicals and Drugs), G (Phenomena and Processes), and N (Health Care) (Fig. 3, lower panel).

Table 2 Shift in Topics between primary and referenced literature: A control group of MeSH terms related to virus diseases outbreaks
Fig. 3
figure 3

Shift of topics in primary publications and referenced publications for control groups

Consequently, our analyses show that COVID-19 research in the analysis period of 2020 has grown unprecedentedly fast and at least partially has displaced research efforts on other viruses and diseases.

Concept similarity analysis

Methods

Dynamic concept embedding space learned by DeepWalk

Counting how often a concept is assigned to a document together with another concept is a common proxy for the similarity of these two concepts (Tijssen, 1992). However, such co-occurrence counts are insufficient to derive a notion of similarity between concepts. That is because two concepts may be very similar, which is not reflected in co-occurrence counts unless the articles are consistently annotated with all similar concepts. Moreover, the annotation of publications with concepts from a controlled vocabulary rather reflects the diversity of aspects of that particular publication. Thus, co-occurrence does not imply that the concepts are similar (Galke et al., 2018).

A crucial example is that the MeSH descriptor COVID-19 only appeared on 7 July 2020Footnote 4. Thus, research articles published before July had to resort to other terms of MeSH, such as Betacoronavirus or Pneumonia, Viral. With our machine learning model, the similarity of the two concepts (or between any other concepts) is learned and taken into account. We pursue an embedding approach (Perozzi et al., 2014; Grover & Leskovec, 2016) that exploits bibliographic metadata such as the authors of the publications, its journal, and the annotation with concepts from a controlled vocabulary.

In the following, we outline how concept embeddings can be learned from the graph of bibliographic metadata and how these embeddings facilitate a measure of concept similarity over time. For training this model, we use the COVID-19++ dataset (Galke et al., 2021). The dataset is described in more detail in Appendix A.

To transform bibliographic metadata into graphs, we consider each publication and each concept to be a node in the graph. The publication node is connected with the concept node via an edge when the publication is annotated with that particular concept. Similarly, we insert a node for each author and each journal into the graph linked to the publication by an edge if the author has written the publication or if the publication is published in the journal, respectively. As a result, we obtain a graph \({\mathcal {G}}= (V, E)\) constructed based on bibliographic metadata. The set of nodes V is comprised of documents (original publications and their references) nodes \({\mathbb {P}}\), concept nodes \({\mathbb {C}}\), author nodes \({\mathbb {A}}\), and journal nodes \({\mathbb {J}}\). The undirected edges E may resemble authorship relations between authors and publications, annotation relations between publications and concepts, and the journal. We use this graph as a basis for learning concept embeddings. Based on these concept embeddings, we compute the similarity between concepts.

To learn the embedding function f, we use the DeepWalk algorithm (Perozzi et al., 2014; Grover & Leskovec, 2016). DeepWalk is an established approach for learning node embeddings in a graph. The Deepwalk algorithm randomly initializes a node embedding \({\varvec{X}}\in \mathbb {R}^{n \times s}\), where s is the embedding size. To update the embeddings of the nodes, DeepWalk samples random walks \((u_1,u_2,\cdots ,u_l)\) through the graph such that \(\forall i \in {1,2,\ldots ,l} : u_i \in V\) and \(\forall i \in {1,2,\ldots ,l-1}: (u_i, u_{i+1}) \in E\), where l is the length of the random walk. For each node on each random walk, its current embedding is used to predict neighboring nodes within a fixed context window size \(n_\text {cwnd}\) along the random walk. The respective node embedding is updated according to the error signal from the prediction. DeepWalk is connected to the Word2vec model from natural language processing (Mikolov et al., 2013) by considering random walks as sentences.

In the present work, we intend to learn node embeddings not only on a single graph but also on multiple graph snapshots over time. DeepWalk relies on random initialization. Therefore, different runs may converge to different node embeddings. This is a critical factor as the node embedding at time \(t+1\) should be based on the embedding of the node at time t. In the original work on DeepWalk (Perozzi et al., 2014), the authors have suggested the possibility of online learning. In our snapshot-based setting, we run multiple epochs for time step t before advancing to time step \(t+1\). To stabilize training across time steps, we reuse the node embeddings from the previous time step to initialize the embeddings for the next time step. This is similar to the method for deep autoencoders proposed by Goyal et al. (2018, 2020). Thus, we expect the embeddings to remain similar but still account for the changes in the graph.

Regarding hyperparameter choices, we use a walk length of \(l=80\), a context size of \(c = 10\), and an embedding size of \(s=128\). We sample 10 random walks per concept per month. We optimize the embedding with a learning rate of 0.025 that linearly decays to zero at each time step. The resulting embedding space is visualized by t-Stochastic Neighbor Embedding (t-SNE) (van der Maaten & Hinton, 2008) in Fig. 4.

Fig. 4
figure 4

A t-SNE visualization of the final embedding space produced by the DeepWalk algorithm after incrementally training on all available data. The 30 nearest neighbors (in original high-dimensional space) of the concept Pneumonia, Viral are highlighted. The three closest neighbors are Betacoronavirus, Pandemics, and Coronavirus Infections

To compare similarities, we first normalize the resulting concept embedding \({\varvec{X}}_{\mid \textrm{concepts}}\), whose rows i hold concept embedding \(f(c_i)\) in two steps. First, we center the embedding by subtracting the centroid. Then, we normalize the columns to unit \(L_2\)-norm. As a result, we have normalized and centered vectors for each concept \(c^{(t)}\) in each time step t.

To compare a pair of concepts at time t, we compute the cosine distance on normalized and centered concept vectors ab.

$$\begin{aligned} d_{\cos }^{(t)}(a,b) = 1 - \frac{a^{(t)} \cdot b^{(t)}}{\vert \vert a^{(t)}\vert \vert \cdot \vert \vert b^{(t)} \vert \vert } \end{aligned}$$

The cosine distance ranges between 0 and 2 and a lower cosine distance indicates a higher degree of similarity between the two concepts.

We are interested in the relationship and dynamics of whole research fields rather than merely two concepts. For that purpose, we assume that a research field can be described by a set of concepts. We define a metric to compare two sets of concepts, given the learned concept embedding. This is connected to the average linkage metric in the context of clustering methods (Chapter 17: Hierarchical Clustering Manning Raghavan, & Schütze 2008). We consider two concept sets \({\mathbb {A}}\) and \({\mathbb {B}}\). Then we compute the pairwise cosine distances between all the concepts of \({\mathbb {A}}\) and all the concepts of \({\mathbb {B}}\). We regard the mean of these pairwise distances as the distance between the two concept sets.

$$\begin{aligned} \overline{d_{\cos }^{(t)}({\mathbb {A}}, {\mathbb {B}})} = \frac{1}{\vert {\mathbb {A}}\vert \cdot \vert {\mathbb {B}}\vert } \sum _{a \in {\mathbb {A}}, b \in {\mathbb {B}}} d_{\cos }^{(t)}(a, b) \end{aligned}$$

We consider the prevalence of increasing similarity (convergence) between concept sets \({\mathbb {A}}\) and \({\mathbb {B}}\) during the time span \(t_\textrm{start}, t_\textrm{end}\) if the distances are monotonously decreasing, i. e., if the following equation holds:

$$\begin{aligned} \forall t \in [t_\text {start}, t_\text {end}) : \overline{d_{\cos }^{(t+\Delta t)}({\mathbb {A}}, {\mathbb {B}})} \le \overline{d_{\cos }^{(t)}({\mathbb {A}}, {\mathbb {B}})} \end{aligned}$$

With this model, we have generated temporal embeddings on our newly introduced COVID19++ dataset that we use for the analyses of research dynamics, which we will describe next.

Identification of relevant scientific fields on the basis of MeSH terms

Further, we conducted a specific analysis of MeSH terms (descriptors) in our data set. After the exclusion of particular common and COVID-19-related MeSH descriptors, we identified the most frequent MeSH descriptors over the analysis period (January–December 2020) and allocated it according to the MeSH thesaurus hierarchy at the second level (which we define as MeSH subdomains). Afterward, we determined 12 MeSH subdomains with the largest numbers of the most frequent MeSH descriptors. These MeSH subdomains and especially MeSH descriptors represent the most important medical and social spheres related to or affected by the COVID-19 pandemic.

For instance, the MeSH descriptors Pneumonia, Pneumonia-Viral, Severe Acute Respiratory Syndrome, Asymptomatic Infections of the subdomain Infections (C01) reflect diseases caused by the novel coronavirus (Huang et al., 2020). The efforts of scientists to develop efficient medicine and vaccine are displayed in the subdomain Amino Acids, Peptides, and Proteins (D12), since its most frequent MeSH descriptors Antibodies, Protein S, C-Reactive Protein, Cytokines are indicative of the scientists’ considerations regarding the triggers of the immune responses to the novel coronavirus and describe different approaches for immune response mechanisms at medical and biochemical levels (Watanabe et al., 2020; Rogers et al., 2020; Kayser et al., 2020).

The MeSH descriptors Diagnosis, Viral Load, Tomography, Disk Diffusion Antimicrobial Tests, Respiratory Rate, Diagnostic Imaging of the subdomain Diagnosis (E01) and Hospitalization, Therapeutics, Respiration-Artificial, Intubation, Vaccination MeSH descriptors of the subdomain Therapeutics (E02) reveal that scientists have intensively investigated and employed the proper diagnostic instruments and therapy approaches for the novel COVID-19 disease which were that time utilized in hospitals and health care  (Huang et al., 2020).

Furthermore, the scope of the pandemic is reflected in the MeSH descriptors Incidence, Prevalence, Sensitivity and Specificity, Mortality, In Vitro Techniques, Polymerase Chain Reaction of the subdomain Investigative Techniques (E05) which represents an identification method of the SARS-CoV-2 RNA and an analysis of the dissemination and transmission of SARS-CoV-2 infections.

The frequent usage of MeSH descriptors Epidemics, Quarantine, Infection Control, Disinfection of the subdomain Environment and Public Health (N06) underlines that SARS-CoV-2 is highly infectious and can be widespread very fast without the appropriate infection control measures  (Ali & Alharbi, 2020).

Moreover, the frequent occurrence of the MeSH descriptors Anxiety, Stress-Psychological, Social Distance, Depression, Fear of the subdomain Behavior and Behavior Mechanisms (F01) reveal that the COVID-19 triggers also severe psychological issues (Cullen et al., 2020).

The geographical dissemination of COVID-19 can be followed by the Geographic Location (Z01) MeSH descriptors China, United States, Italy, United Kingdom, and India as countries that were most affected by or related to the coronavirus topic at the beginning of the pandemic (Ali & Alharbi, 2020).

To sum up, our results show that MeSH descriptors are a reliable instrument to reflect multiple facets of an emerging scientific topic in the medical field as well as societal issues related to this topic. Even though the prior literature uses a keyword co-occurrence analysis to cluster publications on coronavirus and COVID-19  (Haghani & Bliemer, 2020; Radanliev et al., 2020), our proposed method allows for a more fine-grained and validated analysis of the topics, since we use the established MeSH thesaurus for the identification of topics.

Results

We used all MeSH descriptors that we identified in our dataset and that were allocated to the twelve relevant above-mentioned subdomains. We analyzed whether the topics represented by MeSH descriptors approach each other (converge) to unveil interdisciplinary collaborations. Since scientific convergence can be revealed by the usage of research results, methods, and techniques of one separate discipline by another one (Curran, C-S.& Leker, J., 2009; Curran & Leker, 2011), we spanned bibliographic graphs of scientific publications’ metadata containing information on publications, authors, MeSH descriptors as annotated concepts and applied machine-learning algorithms on these graphs to train a model for the evaluation of the degree of interdisciplinary collaborations. Furthermore, we applied our developed measurement based on the topic (concept) similarity, as described in Sect. 4.1. Fig. 5 shows the degree of convergence (growing interdisciplinarity) or divergence (growing specialization) of relevant MeSH subdomains, measured as the normalized mean cosine distance.

The curves in the Fig. 5 should be interpreted in the following way: a decrease of cosine distance over time implies that two relevant fields approach each other, and they start to converge—i.e. the degree of research interdisciplinarity increases; an increase in cosine distance over time means that the scientific fields depart from each other and suggest a scientific divergence pattern—i.e. a specialization within scientific fields grows. In general, the lower a cosine distance is, the nearer two scientific fields are to each other.

Fig. 5
figure 5

Cosine distance difference in DeepWalk embedding space between MeSH subdomain Infections C01 (a) and Investigative Techniques E05 (b) and other identified relevant MeSH subdomains in 2020. The values of the cosine distance curves of C01 and E05 were set to zero as baselines

To calculate the normalized cosine distances difference to other subdomains, we subtracted the normalized cosine distance of the MeSH descriptors of the MeSH subdomain Infections (C01) from the values of the normalized cosine distance of other subdomains. In this way, the normalized mean cosine distance of the MeSH descriptors of the MeSH subdomain Infections (C01) was set to zero value as a baseline.Footnote 5

As Fig. 5a shows in the first month of 2020 the contextual distance between the MeSH subdomain Infections (C01)) and other selected subdomains decreased implying that the research on COVID-19 in other subdomains was primarily associated with the infections and diseases that the coronavirus triggered. As such, the degree of research interdisciplinarity increased. From February to March 2020, other subdomains were deepened in their own research topics, thus, the cosine distance between Infections (C01) and other MeSH subdomains increased—a specialization prevailed. From March 2020, the cosine distance between the MeSH subdomain Infections (C01) again slightly decreased and remained at a stable level for most of the other analyzed subdomains, except the subdomain Environment and Public Health (N06). The subdomain N06 containing such MeSH descriptors as Epidemics, Quarantine, Infection Control, Ventilation, and Hand Disinfection had in February 2020 the lowest cosine distance to the subdomain C01 (Infections), revealing the urgent necessity of preventing measures against COVID-19 spreading. Thus, we observe an increase in research interdisciplinarity between subdomains C01 and N06. However, after February 2020 this subdomain N06 was evolving more independently of the field marked as Infections (subdomain C01). As such, the cosine distance between these two subdomains gradually increased, reaching the highest distance difference in September 2020. Meanwhile, the contextual closeness and relatedness of the subdomains Infections (C01) and Health Care Facilities, Manpower, and Services (N02), containing MeSH descriptors Hospitals, Intensive Care Units, and Triage, started to grow from March 2020, since the cosine distance was falling from March 2020, reaching the lowest distance between two subdomains in December 2020. This interrelatedness of these two subdomains can be explained by the drastically increasing need for hospitalization and intensive care for COVID-19 patients. Surprisingly, the cosine distance between Infections (C01) and Amino Acids, Peptides, and Proteins (D12) was even incrementally rising from the mid of the year till December 2020.

Figure 5b reveals that the cosine distance between the subdomain Investigative Techniques (E05) and other subdomains was substantially lower in comparison to the Infections (C01) subdomain. In the first month of 2020, the cosine distance between Investigative Techniques (E05) and other subdomains decreased, implying rising interdisciplinarity. In March 2020, it raised for the subdomain Social Science (I01) and Behavior and Behavioral Mechanisms (F01), underlying a weak relationship between E05 MeSH concepts and Socioeconomic Factors, Demography, Culture, Policy, Economics MeSH concepts of I01. Similarly, the cosine distance between E05 MeSH concepts and concepts of Health Care Facilities, Manpower, and Services (N02) and Amino Acids, Peptides, and Proteins (D12) subdomains increased from February to March 2020, revealing the disciplines’ specialization. Afterward, cosine distance fell for all pairs between E05 and other subdomains. The decreasing cosine distance means that despite the discipline and subdomain differences, the contextual interrelatedness between E05 and other subdomains’ MeSH concepts increased. Thus, the degree of interdisciplinarity slightly increased. Strikingly, the cosine distance between E05 and Information Science (L01) subdomain even dropped under zero (baseline), highlighting the higher importance of L01 MeSH concepts for the Investigative Techniques subdomain. L01 subdomain contains, among others Computer Simulation, Software, Social Media, Information Storage and Retrieval MeSH concepts. The low cosine distance difference between E05 and L01 implies an amplified impact of information science technologies on conventional and novel medical analysis methods. Hence, we observe a high degree of interdisciplinarity between subdomains E05 and L01 in April and May 2020.

In total, our analysis of scientific topics related to COVID-19 (defined as MeSH subdomains) shows that most of the scientific subdomains and fields rely on their own research insights, techniques, and methods. This finding suggests that interdisciplinarity is less prevalent in COVID-19 research. However, the fine-grained analysis of pairwise topics shows that researchers produce knowledge on particular topics by integrating insights from both topics. We performed a similar analysis for the MeSH descriptor pairs denoting vaccine candidates. Appendix C shows the results.

Reference analysis

Methods

The analysis of research interdisciplinarity is usually performed by consulting the citation data of scientific publications (Yegros-Yegros et al., 2015; Rafols & Meyer, 2010). Hence, we conduct an analysis of research flows embedded in SARS-CoV-2/COVID-19 publications and preprints to evaluate the level of interdisciplinary in the knowledge flows.

To determine what knowledge has been used to conduct research on COVID-19, we compared the annotations to publications on COVID-19 with those annotations to papers cited in them. We examined the distribution of MeSH terms annotated to the primary journal publications on SARS-CoV-2 as well as preprints and the referenced publications.

Specifically, we used the total variation distance to quantify the differences in the distributions of MeSH descriptors over the 15 MeSH thesaurus domains for the COVID++ dataset between primary publications and the respective references. Section 3.1 contains details of the total variation distance calculation.

Additionally, we contrasted the differences with a random sample of publications and their references, which we call control datasets. For these control datasets, we sampled 25,000 papers from the database ZB MED Knowledge Environment that appeared in 2020 and had MeSH terms. These publications were randomly selected from the ZB MED Knowledge Environment database. Afterward, we extracted the referenced works from the CrossRef database. We repeated this procedure 100 times.

Results

The results show that SARS-CoV-2 publications use knowledge from slightly differently oriented life science fields. In Fig. 6, the assigned MeSH terms of primary articles published on SARS-CoV-2 and their references are allocated to MeSH categories. On average, a lower share of MeSH terms is allocated to the referenced publications than to the primary articles and preprints. One reason for this could be that much of the literature cited is not in the field of life sciences. These articles from other fields may not carry MeSH terms. This explanation is further examined in Appendix B by the subject area of the journals in which the articles had been published.

A slight shift in topics can be observed in Fig. 6. The shift is indicated by the categories of MeSH terms assigned to primary publications and preprints compared to those categories of MeSH terms assigned to the cited papers. This way, we identify the subject areas on which the SARS-CoV-2 papers build. Both primary articles and referenced articles share a focus on topics that are annotated with MeSH concepts from categories Organism (B), Diseases (C), Chemicals and Drugs (D), Analytical, Diagnostic and Therapeutic, and Equipment (E), and Phenomena and Process (G). Both publication types correspond to the low numbers of annotations from the fields: Disciplines and Occupations (H), Anthropology, Education, Sociology, and Social Phenomena (I), as well as Technology, Industry, and Agriculture (J).

For categories Phenomena and Process (G) and Health Care (N), a more or less equal distribution can be observed. In contrast, in the categories Analytical, Diagnostic and Therapeutic, and Equipment (E), Psychiatry and Psychology (F), Named Groups (M), and Publication Characteristics (V), we observe a surplus of MeSH terms annotated to references over primary literature. A clear preponderance in the distribution of MeSH terms to primary papers can be observed in the categories Organism (B) and Chemicals and Drugs (D).

The biggest differences between subject fields of primary publication and references were determined in categories Organism (B; primary literature: 29.11%, references: 14.57%), Chemicals and Drugs (D; primary literature: 22.66%, references: 17.71%), Analytical, Diagnostic, and Therapeutic Techniques, and Equipment (E; primary literature: 11.79%, references: 15.78%), and Psychiatry and Psychology (F, primary literature: 0%, references: 3.28%), Named Groups (M; primary literature: 0.79%, references: 8.3%), and Publication characteristics (V; primary literature: 0%, references: 2.54%). The high value for categories Organism (B) and Chemicals ad Drugs (D) reflect the intensive search for a vaccine for this phase of research and publishing. The knowledge used for this research comes mainly from these fields and, in addition, from the fields that deal with therapeutic techniques (E), as well as with certain population groups (M). With SARS-CoV-2, the named groups discussed here include obese people, the elderly, and men as particular risk groups. The named groups of people, like the content on psychology, are far more prevalent in the received works than in the publications on SARS-CoV-2 itself. This shows that conditional interdisciplinary exploitation of knowledge is taking place here.

Thus, our results show definite discrepancies between the primary publications on SARS-CoV-2 and the references of these publications in particular MeSH categories. The surplus of cited works in topic areas A, E, F, M, and V shows that publications on SARS-CoV-2 exploit information from these fields. The knowledge is used to research COVID-19, mainly in Organism (B), and Chemicals and Drugs (D). This implies that scholars also applied research results from categories other than their research field, which is one of the signs of interdisciplinarity. The results of further analysis of primary publications and their references are presented in Appendix B.

Fig. 6
figure 6

Shift of topics. Distribution of MeSH terms related to primary publications on COVID-19 and the cited work. Legend MeSH research fields see Appendix A.3

Regarding the comparison with control datasets, Fig. 7 shows that differences in the distributions between primary publications and references of 100 control groups are in the range of ± 3% in all MeSH domains. The respective differences of distributions between primary publications and references of the COVID-19 dataset have a range of ± 9%. The conducted one-way two-tailed t-test confirms that different values of the COVID-19 dataset are significantly different from the control group data.

Fig. 7
figure 7

Differences of distributions of MeSH descriptors over the 15 MeSH thesaurus domains between primary publications (citing) and references (cited) of the COVID-19 data and 100 control groups. The results of control groups are displayed as boxplots. COVID-19 difference is shown as a dot

Network analysis

Since our presented results slightly revealed particular characteristics of scientific collaboration or at least citation of knowledge from other research fields, we aimed to determine specific factors that can have an impact on the interdisciplinarity of COVID-19 research. Thus, we investigated whether and how social network characteristics affect the interdisciplinarity of authors of publications at the individual level.

Methods

In our dataset of COVID-19 journal articles and preprints, we were able to identify particular authors unambiguously with the help of their ORCIDs. In order to focus on the authors who rapidly responded to the COVID-19 crisis in their research, we included only those authors that have published at least two or more journal articles or preprints. Our dataset contained 6,283 authors in total. For these authors, we identified all their COVID-19 journal articles and preprints and their co-authors in these journal articles and preprints. This data enabled us to create a scientific social network for the focal authors. In our scientific social network, authors represent nodes and joint publications (journal articles or preprints) are edges.

Each edge implies the knowledge exchange and information linkage between two authors (Aboelela et al., 2007). Having the information on the nodes’ connectedness over the available edges, we calculated the degree centrality and betweenness centrality for each author (node) according to Freeman (1979). Afterward, for each author of the scientific social network, we identified all MeSH descriptors of their respective publications. To measure an author’s level of interdisciplinarity, we calculated the mean cosine distance between all MeSH descriptors of all author’s COVID-19 publications and preprints. The level of interdisciplinarity of an author is our dependent variable. Degree centrality and betweenness centrality are independent variables. Then we calculated an average number of different MeSH domains per publication and per author to control for the diversity of the focal author’s research. For the estimated statistic model, we also determined the number of total COVID-19 publications of an author as well as the number of co-authors per publication at the author level as control variables. The resulting dataset contained 4,814 observations. Since the data set contained publications with too many authors, whose contribution to the paper might be limited, we excluded observations with extreme values under the 1% percentile and over the 99% percentile for the dependent variable and all explanatory and control variables. Thus, the analyzed data set contained 4,760 observations.

Estimations. Since our dependent variable is a continuous nonnegative variable, we used ordinary least squares (OLS) estimations. After the specification of the model, we found that an assumption on heteroscedasticity of residuals was not satisfied according to the Breusch-Pagan test (Breusch & Pagan, 1979). Hence, we used the Huber-White sandwich estimator to obtain heteroscedasticity-consistent standard errors (Wooldridge, 2010).

Descriptive statistics and correlations for the variables appear in Tables 3 and 4. Due to the high correlation of the variables’ degree centrality and number of co-authors per publication (0.88) (Table 4) as well as high VIF-values of these variables (over 6) (Allison, 1999), we excluded the number of co-authors per publication from the model estimation. Correlation values among other variables are low. The VIF-values for other variables are under 1.2 and don’t reveal any multicollinearity issues (Allison, 1999).

Table 3 Descriptive statistics
Table 4 Correlation matrix

Results

The results of OLS estimations are displayed in Table 5. The results reveal that the position of the author in the scientific social network of COVID-19 research indeed determines the level of the authors’ interdisciplinarity. The betweenness centrality determines the potential control of knowledge flows and communication within the network (Freeman, 1979). Thus, authors with a central position can control the information that they obtain from different and possibly not connected parts of the network. However, even though this information can provide valuable insights for the authors, our results (Table 5) suggest that these insights are difficult to internalize completely in the short term since the coefficient of betweenness centrality is negative and significant \((\beta = -8.28\cdot 10^{-6}, p < 0.05)\), if holding the degree centrality constant. Thus, these authors cannot profit from the betweenness centrality position in terms of increasing the level of interdisciplinarity of their publications. On the other hand, OLS estimation results show that the degree centrality in the scientific social network is beneficial for authors if they want to increase their level of interdisciplinarity (Table 5), since degree centrality is positive significant \((\beta = 1.72\cdot 10^{-3}, p < 0.001)\). The central position in the network with regard to the degree of centrality is an indicator of the enhanced communication activity and direct interactions of the focal author with other authors in the network. Through efficient, fast, and direct linkages (edges) to their peers, the focal authors gain valuable knowledge of their network colleagues at a lower cost in terms of time and access to specific information. This knowledge and information might come from different research fields and through the direct communication that the focal authors have with their peers, the authors can absorb this knowledge easier even if the knowledge is outside their subject area.

Further, the number of publications is disadvantageous for the promotion of the authors’ level of interdisciplinarity (Table 5, \(\beta = -8.27 \times 10^{-3}, p < 0.001\)). Authors who publish many articles might be very specialized and more efficient exclusively in their subject. In contrast, the higher number of research fields, measured as a number of MeSH domains per author, raises the level of interdisciplinarity (Table 5, \(\beta = 2.38 \times 10^{-2}, p < 0.001\)).

Table 5 OLS results for the level of interdisciplinarity of authors

Discussion

We comprehensively analyzed the dynamics of the scientific response to SARS-CoV-2 and COVID-19 with a focus on research interdisciplinarity and topic adjustment effects. For this purpose, we investigated the total research volume on SARS-CoV-2 and COVID-19, MeSH thesaurus descriptors as research topics, MeSH thesaurus domains as well as authors’ networks with regard to research interdisciplinarity and increasing focus on COVID-19 topics at cost of activities in other established topics. We ascertained that the research on novel coronavirus and COVID-19 has drastically risen in the first year of the pandemic and has partially displaced other research topics. However, we did not find a complete displacement of particular research topics by the research on COVID-19. Consequently, our study suggests that crisis-driven research, such as research on SARS-CoV-2 and COVID-19, occupies valuable research resources – which is in line with the finding by Coccia (2021) but does not suppress other research topics completely, which scientists usually elaborate on in times without extreme emergencies.

In consent with the study by Zhao et al. (2022), we found that the COVID-19-related research was only to some extent interdisciplinary. The finding that interdisciplinarity did not increase during the first year of the COVID-19 pandemic therefore stands in contrast to the expectation and the benefits declared by scientists in the large-scale Delphi study (Lazarus et al., 2022) or the smaller study by Kastenhofer (Kastenhofer et al., 2023) of a cooperation across scientific boundaries. Similar to our results, Coccia (2021) concluded that crisis-driven research is less interdisciplinary. In pursuit of an explanation, we have conducted an additional analysis (see Appendix B) to confirm that the pandemic-induced research is indeed only partially interdisciplinary because COVID-19 related research is dominated by only a few disciplines (Clinical Medicine, Immunology, Molecular Biology, and Genetics as well as Microbiology) – in line with Zhao et al. (2022). Harsanto (2020) also concluded that Medicine was the prevailing scientific field in the first months of the pandemic.

Additionally, we examined the factors that facilitate interdisciplinary collaborations. Although attempts to analyze the research development on SARS-CoV-2 and COVID-19 by the usage of MeSH descriptors were made (Stegmann, 2020), our paper provides a deeper explication of knowledge development on these topics. In particular, beyond the confirmation of the prior research (Stegmann, 2020; Colavizza et al., 2020; Haghani & Bliemer, 2020) that ascertained the rapid growth of scientific knowledge on SARS-CoV-2 and COVID-19, we investigated which patterns (scientific convergence or divergence) were prevailing in the research on COVID-19 and whether researchers from different scientific fields have been working together to cope with the global pandemic, implying an increasing level of research interdisciplinarity. For this purpose, we have applied our developed method for measuring research dynamics based on machine learning (Galke et al., 2019; Perozzi et al., 2014; Grover & Leskovec, 2016).

First, our analysis shows that due to the dramatic impact of the worldwide spread of COVID-19 and the related disease, scientific research on these topics has rapidly grown by exploiting the strengths of researchers from multiple research fields. This finding is in line with the study by Shan et al. (2020), which suggests that medical journals preferred to publish COVID-19-relevant research compared to other topics. Thus, owing to the emergency of the COVID-19 pandemic and journal preference for COVID-19 topics, many scientists contributed to COVID-19-related matters in their fields, potentially at the expense of their main research interests and topics (assuming limited research capacity). Notably, the journal preference for COVID-19 topics could be regarded as a bias to our bibliographic analysis. However, the preference for COVID-19 topics is deeply intertwined with the crisis that needed to be addressed, such that we consider the effect as a part of the scientific response to the global pandemic – leading to a temporary displacement of topics. In particular, we also expect that there will be bounce-back effects in a way that the share of COVID-19 publications will be again decreasing and other topics will be increasing their share in publications. In our analysis on comparing COVID-19 with other medical emergencies, we have observed an annealing of the publication share of the emergencies’ topics. For instance, even though the research on HIV-1 and related AIDS topics has decreased in the analyzed period, the actual disease and 38 million people across the globe who lived with HIV-1/AIDS in 2019 (according to the Joint United Nations Program on HIV and AIDS (UNAIDS, 2021)), have not disappeared with the emergence of the novel coronavirus.

Second, MeSH descriptors provide an accurate evaluation of an emerging scientific topic “COVID-19”, even before the descriptor “COVID-19” is officially introduced into the classification system. The most frequent MeSH descriptors allocated to their domains or subdomains at the higher level provide a fast and precise overview of the topics related to COVID-19. Ranging from medical and biochemical research domains to social, psychological, and behavioral topics to medical device and engineering scientific domains, MeSH descriptors in the respective MeSH thesaurus subdomains reproduce all facets of the COVID-19 pandemic. These insights are in line with the prior literature on the research and development on coronavirus, investigating the distribution of papers on coronavirus over research categories (Haghani & Bliemer, 2020). Since our results overlap with the findings from the literature, we conclude that our developed measurement of research dynamics based on the topic (concept) similarity is a valid method to determine scientific evolvement trajectories and to measure the degree of interdisciplinary of research. Our results suggest that at the beginning of the pandemic, scholars put multiple efforts to comprehensively analyze diseases caused by SARS-CoV-2 to quickly respond with medication approaches. The SARS-CoV-2 infection triggers except mild symptoms and respiratory problems such as sore throat, fatigue, dry cough, and high fever, also much more serious complications and diseases such as severe pneumonia, kidney failure, cardiac injury, and even death (Huang et al., 2020). For people having other diseases such as heart diseases, lung diseases, cancer, or diabetes, the risk for complications and severe symptoms induced by the SARS-CoV-2 infection is higher (Ali & Alharbi, 2020). Our findings show that in the first period (January–March 2020), researchers have worked within their field of research, which led to the agglomeration of specific discipline knowledge (specialization of knowledge in a particular medical indication). Thereby, researchers may profit from their prior experiences in related scientific topics like other viruses, as also indicated by Laufs et al. 2024 for the development of SARS-CoV-2 vaccines and medications (Laufs et al., 2024). This specialization resulted in a divergence pattern of research dynamics reflected in higher cosine distance between the MeSH subdomain Infections (C01) and other MeSH subdomains since multiple organ systems were subjected to the SARS-CoV-2 infection and had to be profoundly investigated. However, after the identification of major diseases and critical medical conditions, scientists started to apply the knowledge, research methods, and techniques of their colleagues from other fields of medical research. The bundling of knowledge has helped scientists to make advancements beyond the boundaries of their field of study. Moreover, scientists have started employing the methods and techniques of their colleagues from other fields, which resulted in the decrease of normalized cosine distance between the MeSH subdomain Investigative Techniques (E05) and Information Science (L01) or Social Sciences (I01) in our study. Thus, we observe patterns of interdisciplinary research. Such dependence on knowledge from other scientific fields underlines the importance of joint interdisciplinary research in times of crises and emergencies.

Third, our analysis of references in COVID-19 publications provides further insights into research dynamics. Although clinical medicine research has been most cited by COVID-19 scholars, also the knowledge from other MeSH categories than Diseases (C) has constituted the research basis for medical publications. Insights from the research fields of microbiology, immunology, molecular biology, and genetics were important for research on the COVID-19 as well as on vaccines and medicine against the SARS-CoV-2 infection. Furthermore, the research on health care (MeSH category N) builds on publications from diverse MeSH categories.

Fourth, the author network analysis reveals that the network position of authors determines whether they can contribute to interdisciplinary research, and, as such, to interdisciplinary inventions and innovations. Our results are in line with prior findings (Aboelela et al., 2007). Aboelela et al. (2007) found that researchers working in interdisciplinary research centers having a high degree and betweenness centrality in their local research networks are more productive since they obtain more information from their colleagues and can collaborate more efficiently. However, our results show that only a high degree of centrality position in the collaborative network positively affects the level of research interdisciplinarity, suggesting that authors who can make use of their interdisciplinary scientific network profit in terms of a higher degree of interdisciplinarity of COVID-19 research. This might be a positive effect of short communication between authors in the scientific networks, which is characteristic of the dynamics of science under crisis (Coccia, 2021). In contrast to the degree centrality, the authors’ betweenness centrality position hinders the high level of interdisciplinarity, implying that though authors gain diverse knowledge from eventually dispersed unconnected subnetworks, they are not able to rapidly internalize the obtained knowledge and exploit it for interdisciplinary research.

We have conducted additional analyses to trace the research on critical vaccine development (see Appendix C). Our results suggest that conventional viral vector-based vaccines dominated the research on the vaccine against SARS-CoV-2 compared to novel vaccine development approaches (i.e. mRNA-based vaccines). Our results indicate that novel mRNA-based vaccine technologies have reached the late development stage faster compared to conventional vector-based vaccines, which is in line with prior literature (Li et al., 2021).

Our study has several limitations. First, we use MeSH descriptors as research topics, which can be broadly or fine-grained defined. Also, we make use of preprints that contained the most current knowledge and the latest research results on SARS-CoV-2 in the first year of the pandemic. However, since preprints do not undergo a strict review process, they are not published in journals at the initial stage and as such are not indexed with MeSH descriptors. Though we have applied the dictionary lookup-based annotation tool ConceptMapper to index preprints, which provides comparable results with the MeSH on Demand system, human-conducted indexing might have differences from our machine-conducted indexing. However, we have conducted a comparison between manual annotation and our ConceptMapper-based annotation in previous work (Galke et al., 2021). Second, our temporal embedding similarly measure provides quantitative insights about research dynamics between two or more research topics. However, our method does not provide any insights into how interdisciplinary research alliances have been formed or how scientists have chosen their collaborative partners from other research fields. These issues can be explored by future research.

Conclusion

To summarize, compared with other medical emergencies, the volume of publications related to the COVID-19 pandemic has developed unusually strongly. The publication volume is most comparable with the volume of publications related to HIV emerging in the early 1980s. However, the crisis-driven research on SARS-CoV-2/COVID-19 did not displace other research topics (problem-driven research) completely. For the first year of the COVID-19 pandemic, the initial hypothesis that different scientific disciplines would increasingly collaborate on an interdisciplinary basis was not confirmed. Nevertheless, our results show that particular research topics rely on knowledge, methods, and techniques from other research fields, which is an indicator of interdisciplinary research. Our findings suggest that clinical medicine accompanied by immunology, microbiology as well as molecular biology, and genetics constitute the foundation for SARS-CoV-2/COVID-19 research. These disciplines predominantly cite each other, revealing further limited indications of interdisciplinary research. Further, our results suggest that researchers having a high degree centrality in scientific networks can develop a higher degree of interdisciplinarity.

In total, our results on research dynamics in the first COVID-19 pandemic year recommend research policymakers:

  • To foster research on the emerging crisis topic, but to leave enough room for problem-driven research.

  • To facilitate interdisciplinary collaborations of researchers from different scientific fields.

  • To encourage researchers to produce less, but more high-quality research on the emerging topic, since a high quantity of research publications hinders interdisciplinary research.

  • To develop scientific community networks to enable fast and efficient communication opportunities for researchers in order to promote scientists’ degree centrality in such networks.

  • To encourage the development of novel medical technologies rather than rely solely on conventional technologies since novel technologies lead to breakthrough innovations helping to overcome the crisis.