1 Background

Pandemics or epidemics have profound consequences for mental health [1], a state characterized by the performance of mental functions, resulting from an adjustment of the individual to his environment or society [2]. It encompasses various aspects such as emotional regulation, resilience, and the capacity to adapt to the changing circumstances brought about by global dynamics [2]. The ease of global travel, intensified international exchanges, and shifts in climate patterns have not only accelerated the spread of infectious diseases but also heightened awareness of mental health challenges during such times [3, 4]. COVID-19, first identified in Wuhan, China, in Dec 2019, has affected numerous regions, highlighting the necessity for a mental health perspective in public health strategies [5]. As of Sep 1st, 2023, the substantial figures of over 770 million confirmed cases and over 6.9 million deaths have been reported globally [6]. Extremely strict pandemic prevention measures, such as mandatory school closures and the suspension of nonessential productions and business activities, as well as the spread of emotions through channels like social media, are seriously affecting people’s daily life and emotions [7, 8].

Pandemics or epidemics have resulted in widespread contagion and lockdown, signifying the widespread emotional and mental impacts that require attention [9, 10]. Given the lack of specific treatments for the mental health effects of pandemic [11], some countries are adopting mitigation strategies to reduce the impact of the virus and alleviate the psychosocial distress arising from lockdown measures [12, 13]. However, these strategies, while aimed at protecting the most vulnerable such as the older adults or those with chronic/psychiatric illnesses [14], also introduce additional panic, stress, fear and associating with anxiety, frustration, anger, depression and several other mental disorders among the general public [15, 16]. Furthermore, certain groups such as the children [17], the health-care workers [18], the unemployed and those frequently exposed to social media/news [19] might report certain degrees of mental distress. It requires a targeted approach to address consequent mental health challenges in both clinical settings and among the general public.

Our systematic search of the literature has revealed several detailed and specific reviews that have focused on a specific pandemic, particular at-risk groups or specific aspects of mental health [20,21,22,23,24]. To our knowledge, there has not been an encompassing systematic review that synthesizes the diverse and broader implications of pandemics or epidemics on mental health across various segments of the population. In order to identify the links between pandemics or epidemics and mental health, and to enable health-promoting responses, the scientific evidence base must be updated in a timely and regularly manner. Two factors limit the effectiveness of such evidence reviews. First, research on pandemics or epidemics and mental health cuts across a variety of disciplines and represents a fragmented picture of niche discourses, that impedes efforts to synthesize key insights and identify trends and evidence gaps. Second, the exponential growth of literature means that traditional evidence synthesis methods, which usually require considerable human resources to manually collate and screen literature, may be no longer sufficient or feasible [25]. Therefore, the response of existing evidence is to narrow the scope of literature review, which further reduces the potential for broader insights across disciplines.

In the landscape of expansive literature, there is a pressing demand for innovative methodologies that can offer systematic and expeditious synthesis of the existing body of knowledge [26]. Traditional review methods may be constrained by the sheer volume of data, resulting in a less comprehensive understanding of the subject at hand. In stark contrast, natural language processing (NLP) can rapidly screen and encode potentially hundreds of thousands of articles, increasing the breadth and diversity of the literature [27]. Given the devastating impacts of coronavirus, evidence synthesis about mental health outcomes is needed to produce guidance for health care institutions and the public. The purpose of this review is to use NLP methods (supervised and unsupervised) to systematically synthesize evidence on pandemics or epidemics and mental health. We also aim to identify key indicators from our systematic synthesis that can inform the development of targeted preventative interventions. In doing so, we hope to provide the first comprehensive, semi-automated systematic map of the scientific literature on pandemics or epidemics, mental health impacts, and relevant coping strategies, ultimately contributing to timely public health responses and mental health interventions.

2 Methods

2.1 Search Strategy and Selection Criteria

NLP approaches were used to systematically synthesize evidence on the links between pandemics or epidemics and mental health globally [28]. The analysis involved five steps: defining the framework, database searching, screening, supervised and unsupervised machine learning, and generating topic maps and heatmaps.

We focused on literature from different geographical contexts, and assessed the distribution of literature based on country income status, using the 2020 World Bank income classification rankings to define low-income, lower middle-income, upper middle-income, and high-income nations [29].

We included all empirical study designs published and indexed in English between Jan 1, 2014, and Dec 31, 2023. We searched Embase, PubMed, Scopus, PsycInfo, and Web of Science Core Collection using title, abstract, and keywords only. Full texts were not retrieved, screened, or analyzed. Grey literature was not included. The search strings are described in detail is in Supplementary File 1.

2.2 Screening

Screening was done based on titles and abstracts using a combination of manual assessment against priori inclusion criteria and machine learning methods. For inclusion, documents had to meet the following criteria: (1) be indexed in English; (2) be published between Jan 1, 2014, and Dec 31, 2023; (3) provide a clear link to the actual, projected, or perceived impacts of pandemics or epidemics, or responses to reduce the impacts of pandemics or epidemics; (4) include substantial focus on a perceived, experienced, or observed eligible mental health outcomes; (5) present empirically driven research or a review of such research. Correspondingly, our exclusion criteria included: (1) Non-English language documents; (2) Documents outside the 2014–2023 publication range; (3) Documents lacking a clear thematic focus on either pandemics or epidemics or mental health outcomes; (4) Non-empirical or non-peer-reviewed documents, such as opinion pieces, editorials, and news articles.

We used a supervised NLP method to accelerate screening. This involves training a computer algorithm on a human-labelled dataset to learn human decisions on screening documents [30]. We did this through two separate classification tasks: one is to predict whether the document is relevant to pandemics or epidemics; the other is to predict whether it is related to mental health. Since the positive data would be sparse (about 5%), we used the most advanced unsupervised learning to filter out more likely positive data for labelling [31]. We manually screened a training sample by title and abstract (500 unique documents), and coded them as relevant to pandemics or epidemics/mental health categories, or any combination of both. This human-screened sample set was used to train the classification models. For the remaining un-labelled documents, the two classification models were employed to filter out the documents related to both natural disasters and health.

To be exact, the task of labelling the dataset was entrusted to a team of experienced research assistants, under the supervision of principal investigators. The labelling process was performed according to pre-established criteria that we had meticulously defined. These criteria include keywords, research focus, and specific terminology associated with pandemics or epidemics and mental health. Training sessions were conducted to homogenize the understanding of these criteria among all labellers. Discrepancies were resolved through group discussions to reach consensual decisions. A consistent approach was ensured by pilot-testing the labelling process with a subset of the whole dataset, refining the criteria and protocols as needed. A common labelling guideline was established based on this pilot test, detailing the precise operational definitions for “pandemics or epidemics” and “mental health” in terms of document relevance.

We used the most advanced language model, the BERT model (uncased base setting) as the backbone of the classification, and evaluated its performance by the f1 score [32]. It is generally recognized within the scientific community that an f1 score in the range of 0.8 to 0.9 is indicative of acceptable performance for classification tasks relating to complex phenomena. Moreover, an f1 score exceeding 0.9 is commonly regarded as reflecting excellent performance. In the pandemics or epidemics classification task, the model achieved a 0.94 f1 score in its test set and a 0.84 f1 score for mental health classification task. These results are above the benchmarks for good performance, suggesting that our model is delivering high-quality and reliable classifications across both domains. For the documents predicted positive by both models, we manually reviewed 100 documents to recheck the overall performance of the model, and 91% of them met our above inclusion and exclusion criteria.

2.3 Data Extraction and Analysis

We extracted the bibliographic meta-data for all documents retrieved by search strings. To determine where studies were conducted, we used a pre-trained geoparser to identify geographical locations in abstracts and titles [33]. Detailed literature section process is presented in Supplementary Fig. 1.

We used unsupervised NLP and topic modelling to support analysis of literature included in this review. Topic modelling is a method that automatically identifies clusters of words that frequently occur together based on a pre-specified number of topics [34]. The themes generated by topic modelling are not based on any human labelling or tagging, but on structures found by the algorithm in the data itself. An article is typically associated with more than one topic. To find the most relevant and explicable topic model, we qualitatively compared several alternative topic models, including non-negative matrix factorization (NMF) [35], latent dirichlet allocation (LDA) [36], guided LDA, and BERTopic from 40 to 80 topics and various hyperparameters. Non-negative matrix factorization with 45 topics provided the best balance between detail and interpretability. We then implemented the NMF by the Scikit Learn package [37]. We iteratively reviewed the topics emerging from topic models to identify and label key thematic clusters that were then validated within the study team and with external experts to ensure that final topics were streamlined with the expert’s understanding of the literature. The topics of the final topic model were designated to one of three aggregated meta-topics: pandemics or epidemics, mental health risks & impacts, coping & responses (Supplementary File 2).

We generated topic maps based on outcomes of the topic model and created heat maps to visualize the relative co-occurrence of topics. The topic map reflects a conceptual space where similar documents are placed closer together and dissimilar documents are farther apart. Thus, clusters of dots represent areas of literature with similar topic scores. We identified the most and least frequent topics from all topics in our topic model by the global region for key meta-topics. We generated geographical maps to visualize locations of studies globally. We used narrative synthesis to assess the frequency of key topics in literature on pandemics or epidemics and mental health, as well as the extent of co-occurrence of topics within the topic and heat maps. We assessed the extent to which trends in the literature differed by country income class.

3 Results

3.1 Literature Growth and Spatial Concentration

Our findings showed that literature on a broad spectrum of pandemics/epidemics and mental health has been growing substantially since 2020 (Fig. 1A). Based on our criteria, we identified 77,915 studies in the field of pandemics or epidemics and mental health published by Dec 31, 2023 and indexed in Embase, Pubmed, Scopus, PsyciInfo, or Web of Science Core Collections. In 2023, 27,831 relevant studies were retrieved and included—around 4.5 times as many as in 2020. However, only less than 100 studies were retrieved per year before 2020.

31,358 (76.56%) of 40,957 studies on pandemics or epidemics and mental health with identified place names focused on high-income and upper middle-income nations (Fig. 1B). Publications on pandemics or epidemics and mental health showed a large gradient by national income group. The number of publications from high-income nations was 1.7 times greater than that in upper middle-income nations, and far exceeded that from low-income nations. When places mentioned were combined by country for each paper, we found that 14,511 (35.43%) of the 40,957 locations were in Asia, followed by 12,274 (29.97%) in Europe and 7,120 (17.38%) documents in North America. In terms of specific countries, 3,827 (11.25%) of the 34,008 locations were in China, 2,581 (7.59%) in the USA, and 2,314 (6.80%) in Great Britain.

Fig. 1
figure 1

(A) Descriptive summary of included articles in 2014–2023 by publication year. (B) Descriptive summary of included articles by country income group, continent, and country

3.2 Topic Modelling and Topic Distribution

Through topic modelling, we mapped pandemics or epidemics, mental health risks & impacts, and coping & responses across the global. Topics with their relative prevalence in the literature are shown in Fig. 2, grouped by pandemics or epidemics, mental health risks & impacts, and coping & responses. The axis is a normalized scale reflecting topic prevalence relative to the mean score (reference = 1). Topics were identified by the words used in titles, keywords, and abstracts, and could thus reflect several meanings. The Covid pandemic and SARS were the most frequently studied pandemics or epidemics, followed by HIV/AIDS. Anxiety and stress were the most frequently studied mental health outcomes. Social support and healthcare were the most common ways of coping. Patients, healthcare workers, students and older adults were the special population that are vulnerable to mental health problems caused by epidemics. Full lists of topics and words associated with these topics are in the Supplementary File 2.

Fig. 2
figure 2

Prevalence of topics within included articles, organized by meta-topic

Figure 3A shows global geographical distribution of included studies. For studies conducted at the national level, points appear in the geographic center of the region or country. The topics with the highest frequency by region (top three topics) are shown in Fig. 3B. Most reported pandemics or epidemics included Covid (19,770 [48.27%]), SARS (7,499 [18.31%]), and HIV/AIDS (2,301 [5.62%]). The Covid pandemic was among the top three topics in all continents. Lockdown or quarantine was among the top three topics in continents except North America. Literature published in North America and Africa more frequently reported HIV/AIDS compared with other regions.

Fig. 3
figure 3

(A) Global geographical distribution of included studies. (B) Most frequent topics by region and category

Various mental health outcomes were reported—including anxiety (5,431 [13.26%] articles), stress (4,427 [10.81%]), depression (3,236 [7.90%]), and psychological stress (3,510 [8.57%]) (Figs. 2 and 3). Anxiety was one of the three most frequent topics in all continents except North America. Depression was among the top three topics in all continents except Oceania. Alcohol consumption was the most frequent mental health topic in North America and Oceania. Oceania presented a different pattern, in that psychological distress was the most prevalent mental health topic.

In the existing coping & responses literature, patients (3,313 [8.09%] of all 40,957 articles) were the most frequently studied objects. Studies also focused on students (2,502 [6.11%]), child-parent (1,618 [3.95%]), healthcare workers/medical staff (1,311 [3.20%]), etc. Social support was the most common way of coping & responses (2,920 [7.13%] of all 40,957 articles), followed by healthcare (2,552 [6.23%]) and physical activities (1,682 [4.11%]). Patients, healthcare workers/medical staff, students, and older adults were the most common subjects in most continents. Healthcare or telehealth services were common ways of coping in all continents except South America. Social support was a frequent topic in Europe, Oceania, and South America.

3.3 Topic Aggregation and Dimensional Representation

We mapped all articles in the dataset based on the similarity of topics within documents—i.e., distinguishing documents based on their aggregated meta-topic: pandemics or epidemics, mental health risks & impacts, or coping & responses (Fig. 4). The topic map showed a large number of topic clusters, offering a nuanced view of the complex interactions between pandemics, mental health, and societal coping mechanisms. It included several focusing on specific mental health outcomes (e.g., anxiety, depression, loneliness, and stress) that were highly clustered and separate from other topics, demonstrating the unique and distinct impact of pandemics on these particular mental health outcomes. Pandemics or epidemics-related topics, such as Covid, SARS, and HIV/AIDS, also showed distinctive clustering. Clusters associated with coping and responses, including themes like social support, resilience, and healthcare responses, were also manifested.

Fig. 4
figure 4

(A) Visualization of topics in the dataset. The graphic reflects a conceptual space where similar documents are placed closer together, and dissimilar documents are farther apart. Clusters of dots represent areas of literature that have similar topic scores, meaning that they use similar words and are presumed to be about related subjects. Labels show the most frequent topics. (B) Pandemics or epidemics and mental health research categories in the dataset. Topic map in which each dot represents a document, colored according to the categories of pandemics or epidemics, mental health risks & impacts, or coping & responses

In Fig. 5, with higher national income level, there was a lower frequency for depression, post-traumatic stress disorder (PTSD), and sexual violence, and an increasing frequency for loneliness, negative emotion, life satisfaction, suicide, and sleep disturbance. Literature from high-income regions mostly focused on the effects of pandemics or epidemics on psychological distress, depression, sleep disturbance, suicide, alcohol consumption, and burnout. In low-income regions, the literature on pandemics or epidemics and mental health was dominated by psychological distress and depression, with particular attention paid to research on PTSD, sexual violence and dementia.

Fig. 5
figure 5

Frequency of mental health risk and impact topics for countries in different income classes. Data are from documents on mental health impacts per country income group, subdivided by aggregated topic as a percentage of the group total

3.4 Topic Co-occurrence and Demographic Implications

We assessed the co-occurrence of topics by pandemics or epidemics, mental health risks & impacts, and coping & responses categories, with results shown as heat maps (Fig. 6). Co-occurrence of pandemics or epidemics and mental health topics was common, reflecting to a large extent the dominant pathways leading pandemics or epidemics to health risks. Among specific topics, the Covid pandemic was a frequently reported topic in articles on psychological distress, negative emotion, stress, and life satisfaction. SARS most frequently co-occurred with studies of depressive symptoms, psychiatric disorder, and anxiety. HIV/AIDS co-occurred frequently with sexual violence.

Fig. 6
figure 6

Co-occurrence of pandemics or epidemics, coping & responses, and mental health risks & impacts. Number of documents with co-occurring topics above threshold 0.015

Healthcare workers or medical staff suffered most from mental health problems, such as stress, anxiety, depression, psychological distress, and burnout. Patients and students frequently suffered from anxiety, depression, psychological distress, and psychiatric disorder during or after pandemics or epidemics. Older adults also suffered much from loneliness, which was often mitigated by social support. Social support, healthcare services and physical activities were common responses to mental health problems, reflecting the potential effects of responses leading to pandemics or epidemics recovery.

4 Discussion

Infectious diseases such as the SARS and the Covid pandemic have a major impact on global health. Existing evidence must be kept up to date to better respond to future outbreaks of infectious diseases. Available research revealed that 40,957 studies, mainly from high-income countries, explored the relationship between pandemics or epidemics and mental health. The Covid pandemic was the most common topic, followed by SARS and HIV/AIDS. Pandemics or epidemics can lead to mental health problems, such as anxiety, stress, fear, and depression. Social support, healthcare services and physical activities were the most common ways of coping. Patients, healthcare workers, students and older adults were the vulnerable group studied. The findings reflected the overview of existing research on pandemics or epidemics and mental health, and also highlighted their recovery responses.

The Covid pandemic has emerged as a principal theme from our analysis, with its profound and multifaceted reverberations on various societal strata [38]. This is closely followed by SARS and HIV/AIDS. They have not only shaped the global health narrative over the past two decades but also significantly influenced the mental landscape, bridging virulent pathogens with emotional upheaval [39, 40]. These conditions, being marked by their contagious and far-reaching impacts, evoke profound communal and individual anxiety, leading to a spectrum of mental health issues, embattling individuals with anxiety, stress, fear, loneliness, and depression [41]. Especially noteworthy is the precipitous rise in stress and fear, both acute responses to novel and life-threatening situations. However, if unmitigated, can evolve into chronic mental conditions [42]. Simultaneously, the pervasive sense of uncertainty during pandemics often molds the fertile ground in which depression takes root, fertilizing feelings of hopelessness and lethargy [43].

Our findings underscore the importance of social support, healthcare services, and physical activities as fundamental coping mechanisms against mental health adversities arising during pandemics. These elements were found to be key to recovery responses, providing significant support to the affected population. Specifically, social support stands out as a critical factor, not only in mitigating immediate distress but in fostering a sense of community and resilience among individuals [44]. Social connections provide an essential platform for sharing experiences and solutions, crucial for mental health [45]. Furthermore, healthcare services, in traditional and telehealth formats, play an indispensable role in ensuring continuous care and monitoring for individuals’ mental health, especially for those in isolation or quarantine [46]. Physical activities are also come to the forefront as a supportive element in combating mental health issues. By promoting physical health, these activities inherently contribute to mental health by reducing symptoms of anxiety, depression, and enhancing mood [47].

The multifaceted approach to addressing mental health challenges highlights the need for a collaborative and community-oriented strategy, particularly during times of crisis. For general population affected by epidemics or pandemics, a comprehensive assessment of risk factors leading to mental problems is needed, such as pre-crisis mental health status, trauma, or bereavement [10]. Moreover, tailoring support to different population segments, especially mental interventions for vulnerable and exposed groups, is imperative [48]. For patients and healthcare workers who are dealing with the direct health impacts of a pandemic, the focus might lean towards medical support integrated with mental care [49, 50]. For students, disruptions in education and the loss of routine can be mitigated through online platforms and mental health support systems provided by educational institutions [51]. The older adults, who might face isolation more acutely, require specific attention physically and emotionally [52]. The synthesis of our research points towards the necessity of a robust, adaptable, and inclusive framework for mental health support during times of pandemic.

Compared with existing work, our NLP-automated mapping study provided more comprehensive evidence on the pandemics or epidemics and mental health. Traditional reviews face challenges in processing the sheer volume of literature as efficiently as NLP methods, which can systematically and swiftly integrate the existing large number of studies, especially considering the advancements in supervised algorithms for enhanced accuracy and relevance [53]. Merging the precision of human expertise with the scalability of machine learning algorithms is an essential step in training robust machine learning models to accurately reflect human decision-making [54]. At the same time, the automated procedure vastly outpaces traditional manual methods, leading to a more efficient identification of trends, patterns, and relationships across diverse studies [55]. Moreover, NLP can automate and improve the classification and theme extraction process, allowing the exploration of broader research scopes, thus facilitating a more comprehensive and precise synthesis of evidence [56]. It also mitigates the variability inherent in individual human assessments, especially over large datasets.

However, given the magnitude of the literature and the constraints of a high-level systematic synthesis using NLP, we only relied on abstracts, titles, and keywords for analyses. Not retrieving full texts may potentially lead to the exclusion of nuanced details and context-specific findings present in the full texts. In addition, we did not include grey literature because it is still difficult to integrate systematically using NLP methods. Finally, we limited our search to English-indexed studies. The resulting evidence map, for example, had little evidence in Oceania and South America, which may be partly related to language bias. Similarly, the over-representation of evidence in high-income and middle-income countries may be due to the focus on peer-reviewed publications.

5 Conclusions

Taken together, evidence accumulated so far confirms that pandemics or epidemics are having a significant mental impact on individuals. Our NLP-driven insights offer an opportunity to inform and guide clinicians and mental health professionals that potential risk and protective factors, as well as appropriate strategies are required to cope with public health emergencies such as pandemics. On the other hand, our reliance on meta-data highlights a growing need for NLP techniques in research synthesis. Integrating methodological rigor from systematic reviews with advanced NLP can accelerate the identification of cross-study patterns and relationships, a necessity in a rapidly growing literature space, and thus is essential for preparing for and mitigating the mental health impacts of pandemics or epidemics.