1 Introduction

Neonatal infections are responsible for 25% of infant deaths globally and 10% of deaths in children under 5 years old. In developing countries, they account for 99% of the world’s mortality due to bacterial infections [1]. Due to the uneven distribution of health services between urban and rural areas, developing countries such as India face a high Neonatal Mortality Rate (refer to Table 1 for abbreviations used in this paper) of 22 [2] and the highest number of neonatal deaths within the first 24 h of life [3]. Specifically, the rural NMR is twice the urban NMR [4]. India contributes to one-fifth of global live births and over a quarter of neonatal deaths [4]. In 2013 alone, nearly 0.75 million neonates died in India, which is the highest in the world [4]. Significantly, 70% of all infant deaths and over 50% of deaths in children under five occur in the neonatal period. Deaths within the first week of the neonatal period alone account for 45% of infant mortality [4]. Neonatal sepsis, along with antibiotic resistance, kills more than 0.6 million babies annually in India [5, 6]. The United Nations SDG for NMR is 12 [7]. However, India faces significant challenges due to limited infrastructure and resources, which result in delays in diagnosing and treating neonates. This difficulty is further intensified because one or two clinical parameters cannot diagnose neonatal sepsis.

Neonatal sepsis is the presence of symptoms of sepsis in the neonatal period combined with bacteriological isolation of an infectious agent from blood or CSF [8] and the radiographic evidence of pneumonia [9]. About one million newborn babies across the world die due to neonatal infections during the neonatal period (from 0 to 28 days of birth) [1]. Neonatal sepsis includes serious infections, such as bacteremia, pneumonia, and meningitis in newborn babies. It is manifested in various forms, including very early-onset sepsis, the sepsis that occurs within the first three days of life; EOS the sepsis occurring within the first week; and finally, LOS occurring after the first week of life till the end of the neonatal period (28 days of life) [10]. The neonatal period carries the highest daily risk of mortality in the first four weeks than any other period and is 30-fold higher than the post-neonatal period from month one of birth to 59 months of age [4].

Neonatal sepsis can be managed by early diagnosis and treatment [8]. The study established that 90% of the babies sampled had more than one-factor causing sepsis. The critical challenge therefore in the diagnosis of neonatal sepsis arise from the nonspecific nature of symptoms, making both diagnosis and treatment difficult. A blood culture test, the current gold standard of diagnosis, takes more than 2 days to generate results [11, 12]. Blood culture tests are, at times, inadequate, tend to throw up false negatives, and have not always demonstrated complete accuracy [13]. Moreover, in the absence of specific laboratory tests for neonatal sepsis, and the inherent protean nature of the non-specific symptoms, accurate clinical diagnosis of neonatal sepsis continues to remain a challenge [9, 14] with delays in diagnosis and treatment as common issues [15] across the globe. Blood culture test continues to be a gold standard even though its sensitivity is challenged by 18% false negative rates [15, 16]. Additionally, the volume of blood obtained from a neonate is more often less than 0.5 ml while the recommended range to detect bacteria is between 0.5 and 1.0 ml [13]. To determine the volume of blood necessary for conducting culture tests among babies, it is essential to know the detection systems sensitivity, and the probability of locating at least one microorganism in the collected blood culture sample. Sampling small volumes of blood to spot low-density bacteraemia or fungemia could run into a risk of not finding an organism in the culture bottle [13]. Also, in critically ill babies or very low birth weight babies, it is very hard to draw 1 ml blood sample and the low volume of the sample is usually inadequate to diagnose neonatal sepsis [16, 17]. Especially in preterm babies, it is hard to make a suspicion of infection, as a result of which antibiotic usage is common and even leads to higher rates of mortality and morbidity when administered during the first seven days among new-borns [18]. Failure to quickly diagnose neonatal sepsis, primarily due to its indefinite sign and symptoms, makes the disease more lethal and destructive. Blood culture test, the only method for confirmatory diagnosis of neonatal sepsis, has however not been effectively addressing issues around neonatal morbidity and mortality, leading to a practice of rampant of empirical antibiotics [15] which in turn is contributing to yet another global health crisis, AMR.

According to [19], antibiotic resistance arises because of mutations in microbes and resulting selection pressure due to antibiotic use which offers a competitive advantage for mutated strains. Especially in LMICs, antibiotic use has gone up due to rising incomes, growing rates of hospitalization, and increasing prevalence of hospital infections [19]. NICUs in LMICs report between 15.2 and 62.0 infections per 1000 patient days and are up to nine times more than that in the USA which averages at about 6.9 infections per 1000 patient-days [16, 19]. In [16] state that for every culture-positive sepsis result, on average, 11–23 infants in developed countries receive antibiotic treatment leading to antibiotic resistance and a rise in healthcare costs. On many occasions, despite negative blood cultures for 98% of patients, 47% of very low birth weight infants continued to receive antibiotics [16]. A review titled Antimicrobial Resistance: Tackling a crisis for the health and wealth of nations commissioned by the government of the United Kingdom and published in 2014 reported that an estimated 10 million people could die because of AMR by 2050 [20, 21]. A study published in the Lancet [21], which performed a comprehensive review of the burden of AMR across 204 countries, 23 bacterial pathogens, and 88 drug combinations, reports that the magnitude of AMR, especially that caused by bacterial pathogens surpasses that of malaria or HIV. According to the Lancet study, one critical way to address this growing problem is to increase the quality of data collection, especially in low-income settings where surveillance is weak and data is sparse which could pave the way for addressing this global health crisis and guide policymaking. Research has also demonstrated that continuous use of sepsis screens is indeed effective in reducing antibiotic usage among neonates [22].

In developing countries like India, resistance to WHO recommended treatment regimens of antibiotics like ampicillin and gentamicin among pathogens causing neonatal infections in the first 28 days of their life is highly prevalent. A high proportion of these isolates, especially 71% Klebsiella spp and 50% E-coli are found to be resistant to gentamicin, which is the first line of treatment for neonatal infections [19]. Other harmful pathogens like Enterobacteriaceae and Acinetobacter spp causing neonatal infections are also associated with high antibiotic resistance and high mortality in neonatal nurseries [19]. In high-income countries, the culture-positive to culture-negative sepsis ratio ranges from 1:6 to 1:16 [23]. This ratio is likely skewed towards culture-negative sepsis in South Asia. Additionally, the pathogen profile in South Asia is observed to be different from that in high-income countries. In South Asia, Gram-negative pathogens (>60%) are dominant, while group B streptococci are low, which typically have a high incidence in high-income countries. The predominance of Gram-negative pathogens in South Asia indicates that the transition from Gram-Negative to Gram-positive organisms such as group B streptococci, that occurred about five to six decades ago in developed countries, is largely attributed to improved aseptic routines, including hand hygiene in neonatal intensive care units.

This disparity in pathogen profile across geographies requires localized algorithms in developing countries’ context rather than deploying the ones built in the developed country’s context since they will only render themselves ineffective in aiding health professionals, clearly demonstrating that ‘one size fits all’ is not the perfect solution in healthcare. Researchers have therefore indicated towards criticality in the requirement of the development of novel technology-based approaches for fast prediction of neonatal sepsis [11] that could perform diagnosis as effectively or more.

Predictive analytics methods have attracted researchers’ attention for their ability to diagnose neonatal sepsis early. They used statistical methods [8], Machine learning (ML) [15], and Artificial Neural Networks (ANNs) [24] to aid in predicting the onset of sepsis in neonates. Predictive analytics has generated significant interest in healthcare because of its potential to transform economies, especially in primary health services [25]. Acknowledged as a non-medical intervention, predictive analytics promises to address health crises, build preparedness for next-generation epidemics, and aid in resilient recoveries during global health emergencies or calamities [26]. EHR alerts and CDSS all under the “classic” predictive analytics category [25], which, in recent times, has seen a significant rise in budgetary allocations.

Predictive analytics use data to build algorithms capable of deriving insights, mimicking intelligent human decision-making processes, thereby enhancing the effectiveness of decision-making processes [27]. Predictive analytics in healthcare broadly refers to ML, a subset of AI, and DL based on ANNs is a subset of ML. An example of a widespread use case of predictive analytics in healthcare is predicting patients at risk of developing certain conditions, estimating the likelihood of unexpected transfers to ICUs or emergency departments, and forecasting unplanned hospitalization expenses, among other scenarios [25].

While the health sector has pioneered predictive analytics [28], its adoption in health management focusing on neonatal sepsis detection has been uneven globally [27]. As a consequence, the predictive analytics adoption curve traverses a non-linear trajectory owing to diverse stages of innovation in several countries, resulting in varied success rates amidst the unaddressed challenges around their respective legal, regulatory, ethical, and cultural factors [29]. Thus, the current study fills the gap in predictive analytics literature in early predicting neonatal sepsis by providing a pathway to improve its early diagnosis and screening and address related antibiotic resistance. It also builds upon theory to foster world-class research in predictive analytics applications.

Therefore, following a rigorous scientific research design, this study endeavors to holistically assimilate knowledge concerning prediction analytics models for early diagnosis of Neonatal Sepsis in the healthcare sector in the context of developing countries. A systematic literature review is considered a reliable method to synthesize the literature and is deemed appropriate to assimilate the fragmented literature to determine the current state of the art [30] literature.

The objectives of the current review are as follows:

  1. 1.

    To systematically review predictive analytics methods used for the early diagnosis of neonatal sepsis (see Sect. 2 and Sect. 3).

  2. 2.

    To identify gaps in the current research efforts toward building advanced predictive analytics methods for neonatal sepsis diagnosis (see Sect. 4). The review is up-to-date and explores different dimensions of neonatal sepsis diagnosis in the context of the nature of the disease (EOS/LOS), country of study, performance analysis of different predictive analytics methods, and reviews of prominent studies on the implementation of predictive analytics methods in neonatal sepsis diagnosis by different researchers. This review will benefit researchers in the domain of neonatal sepsis diagnosis and predictive analytics, providing an in-depth understanding of the different predictive analytics methods used for early diagnosis of neonatal sepsis.

The rest of the paper is organized as follows: Sect. 2 presents the method used for systematic literature review. Later, Sect. 3 presents the results obtained and their discussion of key findings, strengths, and limitations. Section 4 presents the gaps in the literature and future directions for further research. Finally, the paper is concluded with acknowledgements.

2 Methods

The four-step process outlined in the PRISMA guidelines [31] was used to create a rigorous and repeatable review, as recommended by [32]. The search protocol is classified into three phases:

2.1 Phase1: Planning the review

Articles were searched from Scopus, PubMed, Google Scholar, etc. An exhaustive search using boolean operators with a combination of “OR” and “AND” was used along with the following keywords- “Neonatal Sepsis” AND “Neonatal Sepsis Diagnosis” AND, “Predictive Analytics” OR “Artificial Intelligence” OR “Machine Learning.” Research papers were searched from the year 2014 to 2024.

All the articles were screened independently by all the researchers; however, only those articles with the main keywords in the abstract that were relevant were considered for review.

Inclusion Criteria: Further, the articles were selected based on the inclusion criteria mentioned below:

  1. 1.

    Neonates with positive blood culture before 72 h of life were considered Early-onset Neonatal Sepsis cases. Neonates with positive blood culture reports after 72 h of life were considered late neonatal sepsis cases.

  2. 2.

    Both prospective and retrospective studies that used electronic health records and vital signal measurement devices to try to figure out how to predict neonatal sepsis.

  3. 3.

    Clinical Studies using various statistical methods for predicting Neonatal Sepsis were considered.

Exclusion Criteria: The following are excluded from our study

  1. 1.

    Non-English articles, conference papers, and editorial communication

  2. 2.

    Systematic reviews of previously published studies

  3. 3.

    Meta-analysis of studies

2.2 Phase II: Conducting the review

Based on the inclusion and exclusion criteria, 1690 articles were screened for document type and subjects, 72 from Scopus, 4 from PubMed, and 1612 from Google Scholar were considered for further evaluation. 994 articles were exported based on accessibility, out of which 924 articles were excluded and which met the inclusion and exclusion criteria defined for the study. A final set of 48 full-text articles were identified for the literature review, and 16 articles with predictive models developed and validated were analyzed.

2.3 Phase III: Reporting

Table 1 presents the review’s results and abbreviations used in the paper. Figure 1 depicts the PRISMA approach for choosing the final set of 16 relevant studies. Figure 2 presents the distribution of studies year wise, journal wise and country wise respectively. These studies are briefly described as follows:

Mani et al. [16] proposed machine learning algorithms for early diagnosis of late-onset neonatal sepsis, which include SVM, NB, Variants such as TAN, AODE, a sample-based classifier (KNN), decision tree classifiers, CART, RF, LR and LBR. Authors picked variables from a subset of NICU variables deemed relevant, aided by clinical expertise and literature evidence. They included 781 (71 X 11) temporal variables (laboratory) and 30 nontemporal (demographics). Data missing values were addressed using the LOCF approach [33]. The Gaussian single imputation method is considered superior over simple mean, median, and mode value imputation because it introduces less bias and reduces the missing values from 91 to 64%. Variables with more than 90% missing data were removed. The final dataset consisted of 93 variables and 299 instances. An ensemble of machine learning methods used included classifier algorithms such as SVM; decision support algorithms such as NB and CART, LR; KNN, TAN, AODE, RF, and finally LBR. Feature selection was done using a 5-fold nested cross-validation procedure in MATLAB. Performance was measured using the AUC since it works well independent of class size and classification threshold [34]. Since the sample size is smaller relative to the number of variables, authors used SVM algorithms such that they could select a subset of features that are highly predictive of the outcome variable. SVM algorithms have qualitative and quantitative advantages and can pick informative features quite well [35]. The treatment specificity of ML algorithms was higher than the physicians criterion measured as a prescription of an antibiotic within 12 h of phlebotomy. The authors concluded that most ML algorithms treatment sensitivity and specificity exceeded that of the physicians.

Thakur et al. [36] proposed two different prediction models for low-resource and developing countries using non-invasive and invasive parameters extracted from the MIMIC III database. The authors applied binary logistic regression to develop prediction models and used Chi-square test and Fishers exact test as goodness-of-fit tests. They extracted a total of seven parameters within a time window of 12 h, including blood culture reports, temperature, CSF culture result, HR, systolic and diastolic pressure, white blood cell count, and bands. In the derivation and validation phase, 1472 neonates were divided into two datasets in the ratio of 70/30. Binary logistic regression was used to calculate the prediction power of invasive parameters (platelets, bands, and WBC) and non-invasive parameters such as temperature, blood pressure, and heart rate. The authors concluded that both prediction models derived from invasive and non-invasive parameters performed equally well, with an AUROC of 0.777 and 0.824 in the derivation dataset and an AUROC of 0.830 and 0.824 in the validation dataset.

Lopez-Martinez et al. [24] proposed a deep learning approach to predict early neonatal sepsis. They constructed a sequence of ANNs with increasing hidden layers and nodes, trained the networks on a set of hyper parameters, and normalized the data using Gaussian normalization. The network model results show a sensitivity of 80.3%, a specificity of 90.4%, a precision value of 83.1%, and an AUC of 92.5%. The authors concluded that the neural network model could be used as an inference system to aid in detecting neonatal sepsis.

Gomez et al. [37] proposed a noninvasive machine learning algorithm for early detection of neonatal sepsis using HRV monitoring features. The study included 79 newborn babies with less than 48 h of life and a gestational age between 36 and 41 weeks. Electrocardiogram signal data was recorded, and HRV parameters were calculated. The authors developed supervised machine learning algorithms, and critical metrics such as sensitivity, specificity, positive predictive value, negative predictive value, and area under the ROC curve were calculated. The study results indicated an AUC of 0.94 using the adaptive boosting machine learning model, an AUC of 0.88 for the bagged trees model, and an AUC of 0.84 for random forests. The authors concluded that neonatal sepsis can be identified using HRV parameters with the use of the adaptive boosting algorithm, which promises better results

Hu et al. [38] proposed an Artificial Neural Network model for detecting late onset of neonatal sepsis. The authors placed data collectors in NICUs to gather real-time vital sign data, including HR, RR, and blood oxygen saturation (SpO2). They transformed the vital signs data into images and labeled the data as normal or sepsis category. After extracting features, they developed a 14-layer CNN. The authors concluded that they successfully demonstrated the feasibility of using a CNN for sepsis detection using vital signs. However, they suggested further work using more powerful methods such as LSTM.

Thakur et al. [39] proposed a noninvasive prediction model to detect sepsis in neonates in developing countries and compared its performance with an invasive prediction model. The study used data extracted from the MIMIC III database, with 1447 samples filtered from a sample of 7875 EHR records. Two logistic regression models were developed for invasive and noninvasive parameters separately. The results showed that the area under the receiver operating characteristic curve for predicting sepsis in neonates using noninvasive parameters was 0.879 (95% CI), with a sensitivity of 74.81% (95% CI), specificity of 88.75% (95% CI), and PPV of 49.75% (95% CI). The authors concluded that both invasive and noninvasive models performed well and suggested that further improvements to these models could help frontline health workers and semi-skilled workers in the early prediction of neonatal sepsis, leading to a decrease in mortality rates in developing countries.

Masino et al. [15] developed ML models for early recognition of sepsis in the NICU using electronic health records at least 4 h before clinical manifestation. The authors conducted a retrospective control study considering 36 features derived from electronic health records and developed 8 machine learning models using 10-fold nested cross-validation. The results demonstrate that 6 out of 8 models showed a mean area under the ROC of 0.80–0.82, and by including both culture and clinically positive neonatal sepsis-identified cases, the same models achieved a mean ROC of 0.85–0.87. The authors concluded that machine learning predictive models could identify neonates with sepsis before their clinical recognition.

Goldberg et al. [40] introduced a validated prediction model for early diagnosis of late-onset neonatal sepsis. The model incorporates the clinical evaluation of the NLR and CRP. The authors conducted a retrospective case–control study, extracting data from medical files, and utilized univariate and multivariate logistic regression methods to identify the risk of sepsis. The authors concluded that the combination of clinical evaluation with NLR and CRP values can be utilized to predict LOS in neonates.

Song et al. [12] conducted a study to develop a prediction model for LONS using noninvasive vital sign data and machine learning technology. The model was designed to detect clinical sepsis 48 h before it occurred. The model’s performance was comparable to those using invasive data, with key markers being blood pressure, oxygen saturation, and body temperature. The authors confirmed that a machine learning-based LONS prediction model could be developed using vital sign data regularly measured in clinical settings.

Alvi et al. [41] developed an ANN model for early detection of neonatal sepsis. The authors utilized under-sampling and oversampling techniques to balance the dataset. They fed a set of hyperparameters to the ANN. They concluded that the model achieved a true positive rate of 98.4%, a true negative rate of 98.1%, a precision of 96.8%, an AUC of 99.8%, and an accuracy of 98.2%. The better performance of the ANN model was attributed to selecting the correct set of features based on the strength score of the dependent variable. It was also concluded that the developed ANN model helps detect positive and negative sepsis cases and can be used as a decision-support tool.

Helguera Repetto et al. [42] introduced a predictive model by training and validating an ANN algorithm using a balanced dataset that included preterm and term neonates. The authors found that the ANN model’s performance was better than physician diagnosis based on traditional scoring systems. Additionally, the model performed well compared to other advanced methods that used maternal and neonatal variables.

Alvi et al. [43] proposed a deep-learning model for predicting the early onset of neonatal sepsis using non-invasive data. The deep learning models are designed to work with time series, sequential, or image data. The authors built two neural networks, CNN and LSTM-RNN, to classify subjects and determine whether they are affected by sepsis. The study’s results indicated that with identical precision and recall percentages, CNN can predict both positive and negative cases well. The recall rate of LSTM-RNN is better, with a result of 1.0. LSTM RNN is considered good at predicting the early onset of neonatal sepsis, with an F-measure of 98.04% and an accuracy of 99.40%. The authors concluded that although CNN’s performance is lower than that of ANN and RNN, it performed better than traditional ML algorithms. LSTM RNN outperforms CNN and ANN.

Honore et al. [44] conducted a study investigating the predictive value of machine learning-assisted analysis of non-invasive, high-frequency monitoring data and demographic factors to detect neonatal sepsis. The study involved 325 infants with a total of 2866 hospitalization days. Personalized event timelines were created to monitor interventions and clinical findings. Time-domain features from heart rate, respiratory rate, and oxygen saturation values, as well as demographic factors, were analyzed. The Naive Bayes algorithm within a maximum a posteriori framework was used to predict sepsis up to 24 h before clinical suspicion. The study identified twenty cases of sepsis and found that combining multiple vital signs improved the algorithm’s performance compared to using only heart rate characteristics. This allowed for sepsis prediction with an area under the receiver operating characteristics curve of 0.82 up to 24 h before clinical suspicion. Additionally, the risk of sepsis increased 150-fold 10 h before clinical suspicion. The non-invasive patient data used in the algorithm provides valuable predictive value for the early detection of neonatal sepsis. The authors concluded that the non-invasive patient data algorithm provides applicable predictive value for neonatal sepsis detection.

Van den Berg et al. [45] conducted a study to develop a predictive algorithm to assist doctors in the early detection of LOS in very preterm infants. In a meticulously designed retrospective cohort study, all consecutively admitted preterm infants (GA of 32 weeks) from 2008 until 2019 were included. The subjects were classified as LOS or control according to blood culture results, the gold standard in the field. Routine and continuously measured oxygen saturation and heart rate data with a minute-by-minute sampling rate were extracted from electronic medical records to generate features. Machine learning techniques such as generalized additive models, logistic regression, and XGBoost) were then employed to build a robust classification algorithm. Upon evaluation, the LR algorithm demonstrated a promising AUC of 0.73 (p< 0.05) at the moment of clinical suspicion (t = 0). In the longitudinal simulation, the algorithms detected LOS in at least 47% of the patients before clinical suspicion without exceeding the alarm fatigue threshold of 3 alarms per day. This suggests that the algorithm, based on routinely collected data, has the potential to significantly accelerate clinical decisions in the early detection of LOS, even with limited inputs, thereby revolutionizing the way LOS is managed in clinical practice.

Robi et al. [46] have applied a classification stacking model for neonatal diseases such as sepsis, a leading cause of neonatal deaths. The developed stacking model was rigorously compared to three related machine-learning models: XGB, RF, and SVM. The results were precise, and the proposed stacking model outperformed the other models, boasting an impressive accuracy of 97.04%. This robust performance underscores the potential of the developed predictive models in contributing to the early detection and accurate diagnosis of neonatal diseases, particularly in resource-limited health facilities.

Kallonen et al. [47] conducted a study aimed at predicting late-onset sepsis (LOS) in newborns by analyzing noninvasive biosignal measurements from NICU monitors. The authors developed a machine learning algorithm based on a CNN structure, which achieved 83% sensitivity and 73% specificity for detecting LOS 44 h before clinical suspicion. The electrocardiogram and respiratory impedance signals were crucial in improving predictive accuracy. The authors concluded that this approach offers state-of-the-art prediction performance for LOS without requiring artifact removal.

Apart from studies done using Machine Learning and Artificial Neural Networks, there are few other studies that have used traditional statistical methods by researchers to predict neonatal sepsis clinically which were summarized as below.

Selimovic et al. [48] developed a predictive score to diagnose early onset of sepsis using CBC and CRP levels. The authors concluded that the predictive score for EONS is useful in the diagnostic evaluation of neonates suspected of Early onset of neonatal sepsis.

Okascharoen et al. [49] proposed a practical, clinical prediction scoring model for the diagnosis of Late onset of Sepsis using weighted coefficients from Coxs proportional hazards model ROC analysis. The authors also validated the model which indicated a good performance.

Kuzniewicz et al. [50] proposed multivariable prediction models to estimate the risk of Early onset of sepsis in preterm as well as term neonates. The study was conducted on a cohort of infants born at Kaiser Permanente Northern California Hospital during the period 2010 to 2015 to test the use of antibiotic administration, blood culture tests, clinical outcomes, and re admissions at the hospital for early onset of sepsis. Segmented regression models were used to predict the onset of sepsis. The authors concluded that usage of the Early onset of Sepsis calculator resulted in a decrease in blood culture test from 14.5 to 4.9%, while empirical administration of antibiotic use in the first 24 h reduced from 5.0 to 2.6% and a reduction of antibiotic administration from 0.5 to 0.4% was observed between 24 and 72 h of birth. Their study clearly indicated the importance of the clinical care multi variable risk prediction model in reducing the proportion of neonates undergoing extensive laboratory testing and empirical antibiotic treatment without any adverse effects.

Goldberg et al. [40] proposed a validated prediction model for early diagnosis of late-onset neonatal sepsis with clinical evaluation of NLR and CRP. The authors conducted a retrospective case–control study with data extracted from medical files and applied univariate and multivariate logistic regression methods to identify the risk of sepsis. The authors concluded that a clinical evaluation in conjunction with NLR and CRP values could be used to predict the late onset of sepsis in Neonates.

Huang et al. [51] proposed using nomograms to predict the Late onset of sepsis in preterm infants. A development cohort of 1256 preterm infants and a validation cohort of 452 preterm infants were included, and the nomograms were built with and without thyroid function. The authors concluded that thyroid hypofunction in preterm infants may increase the incidence of sepsis.

Fig. 1
figure 1

Systematic literature review using PRISMA methodology

Table 1 Analysis of 16 studies found in the literature
Fig. 2
figure 2

Distribution of papers from the literature

3 Results and discussion

Of the 16 studies selected (as per Table 1), 15 were retrospective, and one was conducted on prospective data. Most studies (70%) are based in developed countries such as the USA, Finland, Netherlands, Australia, Israel, Spain, etc., and few studies on neonatal sepsis diagnosis were conducted in developing countries. It was also observed that late-onset neonatal sepsis diagnosis is of significant interest in studies conducted in developed countries. Most models (62.5%) were based on non-invasive parameters without involving any blood tests. ML models such as SVM, NB Classifiers, AODE Estimators, Sample-based classifiers, Decision tree classifier, CART, RF, LR, LBR, AB, BT, GB, GP, KNN, GAM, XGBoost etc were used in majority of the studies. ANNs, CNNs, and DL techniques were used to predict sepsis in 6 studies analyzed.

3.1 Key findings

Sepsis is a severe medical condition that requires prompt diagnosis and treatment. However, traditional diagnostic methods have inherent limitations that can cause delays in recognizing the onset of sepsis. Fortunately, machine learning algorithms have proven and consistently demonstrated their high effectiveness in predicting sepsis hours before it is clinically recognized [15]. The systematic review of various studies reaffirmed these impressive results, showing better sensitivity and specificity than clinicians’ diagnosis of sepsis [42].

Both invasive and non-invasive models were equally effective in predicting sepsis, with the latter providing applicable predictive value and supporting clinical decision-making in the early diagnosis of late-onset neonatal sepsis [12, 36]. Additionally, NLR and C-reactive protein prediction can rapidly identify neonatal sepsis’s late onset [40]. HRV is identified as one of the critical parameters for sepsis diagnosis. Among machine learning algorithms, Adaboost ML was an effective tool for predicting sepsis, and several other machine learning algorithms were used in the studies [37].

As illustrated in a few articles, ANNs, and CNN models were effectively used as an intelligent systems inference engine to enhance the detection of neonatal sepsis [24]. These models demonstrated high accuracy in predicting neonatal sepsis, with the CNN model capable of predicting its onset 24 h ahead [38]. Furthermore, ANN can accurately predict neonatal sepsis and outperform physicians’ diagnoses [41]. The LSTM RNN is the best-performing prediction model, with an accuracy of 99.40% [43]. When combined with PSD transformation, biosignals from NICU monitors provide excellent prediction performance for late-onset neonatal sepsis [47]. With its potential to support early diagnosis of neonatal sepsis, machine learning holds promise for healthcare, particularly in resource-limited facilities [46]. The performance of predictive models using non-invasive features is comparable to invasive models, offering a ray of hope for more accessible and efficient healthcare [45]. Machine learning algorithms can improve patient outcomes by predicting sepsis hours before clinical recognition [16].

3.2 Strengths and limitations

3.2.1 Strengths

The systematic review rigorously analyzed an equal proportion of studies investigating early and late-onset sepsis. It thoroughly evaluated the use of both invasive and non-invasive parameters for developing machine learning models. The study was up to date with the articles published till March 2024.

3.2.2 Limitations

While comprehensive, it is crucial to note that this review was limited to studies conducted in English, excluding a significant number of articles. Moreover, most of these studies were conducted in developed countries, leaving a need for more research in developing countries. The potential impact of predictive models in diagnosing neonatal sepsis, especially in low-resource settings, is immense. To underline the urgency and importance of this issue, it is clear that more evidence is needed. Urgent and extensive research is required in the field of neonatal sepsis detection, particularly in the context of developing countries.

4 Literature review gaps and future directions

4.1 Gaps identified

Most existing ML studies on the early prediction of neonatal sepsis are focused on the LOS from developed countries and are retrospective using EHR data ([15, 16, 18]). According to these studies, the essential causative organisms for EOS are Streptococcus agalactiae and Escherichia coli (in Western countries data). Studies indicate that due to a sporadic rise in AMR, neonatal mortality is also increasing, and hence, non-culture-based diagnosis needs to be encouraged. Moreover, in high-income countries, the culture-positive to culture-negative sepsis ratio ranges from 1:6 to 1:16 ([23]), and this ratio is likely skewed towards culture-negative sepsis in South Asia. Additionally, variation in pathogen profile in South Asia compared to that from high-income countries calls for localized algorithms to address the issue.

Most research in Predictive Analytics is empirical in nature; therefore, a more significant number of theoretical studies are also needed to develop the body of literature and advance academic research in the context of developing countries.

4.2 Future research directions

Research from developed countries states that early diagnosis and treatment of such critical medical conditions using predictive analytics methods reduce antibiotic use by at least 50% ([52]) and infections by 84% upon continuous use [23]. Additionally, due to low resource constraints in LMICs, there is an imperative need for applications powered by predictive analytics methods to address issues of low access to healthcare facilities

A wider academic view is, however the need of the hour to bring to the fore a better perspective of how new technologies perform or are accepted in radically different cultural contexts from the oft-tried and tested spaces of the American or European environments and encompass the sociological milieu of gender, ethnicity, and other factors that influence the adoption of novel ways and means of addressing global health, especially among healthcare providers, facilitators, and their local and global partners. The field of AI thus entails increased studies to enhance the body of literature further and necessitates independent studies to address region-specific issues with global impact.

5 Conclusions

Predictive analytics-based models can predict the early manifestation of Neonatal Sepsis. They can be used as a rapid detection tool for the early onset and late onset of Neonatal Sepsis. These methods can be valuable tools for clinicians to aid in effective decision-making and improved healthcare management of neonates, particularly in low-resource settings. Most of the prediction models were developed on clinical data sets from countries in North America, Europe, or Asian countries such as China, and their applicability to low-resource regions has seen subpar success rates. Therefore, it is imperative to develop a prediction model with localized data sets and validate their adoption with healthcare practitioners.