Prediction of preterm birth in multiparous women using logistic regression and machine learning approaches

Belaghi, Reza Arabi

doi:10.1038/s41598-024-60097-4

Prediction of preterm birth in multiparous women using logistic regression and machine learning approaches

Article
Open access
Published: 20 September 2024

Volume 14, article number 21967, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Prediction of preterm birth in multiparous women using logistic regression and machine learning approaches

Download PDF

Reza Arabi Belaghi^1,2

1 Altmetric

Abstract

To predict preterm birth (PTB) in multiparous women, comparing machine learning approaches with traditional logistic regression. A population-based cohort study was conducted using data from the Ontario Better Outcomes Registry and Network (BORN). The cohort included all multiparous women who delivered a singleton birth at 20–42 weeks’ gestation in an Ontario hospital between April 1, 2012 and March 31, 2014. The primary outcome was PTB < 37 weeks, with spontaneous PTB analyzed as a secondary outcome. Stepwise logistic regression and the Boruta machine learning were used to select the important variables during the first and second trimester. For building prediction models, the whole data set were divided for the two independent parts: two-third for training the classifiers (Logistic regression, random forests, decision trees, and artificial neural networks) and one-third for model validation. Then, the training data set were balanced by random over sampling technique. The best hyper parameters were obtained by the tenfold cross validation. The performance of all models was evaluated by sensitivity, specificity, positive predictive value, negative predictive value, and the area under the receiver operating characteristics (AUC). The cohort included 145,846 births, of which 8125 (5.57%) were preterm. In first-trimester models, the strongest predictors of PTB were previous PTB, preexisting diabetes, and abnormal pregnancy‐associated plasma protein-A. In the testing data set, the highest predictive ability was seen for artificial neural networks, with an area under the receiver operating characteristic curve (AUC) of 68.8% (95% CI 67.6–70.1%). In second-trimester models, addition of infant sex, attendance at first-trimester appointment, medication exposure, and abnormal alpha-fetoprotein concentrations increased the AUC to 72.1% (95% CI 71.1–73.1%) with logistic regression. With the inclusion of the variable complications during pregnancy, the AUC increased to 80.5% (95% CI 79.6–81.5%) using logistic regression. For both overall and spontaneous PTB, during both the first and second trimesters, models yielded negative predictive values of 97%. Overall, machine learning and logistic regression produced similar performance for prediction of PTB. For overall and spontaneous PTB, both first- and second-trimester models provided negative predictive values of ~ 97%, higher than that of fetal fibronectin.

Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015

Article Open access 09 November 2022

Predicting risk of stillbirth and preterm pregnancies with machine learning

Article Open access 25 March 2020

Antenatal prediction models for outcomes of extremely and very preterm infants based on machine learning

Article 11 December 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Introduction

An estimated 15 million infants are born preterm (less than 37 weeks of completed gestation) each year worldwide, of whom approximately 1 million do not survive the first year of life^1,2,3. Complications of preterm birth (PTB) can lead to many short- and long-term medical conditions in the child, including respiratory distress syndrome, chronic lung disease, cardiovascular disorders, asthma, and loss of hearing and vision⁴. Consequently, the economic costs and public health burden of PTB are substantial. Yearly costs have been estimated at £2.946 billion in the UK⁵ and $8 billion CAD in Canada⁶, where PTB has been linked to two-thirds of infant deaths⁷. Compared to the UK, Canada, and Western European countries, the US has a higher rate of PTB (10.02% in 2018)^8,9, with annual costs of $26.2 billion¹⁰.

In light of the potential to reduce infant and childhood morbidity and mortality, reduction of PTB is a major public health priority¹¹. Accurate and reliable prediction of PTB would enable preventative interventions to diminish morbidity and mortality¹², thus benefitting pregnant women, neonates, healthcare systems, and society as a whole. However, recent studies have shown that prediction of PTB is a challenging task, with prediction models to date yielding an area under the receiver operating characteristic curve (AUC) of around 70%¹³.

Machine learning techniques present the potential for improved prediction of PTB. Use of machine learning has grown rapidly in a wide range of medical disciplines including cancer prediction, cardiovascular diagnostics, image analysis^14,15, and maternal postpartum complications¹⁶, and machine learning was highlighted as one of the most important advances of 2019 for prenatal diagnosis¹⁷. The main advantage of machine learning methods over other methods such as linear regression is the possibility of relaxing assumptions regarding multicollinearity, additivity, and distribution¹⁸.

Machine learning can be conducted using multiple types of learning algorithms, among which decision trees, random forests, and artificial neural networks are commonly used in medical disciplines¹⁹. Decision trees constitute the most straightforward algorithm and provide a visual representation of the relation between predictors and outcome variables. However, the high variance in the results of the decision trees can in some cases be improved by using random forests, which aggregate results of randomly generated decision trees to produce a more accurate model²⁰. Artificial neural networks present a third option that has been broadly used in medical studies²¹ and performs well when complex and non-linear associations exist between variables²². Briefly, artificial neural networks use predictors as inputs and connect them to multiple hidden layer combinations by assigning suitable weights to predict the outcome²³.

This study identified predictors of PTB in a large cohort of multiparous women using multiple state-of-the-art machine learning algorithms and compared results obtained from these algorithms with those from traditional logistic regression analyses. Predictors of PTB that have not been commonly considered in prior prediction models, such as proteins used for screening for placental diseases and Down Syndrome, were included. Also the predictive models were developed using training datasets and then tested using validation data, as is recommended by guidelines^24,25.

Methods

Data and population

A population-based retrospective cohort study was performed using data from Ontario’s Better Outcomes Registry and Network (BORN)²⁶. A description of the reliability of BORN to provide accurate data is provided by Miao et al.²⁶ In the present study, multiparous women with a singleton birth at 20–42 weeks’ gestation delivered in an Ontario hospital between April 1, 2012, and March 31, 2014, were included. The primary outcome, PTB, was defined as gestational age at birth less than 37 weeks. Spontaneous PTB was also considered as a secondary outcome. Spontaneous PTB was identified by births categorized as not “induced”, not “caesarean section” and not “augmented labor”²⁷.

The potential predictors of PTB available in the BORN database were identified based on knowledge to date regarding the etiology of PTB^11,28. Potential predictors considered for first-trimester models included maternal age, height, pre-pregnancy body mass index (BMI), pre-existing physical health conditions (Table S1), pre-existing mental health conditions (Table S2), socioeconomic variables (Table S3), smoking status, alcohol consumption during pregnancy, folic acid use, gravidity, number of prior abortions (including miscarriages), number of prior PTBs, number of prior term births, number of prior vaginal births, number of prior caesarean births, history of vaginal birth following caesarean section, history of stillbirth, gestational weight gain during the first trimester, diabetes, use of assisted reproductive technologies, and antenatal health care provider. The impact of pregnancy-associated plasma protein-A concentration and free beta-subunit of human chorionic gonadotropin were also considered as potential markers of placental diseases including preeclampsia²⁹, as well as nuchal translucencies³⁰.

In second-trimester models, all of the variables from the first trimester were considered. Moreover, the protein concentrations of dimeric inhibin-A³², unconjugated estriol³³, human chorionic gonadotropin³⁴, and alpha-fetoprotein³⁵ were added. Furthermore, the intention to breastfeed, attendance at first-trimester appointments, hypertensive disorder of pregnancy, gestational diabetes, infection, medication exposure, fetal sex, pregnancy complications³¹ were included as potential predictors of PTB during the second trimester as well. Complications of pregnancy contained more than 600 categories, which were combined into three groups based on the expertise of our in-house maternal–fetal specialist: No complications, Moderate complications, and severe complications (including hypertensive, placental abnormalities, and maternal problems during the pregnancy, such as antepartum hemorrhaging). Also, biomarker concentrations and nuchal translucencies were categorized into three groups: normal, abnormal, and missing (see Table S4 for cutoffs).

Descriptive analyses

The associations between predictors and outcome variables were measured by applying chi-square and t-tests for categorical and continuous predictors respectively, with statistical significance defined as p < 0.05. Then crude odds ratios (ORs) between individual predictors and the primary and secondary outcomes were estimated using univariate logistic regression.

Model selection and evaluation

First the whole cleaned data set was divided into two portions: two-thirds of the data were used for model building (training), while the remaining one-third of the data were used for model testing (validation). For each of the algorithms, the tenfold cross-validation was used in the balanced (random oversampling^36,37) training data to identify the optimal model producing the highest area under the receiver operating characteristic curve (AUC). Finally the model performance is assessed in the validation data by sensitivity, specificity, positive predictive value, negative predictive value, and AUC³⁸.

Machine learning algorithms were executed using R software (version 3.5.2) and the caret³⁹ package. The receiver operating characteristic curves (ROCs) and AUCs were obtained by using the pROC package in R. Variable selection for the machine learning models was performed using the Boruta package⁴⁰. Boruta computes the importance of each variable as a measure of predictive ability. The higher the importance of a variable, the more predictive power in the model⁴¹. For missing values of prediction variables, ten multiple imputations were produced by chained equations⁴² using the R package MICE⁴³ and substituted missing values with the average and mode for continuous and categorical variables, respectively. Variable selection for the logistic regression analyses was performed using the stepAIC function in the MASS package.

Consent to participate

Informed consent was obtained from all subjects and/or their legal guardian(s).

Ethics approval

This study followed the guidelines for the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis²⁵. The Hamilton Integrated Research Ethics Board approved the study before its commencement (approval # 14-714-C).

Results

The cohort included 145,846 births, of which 8125 (5.57%) were preterm (Table 1). Nearly 30% of the study population was over 35 years of age. The mean maternal pre-pregnancy BMI was 25.9 ± 6.6 kg/m², and the mean maternal height was 163.6 cm ± 7.3 cm. Twenty-one percent of women had at least one medical condition and approximately 40% had at least one previous abortion (including miscarriages) (Table S5). Of those patients who had experienced prior abortions (including miscarriages), 4.9% had more than three. Overall, 2.0% of pregnant women had a history of stillbirth, while 8.6% had a history of PTB and 24.8% had a previous caesarean section.

Table 1 Distribution of preterm birth and maternal baseline demographic and clinical variables in a population-based cohort study for prediction of preterm birth in multiparous women.

Full size table

Univariable analysis

Twenty-nine potential first-trimester predictors of PTB were analyzed (Table S6). Patients who were older than 35 years of age, shorter than 160 cm, obese prior to pregnancy, smokers, patients who conceived using ovulation induction including IVF, and those who had prior medical conditions including diabetes or low pregnancy-associated plasma protein-A concentrations were more likely to experience PTB than patients without these conditions. Furthermore, patients with a history of abortion, PTB, or caesarean section had a higher likelihood of experiencing PTB than patients who did not have a history of such conditions. Results of the univariate analysis for second-trimester predictors are shown in Table S7. Patients who did not attend their first trimester clinical evaluation, failed to enroll in prenatal education programs, or had abnormal protein (pregnancy-associated plasma protein-A, dimeric inhibin-A, unconjugated estriol, human chorionic gonadotropin) levels were more likely to experience PTB than a term birth. Furthermore, patients pregnant with a male baby, having a hypertensive disorder, or experiencing severe complications during pregnancy were at a higher risk of PTB than term birth.

Multivariable analysis

Stepwise logistic regression identified 18 significant first-trimester predictors of PTB (Fig. S1). Among these, the highest adjusted OR was seen for prior PTB (aOR for one previous PTB: 3.91; 95% confidence interval (CI) 3.52–4.33; OR for two previous preterm births: 4.33; 95% CI 3.59–5.20; OR for three or more previous preterm births: 6.37; 95% CI 4.48–8.94). Diabetes and abnormal pregnancy-associated plasma protein-A concentrations had the second highest ORs. Overall, women were at increased risk of PTB if they were shorter than 160 cm, underweight (< 18.5 kg/m²), had a lower education level, a history of health concerns, history of miscarriage or caesarean section, or experienced excess gestational weight gain. In contrast, patients with a history of term birth were less likely to give birth preterm.

In the second-trimester model, stepwise logistic regression identified 23 significant predictors of PTB (Figs. S1, S2), including all 18 predictors identified in the first-trimester model. The five additional predictors identified were fetal sex, attendance at the first-trimester clinical evaluation, pregnancy complications, medication exposure, and abnormal alpha-fetoprotein concentration. Severe complications during pregnancy exhibited the highest adjusted OR (12.37; 95% CI 11.61–13.18) in the second-trimester model. Patients who were pregnant with a male fetus were also at a higher risk of experiencing PTB. Conversely, patients exposed to medications, including vitamins and herbal supplements, were less likely to experience PTB.

Machine learning (Boruta method) identified 24-predictors of PTB during the first trimester (Table S8). As with logistic regression analyses, machine learning methods found that a history of PTB was the strongest predictor of subsequent PTB. However, in second-trimester models, machine learning methods did not identify fetal sex, prior abortions (including miscarriages), or maternal height as important predictors of PTB (Table S9). As with logistic regression analyses, complications during pregnancy, hypertensive disorder of pregnancy, and prior PTB were the most important predictors identified by machine learning.

Performance of prediction models in the validation set

Among first-trimester models, the random forests model had the highest sensitivity, specificity, and AUC in training samples (Table S10). However, in the testing data, artificial neural networks yielded the highest AUC (68.8%, 95% CI 67.6–70.1%), sensitivity (54.5%, 95% CI 52.4–56.7%) and negative predictive value (96.6%, 95% CI 96.3–96.8%) (Fig. 1, Table 2).

Table 2 Predictive power and 95% confidence intervals of models for preterm birth in multiparous women during the first and second trimesters in testing data.

Full size table

The AUC for both machine learning methods and logistic regression models rose substantially with inclusion of second-trimester predictors, with logistic regression yielding the highest AUC (80.5%, 95% CI 79.6–81.5%). The notable rise in AUC between first- and second-trimester models stemmed largely from the addition of pregnancy complications. Accordingly, a sensitivity analysis without this variable was performed. This analysis yielded an AUC of 72.1% (95% CI 71.1–73.1%) (Fig. 2). Among the second-trimester models, artificial neural networks yielded the highest AUC, while also maintaining a high negative predictive value of 97.4% (95% CI 97.3–97.6%). However, both artificial neural networks and random forests exhibited overfitting (high performance in training data but low performance in testing data) issue.

Prediction of spontaneous PTB

In first-trimester models predicting spontaneous PTB, logistic regression and artificial neural networks yielded similar AUCs (71.0% and 70.9% respectively) (Fig. S3). However, in second-trimester models, the AUC was slightly higher for artificial neural networks (73.8%, 95% CI 72.3–75.3%) compared to logistic regression (70.8%, 95% CI 69.1–72.4%) and other methods (Fig. S4). Artificial neural networks and logistic regression both yielded negative predictive values for spontaneous preterm birth of ~ 97% in first- and second-trimester models (Table S11).

Discussion

Main findings

Using data from a population-based retrospective cohort of multiparous women, a set of prediction models for PTB developed and validated demonstrated moderate predictive power^44,45, with an AUC of 68.8% in the first trimester. This study identified 18 first-trimester predictor variables, among which history of PTB, diabetes, and abnormal pregnancy-associated plasma protein-A concentrations were the strongest predictors. In second-trimester models, four additional significant predictors were identified (infant sex, antenatal care in the first trimester, medication exposure, and abnormal alpha-fetoprotein concentrations), resulting in a maximum AUC of 72.1%, which is considered an acceptable prediction accuracy^38,45. Inclusion of data on complications during pregnancy yielded an AUC of 80.5%, which is indicative of an acceptable prediction model^39,46.

For both the first and second trimesters, my models for overall and spontaneous PTB yielded negative predictive values higher than that of fetal fibronectin, whose negative predictive value was identified as 93% in a recent systematic review⁴⁶. Moreover, my second-trimester models generated using artificial neural networks yielded sensitivity and specificity similar to the overall sensitivity (58%) and specificity (84%) of fetal fibronectin⁴⁶.

Strengths and limitations

This study has several important strengths. First, logistic regression was applied and three of the most commonly used machine learning approaches to predict PTB. Such an approach contrasts with that used in previous studies, which have primarily focused on one or two machine learning methods^{47,48,49,50,51,52,53}. Moreover, new prediction variables from both the first and second trimesters were considered, which have also not been consistently included in previous studies^{8,9,10,11,12,13,14,48,49,50,51,54,55,56,57,58,59}. The proposed models yielded higher negative predictive values than fetal fibronectin for the prediction of PTB⁴⁶. Accordingly, eventual use of these models in clinical practice could reduce non-essential hospitalization and other interventions, in turn reducing costs and hospital staff burden⁶⁰.

An additional strength of this study is the use of a relatively large cohort compared to previous studies^{8,9,10,11,12,13,14}. Also, the prediction variables that have not been considered by previous studies, including plasma proteins and first-trimester gestational weight gain, were examined in this study. As suggested by Courtney et al.⁶¹, prenatal factors such as maternal health behaviors and medical history were also included in our study. Finally, an additional drawback of previous studies is the lack of information provided to address the issue of imbalanced classes^{8,9,10,11,12,13,14} which was covered by random-oversampling method.

Despite using state-of-the-art machine learning algorithms, the predictive power of the proposed models was limited, and the AUCs varied from 69 to 73% when information on pregnancy complications was not included. The inclusion of information on fetal fibronectin and ultrasonography (cervical length), which are important predictors of PTB⁶², as well as data on preventative measures used by patients with high-risk pregnancies^63,64, may improve predictive power. However, it should be noted that progesterone was rarely administered in Ontario during the timeframe of the data used in our study⁶⁵.

Conclusion

In a large cohort of multiparous women, machine learning (artificial neural networks) and logistic regression methods yielded generally similar accuracy for the prediction of PTB. However, for spontaneous PTB, artificial neural networks provided slightly better results than logistic regression. For overall and spontaneous PTB, both first- and second-trimester models provided very high negative predictive values, higher than that of fetal fibronectin.

Data availability

Data can be available from corresponding author upon reasonable request.

References

Blencowe, H. et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: A systematic analysis and implications. Lancet 379(9832), 2162–2172 (2012).
Article PubMed Google Scholar
Chawanpaiboon, S. et al. Global, regional, and national estimates of levels of preterm birth in 2014: A systematic review and modelling analysis. Lancet Glob. Health 7(1), e37–e46 (2019).
Article PubMed Google Scholar
Liu, L. et al. Global, regional, and national causes of child mortality in 2000–13, with projections to inform post-2015 priorities: An updated systematic analysis. The Lancet 385(9966), 430–440 (2015).
Article Google Scholar
Saigal, S. & Doyle, L. W. An overview of mortality and sequelae of preterm birth from infancy to adulthood. Lancet 371(9608), 261–269 (2008).
Article PubMed Google Scholar
Mangham, L. J., Petrou, S., Doyle, L. W., Draper, E. S. & Marlow, N. The cost of preterm birth throughout childhood in England and Wales. Pediatrics 123(2), e312–e327 (2009).
Article PubMed Google Scholar
Lim, G. et al. CIHI survey: Hospital costs for preterm and small-for-gestational age babies in Canada. Healthc. Q. 12(4), 20–24 (2009).
Article PubMed Google Scholar
Government of Canada Invests in Better Health for Premature Babies. https://www.newswire.ca/news-releases/government-of-canada-invests-in-better-health-for-premature-babies-622089233.html (2019).
Bronstein, J. M., Wingate, M. S. & Brisendine, A. E. Why is the US preterm birth rate so much higher than the rates in Canada, Great Britain, and Western Europe? Int. J. Health Serv. 48(4), 622–640 (2018).
Article PubMed Google Scholar
Preterm Birth Rates are from the National Center for Health Statistics, 2018 Final Natality Data [Internet]. MARCH OF DIMES. https://www.marchofdimes.org/mission/reportcard.aspx (2019).
The Impact of Premature Birth on Society. https://www.marchofdimes.org/mission/the-economic-and-societal-costs.aspx (2020).
Frey, H. A. & Klebanoff, M. A. The epidemiology, etiology, and costs of preterm birth. Semin. Fetal Neonat. Med. 21(2), 68–73 (2016).
Article Google Scholar
Suff, N., Story, L. & Shennan, A. The prediction of preterm delivery: What is new? Semin. Fetal Neonat. Med. 24(1), 27–32 (2019).
Article Google Scholar
Ferrero, D. M. et al. Cross-country individual participant analysis of 4.1 million singleton births in 5 countries with very high human development index confirms known associations but provides no biologic explanation for 2/3 of all preterm births. PLoS ONE 11(9), e0162506 (2016).
Article PubMed PubMed Central Google Scholar
Jalali, A., Simpao, A. F., Gálvez, J. A., Licht, D. J. & Nataraj, C. Prediction of periventricular leukomalacia in neonates after cardiac surgery using machine learning algorithms. J. Med. Syst. 42(10), 177 (2018).
Article PubMed Google Scholar
Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V. & Fotiadis, D. I. Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13, 8–17 (2015).
Article CAS PubMed Google Scholar
Betts, K. S., Kisely, S. & Alati, R. Predicting common maternal postpartum complications: Leveraging health administrative data and machine learning. BJOG 126(6), 702–709 (2019).
Article CAS PubMed Google Scholar
Chitty, L. S. et al. In case you missed it: The prenatal diagnosis editors bring you the most significant advances of 2019. Prenat. Diagn. 40(3), 287–293 (2020).
Article CAS PubMed Google Scholar
Hastie, T., Tibshirani, R. & Friedman, J. Elements of Statistical Learning: Data Mining, Inference, and Prediction 2nd edn. (Springer, 2009).
Book Google Scholar
Wiemken, T. L. & Kelley, R. R. Machine learning in epidemiology and health outcomes research. Annu. Rev. Public Health 41, 1 (2020).
Article Google Scholar
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning: With Applications in R 7th edn. (Springer, 2019).
Google Scholar
DeGregory, K. W. et al. A review of machine learning in obesity. Obes. Rev. 19(5), 668–685 (2018).
Article CAS PubMed PubMed Central Google Scholar
Hassanipour, S. et al. Comparison of artificial neural network and logistic regression models for prediction of outcomes in trauma patients: A systematic review and meta-analysis. Injury 50(2), 244–250 (2019).
Article PubMed Google Scholar
Lisboa, P. J. & Taktak, A. F. G. The use of artificial neural networks in decision support in cancer: A systematic review. Neural Netw. 19(4), 408–415 (2006).
Article PubMed Google Scholar
Grant, S. W., Collins, G. S. & Nashef, S. A. M. Statistical primer: Developing and validating a risk prediction model. Eur. J. Cardiothorac. Surg. 54(2), 203–208 (2018).
Article PubMed Google Scholar
Moons, K. G. M. et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): Explanation and elaboration. Ann. Intern. Med. 162(1), W1–W73 (2015).
Article PubMed Google Scholar
Miao, Q., Fell, D. B., Dunn, S. & Sprague, A. E. Agreement assessment of key maternal and newborn data elements between birth registry and Clinical Administrative Hospital Databases in Ontario, Canada. Arch. Gynecol. Obstet. 300(1), 135–143 (2019).
Article PubMed Google Scholar
Maghsoudlou, S., Beyene, J., Yu, Z. M. & McDonald, S. D. Phenotypic classification of preterm birth among multiparous women: A population-based cohort study. J. Obstet. Gynaecol. Can. 41(10), 1433–1443 (2019).
Article PubMed Google Scholar
Goldenberg, R. L., Culhane, J. F., Iams, J. D. & Romero, R. Epidemiology and causes of preterm birth. The Lancet 371(9606), 75–84 (2008).
Article Google Scholar
Pillay, P., Moodley, K., Moodley, J. & Mackraj, I. Placenta-derived exosomes: Potential biomarkers of preeclampsia. Int. J. Nanomed. 12, 8009–8023 (2017).
Article CAS Google Scholar
Bilagi, A. et al. Association of maternal serum PAPP-A levels, nuchal translucency and crown-rump length in first trimester with adverse pregnancy outcomes: Retrospective cohort study. Prenat. Diagn. 37(7), 705–711 (2017).
Article CAS PubMed Google Scholar
Vahanian, S. A., Lavery, J. A., Ananth, C. V. & Vintzileos, A. Placental implantation abnormalities and risk of preterm delivery: A systematic review and metaanalysis. Am. J. Obstetr. Gynecol. 213(4), S78–S90 (2015).
Article Google Scholar
Jelliffe-Pawlowski, L. L. et al. Association of early preterm birth with abnormal levels of routinely collected first and second trimester biomarkers. Am. J. Obstet. Gynecol. 208(6), 492 (2013).
Article PubMed Central Google Scholar
Olsen, R. N., Dunsmoor-Su, R., Capurro, D., McMahon, K. & Gravett, M. G. Correlation between spontaneous preterm birth and mid-trimester maternal serum estriol. J. Matern. Fetal Neonatal Med. 27(4), 376–380 (2014).
Article PubMed Google Scholar
Bernstein, P. S. et al. Beta-human chorionic gonadotropin in cervicovaginal secretions as a predictor of preterm delivery. Am. J. Obstet. Gynecol. 179(4), 870–873 (1998).
Article CAS PubMed Google Scholar
Waller, D. K., Lustig, L. S., Cunningham, G. C., Feuchtbaum, L. B. & Hook, E. B. The association between maternal serum alpha-fetoprotein and preterm birth, small for gestational age infants, preeclampsia, and placental complications. Obstet. Gynecol. 88(5), 816–822 (1996).
Article CAS PubMed Google Scholar
Fotouhi, S., Asadi, S. & Kattan, M. W. A comprehensive data level analysis for cancer diagnosis on imbalanced data. J. Biomed. Inform. 90, 103089 (2019).
Article PubMed Google Scholar
Sun, Y., Wong, A. K. C. & Kamel, M. S. Classification of imbalanced data: A review. Int. J. Pattern Recogn. Artif. Intell. 23(04), 687–719 (2009).
Article Google Scholar
Hosmer, D. W. & Lemeshow, S. Applied Logistic Regression 2nd edn. (Wiley, 2000).
Book Google Scholar
Wing, M. K. C. et al. caret: Classification and Regression Training. https://CRAN.R-project.org/package=caret (Accessed 30 September 2019) (2019).
Kursa, M. B. & Rudnicki, W. R. Boruta: Wrapper Algorithm for All Relevant Feature Selection. https://CRAN.R-project.org/package=Boruta (Accessed 1 October 2019) (2018).
Witold, R. & Rudnicki, M. B. K. Feature selection with the boruta package. J. Stat. Softw. 36(11), 1–13 (2010).
Google Scholar
Stuart, E. A., Azur, M., Frangakis, C. & Leaf, P. Multiple imputation with large data sets: A case study of the children’s mental health initiative. Am. J. Epidemiol. 169(9), 1133–1139 (2009).
Article PubMed PubMed Central Google Scholar
van Buuren, S. & Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45(1), 1–67 (2011).
Google Scholar
Fischer, J. E., Bachmann, L. M. & Jaeschke, R. A readers’ guide to the interpretation of diagnostic test properties: Clinical example of sepsis. Intens. Care Med. 29(7), 1043–1051 (2003).
Article Google Scholar
Akobeng, A. K. Understanding diagnostic tests 3: Receiver operating characteristic curves. Acta Paediatr. 96(5), 644–647 (2007).
Article PubMed Google Scholar
Melchor, J. C., Khalil, A., Wing, D., Schleussner, E. & Surbek, D. Prediction of preterm delivery in symptomatic women using PAMG-1, fetal fibronectin and phIGFBP-1 tests: Systematic review and meta-analysis. Ultrasound Obstet. Gynecol. 52(4), 442–451 (2018).
Article CAS PubMed Google Scholar
Grobman, W. A. et al. Prediction of spontaneous preterm birth among nulliparous women with a short cervix. J. Ultrasound Med. 35(6), 1293–1297 (2016).
Article PubMed PubMed Central Google Scholar
Fergus, P. et al. Prediction of preterm deliveries from EHG signals using machine learning. PLoS ONE 8(10), e77154 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Meertens, L. J. E. et al. Prediction models for the risk of spontaneous preterm birth based on maternal characteristics: A systematic review and independent external validation. Acta Obstetr. Gynecol. Scand. 97(8), 907–920 (2018).
Article Google Scholar
Lee, K. A. et al. A model for prediction of spontaneous preterm birth in asymptomatic women. J. Womens Health (Larchmt) 20(12), 1825–1831 (2011).
Article PubMed Google Scholar
Catley, C., Frize, M., Walker, R. C. & Petriu, D. C. Predicting high-risk preterm birth using artificial neural networks. IEEE Trans. Inf. Technol. Biomed. 10(3), 540–549 (2006).
Article PubMed Google Scholar
Podda, M. et al. A machine learning approach to estimating preterm infants survival: Development of the preterm infants survival assessment (PISA) predictor. Sci. Rep. 8(1), 13743 (2018).
Article ADS PubMed PubMed Central Google Scholar
Care, A. G. et al. Predicting preterm birth in women with previous preterm birth and cervical length ≥ 25 mm. Ultrasound Obstet. Gynecol. 43(6), 681–686 (2014).
Article CAS PubMed Google Scholar
Vovsha, I. et al. Using Kernel methods and model selection for prediction of preterm birth. In Machine Learning for Healthcare Conference 55–72. http://proceedings.mlr.press/v56/Vovsha16.html (Accessed 21 January 2019) (2016).
Weber, A. et al. Application of machine-learning to predict early spontaneous preterm birth among nulliparous non-Hispanic black and white women. Ann. Epidemiol. 28(11), 783–789 (2018).
Article PubMed Google Scholar
Podda, M. et al. A machine learning approach to estimating preterm infants survival: Development of the preterm infants survival assessment (PISA) predictor. Sci. Rep. 8, 213 (2018).
Article Google Scholar
Omani-Samani, R. et al. Cross-sectional study of associations between prior spontaneous abortions and preterm delivery. Int. J. Gynaecol. Obstet. 140(1), 81–86 (2018).
Article PubMed Google Scholar
Goodwin, L. K. et al. Data mining methods find demographic predictors of preterm birth. Nurs. Res. 50(6), 340–345 (2001).
Article CAS PubMed Google Scholar
Luo, W. Preterm Birth Prediction: Deriving Stable and Interpretable Rules from High Dimensional Data (2016).
Melchor, J. C. et al. Predictive performance of PAMG-1 vs fFN test for risk of spontaneous preterm birth in symptomatic women attending an emergency obstetric unit: Retrospective cohort study. Ultrasound Obstet. Gynecol. 51(5), 644–649 (2018).
Article CAS PubMed Google Scholar
Courtney, K. L., Stewart, S., Popescu, M. & Goodwin, L. K. Predictors of preterm birth in birth certificate data. Stud. Health Technol. Inform. 136, 555–560 (2008).
PubMed Google Scholar
Iams, J. D., Paraskos, J., Landon, M. B., Teteris, J. N. & Johnson, F. F. Cervical sonography in preterm labor. Obstet. Gynecol. 84(1), 40–46 (1994).
CAS PubMed Google Scholar
Jarde, A. et al. Effectiveness of progesterone, cerclage and pessary for preventing preterm birth in singleton pregnancies: A systematic review and network meta-analysis. BJOG 124(8), 1176–1189 (2017).
Article CAS PubMed Google Scholar
Jarde, A., Lutsiv, O., Beyene, J. & McDonald, S. D. Vaginal progesterone, oral progesterone, 17-OHPC, cerclage, and pessary for preventing preterm birth in at-risk singleton pregnancies: An updated systematic review and network meta-analysis. BJOG 126(5), 556–567 (2019).
Article CAS PubMed Google Scholar
Feng, Y. Y. et al. What interventions are being used to prevent preterm birth and when? J. Obstet. Gynaecol. Can. 40(5), 547–554 (2018).
Article PubMed Google Scholar

Download references

Acknowledgements

The author extends sincere gratitude to the editor and reviewers for their invaluable insights and constructive feedback, which greatly enhanced the quality of this academic paper. Additionally, the author would like to thank Dr. Sarah McDonald and Dr. Joseph Beyene for their valuable inputs and support in conducting this research.

Author information

Authors and Affiliations

Clinical Research Unit (CRU), CHEO Research Institute, University of Ottawa, Ottawa, Canada
Reza Arabi Belaghi
Department of Obstetrics and Gynecology, McMaster University, 1280 Main Street W, Hamilton, ON, L8S 4L8, Canada
Reza Arabi Belaghi

Authors

Reza Arabi Belaghi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Reza Belaghi conducted the statistical analysis, interpretation of the results, manuscript writing, and submission.

Corresponding author

Correspondence to Reza Arabi Belaghi.

Ethics declarations

Competing interests

The author declares no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Belaghi, R.A. Prediction of preterm birth in multiparous women using logistic regression and machine learning approaches. Sci Rep 14, 21967 (2024). https://doi.org/10.1038/s41598-024-60097-4

Download citation

Received: 30 September 2023
Accepted: 18 April 2024
Published: 20 September 2024
DOI: https://doi.org/10.1038/s41598-024-60097-4
Springer Nature Limited

Prediction of preterm birth in multiparous women using logistic regression and machine learning approaches

Abstract

Similar content being viewed by others

Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015

Predicting risk of stillbirth and preterm pregnancies with machine learning

Antenatal prediction models for outcomes of extremely and very preterm infants based on machine learning

Introduction