A review of deep learning models and online healthcare databases for electronic health records and their use for health prediction

Nasarudin, Nurul Athirah; Al Jasmi, Fatma; Sinnott, Richard O.; Zaki, Nazar; Al Ashwal, Hany; Mohamed, Elfadil A.; Mohamad, Mohd Saberi

doi:10.1007/s10462-024-10876-2

A review of deep learning models and online healthcare databases for electronic health records and their use for health prediction

Open access
Published: 13 August 2024

Volume 57, article number 249, (2024)
Cite this article

Download PDF

You have full access to this open access article

Artificial Intelligence Review Aims and scope Submit manuscript

A review of deep learning models and online healthcare databases for electronic health records and their use for health prediction

Download PDF

Nurul Athirah Nasarudin¹,
Fatma Al Jasmi¹^na1,
Richard O. Sinnott²^na1,
Nazar Zaki³^na1,
Hany Al Ashwal³^na1,
Elfadil A. Mohamed⁴^na1 &
…
Mohd Saberi Mohamad¹^na1

671 Accesses
Explore all metrics

Abstract

A fundamental obstacle to healthcare transformation continues to be the acquisition of knowledge and insightful data from complex, high dimensional, and heterogeneous biological data. As technology has improved, a wide variety of data sources, including omics data, imaging data, and health records, have been available for use in healthcare research contexts. Electronic health records (EHRs), which are digitalized versions of medical records, have given researchers a significant chance to create computational methods for analyzing healthcare data. EHR systems typically keep track of all the data relating to a patient’s medical history, including clinical notes, demographic background, and diagnosis details. EHR data can offer valuable insights and support doctors in making better decisions related to disease and diagnostic forecasts. As a result, several academics use deep learning to forecast diseases and track health trajectories in EHR. Recent advances in deep learning technology have produced innovative and practical paradigms for building end-to-end learning models. However, scholars have limited access to online HER databases, and there is an inherent need to address this issue. This research examines deep learning models, their architectures, and readily accessible EHR online databases. The goal of this paper is to examine how various architectures, models, and databases differ in terms of features and usability. It is anticipated that the outcomes of this review will lead to the development of more robust deep learning models that facilitate medical decision-making processes based on EHR data and inform efforts to support the selection of architectures, models, and databases for specific research purposes.

Applications of Deep Learning in Healthcare: A Systematic Analysis

Deep Learning in Healthcare: Applications, Challenges, and Opportunities

Deep Learning for Predictive Analytics in Healthcare

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the advancement of technologies, a substantial amount of health data has been accumulated in digital format. The health information contained in the data consists of both individual patient and population-level health information that is kept in electronic health records (EHRs). The goal of EHRs is to increase the effectiveness of a given healthcare system by managing patient medical records and minimizing pharmaceutical errors with the underlying intention of providing better care (Kruse et al. 2016). Various data formats are captured in EHRs, including pictures, free text, integers, and symbols. These data can be divided into structured and unstructured categories. It is simple to examine structured data using conventional machine learning techniques, such as patient demographics, blood pressure, and medication, as they are monitored using established metrics. Contrarily, narrative data, which are more complex since they are not in a structured format, are referred to as unstructured data and include medical photographs, pathology reports, and surgical notes (Sun et al. 2018). Unstructured data contains a lot more valuable information than structured data. However, as unstructured data in medical reports sometimes contains ambiguous information, it is challenging to extract meaningful information from it. Abbreviations, spelling mistakes, and grammar mistakes are frequently present in clinical text data, making it challenging to analyze the data.

In recent years, the number of hospitals have been increasingly using EHRs to store information related to patients’ medical histories. This has indirectly contributed to physicians’ use of computers for data records that contain a wide range of clinical data (Hornberger 2009). Clinical decision making can be supported by the vast amount of latent information that is present in EHRs. As such, there is an inherent need to create a comprehensive model for analyzing this type of data. Various artificial intelligence and machine learning models have been successfully used with EHR records over the past ten years to recognize, anticipate, evaluate, and categorize medical data (Maurya et al. 2021). The heterogeneous nature of the data types—which includes numerical data, date time objects, free text, etc.—means that despite the fact that numerous models have been created to accommodate EHR data, there are still issues that prevent the healthcare information from being fully leveraged (Goldstein et al. 2017). Furthermore, the majority of machine learning models have exhibited difficulties in identifying temporal patterns in data that contain numerous repeating sets of variables. Some conventional models rely on taking a single number out of the time series, like the mean, median, or other agglomerated statistics (Xie et al. 2020). The failure to fully exploit the data’s temporal dynamics can result in valuable sequential information loss (Zhao et al. 2017).

Modern deep learning-based approaches have been proposed for the representation of temporal EHR data due to the limits of machine learning algorithms in addressing these issues. The temporal aspect of the EHR can be handled by these sequential deep learning models. Deep learning algorithms have proven to be more effective at modeling temporal EHR data in many applications due to their adaptability and generalizability (Yang et al. 2017; Hung et al. 2017; Reddy and Delen 2018; Park et al. 2020). Deep learning frameworks are designed to offer a complete system that learns from unprocessed data and autonomously completes predetermined tasks. Deep learning can be more advantageous than conventional machine learning because it can learn from the original data and has several hidden layers. It can analyze enormous amounts of data with excellent accuracy and performance. It can also learn abstract information based on input. As a result, it has been used in the medical profession by numerous academics. The use of deep learning to analyze generic EHR data has been detailed in a number of recent reviews (Solares et al. 2020; Shickel et al. 2017; Si et al. 2021; Xiao et al. 2018). However, a systematic and thorough overview of the technical difficulties and deep learning remedies for managing temporal EHR data is required.

The purpose of this research is to analyze current advancements in unique deep learning models and databases used for EHR analysis by outlining their associated features from the standpoint of key difficulties and the approaches adopted to overcome challenges. This paper is structured as follows: Sect. 2 provides a brief explanation of the typical deep learning architectures used for modeling EHR data, followed by a comparison table. Following that, Sect. 3 discusses a number of deep learning models used in EHR analysis for clinical prediction. The online databases that contain EHR data are examined in Sect. 4. Finally, the the conclusions are discussed, and the future directions of deep learning modeling for EHR are highlighted by examining a number of crucial factors that require greater attention.

2 Deep learning architectures

Over the past few years, the amount of research applying deep learning for EHR analysis has been increasing rapidly. Due to the advancement of technology, a rich variety of deep learning architectures have been introduced. This section provides a brief explanation of the conventional deep learning architecture used for the modeling and analysis of EHRs. For a more thorough explanation, the basic equations underlining each architecture are also emphasized. A comparison of deep learning architectures is presented in Table 1.

2.1 Autoencoder

An unsupervised deep learning architecture, which is referred to as an autoencoder (AE), uses high-dimensional data, such as EHR data, to learn patterns (Zamanzadeh et al. 2021). AE is constructed to perform dimensionality reduction based on a nonlinear transformation, which is known as latent representation. Previous research has demonstrated that latent representation can extract useful clinical information from raw data (Beaulieu-Jones et al. 2016). Numerous forms of autoencoders have been developed, including variational autoencoders (VAE), sparse autoencoders (SAE), and denoising autoencoders (DAE). However, in general, all autoencoders have the same structure and functions as shown in Fig. 1 (Vincent et al., 2010). As can be observed in Fig. 1, an AE is formed of three layers: an input layer, x; a single hidden layer, z; and a reconstructed layer, ˜x. Moreover, data is reconstructed from x to ˜x via encoding and decoding processes, as shown in (1) and (2). An AE constructs an encoder that can convert the input to a latent representation in the hidden layer. It also develops a decoder for remodeling the input that results from the latent representation. W and W’ are the respective encoding and decoding weights. As this process serves to minimize the reconstruction error ||x − x˜||, the encoded representation z is deemed more reliable.

$$z = \sigma \left( {Wx + b} \right)$$

(1)

$$\tilde x = \sigma \left( {W'z + b'} \right)$$

(2)

An AE transforms the input data into a format in which only the most essential derived dimensions are stored. As such, they are comparable to standard dimensionality reduction techniques, like singular value decomposition (SVD) or principal component analysis (PCA). However, they can be more beneficial for solving complex problems due to nonlinear transformations via the activation function of each hidden layer.

2.2 Restricted Boltzmann machine (RBM)

Another unsupervised deep learning architecture that adopts a stochastic viewpoint is the Restricted Boltzmann machine (RBM), which calculates the probability distribution of the input data. It can handle some complicated issues and lower the chance of overfitting. Additionally, since the RBM network is undirected, information might spread both ways across it (feed-forward and feedback modes). Two layers make up RBM: An input layer (visible units) that encodes the observable (such as the occurrence of diseases) and a latent layer (hidden units). The hidden units are used for interesting tasks, including disease diagnostics, predicting future danger, and more (Tran et al. 2015). As the RBM is ”restricted,” it is not possible for two nodes in the same layer to share a connection. The RBM architecture assumes the form of an energy-based framework with visible binary units, v, hidden units, h, and a weight matrix, W, that connects the weight and hidden units. The energy function presented in (3) defines the interaction of variables. Figure 2 presents an overview of the Restricted Boltzmann Machine architecture.

$$E\left( {v,h} \right) = - \left( {{a^T}v + {b^T}h + W{v^T}h} \right)$$

(3)

Stochastic optimization techniques, like Gibbs sampling, are frequently used to train RBMs. The final form of h is produced, which can be thought of as the learned representation of the original input data. To create a deep belief network (DBN) for supervised learning tasks, RBMs can be stacked hierarchically. By allocating weights to various transcript measurement categories, the DBM is frequently utilized in EHR word embedding as a feature-extracting strategy (Gupta et al. 2015). This design is further used to integrate medical artifacts, like diagnosis codes, in the low-dimensional vector space (Tran et al. 2015).

2.3 Convolutional neural network (CNN)

A Convolutional Neural Network (CNN) is a deep neural network with a multilayer architecture and a topology that is frequently utilized for visual tasks (Yamashita et al. 2018). The architecture in a CNN consists of hidden layers, an output layer, and an input layer in this architecture. Every layer in the CNN is hidden and consists of a convolutional layer, a subsampling/pooling layer, and a fully connected layer. The input layer creates a dot product from the distinct inputs that have weight as filters. A convolutional layer consists of parameters and a collection of filters called kernels that are used to train the model. A pooling pub-sampling layer is frequently used to combine the collected information after these convolutions. Figure 3 presents a sample CNN architecture with two convolutional layers and a pooling layer after. Due to the fact that CNNs often have a small number of parameters and filters that are typically smaller than the input, these interactions are rare. Since each filter is applied to the entire input in convolution, parameter sharing is also encouraged.

CNN is a distinctive architecture that excels at picture classification and other deep learning application areas. Medical image analysis using CNN can produce positive results in the EHR setting for images from MRIs, mammograms, CT scans, etc. By analyzing the image as a set of local pixel patches, CNN can reliably extract significant characteristics.

2.4 Recurrent neural network (RNN)

A recurrent neural network that can encode time-stamped events from EHR data and handle sequentially ordered input, such as natural language is developed (Chen et al. 2019). RNNs are also capable of handling long-range temporal dependencies. RNNs typically consist of links, as depicted in Fig. 4, that feed each layer’s output back into itself. Due to hidden states and feedback loops.

that can elegantly absorb and integrate prior knowledge about the patient, the RNN’s recurrent structure makes them suitable for processing EHR data (Ho et al. 2017). Activating the current input at a time and the preceding hidden layer causes the current hidden layer to be updated progressively. This design processes a complete sequence before passing information from all of its preceding pieces to the last concealed layer.

RNNs have recently been applied to medical activities, like early heart failure diagnosis, predicting ICU mortality, and predicting patient decomposition (Shah et al. 2016; Choi et al. 2017a, b; Aczon et al. 2017). Additionally, a lot of research has used RNNs to build patient representations in EHRs utilizing groups or sequences of clinical codes. Many medical applications process enormous quantities of text, including clinical notes and medical queries, by looking for keywords that relate to common clinical entities, like ICD and CPT codes.

Table 1 Advantages and disadvantages of deep learning architecture applied to electronic health record in clinical prediction

Full size table

3 Deep learning models

Deep learning models have been created in recent years to evaluate EHR data for managing chronic diseases, such as predicting chronic diseases and identifying adverse medication events. Multiple difficulties, including data heterogeneity, irregularity, and sparsity, are brought on by the expansion of EHR data. The most advanced deep learning-based model for EHR data analysis has been presented to address these issues. Additionally, as they have the proven capacity for learning, adaptability, and generalizability, these models have shown exceptional effectiveness in modeling EHR data. In this section, the current deep learning models are discussed. Tables 2 and 3 provide a summary of deep learning model frameworks.

Table 2 List of existing deep-learning models review

Full size table

Table 3 Comparison of existing deep-learning models

Full size table

3.1 Doctor AI

Doctor AI is a predictive model to forecast disease diagnosis with associated timelines for pharmacological interventions (Choi et al. 2016a, b). The RNN architecture was used to create the Doctor AI model, which was subsequently applied to timestamped longitudinal EHR data collected over an eight-year period. To increase accuracy and speed, skip-gram embedding (Mikolov et al. 2013) was added to the RNN initialization approach. As additional input for diagnostic prediction, this model incorporates encounter records like diagnosis, procedure, or drug codes. Additionally, it may evaluate the patient’s medical history to forecast multilabel for any type of medicine or condition. The records of primary care patients from Sutter Health Palo Alto Medical Foundation who had been using an Epic System Corporation EHR for more than ten years—260 K patients and 2128 physicians—were the data used in this model. On the real-world EHR datasets, doctor AI beat numerous baselines by obtaining 79.58% recall@30 and accuracy comparable to that of a doctor. It’s interesting to note that this model scored well on the open MIMIC dataset while maintaining high accuracy in various universities’ coding systems. Last, but not least, health professionals confirmed that Doctor AI could deliver valuable clinical information based on diagnostic results. However, the model can exhibit biases stemming from imbalanced datasets, overfitting, and gradient issues like vanishing/exploding gradients. These biases, including data and temporal bias, can result in skewed outputs and miss interpretation of data patterns, especially when the timing is atypical. Overfitting bias occurs when the model is trained on a limited amount of data, leading to excellent performance on training data but poor generalization. Additionally, biases may be exacerbated by embedding techniques like skip-gram, where stereotypical associations encoded in the training data influence downstream tasks. Addressing these biases necessitates careful dataset curation, vigilant model training, and the application of debiasing techniques to enhance model fairness and performance.

3.2 Deep patient

Deep Patient is a revolutionary unsupervised deep learning model that may forecast patients’ future health issues based on a general-purpose patient representation from EHR data (Miotto et al. 2016). To process EHR for captured hierarchical regularities and stable data structures, this framework was created using a stacked denoising autoencoder (SDA). By automatically combining the clinical descriptions, this technique created a representation that is more condensed, consistent, and non-redundant. Additionally, Deep Patient consistently delivered lower dimensional representation than the raw EHR data, improving the performance of clinical analytics engines. A total of 700,00 patients’ worth of data from the Mount Sinai data warehouse were used in SDA, and 76,214 test patients with 78 disorders were used for the evaluation of deep representation. Deep Patient was deemed to have demonstrated superior performance in the prediction of several disease categories on the basis of unprocessed EHR data and using various feature learning techniques. This shows that the learned features characterize patients in a broad and efficient manner that may be handled by automated systems in multiple fields. Personalized prescriptions, treatment suggestions, and the recruitment of clinical trial participants could all benefit from the deep patient representation that the EHR provides. However, the quality and completeness of the electronic health records data could be the causes of potential bias in this model. If certain patient populations are underrepresented or if there are errors in the data, this could lead to biased predictions.

Another potential bias could be the selection of features used in the model. If certain features are more heavily weighted or if important features are missing, this could also lead to biased predictions.

3.3 RETAIN

The RETAIN (Reverse Time Attention) model was created to solve RNN constraints. It uses a two-level neural network for sequential input to provide a detailed interpretation of the forecast while maintaining prediction accuracy (Choi et al. 2016a, b). RETAIN typically imitates medical practice by using existing EHR data to provide more focus to a recent clinical visit. The doctor’s response to the patient’s requirements and the investigation of the patient record, which focuses on specific information from the present to the past, served as the inspiration for this model framework. Five steps make up the RETAIN algorithm: embedding information, creating visit-level attention weights, creating.

variable-level attention weights, creating the context vector, and making predictions. This model employs RNN in steps 2 and 3 to recover the sequential information and imitate the behaviors of doctors. RETAIN was tested using a sizable EHR dataset with 14 million visits completed by 263 K patients over an eight-year period. Additionally, RETAIN predicts heart failure disease more accurately and quickly than conventional machine learning techniques. The goal of RETAIN is to enhance prediction performance by retaining the straightforward representation learning component for interpretation while utilizing a sophisticated attention creation technique. They intend to develop an interactive visualization system for RETAIN and use the RETAIN paradigm for more healthcare applications in the future.

3.4 T-LSTM

To analyze longitudinal patient records with irregular elapsed times, a new LSTM called Time-Aware LSTM (T-LSTM) was created (Baytas et al. 2017). Patient records, for instance, contained sophisticated medical records with variable sequence durations and videos with missing frames. Generally speaking, T-LSTM is a subtyping technique that employs the k-means method to develop a strong single representation of consecutive patient records in order to cluster patients based on clinical subtypes. This model represents a variation of the common Long-Short Term Memory (LSTM) framework that enhanced the memory content of the unit by taking the interval between consecutive elements of a sequence into account. The typical LTSM’s forget, input, and output gates are maintained by T-LSTM, which also learns a neural network to divide the cell memory into short- and long-term memories. The main part of this concept is subspace decomposition used on the memory from the previous time step. The quantity of data stored in the memory from earlier periods is modified to prevent loss of the patient’s overall profile. The elapsed time between succeeding items, which is how T-LSTM calculates the short-term memory content weight, is used to apply the memory discount. The time-lapse is converted into a suitable weight using the non-increasing function of the elapsed time. Supervised and unsupervised experiments were applied to T-LSTM on artificial and actual datasets. When dealing with erroneous elapsed time data, T-LSTM performs better than ordinary LSTM. But, biases in T-LSTM might occurs from how temporal data is integrated into the model. For instance, inadequate normalization of time information or ineffective modeling of temporal relationships within the architecture could introduce biases into the predictions. Furthermore, biases might result from specific design decisions within the T-LSTM model, such as selecting which time features to incorporate or determining how time interacts with the LSTM architecture.

3.5 Deep Diabetologist

A model called Deep Diabetologist predicts diabetes patients using sequential EHR data and RNN architecture (Mei et al. 2018). This algorithm uses a hierarchical recurrent neural network (HRNN) framework to capture heterogeneous sequential information in EHR data. This system can handle a variety of diagnoses and clinical measurements and is suited to multi-resolution learning. Technically, the learning process occurs after the EHR data has been cleaned and imputed. The preprocessing procedure yields 481 clinical variables, including 350 3-digit ICD-10 codes, 124 lab tests, and 7 previously used drug classes. Theano backend was used to implement LSTM and RNN in the learning process using a GPU machine configuration. The model was put to the test in two experiments—one with past treatments and one without. It was then contrasted with other models. Deep diabetologists perform marginally better than other models and outperform baseline logistic regression (LR). However, overall performance suffers, which results in inadequate medication. Additionally, this model continues to have a clinical measuring constraint in its EHR repository, which renders it ineffective.

3.6 DeepCare

Based on medical records, illness histories, and present ailment states, DeepCare is a dynamic neural network designed to predict future medical outcomes (Pham et al. 2017). It is a universal, all-encompassing predictive solution that may be applied to any healthcare practice EHR strategy. The LSTM architecture used in this model, which can handle events with erratic timing, directly simulates medical procedures. The development of patients’ sickness tracks and healthcare procedures in a time-stamped order involves the use of LSTM. The extracted data from the admission will be the input to the LSTM, and the output will be the sickness condition at the time of admission. Three main layers make up the DeepCare paradigm, and each plays a specific purpose. A modified LSTM, known as C-LSTM in the bottom layer, manages interventions and erratic timing. Through the use of multiscale weighted pooling for scale, the disease states are combined in the middle layer. Finally, a neural network in the top layer uses pooled states and data to estimate the outcome probability. However, it can be potential bias if certain features are more heavily weighted this could also lead to biased predictions. DeepCare can be used with current EMR systems. There is a need for further research to perform thorough assessments of the various cohorts, sites, and outcomes. Sharing parameters across numerous cohorts and hospitals makes room for domain modifications in this situation.

3.7 GRNN-HA

A model for predicting mortality that is appropriate for clinical decision support systems was created using gated recurrent unit RNNs with hierarchical attention (GRNN-HA) (Sha and Wang 2017). The aim of this paradigm was to address current issues with healthcare data; for example, managing, modeling, and interpreting medical data, for instance Word2vec (Mikolov et al. 2013). A low-dimensional representation of medical codes, is used by GRNN-HA to learn how to handle high-dimensional medical information. Bidirectional GRU is also used to encrypt temporal information from medical data. To make medical data easier to understand, this approach uses hierarchical organization to separate information at the visit and code levels and learns hierarchical attention weights on both levels in a dependent manner. The interpretability of the framework depends on attention weights, which are allocated to specific diagnostic codes and hospital visits based on the relative significance of those codes for making predictions.

Because of this, the GRNN-HA model can interpret the data in the visualization and has a higher prediction accuracy than baseline models. However, there are several limitations in this work that have an impact on the performance. First, the dataset’s poor quality is a result of the scant longitudinal medical data available on patients. Second, there are not enough samples because deep learning models require a large training dataset in order to produce adequate attention weight and meaningful prediction.

3.8 Timeline

Timeline is a state-of-the-art deep learning model that predicts clinical occurrences from previous visits while taking into account the interval between visits and two different variables pertaining to those visits (Bai et al. 2018). This model is capable of learning the time decay factor for each medical. This capability enables Timeline to recognize that chronic diseases affect future visits more subtly than acute ones. Timeline uses an attention method to enhance visit vector embedding (Vaswani et al. 2017). It is possible to analyze the predictions and comprehend how the risks of subsequent visits vary over time by scrutinizing Timeline’s attention weights and disease progression functions. Medical claims data from SEER-Medicare Linked Database (Feng et al. 2017), which contain a patient’s diagnosis and procedure billing codes, are utilized to evaluate the model. According to the experiment, when used with two sizable real-world data sets, this method surpasses cutting-edge deep neural networks at predicting the primary diagnosis of a future hospitalization.

3.9 DMNC

In order to overcome the asynchronous sequential challenge, the Dual Memory Neural Computer (DMNC) was developed as a novel memory-augmented neural network (Le et al. 2018). DMNC is a dual memory neural computer that combines three neural controllers with two external memories. Additionally, this architecture includes two encoders that read from and write to external memories to encode input views. The two memory accessing modalities in this model, early-fusion and late-fusion, correlate to the early and late interview exchanges. Early fusion uses a shared memory address space that the encoder can access and alter. As opposed to early fusion, which results in data exchange during encoding, late fusion memory space is independent. In both situations, the decoder will amalgamate the knowledge from the memories to predict both modes. The performance of the DMNC model was compared to that of numerous earlier works in two different tasks. The DMNC consistently outperformed other deep learning models in the task of prescribing drugs. Additionally, the prediction accuracy of this model rose in correlation with the number of predictions, and outperformed that of the deepCare model in tasks involving disease progression.

3.10 KAME

A strong and reliable model called the Knowledge-based Attention Model (KAME) was created to forecast patients’ future health information (Ma et al. 2018). To increase prediction accuracy, KAME makes use of general knowledge throughout the entire prediction process. Additionally, this approach learns how to represent medical codes from a specific medical ontology, such as the International Classification of Diseases (ICD) or Clinical Classification Software (CSS) (ICD). Each input visit is represented by a medical code encoded into a low-dimensional level vector. This vector is subsequently fed into an RNN to provide a hidden state representation. Utilizing altered ancestral embeds from the knowledge graph, the hidden state representation is employed to calculate knowledge attention weights. The ancestral embedding in this case comprises broad knowledge about the medical code, such as an advanced understanding of the medical graph. Using the pertinent high-level information and the accompanying knowledge’s attention weights, KAME creates a new knowledge vector. The effectiveness of the model is assessed using three authentic medical datasets. KAME generally outperformed GRAM (Choi et al. 2017a, b), Dipole (Ma et al. 2017), RNN+, and RNN on a variety of datasets. A further experiment found that the KAME model also outperformed baselines with both adequate and insufficient data.

3.11 COAM

Numerous deep learning models currently in use employ attention methods to analyze the data; nevertheless, the association between EHR data and the requirement for correction have been proven to cause inaccurate prediction. CrossOver Attention Model (COAM) makes use of the connection between diagnosis and treatment data in a crossover attention mechanism to enhance prediction performance (Guo et al. 2019). To efficiently process medical data, this model framework uses two RNNs: BRNNd and BRNNt. The bidirectional recurrent neural network (BRNN), which is used to train all input visit detail from two sections, is recurrent in both directions. BRNNd reads representations for diagnoses, whereas BRNNt reads representations for treatments. In general, COAM consists of five key processes: (1) Embedding diagnosis and treatment information; (2) Utilizing BRNNd and BRNNt to process a patient’s previous diagnosis and treatment; (3) combining data and determining the weight of the diagnosis and therapy; (4) creating a context vector using crossover attention mechanism; and v) Process prediction using both context vectors together. Without employing any expert knowledge, COAM was found to achieve high prediction accuracy without expert knowledge. Furthermore, based on strong interpretability, this model can analyze disease situations and recommend efficient treatment approaches. As such, it can effectively lower a patient’s risk of sickness and enable doctors to provide precise medical care. However, COAM rely on learning representations of data, which could be biased or incomplete. If the learned representations do not accurately capture the underlying structure of the data, it may lead to biased predictions or decisions.

3.12 VS-GRU

Variable sensitive GRU (VS-GRU) is a GRU-based framework that employs the different missing rates as input and learns the characteristics of each variable separately to decrease the influence of variables with high missing rates (Li and Xu 2019). Every variable in real life is monitored at a varied frequency depending on its properties. This is of significance in the healthcare industry, where the doctor chooses which variable to track. The model aims to present an analysis of multivariate time series containing a large number of additional values. Furthermore, VS-GRU also has the ability to handle time series without requiring a two-step method. Without incurring additional computing expenditures, VSGRU may examine the feature of various variables automatically. However, this model’s structure needs to be improved to better handle complicated issues like multi-task categorization issues. As a result, the developers enhanced the VS-GRU to become the VS-GRU integration (VS-GRU-i), which consists of a two-layer GRU. Specifically, VS-GRU is the first layer, and GRU, which incorporates the characteristic from the first layer, is the second layer. A penalized method is implemented for the second layer’s input to help identify variables that are either entirely absent or have a small number of observed values. Two open clinical datasets from PhysioNet and MIMIC-III were used to test the model. According to the experiment, VS-GRU and VS-GRU-i both performed well in single-label classification tasks and multi-label classification tasks. The findings show that this model is capable of capturing the pattern of time series with significant missing data and is useful outside the healthcare industry.

3.13 Patient2vec

The three types of records included in the EHR are patient medical treatments, diagnosis codes, and physical symptoms. By combining two features from physical symptoms and medical treatments, Patient2vec can learn how to represent EHR data in a bi-dimensional manner (Wang et al. 2019). Additionally, Patient2vec incorporates the RNN model to discover the sequential context-aware aspects of visits. The classifier is fed the learned representations in order to predict diagnoses. This framework uses Word2Vec to embed diagnosis codes and build a vector representation using a dynamic window. Additionally, Word2Vec employs Skip-Gram to forecast the words that will be in the vicinity of the targeted word. The effectiveness of the Patient2Vec framework was tested using public MIMIC-III datasets with 61,532 visits by 46,520 patients. Additionally, an experiment on multi-classifying diagnoses was conducted and performance was evaluated against the other three baselines in order to corroborate the model performance results. The studies revealed that the information hidden in physical symptoms and medical conditions plays a critical role in patient representation, resulting in up to a 76% increase in AUC for predicting the diagnosis of the entire illness and an 80% top-10 recall for the target disorders. Unfortunately, the quality and completeness of the EHR data used for training the model could be causes of potential bias in prediction. If the EHR data is incomplete or contains errors, it could lead to biased patient embeddings and potentially impact the performance of downstream tasks such as patient similarity or clustering.

3.14 ConvAE

ConvAE is a platform for unsupervised deep learning designed to quickly and accurately assess diverse EHR data (Landi et al. 2020). This learning model reduces the patient’s latent dimension vector by combining word embedding, convolutional neural networks, and autoencoders. Additionally, ConvAE enables efficient patient classification with little effort. This model incorporates subcategories for several complicated illnesses and performs qualitative analysis to establish their clinical relevance using encodings learned from heterogeneous, domainfree EHRs. ConvAE’s performance was evaluated using actual EHR data from 1.6 M Mount Sinai Health System patients in New York. According to the testing results, ConvAE significantly outperformed numerous baselines when it came to clustering individuals with complex illnesses. This demonstrates that the model is able to recognize many clinically significant disease subtypes, such as disease progression and comorbidities, which are the main contributors to the clinical phenotypic heterogeneity of complex illnesses as measured by the EHR.

3.15 BEHRT

To forecast the possibility of 301 events in the patient’s upcoming visits, the deep neural sequence model BERHT was introduced for EHR (Li et al. 2020). This model takes its cues from BERT, the most potent Transformer-based NLP architecture. F eed-forward neural networks are utilized in to simulate the temporal evolution of EHR data using a variety of sequential notions. By concurrently taking into account the complete sequence and learning the input in parallel rather than sequentially, BEHRT’s feed-forward structure solves the exploding and vanishing gradient concerns and captures information. BEHRT can be pre-trained on a sizable dataset and, after some modest tweaking, will deliver a noteworthy performance in a variety of downstream tasks. The model’s effectiveness was proved by training and testing it on CPRD, one of the largest linked primary care EHR systems, to identify the following diseases that will likely be present during a patient’s upcoming visits. According to the findings, BEHRT scored 8% better than the top deep EHR models described in the literature when predicting a variety of more than 300 diseases. BEHRT may customize interpretation and incorporate numerous heterogeneous concepts thanks to the model’s scalability, higher accuracy, and flexible design. However, this model also tend to has potential biases such as selection bias, measurement bias, confounding bias, and information bias. These biases can impact the accuracy and validity of the results obtained through BERHT analysis.

3.16 HiTANet

To more efficiently employ time information for risk prediction, a novel hierarchical time-aware attention network called HiTANet was developed to model how clinicians make risk prediction decisions (Luo et al. 2020). HiTANet replicates temporal information both locally and globally. For each visit, the local evaluation step provides local attention weight and embeds time information into visitlevel embedding. Assigning global weights to various time steps is done at the global synthesis stage by using a time-aware key-query attention technique. To create the patient representations for subsequent risk prediction, two attention weighting types are dynamically blended. The effectiveness of HiTANet was assessed using three actual datasets, and the outcomes were contrasted with 12 competing baselines. According to the experiment, the HiTANet model performed better than cutting-edge deep neural network models and exhibited steady progress in risk prediction tasks on three sizable real-world illness cohorts. HiTANet achieved an F1 score of above 7% across all datasets, demonstrating the model’s efficacy and how easily interpretable HiTANet’s inference process is for risk prediction.

4 Online databases for EHR data

Historically, listings of disease-specific data were manually compiled and displayed in disease repositories that housed healthcare data. In more recent years, the advancement of technology has enabled digital health record systems to be more readily available in hospitals. Novel methods for presenting, comprehending, and interpreting healthcare data have emerged that leverage the capabilities of internet databases. A database that can be viewed and accessed online for healthcare purposes is known as an online database. Practitioners may require a subscription to access the clinical information stored in these online databases because the data is typically housed in a cloud database that is hosted on a website. Online healthcare databases primarily stand out due to the variety of patient-level data that is automatically gathered from EHRs. The research for clinical interactions and decisions study in disease processes uses a large number of high-resolution characteristics derived from numerous individuals. The primary goal of creating an online healthcare database is to arrange clinical data so researchers may assess and derive useful knowledge from the data quickly and easily. Due to the possibility to crowdsource new machine learning approaches using open-source programming tools, there has been a significant interest in studying massive amounts of health data as a result of this new information (Sanchez-Pinto et al. 2018). Table 4 displays a comparison of online databases for EHR data.

Table 4 Comparison of existing online databases for EHR data

Full size table

4.1 MIMIC

The Medical Information Mart for Intensive Care (MIMIC) (Johnson et al. 2016) is one of the most well-known and extensively utilized open-access clinical databases in the world. In 2013, The Massachusetts Institute of Technology (MIT) Laboratory for Computational Physiology, Philips Medical Systems, and Beth Israel Deaconess Medical Center collaborated to create MIMIC with support from the National Institute of Biomedical Imaging and Bioengineering. The main objective of MIMIC is to increase the speed, precision, and efficiency of clinical decision-making for patients in intensive care units. They were initially developed as sophisticated systems for ICU patient monitoring and decision support. For scholars interested in using the database, MIMIC also maintains data structure documentation and a public GitHub repository. By accessing the public code, new users can take advantage of others’ hard work and are incentivized to contribute their own effort, enhancing and expanding the influence of MIMIC.

4.2 eICU-CRD

A further illustration of an open-access database is the eICU Collaborative Research Database (eICU-CRD) (Pollard et al. 2018). This undertaking was inspired by a Philips^® Healthcare critical care telehealth campaign. The MIMIC team created the eICU-CRD, which featured a unique patient pool from 208 ICUs across the United States, in 2014 and 2015. The eICU Collaborative Research Database is a multi-center intensive care unit (ICU) database that contains high-granularity data on more than 200,000 admissions to ICUs across the United States that are under the watchful eye of eICU Programs. The deidentified database stores measurements of vital signs, records of care plans, sickness severity, diagnoses and treatments, and more. Practitioners can access the data after completing the registration, which entails finishing a course on conducting research with human subjects and signing a data usage agreement mandating responsible treatment of the data and adherence to the collaborative research principle. The data can be beneficial in several efforts, such as the creation of machine learning algorithms and decision assistance systems, and ongoing clinical research.

4.3 PhysioBank

The enormous PhysioBank library of well-characterized digital recordings of physiological signals and associated data is available to the biomedical research community and is constantly growing (Goldberger et al. 2000). It currently has databases of multiparameter cardiopulmonary, neural, and other biomedical signals gathered from both healthy people and people with a variety of serious illnesses that have an effect on public health, such as life-threatening arrhythmias, congestive heart failure, sleep apnea, neurological disorders, and aging. Spanning over 80 databases, PhysioBank comprises approximately 90,000 recordings, or over four terabytes, of digitized physiologic signals and time series. Timeseries data from publically financed studies, like extensive multicenter clinical trials or physiological research carried out by the National Aeronautics and Space Administration, can be archived in PhysioBank as a final and permanent repository (NASA).

4.4 CPRD

Clinical Practice Research Datalink (CPRD) is a database of high-coverage anonymized medical records spanning 674 UK practices and 11.3 million patients (Herrett et al. 2015). The 4.4 million active (living and currently registered) patients who meet quality requirements account for around 6.9% of the UK’s total population. In terms of age, gender, and ethnicity, patients largely mirror the UK general population. As such, the CPRD primary care database provides a wealth of health-related data for research purposes, including demographics, symptoms, tests, diagnoses, treatments, health-related behaviors, and referrals to secondary care. More than half of patients have access to a wider range of data for study thanks to links with secondary care databases, disease-specific cohorts, and death records. Peer-reviewed journals have published details of more than 1000 research projects that have utilized the CPRD to conduct epidemiological research on a variety of health outcomes.

4.5 CERNER Health Facts*

Cerner Corporation has maintained Health Facts* (HF) since 2000. It represents the largest vendor-specific EHR database (DeShazo and Hoffman 2015) and contains 84 million patient records collected from more than 500 healthcare facilities across the United States over the past 20 years. To safeguard the privacy of both patients and organizations, HF data is de-identified and HIPAA-compliant. The database consists of longitudinal, de-identified electronic health record (EHR) patient data that has been collected and organized to facilitate studies and reporting. Additionally, the data types in HF include demographics, medicines, test results, and pharmacy. These records are often thorough and include 300 data pieces, including information about consultations and laboratory results. These clinical data are mapped to the most prevalent standards; for instance, diagnoses and procedures are linked to their ICD (International Classification of Diseases) codes, medication information includes the national drug codes (NDCs), and laboratory tests are linked to their LOINIC (Laboratory Observational Identification and Classification) codes (Rasmy et al. 2018). The Cerner Corporation publishes a publicly accessible resource in the form of the company’s Health Facts^® data (Kansas City, MO).

4.6 Healthcare cost and utilization project

The Healthcare Cost and Utilization Project (HCUP) is a valuable resource that combines data from the State Inpatient Databases (SID), the Nationwide Inpatient Sample (NIS), the Kids’ Inpatient Database (KID), the outpatient databases State Ambulatory Surgery Data (SASD), and State Emergency Department Data (SEDD). The HCUP also stores multistate, inpatient, and outpatient records for both insured and uninsured patients. The aim of the program is to create a multistate healthcare system that stores data that benefits health services research and the creation of tools that aid administrators and researchers alike. Additionally, in order to increase the value of data, this database also offers a collection of software applications and other resources. The system can be accessed by individuals who have signed up to the data use agreement.

5 Discussion

Patient status information is extensively recorded within Electronic Health Records (EHRs). As a result, EHR data offers a practical way to monitor patient health details and enhance decision-making using data-driven technologies. In contrast to data found in clinical trials or other biomedical research, secondary data obtained from EHRs lack a predefined purpose to address a specific hypothesis. The emerging literature recognizes challenges in using EHR data for predictive modeling of health trajectories. Data quality is a persistent concern, with healthcare professionals citing issues in the healthcare environment, clinical documentation, and data tools affecting EHR accuracy. Researchers note the limited accuracy of diagnostic codes and the potential of free text fields to capture missed information. Moreover, several policy and business barriers can hinder the effective utilization of these datasets for such purposes. Some of these barriers include data privacy and security concerns, inconsistent data standards, lack of interoperability, and data quality and accuracy. To address these barriers, collaboration among healthcare stakeholders, policymakers, researchers, and technology vendors is crucial. Implementing clear data sharing agreements, improving data standards and interoperability, strengthening privacy safeguards, and providing incentives for data sharing are steps that can help overcome these obstacles and harness the potential of EHR datasets for predictive modeling of health trajectories.

Numerous investigations have undertaken predictive modeling using EHR data. This involves employing machine learning techniques to create a statistical model from EHR data, aimed at foreseeing a specific desired clinical outcome (Wu et al. 2010). However, the complexity of EHR data that uncurated, poor-quality, high-dimensional and multimodal poses challenges in employing EHR raw data directly within machine learning models for the purpose of predictive modeling. An essential aspect of predictive modeling using EHR data involves proficiently converting patient information from its initial EHR structure into a machine-readable format. This conversion essentially translates patient data into meaningful insights that can be comprehended by algorithms. The success of predictive models in enhancing disease diagnosis, phenotyping, and prognosis greatly relies on the excellence of this representation of features.

Recently, deep learning models have served as excellent instruments for identifying illnesses or foreseeing clinical events or consequences (such as mortality or treatment response) using time series data like EEG or biosignals from ICU, as well as imaging data. Nevertheless, despite the encouraging outcomes exhibited by deep learning methods in executing numerous analytical tasks, several unresolved difficulties persist. Such as transferring both data and labels in transfer learning due to the fact that deep models frequently fail to explicitly account for uncertainties. This deficiency diminishes the models’ resilience in adapting to shifts in the fundamental data distribution. Consequently, there exists a potential hazard in implementing models that could have their future predictions compromised by actual EHR data. This concern holds particular significance, particularly in healthcare environments. Furthermore, with respect to the model’s interpretability and clarity, current endeavors (such as attention mechanisms, visualization, and explanations through examples) frequently strive to elucidate the predictions. However, in order to effectively apply deep models developed from EHR data, users often require a grasp of the mechanisms underpinning the models’ operations. Achieving such a level of transparency in model functioning remains a formidable challenge. Lastly, for achieving direct clinical impact, the deployment and automation of deep learning models demand consideration. For instance, substantial volumes of EHR data are processed to create standardized inputs for training deep models. Addressing the challenge of obtaining extensive EHR datasets is essential for the integration of deep EHR models into practical EHR systems.

6 Conclusion

In recent years, there has been an increase in the use of deep learning architecture to evaluate EHR data, and this pattern of development continues to be strong. Numerous computational models have been developed for use in the medical context thanks to efforts to combine EHRs with novel technologies. Medical concepts, including disease prediction, disease and patient classification, clinical decision assistance, and more have recently been applied with EHR. This paper presented an overview of a number of deep learning architectures, models, and databases together with details of their applications within the field of healthcare. This review also explored the pros and cons of popular deep learning architectures. The results reveal that researchers prefer RNN architecture to model EHR patient data because it can handle the challenges associated with EHR data. RNN is effective at processing sequential data and can address the temporal structure of the EHR since it can map from the whole history of prior input data to each output (Solares et al. 2020).

Additionally, several deep learning models and EHR data database reviews were conducted. To give an overview of their range, the features and capabilities of each model and database were briefly outlined. Researchers can draw from the data presented to formulate fresh ideas to enhance the model’s ability to generate meaningful insights for healthcare practitioners. Each model has unique capabilities and qualities that can benefit various therapeutic applications. The effectiveness and precision of prediction are impacted by every aspect of the model. As a result, research efficiency that can address both old and new difficulties in EHR data depends on the appropriate design and characteristics. A greater role will be indirectly played in clinical prediction tasks by the expanding EHR data and numerous advanced models, which will continuously supply insightful data on patient representation.

The following section reviews and discusses six healthcare databases that contain EHR data. Before beginning their investigation and experiment, scientists must collect the necessary data. EHR data is confidential and cannot be made available to the public; thus, accessing and obtaining it remains difficult. The majority of the time, EHR records include formal information, like patient identity ID, address, and more. Therefore, EHR data is rarely upload to the database and difficult for researchers to access it. This review examines and discusses six databases are reviewed and discussed. All the databases are web-based and available online. Only three of them are unrestricted free access that can be used on a frequent basis. These databases contain several disease data categories and differ in how they deliver the data. The information given could be inconsistent. The benefits and drawbacks of each database are listed in Table 4. There is not a significant number of healthcare databases currently available; as such, more comprehensive databases are required to to facilitate researchers and to conduct larger scale EHR-oriented experiments. Key to this is solving confidentiality problems on an ongoing basis, as the data needs to be regularly updated so it can be access and used freely for research. Moreover, to address the data scarcity challenge in Electronic Health Record (EHR) systems, innovative strategies are essential to improve data availability and quality for diverse healthcare applications. These strategies encompass various approaches: i- Generate synthetic data through techniques like interpolation, extrapolation, and perturbation to expand the dataset without compromising patient privacy. ii- Forming partnerships among healthcare organizations to pool anonymized EHR data, employing privacy-preserving methods like federated learning for collective analysis. iii- Combining EHR data with relevant sources such as wearable devices, genetic data, and social determinants for a more comprehensive patient health understanding.

Finally, modern deep learning (DL) models, including transformers, hold immense potential, their promises often overshadow the practical challenges they face, particularly in real-world applications like clinical care where privacy and the protection of Protected Health Information (PHI) are paramount. DL models, including transformers, often require vast amounts of data to train effectively, raising concerns about privacy breaches and unauthorized access to sensitive information. In healthcare, where patient confidentiality is sacrosanct, the deployment of DL models must navigate stringent regulations and ethical considerations to safeguard PHI. Ultimately, while the promise of modern DL techniques in healthcare is undeniable, their successful integration into real-world clinical care settings hinges on our ability to navigate and overcome the indurate challenges of privacy protection and PHI safeguarding.

References

Aczon M, Ledbetter D, Ho L, Gunny A, Flynn A, Williams J, Wetzel R (2017) Dynamic mortality risk predictions in pediatric critical care using recurrent neural networks. arXiv Preprint arXiv :170106675
Bai T, Zhang S, Egleston BL, Vucetic S (2018) Interpretable representation learning for healthcare via capturing disease progression through time. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 43–51
Baytas IM, Xiao C, Zhang X, Wang F, Jain AK, Zhou J (2017) Patient subtyping via time-aware lstm networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 65–74
Beaulieu-Jones BK, Greene CS et al (2016) Semi-supervised learning of the electronic health record for phenotype stratification. J Biomed Inform 64:168–178
Article Google Scholar
Chen R, Stewart WF, Sun J, Ng K, Yan X (2019) Recurrent neural networks for early detection of heart failure from longitudinal electronic health record data: implications for temporal modeling with respect to time before diagnosis, data density, data quantity, and data type. Circulation: Cardiovasc Qual Outcomes 12(10):005114
Google Scholar
Choi E, Bahadori MT, Schuetz A, Stewart WF, Sun J (2016a) Doctor ai: Predicting clinical events via recurrent neural networks. In: Machine Learning for Healthcare Conference, pp. 301–318 PMLR
Choi E, Bahadori MT, Sun J, Kulas J, Schuetz A, Stewart W (2016b) Retain: an interpretable predictive model for healthcare using reverse time attention mechanism. Adv Neural Inf Process Syst 29
Choi E, Schuetz A, Stewart WF, Sun J (2017a) Using recurrent neural network models for early detection of heart failure onset. J Am Med Inform Assoc 24(2):361–370
Article Google Scholar
Choi E, Bahadori MT, Song L, Stewart WF, Sun J (2017b) Gram: graph-based attention model for healthcare representation learning. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 787–795
DeShazo JP, Hoffman MA (2015) A comparison of a multistate inpatient ehr database to the hcup nationwide inpatient sample. BMC Health Serv Res 15(1):1–8
Article Google Scholar
Feng Y, Min X, Chen N, Chen H, Xie X, Wang H, Chen T (2017) Patient outcome prediction via convolutional neural networks based on multi-granularity medical concept embedding. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 770–777 IEEE
Goldberger AL, Amaral LA, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE (2000) Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. circulation 101(23), 215–220
Goldstein BA, Navar AM, Pencina MJ, Ioannidis J (2017) Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc 24(1):198–208
Article Google Scholar
Guo W, Ge W, Cui L, Li H, Kong L (2019) An interpretable disease onset predictive model using crossover attention mechanism from electronic health records. IEEE Access 7:134236–134244
Article Google Scholar
Gupta P, Sivalingam U, P¨olsterl S, Navab N (2015) Identifying patients with diabetes using discriminative restricted boltzmann machines. Technical report, Technical report, Technical University of Munich, Germany
Herrett E, Gallagher AM, Bhaskaran K, Forbes H, Mathur R, Van Staa T, Smeeth L (2015) Data resource profile: clinical practice research datalink (cprd). Int J Epidemiol 44(3):827–836
Article Google Scholar
Ho LV, Ledbetter D, Aczon M, Wetzel R (2017) The dependence of machine learning on electronic medical record quality. In: AMIA Annual Symposium Proceedings, vol. p. 883 (2017). American Medical Informatics Association
Hornberger J (2009) Electronic health records: a guide for clinicians and administrators. JAMA 301(1):110–110
Article Google Scholar
Hung C-Y, Chen W-C, Lai P-T, Lin C-H, Lee C-C (2017) Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 3110–3113 IEEE
Johnson AE, Pollard TJ, Shen L, Lehman L-wH, Feng M, Ghassemi M, Moody B, Szolovits P, Celi A, Mark L (2016) Mimic-iii, a freely accessible critical care database. Sci data 3(1):1–9
Article Google Scholar
Kruse CS, Kristof C, Jones B, Mitchell E, Martinez A (2016) Barriers to electronic health record adoption: a systematic literature review. J Med Syst 40(12):1–7
Article Google Scholar
Landi I, Glicksberg BS, Lee H-C, Cherng S, Landi G, Danieletto M, Dudley JT, Furlanello C, Miotto R (2020) Deep representation learning of electronic health records to unlock patient stratification at scale. NPJ Digit Med 3(1):1–11
Article Google Scholar
Le H, Tran T, Venkatesh S (2018) Dual memory neural computer for asynchronous two-view sequential learning. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1637–1645
Li Q, Xu Y (2019) Vs-gru: a variable sensitive gated recurrent neural network for multivariate time series with massive missing values. Appl Sci 9(15):3041
Article Google Scholar
Li Y, Rao S, Solares JRA, Hassaine A, Ramakrishnan R, Canoy D, Zhu Y, Rahimi K, Salimi-Khorshidi G (2020) Behrt: transformer for electronic health records. Sci Rep 10(1):1–12
Google Scholar
Luo J, Ye M, Xiao C, Ma F (2020) Hitanet: Hierarchical time-aware attention networks for risk prediction on electronic health records. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 647–656
Ma F, Chitta R, Zhou J, You Q, Sun T, Gao J (2017) Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1903–1911
Ma F, You Q, Xiao H, Chitta R, Zhou J, Gao J (2018) Kame: Knowledge-based attention model for diagnosis prediction in healthcare. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 743–752
Maurya MR, Riyaz NU, Reddy M, Yalcin HC, Ouakad HM, Bahadur I, Al-Maadeed S, Sadasivuni KK (2021) A review of smart sensors coupled with internet of things and artificial intelligence approach for heart failure monitoring. Med Biol Eng Comput 59(11):2185–2203
Article Google Scholar
Mei J, Zhao S, Jin F, Xia E, Liu H, Li X (2018) Deep diabetologist: learning to prescribe hyperglycemia medications with hierarchical recurrent neural networks. arXiv Preprint arXiv :181007692
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv Preprint arXiv :13013781
Miotto R, Li L, Kidd BA, Dudley JT (2016) Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep 6(1):1–10
Article Google Scholar
Park HJ, Jung DY, Ji W, Choi C-M (2020) Detection of bacteremia in surgical in-patients using recurrent neural network based on time series records: development and validation study. J Med Internet Res 22(8):19512
Article Google Scholar
Pham T, Tran T, Phung D, Venkatesh S (2017) Predicting healthcare trajectories from medical records: a deep learning approach. J Biomed Inform 69:218–229
Article Google Scholar
Pollard TJ, Johnson AE, Raffa JD, Celi LA, Mark RG, Badawi O (2018) The eicu collaborative research database, a freely available multicenter database for critical care research. Sci data 5(1):1–13
Article Google Scholar
Rasmy L, Wu Y, Wang N, Geng X, Zheng WJ, Wang F, Wu H, Xu H, Zhi D (2018) A study of generalizability of recurrent neural networkbased predictive models for heart failure onset risk using a large and heterogeneous ehr data set. J Biomed Inform 84:11–16
Article Google Scholar
Reddy BK, Delen D (2018) Predicting hospital readmission for lupus patients: an rnn-lstm-based deep-learning methodology. Comput Biol Med 101:199–209
Article Google Scholar
Sanchez-Pinto LN, Luo Y, Churpek MM (2018) Big data and data science in critical care. Chest 154(5):1239–1248
Article Google Scholar
Sha Y, Wang MD (2017) Interpretable predictions of clinical outcomes with an attention-based recurrent neural network. In: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 233–240
Shah S, Ledbetter D, Aczon M, Flynn A, Rubin S (2016) 2: early prediction of patient deterioration using machine learning techniques with time series data. Crit Care Med 44(12):87
Article Google Scholar
Shickel B, Tighe PJ, Bihorac A, Rashidi P (2017) Deep ehr: a survey of recent advances in deep learning techniques for electronic health record (ehr) analysis. IEEE J Biomedical Health Inf 22(5):1589–1604
Article Google Scholar
Si Y, Du J, Li Z, Jiang X, Miller T, Wang F, Zheng WJ, Roberts K (2021) Deep representation learning of patient data from electronic health records (ehr): a systematic review. J Biomed Inform 115:103671
Article Google Scholar
Solares JRA, Raimondi FED, Zhu Y, Rahimian F, Canoy D, Tran J, Gomes ACP, Payberah AH, Zottoli M, Nazarzadeh M et al (2020) Deep learning for electronic health records: a comparative review of multiple deep neural architectures. J Biomed Inform 101:103337
Article Google Scholar
Sun W, Cai Z, Li Y, Liu F, Fang S, Wang G (2018) Data processing and text mining technologies on electronic medical records: a review. Journal of healthcare engineering (2018)
Tran T, Nguyen TD, Phung D, Venkatesh S (2015) Learning vector representation of medical objects via emr-driven nonnegative restricted boltzmann machines (enrbm). J Biomed Inform 54:96–105
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA, Bottou L (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11(12)
Wang W, Guo C, Xu J, Liu A (2019) Bi-dimensional representation of patients for diagnosis prediction. In: 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 374–379 IEEE
Wu J, Roy J, Stewart WF (2010) Prediction modeling using HER data: challenges, strategies, and a comparison of machine learning approaches. Medical care, S106-S113
Xiao C, Choi E, Sun J (2018) Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. J Am Med Inform Assoc 25(10):1419–1428
Article Google Scholar
Xie F, Chakraborty B, Ong MEH, Goldstein BA, Liu N et al (2020) Autoscore: a machine learning–based automatic clinical score generator and its application to mortality prediction using electronic health records. JMIR Med Inf 8(10):21798
Article Google Scholar
Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights into Imaging 9(4):611–629
Article Google Scholar
Yang Y, Fasching PA, Tresp V (2017) Predictive modeling of therapy decisions in metastatic breast cancer with recurrent neural network encoder and multinomial hierarchical regression decoder. In: IEEE International Conference on Healthcare Informatics (ICHI), pp. 46–55 (2017). IEEE
Zamanzadeh DJ, Petousis P, Davis TA, Nicholas SB, Norris KC, Tuttle KR, Bui AA, Sarrafzadeh M (2021) Autopopulus: A novel framework for autoencoder imputation on large clinical datasets. In: 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 2303–2309 IEEE
Zhao J, Papapetrou P, Asker L, Bostr¨om H (2017) Learning from heterogeneous temporal data in electronic health records. J Biomed Inform 65:105–119
Article Google Scholar

Download references

Acknowledgements

This work was sponsored by the United Arab Emirates University through Strategic Research Program (Grant # 12R111) and the Research Start-up Program (Grant # 12M109). This research was also supported by ASPIRE, the technology program management pillar of Abu Dhabi’s Advanced Technology Research Council (ATRC), via the ASPIRE Precision Medicine Research Institute Abu Dhabi (ASPIREPMRIAD) award grant number VRI-20-10.

Author information

Fatma Al Jasmi, Richard O. Sinnott, Nazar Zaki, Hany Al Ashwal, Elfadil A. Mohamed and Mohd Saberi Mohamad contributed equally to this work.

Authors and Affiliations

Health Data Science Lab, Department of Genetics and Genomics, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, Abu Dhabi, 17666, United Arab Emirates
Nurul Athirah Nasarudin, Fatma Al Jasmi & Mohd Saberi Mohamad
School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, 3010, Australia
Richard O. Sinnott
Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirate University, Al Ain, Abu Dhabi, 17666, United Arab Emirates
Nazar Zaki & Hany Al Ashwal
Artificial Intelligence Research Center (AIRC), College of Engineering and Information Technology, Ajman University, Ajman, 346, United Arab Emirates
Elfadil A. Mohamed

Authors

Nurul Athirah Nasarudin
View author publications
You can also search for this author in PubMed Google Scholar
Fatma Al Jasmi
View author publications
You can also search for this author in PubMed Google Scholar
Richard O. Sinnott
View author publications
You can also search for this author in PubMed Google Scholar
Nazar Zaki
View author publications
You can also search for this author in PubMed Google Scholar
Hany Al Ashwal
View author publications
You can also search for this author in PubMed Google Scholar
Elfadil A. Mohamed
View author publications
You can also search for this author in PubMed Google Scholar
Mohd Saberi Mohamad
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.S.M and F.A.J conceptualized the manuscript. N.A.N drafted themanuscript with contributions from R.O.S, E.A.M, N.Z, and H.A.A. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mohd Saberi Mohamad.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Nasarudin, N.A., Al Jasmi, F., Sinnott, R.O. et al. A review of deep learning models and online healthcare databases for electronic health records and their use for health prediction. Artif Intell Rev 57, 249 (2024). https://doi.org/10.1007/s10462-024-10876-2

Download citation

Accepted: 25 July 2024
Published: 13 August 2024
DOI: https://doi.org/10.1007/s10462-024-10876-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A review of deep learning models and online healthcare databases for electronic health records and their use for health prediction

Abstract

Similar content being viewed by others

Applications of Deep Learning in Healthcare: A Systematic Analysis

Deep Learning in Healthcare: Applications, Challenges, and Opportunities

Deep Learning for Predictive Analytics in Healthcare

Explore related subjects

1 Introduction

2 Deep learning architectures

2.1 Autoencoder

2.2 Restricted Boltzmann machine (RBM)

2.3 Convolutional neural network (CNN)

2.4 Recurrent neural network (RNN)

3 Deep learning models

3.1 Doctor AI

3.2 Deep patient

3.3 RETAIN

3.4 T-LSTM

3.5 Deep Diabetologist

3.6 DeepCare

3.7 GRNN-HA

3.8 Timeline

3.9 DMNC

3.10 KAME

3.11 COAM

3.12 VS-GRU

3.13 Patient2vec

3.14 ConvAE

3.15 BEHRT

3.16 HiTANet

4 Online databases for EHR data

4.1 MIMIC

4.2 eICU-CRD

4.3 PhysioBank

4.4 CPRD

4.5 CERNER Health Facts*

4.6 Healthcare cost and utilization project

5 Discussion

6 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation