Abstract
Purpose
Cardiovascular diseases, the world’s leading cause of death, are linked to changes in tissue mechanical and material properties that affect the signaling of cells in the damaged tissue. It is hard to predict the effect of altered physical cues on cell signaling though, due to the large number of molecules potentially involved. Our goal is to identify genes and molecular networks that mediate cellular response to cardiovascular disease and cardiovascular-related forces.
Methods
We used custom computer code, statistics, and bioinformatics tools to meta-analyze PubMed-indexed citations for mentions of genes and proteins.
Results
We identified the names and frequencies of genes studied in the context of mechanical cues (shear, strain, stiffness, and pressure) and major diseases (stroke, myocardial infarction, peripheral arterial disease, deep vein thrombosis). Using statistical and bioinformatics analyses of these biomolecules, we identified the cellular functions and molecular gene sets linked to cardiovascular diseases, biophysical cues, and the overlap between these topics. These gene sets formed independent molecular circuits that each related to different biological processes, including inflammation and extracellular matrix remodeling.
Conclusion
Computational analysis of cardiovascular and mechanobiology publication data can be used for discovery of evidence-based, data-rich gene networks suitable for future systems biology modeling of mechanosignaling.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Lay Summary
Cardiovascular diseases are a leading cause of death and chronic disability worldwide. Part of the reason we cannot control the progression of these diseases, or promote repair and recovery, may result from a lack of understanding of how mechanical forces and material properties change cell signaling during health and disease. To help address this knowledge gap, we systematically analyze all articles in the PubMed citation database related to biophysics and, separately, cardiovascular diseases. We identify genes that have been studied before in these contexts and combine this information into signaling networks that may work together to regulate cell behavior outcomes.
Future Works
This study identified individual genes and linked gene sets that are strongly associated with both biophysical cues and cardiovascular diseases. Future work can integrate detailed information from the articles we identify here to build data-driven, quantitative models to help predict how biophysical changes will influence cell behavior and health.
Introduction
What intracellular networks control cell behavior? How are these biological circuits disrupted when disease strikes, and can regenerative medicine and tissue engineering therapies restore circuit and tissue function to a healthy state? The global effort to answer these basic questions is adding thousands of new articles every year to the scientific literature. This overlaps topics relevant to regenerative medicine and tissue engineering, including advanced materials science, stem cell research, the physical sciences, and developmental biology. Gaining more accurate predictive capability and control of cell signaling can have especially high impact in the area of cardiovascular regenerative medicine, since cardiovascular disease remains the world-leading cause of death and chronic disease [1]. Our goal is to identify biological networks important for cardiovascular health and disease, and whether they also help cells respond to local mechanical forces and material properties in the surrounding tissue.
Cardiovascular diseases impact hundreds of millions of patients and trigger more than $300 billion loss annually in the USA alone [2]. As the number of deaths and disability adjusted life years due to cardiovascular disease grimly increase, public health experts urge countries to reduce the disability and premature deaths [3]. Furthermore, some sectors of the population suffer at higher rates, including ethnic and racial minorities [4] and gender differences [5]. We focused our analysis on four main groups of conditions that drive morbidity and mortality in cardiology: (1) heart attacks, or myocardial infarctions, and coronary artery disease [6]; (2) stroke and other cerebrovascular diseases such as aneurysm and blocked arteries [7, 8]; (3) deep vein thrombosis and pulmonary embolism [2]; and (4) peripheral artery disease [9,10,11]. Each of these groups of cardiovascular diseases includes chronic and acute components, which differ in the cell signaling and tissue remodeling involved and treatment strategies needed [12].
Many tissue engineering and regenerative approaches have been proposed to treat heart and blood vessels disorders during the past three decades [13,14,15,16]. The localized nature of many cardiovascular conditions (e.g., heart attacks, blood clots, vessel narrowing) make them well-suited to benefit from tissue engineering approaches, compared to diseases that affect diffuse regions such as some neurodegenerative diseases. Furthermore, the systemic access of the cardiovascular system can support regenerative medicine approaches such as cell homing and systemic delivery [15]. These efforts have yielded insights from cell therapy clinical trials [17, 18], FDA pre-market approved products [19], and ongoing investment to develop new cardiovascular regenerative therapies [20]. However, regenerative medicine and tissue engineering have not yet successfully reduced the burden of cardiovascular disease on society [21, 22]. Problems have been identified with immature cell phenotypes, failure to integrate with local host tissue, and side effects of poorly controlled biologic agents [6]. Improved modeling of human disease may be possible via stem cell technology and analyses that integrate multiple types of information [23]. Such systems-oriented models can help explain how cardiovascular diseases — or their repair — result from multiple factors working in concert, not single genes (or cells, scaffolds, or drugs) [24].
Living tissues and the laboratory models for studying them feature cells grown within inherently physical environments, including mechanical forces and material properties. Biophysical cues in turn play essential roles in promoting cardiovascular homeostasis [25]. In the cardiovascular system, disruptions in physiologic levels of shear stress, pressure, tissue stiffness, and strain are linked directly and indirectly to triggering cardiovascular disease [2, 5, 25, 26]. This underscores the need for stem cells or other therapeutic cells to be able to respond in site-appropriate ways to the biophysical environment when used as part of a regenerative therapy [27]. This is challenging, since differentiated stem cells often show immature properties compared to their endogenous counterparts [6, 28]. Biophysical cues are one of several types of information to which cells detect and react during disease progression and tissue regeneration. Models that integrate the breadth of published information on biophysics, cellular responses, and disease states can help predict how molecular networks translate biophysical cues into cell signaling and, subsequently, how to engineer therapeutic repair.
Systems biology approaches to study cardiovascular mechanical forces have been used to integrate mechanosignaling transcriptome data at different timepoints [29], generate quantitative computational models of specific stiffness-sensitive signaling pathways like Yap [30], and qualitatively summarize the large number of canonical signaling pathways that appear to participate in shear stress mechanosensing [31]. To more broadly understand cellular response to materials, Darnell and colleagues recently systematically characterized cell responses to distinct combinations of biophysical parameters (substrate stiffness, stress relaxation, and adhesion ligand density) [32]. A complementary but less common approach to these examples of -omics measurement-based systems biology is systematic, high-throughput meta-analysis of existing literature to integrate otherwise siloed experimental data. Text mining for key signaling network information, such as gene and biomolecule names, is one such method that has been successfully applied to discover and study novel signaling networks in disciplines other than cardiovascular bioengineering, including autoimmune diseases [33], precision medicine [34], and pharmacology [35].
In this work, we sought to address a critical gap in identifying molecular circuits that mediate the mechanobiology of cardiovascular disease: identification of relevant molecules, or circuit nodes. We systematically mined the text of 8 custom citation databases, each representing one biophysical cue or major cardiovascular disease. Our high-throughput approach used statistical and bioinformatic tools to identify evidence-based molecular networks in each category. This systematic approach yielded novel and specific gene networks with known biological functions and immediate data-driven relevance to biophysics and cardiovascular disease.
Materials and Methods
Citation Database Generation
Search terms were selected manually with the intent to capture a broad collection of articles related to the topics (biophysical cue or cardiovascular disease), while minimizing the number of unrelated or false positive citations. Searches were performed on PubMed between 9 November 2020 and 18 May 2021. Search phrases, Boolean operators, and wild card variations were used to improve the comprehensive citation lists (see Supplemental Tables 1–3). Article searches and downloading of XML-formatted citations was performed using eDirect commands and custom python code [36]. XML files were converted to tables for downstream data processing. In total, 8 independent citation databases were generated for four biophysical cues (shear, strain, stiffness, and pressure) and four cardiovascular disease sets: coronary artery disease/myocardial infarction (CHD/MI), cerebrovascular disease/stroke (CVD/stroke), deep vein thrombosis/pulmonary embolism (DVT/PE), and peripheral artery disease (PAD).
Gene Frequency Analysis
We made the simplifying assumption in this analysis that genes mentioned in prominent portions of the text, such as the title or abstract, likely play an important role in understanding the article’s findings. Biomolecules were identified using third party-curated lists of human gene and protein official names, official symbols, alternative names and alternative symbols. Annotation files for biomolecule information were sourced online from HGNC [37] (15 June 2020 download) and UniProtKB/Swiss-Prot [38] (01 December 2020 download). Custom python scripts were used to integrate biomolecular annotation information from both sources into a single gene database, using the HGNC ID as the root identifier for each gene. Titles and abstracts for all articles in a given citation database were then screened for mention of any biomolecule’s identifying information in the gene name database (Supplemental Fig. 1). Gene counts were filtered to remove likely false positives, such as gene symbols that are words in the English language (e.g., REST) or common scientific abbreviations (e.g., AFM). The code generated tabular output showing (1) the individual genes (if any) mentioned in every article in the database and (2) total counts of each gene name mentioned in the literature database. Genes that appeared in fewer than 3 articles in the database were neglected, since they lacked sufficient representation in the literature to detect trends during downstream analysis.
In total, the HGNC annotation file included 144,407 molecule identifiers (names, symbols, and aliases) and the UniProt annotation file included 65,922 molecule identifiers (approved symbols, protein names and aliases, gene aliases). We searched the title and abstract text for these \(\sim 2.1*10^{5}\) established gene identifiers for every article in each of the 8 citation databases (\(\sim 1.2*10^{6}\) articles), equivalent to \(\sim 2.6*10^{11}\) bits of information.
Probability and Statistics
To determine whether two genes, x and y, were studied together, we calculated the absolute numbers of articles mentioning both genes, the Pearson correlation coefficient of the two genes, and the related conditional probabilities, P(x|y) = (# articles mentioning both gene x and gene y) / (# articles mentioning gene y). Means and standard deviation were calculated when noted in the text.
Network Analysis
Network Node Derivation
We chose a threshold of 0.2 as the level of conditional probability required to classify two genes as enriched in their mutual occurrence in the literature. The level of 0.2 was chosen to filter out noise, based on the distribution of conditional probability values for all pairwise conditional probabilities (Supplemental Fig. 2). We inferred that gene pairs that are enriched for mentions in the same articles may contribute to similar signaling networks and/or biological functions [39, 40]. We refer to these linked gene sets as molecular networks or circuits. Histogram analysis was performed to tally the number of genes with enriched mentions (sum of gene pairs that satisfy P(A|B) ≥ 0.2) for each gene, to quantify the degree of connectivity and information content of the gene.
Network Architecture
Network architecture was visualized using two approaches. First, we used Cytoscape software [41] to connect nodes (shown as gene symbols) with edges (connecting lines between nodes). Edge thickness and color were proportional to the absolute number of shared articles and the correlation coefficient between the nodes, respectively. Second, we used the software Ingenuity Pathways Analysis (IPA, Qiagen) to define a new pathway containing only nodes that were gene symbols that could be linked exclusively using the conditional probability enrichment criteria (P(A|B) ≥ 0.2) [42]. Using the “connect network” function and allowing only direct connections, IPA software populated the edges in these networks based exclusively on known biological relationships between molecules, without regard to article counts or correlation values. Indirect relationships were excluded since these were far too numerous and made it more difficult to visualize the signaling pathway architecture based on direct connections.
Computer Code
Computer code was written in Python 3. PubMed database interrogation via the command line used eDirect (NCBI) underlying commands executed via the Biopython library. Code output is deposited in the Dryad digital repository.
Results
Generation of Systematic Citation Databases
Interest in biophysical stimuli and cardiovascular diseases has grown steadily in the past 60 years, and approaching exponentially since 2000 (Fig. 1A). In 2020 alone, more than 25,000 new PubMed-indexed articles were published related to shear, stiffness, strain, or pressure. Nearly 60,000 new publications during the same year mentioned CHD/MI, CVD/ stroke, DVT/PE, or PAD. This vast data set is larger than what can be read and synthesized manually. Among the biophysical citation databases, pressure was studied the most prior to 2010. The past 10 years, however, show a rapid increase in annual publications related to stiffness, which is on track to exceed pressure as having the most total articles published since 1960. Among cardiovascular diseases, most articles published before 2010 focused on CHD/MI, compared to a greater focus more recently on CVD/stroke.
To systematically analyze these large datasets, we wrote custom code to perform text recognition in titles and abstracts of PubMed-indexed article citations (Supplemental Fig. 1A–B). Text that was screened initially only for official HGNC-curated gene symbols and gene names missed many relevant biomolecular mentions. This high rate of false negatives (genes present but not found when searching only for HGNC symbols) is because a diverse range of aliases are used in the literature. To solve the problem of false negatives, the algorithm was updated so that text was scanned for 6 types of officially recognized biomolecule names including 4 sets of gene symbols, names, symbol aliases, and name aliases from the HGNC database; 1 name list from the Reviewed UniProt (Swiss-Prot) database; and 1 optional category of user-defined names to handle names not documented in either the HGNC or UniProt database (such as “YAP” for the official gene symbol YAP1) (Fig. 1B). Accounting for biomolecule mentions improved the sensitivity of automatically detecting gene mentions in article citations, affecting 7 out of the top 10 most frequently studied genes in the shear biophysical cue database (Supplemental Fig. 1D). This showed that the type of biomolecule identifier used varied greatly depending on the subdiscipline (article dataset) and individual gene. To promote algorithm efficiency, the user-defined names, HGNC gene names, and HGNC gene name alias were searched with highest priority (Fig. 1A) since alternatives like HGNC gene symbols, HGNC symbol aliases, and UniProt names were more likely to overlap with unrelated acronyms.
We designed Boolean filter criteria to reduce the detection of spurious genes (false positives) and improve accuracy (Fig. 1B) of the text mining code. This accounted for a common scenario in which gene names could be confused with other terms (e.g., “CPP” could refer to the gene or to the term cerebral perfusion pressure). We validated this automated filter method for all genes mentioned in ≥ 10 articles in the shear biophysical cue database. The computational algorithm with Boolean predicted positive genes achieved levels of 88% precision, 94% sensitivity, 86% specificity, and 90% accuracy compared to ground truth positives, defined using manual verification (Fig 1C). We performed the same validation procedure for three other biophysical cues and obtained similar scores (not shown). The computational algorithm executed text mining for gene names and the Boolean prediction filter within minutes to hours, depending on the dataset size. In contrast, manual verification of individual true positive genes required much longer (days). For larger datasets or in case increased gene detection sensitivity is desired (e.g., for genes with article counts below 10), manual verification of hundreds of thousands of gene mentions in articles is not feasible. Manual verification also risks inaccuracy due to human error, while execution of computer code is expected to be reproducible and objective. These data demonstrate the need for computational solutions to systematically analyze published literature, and the ability of the algorithm to generate an accurate list of genes mentioned in a citation dataset.
Each of the eight citation databases can be described by output listing (a) genes mentioned in each article, (b) genes in ranked order according to the total number of articles mentioning each gene, and (c) gene mentions over time (Supplemental Tables; Dryad repository). These citation databases range in size from 37,706 articles for PAD to 394,607 articles for CHD/MI (Fig. 1B). Despite differences in dataset size, \(\sim\)25–40% of the articles in every database mentioned genes in prominent citation fields. Approximately 10–20% of the articles in each database mentioned genes predicted by the Boolean filters to be high confidence genes mentioned in the data set (Fig. 1B right, green bars) versus those detected with more generous filters as simply potential gene mentions (Fig. 1B right, yellow-only bars). This indicates that each citation dataset included 4000 to 80,000 individual articles containing high confidence gene mentions, suitable for future detailed inspection. In both the biophysical dataset and the cardiovascular disease set, the percentage of articles mentioning genes has risen since 1998 (Fig. 1D). Shear, strain, and PAD showed the greatest percent of articles mentioning genes between 1998 and 2010, while the percent of articles mentioning genes in the remaining topics has plateaued since 2010 (stiffness, pressure, CVD/stroke, DVT/PE) or increased (CHD/MI). Likewise, the number of genes being studied (Fig. 1E) has increased rapidly since the mid-1990s to the present. Of all the databases, stiffness and CVD/stroke show the fastest growth both in publication number and number of genes mentioned during the past 10 years.
Genes Associated with Cardiovascular-Relevant Biophysical Cues
From the set of > 40,000 genes screened, 1558 genes were mentioned in ≥ 3 articles for one or more of the biophysical cues (Fig. 2A). More than half of these (859 genes) have been reported in multiple biophysical contexts, and 21.6% (337 genes) appeared in the articles of all 4 stimuli. Strain and pressure resembled each other the most in terms of biomolecules studied, as shown by the dendrogram (Fig. 2A). Shear and stiffness had more unique profiles of genes studied (272 and 244 unique genes, respectively, versus 88 and 94 for strain and pressure).
In order to rapidly identify genes that are often studied in the same context, we used pairwise conditional probability to measure the likelihood for one gene to be mentioned in an article given the presence of the other gene in the pair. Genes were considered to be enriched in their association when the conditional probability of P(A|B) ≥ 0.2 (Supplemental Fig. 2). The conditional probabilities demonstrated consistent bias in the literature, with one gene in the pair usually driving the likelihood that other genes are studied. This can be seen by the global diagonal asymmetry in the conditional probability heat maps (Fig. 2B, blue heatmaps) and the asymmetry of gene connectedness shown by comparing histograms for the sums of P(A|B) versus P(B|A) (Fig. 2C). One specific example of this asymmetry from the shear dataset is the high conditional probability for genes Selectin -P, -E, and -L to be studied in any article in which Selectin-P Ligand was mentioned, but an absence of the reverse correlation. Articles mentioning the receptors Selectin -P, -E, or -L do not show increased likelihood for the ligand to be mentioned. Applying a filter to highlight conditional probability values above 0.2 (Fig. 2, outlined boxes) rapidly narrowed down the gene association space from the initial conditional probability list of \(\sim 1.6*10^{9}\) (40,000 × 40,000) possible gene pairs to a manageable set of < 100 enriched gene pairs. Individual genes within this list linked to a variable number of other genes, based on meeting the enrichment criteria of P(A|B) ≥ 0.2 (Fig. 2C). Individual genes that directly link to more genes based on the conditional probability threshold contain more information about the related molecular networks. These genes appear in the largest linked gene molecular circuits and indicate that molecular network architecture for biophysical cue sensing includes mainly highly connected genes, rather than genes connected in series. Mathematically linked gene sets could be visualized as molecular networks based on their known biological relatedness, as shown for the strain-sensitive network AGT-ACE-REN (Fig. 2F). Connecting this information with the number of articles overlapping for each gene pair enabled us to identify individual article subsets matching any gene, gene pair, or linked gene set (Fig. 2B, red heatmaps). We could also extract other information from either the citation, such as the molecule studies over time (Fig. 2G), or details from the full text of the articles.
Within the shear stress dataset (948 total genes found in ≥ 3 articles), 180 genes were discussed in ≥ 20 articles (Fig. 2B). Imposing a conditional probability threshold of 0.2 narrowed the list to 49 genes with 84 pairwise interactions. Linking the 49 genes together generated 5 linked gene (or co-occurrence) networks with 3 or more genes. We interpreted high outlier conditional probability values (close to 1.0) as a red flag for possible false positives. Upon inspection, this suspicion was borne out: NEUROD1 and TUBB4B shared the HGNC alias symbol Beta2, and DUSP2-SELP matches comes from the alias PAC1-SELP used in the literature, which in this subfield refers to an antibody that binds to activated α IIbβ3 and not the gene DUSP2 [43]. Subsequent conditional probability values at or near 1.0, while rare, were manually checked and removed if artifacts. The number of genes identified in the stiffness dataset was similar to the shear dataset: 77 pairwise interactions, shared among 44 genes, exceeded the conditional probability threshold. These 44 genes can be grouped into 4 linked gene networks. In the strain dataset, slightly fewer enriched gene interactions passed the conditional probability filter (62 gene interactions, 38 genes). These 38 strain genes contribute to form 4 linked gene networks. The pressure dataset included the fewest (46) gene interactions above conditional probabilities ≥ 0.2. These interactions were shared among 28 genes and could be grouped into 3 linked gene networks. For all four biophysical cues analyzed, the molecules included in each pair of connected genes and each linked gene network (i.e., those sets containing ≥ 3 connected genes) we identified are shown in detail in Fig. 4A.
We also analyzed interests shared by the biophysics and regenerative medicine communities in the biophysical cue citation databases. Publications related to stiffness were the most likely to discuss regenerative medicine and tissue engineering topics (Fig. 2D). Across all biophysical cues, articles mentioned scaffolds approximately twice as often as stem cells or cell therapy was mentioned, indicating a potential gap in the regenerative medicine literature describing the biomechanics and mechanosignaling properties of stem cells and other therapeutic cell types. Mesenchymal Stem Cells (MSC) were the most frequently studied stem cell type in the 4 biophysics datasets (67–84%), with more than 6 times the article counts compared to induced or pluripotent stem cells. MSC studies were also correlated primarily with stiffness studies, which could reflect adherence to a well-established model system in the stiffness mechanosensing community [44]. Pluripotent cells were studied more uniformly across biophysical cues. Among cardiovascular cell types (Fig. 2E), endothelial cells were studied most often and usually in the context of shear, while smooth muscle and cardiomyocytes were studied more uniformly across all cues (with the exception of cardiomyocytes and shear stress, a group containing fewer articles). This similar number of available articles across mechanical cues and material properties is somewhat surprising, given the larger role strain and pressure play in situ for cardiovascular muscle cells.
Genes Associated with Major Cardiovascular Diseases
Since cardiovascular health problems vary widely in their symptoms and causes, we also separated gene analysis of the most common cardiovascular diseases by anatomically themed categories. These 4 groups included (1) coronary heart disease and myocardial infarction (MI/CHD), (2) cerebrovascular disease and stroke (CVD/stroke), (3) deep vein thrombosis and pulmonary embolism (DVT/PE), and (4) peripheral artery disease (PAD). 3129 genes have been mentioned in ≥ 3 articles for at least one of the disease topics (Fig. 3B). 1742 genes were studied in more than 1 disease area and 299 highly studied genes (9.6%) appeared in the articles of all 4 topics. CHD/MI and CVD/stroke resembled each other the most, with substantial overlap in genes studied. Genes related to DVT/PE and PAD were more similar to each other than to either CHD/MI or CVD/stroke, as shown in the dendrogram (Fig. 3B). Both CHD/MI and CVD/stroke also had large gene sets uniquely studied in those diseases, 516 and 853 unique genes to CHD/MI and CVD/stroke respectively (Fig. 3B, green and blue bars). Very few genes were unique to either DVT/PE (14) or PAD (4). Article data size varied considerably across cardiovascular diseases, and resulted in differences in the average article counts mentioning a gene pair between the smallest article database, PAD (\(\bar {x}_{PAD}=8\)), and the largest database, CHD/MI (\(\bar {x}_{CHD/MI}=47\)) (Fig. 3A). To account for this data size variation of 40,000 to 400,000 articles, we could not use the absolute cut-off of genes appearing in at least 20 articles, since this was chosen for the biophysical conditional probability heat maps that all had dataset sizes of 7000–12,000 articles mentioning filtered genes. Instead, we used the equivalent percentage, requiring that genes listed on the conditional probability heatmap appear in ≥ 0.2% of gene-mentioning articles within the database (Fig. 3C).
In the CHD/MI dataset (2235 genes≥ 3 articles), 127 genes appeared in ≥ 0.2% articles (Fig. 3C) and 25 genes formed pairwise interactions that had conditional probabilities ≥ 0.2. On average those gene pairs were mentioned in 47 articles (Fig. 3A) and the percentage of color-filled cells (especially among more frequently mentioned genes) implied a high degree of connectivity. For example, PLG and SERPINE1 had the highest chance (P(PLG|SERPINE1) = 0.74) of appearing together in an article, and the pair was mentioned together in 682 articles. 35 gene pairs met the conditional probability threshold of ≥ 0.2, which identified 3 linked gene networks (Fig. 4B). One high conditional probability outlier was manually removed from subsequent analysis: the genes ATP12A and ATP4A share the same alias, “proton pump”, and could not be distinguished as an interacting gene pair.
Within the CVD/stroke dataset (containing 2580 total genes found in ≥ 3 articles), 147 genes appeared in ≥ 0.2% articles, and 40 genes had conditional probabilities ≥ 0.2. CVD/stroke included more gene pairs than CHD/MI, despite having a similar dataset size (Fig. 2C). Gene pairs studied in CVD/stroke (19 ± 54 articles) were documented in smaller article subsets than those in CHD/MI (47 ± 124 articles). The 68 gene pairs were connected to form 6 potential networks (Fig. 4B). High conditional probability values flagged the genes GABPA and NFE2L2 as a potential outlier interaction. Since they share the same alias NRF2, the trio network of GABPA-NFE2L2-HMOX1 could represented as a network pair interaction, NRF2-HMOX1.
In the DVT/PE dataset (464 total genes found in ≥ 3 articles), 79 genes appeared in ≥ 0.2% articles, and 26 gene interactions among 16 genes reflected enriched conditional probabilities ≥ 0.2 (Fig. 2C). Despite having fewer enriched gene pairs and a smaller overall number of citations in the DVT/PE database, the average number of articles supporting each gene pair was similar to the literature for CVD/stroke (24 vs 19 articles). This suggests the DVT/PE community has coalesced to study a more consistent set of molecular interactions. The 26 gene interactions in DVT/PE can be linked together to form 2 potential networks (Fig. 4B). In the PAD dataset (containing 559 genes appearing in at least 3 articles), 151 genes appeared in ≥ 0.2% articles, and 45 genes had conditional probabilities ≥ 0.2. Although the PAD dataset was the smallest of the 4 cardiovascular diseases, it included a broad range of gene interactions (80 pairs). Linking these genes pairs resulted in 3 molecular networks (Fig. 4).
The combined set of 8 citation databases represents a novel and contemporary snapshot of several important subdisciplines within mechanobiology and cardiovascular disease. While these databases can be used as a source for many systematic comparisons across the fields, our current effort focused on gene details relevant to underlying molecular control circuits. We quantified the overlap between the genes that have been studied in the context of both biophysical cues and cardiovascular diseases (Fig. 3D), as a first step to establish a systems-scale link between molecules that may contribute to both processes. We find that genes with biophysical relevance have also nearly always been studied in relation to cerebrovascular disease and stroke (shear: 84%, stiffness: 82%, strain: 85%, pressure: 90%), often also related to CHD/MI, and in 30–40% of cases in DVT/PE and PAD (Fig. 3D, top). These data imply that a rich data set is embedded in the literature that describes how biophysical-relevant genes may be affected during cardiovascular disease.
We also quantified the percent of genes studied in the context of disease for which articles also existed in the biophysical literature (Fig. 3D, bottom). Consistent with the smaller number of genes studied for DVT/PE and PAD (464 and 559, respectively), these were more likely to appear within articles in the biophysical databases, ranging from 55 to 69% representation. More genes have been studied in the context of CHD/MI and CVD/stroke (2235 and 2580, respectively), so even though a smaller fraction of them overlap with the biophysical literature (24–35%), this still amounts to sets of 600–800 genes that overlap between every combination of disease (CHD/MI or CVD/stroke) and biophysical cue (shear, stiffness, strain or pressure). Analyzing these overlapping genes sets in future may help establish the role of mechanosignaling in cardiovascular disease at a systems level, to complement the signaling pathway scale.
Network Analysis
To identify genes that may function together as molecular circuits, we identified all sets of genes, {x1,x2,..xi,...xn}, that could be associated together based on enriched probability of gene pairs to be studied in the same article citations (P(xi− 1|xi) ≥ 0.2). This approach reduced \(\sim 10^{2}\) enriched gene interactions to \(\sim 10^{1}\) molecular networks (≥ 2 linked genes/network), with an average of 9 ± 2 networks identified for each biophysical cue (Fig. 4A) and 8 ± 3 networks identified for each cardiovascular disease (Fig. 4B). A total of 68 individual networks were identified. The size of these networks had a bimodal distribution, with three-quarters of the networks (n = 52) consisting of two or three genes versus the remaining quarter of the networks, comprised of 8 larger networks each identified for biophysical cues and cardiovascular diseases. These larger biophysical cue networks contained more molecules per network on average than the large cardiovascular disease networks (median: 11 versus 5.5 molecules/network). This size difference between biophysical and disease networks may reflect greater consensus by authors in the biophysical cue community about choosing genes to study in parallel; be an artifact of the larger number of co-occurrence articles required to meet the conditional probability enrichment filter in case of large datasets like CHD/MI and CVD/stroke; and/or reflect underlying features of the biological networks themselves based on the literature to date. Among the smaller networks, antioxidant defense molecules CAT and SOD1 showed enriched co-mentions in every biophysical cue and all diseases except DVT/PE. The AP-1 family member FOS and JUN made frequent co-appearances in articles related to shear, strain, and pressure. In contrast to the biophysical cues however, the disease data sets included frequent co-appearance of molecules related to insulin, tumor growth, apoptosis, and thrombosis. Each biophysical stimulus and cardiovascular disease, except CHD/ MI, was also affiliated with a few cue-specific networks.
Our priority was to identify cell processes that may be targets of system-level regulation during biophysical or disease signaling, so we focused subsequent analyses on the larger networks. These networks readily clustered into a few even larger molecular networks formed by combining partially overlapping networks across biophysical cues or diseases. Molecules in these pan-biophysics or pan-disease networks related to non-overlapping cell functions, as noted (Fig. 4 tables, “Biological Process” column). Very intriguingly, the networks identified using this approach detected two key cell functions in all 8 citation databases: inflammation and immune cell adhesion (e.g., networks containing Interleukin and chemokine family members, TNF, Selectins, and CAMs) and extracellular matrix regulation (e.g., networks containing MMP, TIMP, and COL family members, as well as cartilage and bone associated genes). Genes involved in Inflammation/Immune Cell Adhesion and Extracellular Matrix were detected in all 8 datasets, but each field focused on a different subset of genes with varying degree of connections. Using the curated pathway analysis software IPA, we tested whether the linked gene sets that we had identified using only mathematical approaches could also be connected to one another physically via known biological interactions. All networks (n = 68) were highly connected when the software network building tool allowed consideration of direct and indirect interactions. These connections were so numerous that they made interpreting the signaling network diagram difficult, since they represent hundreds of edges connecting molecules and tens of thousands of embedded references collected by IPA that document these edge relationships. We observed a similar challenge building networks from correlation coefficients, in that the number of networks generated is too large to be informative compared to the sparse linked gene sets detected using enriched conditional (directional) probabilities.
To identify sparser networks more tractable for future experimental study, we reduced the signaling pathway connections to allow only direct interaction relationships between molecules within a linked gene set (Fig. 4, molecular network diagrams). No extra molecules were added. Thus, the networks we identified reflect a sparse set of genes, much sparser than the entire collection of inflammation, immune cell adhesion, or extracellular matrix genes known. The connection of these molecules to each other and their relevance to specific biophysical cues and cardiovascular diseases are thoroughly documented, by the custom citations databases created through this work and the biological references embedded within IPA. Furthermore, since networks nodes are only selected based on enriched conditional probabilities, these data-rich, sparse molecular networks are supported by many articles in the literature already that co-mention, and may co-study, multiple genes in a network in parallel. Locating and assimilating experimental observations taken in parallel is a valuable tool in quantitative model building to predict signaling pathway behavior. By using a computational approach to systematically and comprehensively review citation data, we have created a relatively simple tool to help accomplish an otherwise laborious and daunting task. The overlap in networks (Fig. 4A and B) reveals that most molecules in the networks important for cardiovascular disease also contribute to cellular mechanosignaling (Fig. 4B, overlapping genes denoted by solid blue stars). In future, this similarity could motivate integrating citations from different cue databases, to better understand how mechanics may alter cell signaling and impact cardiovascular disease. These networks also suggest a need for multi-factor studies in the pericellular and plasma membrane space, the subcellular location of the majority of genes identified in all networks. Some networks appear enriched for signaling in the plasma membrane (biophysics inflammation and cell adhesion, Fig. 4A left) versus the extracellular space (biophysics extracellular matrix, Fig. 4A right).
Physical Cue Sensing in Cardiovascular Diseases
In addition to mechanical forces and material properties, the cardiovascular system also relies on electrochemical communication to perform pacemaking and muscle contraction. We extended the approach used to study biophysical cues and cardiovascular diseases to analyze bioelectric articles and added 3 electrochemically related cardiovascular diseases: arrhythmia, bradycardia, and tachycardia (Supplemental Fig. 3). In the electric articles set, 2354 genes were mentioned in at least 3 articles, slightly less than the 3130 genes mentioned in ≥ 3 articles in any of the 4 common cardiovascular diseases (CAD/MI; CVD/Stroke; DVT/PE; PAD) but more than the 1235 genes mentioned in ≥ 3 articles in any of the 3 irregular heartbeat disorders (Fig. 5A). Most electric genes were documented in the 4 common cardiovascular disease-related articles (73% of the electric-related genes) and at least one biophysical citation dataset (49% of electric genes). In contrast, only 41% of the electric genes have been extensively studied in abnormal heartbeat-related articles, whereas 83% of the biophysical genes have been studied in at least one of the 4 common cardiovascular disease datasets. Approximately 700 genes have attracted attention from at least one citation database in all 4 subdisciplines (Fig. 5B). Furthermore, many genes (156 genes) have been studied in all 4 biomechanics-related diseases but not irregular heartbeat-related diseases, while a complementary set of genes (91 genes) has been studied in all 3 electric-related diseases but not all biomechanics-related diseases (Fig. 5C). Of the 70 genes present with high confidence in all of the 12 biophysics and cardiovascular disease databases (Fig. 5C-star and G), several of these contribute to the molecular networks derived in Fig. 4. Future studies may investigate the circuit activity and universality of these predicted molecular circuits in homeostasis and disease.
We identified gene overlap among biophysical cues (Fig. 5D; 5 cues), cardiovascular diseases (Fig. 5E, 7 diseases), and between each individual cardiovascular disease compared to all biophysical cues (Fig. 5F; 7 disease-biophysics comparisons). Approximately 300 molecules were mentioned in articles related to all 5 biophysical cues, with additional subsets of genes reported common to electric and between 1 and 3 mechanical/stiffness cue databases (Fig. 5D). All 7 diseases we studied included 100–300 genes that are candidate highly mechanosensitive molecules, defined here as genes that have been studied across all biophysical cues, including mechanical forces, stiffness, and electric cues (Fig. 5F). Since only 70 genes were studied across all biophysical cues and all diseases though (Fig. 5C, green star and Fig. 5G, gene table), this suggests that the remaining few hundred genes studied in all biophysical contexts may reflect mechanosensing observations specific to one or a few diseases. Data for the presence or absence of individual genes (4037 unique genes) within each of the 12 citation databases (biophysics or disease related) is shown in the Supplemental Table 6.
Imposing the same 0.2 conditional probability threshold and 0.2% article threshold (≥ 3 genes) used to generate co-occurrence networks for biophysical cues and cardiovascular diseases, we identified linked genes and molecular networks associated with electric-related articles (Supplemental Fig. 4A). These electric networks included smaller subsets of the Inflammation and Insulin networks identified in the cardiovascular disease datasets (Fig. 4), as well as new networks related to potassium channels (8-gene network), tight junction proteins (5 genes), and gap junction proteins (5 genes) (Supplemental Fig. 4C). Compared to the biophysical and disease datasets, the articles focused on any one of these gene sets tend to be more siloed from the other gene sets, as shown by clusters of high intensity surrounded by larger regions of white on the co-mention heatmap (Supplemental Fig. 4B). It is noteworthy that by relaxing the threshold criteria used to define molecular networks, it is possible to identify additional co-occurring gene sets. For example, by decreasing the conditional probability threshold from 0.2 to 0.1 and decreasing the article threshold from 0.2% of articles to 10 articles, we identified 48 co-occurrence networks and 32 gene pairs in the electric dataset instead of the more stringent 5 networks and 6 pairs.
Discussion
PubMed-indexed articles related to cardiovascular diseases and mechanosignaling, and their overlap with regenerative engineering, encompass more than 1 million articles. We hypothesized that this literature set included studies of non-random groups of genes that operate as effectively latent molecular circuits. We used bioinformatics and computational analysis to systematically meta-analyze these article citations to identify genes and linked gene networks mentioned most often in article citations. We observed the following key results from this high-throughput analysis.
-
Stiffness and cerebrovascular disease/stroke articles are accumulating at the fastest rates (Fig. 1).
-
Analyzing these 4 biophysical cues and 4 cardiovascular diseases gave molecular insight into each field, scalable from overall genome- and citation database-wide trends down to single gene- and single article-resolution.
-
Genes mentioned in biophysics citations (up to 90%) also appear in cardiovascular disease citations, defining a tight molecular coupling between fields (Figs. 2 and 3).
-
Conditional probability filtering efficiently constructed molecular networks from citation databases. Inflammation and immune cell adhesion, and extracellular matrix regulation, dominated the largest molecular networks found across disease and biophysics databases (Fig. 4).
-
The molecular circuits identified connect via direct interactions and are backed by large numbers of field-specific citations. The full text of articles from the custom citation lists, plus the ranked gene lists generated for each field, provides new source material for building quantitative predictive models. These can focus on mechanisms or empirical behavior of genes and networks involved in cardiovascular diseases and biophysical cues.
-
Modeling and follow up experiments can also help distinguish whether observed asymmetries in conditional probability reflect empirical bias in experiment design (e.g., if a receptor is studied in isolation, while its ligand is routinely studied in parallel with the receptor) or a true difference in mechanosignaling (e.g., in case a receptor is mechanosensitive, but not its ligand) (Figs. 2 and 3).
-
A subset of 70 genes have been studied by biophysics, electric, and cardiovascular disease communities (Fig. 5). Future investigation is needed to determine the role this broad interest gene set may play in promoting biophysical homeostasis in cardiovascular cells.
This computational study of cardiovascular mechanobiology included limitations that affected the molecular networks we identified. First, since only article citations are text mined for biomolecule names, these results overlook genes that may be studied in the article full text but not mentioned in the citation. Despite this coarse-graining of the molecular data, this approach had the advantage of prioritizing important signaling networks, since these molecules would likely be the ones to appear prominently in article titles and abstracts. A second option to tune the sensitivity of network discovery is to adjust the combinations of filters applied at each stage of the algorithm. We required at least 20 separate article mentions (or ≥ 0.2% of gene-mentioning articles) to consider that enough information existed in the citation database for the gene for us to draw conclusions about its enriched conditional probability. Different filter choices may help optimize the sensitivity of linked gene networks discovered for other applications. Finally, the comprehensive scope of our current analysis precludes us from discussing quantitative models for any single signaling pathway. Future work could use these citation databases and gene annotation outputs to build data-driven computer models that quantify molecular mechanisms or empirical behavior of signaling pathways and predict how disease or biophysical changes will affect signaling behavior.
Meta-analysis approaches have been used to enhance the impact of existing scholarly works in many fields [45, 46]. Here, we offer new perspective on the genes and molecular networks associated with biophysical cues and cardiovascular diseases. Our ability to identify evidence-based network summary lists for biophysical cues and cardiovascular pathologies (Fig. 4) is made possible thanks to decades of research in cardiovascular bioengineering in particular, a field known for early recognition of the importance of mechanics on cell signaling [47,48,49,50]. This work also builds on efforts to provide systematic and regularly curated consensus ontologies (e.g., HGNC, UniProt) in bioinformatics subfields like gene and protein nomenclature. Our results demonstrate that ontologies play crucial roles in systems biology analyses, with choices about their use affecting the genes and molecular networks subsequently discovered.
There have been previous efforts to use computational text mining to gain insight into gene associations and networks [51,52,53,54,55,56]. These tools provide rapid and computationally powered insight into user-selected individual molecules or articles. Our approach fills a complementary niche in this space, in which the user has less prior knowledge: they desire a broad overview of molecular knowledge in a specific rapidly evolving field plus high resolution data about potential genes of interest (e.g., highly ranked or highly correlated genes), backed by individual citations. Combining these computational tools can help researchers synthesize large amounts of molecular signaling data to help identify bottlenecks or refine strategies for cardiovascular regenerative therapies. This can complement parallel efforts to build quantitative predictive models describing specific aspects of cardiovascular physiology [57, 58]. The citation and ranked gene results reported here for 4 biophysical cues, electric, and 7 cardiovascular diseases paves the way for future data-driven studies of how these molecules alone, or acting together in molecular circuits, support cardiovascular health.
Data Availability
All materials available via supplemental data, the Dryad data repository, or by contacting the authors.
References
Centers for Disease Control. Global health protection and security: Cardiovascular diseases. https://www.cdc.gov/globalhealth/healthprotection/ncd/cardiovascular-diseases.html (2021). Accessed: 2021-06-20.
Virani SS, Alonso A, Aparicio HJ, et al. American heart association council on epidemiology and prevention statistics committee and stroke statistics subcommittee. Circulation. 2021;143(8):e254. https://doi.org/10.1161/CIR.0000000000000950.
Roth GA, Mensah GA, Johnson CO, et al. Global burden of cardiovascular diseases and risk factors, 1990--2019: Update from the GBD 2019 study. J Am Coll Cardiol. 2020;76(25):2982.
Graham G. Disparities in cardiovascular disease risk in the United States. Curr Cardiol Rev. 2015;11(3):238.
Vogel B, Acevedo M, Appelman Y, Bairey Merz CN, Chieffo A, Figtree GA, et al. The Lancet women and cardiovascular disease Commission: reducing the global burden by 2030. Lancet. 2021;397(10292):2385.
Liew LC, Ho BX, Soh BS. Mending a broken heart: current strategies and limitations of cell-based therapy. Stem Cell Res Ther. 2020;11(1):138.
Katan M, Luft A. Global Burden of Stroke. Semin Neurol. 2018;38:208.
Thomas Truelsen and Stephen Begg and Colin Mathers. World health organization, global burden of disease 2000: The global burden of cerebrovascular disease. https://www.who.int/healthinfo/statistics/bod_cerebrovasculardiseasestroke.pdf (2021). Accessed: 2021-06-20.
Kolls BJ, Sapp S, Rockhold FW, Jordan JD, Dombrowski KE, Fowkes FGR, et al. Stroke in patients with peripheral artery disease. Stroke 2019;50:1356.
Hess CN, Huang Z, Patel MR, Baumgartner I, Berger JS, Blomster JI, et al. Acute limb ischemia in peripheral artery disease. Circulation. 2019;140:556.
Fowkes FGR, Aboyans V, Fowkes FJI, McDermott MM, Sampson UKA, Criqui MH. Peripheral artery disease: epidemiology and global perspectives. Nat Rev Cardiol. 2017;14(3):156.
Millane T, Jackson G, Gibbs CR, Lip GY. ABC of heart failure. Acute and chronic management strategies. BMJ. 2000;320(7234):559.
Matsuda T, Akutsu T, Kira K, Matsumoto H. Development of hybrid compliant graft: rapid preparative method for reconstruction of a vascular wall. ASAIO Trans. 1989;35(3):553.
Dahl SLM, Kypson AP, Lawson JH, Blum JL, Strader JT, Li Y, et al. Readily available tissue-engineered vascular grafts. Sci Transl Med. 2011;3(68):68ra9.
Squillaro T, Peluso G, Galderisi U. Clinical trials with mesenchymal stem cells: An update. Cell Transplant. 2016;25(5):829.
Zhang YS, Arneri A, Bersini S, Shin SR, Zhu K, Goli-Malekabadi Z, et al. Bioprinting 3D microfibrous scaffolds for engineering endothelialized myocardium and heart-on-a-chip. Biomaterials. 2016;110:45.
Pagano F, Picchio V, Chimenti I, Sordano A, De Falco E, Peruzzi M, et al. On the road to regeneration: "Tools" and "Routes" towards efficient cardiac cell therapy for ischemic cardiomyopathy. Curr Cardiol Rep. 2019;21(11):133.
Suda S, Nito C, Yokobori S, Sakamoto Y, Nakajima M, Sowa K, et al. Recent advances in cell-based therapies for ischemic stroke. Int J Mol Sci. 2020;21(18)6718.
Food and Drug Administration. Public workshop: Scientific evidence in the development of human cells, tissues, and cellular and tissue-based products subject to premarket approval. (2021). https://www.fda.gov/vaccines-blood-biologics/workshops-meetings-conferences-biologics/public-workshop-scientic-evidence-development-human-cells-tissues-and-cellular-and-tissue-based. Accessed 20 Jun 2021.
Mao AS, Mooney DJ. Regenerative medicine: Current therapies and future directions. Proceedings of the National Academy of Sciences. 2015;112(47):14452.
Chien KR, Frisén J, Fritsche-Danielson R, Melton DA, Murry CE, Weissman IL. Regenerating the field of cardiovascular cell therapy. Nat Biotechnol. 2019;37(3):232.
Food and Drug Administration. Important patient and consumer information about regenerative medicine therapies. https://www.fda.gov/vaccines-blood-biologics/consumers-biologics/important-patient-and-consumer-information-about-regenerative-medicine-therapies (2021). Accessed: 2021-06-20.
Lee RT, Walsh K. The future of cardiovascular regenerative medicine. Circulation. 2016;133(25):2618.
Nerem RM, Levesque MJ, Cornhill JF. Social environment as a factor in diet-induced atherosclerosis. Science. 1980;208(4451):1475.
Yamashiro Y, Yanagisawa H. The molecular mechanism of mechanotransduction in vascular homeostasis and disease. Clin Sci (Lond). 2020;134(17):2399.
Malek AM, Alper SL, Izumo S. Hemodynamic shear stress and its role in atherosclerosis. JAMA. 1999;282(21):2035.
Aratyn-Schaus Y, Pasqualini FS, Yuan H, McCain ML, Ye GJC, Sheehy SP, et al. Coupling primary and stem cell-derived cardiomyocytes in an in vitro model of cardiac cell therapy. J Cell Biol. 2016;212(4):389.
Pasqualini FS, Sheehy SP, Agarwal A, Aratyn-Schaus Y, Parker KK. Structural phenotyping of stem cell-derived cardiomyocytes. Stem Cell Reports. 2015;4(3):340.
Maurya MR, Gupta S, Li JYS, Ajami NE, Chen ZB, Shyy JYJ, et al. Longitudinal shear stress response in human endothelial cells to atheroprone and atheroprotective conditions. Proc Natl Acad Sci USA. 2021;118(4)e2023236118.
Sun M, Spill F, Zaman MH. A Computational model of YAP/TAZ mechanosensing. Biophysical J. 2016;110(11):2540.
Souilhol C, Serbanovic-Canic J, Fragiadaki M, Chico TJ, Ridger V, Roddie H, et al. Endothelial responses to shear stress in atherosclerosis: a novel role for developmental genes. Nat Rev Cardiol. 2020;17(1):52.
Darnell M, O’Neil A, Mao A, Gu L, Rubin LL, Mooney DJ. Material microenvironmental properties couple to induce distinct transcriptional programs in mammalian stem cells. Proc Natl Acad Sci USA. 2018;115(36):E8368.
Lu G, Hao X, Chen WH, Mu S. GAAD: A gene and autoimmiune disease association database. Genom Proteom Bioinform. 2018;16(4):252.
Simmons M, Singhal A, Lu Z. Text Mining for Precision Medicine: Bringing Structure to EHRs and Biomedical Literature to Understand Genes and Health. Singapore: Springer Singapore; 2016, pp. 139–166.
Alexopoulos LG, Melas IN, Chairakaki AD, Saez-Rodriguez J, Mitsos A. Construction of signaling pathways and identification of drug effects on the liver cancer cell HepG2. Annu Int Conf IEEE Eng Med Biol Soc. 2010;2010:6717.
Entrez KJ. Direct: E-utilities on the unix command line. In: Entrez programming utilities help.
Tweedie S, Braschi B, Gray K, Jones TEM, Seal RL, Yates B, et al. Genenames.org: the HGNC and VGNC resources in 2021. Nucleic Acids Res. 2021;49:D939.
Uniprot Consortium. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 2021;49:D480.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25.
Gene Ontology Consortium. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021;49:D325.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003;13(11):2498.
Qiagen. Qiagen ingenuity pathway analysis. https://digitalinsights.qiagen.com/products/ingenuity-pathway-analysis/.
Shattil SJ, Hoxie JA, Cunningham M, Brass LF. Changes in the platelet membrane glycoprotein IIb.IIIa complex during platelet activation. J Biological Chem. 1985;260:11107.
Engler AJ, Sen S, Sweeney HL, Discher DE. Matrix elasticity directs stem cell lineage specification. Cell. 2006;126(4):677.
Howren MB, Lamkin DM. Associations of depression with C-reactive protein, IL-1, and IL-6: a meta-analysis. J Suls, Psychosom Med. 2009;71(2):171.
Vlachopoulos C, Aznaouridis K, Stefanadis C. Prediction of cardiovascular events and all-cause mortality with arterial stiffness: a systematic review and meta-analysis. J Am Coll Cardiol. 2010;55(13):1318.
Wagenseil JE, Mecham RP. Vascular extracellular matrix and arterial mechanics. Physiol Rev. 2009;89(3):957.
Nishida K, Harrison DG, Navas JP, Fisher AA, Dockery SP, Uematsu M, et al. Molecular cloning and characterization of the constitutive bovine aortic endothelial cell nitric oxide synthase. J Clin Invest. 1992;90(5):2092.
Levesque MJ, Nerem RM. The elongation and orientation of cultured endothelial cells in response to shear stress. J Biomech Eng. 1985;107(4):341.
Ranade SS, Qiu Z, Woo SH, Hur SS, Murthy SE, Cahalan SM, et al. Piezo1, a mechanically activated ion channel, is required for vascular development in mice. Proc Natl Acad Sci USA. 2014;111(28):10347.
Biryukov M, Grouès V, Satagopam V, Schneider R. BioKB - text mining and semantic technologies for biomedical content discovery. SWAT4LS. 2017.
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:D607.
National Center for Biotechnology Information. How to: Find genes associated with a phenotype or disease. https://www.ncbi.nlm.nih.gov/guide/howto/find-gen-phen/.
Mandloi S, Chakrabarti S. PALM-IST: Pathway assembly from literature mining--an information search tool. Sci Rep. 2015;5:10021.
Perez-Iratxeta C, Bork P, Andrade MA. XplorMed: a tool for exploring MEDLINE abstracts. Trends Biochem Sci. 2001;26(9):573.
Wilkinson DM, Huberman BA. A method for finding communities of related genes. Proc Natl Acad Sci USA 101 Suppl. 2004;1:5241.
Irons L, Humphrey JD. Cell signaling model for arterial mechanobiology. PLoS Comput Biol. 2020;16:e1008161
Escribano J, Chen MB, Moeendarbary E, Cao X, Shenoy V, Garcia-Aznar JM, et al. Balance of mechanical forces drives endothelial gap formation and may facilitate cancer and immune-cell extravasation. PLoS Comput Biol. 2019;15:e1006395.
Acknowledgements
The authors gratefully acknowledge technical support from Isabelle Sato (UCSB Physics), Sayali Kakade (UCSB Computer Science), and Samuel Berkun (UCSB RMP) during development of computer code for this analysis.
Funding
This work was funded by the UCSB College of Engineering, UCSB Research Mentorship Program, Regents Junior Faculty Fellowship (AD), and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC-2068 - 390729961 (AD).
Author information
Authors and Affiliations
Contributions
Conceptualization: AD. Acquisition of data: XY. Creation of new software, investigation, writing—original draft preparation, writing—reviewing and editing, and approval of final manuscript: All authors.
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: T.C. : Special Edition: In Memory of Professor Robert Nerem
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yu, X., Doyle, A. Molecular Circuit Discovery for Mechanobiology of Cardiovascular Disease. Regen. Eng. Transl. Med. 9, 108–124 (2023). https://doi.org/10.1007/s40883-022-00264-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40883-022-00264-z