Prioritization of orphan disease-causing genes using topological feature and GO similarity between proteins in interaction networks

Li, Min; Li, Qi; Ganegoda, Gamage Upeksha; Wang, JianXin; Wu, FangXiang; Pan, Yi

doi:10.1007/s11427-014-4747-6

Prioritization of orphan disease-causing genes using topological feature and GO similarity between proteins in interaction networks

Research Paper
Open access
Published: 17 October 2014

Volume 57, pages 1064–1071, (2014)
Cite this article

Download PDF

You have full access to this open access article

Science China Life Sciences Aims and scope Submit manuscript

Prioritization of orphan disease-causing genes using topological feature and GO similarity between proteins in interaction networks

Download PDF

Min Li¹,
Qi Li¹,
Gamage Upeksha Ganegoda¹,
JianXin Wang¹,
FangXiang Wu^1,2 &
…
Yi Pan^1,3

915 Accesses
19 Citations
1 Altmetric
Explore all metrics

Abstract

Identification of disease-causing genes among a large number of candidates is a fundamental challenge in human disease studies. However, it is still time-consuming and laborious to determine the real disease-causing genes by biological experiments. With the advances of the high-throughput techniques, a large number of protein-protein interactions have been produced. Therefore, to address this issue, several methods based on protein interaction network have been proposed. In this paper, we propose a shortest path-based algorithm, named SPranker, to prioritize disease-causing genes in protein interaction networks. Considering the fact that diseases with similar phenotypes are generally caused by functionally related genes, we further propose an improved algorithm SPGOranker by integrating the semantic similarity of GO annotations. SPGOranker not only considers the topological similarity between protein pairs in a protein interaction network but also takes their functional similarity into account. The proposed algorithms SPranker and SPGOranker were applied to 1598 known orphan disease-causing genes from 172 orphan diseases and compared with three state-of-the-art approaches, ICN, VS and RWR. The experimental results show that SPranker and SPGOranker outperform ICN, VS, and RWR for the prioritization of orphan disease-causing genes. Importantly, for the case study of severe combined immunodeficiency, SPranker and SPGOranker predict several novel causal genes.

Article PDF

DIGNiFI: Discovering causative genes for orphan diseases using protein-protein interaction networks

Article Open access 14 March 2017

Network-based disease gene prioritization based on Protein–Protein Interaction Networks

Article 06 August 2020

Constructing an integrated gene similarity network for the identification of disease genes

Article Open access 20 September 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Dear JW, Lilitkarntakul P, Webb DJ. Are rare diseases still orphans or happily adopted? The challenges of developing and using orphan medicinal products. British J Clin Pharmacol, 2006, 62: 264–271
Article Google Scholar
Schieppati AHJ, Daina E, Aperia A. Why rare diseases are an important medical and social issue. Lancet, 2008, 371: 2039–2041
Article PubMed Google Scholar
Stolk P, Willemen MJC, Leufkens HGM. Rare essentials: drugs for rare diseases as essential medicines. Bull World Health Org, 2006, 84: 745–751
Article PubMed PubMed Central Google Scholar
Botstein D, Risch N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet, 2003, 33: 228–237
Article PubMed CAS Google Scholar
Glazier AM, Nadeau JH, Aitman TJ. Finding genes that underlie complex traits. Science, 2002, 298: 2345–2349
Article PubMed CAS Google Scholar
McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, Hirschhorn JN. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet, 2008, 9: 356–369
Article PubMed CAS Google Scholar
Wang J, Li M, Deng Y, Pan Y. Recent advances in clustering methods for protein interaction networks. BMC Genomics, 2010, 11: S10
CAS Google Scholar
Li M, Wu X, Wang J, Pan Y. Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data. BMC Bioinformatics, 2012, 13: 109
Article PubMed CAS PubMed Central Google Scholar
Zhao B, Wang J, Li M, Wu FX. Detecting protein complexes based on uncertain graph model. IEEE/ACM Trans Comput Biol Bioinform, 2014, doi:10.1109/TCBB.2013.2297915
Google Scholar
Zhong J, Wang J, Peng W, Zhang Z, Pan Y. Prediction of essential proteins based on gene expression programming. BMC Genomics, 2013, 14: 1–8
Article Google Scholar
Wang J, Peng W, Wu FX. Computational approaches to predicting essential proteins: a survey. Proteomics Clin Appl, 2013, 7: 181–192
Article PubMed CAS Google Scholar
Peng, W, Wang, J, Cai, J, Chen L, Li M, Wu FX. Improving protein function prediction using domain and protein complexes in PPI networks. BMC Syst Biol, 2014, 8: 35
Article PubMed PubMed Central Google Scholar
Wang J, Ren J, Li M, Wu FX. Identification of hierarchical and overlapping functional modules in PPI networks. IEEE Trans Nano-Biosci, 2012, 11: 386–393
Article Google Scholar
Wang J, Liu B, Li M and Pan Y. Identifying protein complexes from interaction networks based on clique percolation and distance restraction. BMC Genomics, 11: S10
Li M, Wang J, Chen J, Cai Z, Chen G. Identifying the overlapping complexes in protein interaction networks. Int J Data Min Bioinform, 2010, 4: 91–108
Article PubMed Google Scholar
Peng W, Wang J, Cheng Y, Lu Y, Wu FX, Pan Y. UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans Comput Biol Bioinform, 2014, doi: 10.1109/TCBB.2014.2338317
Google Scholar
Zhao B, Wang J, Li M, Wu FX, Pan Y. Prediction of essential proteins based on overlapping essential modules. IEEE Trans NanoBiosci, 2014, doi: 10.1109/TNB.2014.2337912
Google Scholar
Li M, Wang J, Wang H, Pan Y. Identification of essential proteins from weighted protein interaction networks. J Bioinform Comput Biol, 2013, 11: 1341002
Article PubMed Google Scholar
Wang J, Li M, Wang H, Pan Y. Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinform, 2012, 9: 1070–1080
Article PubMed Google Scholar
Li M, Zhang H, Wang J, Pan Y. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst Biol, 2012, 6: 15
Article PubMed CAS PubMed Central Google Scholar
Li M, Wang J, Chen X, Wang H, Pan Y. A local average connectivity-based method for identifying essential proteins from the network level. Comput Biol Chem, 2011, 35: 143–150
Article PubMed Google Scholar
Barabási AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet, 2011, 12: 56–68
Article PubMed PubMed Central Google Scholar
Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL. The human disease network. Proc Natl Acad Sci USA, 2007, 104: 8685–8690
Article PubMed CAS PubMed Central Google Scholar
Feldman I, Rzhetsky A, Vitkup D. Network properties of genes harboring inherited disease mutations. Proc Natl Acad Sci USA, 2008, 105: 4323–4328
Article PubMed CAS PubMed Central Google Scholar
Oti M, Brunner HG. The modular nature of genetic diseases. Clin Genet, 2007, 71: 1–11
Article PubMed CAS Google Scholar
Wood LD, Parsons DW, Jones S, Lin J, Sjöblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, Silliman N, Szabo S, Dezso Z, Ustyanksky V, Nikolskaya T, Nikolsky Y, Karchin R, Wilson PA, Kaminker JS, Zhang Z, Croshaw R, Willis J, Dawson D, Shipitsin M, Willson JK, Sukumar S, Polyak K, Park BH, Pethiyagoda CL, Pant PV, Ballinger DG, Sparks AB, Hartigan J, Smith DR, Suh E, Papadopoulos N, Buckhaults P, Markowitz SD, Parmigiani G, Kinzler KW, Velculescu VE, Vogelstein B. The genomic landscapes of human breast and colorectal cancers. Science, 2007, 318: 1108–1113
Article PubMed CAS Google Scholar
Lim J, Hao T, Shaw C, Patel AJ, Szabó G, Rual JF, Fisk CJ, Li N, Smolyar A, Hill DE, Barabási AL, Vidal M, Zoghbi HY. A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell, 2006, 125: 801–814
Article PubMed CAS Google Scholar
Navlakha S, Kingsford C. The power of protein interaction networks for associating genes with diseases. Bioinformatics, 2010, 26: 1057–1063
Article PubMed CAS PubMed Central Google Scholar
Ganegoda GU, Wang J, Wu FX, Li M. Prediction of disease genes using tissue-specified gene-gene network. BMC Syst Biol, 2014, 8(Suppl 3): S3
Article PubMed PubMed Central Google Scholar
Wang J, Chen G, Li M, Pan Y. Integration of breast cancer gene signatures based on graph centrality. BMC Syst Biol, 2011, 5: S10
Article PubMed PubMed Central Google Scholar
Chen B, Wang J, Li M, Wu FX. Identifying disease causing genes by integrating multiple data sources. BMC Med Genom, 2014, 7(Suppl 2): S2
Article Google Scholar
Vanunu O, Magger O, Ruppin E, Shlomi T, Sharan R. Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol, 2010, 6: e1000641
Article PubMed PubMed Central Google Scholar
Köhler S, Bauer S, Horn D, Robinson PN. Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet, 2008, 82: 949–958
Article PubMed PubMed Central Google Scholar
Chen J, Aronow BJ, Jegga AG. Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics, 2009, 10: 73
Article PubMed PubMed Central Google Scholar
Hsu CL, Huang YH, Hsu CT, Yang UC. Prioritizing disease candidate genes by a gene interconnectedness-based approach. BMC Genomics, 2011, 12: S25
Article PubMed CAS PubMed Central Google Scholar
Zhu C, Kushwaha A, Berman K, Jegga AG. A vertex similarity-based framework to discover and rank orphan disease-related genes. BMC Syst Biol, 2012, 6: S8
Article PubMed PubMed Central Google Scholar
Navlakha S, Rastogi R, Shrivastava N. Graph summarization with bounded error. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, 2008. 419–432
Chapter Google Scholar
van Dongen S. Graph clustering via a discrete uncoupling process. SIAM J Matrix Anal Appl, 2008, 30: 121–141
Article Google Scholar
Navlakha S, White J, Nagarajan N, Pop M, Kingsford C. Finding biologically accurate clusterings in hierarchical tree decompositions using the variation of information. J Computat Biol, 2010, 17: 503–516
Article CAS Google Scholar
Li M, Chen J, Wang J, Hu B, Chen G. Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinformatics, 2008, 9: 398
Article PubMed PubMed Central Google Scholar
Ding X, Wang W, Peng X, Wang J. Mining protein complexes from PPI networks using the minimum vertex cut. Tsinghua Sci Technol, 2012, 17: 674–681
Article CAS Google Scholar
Wang J, Li M, Chen J, Pan Y. A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans Computat Biol Bioinform, 2011, 8: 607–620
Article Google Scholar
Montanez G, Cho YR. Predicting false positives of protein-protein interaction data by semantic similarity measures. Curr Bioinform, 2013, 8: 339–346
Article CAS Google Scholar
Li M, Zheng R, Zhang H, Wang J, Pan Y. Effective identification of essential proteins based on priori knowledge, network topology and gene expressions. Methods, 2014, 67: 325–333
Article PubMed CAS Google Scholar
Tang X, Wang J, Zhong J, Pan Y. Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans Comput Biol Bioinform, 2014, 11: 407–418
Article PubMed Google Scholar
Wang J, Peng X, Peng W, Wu FX. Dynamic protein interaction network construction and applications. Proteomics, 2014, 8: 338–352
Article Google Scholar
Wang J, Peng X, Li M, Pan Y. Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics, 2013, 13: 301–312
Article PubMed CAS Google Scholar
Tang X, Feng Q, Wang J, He Y, Pan Y. Clustering based on multiple biological information: approach for predicting protein complexes. IET Syst Biol, 2013, 7: 223–230
Article PubMed Google Scholar
Peng W, Wang J, Wang W, Liu Q, Wu FX, Pan Y. Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst Biol, 2012, 6: 87
Article PubMed CAS PubMed Central Google Scholar
Aymé S. Orphanet, an information site on rare diseases. Soins; la revue de référence infirmière, 2003, 672: 46
PubMed Google Scholar
Wolfe CJ, Kohane IS, Butte AJ. Systematic survey reveals general applicability of guilt-by-association within gene coexpression networks. BMC Bioinformatics, 2005, 6: 227
Article PubMed PubMed Central Google Scholar
Dijkstra EW. A note on two problems in connexion with graphs. Numerische Mathematik, 1959, 1: 269–271
Article Google Scholar
Franke L, van Bakel H, Fokkens L, de Jong ED, Egmont-Petersen M, Wijmenga C. Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet, 2006, 78: 1011–1025
Article PubMed CAS PubMed Central Google Scholar
Perez-Iratxeta C, Bork P, Andrade MA. Association of genes to genetically inherited diseases using data mining. Nat Genet, 2002, 31: 316–319
PubMed CAS Google Scholar
Turner FS, Clutterbuck DR, Semple CAM. POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol, 2003, 4: R75–R75
Article PubMed PubMed Central Google Scholar
Freudenberg J, Propping P. A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics, 2002, 18: S110–115
Article PubMed Google Scholar
Zhang P, Zhang J, Sheng H, Russo JJ, Osborne B, Buetow K. Gene functional similarity search tool (GFSST). BMC Bioinformatics, 2006, 7: 135
Article PubMed CAS PubMed Central Google Scholar
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Gene Ontology: tool for the unification of biology. Nat Genet, 2000, 25: 25–29
Article PubMed CAS PubMed Central Google Scholar
Li M, Wu X, Pan Y, Wang J. hF-measure: a new measurement for evaluating clusters in protein-protein interaction networks. Proteomics, 2013, 13: 291–300
Article PubMed CAS Google Scholar
Wang J, Dai L, Li M. GO semantic similarity-based false positive reduction of protein-protein interactions. In: IEEE International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing, 2009. 211–214
Google Scholar
Peri S, Navarro JD, Kristiansen TZ, Amanchy R, Surendranath V, Muthusamy B, Gandhi TK, Chandrika KN, Deshpande N, Suresh S, Rashmi BP, Shanker K, Padma N, Niranjan V, Harsha HC, Talreja N, Vrushabendra BM, Ramya MA, Yatish AJ, Joy M, Shivashankar HN, Kavitha MP, Menezes M, Choudhury DR, Ghosh N, Saravana R, Chandran S, Mohan S, Jonnalagadda CK, Prasad CK, Kumar-Sinha C, Deshpande KS, Pandey A. Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res, 2004, 32: D497–501
Article PubMed CAS PubMed Central Google Scholar
Ikeda MD, Larkin A. ZAP70-related severe combined immunodeficiency. In: Pagon RA, Adam MP, Ardinger HH, Bird TD, Dolan CR, Fong CT, Smith RJH, Stephens K, eds. SourceGeneReviews®. Seattle: University of Washington, Seattle, 2009
Google Scholar
Russell SM, Johnston JA, Noguchi M, Kawamura M, Bacon CM, Friedmann M, Berg M, McVicar DW, Witthuhn BA, Silvennoinen O. Interaction of IL-2R beta and gamma c chains with Jak1 and Jak3: implications for XSCID and XCID. Science, 1994, 266: 1042–1045
Article PubMed CAS Google Scholar
Sebastian K, Borowski A, Kuepper M, Friedrich K. Signal transduction around thymic stromal lymphopoietin (TSLP) in atopic asthma. Cell Commun Signal, 2008, 6: 5
Article PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Central South University, Changsha, 410083, China
Min Li, Qi Li, Gamage Upeksha Ganegoda, JianXin Wang, FangXiang Wu & Yi Pan
College of Engineering, University of Saskatchewan, Saskatoon, SK, STN 5A9, Canada
FangXiang Wu
Department of Computer Science, Georgia State University, Atlanta, GA, 30302-4110, USA
Yi Pan

Authors

Min Li
View author publications
You can also search for this author in PubMed Google Scholar
Qi Li
View author publications
You can also search for this author in PubMed Google Scholar
Gamage Upeksha Ganegoda
View author publications
You can also search for this author in PubMed Google Scholar
JianXin Wang
View author publications
You can also search for this author in PubMed Google Scholar
FangXiang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to JianXin Wang.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Li, M., Li, Q., Ganegoda, G.U. et al. Prioritization of orphan disease-causing genes using topological feature and GO similarity between proteins in interaction networks. Sci. China Life Sci. 57, 1064–1071 (2014). https://doi.org/10.1007/s11427-014-4747-6

Download citation

Received: 23 May 2014
Accepted: 15 July 2014
Published: 17 October 2014
Issue Date: November 2014
DOI: https://doi.org/10.1007/s11427-014-4747-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Prioritization of orphan disease-causing genes using topological feature and GO similarity between proteins in interaction networks

Abstract

Article PDF

Similar content being viewed by others

DIGNiFI: Discovering causative genes for orphan diseases using protein-protein interaction networks

Network-based disease gene prioritization based on Protein–Protein Interaction Networks

Constructing an integrated gene similarity network for the identification of disease genes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Prioritization of orphan disease-causing genes using topological feature and GO similarity between proteins in interaction networks

Abstract

Article PDF

Similar content being viewed by others

DIGNiFI: Discovering causative genes for orphan diseases using protein-protein interaction networks

Network-based disease gene prioritization based on Protein–Protein Interaction Networks

Constructing an integrated gene similarity network for the identification of disease genes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation