1
|
da Silva Rosa SC, Barzegar Behrooz A, Guedes S, Vitorino R, Ghavami S. Prioritization of genes for translation: a computational approach. Expert Rev Proteomics 2024; 21:125-147. [PMID: 38563427 DOI: 10.1080/14789450.2024.2337004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 02/21/2024] [Indexed: 04/04/2024]
Abstract
INTRODUCTION Gene identification for genetic diseases is critical for the development of new diagnostic approaches and personalized treatment options. Prioritization of gene translation is an important consideration in the molecular biology field, allowing researchers to focus on the most promising candidates for further investigation. AREAS COVERED In this paper, we discussed different approaches to prioritize genes for translation, including the use of computational tools and machine learning algorithms, as well as experimental techniques such as knockdown and overexpression studies. We also explored the potential biases and limitations of these approaches and proposed strategies to improve the accuracy and reliability of gene prioritization methods. Although numerous computational methods have been developed for this purpose, there is a need for computational methods that incorporate tissue-specific information to enable more accurate prioritization of candidate genes. Such methods should provide tissue-specific predictions, insights into underlying disease mechanisms, and more accurate prioritization of genes. EXPERT OPINION Using advanced computational tools and machine learning algorithms to prioritize genes, we can identify potential targets for therapeutic intervention of complex diseases. This represents an up-and-coming method for drug development and personalized medicine.
Collapse
Affiliation(s)
- Simone C da Silva Rosa
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
| | - Amir Barzegar Behrooz
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
- Electrophysiology Research Center, Neuroscience Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Sofia Guedes
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro, Portugal
| | - Rui Vitorino
- LAQV/REQUIMTE, Department of Chemistry, University of Aveiro, Aveiro, Portugal
- Department of Medical Sciences, Institute of Biomedicine-iBiMED, University of Aveiro, Aveiro, Portugal
- UnIC@RISE, Department of Surgery and Physiology, Faculty of Medicine of the University of Porto, Porto, Portugal
| | - Saeid Ghavami
- Department of Human Anatomy and Cell Science, Max Rady College of Medicine, Rady Faculty of Health Science, University of Manitoba, Winnipeg, Canada
- Faculty of Medicine in Zabrze, Academia of Silesia, Katowice, Poland
- Research Institute of Oncology and Hematology, Cancer Care Manitoba, University of Manitoba, Winnipeg, Canada
| |
Collapse
|
2
|
Gravel B, Renaux A, Papadimitriou S, Smits G, Nowé A, Lenaerts T. Prioritization of oligogenic variant combinations in whole exomes. Bioinformatics 2024; 40:btae184. [PMID: 38603604 PMCID: PMC11037482 DOI: 10.1093/bioinformatics/btae184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 01/29/2024] [Accepted: 04/10/2024] [Indexed: 04/13/2024] Open
Abstract
MOTIVATION Whole exome sequencing (WES) has emerged as a powerful tool for genetic research, enabling the collection of a tremendous amount of data about human genetic variation. However, properly identifying which variants are causative of a genetic disease remains an important challenge, often due to the number of variants that need to be screened. Expanding the screening to combinations of variants in two or more genes, as would be required under the oligogenic inheritance model, simply blows this problem out of proportion. RESULTS We present here the High-throughput oligogenic prioritizer (Hop), a novel prioritization method that uses direct oligogenic information at the variant, gene and gene pair level to detect digenic variant combinations in WES data. This method leverages information from a knowledge graph, together with specialized pathogenicity predictions in order to effectively rank variant combinations based on how likely they are to explain the patient's phenotype. The performance of Hop is evaluated in cross-validation on 36 120 synthetic exomes for training and 14 280 additional synthetic exomes for independent testing. Whereas the known pathogenic variant combinations are found in the top 20 in approximately 60% of the cross-validation exomes, 71% are found in the same ranking range when considering the independent set. These results provide a significant improvement over alternative approaches that depend simply on a monogenic assessment of pathogenicity, including early attempts for digenic ranking using monogenic pathogenicity scores. AVAILABILITY AND IMPLEMENTATION Hop is available at https://github.com/oligogenic/HOP.
Collapse
Affiliation(s)
- Barbara Gravel
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Department of Computer Science, Artificial Intelligence Laboratory, Vrije Universiteit Brussels, 1050 Brussels, Belgium
| | - Alexandre Renaux
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Department of Computer Science, Artificial Intelligence Laboratory, Vrije Universiteit Brussels, 1050 Brussels, Belgium
| | - Sofia Papadimitriou
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Brussels Interuniversity Genomics High Throughput core (BRIGHTcore), UZ Brussel, Vrije Universiteit Brussel (VUB) - Université Libre de Bruxelles (ULB), 1090 Brussels, Belgium
| | - Guillaume Smits
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Center of Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, 1070 Brussels, Belgium
| | - Ann Nowé
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Artificial Intelligence Laboratory, Vrije Universiteit Brussels, 1050 Brussels, Belgium
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, 1050 Brussels, Belgium
- Department of Computer Science, Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Department of Computer Science, Artificial Intelligence Laboratory, Vrije Universiteit Brussels, 1050 Brussels, Belgium
| |
Collapse
|
3
|
Voitalov I, Zhang L, Kilpatrick C, Withers JB, Saleh A, Akmaev VR, Ghiassian SD. The module triad: a novel network biology approach to utilize patients' multi-omics data for target discovery in ulcerative colitis. Sci Rep 2022; 12:21685. [PMID: 36522454 PMCID: PMC9755270 DOI: 10.1038/s41598-022-26276-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022] Open
Abstract
Tumor necrosis factor-[Formula: see text] inhibitors (TNFi) have been a standard treatment in ulcerative colitis (UC) for nearly 20 years. However, insufficient response rate to TNFi therapies along with concerns around their immunogenicity and inconvenience of drug delivery through injections calls for development of UC drugs targeting alternative proteins. Here, we propose a multi-omic network biology method for prioritization of protein targets for UC treatment. Our method identifies network modules on the Human Interactome-a network of protein-protein interactions in human cells-consisting of genes contributing to the predisposition to UC (Genotype module), genes whose expression needs to be modulated to achieve low disease activity (Response module), and proteins whose perturbation alters expression of the Response module genes to a healthy state (Treatment module). Targets are prioritized based on their topological relevance to the Genotype module and functional similarity to the Treatment module. We demonstrate utility of our method in UC and other complex diseases by efficiently recovering the protein targets associated with compounds in clinical trials and on the market . The proposed method may help to reduce cost and time of drug development by offering a computational screening tool for identification of novel and repurposing therapeutic opportunities in UC and other complex diseases.
Collapse
Affiliation(s)
- Ivan Voitalov
- Scipher Medicine Corporation, 221 Crescent St Suite 103A, Waltham, MA 02453 USA
| | - Lixia Zhang
- Scipher Medicine Corporation, 221 Crescent St Suite 103A, Waltham, MA 02453 USA
| | - Casey Kilpatrick
- Scipher Medicine Corporation, 221 Crescent St Suite 103A, Waltham, MA 02453 USA
| | - Johanna B. Withers
- Scipher Medicine Corporation, 221 Crescent St Suite 103A, Waltham, MA 02453 USA
| | - Alif Saleh
- Scipher Medicine Corporation, 221 Crescent St Suite 103A, Waltham, MA 02453 USA
| | | | | |
Collapse
|
4
|
Ravindran V, Wagoner J, Athanasiadis P, Den Hartigh AB, Sidorova JM, Ianevski A, Fink SL, Frigessi A, White J, Polyak SJ, Aittokallio T. Discovery of host-directed modulators of virus infection by probing the SARS-CoV-2-host protein-protein interaction network. Brief Bioinform 2022; 23:bbac456. [PMID: 36305426 PMCID: PMC9677461 DOI: 10.1093/bib/bbac456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 09/05/2022] [Accepted: 09/23/2022] [Indexed: 12/14/2022] Open
Abstract
The ongoing coronavirus disease 2019 (COVID-19) pandemic has highlighted the need to better understand virus-host interactions. We developed a network-based method that expands the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2)-host protein interaction network and identifies host targets that modulate viral infection. To disrupt the SARS-CoV-2 interactome, we systematically probed for potent compounds that selectively target the identified host proteins with high expression in cells relevant to COVID-19. We experimentally tested seven chemical inhibitors of the identified host proteins for modulation of SARS-CoV-2 infection in human cells that express ACE2 and TMPRSS2. Inhibition of the epigenetic regulators bromodomain-containing protein 4 (BRD4) and histone deacetylase 2 (HDAC2), along with ubiquitin-specific peptidase (USP10), enhanced SARS-CoV-2 infection. Such proviral effect was observed upon treatment with compounds JQ1, vorinostat, romidepsin and spautin-1, when measured by cytopathic effect and validated by viral RNA assays, suggesting that the host proteins HDAC2, BRD4 and USP10 have antiviral functions. We observed marked differences in antiviral effects across cell lines, which may have consequences for identification of selective modulators of viral infection or potential antiviral therapeutics. While network-based approaches enable systematic identification of host targets and selective compounds that may modulate the SARS-CoV-2 interactome, further developments are warranted to increase their accuracy and cell-context specificity.
Collapse
Affiliation(s)
- Vandana Ravindran
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Oslo, Norway
- Institute for Cancer Research, Department of Cancer Genetics, Oslo University Hospital, Oslo, Norway
| | - Jessica Wagoner
- Department of Laboratory Medicine & Pathology, University of Washington, Seattle, WA, USA
| | - Paschalis Athanasiadis
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Oslo, Norway
- Institute for Cancer Research, Department of Cancer Genetics, Oslo University Hospital, Oslo, Norway
| | - Andreas B Den Hartigh
- Department of Laboratory Medicine & Pathology, University of Washington, Seattle, WA, USA
| | - Julia M Sidorova
- Department of Laboratory Medicine & Pathology, University of Washington, Seattle, WA, USA
| | - Aleksandr Ianevski
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| | - Susan L Fink
- Department of Laboratory Medicine & Pathology, University of Washington, Seattle, WA, USA
| | - Arnoldo Frigessi
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Judith White
- Department of Cell Biology and Department of Microbiology, University of Virginia, Charlottesville, VA, USA
| | - Stephen J Polyak
- Department of Laboratory Medicine & Pathology, University of Washington, Seattle, WA, USA
| | - Tero Aittokallio
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Oslo, Norway
- Institute for Cancer Research, Department of Cancer Genetics, Oslo University Hospital, Oslo, Norway
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
| |
Collapse
|
5
|
Azadifar S, Ahmadi A. A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning. BMC Bioinformatics 2022; 23:422. [PMID: 36241966 PMCID: PMC9563530 DOI: 10.1186/s12859-022-04954-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 09/20/2022] [Indexed: 11/18/2022] Open
Abstract
Background Selecting and prioritizing candidate disease genes is necessary before conducting laboratory studies as identifying disease genes from a large number of candidate genes using laboratory methods, is a very costly and time-consuming task. There are many machine learning-based gene prioritization methods. These methods differ in various aspects including the feature vectors of genes, the used datasets with different structures, and the learning model. Creating a suitable feature vector for genes and an appropriate learning model on a variety of data with different and non-Euclidean structures, including graphs, as well as the lack of negative data are very important challenges of these methods. The use of graph neural networks has recently emerged in machine learning and other related fields, and they have demonstrated superior performance for a broad range of problems. Methods In this study, a new semi-supervised learning method based on graph convolutional networks is presented using the novel constructing feature vector for each gene. In the proposed method, first, we construct three feature vectors for each gene using terms from the Gene Ontology (GO) database. Then, we train a graph convolution network on these vectors using protein–protein interaction (PPI) network data to identify disease candidate genes. Our model discovers hidden layer representations encoding in both local graph structure as well as features of nodes. This method is characterized by the simultaneous consideration of topological information of the biological network (e.g., PPI) and other sources of evidence. Finally, a validation has been done to demonstrate the efficiency of our method. Results Several experiments are performed on 16 diseases to evaluate the proposed method's performance. The experiments demonstrate that our proposed method achieves the best results, in terms of precision, the area under the ROC curve (AUCs), and F1-score values, when compared with eight state-of-the-art network and machine learning-based disease gene prioritization methods. Conclusion This study shows that the proposed semi-supervised learning method appropriately classifies and ranks candidate disease genes using a graph convolutional network and an innovative method to create three feature vectors for genes based on the molecular function, cellular component, and biological process terms from GO data.
Collapse
Affiliation(s)
- Saeid Azadifar
- Faculty of Computer Engineering, K. N. Toosi University of Technology, Tehran, Iran.
| | - Ali Ahmadi
- Faculty of Computer Engineering, K. N. Toosi University of Technology, Tehran, Iran
| |
Collapse
|
6
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
7
|
Demenkov PS, Oshchepkova ЕА, Demenkov PS, Ivanisenko TV, Ivanisenko VA. Prioritization of biological processes based on the reconstruction and analysis of associative gene networks describing the response of plants to adverse environmental factors. Vavilovskii Zhurnal Genet Selektsii 2021; 25:580-592. [PMID: 34723066 PMCID: PMC8543060 DOI: 10.18699/vj21.065] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 06/21/2021] [Accepted: 06/21/2021] [Indexed: 11/23/2022] Open
Abstract
Methods for prioritizing or ranking candidate genes according to their importance based on specif ic criteria
via the analysis of gene networks are widely used in biomedicine to search for genes associated with diseases and to
predict biomarkers, pharmacological targets and other clinically relevant molecules. These methods have also been
used in other f ields, particularly in crop production. This is largely due to the development of technologies to solve
problems in marker-oriented and genomic selection, which requires knowledge of the molecular genetic mechanisms
underlying the formation of agriculturally valuable traits. A new direction for the study of molecular genetic mechanisms
is the prioritization of biological processes based on the analysis of associative gene networks. Associative gene
networks are heterogeneous networks whose vertices can depict both molecular genetic objects (genes, proteins, metabolites,
etc.) and the higher-level factors (biological processes, diseases, external environmental factors, etc.) related
to regulatory, physicochemical or associative interactions. Using a previously developed method, biological processes
involved in plant responses to increased cadmium content, saline stress and drought conditions were prioritized according
to their degree of connection with the gene networks in the SOLANUM TUBEROSUM knowledge base. The
prioritization results indicate that fundamental processes, such as gene expression, post-translational modif ications,
protein degradation, programmed cell death, photosynthesis, signal transmission and stress response play important
roles in the common molecular genetic mechanisms for plant response to various adverse factors. On the other hand, a
group of processes related to the development of seeds (“seeding development”) was revealed to be drought specif ic,
while processes associated with ion transport (“ion transport”) were included in the list of responses specif ic to salt
stress and processes associated with the metabolism of lipids were found to be involved specif ically in the response to
cadmium.
Collapse
Affiliation(s)
- P S Demenkov
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia Novosibirsk State University, Novosibirsk, Russia
| | - Е А Oshchepkova
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - P S Demenkov
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia Novosibirsk State University, Novosibirsk, Russia
| | - T V Ivanisenko
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - V A Ivanisenko
- Novosibirsk State University, Novosibirsk, Russiavosibirsk, Russia Kurchatov Genomic Center of ICG SB RAS, Novosibirsk, Russia
| |
Collapse
|
8
|
Zhang H, Ferguson A, Robertson G, Jiang M, Zhang T, Sudlow C, Smith K, Rannikmae K, Wu H. Benchmarking network-based gene prioritization methods for cerebral small vessel disease. Brief Bioinform 2021; 22:bbab006. [PMID: 33634312 PMCID: PMC8425308 DOI: 10.1093/bib/bbab006] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 12/31/2020] [Accepted: 01/04/2021] [Indexed: 12/25/2022] Open
Abstract
Network-based gene prioritization algorithms are designed to prioritize disease-associated genes based on known ones using biological networks of protein interactions, gene-disease associations (GDAs) and other relationships between biological entities. Various algorithms have been developed based on different mechanisms, but it is not obvious which algorithm is optimal for a specific disease. To address this issue, we benchmarked multiple algorithms for their application in cerebral small vessel disease (cSVD). We curated protein-gene interactions (PGIs) and GDAs from databases and assembled PGI networks and disease-gene heterogeneous networks. A screening of algorithms resulted in seven representative algorithms to be benchmarked. Performance of algorithms was assessed using both leave-one-out cross-validation (LOOCV) and external validation with MEGASTROKE genome-wide association study (GWAS). We found that random walk with restart on the heterogeneous network (RWRH) showed best LOOCV performance, with median LOOCV rediscovery rank of 185.5 (out of 19 463 genes). The GenePanda algorithm had most GWAS-confirmable genes in top 200 predictions, while RWRH had best ranks for small vessel stroke-associated genes confirmed in GWAS. In conclusion, RWRH has overall better performance for application in cSVD despite its susceptibility to bias caused by degree centrality. Choice of algorithms should be determined before applying to specific disease. Current pure network-based gene prioritization algorithms are unlikely to find novel disease-associated genes that are not associated with known ones. The tools for implementing and benchmarking algorithms have been made available and can be generalized for other diseases.
Collapse
Affiliation(s)
- Huayu Zhang
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Amy Ferguson
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Grant Robertson
- Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom
| | - Muchen Jiang
- Edinburgh Medical School, University of Edinburgh, Edinburgh, United Kingdom
| | - Teng Zhang
- Department of Orthopaedics and Traumatology, the University of Hong Kong, Hong Kong, China
| | - Cathie Sudlow
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
- Health Data Research UK, London, United Kingdom
| | - Keith Smith
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
- Health Data Research UK, London, United Kingdom
| | - Kristiina Rannikmae
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
- Health Data Research UK, London, United Kingdom
| | - Honghan Wu
- Health Data Research UK, London, United Kingdom
- Institute of Health Informatics, University College London, London, United Kingdom
| |
Collapse
|
9
|
Collins TK, Houghten S. A centrality based multi-objective approach to disease gene association. Biosystems 2020; 193-194:104133. [PMID: 32243908 DOI: 10.1016/j.biosystems.2020.104133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 02/27/2020] [Accepted: 03/23/2020] [Indexed: 01/11/2023]
Abstract
Disease Gene Association finds genes that are involved in the presentation of a given genetic disease. We present a hybrid approach which implements a multi-objective genetic algorithm, where input consists of centrality measures based on various relational biological evidence types merged into a complex network. Multiple objective settings and parameters are studied including the development of a new exchange methodology, safe dealer-based crossover. Successful results with respect to breast cancer and Parkinson's disease compared to previous techniques and popular known databases are shown. In addition, the newly developed methodology is also successfully applied to Alzheimer's disease, further demonstrating its flexibility. Across all three case studies the strongest results were produced by the shortest path-based measures stress and betweenness, either in a single objective parameter setting or when used in conjunction in a multi-objective environment. The new crossover technique achieved the best results when applied to Alzheimer's disease.
Collapse
Affiliation(s)
- Tyler K Collins
- Computer Science Department, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, Ontario L2S 3A1, Canada
| | - Sheridan Houghten
- Computer Science Department, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, Ontario L2S 3A1, Canada.
| |
Collapse
|
10
|
Zolotareva O, Kleine M. A Survey of Gene Prioritization Tools for Mendelian and Complex Human Diseases. J Integr Bioinform 2019; 16:/j/jib.ahead-of-print/jib-2018-0069/jib-2018-0069.xml. [PMID: 31494632 PMCID: PMC7074139 DOI: 10.1515/jib-2018-0069] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 07/12/2019] [Indexed: 12/16/2022] Open
Abstract
Modern high-throughput experiments provide us with numerous potential associations between genes and diseases. Experimental validation of all the discovered associations, let alone all the possible interactions between them, is time-consuming and expensive. To facilitate the discovery of causative genes, various approaches for prioritization of genes according to their relevance for a given disease have been developed. In this article, we explain the gene prioritization problem and provide an overview of computational tools for gene prioritization. Among about a hundred of published gene prioritization tools, we select and briefly describe 14 most up-to-date and user-friendly. Also, we discuss the advantages and disadvantages of existing tools, challenges of their validation, and the directions for future research.
Collapse
Affiliation(s)
- Olga Zolotareva
- Bielefeld University, Faculty of Technology and Center for Biotechnology, International Research Training Group "Computational Methods for the Analysis of the Diversity and Dynamics of Genomes" and Genome Informatics, Universitätsstraße 25, Bielefeld, Germany
| | - Maren Kleine
- Bielefeld University, Faculty of Technology, Bioinformatics/Medical Informatics Department, Universitätsstraße 25, Bielefeld, Germany
| |
Collapse
|
11
|
|