1
|
Cacheiro P, Lawson S, Van den Veyver IB, Marengo G, Zocche D, Murray SA, Duyzend M, Robinson PN, Smedley D. Lethal phenotypes in Mendelian disorders. Genet Med 2024; 26:101141. [PMID: 38629401 PMCID: PMC11232373 DOI: 10.1016/j.gim.2024.101141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 04/08/2024] [Accepted: 04/09/2024] [Indexed: 04/26/2024] Open
Abstract
PURPOSE Existing resources that characterize the essentiality status of genes are based on either proliferation assessment in human cell lines, viability evaluation in mouse knockouts, or constraint metrics derived from human population sequencing studies. Several repositories document phenotypic annotations for rare disorders; however, there is a lack of comprehensive reporting on lethal phenotypes. METHODS We queried Online Mendelian Inheritance in Man for terms related to lethality and classified all Mendelian genes according to the earliest age of death recorded for the associated disorders, from prenatal death to no reports of premature death. We characterized the genes across these lethality categories, examined the evidence on viability from mouse models and explored how this information could be used for novel gene discovery. RESULTS We developed the Lethal Phenotypes Portal to showcase this curated catalog of human essential genes. Differences in the mode of inheritance, physiological systems affected, and disease class were found for genes in different lethality categories, as well as discrepancies between the lethal phenotypes observed in mouse and human. CONCLUSION We anticipate that this resource will aid clinicians in the diagnosis of early lethal conditions and assist researchers in investigating the properties that make these genes essential for human development.
Collapse
Affiliation(s)
- Pilar Cacheiro
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - Samantha Lawson
- ITS Research, Queen Mary University of London, London, United Kingdom
| | - Ignatia B Van den Veyver
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX; Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, TX
| | - Gabriel Marengo
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - David Zocche
- North West Thames Regional Genetics Service, Northwick Park and St Mark's Hospitals, London, United Kingdom
| | | | - Michael Duyzend
- Massachusetts General Hospital, Boston, MA; Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA; Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital and Harvard Medical School, Boston, MA
| | - Peter N Robinson
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom.
| |
Collapse
|
2
|
Chao P, Zhang X, Zhang L, Yang A, Wang Y, Chen X. Integration of molecular docking and molecular dynamics simulations with subtractive proteomics approach to identify the novel drug targets and their inhibitors in Streptococcus gallolyticus. Sci Rep 2024; 14:14755. [PMID: 38926437 PMCID: PMC11208513 DOI: 10.1038/s41598-024-64769-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024] Open
Abstract
Streptococcus gallolyticus (Sg) is a non-motile, gram-positive bacterium that causes infective endocarditis (inflammation of the heart lining). Because Sg has gained resistance to existing antibiotics and there is currently no drug available, developing effective anti-Sg drugs is critical. This study combined core proteomics with a subtractive proteomics technique to identify potential therapeutic targets for Sg. Several bioinformatics approaches were used to eliminate non-essential and human-specific homologous sequences from the bacterial proteome. Then, virulence, druggability, subcellular localization, and functional analyses were carried out to specify the participation of significant bacterial proteins in various cellular processes. The pathogen's genome contained three druggable proteins, glucosamine-1phosphate N-acetyltransferase (GlmU), RNA polymerase sigma factor (RpoD), and pantetheine-phosphate adenylyltransferase (PPAT) which could serve as effective targets for developing novel drugs. 3D structures of target protein were modeled through Swiss Model. A natural product library containing 10,000 molecules from the LOTUS database was docked against therapeutic target proteins. Following an evaluation of the docking results using the glide gscore, the top 10 compounds docked against each protein receptor were chosen. LTS001632, LTS0243441, and LTS0236112 were the compounds that exhibited the highest binding affinities against GlmU, PPAT, and RpoD, respectively, among the compounds that were chosen. To augment the docking data, molecular dynamics simulations and MM-GBSA binding free energy were also utilized. More in-vitro research is necessary to transform these possible inhibitors into therapeutic drugs, though computer validations were employed in this study. This combination of computational techniques paves the way for targeted antibiotic development, which addresses the critical need for new therapeutic strategies against S. gallolyticus infections.
Collapse
Affiliation(s)
- Peng Chao
- Department of Cardiology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China
| | - Xueqin Zhang
- Department of Nephrology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China
| | - Lei Zhang
- Department of Cardiology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China
| | - Aiping Yang
- Department of Traditional Chinese Medicine, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China
| | - Yong Wang
- Department of Cardiology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China
| | - Xiaoyang Chen
- Department of Cardiology, People's Hospital of Xinjiang Uygur Autonomous Region, Urumqi, China.
| |
Collapse
|
3
|
Cacheiro P, Lawson S, Van den Veyver IB, Marengo G, Zocche D, Murray SA, Duyzend M, Robinson PN, Smedley D. Lethal phenotypes in Mendelian disorders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.12.24301168. [PMID: 38260283 PMCID: PMC10802756 DOI: 10.1101/2024.01.12.24301168] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Essential genes are those whose function is required for cell proliferation and/or organism survival. A gene's intolerance to loss-of-function can be allocated within a spectrum, as opposed to being considered a binary feature, since this function might be essential at different stages of development, genetic backgrounds or other contexts. Existing resources that collect and characterise the essentiality status of genes are based on either proliferation assessment in human cell lines, embryonic and postnatal viability evaluation in different model organisms, and gene metrics such as intolerance to variation scores derived from human population sequencing studies. There are also several repositories available that document phenotypic annotations for rare disorders in humans such as the Online Mendelian Inheritance in Man (OMIM) and the Human Phenotype Ontology (HPO) knowledgebases. This raises the prospect of being able to use clinical data, including lethality as the most severe phenotypic manifestation, to further our characterisation of gene essentiality. Here we queried OMIM for terms related to lethality and classified all Mendelian genes into categories, according to the earliest age of death recorded for the associated disorders, from prenatal death to no reports of premature death. To showcase this curated catalogue of human essential genes, we developed the Lethal Phenotypes Portal (https://lethalphenotypes.research.its.qmul.ac.uk), where we also explore the relationships between these lethality categories, constraint metrics and viability in cell lines and mouse. Further analysis of the genes in these categories reveals differences in the mode of inheritance of the associated disorders, physiological systems affected and disease class. We highlight how the phenotypic similarity between genes in the same lethality category combined with gene family/group information can be used for novel disease gene discovery. Finally, we explore the overlaps and discrepancies between the lethal phenotypes observed in mouse and human and discuss potential explanations that include differences in transcriptional regulation, functional compensation and molecular disease mechanisms. We anticipate that this resource will aid clinicians in the diagnosis of early lethal conditions and assist researchers in investigating the properties that make these genes essential for human development.
Collapse
Affiliation(s)
- Pilar Cacheiro
- William Harvey Research Institute, Queen Mary University of London, London, UK
| | | | - Ignatia B. Van den Veyver
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
- Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, TX, USA
| | - Gabriel Marengo
- William Harvey Research Institute, Queen Mary University of London, London, UK
| | - David Zocche
- North West Thames Regional Genetics Service, Northwick Park & St Mark’s Hospitals, London, UK
| | | | | | - Peter N. Robinson
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, UK
| |
Collapse
|
4
|
Cacheiro P, Smedley D. Essential genes: a cross-species perspective. Mamm Genome 2023; 34:357-363. [PMID: 36897351 PMCID: PMC10382395 DOI: 10.1007/s00335-023-09984-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 02/17/2023] [Indexed: 03/11/2023]
Abstract
Protein coding genes exhibit different degrees of intolerance to loss-of-function variation. The most intolerant genes, whose function is essential for cell or/and organism survival, inform on fundamental biological processes related to cell proliferation and organism development and provide a window on the molecular mechanisms of human disease. Here we present a brief overview of the resources and knowledge gathered around gene essentiality, from cancer cell lines to model organisms to human development. We outline the implications of using different sources of evidence and definitions to determine which genes are essential and highlight how information on the essentiality status of a gene can inform novel disease gene discovery and therapeutic target identification.
Collapse
Affiliation(s)
- Pilar Cacheiro
- William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, UK.
| |
Collapse
|
5
|
Rout RK, Umer S, Khandelwal M, Pati S, Mallik S, Balabantaray BK, Qin H. Identification of discriminant features from stationary pattern of nucleotide bases and their application to essential gene classification. Front Genet 2023; 14:1154120. [PMID: 37152988 PMCID: PMC10156977 DOI: 10.3389/fgene.2023.1154120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 04/04/2023] [Indexed: 05/09/2023] Open
Abstract
Introduction: Essential genes are essential for the survival of various species. These genes are a family linked to critical cellular activities for species survival. These genes are coded for proteins that regulate central metabolism, gene translation, deoxyribonucleic acid replication, and fundamental cellular structure and facilitate intracellular and extracellular transport. Essential genes preserve crucial genomics information that may hold the key to a detailed knowledge of life and evolution. Essential gene studies have long been regarded as a vital topic in computational biology due to their relevance. An essential gene is composed of adenine, guanine, cytosine, and thymine and its various combinations. Methods: This paper presents a novel method of extracting information on the stationary patterns of nucleotides such as adenine, guanine, cytosine, and thymine in each gene. For this purpose, some co-occurrence matrices are derived that provide the statistical distribution of stationary patterns of nucleotides in the genes, which is helpful in establishing the relationship between the nucleotides. For extracting discriminant features from each co-occurrence matrix, energy, entropy, homogeneity, contrast, and dissimilarity features are computed, which are extracted from all co-occurrence matrices and then concatenated to form a feature vector representing each essential gene. Finally, supervised machine learning algorithms are applied for essential gene classification based on the extracted fixed-dimensional feature vectors. Results: For comparison, some existing state-of-the-art feature representation techniques such as Shannon entropy (SE), Hurst exponent (HE), fractal dimension (FD), and their combinations have been utilized. Discussion: An extensive experiment has been performed for classifying the essential genes of five species that show the robustness and effectiveness of the proposed methodology.
Collapse
Affiliation(s)
- Ranjeet Kumar Rout
- National Institute of Technology Srinagar, Hazratbal, Jammu and Kashmir, India
| | - Saiyed Umer
- Aliah University, Kolkata, West Bengal, India
| | - Monika Khandelwal
- National Institute of Technology Srinagar, Hazratbal, Jammu and Kashmir, India
| | - Smitarani Pati
- Dr. B R Ambedkar National Institute of Technology Jalandhar, Jalandhar, Punjab, India
| | - Saurav Mallik
- Harvard T H Chan School of Public Health, Boston, United States
- Department of Pharmacology and Toxicology, University of Arizona, Tucson, AZ, United States
- *Correspondence: Saurav Mallik, , ; Hong Qin,
| | | | - Hong Qin
- Department of Computer Science and Engineering, University of Tennessee at Chattanooga, Chattanooga, TN, United States
- *Correspondence: Saurav Mallik, , ; Hong Qin,
| |
Collapse
|
6
|
Manzo M, Giordano M, Maddalena L, Guarracino MR, Granata I. Novel Data Science Methodologies for Essential Genes Identification Based on Network Analysis. STUDIES IN COMPUTATIONAL INTELLIGENCE 2023:117-145. [DOI: 10.1007/978-3-031-24453-7_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
7
|
Siewert-Rocks KM, Kim SS, Yao DW, Shi H, Price AL. Leveraging gene co-regulation to identify gene sets enriched for disease heritability. Am J Hum Genet 2022; 109:393-404. [PMID: 35108496 PMCID: PMC8948163 DOI: 10.1016/j.ajhg.2022.01.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 01/04/2022] [Indexed: 12/15/2022] Open
Abstract
Identifying gene sets that are associated to disease can provide valuable biological knowledge, but a fundamental challenge of gene set analyses of GWAS data is linking disease-associated SNPs to genes. Transcriptome-wide association studies (TWASs) detect associations between the genetically predicted expression of a gene and disease risk, thus implicating candidate disease genes. However, causal disease genes at TWAS-associated loci generally remain unknown due to gene co-regulation, which leads to correlations across genes in predicted expression. We developed a method, gene co-regulation score (GCSC) regression, to identify gene sets that are enriched for disease heritability explained by predicted expression. GCSC regresses TWAS chi-square statistics on gene co-regulation scores reflecting correlations in predicted gene expression; a gene set is enriched for heritability if genes with high co-regulation to the set have higher TWAS chi-square statistics than genes with low co-regulation to the set, beyond what is expected based on co-regulation to all genes. We verified via simulations that GCSC is well calibrated and well powered. We applied GCSC to gene expression data from GTEx (48 tissues) and GWAS summary statistics for 43 independent diseases and complex traits analyzing a broad set of biological pathways and specifically expressed gene sets. We identified many enriched sets, recapitulating known biology. For Alzheimer disease, we detected evidence of an immune basis, and specifically a role for antigen presentation, in analyses of both biological pathways and specifically expressed gene sets. Our results highlight the advantages of leveraging gene co-regulation within the TWAS framework to identify enriched gene sets.
Collapse
Affiliation(s)
- Katherine M Siewert-Rocks
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Samuel S Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Douglas W Yao
- Program in Systems, Synthetic, and Quantitative Biology, Harvard University, Cambridge, MA 02138, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
| |
Collapse
|
8
|
Li-Leger E, Feichtinger R, Flibotte S, Holzkamp H, Schnabel R, Moerman DG. Identification of essential genes in Caenorhabditis elegans through whole genome sequencing of legacy mutant collections. G3-GENES GENOMES GENETICS 2021; 11:6373896. [PMID: 34550348 PMCID: PMC8664450 DOI: 10.1093/g3journal/jkab328] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 08/27/2021] [Indexed: 01/23/2023]
Abstract
It has been estimated that 15%–30% of the ∼20,000 genes in C. elegans are essential, yet many of these genes remain to be identified or characterized. With the goal of identifying unknown essential genes, we performed whole-genome sequencing on complementation pairs from legacy collections of maternal-effect lethal and sterile mutants. This approach uncovered maternal genes required for embryonic development and genes with apparent sperm-specific functions. In total, 58 putative essential genes were identified on chromosomes III–V, of which 52 genes are represented by novel alleles in this collection. Of these 52 genes, 19 (40 alleles) were selected for further functional characterization. The terminal phenotypes of embryos were examined, revealing defects in cell division, morphogenesis, and osmotic integrity of the eggshell. Mating assays with wild-type males revealed previously unknown male-expressed genes required for fertilization and embryonic development. The result of this study is a catalog of mutant alleles in essential genes that will serve as a resource to guide further study toward a more complete understanding of this important model organism. As many genes and developmental pathways in C. elegans are conserved and essential genes are often linked to human disease, uncovering the function of these genes may also provide insight to further our understanding of human biology.
Collapse
Affiliation(s)
- Erica Li-Leger
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z3
| | - Richard Feichtinger
- Department of Developmental Genetics, Institute of Genetics, Technische Universität Braunschweig, 38106, Germany
| | - Stephane Flibotte
- UBC/LSI Bioinformatics Facility, University of British Columbia, Vancouver, British Columbia, Canada
| | - Heinke Holzkamp
- Department of Developmental Genetics, Institute of Genetics, Technische Universität Braunschweig, 38106, Germany
| | - Ralf Schnabel
- Department of Developmental Genetics, Institute of Genetics, Technische Universität Braunschweig, 38106, Germany
| | - Donald G Moerman
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z3
| |
Collapse
|
9
|
DELEAT: gene essentiality prediction and deletion design for bacterial genome reduction. BMC Bioinformatics 2021; 22:444. [PMID: 34537011 PMCID: PMC8449488 DOI: 10.1186/s12859-021-04348-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 08/26/2021] [Indexed: 11/10/2022] Open
Abstract
Background The study of gene essentiality is fundamental to understand the basic principles of life, as well as for applications in many fields. In recent decades, dozens of sets of essential genes have been determined using different experimental and bioinformatics approaches, and this information has been useful for genome reduction of model organisms. Multiple in silico strategies have been developed to predict gene essentiality, but no optimal algorithm or set of gene features has been found yet, especially for non-model organisms with incomplete functional annotation. Results We have developed DELEAT v0.1 (DELetion design by Essentiality Analysis Tool), an easy-to-use bioinformatic tool which integrates an in silico gene essentiality classifier in a pipeline allowing automatic design of large-scale deletions in any bacterial genome. The essentiality classifier consists of a novel logistic regression model based on only six gene features which are not dependent on experimental data or functional annotation. As a proof of concept, we have applied this pipeline to the determination of dispensable regions in the genome of Bartonella quintana str. Toulouse. In this already reduced genome, 35 possible deletions have been delimited, spanning 29% of the genome. Conclusions Built on in silico gene essentiality predictions, we have developed an analysis pipeline which assists researchers throughout multiple stages of bacterial genome reduction projects, and created a novel classifier which is simple, fast, and universally applicable to any bacterial organism with a GenBank annotation file. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04348-5.
Collapse
|
10
|
Guo Z, Fu Y, Huang C, Zheng C, Wu Z, Chen X, Gao S, Ma Y, Shahen M, Li Y, Tu P, Zhu J, Wang Z, Xiao W, Wang Y. NOGEA: A Network-oriented Gene Entropy Approach for Dissecting Disease Comorbidity and Drug Repositioning. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021; 19:549-564. [PMID: 33744433 PMCID: PMC9040018 DOI: 10.1016/j.gpb.2020.06.023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 04/04/2020] [Accepted: 09/24/2020] [Indexed: 10/31/2022]
Abstract
Rapid development of high-throughput technologies has permitted the identification of an increasing number of disease-associated genes (DAGs), which are important for understanding disease initiation and developing precision therapeutics. However, DAGs often contain large amounts of redundant or false positive information, leading to difficulties in quantifying and prioritizing potential relationships between these DAGs and human diseases. In this study, a network-oriented gene entropy approach (NOGEA) is proposed for accurately inferring master genes that contribute to specific diseases by quantitatively calculating their perturbation abilities on directed disease-specific gene networks. In addition, we confirmed that the master genes identified by NOGEA have a high reliability for predicting disease-specific initiation events and progression risk. Master genes may also be used to extract the underlying information of different diseases, thus revealing mechanisms of disease comorbidity. More importantly, approved therapeutic targets are topologically localized in a small neighborhood of master genes on the interactome network, which provides a new way for predicting drug-disease associations. Through this method, 11 old drugs were newly identified and predicted to be effective for treating pancreatic cancer and then validated by in vitro experiments. Collectively, the NOGEA was useful for identifying master genes that control disease initiation and co-occurrence, thus providing a valuable strategy for drug efficacy screening and repositioning. NOGEA codes are publicly available at https://github.com/guozihuaa/NOGEA.
Collapse
Affiliation(s)
- Zihu Guo
- College of Life Science, Northwest University, Xi'an 710069, China; College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Yingxue Fu
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Chao Huang
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Chunli Zheng
- College of Life Science, Northwest University, Xi'an 710069, China
| | - Ziyin Wu
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Xuetong Chen
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Shuo Gao
- College of Life Science, Northwest A & F University, Yangling 712100, China
| | - Yaohua Ma
- College of Life Science, Northwest University, Xi'an 710069, China
| | - Mohamed Shahen
- Zoology Department, Faculty of Science, Tanta University, Tanta 31527, Egypt
| | - Yan Li
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Faculty of Chemical, Environmental and Biological Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Pengfei Tu
- State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Jingbo Zhu
- School of Food Science and Technology, Dalian Polytechnic University, Dalian 116034, China
| | - Zhenzhong Wang
- State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China
| | - Wei Xiao
- State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China.
| | - Yonghua Wang
- College of Life Science, Northwest University, Xi'an 710069, China; College of Life Science, Northwest A & F University, Yangling 712100, China; State Key Laboratory of New-tech for Chinese Medicine Pharmaceutical Process, Lianyungang 222001, China.
| |
Collapse
|
11
|
Le NQK, Do DT, Hung TNK, Lam LHT, Huynh TT, Nguyen NTK. A Computational Framework Based on Ensemble Deep Neural Networks for Essential Genes Identification. Int J Mol Sci 2020; 21:E9070. [PMID: 33260643 PMCID: PMC7730808 DOI: 10.3390/ijms21239070] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Revised: 11/25/2020] [Accepted: 11/26/2020] [Indexed: 01/13/2023] Open
Abstract
Essential genes contain key information of genomes that could be the key to a comprehensive understanding of life and evolution. Because of their importance, studies of essential genes have been considered a crucial problem in computational biology. Computational methods for identifying essential genes have become increasingly popular to reduce the cost and time-consumption of traditional experiments. A few models have addressed this problem, but performance is still not satisfactory because of high dimensional features and the use of traditional machine learning algorithms. Thus, there is a need to create a novel model to improve the predictive performance of this problem from DNA sequence features. This study took advantage of a natural language processing (NLP) model in learning biological sequences by treating them as natural language words. To learn the NLP features, a supervised learning model was consequentially employed by an ensemble deep neural network. Our proposed method could identify essential genes with sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC), and area under the receiver operating characteristic curve (AUC) values of 60.2%, 84.6%, 76.3%, 0.449, and 0.814, respectively. The overall performance outperformed the single models without ensemble, as well as the state-of-the-art predictors on the same benchmark dataset. This indicated the effectiveness of the proposed method in determining essential genes, in particular, and other sequencing problems, in general.
Collapse
Affiliation(s)
- Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei 106, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei 106, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei 110, Taiwan
| | - Duyen Thi Do
- Graduate Institute of Biomedical Informatics, Taipei Medical University, Taipei 106, Taiwan;
| | - Truong Nguyen Khanh Hung
- International Master/Ph.D. Program in Medicine, College of Medicine, Taipei Medical University, Taipei 110, Taiwan; (T.N.K.H.); (L.H.T.L.)
- Department of Orthopedic and Trauma, Cho Ray Hospital, Ho Chi Minh 70000, Vietnam
| | - Luu Ho Thanh Lam
- International Master/Ph.D. Program in Medicine, College of Medicine, Taipei Medical University, Taipei 110, Taiwan; (T.N.K.H.); (L.H.T.L.)
- Intensive Care Unit, Children’s Hospital 2, Ho Chi Minh 70000, Vietnam
| | - Tuan-Tu Huynh
- Department of Electrical Engineering, Yuan Ze University, Taoyuan 320, Taiwan;
- Department of Electrical Electronic and Mechanical Engineering, Lac Hong University, Dong Nai 76120, Vietnam
| | - Ngan Thi Kim Nguyen
- School of Nutrition and Health Sciences, Taipei Medical University, Taipei 110, Taiwan;
| |
Collapse
|
12
|
Enciso-Ramírez M, Reyes-Castillo Z, Llamas-Covarrubias MA, Guerrero L, López-Espinoza A, Valdés-Miramontes EH. CD36 gene polymorphism -31118 G > A (rs1761667) is associated with overweight and obesity but not with fat preferences in Mexican children. INT J VITAM NUTR RES 2020; 91:513-521. [PMID: 32419652 DOI: 10.1024/0300-9831/a000656] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
CD36 glycoprotein is a candidate receptor involved in the gustatory detection of lipids and emerging evidence has suggested that genetic variations in CD36 may modulate the oral perception threshold to fatty acids. Here, we analyzed the association of -31118 G > A polymorphism in CD36 gene with nutritional status and preferences for fatty foods in Mexican children. Genotyping of SNP rs1761667 was performed in school-age children (n = 63) in addition to sensory tests evaluating the preference and satisfaction score assigned to oil-based sauces of different fatty acid composition. The G allele was associated with high BMI z-score in children (OR = 2.43, 95% (CI 1.02-5.99); p = 0.02) but CD36 genotypes (AA, GA, and GG) did not show significant association with the preference and satisfaction scores assigned to oil-based sauces. The BMI z-score showed no association with the preference to oil-based sauces; however, children with normal weight gave higher satisfaction scores to sauces with a high content of unsaturated fatty acids than to sauces rich in saturated fatty acids (0.56 ± 1.26 vs. 0.06 ± 1.22; p = 0.02). Therefore, the G allele of -31118 G > A SNP in CD36 gene is associated with overweight and obesity in Mexican children but do not appear to modulate the preferences and satisfaction scores to fat.
Collapse
Affiliation(s)
- Mayra Enciso-Ramírez
- Instituto de Investigaciones en Comportamiento Alimentario y Nutrición (IICAN), Centro Universitario del Sur, Universidad de Guadalajara, Ciudad Guzmán, Jalisco, México
| | - Zyanya Reyes-Castillo
- Instituto de Investigaciones en Comportamiento Alimentario y Nutrición (IICAN), Centro Universitario del Sur, Universidad de Guadalajara, Ciudad Guzmán, Jalisco, México
| | - Mara Anaís Llamas-Covarrubias
- Instituto de Investigación en Ciencias Biomédicas (IICB), Centro Universitario de Ciencias de la Salud, Universidad de Guadalajara, Guadalajara, Jalisco, México
| | - Luis Guerrero
- IRTA-Monells, Institut de Recerca i Tecnologia Agroalimentàries, Granja Camps i Armet, Monells, Girona, Spain
| | - Antonio López-Espinoza
- Instituto de Investigaciones en Comportamiento Alimentario y Nutrición (IICAN), Centro Universitario del Sur, Universidad de Guadalajara, Ciudad Guzmán, Jalisco, México
| | - Elia Herminia Valdés-Miramontes
- Instituto de Investigaciones en Comportamiento Alimentario y Nutrición (IICAN), Centro Universitario del Sur, Universidad de Guadalajara, Ciudad Guzmán, Jalisco, México
| |
Collapse
|
13
|
Pei J, Kinch LN, Otwinowski Z, Grishin NV. Mutation severity spectrum of rare alleles in the human genome is predictive of disease type. PLoS Comput Biol 2020; 16:e1007775. [PMID: 32413045 PMCID: PMC7255613 DOI: 10.1371/journal.pcbi.1007775] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 05/28/2020] [Accepted: 03/06/2020] [Indexed: 12/19/2022] Open
Abstract
The human genome harbors a variety of genetic variations. Single-nucleotide changes that alter amino acids in protein-coding regions are one of the major causes of human phenotypic variation and diseases. These single-amino acid variations (SAVs) are routinely found in whole genome and exome sequencing. Evaluating the functional impact of such genomic alterations is crucial for diagnosis of genetic disorders. We developed DeepSAV, a deep-learning convolutional neural network to differentiate disease-causing and benign SAVs based on a variety of protein sequence, structural and functional properties. Our method outperforms most stand-alone programs, and the version incorporating population and gene-level information (DeepSAV+PG) has similar predictive power as some of the best available. We transformed DeepSAV scores of rare SAVs in the human population into a quantity termed "mutation severity measure" for each human protein-coding gene. It reflects a gene's tolerance to deleterious missense mutations and serves as a useful tool to study gene-disease associations. Genes implicated in cancer, autism, and viral interaction are found by this measure as intolerant to mutations, while genes associated with a number of other diseases are scored as tolerant. Among known disease-associated genes, those that are mutation-intolerant are likely to function in development and signal transduction pathways, while those that are mutation-tolerant tend to encode metabolic and mitochondrial proteins.
Collapse
Affiliation(s)
- Jimin Pei
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Lisa N. Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Zbyszek Otwinowski
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Nick V. Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- * E-mail:
| |
Collapse
|
14
|
Cacheiro P, Muñoz-Fuentes V, Murray SA, Dickinson ME, Bucan M, Nutter LMJ, Peterson KA, Haselimashhadi H, Flenniken AM, Morgan H, Westerberg H, Konopka T, Hsu CW, Christiansen A, Lanza DG, Beaudet AL, Heaney JD, Fuchs H, Gailus-Durner V, Sorg T, Prochazka J, Novosadova V, Lelliott CJ, Wardle-Jones H, Wells S, Teboul L, Cater H, Stewart M, Hough T, Wurst W, Sedlacek R, Adams DJ, Seavitt JR, Tocchini-Valentini G, Mammano F, Braun RE, McKerlie C, Herault Y, de Angelis MH, Mallon AM, Lloyd KCK, Brown SDM, Parkinson H, Meehan TF, Smedley D. Human and mouse essentiality screens as a resource for disease gene discovery. Nat Commun 2020; 11:655. [PMID: 32005800 PMCID: PMC6994715 DOI: 10.1038/s41467-020-14284-2] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Accepted: 12/12/2019] [Indexed: 12/31/2022] Open
Abstract
The identification of causal variants in sequencing studies remains a considerable challenge that can be partially addressed by new gene-specific knowledge. Here, we integrate measures of how essential a gene is to supporting life, as inferred from viability and phenotyping screens performed on knockout mice by the International Mouse Phenotyping Consortium and essentiality screens carried out on human cell lines. We propose a cross-species gene classification across the Full Spectrum of Intolerance to Loss-of-function (FUSIL) and demonstrate that genes in five mutually exclusive FUSIL categories have differing biological properties. Most notably, Mendelian disease genes, particularly those associated with developmental disorders, are highly overrepresented among genes non-essential for cell survival but required for organism development. After screening developmental disorder cases from three independent disease sequencing consortia, we identify potentially pathogenic variants in genes not previously associated with rare diseases. We therefore propose FUSIL as an efficient approach for disease gene discovery.
Collapse
Grants
- UM1 HG008900 NHGRI NIH HHS
- UM1 HG006504 NHGRI NIH HHS
- MC_UP_1502/1 Medical Research Council
- UM1 HG006542 NHGRI NIH HHS
- UM1 OD023221 NIH HHS
- MC_U142684171 Medical Research Council
- MR/S006753/1 Medical Research Council
- UM1 HG006370 NHGRI NIH HHS
- UM1 HG006493 NHGRI NIH HHS
- U54 HG006370 NHGRI NIH HHS
- U54 HG006364 NHGRI NIH HHS
- MC_U142684172 Medical Research Council
- UM1 HG006348 NHGRI NIH HHS
- U42 OD011174 NIH HHS
- U42 OD011175 NIH HHS
- Wellcome Trust
- This work was supported by NIH grant U54 HG006370. IMPC-related mouse production and phenotyping was funded by the Government of Canada through Genome Canada and Ontario Genomics (OGI-051) for NorCOMM2 (C.M.) and the National Institutes of Health and OD, NCRR, NIDDK and NHLBI for KOMP and KOMP2 Projects U42 OD011175 and UM1OD023221 (C.M., K.C.K.L), Infrafrontier grant 01KX1012, EU Horizon2020: IPAD-MD funding 653961 (M.H.d.A); EUCOMM: LSHM-CT-2005-018931, EUCOMMTOOLS: FP7-HEALTH-F4-2010-261492 (W.G.W). UM1 HG006348; U42 OD011174; U54 HG005348 (A.L.B), NIH U54706HG006364 (A.L.B). Wellcome Trust grants WT098051 and WT206194 (D.A). The French National Centre for Scientific Research (CNRS), the French National Institute of Health and Medical Research (INSERM), the University of Strasbourg and the “Centre Europeen de Recherche en Biomedecine”, and the French state funds through the “Agence Nationale de la Recherche” under the frame programme Investissements d’Avenir labelled (ANR-10-IDEX-0002-02, ANR-10-LABX-0030-INRT, ANR-10-INBS-07 PHENOMIN (J.H.). This research was made possible through access to the data and findings generated by the 100,000 Genomes Project. The 100,000 Genomes Project is managed by Genomics England Limited (a wholly owned company of the Department of Health). The 100,000 Genomes Project is funded by the National Institute for Health Research and NHS England. The Wellcome Trust, Cancer Research UK and the Medical Research Council have also funded research infrastructure. The 100,000 Genomes Project uses data provided by patients and collected by the National Health Service as part of their care and support. We are also grateful for the data access provided by the DDD and CMG projects. The DDD study presents independent research commissioned by the Health Innovation Challenge Fund [grant number HICF-1009-003], a parallel funding partnership between Wellcome and the Department of Health, and the Wellcome Sanger Institute [grant number WT098051]. The views expressed in this publication are those of the author(s) and not necessarily those of Wellcome or the Department of Health. The study has UK Research Ethics Committee approval (10/H0305/83, granted by the Cambridge South REC, and GEN/284/12 granted by the Republic of Ireland REC). The research team acknowledges the support of the National Institute for Health Research, through the Comprehensive Clinical Research Network. The Centers for Mendelian Genomics are funded by the National Human Genome Research Institute, the National Heart, Lung, and Blood Institute, and the National Eye Institute. Broad Institute (UM1 HG008900), Johns Hopkins University School of Medicine/Baylor College of Medicine (UM1 HG006542), University of Washington (UM1 HG006493), Yale University (UM1 HG006504).
Collapse
Affiliation(s)
- Pilar Cacheiro
- Clinical Pharmacology, William Harvey Research Institute, School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Violeta Muñoz-Fuentes
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | | | - Mary E Dickinson
- Departments of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Maja Bucan
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Lauryl M J Nutter
- The Centre for Phenogenomics, The Hospital for Sick Children, Toronto, ON, M5T 3H7, Canada
| | | | - Hamed Haselimashhadi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ann M Flenniken
- The Centre for Phenogenomics, Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, ON, M5T 3H7, Canada
| | - Hugh Morgan
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Henrik Westerberg
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Tomasz Konopka
- Clinical Pharmacology, William Harvey Research Institute, School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Chih-Wei Hsu
- Departments of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Audrey Christiansen
- Departments of Molecular Physiology and Biophysics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Denise G Lanza
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Arthur L Beaudet
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jason D Heaney
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Helmut Fuchs
- German Mouse Clinic, Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764, Neuherberg, Germany
| | - Valerie Gailus-Durner
- German Mouse Clinic, Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764, Neuherberg, Germany
| | - Tania Sorg
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404, Illkirch, France
| | - Jan Prochazka
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vestec, 252 50, Prague, Czech Republic
| | - Vendula Novosadova
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vestec, 252 50, Prague, Czech Republic
| | | | | | - Sara Wells
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Lydia Teboul
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Heather Cater
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Michelle Stewart
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Tertius Hough
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Wolfgang Wurst
- Institute of Developmental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health GmbH, 85764, Neuherberg, Germany
- Department of Developmental Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, 85764, Neuherberg, Germany
- Deutsches Institut für Neurodegenerative Erkrankungen (DZNE) Site Munich, Munich Cluster for Systems Neurology (SyNergy), Adolf-Butenandt-Institut, Ludwig-Maximilians-Universität München, 80336, Munich, Germany
| | - Radislav Sedlacek
- Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vestec, 252 50, Prague, Czech Republic
| | - David J Adams
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - John R Seavitt
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Glauco Tocchini-Valentini
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, 00015, Monterotondo Scalo, Italy
| | - Fabio Mammano
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, 00015, Monterotondo Scalo, Italy
| | | | - Colin McKerlie
- The Centre for Phenogenomics, The Hospital for Sick Children, Toronto, ON, M5T 3H7, Canada
- Translational Medicine, The Hospital for Sick Children, Toronto, ON, M5T 3H7, Canada
| | - Yann Herault
- Université de Strasbourg, CNRS, INSERM, Institut de Génétique, Biologie Moléculaire et Cellulaire, Institut Clinique de la Souris, IGBMC, PHENOMIN-ICS, 67404, Illkirch, France
| | - Martin Hrabě de Angelis
- German Mouse Clinic, Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764, Neuherberg, Germany
- Department of Experimental Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, 85354, Freising-Weihenstephan, Germany
- German Center for Diabetes Research (DZD), 85764, Neuherberg, Germany
| | - Ann-Marie Mallon
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - K C Kent Lloyd
- Mouse Biology Program, University of California, Davis, CA, 95618, USA
| | - Steve D M Brown
- Medical Research Council Harwell Institute (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, OX11 0RD, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Terrence F Meehan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Damian Smedley
- Clinical Pharmacology, William Harvey Research Institute, School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK.
| |
Collapse
|
15
|
Thompson B, Katsanis N, Apostolopoulos N, Thompson DC, Nebert DW, Vasiliou V. Genetics and functions of the retinoic acid pathway, with special emphasis on the eye. Hum Genomics 2019; 13:61. [PMID: 31796115 PMCID: PMC6892198 DOI: 10.1186/s40246-019-0248-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 11/12/2019] [Indexed: 02/07/2023] Open
Abstract
Retinoic acid (RA) is a potent morphogen required for embryonic development. RA is formed in a multistep process from vitamin A (retinol); RA acts in a paracrine fashion to shape the developing eye and is essential for normal optic vesicle and anterior segment formation. Perturbation in RA-signaling can result in severe ocular developmental diseases—including microphthalmia, anophthalmia, and coloboma. RA-signaling is also essential for embryonic development and life, as indicated by the significant consequences of mutations in genes involved in RA-signaling. The requirement of RA-signaling for normal development is further supported by the manifestation of severe pathologies in animal models of RA deficiency—such as ventral lens rotation, failure of optic cup formation, and embryonic and postnatal lethality. In this review, we summarize RA-signaling, recent advances in our understanding of this pathway in eye development, and the requirement of RA-signaling for embryonic development (e.g., organogenesis and limb bud development) and life.
Collapse
Affiliation(s)
- Brian Thompson
- Department of Environmental Health Sciences, Yale School of Public Health, 60 College St, New Haven, CT, 06520, USA
| | - Nicholas Katsanis
- Stanley Manne Research Institute, Lurie Children's Hospital, Chicago, IL, 60611, USA.,Departments of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Nicholas Apostolopoulos
- Department of Environmental Health Sciences, Yale School of Public Health, 60 College St, New Haven, CT, 06520, USA
| | - David C Thompson
- Department of Clinical Pharmacy, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Denver, Aurora, CO, 80045, USA
| | - Daniel W Nebert
- Department of Environmental Health and Center for Environmental Genetics, University Cincinnati Medical Center, Cincinnati, OH, 45267-0056, USA
| | - Vasilis Vasiliou
- Department of Environmental Health Sciences, Yale School of Public Health, 60 College St, New Haven, CT, 06520, USA.
| |
Collapse
|
16
|
Mustafin ZS, Zamyatin VI, Konstantinov DK, Doroshkov AV, Lashin SA, Afonnikov DA. Phylostratigraphic Analysis Shows the Earliest Origination of the Abiotic Stress Associated Genes in A. thaliana. Genes (Basel) 2019; 10:genes10120963. [PMID: 31766757 PMCID: PMC6947294 DOI: 10.3390/genes10120963] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 11/16/2019] [Accepted: 11/18/2019] [Indexed: 12/27/2022] Open
Abstract
Plants constantly fight with stressful factors as high or low temperature, drought, soil salinity and flooding. Plants have evolved a set of stress response mechanisms, which involve physiological and biochemical changes that result in adaptive or morphological changes. At a molecular level, stress response in plants is performed by genetic networks, which also undergo changes in the process of evolution. The study of the network structure and evolution may highlight mechanisms of plants adaptation to adverse conditions, as well as their response to stresses and help in discovery and functional characterization of the stress-related genes. We performed an analysis of Arabidopsis thaliana genes associated with several types of abiotic stresses (heat, cold, water-related, light, osmotic, salt, and oxidative) at the network level using a phylostratigraphic approach. Our results show that a substantial fraction of genes associated with various types of abiotic stress is of ancient origin and evolves under strong purifying selection. The interaction networks of genes associated with stress response have a modular structure with a regulatory component being one of the largest for five of seven stress types. We demonstrated a positive relationship between the number of interactions of gene in the stress gene network and its age. Moreover, genes of the same age tend to be connected in stress gene networks. We also demonstrated that old stress-related genes usually participate in the response for various types of stress and are involved in numerous biological processes unrelated to stress. Our results demonstrate that the stress response genes represent the ancient and one of the fundamental molecular systems in plants.
Collapse
Affiliation(s)
- Zakhar S. Mustafin
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Kurchatov Genomics Center, Institute of Cytology and Genetics, SB RAS, 630090 Novosibirsk, Russia
| | - Vladimir I. Zamyatin
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Kurchatov Genomics Center, Institute of Cytology and Genetics, SB RAS, 630090 Novosibirsk, Russia
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
| | - Dmitrii K. Konstantinov
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
| | - Aleksej V. Doroshkov
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
| | - Sergey A. Lashin
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Kurchatov Genomics Center, Institute of Cytology and Genetics, SB RAS, 630090 Novosibirsk, Russia
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
- Correspondence: (S.A.L.); (D.A.A.); Tel.: +7-383-363-49-63 (D.A.A.)
| | - Dmitry A. Afonnikov
- The Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences (IC & G SB RAS), 630090 Novosibirsk, Russia; (Z.S.M.); (V.I.Z.); (D.K.K.); (A.V.D.)
- Kurchatov Genomics Center, Institute of Cytology and Genetics, SB RAS, 630090 Novosibirsk, Russia
- Faculty of Natural Sciences, Novosibirsk State University (NSU), 630090 Novosibirsk, Russia
- Correspondence: (S.A.L.); (D.A.A.); Tel.: +7-383-363-49-63 (D.A.A.)
| |
Collapse
|
17
|
Renschler G, Richard G, Valsecchi CIK, Toscano S, Arrigoni L, Ramírez F, Akhtar A. Hi-C guided assemblies reveal conserved regulatory topologies on X and autosomes despite extensive genome shuffling. Genes Dev 2019; 33:1591-1612. [PMID: 31601616 PMCID: PMC6824461 DOI: 10.1101/gad.328971.119] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Accepted: 09/09/2019] [Indexed: 11/30/2022]
Abstract
In this study, Renschler et al. set out to analyze the impact of genomic rearrangements on genome topology using the Drosophila genus and X chromosome dosage compensation as a model. The authors developed a scaffolding algorithm and generated chromosome-length assemblies from Hi-C data for studying genome topology in three distantly related Drosophila species. Their data provides unique insights into genome topology evolution. RA Genome rearrangements that occur during evolution impose major challenges on regulatory mechanisms that rely on three-dimensional genome architecture. Here, we developed a scaffolding algorithm and generated chromosome-length assemblies from Hi-C data for studying genome topology in three distantly related Drosophila species. We observe extensive genome shuffling between these species with one synteny breakpoint after approximately every six genes. A/B compartments, a set of large gene-dense topologically associating domains (TADs), and spatial contacts between high-affinity sites (HAS) located on the X chromosome are maintained over 40 million years, indicating architectural conservation at various hierarchies. Evolutionary conserved genes cluster in the vicinity of HAS, while HAS locations appear evolutionarily flexible, thus uncoupling functional requirement of dosage compensation from individual positions on the linear X chromosome. Therefore, 3D architecture is preserved even in scenarios of thousands of rearrangements highlighting its relevance for essential processes such as dosage compensation of the X chromosome.
Collapse
Affiliation(s)
- Gina Renschler
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany.,Faculty of Biology, University of Freiburg, 79104 Freiburg, Germany
| | - Gautier Richard
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany.,IGEPP, INRA, Agrocampus Ouest, Université Rennes, 35600 Le Rheu, France
| | | | - Sarah Toscano
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany
| | - Laura Arrigoni
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany
| | - Fidel Ramírez
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany
| | - Asifa Akhtar
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany
| |
Collapse
|
18
|
Wen QF, Liu S, Dong C, Guo HX, Gao YZ, Guo FB. Geptop 2.0: An Updated, More Precise, and Faster Geptop Server for Identification of Prokaryotic Essential Genes. Front Microbiol 2019; 10:1236. [PMID: 31214154 PMCID: PMC6558110 DOI: 10.3389/fmicb.2019.01236] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Accepted: 05/17/2019] [Indexed: 12/16/2022] Open
Abstract
Geptop has performed effectively in the identification of prokaryotic essential genes since its first release in 2013. It estimates gene essentiality for prokaryotes based on orthology and phylogeny. Genome-scale essentiality data of more prokaryotic species are available, and the information has been collected into public essential gene repositories such as DEG and OGEE. A faster and more accurate toolkit is needed to meet the increasing prokaryotic genome data. We updated Geptop by supplementing more validated essentiality data into reference set (from 19 to 37 species), and introducing multi-process technology to accelerate the computing speed. Compared with Geptop 1.0 and other gene essentiality prediction models, Geptop 2.0 can generate more stable predictions and finish the computation in a shorter time. The software is available both as an online server and a downloadable standalone application. We hope that the improved Geptop 2.0 will facilitate researches in gene essentiality and the development of novel antibacterial drugs. The gene essentiality prediction tool is available at http://cefg.uestc.cn/geptop.
Collapse
Affiliation(s)
- Qing-Feng Wen
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Shuo Liu
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Chuan Dong
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hai-Xia Guo
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Yi-Zhou Gao
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Feng-Biao Guo
- School of Life Sciences and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
19
|
Ji X, Rajpal DK, Freudenberg JM. The essentiality of drug targets: an analysis of current literature and genomic databases. Drug Discov Today 2019; 24:544-550. [DOI: 10.1016/j.drudis.2018.11.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 09/18/2018] [Accepted: 11/05/2018] [Indexed: 12/14/2022]
|
20
|
Tian D, Wenlock S, Kabir M, Tzotzos G, Doig AJ, Hentges KE. Identifying mouse developmental essential genes using machine learning. Dis Model Mech 2018; 11:11/12/dmm034546. [PMID: 30563825 PMCID: PMC6307915 DOI: 10.1242/dmm.034546] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Accepted: 10/19/2018] [Indexed: 12/20/2022] Open
Abstract
The genes that are required for organismal survival are annotated as ‘essential genes’. Identifying all the essential genes of an animal species can reveal critical functions that are needed during the development of the organism. To inform studies on mouse development, we developed a supervised machine learning classifier based on phenotype data from mouse knockout experiments. We used this classifier to predict the essentiality of mouse genes lacking experimental data. Validation of our predictions against a blind test set of recent mouse knockout experimental data indicated a high level of accuracy (>80%). We also validated our predictions for other mouse mutagenesis methodologies, demonstrating that the predictions are accurate for lethal phenotypes isolated in random chemical mutagenesis screens and embryonic stem cell screens. The biological functions that are enriched in essential and non-essential genes have been identified, showing that essential genes tend to encode intracellular proteins that interact with nucleic acids. The genome distribution of predicted essential and non-essential genes was analysed, demonstrating that the density of essential genes varies throughout the genome. A comparison with human essential and non-essential genes was performed, revealing conservation between human and mouse gene essentiality status. Our genome-wide predictions of mouse essential genes will be of value for the planning of mouse knockout experiments and phenotyping assays, for understanding the functional processes required during mouse development, and for the prioritisation of disease candidate genes identified in human genome and exome sequence datasets. Summary: Here, we used computer-based machine learning methodology to predict which genes in the mouse genome are essential for development, and present a database of mouse essential and non-essential genes.
Collapse
Affiliation(s)
- David Tian
- Division of Evolution and Genomic Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Stephanie Wenlock
- Division of Evolution and Genomic Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - Mitra Kabir
- Division of Evolution and Genomic Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester M13 9PT, UK
| | - George Tzotzos
- Department of Agriculture, Food and Environmental Sciences, Marche Polytechnic University, Ancona 60121, Italy
| | - Andrew J Doig
- Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester M1 7DN, UK .,Division of Neuroscience and Experimental Psychology, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PT, UK
| | - Kathryn E Hentges
- Division of Evolution and Genomic Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, The University of Manchester, Oxford Road, Manchester M13 9PT, UK
| |
Collapse
|
21
|
Yu S, Zheng C, Zhou F, Baillie DL, Rose AM, Deng Z, Chu JSC. Genomic identification and functional analysis of essential genes in Caenorhabditis elegans. BMC Genomics 2018; 19:871. [PMID: 30514206 PMCID: PMC6278001 DOI: 10.1186/s12864-018-5251-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 11/14/2018] [Indexed: 11/27/2022] Open
Abstract
Background Essential genes are required for an organism’s viability and their functions can vary greatly, spreading across many pathways. Due to the importance of essential genes, large scale efforts have been undertaken to identify the complete set of essential genes and to understand their function. Studies of genome architecture and organization have found that genes are not randomly disturbed in the genome. Results Using combined genetic mapping, Illumina sequencing, and bioinformatics analyses, we successfully identified 44 essential genes with 130 lethal mutations in genomic regions of C. elegans of around 7.3 Mb from Chromosome I (left). Of the 44 essential genes, six of which were genes not characterized previously by mutant alleles, let-633/let-638 (B0261.1), let-128 (C53H9.2), let-511 (W09C3.4), let-162 (Y47G6A.18), let-510 (Y47G6A.19), and let-131 (Y71G12B.6). Examine essential genes with Hi-C data shows that essential genes tend to cluster within TAD units rather near TAD boundaries. We have also shown that essential genes in the left half of chromosome I in C. elegans function in enzyme and nucleic acid binding activities during fundamental processes, such as DNA replication, transcription, and translation. From protein-protein interaction networks, essential genes exhibit more protein connectivity than non-essential genes in the genome. Also, many of the essential genes show strong expression in embryos or early larvae stages, indicating that they are important to early development. Conclusions Our results confirmed that this work provided a more comprehensive picture of the essential gene and their functional characterization. These genetic resources will offer important tools for further heath and disease research. Electronic supplementary material The online version of this article (10.1186/s12864-018-5251-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shicheng Yu
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education, School of Pharmaceutical Sciences, Wuhan University, Wuhan, 430071, China. .,Wuhan Frasergen Bioinformatics, Wuhan East Lake High-tech Zone, Wuhan, 430075, China.
| | - Chaoran Zheng
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education, School of Pharmaceutical Sciences, Wuhan University, Wuhan, 430071, China
| | - Fan Zhou
- Wuhan Frasergen Bioinformatics, Wuhan East Lake High-tech Zone, Wuhan, 430075, China
| | - David L Baillie
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Ann M Rose
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Zixin Deng
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery, Ministry of Education, School of Pharmaceutical Sciences, Wuhan University, Wuhan, 430071, China.
| | | |
Collapse
|
22
|
Dong C, Jin YT, Hua HL, Wen QF, Luo S, Zheng WX, Guo FB. Comprehensive review of the identification of essential genes using computational methods: focusing on feature implementation and assessment. Brief Bioinform 2018; 21:171-181. [PMID: 30496347 DOI: 10.1093/bib/bby116] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Revised: 11/01/2018] [Accepted: 11/02/2018] [Indexed: 02/06/2023] Open
Abstract
Essential genes have attracted increasing attention in recent years due to the important functions of these genes in organisms. Among the methods used to identify the essential genes, accurate and efficient computational methods can make up for the deficiencies of expensive and time-consuming experimental technologies. In this review, we have collected researches on essential gene predictions in prokaryotes and eukaryotes and summarized the five predominant types of features used in these studies. The five types of features include evolutionary conservation, domain information, network topology, sequence component and expression level. We have described how to implement the useful forms of these features and evaluated their performance based on the data of Escherichia coli MG1655, Bacillus subtilis 168 and human. The prerequisite and applicable range of these features is described. In addition, we have investigated the techniques used to weight features in various models. To facilitate researchers in the field, two available online tools, which are accessible for free and can be directly used to predict gene essentiality in prokaryotes and humans, were referred. This article provides a simple guide for the identification of essential genes in prokaryotes and eukaryotes.
Collapse
Affiliation(s)
- Chuan Dong
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Yan-Ting Jin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hong-Li Hua
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Qing-Feng Wen
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Sen Luo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wen-Xin Zheng
- School of Biomedical Engineering, Capital Medical University, Beijing, China
| | - Feng-Biao Guo
- School of Life Science and Technology, Center for Informational Biology, Intelligent Learning Institute for Science and Application, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
23
|
Hao T, Wang Q, Zhao L, Wu D, Wang E, Sun J. Analyzing of Molecular Networks for Human Diseases and Drug Discovery. Curr Top Med Chem 2018; 18:1007-1014. [PMID: 30101711 PMCID: PMC6174636 DOI: 10.2174/1568026618666180813143408] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Revised: 06/22/2018] [Accepted: 07/03/2018] [Indexed: 01/11/2023]
Abstract
Molecular networks represent the interactions and relations of genes/proteins, and also encode molecular mechanisms of biological processes, development and diseases. Among the molecular networks, protein-protein Interaction Networks (PINs) have become effective platforms for uncovering the molecular mechanisms of diseases and drug discovery. PINs have been constructed for various organisms and utilized to solve many biological problems. In human, most proteins present their complex functions by interactions with other proteins, and the sum of these interactions represents the human protein interactome. Especially in the research on human disease and drugs, as an emerging tool, the PIN provides a platform to systematically explore the molecular complexities of specific diseases and the references for drug design. In this review, we summarized the commonly used approaches to aid disease research and drug discovery with PINs, including the network topological analysis, identification of novel pathways, drug targets and sub-network biomarkers for diseases. With the development of bioinformatic techniques and biological networks, PINs will play an increasingly important role in human disease research and drug discovery.
Collapse
Affiliation(s)
- Tong Hao
- Tianjin Key Laboratory of Animal and Plant Resistance/College of Life Sciences, Tianjin Normal University, Tianjin 300387, China
| | - Qian Wang
- Tianjin Key Laboratory of Animal and Plant Resistance/College of Life Sciences, Tianjin Normal University, Tianjin 300387, China
| | - Lingxuan Zhao
- Tianjin Key Laboratory of Animal and Plant Resistance/College of Life Sciences, Tianjin Normal University, Tianjin 300387, China
| | - Dan Wu
- Tianjin Key Laboratory of Animal and Plant Resistance/College of Life Sciences, Tianjin Normal University, Tianjin 300387, China
| | - Edwin Wang
- Tianjin Key Laboratory of Animal and Plant Resistance/College of Life Sciences, Tianjin Normal University, Tianjin 300387, China.,University of Calgary Cumming School of Medicine, Calgary, Alberta T2N 4Z6, Canada
| | - Jinsheng Sun
- Tianjin Key Laboratory of Animal and Plant Resistance/College of Life Sciences, Tianjin Normal University, Tianjin 300387, China.,Tianjin Bohai Fisheries Research Institute, Tianjin 300221, China
| |
Collapse
|
24
|
Ahmad S, Navid A, Akhtar AS, Azam SS, Wadood A, Pérez-Sánchez H. Subtractive Genomics, Molecular Docking and Molecular Dynamics Simulation Revealed LpxC as a Potential Drug Target Against Multi-Drug Resistant Klebsiella pneumoniae. Interdiscip Sci 2018; 11:508-526. [PMID: 29721784 DOI: 10.1007/s12539-018-0299-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Revised: 04/11/2018] [Accepted: 04/24/2018] [Indexed: 12/17/2022]
Abstract
The emergence and dissemination of pan drug resistant clones of Klebsiella pneumoniae are great threat to public health. In this regard new therapeutic targets must be highlighted to pave the path for novel drug discovery and development. Subtractive proteomic pipeline brought forth UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine deacetylase (LpxC), a Zn+2 dependent cytoplasmic metalloprotein and catalyze the rate limiting deacetylation step of lipid A biosynthesis pathway. Primary sequence analysis followed by 3-dimensional (3-D) structure elucidation of the protein led to the detection of K. pneumoniae LpxC (KpLpxC) topology distinct from its orthologous counterparts in other bacterial species. Molecular docking study of the protein recognized receptor antagonist compound 106, a uridine-based LpxC inhibitory compound, as a ligand best able to fit the binding pocket with a Gold Score of 67.53. Molecular dynamics simulation of docked KpLpxC revealed an alternate binding pattern of ligand in the active site. The ligand tail exhibited preferred binding to the domain I residues as opposed to the substrate binding hydrophobic channel of subdomain II, usually targeted by inhibitory compounds. Comparison with the undocked KpLpxC system demonstrated ligand induced high conformational changes in the hydrophobic channel of subdomain II in KpLpxC. Hence, ligand exerted its inhibitory potential by rendering the channel unstable for substrate binding.
Collapse
Affiliation(s)
- Sajjad Ahmad
- National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Afifa Navid
- National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Amina Saleem Akhtar
- National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Syed Sikander Azam
- National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan.
| | - Abdul Wadood
- Department of Biochemistry, Abdul Wali Khan University-Mardan, Shankar Campus, Mardan, Khyber Pukhtoonkhwa, Pakistan
| | - Horacio Pérez-Sánchez
- Structural Bioinformatics and High Performance Computing Research Group (BIO-HPC), Universidad Católica San Antonio de Murcia (UCAM), Murcia, Spain
| |
Collapse
|
25
|
Hernando-Rodríguez B, Erinjeri AP, Rodríguez-Palero MJ, Millar V, González-Hernández S, Olmedo M, Schulze B, Baumeister R, Muñoz MJ, Askjaer P, Artal-Sanz M. Combined flow cytometry and high-throughput image analysis for the study of essential genes in Caenorhabditis elegans. BMC Biol 2018; 16:36. [PMID: 29598825 PMCID: PMC5875015 DOI: 10.1186/s12915-018-0496-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Accepted: 02/06/2018] [Indexed: 12/28/2022] Open
Abstract
Background Advances in automated image-based microscopy platforms coupled with high-throughput liquid workflows have facilitated the design of large-scale screens utilising multicellular model organisms such as Caenorhabditis elegans to identify genetic interactions, therapeutic drugs or disease modifiers. However, the analysis of essential genes has lagged behind because lethal or sterile mutations pose a bottleneck for high-throughput approaches, and a systematic way to analyse genetic interactions of essential genes in multicellular organisms has been lacking. Results In C. elegans, non-conditional lethal mutations can be maintained in heterozygosity using chromosome balancers, commonly expressing green fluorescent protein (GFP) in the pharynx. However, gene expression or function is typically monitored by the use of fluorescent reporters marked with the same fluorophore, presenting a challenge to sort worm populations of interest, particularly at early larval stages. Here, we develop a sorting strategy capable of selecting homozygous mutants carrying a GFP stress reporter from GFP-balanced animals at the second larval stage. Because sorting is not completely error-free, we develop an automated high-throughput image analysis protocol that identifies and discards animals carrying the chromosome balancer. We demonstrate the experimental usefulness of combining sorting of homozygous lethal mutants and automated image analysis in a functional genomic RNA interference (RNAi) screen for genes that genetically interact with mitochondrial prohibitin (PHB). Lack of PHB results in embryonic lethality, while homozygous PHB deletion mutants develop into sterile adults due to maternal contribution and strongly induce the mitochondrial unfolded protein response (UPRmt). In a chromosome-wide RNAi screen for C. elegans genes having human orthologues, we uncover both known and new PHB genetic interactors affecting the UPRmt and growth. Conclusions The method presented here allows the study of balanced lethal mutations in a high-throughput manner. It can be easily adapted depending on the user’s requirements and should serve as a useful resource for the C. elegans community for probing new biological aspects of essential nematode genes as well as the generation of more comprehensive genetic networks. Electronic supplementary material The online version of this article (10.1186/s12915-018-0496-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Blanca Hernando-Rodríguez
- Andalusian Center for Developmental Biology, Consejo Superior de Investigaciones Científicas/Junta de Andalucía/Universidad Pablo de Olavide, Seville, Spain.,Department of Molecular Biology and Biochemical Engineering, Universidad Pablo de Olavide, Seville, Spain
| | - Annmary Paul Erinjeri
- Andalusian Center for Developmental Biology, Consejo Superior de Investigaciones Científicas/Junta de Andalucía/Universidad Pablo de Olavide, Seville, Spain.,Department of Molecular Biology and Biochemical Engineering, Universidad Pablo de Olavide, Seville, Spain
| | - María Jesús Rodríguez-Palero
- Andalusian Center for Developmental Biology, Consejo Superior de Investigaciones Científicas/Junta de Andalucía/Universidad Pablo de Olavide, Seville, Spain.,Department of Molecular Biology and Biochemical Engineering, Universidad Pablo de Olavide, Seville, Spain
| | - Val Millar
- GE Healthcare Life Sciences, Maynard Centre, Forest Farm, Whitchurch, Cardiff, UK.,Present address: Target Discovery Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Sara González-Hernández
- Andalusian Center for Developmental Biology, Consejo Superior de Investigaciones Científicas/Junta de Andalucía/Universidad Pablo de Olavide, Seville, Spain.,Department of Molecular Biology and Biochemical Engineering, Universidad Pablo de Olavide, Seville, Spain.,Present address: Cell and Developmental Biology Area, Centro Nacional de Investigaciones Cardiovasculares Carlos III (CNIC), Madrid, Spain
| | - María Olmedo
- Andalusian Center for Developmental Biology, Consejo Superior de Investigaciones Científicas/Junta de Andalucía/Universidad Pablo de Olavide, Seville, Spain.,Department of Molecular Biology and Biochemical Engineering, Universidad Pablo de Olavide, Seville, Spain.,Present address: Department of Genetics, University of Seville, Seville, Spain
| | - Bettina Schulze
- Centre for Biological Signalling Studies (BIOSS), Laboratory for Bioinformatics and Molecular Genetics, Faculty of Biology, and ZBMZ Center for Biochemistry and Molecular Cell Biology (Faculty of Medicine), Albert-Ludwigs-University of Freiburg, Freiburg, Germany
| | - Ralf Baumeister
- Centre for Biological Signalling Studies (BIOSS), Laboratory for Bioinformatics and Molecular Genetics, Faculty of Biology, and ZBMZ Center for Biochemistry and Molecular Cell Biology (Faculty of Medicine), Albert-Ludwigs-University of Freiburg, Freiburg, Germany
| | - Manuel J Muñoz
- Andalusian Center for Developmental Biology, Consejo Superior de Investigaciones Científicas/Junta de Andalucía/Universidad Pablo de Olavide, Seville, Spain.,Department of Molecular Biology and Biochemical Engineering, Universidad Pablo de Olavide, Seville, Spain
| | - Peter Askjaer
- Andalusian Center for Developmental Biology, Consejo Superior de Investigaciones Científicas/Junta de Andalucía/Universidad Pablo de Olavide, Seville, Spain
| | - Marta Artal-Sanz
- Andalusian Center for Developmental Biology, Consejo Superior de Investigaciones Científicas/Junta de Andalucía/Universidad Pablo de Olavide, Seville, Spain. .,Department of Molecular Biology and Biochemical Engineering, Universidad Pablo de Olavide, Seville, Spain.
| |
Collapse
|
26
|
Amaral PP, Leonardi T, Han N, Viré E, Gascoigne DK, Arias-Carrasco R, Büscher M, Pandolfini L, Zhang A, Pluchino S, Maracaja-Coutinho V, Nakaya HI, Hemberg M, Shiekhattar R, Enright AJ, Kouzarides T. Genomic positional conservation identifies topological anchor point RNAs linked to developmental loci. Genome Biol 2018; 19:32. [PMID: 29540241 PMCID: PMC5853149 DOI: 10.1186/s13059-018-1405-5] [Citation(s) in RCA: 91] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Accepted: 02/07/2018] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider promoter conservation and positional conservation as indicators of functional commonality. RESULTS We identify 665 conserved lncRNA promoters in mouse and human that are preserved in genomic position relative to orthologous coding genes. These positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are coexpressed in a tissue-specific manner. Over half of positionally conserved RNAs in this set are linked to chromatin organization structures, overlapping binding sites for the CTCF chromatin organiser and located at chromatin loop anchor points and borders of topologically associating domains (TADs). We define these RNAs as topological anchor point RNAs (tapRNAs). Characterization of these noncoding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other's expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Furthermore, we find that tapRNAs contain conserved sequence domains that are enriched in motifs for zinc finger domain-containing RNA-binding proteins and transcription factors, whose binding sites are found mutated in cancers. CONCLUSIONS This work leverages positional conservation to identify lncRNAs with potential importance in genome organization, development and disease. The evidence that many developmental transcription factors are physically and functionally connected to lncRNAs represents an exciting stepping-stone to further our understanding of genome regulation.
Collapse
Affiliation(s)
- Paulo P. Amaral
- The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK
| | - Tommaso Leonardi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD UK
- Department of Clinical Neurosciences and NIHR Biomedical Research Centre, University of Cambridge, Cambridge, UK
| | - Namshik Han
- The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK
- Present address: The Milner Therapeutics Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK
| | - Emmanuelle Viré
- Present address: MRC Prion Unit, UCL Institute of Neurology, Queen Square House, Queen Square, London, WC1N 3BG UK
| | - Dennis K. Gascoigne
- The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK
| | - Raúl Arias-Carrasco
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
| | - Magdalena Büscher
- The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK
| | - Luca Pandolfini
- The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK
| | - Anda Zhang
- University of Miami Miller School of Medicine, Sylvester Comprehensive Cancer Center, Department of Human Genetics, Biomedical Research Building, Miami, FL 33136 USA
| | - Stefano Pluchino
- Department of Clinical Neurosciences and NIHR Biomedical Research Centre, University of Cambridge, Cambridge, UK
| | - Vinicius Maracaja-Coutinho
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
- Advanced Center for Chronic Diseases (ACCDiS), Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Santiago, Chile
| | - Helder I. Nakaya
- School of Pharmaceutical Sciences, University of São Paulo, Av. Prof. Lineu Prestes 580, São Paulo, 05508 Brazil
| | - Martin Hemberg
- The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA UK
| | - Ramin Shiekhattar
- University of Miami Miller School of Medicine, Sylvester Comprehensive Cancer Center, Department of Human Genetics, Biomedical Research Building, Miami, FL 33136 USA
| | - Anton J. Enright
- Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QP UK
| | - Tony Kouzarides
- The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN UK
| |
Collapse
|
27
|
Genomic Identification and Functional Characterization of Essential Genes in Caenorhabditis elegans. G3-GENES GENOMES GENETICS 2018; 8:981-997. [PMID: 29339407 PMCID: PMC5844317 DOI: 10.1534/g3.117.300338] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Using combined genetic mapping, Illumina sequencing, bioinformatics analyses, and experimental validation, we identified 60 essential genes from 104 lethal mutations in two genomic regions of Caenorhabditis elegans totaling ∼14 Mb on chromosome III(mid) and chromosome V(left). Five of the 60 genes had not previously been shown to have lethal phenotypes by RNA interference depletion. By analyzing the regions around the lethal missense mutations, we identified four putative new protein functional domains. Furthermore, functional characterization of the identified essential genes shows that most are enzymes, including helicases, tRNA synthetases, and kinases in addition to ribosomal proteins. Gene Ontology analysis indicated that essential genes often encode for enzymes that conduct nucleic acid binding activities during fundamental processes, such as intracellular DNA replication, transcription, and translation. Analysis of essential gene shows that they have fewer paralogs, encode proteins that are in protein interaction hubs, and are highly expressed relative to nonessential genes. All these essential gene traits in C. elegans are consistent with those of human disease genes. Most human orthologs (90%) of the essential genes in this study are related to human diseases. Therefore, functional characterization of essential genes underlines their importance as proxies for understanding the biological functions of human disease genes.
Collapse
|
28
|
Krabbenhoft TJ, Turner TF. Comparative transcriptomics of cyprinid minnows and carp in a common wild setting: a resource for ecological genomics in freshwater communities. DNA Res 2018; 25:11-23. [PMID: 28985264 PMCID: PMC5824830 DOI: 10.1093/dnares/dsx034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Accepted: 08/12/2017] [Indexed: 12/30/2022] Open
Abstract
Comparative transcriptomics can now be conducted on organisms in natural settings, which has greatly enhanced understanding of genome–environment interactions. Here, we demonstrate the utility and potential pitfalls of comparative transcriptomics of wild organisms, with an example from three cyprinid fish species (Teleostei:Cypriniformes). We present extensively filtered and annotated transcriptome assemblies that provide a valuable resource for studies of genome evolution (e.g. polyploidy), ecological and morphological diversification, speciation, and shared and unique responses to environmental variation in cyprinid fishes. Our results and analyses address the following points: (i) ‘essential developmental genes’ are shown to be ubiquitously expressed in a diverse suite of tissues across later ontogenetic stages (i.e. juveniles and adults), making these genes are useful for assessing the quality of transcriptome assemblies, (ii) the influence of microbiomes and other exogenous DNA, (iii) potentially novel, species-specific genes, and (iv) genomic rearrangements (e.g. whole genome duplication). The data we present provide a resource for future comparative work in cypriniform fishes and other taxa across a variety of sub-disciplines, including stress response, morphological diversification, community ecology, ecotoxicology, and climate change.
Collapse
Affiliation(s)
- Trevor J Krabbenhoft
- Department of Biology and Museum of Southwestern Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - Thomas F Turner
- Department of Biology and Museum of Southwestern Biology, University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
29
|
Wang T, Bu CH, Hildebrand S, Jia G, Siggs OM, Lyon S, Pratt D, Scott L, Russell J, Ludwig S, Murray AR, Moresco EMY, Beutler B. Probability of phenotypically detectable protein damage by ENU-induced mutations in the Mutagenetix database. Nat Commun 2018; 9:441. [PMID: 29382827 PMCID: PMC5789985 DOI: 10.1038/s41467-017-02806-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 12/27/2017] [Indexed: 12/23/2022] Open
Abstract
Computational inference of mutation effects is necessary for genetic studies in which many mutations must be considered as etiologic candidates. Programs such as PolyPhen-2 predict the relative severity of damage caused by missense mutations, but not the actual probability that a mutation will reduce/eliminate protein function. Based on genotype and phenotype data for 116,330 ENU-induced mutations in the Mutagenetix database, we calculate that putative null mutations, and PolyPhen-2-classified “probably damaging”, “possibly damaging”, or “probably benign” mutations have, respectively, 61%, 17%, 9.8%, and 4.5% probabilities of causing phenotypically detectable damage in the homozygous state. We use these probabilities in the estimation of genome saturation and the probability that individual proteins have been adequately tested for function in specific genetic screens. We estimate the proportion of essential autosomal genes in Mus musculus (C57BL/6J) and show that viable mutations in essential genes are more likely to induce phenotype than mutations in non-essential genes. Programs such as PolyPhen-2 predict the relative severity of damage by missense mutations. Here, Wang et al estimate probabilities that putative null or missense alleles would reduce protein function to cause detectable phenotype by analyzing data from ENU-induced mouse mutations.
Collapse
Affiliation(s)
- Tao Wang
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. .,Quantitative Biomedical Research Center, Department of Clinical Science, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. .,Kidney Cancer Program, Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.
| | - Chun Hui Bu
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Sara Hildebrand
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Gaoxiang Jia
- Quantitative Biomedical Research Center, Department of Clinical Science, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.,Department of Statistical Science, Southern Methodist University, Dallas, TX, 75205, USA
| | - Owen M Siggs
- Immunology Division, Garvan Institute for Medical Research, Sydney, NSW, 2010, Australia
| | - Stephen Lyon
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - David Pratt
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Lindsay Scott
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Jamie Russell
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Sara Ludwig
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Anne R Murray
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Eva Marie Y Moresco
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Bruce Beutler
- Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.
| |
Collapse
|
30
|
Eidsaa M, Stubbs L, Almaas E. Comparative analysis of weighted gene co-expression networks in human and mouse. PLoS One 2017; 12:e0187611. [PMID: 29161290 PMCID: PMC5697817 DOI: 10.1371/journal.pone.0187611] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 10/23/2017] [Indexed: 01/21/2023] Open
Abstract
The application of complex network modeling to analyze large co-expression data sets has gained traction during the last decade. In particular, the use of the weighted gene co-expression network analysis framework has allowed an unbiased and systems-level investigation of genotype-phenotype relationships in a wide range of systems. Since mouse is an important model organism for biomedical research on human disease, it is of great interest to identify similarities and differences in the functional roles of human and mouse orthologous genes. Here, we develop a novel network comparison approach which we demonstrate by comparing two gene-expression data sets from a large number of human and mouse tissues. The method uses weighted topological overlap alongside the recently developed network-decomposition method of s-core analysis, which is suitable for making gene-centrality rankings for weighted networks. The aim is to identify globally central genes separately in the human and mouse networks. By comparing the ranked gene lists, we identify genes that display conserved or diverged centrality-characteristics across the networks. This framework only assumes a single threshold value that is chosen from a statistical analysis, and it may be applied to arbitrary network structures and edge-weight distributions, also outside the context of biology. When conducting the comparative network analysis, both within and across the two species, we find a clear pattern of enrichment of transcription factors, for the homeobox domain in particular, among the globally central genes. We also perform gene-ontology term enrichment analysis and look at disease-related genes for the separate networks as well as the network comparisons. We find that gene ontology terms related to regulation and development are generally enriched across the networks. In particular, the genes FOXE3, RHO, RUNX2, ALX3 and RARA, which are disease genes in either human or mouse, are on the top-10 list of globally central genes in the human and mouse networks.
Collapse
Affiliation(s)
- Marius Eidsaa
- Department of Biotechnology, NTNU - Norwegian University of Science and Technology, N-7491 Trondheim, Norway
| | - Lisa Stubbs
- Institute for Genomic Biology, Neuroscience Program, Cell and Developmental Biology, University of Illinois at Urbana-Champaigne, Urbana, IL 61801, United States of America
| | - Eivind Almaas
- Department of Biotechnology, NTNU - Norwegian University of Science and Technology, N-7491 Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health and General Practice, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- * E-mail:
| |
Collapse
|
31
|
Mobegi FM, Zomer A, de Jonge MI, van Hijum SAFT. Advances and perspectives in computational prediction of microbial gene essentiality. Brief Funct Genomics 2017; 16:70-79. [PMID: 26857942 DOI: 10.1093/bfgp/elv063] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
The minimal subset of genes required for cellular growth, survival and viability of an organism are classified as essential genes. Knowledge of essential genes gives insight into the core structure and functioning of a cell. This might lead to more efficient antimicrobial drug discovery, to elucidation of the correlations between genotype and phenotype, and a better understanding of the minimal requirements for a (synthetic) cell. Traditionally, constructing a catalog of essential genes for a given microbe involved costly and time-consuming laboratory experiments. While experimental methods have produced abundant gene essentiality data for model organisms like Escherichia coli and Bacillus subtilis, the knowledge generated cannot automatically be extrapolated to predict essential genes in all bacteria. In addition, essential genes identified in the laboratory are by definition 'conditionally essential', as they are essential under the specified experimental conditions: these might not resemble conditions in the microorganisms' natural habitat(s). Also, large-scale experimental assaying for essential genes is not always feasible because of the time investment required to setup these assays. The ability to rapidly and precisely identify essential genes in silico is therefore important and has great potential for applications in medicine, biotechnology and basic biological research. Here, we review the advances made in the use of computational methods to predict microbial gene essentiality, perspectives for the future of these techniques and the possible practical applications of essential genes.
Collapse
Affiliation(s)
- Fredrick M Mobegi
- Laboratory of Pediatric Infectious Diseases and Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Centre, Nijmegen, The Netherlands
| | - Aldert Zomer
- Radboud university medical center, Laboratory of Pediatric Infectious Diseases, Nijmegen, The Netherlands.,Radboud university medical center, Bacterial Genomics Group; Center for Molecular and Biomolecular Informatics, Nijmegen, The Netherlands
| | - Marien I de Jonge
- Laboratory of Pediatric Infectious Diseases, Department of Pediatrics, Radboudumc, Nijmegen, The Netherlands
| | - Sacha A F T van Hijum
- Radboud Institute for Molecular Life Sciences, Laboratory of Paediatric Infectious Diseases, Radboud University Medical Centre, Nijmegen, The Netherlands
| |
Collapse
|
32
|
|
33
|
Pengelly RJ, Vergara-Lope A, Alyousfi D, Jabalameli MR, Collins A. Understanding the disease genome: gene essentiality and the interplay of selection, recombination and mutation. Brief Bioinform 2017; 20:267-273. [DOI: 10.1093/bib/bbx110] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Indexed: 12/24/2022] Open
Affiliation(s)
- Reuben J Pengelly
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Alejandra Vergara-Lope
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Dareen Alyousfi
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - M Reza Jabalameli
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Andrew Collins
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| |
Collapse
|
34
|
Sundberg JP, Dadras SS, Silva KA, Kennedy VE, Garland G, Murray SA, Sundberg BA, Schofield PN, Pratt CH. Systematic screening for skin, hair, and nail abnormalities in a large-scale knockout mouse program. PLoS One 2017; 12:e0180682. [PMID: 28700664 PMCID: PMC5503261 DOI: 10.1371/journal.pone.0180682] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 06/19/2017] [Indexed: 12/19/2022] Open
Abstract
The International Knockout Mouse Consortium was formed in 2007 to inactivate (“knockout”) all protein-coding genes in the mouse genome in embryonic stem cells. Production and characterization of these mice, now underway, has generated and phenotyped 3,100 strains with knockout alleles. Skin and adnexa diseases are best defined at the gross clinical level and by histopathology. Representative retired breeders had skin collected from the back, abdomen, eyelids, muzzle, ears, tail, and lower limbs including the nails. To date, 169 novel mutant lines were reviewed and of these, only one was found to have a relatively minor sebaceous gland abnormality associated with follicular dystrophy. The B6N(Cg)-Far2tm2b(KOMP)Wtsi/2J strain, had lesions affecting sebaceous glands with what appeared to be a secondary follicular dystrophy. A second line, B6N(Cg)-Ppp1r9btm1.1(KOMP)Vlcg/J, had follicular dystrophy limited to many but not all mystacial vibrissae in heterozygous but not homozygous mutant mice, suggesting that this was a nonspecific background lesion. We discuss potential reasons for the low frequency of skin and adnexal phenotypes in mice from this project in comparison to those seen in human Mendelian diseases, and suggest alternative approaches to identification of human disease-relevant models.
Collapse
Affiliation(s)
- John P. Sundberg
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- * E-mail:
| | - Soheil S. Dadras
- Departments of Dermatology and Pathology, University of Connecticut, Farmington, Connecticut, United States of America
| | | | | | - Gaven Garland
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | | | - Beth A. Sundberg
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Paul N. Schofield
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Department of Physiology Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom
| | - C. Herbert Pratt
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| |
Collapse
|
35
|
Santiago-Sim T, Burrage LC, Ebstein F, Tokita MJ, Miller M, Bi W, Braxton AA, Rosenfeld JA, Shahrour M, Lehmann A, Cogné B, Küry S, Besnard T, Isidor B, Bézieau S, Hazart I, Nagakura H, Immken LL, Littlejohn RO, Roeder E, Kara B, Hardies K, Weckhuysen S, May P, Lemke JR, Elpeleg O, Abu-Libdeh B, James KN, Silhavy JL, Issa MY, Zaki MS, Gleeson JG, Seavitt JR, Dickinson ME, Ljungberg MC, Wells S, Johnson SJ, Teboul L, Eng CM, Yang Y, Kloetzel PM, Heaney JD, Walkiewicz MA, Afawi Z, Balling R, Barisic N, Baulac S, Craiu D, De Jonghe P, Guerrero-Lopez R, Guerrini R, Helbig I, Hjalgrim H, Jähn J, Klein KM, Leguern E, Lerche H, Marini C, Muhle H, Rosenow F, Serratosa J, Sterbová K, Suls A, Moller RS, Striano P, Weber Y, Zara F. Biallelic Variants in OTUD6B Cause an Intellectual Disability Syndrome Associated with Seizures and Dysmorphic Features. Am J Hum Genet 2017; 100:676-688. [PMID: 28343629 DOI: 10.1016/j.ajhg.2017.03.001] [Citation(s) in RCA: 45] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Accepted: 02/21/2017] [Indexed: 10/19/2022] Open
Abstract
Ubiquitination is a posttranslational modification that regulates many cellular processes including protein degradation, intracellular trafficking, cell signaling, and protein-protein interactions. Deubiquitinating enzymes (DUBs), which reverse the process of ubiquitination, are important regulators of the ubiquitin system. OTUD6B encodes a member of the ovarian tumor domain (OTU)-containing subfamily of deubiquitinating enzymes. Herein, we report biallelic pathogenic variants in OTUD6B in 12 individuals from 6 independent families with an intellectual disability syndrome associated with seizures and dysmorphic features. In subjects with predicted loss-of-function alleles, additional features include global developmental delay, microcephaly, absent speech, hypotonia, growth retardation with prenatal onset, feeding difficulties, structural brain abnormalities, congenital malformations including congenital heart disease, and musculoskeletal features. Homozygous Otud6b knockout mice were subviable, smaller in size, and had congenital heart defects, consistent with the severity of loss-of-function variants in humans. Analysis of peripheral blood mononuclear cells from an affected subject showed reduced incorporation of 19S subunits into 26S proteasomes, decreased chymotrypsin-like activity, and accumulation of ubiquitin-protein conjugates. Our findings suggest a role for OTUD6B in proteasome function, establish that defective OTUD6B function underlies a multisystemic human disorder, and provide additional evidence for the emerging relationship between the ubiquitin system and human disease.
Collapse
|
36
|
Effects of different kinds of essentiality on sequence evolution of human testis proteins. Sci Rep 2017; 7:43534. [PMID: 28272493 PMCID: PMC5341092 DOI: 10.1038/srep43534] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/25/2017] [Indexed: 11/17/2022] Open
Abstract
We asked if essentiality for either fertility or viability differentially affects sequence evolution of human testis proteins. Based on murine knockout data, we classified a set of 965 proteins expressed in human seminiferous tubules into three categories: proteins essential for prepubertal survival (“lethality proteins”), associated with male sub- or infertility (“male sub-/infertility proteins”), and nonessential proteins. In our testis protein dataset, lethality genes evolved significantly slower than nonessential and male sub-/infertility genes, which is in line with other authors’ findings. Using tissue specificity, connectivity in the protein-protein interaction (PPI) network, and multifunctionality as proxies for evolutionary constraints, we found that of the three categories, proteins linked to male sub- or infertility are least constrained. Lethality proteins, on the other hand, are characterized by broad expression, many PPI partners, and high multifunctionality, all of which points to strong evolutionary constraints. We conclude that compared with lethality proteins, those linked to male sub- or infertility are nonetheless indispensable, but evolve under more relaxed constraints. Finally, adaptive evolution in response to postmating sexual selection could further accelerate evolutionary rates of male sub- or infertility proteins expressed in human testis. These findings may become useful for in silico detection of human sub-/infertility genes.
Collapse
|
37
|
Increased burden of deleterious variants in essential genes in autism spectrum disorder. Proc Natl Acad Sci U S A 2016; 113:15054-15059. [PMID: 27956632 DOI: 10.1073/pnas.1613195113] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Autism spectrum disorder (ASD) is a heterogeneous, highly heritable neurodevelopmental syndrome characterized by impaired social interaction, communication, and repetitive behavior. It is estimated that hundreds of genes contribute to ASD. We asked if genes with a strong effect on survival and fitness contribute to ASD risk. Human orthologs of genes with an essential role in pre- and postnatal development in the mouse [essential genes (EGs)] are enriched for disease genes and under strong purifying selection relative to human orthologs of mouse genes with a known nonlethal phenotype [nonessential genes (NEGs)]. This intolerance to deleterious mutations, commonly observed haploinsufficiency, and the importance of EGs in development suggest a possible cumulative effect of deleterious variants in EGs on complex neurodevelopmental disorders. With a comprehensive catalog of 3,915 mammalian EGs, we provide compelling evidence for a stronger contribution of EGs to ASD risk compared with NEGs. By examining the exonic de novo and inherited variants from 1,781 ASD quartet families, we show a significantly higher burden of damaging mutations in EGs in ASD probands compared with their non-ASD siblings. The analysis of EGs in the developing brain identified clusters of coexpressed EGs implicated in ASD. Finally, we suggest a high-priority list of 29 EGs with potential ASD risk as targets for future functional and behavioral studies. Overall, we show that large-scale studies of gene function in model organisms provide a powerful approach for prioritization of genes and pathogenic variants identified by sequencing studies of human disease.
Collapse
|
38
|
Dickinson ME, Flenniken AM, Ji X, Teboul L, Wong MD, White JK, Meehan TF, Weninger WJ, Westerberg H, Adissu H, Baker CN, Bower L, Brown JM, Caddle LB, Chiani F, Clary D, Cleak J, Daly MJ, Denegre JM, Doe B, Dolan ME, Edie SM, Fuchs H, Gailus-Durner V, Galli A, Gambadoro A, Gallegos J, Guo S, Horner NR, Hsu CW, Johnson SJ, Kalaga S, Keith LC, Lanoue L, Lawson TN, Lek M, Mark M, Marschall S, Mason J, McElwee ML, Newbigging S, Nutter LM, Peterson KA, Ramirez-Solis R, Rowland DJ, Ryder E, Samocha KE, Seavitt JR, Selloum M, Szoke-Kovacs Z, Tamura M, Trainor AG, Tudose I, Wakana S, Warren J, Wendling O, West DB, Wong L, Yoshiki A, MacArthur DG, Tocchini-Valentini GP, Gao X, Flicek P, Bradley A, Skarnes WC, Justice MJ, Parkinson HE, Moore M, Wells S, Braun RE, Svenson KL, de Angelis MH, Herault Y, Mohun T, Mallon AM, Henkelman RM, Brown SD, Adams DJ, Lloyd KK, McKerlie C, Beaudet AL, Bucan M, Murray SA. High-throughput discovery of novel developmental phenotypes. Nature 2016; 537:508-514. [PMID: 27626380 PMCID: PMC5295821 DOI: 10.1038/nature19356] [Citation(s) in RCA: 796] [Impact Index Per Article: 99.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 08/10/2016] [Indexed: 12/29/2022]
Abstract
Approximately one-third of all mammalian genes are essential for life. Phenotypes resulting from knockouts of these genes in mice have provided tremendous insight into gene function and congenital disorders. As part of the International Mouse Phenotyping Consortium effort to generate and phenotypically characterize 5,000 knockout mouse lines, here we identify 410 lethal genes during the production of the first 1,751 unique gene knockouts. Using a standardized phenotyping platform that incorporates high-resolution 3D imaging, we identify phenotypes at multiple time points for previously uncharacterized genes and additional phenotypes for genes with previously reported mutant phenotypes. Unexpectedly, our analysis reveals that incomplete penetrance and variable expressivity are common even on a defined genetic background. In addition, we show that human disease genes are enriched for essential genes, thus providing a dataset that facilitates the prioritization and validation of mutations identified in clinical sequencing efforts.
Collapse
Affiliation(s)
- Mary E. Dickinson
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | - Ann M. Flenniken
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Xiao Ji
- Genomics and Computational Biology Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia PA 19104
| | - Lydia Teboul
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Michael D. Wong
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- Mouse Imaging Centre, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Jacqueline K. White
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Terrence F. Meehan
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Wolfgang J. Weninger
- Centre for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Henrik Westerberg
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Hibret Adissu
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | | | - Lynette Bower
- Mouse Biology Program, University of California, Davis
| | - James M. Brown
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | | | - Francesco Chiani
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, Monterotondo Scalo, Itally
| | - Dave Clary
- Mouse Biology Program, University of California, Davis
| | - James Cleak
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Mark J. Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston MA, USA
- Program in Medical and Population Genetics, Broad Institute MIT and Harvard, Cambridge, MA, USA
| | | | - Brendan Doe
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | - Helmut Fuchs
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics and German Mouse Clinic, Neuherberg, Germany
| | - Valerie Gailus-Durner
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics and German Mouse Clinic, Neuherberg, Germany
| | - Antonella Galli
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Alessia Gambadoro
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, Monterotondo Scalo, Itally
| | - Juan Gallegos
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| | - Shiying Guo
- SKL of Pharmaceutical Biotechnology and Model Animal Research Center, Collaborative Innovation Center for Genetics and Development, Nanjing Biomedical Research Institute, Nanjing University, China
| | - Neil R. Horner
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Chih-wei Hsu
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | - Sara J. Johnson
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Sowmya Kalaga
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | - Lance C. Keith
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | - Louise Lanoue
- Mouse Biology Program, University of California, Davis
| | - Thomas N. Lawson
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Monkol Lek
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston MA, USA
- Program in Medical and Population Genetics, Broad Institute MIT and Harvard, Cambridge, MA, USA
| | - Manuel Mark
- Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), et Institut de Génétique Biologie Moléculaire et Cellulaire (IGBMC) CNRS, INSERM, University of Strasbourg, Illkirch-Graffenstaden, France
| | - Susan Marschall
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics and German Mouse Clinic, Neuherberg, Germany
| | - Jeremy Mason
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Susan Newbigging
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Lauryl M.J. Nutter
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | | | - Ramiro Ramirez-Solis
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Edward Ryder
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Kaitlin E. Samocha
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston MA, USA
- Program in Medical and Population Genetics, Broad Institute MIT and Harvard, Cambridge, MA, USA
| | - John R. Seavitt
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| | - Mohammed Selloum
- Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), et Institut de Génétique Biologie Moléculaire et Cellulaire (IGBMC) CNRS, INSERM, University of Strasbourg, Illkirch-Graffenstaden, France
| | - Zsombor Szoke-Kovacs
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | | | | | - Ilinca Tudose
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Jonathan Warren
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Olivia Wendling
- Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), et Institut de Génétique Biologie Moléculaire et Cellulaire (IGBMC) CNRS, INSERM, University of Strasbourg, Illkirch-Graffenstaden, France
| | - David B. West
- Children’s Hospital Oakland Research Institute, Oakland, CA 94609
| | - Leeyean Wong
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | | | | | - Daniel G. MacArthur
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston MA, USA
- Program in Medical and Population Genetics, Broad Institute MIT and Harvard, Cambridge, MA, USA
| | - Glauco P. Tocchini-Valentini
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, Monterotondo Scalo, Itally
| | - Xiang Gao
- SKL of Pharmaceutical Biotechnology and Model Animal Research Center, Collaborative Innovation Center for Genetics and Development, Nanjing Biomedical Research Institute, Nanjing University, China
| | - Paul Flicek
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Allan Bradley
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - William C. Skarnes
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Monica J. Justice
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Helen E. Parkinson
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Sara Wells
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | | | | | - Martin Hrabe de Angelis
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics and German Mouse Clinic, Neuherberg, Germany
- Chair of Experimental Genetics, School of Life Science Weihenstephan, Technische Universität München, Freising
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Yann Herault
- Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), et Institut de Génétique Biologie Moléculaire et Cellulaire (IGBMC) CNRS, INSERM, University of Strasbourg, Illkirch-Graffenstaden, France
| | - Tim Mohun
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London, UK
| | - Ann-Marie Mallon
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - R. Mark Henkelman
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- Mouse Imaging Centre, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Steve D.M. Brown
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - David J. Adams
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Colin McKerlie
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Arthur L. Beaudet
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| | - Maja Bucan
- Departments of Genetics and Psychiatry, Perlman School of Medicine, University of Pennsylvania, Philadelphia PA 19104
| | | |
Collapse
|
39
|
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms. BIOMED RESEARCH INTERNATIONAL 2016; 2016:7639397. [PMID: 27660763 PMCID: PMC5021884 DOI: 10.1155/2016/7639397] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Revised: 07/25/2016] [Accepted: 08/04/2016] [Indexed: 11/17/2022]
Abstract
Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.
Collapse
|
40
|
Zhang X, Xiao W, Acencio ML, Lemke N, Wang X. An ensemble framework for identifying essential proteins. BMC Bioinformatics 2016; 17:322. [PMID: 27557880 PMCID: PMC4997703 DOI: 10.1186/s12859-016-1166-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Accepted: 08/09/2016] [Indexed: 11/10/2022] Open
Abstract
Background Many centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality. However, most of them show limited prediction accuracy, and the number of common predicted essential proteins by different methods is very small. Results In this paper, an ensemble framework is proposed which integrates gene expression data and protein-protein interaction networks (PINs). It aims to improve the prediction accuracy of basic centrality measures. The idea behind this ensemble framework is that different protein-protein interactions (PPIs) may show different contributions to protein essentiality. Five standard centrality measures (degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and subgraph centrality) are integrated into the ensemble framework respectively. We evaluated the performance of the proposed ensemble framework using yeast PINs and gene expression data. The results show that it can considerably improve the prediction accuracy of the five centrality measures individually. It can also remarkably increase the number of common predicted essential proteins among those predicted by each centrality measure individually and enable each centrality measure to find more low-degree essential proteins. Conclusions This paper demonstrates that it is valuable to differentiate the contributions of different PPIs for identifying essential proteins based on network topological characteristics. The proposed ensemble framework is a successful paradigm to this end. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1166-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xue Zhang
- Systems Biology Core, NHLBI, NIH, 9000 Rockville Pike, Bethesda, MD, 20892, USA
| | - Wangxin Xiao
- Department of Computer Science, XiangNan University, Eastern Wangxian Park, Chenzhou, Hunan, 423000, China.
| | - Marcio Luis Acencio
- Department of Physics and Biophysics, Institute of Biosciences of Botucatu, UNESP-São Paulo State University, CEP 18618-970, Botucatu, São Paulo, 510, Brazil.,Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology (NTNU), P.B. 8905, N-7491, Trondheim, Norway
| | - Ney Lemke
- Department of Physics and Biophysics, Institute of Biosciences of Botucatu, UNESP-São Paulo State University, CEP 18618-970, Botucatu, São Paulo, 510, Brazil
| | - Xujing Wang
- Systems Biology Core, NHLBI, NIH, 9000 Rockville Pike, Bethesda, MD, 20892, USA.
| |
Collapse
|
41
|
Tronick E, Hunter RG. Waddington, Dynamic Systems, and Epigenetics. Front Behav Neurosci 2016; 10:107. [PMID: 27375447 PMCID: PMC4901045 DOI: 10.3389/fnbeh.2016.00107] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Accepted: 05/18/2016] [Indexed: 11/13/2022] Open
Abstract
Waddington coined the term “epigenetic” to attempt to explain the complex, dynamic interactions between the developmental environment and the genome that led to the production of phenotype. Waddington's thoughts on the importance of both adaptability and canalization of phenotypic development are worth recalling as well, as they emphasize the available range for epigenetic action and the importance of environmental feedback (or lack thereof) in the development of complex traits. We suggest that a dynamic systems view fits well with Waddington's conception of epigenetics in the developmental context, as well as shedding light on the study of the molecular epigenetic effects of the environment on brain and behavior. Further, the dynamic systems view emphasizes the importance of the multi-directional interchange between the organism, the genome and various aspects of the environment to the ultimate phenotype.
Collapse
Affiliation(s)
- Ed Tronick
- Developmental and Brain Sciences, University of Massachusetts Boston, Psychology Boston, MA, USA
| | - Richard G Hunter
- Developmental and Brain Sciences, University of Massachusetts Boston, Psychology Boston, MA, USA
| |
Collapse
|
42
|
Wang P, Chen Y, Lü J, Wang Q, Yu X. Graphical Features of Functional Genes in Human Protein Interaction Network. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS 2016; 10:707-20. [PMID: 26841412 DOI: 10.1109/tbcas.2015.2487299] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
With the completion of the human genome project, it is feasible to investigate large-scale human protein interaction network (HPIN) with complex networks theory. Proteins are encoded by genes. Essential, viable, disease, conserved, housekeeping (HK) and tissue-enriched (TE) genes are functional genes, which are organized and functioned via interaction networks. Based on up-to-date data from various databases or literature, two large-scale HPINs and six subnetworks are constructed. We illustrate that the HPINs and most of the subnetworks are sparse, small-world, scale-free, disassortative and with hierarchical modularity. Among the six subnetworks, essential, disease and HK subnetworks are more densely connected than the others. Statistical analysis on the topological structures of the HPIN reveals that the lethal, the conserved, the HK and the TE genes are with hallmark graphical features. Receiver operating characteristic (ROC) curves indicate that the essential genes can be distinguished from the viable ones with accuracy as high as almost 70%. Closeness, semi-local and eigenvector centralities can distinguish the HK genes from the TE ones with accuracy around 82%. Furthermore, the Venn diagram, cluster dendgrams and classifications of disease genes reveal that some classes of disease genes are with hallmark graphical features, especially for cancer genes, HK disease genes and TE disease genes. The findings facilitate the identification of some functional genes via topological structures. The investigations shed some light on the characteristics of the compete interactome, which have potential implications in networked medicine and biological network control.
Collapse
|
43
|
Banerjee S, Chakraborty S, De RK. Deciphering the cause of evolutionary variance within intrinsically disordered regions in human proteins. J Biomol Struct Dyn 2016; 35:233-249. [PMID: 26790343 DOI: 10.1080/07391102.2016.1143877] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Why the intrinsically disordered regions evolve within human proteome has became an interesting question for a decade. Till date, it remains an unsolved yet an intriguing issue to investigate why some of the disordered regions evolve rapidly while the rest are highly conserved across mammalian species. Identifying the key biological factors, responsible for the variation in the conservation rate of different disordered regions within the human proteome, may revisit the above issue. We emphasized that among the other biological features (multifunctionality, gene essentiality, protein connectivity, number of unique domains, gene expression level and expression breadth) considered in our study, the number of unique protein domains acts as a strong determinant that negatively influences the conservation of disordered regions. In this context, we justified that proteins having a fewer types of domains preferably need to conserve their disordered regions to enhance their structural flexibility which in turn will facilitate their molecular interactions. In contrast, the selection pressure acting on the stretches of disordered regions is not so strong in the case of multi-domains proteins. Therefore, we reasoned that the presence of conserved disordered stretches may compensate the functions of multiple domains within a single domain protein. Interestingly, we noticed that the influence of the unique domain number and expression level acts differently on the evolution of disordered regions from that of well-structured ones.
Collapse
Affiliation(s)
- Sanghita Banerjee
- a Machine Intelligence Unit , Indian Statistical Institute , 203 Barrackpore Trunk Road, Kolkata 700108 , India
| | | | - Rajat K De
- a Machine Intelligence Unit , Indian Statistical Institute , 203 Barrackpore Trunk Road, Kolkata 700108 , India
| |
Collapse
|
44
|
Local Action with Global Impact: Highly Similar Infection Patterns of Human Viruses and Bacteriophages. mSystems 2016; 1:mSystems00030-15. [PMID: 27822522 PMCID: PMC5069743 DOI: 10.1128/msystems.00030-15] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Accepted: 02/16/2016] [Indexed: 11/20/2022] Open
Abstract
The investigation of host-pathogen interaction interfaces and their constituent factors is crucial for our understanding of an organism's pathogenesis. Here, we explored the interactomes of HIV, hepatitis C virus, influenza A virus, human papillomavirus, herpes simplex virus, and vaccinia virus in a human host by analyzing the combined sets of virus targets and human genes that are required for viral infection. We also considered targets and required genes of bacteriophages lambda and T7 infection in Escherichia coli. We found that targeted proteins and their immediate network neighbors significantly pool with proteins required for infection and essential for cell growth, forming large connected components in both the human and E. coli protein interaction networks. The impact of both viruses and phages on their protein targets appears to extend to their network neighbors, as these are enriched with topologically central proteins that have a significant disruptive topological effect and connect different protein complexes. Moreover, viral and phage targets and network neighbors are enriched with transcription factors, methylases, and acetylases in human viruses, while such interactions are much less prominent in bacteriophages. IMPORTANCE While host-virus interaction interfaces have been previously investigated, relatively little is known about the indirect interactions of pathogen and host proteins required for viral infection and host cell function. Therefore, we investigated the topological relationships of human and bacterial viruses and how they interact with their hosts. We focused on those host proteins that are directly targeted by viruses, those that are required for infection, and those that are essential for both human and bacterial cells (here, E. coli). Generally, we observed that targeted, required, and essential proteins in both hosts interact in a highly intertwined fashion. While there exist highly similar topological patterns, we found that human viruses target transcription factors through methylases and acetylases, proteins that played no such role in bacteriophages.
Collapse
|
45
|
Zhan T, Boutros M. Towards a compendium of essential genes - From model organisms to synthetic lethality in cancer cells. Crit Rev Biochem Mol Biol 2015; 51:74-85. [PMID: 26627871 PMCID: PMC4819810 DOI: 10.3109/10409238.2015.1117053] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Essential genes are defined by their requirement to sustain life in cells or whole organisms. The systematic identification of essential gene sets not only allows insights into the fundamental building blocks of life, but may also provide novel therapeutic targets in oncology. The discovery of essential genes has been tightly linked to the development and deployment of various screening technologies. Here, we describe how gene essentiality was addressed in different eukaryotic model organisms, covering a range of organisms from yeast to mouse. We describe how increasing knowledge of evolutionarily divergent genomes facilitate identification of gene essentiality across species. Finally, the impact of gene essentiality and synthetic lethality on cancer research and the clinical translation of screening results are highlighted.
Collapse
Affiliation(s)
- Tianzuo Zhan
- a Department of Cell and Molecular Biology , Division of Signaling and Functional Genomics, Medical Faculty Mannheim, German Cancer Research Center (DKFZ), Heidelberg University , Heidelberg , Germany and.,b Department of Medicine II , Medical Faculty Mannheim, University Hospital Mannheim, Heidelberg University , Mannheim , Germany
| | - Michael Boutros
- a Department of Cell and Molecular Biology , Division of Signaling and Functional Genomics, Medical Faculty Mannheim, German Cancer Research Center (DKFZ), Heidelberg University , Heidelberg , Germany and
| |
Collapse
|
46
|
Profiling RNA editing in human tissues: towards the inosinome Atlas. Sci Rep 2015; 5:14941. [PMID: 26449202 PMCID: PMC4598827 DOI: 10.1038/srep14941] [Citation(s) in RCA: 163] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 09/09/2015] [Indexed: 12/26/2022] Open
Abstract
Adenine to Inosine RNA editing is a widespread co- and post-transcriptional mechanism mediated by ADAR enzymes acting on double stranded RNA. It has a plethora of biological effects, appears to be particularly pervasive in humans with respect to other mammals, and is implicated in a number of diverse human pathologies. Here we present the first human inosinome atlas comprising 3,041,422 A-to-I events identified in six tissues from three healthy individuals. Matched directional total-RNA-Seq and whole genome sequence datasets were generated and analysed within a dedicated computational framework, also capable of detecting hyper-edited reads. Inosinome profiles are tissue specific and edited gene sets consistently show enrichment of genes involved in neurological disorders and cancer. Overall frequency of editing also varies, but is strongly correlated with ADAR expression levels. The inosinome database is available at: http://srv00.ibbe.cnr.it/editing/.
Collapse
|
47
|
Marschall ALJ, Dübel S, Böldicke T. Specific in vivo knockdown of protein function by intrabodies. MAbs 2015; 7:1010-35. [PMID: 26252565 PMCID: PMC4966517 DOI: 10.1080/19420862.2015.1076601] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Revised: 07/19/2015] [Accepted: 07/20/2015] [Indexed: 01/02/2023] Open
Abstract
Intracellular antibodies (intrabodies) are recombinant antibody fragments that bind to target proteins expressed inside of the same living cell producing the antibodies. The molecules are commonly used to study the function of the target proteins (i.e., their antigens). The intrabody technology is an attractive alternative to the generation of gene-targeted knockout animals, and complements knockdown techniques such as RNAi, miRNA and small molecule inhibitors, by-passing various limitations and disadvantages of these methods. The advantages of intrabodies include very high specificity for the target, the possibility to knock down several protein isoforms by one intrabody and targeting of specific splice variants or even post-translational modifications. Different types of intrabodies must be designed to target proteins at different locations, typically either in the cytoplasm, in the nucleus or in the endoplasmic reticulum (ER). Most straightforward is the use of intrabodies retained in the ER (ER intrabodies) to knock down the function of proteins passing the ER, which disturbs the function of members of the membrane or plasma proteomes. More effort is needed to functionally knock down cytoplasmic or nuclear proteins because in this case antibodies need to provide an inhibitory effect and must be able to fold in the reducing milieu of the cytoplasm. In this review, we present a broad overview of intrabody technology, as well as applications both of ER and cytoplasmic intrabodies, which have yielded valuable insights in the biology of many targets relevant for drug development, including α-synuclein, TAU, BCR-ABL, ErbB-2, EGFR, HIV gp120, CCR5, IL-2, IL-6, β-amyloid protein and p75NTR. Strategies for the generation of intrabodies and various designs of their applications are also reviewed.
Collapse
Affiliation(s)
- Andrea LJ Marschall
- Technische Universität Braunschweig, Institute of Biochemistry, Biotechnology and Bioinformatics; Braunschweig, Germany
| | - Stefan Dübel
- Technische Universität Braunschweig, Institute of Biochemistry, Biotechnology and Bioinformatics; Braunschweig, Germany
| | - Thomas Böldicke
- Helmholtz Centre for Infection Research, Recombinant Protein Expression/Intrabody Unit, Helmholtz Centre for Infection Research; Braunschweig, Germany
| |
Collapse
|
48
|
Gut P, Zweckstetter M, Banati RB. Lost in translocation: the functions of the 18-kD translocator protein. Trends Endocrinol Metab 2015; 26:349-56. [PMID: 26026242 PMCID: PMC5654500 DOI: 10.1016/j.tem.2015.04.001] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2015] [Revised: 03/31/2015] [Accepted: 04/21/2015] [Indexed: 01/29/2023]
Abstract
Research spanning nearly four decades has assigned to the translocator protein (18 kDa) (TSPO) a critical role, among others, in the mitochondrial import of cholesterol, the subsequent steps of (neuro)steroid production, and systemic endocrine regulation, with implications for the pathophysiology of immune, inflammatory, neurodegenerative, and psychiatric as well as neoplastic diseases. Recent knockout studies in mice unexpectedly report normal or latent phenotypes, raising doubts about the protein's role in steroidogenesis and other previously postulated functions and challenging the validity of earlier data on the selectivity of TSPO-binding drugs. Here we provide a synthesis of the current debate from a structural and molecular biology perspective, discuss the limits of inference in loss-of-function (gene knockout) studies, and suggest new functions of TSPO.
Collapse
Affiliation(s)
- Philipp Gut
- Nestlé Institute of Health Sciences, EPFL Innovation Park, Bâtiment H, 1015 Lausanne, Switzerland
| | - Markus Zweckstetter
- Max-Planck-Institut für Biophysikalische Chemie, 37077 Göttingen, Germany; Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE), 37077 Göttingen, Germany; Center for Nanoscale Microscopy and Molecular Physiology of the Brain, University Medical Center, 37073 Göttingen, Germany
| | - Richard B Banati
- Life Sciences, Australian Nuclear Science and Technology Organisation, Lucas Heights, NSW 2234, Australia; National Imaging Facility and Ramaciotti Centre for Brain Imaging, Brain and Mind Research Institute, Faculty of Health Sciences, University of Sydney, Sydney, NSW 2006, Australia.
| |
Collapse
|
49
|
Li W, Freudenberg J, Oswald M. Principles for the organization of gene-sets. Comput Biol Chem 2015; 59 Pt B:139-49. [PMID: 26188561 DOI: 10.1016/j.compbiolchem.2015.04.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 04/08/2015] [Indexed: 12/23/2022]
Abstract
A gene-set, an important concept in microarray expression analysis and systems biology, is a collection of genes and/or their products (i.e. proteins) that have some features in common. There are many different ways to construct gene-sets, but a systematic organization of these ways is lacking. Gene-sets are mainly organized ad hoc in current public-domain databases, with group header names often determined by practical reasons (such as the types of technology in obtaining the gene-sets or a balanced number of gene-sets under a header). Here we aim at providing a gene-set organization principle according to the level at which genes are connected: homology, physical map proximity, chemical interaction, biological, and phenotypic-medical levels. We also distinguish two types of connections between genes: actual connection versus sharing of a label. Actual connections denote direct biological interactions, whereas shared label connection denotes shared membership in a group. Some extensions of the framework are also addressed such as overlapping of gene-sets, modules, and the incorporation of other non-protein-coding entities such as microRNAs.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA.
| | - Jan Freudenberg
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA
| | - Michaela Oswald
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA
| |
Collapse
|
50
|
Decoding the complex genetic causes of heart diseases using systems biology. Biophys Rev 2015; 7:141-159. [PMID: 28509974 DOI: 10.1007/s12551-014-0145-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 11/10/2014] [Indexed: 10/24/2022] Open
Abstract
The pace of disease gene discovery is still much slower than expected, even with the use of cost-effective DNA sequencing and genotyping technologies. It is increasingly clear that many inherited heart diseases have a more complex polygenic aetiology than previously thought. Understanding the role of gene-gene interactions, epigenetics, and non-coding regulatory regions is becoming increasingly critical in predicting the functional consequences of genetic mutations identified by genome-wide association studies and whole-genome or exome sequencing. A systems biology approach is now being widely employed to systematically discover genes that are involved in heart diseases in humans or relevant animal models through bioinformatics. The overarching premise is that the integration of high-quality causal gene regulatory networks (GRNs), genomics, epigenomics, transcriptomics and other genome-wide data will greatly accelerate the discovery of the complex genetic causes of congenital and complex heart diseases. This review summarises state-of-the-art genomic and bioinformatics techniques that are used in accelerating the pace of disease gene discovery in heart diseases. Accompanying this review, we provide an interactive web-resource for systems biology analysis of mammalian heart development and diseases, CardiacCode ( http://CardiacCode.victorchang.edu.au/ ). CardiacCode features a dataset of over 700 pieces of manually curated genetic or molecular perturbation data, which enables the inference of a cardiac-specific GRN of 280 regulatory relationships between 33 regulator genes and 129 target genes. We believe this growing resource will fill an urgent unmet need to fully realise the true potential of predictive and personalised genomic medicine in tackling human heart disease.
Collapse
|