1
|
García-Vaquero ML, Gama-Carvalho M, Pinto FR, De Las Rivas J. Biological interacting units identified in human protein networks reveal tissue-functional diversification and its impact on disease. Comput Struct Biotechnol J 2022; 20:3764-3778. [PMID: 35891788 PMCID: PMC9304429 DOI: 10.1016/j.csbj.2022.07.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 07/04/2022] [Accepted: 07/04/2022] [Indexed: 12/29/2022] Open
Abstract
Biological processes are exerted by groups of physically interacting proteins. Proteins display variable biological roles depending on tissue-interactomic context. Tissue-specific protein-protein interaction networks reveal functional diversification. Most disease associated genes/proteins display tissue-specific phenotypes. Protein interaction network analysis is a valuable resource to identify disease genes.
Protein-protein interactions (PPI) play an essential role in the biological processes that occur in the cell. Therefore, the dissection of PPI networks becomes decisive to model functional coordination and predict pathological de-regulation. Cellular networks are dynamic and proteins display varying roles depending on the tissue-interactomic context. Thus, the use of centrality measures in individual proteins fall short to dissect the functional properties of the cell. For this reason, there is a need for more comprehensive, relational, and context-specific ways to analyze the multiple actions of proteins in different cells and identify specific functional assemblies within global biomolecular networks. Under this framework, we define Biological Interacting units (BioInt-U) as groups of proteins that interact physically and are enriched in a common Gene Ontology. A search strategy was applied on 33 tissue-specific (TS) PPI networks to generate BioInt libraries associated with each particular human tissue. The cross-tissue comparison showed that housekeeping assemblies incorporate different proteins and exhibit distinct network properties depending on the tissue. Furthermore, disease genes (DGs) of tissue-associated pathologies preferentially accumulate in units in the expected tissues, which in turn were more central in the TS networks. Overall, the study reveals a tissue-specific functional diversification based on the identification of specific protein units and suggests vulnerabilities specific of each tissue network, which can be applied to refine protein-disease association methods.
Collapse
Key Words
- BiU, BioInt unit
- Biological function
- CO, CORUM complex
- DEg, Differentially expressed gene
- DG, Disease gene
- Disease gene
- GO-BP, Gene Ontology biological process
- HK, Housekeeping
- Housekeeping gene
- PPI network
- PPI, Protein-protein interaction
- Protein module
- SS, Simpson's similarity
- TE, Tissue enriched
- TS, Tissue-specific
- Tissue-specific gene
- UB, Ubiquitous
Collapse
Affiliation(s)
- Marina L García-Vaquero
- University of Lisboa, Faculty of Sciences, BioISI - Biosystems & Integrative Sciences Institute, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal.,Cancer Research Center (CiC-IBMCC, CSIC/USAL and IBSAL), Consejo Superior de Investigaciones Científicas (CSIC), University of Salamanca (USAL) and Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca 37007, Spain
| | - Margarida Gama-Carvalho
- University of Lisboa, Faculty of Sciences, BioISI - Biosystems & Integrative Sciences Institute, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal
| | - Francisco R Pinto
- University of Lisboa, Faculty of Sciences, BioISI - Biosystems & Integrative Sciences Institute, Campo Grande, C8 bdg, Lisboa 1749-016, Portugal
| | - Javier De Las Rivas
- Cancer Research Center (CiC-IBMCC, CSIC/USAL and IBSAL), Consejo Superior de Investigaciones Científicas (CSIC), University of Salamanca (USAL) and Instituto de Investigación Biomédica de Salamanca (IBSAL), Salamanca 37007, Spain
| |
Collapse
|
2
|
Leipart V, Ludvigsen J, Kent M, Sandve S, To TH, Árnyasi M, Kreibich CD, Dahle B, Amdam GV. Identification of 121 variants of honey bee Vitellogenin protein sequences with structural differences at functional sites. Protein Sci 2022; 31:e4369. [PMID: 35762708 PMCID: PMC9207902 DOI: 10.1002/pro.4369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 05/21/2022] [Indexed: 12/04/2022]
Abstract
Proteins are under selection to maintain central functions and to accommodate needs that arise in ever‐changing environments. The positive selection and neutral drift that preserve functions result in a diversity of protein variants. The amount of diversity differs between proteins: multifunctional or disease‐related proteins tend to have fewer variants than proteins involved in some aspects of immunity. Our work focuses on the extensively studied protein Vitellogenin (Vg), which in honey bees (Apis mellifera) is multifunctional and highly expressed and plays roles in immunity. Yet, almost nothing is known about the natural variation in the coding sequences of this protein or how amino acid‐altering variants might impact structure–function relationships. Here, we map out allelic variation in honey bee Vg using biological samples from 15 countries. The successful barcoded amplicon Nanopore sequencing of 543 bees revealed 121 protein variants, indicating a high level of diversity in Vg. We find that the distribution of non‐synonymous single nucleotide polymorphisms (nsSNPs) differs between protein regions with different functions; domains involved in DNA and protein–protein interactions contain fewer nsSNPs than the protein's lipid binding cavities. We outline how the central functions of the protein can be maintained in different variants and how the variation pattern may inform about selection from pathogens and nutrition.
Collapse
Affiliation(s)
- Vilde Leipart
- Faculty of Environmental Sciences and Natural Resource Management, Norwegian University of Life Sciences, Ås, Norway
| | - Jane Ludvigsen
- Faculty of Environmental Sciences and Natural Resource Management, Norwegian University of Life Sciences, Ås, Norway.,Fürst Medisinsk Laboratorium, Oslo, Norway
| | - Matthew Kent
- Department of Animal and Aquacultural Sciences, Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences, Ås, Norway
| | - Simen Sandve
- Department of Animal and Aquacultural Sciences, Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences, Ås, Norway
| | - Thu-Hien To
- Department of Animal and Aquacultural Sciences, Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences, Ås, Norway
| | - Mariann Árnyasi
- Department of Animal and Aquacultural Sciences, Centre for Integrative Genetics (CIGENE), Norwegian University of Life Sciences, Ås, Norway
| | - Claus D Kreibich
- Faculty of Environmental Sciences and Natural Resource Management, Norwegian University of Life Sciences, Ås, Norway
| | - Bjørn Dahle
- Faculty of Environmental Sciences and Natural Resource Management, Norwegian University of Life Sciences, Ås, Norway.,Norwegian Beekeepers Association, Kløfta, Norway
| | - Gro V Amdam
- Faculty of Environmental Sciences and Natural Resource Management, Norwegian University of Life Sciences, Ås, Norway.,School of Life Sciences, Arizona State University, Tempe, Arizona, USA
| |
Collapse
|
3
|
Deciphering the intrinsic properties of fungal proteases in optimizing phytopathogenic interaction. Gene 2019; 711:143934. [PMID: 31228540 DOI: 10.1016/j.gene.2019.06.024] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 06/05/2019] [Accepted: 06/11/2019] [Indexed: 11/23/2022]
Abstract
Phytopathogenic fungi secrete a wide range of enzymes to penetrate and colonize host tissues. Of them protease activity is reported to increase disease aggressiveness in the plant. With the aim to explore the reason of the higher infection potential of proteases, we have compared several genomic and proteomic attributes among different hydrolytic enzymes coded by five pathogenic fungal species which are the potent infectious agents of plant. Categorizing the enzymes into four major groups, namely protease, lipase, amylase and cell-wall degraders, we observed that proteases are evolutionary more conserved, have higher expression levels, contain more hydrophobic buried residues, short linear motifs and post-translational modified (PTM) sites than the other three groups of enzymes. Again, comparing these features of protease between pathogenic and non-pathogenic Aspergillus sps, we have hypothesized that protein structural properties could play significant roles in imposing infection potency to the fungal proteases.
Collapse
|
4
|
Aguilar-Rodríguez J, Wagner A. Metabolic Determinants of Enzyme Evolution in a Genome-Scale Bacterial Metabolic Network. Genome Biol Evol 2018; 10:3076-3088. [PMID: 30351420 PMCID: PMC6257574 DOI: 10.1093/gbe/evy234] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2018] [Indexed: 11/12/2022] Open
Abstract
Different genes and proteins evolve at very different rates. To identify the factors that explain these differences is an important aspect of research in molecular evolution. One such factor is the role a protein plays in a large molecular network. Here, we analyze the evolutionary rates of enzyme-coding genes in the genome-scale metabolic network of Escherichia coli to find the evolutionary constraints imposed by the structure and function of this complex metabolic system. Central and highly connected enzymes appear to evolve more slowly than less connected enzymes, but we find that they do so as a by-product of their high abundance, and not because of their position in the metabolic network. In contrast, enzymes catalyzing reactions with high metabolic flux-high substrate to product conversion rates-evolve slowly even after we account for their abundance. Moreover, enzymes catalyzing reactions that are difficult to by-pass through alternative pathways, such that they are essential in many different genetic backgrounds, also evolve more slowly. Our analyses show that an enzyme's role in the function of a metabolic network affects its evolution more than its place in the network's structure. They highlight the value of a system-level perspective for studies of molecular evolution.
Collapse
Affiliation(s)
- José Aguilar-Rodríguez
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Biology, Stanford University, Stanford, CA and Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA
| | - Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- The Santa Fe Institute, Santa Fe, New Mexico
| |
Collapse
|
5
|
Alvarez-Ponce D, Feyertag F, Chakraborty S. Position Matters: Network Centrality Considerably Impacts Rates of Protein Evolution in the Human Protein-Protein Interaction Network. Genome Biol Evol 2018; 9:1742-1756. [PMID: 28854629 PMCID: PMC5570066 DOI: 10.1093/gbe/evx117] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/01/2017] [Indexed: 02/06/2023] Open
Abstract
The proteins of any organism evolve at disparate rates. A long list of factors affecting rates of protein evolution have been identified. However, the relative importance of each factor in determining rates of protein evolution remains unresolved. The prevailing view is that evolutionary rates are dominantly determined by gene expression, and that other factors such as network centrality have only a marginal effect, if any. However, this view is largely based on analyses in yeasts, and accurately measuring the importance of the determinants of rates of protein evolution is complicated by the fact that the different factors are often correlated with each other, and by the relatively poor quality of available functional genomics data sets. Here, we use correlation, partial correlation and principal component regression analyses to measure the contributions of several factors to the variability of the rates of evolution of human proteins. For this purpose, we analyzed the entire human protein–protein interaction data set and the human signal transduction network—a network data set of exceptionally high quality, obtained by manual curation, which is expected to be virtually free from false positives. In contrast with the prevailing view, we observe that network centrality (measured as the number of physical and nonphysical interactions, betweenness, and closeness) has a considerable impact on rates of protein evolution. Surprisingly, the impact of centrality on rates of protein evolution seems to be comparable, or even superior according to some analyses, to that of gene expression. Our observations seem to be independent of potentially confounding factors and from the limitations (biases and errors) of interactomic data sets.
Collapse
|
6
|
Schumacher J, Herlyn H. Correlates of evolutionary rates in the murine sperm proteome. BMC Evol Biol 2018; 18:35. [PMID: 29580206 PMCID: PMC5870804 DOI: 10.1186/s12862-018-1157-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 03/19/2018] [Indexed: 01/20/2023] Open
Abstract
Background Protein-coding genes expressed in sperm evolve at different rates. To gain deeper insight into the factors underlying this heterogeneity we examined the relative importance of a diverse set of previously described rate correlates in determining the evolution of murine sperm proteins. Results Using partial rank correlations we detected several major rate indicators: Phyletic gene age, numbers of protein-protein interactions, and survival essentiality emerged as particularly important rate correlates in murine sperm proteins. Tissue specificity, numbers of paralogs, and untranslated region lengths also correlate significantly with sperm genes’ evolutionary rates, albeit to a lesser extent. Multifunctionality, coding sequence or average intron lengths, and mean expression level have insignificant or virtually no independent effects on evolutionary rates in murine sperm genes. Gene ontology enrichment analyses of three equally sized murine sperm protein groups classified based on their evolutionary rates indicate strongest sperm-specific functional specialization in the most quickly evolving gene class. Conclusions We propose a model according to which slowly evolving murine sperm proteins tend to be constrained by factors such as survival essentiality, network connectivity, and/or broad expression. In contrast, evolutionary change may arise especially in less constrained sperm proteins, which might, moreover, be prone to specialize to reproduction-related functions. Our results should be taken into account in future studies on rate variations of reproductive genes. Electronic supplementary material The online version of this article (10.1186/s12862-018-1157-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Julia Schumacher
- Institute of Organismic and Molecular Evolution, Anthropology, Johannes Gutenberg University, Mainz, Germany.
| | - Holger Herlyn
- Institute of Organismic and Molecular Evolution, Anthropology, Johannes Gutenberg University, Mainz, Germany.
| |
Collapse
|
7
|
Lu YW, Chiu TS. Factors affecting synonymous codon usage of housekeeping genes in Drosophila melanogaster. ACTA BIOLOGICA HUNGARICA 2018; 69:58-71. [PMID: 29575916 DOI: 10.1556/018.68.2018.1.5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Housekeeping genes (HK genes) are required for cell survival and the maintenance of basic cellular functions. The investigation of factors affecting codon usage patterns in HK genes of insects can help in understanding the molecular evolution of insects and aid the development of insect pest management strategies. In this study, we employed bioinformatics approaches to analyze the codon usage bias (CUB) of HK genes in the insect model organism, Drosophila melanogaster. A comparison of CUB between 1107 HK genes and 1084 high tissue specificity genes suggested that HK genes have higher CUB in D. melanogaster. In addition, we found that CUB inversely correlates with the non-synonymous substitution rate of HK genes. Therefore, we attempted to identify the factors that potentially influence the codon usage pattern of HK genes. Our results suggest that mutation pressure and natural selection highly correlate with CUB in the HK genes of D. melanogaster and that two topological properties of HK proteins (proportion of protein interacting length and protein connectivity) also correlate with CUB in the HK genes of D. melanogaster. This study provides insight into CUB in the HK genes of D. melanogaster, and the results can support future investigations of potential applications in agricultural and biomedical field.
Collapse
Affiliation(s)
- Yi Wen Lu
- Department of Life Science, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., Taipei 10617, Taiwan
| | - Tai Sheng Chiu
- Department of Life Science, National Taiwan University, No. 1, Sec. 4, Roosevelt Rd., Taipei 10617, Taiwan
| |
Collapse
|
8
|
Biswas K, Acharya D, Podder S, Ghosh TC. Evolutionary rate heterogeneity between multi- and single-interface hubs across human housekeeping and tissue-specific protein interaction network: Insights from proteins' and its partners' properties. Genomics 2017; 110:283-290. [PMID: 29198610 DOI: 10.1016/j.ygeno.2017.11.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Revised: 11/10/2017] [Accepted: 11/29/2017] [Indexed: 12/12/2022]
Abstract
Integrating gene expression into protein-protein interaction network (PPIN) leads to the construction of tissue-specific (TS) and housekeeping (HK) sub-networks, with distinctive TS- and HK-hubs. All such hub proteins are divided into multi-interface (MI) hubs and single-interface (SI) hubs, where MI hubs evolve slower than SI hubs. Here we explored the evolutionary rate difference between MI and SI proteins within TS- and HK-PPIN and observed that this difference is present only in TS, but not in HK-class. Next, we explored whether proteins' own properties or its partners' properties are more influential in such evolutionary discrepancy. Statistical analyses revealed that this evolutionary rate correlates negatively with protein's own properties like expression level, miRNA count, conformational diversity and functional properties and with its partners' properties like protein disorder and tissue expression similarity. Moreover, partial correlation and regression analysis revealed that both proteins' and its partners' properties have independent effects on protein evolutionary rate.
Collapse
Affiliation(s)
- Kakali Biswas
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Debarun Acharya
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Soumita Podder
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India; Department of Microbiology, Raiganj University, Raiganj, Uttar Dinajpur 733134, India
| | - Tapash Chandra Ghosh
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India.
| |
Collapse
|
9
|
Effects of different kinds of essentiality on sequence evolution of human testis proteins. Sci Rep 2017; 7:43534. [PMID: 28272493 PMCID: PMC5341092 DOI: 10.1038/srep43534] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/25/2017] [Indexed: 11/17/2022] Open
Abstract
We asked if essentiality for either fertility or viability differentially affects sequence evolution of human testis proteins. Based on murine knockout data, we classified a set of 965 proteins expressed in human seminiferous tubules into three categories: proteins essential for prepubertal survival (“lethality proteins”), associated with male sub- or infertility (“male sub-/infertility proteins”), and nonessential proteins. In our testis protein dataset, lethality genes evolved significantly slower than nonessential and male sub-/infertility genes, which is in line with other authors’ findings. Using tissue specificity, connectivity in the protein-protein interaction (PPI) network, and multifunctionality as proxies for evolutionary constraints, we found that of the three categories, proteins linked to male sub- or infertility are least constrained. Lethality proteins, on the other hand, are characterized by broad expression, many PPI partners, and high multifunctionality, all of which points to strong evolutionary constraints. We conclude that compared with lethality proteins, those linked to male sub- or infertility are nonetheless indispensable, but evolve under more relaxed constraints. Finally, adaptive evolution in response to postmating sexual selection could further accelerate evolutionary rates of male sub- or infertility proteins expressed in human testis. These findings may become useful for in silico detection of human sub-/infertility genes.
Collapse
|
10
|
Biswas K, Chakraborty S, Podder S, Ghosh TC. Insights into the dN/dS ratio heterogeneity between brain specific genes and widely expressed genes in species of different complexity. Genomics 2016; 108:11-7. [DOI: 10.1016/j.ygeno.2016.04.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Revised: 04/22/2016] [Accepted: 04/23/2016] [Indexed: 01/07/2023]
|
11
|
Acharya D, Ghosh TC. Global analysis of human duplicated genes reveals the relative importance of whole-genome duplicates originated in the early vertebrate evolution. BMC Genomics 2016; 17:71. [PMID: 26801093 PMCID: PMC4724117 DOI: 10.1186/s12864-016-2392-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 01/13/2016] [Indexed: 12/13/2022] Open
Abstract
Background Gene duplication is a genetic mutation that creates functionally redundant gene copies that are initially relieved from selective pressures and may adapt themselves to new functions with time. The levels of gene duplication may vary from small-scale duplication (SSD) to whole genome duplication (WGD). Studies with yeast revealed ample differences between these duplicates: Yeast WGD pairs were functionally more similar, less divergent in subcellular localization and contained a lesser proportion of essential genes. In this study, we explored the differences in evolutionary genomic properties of human SSD and WGD genes, with the identifiable human duplicates coming from the two rounds of whole genome duplication occurred early in vertebrate evolution. Results We observed that these two groups of duplicates were also dissimilar in terms of their evolutionary and genomic properties. But interestingly, this is not like the same observed in yeast. The human WGDs were found to be functionally less similar, diverge more in subcellular level and contain a higher proportion of essential genes than the SSDs, all of which are opposite from yeast. Additionally, we explored that human WGDs were more divergent in their gene expression profile, have higher multifunctionality and are more often associated with disease, and are evolutionarily more conserved than human SSDs. Conclusions Our study suggests that human WGD duplicates are more divergent and entails the adaptation of WGDs to novel and important functions that consequently lead to their evolutionary conservation in the course of evolution. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2392-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Debarun Acharya
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700054, West Bengal, India
| | - Tapash C Ghosh
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700054, West Bengal, India.
| |
Collapse
|
12
|
Chakraborty S, Panda A, Ghosh TC. Exploring the evolutionary rate differences between human disease and non-disease genes. Genomics 2015; 108:18-24. [PMID: 26562439 DOI: 10.1016/j.ygeno.2015.11.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Revised: 10/29/2015] [Accepted: 11/03/2015] [Indexed: 10/22/2022]
Abstract
Comparisons of evolutionary features between human disease and non-disease genes have a wide implication to understand the genetic basis of human disease genes. However, it has not yet been resolved whether disease genes evolve at slower or faster rate than the non-disease genes. To resolve this controversy, here we integrated human disease genes from several databases and compared their protein evolutionary rates with non-disease genes in both housekeeping and tissue-specific group. We noticed that in tissue specific group, disease genes evolve significantly at a slower rate than non-disease genes. However, we found no significant difference in evolutionary rates between disease and non-disease genes in housekeeping group. Tissue specific disease genes have a higher protein complex number, elevated gene expression level and are also associated with conserve biological processes. Finally, our regression analysis suggested that protein complex number followed by protein multifunctionality independently modulates the evolutionary rate of human disease genes.
Collapse
Affiliation(s)
- Sandip Chakraborty
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Arup Panda
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Tapash Chandra Ghosh
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India.
| |
Collapse
|
13
|
Bush SJ, Kover PX, Urrutia AO. Lineage-specific sequence evolution and exon edge conservation partially explain the relationship between evolutionary rate and expression level in A. thaliana. Mol Ecol 2015; 24:3093-106. [PMID: 25930165 PMCID: PMC4480654 DOI: 10.1111/mec.13221] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Revised: 04/21/2015] [Accepted: 04/28/2015] [Indexed: 02/06/2023]
Abstract
Rapidly evolving proteins can aid the identification of genes underlying phenotypic adaptation across taxa, but functional and structural elements of genes can also affect evolutionary rates. In plants, the ‘edges’ of exons, flanking intron junctions, are known to contain splice enhancers and to have a higher degree of conservation compared to the remainder of the coding region. However, the extent to which these regions may be masking indicators of positive selection or account for the relationship between dN/dS and other genomic parameters is unclear. We investigate the effects of exon edge conservation on the relationship of dN/dS to various sequence characteristics and gene expression parameters in the model plant Arabidopsis thaliana. We also obtain lineage-specific dN/dS estimates, making use of the recently sequenced genome of Thellungiella parvula, the second closest sequenced relative after the sister species Arabidopsis lyrata. Overall, we find that the effect of exon edge conservation, as well as the use of lineage-specific substitution estimates, upon dN/dS ratios partly explains the relationship between the rates of protein evolution and expression level. Furthermore, the removal of exon edges shifts dN/dS estimates upwards, increasing the proportion of genes potentially under adaptive selection. We conclude that lineage-specific substitutions and exon edge conservation have an important effect on dN/dS ratios and should be considered when assessing their relationship with other genomic parameters.
Collapse
Affiliation(s)
- Stephen J Bush
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Paula X Kover
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Araxi O Urrutia
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
14
|
Abstract
The searching of human housekeeping (HK) genes has been a long quest since the emergence of transcriptomics, and is instrumental for us to understand the structure of genome and the fundamentals of biological processes. The resolved genes are frequently used in evolution studies and as normalization standards in quantitative gene-expression analysis. Within the past 20 years, more than a dozen HK-gene studies have been conducted, yet none of them sampled human tissues completely. We believe an integration of these results will help remove false positive genes owing to the inadequate sampling. Surprisingly, we only find one common gene across 15 examined HK-gene datasets comprising 187 different tissue and cell types. Our subsequent analyses suggest that it might not be appropriate to rigidly define HK genes as expressed in all tissue types that have diverse developmental, physiological, and pathological states. It might be beneficial to use more robustly identified HK functions for filtering criteria, in which the representing genes can be a subset of genome. These genes are not necessarily the same, and perhaps need not to be the same, everywhere in our body.
Collapse
Affiliation(s)
- Yijuan Zhang
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Ding Li
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Bingyun Sun
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| |
Collapse
|
15
|
Chakraborty S, Ghosh TC. Evolutionary rate heterogeneity of core and attachment proteins in yeast protein complexes. Genome Biol Evol 2013; 5:1366-75. [PMID: 23814130 PMCID: PMC3730348 DOI: 10.1093/gbe/evt096] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
In general, proteins do not work alone; they form macromolecular complexes to play fundamental roles in diverse cellular functions. On the basis of their iterative clustering procedure and frequency of occurrence in the macromolecular complexes, the protein subunits have been categorized as core and attachment. Core protein subunits are the main functional elements, whereas attachment proteins act as modifiers or activators in protein complexes. In this article, using the current data set of yeast protein complexes, we found that core proteins are evolving at a faster rate than attachment proteins in spite of their functional importance. Interestingly, our investigation revealed that attachment proteins are present in a higher number of macromolecular complexes than core proteins. We also observed that the protein complex number (defined as the number of protein complexes in which a protein subunit belongs) has a stronger influence on gene/protein essentiality than multifunctionality. Finally, our results suggest that the observed differences in the rates of protein evolution between core and attachment proteins are due to differences in protein complex number and expression level. Moreover, we conclude that proteins which are present in higher numbers of macromolecular complexes enhance their overall expression level by increasing their transcription rate as well as translation rate, and thus the protein complex number imposes a strong selection pressure on the evolution of yeast proteome.
Collapse
|
16
|
Choi SS, Hannenhalli S. Three independent determinants of protein evolutionary rate. J Mol Evol 2013; 76:98-111. [PMID: 23400388 DOI: 10.1007/s00239-013-9543-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 01/16/2013] [Indexed: 12/15/2022]
Abstract
One of the most widely accepted ideas related to the evolutionary rates of proteins is that functionally important residues or regions evolve slower than other regions, a reasonable outcome of which should be a slower evolutionary rate of the proteins with a higher density of functionally important sites. Oddly, the role of functional importance, mainly measured by essentiality, in determining evolutionary rate has been challenged in recent studies. Several variables other than protein essentiality, such as expression level, gene compactness, protein-protein interactions, etc., have been suggested to affect protein evolutionary rate. In the present review, we try to refine the concept of functional importance of a gene, and consider three factors-functional importance, expression level, and gene compactness, as independent determinants of evolutionary rate of a protein, based not only on their known correlation with evolutionary rate but also on a reasonable mechanistic model. We suggest a framework based on these mechanistic models to correctly interpret the correlations between evolutionary rates and the various variables as well as the interrelationships among the variables.
Collapse
Affiliation(s)
- Sun Shim Choi
- Department of Medical Biotechnology, College of Biomedical Science, and Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, South Korea.
| | | |
Collapse
|
17
|
Podder S, Ghosh TC. Evolutionary dynamics of human autoimmune disease genes and malfunctioned immunological genes. BMC Evol Biol 2012; 12:10. [PMID: 22276655 PMCID: PMC3347981 DOI: 10.1186/1471-2148-12-10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2011] [Accepted: 01/25/2012] [Indexed: 02/01/2023] Open
Abstract
Background One of the main issues of molecular evolution is to divulge the principles in dictating the evolutionary rate differences among various gene classes. Immunological genes have received considerable attention in evolutionary biology as candidates for local adaptation and for studying functionally important polymorphisms. The normal structure and function of immunological genes will be distorted when they experience mutations leading to immunological dysfunctions. Results Here, we examined the fundamental differences between the genes which on mutation give rise to autoimmune or other immune system related diseases and the immunological genes that do not cause any disease phenotypes. Although the disease genes examined are analogous to non-disease genes in product, expression, function, and pathway affiliation, a statistically significant decrease in evolutionary rate has been found in autoimmune disease genes relative to all other immune related diseases and non-disease genes. Possible ways of accumulation of mutation in the three steps of the central dogma (DNA-mRNA-Protein) have been studied to trace the mutational effects predisposed to disease consequence and acquiring higher selection pressure. Principal Component Analysis and Multivariate Regression Analysis have established the predominant role of single nucleotide polymorphisms in guiding the evolutionary rate of immunological disease and non-disease genes followed by m-RNA abundance, paralogs number, fraction of phosphorylation residue, alternatively spliced exon, protein residue burial and protein disorder. Conclusions Our study provides an empirical insight into the etiology of autoimmune disease genes and other immunological diseases. The immediate utility of our study is to help in disease gene identification and may also help in medicinal improvement of immune related disease.
Collapse
|
18
|
Sen K, Podder S, Ghosh TC. On the quest for selective constraints shaping the expressivity of the genes casting retropseudogenes in human. BMC Genomics 2011; 12:401. [PMID: 21824418 PMCID: PMC3162935 DOI: 10.1186/1471-2164-12-401] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2010] [Accepted: 08/08/2011] [Indexed: 02/04/2023] Open
Abstract
Background Pseudogenes, the nonfunctional homologues of functional genes are now coming to light as important resources regarding the study of human protein evolution. Processed pseudogenes arising by reverse transcription and reinsertion can provide molecular record on the dynamics and evolution of genomes. Researches on the progenitors of human processed pseudogenes delved out their highly expressed and evolutionarily conserved characters. They are reported to be short and GC-poor indicating their high efficiency for retrotransposition. In this article we focused on their high expressivity and explored the factors contributing for that and their relevance in the milieu of protein sequence evolution. Results We here, analyzed the high expressivity of these genes configuring processed or retropseudogenes by their immense connectivity in protein-protein interaction network, an inclination towards alternative splicing mechanism, a lower rate of mRNA disintegration and a slower evolutionary rate. While the unusual trend of the upraised disorder in contrast with the high expressivity of the proteins encoded by processed pseudogene ancestors is accredited by a predominance of hub-protein encoding genes, a high propensity of repeat sequence containing genes, elevated protein stability and the functional constraint to perform the transcription regulatory jobs. Linear regression analysis demonstrates mRNA decay rate and protein intrinsic disorder as the influential factors controlling the expressivity of these retropseudogene ancestors while the latter one is found to have the most significant regulatory power. Conclusions Our findings imply that, the affluence of disordered regions elevating the network attachment to be involved in important cellular assignments and the stability in transcriptional level are acting as the prevailing forces behind the high expressivity of the human genes configuring processed pseudogenes.
Collapse
Affiliation(s)
- Kamalika Sen
- Bioinformatics Centre, Bose Institute, P 1/12, C,I,T, Scheme VII M, Kolkata- 700 054, India
| | | | | |
Collapse
|
19
|
Chang CW, Cheng WC, Chen CR, Shu WY, Tsai ML, Huang CL, Hsu IC. Identification of human housekeeping genes and tissue-selective genes by microarray meta-analysis. PLoS One 2011; 6:e22859. [PMID: 21818400 PMCID: PMC3144958 DOI: 10.1371/journal.pone.0022859] [Citation(s) in RCA: 99] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2011] [Accepted: 06/29/2011] [Indexed: 01/26/2023] Open
Abstract
Background Categorizing protein-encoding transcriptomes of normal tissues into housekeeping genes and tissue-selective genes is a fundamental step toward studies of genetic functions and genetic associations to tissue-specific diseases. Previous studies have been mainly based on a few data sets with limited samples in each tissue, which restrained the representativeness of their identified genes, and resulted in low consensus among them. Results This study compiled 1,431 samples in 43 normal human tissues from 104 microarray data sets. We developed a new method to improve gene expression assessment, and showed that more than ten samples are needed to robustly identify the protein-encoding transcriptome of a tissue. We identified 2,064 housekeeping genes and 2,293 tissue-selective genes, and analyzed gene lists by functional enrichment analysis. The housekeeping genes are mainly involved in fundamental cellular functions, and the tissue-selective genes are strikingly related to functions and diseases corresponding to tissue-origin. We also compared agreements and related functions among our housekeeping genes and those of previous studies, and pointed out some reasons for the low consensuses. Conclusions The results indicate that sufficient samples have improved the identification of protein-encoding transcriptome of a tissue. Comprehensive meta-analysis has proved the high quality of our identified HK and TS genes. These results could offer a useful resource for future research on functional and genomic features of HK and TS genes.
Collapse
Affiliation(s)
- Cheng-Wei Chang
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Wei-Chung Cheng
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Chaang-Ray Chen
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Wun-Yi Shu
- Institute of Statistics, National Tsing Hua University, Hsinchu, Taiwan
| | - Min-Lung Tsai
- Institute of Athletics, National Taiwan Sport University, Taichung, Taiwan
| | - Ching-Lung Huang
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
| | - Ian C. Hsu
- Department of Biomedical Engineering and Environmental Sciences, National Tsing Hua University, Hsinchu, Taiwan
- * E-mail:
| |
Collapse
|
20
|
Razeto-Barry P, Díaz J, Cotoras D, Vásquez RA. Molecular evolution, mutation size and gene pleiotropy: a geometric reexamination. Genetics 2011; 187:877-85. [PMID: 21196522 PMCID: PMC3048784 DOI: 10.1534/genetics.110.125195] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2010] [Accepted: 12/22/2010] [Indexed: 01/15/2023] Open
Abstract
The influence of phenotypic effects of genetic mutations on molecular evolution is not well understood. Neutral and nearly neutral theories of molecular evolution predict a negative relationship between the evolutionary rate of proteins and their functional importance; nevertheless empirical studies seeking relationships between evolutionary rate and the phenotypic role of proteins have not produced conclusive results. In particular, previous studies have not found the expected negative correlation between evolutionary rate and gene pleiotropy. Here, we studied the effect of gene pleiotropy and the phenotypic size of mutations on the evolutionary rate of genes in a geometrical model, in which gene pleiotropy was characterized by n molecular phenotypes that affect organismal fitness. For a nearly neutral process, we found a negative relationship between evolutionary rate and mutation size but pleiotropy did not affect the evolutionary rate. Further, for a selection model, where most of the substitutions were fixed by natural selection in a randomly fluctuating environment, we also found a negative relationship between evolutionary rate and mutation size, but interestingly, gene pleiotropy increased the evolutionary rate as √n. These findings may explain part of the disagreement between empirical data and traditional expectations.
Collapse
Affiliation(s)
- Pablo Razeto-Barry
- Instituto de Filosof ía y Ciencias de la Complejidad, Santiago, Chile 7780192.
| | | | | | | |
Collapse
|
21
|
Chakraborty S, Kahali B, Ghosh TC. Protein complex forming ability is favored over the features of interacting partners in determining the evolutionary rates of proteins in the yeast protein-protein interaction networks. BMC SYSTEMS BIOLOGY 2010; 4:155. [PMID: 21073713 PMCID: PMC2998497 DOI: 10.1186/1752-0509-4-155] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Accepted: 11/12/2010] [Indexed: 01/01/2023]
Abstract
Background Evolutionary rates of proteins in a protein-protein interaction network are primarily governed by the protein connectivity and/or expression level. A recent study revealed the importance of the features of the interacting protein partners, viz., the coefficient of functionality and clustering coefficient in controlling the protein evolutionary rates in a protein-protein interaction (PPI) network. Results By multivariate regression analysis we found that the three parameters: probability of complex formation, expression level and degree of a protein independently guide the evolutionary rates of proteins in the PPI network. The contribution of the complex forming property of a protein and its expression level led to nearly 43% of the total variation as observed from the first principal component. We also found that for complex forming proteins in the network, those which have partners sharing the same functional class evolve faster than those having partners belonging to different functional classes. The proteins in the dense parts of the network evolve faster than their counterparts which are present in the sparse regions of the network. Taking into account the complex forming ability, we found that all the complex forming proteins considered in this study evolve slower than the non-complex forming proteins irrespective of their localization in the network or the affiliation of their partners to same/different functional classes. Conclusions We have shown here that the functionality and clustering coefficient correlated with the degree of the protein in the protein-protein interaction network. We have identified the significant relationship of the complex-forming property of proteins and their evolutionary rates even when they are classified according to the features of their interacting partners. Our study implies that the evolutionarily constrained proteins are actually members of a larger number of protein complexes and this justifies why they have enhanced expression levels.
Collapse
Affiliation(s)
- Sandip Chakraborty
- Bioinformatics Centre, Bose Institute, P 1/12, C,I,T, Scheme VII M, Kolkata 700 054, India
| | | | | |
Collapse
|
22
|
Begum T, Ghosh TC. Understanding the Effect of Secondary Structures and Aggregation on Human Protein Folding Class Evolution. J Mol Evol 2010; 71:60-9. [DOI: 10.1007/s00239-010-9364-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2009] [Accepted: 06/23/2010] [Indexed: 12/01/2022]
|
23
|
Vinogradov AE. Human transcriptome nexuses: basic-eukaryotic and metazoan. Genomics 2010; 95:345-54. [PMID: 20298777 DOI: 10.1016/j.ygeno.2010.03.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 03/01/2010] [Accepted: 03/08/2010] [Indexed: 01/10/2023]
Abstract
Using a new approach, I analysed human transcriptome coexpression network and revealed two large-scale nexuses. Besides gene coexpression, each nexus is characterized by a combination of gene evolutionary origin, function and among-tissues expression breadth. The first nexus contains mostly genes of pre-metazoan origin, which are widely expressed and have cell-centred functions. The second nexus is enriched in genes of metazoan origin, which are expressed more narrowly and have organism-centred functions. The revealed nexuses are supported by asymmetry in distribution of transcription factor targets between them. Within the metazoan nexus, there is a subnexus that is more pronounced in the nervous tissues and is enriched in gene regulatory complexity. It mostly contains genes related to nervous system, cell communication and multicellular organism processes and development. The revealed nexuses indicate a dichotomy in the transcriptional regulation and can provide a framework for further functional genomics studies.
Collapse
|
24
|
Exploring the Differences in Evolutionary Rates between Monogenic and Polygenic Disease Genes in Human. Mol Biol Evol 2009; 27:934-41. [DOI: 10.1093/molbev/msp297] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
|