1
|
Cui J, Qiu T, Li L, Cui S. De novo full-length transcriptome analysis of two ecotypes of Phragmites australis (swamp reed and dune reed) provides new insights into the transcriptomic complexity of dune reed and its long-term adaptation to desert environments. BMC Genomics 2023; 24:180. [PMID: 37020272 PMCID: PMC10077656 DOI: 10.1186/s12864-023-09271-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Accepted: 03/23/2023] [Indexed: 04/07/2023] Open
Abstract
BACKGROUND The extremely harsh environment of the desert is changing dramatically every moment, and the rapid adaptive stress response in the short term requires enormous energy expenditure to mobilize widespread regulatory networks, which is all the more detrimental to the survival of the desert plants themselves. The dune reed, which has adapted to desert environments with complex and variable ecological factors, is an ideal type of plant for studying the molecular mechanisms by which Gramineae plants respond to combinatorial stress of the desert in their natural state. But so far, the data on the genetic resources of reeds is still scarce, therefore most of their research has focused on ecological and physiological studies. RESULTS In this study, we obtained the first De novo non-redundant Full-Length Non-Chimeric (FLNC) transcriptome databases for swamp reeds (SR), dune reeds (DR) and the All of Phragmites australis (merged of iso-seq data from SR and DR), using PacBio Iso-Seq technology and combining tools such as Iso-Seq3 and Cogent. We then identified and described long non-coding RNAs (LncRNA), transcription factor (TF) and alternative splicing (AS) events in reeds based on a transcriptome database. Meanwhile, we have identified and developed for the first time a large number of candidates expressed sequence tag-SSR (EST-SSRs) markers in reeds based on UniTransModels. In addition, through differential gene expression analysis of wild-type and homogenous cultures, we found a large number of transcription factors that may be associated with desert stress tolerance in the dune reed, and revealed that members of the Lhc family have an important role in the long-term adaptation of dune reeds to desert environments. CONCLUSIONS Our results provide a positive and usable genetic resource for Phragmites australis with a widespread adaptability and resistance, and provide a genetic database for subsequent reeds genome annotation and functional genomic studies.
Collapse
Affiliation(s)
- Jipeng Cui
- College of Life Sciences, Capital Normal University, Haidian District, Beijing, 100048, China
- Beijing Key Laboratory of Plant Gene Resources and Biotechnology for Carbon Reduction and Environmental Improvement, Haidian District, Beijing, 100048, China
| | - Tianhang Qiu
- College of Life Sciences, Capital Normal University, Haidian District, Beijing, 100048, China
- Beijing Key Laboratory of Plant Gene Resources and Biotechnology for Carbon Reduction and Environmental Improvement, Haidian District, Beijing, 100048, China
| | - Li Li
- College of Life Sciences, Capital Normal University, Haidian District, Beijing, 100048, China
- Beijing Key Laboratory of Plant Gene Resources and Biotechnology for Carbon Reduction and Environmental Improvement, Haidian District, Beijing, 100048, China
| | - Suxia Cui
- College of Life Sciences, Capital Normal University, Haidian District, Beijing, 100048, China.
- Beijing Key Laboratory of Plant Gene Resources and Biotechnology for Carbon Reduction and Environmental Improvement, Haidian District, Beijing, 100048, China.
| |
Collapse
|
2
|
Han Y, Wennersten SA, Wright JM, Ludwig RW, Lau E, Lam MPY. Proteogenomics reveals sex-biased aging genes and coordinated splicing in cardiac aging. Am J Physiol Heart Circ Physiol 2022; 323:H538-H558. [PMID: 35930447 PMCID: PMC9448281 DOI: 10.1152/ajpheart.00244.2022] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 07/20/2022] [Accepted: 07/31/2022] [Indexed: 01/24/2023]
Abstract
The risks of heart diseases are significantly modulated by age and sex, but how these factors influence baseline cardiac gene expression remains incompletely understood. Here, we used RNA sequencing and mass spectrometry to compare gene expression in female and male young adult (4 mo) and early aging (20 mo) mouse hearts, identifying thousands of age- and sex-dependent gene expression signatures. Sexually dimorphic cardiac genes are broadly distributed, functioning in mitochondrial metabolism, translation, and other processes. In parallel, we found over 800 genes with differential aging response between male and female, including genes in cAMP and PKA signaling. Analysis of the sex-adjusted aging cardiac transcriptome revealed a widespread remodeling of exon usage patterns that is largely independent from differential gene expression, concomitant with upstream changes in RNA-binding protein and splice factor transcripts. To evaluate the impact of the splicing events on cardiac proteoform composition, we applied an RNA-guided proteomics computational pipeline to analyze the mass spectrometry data and detected hundreds of putative splice variant proteins that have the potential to rewire the cardiac proteome. Taken together, the results here suggest that cardiac aging is associated with 1) widespread sex-biased aging genes and 2) a rewiring of RNA splicing programs, including sex- and age-dependent changes in exon usages and splice patterns that have the potential to influence cardiac protein structure and function. These changes contribute to the emerging evidence for considerable sexual dimorphism in the cardiac aging process that should be considered in the search for disease mechanisms.NEW & NOTEWORTHY Han et al. used proteogenomics to compare male and female mouse hearts at 4 and 20 mo. Sex-biased cardiac genes function in mitochondrial metabolism, translation, autophagy, and other processes. Hundreds of cardiac genes show sex-by-age interactions, that is, sex-biased aging genes. Cardiac aging is accompanied with a remodeling of exon usage in functionally coordinated genes, concomitant with differential expression of RNA-binding proteins and splice factors. These features represent an underinvestigated aspect of cardiac aging that may be relevant to the search for disease mechanisms.
Collapse
Grants
- R21-HL150456 HHS | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R00-HL144829 HHS | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R00 HL127302 NHLBI NIH HHS
- R03-OD032666 HHS | NIH | NIH Office of the Director (OD)
- R01 HL141278 NHLBI NIH HHS
- F32 HL149191 NHLBI NIH HHS
- F32-HL149191 HHS | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R00-HL127302 HHS | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- R21 HL150456 NHLBI NIH HHS
- R03 OD032666 NIH HHS
- R00 HL144829 NHLBI NIH HHS
- R01-HL141278 HHS | NIH | National Heart, Lung, and Blood Institute (NHLBI)
- University of Colorado
- University of Colorado School of Medicine, Anschutz Medical Campus
Collapse
Affiliation(s)
- Yu Han
- Department of Medicine, Anschutz Medical Campus, University of Colorado School of Medicine, Aurora, Colorado
| | - Sara A Wennersten
- Department of Medicine, Anschutz Medical Campus, University of Colorado School of Medicine, Aurora, Colorado
| | - Julianna M Wright
- Department of Medicine, Anschutz Medical Campus, University of Colorado School of Medicine, Aurora, Colorado
| | - R W Ludwig
- Department of Medicine, Anschutz Medical Campus, University of Colorado School of Medicine, Aurora, Colorado
| | | | - Maggie P Y Lam
- Department of Medicine, Anschutz Medical Campus, University of Colorado School of Medicine, Aurora, Colorado
- Department of Biochemistry and Molecular Genetics, Anschutz Medical Campus, University of Colorado School of Medicine, Aurora, Colorado
| |
Collapse
|
3
|
Kawachi T, Masuda A, Yamashita Y, Takeda JI, Ohkawara B, Ito M, Ohno K. Regulated splicing of large exons is linked to phase-separation of vertebrate transcription factors. EMBO J 2021; 40:e107485. [PMID: 34605568 DOI: 10.15252/embj.2020107485] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 09/06/2021] [Accepted: 09/14/2021] [Indexed: 12/30/2022] Open
Abstract
Although large exons cannot be readily recognized by the spliceosome, many are evolutionarily conserved and constitutively spliced for inclusion in the processed transcript. Furthermore, whether large exons may be enriched in a certain subset of proteins, or mediate specific functions, has remained unclear. Here, we identify a set of nearly 3,000 SRSF3-dependent large constitutive exons (S3-LCEs) in human and mouse cells. These exons are enriched for cytidine-rich sequence motifs, which bind and recruit the splicing factors hnRNP K and SRSF3. We find that hnRNP K suppresses S3-LCE splicing, an effect that is mitigated by SRSF3 to thus achieve constitutive splicing of S3-LCEs. S3-LCEs are enriched in genes for components of transcription machineries, including mediator and BAF complexes, and frequently contain intrinsically disordered regions (IDRs). In a subset of analyzed S3-LCE-containing transcription factors, SRSF3 depletion leads to deletion of the IDRs due to S3-LCE exon skipping, thereby disrupting phase-separated assemblies of these factors. Cytidine enrichment in large exons introduces proline/serine codon bias in intrinsically disordered regions and appears to have been evolutionarily acquired in vertebrates. We propose that layered splicing regulation by hnRNP K and SRSF3 ensures proper phase-separation of these S3-LCE-containing transcription factors in vertebrates.
Collapse
Affiliation(s)
- Toshihiko Kawachi
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Akio Masuda
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Yoshihiro Yamashita
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Jun-Ichi Takeda
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Bisei Ohkawara
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Mikako Ito
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Kinji Ohno
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
4
|
Exploring Potential Signals of Selection for Disordered Residues in Prokaryotic and Eukaryotic Proteins. GENOMICS PROTEOMICS & BIOINFORMATICS 2020; 18:549-564. [PMID: 33346088 PMCID: PMC8377245 DOI: 10.1016/j.gpb.2020.06.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 03/29/2020] [Accepted: 06/10/2020] [Indexed: 11/22/2022]
Abstract
Intrinsically disordered proteins (IDPs) are an important class of proteins in all domains of life for their functional importance. However, how nature has shaped the disorder potential of prokaryotic and eukaryotic proteins is still not clearly known. Randomly generated sequences are free of any selective constraints, thus these sequences are commonly used as null models. Considering different types of random protein models, here we seek to understand how the disorder potential of natural eukaryotic and prokaryotic proteins differs from random sequences. Comparing proteome-wide disorder content between real and random sequences of 12 model organisms, we noticed that eukaryotic proteins are enriched in disordered regions compared to random sequences, but in prokaryotes such regions are depleted. By analyzing the position-wise disorder profile, we show that there is a generally higher disorder near the N- and C-terminal regions of eukaryotic proteins as compared to the random models; however, either no or a weak such trend was found in prokaryotic proteins. Moreover, here we show that this preference is not caused by the amino acid or nucleotide composition at the respective sites. Instead, these regions were found to be endowed with a higher fraction of protein–protein binding sites, suggesting their functional importance. We discuss several possible explanations for this pattern, such as improving the efficiency of protein–protein interaction, ribosome movement during translation, and post-translational modification. However, further studies are needed to clearly understand the biophysical mechanisms causing the trend.
Collapse
|
5
|
Abrahams L, Hurst LD. A Depletion of Stop Codons in lincRNA is Owing to Transfer of Selective Constraint from Coding Sequences. Mol Biol Evol 2020; 37:1148-1164. [PMID: 31841162 PMCID: PMC7086181 DOI: 10.1093/molbev/msz299] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Although the constraints on a gene’s sequence are often assumed to reflect the functioning of that gene, here we propose transfer selection, a constraint operating on one class of genes transferred to another, mediated by shared binding factors. We show that such transfer can explain an otherwise paradoxical depletion of stop codons in long intergenic noncoding RNAs (lincRNAs). Serine/arginine-rich proteins direct the splicing machinery by binding exonic splice enhancers (ESEs) in immature mRNA. As coding exons cannot contain stop codons in one reading frame, stop codons should be rare within ESEs. We confirm that the stop codon density (SCD) in ESE motifs is low, even accounting for nucleotide biases. Given that serine/arginine-rich proteins binding ESEs also facilitate lincRNA splicing, a low SCD could transfer to lincRNAs. As predicted, multiexon lincRNA exons are depleted in stop codons, a result not explained by open reading frame (ORF) contamination. Consistent with transfer selection, stop codon depletion in lincRNAs is most acute in exonic regions with the highest ESE density, disappears when ESEs are masked, is consistent with stop codon usage skews in ESEs, and is diminished in both single-exon lincRNAs and introns. Owing to low SCD, the maximum lengths of pseudo-ORFs frequently exceed null expectations. This has implications for ORF annotation and the evolution of de novo protein-coding genes from lincRNAs. We conclude that not all constraints operating on genes need be explained by the functioning of the gene but may instead be transferred owing to shared binding factors.
Collapse
Affiliation(s)
- Liam Abrahams
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
6
|
Chaudhary S, Jabre I, Reddy ASN, Staiger D, Syed NH. Perspective on Alternative Splicing and Proteome Complexity in Plants. TRENDS IN PLANT SCIENCE 2019; 24:496-506. [PMID: 30852095 DOI: 10.1016/j.tplants.2019.02.006] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 01/28/2019] [Accepted: 02/08/2019] [Indexed: 05/02/2023]
Abstract
Alternative splicing (AS) generates multiple transcripts from the same gene, however, AS contribution to proteome complexity remains elusive in plants. AS is prevalent under stress conditions in plants, but it is counterintuitive why plants would invest in protein synthesis under declining energy supply. We propose that plants employ AS not only to potentially increasing proteomic complexity, but also to buffer against the stress-responsive transcriptome to reduce the metabolic cost of translating all AS transcripts. To maximise efficiency under stress, plants may make fewer proteins with disordered domains via AS to diversify substrate specificity and maintain sufficient regulatory capacity. Furthermore, we suggest that chromatin state-dependent AS engenders short/long-term stress memory to mediate reproducible transcriptional response in the future.
Collapse
Affiliation(s)
- Saurabh Chaudhary
- School of Human and Life Sciences, Canterbury Christ Church University, Canterbury, CT1 1QU, UK; These authors contributed equally to this work
| | - Ibtissam Jabre
- School of Human and Life Sciences, Canterbury Christ Church University, Canterbury, CT1 1QU, UK; These authors contributed equally to this work
| | - Anireddy S N Reddy
- Department of Biology and Program in Cell and Molecular Biology, Colorado State University, Fort Collins, CO 80523-1878, USA
| | - Dorothee Staiger
- RNA Biology and Molecular Physiology, Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Naeem H Syed
- School of Human and Life Sciences, Canterbury Christ Church University, Canterbury, CT1 1QU, UK.
| |
Collapse
|
7
|
Fontrodona N, Aubé F, Claude JB, Polvèche H, Lemaire S, Tranchevent LC, Modolo L, Mortreux F, Bourgeois CF, Auboeuf D. Interplay between coding and exonic splicing regulatory sequences. Genome Res 2019; 29:711-722. [PMID: 30962178 PMCID: PMC6499313 DOI: 10.1101/gr.241315.118] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Accepted: 03/28/2019] [Indexed: 01/24/2023]
Abstract
The inclusion of exons during the splicing process depends on the binding of splicing factors to short low-complexity regulatory sequences. The relationship between exonic splicing regulatory sequences and coding sequences is still poorly understood. We demonstrate that exons that are coregulated by any given splicing factor share a similar nucleotide composition bias and preferentially code for amino acids with similar physicochemical properties because of the nonrandomness of the genetic code. Indeed, amino acids sharing similar physicochemical properties correspond to codons that have the same nucleotide composition bias. In particular, we uncover that the TRA2A and TRA2B splicing factors that bind to adenine-rich motifs promote the inclusion of adenine-rich exons coding preferentially for hydrophilic amino acids that correspond to adenine-rich codons. SRSF2 that binds guanine/cytosine-rich motifs promotes the inclusion of GC-rich exons coding preferentially for small amino acids, whereas SRSF3 that binds cytosine-rich motifs promotes the inclusion of exons coding preferentially for uncharged amino acids, like serine and threonine that can be phosphorylated. Finally, coregulated exons encoding amino acids with similar physicochemical properties correspond to specific protein features. In conclusion, the regulation of an exon by a splicing factor that relies on the affinity of this factor for specific nucleotide(s) is tightly interconnected with the exon-encoded physicochemical properties. We therefore uncover an unanticipated bidirectional interplay between the splicing regulatory process and its biological functional outcome.
Collapse
Affiliation(s)
- Nicolas Fontrodona
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Fabien Aubé
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Jean-Baptiste Claude
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Hélène Polvèche
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Sébastien Lemaire
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Léon-Charles Tranchevent
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health (LIH), L-1445 Strassen, Luxembourg
| | - Laurent Modolo
- LBMC Biocomputing Center, CNRS UMR 5239, INSERM U1210, F-69007, Lyon, France
| | - Franck Mortreux
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Cyril F Bourgeois
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| | - Didier Auboeuf
- Université Lyon, ENS de Lyon, Université Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, F-69007, Lyon, France
| |
Collapse
|
8
|
Wang X, Codreanu SG, Wen B, Li K, Chambers MC, Liebler DC, Zhang B. Detection of Proteome Diversity Resulted from Alternative Splicing is Limited by Trypsin Cleavage Specificity. Mol Cell Proteomics 2017; 17:422-430. [PMID: 29222161 PMCID: PMC5836368 DOI: 10.1074/mcp.ra117.000155] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Revised: 11/05/2017] [Indexed: 12/11/2022] Open
Abstract
Alternative splicing dramatically increases transcriptome complexity but its contribution to proteome diversity remains controversial. Exon-exon junction spanning peptides provide direct evidence for the translation of specific splice isoforms and are critical for delineating protein isoform complexity. Here we found that junction-spanning peptides are underrepresented in publicly available mass spectrometry-based shotgun proteomics data sets. Further analysis showed that evolutionarily conserved preferential nucleotide usage at exon boundaries increases the occurrence of lysine- and arginine-coding triplets at the end of exons. Because both lysine and arginine residues are cleavage sites of trypsin, the nearly exclusive use of trypsin as the protein digestion enzyme in shotgun proteomic analyses hinders the detection of junction-spanning peptides. To study the impact of enzyme selection on splice junction detectability, we performed in-silico digestion of the human proteome using six proteases. The six enzymes created a total of 161,125 detectable junctions, and only 1,029 were common across all enzyme digestions. Chymotrypsin digestion provided the largest number of detectable junctions. Our experimental results further showed that combination of a chymotrypsin-based human proteome analysis with a trypsin-based analysis increased detection of junction-spanning peptides by 37% over the trypsin-only analysis and identified over a thousand junctions that were undetectable in fully tryptic digests. Our study demonstrates that detection of proteome diversity resulted from alternative splicing is limited by trypsin cleavage specificity, and that complementary digestion schemes will be essential to comprehensively analyze the translation of alternative splicing isoforms.
Collapse
Affiliation(s)
- Xiaojing Wang
- From the ‡Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030.,§Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - Simona G Codreanu
- ¶Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| | - Bo Wen
- From the ‡Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030.,§Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - Kai Li
- ‖BGI-Shenzhen, Shenzhen, Guangdong 518083, China
| | - Matthew C Chambers
- ¶Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee 37232
| | - Daniel C Liebler
- ¶Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee 37232.,**Jim Ayers Institute for Precancer Detection and Diagnosis, Vanderbilt-Ingram Cancer Center, Nashville, Tennessee 37232
| | - Bing Zhang
- From the ‡Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030; .,§Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| |
Collapse
|
9
|
Savisaar R, Hurst LD. Both Maintenance and Avoidance of RNA-Binding Protein Interactions Constrain Coding Sequence Evolution. Mol Biol Evol 2017; 34:1110-1126. [PMID: 28138077 PMCID: PMC5400389 DOI: 10.1093/molbev/msx061] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
While the principal force directing coding sequence (CDS) evolution is selection on protein function, to ensure correct gene expression CDSs must also maintain interactions with RNA-binding proteins (RBPs). Understanding how our genes are shaped by these RNA-level pressures is necessary for diagnostics and for improving transgenes. However, the evolutionary impact of the need to maintain RBP interactions remains unresolved. Are coding sequences constrained by the need to specify RBP binding motifs? If so, what proportion of mutations are affected? Might sequence evolution also be constrained by the need not to specify motifs that might attract unwanted binding, for instance because it would interfere with exon definition? Here, we have scanned human CDSs for motifs that have been experimentally determined to be recognized by RBPs. We observe two sets of motifs-those that are enriched over nucleotide-controlled null and those that are depleted. Importantly, the depleted set is enriched for motifs recognized by non-CDS binding RBPs. Supporting the functional relevance of our observations, we find that motifs that are more enriched are also slower-evolving. The net effect of this selection to preserve is a reduction in the over-all rate of synonymous evolution of 2-3% in both primates and rodents. Stronger motif depletion, on the other hand, is associated with stronger selection against motif gain in evolution. The challenge faced by our CDSs is therefore not only one of attracting the right RBPs but also of avoiding the wrong ones, all while also evolving under selection pressures related to protein structure.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
10
|
Savisaar R, Hurst LD. Estimating the prevalence of functional exonic splice regulatory information. Hum Genet 2017; 136:1059-1078. [PMID: 28405812 PMCID: PMC5602102 DOI: 10.1007/s00439-017-1798-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 04/04/2017] [Indexed: 12/14/2022]
Abstract
In addition to coding information, human exons contain sequences necessary for correct splicing. These elements are known to be under purifying selection and their disruption can cause disease. However, the density of functional exonic splicing information remains profoundly uncertain. Several groups have experimentally investigated how mutations at different exonic positions affect splicing. They have found splice information to be distributed widely in exons, with one estimate putting the proportion of splicing-relevant nucleotides at >90%. These results suggest that splicing could place a major pressure on exon evolution. However, analyses of sequence conservation have concluded that the need to preserve splice regulatory signals only slightly constrains exon evolution, with a resulting decrease in the average human rate of synonymous evolution of only 1–4%. Why do these two lines of research come to such different conclusions? Among other reasons, we suggest that the methods are measuring different things: one assays the density of sites that affect splicing, the other the density of sites whose effects on splicing are visible to selection. In addition, the experimental methods typically consider short exons, thereby enriching for nucleotides close to the splice junction, such sites being enriched for splice-control elements. By contrast, in part owing to correction for nucleotide composition biases and to the assumption that constraint only operates on exon ends, the conservation-based methods can be overly conservative.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
11
|
Pancsa R, Tompa P. Coding Regions of Intrinsic Disorder Accommodate Parallel Functions. Trends Biochem Sci 2016; 41:898-906. [DOI: 10.1016/j.tibs.2016.08.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Revised: 08/16/2016] [Accepted: 08/19/2016] [Indexed: 02/01/2023]
|
12
|
Homma K, Noguchi T, Fukuchi S. Codon usage is less optimized in eukaryotic gene segments encoding intrinsically disordered regions than in those encoding structural domains. Nucleic Acids Res 2016; 44:10051-10061. [PMID: 27915289 PMCID: PMC5137448 DOI: 10.1093/nar/gkw899] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 09/15/2016] [Accepted: 09/29/2016] [Indexed: 12/14/2022] Open
Abstract
Codon usage tends to be optimized in highly expressed genes. A plausible explanation for this phenomenon is that translational accuracy is increased in highly expressed genes with infrequent use of rare codons. Besides structural domains (SDs), eukaryotic proteins generally have intrinsically disordered regions (IDRs) that by themselves do not assume unique three-dimensional structures. As IDRs are free from structural constraint, they can probably accommodate more translational errors than SDs can. Thus, codon usage in IDRs is likely to be less optimized than that in SDs. Codon usage in all the genes of seven eukaryotes was examined in terms of both tRNA adaptation index and codon adaptation index. Different amino acid compositions in different protein regions were taken into account in calculating expected adaptation indices, to which observed indices were compared. Codon usage is less optimized in gene regions encoding IDRs than in those corresponding to SDs. The finding does not depend on whether IDRs are located at the N-terminus, in the middle, or at the C-terminus of proteins. Furthermore, the observation remains unchanged in two different algorithms used to predict IDRs in proteins. The result is consistent with the idea that IDRs tolerate more translational errors than SDs.
Collapse
Affiliation(s)
- Keiichi Homma
- Department of Life Science and Informatics, Maebashi Institute of Technology, 460-1 Kamisadori-machi, Maebashi-shi 371-0816, Japan
| | - Tamotsu Noguchi
- Pharmaceutical Education Research Center, Meiji Pharmaceutical University, 2-522-1 Noshio, Kiyose, Tokyo 204-8588, Japan
| | - Satoshi Fukuchi
- Department of Life Science and Informatics, Maebashi Institute of Technology, 460-1 Kamisadori-machi, Maebashi-shi 371-0816, Japan
| |
Collapse
|
13
|
DeForte S, Uversky VN. Order, Disorder, and Everything in Between. Molecules 2016; 21:molecules21081090. [PMID: 27548131 PMCID: PMC6274243 DOI: 10.3390/molecules21081090] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2016] [Revised: 08/10/2016] [Accepted: 08/11/2016] [Indexed: 02/04/2023] Open
Abstract
In addition to the “traditional” proteins characterized by the unique crystal-like structures needed for unique functions, it is increasingly recognized that many proteins or protein regions (collectively known as intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs)), being biologically active, do not have a specific 3D-structure in their unbound states under physiological conditions. There are also subtler categories of disorder, such as conditional (or dormant) disorder and partial disorder. Both the ability of a protein/region to fold into a well-ordered functional unit or to stay intrinsically disordered but functional are encoded in the amino acid sequence. Structurally, IDPs/IDPRs are characterized by high spatiotemporal heterogeneity and exist as dynamic structural ensembles. It is important to remember, however, that although structure and disorder are often treated as binary states, they actually sit on a structural continuum.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA.
- Laboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia.
| |
Collapse
|
14
|
Smithers B, Oates ME, Tompa P, Gough J. Three reasons protein disorder analysis makes more sense in the light of collagen. Protein Sci 2016; 25:1030-6. [PMID: 26941008 PMCID: PMC4838654 DOI: 10.1002/pro.2913] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Revised: 03/01/2016] [Accepted: 03/02/2016] [Indexed: 01/17/2023]
Abstract
We have identified that the collagen helix has the potential to be disruptive to analyses of intrinsically disordered proteins. The collagen helix is an extended fibrous structure that is both promiscuous and repetitive. Whilst its sequence is predicted to be disordered, this type of protein structure is not typically considered as intrinsic disorder. Here, we show that collagen-encoding proteins skew the distribution of exon lengths in genes. We find that previous results, demonstrating that exons encoding disordered regions are more likely to be symmetric, are due to the abundance of the collagen helix. Other related results, showing increased levels of alternative splicing in disorder-encoding exons, still hold after considering collagen-containing proteins. Aside from analyses of exons, we find that the set of proteins that contain collagen significantly alters the amino acid composition of regions predicted as disordered. We conclude that research in this area should be conducted in the light of the collagen helix.
Collapse
Affiliation(s)
- Ben Smithers
- Department of Computer ScienceUniversity of BristolBristolBS8 1UBUnited Kingdom
| | - Matt E. Oates
- Department of Computer ScienceUniversity of BristolBristolBS8 1UBUnited Kingdom
| | - Peter Tompa
- VIB Structural Biology Research Center (SBRC)Vrije Universiteit BrusselBrussels1050Belgium
| | - Julian Gough
- Department of Computer ScienceUniversity of BristolBristolBS8 1UBUnited Kingdom
| |
Collapse
|