1
|
Anil AT, Pandian R, Mishra SK. Introns with branchpoint-distant 3' splice sites: Splicing mechanism and regulatory roles. Biophys Chem 2024; 314:107307. [PMID: 39173313 DOI: 10.1016/j.bpc.2024.107307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2024] [Revised: 07/26/2024] [Accepted: 08/07/2024] [Indexed: 08/24/2024]
Abstract
The two transesterification reactions of pre-mRNA splicing require highly complex yet well-controlled rearrangements of small nuclear RNAs and proteins (snRNP) in the spliceosome. The efficiency and accuracy of these reactions are critical for gene expression, as almost all human genes pass through pre-mRNA splicing. Key parameters that determine the splicing outcome are the length of the intron, the strengths of its splicing signals and gaps between them, and the presence of splicing controlling elements. In particular, the gap between the branchpoint (BP) and the 3' splice site (ss) of introns is a major determinant of the splicing efficiency. This distance falls within a small range across the introns of an organism. The constraints exist possibly because BP and 3'ss are recognized by BP-binding proteins, U2 snRNP and U2 accessory factors (U2AF) in a coordinated manner. Furthermore, varying distances between the two signals may also affect the second transesterification reaction since the intervening RNA needs to be accurately positioned within the complex RNP machinery. Splicing such pre-mRNAs requires cis-acting elements in the RNA and many trans-acting splicing regulators. Regulated pre-mRNA splicing with BP-distant 3'ss adds another layer of control to gene expression and promotes alternative splicing.
Collapse
Affiliation(s)
- Anupa T Anil
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306, Punjab, India
| | - Rakesh Pandian
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306, Punjab, India
| | - Shravan Kumar Mishra
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306, Punjab, India.
| |
Collapse
|
2
|
Siachisumo C, Luzzi S, Aldalaqan S, Hysenaj G, Dalgliesh C, Cheung K, Gazzara MR, Yonchev ID, James K, Kheirollahi Chadegani M, Ehrmann IE, Smith GR, Cockell SJ, Munkley J, Wilson SA, Barash Y, Elliott DJ. An anciently diverged family of RNA binding proteins maintain correct splicing of a class of ultra-long exons through cryptic splice site repression. eLife 2024; 12:RP89705. [PMID: 39356106 PMCID: PMC11446547 DOI: 10.7554/elife.89705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/03/2024] Open
Abstract
Previously, we showed that the germ cell-specific nuclear protein RBMXL2 represses cryptic splicing patterns during meiosis and is required for male fertility (Ehrmann et al., 2019). Here, we show that in somatic cells the similar yet ubiquitously expressed RBMX protein has similar functions. RBMX regulates a distinct class of exons that exceed the median human exon size. RBMX protein-RNA interactions are enriched within ultra-long exons, particularly within genes involved in genome stability, and repress the selection of cryptic splice sites that would compromise gene function. The RBMX gene is silenced during male meiosis due to sex chromosome inactivation. To test whether RBMXL2 might replace the function of RBMX during meiosis we induced expression of RBMXL2 and the more distantly related RBMY protein in somatic cells, finding each could rescue aberrant patterns of RNA processing caused by RBMX depletion. The C-terminal disordered domain of RBMXL2 is sufficient to rescue proper splicing control after RBMX depletion. Our data indicate that RBMX and RBMXL2 have parallel roles in somatic tissues and the germline that must have been conserved for at least 200 million years of mammalian evolution. We propose RBMX family proteins are particularly important for the splicing inclusion of some ultra-long exons with increased intrinsic susceptibility to cryptic splice site selection.
Collapse
Affiliation(s)
- Chileleko Siachisumo
- Biosciences Institute, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Sara Luzzi
- Biosciences Institute, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Saad Aldalaqan
- Biosciences Institute, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Gerald Hysenaj
- Biosciences Institute, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Caroline Dalgliesh
- Biosciences Institute, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Kathleen Cheung
- Bioinformatics Support Unit, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Matthew R Gazzara
- Department of Genetics, Perelman School of Medicine, University of PennsylvaniaPhildelphiaUnited States
| | - Ivaylo D Yonchev
- School of Biosciences, University of SheffieldSheffieldUnited Kingdom
| | - Katherine James
- School of Computing, Newcastle UniversityNewcastleUnited Kingdom
| | | | - Ingrid E Ehrmann
- Biosciences Institute, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Graham R Smith
- Bioinformatics Support Unit, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Simon J Cockell
- Bioinformatics Support Unit, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Jennifer Munkley
- Biosciences Institute, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| | - Stuart A Wilson
- School of Biosciences, University of SheffieldSheffieldUnited Kingdom
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of PennsylvaniaPhildelphiaUnited States
| | - David J Elliott
- Biosciences Institute, Faculty of Medical Sciences, Newcastle UniversityNewcastleUnited Kingdom
| |
Collapse
|
3
|
Senn KA, Lipinski KA, Zeps NJ, Griffin AF, Wilkinson ME, Hoskins AA. Control of 3' splice site selection by the yeast splicing factor Fyv6. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.04.592262. [PMID: 38746449 PMCID: PMC11092753 DOI: 10.1101/2024.05.04.592262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Pre-mRNA splicing is catalyzed in two steps: 5' splice site (SS) cleavage and exon ligation. A number of proteins transiently associate with spliceosomes to specifically impact these steps (1st and 2nd step factors). We recently identified Fyv6 (FAM192A in humans) as a 2nd step factor in S. cerevisiae; however, we did not determine how widespread Fyv6's impact is on the transcriptome. To answer this question, we have used RNA-seq to analyze changes in splicing. These results show that loss of Fyv6 results in activation of non-consensus, branch point (BP) proximal 3' SS transcriptome-wide. To identify the molecular basis of these observations, we determined a high-resolution cryo-EM structure of a yeast product complex spliceosome containing Fyv6 at 2.3 Å. The structure reveals that Fyv6 is the only 2nd step factor that contacts the Prp22 ATPase and that Fyv6 binding is mutually exclusive with that of the 1st step factor Yju2. We then use this structure to dissect Fyv6 functional domains and interpret results of a genetic screen for fyv6Δ suppressor mutations. The combined transcriptomic, structural, and genetic studies allow us to propose a model in which Yju2/Fyv6 exchange facilitates exon ligation and Fyv6 promotes usage of consensus, BP distal 3' SS.
Collapse
Affiliation(s)
- Katherine A. Senn
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706 USA
| | - Karli A. Lipinski
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706 USA
| | - Natalie J. Zeps
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706 USA
| | - Amory F. Griffin
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706 USA
| | - Max E. Wilkinson
- MRC Laboratory of Molecular Biology, Cambridge CB2 0QH UK
- Present Addresses: Broad Institute of MIT and Harvard, Cambridge MA 02142 USA and McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA 02139 USA
| | - Aaron A. Hoskins
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706 USA
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706 USA
| |
Collapse
|
4
|
Huang AC, Su JY, Hung YJ, Chiang HL, Chen YT, Huang YT, Yu CHA, Lin HN, Lin CL. SpliceAPP: an interactive web server to predict splicing errors arising from human mutations. BMC Genomics 2024; 25:600. [PMID: 38877417 PMCID: PMC11179192 DOI: 10.1186/s12864-024-10512-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 06/07/2024] [Indexed: 06/16/2024] Open
Abstract
BACKGROUND Splicing variants are a major class of pathogenic mutations, with their severity equivalent to nonsense mutations. However, redundant and degenerate splicing signals hinder functional assessments of sequence variations within introns, particularly at branch sites. We have established a massively parallel splicing assay to assess the impact on splicing of 11,191 disease-relevant variants. Based on the experimental results, we then applied regression-based methods to identify factors determining splicing decisions and their respective weights. RESULTS Our statistical modeling is highly sensitive, accurately annotating the splicing defects of near-exon intronic variants, outperforming state-of-the-art predictive tools. We have incorporated the algorithm and branchpoint information into a web-based tool, SpliceAPP, to provide an interactive application. This user-friendly website allows users to upload any genetic variants with genome coordinates (e.g., chr15 74,687,208 A G), and the tool will output predictions for splicing error scores and evaluate the impact on nearby splice sites. Additionally, users can query branch site information within the region of interest. CONCLUSIONS In summary, SpliceAPP represents a pioneering approach to screening pathogenic intronic variants, contributing to the development of precision medicine. It also facilitates the annotation of splicing motifs. SpliceAPP is freely accessible using the link https://bc.imb.sinica.edu.tw/SpliceAPP . Source code can be downloaded at https://github.com/hsinnan75/SpliceAPP .
Collapse
Affiliation(s)
- Ang-Chu Huang
- Institute of Molecular Biology, Academia Sinica, No. 128, Sec. 2, Academia Road, Nangang District, Taipei City, 115014, Taiwan
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan
| | - Jia-Ying Su
- Institute of Molecular Biology, Academia Sinica, No. 128, Sec. 2, Academia Road, Nangang District, Taipei City, 115014, Taiwan
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
- Bioinformatics Program, International Graduate Program, Academia Sinica, Taipei, Taiwan
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan
| | - Yu-Jen Hung
- Institute of Molecular Biology, Academia Sinica, No. 128, Sec. 2, Academia Road, Nangang District, Taipei City, 115014, Taiwan
| | - Hung-Lun Chiang
- Institute of Molecular Biology, Academia Sinica, No. 128, Sec. 2, Academia Road, Nangang District, Taipei City, 115014, Taiwan
| | - Yi-Ting Chen
- Institute of Molecular Biology, Academia Sinica, No. 128, Sec. 2, Academia Road, Nangang District, Taipei City, 115014, Taiwan
| | - Yen-Tsung Huang
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
- Bioinformatics Program, International Graduate Program, Academia Sinica, Taipei, Taiwan
| | - Chen-Hsin Albert Yu
- Institute of Molecular Biology, Academia Sinica, No. 128, Sec. 2, Academia Road, Nangang District, Taipei City, 115014, Taiwan
| | - Hsin-Nan Lin
- Institute of Molecular Biology, Academia Sinica, No. 128, Sec. 2, Academia Road, Nangang District, Taipei City, 115014, Taiwan.
| | - Chien-Ling Lin
- Institute of Molecular Biology, Academia Sinica, No. 128, Sec. 2, Academia Road, Nangang District, Taipei City, 115014, Taiwan.
- Genome and Systems Biology Degree Program, Academia Sinica and National Taiwan University, Taipei, Taiwan.
- Bioinformatics Program, International Graduate Program, Academia Sinica, Taipei, Taiwan.
| |
Collapse
|
5
|
Martínez-Pizarro A, Álvarez M, Dembic M, Lindegaard CA, Castro M, Richard E, Andresen BS, Desviat LR. Splice-Switching Antisense Oligonucleotides Correct Phenylalanine Hydroxylase Exon 11 Skipping Defects and Rescue Enzyme Activity in Phenylketonuria. Nucleic Acid Ther 2024; 34:134-142. [PMID: 38591802 DOI: 10.1089/nat.2024.0014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2024] Open
Abstract
The PAH gene encodes the hepatic enzyme phenylalanine hydroxylase (PAH), and its deficiency, known as phenylketonuria (PKU), leads to neurotoxic high levels of phenylalanine. PAH exon 11 is weakly defined, and several missense and intronic variants identified in patients affect the splicing process. Recently, we identified a novel intron 11 splicing regulatory element where U1snRNP binds, participating in exon 11 definition. In this work, we describe the implementation of an antisense strategy targeting intron 11 sequences to correct the effect of PAH mis-splicing variants. We used an in vitro assay with minigenes and identified splice-switching antisense oligonucleotides (SSOs) that correct the exon skipping defect of PAH variants c.1199+17G>A, c.1199+20G>C, c.1144T>C, and c.1066-3C>T. To examine the functional rescue induced by the SSOs, we generated a hepatoma cell model with variant c.1199+17G>A using CRISPR/Cas9. The edited cell line reproduces the exon 11 skipping pattern observed from minigenes, leading to reduced PAH protein levels and activity. SSO transfection results in an increase in exon 11 inclusion and corrects PAH deficiency. Our results provide proof of concept of the potential therapeutic use of a single SSO for different exonic and intronic splicing variants causing PAH exon 11 skipping in PKU.
Collapse
Affiliation(s)
- Ainhoa Martínez-Pizarro
- Centro de Biología Molecular Severo Ochoa UAM-CSIC, IUBM, CIBERER, IdiPaz, Universidad Autónoma de Madrid, Madrid, Spain
| | - Mar Álvarez
- Centro de Biología Molecular Severo Ochoa UAM-CSIC, IUBM, CIBERER, IdiPaz, Universidad Autónoma de Madrid, Madrid, Spain
| | - Maja Dembic
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Caroline A Lindegaard
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Margarita Castro
- Centro de Diagnóstico de Enfermedades Moleculares (CEDEM), CIBERER, IdiPaz, Universidad Autónoma de Madrid, Madrid, Spain
| | - Eva Richard
- Centro de Biología Molecular Severo Ochoa UAM-CSIC, IUBM, CIBERER, IdiPaz, Universidad Autónoma de Madrid, Madrid, Spain
| | - Brage S Andresen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Lourdes R Desviat
- Centro de Biología Molecular Severo Ochoa UAM-CSIC, IUBM, CIBERER, IdiPaz, Universidad Autónoma de Madrid, Madrid, Spain
| |
Collapse
|
6
|
Liu X, Zhang H, Zeng Y, Zhu X, Zhu L, Fu J. DRANetSplicer: A Splice Site Prediction Model Based on Deep Residual Attention Networks. Genes (Basel) 2024; 15:404. [PMID: 38674339 PMCID: PMC11048956 DOI: 10.3390/genes15040404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 03/20/2024] [Accepted: 03/23/2024] [Indexed: 04/28/2024] Open
Abstract
The precise identification of splice sites is essential for unraveling the structure and function of genes, constituting a pivotal step in the gene annotation process. In this study, we developed a novel deep learning model, DRANetSplicer, that integrates residual learning and attention mechanisms for enhanced accuracy in capturing the intricate features of splice sites. We constructed multiple datasets using the most recent versions of genomic data from three different organisms, Oryza sativa japonica, Arabidopsis thaliana and Homo sapiens. This approach allows us to train models with a richer set of high-quality data. DRANetSplicer outperformed benchmark methods on donor and acceptor splice site datasets, achieving an average accuracy of (96.57%, 95.82%) across the three organisms. Comparative analyses with benchmark methods, including SpliceFinder, Splice2Deep, Deep Splicer, EnsembleSplice, and DNABERT, revealed DRANetSplicer's superior predictive performance, resulting in at least a (4.2%, 11.6%) relative reduction in average error rate. We utilized the DRANetSplicer model trained on O. sativa japonica data to predict splice sites in A. thaliana, achieving accuracies for donor and acceptor sites of (94.89%, 94.25%). These results indicate that DRANetSplicer possesses excellent cross-organism predictive capabilities, with its performance in cross-organism predictions even surpassing that of benchmark methods in non-cross-organism predictions. Cross-organism validation showcased DRANetSplicer's excellence in predicting splice sites across similar organisms, supporting its applicability in gene annotation for understudied organisms. We employed multiple methods to visualize the decision-making process of the model. The visualization results indicate that DRANetSplicer can learn and interpret well-known biological features, further validating its overall performance. Our study systematically examined and confirmed the predictive ability of DRANetSplicer from various levels and perspectives, indicating that its practical application in gene annotation is justified.
Collapse
Affiliation(s)
- Xueyan Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Hongyan Zhang
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Ying Zeng
- School of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China;
| | - Xinghui Zhu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Lei Zhu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| | - Jiahui Fu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China; (X.L.); (X.Z.); (L.Z.); (J.F.)
| |
Collapse
|
7
|
Bakhtiar D, Vondraskova K, Pengelly RJ, Chivers M, Kralovicova J, Vorechovsky I. Exonic splicing code and coordination of divalent metals in proteins. Nucleic Acids Res 2024; 52:1090-1106. [PMID: 38055834 PMCID: PMC10853796 DOI: 10.1093/nar/gkad1161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 11/15/2023] [Accepted: 11/17/2023] [Indexed: 12/08/2023] Open
Abstract
Exonic sequences contain both protein-coding and RNA splicing information but the interplay of the protein and splicing code is complex and poorly understood. Here, we have studied traditional and auxiliary splicing codes of human exons that encode residues coordinating two essential divalent metals at the opposite ends of the Irving-Williams series, a universal order of relative stabilities of metal-organic complexes. We show that exons encoding Zn2+-coordinating amino acids are supported much less by the auxiliary splicing motifs than exons coordinating Ca2+. The handicap of the former is compensated by stronger splice sites and uridine-richer polypyrimidine tracts, except for position -3 relative to 3' splice junctions. However, both Ca2+ and Zn2+ exons exhibit close-to-constitutive splicing in multiple tissues, consistent with their critical importance for metalloprotein function and a relatively small fraction of expendable, alternatively spliced exons. These results indicate that constraints imposed by metal coordination spheres on RNA splicing have been efficiently overcome by the plasticity of exon-intron architecture to ensure adequate metalloprotein expression.
Collapse
Affiliation(s)
- Dara Bakhtiar
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| | - Katarina Vondraskova
- Slovak Academy of Sciences, Centre of Biosciences, 840 05 Bratislava, Slovak Republic
| | - Reuben J Pengelly
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| | - Martin Chivers
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| | - Jana Kralovicova
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
- Slovak Academy of Sciences, Centre of Biosciences, 840 05 Bratislava, Slovak Republic
| | - Igor Vorechovsky
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| |
Collapse
|
8
|
Nachtigall PG, Durham AM, Rokyta DR, Junqueira-de-Azevedo ILM. ToxCodAn-Genome: an automated pipeline for toxin-gene annotation in genome assembly of venomous lineages. Gigascience 2024; 13:giad116. [PMID: 38241143 PMCID: PMC10797961 DOI: 10.1093/gigascience/giad116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 10/19/2023] [Accepted: 12/18/2023] [Indexed: 01/21/2024] Open
Abstract
BACKGROUND The rapid development of sequencing technologies resulted in a wide expansion of genomics studies using venomous lineages. This facilitated research focusing on understanding the evolution of adaptive traits and the search for novel compounds that can be applied in agriculture and medicine. However, the toxin annotation of genomes is a laborious and time-consuming task, and no consensus pipeline is currently available. No computational tool currently exists to address the challenges specific to toxin annotation and to ensure the reproducibility of the process. RESULTS Here, we present ToxCodAn-Genome, the first software designed to perform automated toxin annotation in genomes of venomous lineages. This pipeline was designed to retrieve the full-length coding sequences of toxins and to allow the detection of novel truncated paralogs and pseudogenes. We tested ToxCodAn-Genome using 12 genomes of venomous lineages and achieved high performance on recovering their current toxin annotations. This tool can be easily customized to allow improvements in the final toxin annotation set and can be expanded to virtually any venomous lineage. ToxCodAn-Genome is fast, allowing it to run on any personal computer, but it can also be executed in multicore mode, taking advantage of large high-performance servers. In addition, we provide a guide to direct future research in the venomics field to ensure a confident toxin annotation in the genome being studied. As a case study, we sequenced and annotated the toxin repertoire of Bothrops alternatus, which may facilitate future evolutionary and biomedical studies using vipers as models. CONCLUSIONS ToxCodAn-Genome is suitable to perform toxin annotation in the genome of venomous species and may help to improve the reproducibility of further studies. ToxCodAn-Genome and the guide are freely available at https://github.com/pedronachtigall/ToxCodAn-Genome.
Collapse
Affiliation(s)
- Pedro G Nachtigall
- Laboratório de Toxinologia Aplicada, CeTICS, Instituto Butantan, São Paulo, 05503-900 SP, Brazil
- Department of Biological Science, Florida State University, Tallahassee, 32306-4295 FL, USA
| | - Alan M Durham
- Departamento de Ciência da Computação, Instituto de Matemática e Estatística, Universidade de São Paulo (USP), São Paulo, 05508-090 SP, Brazil
| | - Darin R Rokyta
- Department of Biological Science, Florida State University, Tallahassee, 32306-4295 FL, USA
| | | |
Collapse
|
9
|
Eswaramoorthy V, Kandasamy T, Thiyagarajan K, Chockalingam V, Jegadeesan S, Natesan S, Adhimoolam K, Prabhakaran J, Singh R, Muthurajan R. Characterization of terminal flowering cowpea (Vigna unguiculata (L.) Walp.) mutants obtained by induced mutagenesis digs out the loss-of-function of phosphatidylethanolamine-binding protein. PLoS One 2023; 18:e0295509. [PMID: 38096151 PMCID: PMC10721064 DOI: 10.1371/journal.pone.0295509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Accepted: 11/26/2023] [Indexed: 12/17/2023] Open
Abstract
Cowpea (Vigna unguiculata (L.) Walp) is one of the major food legume crops grown extensively in arid and semi-arid regions of the world. The determinate habit of cowpea has many advantages over the indeterminate and is well adapted to modern farming systems. Mutation breeding is an active research area to develop the determinate habit of cowpea. The present study aimed to develop new determinate habit mutants with terminal flowering (TFL) in locally well-adapted genetic backgrounds. Consequently, the seeds of popular cowpea cv P152 were irradiated with doses of gamma rays (200, 250, and, 300 Gy), and the M1 populations were grown. The M2 populations were produced from the M1 progenies and selected determinate mutants (TFLCM-1 and TFLCM-2) from the M2 generation (200 Gy) were forwarded up to the M5 generation to characterize the mutants and simultaneously they were crossed with P152 to develop a MutMap population. In the M5 generation, determinate mutants (80-81 days) were characterized by evaluating the TFL growth habit, longer peduncles (30.75-31.45 cm), erect pods (160°- 200°), number of pods per cluster (4-5 nos.), and early maturity. Further, sequencing analysis of the VuTFL1 gene in the determinate mutants and MutMap population revealed a single nucleotide transversion (A-T at 1196 bp) in the fourth exon and asparagine (N) to tyrosine (Y) amino acid change at the 143rd position of phosphatidylethanolamine-binding protein (PEBP). Notably, the loss of function PEPB with a higher confidence level modification of anti-parallel beta-sheets and destabilization of the protein secondary structure was observed in the mutant lines. Quantitative real-time PCR (qRT-PCR) analysis showed that the VuTFL1 gene was downregulated at the flowering stage in TFL mutants. Collectively, the insights garnered from this study affirm the effectiveness of induced mutation in modifying the plant's ideotype. The TFL mutants developed during this investigation have the potential to serve as a valuable resource for fostering determinate traits in future cowpea breeding programs and pave the way for mechanical harvesting.
Collapse
Affiliation(s)
- Vijayakumar Eswaramoorthy
- Department of Plant Breeding and Genetics, Agricultural College and Research Institute, Tamil Nadu Agricultural University, Coimbatore, Tamil Nadu, India
| | - Thangaraj Kandasamy
- Department of Plant Breeding and Genetics, Agricultural College and Research Institute, Tamil Nadu Agricultural University, Madurai, Tamil Nadu, India
| | - Kalaimagal Thiyagarajan
- Department of Plant Breeding and Genetics, Agricultural College and Research Institute, Tamil Nadu Agricultural University, Coimbatore, Tamil Nadu, India
| | - Vanniarajan Chockalingam
- Department of Plant Breeding and Genetics, Agricultural College and Research Institute, Tamil Nadu Agricultural University, Madurai, Tamil Nadu, India
| | - Souframanien Jegadeesan
- Nuclear Agriculture & Biotechnology Division, Bhabha Atomic Research Centre, Trombay, Mumbai, India
| | - Senthil Natesan
- Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore, India
| | - Karthikeyan Adhimoolam
- Subtropical Horticulture Research Institute, Jeju National University, Jeju, South Korea
| | - Jeyakumar Prabhakaran
- Department of Crop Physiology, Agricultural College and Research Institute, Tamil Nadu Agricultural University, Coimbatore, Tamil Nadu, India
| | - Ramji Singh
- Department of Plant Pathology, College of Agriculture, Sardar Vallabhbhai Patel University of Agricultural Sciences and Technology, Meerut, Uttar Pradesh, India
| | - Raveendran Muthurajan
- Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore, India
| |
Collapse
|
10
|
Xie J, Wang L, Lin RJ. Variations of intronic branchpoint motif: identification and functional implications in splicing and disease. Commun Biol 2023; 6:1142. [PMID: 37949953 PMCID: PMC10638238 DOI: 10.1038/s42003-023-05513-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 10/26/2023] [Indexed: 11/12/2023] Open
Abstract
The branchpoint (BP) motif is an essential intronic element for spliceosomal pre-mRNA splicing. In mammals, its sequence composition, distance to the downstream exon, and number of BPs per 3´ splice site are highly variable, unlike the GT/AG dinucleotides at the intron ends. These variations appear to provide evolutionary advantages for fostering alternative splicing, satisfying more diverse cellular contexts, and promoting resilience to genetic changes, thus contributing to an extra layer of complexity for gene regulation. Importantly, variants in the BP motif itself or in genes encoding BP-interacting factors cause human genetic diseases or cancers, highlighting the critical function of BP motif and the need to precisely identify functional BPs for faithful interpretation of their roles in splicing. In this perspective, we will succinctly summarize the major findings related to BP motif variations, discuss the relevant issues/challenges, and provide our insights.
Collapse
Affiliation(s)
- Jiuyong Xie
- Department of Physiology & Pathophysiology, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, R3E 0J9, Canada.
| | - Lili Wang
- Department of Systems Biology, Beckman Research Institute, City of Hope National Medical Center, Duarte, CA, 91010, USA.
| | - Ren-Jang Lin
- Center for RNA Biology & Therapeutics, Beckman Research Institute, City of Hope National Medical Center, Duarte, CA, 91010, USA.
| |
Collapse
|
11
|
Rodríguez-Hidalgo M, de Bruijn SE, Corradi Z, Rodenburg K, Lara-López A, Valverde-Megías A, Ávila-Fernández A, Fernandez-Caballero L, Del Pozo-Valero M, Corominas J, Gilissen C, Irigoyen C, Cremers FPM, Ayuso C, Ruiz-Ederra J, Roosing S. ABCA4 c.6480-35A>G, a novel branchpoint variant associated with Stargardt disease. Front Genet 2023; 14:1234032. [PMID: 37779911 PMCID: PMC10539688 DOI: 10.3389/fgene.2023.1234032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Accepted: 08/15/2023] [Indexed: 10/03/2023] Open
Abstract
Introduction: Inherited retinal dystrophies (IRDs) can be caused by variants in more than 280 genes. The ATP-binding cassette transporter type A4 (ABCA4) gene is one of these genes and has been linked to Stargardt disease type 1 (STGD1), fundus flavimaculatus, cone-rod dystrophy (CRD), and pan-retinal CRD. Approximately 25% of the reported ABCA4 variants affect RNA splicing. In most cases, it is necessary to perform a functional assay to determine the effect of these variants. Methods: Whole genome sequencing (WGS) was performed in one Spanish proband with Stargardt disease. The putative pathogenicity of c.6480-35A>G on splicing was investigated both in silico and in vitro. The in silico approach was based on the deep-learning tool SpliceAI. For the in vitro approach we used a midigene splice assay in HEK293T cells, based on a previously established wild-type midigene (BA29) containing ABCA4 exons 46 to 48. Results: Through the analysis of WGS data, we identified two candidate variants in ABCA4 in one proband: a previously described deletion, c.699_768+342del (p.(Gln234Phefs*5)), and a novel branchpoint variant, c.6480-35A>G. Segregation analysis confirmed that the variants were in trans. For the branchpoint variant, SpliceAI predicted an acceptor gain with a high score (0.47) at position c.6480-47. A midigene splice assay in HEK293T cells revealed the inclusion of the last 47 nucleotides of intron 47 creating a premature stop codon and allowed to categorize the variant as moderately severe. Subsequent analysis revealed the presence of this variant as a second allele besides c.1958G>A p.(Arg653His) in an additional Spanish proband in a large cohort of IRD cases. Conclusion: A splice-altering effect of the branchpoint variant, confirmed by the midigene splice assay, along with the identification of this variant in a second unrelated individual affected with STGD, provides sufficient evidence to classify the variant as likely pathogenic. In addition, this research highlights the importance of studying non-coding regions and performing functional assays to provide a conclusive molecular diagnosis.
Collapse
Affiliation(s)
- María Rodríguez-Hidalgo
- Department of Neuroscience, Biodonostia Health Research Institute, Donostia-San Sebastián, Spain
- Department of Genetic, Physical Anthropology and Animal Physiology, University of the Basque Country UPV/EHU, Leioa, Spain
| | - Suzanne E. de Bruijn
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
| | - Zelia Corradi
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
| | - Kim Rodenburg
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
| | | | | | - Almudena Ávila-Fernández
- Department of Genetics, Health Research Institute-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
| | - Lidia Fernandez-Caballero
- Department of Genetics, Health Research Institute-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
| | - Marta Del Pozo-Valero
- Department of Genetics, Health Research Institute-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
| | - Jordi Corominas
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
- Radboud Institute of Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Christian Gilissen
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
- Radboud Institute of Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | - Cristina Irigoyen
- Department of Neuroscience, Biodonostia Health Research Institute, Donostia-San Sebastián, Spain
- Ophthalmology Service, Donostia Universy Hospital, Donostia-San Sebastián, Spain
| | - Frans P. M. Cremers
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
| | - Carmen Ayuso
- Department of Genetics, Health Research Institute-Fundación Jiménez Díaz University Hospital, Universidad Autónoma de Madrid (IIS-FJD, UAM), Madrid, Spain
- Center for Biomedical Network Research on Rare Diseases (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
| | - Javier Ruiz-Ederra
- Department of Neuroscience, Biodonostia Health Research Institute, Donostia-San Sebastián, Spain
- Department of Ophthalmology, University of the Basque Country (UPV/EHU), San Sebastián, Spain
| | - Susanne Roosing
- Department of Human Genetics, Radboud University Medical Center, Nijmegen, Netherlands
| |
Collapse
|
12
|
Hort Y, Sullivan P, Wedd L, Fowles L, Stevanovski I, Deveson I, Simons C, Mallett A, Patel C, Furlong T, Cowley MJ, Shine J, Mallawaarachchi A. Atypical splicing variants in PKD1 explain most undiagnosed typical familial ADPKD. NPJ Genom Med 2023; 8:16. [PMID: 37419908 DOI: 10.1038/s41525-023-00362-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 06/26/2023] [Indexed: 07/09/2023] Open
Abstract
Autosomal dominant polycystic kidney disease (ADPKD) is the most common monogenic cause of kidney failure and is primarily associated with PKD1 or PKD2. Approximately 10% of patients remain undiagnosed after standard genetic testing. We aimed to utilise short and long-read genome sequencing and RNA studies to investigate undiagnosed families. Patients with typical ADPKD phenotype and undiagnosed after genetic diagnostics were recruited. Probands underwent short-read genome sequencing, PKD1 and PKD2 coding and non-coding analyses and then genome-wide analysis. Targeted RNA studies investigated variants suspected to impact splicing. Those undiagnosed then underwent Oxford Nanopore Technologies long-read genome sequencing. From over 172 probands, 9 met inclusion criteria and consented. A genetic diagnosis was made in 8 of 9 (89%) families undiagnosed on prior genetic testing. Six had variants impacting splicing, five in non-coding regions of PKD1. Short-read genome sequencing identified novel branchpoint, AG-exclusion zone and missense variants generating cryptic splice sites and a deletion causing critical intron shortening. Long-read sequencing confirmed the diagnosis in one family. Most undiagnosed families with typical ADPKD have splice-impacting variants in PKD1. We describe a pragmatic method for diagnostic laboratories to assess PKD1 and PKD2 non-coding regions and validate suspected splicing variants through targeted RNA studies.
Collapse
Affiliation(s)
- Yvonne Hort
- Molecular Genetics of Inherited Kidney Disorders Laboratory, Garvan Institute of Medical Research, Sydney, Australia
| | - Patricia Sullivan
- Children's Cancer Institute, Lowy Cancer Centre, UNSW Sydney, Kensington, NSW, Australia
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Kensington, NSW, Australia
| | - Laura Wedd
- Molecular Genetics of Inherited Kidney Disorders Laboratory, Garvan Institute of Medical Research, Sydney, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and UNSW Sydney, Sydney, NSW, Australia
| | - Lindsay Fowles
- Genetic Health Queensland, Royal Brisbane and Women's Hospital, Herston, QLD, Australia
| | - Igor Stevanovski
- Genomic Technologies, Garvan Institute of Medical Research, Sydney, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Sydney, Australia
| | - Ira Deveson
- Genomic Technologies, Garvan Institute of Medical Research, Sydney, Australia
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Sydney, Australia
| | - Cas Simons
- Centre for Population Genomics, Garvan Institute of Medical Research and UNSW Sydney, Sydney, NSW, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, VIC, Australia
| | - Andrew Mallett
- Department of Renal Medicine, Townsville University Hospital, Townsville, QLD, Australia
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
- College of Medicine and Dentistry, James Cook University, Townsville, QLD, Australia
| | - Chirag Patel
- Genetic Health Queensland, Royal Brisbane and Women's Hospital, Herston, QLD, Australia
| | - Timothy Furlong
- Molecular Genetics of Inherited Kidney Disorders Laboratory, Garvan Institute of Medical Research, Sydney, Australia
| | - Mark J Cowley
- Children's Cancer Institute, Lowy Cancer Centre, UNSW Sydney, Kensington, NSW, Australia
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Kensington, NSW, Australia
| | - John Shine
- Molecular Genetics of Inherited Kidney Disorders Laboratory, Garvan Institute of Medical Research, Sydney, Australia
| | - Amali Mallawaarachchi
- Molecular Genetics of Inherited Kidney Disorders Laboratory, Garvan Institute of Medical Research, Sydney, Australia.
- Clinical Genetics Service, Institute of Precision Medicine and Bioinformatics, Royal Prince Alfred Hospital, Sydney, Australia.
| |
Collapse
|
13
|
Zabardast A, Tamer EG, Son YA, Yılmaz A. An automated framework for evaluation of deep learning models for splice site predictions. Sci Rep 2023; 13:10221. [PMID: 37353532 PMCID: PMC10290104 DOI: 10.1038/s41598-023-34795-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 05/08/2023] [Indexed: 06/25/2023] Open
Abstract
A novel framework for the automated evaluation of various deep learning-based splice site detectors is presented. The framework eliminates time-consuming development and experimenting activities for different codebases, architectures, and configurations to obtain the best models for a given RNA splice site dataset. RNA splicing is a cellular process in which pre-mRNAs are processed into mature mRNAs and used to produce multiple mRNA transcripts from a single gene sequence. Since the advancement of sequencing technologies, many splice site variants have been identified and associated with the diseases. So, RNA splice site prediction is essential for gene finding, genome annotation, disease-causing variants, and identification of potential biomarkers. Recently, deep learning models performed highly accurately for classifying genomic signals. Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) and its bidirectional version (BLSTM), Gated Recurrent Unit (GRU), and its bidirectional version (BGRU) are promising models. During genomic data analysis, CNN's locality feature helps where each nucleotide correlates with other bases in its vicinity. In contrast, BLSTM can be trained bidirectionally, allowing sequential data to be processed from forward and reverse directions. Therefore, it can process 1-D encoded genomic data effectively. Even though both methods have been used in the literature, a performance comparison was missing. To compare selected models under similar conditions, we have created a blueprint for a series of networks with five different levels. As a case study, we compared CNN and BLSTM models' learning capabilities as building blocks for RNA splice site prediction in two different datasets. Overall, CNN performed better with [Formula: see text] accuracy ([Formula: see text] improvement), [Formula: see text] F1 score ([Formula: see text] improvement), and [Formula: see text] AUC-PR ([Formula: see text] improvement) in human splice site prediction. Likewise, an outperforming performance with [Formula: see text] accuracy ([Formula: see text] improvement), [Formula: see text] F1 score ([Formula: see text] improvement), and [Formula: see text] AUC-PR ([Formula: see text] improvement) is achieved in C. elegans splice site prediction. Overall, our results showed that CNN learns faster than BLSTM and BGRU. Moreover, CNN performs better at extracting sequence patterns than BLSTM and BGRU. To our knowledge, no other framework is developed explicitly for evaluating splice detection models to decide the best possible model in an automated manner. So, the proposed framework and the blueprint would help selecting different deep learning models, such as CNN vs. BLSTM and BGRU, for splice site analysis or similar classification tasks and in different problems.
Collapse
Affiliation(s)
- Amin Zabardast
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Elif Güney Tamer
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Yeşim Aydın Son
- Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey
| | - Arif Yılmaz
- Institute of Data Science, Maastricht University, Maastricht, The Netherlands.
| |
Collapse
|
14
|
Sullivan PJ, Gayevskiy V, Davis RL, Wong M, Mayoh C, Mallawaarachchi A, Hort Y, McCabe MJ, Beecroft S, Jackson MR, Arts P, Dubowsky A, Laing N, Dinger ME, Scott HS, Oates E, Pinese M, Cowley MJ. Introme accurately predicts the impact of coding and noncoding variants on gene splicing, with clinical applications. Genome Biol 2023; 24:118. [PMID: 37198692 PMCID: PMC10190034 DOI: 10.1186/s13059-023-02936-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 04/10/2023] [Indexed: 05/19/2023] Open
Abstract
Predicting the impact of coding and noncoding variants on splicing is challenging, particularly in non-canonical splice sites, leading to missed diagnoses in patients. Existing splice prediction tools are complementary but knowing which to use for each splicing context remains difficult. Here, we describe Introme, which uses machine learning to integrate predictions from several splice detection tools, additional splicing rules, and gene architecture features to comprehensively evaluate the likelihood of a variant impacting splicing. Through extensive benchmarking across 21,000 splice-altering variants, Introme outperformed all tools (auPRC: 0.98) for the detection of clinically significant splice variants. Introme is available at https://github.com/CCICB/introme .
Collapse
Affiliation(s)
- Patricia J Sullivan
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Sydney, NSW, Australia
- University of New South Wales Centre for Childhood Cancer Research, UNSW Sydney, Sydney, NSW, Australia
| | - Velimir Gayevskiy
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, Australia
| | - Ryan L Davis
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, Australia
- Department of Neurogenetics, Kolling Institute, St. Leonards, NSW, Australia
- Sydney Medical School-Northern, Faculty of Medicine and Health, University of Sydney, Sydney, NSW, Australia
| | - Marie Wong
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Sydney, NSW, Australia
| | - Chelsea Mayoh
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Sydney, NSW, Australia
| | - Amali Mallawaarachchi
- Division of Genomics and Epigenetics, Garvan Institute of Medical Research, Sydney, Australia
- Clinical Genetics Unit, Institute of Precision Medicine and Bioinformatics, Sydney Local Health District, Sydney, Australia
| | - Yvonne Hort
- Division of Genomics and Epigenetics, Garvan Institute of Medical Research, Sydney, Australia
| | - Mark J McCabe
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, Australia
| | - Sarah Beecroft
- Centre for Medical Research, University of Western Australia, Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, WA, Australia
| | - Matilda R Jackson
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An Alliance Between SA Pathology and the University of South Australia, Adelaide, Australia
- Australian Genomics, Parkville, VIC, Australia
| | - Peer Arts
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An Alliance Between SA Pathology and the University of South Australia, Adelaide, Australia
| | - Andrew Dubowsky
- Department of Genetics and Molecular Pathology, SA Pathology, Adelaide, Australia
| | - Nigel Laing
- Centre for Medical Research, University of Western Australia, Harry Perkins Institute of Medical Research, QEII Medical Centre, Nedlands, WA, Australia
| | - Marcel E Dinger
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, Australia
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
| | - Hamish S Scott
- Department of Genetics and Molecular Pathology, Centre for Cancer Biology, An Alliance Between SA Pathology and the University of South Australia, Adelaide, Australia
- Australian Genomics, Parkville, VIC, Australia
- School of Medicine, University of Adelaide, Adelaide, SA, Australia
- ACRF Cancer Genomics Facility, Centre for Cancer Biology, An Alliance Between SA Pathology and the University of South Australia, Adelaide, SA, Australia
| | - Emily Oates
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, Australia
| | - Mark Pinese
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Sydney, NSW, Australia
| | - Mark J Cowley
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia.
- School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Sydney, NSW, Australia.
| |
Collapse
|
15
|
Sapkota M, Pereira L, Wang Y, Zhang L, Topcu Y, Tieman D, van der Knaap E. Structural variation underlies functional diversity at methyl salicylate loci in tomato. PLoS Genet 2023; 19:e1010751. [PMID: 37141297 PMCID: PMC10187894 DOI: 10.1371/journal.pgen.1010751] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 05/16/2023] [Accepted: 04/19/2023] [Indexed: 05/06/2023] Open
Abstract
Methyl salicylate is an important inter- and intra-plant signaling molecule, but is deemed undesirable by humans when it accumulates to high levels in ripe fruits. Balancing the tradeoff between consumer satisfaction and overall plant health is challenging as the mechanisms regulating volatile levels have not yet been fully elucidated. In this study, we investigated the accumulation of methyl salicylate in ripe fruits of tomatoes that belong to the red-fruited clade. We determine the genetic diversity and the interaction of four known loci controlling methyl salicylate levels in ripe fruits. In addition to Non-Smoky Glucosyl Transferase 1 (NSGT1), we uncovered extensive genome structural variation (SV) at the Methylesterase (MES) locus. This locus contains four tandemly duplicated Methylesterase genes and genome sequence investigations at the locus identified nine distinct haplotypes. Based on gene expression and results from biparental crosses, functional and non-functional haplotypes for MES were identified. The combination of the non-functional MES haplotype 2 and the non-functional NSGT1 haplotype IV or V in a GWAS panel showed high methyl salicylate levels in ripe fruits, particularly in accessions from Ecuador, demonstrating a strong interaction between these two loci and suggesting an ecological advantage. The genetic variation at the other two known loci, Salicylic Acid Methyl Transferase 1 (SAMT1) and tomato UDP Glycosyl Transferase 5 (SlUGT5), did not explain volatile variation in the red-fruited tomato germplasm, suggesting a minor role in methyl salicylate production in red-fruited tomato. Lastly, we found that most heirloom and modern tomato accessions carried a functional MES and a non-functional NSGT1 haplotype, ensuring acceptable levels of methyl salicylate in fruits. Yet, future selection of the functional NSGT1 allele could potentially improve flavor in the modern germplasm.
Collapse
Affiliation(s)
- Manoj Sapkota
- Institute of Plant Breeding, Genetics, and Genomics, University of Georgia, Athens, Georgia, United States of America
| | - Lara Pereira
- Institute of Plant Breeding, Genetics, and Genomics, University of Georgia, Athens, Georgia, United States of America
| | - Yanbing Wang
- Institute of Plant Breeding, Genetics, and Genomics, University of Georgia, Athens, Georgia, United States of America
| | - Lei Zhang
- Institute of Plant Breeding, Genetics, and Genomics, University of Georgia, Athens, Georgia, United States of America
| | - Yasin Topcu
- Institute of Plant Breeding, Genetics, and Genomics, University of Georgia, Athens, Georgia, United States of America
| | - Denise Tieman
- Horticultural Sciences, University of Florida, Gainesville, Florida, United States of America
| | - Esther van der Knaap
- Institute of Plant Breeding, Genetics, and Genomics, University of Georgia, Athens, Georgia, United States of America
| |
Collapse
|
16
|
Wang X, Liu Y, Li J, Wang G. StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy. BMC Bioinformatics 2022; 23:563. [PMID: 36575368 PMCID: PMC9793644 DOI: 10.1186/s12859-022-05118-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 12/20/2022] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND CircRNAs are essential for the regulation of post-transcriptional gene expression, including as miRNA sponges, and play an important role in disease development. Some computational tools have been proposed recently to predict circRNA, since only one classifier is used, there is still much that can be done to improve the performance. RESULTS StackCirRNAPred was proposed, the computational classification of long circRNA from other lncRNA based on stacking strategy. In order to cope with the potential problem that a single feature might not be able to distinguish circRNA well from other lncRNA, we first extracted features from different sources, including nucleic acid composition, sequence spatial features and physicochemical properties, Alu and tandem repeats. We innovatively apply the stacking strategy to integrate the more advantageous classifiers of RF, LightGBM, XGBoost. This allows the model to incorporate these features more flexibly. StackCirRNAPred was found to be significantly better than other tools, with precision, accuracy, F1, recall and MCC of 0.843, 0.833, 0.831, 0.819 and 0.666 respectively. We tested it directly on the mouse dataset. StackCirRNAPred was still significantly better than other methods, with precision, accuracy, F1, recall and MCC of 0.837, 0.839, 0.839, 0.841, 0.677. CONCLUSIONS We proposed StackCirRNAPred based on stacking strategy to distinguish long circRNAs from other lncRNAs. With the test results demonstrating the validity and robustness of StackCirRNAPred, we hope StackCirRNAPred will complement existing circRNA prediction methods and is helpful in down-stream research.
Collapse
Affiliation(s)
- Xin Wang
- grid.19373.3f0000 0001 0193 3564School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yadong Liu
- grid.19373.3f0000 0001 0193 3564School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jie Li
- grid.19373.3f0000 0001 0193 3564School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Guohua Wang
- grid.19373.3f0000 0001 0193 3564School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
17
|
Maddi AMA, Kavousi K, Arabfard M, Ohadi H, Ohadi M. Tandem repeats ubiquitously flank and contribute to translation initiation sites. BMC Genom Data 2022; 23:59. [PMID: 35896982 PMCID: PMC9331589 DOI: 10.1186/s12863-022-01075-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 07/18/2022] [Indexed: 12/31/2022] Open
Abstract
Background While the evolutionary divergence of cis-regulatory sequences impacts translation initiation sites (TISs), the implication of tandem repeats (TRs) in TIS selection remains largely elusive. Here, we employed the TIS homology concept to study a possible link between TRs of all core lengths and repeats with TISs. Methods Human, as reference sequence, and 83 other species were selected, and data was extracted on the entire protein-coding genes (n = 1,611,368) and transcripts (n = 2,730,515) annotated for those species from Ensembl 102. Following TIS identification, two different weighing vectors were employed to assign TIS homology, and the co-occurrence pattern of TISs with the upstream flanking TRs was studied in the selected species. The results were assessed in 10-fold cross-validation. Results On average, every TIS was flanked by 1.19 TRs of various categories within its 120 bp upstream sequence, per species. We detected statistically significant enrichment of non-homologous human TISs co-occurring with human-specific TRs. On the contrary, homologous human TISs co-occurred significantly with non-human-specific TRs. 2991 human genes had at least one transcript, TIS of which was flanked by a human-specific TR. Text mining of a number of the identified genes, such as CACNA1A, EIF5AL1, FOXK1, GABRB2, MYH2, SLC6A8, and TTN, yielded predominant expression and functions in the human brain and/or skeletal muscle. Conclusion We conclude that TRs ubiquitously flank and contribute to TIS selection at the trans-species level. Future functional analyses, such as a combination of genome editing strategies and in vitro protein synthesis may be employed to further investigate the impact of TRs on TIS selection. Supplementary Information The online version contains supplementary material available at 10.1186/s12863-022-01075-5.
Collapse
|
18
|
Mechanism and modeling of human disease-associated near-exon intronic variants that perturb RNA splicing. Nat Struct Mol Biol 2022; 29:1043-1055. [PMID: 36303034 DOI: 10.1038/s41594-022-00844-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 08/23/2022] [Indexed: 12/24/2022]
Abstract
It is estimated that 10%-30% of disease-associated genetic variants affect splicing. Splicing variants may generate deleteriously altered gene product and are potential therapeutic targets. However, systematic diagnosis or prediction of splicing variants is yet to be established, especially for the near-exon intronic splice region. The major challenge lies in the redundant and ill-defined branch sites and other splicing motifs therein. Here, we carried out unbiased massively parallel splicing assays on 5,307 disease-associated variants that overlapped with branch sites and collected 5,884 variants across the 5' splice region. We found that strong splice sites and exonic features preserve splicing from intronic sequence variation. Whereas the splice-altering mechanism of the 3' intronic variants is complex, that of the 5' is mainly splice-site destruction. Statistical learning combined with these molecular features allows precise prediction of altered splicing from an intronic variant. This statistical model provides the identity and ranking of biological features that determine splicing, which serves as transferable knowledge and out-performs the benchmarking predictive tool. Moreover, we demonstrated that intronic splicing variants may associate with disease risks in the human population. Our study elucidates the mechanism of splicing response of intronic variants, which classify disease-associated splicing variants for the promise of precision medicine.
Collapse
|
19
|
Bryen SJ, Yuen M, Joshi H, Dawes R, Zhang K, Lu JK, Jones KJ, Liang C, Wong WK, Peduto AJ, Waddell LB, Evesson FJ, Cooper ST. Prevalence, parameters, and pathogenic mechanisms for splice-altering acceptor variants that disrupt the AG exclusion zone. HGG ADVANCES 2022; 3:100125. [PMID: 35847480 PMCID: PMC9284458 DOI: 10.1016/j.xhgg.2022.100125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 06/19/2022] [Indexed: 10/26/2022] Open
Abstract
Predicting the pathogenicity of acceptor splice-site variants outside the essential AG is challenging, due to high sequence diversity of the extended splice-site region. Critical analysis of 24,445 intronic extended acceptor splice-site variants reported in ClinVar and the Leiden Open Variation Database (LOVD) demonstrates 41.9% of pathogenic variants create an AG dinucleotide between the predicted branchpoint and acceptor (AG-creating variants in the AG exclusion zone), 28.4% result in loss of a pyrimidine at the -3 position, and 15.1% result in loss of one or more pyrimidines in the polypyrimidine tract. Pathogenicity of AG-creating variants was highly influenced by their position. We define a high-risk zone for pathogenicity: > 6 nucleotides downstream of the predicted branchpoint and >5 nucleotides upstream from the acceptor, where 93.1% of pathogenic AG-creating variants arise and where naturally occurring AG dinucleotides are concordantly depleted (5.8% of natural AGs). SpliceAI effectively predicts pathogenicity of AG-creating variants, achieving 95% sensitivity and 69% specificity. We highlight clinical examples showing contrasting mechanisms for mis-splicing arising from AG variants: (1) cryptic acceptor created; (2) splicing silencer created: an introduced AG silences the acceptor, resulting in exon skipping, intron retention, and/or use of an alternative existing cryptic acceptor; and (3) splicing silencer disrupted: loss of a deep intronic AG activates inclusion of a pseudo-exon. In conclusion, we establish AG-creating variants as a common class of pathogenic extended acceptor variant and outline factors conferring critical risk for mis-splicing for AG-creating variants in the AG exclusion zone, between the branchpoint and acceptor.
Collapse
Affiliation(s)
- Samantha J. Bryen
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Discipline of Child and Adolescent Health, Faculty of Medicine and Health, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
- Functional Neuromics, Children’s Medical Research Institute, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Michaela Yuen
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Discipline of Child and Adolescent Health, Faculty of Medicine and Health, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Himanshu Joshi
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Functional Neuromics, Children’s Medical Research Institute, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Ruebena Dawes
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Discipline of Child and Adolescent Health, Faculty of Medicine and Health, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Katharine Zhang
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Functional Neuromics, Children’s Medical Research Institute, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Jessica K. Lu
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Discipline of Child and Adolescent Health, Faculty of Medicine and Health, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Kristi J. Jones
- Discipline of Child and Adolescent Health, Faculty of Medicine and Health, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
- Department of Clinical Genetics, Children’s Hospital at Westmead, Westmead, NSW 2145, Australia
| | - Christina Liang
- Department of Neurology, Royal North Shore Hospital, St Leonards, NSW 2065, Australia
- Department of Neurogenetics, Northern Clinical School, Kolling Institute, University of Sydney, NSW 2065, Australia
| | - Wui-Kwan Wong
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Discipline of Child and Adolescent Health, Faculty of Medicine and Health, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Anthony J. Peduto
- Department of Radiology, Westmead Hospital, Western Clinical School, University of Sydney, Westmead, NSW 2145, Australia
| | - Leigh B. Waddell
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Discipline of Child and Adolescent Health, Faculty of Medicine and Health, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Frances J. Evesson
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Functional Neuromics, Children’s Medical Research Institute, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| | - Sandra T. Cooper
- Kids Neuroscience Centre, Kids Research, The Children’s Hospital at Westmead, Locked Bag 4001, Westmead, NSW 2145, Australia
- Discipline of Child and Adolescent Health, Faculty of Medicine and Health, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
- Functional Neuromics, Children’s Medical Research Institute, The University of Sydney, Locked Bag 4001, Westmead, NSW 2145, Australia
| |
Collapse
|
20
|
Trabuco Amaral D, Mitani Y, Aparecida Silva Bonatelli I, Cerri R, Ohmiya Y, Viviani V. Genome analysis of Phrixothrix hirtus (Phengodidae) railroad worm shows the expansion of odorant-binding gene families and positive selection on morphogenesis and sex determination genes. Gene X 2022; 850:146917. [PMID: 36174905 DOI: 10.1016/j.gene.2022.146917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 09/14/2022] [Accepted: 09/21/2022] [Indexed: 10/14/2022] Open
Abstract
Among bioluminescent beetles of the Elateroidea superfamily, Phengodidae is the third largest family, with 244 bioluminescent species distributed only in the Americas, but is still the least studied from the phylogenetic and evolutionary points of view. The railroad worm Phrixothrix hirtus is an essential biological model and symbolic species due to its bicolor bioluminescence, being the only organism that produces true red light among bioluminescent terrestrial species. Here, we performed partial genome assembly of P. hirtus, combining short and long reads generated with Illumina sequencing, providing the first source of genomic information and a framework for comparative analyses of the bioluminescent system in Elateroidea. This is the largest genome described in the Elateroidea superfamily, with an estimated size of ∼3.4 Gb, displaying 32 % GC content, and 67 % transposable elements. Comparative genomic analyses showed a positive selection of genes and gene family expansion events of growths and morphogenesis gene products, which could be associated with the atypical anatomical development and morphogenesis found in paedomorphic females and underdeveloped males. We also observed gene family expansion among distinct odorant-binding receptors, which could be associated with the pheromone communication system typical of these beetles, and retrotransposable elements. Common genes putatively regulating bioluminescence production and control, including two luciferase genes corresponding to lateral lanterns green-emitting and head lanterns red-emitting luciferases with 7 exons and 6 introns, and genes potentially involved in luciferin biosynthesis were found, indicating that there are no clear differences about the presence or absence of gene families associated with bioluminescence in Elateroidea.
Collapse
Affiliation(s)
- Danilo Trabuco Amaral
- Programa de Pós-Graduação em Biotecnociência, Centro de Ciências Naturais e Humanas. Universidade Federal do ABC (UFABC), Santo André, Brazil
| | - Yasuo Mitani
- Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Sapporo, Japan
| | | | - Ricardo Cerri
- Department of Computational Science, Universidade Federal de São Carlos (UFSCar), São Carlos, Brazil
| | - Yoshihiro Ohmiya
- Biomedical Research Institute, AIST, Ikeda-Osaka, Japan; Osaka Institute of Technology, OIT, Osaka, Japan
| | - Vadim Viviani
- Graduate Program of Evolutive Genetics and Molecular Biology, Federal University of São Carlos (UFSCar), São Carlos, Brazil; Graduate Program of Biotechnology and Environmental Monitoring, Federal University of São Carlos (UFSCar), Sorocaba, Brazil.
| |
Collapse
|
21
|
Anil AT, Choudhary K, Pandian R, Gupta P, Thakran P, Singh A, Sharma M, Mishra SK. Splicing of branchpoint-distant exons is promoted by Cactin, Tls1 and the ubiquitin-fold-activated Sde2. Nucleic Acids Res 2022; 50:10000-10014. [PMID: 36095128 PMCID: PMC9508853 DOI: 10.1093/nar/gkac769] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 08/22/2022] [Accepted: 08/27/2022] [Indexed: 11/13/2022] Open
Abstract
Intron diversity facilitates regulated gene expression and alternative splicing. Spliceosomes excise introns after recognizing their splicing signals: the 5'-splice site (5'ss), branchpoint (BP) and 3'-splice site (3'ss). The latter two signals are recognized by U2 small nuclear ribonucleoprotein (snRNP) and its accessory factors (U2AFs), but longer spacings between them result in weaker splicing. Here, we show that excision of introns with a BP-distant 3'ss (e.g. rap1 intron 2) requires the ubiquitin-fold-activated splicing regulator Sde2 in Schizosaccharomyces pombe. By monitoring splicing-specific ura4 reporters in a collection of S. pombe mutants, Cay1 and Tls1 were identified as additional regulators of this process. The role of Sde2, Cay1 and Tls1 was further confirmed by increasing BP-3'ss spacings in a canonical tho5 intron. We also examined BP-distant exons spliced independently of these factors and observed that RNA secondary structures possibly bridged the gap between the two signals. These proteins may guide the 3'ss towards the spliceosome's catalytic centre by folding the RNA between the BP and 3'ss. Orthologues of Sde2, Cay1 and Tls1, although missing in the intron-poor Saccharomyces cerevisiae, are present in intron-rich eukaryotes, including humans. This type of intron-specific pre-mRNA splicing appears to have evolved for regulated gene expression and alternative splicing of key heterochromatin factors.
Collapse
Affiliation(s)
- Anupa T Anil
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306 Punjab, India
| | - Karan Choudhary
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306 Punjab, India
| | - Rakesh Pandian
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306 Punjab, India
| | - Praver Gupta
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306 Punjab, India
| | - Poonam Thakran
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306 Punjab, India
| | - Arashdeep Singh
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306 Punjab, India
| | - Monika Sharma
- Department of Chemical Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306 Punjab, India
| | - Shravan Kumar Mishra
- Department of Biological Sciences, Indian Institute of Science Education and Research (IISER) Mohali, Sector 81, 140306 Punjab, India
| |
Collapse
|
22
|
Chhabra R, Muthusamy V, Baveja A, Katral A, Mehta B, Zunjare RU, Hossain F. Allelic variation in shrunken2 gene affecting kernel sweetness in exotic-and indigenous-maize inbreds. PLoS One 2022; 17:e0274732. [PMID: 36136965 PMCID: PMC9498942 DOI: 10.1371/journal.pone.0274732] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 09/03/2022] [Indexed: 11/24/2022] Open
Abstract
Sweet corn has become a popular food worldwide. It possesses six-times more sugar than field corn due to the presence of recessive shrunken2 (sh2) gene. Despite availability of diverse sweet corn germplasm, comprehensive characterization of sh2 has not been undertaken so far. Here, entire Sh2 gene (7320 bp) among five field corn-(Sh2Sh2) and six sweet corn-(sh2sh2) inbreds was sequenced. A total of 686 SNPs and 372 InDels were identified, of which three SNPs differentiated the wild-(Sh2) and mutant-(sh2) allele. Ten InDel markers were developed to assess sh2 gene-based diversity among 23 sweet corn and 25 field corn lines. Twenty-five alleles and 47 haplotypes of sh2 were identified among 48 inbreds. Among markers, MGU-InDel-2, MGU-InDel-3, MGU-InDel-5 and MGU-InDel-8 had PIC>0.5. Major allele frequency varied from 0.458–0.958. The gene sequence of these maize inbreds was compared with 25 orthologues of monocots. Sh2 gene possessed 15–18 exons with 6-225bp among maize, while it was 6–21 exons with 30-441bp among orthologues. While intron length across maize genotypes varied between 67-2069bp, the same among orthologues was 57–2713 bp. Sh2-encoded AGPase domain was more conserved than NTP transferase domain. Nucleotide and protein sequences of sh2 in maize and orthologues revealed that rice orthologue was closer to maize than other monocots. The study also provided details of motifs and domains present in sh2 gene, physicochemical properties and secondary structure of SH2 protein in maize inbreds and orthologues. This study reports detailed characterization and diversity analysis in sh2 gene of maize and related orthologues in various monocots.
Collapse
Affiliation(s)
- Rashmi Chhabra
- ICAR-Indian Agricultural Research Institute, New Delhi, India
| | | | - Aanchal Baveja
- ICAR-Indian Agricultural Research Institute, New Delhi, India
| | | | - Brijesh Mehta
- ICAR-Indian Grassland and Fodder Research Institute, Jhansi, India
| | | | - Firoz Hossain
- ICAR-Indian Agricultural Research Institute, New Delhi, India
- * E-mail:
| |
Collapse
|
23
|
Glasser E, Maji D, Biancon G, Puthenpeedikakkal A, Cavender C, Tebaldi T, Jenkins J, Mathews D, Halene S, Kielkopf C. Pre-mRNA splicing factor U2AF2 recognizes distinct conformations of nucleotide variants at the center of the pre-mRNA splice site signal. Nucleic Acids Res 2022; 50:5299-5312. [PMID: 35524551 PMCID: PMC9128377 DOI: 10.1093/nar/gkac287] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 04/03/2022] [Accepted: 04/12/2022] [Indexed: 11/24/2022] Open
Abstract
The essential pre-mRNA splicing factor U2AF2 (also called U2AF65) identifies polypyrimidine (Py) tract signals of nascent transcripts, despite length and sequence variations. Previous studies have shown that the U2AF2 RNA recognition motifs (RRM1 and RRM2) preferentially bind uridine-rich RNAs. Nonetheless, the specificity of the RRM1/RRM2 interface for the central Py tract nucleotide has yet to be investigated. We addressed this question by determining crystal structures of U2AF2 bound to a cytidine, guanosine, or adenosine at the central position of the Py tract, and compared U2AF2-bound uridine structures. Local movements of the RNA site accommodated the different nucleotides, whereas the polypeptide backbone remained similar among the structures. Accordingly, molecular dynamics simulations revealed flexible conformations of the central, U2AF2-bound nucleotide. The RNA binding affinities and splicing efficiencies of structure-guided mutants demonstrated that U2AF2 tolerates nucleotide substitutions at the central position of the Py tract. Moreover, enhanced UV-crosslinking and immunoprecipitation of endogenous U2AF2 in human erythroleukemia cells showed uridine-sensitive binding sites, with lower sequence conservation at the central nucleotide positions of otherwise uridine-rich, U2AF2-bound splice sites. Altogether, these results highlight the importance of RNA flexibility for protein recognition and take a step towards relating splice site motifs to pre-mRNA splicing efficiencies.
Collapse
Affiliation(s)
- Eliezra Glasser
- Department of Biochemistry and Biophysics, and the Center for
RNA Biology, University of Rochester School of Medicine and
Dentistry, Rochester,
NY 14642, USA
| | - Debanjana Maji
- Department of Biochemistry and Biophysics, and the Center for
RNA Biology, University of Rochester School of Medicine and
Dentistry, Rochester,
NY 14642, USA
| | - Giulia Biancon
- Section of Hematology, Department of Internal Medicine and
Yale Cancer Center, Yale University School of Medicine,
New Haven,
CT 06520, USA
| | | | - Chapin E Cavender
- Department of Biochemistry and Biophysics, and the Center for
RNA Biology, University of Rochester School of Medicine and
Dentistry, Rochester,
NY 14642, USA
| | - Toma Tebaldi
- Section of Hematology, Department of Internal Medicine and
Yale Cancer Center, Yale University School of Medicine,
New Haven,
CT 06520, USA
- Department of Cellular, Computational and Integrative Biology
(CIBIO), University of
Trento, Trento, Italy
| | - Jermaine L Jenkins
- Department of Biochemistry and Biophysics, and the Center for
RNA Biology, University of Rochester School of Medicine and
Dentistry, Rochester,
NY 14642, USA
| | - David H Mathews
- Department of Biochemistry and Biophysics, and the Center for
RNA Biology, University of Rochester School of Medicine and
Dentistry, Rochester,
NY 14642, USA
| | - Stephanie Halene
- Section of Hematology, Department of Internal Medicine and
Yale Cancer Center, Yale University School of Medicine,
New Haven,
CT 06520, USA
- Yale Center for RNA Science and Medicine, Yale University
School of Medicine, New Haven,
CT 06520, USA
- Department of Pathology, Yale University School of
Medicine, New Haven,
CT 06520, USA
| | - Clara L Kielkopf
- Department of Biochemistry and Biophysics, and the Center for
RNA Biology, University of Rochester School of Medicine and
Dentistry, Rochester,
NY 14642, USA
- Wilmot Cancer Institute, University of Rochester School of
Medicine and Dentistry, Rochester,
NY 14642, USA
| |
Collapse
|
24
|
Torrado M, Maneiro E, Lamounier Junior A, Fernández-Burriel M, Sánchez Giralt S, Martínez-Carapeto A, Cazón L, Santiago E, Ochoa JP, McKenna WJ, Santomé L, Monserrat L. Identification of an elusive spliceogenic MYBPC3 variant in an otherwise genotype-negative hypertrophic cardiomyopathy pedigree. Sci Rep 2022; 12:7284. [PMID: 35508642 PMCID: PMC9068804 DOI: 10.1038/s41598-022-11159-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 04/13/2022] [Indexed: 11/10/2022] Open
Abstract
The finding of a genotype-negative hypertrophic cardiomyopathy (HCM) pedigree with several affected members indicating a familial origin of the disease has driven this study to discover causative gene variants. Genetic testing of the proband and subsequent family screening revealed the presence of a rare variant in the MYBPC3 gene, c.3331−26T>G in intron 30, with evidence supporting cosegregation with the disease in the family. An analysis of potential splice-altering activity using several splicing algorithms consistently yielded low scores. Minigene expression analysis at the mRNA and protein levels revealed that c.3331−26T>G is a spliceogenic variant with major splice-altering activity leading to undetectable levels of properly spliced transcripts or the corresponding protein. Minigene and patient mRNA analyses indicated that this variant induces complete and partial retention of intron 30, which was expected to lead to haploinsufficiency in carrier patients. As most spliceogenic MYBPC3 variants, c.3331−26T>G appears to be non-recurrent, since it was identified in only two additional unrelated probands in our large HCM cohort. In fact, the frequency analysis of 46 known splice-altering MYBPC3 intronic nucleotide substitutions in our HCM cohort revealed 9 recurrent and 16 non-recurrent variants present in a few probands (≤ 4), while 21 were not detected. The identification of non-recurrent elusive MYBPC3 spliceogenic variants that escape detection by in silico algorithms represents a challenge for genetic diagnosis of HCM and contributes to solving a fraction of genotype-negative HCM cases.
Collapse
Affiliation(s)
- Mario Torrado
- Cardiovascular Research Group, University of A Coruña, Campus de Oza, Building Fortín, 15006, A Coruña, Spain. .,Biomedical Research Institute of A Coruña, A Coruña, Spain.
| | - Emilia Maneiro
- Biomedical Research Institute of A Coruña, A Coruña, Spain. .,Cardiovascular Genetics, Health in Code, Business Center Marineda, Avenida de Arteixo 43, Local 1A, 15008, A Coruña, Spain.
| | - Arsonval Lamounier Junior
- Cardiovascular Research Group, University of A Coruña, Campus de Oza, Building Fortín, 15006, A Coruña, Spain.,Biomedical Research Institute of A Coruña, A Coruña, Spain.,Cardiovascular Genetics, Health in Code, Business Center Marineda, Avenida de Arteixo 43, Local 1A, 15008, A Coruña, Spain.,Medical School, Universidade Vale do Rio Doce, Governador Valadares, MG, Brazil
| | | | | | | | - Laura Cazón
- Cardiovascular Genetics, Health in Code, Business Center Marineda, Avenida de Arteixo 43, Local 1A, 15008, A Coruña, Spain
| | - Elisa Santiago
- Cardiovascular Genetics, Health in Code, Business Center Marineda, Avenida de Arteixo 43, Local 1A, 15008, A Coruña, Spain
| | - Juan Pablo Ochoa
- Biomedical Research Institute of A Coruña, A Coruña, Spain.,Cardiovascular Genetics, Health in Code, Business Center Marineda, Avenida de Arteixo 43, Local 1A, 15008, A Coruña, Spain
| | - William J McKenna
- Cardiovascular Research Group, University of A Coruña, Campus de Oza, Building Fortín, 15006, A Coruña, Spain.,Biomedical Research Institute of A Coruña, A Coruña, Spain.,Institute of Cardiovascular Science, University College London, London, UK
| | - Luis Santomé
- Cardiovascular Genetics, Health in Code, Business Center Marineda, Avenida de Arteixo 43, Local 1A, 15008, A Coruña, Spain
| | - Lorenzo Monserrat
- Biomedical Research Institute of A Coruña, A Coruña, Spain.,Cardiovascular Genetics, Health in Code, Business Center Marineda, Avenida de Arteixo 43, Local 1A, 15008, A Coruña, Spain
| |
Collapse
|
25
|
Keegan NP, Wilton SD, Fletcher S. Analysis of Pathogenic Pseudoexons Reveals Novel Mechanisms Driving Cryptic Splicing. Front Genet 2022; 12:806946. [PMID: 35140743 PMCID: PMC8819188 DOI: 10.3389/fgene.2021.806946] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 12/09/2021] [Indexed: 12/16/2022] Open
Abstract
Understanding pre-mRNA splicing is crucial to accurately diagnosing and treating genetic diseases. However, mutations that alter splicing can exert highly diverse effects. Of all the known types of splicing mutations, perhaps the rarest and most difficult to predict are those that activate pseudoexons, sometimes also called cryptic exons. Unlike other splicing mutations that either destroy or redirect existing splice events, pseudoexon mutations appear to create entirely new exons within introns. Since exon definition in vertebrates requires coordinated arrangements of numerous RNA motifs, one might expect that pseudoexons would only arise when rearrangements of intronic DNA create novel exons by chance. Surprisingly, although such mutations do occur, a far more common cause of pseudoexons is deep-intronic single nucleotide variants, raising the question of why these latent exon-like tracts near the mutation sites have not already been purged from the genome by the evolutionary advantage of more efficient splicing. Possible answers may lie in deep intronic splicing processes such as recursive splicing or poison exon splicing. Because these processes utilize intronic motifs that benignly engage with the spliceosome, the regions involved may be more susceptible to exonization than other intronic regions would be. We speculated that a comprehensive study of reported pseudoexons might detect alignments with known deep intronic splice sites and could also permit the characterisation of novel pseudoexon categories. In this report, we present and analyse a catalogue of over 400 published pseudoexon splice events. In addition to confirming prior observations of the most common pseudoexon mutation types, the size of this catalogue also enabled us to suggest new categories for some of the rarer types of pseudoexon mutation. By comparing our catalogue against published datasets of non-canonical splice events, we also found that 15.7% of pseudoexons exhibit some splicing activity at one or both of their splice sites in non-mutant cells. Importantly, this included seven examples of experimentally confirmed recursive splice sites, confirming for the first time a long-suspected link between these two splicing phenomena. These findings have the potential to improve the fidelity of genetic diagnostics and reveal new targets for splice-modulating therapies.
Collapse
Affiliation(s)
- Niall P. Keegan
- Centre for Molecular Medicine and Innovative Therapeutics, Health Futures Institute, Murdoch University, Perth, WA, Australia
- Centre for Neuromuscular and Neurological Disorders, Perron Institute for Neurological and Translational Science, The University of Western Australia, Perth, WA, Australia
| | - Steve D. Wilton
- Centre for Molecular Medicine and Innovative Therapeutics, Health Futures Institute, Murdoch University, Perth, WA, Australia
- Centre for Neuromuscular and Neurological Disorders, Perron Institute for Neurological and Translational Science, The University of Western Australia, Perth, WA, Australia
| | - Sue Fletcher
- Centre for Molecular Medicine and Innovative Therapeutics, Health Futures Institute, Murdoch University, Perth, WA, Australia
- Centre for Neuromuscular and Neurological Disorders, Perron Institute for Neurological and Translational Science, The University of Western Australia, Perth, WA, Australia
| |
Collapse
|
26
|
Kumari A, Sedehizadeh S, Brook JD, Kozlowski P, Wojciechowska M. Differential fates of introns in gene expression due to global alternative splicing. Hum Genet 2022; 141:31-47. [PMID: 34907472 PMCID: PMC8758631 DOI: 10.1007/s00439-021-02409-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Accepted: 12/02/2021] [Indexed: 02/06/2023]
Abstract
The discovery of introns over four decades ago revealed a new vision of genes and their interrupted arrangement. Throughout the years, it has appeared that introns play essential roles in the regulation of gene expression. Unique processing of excised introns through the formation of lariats suggests a widespread role for these molecules in the structure and function of cells. In addition to rapid destruction, these lariats may linger on in the nucleus or may even be exported to the cytoplasm, where they remain stable circular RNAs (circRNAs). Alternative splicing (AS) is a source of diversity in mature transcripts harboring retained introns (RI-mRNAs). Such RNAs may contain one or more entire retained intron(s) (RIs), but they may also have intron fragments resulting from sequential excision of smaller subfragments via recursive splicing (RS), which is characteristic of long introns. There are many potential fates of RI-mRNAs, including their downregulation via nuclear and cytoplasmic surveillance systems and the generation of new protein isoforms with potentially different functions. Various reports have linked the presence of such unprocessed transcripts in mammals to important roles in normal development and in disease-related conditions. In certain human neurological-neuromuscular disorders, including myotonic dystrophy type 2 (DM2), frontotemporal dementia/amyotrophic lateral sclerosis (FTD/ALS) and Duchenne muscular dystrophy (DMD), peculiar processing of long introns has been identified and is associated with their pathogenic effects. In this review, we discuss different mechanisms involved in the processing of introns during AS and the functions of these large sections of the genome in our biology.
Collapse
Affiliation(s)
- Anjani Kumari
- Queen's Medical Centre, School of Life Sciences, University of Nottingham, Nottingham, NG7 2UH, UK
| | - Saam Sedehizadeh
- Queen's Medical Centre, School of Life Sciences, University of Nottingham, Nottingham, NG7 2UH, UK
| | - John David Brook
- Queen's Medical Centre, School of Life Sciences, University of Nottingham, Nottingham, NG7 2UH, UK
| | - Piotr Kozlowski
- Department of Molecular Genetics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704, Poznan, Poland
| | - Marzena Wojciechowska
- Department of Molecular Genetics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704, Poznan, Poland.
- Department of Rare Human Diseases, Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704, Poznan, Poland.
| |
Collapse
|
27
|
Scalzitti N, Kress A, Orhand R, Weber T, Moulinier L, Jeannin-Girardon A, Collet P, Poch O, Thompson JD. Spliceator: multi-species splice site prediction using convolutional neural networks. BMC Bioinformatics 2021; 22:561. [PMID: 34814826 PMCID: PMC8609763 DOI: 10.1186/s12859-021-04471-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 11/09/2021] [Indexed: 12/14/2022] Open
Abstract
Background Ab initio prediction of splice sites is an essential step in eukaryotic genome annotation. Recent predictors have exploited Deep Learning algorithms and reliable gene structures from model organisms. However, Deep Learning methods for non-model organisms are lacking. Results We developed Spliceator to predict splice sites in a wide range of species, including model and non-model organisms. Spliceator uses a convolutional neural network and is trained on carefully validated data from over 100 organisms. We show that Spliceator achieves consistently high accuracy (89–92%) compared to existing methods on independent benchmarks from human, fish, fly, worm, plant and protist organisms. Conclusions Spliceator is a new Deep Learning method trained on high-quality data, which can be used to predict splice sites in diverse organisms, ranging from human to protists, with consistently high accuracy. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04471-3.
Collapse
Affiliation(s)
- Nicolas Scalzitti
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Arnaud Kress
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France.,BiGEst-ICube Platform, ICube Laboratory, UMR7357, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Romain Orhand
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Thomas Weber
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Luc Moulinier
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France.,BiGEst-ICube Platform, ICube Laboratory, UMR7357, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Anne Jeannin-Girardon
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Pierre Collet
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Olivier Poch
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Julie D Thompson
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France.
| |
Collapse
|
28
|
Filho JAF, Rosolen RR, Almeida DA, de Azevedo PHC, Motta MLL, Aono AH, dos Santos CA, Horta MAC, de Souza AP. Trends in biological data integration for the selection of enzymes and transcription factors related to cellulose and hemicellulose degradation in fungi. 3 Biotech 2021; 11:475. [PMID: 34777932 PMCID: PMC8548487 DOI: 10.1007/s13205-021-03032-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 10/15/2021] [Indexed: 12/13/2022] Open
Abstract
Fungi are key players in biotechnological applications. Although several studies focusing on fungal diversity and genetics have been performed, many details of fungal biology remain unknown, including how cellulolytic enzymes are modulated within these organisms to allow changes in main plant cell wall compounds, cellulose and hemicellulose, and subsequent biomass conversion. With the advent and consolidation of DNA/RNA sequencing technology, different types of information can be generated at the genomic, structural and functional levels, including the gene expression profiles and regulatory mechanisms of these organisms, during degradation-induced conditions. This increase in data generation made rapid computational development necessary to deal with the large amounts of data generated. In this context, the origination of bioinformatics, a hybrid science integrating biological data with various techniques for information storage, distribution and analysis, was a fundamental step toward the current state-of-the-art in the postgenomic era. The possibility of integrating biological big data has facilitated exciting discoveries, including identifying novel mechanisms and more efficient enzymes, increasing yields, reducing costs and expanding opportunities in the bioprocess field. In this review, we summarize the current status and trends of the integration of different types of biological data through bioinformatics approaches for biological data analysis and enzyme selection.
Collapse
Affiliation(s)
- Jaire A. Ferreira Filho
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
| | - Rafaela R. Rosolen
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
| | - Deborah A. Almeida
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
| | - Paulo Henrique C. de Azevedo
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
| | - Maria Lorenza L. Motta
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
| | - Alexandre H. Aono
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
| | - Clelton A. dos Santos
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
- Brazilian Biorenewables National Laboratory (LNBR), Brazilian Center for Research in Energy and Materials (CNPEM), Campinas, SP Brazil
| | - Maria Augusta C. Horta
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
- Faculty of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, SP Brazil
| | - Anete P. de Souza
- Center for Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, SP Brazil
- Department of Plant Biology, Institute of Biology, UNICAMP, Universidade Estadual de Campinas, Campinas, SP 13083-875 Brazil
| |
Collapse
|
29
|
Zhou YN, Xie S, Chen JN, Wang ZH, Yang P, Zhou SC, Pang L, Li F, Shi M, Huang JH, Chen XX. Expression and functional characterization of odorant-binding protein genes in the endoparasitic wasp Cotesia vestalis. INSECT SCIENCE 2021; 28:1354-1368. [PMID: 32761881 DOI: 10.1111/1744-7917.12861] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 07/13/2020] [Accepted: 07/22/2020] [Indexed: 06/11/2023]
Abstract
Odorant-binding proteins (OBPs) are crucial in insect's olfactory perception, which participate in the initial step of odorant molecules transporting from the external environment to olfactory receptor neurons. To better understand the roles for OBPs in olfactory perception in Cotesia vestalis, a solitary larval endoparasitoid of diamondback moth, Plutella xylostella, we have comprehensively screened the genome of C. vestalis, and obtained 20 CvesOBPs, including 18 classic OBPs and two minus-C OBPs. Motif-pattern analysis indicates that the motifs of C. vestalis OBPs are highly conserved in Hymenoptera. The results of tissue expression analysis show that five OBPs (CvesOBP1/11/12/14/16) are highly expressed in male antennae, whereas six other OBP genes (CvesOBP7/8/13/17/18/19) are significantly transcriptionally enriched in female antennae. The results of RNA interference experiments for three most highly expressed OBP genes (CvesOBP17/18/19) in female antennae demonstrate that they are likely involved in parasitic processes of female wasps, as the wasps take a longer time to target the hosts when they are knocked down.
Collapse
Affiliation(s)
- Yue-Nan Zhou
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Shuang Xie
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Jia-Ni Chen
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Ze-Hua Wang
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Pei Yang
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Si-Cong Zhou
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Lan Pang
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Fei Li
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Min Shi
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Jian-Hua Huang
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
| | - Xue-Xin Chen
- Institute of Insect Sciences, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, 310058, China
- Ministry of Agriculture Key Lab of Molecular Biology of Crop Pathogens and Insect Pests, Zhejiang University, Hangzhou, 310058, China
- Key Laboratory of Biology of Crop Pathogens and Insects of Zhejiang Province, Zhejiang University, Hangzhou, 310058, China
- State Key Lab of Rice Biology, Zhejiang University, Hangzhou, 310058, China
| |
Collapse
|
30
|
Natural selection at the RASGEF1C (GGC) repeat in human and divergent genotypes in late-onset neurocognitive disorder. Sci Rep 2021; 11:19235. [PMID: 34584172 PMCID: PMC8479062 DOI: 10.1038/s41598-021-98725-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 09/14/2021] [Indexed: 12/17/2022] Open
Abstract
Expression dysregulation of the neuron-specific gene, RASGEF1C (RasGEF Domain Family Member 1C), occurs in late-onset neurocognitive disorders (NCDs), such as Alzheimer's disease. This gene contains a (GGC)13, spanning its core promoter and 5' untranslated region (RASGEF1C-201 ENST00000361132.9). Here we sequenced the (GGC)-repeat in a sample of human subjects (N = 269), consisting of late-onset NCDs (N = 115) and controls (N = 154). We also studied the status of this STR across various primate and non-primate species based on Ensembl 103. The 6-repeat allele was the predominant allele in the controls (frequency = 0.85) and NCD patients (frequency = 0.78). The NCD genotype compartment consisted of an excess of genotypes that lacked the 6-repeat (divergent genotypes) (Mid-P exact = 0.004). A number of those genotypes were not detected in the control group (Mid-P exact = 0.007). The RASGEF1C (GGC)-repeat expanded beyond 2-repeats specifically in primates, and was at maximum length in human. We conclude that there is natural selection for the 6-repeat allele of the RASGEF1C (GGC)-repeat in human, and significant divergence from that allele in late-onset NCDs. STR alleles that are predominantly abundant and genotypes that deviate from those alleles are underappreciated features, which may have deep evolutionary and pathological consequences.
Collapse
|
31
|
Conboy JG. Unannotated splicing regulatory elements in deep intron space. WILEY INTERDISCIPLINARY REVIEWS-RNA 2021; 12:e1656. [PMID: 33887804 DOI: 10.1002/wrna.1656] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 03/14/2021] [Accepted: 03/23/2021] [Indexed: 12/21/2022]
Abstract
Deep intron space harbors a diverse array of splicing regulatory elements that cooperate with better-known exon-proximal elements to enforce proper tissue-specific and development-specific pre-mRNA processing. Many deep intron elements have been highly conserved through vertebrate evolution, yet remain poorly annotated in the human genome. Recursive splicing exons (RS-exons) and intraexons promote noncanonical, multistep resplicing pathways in long introns, involving transient intermediate structures that are greatly underrepresented in RNA-seq datasets. Decoy splice sites and decoy exons act at a distance to inhibit splicing catalysis at annotated splice sites, with functional consequences such as exon skipping and intron retention. RNA:RNA bridges can juxtapose distant sequences within or across introns to activate deep intron splicing enhancers and silencers, to loop out exons to be skipped, or to select one member of a mutually exclusive set of exons. Similarly, protein bridges mediated by interactions among transcript-bound RNA binding proteins (RBPs) can modulate splicing outcomes. Experimental disruption of deep intron elements serving any of these functions can abrogate normal splicing, strongly suggesting that natural mutations of deep intron elements can do likewise to cause human disease. Understanding noncanonical splicing pathways and discovering deep intron regulatory signals, many of which map hundreds to many thousands of nucleotides from annotated splice junctions, is of great academic interest for basic scientists studying alternative splicing mechanisms. Hopefully, this knowledge coupled with increased analysis of deep intron sequences will also have important medical applications, as better interpretation of deep intron mutations may reveal new disease mechanisms and suggest new therapies. This article is categorized under: RNA Processing > Splicing Regulation/Alternative Splicing.
Collapse
Affiliation(s)
- John G Conboy
- Lawrence Berkeley National Laboratory, Biological Systems and Engineering Division, Berkeley, California, USA
| |
Collapse
|
32
|
Wimmer K, Schamschula E, Wernstedt A, Traunfellner P, Amberger A, Zschocke J, Kroisel P, Chen Y, Callens T, Messiaen L. AG-exclusion zone revisited: Lessons to learn from 91 intronic NF1 3' splice site mutations outside the canonical AG-dinucleotides. Hum Mutat 2020; 41:1145-1156. [PMID: 32126153 PMCID: PMC7317903 DOI: 10.1002/humu.24005] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 01/26/2020] [Accepted: 02/24/2020] [Indexed: 12/17/2022]
Abstract
Uncovering frequent motives of action by which variants impair 3′ splice site (3′ss) recognition and selection is essential to improve our understanding of this complex process. Through several mini‐gene experiments, we demonstrate that the pyrimidine (Y) to purine (R) transversion NM_000267.3(NF1):c.1722‐11T>G, although expected to weaken the polypyrimidine tract, causes exon skipping primarily by introducing a novel AG in the AG‐exclusion zone (AGEZ) between the authentic 3′ss AG and the branch point. Evaluation of 90 additional noncanonical intronic NF1 3′ss mutations confirmed that 63% of all mutations and 89% (49/55) of the single‐nucleotide variants upstream of positions ‐3 interrupt the AGEZ. Of these AGEZ‐interrupting mutations, 24/49 lead to exon skipping suggesting that absence of AG in this region is necessary for accurate 3′ss selection already in the initial steps of splicing. The analysis of 91 noncanonical NF1 3′ss mutations also shows that 90% either introduce a novel AG in the AGEZ, cause a Y>R transversion at position ‐3 or remove ≥2 Ys in the AGEZ. We confirm in a validation cohort that these three motives distinguish spliceogenic from splice‐neutral variants with 85% accuracy and, therefore, are generally applicable to select among variants of unknown significance those likely to affect splicing.
Collapse
Affiliation(s)
- Katharina Wimmer
- Institute of Human Genetics, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Esther Schamschula
- Institute of Human Genetics, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Annekatrin Wernstedt
- Institute of Human Genetics, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Pia Traunfellner
- Institute of Human Genetics, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Albert Amberger
- Institute of Human Genetics, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Johannes Zschocke
- Institute of Human Genetics, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Peter Kroisel
- Diagnostic & Research Institute of Human Genetics, Diagnostic & Research Center for Molecular BioMedicine, Medical University of Graz, Graz, Austria
| | - Yunjia Chen
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama
| | - Tom Callens
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama
| | - Ludwine Messiaen
- Department of Genetics, University of Alabama at Birmingham, Birmingham, Alabama
| |
Collapse
|
33
|
Leman R, Tubeuf H, Raad S, Tournier I, Derambure C, Lanos R, Gaildrat P, Castelain G, Hauchard J, Killian A, Baert-Desurmont S, Legros A, Goardon N, Quesnelle C, Ricou A, Castera L, Vaur D, Le Gac G, Ka C, Fichou Y, Bonnet-Dorion F, Sevenet N, Guillaud-Bataille M, Boutry-Kryza N, Schultz I, Caux-Moncoutier V, Rossing M, Walker LC, Spurdle AB, Houdayer C, Martins A, Krieger S. Assessment of branch point prediction tools to predict physiological branch points and their alteration by variants. BMC Genomics 2020; 21:86. [PMID: 31992191 PMCID: PMC6988378 DOI: 10.1186/s12864-020-6484-5] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 01/10/2020] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Branch points (BPs) map within short motifs upstream of acceptor splice sites (3'ss) and are essential for splicing of pre-mature mRNA. Several BP-dedicated bioinformatics tools, including HSF, SVM-BPfinder, BPP, Branchpointer, LaBranchoR and RNABPS were developed during the last decade. Here, we evaluated their capability to detect the position of BPs, and also to predict the impact on splicing of variants occurring upstream of 3'ss. RESULTS We used a large set of constitutive and alternative human 3'ss collected from Ensembl (n = 264,787 3'ss) and from in-house RNAseq experiments (n = 51,986 3'ss). We also gathered an unprecedented collection of functional splicing data for 120 variants (62 unpublished) occurring in BP areas of disease-causing genes. Branchpointer showed the best performance to detect the relevant BPs upstream of constitutive and alternative 3'ss (99.48 and 65.84% accuracies, respectively). For variants occurring in a BP area, BPP emerged as having the best performance to predict effects on mRNA splicing, with an accuracy of 89.17%. CONCLUSIONS Our investigations revealed that Branchpointer was optimal to detect BPs upstream of 3'ss, and that BPP was most relevant to predict splicing alteration due to variants in the BP area.
Collapse
Affiliation(s)
- Raphaël Leman
- Laboratoire de Biologie Clinique et Oncologique, Centre François Baclesse, Caen, France. .,Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France. .,Université Caen-Normandie, Caen, France.
| | - Hélène Tubeuf
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France.,Interactive Biosoftware, Rouen, France
| | - Sabine Raad
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Isabelle Tournier
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Céline Derambure
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Raphaël Lanos
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Pascaline Gaildrat
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Gaia Castelain
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Julie Hauchard
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Audrey Killian
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Stéphanie Baert-Desurmont
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Angelina Legros
- Laboratoire de Biologie Clinique et Oncologique, Centre François Baclesse, Caen, France
| | - Nicolas Goardon
- Laboratoire de Biologie Clinique et Oncologique, Centre François Baclesse, Caen, France.,Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Céline Quesnelle
- Laboratoire de Biologie Clinique et Oncologique, Centre François Baclesse, Caen, France
| | - Agathe Ricou
- Laboratoire de Biologie Clinique et Oncologique, Centre François Baclesse, Caen, France.,Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Laurent Castera
- Laboratoire de Biologie Clinique et Oncologique, Centre François Baclesse, Caen, France.,Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Dominique Vaur
- Laboratoire de Biologie Clinique et Oncologique, Centre François Baclesse, Caen, France.,Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Gérald Le Gac
- Inserm UMR1078, Genetics, Functional Genomics and Biotechnology, Université de Bretagne Occidentale, Brest, France
| | - Chandran Ka
- Inserm UMR1078, Genetics, Functional Genomics and Biotechnology, Université de Bretagne Occidentale, Brest, France
| | - Yann Fichou
- Inserm UMR1078, Genetics, Functional Genomics and Biotechnology, Université de Bretagne Occidentale, Brest, France
| | - Françoise Bonnet-Dorion
- Inserm U916, Département de Pathologie, Laboratoire de Génétique Constitutionnelle, Institut Bergonié, Bordeaux, France
| | - Nicolas Sevenet
- Inserm U916, Département de Pathologie, Laboratoire de Génétique Constitutionnelle, Institut Bergonié, Bordeaux, France
| | | | - Nadia Boutry-Kryza
- Lyon Neuroscience Research Center-CRNL, Inserm U1028, CNRS UMR 5292, University of Lyon, Lyon, France
| | - Inès Schultz
- Laboratoire d'Oncogénétique, Centre Paul Strauss, Strasbourg, France
| | | | - Maria Rossing
- Centre for Genomic Medicine, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Logan C Walker
- Department of Pathology and Biomedical Science, University of Otago, Christchurch, New Zealand
| | - Amanda B Spurdle
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Herston, Queensland, Australia
| | - Claude Houdayer
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Alexandra Martins
- Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France
| | - Sophie Krieger
- Laboratoire de Biologie Clinique et Oncologique, Centre François Baclesse, Caen, France. .,Inserm U1245, Normandy Center for Genomic and Personalized Medicine, Rouen, UNIROUEN, Normandy University, Caen, France. .,Université Caen-Normandie, Caen, France. .,Present address: Laboratoire de biologie et génétique des cancers, Centre François Baclesse, Caen, France.
| |
Collapse
|
34
|
Královicová J, Ševcíková I, Stejskalová E, Obuca M, Hiller M, Stanek D, Vorechovský I. PUF60-activated exons uncover altered 3' splice-site selection by germline missense mutations in a single RRM. Nucleic Acids Res 2019; 46:6166-6187. [PMID: 29788428 PMCID: PMC6093180 DOI: 10.1093/nar/gky389] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 05/01/2018] [Indexed: 12/27/2022] Open
Abstract
PUF60 is a splicing factor that binds uridine (U)-rich tracts and facilitates association of the U2 small nuclear ribonucleoprotein with primary transcripts. PUF60 deficiency (PD) causes a developmental delay coupled with intellectual disability and spinal, cardiac, ocular and renal defects, but PD pathogenesis is not understood. Using RNA-Seq, we identify human PUF60-regulated exons and show that PUF60 preferentially acts as their activator. PUF60-activated internal exons are enriched for Us upstream of their 3′ splice sites (3′ss), are preceded by longer AG dinucleotide exclusion zones and more distant branch sites, with a higher probability of unpaired interactions across a typical branch site location as compared to control exons. In contrast, PUF60-repressed exons show U-depletion with lower estimates of RNA single-strandedness. We also describe PUF60-regulated, alternatively spliced isoforms encoding other U-bound splicing factors, including PUF60 partners, suggesting that they are co-regulated in the cell, and identify PUF60-regulated exons derived from transposed elements. PD-associated amino-acid substitutions, even within a single RNA recognition motif (RRM), altered selection of competing 3′ss and branch points of a PUF60-dependent exon and the 3′ss choice was also influenced by alternative splicing of PUF60. Finally, we propose that differential distribution of RNA processing steps detected in cells lacking PUF60 and the PUF60-paralog RBM39 is due to the RBM39 RS domain interactions. Together, these results provide new insights into regulation of exon usage by the 3′ss organization and reveal that germline mutation heterogeneity in RRMs can enhance phenotypic variability at the level of splice-site and branch-site selection.
Collapse
Affiliation(s)
- Jana Královicová
- University of Southampton Faculty of Medicine, Southampton SO16 6YD, UK.,Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Ivana Ševcíková
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Eva Stejskalová
- Czech Academy of Sciences, Institute of Molecular Genetics, 142 20 Prague, Czech Republic
| | - Mina Obuca
- Czech Academy of Sciences, Institute of Molecular Genetics, 142 20 Prague, Czech Republic
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics and Max Planck Institute for the Physics of Complex Systems, Dresden, Germany
| | - David Stanek
- Czech Academy of Sciences, Institute of Molecular Genetics, 142 20 Prague, Czech Republic
| | - Igor Vorechovský
- University of Southampton Faculty of Medicine, Southampton SO16 6YD, UK
| |
Collapse
|
35
|
Zhang X, Zhang Y, Wang T, Li Z, Cheng J, Ge H, Tang Q, Chen K, Liu L, Lu C, Guo J, Zheng B, Zheng Y. A Comprehensive Map of Intron Branchpoints and Lariat RNAs in Plants. THE PLANT CELL 2019; 31:956-973. [PMID: 30894459 PMCID: PMC6533014 DOI: 10.1105/tpc.18.00711] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 01/29/2019] [Accepted: 03/14/2019] [Indexed: 05/20/2023]
Abstract
Lariats are formed by excised introns, when the 5' splice site joins with the branchpoint (BP) during splicing. Although lariat RNAs are usually degraded by RNA debranching enzyme 1, recent findings in animals detected many lariat RNAs under physiological conditions. By contrast, the features of BPs and to what extent lariat RNAs accumulate naturally are largely unexplored in plants. Here, we analyzed 948 RNA sequencing data sets to document plant BPs and lariat RNAs on a genome-wide scale. In total, we identified 13,872, 5199, 29,582, and 13,478 BPs in Arabidopsis (Arabidopsis thaliana), tomato (Solanum lycopersicum), rice (Oryza sativa), and maize (Zea mays), respectively. Features of plant BPs are highly similar to those in yeast and human, in that BPs are adenine-preferred and flanked by uracil-enriched sequences. Intriguingly, ∼20% of introns harbor multiple BPs, and BP usage is tissue-specific. Furthermore, 10,580 lariat RNAs accumulate in wild-type Arabidopsis plants, and most of these lariat RNAs originate from longer or retroelement-depleted introns. Moreover, the expression of these lariat RNAs is accompanied by the incidence of back-splicing of parent exons. Collectively, our results provide a comprehensive map of intron BPs and lariat RNAs in four plant species and uncover a link between lariat turnover and splicing.
Collapse
Affiliation(s)
- Xiaotuo Zhang
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200438, China
- Faculty of Life Science and Technology, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
| | - Yong Zhang
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Taiyun Wang
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Ziwei Li
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Jinping Cheng
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Haoran Ge
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Qi Tang
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Kun Chen
- Faculty of Life Science and Technology, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
| | - Li Liu
- Faculty of Life Science and Technology, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
| | - Chenyu Lu
- Yunnan Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
| | - Junqiang Guo
- Yunnan Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
| | - Binglian Zheng
- State Key Laboratory of Genetic Engineering, Ministry of Education Key Laboratory of Biodiversity Sciences and Ecological Engineering, Institute of Plant Biology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Yun Zheng
- Yunnan Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
| |
Collapse
|
36
|
Nguyen H, Xie J. Widespread Separation of the Polypyrimidine Tract From 3' AG by G Tracts in Association With Alternative Exons in Metazoa and Plants. Front Genet 2019; 9:741. [PMID: 30693020 PMCID: PMC6339879 DOI: 10.3389/fgene.2018.00741] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 12/22/2018] [Indexed: 12/23/2022] Open
Abstract
At the end of introns, the polypyrimidine tract (Py) is often close to the 3′ AG in a consensus (Y)20NCAGgt in humans. Interestingly, we have found that they could also be separated by purine-rich elements including G tracts in thousands of human genes. These regulatory elements between the Py and 3′ AG (REPA) mainly regulate alternative 3′ splice sites (3′ SS) and intron retention. Here we show their widespread distribution and special properties across kingdoms. The purine-rich 3′ SS are found in up to about 60% of the introns among more than 1,000 species/lineages by whole genome analysis, and up to 18% of these introns contain the REPA G-tracts (REPAG) in about 0.6 million of 3′ SS in total. In particular, they are significantly enriched over their 3′ SS and genome backgrounds in metazoa and plants, and highly associated with alternative splicing of genes in diverse functional clusters. Cryptic splice sites harboring such G- and the other purine-triplets tend to be enriched (2–9 folds over the disrupted canonical 3′ SS) and aberrantly used in cancer patients carrying mutations of the SF3B1 or U2AF35, factors critical for branch point (BP) or 3′ AG recognition, respectively. Moreover, the REPAGs are significantly associated with reduced occurrences of BP motifs between the −24 and −4 positions, in particular absent between the −7 and −5 positions in several model organisms examined. The more distant BPs are associated with increased occurrences of alternative splicing in humans and zebrafish. The REPAGs appear to have evolved in a species- or phylum-specific way. Thus, there is widespread separation of the Py and 3′ AG by REPAGs that have evolved differentially. This special 3′ SS arrangement likely contributes to the generation of diverse transcript or protein isoforms in biological functions or diseases through alternative or aberrant splicing.
Collapse
Affiliation(s)
- Hai Nguyen
- Department of Physiology and Pathophysiology, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada.,Department of Applied Computer Sciences, University of Winnipeg, Winnipeg, MB, Canada
| | - Jiuyong Xie
- Department of Physiology and Pathophysiology, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada
| |
Collapse
|
37
|
Yadav RP, Ghatak S, Chakraborty P, Lalrohlui F, Kannan R, Kumar R, Pautu JL, Zomingthanga J, Chenkual S, Muthukumaran R, Senthil Kumar N. Lifestyle chemical carcinogens associated with mutations in cell cycle regulatory genes increases the susceptibility to gastric cancer risk. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2018; 25:31691-31704. [PMID: 30209766 DOI: 10.1007/s11356-018-3080-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 08/27/2018] [Indexed: 06/08/2023]
Abstract
In the present study, we correlated the various lifestyle habits and their associated mutations in cell cycle (P21 and MDM2) and DNA damage repair (MLH1) genes to investigate their role in gastric cancer (GC). Multifactor dimensionality reduction (MDR) analysis revealed the two-factor model of oral snuff and smoked meat as the significant model for GC risk. The interaction analysis between identified mutations and the significant demographic factors predicted that oral snuff is significantly associated with P21 3'UTR mutations. A total of five mutations in P21 gene, including three novel mutations in intron 2 (36651738G > A, 36651804A > T, 36651825G > T), were identified. In MLH1 gene, two variants were identified viz. one in exon 8 (37053568A > G; 219I > V) and a novel 37088831C > G in intron 16. Flow cytometric analysis predicted DNA aneuploidy in 07 (17.5%) and diploidy in 33 (82.5%) tumor samples. The G2/M phase was significantly arrested in aneuploid gastric tumor samples whereas high S-phase fraction was observed in all the gastric tumor samples. This study demonstrated that environmental chemical carcinogens along with alteration in cell cycle regulatory (P21) and mismatch repair (MLH1) genes may be stimulating the susceptibility of GC by altering the DNA content level abnormally in tumors in the Mizo ethic population.
Collapse
Affiliation(s)
- Ravi Prakash Yadav
- Department of Biotechnology, Mizoram University, Aizawl, Mizoram, 796004, India
| | - Souvik Ghatak
- Department of Biotechnology, Mizoram University, Aizawl, Mizoram, 796004, India
| | - Payel Chakraborty
- Department of Biotechnology, Mizoram University, Aizawl, Mizoram, 796004, India
| | - Freda Lalrohlui
- Department of Biotechnology, Mizoram University, Aizawl, Mizoram, 796004, India
| | - Ravi Kannan
- Cachar Cancer Hospital and Research Centre, Silchar, Assam 788015, India
| | - Rajeev Kumar
- Cachar Cancer Hospital and Research Centre, Silchar, Assam 788015, India
| | - Jeremy L Pautu
- Mizoram State Cancer Institute, Zemabawk, Aizawl, Mizoram, 796017, India
| | - John Zomingthanga
- Department of Pathology, Civil Hospital, Aizawl, Mizoram, 796001, India
| | - Saia Chenkual
- Department of Surgery, Civil Hospital, Aizawl, Mizoram, 796001, India
| | | | | |
Collapse
|
38
|
Zhang Q, Fan X, Wang Y, Sun MA, Shao J, Guo D. BPP: a sequence-based algorithm for branch point prediction. Bioinformatics 2018. [PMID: 28633445 DOI: 10.1093/bioinformatics/btx401] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Motivation Although high-throughput sequencing methods have been proposed to identify splicing branch points in the human genome, these methods can only detect a small fraction of the branch points subject to the sequencing depth, experimental cost and the expression level of the mRNA. An accurate computational model for branch point prediction is therefore an ongoing objective in human genome research. Results We here propose a novel branch point prediction algorithm that utilizes information on the branch point sequence and the polypyrimidine tract. Using experimentally validated data, we demonstrate that our proposed method outperforms existing methods. Availability and implementation: https://github.com/zhqingit/BPP. Contact djguo@cuhk.edu.hk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qing Zhang
- School of Life Sciences and the State Key Laboratory of Agrobiotechnology
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Shatin, NT, Hong Kong SAR, China
| | - Yejun Wang
- Department of Cell Biology and Genetics, Shenzhen University Health Science Center, Shenzhen 518060, China
| | - Ming-An Sun
- School of Life Sciences and the State Key Laboratory of Agrobiotechnology
| | - Jianlin Shao
- First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Dianjing Guo
- School of Life Sciences and the State Key Laboratory of Agrobiotechnology
| |
Collapse
|
39
|
Zuallaert J, Godin F, Kim M, Soete A, Saeys Y, De Neve W. SpliceRover: interpretable convolutional neural networks for improved splice site prediction. Bioinformatics 2018; 34:4180-4188. [DOI: 10.1093/bioinformatics/bty497] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Accepted: 06/19/2018] [Indexed: 11/13/2022] Open
Affiliation(s)
- Jasper Zuallaert
- Center for Biotech Data Science, Department of Environmental Technology, Food Technology and Molecular Biotechnology, Ghent University Global Campus, Songdo, Incheon, South Korea
- IDLab, Department for Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Fréderic Godin
- IDLab, Department for Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Mijung Kim
- Center for Biotech Data Science, Department of Environmental Technology, Food Technology and Molecular Biotechnology, Ghent University Global Campus, Songdo, Incheon, South Korea
- IDLab, Department for Electronics and Information Systems, Ghent University, Ghent, Belgium
| | - Arne Soete
- Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
- Data Mining and Modeling for Biomedicine, VIB Inflammation Research Center, Ghent, Belgium
| | - Yvan Saeys
- Data Mining and Modeling for Biomedicine, VIB Inflammation Research Center, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Wesley De Neve
- Center for Biotech Data Science, Department of Environmental Technology, Food Technology and Molecular Biotechnology, Ghent University Global Campus, Songdo, Incheon, South Korea
- IDLab, Department for Electronics and Information Systems, Ghent University, Ghent, Belgium
| |
Collapse
|
40
|
Legendre M, Rodriguez-Ballesteros M, Rossi M, Abadie V, Amiel J, Revencu N, Blanchet P, Brioude F, Delrue MA, Doubaj Y, Sefiani A, Francannet C, Holder-Espinasse M, Jouk PS, Julia S, Melki J, Mur S, Naudion S, Fabre-Teste J, Busa T, Stamm S, Lyonnet S, Attie-Bitach T, Kitzis A, Gilbert-Dussardier B, Bilan F. CHARGE syndrome: a recurrent hotspot of mutations in CHD7 IVS25 analyzed by bioinformatic tools and minigene assays. Eur J Hum Genet 2017; 26:287-292. [PMID: 29255276 DOI: 10.1038/s41431-017-0007-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Revised: 08/17/2017] [Accepted: 08/29/2017] [Indexed: 11/09/2022] Open
Abstract
CHARGE syndrome is a rare genetic disorder mainly due to de novo and private truncating mutations of CHD7 gene. Here we report an intriguing hot spot of intronic mutations (c.5405-7G > A, c.5405-13G > A, c.5405-17G > A and c.5405-18C > A) located in CHD7 IVS25. Combining computational in silico analysis, experimental branch-point determination and in vitro minigene assays, our study explains this mutation hot spot by a particular genomic context, including the weakness of the IVS25 natural acceptor-site and an unconventional lariat sequence localized outside the common 40 bp upstream the acceptor splice site. For each of the mutations reported here, bioinformatic tools indicated a newly created 3' splice site, of which the existence was confirmed using pSpliceExpress, an easy-to-use and reliable splicing reporter tool. Our study emphasizes the idea that combining these two complementary approaches could increase the efficiency of routine molecular diagnosis.
Collapse
Affiliation(s)
- Marine Legendre
- Service de Génétique, Centre de Référence Anomalies du Développement de l'Ouest, CHU Poitiers, France.,EA3808 CiMoTheMA Université Poitiers, Poitiers, France
| | | | - Massimiliano Rossi
- Service de génétique, Centre de Référence Anomalies du Développement, Hospices Civils de Lyon et INSERM U1028, CNRS UMR5292, Centre de Recherche en Neurosciences de Lyon, GENDEV Team, Université Claude Bernard Lyon 1, Bron, France
| | - Véronique Abadie
- Service de Pédiatrie Générale, Hôpital Necker Enfants-Malades, AP-HP, Paris, France
| | - Jeanne Amiel
- Département de Génétique, Hôpital Universitaire Necker-Enfants Malades, AP-HP, Institut Imagine, UMR-1163 INSERM-Université Paris Descartes, Paris, France
| | - Nicole Revencu
- Center for Human Genetics, Cliniques universitaires St Luc, Université catholique de Louvain, Brussels, Belgium
| | - Patricia Blanchet
- Département de Génétique Médicale, Hôpital Arnaud de Villeneuve, CHU Montpellier, France
| | - Frédéric Brioude
- Sorbonne Universités, UPMC Univ Paris 06, INSERM UMR_S938, Centre de Recherche Saint Antoine and AP-HP, Hôpitaux Universitaires Paris Est, Hôpital Trousseau, Service d'Explorations Fonctionnelles Endocriniennes, Paris, France
| | | | - Yassamine Doubaj
- Département de Génétique Médicale, Institut National d'Hygiène, Centre de Génomique Humaine, Faculté de Médecine et de Pharmacie de Rabat, Mohammed V University in Rabat, Rabat, Morocco
| | - Abdelaziz Sefiani
- Département de Génétique Médicale, Institut National d'Hygiène, Centre de Génomique Humaine, Faculté de Médecine et de Pharmacie de Rabat, Mohammed V University in Rabat, Rabat, Morocco
| | | | | | - Pierre-Simon Jouk
- Département Génétique & Procréation, Hôpital Couple-Enfant, CHU Grenoble, France
| | - Sophie Julia
- Service de Génétique Médicale, Hôpital Purpan, CHU Toulouse, France
| | - Judith Melki
- CHU Bicêtre, Unité de Génétique Médicale and UMR-1169, Inserm, Le Kremlin Bicêtre, France
| | - Sébastien Mur
- Clinique de médecine néonatale, Hôpital Jeanne de Flandre, CHU Lille, France
| | - Sophie Naudion
- Service de Génétique Médicale, GH Pellegrin, CHU Bordeaux, France
| | | | - Tiffany Busa
- Département de Génétique Médicale, Hôpital d'enfants de la Timone, Marseille, France
| | - Stephen Stamm
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, USA
| | - Stanislas Lyonnet
- Département de Génétique, Hôpital Universitaire Necker-Enfants Malades, AP-HP, Institut Imagine, UMR-1163 INSERM-Université Paris Descartes, Paris, France
| | - Tania Attie-Bitach
- Département de Génétique, Hôpital Universitaire Necker-Enfants Malades, AP-HP, Institut Imagine, UMR-1163 INSERM-Université Paris Descartes, Paris, France
| | - Alain Kitzis
- Service de Génétique, Centre de Référence Anomalies du Développement de l'Ouest, CHU Poitiers, France.,EA3808 CiMoTheMA Université Poitiers, Poitiers, France
| | - Brigitte Gilbert-Dussardier
- Service de Génétique, Centre de Référence Anomalies du Développement de l'Ouest, CHU Poitiers, France.,EA3808 CiMoTheMA Université Poitiers, Poitiers, France
| | - Frédéric Bilan
- Service de Génétique, Centre de Référence Anomalies du Développement de l'Ouest, CHU Poitiers, France. .,EA3808 CiMoTheMA Université Poitiers, Poitiers, France.
| |
Collapse
|
41
|
Zhang W, Zhu X, Fu Y, Tsuji J, Weng Z. Predicting human splicing branchpoints by combining sequence-derived features and multi-label learning methods. BMC Bioinformatics 2017; 18:464. [PMID: 29219070 PMCID: PMC5773893 DOI: 10.1186/s12859-017-1875-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Background Alternative splicing is the critical process in a single gene coding, which removes introns and joins exons, and splicing branchpoints are indicators for the alternative splicing. Wet experiments have identified a great number of human splicing branchpoints, but many branchpoints are still unknown. In order to guide wet experiments, we develop computational methods to predict human splicing branchpoints. Results Considering the fact that an intron may have multiple branchpoints, we transform the branchpoint prediction as the multi-label learning problem, and attempt to predict branchpoint sites from intron sequences. First, we investigate a variety of intron sequence-derived features, such as sparse profile, dinucleotide profile, position weight matrix profile, Markov motif profile and polypyrimidine tract profile. Second, we consider several multi-label learning methods: partial least squares regression, canonical correlation analysis and regularized canonical correlation analysis, and use them as the basic classification engines. Third, we propose two ensemble learning schemes which integrate different features and different classifiers to build ensemble learning systems for the branchpoint prediction. One is the genetic algorithm-based weighted average ensemble method; the other is the logistic regression-based ensemble method. Conclusions In the computational experiments, two ensemble learning methods outperform benchmark branchpoint prediction methods, and can produce high-accuracy results on the benchmark dataset.
Collapse
Affiliation(s)
- Wen Zhang
- School of Computer, Wuhan University, Wuhan, 430072, China.
| | - Xiaopeng Zhu
- School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA
| | - Yu Fu
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA, 01605, USA
| | - Junko Tsuji
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA, 01605, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA, 01605, USA
| |
Collapse
|
42
|
Wen J, Wang J, Zhang Q, Guo D. A heuristic model for computational prediction of human branch point sequence. BMC Bioinformatics 2017; 18:459. [PMID: 29065858 PMCID: PMC5655975 DOI: 10.1186/s12859-017-1864-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2017] [Accepted: 10/09/2017] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Pre-mRNA splicing is the removal of introns from precursor mRNAs (pre-mRNAs) and the concurrent ligation of the flanking exons to generate mature mRNA. This process is catalyzed by the spliceosome, where the splicing factor 1 (SF1) specifically recognizes the seven-nucleotide branch point sequence (BPS) and the U2 snRNP later displaces the SF1 and binds to the BPS. In mammals, the degeneracy of BPS motifs together with the lack of a large set of experimentally verified BPSs complicates the task of BPS prediction in silico. RESULTS In this paper, we develop a simple and yet efficient heuristic model for human BPS prediction based on a novel scoring scheme, which quantifies the splicing strength of putative BPSs. The candidate BPS is restricted exclusively within a defined BPS search region to avoid the influences of other elements in the intron and therefore the prediction accuracy is improved. Moreover, using two types of relative frequencies for human BPS prediction, we demonstrate our model outperformed other current implementations on experimentally verified human introns. CONCLUSION We propose that the binding energy contributes to the molecular recognition involved in human pre-mRNA splicing. In addition, a genome-wide human BPS prediction is carried out. The characteristics of predicted BPSs are in accordance with experimentally verified human BPSs, and branch site positions relative to the 3'ss and the 5'end of the shortened AGEZ are consistent with the results of published papers. Meanwhile, a webserver for BPS predictor is freely available at http://biocomputer.bio.cuhk.edu.hk/BPS .
Collapse
Affiliation(s)
- Jia Wen
- School of Life Science, State Key Laboratory of Agrobiotechnology and ShenZhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Jue Wang
- School of Life Science, State Key Laboratory of Agrobiotechnology and ShenZhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Qing Zhang
- School of Life Science, State Key Laboratory of Agrobiotechnology and ShenZhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| | - Dianjing Guo
- School of Life Science, State Key Laboratory of Agrobiotechnology and ShenZhen Research Institute, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
43
|
Chen L, Weinmeister R, Kralovicova J, Eperon LP, Vorechovsky I, Hudson AJ, Eperon IC. Stoichiometries of U2AF35, U2AF65 and U2 snRNP reveal new early spliceosome assembly pathways. Nucleic Acids Res 2017; 45:2051-2067. [PMID: 27683217 PMCID: PMC5389562 DOI: 10.1093/nar/gkw860] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 09/16/2016] [Indexed: 12/24/2022] Open
Abstract
The selection of 3΄ splice sites (3΄ss) is an essential early step in mammalian RNA splicing reactions, but the processes involved are unknown. We have used single molecule methods to test whether the major components implicated in selection, the proteins U2AF35 and U2AF65 and the U2 snRNP, are able to recognize alternative candidate sites or are restricted to one pre-specified site. In the presence of adenosine triphosphate (ATP), all three components bind in a 1:1 stoichiometry with a 3΄ss. Pre-mRNA molecules with two alternative 3΄ss can be bound concurrently by two molecules of U2AF or two U2 snRNPs, so none of the components are restricted. However, concurrent occupancy inhibits splicing. Stoichiometric binding requires conditions consistent with coalescence of the 5΄ and 3΄ sites in a complex (I, initial), but if this cannot form the components show unrestricted and stochastic association. In the absence of ATP, when complex E forms, U2 snRNP association is unrestricted. However, if protein dephosphorylation is prevented, an I-like complex forms with stoichiometric association of U2 snRNPs and the U2 snRNA is base-paired to the pre-mRNA. Complex I differs from complex A in that the formation of complex A is associated with the loss of U2AF65 and 35.
Collapse
Affiliation(s)
- Li Chen
- University of Leicester, Leicester Institute for Structural and Chemical Biology and Department of Molecular and Cell Biology, Leicester LE1 9HN, UK
| | - Robert Weinmeister
- University of Leicester, Leicester Institute for Structural and Chemical Biology and Department of Molecular and Cell Biology, Leicester LE1 9HN, UK
| | - Jana Kralovicova
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| | - Lucy P Eperon
- University of Leicester, Leicester Institute for Structural and Chemical Biology and Department of Molecular and Cell Biology, Leicester LE1 9HN, UK
| | - Igor Vorechovsky
- University of Southampton, Faculty of Medicine, Southampton SO16 6YD, UK
| | - Andrew J Hudson
- University of Leicester, Leicester Institute for Structural and Chemical Biology and Department of Chemistry, Leicester LE1 7RH, UK
| | - Ian C Eperon
- University of Leicester, Leicester Institute for Structural and Chemical Biology and Department of Molecular and Cell Biology, Leicester LE1 9HN, UK
| |
Collapse
|
44
|
Brady LK, Wang H, Radens CM, Bi Y, Radovich M, Maity A, Ivan C, Ivan M, Barash Y, Koumenis C. Transcriptome analysis of hypoxic cancer cells uncovers intron retention in EIF2B5 as a mechanism to inhibit translation. PLoS Biol 2017; 15:e2002623. [PMID: 28961236 PMCID: PMC5636171 DOI: 10.1371/journal.pbio.2002623] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Revised: 10/11/2017] [Accepted: 09/07/2017] [Indexed: 01/09/2023] Open
Abstract
Cells adjust to hypoxic stress within the tumor microenvironment by downregulating energy-consuming processes including translation. To delineate mechanisms of cellular adaptation to hypoxia, we performed RNA-Seq of normoxic and hypoxic head and neck cancer cells. These data revealed a significant down regulation of genes known to regulate RNA processing and splicing. Exon-level analyses classified > 1,000 mRNAs as alternatively spliced under hypoxia and uncovered a unique retained intron (RI) in the master regulator of translation initiation, EIF2B5. Notably, this intron was expressed in solid tumors in a stage-dependent manner. We investigated the biological consequence of this RI and demonstrate that its inclusion creates a premature termination codon (PTC), that leads to a 65kDa truncated protein isoform that opposes full-length eIF2Bε to inhibit global translation. Furthermore, expression of 65kDa eIF2Bε led to increased survival of head and neck cancer cells under hypoxia, providing evidence that this isoform enables cells to adapt to conditions of low oxygen. Additional work to uncover -cis and -trans regulators of EIF2B5 splicing identified several factors that influence intron retention in EIF2B5: a weak splicing potential at the RI, hypoxia-induced expression and binding of the splicing factor SRSF3, and increased binding of total and phospho-Ser2 RNA polymerase II specifically at the intron retained under hypoxia. Altogether, these data reveal differential splicing as a previously uncharacterized mode of translational control under hypoxia and are supported by a model in which hypoxia-induced changes to cotranscriptional processing lead to selective retention of a PTC-containing intron in EIF2B5.
Collapse
Affiliation(s)
- Lauren K. Brady
- Department of Radiation Oncology Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Cellular and Molecular Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States of America
| | - Hejia Wang
- Department of Biochemistry and Molecular Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States of America
| | - Caleb M. Radens
- Cellular and Molecular Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States of America
| | - Yue Bi
- Department of Radiation Oncology Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Oncology Center, Zhujiang Hospital of Southern Medical University, Guangzhou, Guangdong, China
| | - Milan Radovich
- Indiana University Health Precision Genomics Program, Indianapolis, Indiana, United States of America
- Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, Indiana, United States of America
| | - Amit Maity
- Department of Radiation Oncology Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Cristina Ivan
- Center for RNA Interference and Non-coding RNAs, The University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Mircea Ivan
- Department of Medicine, Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, Indiana, United States of America
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United States of America
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, United States of America
| | - Constantinos Koumenis
- Department of Radiation Oncology Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
45
|
Relatively frequent switching of transcription start sites during cerebellar development. BMC Genomics 2017; 18:461. [PMID: 28610618 PMCID: PMC5470264 DOI: 10.1186/s12864-017-3834-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 05/31/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Alternative transcription start site (TSS) usage plays important roles in transcriptional control of mammalian gene expression. The growing interest in alternative TSSs and their role in genome diversification spawned many single-gene studies on differential usages of tissue-specific or temporal-specific alternative TSSs. However, exploration of the switching usage of alternative TSS usage on a genomic level, especially in the central nervous system, is largely lacking. RESULTS In this study, We have prepared a unique set of time-course data for the developing cerebellum, as part of the FANTOM5 consortium ( http://fantom.gsc.riken.jp/5/ ) that uses their innovative capturing of 5' ends of all transcripts followed by Helicos next generation sequencing. We analyzed the usage of all transcription start sites (TSSs) at each time point during cerebellar development that provided information on multiple RNA isoforms that emerged from the same gene. We developed a mathematical method that systematically compares the expression of different TSSs of a gene to identify temporal crossover and non-crossover switching events. We identified 48,489 novel TSS switching events in 5433 genes during cerebellar development. This includes 9767 crossover TSS switching events in 1511 genes, where the dominant TSS shifts over time. CONCLUSIONS We observed a relatively high prevalence of TSS switching in cerebellar development where the resulting temporally-specific gene transcripts and protein products can play important regulatory and functional roles.
Collapse
|
46
|
Luo Y, Ahmad FS, Shah SJ. Tensor Factorization for Precision Medicine in Heart Failure with Preserved Ejection Fraction. J Cardiovasc Transl Res 2017; 10:305-312. [PMID: 28116551 PMCID: PMC5515683 DOI: 10.1007/s12265-016-9727-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 12/23/2016] [Indexed: 02/07/2023]
Abstract
Heart failure with preserved ejection fraction (HFpEF) is a heterogeneous clinical syndrome that may benefit from improved subtyping in order to better characterize its pathophysiology and to develop novel targeted therapies. The United States Precision Medicine Initiative comes amid the rapid growth in quantity and modality of clinical data for HFpEF patients ranging from deep phenotypic to trans-omic data. Tensor factorization, a form of machine learning, allows for the integration of multiple data modalities to derive clinically relevant HFpEF subtypes that may have significant differences in underlying pathophysiology and differential response to therapies. Tensor factorization also allows for better interpretability by supporting dimensionality reduction and identifying latent groups of data for meaningful summarization of both features and disease outcomes. In this narrative review, we analyze the modest literature on the application of tensor factorization to related biomedical fields including genotyping and phenotyping. Based on the cited work including work of our own, we suggest multiple tensor factorization formulations capable of integrating the deep phenotypic and trans-omic modalities of data for HFpEF, or accounting for interactions between genetic variants at different omic hierarchies. We encourage extensive experimental studies to tackle challenges in applying tensor factorization for precision medicine in HFpEF, including effectively incorporating existing medical knowledge, properly accounting for uncertainty, and efficiently enforcing sparsity for better interpretability.
Collapse
Affiliation(s)
- Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, 11th Floor, Arthur Rubloff Building, 750 N. Lake Shore Drive, Chicago, IL, 60611, USA.
| | - Faraz S Ahmad
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, 11th Floor, Arthur Rubloff Building, 750 N. Lake Shore Drive, Chicago, IL, 60611, USA
- Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Sanjiv J Shah
- Division of Cardiology, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| |
Collapse
|
47
|
Tian C, Tan S, Bao L, Zeng Q, Liu S, Yang Y, Zhong X, Liu Z. DExD/H-box RNA helicase genes are differentially expressed between males and females during the critical period of male sex differentiation in channel catfish. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2017; 22:109-119. [DOI: 10.1016/j.cbd.2017.02.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Revised: 02/21/2017] [Accepted: 02/24/2017] [Indexed: 01/19/2023]
|
48
|
Zhang X, Wu Y, Cai F, Liu S, Bromley-Brits K, Xia K, Song W. A Novel Alzheimer-Associated SNP in Tmp21 Increases Amyloidogenesis. Mol Neurobiol 2017; 55:1862-1870. [PMID: 28233271 DOI: 10.1007/s12035-017-0459-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 02/13/2017] [Indexed: 10/20/2022]
Abstract
Recent studies suggest that TMP21 is a selective modulator of γ-secretase and its dysregulation affects APP processing, leading to increased Aβ generation. However, the genetic association between Tmp21 and Alzheimer's disease (AD) remains elusive. In this study, we identified that a novel single-nucleotide polymorphism (SNP) rs12435391 (IVS4-28T>C) in intron 4 of Tmp21 was genetically associated with AD. We found that allele C of the SNP rs12435391 did not affect splicing site recognition, but it significantly increased TMP21 gene expression. The stability of Tmp21 pre-mRNA and the transcription of Tmp21 were not affected by allele C of the SNP rs12435391. However, allele C of the SNP rs12435391 significantly increased the splicing efficiency of Tmp21 pre-mRNA, leading to the elevation of mature mRNA. Furthermore, allele C of the SNP rs12435391 significantly reduced C83 level and increased Aβ generation. Taken together, our study suggests that TMP21 is genetically associated with Alzheimer's disease, with the novel Tmp21 SNP as a risk factor for Alzheimer's pathogenesis.
Collapse
Affiliation(s)
- Xiaojie Zhang
- Townsend Family Laboratories, Department of Psychiatry, The University of British Columbia, 2255 Wesbrook Mall, Vancouver, BC, V6T 1Z3, Canada
| | - Yili Wu
- Townsend Family Laboratories, Department of Psychiatry, The University of British Columbia, 2255 Wesbrook Mall, Vancouver, BC, V6T 1Z3, Canada
| | - Fang Cai
- Townsend Family Laboratories, Department of Psychiatry, The University of British Columbia, 2255 Wesbrook Mall, Vancouver, BC, V6T 1Z3, Canada
| | - Shengchun Liu
- Department of Surgery, The First Affiliated Hospital, Chongqing Medical University, 1 Friendship Road, Yuzhong District, Chongqing, 410006, China
| | - Kelley Bromley-Brits
- Townsend Family Laboratories, Department of Psychiatry, The University of British Columbia, 2255 Wesbrook Mall, Vancouver, BC, V6T 1Z3, Canada
| | - Kun Xia
- The State Key Lab of Medical Genetics of China, School of Life Sciences, Central South University, Changsha, 410000, China.
| | - Weihong Song
- Townsend Family Laboratories, Department of Psychiatry, The University of British Columbia, 2255 Wesbrook Mall, Vancouver, BC, V6T 1Z3, Canada.
| |
Collapse
|
49
|
Taggart AJ, Lin CL, Shrestha B, Heintzelman C, Kim S, Fairbrother WG. Large-scale analysis of branchpoint usage across species and cell lines. Genome Res 2017; 27:639-649. [PMID: 28119336 PMCID: PMC5378181 DOI: 10.1101/gr.202820.115] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 01/18/2017] [Indexed: 02/03/2023]
Abstract
The coding sequence of each human pre-mRNA is interrupted, on average, by 11 introns that must be spliced out for proper gene expression. Each intron contains three obligate signals: a 5′ splice site, a branch site, and a 3′ splice site. Splice site usage has been mapped exhaustively across different species, cell types, and cellular states. In contrast, only a small fraction of branch sites have been identified even once. The few reported annotations of branch site are imprecise as reverse transcriptase skips several nucleotides while traversing a 2–5 linkage. Here, we report large-scale mapping of the branchpoints from deep sequencing data in three different species and in the SF3B1 K700E oncogenic mutant background. We have developed a novel method whereby raw lariat reads are refined by U2snRNP/pre-mRNA base-pairing models to return the largest current data set of branchpoint sequences with quality metrics. This analysis discovers novel modes of U2snRNA:pre-mRNA base-pairing conserved in yeast and provides insight into the biogenesis of intron circles. Finally, matching branch site usage with isoform selection across the extensive panel of ENCODE RNA-seq data sets offers insight into the mechanisms by which branchpoint usage drives alternative splicing.
Collapse
Affiliation(s)
- Allison J Taggart
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island 02912, USA
| | - Chien-Ling Lin
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island 02912, USA
| | - Barsha Shrestha
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island 02912, USA
| | - Claire Heintzelman
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island 02912, USA
| | - Seongwon Kim
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island 02912, USA
| | - William G Fairbrother
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island 02912, USA.,Center for Computational Molecular Biology, Brown University, Providence, Rhode Island 02912, USA.,Hassenfeld Child Health Innovation Institute of Brown University, Providence, Rhode Island 02912, USA
| |
Collapse
|
50
|
Alagappan U, Pramono ZAD, Chong WS. Ferrochelatase gene mutation in Singapore and a novel frame-shift mutation in an Asian boy with erythropoietic protoporphyria. Int J Dermatol 2017; 56:272-276. [PMID: 28054335 DOI: 10.1111/ijd.13418] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Revised: 05/17/2016] [Accepted: 06/06/2016] [Indexed: 11/28/2022]
Abstract
BACKGROUND Erythropoietic protoporphyria (EPP) is a rare inherited disorder of heme biosynthesis caused by decreased activity of the enzyme ferrochelatase (FECH ). The frequency of the hypomorphic c.333-48C allele in a population directly contributes to the prevalence of EPP in the same population. This study sought to identify the molecular basis of EPP in a Chinese patient from Singapore and the c.333-48C allele frequency among the Chinese population in Singapore. MATERIALS AND METHODS FECH gene was screened for mutation in the patient's DNA sample by polymerase chain reaction amplification and DNA sequencing. To validate the identified mutation, the FECH region harboring the mutation was screened in DNA samples from all healthy controls. One patient and 46 ethnically matched healthy controls were included in the study. RESULTS A novel c.474dupC which leads to a frameshift and premature stop codon was identified in one allele, while the other allele showed to carry c.333-48C and c.337C>T variants in the patient's FECH. The frequency of the c.333-48C hypomorphic allele is 27% among Chinese population in Singapore. CONCLUSIONS c.474dupC in one allele trans to hypomorphic c.333-48C and c.337C>T allele in FECH gene may be the underlying cause of the clinical EPP of the studied patient. The FECH hypomorphic c.333-48C allele frequency in Singapore is lower than the Han Chinese (41.3%) and Japanese (43%) populations but nearly the same as the Southeast Asian (31%) population and higher than the European (2.7-11%) population.
Collapse
Affiliation(s)
- Uma Alagappan
- Department of Dermatology, Changi General Hospital, Singapore
| | | | | |
Collapse
|