1
|
Pham M, Hoffmann HH, Kurtti TJ, Chana R, Garcia-Cruz O, Aliabadi S, Gulia-Nuss M. Validation of heat-inducible Ixodes scapularis HSP70 and tick-specific 3xP3 promoters in ISE6 cells. iScience 2024; 27:110468. [PMID: 39139404 PMCID: PMC11321315 DOI: 10.1016/j.isci.2024.110468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/18/2024] [Accepted: 07/03/2024] [Indexed: 08/15/2024] Open
Abstract
Ixodes scapularis is an important vector of many pathogens, including the causative agent of Lyme disease. The gene function studies in I. scapularis and other ticks are hampered by the lack of genetic tools, including an inducible promoter for temporal control over transgene-encoding protein or double-stranded RNA. We characterized an intergenic sequence upstream of a heat shock protein 70 (HSP70) gene that can drive Renilla luciferase and mCherry expression in the I. scapularis cell line ISE6 (IsHSP70). In another construct, we replaced the Drosophila melanogaster minimal HSP70 promoter of the 3xP3 promoter with a minimal portion of IsHSP70 promoter and generated an I. scapularis-specific 3xP3 (Is3xP3) promoter. Both IsHSP70 and Is3xP3 have a heat-inducible expression of mCherry fluorescence in ISE6 cells with an approximately 10-fold increase in the percentage of fluorescent cells upon 2 h heat shock. These promoters described will be valuable tools for gene function studies.
Collapse
Affiliation(s)
- Michael Pham
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, USA
| | - Hans-Heinrich Hoffmann
- Laboratory of Virology and Infectious Diseases, Rockefeller University, New York City, NY, USA
| | | | - Randeep Chana
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, USA
| | - Omar Garcia-Cruz
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, USA
| | - Simindokht Aliabadi
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, USA
| | - Monika Gulia-Nuss
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, USA
| |
Collapse
|
2
|
Fayad MA, Charles S, Shelvy S, Sheeja TE, Sangeetha K, Angadi UB, Tandon G, Iquebal MA, Jaiswal S, Kumar D. Whole genome based identification of BAHD acyltransferase gene involved in piperine biosynthetic pathway in black pepper. J Biomol Struct Dyn 2024:1-13. [PMID: 38344997 DOI: 10.1080/07391102.2024.2313164] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 01/25/2024] [Indexed: 03/08/2025]
Abstract
Black pepper (Piper nigrum L.), a crop of the genus Piper, is an important spice that has both economic and ecological significance. It is widely regarded as the "King of Spices" because of its pungency, attributed to the presence of piperine. BAHD acyl transferase, the crucial enzyme involved in the final step in piperine biosynthesis was the focus of our study and the aim was to identify the candidate isoform involved in biosynthesis of piperine. Reference genome-based analysis of black pepper identified six BAHD-AT isoforms and mapping of these sequences revealed that the isoforms were situated on six distinct chromosomes. By using specific primers for each of these transcripts, qPCR analysis was done in different tissues as well as berry stages to obtain detectable amplification products. Expression profiles of isoforms from chromosome 6 correlated well with piperine content compared to other five isoforms, across tissues and was therefore assumed to be involved in biosynthesis of piperine. In addition to this, we could also identify the binding sites of MYB transcription factor in the cis-regulatory regions of the isoforms. We also used in-silico docking and molecular dynamics simulation to calculate the binding free energy of the ligand and confirmed that among all the isoforms, BAHD-AT from chromosome 6 had the lowest free binding energy and highest affinity towards the ligand. Our findings are expected to aid the identification of new genes connecting enzymes involved in the biosynthetic pathway of piperine, which will have major implications for future research in metabolic engineering.
Collapse
Affiliation(s)
- M A Fayad
- ICAR - Indian Institute of Spices Research, Kozhikode, Kerala, India
| | - Sona Charles
- ICAR - Indian Institute of Spices Research, Kozhikode, Kerala, India
| | - S Shelvy
- ICAR - Indian Institute of Spices Research, Kozhikode, Kerala, India
| | - T E Sheeja
- ICAR - Indian Institute of Spices Research, Kozhikode, Kerala, India
| | - K Sangeetha
- ICAR - Indian Institute of Spices Research, Kozhikode, Kerala, India
| | - U B Angadi
- ICAR - Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Gitanjali Tandon
- ICAR - Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Mir Asif Iquebal
- ICAR - Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Sarika Jaiswal
- ICAR - Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Dinesh Kumar
- ICAR - Indian Agricultural Statistics Research Institute, New Delhi, India
| |
Collapse
|
3
|
Pham M, Hoffmann HH, Kurtti TJ, Chana R, Garcia-Cruz O, Aliabadi S, Gulia-Nuss M. Validation of a heat-inducible Ixodes scapularis HSP70 promoter and developing a tick-specific 3xP3 promoter sequence in ISE6 cells. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.29.569248. [PMID: 38076872 PMCID: PMC10705397 DOI: 10.1101/2023.11.29.569248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2023]
Abstract
Ixodes scapularis is an important vector of many pathogens, including the causative agent of Lyme disease, tick-borne encephalitis, and anaplasmosis. The study of gene function in I. scapularis and other ticks has been hampered by the lack of genetic tools, such as an inducible promoter to permit temporal control over transgenes encoding protein or double-stranded RNA expression. Studies of vector-pathogen relationships would also benefit from the capability to activate anti-pathogen genes at different times during pathogen infection and dissemination. We have characterized an intergenic sequence upstream of the heat shock protein 70 (HSP70) gene that can drive Renilla luciferase expression and mCherry fluorescence in the I. scapularis cell line ISE6. In another construct, we replaced the Drosophila melanogaster minimal HSP70 promoter in the synthetic 3xP3 promoter with a minimal portion of the I. scapularis HSP70 promoter and generated an I. scapularis specific 3xP3 (Is3xP3) promoter. Both promoter constructs, IsHSP70 and Is3xP3, allow for heat-inducible expression of mCherry fluorescence in ISE6 cells with an approximately 10-fold increase in the percentage of fluorescent positive cells upon exposure to a 2 h heat shock. These promoters described here will be valuable tools for gene function studies and temporal control of gene expression, including anti-pathogen genes.
Collapse
Affiliation(s)
- Michael Pham
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, USA
| | | | | | - Randeep Chana
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, USA
| | - Omar Garcia-Cruz
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, USA
| | - Simindokht Aliabadi
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, USA
| | - Monika Gulia-Nuss
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, USA
| |
Collapse
|
4
|
Zaytsev K, Fedorov A, Korotkov E. Classification of Promoter Sequences from Human Genome. Int J Mol Sci 2023; 24:12561. [PMID: 37628742 PMCID: PMC10454140 DOI: 10.3390/ijms241612561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 07/28/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023] Open
Abstract
We have developed a new method for promoter sequence classification based on a genetic algorithm and the MAHDS sequence alignment method. We have created four classes of human promoters, combining 17,310 sequences out of the 29,598 present in the EPD database. We searched the human genome for potential promoter sequences (PPSs) using dynamic programming and position weight matrices representing each of the promoter sequence classes. A total of 3,065,317 potential promoter sequences were found. Only 1,241,206 of them were located in unannotated parts of the human genome. Every other PPS found intersected with either true promoters, transposable elements, or interspersed repeats. We found a strong intersection between PPSs and Alu elements as well as transcript start sites. The number of false positive PPSs is estimated to be 3 × 10-8 per nucleotide, which is several orders of magnitude lower than for any other promoter prediction method. The developed method can be used to search for PPSs in various eukaryotic genomes.
Collapse
Affiliation(s)
- Konstantin Zaytsev
- Bach Institute of Biochemistry, Federal Research Center of Biotechnology of the Russian Academy of Sciences, 119071 Moscow, Russia
| | - Alexey Fedorov
- Bach Institute of Biochemistry, Federal Research Center of Biotechnology of the Russian Academy of Sciences, 119071 Moscow, Russia
| | - Eugene Korotkov
- Institute of Bioengineering, Federal Research Center of Biotechnology of the Russian Academy of Sciences, 119071 Moscow, Russia
| |
Collapse
|
5
|
Jia Y, Huang C, Mao Y, Zhou S, Deng Y. Screening and Constructing a Library of Promoter-5'-UTR Complexes with Gradient Strength in Pediococcus acidilactici. ACS Synth Biol 2023; 12:1794-1803. [PMID: 37172276 DOI: 10.1021/acssynbio.3c00067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
The GRAS (generally recognized as safe) strain Pediococcus acidilactici is well known for its antibacterial and probiotic functions. Furthermore, as P. acidilactici has excellent high temperature and salt resistance, it is an ideal host for the production of food enzymes, food additives, and pharmaceuticals. In this regard, it is desirable and feasible to enhance the production of these products through the metabolic engineering of P. acidilactici. However, the rare gene expression elements greatly obstruct the development of engineering P. acidilactici. In this study, we screened and constructed a library of promoter-5'-UTR (PUTR) complexes in P. acidilactici DY15 for regulating gene expression at the transcription and translation levels. In the post-log phase, the mRNA and protein expression level ranges of the 90 screened native PUTRs were 0.059-2010% and 0.77-245%, respectively, of the P32 promoter. Besides, several PUTRs exhibited great expression stability under high temperature, salt, and ethanol stress. We analyzed the structure of PUTRs and obtained the conserved regions of the promoter and 5'-UTR. Based on the identified core regions of PUTRs, we constructed a panel of combinatorial PUTRs with higher and stable protein expression levels. The strongest combinatorial PUTR was 853% of the P32 promoter in the protein expression level. Finally, the obtained PUTRs were applied to optimize the expression level of aminotransferase and improve the phenyllactic acid (PLA) production in P. acidilactici DY15. The achieved yield was 950.6 mg/L, which was 79.2% higher than the wild-type strain. These results indicated that the obtained PUTRs with gradient strength had great potential for precisely regulating gene expression to achieve various goals in P. acidilactici.
Collapse
Affiliation(s)
- Yize Jia
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Jiangsu Provincial Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Chao Huang
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Jiangsu Provincial Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Yin Mao
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Jiangsu Provincial Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Shenghu Zhou
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Jiangsu Provincial Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | - Yu Deng
- National Engineering Research Center of Cereal Fermentation and Food Biomanufacturing, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
- Jiangsu Provincial Research Center for Bioactive Product Processing Technology, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| |
Collapse
|
6
|
Lee M, Heo YB, Woo HM. Cytosine base editing in cyanobacteria by repressing archaic Type IV uracil-DNA glycosylase. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 113:610-625. [PMID: 36565011 DOI: 10.1111/tpj.16074] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 12/19/2022] [Indexed: 06/17/2023]
Abstract
Base editing enables precise gene editing without requiring donor DNA or double-stranded breaks. To facilitate base editing tools, a uracil DNA glycosylase inhibitor (UGI) was fused to cytidine deaminase-Cas nickase to inhibit uracil DNA glycosylase (UDG). Herein, we revealed that the bacteriophage PBS2-derived UGI of the cytosine base editor (CBE) could not inhibit archaic Type IV UDG in oligoploid cyanobacteria. To overcome the limitation of the CBE, dCas12a-assisted gene repression of the udg allowed base editing at the desired targets with up to 100% mutation frequencies, and yielded correct phenotypes of desired mutants in cyanobacteria. Compared with the original CBE (BE3), base editing was analyzed within a broader C4-C16 window with a strong TC-motif preference. Using multiplexed CyanoCBE, while udg was repressed, simultaneous base editing at two different sites was achieved with lower mutation frequencies than single CBE. Our discovery of a Type IV UDG that is not inhibited by the UGI of the CBE in cyanobacteria and the development of dCas12a-mediated base editing should facilitate the application of base editing not only in cyanobacteria, but also in archaea and green algae that possess Type IV UDGs. We revealed the bacteriophage-derived UGI of the base editor did not repress Type IV UDG in cyanobacteria. To overcome the limitation, orthogonal dCas12a interference was successfully applied to repress the UDG gene expression in cyanobacteria during base editing occurred, yielding a premature translational termination at desired targets. This study will open a new opportunity to perform base editing with Type IV UDGs in archaea and green algae.
Collapse
Affiliation(s)
- Mieun Lee
- Department of Food Science and Biotechnology, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon, 16419, Republic of Korea
- BioFoundry Research Center, Institute of Biotechnology and Bioengineering, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon, 16419, Republic of Korea
| | - Yu Been Heo
- Department of Food Science and Biotechnology, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon, 16419, Republic of Korea
- BioFoundry Research Center, Institute of Biotechnology and Bioengineering, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon, 16419, Republic of Korea
| | - Han Min Woo
- Department of Food Science and Biotechnology, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon, 16419, Republic of Korea
- BioFoundry Research Center, Institute of Biotechnology and Bioengineering, Sungkyunkwan University (SKKU), 2066 Seobu-ro, Jangan-gu, Suwon, 16419, Republic of Korea
| |
Collapse
|
7
|
Motif and conserved module analysis in DNA (promoters, enhancers) and RNA (lncRNA, mRNA) using AlModules. Sci Rep 2022; 12:17588. [PMID: 36266399 PMCID: PMC9584888 DOI: 10.1038/s41598-022-21732-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 09/30/2022] [Indexed: 01/13/2023] Open
Abstract
Nucleic acid motifs consist of conserved and variable nucleotide regions. For functional action, several motifs are combined to modules. The tool AIModules allows identification of such motifs including combinations of them and conservation in several nucleic acid stretches. AIModules recognizes conserved motifs and combinations of motifs (modules) allowing a number of interesting biological applications such as analysis of promoter and transcription factor binding sites (TFBS), identification of conserved modules shared between several gene families, e.g. promoter regions, but also analysis of shared and conserved other DNA motifs such as enhancers and silencers, in mRNA (motifs or regulatory elements e.g. for polyadenylation) and lncRNAs. The tool AIModules presented here is an integrated solution for motif analysis, offered as a Web service as well as downloadable software. Several nucleotide sequences are queried for TFBSs using predefined matrices from the JASPAR DB or by using one's own matrices for diverse types of DNA or RNA motif discovery. Furthermore, AIModules can find TFBSs common to two or more sequences. Demanding high or low conservation, AIModules outperforms other solutions in speed and finds more modules (specific combinations of TFBS) than alternative available software. The application also searches RNA motifs such as polyadenylation site or RNA-protein binding motifs as well as DNA motifs such as enhancers as well as user-specified motif combinations ( https://bioinfo-wuerz.de/aimodules/ ; alternative entry pages: https://aimodules.heinzelab.de or https://www.biozentrum.uni-wuerzburg.de/bioinfo/computing/aimodules ). The application is free and open source whether used online, on-site, or locally.
Collapse
|
8
|
Zhou J, Zhang B, Li H, Zhou L, Li Z, Long Y, Han W, Wang M, Cui H, Li J, Chen W, Gao X. Annotating TSSs in Multiple Cell Types Based on DNA Sequence and RNA-seq Data via DeeReCT-TSS. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:959-973. [PMID: 36528241 PMCID: PMC10025762 DOI: 10.1016/j.gpb.2022.11.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 10/21/2022] [Accepted: 11/24/2022] [Indexed: 12/23/2022]
Abstract
The accurate annotation of transcription start sites (TSSs) and their usage are critical for the mechanistic understanding of gene regulation in different biological contexts. To fulfill this, specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner, and various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences. Most of these computational tools cast the problem as a binary classification task on a balanced dataset, thus resulting in drastic false positive predictions when applied on the genome scale. Here, we present DeeReCT-TSS, a deep learning-based method that is capable of identifying TSSs across the whole genome based on both DNA sequence and conventional RNA sequencing data. We show that by effectively incorporating these two sources of information, DeeReCT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types. Furthermore, we develop a meta-learning-based extension for simultaneous TSS annotations on 10 cell types, which enables the identification of cell type-specific TSSs. Finally, we demonstrate the high precision of DeeReCT-TSS on two independent datasets by correlating our predicted TSSs with experimentally defined TSS chromatin states. The source code for DeeReCT-TSS is available at https://github.com/JoshuaChou2018/DeeReCT-TSS_release and https://ngdc.cncb.ac.cn/biocode/tools/BT007316.
Collapse
Affiliation(s)
- Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Bin Zhang
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
| | - Haoyang Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
| | - Longxi Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
| | - Zhongxiao Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
| | - Yongkang Long
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia
| | - Mengran Wang
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Huanhuan Cui
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China; Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China; Academy for Advanced Interdisciplinary Studies, Southern University of Science and Technology, Shenzhen 518055, China
| | - Jingjing Li
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Wei Chen
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China; Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China; Academy for Advanced Interdisciplinary Studies, Southern University of Science and Technology, Shenzhen 518055, China.
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia; Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia.
| |
Collapse
|
9
|
BERT-Promoter: An improved sequence-based predictor of DNA promoter using BERT pre-trained model and SHAP feature selection. Comput Biol Chem 2022; 99:107732. [PMID: 35863177 DOI: 10.1016/j.compbiolchem.2022.107732] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 07/12/2022] [Indexed: 02/01/2023]
Abstract
A promoter is a sequence of DNA that initializes the process of transcription and regulates whenever and wherever genes are expressed in the organism. Because of its importance in molecular biology, identifying DNA promoters are challenging to provide useful information related to its functions and related diseases. Several computational models have been developed to early predict promoters from high-throughput sequencing over the past decade. Although some useful predictors have been proposed, there remains short-falls in those models and there is an urgent need to enhance the predictive performance to meet the practice requirements. In this study, we proposed a novel architecture that incorporated transformer natural language processing (NLP) and explainable machine learning to address this problem. More specifically, a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model was employed to encode DNA sequences, and SHapley Additive exPlanations (SHAP) analysis served as a feature selection step to look at the top-rank BERT encodings. At the last stage, different machine learning classifiers were implemented to learn the top features and produce the prediction outcomes. This study not only predicted the DNA promoters but also their activities (strong or weak promoters). Overall, several experiments showed an accuracy of 85.5 % and 76.9 % for these two levels, respectively. Our performance showed a superiority to previously published predictors on the same dataset in most measurement metrics. We named our predictor as BERT-Promoter and it is freely available at https://github.com/khanhlee/bert-promoter.
Collapse
|
10
|
Zhang M, Jia C, Li F, Li C, Zhu Y, Akutsu T, Webb GI, Zou Q, Coin LJM, Song J. Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction. Brief Bioinform 2022; 23:6502561. [PMID: 35021193 PMCID: PMC8921625 DOI: 10.1093/bib/bbab551] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 11/12/2021] [Accepted: 11/30/2021] [Indexed: 01/13/2023] Open
Abstract
Promoters are crucial regulatory DNA regions for gene transcriptional activation. Rapid advances in next-generation sequencing technologies have accelerated the accumulation of genome sequences, providing increased training data to inform computational approaches for both prokaryotic and eukaryotic promoter prediction. However, it remains a significant challenge to accurately identify species-specific promoter sequences using computational approaches. To advance computational support for promoter prediction, in this study, we curated 58 comprehensive, up-to-date, benchmark datasets for 7 different species (i.e. Escherichia coli, Bacillus subtilis, Homo sapiens, Mus musculus, Arabidopsis thaliana, Zea mays and Drosophila melanogaster) to assist the research community to assess the relative functionality of alternative approaches and support future research on both prokaryotic and eukaryotic promoters. We revisited 106 predictors published since 2000 for promoter identification (40 for prokaryotic promoter, 61 for eukaryotic promoter, and 5 for both). We systematically evaluated their training datasets, computational methodologies, calculated features, performance and software usability. On the basis of these benchmark datasets, we benchmarked 19 predictors with functioning webservers/local tools and assessed their prediction performance. We found that deep learning and traditional machine learning-based approaches generally outperformed scoring function-based approaches. Taken together, the curated benchmark dataset repository and the benchmarking analysis in this study serve to inform the design and implementation of computational approaches for promoter prediction and facilitate more rigorous comparison of new techniques in the future.
Collapse
Affiliation(s)
| | - Cangzhi Jia
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | | | | | | | | | - Geoffrey I Webb
- Department of Data Science and Artificial Intelligence, Monash University, Melbourne, VIC 3800, Australia,Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Quan Zou
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | - Lachlan J M Coin
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | - Jiangning Song
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| |
Collapse
|
11
|
Xia H, Yu B, Jiang Y, Cheng R, Lu X, Wu H, Zhu B. Psychrophilic phage VSW-3 RNA polymerase reduces both terminal and full-length dsRNA byproducts in in vitro transcription. RNA Biol 2022; 19:1130-1142. [PMID: 36299232 PMCID: PMC9624206 DOI: 10.1080/15476286.2022.2139113] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 10/17/2022] [Indexed: 10/31/2022] Open
Abstract
RNA research and applications are underpinned by in vitro transcription (IVT), but RNA impurities resulting from the enzymatic reagents severely impede downstream applications. To improve the stability and purity of synthesized RNA, we have characterized a novel single-subunit RNA polymerase (RNAP) encoded by the psychrophilic phage VSW-3 from a plateau lake. The VSW-3 RNAP is capable of carrying out in vitro RNA synthesis at low temperatures (4-25°C). Compared to routinely used T7 RNAP, VSW-3 RNAP provides a similar yield of transcripts but is insensitive to class II transcription terminators and synthesizes RNA without redundant 3'-cis extensions. More importantly, through dot-blot detection with the J2 monoclonal antibody, we found that the RNA products synthesized by VSW-3 RNAP contained a much lower amount of double-stranded RNA byproducts (dsRNA), which are produced by transcription from both directions and are significant in T7 RNAP IVT products. Taken together, the VSW-3 RNAP almost eliminates both terminal loop-back dsRNA and full-length dsRNA in IVT and thus is especially advantageous for producing RNA for in vivo use.
Collapse
Affiliation(s)
- Heng Xia
- Key Laboratory of Molecular Biophysics, the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Bingbing Yu
- Key Laboratory of Molecular Biophysics, the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Yixin Jiang
- Key Laboratory of Molecular Biophysics, the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Rui Cheng
- Key Laboratory of Molecular Biophysics, the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Xueling Lu
- Key Laboratory of Molecular Biophysics, the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Hui Wu
- Key Laboratory of Molecular Biophysics, the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Bin Zhu
- Key Laboratory of Molecular Biophysics, the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
- Shenzhen Huazhong University of Science and Technology Research Institute, Shenzhen, China
| |
Collapse
|
12
|
Abstract
Identification of promoter sequences in the eukaryotic genome, by computer methods, is an important task of bioinformatics. However, this problem has not been solved since the best algorithms have a false positive probability of 10−3–10−4 per nucleotide. As a result of full genome analysis, there may be more false positives than annotated gene promoters. The probability of a false positive should be reduced to 10−6–10−8 to reduce the number of false positives and increase the reliability of the prediction. The method for multi alignment of the promoter sequences was developed. Then, mathematical methods were developed for calculation of the statistically important classes of the promoter sequences. Five promoter classes, from the rice genome, were created. We developed promoter classes to search for potential promoter sequences in the rice genome with a false positive number less than 10−8 per nucleotide. Five classes of promoter sequences contain 1740, 222, 199, 167 and 130 promoters, respectively. A total of 145,277 potential promoter sequences (PPSs) were identified. Of these, 18,563 are promoters of known genes, 87,233 PPSs intersect with transposable elements, and 37,390 PPSs were found in previously unannotated sequences. The number of false positives for a randomly mixed rice genome is less than 10−8 per nucleotide. The method developed for detecting PPSs was compared with some previously used approaches. The developed mathematical method can be used to search for genes, transposable elements, and transcript start sites in eukaryotic genomes.
Collapse
|
13
|
Alvelos MI, Brüggemann M, Sutandy FXR, Juan-Mateu J, Colli ML, Busch A, Lopes M, Castela Â, Aartsma-Rus A, König J, Zarnack K, Eizirik DL. The RNA-binding profile of the splicing factor SRSF6 in immortalized human pancreatic β-cells. Life Sci Alliance 2021; 4:e202000825. [PMID: 33376132 PMCID: PMC7772782 DOI: 10.26508/lsa.202000825] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 12/15/2020] [Accepted: 12/15/2020] [Indexed: 12/16/2022] Open
Abstract
In pancreatic β-cells, the expression of the splicing factor SRSF6 is regulated by GLIS3, a transcription factor encoded by a diabetes susceptibility gene. SRSF6 down-regulation promotes β-cell demise through splicing dysregulation of central genes for β-cells function and survival, but how RNAs are targeted by SRSF6 remains poorly understood. Here, we define the SRSF6 binding landscape in the human pancreatic β-cell line EndoC-βH1 by integrating individual-nucleotide resolution UV cross-linking and immunoprecipitation (iCLIP) under basal conditions with RNA sequencing after SRSF6 knockdown. We detect thousands of SRSF6 bindings sites in coding sequences. Motif analyses suggest that SRSF6 specifically recognizes a purine-rich consensus motif consisting of GAA triplets and that the number of contiguous GAA triplets correlates with increasing binding site strength. The SRSF6 positioning determines the splicing fate. In line with its role in β-cell function, we identify SRSF6 binding sites on regulated exons in several diabetes susceptibility genes. In a proof-of-principle, the splicing of the susceptibility gene LMO7 is modulated by antisense oligonucleotides. Our present study unveils the splicing regulatory landscape of SRSF6 in immortalized human pancreatic β-cells.
Collapse
Affiliation(s)
- Maria Inês Alvelos
- ULB Center for Diabetes Research, Medical Faculty, Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Mirko Brüggemann
- Buchman Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt, Frankfurt am Main, Germany
- Faculty of Biological Sciences, Goethe University Frankfurt, Frankfurt am Main, Germany
| | | | - Jonàs Juan-Mateu
- ULB Center for Diabetes Research, Medical Faculty, Université Libre de Bruxelles (ULB), Brussels, Belgium
- Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Maikel Luis Colli
- ULB Center for Diabetes Research, Medical Faculty, Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Anke Busch
- Institute of Molecular Biology gGmbH, Mainz, Germany
| | - Miguel Lopes
- ULB Center for Diabetes Research, Medical Faculty, Université Libre de Bruxelles (ULB), Brussels, Belgium
| | - Ângela Castela
- ULB Center for Diabetes Research, Medical Faculty, Université Libre de Bruxelles (ULB), Brussels, Belgium
| | | | - Julian König
- Institute of Molecular Biology gGmbH, Mainz, Germany
| | - Kathi Zarnack
- Buchman Institute for Molecular Life Sciences (BMLS), Goethe University Frankfurt, Frankfurt am Main, Germany
- Faculty of Biological Sciences, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Décio L Eizirik
- ULB Center for Diabetes Research, Medical Faculty, Université Libre de Bruxelles (ULB), Brussels, Belgium
- Welbio, Medical Faculty, Université Libre de Bruxelles (ULB), Brussels, Belgium
- Indiana Biosciences Research Institute, Indianapolis, IN, USA
| |
Collapse
|
14
|
Bhandari N, Khare S, Walambe R, Kotecha K. Comparison of machine learning and deep learning techniques in promoter prediction across diverse species. PeerJ Comput Sci 2021; 7:e365. [PMID: 33817015 PMCID: PMC7959599 DOI: 10.7717/peerj-cs.365] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 12/30/2020] [Indexed: 06/12/2023]
Abstract
Gene promoters are the key DNA regulatory elements positioned around the transcription start sites and are responsible for regulating gene transcription process. Various alignment-based, signal-based and content-based approaches are reported for the prediction of promoters. However, since all promoter sequences do not show explicit features, the prediction performance of these techniques is poor. Therefore, many machine learning and deep learning models have been proposed for promoter prediction. In this work, we studied methods for vector encoding and promoter classification using genome sequences of three distinct higher eukaryotes viz. yeast (Saccharomyces cerevisiae), A. thaliana (plant) and human (Homo sapiens). We compared one-hot vector encoding method with frequency-based tokenization (FBT) for data pre-processing on 1-D Convolutional Neural Network (CNN) model. We found that FBT gives a shorter input dimension reducing the training time without affecting the sensitivity and specificity of classification. We employed the deep learning techniques, mainly CNN and recurrent neural network with Long Short Term Memory (LSTM) and random forest (RF) classifier for promoter classification at k-mer sizes of 2, 4 and 8. We found CNN to be superior in classification of promoters from non-promoter sequences (binary classification) as well as species-specific classification of promoter sequences (multiclass classification). In summary, the contribution of this work lies in the use of synthetic shuffled negative dataset and frequency-based tokenization for pre-processing. This study provides a comprehensive and generic framework for classification tasks in genomic applications and can be extended to various classification problems.
Collapse
Affiliation(s)
- Nikita Bhandari
- Computer Science, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, MH, India
| | - Satyajeet Khare
- Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Pune, MH, India
| | - Rahee Walambe
- Symbiosis Centre for Applied Artificial Intelligence, Symbiosis International (Deemed University), Pune, Maharashtra, India
- Electronics and Telecommunication Dept, Symbiosis Institute of Technology, Pune, Maharashtra, India
| | - Ketan Kotecha
- Computer Science, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, MH, India
- Symbiosis Centre for Applied Artificial Intelligence, Symbiosis International (Deemed University), Pune, Maharashtra, India
| |
Collapse
|
15
|
Pachganov S, Murtazalieva K, Zarubin A, Taran T, Chartier D, Tatarinova TV. Prediction of Rice Transcription Start Sites Using TransPrise: A Novel Machine Learning Approach. Methods Mol Biol 2021; 2238:261-274. [PMID: 33471337 DOI: 10.1007/978-1-0716-1068-8_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As the interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper, we present TransPrise-an efficient deep learning tool for predicting positions of eukaryotic transcription start sites. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise with the TSSPlant approach for well-annotated genome of Oryza sativa. Using a computer with a graphics processing unit, the run time of TransPrise is 250 min on a genome of 374 Mb long.We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all the necessary packages, models, and code as well as the source code of the TransPrise algorithm are available at http://compubioverne.group/ . The source code is ready to use and to be customized to predict TSS in any eukaryotic organism.
Collapse
Affiliation(s)
- Stepan Pachganov
- Ugra Research Institute of Information Technologies, Khanty-Mansiysk, Russia
| | | | - Alexei Zarubin
- Tomsk National Research Medical Center of the Russian Academy of Sciences, Research Institute of Medical Genetics, Tomsk, Russia
| | | | - Duane Chartier
- International Center for Art Intelligence, Inc, Los Angeles, CA, USA
| | - Tatiana V Tatarinova
- Vavilov Institute of General Genetics, Moscow, Russia.
- Department of Biology, University of La Verne, La Verne, CA, USA.
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.
- Siberian Federal University, Krasnoyarsk, Russia.
| |
Collapse
|
16
|
Bhattarai K, Bastola R, Baral B. Antibiotic drug discovery: Challenges and perspectives in the light of emerging antibiotic resistance. ADVANCES IN GENETICS 2020; 105:229-292. [PMID: 32560788 DOI: 10.1016/bs.adgen.2019.12.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Amid a rising threat of antimicrobial resistance in a global scenario, our huge investments and high-throughput technologies injected for rejuvenating the key therapeutic scaffolds to suppress these rising superbugs has been diminishing severely. This has grasped world-wide attention, with increased consideration being given to the discovery of new chemical entities. Research has now proven that the relatively tiny and simpler microbes possess enhanced capability of generating novel and diverse chemical constituents with huge therapeutic leads. The usage of these beneficial organisms could help in producing new chemical scaffolds that govern the power to suppress the spread of obnoxious superbugs. Here in this review, we have explicitly focused on several appealing strategies employed for the generation of new chemical scaffolds. Also, efforts on providing novel insights on some of the unresolved questions in the production of metabolites, metabolic profiling and also the serendipity of getting "hit molecules" have been rigorously discussed. However, we are highly aware that biosynthetic pathway of different classes of secondary metabolites and their biosynthetic route is a vast topic, thus we have avoided discussion on this topic.
Collapse
Affiliation(s)
- Keshab Bhattarai
- University of Tübingen, Tübingen, Germany; Center for Natural and Applied Sciences (CENAS), Kathmandu, Nepal
| | - Rina Bastola
- Spinal Cord Injury Association-Nepal (SCIAN), Pokhara, Nepal
| | - Bikash Baral
- Spinal Cord Injury Association-Nepal (SCIAN), Pokhara, Nepal.
| |
Collapse
|
17
|
Gatherer D. Reflections on integrating bioinformatics into the undergraduate curriculum: The Lancaster experience. BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION : A BIMONTHLY PUBLICATION OF THE INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2020; 48:118-127. [PMID: 31793726 DOI: 10.1002/bmb.21320] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 10/25/2019] [Accepted: 11/15/2019] [Indexed: 06/10/2023]
Abstract
Bioinformatics is an essential discipline for biologists. It also has a reputation of being difficult for those without a strong quantitative and computer science background. At Lancaster University, we have developed modules for the integration of bioinformatics skills training into our undergraduate biology degree portfolio. This article describes those modules, situating them in the context of the accumulated quarter century of literature on bioinformatics education. The constant evolution of bioinformatics as a discipline is emphasized, drawing attention to the continual necessity to revise and upgrade those skills being taught, even at undergraduate level. Our overarching aim is to equip students both with a portfolio of skills in the currently most essential bioinformatics tools and with the confidence to continue their own bioinformatics skills development at postgraduate or professional level.
Collapse
Affiliation(s)
- Derek Gatherer
- Division of Biomedical and Life Sciences, Faculty of Health and Medicine, Lancaster University, Lancaster, UK
| |
Collapse
|
18
|
Pachganov S, Murtazalieva K, Zarubin A, Sokolov D, Chartier DR, Tatarinova TV. TransPrise: a novel machine learning approach for eukaryotic promoter prediction. PeerJ 2019; 7:e7990. [PMID: 31695967 PMCID: PMC6827441 DOI: 10.7717/peerj.7990] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 10/04/2019] [Indexed: 02/01/2023] Open
Abstract
As interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper we present TransPrise-an efficient deep learning tool for prediction of positions of eukaryotic transcription start sites. Our pipeline consists of two parts: the binary classifier operates the first, and if a sequence is classified as TSS-containing the regression step follows, where the precise location of TSS is being identified. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise classification and regression models with the TSSPlant approach for the well annotated genome of Oryza sativa. Using a computer equipped with a graphics processing unit, the run time of TransPrise is 250 minutes on a genome of 374 Mb long. The Matthews correlation coefficient value for TransPrise is 0.79, more than two times larger than the 0.31 for TSSPlant classification models. This represents a high level of prediction accuracy. Additionally, the mean absolute error for the regression model is 29.19 nt, allowing for accurate prediction of TSS location. TransPrise was also tested in Homo sapiens, where mean absolute error of the regression model was 47.986 nt. We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all necessary packages, models, code as well as the source code of the TransPrise algorithm are available at (http://compubioverne.group/). The source code is ready to use and customizable to predict TSS in any eukaryotic organism.
Collapse
Affiliation(s)
- Stepan Pachganov
- Ugra Research Institute of Information Technologies, Khanty-Mansiysk, Russia
| | - Khalimat Murtazalieva
- Vavilov Institute for General Genetics, Moscow, Russia.,Institute of Bioinformatics, Moscow, Russia
| | - Aleksei Zarubin
- Tomsk National Research Medical Center of the Russian Academy of Sciences, Research Institute of Medical Genetics, Tomsk, Russia
| | | | - Duane R Chartier
- International Center for Art Intelligence, Inc., Los Angeles, CA, United States of America
| | - Tatiana V Tatarinova
- Vavilov Institute for General Genetics, Moscow, Russia.,Department of Biology, University of La Verne, La Verne, CA, United States of America.,A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.,Siberian Federal University, Krasnoyarsk, Russia
| |
Collapse
|
19
|
Promoter analysis and prediction in the human genome using sequence-based deep learning models. Bioinformatics 2019; 35:2730-2737. [DOI: 10.1093/bioinformatics/bty1068] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 12/03/2018] [Accepted: 12/27/2018] [Indexed: 12/14/2022] Open
Abstract
Abstract
Motivation
Computational identification of promoters is notoriously difficult as human genes often have unique promoter sequences that provide regulation of transcription and interaction with transcription initiation complex. While there are many attempts to develop computational promoter identification methods, we have no reliable tool to analyze long genomic sequences.
Results
In this work, we further develop our deep learning approach that was relatively successful to discriminate short promoter and non-promoter sequences. Instead of focusing on the classification accuracy, in this work we predict the exact positions of the transcription start site inside the genomic sequences testing every possible location. We studied human promoters to find effective regions for discrimination and built corresponding deep learning models. These models use adaptively constructed negative set, which iteratively improves the model’s discriminative ability. Our method significantly outperforms the previously developed promoter prediction programs by considerably reducing the number of false-positive predictions. We have achieved error-per-1000-bp rate of 0.02 and have 0.31 errors per correct prediction, which is significantly better than the results of other human promoter predictors.
Availability and implementation
The developed method is available as a web server at http://www.cbrc.kaust.edu.sa/PromID/.
Collapse
|
20
|
Honkanen S, Thamm A, Arteaga-Vazquez MA, Dolan L. Negative regulation of conserved RSL class I bHLH transcription factors evolved independently among land plants. eLife 2018; 7:38529. [PMID: 30136925 PMCID: PMC6141232 DOI: 10.7554/elife.38529] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Accepted: 08/22/2018] [Indexed: 12/16/2022] Open
Abstract
Basic helix-loop-helix transcription factors encoded by RSL class I genes control a gene regulatory network that positively regulates the development of filamentous rooting cells – root hairs and rhizoids – in land plants. The GLABRA2 transcription factor negatively regulates these genes in the angiosperm Arabidopsis thaliana. To find negative regulators of RSL class I genes in early diverging land plants we conducted a mutant screen in the liverwort Marchantia polymorpha. This identified FEW RHIZOIDS1 (MpFRH1) microRNA (miRNA) that negatively regulates the RSL class I gene MpRSL1. The miRNA and its mRNA target constitute a feedback mechanism that controls epidermal cell differentiation. MpFRH1 miRNA target sites are conserved among liverwort RSL class I mRNAs but are not present in RSL class I mRNAs of other land plants. These findings indicate that while RSL class I genes are ancient and conserved, independent negative regulatory mechanisms evolved in different lineages during land plant evolution. Plants colonised the land sometime more than 500 million years ago. The ancestors of the first land plants were algae that were most likely simple with a few different types of cell. Yet, when faced with the challenges of life on land, plants evolved new cell types and specialised structures with roles such as anchorage, nutrient uptake and gas exchange. Many of these specialised structures, including the root hairs and rhizoids that allow plants to collect water and minerals from the soil, first develop as outgrowths from cells in the outer layer of the plant. An ancient and conserved mechanism activates the development of these outgrowths via genes belonging to a group known as RSL class I. In the flowering plant Arabidopsis thaliana, a protein switches off RSL class I genes in a subset of these outer cells, to stop too many root hairs forming. To see whether this kind of negative regulation is also conserved among land plants, Honkanen et al. looked for regulators of RSL class I genes in liverworts. Small and without flowers, liverworts are a group of plants that first appeared during the earliest stages of land plant evolution. Honkanen et al. discovered that RSL class I genes in liverworts are negatively regulated by a molecule named FEW RHIZOIDS1 (or FRH1). However, rather than being a protein, FRH1 is a microRNA – a short strand of genetic code that reduces how much protein is produced from a given gene. The FRH1 microRNA is conserved among liverworts and most likely evolved very early in the history of these plants. The findings indicate that different groups of land plants have evolved different negative regulators to control the conserved genes behind some of the specialised structures crucial to life on land.
Collapse
Affiliation(s)
- Suvi Honkanen
- Department of Plant Sciences, University of Oxford, Oxford, United Kingdom.,Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Perth, Australia
| | - Anna Thamm
- Department of Plant Sciences, University of Oxford, Oxford, United Kingdom
| | - Mario A Arteaga-Vazquez
- Laboratory of Epigenetics and Developmental Biology, Instituto de Biotecnología y Ecología Aplicada, Universidad Veracruzana, Colonia Emiliano Zapata, Mexico
| | - Liam Dolan
- Department of Plant Sciences, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
21
|
Zhang F, Liu W, Xia J, Zeng J, Xiang L, Zhu S, Zheng Q, Xie H, Yang C, Chen M, Liao Z. Molecular Characterization of the 1-Deoxy-D-Xylulose 5-Phosphate Synthase Gene Family in Artemisia annua. FRONTIERS IN PLANT SCIENCE 2018; 9:952. [PMID: 30116250 PMCID: PMC6084332 DOI: 10.3389/fpls.2018.00952] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 06/13/2018] [Indexed: 05/27/2023]
Abstract
Artemisia annua produces artemisinin, an effective antimalarial drug. In recent decades, the later steps of artemisinin biosynthesis have been thoroughly investigated; however, little is known about the early steps of artemisinin biosynthesis. Comparative transcriptomics of glandular and filamentous trichomes and 13CO2 radioisotope study have shown that the 2-C-methyl-D-erythritol-4-phosphate (MEP) pathway, rather than the mevalonate pathway, plays an important role in artemisinin biosynthesis. In this study, we have cloned three 1-deoxy-D-xylulose 5-phosphate synthase (DXS) genes from A. annua (AaDXS1, AaDXS2, and AaDXS3); the DXS enzyme catalyzes the first and rate-limiting enzyme of the MEP pathway. We analyzed the expression of these three genes in different tissues in response to multiple treatments. Phylogenetic analysis revealed that each of the three DXS genes belonged to a distinct clade. Subcellular localization analysis indicated that all three AaDXS proteins are targeted to chloroplasts, which is consistent with the presence of plastid transit peptides in their N-terminal regions. Expression analyses revealed that the expression pattern of AaDXS2 in specific tissues and in response to different treatments, including methyl jasmonate, light, and low temperature, was similar to that of artemisinin biosynthesis genes. To further investigate the tissue-specific expression pattern of AaDXS2, the promoter of AaDXS2 was cloned upstream of the β-glucuronidase gene and was introduced in arabidopsis. Histochemical staining assays demonstrated that AaDXS2 was mainly expressed in the trichomes of Arabidopsis leaves. Together, these results suggest that AaDXS2 might be the only member of the DXS family in A. annua that is involved in artemisinin biosynthesis.
Collapse
Affiliation(s)
- Fangyuan Zhang
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education), Chongqing Key Laboratory of Plant Ecology and Resources Research in Three Gorges Reservoir Region, SWU-TAAHC Medicinal Plant Joint R&D Centre, School of Life Sciences, Southwest University, Chongqing, China
| | - Wanhong Liu
- School of Chemistry and Chemical Engineering, Chongqing University of Science and Technology, Chongqing, China
| | - Jing Xia
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education), Chongqing Key Laboratory of Plant Ecology and Resources Research in Three Gorges Reservoir Region, SWU-TAAHC Medicinal Plant Joint R&D Centre, School of Life Sciences, Southwest University, Chongqing, China
| | - Junlan Zeng
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education), Chongqing Key Laboratory of Plant Ecology and Resources Research in Three Gorges Reservoir Region, SWU-TAAHC Medicinal Plant Joint R&D Centre, School of Life Sciences, Southwest University, Chongqing, China
| | - Lien Xiang
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education), Chongqing Key Laboratory of Plant Ecology and Resources Research in Three Gorges Reservoir Region, SWU-TAAHC Medicinal Plant Joint R&D Centre, School of Life Sciences, Southwest University, Chongqing, China
| | - Shunqin Zhu
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education), Chongqing Key Laboratory of Plant Ecology and Resources Research in Three Gorges Reservoir Region, SWU-TAAHC Medicinal Plant Joint R&D Centre, School of Life Sciences, Southwest University, Chongqing, China
| | - Qiumin Zheng
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education), Chongqing Key Laboratory of Plant Ecology and Resources Research in Three Gorges Reservoir Region, SWU-TAAHC Medicinal Plant Joint R&D Centre, School of Life Sciences, Southwest University, Chongqing, China
| | - He Xie
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Key Laboratory of Tobacco Biotechnological Breeding, National Tobacco Genetic Engineering Research Center, Kunming, China
| | - Chunxian Yang
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education), Chongqing Key Laboratory of Plant Ecology and Resources Research in Three Gorges Reservoir Region, SWU-TAAHC Medicinal Plant Joint R&D Centre, School of Life Sciences, Southwest University, Chongqing, China
| | - Min Chen
- SWU-TAAHC Medicinal Plant Joint R&D Centre, College of Pharmaceutical Sciences, Southwest University, Chongqing, China
| | - Zhihua Liao
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education), Chongqing Key Laboratory of Plant Ecology and Resources Research in Three Gorges Reservoir Region, SWU-TAAHC Medicinal Plant Joint R&D Centre, School of Life Sciences, Southwest University, Chongqing, China
| |
Collapse
|
22
|
Prihatna C, Larkan NJ, Barbetti MJ, Barker SJ. Tomato CYCLOPS/IPD3 is required for mycorrhizal symbiosis but not tolerance to Fusarium wilt in mycorrhiza-deficient tomato mutant rmc. MYCORRHIZA 2018; 28:495-507. [PMID: 29948410 DOI: 10.1007/s00572-018-0842-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Accepted: 05/31/2018] [Indexed: 06/08/2023]
Abstract
Mycorrhizal symbiosis requires several common symbiosis genes including CYCLOPS/IPD3. The reduced mycorrhizal colonisation (rmc) tomato mutant has a deletion of five genes including CYCLOPS/IPD3, and rmc is more susceptible to Fusarium wilt than its wild-type parental line. This study investigated the genetic defects leading to both fungal interaction phenotypes and whether these were separable. Complementation was performed in rmc to test the requirement for CYCLOPS/IPD3 in mycorrhiza formation and Fusarium wilt tolerance. Promoter analysis via GFP expression in roots was conducted to determine the role of native regulatory elements in the proper functioning of CYCLOPS/IPD3. CYCLOPS/IPD3 regulated by its native promoter, but not a 2×35S promoter, restores mycorrhizal association in rmc. GFP regulated by the 2×35S promoter is not expressed in epidermal cells of roots, indicating that expression of CYCLOPS/IPD3 in these cells is required for colonisation by the fungi utilised in this research. CYCLOPS/IPD3 did not restore Fusarium wilt tolerance, however, showing that the genetic requirements for mycorrhizal association and Fusarium wilt tolerance are different. Our results confirm the expected role of CYCLOPS/IPD3 in mycorrhizal symbiosis and suggest that Fusarium tolerance is conferred by one of the other four genes affected by the deletion.
Collapse
Affiliation(s)
- Cahya Prihatna
- School of Agriculture and Environment, Faculty of Science, The University of Western Australia, 35 Stirling Highway, Crawley, Western Australia, 6009, Australia.
- PT Wilmar Benih Indonesia, Jalan Jababeka X Blok F No. 9, Bekasi, Jawa Barat, 17530, Indonesia.
| | | | - Martin John Barbetti
- School of Agriculture and Environment, Faculty of Science, The University of Western Australia, 35 Stirling Highway, Crawley, Western Australia, 6009, Australia
- The UWA Institute of Agriculture, Faculty of Science, The University of Western Australia, 35 Stirling Highway, Crawley, Western Australia, 6009, Australia
| | - Susan Jane Barker
- School of Agriculture and Environment, Faculty of Science, The University of Western Australia, 35 Stirling Highway, Crawley, Western Australia, 6009, Australia
| |
Collapse
|
23
|
Ferreira FV, Aguiar ERGR, Olmo RP, de Oliveira KPV, Silva EG, Sant'Anna MRV, Gontijo NDF, Kroon EG, Imler JL, Marques JT. The small non-coding RNA response to virus infection in the Leishmania vector Lutzomyia longipalpis. PLoS Negl Trop Dis 2018; 12:e0006569. [PMID: 29864168 PMCID: PMC6002125 DOI: 10.1371/journal.pntd.0006569] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2017] [Revised: 06/14/2018] [Accepted: 05/30/2018] [Indexed: 12/30/2022] Open
Abstract
Sandflies are well known vectors for Leishmania but also transmit a number of arthropod-borne viruses (arboviruses). Few studies have addressed the interaction between sandflies and arboviruses. RNA interference (RNAi) mechanisms utilize small non-coding RNAs to regulate different aspects of host-pathogen interactions. The small interfering RNA (siRNA) pathway is a broad antiviral mechanism in insects. In addition, at least in mosquitoes, another RNAi mechanism mediated by PIWI interacting RNAs (piRNAs) is activated by viral infection. Finally, endogenous microRNAs (miRNA) may also regulate host immune responses. Here, we analyzed the small non-coding RNA response to Vesicular stomatitis virus (VSV) infection in the sandfly Lutzoymia longipalpis. We detected abundant production of virus-derived siRNAs after VSV infection in adult sandflies. However, there was no production of virus-derived piRNAs and only mild changes in the expression of vector miRNAs in response to infection. We also observed abundant production of virus-derived siRNAs against two other viruses in Lutzomyia Lulo cells. Together, our results suggest that the siRNA but not the piRNA pathway mediates an antiviral response in sandflies. In agreement with this hypothesis, pre-treatment of cells with dsRNA against VSV was able to inhibit viral replication while knock-down of the central siRNA component, Argonaute-2, led to increased virus levels. Our work begins to elucidate the role of RNAi mechanisms in the interaction between L. longipalpis and viruses and should also open the way for studies with other sandfly-borne pathogens. Sandflies are important insect vectors that transmit many species of Leishmania, bacteria and viruses. We know very little about how this insect vector responds to viral infection. RNA interference (RNAi) utilizes small non-coding RNAs to regulate different aspects of animal physiology, including immune responses. Small interfering RNAs (siRNAs) mediate a major antiviral response in insects. Virus-derived PIWI-interacting RNAs (piRNAs) can also be generated during infection, at least in some insects. Finally, endogenous microRNAs (miRNA) can regulate the host response to infection. Here we show that virus infection triggers activation of the siRNA pathway but not production of piRNAs in the sandfly Lutzomyia longipalpis. Furthermore, activation or inhibition of the siRNA pathway had a direct effect on viral replication. We also show that virus infection caused mild changes to the expression of endogenous miRNAs. Our work describes for the first time a model to study virus infection in sandflies and highlights the importance of the siRNA pathway for the control of virus infection in L. longipalpis. The framework described here can be used to explore other aspects of the vector-pathogen interactions.
Collapse
Affiliation(s)
- Flávia Viana Ferreira
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Department of Microbiology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Eric Roberto Guimarães Rocha Aguiar
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Roenick Proveti Olmo
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Karla Pollyanna Vieira de Oliveira
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Emanuele Guimarães Silva
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Maurício Roberto Viana Sant'Anna
- Department of Parasitology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Nelder de Figueiredo Gontijo
- Department of Parasitology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Erna Geessien Kroon
- Department of Microbiology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Jean Luc Imler
- Université de Strasbourg, CNRS M3I/UPR9022, Inserm MIR/U1257, Strasbourg, France
| | - João Trindade Marques
- Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- * E-mail:
| |
Collapse
|
24
|
Triska M, Solovyev V, Baranova A, Kel A, Tatarinova TV. Nucleotide patterns aiding in prediction of eukaryotic promoters. PLoS One 2017; 12:e0187243. [PMID: 29141011 PMCID: PMC5687710 DOI: 10.1371/journal.pone.0187243] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2017] [Accepted: 09/05/2017] [Indexed: 01/09/2023] Open
Abstract
Computational analysis of promoters is hindered by the complexity of their architecture. In less studied genomes with complex organization, false positive promoter predictions are common. Accurate identification of transcription start sites and core promoter regions remains an unsolved problem. In this paper, we present a comprehensive analysis of genomic features associated with promoters and show that probabilistic integrative algorithms-driven models allow accurate classification of DNA sequence into “promoters” and “non-promoters” even in absence of the full-length cDNA sequences. These models may be built upon the maps of the distributions of sequence polymorphisms, RNA sequencing reads on genomic DNA, methylated nucleotides, transcription factor binding sites, as well as relative frequencies of nucleotides and their combinations. Positional clustering of binding sites shows that the cells of Oryza sativa utilize three distinct classes of transcription factors: those that bind preferentially to the [-500,0] region (188 “promoter-specific” transcription factors), those that bind preferentially to the [0,500] region (282 “5′ UTR-specific” TFs), and 207 of the “promiscuous” transcription factors with little or no location preference with respect to TSS. For the most informative motifs, their positional preferences are conserved between dicots and monocots.
Collapse
Affiliation(s)
- Martin Triska
- Children’s Hospital Los Angeles, University of Southern California, Los Angeles, CA, United States of America
- Faculty of Advanced Technology, University of South Wales, Pontypridd, Wales, United Kingdom
| | | | - Ancha Baranova
- School of Systems Biology, George Mason University, Fairfax, VA, United States of America
- Research Centre for Medical Genetics, Moscow, Russia
| | - Alexander Kel
- geneXplain GmbH, Wolfenbuettel, Germany
- Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia
| | - Tatiana V. Tatarinova
- School of Systems Biology, George Mason University, Fairfax, VA, United States of America
- Department of Biology, Division of Natural Sciences, University of La Verne, La Verne, CA, United States of America
- Bioinformatics Center, AA Kharkevich Institute for Information Transmission Problems RAS, Moscow, Russia
- Vavilov’s Institute for General Genetics, Moscow, Russia, Moscow, Russia
- * E-mail:
| |
Collapse
|
25
|
Wang H, Liu W, Qiu F, Chen Y, Zhang F, Lan X, Chen M, Zhang H, Liao Z. Molecular cloning and characterization of the promoter of aldehyde dehydrogenase gene fromArtemisia annua. Biotechnol Appl Biochem 2017; 64:902-910. [DOI: 10.1002/bab.1520] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 06/05/2016] [Indexed: 11/10/2022]
Affiliation(s)
- Huanyan Wang
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education); SWU-TAAHC Medicinal Plant Joint R&D Centre; School of Life Sciences; Southwest University; Chongqing 400715 People's Republic of China
| | - Wanhong Liu
- School of Chemistry and Chemical Engineering; Chongqing University of Science and Technology; Chongqing 401331 People's Republic of China
| | - Fei Qiu
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education); SWU-TAAHC Medicinal Plant Joint R&D Centre; School of Life Sciences; Southwest University; Chongqing 400715 People's Republic of China
| | - Yupei Chen
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education); SWU-TAAHC Medicinal Plant Joint R&D Centre; School of Life Sciences; Southwest University; Chongqing 400715 People's Republic of China
| | - Fangyuan Zhang
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education); SWU-TAAHC Medicinal Plant Joint R&D Centre; School of Life Sciences; Southwest University; Chongqing 400715 People's Republic of China
| | - Xiaozhong Lan
- TAAHC-SWU Medicinal Plant Joint R&D Centre; Tibetan Collaborative Innovation Center of Agricultural and Animal Husbandry Resources; Agriculture and Animal Husbandry College; Tibet University; Nyingchi of Tibet 860000 People's Republic of China
| | - Min Chen
- SWU-TAAHC Medicinal Plant Joint R&D Centre; College of Pharmaceutical Sciences; Southwest University; Chongqing 400715 People's Republic of China
| | - Haoxing Zhang
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education); SWU-TAAHC Medicinal Plant Joint R&D Centre; School of Life Sciences; Southwest University; Chongqing 400715 People's Republic of China
| | - Zhihua Liao
- Key Laboratory of Eco-Environments in Three Gorges Reservoir Region (Ministry of Education); SWU-TAAHC Medicinal Plant Joint R&D Centre; School of Life Sciences; Southwest University; Chongqing 400715 People's Republic of China
| |
Collapse
|
26
|
Zhou S, Du G, Kang Z, Li J, Chen J, Li H, Zhou J. The application of powerful promoters to enhance gene expression in industrial microorganisms. World J Microbiol Biotechnol 2017; 33:23. [DOI: 10.1007/s11274-016-2184-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 11/24/2016] [Indexed: 01/01/2023]
|
27
|
Li J, Meng H, Wang Y. Synbiological systems for complex natural products biosynthesis. Synth Syst Biotechnol 2016; 1:221-229. [PMID: 29062947 PMCID: PMC5625725 DOI: 10.1016/j.synbio.2016.08.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 08/24/2016] [Accepted: 08/24/2016] [Indexed: 10/25/2022] Open
Abstract
Natural products (NPs) continue to play a pivotal role in drug discovery programs. The rapid development of synthetic biology has conferred the strategies of NPs production. Synthetic biology is a new engineering discipline that aims to produce desirable products by rationally programming the biological parts and manipulating the pathways. However, there is still a challenge for integrating a heterologous pathway in chassis cells for overproduction purpose due to the limited characterized parts, modules incompatibility, and cell tolerance towards product. Enormous endeavors have been taken for mentioned issues. Herein, in this review, the progresses in naturally discovering novel biological parts and rational design of synthetic biological parts are reviewed, combining with the advanced assembly technologies, pathway engineering, and pathway optimization in global network guidance. The future perspectives are also presented.
Collapse
Affiliation(s)
- Jianhua Li
- Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | - Hailin Meng
- Bioengineering Research Center, Guangzhou Institute of Advanced Technology, Chinese Academy of Sciences, Guangzhou 511458, China
| | - Yong Wang
- Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| |
Collapse
|
28
|
Muterko A, Kalendar R, Salina E. Novel alleles of the VERNALIZATION1 genes in wheat are associated with modulation of DNA curvature and flexibility in the promoter region. BMC PLANT BIOLOGY 2016; 16 Suppl 1:9. [PMID: 26822192 PMCID: PMC4895274 DOI: 10.1186/s12870-015-0691-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
BACKGROUND In wheat, the vernalization requirement is mainly controlled by the VRN genes. Different species of hexaploid and tetraploid wheat are widely used as genetic source for new mutant variants and alleles for fundamental investigations and practical breeding programs. In this study, VRN-A1 and VRN-B1 were analysed for 178 accessions representing six tetraploid wheat species (Triticum dicoccoides, T. dicoccum, T. turgidum, T. polonicum, T. carthlicum, T. durum) and five hexaploid species (T. compactum, T. sphaerococcum, T. spelta, T. macha, T. vavilovii). RESULTS Novel allelic variants in the promoter region of VRN-A1 and VRN-B1 were identified based on the change in curvature and flexibility of the DNA molecules. The new variants of VRN-A1 (designated as Vrn-A1a.2, Vrn-A1b.2 - Vrn-A1b.6 and Vrn-A1i) were found to be widely distributed in hexaploid and tetraploid wheat, and in fact were predominant over the known VRN-A1 alleles. The greatest diversity of the new variants of VRN-B1 (designated as VRN-B1.f, VRN-B1.s and VRN-B1.m) was found in the tetraploid and some hexaploid wheat species. For the first time, minor differences within the sequence motif known as the VRN-box of VRN1 were correlated with wheat growth habit. Thus, vrn-A1b.3 and vrn-A1b.4 were revealed in winter wheat in contrast to Vrn-A1b.2, Vrn-A1b.5, Vrn-A1b.6 and Vrn-A1i. It was found that single nucleotide mutation in the VRN-box can influence the vernalization requirement and growth habit of wheat. Our data suggest that both the A-tract and C-rich segment within the VRN-box contribute to its functionality, and provide a new view of the hypothesised role of the VRN-box in regulating transcription of the VRN1 genes. Specifically, it is proposed that combination of mutations in this region can modulate vernalization sensitivity and flowering time of wheat. CONCLUSIONS New allelic variants of the VRN-A1 and VRN-B1 genes were identified in hexaploid and tetraploid wheat. Mutations in A-tract and C-rich segments within the VRN-box of VRN-A1 are associated with modulation of the vernalization requirement and flowering time. New allelic variants will be useful in fundamental investigations into the regulation of VRN1 expression, and provide a valuable genetic resource for practical breeding of wheat.
Collapse
Affiliation(s)
- Alexandr Muterko
- Laboratory of Plant Molecular Genetics and Cytogenetics, The Federal Research Center Institute of Cytology and Genetics, Lavrentyeva Avenue 10, Novosibirsk, 630090, Russian Federation.
- Department of Common and Molecular Genetics, Plant Breeding and Genetics Institute - National Center of Seed and Cultivar Investigation, Ovidiopolskaya Road 3, Odessa, 65036, Ukraine.
| | - Ruslan Kalendar
- Laboratory of Plant Genomics and Bioinformatics, RSE "National Center for Biotechnology", Sh. Valikhanov 13/1, Astana, 010000, Kazakhstan
- University of Helsinki, Institute of Biotechnology, MTT Plant Genomics Laboratory, Biocentre 3, P.O. Box 65, Viikinkaari 1, Helsinki, 00014, Finland
| | - Elena Salina
- Laboratory of Plant Molecular Genetics and Cytogenetics, The Federal Research Center Institute of Cytology and Genetics, Lavrentyeva Avenue 10, Novosibirsk, 630090, Russian Federation
| |
Collapse
|
29
|
Li Y, Tu L, Ye Z, Wang M, Gao W, Zhang X. A cotton fiber-preferential promoter, PGbEXPA2, is regulated by GA and ABA in Arabidopsis. PLANT CELL REPORTS 2015; 34:1539-49. [PMID: 26001998 DOI: 10.1007/s00299-015-1805-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Revised: 03/30/2015] [Accepted: 05/06/2015] [Indexed: 05/24/2023]
Abstract
PGbEXPA2 (Promoter of GbEXPA2 ) was preferentially and strongly expressed during cotton fiber development, and the 461-bp PGbEXPA2 fragment was essential for responding to exogenous GA and ABA in Arabidopsis. Cotton fibers are highly elongated single-cell, unbranched and non-glandular seed trichomes. Previous studies have reported that the transcript level of GbEXPA2 is significantly up-regulated during fiber cell elongation, suggesting that GbEXPA2 has an important function in fiber development. In this study, the promoter of GbEXPA2 (839 bp) from the D(T) sub-genome was isolated from Gossypium barbadense 3-79. Consistent with the expression pattern of GbEXPA2, the promoter PGbEXPA2 was able to express GUS to high levels in elongating fibers, but not in the root, stem, or leaf. In Arabidopsis, GUS activity was only found in the rosette leaf trichomes and rosette leaf vascular tissue, indicating that the transcription factors which bind to PGbEXPA2 in the leaf trichomes of transgenic Arabidopsis were similar to those found in cotton fiber. A deletion analysis of PGbEXPA2 revealed that a 461-bp fragment was sufficient to drive GUS expression in cotton fibers and Arabidopsis rosette leaf trichomes. Exogenous phytohormonal treatments on transgenic Arabidopsis with different promoter lengths (P-839, P-705, P-588 and P-461) showed that GUS activity in Arabidopsis trichomes could be strongly up-regulated by GA and, in contrast, down-regulated by ABA.
Collapse
Affiliation(s)
- Yang Li
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, Hubei, China
| | | | | | | | | | | |
Collapse
|
30
|
Mishra DR, Chaudhary S, Krishna BM, Mishra SK. Identification of Critical Elements for Regulation of Inorganic Pyrophosphatase (PPA1) in MCF7 Breast Cancer Cells. PLoS One 2015; 10:e0124864. [PMID: 25923237 PMCID: PMC4414593 DOI: 10.1371/journal.pone.0124864] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 03/11/2015] [Indexed: 12/21/2022] Open
Abstract
Cytosolic inorganic pyrophosphatase plays an important role in the cellular metabolism by hydrolyzing inorganic pyrophosphate (PPi) formed as a by-product of various metabolic reactions. Inorganic pyrophosphatases are known to be associated with important functions related to the growth and development of various organisms. In humans, the expression of inorganic pyrophosphatase (PPA1) is deregulated in different types of cancer and is involved in the migration and invasion of gastric cancer cells and proliferation of ovarian cancer cells. However, the transcriptional regulation of the gene encoding PPA1 is poorly understood. To gain insights into PPA1 gene regulation, a 1217 bp of its 5'-flanking region was cloned and analyzed. The 5'-deletion analysis of the promoter revealed a 266 bp proximal promoter region exhibit most of the transcriptional activity and upon sequence analysis, three putative Sp1 binding sites were found to be present in this region. Binding of Sp1 to the PPA1 promoter was confirmed by Electrophoretic mobility shift assay (EMSA) and Chromatin immunoprecipitation (ChIP) assay. Importance of these binding sites was verified by site-directed mutagenesis and overexpression of Sp1 transactivates PPA1 promoter activity, upregulates protein expression and increases chromatin accessibility. p300 binds to the PPA1 promoter and stimulates Sp1 induced promoter activity. Trichostatin A (TSA), a histone deacetylase (HDAC) inhibitor induces PPA1 promoter activity and protein expression and HAT activity of p300 was important in regulation of PPA1 expression. These results demonstrated that PPA1 is positively regulated by Sp1 and p300 coactivates Sp1 induced PPA1 promoter activity and histone acetylation/deacetylation may contribute to a local chromatin remodeling across the PPA1 promoter. Further, knockdown of PPA1 decreased colony formation and viability of MCF7 cells.
Collapse
Affiliation(s)
- Dipti Ranjan Mishra
- Cancer Biology Laboratory, Gene function and regulation Group, Institute of Life Sciences, Bhubaneswar, Odisha, India
| | - Sanjib Chaudhary
- Cancer Biology Laboratory, Gene function and regulation Group, Institute of Life Sciences, Bhubaneswar, Odisha, India
| | - B. Madhu Krishna
- Cancer Biology Laboratory, Gene function and regulation Group, Institute of Life Sciences, Bhubaneswar, Odisha, India
| | - Sandip K. Mishra
- Cancer Biology Laboratory, Gene function and regulation Group, Institute of Life Sciences, Bhubaneswar, Odisha, India
- * E-mail:
| |
Collapse
|
31
|
Yella VR, Bansal M. In silico Identification of Eukaryotic Promoters. SYSTEMS AND SYNTHETIC BIOLOGY 2015. [DOI: 10.1007/978-94-017-9514-2_4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
32
|
Pereira ALA, Carazzolle MF, Abe VY, de Oliveira MLP, Domingues MN, Silva JC, Cernadas RA, Benedetti CE. Identification of putative TAL effector targets of the citrus canker pathogens shows functional convergence underlying disease development and defense response. BMC Genomics 2014; 15:157. [PMID: 24564253 PMCID: PMC4028880 DOI: 10.1186/1471-2164-15-157] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2013] [Accepted: 02/18/2014] [Indexed: 11/25/2022] Open
Abstract
Background Transcriptional activator-like (TAL) effectors, formerly known as the AvrBs3/PthA protein family, are DNA-binding effectors broadly found in Xanthomonas spp. that transactivate host genes upon injection via the bacterial type three-secretion system. Biologically relevant targets of TAL effectors, i.e. host genes whose induction is vital to establish a compatible interaction, have been reported for xanthomonads that colonize rice and pepper; however, citrus genes modulated by the TAL effectors PthA“s” and PthC“s” of the citrus canker bacteria Xanthomonas citri (Xc) and Xanthomonas aurantifolii pathotype C (XaC), respectively, are poorly characterized. Of particular interest, XaC causes canker disease in its host lemon (Citrus aurantifolia), but triggers a defense response in sweet orange. Results Based on, 1) the TAL effector-DNA binding code, 2) gene expression data of Xc and XaC-infiltrated sweet orange leaves, and 3) citrus hypocotyls transformed with PthA2, PthA4 or PthC1, we have identified a collection of Citrus sinensis genes potentially targeted by Xc and XaC TAL effectors. Our results suggest that similar with other strains of Xanthomonas TAL effectors, PthA2 and PthA4, and PthC1 to some extent, functionally converge. In particular, towards induction of genes involved in the auxin and gibberellin synthesis and response, cell division, and defense response. We also present evidence indicating that the TAL effectors act as transcriptional repressors and that the best scoring predicted DNA targets of PthA“s” and PthC“s” in citrus promoters predominantly overlap with or localize near to TATA boxes of core promoters, supporting the idea that TAL effectors interact with the host basal transcriptional machinery to recruit the RNA pol II and start transcription. Conclusions The identification of PthA“s” and PthC“s” targets, such as the LOB (LATERAL ORGAN BOUNDARY) and CCNBS genes that we report here, is key for the understanding of the canker symptoms development during host susceptibility, or the defenses of sweet orange against the canker bacteria. We have narrowed down candidate targets to a few, which pointed out the host metabolic pathways explored by the pathogens.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Celso E Benedetti
- Laboratório Nacional de Biociências, Centro Nacional de Pesquisa em Energia e Materiais, R, Giuseppe Máximo Scolfaro 10000, Campinas, SP 13083-970, Brazil.
| |
Collapse
|
33
|
Hodgson K, Uher R, Crawford AA, Lewis G, O'Donovan MC, Keers R, Dernovsek MZ, Mors O, Hauser J, Souery D, Maier W, Henigsberg N, Rietschel M, Placentino A, Aitchison K, Farmer A, Davis O, McGuffin P. Genetic predictors of antidepressant side effects: a grouped candidate gene approach in the Genome-Based Therapeutic Drugs for Depression (GENDEP) study. J Psychopharmacol 2014; 28:142-50. [PMID: 24414086 DOI: 10.1177/0269881113517957] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
BACKGROUND The unwanted side effects associated with antidepressants are key determinants of treatment adherence in depression; propensity to experience these adverse drug reactions (ADRs) may be influenced by genetic variation. However, previous work attempting to ascertain the genetic variants involved has had limited success, in part due to the range of ADRs reported with antidepressants. METHOD ADRs reported with antidepressant treatment were categorised using their likely pharmacological basis; adrenergic, cholinergic, serotonergic and histaminergic. To identify genetic predictors of susceptibility to each group of ADRs, a candidate gene analysis was performed with data from 431 depressed patients (from a total sample size of 811 patients) enrolled in the Genome-Based Therapeutic Drugs for Depression (GENDEP) project, who were randomly allocated to receive treatment with escitalopram or nortriptyline. Data from 474 patients treated with citalopram or reboxetine in the GenPod project (total sample of 601 patients) were used for replication of significant findings. RESULTS We found no significant predictors of presumed adrenergic, cholinergic and histaminergic ADRs. Putative serotonergic ADRs were significantly associated with variation in the gene encoding the serotonin 2C receptor (HTR2C, rs6644093, odds ratio (OR)=1.72, 95% confidence interval (CI)=1.31-2.25, p=7.43×10(-5)) in GENDEP. However, this finding was not replicated in GenPod. CONCLUSIONS The association between serotonergic side effects and variation in the HTR2C gene in the GENDEP sample supports the hypothesis that serotonin receptor-mediated mechanisms underlie these adverse reactions, however this finding was not replicated in GenPod.
Collapse
Affiliation(s)
- Karen Hodgson
- 1Social, Genetic and Developmental Psychiatry Centre, King's College London, London, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Bcrp1 transcription in mouse testis is controlled by a promoter upstream of a novel first exon (E1U) regulated by steroidogenic factor-1. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2013; 1829:1288-99. [PMID: 24189494 DOI: 10.1016/j.bbagrm.2013.10.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Revised: 10/07/2013] [Accepted: 10/28/2013] [Indexed: 01/06/2023]
Abstract
Alternative promoter usage is typically associated with mRNAs with differing first exons that contain or consist entirely of a 5' untranslated region. The murine Bcrp1 (Abcg2) transporter has three alternative promoters associated with mRNAs containing alternative untranslated first exons designated as E1A, E1B, and E1C. The E1B promoter regulates Bcrp1 transcription in mouse intestine. Here, we report the identification and characterization of a novel Bcrp1 promoter and first exon, E1U, located upstream from the other Bcrp1 promoters/first exons, which is the predominant alternative promoter utilized in murine testis. Using in silico analysis we identified a putative steroidogenic factor-1 (SF-1) response element that was unique to the Bcrp1 E1U alternative promoter. Overexpression of SF-1 in murine TM4 Sertoli cells enhanced Bcrp1 E1U mRNA expression and increased Bcrp1 E1U alternative promoter activity in a reporter assay, whereas mutation of the SF-1 binding site totally eliminated Bcrp1 E1U alternative promoter activity. Moreover, expression of Bcrp1 E1U and total mRNA and Bcrp1 protein was markedly diminished in the testes from adult Sertoli cell-specific SF-1 knockout mice, in comparison to the testes from wild-type mice. Binding of SF-1 to the SF-1 response element in the E1U promoter was demonstrated by chromatin immunoprecipitation assays. In conclusion, nuclear transcription factor SF-1 is involved with the regulation of a novel promoter of Bcrp1 that governs transcription of the E1U mRNA isoform in mice. The present study furthers understanding of the complex regulation of Bcrp1 expression in specific tissues of a mammalian model.
Collapse
|
35
|
de Boer CG, van Bakel H, Tsui K, Li J, Morris QD, Nislow C, Greenblatt JF, Hughes TR. A unified model for yeast transcript definition. Genome Res 2013; 24:154-66. [PMID: 24170600 PMCID: PMC3875857 DOI: 10.1101/gr.164327.113] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic acid sequence features detectable by the yeast cell, which we integrate into a Unified Model (UM) that models transcription as a whole. The cis-elements that denote where transcription initiates function primarily through nucleosome depletion, and, using a synthetic promoter system, we show that most of these elements are sufficient to initiate transcription in vivo. Hrp1 binding sites are the major characteristic of terminators; these binding sites are often clustered in terminator regions and can terminate transcription bidirectionally. The UM predicts global transcript structure by modeling transcription of the genome using a hidden Markov model whose emissions are the outputs of the initiation and termination classifiers. We validated the novel predictions of the UM with available RNA-seq data and tested it further by directly comparing the transcript structure predicted by the model to the transcription generated by the cell for synthetic DNA segments of random design. We show that the UM identifies transcription start sites more accurately than the initiation classifier alone, indicating that the relative arrangement of promoter and terminator elements influences their function. Our model presents a concrete description of how the cell defines transcript units, explains the existence of nongenic transcripts, and provides insight into genome evolution.
Collapse
|
36
|
Lederer FL, Weinert U, Günther TJ, Raff J, Weiß S, Pollmann K. Identification of multiple putative S-layer genes partly expressed by Lysinibacillus sphaericus JG-B53. Microbiology (Reading) 2013; 159:1097-1108. [DOI: 10.1099/mic.0.065763-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
- Franziska L. Lederer
- Helmholtz-Institute Freiberg for Resource Technology, Helmholtz-Zentrum Dresden-Rossendorf, 01314 Dresden, Germany
| | - Ulrike Weinert
- Helmholtz-Institute Freiberg for Resource Technology, Helmholtz-Zentrum Dresden-Rossendorf, 01314 Dresden, Germany
| | - Tobias J. Günther
- Helmholtz-Institute Freiberg for Resource Technology, Helmholtz-Zentrum Dresden-Rossendorf, 01314 Dresden, Germany
| | - Johannes Raff
- Institute of Resource Ecology, Helmholtz-Zentrum Dresden-Rossendorf, 01314 Dresden, Germany
- Helmholtz-Institute Freiberg for Resource Technology, Helmholtz-Zentrum Dresden-Rossendorf, 01314 Dresden, Germany
| | - Stephan Weiß
- Institute of Resource Ecology, Helmholtz-Zentrum Dresden-Rossendorf, 01314 Dresden, Germany
| | - Katrin Pollmann
- Helmholtz-Institute Freiberg for Resource Technology, Helmholtz-Zentrum Dresden-Rossendorf, 01314 Dresden, Germany
| |
Collapse
|
37
|
Datta S, Mukhopadhyay S. A composite method based on formal grammar and DNA structural features in detecting human polymerase II promoter region. PLoS One 2013; 8:e54843. [PMID: 23437045 PMCID: PMC3577817 DOI: 10.1371/journal.pone.0054843] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2012] [Accepted: 12/17/2012] [Indexed: 11/25/2022] Open
Abstract
An important step in understanding gene regulation is to identify the promoter regions where the transcription factor binding takes place. Predicting a promoter region de novo has been a theoretical goal for many researchers for a long time. There exists a number of in silico methods to predict the promoter region de novo but most of these methods are still suffering from various shortcomings, a major one being the selection of appropriate features of promoter region distinguishing them from non-promoters. In this communication, we have proposed a new composite method that predicts promoter sequences based on the interrelationship between structural profiles of DNA and primary sequence elements of the promoter regions. We have shown that a Context Free Grammar (CFG) can formalize the relationships between different primary sequence features and by utilizing the CFG, we demonstrate that an efficient parser can be constructed for extracting these relationships from DNA sequences to distinguish the true promoter sequences from non-promoter sequences. Along with CFG, we have extracted the structural features of the promoter region to improve upon the efficiency of our prediction system. Extensive experiments performed on different datasets reveals that our method is effective in predicting promoter sequences on a genome-wide scale and performs satisfactorily as compared to other promoter prediction techniques.
Collapse
Affiliation(s)
- Sutapa Datta
- Department of Biophysics, Molecular Biology and Bioinformatics and Distributed Information Centre for Bioinformatics, University of Calcutta, Kolkata, West Bengal, India.
| | | |
Collapse
|
38
|
De Franceschi P, Stegmeir T, Cabrera A, van der Knaap E, Rosyara UR, Sebolt AM, Dondini L, Dirlewanger E, Quero-Garcia J, Campoy JA, Iezzoni AF. Cell number regulator genes in Prunus provide candidate genes for the control of fruit size in sweet and sour cherry. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2013; 32:311-326. [PMID: 23976873 PMCID: PMC3748327 DOI: 10.1007/s11032-013-9872-6] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Accepted: 04/18/2013] [Indexed: 05/19/2023]
Abstract
Striking increases in fruit size distinguish cultivated descendants from small-fruited wild progenitors for fleshy fruited species such as Solanum lycopersicum (tomato) and Prunus spp. (peach, cherry, plum, and apricot). The first fruit weight gene identified as a result of domestication and selection was the tomato FW2.2 gene. Members of the FW2.2 gene family in corn (Zea mays) have been named CNR (Cell Number Regulator) and two of them exert their effect on organ size by modulating cell number. Due to the critical roles of FW2.2/CNR genes in regulating cell number and organ size, this family provides an excellent source of candidates for fruit size genes in other domesticated species, such as those found in the Prunus genus. A total of 23 FW2.2/CNR family members were identified in the peach genome, spanning the eight Prunus chromosomes. Two of these CNRs were located within confidence intervals of major quantitative trait loci (QTL) previously discovered on linkage groups 2 and 6 in sweet cherry (Prunus avium), named PavCNR12 and PavCNR20, respectively. An analysis of haplotype, sequence, segregation and association with fruit size strongly supports a role of PavCNR12 in the sweet cherry linkage group 2 fruit size QTL, and this QTL is also likely present in sour cherry (P. cerasus). The finding that the increase in fleshy fruit size in both tomato and cherry associated with domestication may be due to changes in members of a common ancestral gene family supports the notion that similar phenotypic changes exhibited by independently domesticated taxa may have a common genetic basis.
Collapse
Affiliation(s)
- P. De Franceschi
- Dipartimento di Scienze Agrarie, Università degli Studi di Bologna, Bologna, Italy
| | - T. Stegmeir
- Michigan State University, East Lansing, MI USA
| | - A. Cabrera
- Ohio Agricultural Research and Development Center, The Ohio State University, Wooster, OH USA
| | - E. van der Knaap
- Ohio Agricultural Research and Development Center, The Ohio State University, Wooster, OH USA
| | | | | | - L. Dondini
- Dipartimento di Scienze Agrarie, Università degli Studi di Bologna, Bologna, Italy
| | - E. Dirlewanger
- INRA, UMR 1332 de Biologie du Fruit et Pathologie, 33140 Villenave d’Ornon, France
- University of Bordeaux, UMR 1332 de Biologie du Fruit et Pathologie, 33140 Villenave d’Ornon, France
| | - J. Quero-Garcia
- INRA, UMR 1332 de Biologie du Fruit et Pathologie, 33140 Villenave d’Ornon, France
- University of Bordeaux, UMR 1332 de Biologie du Fruit et Pathologie, 33140 Villenave d’Ornon, France
| | - J. A. Campoy
- INRA, UMR 1332 de Biologie du Fruit et Pathologie, 33140 Villenave d’Ornon, France
- University of Bordeaux, UMR 1332 de Biologie du Fruit et Pathologie, 33140 Villenave d’Ornon, France
| | | |
Collapse
|
39
|
Lee TY, Chang WC, Hsu JBK, Chang TH, Shien DM. GPMiner: an integrated system for mining combinatorial cis-regulatory elements in mammalian gene group. BMC Genomics 2012; 13 Suppl 1:S3. [PMID: 22369687 PMCID: PMC3587379 DOI: 10.1186/1471-2164-13-s1-s3] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Background Sequence features in promoter regions are involved in regulating gene transcription initiation. Although numerous computational methods have been developed for predicting transcriptional start sites (TSSs) or transcription factor (TF) binding sites (TFBSs), they lack annotations for do not consider some important regulatory features such as CpG islands, tandem repeats, the TATA box, CCAAT box, GC box, over-represented oligonucleotides, DNA stability, and GC content. Additionally, the combinatorial interaction of TFs regulates the gene group that is associated with same expression pattern. To investigate gene transcriptional regulation, an integrated system that annotates regulatory features in a promoter sequence and detects co-regulation of TFs in a group of genes is needed. Results This work identifies TSSs and regulatory features in a promoter sequence, and recognizes co-occurrence of cis-regulatory elements in co-expressed genes using a novel system. Three well-known TSS prediction tools are incorporated with orthologous conserved features, such as CpG islands, nucleotide composition, over-represented hexamer nucleotides, and DNA stability, to construct the novel Gene Promoter Miner (GPMiner) using a support vector machine (SVM). According to five-fold cross-validation results, the predictive sensitivity and specificity are both roughly 80%. The proposed system allows users to input a group of gene names/symbols, enabling the co-occurrence of TFBSs to be determined. Additionally, an input sequence can also be analyzed for homogeneity of experimental mammalian promoter sequences, and conserved regulatory features between homologous promoters can be observed through cross-species analysis. After identifying promoter regions, regulatory features are visualized graphically to facilitate gene promoter observations. Conclusions The GPMiner, which has a user-friendly input/output interface, has numerous benefits in analyzing human and mouse promoters. The proposed system is freely available at http://GPMiner.mbc.nctu.edu.tw/.
Collapse
Affiliation(s)
- Tzong-Yi Lee
- Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan.
| | | | | | | | | |
Collapse
|
40
|
Gan Y, Guan J, Zhou S. A comparison study on feature selection of DNA structural properties for promoter prediction. BMC Bioinformatics 2012; 13:4. [PMID: 22226192 PMCID: PMC3280155 DOI: 10.1186/1471-2105-13-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2011] [Accepted: 01/07/2012] [Indexed: 01/27/2023] Open
Abstract
Background Promoter prediction is an integrant step for understanding gene regulation and annotating genomes. Traditional promoter analysis is mainly based on sequence compositional features. Recently, many kinds of structural features have been employed in promoter prediction. However, considering the high-dimensionality and overfitting problems, it is unfeasible to utilize all available features for promoter prediction. Thus it is necessary to choose some appropriate features for the prediction task. Results This paper conducts an extensive comparison study on feature selection of DNA structural properties for promoter prediction. Firstly, to examine whether promoters possess some special structures, we carry out a systematical comparison among the profiles of thirteen structural features on promoter and non-promoter sequences. Secondly, we investigate the correlations between these structural features and promoter sequences. Thirdly, both filter and wrapper methods are utilized to select appropriate feature subsets from thirteen different kinds of structural features for promoter prediction, and the predictive power of the selected feature subsets is evaluated. Finally, we compare the prediction performance of the feature subsets selected in this paper with nine existing promoter prediction approaches. Conclusions Experimental results show that the structural features are differentially correlated to promoters. Specifically, DNA-bending stiffness, DNA denaturation and energy-related features are highly correlated with promoters. The predictive power for promoter sequences differentiates greatly among different structural features. Selecting the relevant features can significantly improve the accuracy of promoter prediction.
Collapse
Affiliation(s)
- Yanglan Gan
- Department of Computer Science and Technology, Tongji University, Shanghai, China
| | | | | |
Collapse
|
41
|
LIU YX, CHANG W, HAN YP, ZOU Q, GUO MZ, LI WB. In silico Detection of Novel MicroRNAs Genes in Soybean Genome. ACTA ACUST UNITED AC 2011. [DOI: 10.1016/s1671-2927(11)60126-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
42
|
Rangannan V, Bansal M. PromBase: a web resource for various genomic features and predicted promoters in prokaryotic genomes. BMC Res Notes 2011; 4:257. [PMID: 21781326 PMCID: PMC3160392 DOI: 10.1186/1756-0500-4-257] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2011] [Accepted: 07/22/2011] [Indexed: 12/19/2022] Open
Abstract
Background As more and more genomes are being sequenced, an overview of their genomic features and annotation of their functional elements, which control the expression of each gene or transcription unit of the genome, is a fundamental challenge in genomics and bioinformatics. Findings Relative stability of DNA sequence has been used to predict promoter regions in 913 microbial genomic sequences with GC-content ranging from 16.6% to 74.9%. Irrespective of the genome GC-content the relative stability based promoter prediction method has already been proven to be robust in terms of recall and precision. The predicted promoter regions for the 913 microbial genomes have been accumulated in a database called PromBase. Promoter search can be carried out in PromBase either by specifying the gene name or the genomic position. Each predicted promoter region has been assigned to a reliability class (low, medium, high, very high and highest) based on the difference between its average free energy and the downstream region. The recall and precision values for each class are shown graphically in PromBase. In addition, PromBase provides detailed information about base composition, CDS and CG/TA skews for each genome and various DNA sequence dependent structural properties (average free energy, curvature and bendability) in the vicinity of all annotated translation start sites (TLS). Conclusion PromBase is a database, which contains predicted promoter regions and detailed analysis of various genomic features for 913 microbial genomes. PromBase can serve as a valuable resource for comparative genomics study and help the experimentalist to rapidly access detailed information on various genomic features and putative promoter regions in any given genome. This database is freely accessible for academic and non- academic users via the worldwide web http://nucleix.mbu.iisc.ernet.in/prombase/.
Collapse
Affiliation(s)
- Vetriselvi Rangannan
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560 012, India.
| | | |
Collapse
|
43
|
Mishra H, Singh N, Misra K, Lahiri T. An ANN-GA model based promoter prediction in Arabidopsis thaliana using tilling microarray data. Bioinformation 2011; 6:240-3. [PMID: 21887014 PMCID: PMC3159145 DOI: 10.6026/97320630006240] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Accepted: 05/09/2011] [Indexed: 11/23/2022] Open
Abstract
Identification of promoter region is an important part of gene annotation. Identification of promoters in eukaryotes is important as promoters modulate various
metabolic functions and cellular stress responses. In this work, a novel approach utilizing intensity values of tilling microarray data for a model eukaryotic plant
Arabidopsis thaliana, was used to specify promoter region from non-promoter region. A feed-forward back propagation neural network model supported by
genetic algorithm was employed to predict the class of data with a window size of 41. A dataset comprising of 2992 data vectors representing both promoter and
non-promoter regions, chosen randomly from probe intensity vectors for whole genome of Arabidopsis thaliana generated through tilling microarray technique
was used. The classifier model shows prediction accuracy of 69.73% and 65.36% on training and validation sets, respectively. Further, a concept of distance based
class membership was used to validate reliability of classifier, which showed promising results. The study shows the usability of micro-array probe intensities to
predict the promoter regions in eukaryotic genomes.
Collapse
Affiliation(s)
- Hrishikesh Mishra
- Division of Applied Sciences and Indo-Russian Centre for Biotechnology, Indian Institute of Information Technology, Allahabad, India
| | | | | | | |
Collapse
|
44
|
Jin H, Kanthasamy A, Anantharam V, Rana A, Kanthasamy AG. Transcriptional regulation of pro-apoptotic protein kinase Cdelta: implications for oxidative stress-induced neuronal cell death. J Biol Chem 2011; 286:19840-59. [PMID: 21467032 DOI: 10.1074/jbc.m110.203687] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
We previously demonstrated that protein kinase Cδ (PKCδ; PKC delta) is an oxidative stress-sensitive kinase that plays a causal role in apoptotic cell death in neuronal cells. Although PKCδ activation has been extensively studied, relatively little is known about the molecular mechanisms controlling PKCδ expression. To characterize the regulation of PKCδ expression, we cloned an ∼2-kbp 5'-promoter segment of the mouse Prkcd gene. Deletion analysis indicated that the noncoding exon 1 region contained multiple Sp sites, including four GC boxes and one CACCC box, which directed the highest levels of transcription in neuronal cells. In addition, an upstream regulatory region containing adjacent repressive and anti-repressive elements with opposing regulatory activities was identified within the region -712 to -560. Detailed mutagenesis studies revealed that each Sp site made a positive contribution to PKCδ promoter expression. Overexpression of Sp family proteins markedly stimulated PKCδ promoter activity without any synergistic transactivating effect. Furthermore, experiments in Sp-deficient SL2 cells indicated long isoform Sp3 as the essential activator of PKCδ transcription. Importantly, both PKCδ promoter activity and endogenous PKCδ expression in NIE115 cells and primary striatal cultures were inhibited by mithramycin A. The results from chromatin immunoprecipitation and gel shift assays further confirmed the functional binding of Sp proteins to the PKCδ promoter. Additionally, we demonstrated that overexpression of p300 or CREB-binding protein increases the PKCδ promoter activity. This stimulatory effect requires intact Sp-binding sites and is independent of p300 histone acetyltransferase activity. Finally, modulation of Sp transcriptional activity or protein level profoundly altered the cell death induced by oxidative insult, demonstrating the functional significance of Sp-dependent PKCδ gene expression. Collectively, our findings may have implications for development of new translational strategies against oxidative damage.
Collapse
Affiliation(s)
- Huajun Jin
- Parkinson's Disorder Research Laboratory, Iowa Center for Advanced Neurotoxicology, Department of Biomedical Sciences, Iowa State University, Ames, Iowa 50011, USA
| | | | | | | | | |
Collapse
|
45
|
Schaefer U, Kodzius R, Kai C, Kawai J, Carninci P, Hayashizaki Y, Bajic VB. High sensitivity TSS prediction: estimates of locations where TSS cannot occur. PLoS One 2010; 5:e13934. [PMID: 21085627 PMCID: PMC2981523 DOI: 10.1371/journal.pone.0013934] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2010] [Accepted: 10/19/2010] [Indexed: 11/26/2022] Open
Abstract
Background Although transcription in mammalian genomes can initiate from various genomic positions (e.g., 3′UTR, coding exons, etc.), most locations on genomes are not prone to transcription initiation. It is of practical and theoretical interest to be able to estimate such collections of non-TSS locations (NTLs). The identification of large portions of NTLs can contribute to better focusing the search for TSS locations and thus contribute to promoter and gene finding. It can help in the assessment of 5′ completeness of expressed sequences, contribute to more successful experimental designs, as well as more accurate gene annotation. Methodology Using comprehensive collections of Cap Analysis of Gene Expression (CAGE) and other transcript data from mouse and human genomes, we developed a methodology that allows us, by performing computational TSS prediction with very high sensitivity, to annotate, with a high accuracy in a strand specific manner, locations of mammalian genomes that are highly unlikely to harbor transcription start sites (TSSs). The properties of the immediate genomic neighborhood of 98,682 accurately determined mouse and 113,814 human TSSs are used to determine features that distinguish genomic transcription initiation locations from those that are not likely to initiate transcription. In our algorithm we utilize various constraining properties of features identified in the upstream and downstream regions around TSSs, as well as statistical analyses of these surrounding regions. Conclusions Our analysis of human chromosomes 4, 21 and 22 estimates ∼46%, ∼41% and ∼27% of these chromosomes, respectively, as being NTLs. This suggests that on average more than 40% of the human genome can be expected to be highly unlikely to initiate transcription. Our method represents the first one that utilizes high-sensitivity TSS prediction to identify, with high accuracy, large portions of mammalian genomes as NTLs. The server with our algorithm implemented is available at http://cbrc.kaust.edu.sa/ddm/.
Collapse
MESH Headings
- Algorithms
- Animals
- Base Sequence
- Chromosomes, Human, Pair 21/genetics
- Chromosomes, Human, Pair 22/genetics
- Chromosomes, Human, Pair 4/genetics
- Computational Biology/methods
- Genome/genetics
- Genome, Human/genetics
- Humans
- Internet
- Mice
- Molecular Sequence Data
- Promoter Regions, Genetic/genetics
- Receptors, Opioid, mu/genetics
- Reproducibility of Results
- Transcription Initiation Site
- Transcription, Genetic
Collapse
Affiliation(s)
- Ulf Schaefer
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Rimantas Kodzius
- Division of Physical Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
| | - Chikatoshi Kai
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, Japan
| | - Jun Kawai
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, Japan
| | - Piero Carninci
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, Wako, Saitama, Japan
| | - Yoshihide Hayashizaki
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, Japan
| | - Vladimir B. Bajic
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Kingdom of Saudi Arabia
- * E-mail:
| |
Collapse
|
46
|
Liang H, Barakat A, Schlarbaum SE, Mandoli DF, Carlson JE. Comparison of gene order of GIGANTEA loci in yellow-poplar, monocots, and eudicots. Genome 2010; 53:533-44. [PMID: 20616875 DOI: 10.1139/g10-031] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
GIGANTEA plays an important role in the control of circadian rhythms and photoperiodic flowering. The GIGANTEA gene has been studied in various species, but not in basal angiosperms. Moreover, to the best of our knowledge, no study of the genome organization of a basal angiosperm has yet been published. In this study, we sequenced a bacterial artificial chromosome (BAC) harboring GIGANTEA from yellow-poplar (Liriodendron tulipifera L.) and compared the genomic organization of this gene in yellow-poplar with that in other species from various angiosperm clades. This is the first report on the gene structure and organization of a large contig in any basal angiosperm species. The BAC clone, covering a region of approximately 122 kb from the yellow-poplar genome, was sequenced and assembled by coupling the 454 pyrosequencing technology with ABI capillary sequencing. In addition to GIGANTEA, the gene RPS18.A (encoding ribosomal protein S18.A) was found in this segment of the genome. We found that gene content and order in this region of the yellow-poplar genome were similar to those in the corresponding region in eudicots but not in Oryza sativa and Sorghum bicolor, implying that clustering of the GIGANTEA and RPS18.A genes is ancestral and separation of the genes occurred after the phylogenetic split of monocots from dicots. Phylogenetic analysis of GIGANTEA amino acid sequences placed yellow-poplar closer to eudicots than to monocots. In addition, evidence for transposition and large insertions and duplications was found, suggesting multiple and complex mechanisms of basal angiosperm genome evolution.
Collapse
Affiliation(s)
- Haiying Liang
- School of Forest Resources and Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.
| | | | | | | | | |
Collapse
|
47
|
Eukaryotic and prokaryotic promoter prediction using hybrid approach. Theory Biosci 2010; 130:91-100. [DOI: 10.1007/s12064-010-0114-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2010] [Accepted: 10/23/2010] [Indexed: 12/27/2022]
|
48
|
LIU YX, HAN YP, CHANG W, ZOU Q, GUO MZ, LI WB. Genomic Analysis of MicroRNA Promoters and Their Cis-Acting Elements in Soybean. ACTA ACUST UNITED AC 2010. [DOI: 10.1016/s1671-2927(09)60252-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
49
|
Rangannan V, Bansal M. High-quality annotation of promoter regions for 913 bacterial genomes. ACTA ACUST UNITED AC 2010; 26:3043-50. [PMID: 20956245 DOI: 10.1093/bioinformatics/btq577] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
MOTIVATION The number of bacterial genomes being sequenced is increasing very rapidly and hence, it is crucial to have procedures for rapid and reliable annotation of their functional elements such as promoter regions, which control the expression of each gene or each transcription unit of the genome. The present work addresses this requirement and presents a generic method applicable across organisms. RESULTS Relative stability of the DNA double helical sequences has been used to discriminate promoter regions from non-promoter regions. Based on the difference in stability between neighboring regions, an algorithm has been implemented to predict promoter regions on a large scale over 913 microbial genome sequences. The average free energy values for the promoter regions as well as their downstream regions are found to differ, depending on their GC content. Threshold values to identify promoter regions have been derived using sequences flanking a subset of translation start sites from all microbial genomes and then used to predict promoters over the complete genome sequences. An average recall value of 72% (which indicates the percentage of protein and RNA coding genes with predicted promoter regions assigned to them) and precision of 56% is achieved over the 913 microbial genome dataset. AVAILABILITY The binary executable for 'PromPredict' algorithm (implemented in PERL and supported on Linux and MS Windows) and the predicted promoter data for all 913 microbial genomes are available at http://nucleix.mbu.iisc.ernet.in/prombase/.
Collapse
|
50
|
The CC-NB-LRR-type Rdg2a resistance gene confers immunity to the seed-borne barley leaf stripe pathogen in the absence of hypersensitive cell death. PLoS One 2010; 5. [PMID: 20844752 PMCID: PMC2937021 DOI: 10.1371/journal.pone.0012599] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Accepted: 08/12/2010] [Indexed: 01/04/2023] Open
Abstract
Background Leaf stripe disease on barley (Hordeum vulgare) is caused by the seed-transmitted hemi-biotrophic fungus Pyrenophora graminea. Race-specific resistance to leaf stripe is controlled by two known Rdg (Resistance to Drechslera graminea) genes: the H. spontaneum-derived Rdg1a and Rdg2a, identified in H. vulgare. The aim of the present work was to isolate the Rdg2a leaf stripe resistance gene, to characterize the Rdg2a locus organization and evolution and to elucidate the histological bases of Rdg2a-based leaf stripe resistance. Principal Findings We describe here the positional cloning and functional characterization of the leaf stripe resistance gene Rdg2a. At the Rdg2a locus, three sequence-related coiled-coil, nucleotide-binding site, and leucine-rich repeat (CC-NB-LRR) encoding genes were identified. Sequence comparisons suggested that paralogs of this resistance locus evolved through recent gene duplication, and were subjected to frequent sequence exchange. Transformation of the leaf stripe susceptible cv. Golden Promise with two Rdg2a-candidates under the control of their native 5′ regulatory sequences identified a member of the CC-NB-LRR gene family that conferred resistance against the Dg2 leaf stripe isolate, against which the Rdg2a-gene is effective. Histological analysis demonstrated that Rdg2a-mediated leaf stripe resistance involves autofluorescing cells and prevents pathogen colonization in the embryos without any detectable hypersensitive cell death response, supporting a cell wall reinforcement-based resistance mechanism. Conclusions This work reports about the cloning of a resistance gene effective against a seed borne disease. We observed that Rdg2a was subjected to diversifying selection which is consistent with a model in which the R gene co-evolves with a pathogen effector(s) gene. We propose that inducible responses giving rise to physical and chemical barriers to infection in the cell walls and intercellular spaces of the barley embryo tissues represent mechanisms by which the CC-NB-LRR-encoding Rdg2a gene mediates resistance to leaf stripe in the absence of hypersensitive cell death.
Collapse
|