1
|
Vora DS, Kalakoti Y, Sundar D. Computational Methods and Deep Learning for Elucidating Protein Interaction Networks. Methods Mol Biol 2023; 2553:285-323. [PMID: 36227550 DOI: 10.1007/978-1-0716-2617-7_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Protein interactions play a critical role in all biological processes, but experimental identification of protein interactions is a time- and resource-intensive process. The advances in next-generation sequencing and multi-omics technologies have greatly benefited large-scale predictions of protein interactions using machine learning methods. A wide range of tools have been developed to predict protein-protein, protein-nucleic acid, and protein-drug interactions. Here, we discuss the applications, methods, and challenges faced when employing the various prediction methods. We also briefly describe ways to overcome the challenges and prospective future developments in the field of protein interaction biology.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Yogesh Kalakoti
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
- School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India.
| |
Collapse
|
2
|
Randhawa V, Pathania S. Advancing from protein interactomes and gene co-expression networks towards multi-omics-based composite networks: approaches for predicting and extracting biological knowledge. Brief Funct Genomics 2020; 19:364-376. [PMID: 32678894 DOI: 10.1093/bfgp/elaa015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 05/31/2020] [Accepted: 06/15/2020] [Indexed: 01/17/2023] Open
Abstract
Prediction of biological interaction networks from single-omics data has been extensively implemented to understand various aspects of biological systems. However, more recently, there is a growing interest in integrating multi-omics datasets for the prediction of interactomes that provide a global view of biological systems with higher descriptive capability, as compared to single omics. In this review, we have discussed various computational approaches implemented to infer and analyze two of the most important and well studied interactomes: protein-protein interaction networks and gene co-expression networks. We have explicitly focused on recent methods and pipelines implemented to infer and extract biologically important information from these interactomes, starting from utilizing single-omics data and then progressing towards multi-omics data. Accordingly, recent examples and case studies are also briefly discussed. Overall, this review will provide a proper understanding of the latest developments in protein and gene network modelling and will also help in extracting practical knowledge from them.
Collapse
Affiliation(s)
- Vinay Randhawa
- Department of Biochemistry, Panjab University, Chandigarh, 160014, India
| | - Shivalika Pathania
- Department of Biotechnology, Panjab University, Chandigarh, 160014, India
| |
Collapse
|
3
|
Protective Roles of Cytosolic and Plastidal Proteasomes on Abiotic Stress and Pathogen Invasion. PLANTS 2020; 9:plants9070832. [PMID: 32630761 PMCID: PMC7412383 DOI: 10.3390/plants9070832] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Revised: 06/30/2020] [Accepted: 06/30/2020] [Indexed: 01/18/2023]
Abstract
Protein malfunction is typically caused by abiotic stressors. To ensure cell survival during conditions of stress, it is important for plant cells to maintain proteins in their respective functional conformation. Self-compartmentalizing proteases, such as ATP-dependent Clp proteases and proteasomes are designed to act in the crowded cellular environment, and they are responsible for degradation of misfolded or damaged proteins within the cell. During different types of stress conditions, the levels of misfolded or orphaned proteins that are degraded by the 26S proteasome in the cytosol and nucleus and by the Clp proteases in the mitochondria and chloroplasts increase. This allows cells to uphold feedback regulations to cellular-level signals and adjust to altered environmental conditions. In this review, we summarize recent findings on plant proteolytic complexes with respect to their protective functions against abiotic and biotic stressors.
Collapse
|
4
|
Vilela A, Bacelar E, Pinto T, Anjos R, Correia E, Gonçalves B, Cosme F. Beverage and Food Fragrance Biotechnology, Novel Applications, Sensory and Sensor Techniques: An Overview. Foods 2019; 8:E643. [PMID: 31817355 PMCID: PMC6963671 DOI: 10.3390/foods8120643] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 11/29/2019] [Accepted: 12/03/2019] [Indexed: 12/19/2022] Open
Abstract
Flavours and fragrances are especially important for the beverage and food industries. Biosynthesis or extraction are the two main ways to obtain these important compounds that have many different chemical structures. Consequently, the search for new compounds is challenging for academic and industrial investigation. This overview aims to present the current state of art of beverage fragrance biotechnology, including recent advances in sensory and sensor methodologies and statistical techniques for data analysis. An overview of all the recent findings in beverage and food fragrance biotechnology, including those obtained from natural sources by extraction processes (natural plants as an important source of flavours) or using enzymatic precursor (hydrolytic enzymes), and those obtained by de novo synthesis (microorganisms' respiration/fermentation of simple substrates such as glucose and sucrose), are reviewed. Recent advances have been made in what concerns "beverage fragrances construction" as also in their application products. Moreover, novel sensory and sensor methodologies, primarily used for fragrances quality evaluation, have been developed, as have statistical techniques for sensory and sensors data treatments, allowing a rapid and objective analysis.
Collapse
Affiliation(s)
- Alice Vilela
- CQ-VR, Chemistry Research Centre, Department of Biology and Environment, School of Life Sciences and Environment, University of Trás-os-Montes and Alto Douro, P-5000-801 Vila Real, Portugal;
| | - Eunice Bacelar
- CITAB, Centre for the Research and Technology of Agro-Environmental and Biological Sciences, Department of Biology and Environment, School of Life Sciences and Environment, University of Trás-os-Montes and Alto Douro, P-5000-801 Vila Real, Portugal; (E.B.); (T.P.); (R.A.); (B.G.)
| | - Teresa Pinto
- CITAB, Centre for the Research and Technology of Agro-Environmental and Biological Sciences, Department of Biology and Environment, School of Life Sciences and Environment, University of Trás-os-Montes and Alto Douro, P-5000-801 Vila Real, Portugal; (E.B.); (T.P.); (R.A.); (B.G.)
| | - Rosário Anjos
- CITAB, Centre for the Research and Technology of Agro-Environmental and Biological Sciences, Department of Biology and Environment, School of Life Sciences and Environment, University of Trás-os-Montes and Alto Douro, P-5000-801 Vila Real, Portugal; (E.B.); (T.P.); (R.A.); (B.G.)
| | - Elisete Correia
- CQ-VR, Chemistry Research Centre, Department of Mathematics, School of Life Sciences and Environment, University of Trás-os-Montes and Alto Douro, P-5000-801 Vila Real, Portugal;
- Center for Computational and Stochastic Mathematics (CEMAT), Department of Mathematics, IST-UL, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal
| | - Berta Gonçalves
- CITAB, Centre for the Research and Technology of Agro-Environmental and Biological Sciences, Department of Biology and Environment, School of Life Sciences and Environment, University of Trás-os-Montes and Alto Douro, P-5000-801 Vila Real, Portugal; (E.B.); (T.P.); (R.A.); (B.G.)
| | - Fernanda Cosme
- CQ-VR, Chemistry Research Centre, Department of Biology and Environment, School of Life Sciences and Environment, University of Trás-os-Montes and Alto Douro, P-5000-801 Vila Real, Portugal;
| |
Collapse
|
5
|
Mahieu NG, Patti GJ. Systems-Level Annotation of a Metabolomics Data Set Reduces 25 000 Features to Fewer than 1000 Unique Metabolites. Anal Chem 2017; 89:10397-10406. [PMID: 28914531 PMCID: PMC6427824 DOI: 10.1021/acs.analchem.7b02380] [Citation(s) in RCA: 183] [Impact Index Per Article: 26.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
When using liquid chromatography/mass spectrometry (LC/MS) to perform untargeted metabolomics, it is now routine to detect tens of thousands of features from biological samples. Poor understanding of the data, however, has complicated interpretation and masked the number of unique metabolites actually being measured in an experiment. Here we place an upper bound on the number of unique metabolites detected in Escherichia coli samples analyzed with one untargeted metabolomics method. We first group multiple features arising from the same analyte, which we call "degenerate features", using a context-driven annotation approach. Surprisingly, this analysis revealed thousands of previously unreported degeneracies that reduced the number of unique analytes to ∼2961. We then applied an orthogonal approach to remove nonbiological features from the data using the 13C-based credentialing technology. This further reduced the number of unique analytes to less than 1000. Our 90% reduction in data is 5-fold greater than previously published studies. On the basis of the results, we propose an alternative approach to untargeted metabolomics that relies on thoroughly annotated reference data sets. To this end, we introduce the creDBle database ( http://creDBle.wustl.edu ), which contains accurate mass, retention time, and MS/MS fragmentation data as well as annotations of all credentialed features.
Collapse
Affiliation(s)
- Nathaniel G. Mahieu
- Department of Chemistry, Washington University, St. Louis, Missouri 63130, United States
| | - Gary J. Patti
- Department of Chemistry, Washington University, St. Louis, Missouri 63130, United States
| |
Collapse
|
6
|
Robinson JL, Nielsen J. Integrative analysis of human omics data using biomolecular networks. MOLECULAR BIOSYSTEMS 2016; 12:2953-64. [PMID: 27510223 DOI: 10.1039/c6mb00476h] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
High-throughput '-omics' technologies have given rise to an increasing abundance of genome-scale data detailing human biology at the molecular level. Although these datasets have already made substantial contributions to a more comprehensive understanding of human physiology and diseases, their interpretation becomes increasingly cryptic and nontrivial as they continue to expand in size and complexity. Systems biology networks offer a scaffold upon which omics data can be integrated, facilitating the extraction of new and physiologically relevant information from the data. Two of the most prevalent networks that have been used for such integrative analyses of omics data are genome-scale metabolic models (GEMs) and protein-protein interaction (PPI) networks, both of which have demonstrated success among many different omics and sample types. This integrative approach seeks to unite 'top-down' omics data with 'bottom-up' biological networks in a synergistic fashion that draws on the strengths of both strategies. As the volume and resolution of high-throughput omics data continue to grow, integrative network-based analyses are expected to play an increasingly important role in their interpretation.
Collapse
Affiliation(s)
- Jonathan L Robinson
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE412 96 Gothenburg, Sweden.
| | | |
Collapse
|
7
|
Bens M, Sahm A, Groth M, Jahn N, Morhart M, Holtze S, Hildebrandt TB, Platzer M, Szafranski K. FRAMA: from RNA-seq data to annotated mRNA assemblies. BMC Genomics 2016; 17:54. [PMID: 26763976 PMCID: PMC4712544 DOI: 10.1186/s12864-015-2349-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2015] [Accepted: 12/22/2015] [Indexed: 11/25/2022] Open
Abstract
Background Advances in second-generation sequencing of RNA made a near-complete characterization of transcriptomes affordable. However, the reconstruction of full-length mRNAs via de novo RNA-seq assembly is still difficult due to the complexity of eukaryote transcriptomes with highly similar paralogs and multiple alternative splice variants. Here, we present FRAMA, a genome-independent annotation tool for de novo mRNA assemblies that addresses several post-assembly tasks, such as reduction of contig redundancy, ortholog assignment, correction of misassembled transcripts, scaffolding of fragmented transcripts and coding sequence identification. Results We applied FRAMA to assemble and annotate the transcriptome of the naked mole-rat and assess the quality of the obtained compilation of transcripts with the aid of publicy available naked mole-rat gene annotations. Based on a de novo transcriptome assembly (Trinity), FRAMA annotated 21,984 naked mole-rat mRNAs (12,100 full-length CDSs), corresponding to 16,887 genes. The scaffolding of 3488 genes increased the median sequence information 1.27-fold. In total, FRAMA detected and corrected 4774 misassembled genes, which were predominantly caused by fusion of genes. A comparison with three different sources of naked mole-rat transcripts reveals that FRAMA’s gene models are better supported by RNA-seq data than any other transcript set. Further, our results demonstrate the competitiveness of FRAMA to state of the art genome-based transcript reconstruction approaches. Conclusion FRAMA realizes the de novo construction of a low-redundant transcript catalog for eukaryotes, including the extension and refinement of transcripts. Thereby, results delivered by FRAMA provide the basis for comprehensive downstream analyses like gene expression studies or comparative transcriptomics. FRAMA is available at https://github.com/gengit/FRAMA. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2349-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Martin Bens
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Beutenbergstr. 11, 07745, Jena, Germany.
| | - Arne Sahm
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Beutenbergstr. 11, 07745, Jena, Germany.
| | - Marco Groth
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Beutenbergstr. 11, 07745, Jena, Germany.
| | - Niels Jahn
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Beutenbergstr. 11, 07745, Jena, Germany.
| | - Michaela Morhart
- Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, 10315, Berlin, Germany.
| | - Susanne Holtze
- Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, 10315, Berlin, Germany.
| | - Thomas B Hildebrandt
- Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Straße 17, 10315, Berlin, Germany.
| | - Matthias Platzer
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Beutenbergstr. 11, 07745, Jena, Germany.
| | - Karol Szafranski
- Leibniz Institute on Ageing - Fritz Lipmann Institute, Beutenbergstr. 11, 07745, Jena, Germany.
| |
Collapse
|
8
|
Wan D, Wang X, Wu Q, Lin P, Pan Y, Sattar A, Huang L, Ahmad I, Zhang Y, Yuan Z. Integrated Transcriptional and Proteomic Analysis of Growth Hormone Suppression Mediated by Trichothecene T-2 Toxin in Rat GH3 Cells. Toxicol Sci 2015; 147:326-38. [PMID: 26141394 DOI: 10.1093/toxsci/kfv131] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2023] Open
Abstract
Chronic exposure to trichothecenes is known to disturb insulin-like growth factor 1 and signaling of insulin and leptin hormones and causes considerable growth retardation in animals. However, limited information was available on mechanisms underlying trichothecene-induced growth retardation. In this study, we employed an integrated transcriptomics, proteomics, and RNA interference (RNAi) approach to study the molecular mechanisms underlying trichothecene cytotoxicity in rat pituitary adenoma GH3 cells. Our results showed that trichothecenes suppressed the synthesis of growth hormone 1 (Gh1) and inhibited the eukaryotic transcription and translation initiation by suppressing aminoacyl-tRNA synthetases transcription, inducing eukaryotic translation initiation factor 2-alpha kinase 2 (EIF2AK2) and reducing eukaryotic translation initiation factor 5 a. The sulfhydryl oxidases , protein disulfide isomerase,and heat shock protein 90 (were greatly reduced, which resulted in adverse regulation of protein processing and folding. Differential genes and proteins associated with a decline in energy metabolism and cell cycle arrest were also found in our study. However, use of RNAi to interfere with hemopoietic cell kinase (Hck) and EIF2AK2 transcriptions or use of chemical inhibitors of MAPK, p38, Ras, and JNK partially reversed the reduction of Gh1 levels induced by trichothecenes. It indicated that the activation of MAPKs, Hck, and EIF2AK2 were important for trichothecene-induced growth hormone suppression. Considering the potential hazards of exposure to trichothecenes, our findings could help to improve our understanding regarding human and animal health implications.
Collapse
Affiliation(s)
- Dan Wan
- *National Reference Laboratory of Veterinary Drug Residues (HZAU) and MAO Key Laboratory for Detection of Veterinary Drug Residues; MOA Laboratory for Risk Assessment of Quality and Safety of Livestock and Poultry Products, Huazhong Agricultural University; Hubei Collaborative Innovation Center for Animal Nutrition and Feed Safety, Wuhan, Hubei, China; Research Center of Healthy Livestock Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, China
| | - Xu Wang
- *National Reference Laboratory of Veterinary Drug Residues (HZAU) and MAO Key Laboratory for Detection of Veterinary Drug Residues; MOA Laboratory for Risk Assessment of Quality and Safety of Livestock and Poultry Products, Huazhong Agricultural University; Hubei Collaborative Innovation Center for Animal Nutrition and Feed Safety, Wuhan, Hubei, China
| | - Qinghua Wu
- *National Reference Laboratory of Veterinary Drug Residues (HZAU) and MAO Key Laboratory for Detection of Veterinary Drug Residues; College of Life Science, Yangtze University, Jingzhou, Hubei, China; and Center for Basic and Applied Research, Faculty of Informatics and Management, University of Hradec Kralove, Hradec Kralove, Czech Republic
| | - Pingping Lin
- MOA Laboratory for Risk Assessment of Quality and Safety of Livestock and Poultry Products, Huazhong Agricultural University; Hubei Collaborative Innovation Center for Animal Nutrition and Feed Safety, Wuhan, Hubei, China
| | - Yuanhu Pan
- MOA Laboratory for Risk Assessment of Quality and Safety of Livestock and Poultry Products, Huazhong Agricultural University; Hubei Collaborative Innovation Center for Animal Nutrition and Feed Safety, Wuhan, Hubei, China
| | - Adeel Sattar
- *National Reference Laboratory of Veterinary Drug Residues (HZAU) and MAO Key Laboratory for Detection of Veterinary Drug Residues; Hubei Collaborative Innovation Center for Animal Nutrition and Feed Safety, Wuhan, Hubei, China
| | - Lingli Huang
- *National Reference Laboratory of Veterinary Drug Residues (HZAU) and MAO Key Laboratory for Detection of Veterinary Drug Residues; MOA Laboratory for Risk Assessment of Quality and Safety of Livestock and Poultry Products, Huazhong Agricultural University
| | - Ijaz Ahmad
- *National Reference Laboratory of Veterinary Drug Residues (HZAU) and MAO Key Laboratory for Detection of Veterinary Drug Residues; Hubei Collaborative Innovation Center for Animal Nutrition and Feed Safety, Wuhan, Hubei, China
| | - Yuanyuan Zhang
- *National Reference Laboratory of Veterinary Drug Residues (HZAU) and MAO Key Laboratory for Detection of Veterinary Drug Residues; MOA Laboratory for Risk Assessment of Quality and Safety of Livestock and Poultry Products, Huazhong Agricultural University
| | - Zonghui Yuan
- *National Reference Laboratory of Veterinary Drug Residues (HZAU) and MAO Key Laboratory for Detection of Veterinary Drug Residues; MOA Laboratory for Risk Assessment of Quality and Safety of Livestock and Poultry Products, Huazhong Agricultural University; Hubei Collaborative Innovation Center for Animal Nutrition and Feed Safety, Wuhan, Hubei, China;
| |
Collapse
|
9
|
Yeger-Lotem E, Sharan R. Human protein interaction networks across tissues and diseases. Front Genet 2015; 6:257. [PMID: 26347769 PMCID: PMC4541328 DOI: 10.3389/fgene.2015.00257] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 07/17/2015] [Indexed: 11/13/2022] Open
Abstract
Protein interaction networks are an important framework for studying protein function, cellular processes, and genotype-to-phenotype relationships. While our view of the human interaction network is constantly expanding, less is known about networks that form in biologically important contexts such as within distinct tissues or in disease conditions. Here we review efforts to characterize these networks and to harness them to gain insights into the molecular mechanisms underlying human disease.
Collapse
Affiliation(s)
- Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Ben-Gurion University of the Negev Beer-Sheva, Israel
| | - Roded Sharan
- Blavatnik School of Computer Science, Tel Aviv University Tel Aviv, Israel
| |
Collapse
|
10
|
Affiliation(s)
- Ratnesh Chandra Mishra
- Department of Plant Molecular Biology, University of Delhi South Campus, New Delhi, India
| | - Anil Grover
- Department of Plant Molecular Biology, University of Delhi South Campus, New Delhi, India
| |
Collapse
|
11
|
Wijayawardena BK, Minchella DJ, DeWoody JA. Horizontal gene transfer in schistosomes: A critical assessment. Mol Biochem Parasitol 2015; 201:57-65. [DOI: 10.1016/j.molbiopara.2015.05.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Revised: 05/27/2015] [Accepted: 05/29/2015] [Indexed: 02/04/2023]
|
12
|
Kim T, Song HK, Hong SE, Kim DH. Meta-analysis of interspecies microarray sets of cardiac diseases revealed common and disease-specific signatures. Anim Cells Syst (Seoul) 2013. [DOI: 10.1080/19768354.2013.861868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
13
|
Up-regulation of SUMO1 pseudogene 3 (SUMO1P3) in gastric cancer and its clinical association. Med Oncol 2013; 30:709. [PMID: 23996296 DOI: 10.1007/s12032-013-0709-2] [Citation(s) in RCA: 100] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2013] [Accepted: 08/20/2013] [Indexed: 12/14/2022]
Abstract
Long noncoding RNAs (lncRNAs) play crucial roles during cancer occurrence and progression. The pseudogene-expressed lncRNA is one major type of lncRNA family. However, their association with cancers is largely unknown. In this study, we focused on small ubiquitin-like modifier (SUMO) 1 pseudogene 3, SUMO1P3. Gastric cancer tissues and adjacent nontumor tissues were collected from 96 patients with gastric cancer. The SUMO1P3 levels were detected by quantitative reverse transcription-polymerase chain reaction. Then, the association between the level of SUMO1P3 in gastric cancer tissues and the clinicopathological features of patients with gastric cancer was further analyzed. A receiver operating characteristic curve was constructed for differentiating patients with gastric cancer from patients with benign gastric diseases. The results showed that SUMO1P3 was significantly up-regulated in gastric cancer tissues compared with paired-adjacent nontumorous tissues (p < 0.01). Its expression level was significantly correlated with tumor size (p = 0.003), differentiation (p = 0.002), lymphatic metastasis (p = 0.001), and invasion (p = 0.039). The area under the ROC curve of SUMO1P3 was up to 0.666. These results indicated, for the first time, that pseudogene-expressed lncRNA SUMO1P3 may be a potential biomarker in the diagnosis of gastric cancer.
Collapse
|
14
|
Bovolenta M, Erriquez D, Valli E, Brioschi S, Scotton C, Neri M, Falzarano MS, Gherardi S, Fabris M, Rimessi P, Gualandi F, Perini G, Ferlini A. The DMD locus harbours multiple long non-coding RNAs which orchestrate and control transcription of muscle dystrophin mRNA isoforms. PLoS One 2012; 7:e45328. [PMID: 23028937 PMCID: PMC3448672 DOI: 10.1371/journal.pone.0045328] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2011] [Accepted: 08/20/2012] [Indexed: 11/18/2022] Open
Abstract
The 2.2 Mb long dystrophin (DMD) gene, the largest gene in the human genome, corresponds to roughly 0.1% of the entire human DNA sequence. Mutations in this gene cause Duchenne muscular dystrophy and other milder X-linked, recessive dystrophinopathies. Using a custom-made tiling array, specifically designed for the DMD locus, we identified a variety of novel long non-coding RNAs (lncRNAs), both sense and antisense oriented, whose expression profiles mirror that of DMD gene. Importantly, these transcripts are intronic in origin and specifically localized to the nucleus and are transcribed contextually with dystrophin isoforms or primed by MyoD-induced myogenic differentiation. Furthermore, their forced ectopic expression in both human muscle and neuronal cells causes a specific and negative regulation of endogenous dystrophin full length isoforms and significantly down-regulate the activity of a luciferase reporter construct carrying the minimal promoter regions of the muscle dystrophin isoform. Consistent with this apparently repressive role, we found that, in muscle samples of dystrophinopathic female carriers, lncRNAs expression levels inversely correlate with those of muscle full length DMD isoforms. Overall these findings unveil an unprecedented complexity of the transcriptional pattern of the DMD locus and reveal that DMD lncRNAs may contribute to the orchestration and homeostasis of the muscle dystrophin expression pattern by either selective targeting and down-modulating the dystrophin promoter transcriptional activity.
Collapse
Affiliation(s)
- Matteo Bovolenta
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| | - Daniela Erriquez
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Emanuele Valli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Simona Brioschi
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| | - Chiara Scotton
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| | - Marcella Neri
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| | - Maria Sofia Falzarano
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| | - Samuele Gherardi
- Department of Pharmacy and Biotechnology, Health Sciences and Technologies – Interdepartmental Center for Industrial Research (HST-ICIR), University of Bologna, Bologna, Italy
| | - Marina Fabris
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| | - Paola Rimessi
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| | - Francesca Gualandi
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| | - Giovanni Perini
- Department of Pharmacy and Biotechnology, Health Sciences and Technologies – Interdepartmental Center for Industrial Research (HST-ICIR), University of Bologna, Bologna, Italy
| | - Alessandra Ferlini
- Department of Medical Science, Section of Medical Genetics, University of Ferrara, Ferrara, Italy
| |
Collapse
|
15
|
Peanut (Arachis hypogaea) Expressed Sequence Tag Project: Progress and Application. Comp Funct Genomics 2012; 2012:373768. [PMID: 22745594 PMCID: PMC3382957 DOI: 10.1155/2012/373768] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2012] [Accepted: 04/26/2012] [Indexed: 12/12/2022] Open
Abstract
Many plant ESTs have been sequenced as an alternative to whole genome sequences, including peanut because of the genome size and complexity. The US peanut research community had the historic 2004 Atlanta Genomics Workshop and named the EST project as a main priority. As of August 2011, the peanut research community had deposited 252,832 ESTs in the public NCBI EST database, and this resource has been providing the community valuable tools and core foundations for various genome-scale experiments before the whole genome sequencing project. These EST resources have been used for marker development, gene cloning, microarray gene expression and genetic map construction. Certainly, the peanut EST sequence resources have been shown to have a wide range of applications and accomplished its essential role at the time of need. Then the EST project contributes to the second historic event, the Peanut Genome Project 2010 Inaugural Meeting also held in Atlanta where it was decided to sequence the entire peanut genome. After the completion of peanut whole genome sequencing, ESTs or transcriptome will continue to play an important role to fill in knowledge gaps, to identify particular genes and to explore gene function.
Collapse
|
16
|
Zhou S, Ji G, Liu X, Li P, Moler J, Karro JE, Liang C. Pattern analysis approach reveals restriction enzyme cutting abnormalities and other cDNA library construction artifacts using raw EST data. BMC Biotechnol 2012; 12:16. [PMID: 22554190 PMCID: PMC3424822 DOI: 10.1186/1472-6750-12-16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2011] [Accepted: 03/15/2012] [Indexed: 11/12/2022] Open
Abstract
Background Expressed Sequence Tag (EST) sequences are widely used in applications such as genome annotation, gene discovery and gene expression studies. However, some of GenBank dbEST sequences have proven to be “unclean”. Identification of cDNA termini/ends and their structures in raw ESTs not only facilitates data quality control and accurate delineation of transcription ends, but also furthers our understanding of the potential sources of data abnormalities/errors present in the wet-lab procedures for cDNA library construction. Results After analyzing a total of 309,976 raw Pinus taeda ESTs, we uncovered many distinct variations of cDNA termini, some of which prove to be good indicators of wet-lab artifacts, and characterized each raw EST by its cDNA terminus structure patterns. In contrast to the expected patterns, many ESTs displayed complex and/or abnormal patterns that represent potential wet-lab errors such as: a failure of one or both of the restriction enzymes to cut the plasmid vector; a failure of the restriction enzymes to cut the vector at the correct positions; the insertion of two cDNA inserts into a single vector; the insertion of multiple and/or concatenated adapters/linkers; the presence of 3′-end terminal structures in designated 5′-end sequences or vice versa; and so on. With a close examination of these artifacts, many problematic ESTs that have been deposited into public databases by conventional bioinformatics pipelines or tools could be cleaned or filtered by our methodology. We developed a software tool for Abnormality Filtering and Sequence Trimming for ESTs (AFST, http://code.google.com/p/afst/) using a pattern analysis approach. To compare AFST with other pipelines that submitted ESTs into dbEST, we reprocessed 230,783 Pinus taeda and 38,709 Arachis hypogaea GenBank ESTs. We found 7.4% of Pinus taeda and 29.2% of Arachis hypogaea GenBank ESTs are “unclean” or abnormal, all of which could be cleaned or filtered by AFST. Conclusions cDNA terminal pattern analysis, as implemented in the AFST software tool, can be utilized to reveal wet-lab errors such as restriction enzyme cutting abnormities and chimeric EST sequences, detect various data abnormalities embedded in existing Sanger EST datasets, improve the accuracy of identifying and extracting bona fide cDNA inserts from raw ESTs, and therefore greatly benefit downstream EST-based applications.
Collapse
Affiliation(s)
- Sun Zhou
- Department of Automation, Xiamen University, Fujian, China.
| | | | | | | | | | | | | |
Collapse
|
17
|
Chen YC, Chen YC, Lin WD, Hsiao CD, Chiu HW, Ho JM. Bio301: A Web-Based EST Annotation Pipeline That Facilitates Functional Comparison Studies. ISRN BIOINFORMATICS 2011; 2012:139842. [PMID: 25969743 PMCID: PMC4407203 DOI: 10.5402/2012/139842] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2011] [Accepted: 09/05/2011] [Indexed: 11/23/2022]
Abstract
In this postgenomic era, a huge volume of information derived from expressed sequence tags (ESTs) has been constructed for functional description of gene expression profiles. Comparative studies have become more and more important to researchers of biology. In order to facilitate these comparative studies, we have constructed a user-friendly EST annotation pipeline with comparison tools on an integrated EST service website, Bio301. Bio301 includes regular EST preprocessing, BLAST similarity search, gene ontology (GO) annotation, statistics reporting, a graphical GO browsing interface, and microarray probe selection tools. In addition, Bio301 is equipped with statistical library comparison functions using multiple EST libraries based on GO annotations for mining meaningful biological information.
Collapse
Affiliation(s)
- Yen-Chen Chen
- Institute of Information Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan
| | - Yun-Ching Chen
- Department of Biomedical Engineering, The Whitaker Biomedical Engineering Institute at Johns Hopkins University School of Medicine, 720 Rutland Avenue, Baltimore, MD 21205, USA
| | - Wen-Dar Lin
- Institute of Plant and Microbial Biology, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan
| | - Chung-Der Hsiao
- Department of Bioscience Technology, Chung Yuan Christian University, 200 Chung Pei Road, Chung Li City 32073, Taiwan
| | - Hung-Wen Chiu
- Graduate Institute of Biomedical Informatics, Taipei Medical University, 250 Wu-Hsing Street, Taipei City 110, Taiwan
| | - Jan-Ming Ho
- Institute of Information Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan
| |
Collapse
|
18
|
Wu H, Wang J, Deng R, Xing K, Xiong Y, Huang J, He X, Wang X. Benefits of random-priming: exhaustive survey of a cDNA library from lung tissue of a SARS patient. J Med Virol 2011; 83:574-86. [PMID: 21328370 PMCID: PMC7166665 DOI: 10.1002/jmv.22012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The severe acute respiratory syndrome (SARS) leads to severe injury in the lungs with multiple factors, though the pathogenesis is still largely unclear. This paper describes the particular analyses of the transcriptome of human lung tissue that was infected by SARS‐associated coronavirus (SARS‐CoV). Random primers were used to produce ESTs from total RNA samples of the lung tissue. The result showed a high diversity of the transcripts, covering much of the human genome, including loci which do not contain protein coding sequences. 10,801 ESTs were generated and assembled into 267 contigs plus 7,659 singletons. Sequences matching to SARS‐CoV RNAs and other pneumonia‐related microbes were found. The transcripts were well classified by functional annotation. Among the 7,872 assembled sequences that were identified as from human genome, 578 non‐coding genes were revealed by BLAST search. The transcripts were mapped to the human genome with the restriction of identity = 100%, which found a candidate pool of 448 novel transcriptional loci where EST transcriptional signal was never found before. Among these, 13 loci were never reported to be transcriptional by other detection methods such as gene chips, tiling arrays, and paired‐end ditags (PETs). The result showed that random‐priming cDNA library is valid for the investigation of transcript diversity in the virus‐infected tissue. The EST data could be a useful supplemental source for SARS pathology researches. J. Med. Virol. 83:574–586, 2011. © 2011 Wiley‐Liss, Inc.
Collapse
Affiliation(s)
- Hongkai Wu
- State Key Laboratory of Biocontrol, Sun Yat-sen University, Xingangxi Road, Guangzhou, People's Republic of China
| | | | | | | | | | | | | | | |
Collapse
|
19
|
Lu S, Friesen TL, Faris JD. Molecular characterization and genomic mapping of the pathogenesis-related protein 1 (PR-1) gene family in hexaploid wheat (Triticum aestivum L.). Mol Genet Genomics 2011; 285:485-503. [PMID: 21516334 DOI: 10.1007/s00438-011-0618-z] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2011] [Accepted: 03/30/2011] [Indexed: 12/30/2022]
Abstract
The group 1 pathogenesis-related (PR-1) proteins, known as hallmarks of defense pathways, are encoded by multigene families in plants as evidenced by the presence of 22 and 32 PR-1 genes in the finished Arabidopsis and rice genomes, respectively. Here, we report the initial characterization and mapping of 23 PR-1-like (TaPr-1) genes in hexaploid wheat (Triticum aestivum L.), which possesses one of the largest (>16,000 megabases) genomes among monocot crop plants. Sequence analysis revealed that the 23 TaPr-1 genes all contain intron-free open reading frames that encode a signal peptide at the N-terminus and a conserved PR-1-like domain. Phylogenetic analysis indicated that TaPr-1 genes form three major monophyletic groups along with their counterparts in other monocots; each group consists of genes encoding basic, basic with a C-terminal extension, and acidic PR-1 proteins, respectively, suggesting diversity and conservation of PR-1 gene functions in monocot plants. Mapping analysis assisted by untranslated region-specified discrimination (USD) markers and various cytogenetic stocks located the 23 TaPr-1 genes to seven different chromosomes, with the majority mapping to chromosomes of homoeologous groups 5 and 7. Reverse transcriptase (RT)-PCR analysis revealed that 12 TaPr-1 genes were induced or up-regulated upon pathogen challenge. Together, this study provides insights to the origin, evolution, homoeologous relationships, and expression patterns of the TaPr-1 genes. The data presented provide critical information for further genome-wide characterization of the wheat PR-1 gene family and the USD markers developed will facilitate genetic and functional analysis of PR-1 genes associated with plant defense and/or other important traits.
Collapse
Affiliation(s)
- Shunwen Lu
- USDA-ARS, Cereal Crops Research Unit, Northern Crop Science Laboratory, Fargo, ND 58102-2765, USA.
| | | | | |
Collapse
|
20
|
Bioinformatic Approaches for Identification of A-to-I Editing Sites. Curr Top Microbiol Immunol 2011; 353:145-62. [DOI: 10.1007/82_2011_147] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
21
|
Understanding Vegetative Desiccation Tolerance Using Integrated Functional Genomics Approaches Within a Comparative Evolutionary Framework. PLANT DESICCATION TOLERANCE 2011. [DOI: 10.1007/978-3-642-19106-0_15] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
22
|
Li Y, Chia JM, Bartfai R, Christoffels A, Yue GH, Ding K, Ho MY, Hill JA, Stupka E, Orban L. Comparative analysis of the testis and ovary transcriptomes in zebrafish by combining experimental and computational tools. Comp Funct Genomics 2010; 5:403-18. [PMID: 18629171 PMCID: PMC2447462 DOI: 10.1002/cfg.418] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2004] [Revised: 06/25/2004] [Accepted: 06/28/2004] [Indexed: 11/12/2022] Open
Abstract
Studies on the zebrafish model have contributed to our understanding of several important developmental processes, especially those that can be easily studied in the embryo. However, our knowledge on late events such as gonad differentiation in the zebrafish is still limited. Here we provide an analysis on the gene sets expressed in the adult zebrafish testis and ovary in an attempt to identify genes with potential role in (zebra)fish gonad development and function. We produced 10,533 expressed sequence tags (ESTs) from zebrafish testis or ovary and downloaded an additional 23,642 gonad-derived sequences from the zebrafish EST database. We clustered these sequences together with over 13,000 kidney-derived zebrafish ESTs to study partial transcriptomes for these three organs. We searched for genes with gonad-specific expression by screening macroarrays containing at least 2600 unique cDNA inserts with testis-, ovary- and kidney-derived cDNA probes. Clones hybridizing to only one of the two gonad probes were selected, and subsequently screened with computational tools to identify 72 genes with potentially testis-specific and 97 genes with potentially ovary-specific expression, respectively. PCR-amplification confirmed gonad-specificity for 21 of the 45 clones tested (all without known function). Our study, which involves over 47,000 EST sequences and specialized cDNA arrays, is the first analysis of adult organ transcriptomes of zebrafish at such a scale. The study of genes expressed in adult zebrafish testis and ovary will provide useful information on regulation of gene expression in teleost gonads and might also contribute to our understanding of the development and differentiation of reproductive organs in vertebrates.
Collapse
Affiliation(s)
- Yang Li
- Reproductive Genomics Group, Temasek Lifesciences Laboratory, Singapore
| | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Zaranek AW, Levanon EY, Zecharia T, Clegg T, Church GM. A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing. PLoS Genet 2010; 6:e1000954. [PMID: 20531933 PMCID: PMC2873906 DOI: 10.1371/journal.pgen.1000954] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2009] [Accepted: 04/15/2010] [Indexed: 11/19/2022] Open
Abstract
While it is widely held that an organism's genomic information should remain constant, several protein families are known to modify it. Members of the AID/APOBEC protein family can deaminate DNA. Similarly, members of the ADAR family can deaminate RNA. Characterizing the scope of these events is challenging. Here we use large genomic data sets, such as the two billion sequences in the NCBI Trace Archive, to look for clusters of mismatches of the same type, which are a hallmark of editing events caused by APOBEC3 and ADAR. We align 603,249,815 traces from the NCBI trace archive to their reference genomes. In clusters of mismatches of increasing size, at least one systematic sequencing error dominates the results (G-to-A). It is still present in mismatches with 99% accuracy and only vanishes in mismatches at 99.99% accuracy or higher. The error appears to have entered into about 1% of the HapMap, possibly affecting other users that rely on this resource. Further investigation, using stringent quality thresholds, uncovers thousands of mismatch clusters with no apparent defects in their chromatograms. These traces provide the first reported candidates of endogenous DNA editing in human, further elucidating RNA editing in human and mouse and also revealing, for the first time, extensive RNA editing in Xenopus tropicalis. We show that the NCBI Trace Archive provides a valuable resource for the investigation of the phenomena of DNA and RNA editing, as well as setting the stage for a comprehensive mapping of editing events in large-scale genomic datasets.
Collapse
Affiliation(s)
- Alexander Wait Zaranek
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Erez Y. Levanon
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| | | | - Tom Clegg
- Scalable Computing Experts, Somerville, Massachusetts, United States of America
| | - George M. Church
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
24
|
Ramirez AB, Loch CM, Zhang Y, Liu Y, Wang X, Wayner EA, Sargent JE, Sibani S, Hainsworth E, Mendoza EA, Eugene R, Labaer J, Urban ND, McIntosh MW, Lampe PD. Use of a single-chain antibody library for ovarian cancer biomarker discovery. Mol Cell Proteomics 2010; 9:1449-60. [PMID: 20467042 DOI: 10.1074/mcp.m900496-mcp200] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The discovery of novel early detection biomarkers of disease could offer one of the best approaches to decrease the morbidity and mortality of ovarian and other cancers. We report on the use of a single-chain variable fragment antibody library for screening ovarian serum to find novel biomarkers for the detection of cancer. We alternately panned the library with ovarian cancer and disease-free control sera to make a sublibrary of antibodies that bind proteins differentially expressed in cancer. This sublibrary was printed on antibody microarrays that were incubated with labeled serum from multiple sets of cancer patients and controls. The antibodies that performed best at discriminating disease status were selected, and their cognate antigens were identified using a functional protein microarray. Overexpression of some of these antigens was observed in cancer serum, tumor proximal fluid, and cancer tissue via dot blot and immunohistochemical staining. Thus, our use of recombinant antibody microarrays for unbiased discovery found targets for ovarian cancer detection in multiple sample sets, supporting their further study for disease diagnosis.
Collapse
Affiliation(s)
- Arturo B Ramirez
- Molecular Diagnostics Program, Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Torres TT, Dolezal M, Schlötterer C, Ottenwälder B. Expression profiling of Drosophila mitochondrial genes via deep mRNA sequencing. Nucleic Acids Res 2010; 37:7509-18. [PMID: 19843606 PMCID: PMC2794191 DOI: 10.1093/nar/gkp856] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Mitochondria play an essential role in several cellular processes. Nevertheless, very little is known about patterns of gene expression of genes encoded by the mitochondrial DNA (mtDNA). In this study, we used next-generation sequencing (NGS) for transcription profiling of genes encoded in the mitochondrial genome of Drosophila melanogaster and D. pseudoobscura. The analysis of males and females in both species indicated that the expression pattern was conserved between the two species, but differed significantly between both sexes. Interestingly, mRNA levels were not only different among genes encoded by separate transcription units, but also showed significant differences among genes located in the same transcription unit. Hence, mRNA abundance of genes encoded by mtDNA seems to be heavily modulated by post-transcriptional regulation. Finally, we also identified several transcripts with a noncanonical structure, suggesting that processing of mitochondrial transcripts may be more complex than previously assumed.
Collapse
|
26
|
Funari VA, Voevodski K, Leyfer D, Yerkes L, Cramer D, Tolan DR. Quantitative gene expression profiles in real time from expressed sequence tag databases. Gene Expr 2010; 14:321-36. [PMID: 20635574 PMCID: PMC2954622 DOI: 10.3727/105221610x12717040569820] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
An accumulation of expressed sequence tag (EST) data in the public domain and the availability of bioinformatic programs have made EST gene expression profiling a common practice. However, the utility and validity of using EST databases (e.g., dbEST) has been criticized, particularly for quantitative assessment of gene expression. Problems with EST sequencing errors, library construction, EST annotation, and multiple paralogs make generation of specific and sensitive qualitative arid quantitative expression profiles a concern. In addition, most EST-derived expression data exists in previously assembled databases. The Virtual Northern Blot (VNB) (http: //tlab.bu.edu/vnb.html) allows generation, evaluation, and optimization of expression profiles in real time, which is especially important for alternatively spliced, novel, or poorly characterized genes. Representative gene families with variable nucleotide sequence identity, tissue specificity, and levels of expression (bcl-xl, aldoA, and cyp2d9) are used to assess the quality of VNB's output. The profiles generated by VNB are more sensitive and specific than those constructed with ESTs listed in preindexed databases at UCSC and NCBI. Moreover, quantitative expression profiles produced by VNB are comparable to quantization obtained from Northern blots and qPCR. The VNB pipeline generates real-time gene expression profiles for single-gene queries that are both qualitatively and quantitatively reliable.
Collapse
Affiliation(s)
| | | | - Dimitry Leyfer
- †Bioinformatics Program, Boston University, Boston, MA, USA
| | - Laura Yerkes
- *Biology Department, Boston University, Boston, MA, USA
| | - Donald Cramer
- *Biology Department, Boston University, Boston, MA, USA
| | - Dean R. Tolan
- *Biology Department, Boston University, Boston, MA, USA
- †Bioinformatics Program, Boston University, Boston, MA, USA
| |
Collapse
|
27
|
Bragg LM, Stone G. k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage. Bioinformatics 2009; 25:2302-8. [PMID: 19570806 PMCID: PMC2735666 DOI: 10.1093/bioinformatics/btp410] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Motivation: The clustering of expressed sequence tags (ESTs) is a crucial step in many sequence analysis studies that require a high level of redundancy. Chimeric sequences, while uncommon, can make achieving the optimal EST clustering a challenge. Single-linkage algorithms are particularly vulnerable to the effects of chimeras. To avoid chimera-facilitated erroneous merges, researchers using single-linkage algorithms are forced to use stringent sequence–similarity thresholds. Such thresholds reduce the sensitivity of the clustering algorithm. Results: We introduce the concept of k-link clustering for EST data. We evaluate how clustering error rates vary over a range of linkage thresholds. Using k-link, we show that Type II error decreases in response to increasing the number of shared ESTs (ie. links) required. We observe a base level of Type II error likely caused by the presence of unmasked low-complexity or repetitive sequence. We find that Type I error increases gradually with increased linkage. To minimize the Type I error introduced by increased linkage requirements, we propose an extension to k-link which modifies the required number of links with respect to the size of clusters being compared. Availability: The implementation of k-link is available under the terms of the GPL from http://www.bioinformatics.csiro.au/products.shtml. k-link is licensed under the GNU General Public License, and can be downloaded from http://www.bioinformatics.csiro.au/products.shtml. k-link is written in C++. Contact:lauren.bragg@csiro.au Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lauren M Bragg
- CSIRO Mathematical and Information Sciences, North Ryde, NSW, Australia.
| | | |
Collapse
|
28
|
Barak M, Levanon EY, Eisenberg E, Paz N, Rechavi G, Church GM, Mehr R. Evidence for large diversity in the human transcriptome created by Alu RNA editing. Nucleic Acids Res 2009; 37:6905-15. [PMID: 19740767 PMCID: PMC2777429 DOI: 10.1093/nar/gkp729] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Adenosine-to-inosine (A-to-I) RNA editing alters the original genomic content of the human transcriptome and is essential for maintenance of normal life in mammals. A-to-I editing in Alu repeats is abundant in the human genome, with many thousands of expressed Alu sequences undergoing editing. Little is known so far about the contribution of Alu editing to transcriptome complexity. Transcripts derived from a single edited Alu sequence can be edited in multiple sites, and thus could theoretically generate a large number of different transcripts. Here we explored whether the combinatorial potential nature of edited Alu sequences is actually fulfilled in the human transcriptome. We analyzed datasets of editing sites and performed an analysis of a detailed transcript set of one edited Alu sequence. We found that editing appears at many more sites than detected by earlier genomic screens. To a large extent, editing of different sites within the same transcript is only weakly correlated. Thus, rather than finding a few versions of each transcript, a large number of edited variants arise, resulting in immense transcript diversity that eclipses alternative splicing as mechanism of transcriptome diversity, although with less impact on the proteome.
Collapse
Affiliation(s)
- Michal Barak
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
| | | | | | | | | | | | | |
Collapse
|
29
|
Morozova O, Hirst M, Marra MA. Applications of new sequencing technologies for transcriptome analysis. Annu Rev Genomics Hum Genet 2009; 10:135-51. [PMID: 19715439 DOI: 10.1146/annurev-genom-082908-145957] [Citation(s) in RCA: 340] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Transcriptome analysis has been a key area of biological inquiry for decades. Over the years, research in the field has progressed from candidate gene-based detection of RNAs using Northern blotting to high-throughput expression profiling driven by the advent of microarrays. Next-generation sequencing technologies have revolutionized transcriptomics by providing opportunities for multidimensional examinations of cellular transcriptomes in which high-throughput expression data are obtained at a single-base resolution.
Collapse
Affiliation(s)
- Olena Morozova
- BC Cancer Agency, Genome Sciences Center, Vancouver, BC V5Z 4S6, Canada.
| | | | | |
Collapse
|
30
|
Xue Q, Itoh N, Schey KL, Cooper RK, La Peyre JF. Evidence indicating the existence of a novel family of serine protease inhibitors that may be involved in marine invertebrate immunity. FISH & SHELLFISH IMMUNOLOGY 2009; 27:250-259. [PMID: 19464375 DOI: 10.1016/j.fsi.2009.05.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2009] [Revised: 04/20/2009] [Accepted: 05/12/2009] [Indexed: 05/27/2023]
Abstract
A new serine protease inhibitor, designated cvSI-2, was purified and characterized from the plasma of the eastern oyster, Crassostrea virginica. CvSI-2 inhibited the serine protease subtilisin A in a slow-tight binding manner, with an overall dissociation constant Ki* of 0.18 nM. It also inhibited perkinsin, the major extracellular protease of the oyster protozoan parasite Perkinsus marinus. Sequencing of cvSI-2 cloned cDNA revealed an open reading frame of 258 bp encoding a polypeptide of 85 amino acids, with the 18 N-terminal amino acids forming a signal peptide. The mature cvSI-2 molecule predicted consisted of 67 amino acids with 12 cysteine residues and a calculated molecular mass of 7202.96 Da. Overall 91% of the cvSI-2 amino acid sequence predicted from cDNA was confirmed by tandem mass spectrometry sequencing of purified cvSI-2. In addition, serine 43 and a threonine substitution at this position were observed. CvSI-2 amino acid sequence showed a 38% identity and 54% similarity with that of cvSI-1, the first protease inhibitor purified and characterized from a bivalve mollusc. Like cvSI-1, cvSI-2 gene was expressed in the basophil cells of digestive tubules. BLAST search found multiple ESTs from the eastern oyster, Pacific oyster, Mediterranean mussel, and sea vase, a tunicate, which could encode proteins with sequences similar to cvSI-1 and cvSI-2. Our findings indicate that cvSI-1 and cvSI-2 are members of a novel family of serine protease inhibitors in bivalve molluscs and perhaps other marine invertebrates, which share the characteristic cysteine array C-X(4-9)-C-X(4-6)-C-X(7)-C-X(4)-C-T-C-X(6-9)-C-X(5)-C-X(3-7)-C-X(6-10)-C-X(4)-C-X-C.
Collapse
Affiliation(s)
- Qinggang Xue
- Department of Veterinary Science, Louisiana State University Agricultural Center, Baton Rouge, LA 70830, USA
| | | | | | | | | |
Collapse
|
31
|
Obermeier C, Hosseini B, Friedt W, Snowdon R. Gene expression profiling via LongSAGE in a non-model plant species: a case study in seeds of Brassica napus. BMC Genomics 2009; 10:295. [PMID: 19575793 PMCID: PMC2719671 DOI: 10.1186/1471-2164-10-295] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2009] [Accepted: 07/03/2009] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Serial analysis of gene expression (LongSAGE) was applied for gene expression profiling in seeds of oilseed rape (Brassica napus ssp. napus). The usefulness of this technique for detailed expression profiling in a non-model organism was demonstrated for the highly complex, neither fully sequenced nor annotated genome of B. napus by applying a tag-to-gene matching strategy based on Brassica ESTs and the annotated proteome of the closely related model crucifer A. thaliana. RESULTS Transcripts from 3,094 genes were detected at two time-points of seed development, 23 days and 35 days after pollination (DAP). Differential expression showed a shift from gene expression involved in diverse developmental processes including cell proliferation and seed coat formation at 23 DAP to more focussed metabolic processes including storage protein accumulation and lipid deposition at 35 DAP. The most abundant transcripts at 23 DAP were coding for diverse protease inhibitor proteins and proteases, including cysteine proteases involved in seed coat formation and a number of lipid transfer proteins involved in embryo pattern formation. At 35 DAP, transcripts encoding napin, cruciferin and oleosin storage proteins were most abundant. Over both time-points, 18.6% of the detected genes were matched by Brassica ESTs identified by LongSAGE tags in antisense orientation. This suggests a strong involvement of antisense transcript expression in regulatory processes during B. napus seed development. CONCLUSION This study underlines the potential of transcript tagging approaches for gene expression profiling in Brassica crop species via EST matching to annotated A. thaliana genes. Limits of tag detection for low-abundance transcripts can today be overcome by ultra-high throughput sequencing approaches, so that tag-based gene expression profiling may soon become the method of choice for global expression profiling in non-model species.
Collapse
Affiliation(s)
- Christian Obermeier
- Justus Liebig University Giessen, Department of Plant Breeding, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany.
| | | | | | | |
Collapse
|
32
|
Expressed sequence tags: normalization and subtraction of cDNA libraries expressed sequence tags\ normalization and subtraction of cDNA libraries. Methods Mol Biol 2009. [PMID: 19277560 DOI: 10.1007/978-1-60327-136-3_6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Expressed Sequence Tags (ESTs) provide a rapid and efficient approach for gene discovery and analysis of gene expression in eukaryotes. ESTs have also become particularly important with recent expanded efforts in complete genome sequencing of understudied, nonmodel eukaryotes such as protists and algae. For these projects, ESTs provide an invaluable source of data for gene identification and prediction of exon-intron boundaries. The generation of EST data, although straightforward in concept, requires nonetheless great care to ensure the highest efficiency and return for the investment in time and funds. To this end, key steps in the process include generation of a normalized cDNA library to facilitate a high gene discovery rate followed by serial subtraction of normalized libraries to maintain the discovery rate. Here we describe in detail, protocols for normalization and subtraction of cDNA libraries followed by an example using the toxic dinoflagellate Alexandrium tamarense.
Collapse
|
33
|
Morrissy AS, Morin RD, Delaney A, Zeng T, McDonald H, Jones S, Zhao Y, Hirst M, Marra MA. Next-generation tag sequencing for cancer gene expression profiling. Genome Res 2009; 19:1825-35. [PMID: 19541910 DOI: 10.1101/gr.094482.109] [Citation(s) in RCA: 277] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors, antisense transcripts, and intronic sequences, the latter possibly representing novel exons or genes. We observed increases in the diversity, abundance, and dynamic range of such rare transcripts and took advantage of the greater dynamic range of expression to identify, in cancers and normal libraries, altered expression ratios of alternative transcript isoforms. The strand-specific information of Tag-seq reads further allowed us to detect altered expression ratios of sense and antisense (S-AS) transcripts between cancer and normal libraries. S-AS transcripts were enriched in known cancer genes, while transcript isoforms were enriched in miRNA targeting sites. We found that transcript abundance had a stronger GC-bias in LongSAGE than Tag-seq, such that AT-rich tags were less abundant than GC-rich tags in LongSAGE. Tag-seq also performed better in gene discovery, identifying >98% of genes detected by LongSAGE and profiling a distinct subset of the transcriptome characterized by AT-rich genes, which was expressed at levels below those detectable by LongSAGE. Overall, Tag-seq is sensitive to rare transcripts, has less sequence composition bias relative to LongSAGE, and allows differential expression analysis for a greater range of transcripts, including transcripts encoding important regulatory molecules.
Collapse
|
34
|
Jiang SM, Yin WB, Hu J, Shi R, Zhou RN, Chen YH, Zhou GH, Wang RRC, Song LY, Hu ZM. Isolation of expressed sequences from a specific chromosome of Thinopyrum intermedium infected by BYDV. Genome 2009; 52:68-76. [PMID: 19132073 DOI: 10.1139/g08-108] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
To map important ESTs to specific chromosomes and (or) chromosomal regions is difficult in hexaploid wheat because of its large genome size and serious interference of homoeologous sequences. Large-scale EST sequencing and subsequent chromosome localization are both laborious and time-consuming. The wheat alien addition line TAi-27 contains a pair of chromosomes of Thinopyrum intermedium (Host) Barkworth & D.R. Dewey that carry the resistance gene against barley yellow dwarf virus. In this research, we developed a modified technique based on chromosome microdissection and hybridization-specific amplification to isolate expressed sequences from the alien chromosome of TAi-27 by hybridization between the DNA of the microdissected alien chromosome and cDNA of Th. intermedium infected by barley yellow dwarf virus. Twelve clones were selected, sequenced, and analyzed. Three of them were unknown genes without any hit in the GenBank database and the other nine were highly homologous with ESTs of wheat, barley, and (or) other plants in Gramineae induced by abiotic or biotic stress. The method used in this research to isolate expressed sequences from a specific chromosome has the following advantages: (i) the obtained expressed sequences are larger in size and have 3' end information and (ii) the operation is less complicated. It would be an efficient improved method for genomics and functional genomics research of polyploid plants, especially for EST development and mapping. The obtained expressed sequence data are also informative in understanding the resistance genes on the alien chromosome of TAi-27.
Collapse
Affiliation(s)
- Shu-Mei Jiang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Datun Road, Beijing 100101, PR China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Shi BJ, Wang GL. Comparative study of genes expressed from rice fungus-resistant and susceptible lines during interactions with Magnaporthe oryzae. Gene 2008; 427:80-5. [DOI: 10.1016/j.gene.2008.09.015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2008] [Revised: 09/11/2008] [Accepted: 09/16/2008] [Indexed: 01/17/2023]
|
36
|
Jackson NE, Wang HW, Bryant KJ, McNeil HP, Husain A, Liu K, Tedla N, Thomas PS, King GC, Hettiaratchi A, Cairns J, Hunt JE. Alternate mRNA splicing in multiple human tryptase genes is predicted to regulate tetramer formation. J Biol Chem 2008; 283:34178-87. [PMID: 18854315 DOI: 10.1074/jbc.m807553200] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Tryptases are serine proteases that are thought to be uniquely and proteolytically active as tetramers. Crystallographic studies reveal that the active tetramer is a flat ring structure composed of four monomers, with their active sites arranged around a narrow central pore. This model explains why many of the preferred substrates of tryptase are short peptides; however, it does not explain how tryptase cleaves large protein substrates such as fibronectin, although a number of studies have reported in vitro mechanisms for generating active monomers that could digest larger substrates. Here we suggest that alternate mRNA splicing of human tryptase genes generates active tryptase monomers (or dimers). We have identified a conserved pattern of alternate splicing in four tryptase alleles (alphaII, betaI, betaIII, and deltaI), representing three distinct tryptase gene loci. When compared with their full-length counterparts, the splice variants use an alternate acceptor site within exon 4. This results in the deletion of 27 nucleotides within the central coding sequence and 9 amino acids from the translated protein product. Although modeling suggests that the deletion can be easily accommodated by the enzymes structurally, it is predicted to alter the specificity by enlarging the S1' or S2' binding pocket and results in the complete loss of the "47 loop," reported to be critical for the formation of tetramers. Although active monomers can be generated in vitro using a range of artificial conditions, we suggest that alternate splicing is the in vivo mechanism used to generate active tryptase that can cleave large protein substrates.
Collapse
Affiliation(s)
- Nicole E Jackson
- Centre for Infection and Inflammation Research, School of Medical Sciences, Sydney, New South Wales 2052, Australia
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Morozova O, Marra MA. Applications of next-generation sequencing technologies in functional genomics. Genomics 2008; 92:255-64. [PMID: 18703132 DOI: 10.1016/j.ygeno.2008.07.001] [Citation(s) in RCA: 665] [Impact Index Per Article: 41.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2008] [Revised: 07/04/2008] [Accepted: 07/09/2008] [Indexed: 12/17/2022]
Abstract
A new generation of sequencing technologies, from Illumina/Solexa, ABI/SOLiD, 454/Roche, and Helicos, has provided unprecedented opportunities for high-throughput functional genomic research. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted resequencing, discovery of transcription factor binding sites, and noncoding RNA expression profiling. This review discusses applications of next-generation sequencing technologies in functional genomics research and highlights the transforming potential these technologies offer.
Collapse
Affiliation(s)
- Olena Morozova
- BC Cancer Agency Genome Sciences Centre, Vancouver, BC, Canada
| | | |
Collapse
|
38
|
Wisniewski M, Bassett C, Norelli J, Macarisin D, Artlip T, Gasic K, Korban S. Expressed sequence tag analysis of the response of apple (Malus x domestica'Royal Gala') to low temperature and water deficit. PHYSIOLOGIA PLANTARUM 2008; 133:298-317. [PMID: 18298416 DOI: 10.1111/j.1399-3054.2008.01063.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Leaf, bark, xylem and root tissues were used to make nine cDNA libraries from non-stressed (control) 'Royal Gala' apple trees, and from 'Royal Gala' trees exposed to either low temperature (5 degrees C for 24 h) or water deficit (45% of saturated pot mass for 2 weeks). Over 22 600 clones from the nine libraries were subjected to 5' single-pass sequencing, clustered and annotated using blastx. The number of clusters in the libraries ranged from 170 to 1430. Regarding annotation of the sequences, blastx analysis indicated that within the libraries 65-72% of the clones had a high similarity to known function genes, 6-15% had no functional assignment and 15-26% were completely novel. The expressed sequence tags were combined into three classes (control, low-temperature and water deficit) and the annotated genes in each class were placed into 1 of 10 different functional categories. The percentage of genes falling into each category was then calculated. This analysis indicated a distinct downregulation of genes involved in general metabolism and photosynthesis, while a significant increase in defense/stress-related genes, protein metabolism and energy was observed. In particular, there was a three-fold increase in the number of stress genes observed in the water deficit libraries indicating a major shift in gene expression in response to a chronic stress. The number of stress genes in response to low temperature, although elevated, was much less than the water deficit libraries perhaps reflecting the shorter (24 h) exposure to stress. Genes with greater than five clones in any specific library were identified and, based on the number of clones obtained, the fold increase or decrease in expression in the libraries was calculated and verified by semiquantitative polymerase chain reaction. Genes, of particular note, that code for the following proteins were overexpressed in the low-temperature libraries: dehydrin and metallothionein-like proteins, ubiquitin proteins, a dormancy-associated protein, a plasma membrane intrinsic protein and an RNA-binding protein. Genes that were upregulated in the water deficit libraries fell mainly into the functional categories of stress (heat shock proteins, dehydrins) and photosynthesis. With few exceptions, the overall differences in downregulated genes were nominal compared with differences in upregulated genes. The results of this apple study are similar to other global studies of plant response to stress but offer a more detailed analysis of specific tissue response (bark vs xylem vs leaf vs root) and a comparison between an acute stress (24-h exposure to low temperature) and a chronic stress (2 weeks of water deficit).
Collapse
Affiliation(s)
- Michael Wisniewski
- United States Department of Agriculture - Agricultural Research Service (USDA-ARS), Appalachian Fruit Research Station, 2217 Wiltshire Road, Kearneysville, WV 25430, USA.
| | | | | | | | | | | | | |
Collapse
|
39
|
Helftenbein G, Koslowski M, Dhaene K, Seitz G, Sahin U, Türeci O. In silico strategy for detection of target candidates for antibody therapy of solid tumors. Gene 2008; 414:76-84. [PMID: 18358640 DOI: 10.1016/j.gene.2008.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2007] [Revised: 02/05/2008] [Accepted: 02/13/2008] [Indexed: 10/22/2022]
Abstract
In contrast to earlier attempts for the identification of target candidates suitable for monoclonal antibody (mAb) based cancer therapies we concentrated on highly selective lineage-specific genes additionally preserved or even overexpressed in orthotopic cancers. In a script aided workflow we reduced all human entries of the RefSeq mRNA database to those encoding transmembrane domain bearing gene products and subjected them to BLAST analysis against the human EST database. All BLAST results were validated in a gene centric way allowing two types of data curation prior to expression profiling of matching ESTs in selected healthy tissues: (i) exclusion of questionable ESTs arising e.g. from genomic contamination and (ii) elimination of erroneously predicted mRNAs as well as transcripts with only weak EST coverage. The impact of such stringent input control on accuracy of prediction is underlined by RT-PCR confirmation of predicted tissue distribution patterns for a number of selected candidates.
Collapse
|
40
|
Stein LD. Navigating public physical mapping databases. CURRENT PROTOCOLS IN HUMAN GENETICS 2008; Chapter 5:Unit 5.16. [PMID: 18428290 DOI: 10.1002/0471142905.hg0516s13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This unit provides concise overviews of the many physical mapping resources available and relates them to the genetic and transcript maps. Useful information on resolution of the maps, how to access them, and how to interpret them is compiled and presented in a clear fashion. Especially useful is a set of detailed protocols describing how to construct an STS marker and how to map it by means of available yeast artificial chromosomes (YACs). An additional protocol describes accessing EST marker maps.
Collapse
Affiliation(s)
- L D Stein
- Whitehead Institute/MIT Center for Genome Research, Cambridge, Massachusetts, USA
| |
Collapse
|
41
|
Zhou RN, Shi R, Jiang SM, Yin WB, Wang HH, Chen YH, Hu J, Wang RRC, Zhang XQ, Hu ZM. Rapid EST isolation from chromosome 1R of rye. BMC PLANT BIOLOGY 2008; 8:28. [PMID: 18366673 PMCID: PMC2322994 DOI: 10.1186/1471-2229-8-28] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2007] [Accepted: 03/18/2008] [Indexed: 05/26/2023]
Abstract
BACKGROUND To obtain important expressed sequence tags (ESTs) located on specific chromosomes is currently difficult. Construction of single-chromosome EST library could be an efficient strategy to isolate important ESTs located on specific chromosomes. In this research we developed a method to rapidly isolate ESTs from chromosome 1R of rye by combining the techniques of chromosome microdissection with hybrid specific amplification (HSA). RESULTS Chromosome 1R was isolated by a glass needle and digested with proteinase K (PK). The DNA of chromosome 1R was amplified by two rounds of PCR using a degenerated oligonucleotide 6-MW sequence with a Sau3AI digestion site as the primer. The PCR product was digested with Sau3AI and linked with adaptor HSA1, then hybridized with the Sau3AI digested cDNA with adaptor HSA2 of rye leaves with and without salicylic acid (SA) treatment, respectively. The hybridized DNA fragments were recovered by the HSA method and cloned into pMD18-T vector. The cloned inserts were released by PCR using the partial sequences in HSA1 and HSA2 as the primers and then sequenced. Of the 94 ESTs obtained and analyzed, 6 were known sequences located on rye chromosome 1R or on homologous group 1 chromosomes of wheat; all of them were highly homologous with ESTs of wheat, barley and/or other plants in Gramineae, some of which were induced by abiotic or biotic stresses. Isolated in this research were 22 ESTs with unknown functions, probably representing some new genes on rye chromosome 1R. CONCLUSION We developed a new method to rapidly clone chromosome-specific ESTs from chromosome 1R of rye. The information reported here should be useful for cloning and investigating the new genes found on chromosome 1R.
Collapse
Affiliation(s)
- Ruo-Nan Zhou
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
- Graduate University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Rui Shi
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
- Forest Biotechnology Group, N.C. State University, Campus Box 7247, Raleigh, NC 27695-7247, USA
| | - Shu-Mei Jiang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
- South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, P. R. China
| | - Wei-Bo Yin
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Huang-Huang Wang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Yu-Hong Chen
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Jun Hu
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Richard RC Wang
- USDA-ARS, FRRL, Utah State University, Logan, UT 84322-6300, USA
| | - Xiang-Qi Zhang
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| | - Zan-Min Hu
- Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, P. R. China
| |
Collapse
|
42
|
[Strategies for cloning of novel genes based on EST]. YI CHUAN = HEREDITAS 2008; 30:257-62. [PMID: 18331990 DOI: 10.3724/sp.j.1005.2008.00257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Expressed sequence tags (EST) are short, randomly selected single-pass nucleotide sequence reads derived from cDNA libraries and represent a small part of a gene. Along with the development of bioinformatics and genetic localization, EST has already become a powerful tool for mapping, cloning and expression profiling of genes. Recently, because of the fast distension of EST databases, application of EST in gene mapping and cloning leads to revolutionary change in the strategies for cloning of novel genes. Despite of some insufficiencies, it has been proved that EST could promote the discovery and research of novel genes. In this article, an introduction about EST, especially EST-based strategies for cloning of novel genes will be given in details.
Collapse
|
43
|
Abstract
The principal route to understanding the biological significance of the genome sequence comes from discovery and characterization of that portion of the genome that is transcribed into RNA products. We now know that this ;transcriptome' is unexpectedly complex and its precise definition in any one species requires multiple technical approaches and an ability to work on a very large scale. A key step is the development of technologies able to capture snapshots of the complexity of the various kinds of RNA generated by the genome. As the human, mouse and other model genome sequencing projects approach completion, considerable effort has been focused on identifying and annotating the protein-coding genes as the principal output of the genome. In pursuing this aim, several key technologies have been developed to generate large numbers and highly diverse sets of full-length cDNAs and their variants. However, the search has identified another hidden transcriptional universe comprising a wide variety of non-protein coding RNA transcripts. Despite initial scepticism, various experiments and complementary technologies have demonstrated that these RNAs are dynamically transcribed and a subset of them can act as sense-antisense RNAs, which influence the transcriptional output of the genome. Recent experimental evidence suggests that the list of non-protein coding RNAs is still largely incomplete and that transcription is substantially more complex even than currently thought.
Collapse
Affiliation(s)
- Piero Carninci
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Institute, Wako, Saitama, Japan.
| |
Collapse
|
44
|
Siepel A, Diekhans M, Brejová B, Langton L, Stevens M, Comstock CLG, Davis C, Ewing B, Oommen S, Lau C, Yu HC, Li J, Roe BA, Green P, Gerhard DS, Temple G, Haussler D, Brent MR. Targeted discovery of novel human exons by comparative genomics. Genes Dev 2007; 17:1763-73. [PMID: 17989246 PMCID: PMC2099585 DOI: 10.1101/gr.7128207] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2007] [Accepted: 10/15/2007] [Indexed: 01/20/2023]
Abstract
A complete and accurate set of human protein-coding gene annotations is perhaps the single most important resource for genomic research after the human-genome sequence itself, yet the major gene catalogs remain incomplete and imperfect. Here we describe a genome-wide effort, carried out as part of the Mammalian Gene Collection (MGC) project, to identify human genes not yet in the gene catalogs. Our approach was to produce gene predictions by algorithms that rely on comparative sequence data but do not require direct cDNA evidence, then to test predicted novel genes by RT-PCR. We have identified 734 novel gene fragments (NGFs) containing 2188 exons with, at most, weak prior cDNA support. These NGFs correspond to an estimated 563 distinct genes, of which >160 are completely absent from the major gene catalogs, while hundreds of others represent significant extensions of known genes. The NGFs appear to be predominantly protein-coding genes rather than noncoding RNAs, unlike novel transcribed sequences identified by technologies such as tiling arrays and CAGE. They tend to be expressed at low levels and in a tissue-specific manner, and they are enriched for roles in motor activity, cell adhesion, connective tissue, and central nervous system development. Our results demonstrate that many important genes and gene fragments have been missed by traditional approaches to gene discovery but can be identified by their evolutionary signatures using comparative sequence data. However, they suggest that hundreds-not thousands-of protein-coding genes are completely missing from the current gene catalogs.
Collapse
Affiliation(s)
- Adam Siepel
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Najami N, Janda T, Barriah W, Kayam G, Tal M, Guy M, Volokita M. Ascorbate peroxidase gene family in tomato: its identification and characterization. Mol Genet Genomics 2007; 279:171-82. [PMID: 18026995 DOI: 10.1007/s00438-007-0305-2] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2007] [Accepted: 10/24/2007] [Indexed: 11/24/2022]
Abstract
The antioxidative response, where ascorbate peroxidase (APX) is a key enzyme, is an integral part of the plant tolerance response to environmental stresses. As a first step towards the study of the physiological role and the regulation of the members of the Apx gene family, the orthologs of the stress-sensitive cultivated tomato Solanum lycopersicum cv. M82 (Slm) and of the wild salt-tolerant species S. pennellii acc. Atico (Spa) were identified by utilizing the tomato EST database, and characterized. A redundant list of 16 virtual Apx transcripts and four singleton ESTs was shown to correspond to seven genuine Apx genes. The complete tomato Apx gene family is comprised of genes encoding three cytosolic, two peroxisomal, and two chloroplastic APXs. These genes attained differential regulatory patterns in various Slm organs. More detailed study of Apx1 and Apx2 genes, that are the products of a recent gene duplication event, shows that they have already attained differential regulation within and between Slm and Spa under control and stress conditions. It is also suggested that due to lineage-specific gene duplication and lose events, intricate phylogenetic relationships exist among the members of the Apx gene families.
Collapse
Affiliation(s)
- Naim Najami
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | | | | | | | | | | | | |
Collapse
|
46
|
Gorodkin J, Cirera S, Hedegaard J, Gilchrist MJ, Panitz F, Jørgensen C, Scheibye-Knudsen K, Arvin T, Lumholdt S, Sawera M, Green T, Nielsen BJ, Havgaard JH, Rosenkilde C, Wang J, Li H, Li R, Liu B, Hu S, Dong W, Li W, Yu J, Wang J, Stærfeldt HH, Wernersson R, Madsen LB, Thomsen B, Hornshøj H, Bujie Z, Wang X, Wang X, Bolund L, Brunak S, Yang H, Bendixen C, Fredholm M. Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags. Genome Biol 2007; 8:R45. [PMID: 17407547 PMCID: PMC1895994 DOI: 10.1186/gb-2007-8-4-r45] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2006] [Revised: 01/18/2007] [Accepted: 04/02/2007] [Indexed: 12/05/2022] Open
Abstract
A resource consisting of one million porcine ESTs is described, providing an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies.
Collapse
Affiliation(s)
- Jan Gorodkin
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Susanna Cirera
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Jakob Hedegaard
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Michael J Gilchrist
- The Wellcome Trust/Cancer Research UK Gurdon Institute, Cambridge, CB2 1QN, UK
| | - Frank Panitz
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Claus Jørgensen
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Karsten Scheibye-Knudsen
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Troels Arvin
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Steen Lumholdt
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Milena Sawera
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Trine Green
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Bente J Nielsen
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Jakob H Havgaard
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Carina Rosenkilde
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| | - Jun Wang
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
- Institute of Human Genetics, University of Aarhus, Nordre Ringgade 1, DK-8000 Aarhus C, Denmark
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campus Vej 55, DK-5230 Odense M, Denmark
| | - Heng Li
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
- Institute of Human Genetics, University of Aarhus, Nordre Ringgade 1, DK-8000 Aarhus C, Denmark
| | - Ruiqiang Li
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campus Vej 55, DK-5230 Odense M, Denmark
| | - Bin Liu
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Songnian Hu
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Wei Dong
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Wei Li
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Jun Yu
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Jian Wang
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Hans-Henrik Stærfeldt
- Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, DK-2800 Lyngby, Denmark
| | - Rasmus Wernersson
- Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, DK-2800 Lyngby, Denmark
| | - Lone B Madsen
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Bo Thomsen
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Henrik Hornshøj
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Zhan Bujie
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Xuegang Wang
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Xuefei Wang
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Lars Bolund
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
- Institute of Human Genetics, University of Aarhus, Nordre Ringgade 1, DK-8000 Aarhus C, Denmark
| | - Søren Brunak
- Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, DK-2800 Lyngby, Denmark
| | - Huanming Yang
- Beijing Genomics Institute, The Airport Industrial Road, Beijing 101300, PR China
| | - Christian Bendixen
- Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, Blichers Alle, DK-8830 Tjele, Denmark
| | - Merete Fredholm
- Division of Genetics and Bioinformatics, IBHV, Grønnegärdsvej 3, The Royal Veterinary and Agricultural University, DK-1870 Frederiksberg C, Denmark
| |
Collapse
|
47
|
Affiliation(s)
- David B Searls
- GlaxoSmithKline Pharmaceuticals, King of Prussia, Pennsylvania, United States of America.
| |
Collapse
|
48
|
Murray D, Doran P, MacMathuna P, Moss AC. In silico gene expression analysis--an overview. Mol Cancer 2007; 6:50. [PMID: 17683638 PMCID: PMC1964762 DOI: 10.1186/1476-4598-6-50] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2007] [Accepted: 08/07/2007] [Indexed: 12/18/2022] Open
Abstract
Efforts aimed at deciphering the molecular basis of complex disease are underpinned by the availability of high throughput strategies for the identification of biomolecules that drive the disease process. The completion of the human genome-sequencing project, coupled to major technological developments, has afforded investigators myriad opportunities for multidimensional analysis of biological systems. Nowhere has this research explosion been more evident than in the field of transcriptomics. Affordable access and availability to the technology that supports such investigations has led to a significant increase in the amount of data generated. As most biological distinctions are now observed at a genomic level, a large amount of expression information is now openly available via public databases. Furthermore, numerous computational based methods have been developed to harness the power of these data. In this review we provide a brief overview of in silico methodologies for the analysis of differential gene expression such as Serial Analysis of Gene Expression and Digital Differential Display. The performance of these strategies, at both an operational and result/output level is assessed and compared. The key considerations that must be made when completing an in silico expression analysis are also presented as a roadmap to facilitate biologists. Furthermore, to highlight the importance of these in silico methodologies in contemporary biomedical research, examples of current studies using these approaches are discussed. The overriding goal of this review is to present the scientific community with a critical overview of these strategies, so that they can be effectively added to the tool box of biomedical researchers focused on identifying the molecular mechanisms of disease.
Collapse
Affiliation(s)
- David Murray
- General Clinical Research Unit, UCD School of Medicine and Medical Sciences, Mater Misericordiae University Hospital, Dublin 7, Ireland
| | - Peter Doran
- General Clinical Research Unit, UCD School of Medicine and Medical Sciences, Mater Misericordiae University Hospital, Dublin 7, Ireland
| | - Padraic MacMathuna
- Gastrointestinal Unit, Mater Misericordiae University Hospital, Dublin 7, Ireland
| | - Alan C Moss
- Division of Gastroenterology, Beth Israel Deaconess Medical Center, 330 Brookline Ave, Boston, MA 02215, USA
| |
Collapse
|
49
|
Wu X, Zhang Q, Tan K, Xie R, Fan J, Shu H, Wang S. Characterization of a new gene WX2 in Toxoplasma gondii. Acta Biochim Biophys Sin (Shanghai) 2007; 39:475-83. [PMID: 17627323 DOI: 10.1111/j.1745-7270.2007.00302.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Using hybridization techniques, we prepared the monoclonal antibody (Mab) 7C3-C3 against Toxoplasma gondii. The protection tests showed that the protein (Mab7C3-C3) inhibited the invasion and proliferation of T. gondii RH strain in HeLa cells. The passive transfer test indicated that the antibody significantly prolonged the survival time of the challenged mice. It was also shown that the antibody could be used for the detection of the circulating antigen of T. gondii. After immunoscreening the T. gondii tachyzoite cDNA library with Mab7C3-C3, a new gene wx2 of T. gondii was obtained. Immunofluorescence analysis showed that the WX2 protein was located on the membrane of the parasite. Nucleotide sequence comparison showed 28% identity to the calcium channel alpha-1E unit and shared with the surface antigen related sequence in some conservative residues. However, no match was found in protein databases. Therefore, it was an unknown gene in T. gondii encoding a functional protein on the membrane of T. gondii. Because it has been shown to have a partial protective effect against T. gondii infection and is released as a circulating antigen, it could be a candidate molecule for vaccine or a novel target for new drugs.
Collapse
Affiliation(s)
- Xiang Wu
- Department of Parasitology, Xiangya Medical School, Central South University, Changsha 410078, China
| | | | | | | | | | | | | |
Collapse
|
50
|
Liang C, Wang G, Liu L, Ji G, Fang L, Liu Y, Carter K, Webb JS, Dean JFD. ConiferEST: an integrated bioinformatics system for data reprocessing and mining of conifer expressed sequence tags (ESTs). BMC Genomics 2007; 8:134. [PMID: 17535431 PMCID: PMC1894976 DOI: 10.1186/1471-2164-8-134] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2006] [Accepted: 05/29/2007] [Indexed: 11/30/2022] Open
Abstract
Background With the advent of low-cost, high-throughput sequencing, the amount of public domain Expressed Sequence Tag (EST) sequence data available for both model and non-model organism is growing exponentially. While these data are widely used for characterizing various genomes, they also present a serious challenge for data quality control and validation due to their inherent deficiencies, particularly for species without genome sequences. Description ConiferEST is an integrated system for data reprocessing, visualization and mining of conifer ESTs. In its current release, Build 1.0, it houses 172,229 loblolly pine EST sequence reads, which were obtained from reprocessing raw DNA sequencer traces using our software – WebTraceMiner. The trace files were downloaded from NCBI Trace Archive. ConiferEST provides biologists unique, easy-to-use data visualization and mining tools for a variety of putative sequence features including cloning vector segments, adapter sequences, restriction endonuclease recognition sites, polyA and polyT runs, and their corresponding Phred quality values. Based on these putative features, verified sequence features such as 3' and/or 5' termini of cDNA inserts in either sense or non-sense strand have been identified in-silico. Interestingly, only 30.03% of the designated 3' ESTs were found to have an authenticated 5' terminus in the non-sense strand (i.e., polyT tails), while fewer than 5.34% of the designated 5' ESTs had a verified 5' terminus in the sense strand. Such previously ignored features provide valuable insight for data quality control and validation of error-prone ESTs, as well as the ability to identify novel functional motifs embedded in large EST datasets. We found that "double-termini adapters" were effective indicators of potential EST chimeras. For all sequences with in-silico verified termini/terminus, we used InterProScan to assign protein domain signatures, results of which are available for in-depth exploration using our biologist-friendly web interfaces. Conclusion ConiferEST represents a unique and complementary public resource for EST data integration and mining in conifers by reprocessing raw DNA traces, identifying putative sequence features and determining and annotating in-silico verified features. Seamlessly integrated with other public resources, ConiferEST provides biologists powerful tools to verify data, visualize abnormalities, including EST chimeras, and explore large EST datasets.
Collapse
Affiliation(s)
- Chun Liang
- Department of Botany, Miami University, Oxford, Ohio 45056, USA
| | - Gang Wang
- Department of Botany, Miami University, Oxford, Ohio 45056, USA
| | - Lin Liu
- Department of Botany, Miami University, Oxford, Ohio 45056, USA
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian, 361005, China
| | - Lin Fang
- Beijing Genomics Institute, Beijing 101300, China
| | - Yuansheng Liu
- Department of Botany, Miami University, Oxford, Ohio 45056, USA
| | - Kikia Carter
- Department of Botany, Miami University, Oxford, Ohio 45056, USA
| | - Jason S Webb
- Department of Botany, Miami University, Oxford, Ohio 45056, USA
| | - Jeffrey FD Dean
- Warnell School of Forestry and Natural Resources, University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|