Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. Gene index analysis of the human genome estimates approximately 120,000 genes. Nat Genet 2000;25:239-40. [PMID: 10835646 DOI: 10.1038/76126] [Citation(s) in RCA: 195] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

For:	Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. Gene index analysis of the human genome estimates approximately 120,000 genes. Nat Genet 2000;25:239-40. [PMID: 10835646 DOI: 10.1038/76126] [Citation(s) in RCA: 195] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Number

Cited by Other Article(s)

Rodriguez JM, Abascal F, Cerdán-Vélez D, Gómez LM, Vázquez J, Tress ML. Evidence for widespread translation of 5' untranslated regions. Nucleic Acids Res 2024:gkae571. [PMID: 38953162 DOI: 10.1093/nar/gkae571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 06/07/2024] [Accepted: 06/19/2024] [Indexed: 07/03/2024] Open

Carrion SA, Michal JJ, Jiang Z. Alternative Transcripts Diversify Genome Function for Phenome Relevance to Health and Diseases. Genes (Basel) 2023;14:2051. [PMID: 38002994 PMCID: PMC10671453 DOI: 10.3390/genes14112051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/26/2023] Open

Barbagallo C, Stella M, Ferrara C, Caponnetto A, Battaglia R, Barbagallo D, Di Pietro C, Ragusa M. RNA-RNA competitive interactions: a molecular civil war ruling cell physiology and diseases. EXPLORATION OF MEDICINE 2023:504-540. [DOI: 10.37349/emed.2023.00159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 06/02/2023] [Indexed: 09/02/2023] Open

Guigó R. Genome annotation: From human genetics to biodiversity genomics. CELL GENOMICS 2023;3:100375. [PMID: 37601977 PMCID: PMC10435374 DOI: 10.1016/j.xgen.2023.100375] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/22/2023]

Longo G. From information to physics to biology. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2023;177:202-206. [PMID: 36572284 DOI: 10.1016/j.pbiomolbio.2022.12.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Revised: 12/15/2022] [Accepted: 12/16/2022] [Indexed: 12/24/2022]

Zhou Z, Cao Q, Diao Y, Wang Y, Long L, Wang S, Li P. Non-coding RNA-related antitumor mechanisms of marine-derived agents. Front Pharmacol 2022;13:1053556. [PMID: 36532760 PMCID: PMC9752855 DOI: 10.3389/fphar.2022.1053556] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Accepted: 11/21/2022] [Indexed: 09/26/2023] Open

Omics Data and Data Representations for Deep Learning-Based Predictive Modeling. Int J Mol Sci 2022;23:ijms232012272. [PMID: 36293133 PMCID: PMC9603455 DOI: 10.3390/ijms232012272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/03/2022] [Accepted: 10/12/2022] [Indexed: 11/25/2022] Open

The Road Traveled and Journey Ahead for the Genetics and Genomics of Tinnitus. Mol Diagn Ther 2022;26:129-136. [PMID: 35167110 PMCID: PMC8942952 DOI: 10.1007/s40291-022-00578-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/16/2022] [Indexed: 10/29/2022]

Malard F, Mackereth CD, Campagne S. Principles and correction of 5'-splice site selection. RNA Biol 2022;19:943-960. [PMID: 35866748 PMCID: PMC9311317 DOI: 10.1080/15476286.2022.2100971] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Prensner JR, Enache OM, Luria V, Krug K, Clauser KR, Dempster JM, Karger A, Wang L, Stumbraite K, Wang VM, Botta G, Lyons NJ, Goodale A, Kalani Z, Fritchman B, Brown A, Alan D, Green T, Yang X, Jaffe JD, Roth JA, Piccioni F, Kirschner MW, Ji Z, Root DE, Golub TR. Noncanonical open reading frames encode functional proteins essential for cancer cell survival. Nat Biotechnol 2021;39:697-704. [PMID: 33510483 PMCID: PMC8195866 DOI: 10.1038/s41587-020-00806-2] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 12/16/2020] [Indexed: 01/30/2023]

Affiliation(s)

John R. Prensner Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,2Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215,3Division of Pediatric Hematology/Oncology, Boston Children’s Hospital, Boston, MA, 02115
Oana M. Enache Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Victor Luria Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA
Karsten Krug Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Karl R. Clauser Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Joshua M. Dempster Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Amir Karger IT-Research Computing, Harvard Medical School, Boston, MA, USA, 02115
Li Wang Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Karolina Stumbraite Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Vickie M. Wang Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Ginevra Botta Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Nicholas J. Lyons Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Amy Goodale Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Zohra Kalani Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Briana Fritchman Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Adam Brown Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Douglas Alan Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Thomas Green Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Xiaoping Yang Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Jacob D. Jaffe Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,8Present address: Inzen Therapeutics, Cambridge, MA, 02139, USA
Jennifer A. Roth Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Federica Piccioni Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,9Present address: Merck Research Laboratories, Boston, MA, 02115, USA
Marc W. Kirschner Department of Systems Biology, Harvard Medical School, Boston, MA, 02115, USA
Zhe Ji Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611,7Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, IL 60628
David E. Root Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
Todd R. Golub Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA.,2Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215,3Division of Pediatric Hematology/Oncology, Boston Children’s Hospital, Boston, MA, 02115,*Corresponding author: Address correspondence to: Todd R. Golub, MD, Chief Scientific Officer, Broad Institute of Harvard and MIT, Room 4013, 415 Main Street, Cambridge, MA, 02142, , Phone: 617-714-7050

Collapse

Brain Cytoplasmic RNAs in Neurons: From Biosynthesis to Function. Biomolecules 2020;10:biom10020313. [PMID: 32079202 PMCID: PMC7072442 DOI: 10.3390/biom10020313] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 02/13/2020] [Accepted: 02/13/2020] [Indexed: 01/10/2023] Open

Hatje K, Mühlhausen S, Simm D, Kollmar M. The Protein-Coding Human Genome: Annotating High-Hanging Fruits. Bioessays 2019;41:e1900066. [PMID: 31544971 DOI: 10.1002/bies.201900066] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 08/07/2019] [Indexed: 12/19/2022]

Qadir MI, Bukhat S, Rasul S, Manzoor H, Manzoor M. RNA therapeutics: Identification of novel targets leading to drug discovery. J Cell Biochem 2019;121:898-929. [DOI: 10.1002/jcb.29364] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 08/20/2019] [Indexed: 12/23/2022]

Abascal F, Juan D, Jungreis I, Kellis M, Martinez L, Rigau M, Rodriguez JM, Vazquez J, Tress ML. Loose ends: almost one in five human genes still have unresolved coding status. Nucleic Acids Res 2019;46:7070-7084. [PMID: 29982784 PMCID: PMC6101605 DOI: 10.1093/nar/gky587] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 06/18/2018] [Indexed: 12/16/2022] Open

Mai H, Zhou B, Liu L, Yang F, Conran C, Ji Y, Hou J, Jiang D. Molecular pattern of lncRNAs in hepatocellular carcinoma. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2019;38:198. [PMID: 31097003 PMCID: PMC6524221 DOI: 10.1186/s13046-019-1213-0] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 05/07/2019] [Indexed: 02/07/2023]

Affiliation(s)

Haoming Mai State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institute of Liver Diseases Research of Guangdong Province, Guangzhou, China.,Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
Bin Zhou State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institute of Liver Diseases Research of Guangdong Province, Guangzhou, China.,Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
Li Liu State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institute of Liver Diseases Research of Guangdong Province, Guangzhou, China.,Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
Fu Yang Department of Medical Genetics, Second Military Medical University, Shanghai, 200433, China
Carly Conran University of Illinois College of Medicine, Chicago, IL, 60612, USA
Yuan Ji Department of Public Health Sciences, University of Chicago, Chicago, IL, 60637, USA
Jinlin Hou State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institute of Liver Diseases Research of Guangdong Province, Guangzhou, China.,Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China
Deke Jiang State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institute of Liver Diseases Research of Guangdong Province, Guangzhou, China. .,Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China.

Collapse

Pertea M, Shumate A, Pertea G, Varabyou A, Breitwieser FP, Chang YC, Madugundu AK, Pandey A, Salzberg SL. CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise. Genome Biol 2018;19:208. [PMID: 30486838 PMCID: PMC6260756 DOI: 10.1186/s13059-018-1590-2] [Citation(s) in RCA: 162] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Accepted: 11/16/2018] [Indexed: 01/06/2023] Open

Affiliation(s)

Mihaela Pertea Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
Alaina Shumate Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
Geo Pertea Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
Ales Varabyou Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
Florian P Breitwieser Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
Yu-Chi Chang Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
Anil K Madugundu McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA Institute of Bioinformatics, International Technology Park, Bangalore, India Manipal Academy of Higher Education (MAHE), Manipal, Karnataka, India Present address: Center for Individualized Medicine and Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
Akhilesh Pandey McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA Departments of Biological Chemistry, Pathology, Neurology, and Oncology, Johns Hopkins University School of Medicine, Baltimore, MD, USA Present address: Center for Individualized Medicine and Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
Steven L Salzberg Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA. Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA. Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA. Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA.

Collapse

Michelini F, Jalihal AP, Francia S, Meers C, Neeb ZT, Rossiello F, Gioia U, Aguado J, Jones-Weinert C, Luke B, Biamonti G, Nowacki M, Storici F, Carninci P, Walter NG, d'Adda di Fagagna F. From "Cellular" RNA to "Smart" RNA: Multiple Roles of RNA in Genome Stability and Beyond. Chem Rev 2018;118:4365-4403. [PMID: 29600857 DOI: 10.1021/acs.chemrev.7b00487] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Affiliation(s)

Flavia Michelini IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy
Ameya P Jalihal Single Molecule Analysis Group and Center for RNA Biomedicine, Department of Chemistry , University of Michigan , Ann Arbor , Michigan 48109-1055 , United States
Sofia Francia IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy.,Istituto di Genetica Molecolare , CNR - Consiglio Nazionale delle Ricerche , Pavia , 27100 , Italy
Chance Meers School of Biological Sciences , Georgia Institute of Technology , Atlanta , Georgia 30332 , United States
Zachary T Neeb Institute of Cell Biology , University of Bern , Baltzerstrasse 4 , 3012 Bern , Switzerland
Francesca Rossiello IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy
Ubaldo Gioia IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy
Julio Aguado IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy
Corey Jones-Weinert IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy
Brian Luke Institute of Developmental Biology and Neurobiology , Johannes Gutenberg University , 55099 Mainz , Germany.,Institute of Molecular Biology (IMB) , 55128 Mainz , Germany
Giuseppe Biamonti Istituto di Genetica Molecolare , CNR - Consiglio Nazionale delle Ricerche , Pavia , 27100 , Italy
Mariusz Nowacki Institute of Cell Biology , University of Bern , Baltzerstrasse 4 , 3012 Bern , Switzerland
Francesca Storici School of Biological Sciences , Georgia Institute of Technology , Atlanta , Georgia 30332 , United States
Piero Carninci RIKEN Center for Life Science Technologies , 1-7-22 Suehiro-cho, Tsurumi-ku , Yokohama City , Kanagawa 230-0045 , Japan
Nils G Walter Single Molecule Analysis Group and Center for RNA Biomedicine, Department of Chemistry , University of Michigan , Ann Arbor , Michigan 48109-1055 , United States
Fabrizio d'Adda di Fagagna IFOM - The FIRC Institute of Molecular Oncology , Milan , 20139 , Italy.,Istituto di Genetica Molecolare , CNR - Consiglio Nazionale delle Ricerche , Pavia , 27100 , Italy

Collapse

Liu Z, Liang Y, Wang H, Lu Z, Chen J, Huang Q, Sheng L, Ma Y, Du H, Gong Q. LncRNA expression in the spinal cord modulated by minocycline in a mouse model of spared nerve injury. J Pain Res 2017;10:2503-2514. [PMID: 29123421 PMCID: PMC5661508 DOI: 10.2147/jpr.s147055] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Moustafa K. Aberration of the Citation. Account Res 2017;23:230-44. [PMID: 26636372 DOI: 10.1080/08989621.2015.1127763] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Abstract

Multiple inherent biases related to different citation practices (for e.g., self-citations, negative citations, wrong citations, multi-authorship-biased citations, honorary citations, circumstantial citations, discriminatory citations, selective and arbitrary citations, etc.) make citation-based bibliometrics strongly flawed and defective measures. A paper can be highly cited for a while (for e.g., under circumstantial or transitional knowledge), but years later it may appear that its findings, paradigms, or theories were untrue or invalid anymore. By contrast, a paper may remain shelved or overlooked for years or decades, but new studies or discoveries may actualize its subject at any moment. As citation-based metrics are transformed into "commercial activities," the "citation credit" should be considered on a commercial basis too, in the sense that "citation credit" should be shared out as a "citation dividend" by shareholders (coauthors) averagely or proportionally to their contributions but not fully appropriated by each of them. At equal numbers of citations, the greater number of authors, the lower "citation credit" should be and vice versa. Overlooking the presence of distorted and subjective citation practices makes many people and administrators "obsessed" with the number of citations to such an extent to run after "highly cited" authors and to create specialized citation databases for commercial purposes. Citation-based bibliometrics, however, are unreliable and unscientific measures; citation counts do not mean that a more cited work is of a higher quality or accuracy than a less cited work because citations do not measure the quality or accuracy. Citations do not mean that a highly cited author or journal is more commendable than a less cited author or journal. Citations are not more than countable numbers: no more, no less.

Collapse

Lu Z, Liu N, Wang F. Epigenetic Regulations in Diabetic Nephropathy. J Diabetes Res 2017;2017:7805058. [PMID: 28401169 PMCID: PMC5376412 DOI: 10.1155/2017/7805058] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Revised: 02/06/2017] [Accepted: 02/09/2017] [Indexed: 01/10/2023] Open

Navarro E, Funtikova AN, Fíto M, Schröder H. Prenatal nutrition and the risk of adult obesity: Long-term effects of nutrition on epigenetic mechanisms regulating gene expression. J Nutr Biochem 2017;39:1-14. [DOI: 10.1016/j.jnutbio.2016.03.012] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Revised: 03/23/2016] [Accepted: 03/27/2016] [Indexed: 12/19/2022]

Evans JR, Feng FY, Chinnaiyan AM. The bright side of dark matter: lncRNAs in cancer. J Clin Invest 2016;126:2775-82. [PMID: 27479746 DOI: 10.1172/jci84421] [Citation(s) in RCA: 338] [Impact Index Per Article: 42.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Sheshukova EV, Shindyapina AV, Komarova TV, Dorokhov YL. “Matreshka” genes with alternative reading frames. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416020149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Yoshimoto R, Mayeda A, Yoshida M, Nakagawa S. MALAT1 long non-coding RNA in cancer. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2015;1859:192-9. [PMID: 26434412 DOI: 10.1016/j.bbagrm.2015.09.012] [Citation(s) in RCA: 163] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Revised: 09/24/2015] [Accepted: 09/28/2015] [Indexed: 02/09/2023]

Raabe CA, Brosius J. Does every transcript originate from a gene? Ann N Y Acad Sci 2015;1341:136-48. [PMID: 25847549 DOI: 10.1111/nyas.12741] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Revised: 02/05/2015] [Accepted: 02/11/2015] [Indexed: 12/20/2022]

Milligan MJ, Lipovich L. Pseudogene-derived lncRNAs: emerging regulators of gene expression. Front Genet 2015;5:476. [PMID: 25699073 PMCID: PMC4316772 DOI: 10.3389/fgene.2014.00476] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Accepted: 12/25/2014] [Indexed: 01/11/2023] Open

Richard JLC, Ogawa Y. Understanding the Complex Circuitry of lncRNAs at the X-inactivation Center and Its Implications in Disease Conditions. Curr Top Microbiol Immunol 2015;394:1-27. [PMID: 25982976 DOI: 10.1007/82_2015_443] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Oteng-Pabi SK, Pardin C, Stoica M, Keillor JW. Site-specific protein labelling and immobilization mediated by microbial transglutaminase. Chem Commun (Camb) 2014;50:6604-6. [DOI: 10.1039/c4cc00994k] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Chen G, Wang C, Shi L, Qu X, Chen J, Yang J, Shi C, Chen L, Zhou P, Ning B, Tong W, Shi T. Incorporating the human gene annotations in different databases significantly improved transcriptomic and genetic analyses. RNA (NEW YORK, N.Y.) 2013;19:479-89. [PMID: 23431329 PMCID: PMC3677258 DOI: 10.1261/rna.037473.112] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2012] [Accepted: 01/14/2013] [Indexed: 05/18/2023]

Affiliation(s)

Geng Chen Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
Charles Wang Functional Genomics Core, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, California 91010, USA
Leming Shi National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas 72079, USA
Xiongfei Qu Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
Jiwei Chen Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
Jianmin Yang Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
Caiping Shi Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
Long Chen Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
Peiying Zhou Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
Baitang Ning National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas 72079, USA
Weida Tong National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas 72079, USA
Tieliu Shi Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China Corresponding authorE-mail

Collapse

Bajetha G, Bhati J, Sarika, Iquebal MA, Rai A, Arora V, Kumar D. Analysis and functional annotation of expressed sequence tags of water buffalo. Anim Biotechnol 2013;24:25-30. [PMID: 23394367 DOI: 10.1080/10495398.2012.737884] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Prensner JR, Chinnaiyan AM. The emergence of lncRNAs in cancer biology. Cancer Discov 2012;1:391-407. [PMID: 22096659 DOI: 10.1158/2159-8290.cd-11-0209] [Citation(s) in RCA: 1437] [Impact Index Per Article: 119.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Multiple isoforms of the translation initiation factor eIF4GII are generated via use of alternative promoters, splice sites and a non-canonical initiation codon. Biochem J 2012;448:1-11. [DOI: 10.1042/bj20111765] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Zhou S, Ji G, Liu X, Li P, Moler J, Karro JE, Liang C. Pattern analysis approach reveals restriction enzyme cutting abnormalities and other cDNA library construction artifacts using raw EST data. BMC Biotechnol 2012;12:16. [PMID: 22554190 PMCID: PMC3424822 DOI: 10.1186/1472-6750-12-16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2011] [Accepted: 03/15/2012] [Indexed: 11/12/2022] Open

Abstract

Background

Expressed Sequence Tag (EST) sequences are widely used in applications such as genome annotation, gene discovery and gene expression studies. However, some of GenBank dbEST sequences have proven to be “unclean”. Identification of cDNA termini/ends and their structures in raw ESTs not only facilitates data quality control and accurate delineation of transcription ends, but also furthers our understanding of the potential sources of data abnormalities/errors present in the wet-lab procedures for cDNA library construction.

Results

After analyzing a total of 309,976 raw Pinus taeda ESTs, we uncovered many distinct variations of cDNA termini, some of which prove to be good indicators of wet-lab artifacts, and characterized each raw EST by its cDNA terminus structure patterns. In contrast to the expected patterns, many ESTs displayed complex and/or abnormal patterns that represent potential wet-lab errors such as: a failure of one or both of the restriction enzymes to cut the plasmid vector; a failure of the restriction enzymes to cut the vector at the correct positions; the insertion of two cDNA inserts into a single vector; the insertion of multiple and/or concatenated adapters/linkers; the presence of 3′-end terminal structures in designated 5′-end sequences or vice versa; and so on. With a close examination of these artifacts, many problematic ESTs that have been deposited into public databases by conventional bioinformatics pipelines or tools could be cleaned or filtered by our methodology. We developed a software tool for Abnormality Filtering and Sequence Trimming for ESTs (AFST, http://code.google.com/p/afst/) using a pattern analysis approach. To compare AFST with other pipelines that submitted ESTs into dbEST, we reprocessed 230,783 Pinus taeda and 38,709 Arachis hypogaea GenBank ESTs. We found 7.4% of Pinus taeda and 29.2% of Arachis hypogaea GenBank ESTs are “unclean” or abnormal, all of which could be cleaned or filtered by AFST.

Conclusions

cDNA terminal pattern analysis, as implemented in the AFST software tool, can be utilized to reveal wet-lab errors such as restriction enzyme cutting abnormities and chimeric EST sequences, detect various data abnormalities embedded in existing Sanger EST datasets, improve the accuracy of identifying and extracting bona fide cDNA inserts from raw ESTs, and therefore greatly benefit downstream EST-based applications.

Collapse

Bastepe M. The GNAS Locus: Quintessential Complex Gene Encoding Gsalpha, XLalphas, and other Imprinted Transcripts. Curr Genomics 2011;8:398-414. [PMID: 19412439 PMCID: PMC2671723 DOI: 10.2174/138920207783406488] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2007] [Revised: 09/22/2007] [Accepted: 09/28/2007] [Indexed: 12/14/2022] Open

Hegyi H, Kalmar L, Horvath T, Tompa P. Verification of alternative splicing variants based on domain integrity, truncation length and intrinsic protein disorder. Nucleic Acids Res 2010;39:1208-19. [PMID: 20972208 PMCID: PMC3045584 DOI: 10.1093/nar/gkq843] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open

Dunham I, Beare DM, Collins JE. The characteristics of human genes: analysis of human chromosome 22. Comp Funct Genomics 2010;4:635-46. [PMID: 18629020 PMCID: PMC2447302 DOI: 10.1002/cfg.335] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2003] [Revised: 09/04/2003] [Accepted: 09/08/2003] [Indexed: 11/11/2022] Open

Simpson AJ, de Souza SJ, Camargo AA, Brentani RR. Definition of the gene content of the human genome: the need for deep experimental verification. Comp Funct Genomics 2010;2:169-75. [PMID: 18628909 PMCID: PMC2447206 DOI: 10.1002/cfg.81] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2001] [Accepted: 04/05/2001] [Indexed: 11/06/2022] Open

Pertea M, Salzberg SL. Between a chicken and a grape: estimating the number of human genes. Genome Biol 2010;11:206. [PMID: 20441615 PMCID: PMC2898077 DOI: 10.1186/gb-2010-11-5-206] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

An integrated mass-spectrometry pipeline identifies novel protein coding-regions in the human genome. PLoS One 2010;5:e8949. [PMID: 20126623 PMCID: PMC2812506 DOI: 10.1371/journal.pone.0008949] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2009] [Accepted: 01/06/2010] [Indexed: 01/28/2023] Open

Is sequencing enlightenment ending the dark age of the transcriptome? Nat Methods 2009;6:711-13. [PMID: 19953680 DOI: 10.1038/nmeth1009-711] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Yang X, Xie L, Li Y, Wei C. More than 9,000,000 unique genes in human gut bacterial community: estimating gene numbers inside a human body. PLoS One 2009;4:e6074. [PMID: 19562079 PMCID: PMC2699651 DOI: 10.1371/journal.pone.0006074] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2009] [Accepted: 05/29/2009] [Indexed: 01/17/2023] Open

Gu L, Guo R. Genome-wide detection and analysis of alternative splicing for nucleotide binding site-leucine-rich repeats sequences in rice. J Genet Genomics 2009;34:247-57. [PMID: 17498622 DOI: 10.1016/s1673-8527(07)60026-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2006] [Accepted: 08/03/2006] [Indexed: 11/20/2022]

Abstract

Alternative splicing is a major contributor to genomic complexity and proteome diversity, yet the analysis of alternative splicing for the sequence containing nucleotide binding site and leucine-rich repeats (NBS-LRR) domain has not been explored in rice (Oryza sativa L.). Hidden Markov model (HMM) searches were performed for NBS-LRR domain. 875 NBS-LRR-encoding sequences were obtained from the Institute for Genomic Research (TIGR). All of them were used to blast Knowledge-based Oryza Molecular Biological Encyclopaedia (KOME), TIGR rice gene index (TGI), and Universal Protein Resource (UniProt) to obtain homologous full-length cDNAs (FL-cDNAs), tentative consensus sequences, and protein sequences. Alternative splicing events were detected from genomic alignment of FL-cDNAs, tentative consensus sequences, and protein sequences, which provide valuable information on splice variants of genes. These sequences were aligned to the corresponding BAC sequences using the Spidey and Sim4 programs and each of the proteins was aligned by tBLASTn. Of the 875 NBS-LRR sequences, 119 (13.6%) sequences had alternative splicing where multiple FL-cDNAs, TGI sequences and proteins corresponded to the same gene. 71 intron retention events, 20 exon skipping events, 16 alternative termination events, 25 alternative initiation events, 12 alternative 5' splicing events, and 16 alternative 3' splicing events were identified. Most of these alternative splices were supported by two or more transcripts. The data sets are available at http://www.bioinfor.org Furthermore, the bioinformatics analysis of splice boundaries showed that exon skipping and intron retention did not exhibit strong consensus. This implies a different regulation mechanism that guides the expression of splice isoforms. This article also presents the analysis of the effects of intron retention on proteins. The C-terminal regions of alternative proteins turned out to be more variable than the N-terminal regions. Finally, tissue distribution and protein localization of alternative splicing were explored. The largest categories of tissue distributions for alternative splicing were shoot and callus. More than one-thirds of protein localization for splice forms was plasma membrane and cytoplasm. All the NBS-LRR proteins for splice forms may have important function in disease resistance and activate downstream signaling pathways.

Collapse

Scheibye-Alsing K, Hoffmann S, Frankel A, Jensen P, Stadler PF, Mang Y, Tommerup N, Gilchrist MJ, Nygård AB, Cirera S, Jørgensen CB, Fredholm M, Gorodkin J. Sequence assembly. Comput Biol Chem 2008;33:121-36. [PMID: 19152793 DOI: 10.1016/j.compbiolchem.2008.11.003] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2008] [Revised: 11/28/2008] [Accepted: 11/28/2008] [Indexed: 01/20/2023]

Cao J, Wu X, Jin Y. Lower GC-content in editing exons: implications for regulation by molecular characteristics maintained by selection. Gene 2008;421:14-9. [PMID: 18632225 DOI: 10.1016/j.gene.2008.05.012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Revised: 03/01/2008] [Accepted: 05/21/2008] [Indexed: 01/26/2023]

Zou X, Chung T, Lin X, Malakhova ML, Pike HM, Brown RE. Human glycolipid transfer protein (GLTP) genes: organization, transcriptional status and evolution. BMC Genomics 2008;9:72. [PMID: 18261224 PMCID: PMC2262070 DOI: 10.1186/1471-2164-9-72] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2007] [Accepted: 02/08/2008] [Indexed: 12/31/2022] Open

Abstract

BACKGROUND

Glycolipid transfer protein is the prototypical and founding member of the new GLTP superfamily distinguished by a novel conformational fold and glycolipid binding motif. The present investigation provides the first insights into the organization, transcriptional status, phylogenetic/evolutionary relationships of GLTP genes.

RESULTS

In human cells, single-copy GLTP genes were found in chromosomes 11 and 12. The gene at locus 11p15.1 exhibited several features of a potentially active retrogene, including a highly homologous (approximately 94%), full-length coding sequence containing all key amino acid residues involved in glycolipid liganding. To establish the transcriptional activity of each human GLTP gene, in silico EST evaluations, RT-PCR amplifications of GLTP transcript(s), and methylation analyses of regulator CpG islands were performed using various human cells. Active transcription was found for 12q24.11 GLTP but 11p15.1 GLTP was transcriptionally silent. Heterologous expression and purification of the GLTP paralogs showed glycolipid intermembrane transfer activity only for 12q24.11 GLTP. Phylogenetic/evolutionary analyses indicated that the 5-exon/4-intron organizational pattern and encoded sequence of 12q24.11 GLTP were highly conserved in therian mammals and other vertebrates. Orthologs of the intronless GLTP gene were observed in primates but not in rodentiates, carnivorates, cetartiodactylates, or didelphimorphiates, consistent with recent evolutionary development.

CONCLUSION

The results identify and characterize the gene responsible for GLTP expression in humans and provide the first evidence for the existence of a GLTP pseudogene, while demonstrating the rigorous approach needed to unequivocally distinguish transcriptionally-active retrogenes from silent pseudogenes. The results also rectify errors in the Ensembl database regarding the organizational structure of the actively transcribed GLTP gene in Pan troglodytes and establish the intronless GLTP as a primate-specific, processed pseudogene marker. A solid foundation has been established for future identification of hereditary defects in human GLTP genes.

Collapse

Carninci P. Constructing the landscape of the mammalian transcriptome. ACTA ACUST UNITED AC 2008;210:1497-506. [PMID: 17449815 DOI: 10.1242/jeb.000406] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Quackenbush J. Extracting biology from high-dimensional biological data. ACTA ACUST UNITED AC 2008;210:1507-17. [PMID: 17449816 DOI: 10.1242/jeb.004432] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Bioinformatics detection of alternative splicing. Methods Mol Biol 2008;452:179-97. [PMID: 18566765 DOI: 10.1007/978-1-60327-159-2_9] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Dahinden C, Parmigiani G, Emerick MC, Bühlmann P. Penalized likelihood for sparse contingency tables with an application to full-length cDNA libraries. BMC Bioinformatics 2007;8:476. [PMID: 18072965 PMCID: PMC2233645 DOI: 10.1186/1471-2105-8-476] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2007] [Accepted: 12/11/2007] [Indexed: 11/10/2022] Open

Liang C, Wang G, Liu L, Ji G, Fang L, Liu Y, Carter K, Webb JS, Dean JFD. ConiferEST: an integrated bioinformatics system for data reprocessing and mining of conifer expressed sequence tags (ESTs). BMC Genomics 2007;8:134. [PMID: 17535431 PMCID: PMC1894976 DOI: 10.1186/1471-2164-8-134] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2006] [Accepted: 05/29/2007] [Indexed: 11/30/2022] Open

Abstract

Background

With the advent of low-cost, high-throughput sequencing, the amount of public domain Expressed Sequence Tag (EST) sequence data available for both model and non-model organism is growing exponentially. While these data are widely used for characterizing various genomes, they also present a serious challenge for data quality control and validation due to their inherent deficiencies, particularly for species without genome sequences.

Description

ConiferEST is an integrated system for data reprocessing, visualization and mining of conifer ESTs. In its current release, Build 1.0, it houses 172,229 loblolly pine EST sequence reads, which were obtained from reprocessing raw DNA sequencer traces using our software – WebTraceMiner. The trace files were downloaded from NCBI Trace Archive. ConiferEST provides biologists unique, easy-to-use data visualization and mining tools for a variety of putative sequence features including cloning vector segments, adapter sequences, restriction endonuclease recognition sites, polyA and polyT runs, and their corresponding Phred quality values. Based on these putative features, verified sequence features such as 3' and/or 5' termini of cDNA inserts in either sense or non-sense strand have been identified in-silico. Interestingly, only 30.03% of the designated 3' ESTs were found to have an authenticated 5' terminus in the non-sense strand (i.e., polyT tails), while fewer than 5.34% of the designated 5' ESTs had a verified 5' terminus in the sense strand. Such previously ignored features provide valuable insight for data quality control and validation of error-prone ESTs, as well as the ability to identify novel functional motifs embedded in large EST datasets. We found that "double-termini adapters" were effective indicators of potential EST chimeras. For all sequences with in-silico verified termini/terminus, we used InterProScan to assign protein domain signatures, results of which are available for in-depth exploration using our biologist-friendly web interfaces.

Conclusion

ConiferEST represents a unique and complementary public resource for EST data integration and mining in conifers by reprocessing raw DNA traces, identifying putative sequence features and determining and annotating in-silico verified features. Seamlessly integrated with other public resources, ConiferEST provides biologists powerful tools to verify data, visualize abnormalities, including EST chimeras, and explore large EST datasets.

Collapse