Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Frith MC, Wan R, Horton P. Incorporating sequence quality data into alignment improves DNA read mapping. Nucleic Acids Res 2010;38:e100. [PMID: 20110255 PMCID: PMC2853142 DOI: 10.1093/nar/gkq010] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Frith MC, Wan R, Horton P. Incorporating sequence quality data into alignment improves DNA read mapping. Nucleic Acids Res 2010;38:e100. [PMID: 20110255 PMCID: PMC2853142 DOI: 10.1093/nar/gkq010] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Elrashedy A, Nayel M, Salama A, Zaghawa A, Abdelsalam NR, Hasan ME. Phylogenetic Analysis and Comparative Genomics of Brucella abortus and Brucella melitensis Strains in Egypt. J Mol Evol 2024;92:338-357. [PMID: 38809331 PMCID: PMC11169049 DOI: 10.1007/s00239-024-10173-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 05/02/2024] [Indexed: 05/30/2024]

Oury N, Magalon H. Investigating the potential roles of intra-colonial genetic variability in Pocillopora corals using genomics. Sci Rep 2024;14:6437. [PMID: 38499737 PMCID: PMC10948807 DOI: 10.1038/s41598-024-57136-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 03/14/2024] [Indexed: 03/20/2024] Open

Xie S, Isaacs K, Becker G, Murdoch BM. A computational framework for improving genetic variants identification from 5,061 sheep sequencing data. J Anim Sci Biotechnol 2023;14:127. [PMID: 37779189 PMCID: PMC10544426 DOI: 10.1186/s40104-023-00923-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 08/01/2023] [Indexed: 10/03/2023] Open

Abstract

BACKGROUND

Pan-genomics is a recently emerging strategy that can be utilized to provide a more comprehensive characterization of genetic variation. Joint calling is routinely used to combine identified variants across multiple related samples. However, the improvement of variants identification using the mutual support information from multiple samples remains quite limited for population-scale genotyping.

RESULTS

In this study, we developed a computational framework for joint calling genetic variants from 5,061 sheep by incorporating the sequencing error and optimizing mutual support information from multiple samples' data. The variants were accurately identified from multiple samples by using four steps: (1) Probabilities of variants from two widely used algorithms, GATK and Freebayes, were calculated by Poisson model incorporating base sequencing error potential; (2) The variants with high mapping quality or consistently identified from at least two samples by GATK and Freebayes were used to construct the raw high-confidence identification (rHID) variants database; (3) The high confidence variants identified in single sample were ordered by probability value and controlled by false discovery rate (FDR) using rHID database; (4) To avoid the elimination of potentially true variants from rHID database, the variants that failed FDR were reexamined to rescued potential true variants and ensured high accurate identification variants. The results indicated that the percent of concordant SNPs and Indels from Freebayes and GATK after our new method were significantly improved 12%-32% compared with raw variants and advantageously found low frequency variants of individual sheep involved several traits including nipples number (GPC5), scrapie pathology (PAPSS2), seasonal reproduction and litter size (GRM1), coat color (RAB27A), and lentivirus susceptibility (TMEM154).

CONCLUSION

The new method used the computational strategy to reduce the number of false positives, and simultaneously improve the identification of genetic variants. This strategy did not incur any extra cost by using any additional samples or sequencing data information and advantageously identified rare variants which can be important for practical applications of animal breeding.

Collapse

Weinstein JY, Martí-Gómez C, Lipsh-Sokolik R, Hoch SY, Liebermann D, Nevo R, Weissman H, Petrovich-Kopitman E, Margulies D, Ivankov D, McCandlish DM, Fleishman SJ. Designed active-site library reveals thousands of functional GFP variants. Nat Commun 2023;14:2890. [PMID: 37210560 PMCID: PMC10199939 DOI: 10.1038/s41467-023-38099-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 04/13/2023] [Indexed: 05/22/2023] Open

Luiza Atella A, Fatima Grossi-de-Sá M, Alves-Ferreira M. Cotton promoters for controlled gene expression. ELECTRON J BIOTECHN 2023. [DOI: 10.1016/j.ejbt.2022.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

Wu Z, Che Y, Dang C, Zhang M, Zhang X, Sun Y, Li X, Zhang T, Xia Y. Nanopore-based long-read metagenomics uncover the resistome intrusion by antibiotic resistant bacteria from treated wastewater in receiving water body. WATER RESEARCH 2022;226:119282. [PMID: 36332295 DOI: 10.1016/j.watres.2022.119282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 10/17/2022] [Accepted: 10/19/2022] [Indexed: 06/16/2023]

Affiliation(s)

Ziqi Wu School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China; Section of Microbiology, University of Copenhagen, Universitetsparken 15, 2100, Copenhagen, Denmark
You Che Environmental Microbiome Engineering and Biotechnology Laboratory, Department of Civil Engineering, The University of Hong Kong, Hong Kong SAR
Chenyuan Dang School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
Miao Zhang School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
Xuyang Zhang School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
Yuhong Sun School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
Xiang Li School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China; State Environmental Protection Key Laboratory of Integrated Surface Water-Groundwater Pollution Control, School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China; Guangdong Provincial Key Laboratory of Soil and Groundwater Pollution Control, School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China
Tong Zhang Environmental Microbiome Engineering and Biotechnology Laboratory, Department of Civil Engineering, The University of Hong Kong, Hong Kong SAR
Yu Xia School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China; State Environmental Protection Key Laboratory of Integrated Surface Water-Groundwater Pollution Control, School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China; Guangdong Provincial Key Laboratory of Soil and Groundwater Pollution Control, School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China.

Collapse

Yim WC, Swain ML, Ma D, An H, Bird KA, Curdie DD, Wang S, Ham HD, Luzuriaga-Neira A, Kirkwood JS, Hur M, Solomon JKQ, Harper JF, Kosma DK, Alvarez-Ponce D, Cushman JC, Edger PP, Mason AS, Pires JC, Tang H, Zhang X. The final piece of the Triangle of U: Evolution of the tetraploid Brassica carinata genome. THE PLANT CELL 2022;34:4143-4172. [PMID: 35961044 PMCID: PMC9614464 DOI: 10.1093/plcell/koac249] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 06/24/2022] [Indexed: 05/05/2023]

Affiliation(s)

Won Cheol Yim Author for correspondence:
Mia L Swain Author for correspondence:
Dongna Ma Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Key Laboratory of National Forestry and Grassland Administration for Orchid Conservation and Utilization, Fujian Agriculture and Forestry University, Fuzhou, China
Hong An Division of Biological Sciences, University of Missouri, Columbia, Missouri 65201, USA
Kevin A Bird Department of Horticulture, Michigan State University, East Lansing, Michigan 48824, USA
David D Curdie Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Nevada 89557, USA
Samuel Wang Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Nevada 89557, USA
Hyun Don Ham Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Nevada 89557, USA
Agusto Luzuriaga-Neira Department of Biology, University of Nevada, Reno, Nevada 89557, USA
Jay S Kirkwood Metabolomics Core Facility, Institute for Integrative Genome Biology, University of California, Riverside, California 92521, USA
Manhoi Hur Metabolomics Core Facility, Institute for Integrative Genome Biology, University of California, Riverside, California 92521, USA
Juan K Q Solomon Department of Agriculture, Veterinary & Rangeland Sciences, University of Nevada, Reno, Nevada 89557, USA
Jeffrey F Harper Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Nevada 89557, USA
Dylan K Kosma Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Nevada 89557, USA
David Alvarez-Ponce Department of Biology, University of Nevada, Reno, Nevada 89557, USA
John C Cushman Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Nevada 89557, USA
Patrick P Edger Department of Horticulture, Michigan State University, East Lansing, Michigan 48824, USA
Annaliese S Mason Plant Breeding Department, INRES, The University of Bonn, Bonn 53115, Germany
J Chris Pires Division of Biological Sciences, Bond Life Sciences Center, , University of Missouri, Columbia, Missouri 65211, USA
Haibao Tang Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Key Laboratory of National Forestry and Grassland Administration for Orchid Conservation and Utilization, Fujian Agriculture and Forestry University, Fuzhou, China
Xingtan Zhang Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Key Laboratory of Ministry of Education for Genetics, Breeding and Multiple Utilization of Crops, Key Laboratory of National Forestry and Grassland Administration for Orchid Conservation and Utilization, Fujian Agriculture and Forestry University, Fuzhou, China

Collapse

Müller R, Nebel M. On the use of sequence-quality information in OTU clustering. PeerJ 2021;9:e11717. [PMID: 34458017 PMCID: PMC8375510 DOI: 10.7717/peerj.11717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 06/11/2021] [Indexed: 11/20/2022] Open

Abstract

Background

High-throughput sequencing has become an essential technology in life science research. Despite continuous improvements in technology, the produced sequences are still not entirely accurate. Consequently, the sequences are usually equipped with error probabilities. The quality information is already employed to find better solutions to a number of bioinformatics problems (e.g. read mapping). Data processing pipelines benefit in particular (especially when incorporating the quality information early), since enhanced outcomes of one step can improve all subsequent ones. Preprocessing steps, thus, quite regularly consider the sequence quality to fix errors or discard low-quality data. Other steps, however, like clustering sequences into operational taxonomic units (OTUs), a common task in the analysis of microbial communities, are typically performed without making use of the available quality information.

Results

In this paper, we present quality-aware clustering methods inspired by quality-weighted alignments and model-based denoising, and explore their applicability to OTU clustering. We implemented the quality-aware methods in a revised version of our de novo clustering tool GeFaST and evaluated their clustering quality and performance on mock-community data sets. Quality-weighted alignments were able to improve the clustering quality of GeFaST by up to 10%. The examination of the model-supported methods provided a more diverse picture, hinting at a narrower applicability, but they were able to attain similar improvements. Considering the quality information enlarged both runtime and memory consumption, even though the increase of the former depended heavily on the applied method and clustering threshold.

Conclusions

The quality-aware methods expand the iterative, de novo clustering approach by new clustering and cluster refinement methods. Our results indicate that OTU clustering constitutes yet another analysis step benefiting from the integration of quality information. Beyond the shown potential, the quality-aware methods offer a range of opportunities for fine-tuning and further extensions.

Collapse

Boatwright JL, Yeh CT, Hu HC, Susanna A, Soltis DE, Soltis PS, Schnable PS, Barbazuk WB. Trajectories of Homoeolog-Specific Expression in Allotetraploid Tragopogon castellanus Populations of Independent Origins. FRONTIERS IN PLANT SCIENCE 2021;12:679047. [PMID: 34249049 PMCID: PMC8261302 DOI: 10.3389/fpls.2021.679047] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 05/20/2021] [Indexed: 06/13/2023]

Abstract

Polyploidization can have a significant ecological and evolutionary impact by providing substantially more genetic material that may result in novel phenotypes upon which selection may act. While the effects of polyploidization are broadly reviewed across the plant tree of life, the reproducibility of these effects within naturally occurring, independently formed polyploids is poorly characterized. The flowering plant genus Tragopogon (Asteraceae) offers a rare glimpse into the intricacies of repeated allopolyploid formation with both nascent (< 90 years old) and more ancient (mesopolyploids) formations. Neo- and mesopolyploids in Tragopogon have formed repeatedly and have extant diploid progenitors that facilitate the comparison of genome evolution after polyploidization across a broad span of evolutionary time. Here, we examine four independently formed lineages of the mesopolyploid Tragopogon castellanus for homoeolog expression changes and fractionation after polyploidization. We show that expression changes are remarkably similar among these independently formed polyploid populations with large convergence among expressed loci, moderate convergence among loci lost, and stochastic silencing. We further compare and contrast these results for T. castellanus with two nascent Tragopogon allopolyploids. While homoeolog expression bias was balanced in both nascent polyploids and T. castellanus, the degree of additive expression was significantly different, with the mesopolyploid populations demonstrating more non-additive expression. We suggest that gene dosage and expression noise minimization may play a prominent role in regulating gene expression patterns immediately after allopolyploidization as well as deeper into time, and these patterns are conserved across independent polyploid lineages.

Collapse

Fischer C, Koblmüller S, Börger C, Michelitsch G, Trajanoski S, Schlötterer C, Guelly C, Thallinger GG, Sturmbauer C. Genome sequences of Tropheus moorii and Petrochromis trewavasae, two eco-morphologically divergent cichlid fishes endemic to Lake Tanganyika. Sci Rep 2021;11:4309. [PMID: 33619328 PMCID: PMC7900123 DOI: 10.1038/s41598-021-81030-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 12/28/2020] [Indexed: 01/01/2023] Open

Frith MC. How sequence alignment scores correspond to probability models. Bioinformatics 2019;36:408-415. [PMID: 31329241 PMCID: PMC9883716 DOI: 10.1093/bioinformatics/btz576] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 05/31/2019] [Accepted: 07/17/2019] [Indexed: 02/03/2023] Open

Liu T, Wang X, Wang G, Jia S, Liu G, Shan G, Chi S, Zhang J, Yu Y, Xue T, Yu J. Evolution of Complex Thallus Alga: Genome Sequencing of Saccharina japonica. Front Genet 2019;10:378. [PMID: 31118944 PMCID: PMC6507550 DOI: 10.3389/fgene.2019.00378] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 04/09/2019] [Indexed: 01/15/2023] Open

Díaz-Sánchez S, Hernández-Jarguín A, Torina A, de Mera IGF, Blanda V, Caracappa S, Gortazar C, de la Fuente J. Characterization of the bacterial microbiota in wild-caught Ixodes ventalloi. Ticks Tick Borne Dis 2018;10:336-343. [PMID: 30482513 DOI: 10.1016/j.ttbdis.2018.11.014] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Revised: 10/10/2018] [Accepted: 11/15/2018] [Indexed: 11/24/2022]

Frith MC, Shrestha AMS. A Simplified Description of Child Tables for Sequence Similarity Search. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;15:2067-2073. [PMID: 29994365 DOI: 10.1109/tcbb.2018.2796064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Abbas-Aghababazadeh F, Li Q, Fridley BL. Comparison of normalization approaches for gene expression studies completed with high-throughput sequencing. PLoS One 2018;13:e0206312. [PMID: 30379879 PMCID: PMC6209231 DOI: 10.1371/journal.pone.0206312] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Accepted: 10/10/2018] [Indexed: 01/07/2023] Open

A Robust Methodology for Assessing Differential Homeolog Contributions to the Transcriptomes of Allopolyploids. Genetics 2018;210:883-894. [PMID: 30213855 DOI: 10.1534/genetics.118.301564] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Accepted: 09/07/2018] [Indexed: 12/18/2022] Open

Suzuki A, Suzuki M, Mizushima-Sugano J, Frith MC, Makalowski W, Kohno T, Sugano S, Tsuchihara K, Suzuki Y. Sequencing and phasing cancer mutations in lung cancers using a long-read portable sequencer. DNA Res 2018;24:585-596. [PMID: 29117310 PMCID: PMC5726485 DOI: 10.1093/dnares/dsx027] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2017] [Accepted: 05/29/2017] [Indexed: 01/18/2023] Open

Integrated metatranscriptomics and metaproteomics for the characterization of bacterial microbiota in unfed Ixodes ricinus. Ticks Tick Borne Dis 2018;9:1241-1251. [DOI: 10.1016/j.ttbdis.2018.04.020] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 04/28/2018] [Accepted: 04/29/2018] [Indexed: 12/12/2022]

Lin HN, Hsu WL. Kart: a divide-and-conquer algorithm for NGS read alignment. Bioinformatics 2018;33:2281-2287. [PMID: 28379292 PMCID: PMC5860120 DOI: 10.1093/bioinformatics/btx189] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 04/05/2017] [Indexed: 02/02/2023] Open

Gan RC, Chen TW, Wu TH, Huang PJ, Lee CC, Yeh YM, Chiu CH, Huang HD, Tang P. PARRoT- a homology-based strategy to quantify and compare RNA-sequencing from non-model organisms. BMC Bioinformatics 2016;17:513. [PMID: 28155708 PMCID: PMC5260104 DOI: 10.1186/s12859-016-1366-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Open

Abstract

Background

Next-generation sequencing promises the de novo genomic and transcriptomic analysis of samples of interests. However, there are only a few organisms having reference genomic sequences and even fewer having well-defined or curated annotations. For transcriptome studies focusing on organisms lacking proper reference genomes, the common strategy is de novo assembly followed by functional annotation. However, things become even more complicated when multiple transcriptomes are compared.

Results

Here, we propose a new analysis strategy and quantification methods for quantifying expression level which not only generate a virtual reference from sequencing data, but also provide comparisons between transcriptomes. First, all reads from the transcriptome datasets are pooled together for de novo assembly. The assembled contigs are searched against NCBI NR databases to find potential homolog sequences. Based on the searched result, a set of virtual transcripts are generated and served as a reference transcriptome. By using the same reference, normalized quantification values including RC (read counts), eRPKM (estimated RPKM) and eTPM (estimated TPM) can be obtained that are comparable across transcriptome datasets. In order to demonstrate the feasibility of our strategy, we implement it in the web service PARRoT. PARRoT stands for Pipeline for Analyzing RNA Reads of Transcriptomes. It analyzes gene expression profiles for two transcriptome sequencing datasets. For better understanding of the biological meaning from the comparison among transcriptomes, PARRoT further provides linkage between these virtual transcripts and their potential function through showing best hits in SwissProt, NR database, assigning GO terms. Our demo datasets showed that PARRoT can analyze two paired-end transcriptomic datasets of approximately 100 million reads within just three hours.

Conclusions

In this study, we proposed and implemented a strategy to analyze transcriptomes from non-reference organisms which offers the opportunity to quantify and compare transcriptome profiles through a homolog based virtual transcriptome reference. By using the homolog based reference, our strategy effectively avoids the problems that may cause from inconsistencies among transcriptomes. This strategy will shed lights on the field of comparative genomics for non-model organism. We have implemented PARRoT as a web service which is freely available at http://parrot.cgu.edu.tw.

Collapse

Wattam AR, Davis JJ, Assaf R, Boisvert S, Brettin T, Bun C, Conrad N, Dietrich EM, Disz T, Gabbard JL, Gerdes S, Henry CS, Kenyon RW, Machi D, Mao C, Nordberg EK, Olsen GJ, Murphy-Olson DE, Olson R, Overbeek R, Parrello B, Pusch GD, Shukla M, Vonstein V, Warren A, Xia F, Yoo H, Stevens RL. Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center. Nucleic Acids Res 2016;45:D535-D542. [PMID: 27899627 PMCID: PMC5210524 DOI: 10.1093/nar/gkw1017] [Citation(s) in RCA: 1079] [Impact Index Per Article: 134.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Revised: 10/14/2016] [Accepted: 11/09/2016] [Indexed: 12/14/2022] Open

Affiliation(s)

Alice R Wattam Biocomplexity Institute, Virginia Tech University, Blacksburg, VA 24060, USA
James J Davis Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA
Rida Assaf Department of Computer Science, University of Chicago, Chicago, IL 60637, USA
Sébastien Boisvert Gydle Inc. 101-1332 Chanoine Morel Quebec, QC G1S, 4B4, Canada
Thomas Brettin Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA
Christopher Bun Department of Computer Science, University of Chicago, Chicago, IL 60637, USA
Neal Conrad Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, USA
Emily M Dietrich Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA
Terry Disz Fellowship for Interpretation of Genomes, Burr Ridge, IL 60527, USA
Joseph L Gabbard Grado Department of Industrial & Systems Engineering, Virginia Tech, Blacksburg, VA 24060, USA
Svetlana Gerdes Fellowship for Interpretation of Genomes, Burr Ridge, IL 60527, USA
Christopher S Henry Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, USA
Ronald W Kenyon Biocomplexity Institute, Virginia Tech University, Blacksburg, VA 24060, USA
Dustin Machi Biocomplexity Institute, Virginia Tech University, Blacksburg, VA 24060, USA
Chunhong Mao Biocomplexity Institute, Virginia Tech University, Blacksburg, VA 24060, USA
Eric K Nordberg Biocomplexity Institute, Virginia Tech University, Blacksburg, VA 24060, USA
Gary J Olsen Department of Microbiology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
Daniel E Murphy-Olson Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA
Robert Olson Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, USA
Ross Overbeek Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA.,Fellowship for Interpretation of Genomes, Burr Ridge, IL 60527, USA
Bruce Parrello Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA.,Fellowship for Interpretation of Genomes, Burr Ridge, IL 60527, USA
Gordon D Pusch Fellowship for Interpretation of Genomes, Burr Ridge, IL 60527, USA
Maulik Shukla Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA
Veronika Vonstein Fellowship for Interpretation of Genomes, Burr Ridge, IL 60527, USA
Andrew Warren Biocomplexity Institute, Virginia Tech University, Blacksburg, VA 24060, USA
Fangfang Xia Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, USA
Hyunseung Yoo Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA
Rick L Stevens Computation Institute, University of Chicago, Chicago, IL 60637, USA.,Computing, Environment and Life Sciences, Argonne National Laboratory, Argonne, IL 60439, USA.,Department of Computer Science, University of Chicago, Chicago, IL 60637, USA

Collapse

Schmidt K, Mwaigwisya S, Crossman LC, Doumith M, Munroe D, Pires C, Khan AM, Woodford N, Saunders NJ, Wain J, O'Grady J, Livermore DM. Identification of bacterial pathogens and antimicrobial resistance directly from clinical urines by nanopore-based metagenomic sequencing. J Antimicrob Chemother 2016;72:104-114. [PMID: 27667325 DOI: 10.1093/jac/dkw397] [Citation(s) in RCA: 208] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Revised: 08/09/2016] [Accepted: 08/21/2016] [Indexed: 12/18/2022] Open

Buffering of Genetic Regulatory Networks in Drosophila melanogaster. Genetics 2016;203:1177-90. [PMID: 27194752 DOI: 10.1534/genetics.116.188797] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 05/17/2016] [Indexed: 01/01/2023] Open

Roy Chowdhury P, DeMaere M, Chapman T, Worden P, Charles IG, Darling AE, Djordjevic SP. Comparative genomic analysis of toxin-negative strains of Clostridium difficile from humans and animals with symptoms of gastrointestinal disease. BMC Microbiol 2016;16:41. [PMID: 26971047 PMCID: PMC4789261 DOI: 10.1186/s12866-016-0653-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 03/02/2016] [Indexed: 12/13/2022] Open

Abstract

Background

Clostridium difficile infections (CDI) are a significant health problem to humans and food animals. Clostridial toxins ToxA and ToxB encoded by genes tcdA and tcdB are located on a pathogenicity locus known as the PaLoc and are the major virulence factors of C. difficile. While toxin-negative strains of C. difficile are often isolated from faeces of animals and patients suffering from CDI, they are not considered to play a role in disease. Toxin-negative strains of C. difficile have been used successfully to treat recurring CDI but their propensity to acquire the PaLoc via lateral gene transfer and express clinically relevant levels of toxins has reinforced the need to characterise them genetically. In addition, further studies that examine the pathogenic potential of toxin-negative strains of C. difficile and the frequency by which toxin-negative strains may acquire the PaLoc are needed.

Results

We undertook a comparative genomic analysis of five Australian toxin-negative isolates of C. difficile that lack tcdA, tcdB and both binary toxin genes cdtA and cdtB that were recovered from humans and farm animals with symptoms of gastrointestinal disease. Our analyses show that the five C. difficile isolates cluster closely with virulent toxigenic strains of C. difficile belonging to the same sequence type (ST) and have virulence gene profiles akin to those in toxigenic strains. Furthermore, phage acquisition appears to have played a key role in the evolution of C. difficile.

Conclusions

Our results are consistent with the C. difficile global population structure comprising six clades each containing both toxin-positive and toxin-negative strains. Our data also suggests that toxin-negative strains of C. difficile encode a repertoire of putative virulence factors that are similar to those found in toxigenic strains of C. difficile, raising the possibility that acquisition of PaLoc by toxin-negative strains poses a threat to human health. Studies in appropriate animal models are needed to examine the pathogenic potential of toxin-negative strains of C. difficile and to determine the frequency by which toxin-negative strains may acquire the PaLoc.

Electronic supplementary material

The online version of this article (doi:10.1186/s12866-016-0653-3) contains supplementary material, which is available to authorized users.

Collapse

Wei S, Williams Z. Rapid Short-Read Sequencing and Aneuploidy Detection Using MinION Nanopore Technology. Genetics 2016;202:37-44. [PMID: 26500254 PMCID: PMC4701100 DOI: 10.1534/genetics.115.182311] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Accepted: 10/20/2015] [Indexed: 12/30/2022] Open

Torreno O, Trelles O. Breaking the computational barriers of pairwise genome comparison. BMC Bioinformatics 2015;16:250. [PMID: 26260162 PMCID: PMC4531504 DOI: 10.1186/s12859-015-0679-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 07/20/2015] [Indexed: 11/25/2022] Open

Abstract

Background

Conventional pairwise sequence comparison software algorithms are being used to process much larger datasets than they were originally designed for. This can result in processing bottlenecks that limit software capabilities or prevent full use of the available hardware resources. Overcoming the barriers that limit the efficient computational analysis of large biological sequence datasets by retrofitting existing algorithms or by creating new applications represents a major challenge for the bioinformatics community.

Results

We have developed C libraries for pairwise sequence comparison within diverse architectures, ranging from commodity systems to high performance and cloud computing environments. Exhaustive tests were performed using different datasets of closely- and distantly-related sequences that span from small viral genomes to large mammalian chromosomes. The tests demonstrated that our solution is capable of generating high quality results with a linear-time response and controlled memory consumption, being comparable or faster than the current state-of-the-art methods.

Conclusions

We have addressed the problem of pairwise and all-versus-all comparison of large sequences in general, greatly increasing the limits on input data size. The approach described here is based on a modular out-of-core strategy that uses secondary storage to avoid reaching memory limits during the identification of High-scoring Segment Pairs (HSPs) between the sequences under comparison. Software engineering concepts were applied to avoid intermediate result re-calculation, to minimise the performance impact of input/output (I/O) operations and to modularise the process, thus enhancing application flexibility and extendibility. Our computationally-efficient approach allows tasks such as the massive comparison of complete genomes, evolutionary event detection, the identification of conserved synteny blocks and inter-genome distance calculations to be performed more effectively.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0679-9) contains supplementary material, which is available to authorized users.

Collapse

Frith MC, Kawaguchi R. Split-alignment of genomes finds orthologies more accurately. Genome Biol 2015;16:106. [PMID: 25994148 PMCID: PMC4464727 DOI: 10.1186/s13059-015-0670-9] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2015] [Accepted: 05/08/2015] [Indexed: 04/29/2023] Open

Sosa OA, Gifford SM, Repeta DJ, DeLong EF. High molecular weight dissolved organic matter enrichment selects for methylotrophs in dilution to extinction cultures. ISME JOURNAL 2015;9:2725-39. [PMID: 25978545 PMCID: PMC4817625 DOI: 10.1038/ismej.2015.68] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2014] [Revised: 03/04/2015] [Accepted: 03/18/2015] [Indexed: 02/06/2023]

Bu D, Nan X, Wang F, Loor J, Wang J. Identification and characterization of microRNA sequences from bovine mammary epithelial cells. J Dairy Sci 2015;98:1696-705. [DOI: 10.3168/jds.2014-8217] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Accepted: 11/22/2014] [Indexed: 11/19/2022]

Jain M, Fiddes IT, Miga KH, Olsen HE, Paten B, Akeson M. Improved data analysis for the MinION nanopore sequencer. Nat Methods 2015;12:351-6. [PMID: 25686389 PMCID: PMC4907500 DOI: 10.1038/nmeth.3290] [Citation(s) in RCA: 377] [Impact Index Per Article: 41.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2014] [Accepted: 01/20/2015] [Indexed: 12/31/2022]

Zhao H, Chen J, Liu J, Han B. Transcriptome analysis reveals the oxidative stress response in Saccharomyces cerevisiae. RSC Adv 2015. [DOI: 10.1039/c4ra14600j] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Bustin SA. The reproducibility of biomedical research: Sleepers awake! BIOMOLECULAR DETECTION AND QUANTIFICATION 2014;2:35-42. [PMID: 27896142 PMCID: PMC5121206 DOI: 10.1016/j.bdq.2015.01.002] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Revised: 01/08/2015] [Accepted: 01/12/2015] [Indexed: 01/03/2023]

León-Novelo LG, McIntyre LM, Fear JM, Graze RM. A flexible Bayesian method for detecting allelic imbalance in RNA-seq data. BMC Genomics 2014;15:920. [PMID: 25339465 PMCID: PMC4230747 DOI: 10.1186/1471-2164-15-920] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 10/09/2014] [Indexed: 01/01/2023] Open

Abstract

Background

One method of identifying cis regulatory differences is to analyze allele-specific expression (ASE) and identify cases of allelic imbalance (AI). RNA-seq is the most common way to measure ASE and a binomial test is often applied to determine statistical significance of AI. This implicitly assumes that there is no bias in estimation of AI. However, bias has been found to result from multiple factors including: genome ambiguity, reference quality, the mapping algorithm, and biases in the sequencing process. Two alternative approaches have been developed to handle bias: adjusting for bias using a statistical model and filtering regions of the genome suspected of harboring bias. Existing statistical models which account for bias rely on information from DNA controls, which can be cost prohibitive for large intraspecific studies. In contrast, data filtering is inexpensive and straightforward, but necessarily involves sacrificing a portion of the data.

Results

Here we propose a flexible Bayesian model for analysis of AI, which accounts for bias and can be implemented without DNA controls. In lieu of DNA controls, this Poisson-Gamma (PG) model uses an estimate of bias from simulations. The proposed model always has a lower type I error rate compared to the binomial test. Consistent with prior studies, bias dramatically affects the type I error rate. All of the tested models are sensitive to misspecification of bias. The closer the estimate of bias is to the true underlying bias, the lower the type I error rate. Correct estimates of bias result in a level alpha test.

Conclusions

To improve the assessment of AI, some forms of systematic error (e.g., map bias) can be identified using simulation. The resulting estimates of bias can be used to correct for bias in the PG model, without data filtering. Other sources of bias (e.g., unidentified variant calls) can be easily captured by DNA controls, but are missed by common filtering approaches. Consequently, as variant identification improves, the need for DNA controls will be reduced. Filtering does not significantly improve performance and is not recommended, as information is sacrificed without a measurable gain. The PG model developed here performs well when bias is known, or slightly misspecified. The model is flexible and can accommodate differences in experimental design and bias estimation.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-920) contains supplementary material, which is available to authorized users.

Collapse

Chong LC, Albuquerque MA, Harding NJ, Caloian C, Chan-Seng-Yue M, de Borja R, Fraser M, Denroche RE, Beck TA, van der Kwast T, Bristow RG, McPherson JD, Boutros PC. SeqControl: process control for DNA sequencing. Nat Methods 2014;11:1071-5. [PMID: 25173705 DOI: 10.1038/nmeth.3094] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Accepted: 07/27/2014] [Indexed: 12/15/2022]

Kerpedjiev P, Frellsen J, Lindgreen S, Krogh A. Adaptable probabilistic mapping of short reads using position specific scoring matrices. BMC Bioinformatics 2014;15:100. [PMID: 24717095 PMCID: PMC4021105 DOI: 10.1186/1471-2105-15-100] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Accepted: 03/28/2014] [Indexed: 11/10/2022] Open

Saito Y, Tsuji J, Mituyama T. Bisulfighter: accurate detection of methylated cytosines and differentially methylated regions. Nucleic Acids Res 2014;42:e45. [PMID: 24423865 PMCID: PMC3973284 DOI: 10.1093/nar/gkt1373] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open

Hong C, Clement NL, Clement S, Hammoud SS, Carrell DT, Cairns BR, Snell Q, Clement MJ, Johnson WE. Probabilistic alignment leads to improved accuracy and read coverage for bisulfite sequencing data. BMC Bioinformatics 2013;14:337. [PMID: 24261665 PMCID: PMC3924334 DOI: 10.1186/1471-2105-14-337] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Accepted: 11/19/2013] [Indexed: 11/10/2022] Open

Dalton JE, Fear JM, Knott S, Baker BS, McIntyre LM, Arbeitman MN. Male-specific Fruitless isoforms have different regulatory roles conferred by distinct zinc finger DNA binding domains. BMC Genomics 2013;14:659. [PMID: 24074028 PMCID: PMC3852243 DOI: 10.1186/1471-2164-14-659] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 09/20/2013] [Indexed: 11/25/2022] Open

Mahmud MP, Wiedenhoeft J, Schliep A. Indel-tolerant read mapping with trinucleotide frequencies using cache-oblivious kd-trees. Bioinformatics 2013;28:i325-i332. [PMID: 22962448 PMCID: PMC3436807 DOI: 10.1093/bioinformatics/bts380] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Umemura M, Koyama Y, Takeda I, Hagiwara H, Ikegami T, Koike H, Machida M. Fine de novo sequencing of a fungal genome using only SOLiD short read data: verification on Aspergillus oryzae RIB40. PLoS One 2013;8:e63673. [PMID: 23667655 PMCID: PMC3646829 DOI: 10.1371/journal.pone.0063673] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2012] [Accepted: 04/05/2013] [Indexed: 11/18/2022] Open

Shrestha AMS, Frith MC. An approximate Bayesian approach for mapping paired-end DNA reads to a reference genome. ACTA ACUST UNITED AC 2013;29:965-72. [PMID: 23413433 PMCID: PMC3624798 DOI: 10.1093/bioinformatics/btt073] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Improved base-calling and quality scores for 454 sequencing based on a Hurdle Poisson model. BMC Bioinformatics 2012;13:303. [PMID: 23151247 PMCID: PMC3534400 DOI: 10.1186/1471-2105-13-303] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2012] [Accepted: 11/01/2012] [Indexed: 11/25/2022] Open

Umemura M, Koike H, Yamane N, Koyama Y, Satou Y, Kikuzato I, Teruya M, Tsukahara M, Imada Y, Wachi Y, Miwa Y, Yano S, Tamano K, Kawarabayasi Y, Fujimori KE, Machida M, Hirano T. Comparative genome analysis between Aspergillus oryzae strains reveals close relationship between sites of mutation localization and regions of highly divergent genes among Aspergillus species. DNA Res 2012;19:375-82. [PMID: 22912434 PMCID: PMC3473370 DOI: 10.1093/dnares/dss019] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Frith MC, Mori R, Asai K. A mostly traditional approach improves alignment of bisulfite-converted DNA. Nucleic Acids Res 2012;40:e100. [PMID: 22457070 PMCID: PMC3401460 DOI: 10.1093/nar/gks275] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

Peláez P, Trejo MS, Iñiguez LP, Estrada-Navarrete G, Covarrubias AA, Reyes JL, Sanchez F. Identification and characterization of microRNAs in Phaseolus vulgaris by high-throughput sequencing. BMC Genomics 2012;13:83. [PMID: 22394504 PMCID: PMC3359237 DOI: 10.1186/1471-2164-13-83] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2011] [Accepted: 03/06/2012] [Indexed: 12/16/2022] Open

Abstract

Background

MicroRNAs (miRNAs) are endogenously encoded small RNAs that post-transcriptionally regulate gene expression. MiRNAs play essential roles in almost all plant biological processes. Currently, few miRNAs have been identified in the model food legume Phaseolus vulgaris (common bean). Recent advances in next generation sequencing technologies have allowed the identification of conserved and novel miRNAs in many plant species. Here, we used Illumina's sequencing by synthesis (SBS) technology to identify and characterize the miRNA population of Phaseolus vulgaris.

Results

Small RNA libraries were generated from roots, flowers, leaves, and seedlings of P. vulgaris. Based on similarity to previously reported plant miRNAs,114 miRNAs belonging to 33 conserved miRNA families were identified. Stem-loop precursors and target gene sequences for several conserved common bean miRNAs were determined from publicly available databases. Less conserved miRNA families and species-specific common bean miRNA isoforms were also characterized. Moreover, novel miRNAs based on the small RNAs were found and their potential precursors were predicted. In addition, new target candidates for novel and conserved miRNAs were proposed. Finally, we studied organ-specific miRNA family expression levels through miRNA read frequencies.

Conclusions

This work represents the first massive-scale RNA sequencing study performed in Phaseolus vulgaris to identify and characterize its miRNA population. It significantly increases the number of miRNAs, precursors, and targets identified in this agronomically important species. The miRNA expression analysis provides a foundation for understanding common bean miRNA organ-specific expression patterns. The present study offers an expanded picture of P. vulgaris miRNAs in relation to those of other legumes.

Collapse

Graze RM, Novelo LL, Amin V, Fear JM, Casella G, Nuzhdin SV, McIntyre LM. Allelic imbalance in Drosophila hybrid heads: exons, isoforms, and evolution. Mol Biol Evol 2012;29:1521-32. [PMID: 22319150 DOI: 10.1093/molbev/msr318] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Chaparro C, Sabot F. Methods and software in NGS for TE analysis. Methods Mol Biol 2012;859:105-114. [PMID: 22367868 DOI: 10.1007/978-1-61779-603-6_6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Yang Y, Graze RM, Walts BM, Lopez CM, Baker HV, Wayne ML, Nuzhdin SV, McIntyre LM. Partitioning transcript variation in Drosophila: abundance, isoforms, and alleles. G3 (BETHESDA, MD.) 2011;1:427-36. [PMID: 22384353 PMCID: PMC3276160 DOI: 10.1534/g3.111.000596] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2011] [Accepted: 09/11/2011] [Indexed: 12/25/2022]

Hamada M, Wijaya E, Frith MC, Asai K. Probabilistic alignments with quality scores: an application to short-read mapping toward accurate SNP/indel detection. ACTA ACUST UNITED AC 2011;27:3085-92. [PMID: 21976422 DOI: 10.1093/bioinformatics/btr537] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Kazlauskas D, Venclovas C. Computational analysis of DNA replicases in double-stranded DNA viruses: relationship with the genome size. Nucleic Acids Res 2011;39:8291-305. [PMID: 21742758 PMCID: PMC3201878 DOI: 10.1093/nar/gkr564] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open