Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Green ED. Strategies for the systematic sequencing of complex genomes. Nat Rev Genet 2001;2:573-83. [PMID: 11483982 DOI: 10.1038/35084503] [Citation(s) in RCA: 130] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Number

Cited by Other Article(s)

Gautier L. Microbial forensics: what we've learned from Amerithrax and beyond. Biotechniques 2023;75:129-132. [PMID: 37800360 DOI: 10.2144/btn-2023-0084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/07/2023] Open

Meumann EM, Krause VL, Baird R, Currie BJ. Using Genomics to Understand the Epidemiology of Infectious Diseases in the Northern Territory of Australia. Trop Med Infect Dis 2022;7:tropicalmed7080181. [PMID: 36006273 PMCID: PMC9413455 DOI: 10.3390/tropicalmed7080181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 08/09/2022] [Accepted: 08/11/2022] [Indexed: 11/16/2022] Open

Hagemeijer YP, Guryev V, Horvatovich P. Accurate Prediction of Protein Sequences for Proteogenomics Data Integration. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021;2420:233-260. [PMID: 34905178 DOI: 10.1007/978-1-0716-1936-0_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Abdelrahman M, Hirata S, Mukae T, Yamada T, Sawada Y, El-Syaed M, Yamada Y, Sato M, Hirai MY, Shigyo M. Comprehensive Metabolite Profiling in Genetic Resources of Garlic (Allium sativum L.) Collected from Different Geographical Regions. Molecules 2021;26:1415. [PMID: 33807861 PMCID: PMC7962061 DOI: 10.3390/molecules26051415] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 02/18/2021] [Accepted: 02/25/2021] [Indexed: 11/17/2022] Open

Abstract

Garlic (Allium sativum) is the second most important Allium crop that has been used as a vegetable and condiment from ancient times due to its characteristic flavor and taste. Although garlic is a sterile plant that reproduces vegetatively through cloves, garlic shows high biodiversity, as well as phenotypic plasticity and environmental adaptation capacity. To determine the possible mechanism underlying this phenomenon and to provide new genetic materials for the development of a novel garlic cultivar with useful agronomic traits, the metabolic profiles in the leaf tissue of 30 garlic accessions collected from different geographical regions, with a special focus on the Asian region, were investigated using LC/MS. In addition, the total saponin and fructan contents in the roots and cloves of the investigated garlic accessions were also evaluated. Total saponin and fructan contents did not separate the garlic accessions based on their geographical origin, implying that saponin and fructan contents were clone-specific and agroclimatic changes have affected the quantitative and qualitative levels of saponins in garlic over a long history of cultivation. Principal component analysis (PCA) and dendrogram clustering of the LC/MS-based metabolite profiling showed two major clusters. Specifically, many Japanese and Central Asia accessions were grouped in cluster I and showed high accumulations of flavonol glucosides, alliin, and methiin. On the other hand, garlic accessions grouped in cluster II exhibited a high accumulation of anthocyanin glucosides and amino acids. Although most of the accessions were not separated based on country of origin, the Central Asia accessions were clustered in one group, implying that these accessions exhibited distinct metabolic profiles. The present study provides useful information that can be used for germplasm selection and the development of new garlic varieties with beneficial biotic and abiotic stress-adaptive traits.

Collapse

Muggia L, Ametrano CG, Sterflinger K, Tesei D. An Overview of Genomics, Phylogenomics and Proteomics Approaches in Ascomycota. Life (Basel) 2020;10:E356. [PMID: 33348904 PMCID: PMC7765829 DOI: 10.3390/life10120356] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 12/10/2020] [Accepted: 12/12/2020] [Indexed: 12/26/2022] Open

Abstract

Fungi are among the most successful eukaryotes on Earth: they have evolved strategies to survive in the most diverse environments and stressful conditions and have been selected and exploited for multiple aims by humans. The characteristic features intrinsic of Fungi have required evolutionary changes and adaptations at deep molecular levels. Omics approaches, nowadays including genomics, metagenomics, phylogenomics, transcriptomics, metabolomics, and proteomics have enormously advanced the way to understand fungal diversity at diverse taxonomic levels, under changeable conditions and in still under-investigated environments. These approaches can be applied both on environmental communities and on individual organisms, either in nature or in axenic culture and have led the traditional morphology-based fungal systematic to increasingly implement molecular-based approaches. The advent of next-generation sequencing technologies was key to boost advances in fungal genomics and proteomics research. Much effort has also been directed towards the development of methodologies for optimal genomic DNA and protein extraction and separation. To date, the amount of proteomics investigations in Ascomycetes exceeds those carried out in any other fungal group. This is primarily due to the preponderance of their involvement in plant and animal diseases and multiple industrial applications, and therefore the need to understand the biological basis of the infectious process to develop mechanisms for biologic control, as well as to detect key proteins with roles in stress survival. Here we chose to present an overview as much comprehensive as possible of the major advances, mainly of the past decade, in the fields of genomics (including phylogenomics) and proteomics of Ascomycota, focusing particularly on those reporting on opportunistic pathogenic, extremophilic, polyextremotolerant and lichenized fungi. We also present a review of the mostly used genome sequencing technologies and methods for DNA sequence and protein analyses applied so far for fungi.

Collapse

Jin J, Zhang H, Li D, Jing Y, Sun Z, Feng J, Zhang H, Zhang Y, Cui T, Lei X, Zhang J, Cheng Q, Li E. Effectiveness of Xin Jia Xuan Bai Cheng Qi Decoction in treating acute exacerbation of chronic obstructive pulmonary disease: study protocol for a multicentre, randomised, controlled trial. BMJ Open 2019;9:e030249. [PMID: 31784433 PMCID: PMC6924718 DOI: 10.1136/bmjopen-2019-030249] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Revised: 09/17/2019] [Accepted: 09/23/2019] [Indexed: 12/23/2022] Open

Abstract

INTRODUCTION

Acute exacerbation of chronic obstructive pulmonary disease (AECOPD) brings a serious impact on patients' quality of life, and has extremely high morbidity and mortality worldwide. Although there are many therapies being developed to alleviate symptoms and reduce mortality, a few studies have supported which treatment method is the best. Traditional Chinese medicine (TCM) has shown good potential in the prevention and treatment of AECOPD, especially in terms of supplementation and reduction of dosage and adverse effect of Western medicine. The purpose of this study is to compare the effectiveness of combination of TCM and Western medicine with conventional therapy alone for AECOPD, and to ensure whether the combined therapy may reduce the use of systemic glucocorticoid in AECOPD without influencing efficacy.

METHODS AND ANALYSIS

A multicentre, randomised, double-blind, placebo-controlled study was conducted to enrol a total of 360 eligible patients who will be randomised into integrated Chinese and Western medicine group A, B and Western standard Medicine group C. After 5 days of intervention and 1 month of follow-up, the efficacy and safety of Xin Jia Xuan Bai Cheng Qi Decoction in patients with AECOPD will be observed. The results of evaluation indicators include: clinical symptoms, biochemical indicators such as blood gas analysis, inflammatory markers, hospitalisation time, TCM syndrome evaluation, biological indicators such as airway, intestinal flora sequencing.

ETHICS AND DISSEMINATION

This trail has been approved by the Ethics Committee of China-Japan Friendship Hospital. The results will be disseminated in international peer-reviewed journals and be presented in academic conferences. The results will also be disseminated to patients by telephone, inquiring on patient's poststudy health status during the follow-up.

TRIAL REGISTRATION NUMBER

ChiCTR1800016915.

Collapse

Harnessing genomic information for livestock improvement. Nat Rev Genet 2018;20:135-156. [DOI: 10.1038/s41576-018-0082-2] [Citation(s) in RCA: 154] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Egea LA, Mérida-García R, Kilian A, Hernandez P, Dorado G. Assessment of Genetic Diversity and Structure of Large Garlic (Allium sativum) Germplasm Bank, by Diversity Arrays Technology "Genotyping-by-Sequencing" Platform (DArTseq). Front Genet 2017;8:98. [PMID: 28775737 PMCID: PMC5517412 DOI: 10.3389/fgene.2017.00098] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2017] [Accepted: 06/30/2017] [Indexed: 12/20/2022] Open

Abstract

Garlic (Allium sativum) is used worldwide in cooking and industry, including pharmacology/medicine and cosmetics, for its interesting properties. Identifying redundancies in germplasm blanks to generate core collections is a major concern, mostly in large stocks, in order to reduce space and maintenance costs. Yet, similar appearance and phenotypic plasticity of garlic varieties hinder their morphological classification. Molecular studies are challenging, due to the large and expected complex genome of this species, with asexual reproduction. Classical molecular markers, like isozymes, RAPD, SSR, or AFLP, are not convenient to generate germplasm core-collections for this species. The recent emergence of high-throughput genotyping-by-sequencing (GBS) approaches, like DArTseq, allow to overcome such limitations to characterize and protect genetic diversity. Therefore, such technology was used in this work to: (i) assess genetic diversity and structure of a large garlic-germplasm bank (417 accessions); (ii) create a core collection; (iii) relate genotype to agronomical features; and (iv) describe a cost-effective method to manage genetic diversity in garlic-germplasm banks. Hierarchical-cluster analysis, principal-coordinates analysis and STRUCTURE showed general consistency, generating three main garlic-groups, mostly determined by variety and geographical origin. In addition, high-resolution genotyping identified 286 unique and 131 redundant accessions, used to select a reduced size germplasm-bank core collection. This demonstrates that DArTseq is a cost-effective method to analyze species with large and expected complex genomes, like garlic. To the best of our knowledge, this is the first report of high-throughput genotyping of a large garlic germplasm. This is particularly interesting for garlic adaptation and improvement, to fight biotic and abiotic stresses, in the current context of climate change and global warming.

Collapse

Whole genome sequencing in clinical and public health microbiology. Pathology 2015;47:199-210. [PMID: 25730631 PMCID: PMC4389090 DOI: 10.1097/pat.0000000000000235] [Citation(s) in RCA: 175] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]

Structural and Computational Biology in the Design of Immunogenic Vaccine Antigens. J Immunol Res 2015;2015:156241. [PMID: 26526043 PMCID: PMC4615220 DOI: 10.1155/2015/156241] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 08/02/2015] [Indexed: 01/08/2023] Open

Meerzaman D, Dunn BK, Lee M, Chen Q, Yan C, Ross S. The promise of omics-based approaches to cancer prevention. Semin Oncol 2015;43:36-48. [PMID: 26970123 DOI: 10.1053/j.seminoncol.2015.09.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Soto JC, Ortiz JF, Perlaza-Jiménez L, Vásquez AX, Lopez-Lavalle LAB, Mathew B, Léon J, Bernal AJ, Ballvora A, López CE. A genetic map of cassava (Manihot esculenta Crantz) with integrated physical mapping of immunity-related genes. BMC Genomics 2015;16:190. [PMID: 25887443 PMCID: PMC4417308 DOI: 10.1186/s12864-015-1397-4] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 02/24/2015] [Indexed: 03/19/2023] Open

Abstract

BACKGROUND

Cassava, Manihot esculenta Crantz, is one of the most important crops world-wide representing the staple security for more than one billion of people. The development of dense genetic and physical maps, as the basis for implementing genetic and molecular approaches to accelerate the rate of genetic gains in breeding program represents a significant challenge. A reference genome sequence for cassava has been made recently available and community efforts are underway for improving its quality. Cassava is threatened by several pathogens, but the mechanisms of defense are far from being understood. Besides, there has been a lack of information about the number of genes related to immunity as well as their distribution and genomic organization in the cassava genome.

RESULTS

A high dense genetic map of cassava containing 2,141 SNPs has been constructed. Eighteen linkage groups were resolved with an overall size of 2,571 cM and an average distance of 1.26 cM between markers. More than half of mapped SNPs (57.4%) are located in coding sequences. Physical mapping of scaffolds of cassava whole genome sequence draft using the mapped markers as anchors resulted in the orientation of 687 scaffolds covering 45.6% of the genome. One hundred eighty nine new scaffolds are anchored to the genetic cassava map leading to an extension of the present cassava physical map with 30.7 Mb. Comparative analysis using anchor markers showed strong co-linearity to previously reported cassava genetic and physical maps. In silico based searching for conserved domains allowed the annotation of a repertory of 1,061 cassava genes coding for immunity-related proteins (IRPs). Based on physical map of the corresponding sequencing scaffolds, unambiguous genetic localization was possible for 569 IRPs.

CONCLUSIONS

This is the first study reported so far of an integrated high density genetic map using SNPs with integrated genetic and physical localization of newly annotated immunity related genes in cassava. These data build a solid basis for future studies to map and associate markers with single loci or quantitative trait loci for agronomical important traits. The enrichment of the physical map with novel scaffolds is in line with the efforts of the cassava genome sequencing consortium.

Collapse

Li W, Freudenberg J, Miramontes P. Diminishing return for increased Mappability with longer sequencing reads: implications of the k-mer distributions in the human genome. BMC Bioinformatics 2014;15:2. [PMID: 24386976 PMCID: PMC3927684 DOI: 10.1186/1471-2105-15-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2013] [Accepted: 12/17/2013] [Indexed: 11/10/2022] Open

Abstract

Background

The amount of non-unique sequence (non-singletons) in a genome directly affects the difficulty of read alignment to a reference assembly for high throughput-sequencing data. Although a longer read is more likely to be uniquely mapped to the reference genome, a quantitative analysis of the influence of read lengths on mappability has been lacking. To address this question, we evaluate the k-mer distribution of the human reference genome. The k-mer frequency is determined for k ranging from 20 bp to 1000 bp.

Results

We observe that the proportion of non-singletons k-mers decreases slowly with increasing k, and can be fitted by piecewise power-law functions with different exponents at different ranges of k. A slower decay at greater values for k indicates more limited gains in mappability for read lengths between 200 bp and 1000 bp. The frequency distributions of k-mers exhibit long tails with a power-law-like trend, and rank frequency plots exhibit a concave Zipf’s curve. The most frequent 1000-mers comprise 172 regions, which include four large stretches on chromosomes 1 and X, containing genes of biomedical relevance. Comparison with other databases indicates that the 172 regions can be broadly classified into two types: those containing LINE transposable elements and those containing segmental duplications.

Conclusion

Read mappability as measured by the proportion of singletons increases steadily up to the length scale around 200 bp. When read length increases above 200 bp, smaller gains in mappability are expected. Moreover, the proportion of non-singletons decreases with read lengths much slower than linear. Even a read length of 1000 bp would not allow the unique alignment of reads for many coding regions of human genes. A mix of techniques will be needed for efficiently producing high-quality data that cover the complete human genome.

Collapse

Lonardi S, Duma D, Alpert M, Cordero F, Beccuti M, Bhat PR, Wu Y, Ciardo G, Alsaihati B, Ma Y, Wanamaker S, Resnik J, Bozdag S, Luo MC, Close TJ. Combinatorial pooling enables selective sequencing of the barley gene space. PLoS Comput Biol 2013;9:e1003010. [PMID: 23592960 PMCID: PMC3617026 DOI: 10.1371/journal.pcbi.1003010] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2012] [Accepted: 02/05/2013] [Indexed: 11/23/2022] Open

Abstract

For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

The problem of obtaining the full genomic sequence of an organism has been solved either via a global brute-force approach (called whole-genome shotgun) or by a divide-and-conquer strategy (called clone-by-clone). Both approaches have advantages and disadvantages in terms of cost, manual labor, and the ability to deal with sequencing errors and highly repetitive regions of the genome. With the advent of second-generation sequencing instruments, the whole-genome shotgun approach has been the preferred choice. The clone-by-clone strategy is, however, still very relevant for large complex genomes. In fact, several research groups and international consortia have produced clone libraries and physical maps for many economically or ecologically important organisms and now are in a position to proceed with sequencing. In this manuscript, we demonstrate the feasibility of this approach on the gene-space of a large, very repetitive plant genome. The novelty of our approach is that, in order to take advantage of the throughput of the current generation of sequencing instruments, we pool hundreds of clones using a special type of “smart” pooling design that allows one to establish with high accuracy the source clone from the sequenced reads in a pool. Extensive simulations and experimental results support our claims.

Collapse

Bozdag S, Close TJ, Lonardi S. A graph-theoretical approach to the selection of the minimum tiling path from a physical map. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013;10:352-360. [PMID: 23929859 DOI: 10.1109/tcbb.2013.26] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

Hastie AR, Dong L, Smith A, Finklestein J, Lam ET, Huo N, Cao H, Kwok PY, Deal KR, Dvorak J, Luo MC, Gu Y, Xiao M. Rapid genome mapping in nanochannel arrays for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome. PLoS One 2013;8:e55864. [PMID: 23405223 PMCID: PMC3566107 DOI: 10.1371/journal.pone.0055864] [Citation(s) in RCA: 123] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2012] [Accepted: 01/03/2013] [Indexed: 02/04/2023] Open

Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem. J Math Biol 2012;67:1141-61. [PMID: 22965653 PMCID: PMC3795925 DOI: 10.1007/s00285-012-0586-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Revised: 08/28/2012] [Indexed: 11/21/2022]

Abstract

Metagenomic project design has relied variously upon speculation, semi-empirical and ad hoc heuristic models, and elementary extensions of single-sample Lander–Waterman expectation theory, all of which are demonstrably inadequate. Here, we propose an approach based upon a generalization of Stevens’ Theorem for randomly covering a domain. We extend this result to account for the presence of multiple species, from which are derived useful probabilities for fully recovering a particular target microbe of interest and for average contig length. These show improved specificities compared to older measures and recommend deeper data generation than the levels chosen by some early studies, supporting the view that poor assemblies were due at least somewhat to insufficient data. We assess predictions empirically by generating roughly 4.5 Gb of sequence from a twelve member bacterial community, comparing coverage for two particular members, Selenomonas artemidis and Enterococcus faecium, which are the least (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim $$\end{document}3 %) and most (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim $$\end{document}12 %) abundant species, respectively. Agreement is reasonable, with differences likely attributable to coverage biases. We show that, in some cases, bias is simple in the sense that a small reduction in read length to simulate less efficient covering brings data and theory into essentially complete accord. Finally, we describe two applications of the theory. One plots coverage probability over the relevant parameter space, constructing essentially a “metagenomic design map” to enable straightforward analysis and design of future projects. The other gives an overview of the data requirements for various types of sequencing milestones, including a desired number of contact reads and contig length, for detection of a rare viral species.

Collapse

Lim LS, Tay YL, Alias H, Wan KL, Dear PH. Insights into the genome structure and copy-number variation of Eimeria tenella. BMC Genomics 2012;13:389. [PMID: 22889016 PMCID: PMC3505466 DOI: 10.1186/1471-2164-13-389] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2012] [Accepted: 08/01/2012] [Indexed: 12/25/2022] Open

Abstract

BACKGROUND

Eimeria is a genus of parasites in the same phylum (Apicomplexa) as human parasites such as Toxoplasma, Cryptosporidium and the malaria parasite Plasmodium. As an apicomplexan whose life-cycle involves a single host, Eimeria is a convenient model for understanding this group of organisms. Although the genomes of the Apicomplexa are diverse, that of Eimeria is unique in being composed of large alternating blocks of sequence with very different characteristics - an arrangement seen in no other organism. This arrangement has impeded efforts to fully sequence the genome of Eimeria, which remains the last of the major apicomplexans to be fully analyzed. In order to increase the value of the genome sequence data and aid in the effort to gain a better understanding of the Eimeria tenella genome, we constructed a whole genome map for the parasite.

RESULTS

A total of 1245 contigs representing 70.0% of the whole genome assembly sequences (Wellcome Trust Sanger Institute) were selected and subjected to marker selection. Subsequently, 2482 HAPPY markers were developed and typed. Of these, 795 were considered as usable markers, and utilized in the construction of a HAPPY map. Markers developed from chromosomally-assigned genes were then integrated into the HAPPY map and this aided the assignment of a number of linkage groups to their respective chromosomes. BAC-end sequences and contigs from whole genome sequencing were also integrated to improve and validate the HAPPY map. This resulted in an integrated HAPPY map consisting of 60 linkage groups that covers approximately half of the estimated 60 Mb genome. Further analysis suggests that the segmental organization first seen in Chromosome 1 is present throughout the genome, with repeat-poor (P) regions alternating with repeat-rich (R) regions. Evidence of copy-number variation between strains was also uncovered.

CONCLUSIONS

This paper describes the application of a whole genome mapping method to improve the assembly of the genome of E. tenella from shotgun data, and to help reveal its overall structure. A preliminary assessment of copy-number variation (extra or missing copies of genomic segments) between strains of E. tenella was also carried out. The emerging picture is of a very unusual genome architecture displaying inter-strain copy-number variation. We suggest that these features may be related to the known ability of this parasite to rapidly develop drug resistance.

Collapse

Chen N, Bellott DW, Page DC, Clark AG. Identification of avian W-linked contigs by short-read sequencing. BMC Genomics 2012;13:183. [PMID: 22583744 PMCID: PMC3428670 DOI: 10.1186/1471-2164-13-183] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Accepted: 04/25/2012] [Indexed: 11/16/2022] Open

Philippe R, Choulet F, Paux E, van Oeveren J, Tang J, Wittenberg AHJ, Janssen A, van Eijk MJT, Stormo K, Alberti A, Wincker P, Akhunov E, van der Vossen E, Feuillet C. Whole Genome Profiling provides a robust framework for physical mapping and sequencing in the highly complex and repetitive wheat genome. BMC Genomics 2012;13:47. [PMID: 22289472 PMCID: PMC3311077 DOI: 10.1186/1471-2164-13-47] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2011] [Accepted: 01/30/2012] [Indexed: 01/28/2023] Open

Abstract

Background

Sequencing projects using a clone-by-clone approach require the availability of a robust physical map. The SNaPshot technology, based on pair-wise comparisons of restriction fragments sizes, has been used recently to build the first physical map of a wheat chromosome and to complete the maize physical map. However, restriction fragments sizes shared randomly between two non-overlapping BACs often lead to chimerical contigs and mis-assembled BACs in such large and repetitive genomes. Whole Genome Profiling (WGP™) was developed recently as a new sequence-based physical mapping technology and has the potential to limit this problem.

Results

A subset of the wheat 3B chromosome BAC library covering 230 Mb was used to establish a WGP physical map and to compare it to a map obtained with the SNaPshot technology. We first adapted the WGP-based assembly methodology to cope with the complexity of the wheat genome. Then, the results showed that the WGP map covers the same length than the SNaPshot map but with 30% less contigs and, more importantly with 3.5 times less mis-assembled BACs. Finally, we evaluated the benefit of integrating WGP tags in different sequence assemblies obtained after Roche/454 sequencing of BAC pools. We showed that while WGP tag integration improves assemblies performed with unpaired reads and with paired-end reads at low coverage, it does not significantly improve sequence assemblies performed at high coverage (25x) with paired-end reads.

Conclusions

Our results demonstrate that, with a suitable assembly methodology, WGP builds more robust physical maps than the SNaPshot technology in wheat and that WGP can be adapted to any genome. Moreover, WGP tag integration in sequence assemblies improves low quality assembly. However, to achieve a high quality draft sequence assembly, a sequencing depth of 25x paired-end reads is required, at which point WGP tag integration does not provide additional scaffolding value. Finally, we suggest that WGP tags can support the efficient sequencing of BAC pools by enabling reliable assignment of sequence scaffolds to their BAC of origin, a feature that is of great interest when using BAC pooling strategies to reduce the cost of sequencing large genomes.

Collapse

Wolfsberg TG. Using the NCBI Map Viewer to browse genomic sequence data. ACTA ACUST UNITED AC 2011;Chapter 18:Unit18.5. [PMID: 21480181 DOI: 10.1002/0471142905.hg1805s69] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Song G, Zhang L, Vinar T, Miller W. CAGE: Combinatorial Analysis of Gene-cluster Evolution. J Comput Biol 2011;17:1227-42. [PMID: 20874406 DOI: 10.1089/cmb.2010.0094] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Vinar T, Brejová B, Song G, Siepel A. Reconstructing histories of complex gene clusters on a phylogeny. J Comput Biol 2011;17:1267-79. [PMID: 20874408 DOI: 10.1089/cmb.2010.0090] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Feuillet C, Leach JE, Rogers J, Schnable PS, Eversole K. Crop genome sequencing: lessons and rationales. TRENDS IN PLANT SCIENCE 2011;16:77-88. [PMID: 21081278 DOI: 10.1016/j.tplants.2010.10.005] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2010] [Revised: 10/09/2010] [Accepted: 10/16/2010] [Indexed: 05/06/2023]

Knudsen B, Forsberg R, Miyamoto MM. A computer simulator for assessing different challenges and strategies of de novo sequence assembly. Genes (Basel) 2010;1:263-82. [PMID: 24710045 PMCID: PMC3954094 DOI: 10.3390/genes1020263] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2010] [Revised: 08/18/2010] [Accepted: 08/31/2010] [Indexed: 11/16/2022] Open

Wolfsberg TG. Using the NCBI map viewer to browse genomic sequence data. ACTA ACUST UNITED AC 2010;Chapter 1:1.5.1-1.5.25. [PMID: 20205186 DOI: 10.1002/0471250953.bi0105s29] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Mir KU. Sequencing genomes: from individuals to populations. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2010;8:367-78. [PMID: 19808932 DOI: 10.1093/bfgp/elp040] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Blakesley RW, Hansen NF, Gupta J, McDowell JC, Maskeri B, Barnabas BB, Brooks SY, Coleman H, Haghighi P, Ho SL, Schandler K, Stantripop S, Vogt JL, Thomas PJ, Bouffard GG, Green ED. Effort required to finish shotgun-generated genome sequences differs significantly among vertebrates. BMC Genomics 2010;11:21. [PMID: 20064230 PMCID: PMC2827409 DOI: 10.1186/1471-2164-11-21] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2009] [Accepted: 01/11/2010] [Indexed: 01/09/2023] Open

Abstract

BACKGROUND

The approaches for shotgun-based sequencing of vertebrate genomes are now well-established, and have resulted in the generation of numerous draft whole-genome sequence assemblies. In contrast, the process of refining those assemblies to improve contiguity and increase accuracy (known as 'sequence finishing') remains tedious, labor-intensive, and expensive. As a result, the vast majority of vertebrate genome sequences generated to date remain at a draft stage.

RESULTS

To date, our genome sequencing efforts have focused on comparative studies of targeted genomic regions, requiring sequence finishing of large blocks of orthologous sequence (average size 0.5-2 Mb) from various subsets of 75 vertebrates. This experience has provided a unique opportunity to compare the relative effort required to finish shotgun-generated genome sequence assemblies from different species, which we report here. Importantly, we found that the sequence assemblies generated for the same orthologous regions from various vertebrates show substantial variation with respect to misassemblies and, in particular, the frequency and characteristics of sequence gaps. As a consequence, the work required to finish different species' sequences varied greatly. Application of the same standardized methods for finishing provided a novel opportunity to "assay" characteristics of genome sequences among many vertebrate species. It is important to note that many of the problems we have encountered during sequence finishing reflect unique architectural features of a particular vertebrate's genome, which in some cases may have important functional and/or evolutionary implications. Finally, based on our analyses, we have been able to improve our procedures to overcome some of these problems and to increase the overall efficiency of the sequence-finishing process, although significant challenges still remain.

CONCLUSION

Our findings have important implications for the eventual finishing of the draft whole-genome sequences that have now been generated for a large number of vertebrates.

Collapse

Zhang Y, Song G, Vinar T, Green ED, Siepel A, Miller W. Evolutionary history reconstruction for Mammalian complex gene clusters. J Comput Biol 2009;16:1051-70. [PMID: 19645598 DOI: 10.1089/cmb.2009.0040] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Bozdag S, Close TJ, Lonardi S. A compartmentalized approach to the assembly of physical maps. BMC Bioinformatics 2009;10:217. [PMID: 19604400 PMCID: PMC2717093 DOI: 10.1186/1471-2105-10-217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2008] [Accepted: 07/15/2009] [Indexed: 12/30/2022] Open

Vinař T, Brejová B, Song G, Siepel A. Reconstructing Histories of Complex Gene Clusters on a Phylogeny. COMPARATIVE GENOMICS 2009. [DOI: 10.1007/978-3-642-04744-2_13] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]

Wolfsberg TG. Using the NCBI Map Viewer to browse genomic sequence data. ACTA ACUST UNITED AC 2008;Chapter 1:Unit 1.5. [PMID: 18428781 DOI: 10.1002/0471250953.bi0105s16] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Bertin PN, Médigue C, Normand P. Advances in environmental genomics: towards an integrated view of micro-organisms and ecosystems. MICROBIOLOGY-SGM 2008;154:347-359. [PMID: 18227239 DOI: 10.1099/mic.0.2007/011791-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]

Approaches to comparative sequence analysis: towards a functional view of vertebrate genomes. Nat Rev Genet 2008;9:303-13. [PMID: 18347593 DOI: 10.1038/nrg2185] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Idol JR, Addington AM, Long RT, Rapoport JL, Green ED. Sequencing and Analyzing the t(1;7) Reciprocal Translocation Breakpoints Associated with a Case of Childhood-onset Schizophrenia/Autistic Disorder. J Autism Dev Disord 2007;38:668-77. [PMID: 17879154 DOI: 10.1007/s10803-007-0435-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2007] [Accepted: 07/24/2007] [Indexed: 11/30/2022]

Zhou S, Bechner MC, Place M, Churas CP, Pape L, Leong SA, Runnheim R, Forrest DK, Goldstein S, Livny M, Schwartz DC. Validation of rice genome sequence by optical mapping. BMC Genomics 2007;8:278. [PMID: 17697381 PMCID: PMC2048515 DOI: 10.1186/1471-2164-8-278] [Citation(s) in RCA: 103] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2007] [Accepted: 08/15/2007] [Indexed: 11/30/2022] Open

Abstract

Background

Rice feeds much of the world, and possesses the simplest genome analyzed to date within the grass family, making it an economically relevant model system for other cereal crops. Although the rice genome is sequenced, validation and gap closing efforts require purely independent means for accurate finishing of sequence build data.

Results

To facilitate ongoing sequencing finishing and validation efforts, we have constructed a whole-genome SwaI optical restriction map of the rice genome. The physical map consists of 14 contigs, covering 12 chromosomes, with a total genome size of 382.17 Mb; this value is about 11% smaller than original estimates. 9 of the 14 optical map contigs are without gaps, covering chromosomes 1, 2, 3, 4, 5, 7, 8 10, and 12 in their entirety – including centromeres and telomeres. Alignments between optical and in silico restriction maps constructed from IRGSP (International Rice Genome Sequencing Project) and TIGR (The Institute for Genomic Research) genome sequence sources are comprehensive and informative, evidenced by map coverage across virtually all published gaps, discovery of new ones, and characterization of sequence misassemblies; all totalling ~14 Mb. Furthermore, since optical maps are ordered restriction maps, identified discordances are pinpointed on a reliable physical scaffold providing an independent resource for closure of gaps and rectification of misassemblies.

Conclusion

Analysis of sequence and optical mapping data effectively validates genome sequence assemblies constructed from large, repeat-rich genomes. Given this conclusion we envision new applications of such single molecule analysis that will merge advantages offered by high-resolution optical maps with inexpensive, but short sequence reads generated by emerging sequencing platforms. Lastly, map construction techniques presented here points the way to new types of comparative genome analysis that would focus on discernment of structural differences revealed by optical maps constructed from a broad range of rice subspecies and varieties.

Collapse

Affiliation(s)

Shiguo Zhou Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Michael C Bechner Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Michael Place Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Chris P Churas Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Louise Pape Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Sally A Leong USDA-ARS, CCRU, Department of Plant Pathology, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Rod Runnheim Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Dan K Forrest Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Steve Goldstein Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
Miron Livny Department of Computer Sciences, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
David C Schwartz Laboratory for Molecular and Computational Genomics, University of Wisconsin-Madison, UW Biotechnology Centre, 425 Henry Mall, Madison, Wisconsin 53706, USA Department of Chemistry, Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA Laboratory of Genetics; University of Wisconsin-Madison, Madison, Wisconsin 53706, USA

Collapse

Kalavacharla V, Hossain K, Gu Y, Riera-Lizarazu O, Vales MI, Bhamidimarri S, Gonzalez-Hernandez JL, Maan SS, Kianian SF. High-resolution radiation hybrid map of wheat chromosome 1D. Genetics 2006;173:1089-99. [PMID: 16624903 PMCID: PMC1526521 DOI: 10.1534/genetics.106.056481] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2006] [Accepted: 04/05/2006] [Indexed: 11/18/2022] Open

Luo M, Kim H, Kudrna D, Sisneros NB, Lee SJ, Mueller C, Collura K, Zuccolo A, Buckingham EB, Grim SM, Yanagiya K, Inoko H, Shiina T, Flajnik MF, Wing RA, Ohta Y. Construction of a nurse shark (Ginglymostoma cirratum) bacterial artificial chromosome (BAC) library and a preliminary genome survey. BMC Genomics 2006;7:106. [PMID: 16672057 PMCID: PMC1513397 DOI: 10.1186/1471-2164-7-106] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2005] [Accepted: 05/03/2006] [Indexed: 01/12/2023] Open

Affiliation(s)

Meizhong Luo Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA College of Life Sciences and Technology, Huazhong Agricultural University, Wuhan 430070, China
HyeRan Kim Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
Dave Kudrna Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
Nicholas B Sisneros Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
So-Jeong Lee Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
Christopher Mueller Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
Kristi Collura Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
Andrea Zuccolo Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
E Bryan Buckingham University of Maryland, Department of Microbiology and Immunology, 655 West Baltimore Street, BRB3-052, Baltimore, MD 21201, USA
Suzanne M Grim University of Maryland, Department of Microbiology and Immunology, 655 West Baltimore Street, BRB3-052, Baltimore, MD 21201, USA
Kazuyo Yanagiya Department of Molecular Life Science, Division of Basic Medical Science and Molecular Medicine, Tokai University School of Medicine, 143 Shimokasuya, Isehara, Kanagawa 259-1143, Japan
Hidetoshi Inoko Department of Molecular Life Science, Division of Basic Medical Science and Molecular Medicine, Tokai University School of Medicine, 143 Shimokasuya, Isehara, Kanagawa 259-1143, Japan
Takashi Shiina Department of Molecular Life Science, Division of Basic Medical Science and Molecular Medicine, Tokai University School of Medicine, 143 Shimokasuya, Isehara, Kanagawa 259-1143, Japan
Martin F Flajnik University of Maryland, Department of Microbiology and Immunology, 655 West Baltimore Street, BRB3-052, Baltimore, MD 21201, USA
Rod A Wing Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
Yuko Ohta University of Maryland, Department of Microbiology and Immunology, 655 West Baltimore Street, BRB3-052, Baltimore, MD 21201, USA

Collapse

Wendl MC. Occupancy modeling of coverage distribution for whole genome shotgun DNA sequencing. Bull Math Biol 2006;68:179-96. [PMID: 16794926 DOI: 10.1007/s11538-005-9021-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2004] [Accepted: 03/15/2005] [Indexed: 10/24/2022]

Paterson AH. Leafing through the genomes of our major crop plants: strategies for capturing unique information. Nat Rev Genet 2006;7:174-84. [PMID: 16485017 DOI: 10.1038/nrg1806] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Vij S, Gupta V, Kumar D, Vydianathan R, Raghuvanshi S, Khurana P, Khurana JP, Tyagi AK. Decoding the rice genome. Bioessays 2006;28:421-32. [PMID: 16547947 DOI: 10.1002/bies.20399] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Csűrös M, Miklós I. A Probabilistic Model for Gene Content Evolution with Duplication, Loss, and Horizontal Transfer. LECTURE NOTES IN COMPUTER SCIENCE 2006. [DOI: 10.1007/11732990_18] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Han Y, Ni P, Lü H, Ye J, Hu J, Chen C, Huang X, Cong L, Li G, Wang J, Gu X, Yu J, Li S. Applications of the double-barreled data in whole-genome shotgun sequence assembly and analysis. SCIENCE IN CHINA. SERIES C, LIFE SCIENCES 2005;48:300-6. [PMID: 16092764 DOI: 10.1007/bf03183625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Margulies EH, Vinson JP, Miller W, Jaffe DB, Lindblad-Toh K, Chang JL, Green ED, Lander ES, Mullikin JC, Clamp M. An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing. Proc Natl Acad Sci U S A 2005;102:4795-800. [PMID: 15778292 PMCID: PMC555705 DOI: 10.1073/pnas.0409882102] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open

Blakesley RW, Hansen NF, Mullikin JC, Thomas PJ, McDowell JC, Maskeri B, Young AC, Benjamin B, Brooks SY, Coleman BI, Gupta J, Ho SL, Karlins EM, Maduro QL, Stantripop S, Tsurgeon C, Vogt JL, Walker MA, Masiello CA, Guan X, Bouffard GG, Green ED. An intermediate grade of finished genomic sequence suitable for comparative analyses. Genome Res 2004;14:2235-44. [PMID: 15479945 PMCID: PMC525681 DOI: 10.1101/gr.2648404] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2004] [Accepted: 08/16/2004] [Indexed: 11/25/2022]

Margulies EH, Green ED. Detecting highly conserved regions of the human genome by multispecies sequence comparisons. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2004;68:255-63. [PMID: 15338625 DOI: 10.1101/sqb.2003.68.255] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Yin TM, DiFazio SP, Gunter LE, Jawdy SS, Boerjan W, Tuskan GA. Genetic and physical mapping of Melampsora rust resistance genes in Populus and characterization of linkage disequilibrium and flanking genomic sequence. THE NEW PHYTOLOGIST 2004;164:95-105. [PMID: 33873470 DOI: 10.1111/j.1469-8137.2004.01161.x] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Angata T, Margulies EH, Green ED, Varki A. Large-scale sequencing of the CD33-related Siglec gene cluster in five mammalian species reveals rapid evolution by multiple mechanisms. Proc Natl Acad Sci U S A 2004;101:13251-6. [PMID: 15331780 PMCID: PMC516556 DOI: 10.1073/pnas.0404833101] [Citation(s) in RCA: 137] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

Siglecs are a recently discovered family of animal lectins that belong to the Ig superfamily and recognize sialic acids (Sias). CD33-related Siglecs (CD33rSiglecs) are a subgroup with as-yet-unknown functions, characterized by sequence homology, expression on innate immune cells, conserved cytosolic tyrosine-based signaling motifs, and a clustered localization of their genes. To better understand the biology and evolution of CD33rSiglecs, we sequenced and compared the CD33rSiglec gene cluster from multiple mammalian species. Within the sequenced region, the segments containing CD33rSiglec genes showed a lower degree of sequence conservation. In contrast to the adjacent conserved kallikrein-like genes, the CD33rSiglec genes showed extensive species differences, including expansions of gene subsets; gene deletions, including one human-specific loss of a novel functional primate Siglec (Siglec-13); exon shuffling, generating hybrid genes; accelerated accumulation of nonsynonymous substitutions in the Sia-recognition domain; and multiple instances of mutations of an arginine residue essential for Sia recognition in otherwise intact Siglecs. Nonsynonymous differences between human and chimpanzee orthologs showed uneven distribution between the two beta sheets of the Sia-recognition domain, suggesting biased mutation accumulation. These data indicate that CD33rSiglec genes are undergoing rapid evolution via multiple genetic mechanisms, possibly due to an evolutionary "arms race" between hosts and pathogens involving Sia recognition. These studies, which reflect one of the most complete comparative sequence analyses of a rapidly evolving gene cluster, provide a clearer picture of the ortholog status of CD33rSiglecs among primates and rodents and also facilitate rational recommendations regarding their nomenclature.

Collapse

Basu C, Halfhill MD, Mueller TC, Stewart CN. Weed genomics: new tools to understand weed biology. TRENDS IN PLANT SCIENCE 2004;9:391-8. [PMID: 15358270 DOI: 10.1016/j.tplants.2004.06.003] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]

Meyers BC, Scalabrin S, Morgante M. Mapping and sequencing complex genomes: let's get physical! Nat Rev Genet 2004;5:578-88. [PMID: 15266340 DOI: 10.1038/nrg1404] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]