1
|
Preising GA, Faber-Hammond JJ, Renn SCP. Correspondence of aCGH and long-read genome assembly for detection of copy number differences: A proof-of-concept with cichlid genomes. PLoS One 2021; 16:e0258193. [PMID: 34618847 PMCID: PMC8496808 DOI: 10.1371/journal.pone.0258193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 09/21/2021] [Indexed: 11/18/2022] Open
Abstract
Copy number variation is an important source of genetic variation, yet data are often lacking due to technical limitations for detection given the current genome assemblies. Our goal is to demonstrate the extent to which an array-based platform (aCGH) can identify genomic loci that are collapsed in genome assemblies that were built with short-read technology. Taking advantage of two cichlid species for which genome assemblies based on Illumina and PacBio are available, we show that inter-species aCGH log2 hybridization ratios correlate more strongly with inferred copy number differences based on PacBio-built genome assemblies than based on Illumina-built genome assemblies. With regard to inter-species copy number differences of specific genes identified by each platform, the set identified by aCGH intersects to a greater extent with the set identified by PacBio than with the set identified by Illumina. Gene function, according to Gene Ontology analysis, did not substantially differ among platforms, and platforms converged on functions associated with adaptive phenotypes. The results of the current study further demonstrate that aCGH is an effective platform for identifying copy number variable sequences, particularly those collapsed in short read genome assemblies.
Collapse
Affiliation(s)
| | | | - Suzy C. P. Renn
- Department of Biology, Reed College, Portland, OR, United States of America
| |
Collapse
|
2
|
Suryawanshi V, Talke IN, Weber M, Eils R, Brors B, Clemens S, Krämer U. Between-species differences in gene copy number are enriched among functions critical for adaptive evolution in Arabidopsis halleri. BMC Genomics 2016; 17:1034. [PMID: 28155655 PMCID: PMC5259951 DOI: 10.1186/s12864-016-3319-5] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background Gene copy number divergence between species is a form of genetic polymorphism that contributes significantly to both genome size and phenotypic variation. In plants, copy number expansions of single genes were implicated in cultivar- or species-specific tolerance of high levels of soil boron, aluminium or calamine-type heavy metals, respectively. Arabidopsis halleri is a zinc- and cadmium-hyperaccumulating extremophile species capable of growing on heavy-metal contaminated, toxic soils. In contrast, its non-accumulating sister species A. lyrata and the closely related reference model species A. thaliana exhibit merely basal metal tolerance. Results For a genome-wide assessment of the role of copy number divergence (CND) in lineage-specific environmental adaptation, we conducted cross-species array comparative genome hybridizations of three plant species and developed a global signal scaling procedure to adjust for sequence divergence. In A. halleri, transition metal homeostasis functions are enriched twofold among the genes detected as copy number expanded. Moreover, biotic stress functions including mostly disease Resistance (R) gene-related genes are enriched twofold among genes detected as copy number reduced, when compared to the abundance of these functions among all genes. Conclusions Our results provide genome-wide support for a link between evolutionary adaptation and CND in A. halleri as shown previously for Heavy metal ATPase4. Moreover our results support the hypothesis that elemental defences, which result from the hyperaccumulation of toxic metals, allow the reduction of classical defences against biotic stress as a trade-off. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3319-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Vasantika Suryawanshi
- Department of Plant Physiology, Ruhr University Bochum, Universitätsstrasse 150, Bochum, 44801, Germany.,BioQuant, University of Heidelberg, Im Neuenheimer Feld 267, Heidelberg, 69120, Germany
| | - Ina N Talke
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam, 14476, Germany
| | - Michael Weber
- Department of Plant Physiology, University of Bayreuth, Universitätsstrasse 30, Bayreuth, 95447, Germany
| | - Roland Eils
- Division of Theoretical Bioinformatics, DKFZ, Im Neuenheimer Feld 280, Heidelberg, 69121, Germany.,BioQuant, University of Heidelberg, Im Neuenheimer Feld 267, Heidelberg, 69120, Germany.,Institute of Pharmacy and Molecular Biotechnology, University of Heidelberg, Im Neuenheimer Feld 364, Heidelberg, 69120, Germany
| | - Benedikt Brors
- Division of Theoretical Bioinformatics, DKFZ, Im Neuenheimer Feld 280, Heidelberg, 69121, Germany
| | - Stephan Clemens
- Department of Plant Physiology, University of Bayreuth, Universitätsstrasse 30, Bayreuth, 95447, Germany
| | - Ute Krämer
- Department of Plant Physiology, Ruhr University Bochum, Universitätsstrasse 150, Bochum, 44801, Germany. .,BioQuant, University of Heidelberg, Im Neuenheimer Feld 267, Heidelberg, 69120, Germany.
| |
Collapse
|
3
|
Machado HE, Jui G, Joyce DA, Reilly CRL, Lunt DH, Renn SCP. Gene duplication in an African cichlid adaptive radiation. BMC Genomics 2014; 15:161. [PMID: 24571567 PMCID: PMC3944005 DOI: 10.1186/1471-2164-15-161] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2013] [Accepted: 02/19/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene duplication is a source of evolutionary innovation and can contribute to the divergence of lineages; however, the relative importance of this process remains to be determined. The explosive divergence of the African cichlid adaptive radiations provides both a model for studying the general role of gene duplication in the divergence of lineages and also an exciting foray into the identification of genomic features that underlie the dramatic phenotypic and ecological diversification in this particular lineage. We present the first genome-wide study of gene duplication in African cichlid fishes, identifying gene duplicates in three species belonging to the Lake Malawi adaptive radiation (Metriaclima estherae, Protomelas similis, Rhamphochromis "chilingali") and one closely related species from a non-radiated riverine lineage (Astatotilapia tweddlei). RESULTS Using Astatotilapia burtoni as reference, microarray comparative genomic hybridization analysis of 5689 genes reveals 134 duplicated genes among the four cichlid species tested. Between 51 and 55 genes were identified as duplicated in each of the three species from the Lake Malawi radiation, representing a 38%-49% increase in number of duplicated genes relative to the non-radiated lineage (37 genes). Duplicated genes include several that are involved in immune response, ATP metabolism and detoxification. CONCLUSIONS These results contribute to our understanding of the abundance and type of gene duplicates present in cichlid fish lineages. The duplicated genes identified in this study provide candidates for the analysis of functional relevance with regard to phenotype and divergence. Comparative sequence analysis of gene duplicates can address the role of positive selection and adaptive evolution by gene duplication, while further study across the phylogenetic range of cichlid radiations (and more generally in other adaptive radiations) will determine whether the patterns of gene duplication seen in this study consistently accompany rapid radiation.
Collapse
Affiliation(s)
| | | | | | | | | | - Suzy C P Renn
- Department of Biology, Reed College, Portland, OR 97202, USA.
| |
Collapse
|
4
|
Aliyu OM, Seifert M, Corral JM, Fuchs J, Sharbel TF. Copy number variation in transcriptionally active regions of sexual and apomictic Boechera demonstrates independently derived apomictic lineages. THE PLANT CELL 2013; 25:3808-23. [PMID: 24170129 PMCID: PMC3877827 DOI: 10.1105/tpc.113.113860] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Revised: 09/11/2013] [Accepted: 10/15/2013] [Indexed: 05/19/2023]
Abstract
In asexual (apomictic) plants, the absence of meiosis and sex is expected to lead to mutation accumulation. To compare mutation accumulation in the transcribed genomic regions of sexual and apomictic plants, we performed a double-validated analysis of copy number variation (CNV) on 10 biological replicates each of diploid sexual and diploid apomictic Boechera, using a high-density (>700 K) custom microarray. The Boechera genome demonstrated higher levels of depleted CNV, compared with enriched CNV, irrespective of reproductive mode. Genome-wide patterns of CNV revealed four divergent lineages, three of which contain both sexual and apomictic genotypes. Hence genome-wide CNV reflects at least three independent origins (i.e., expression) of apomixis from different sexual genetic backgrounds. CNV distributions for different families of transposable elements were lineage specific, and the enrichment of LINE/L1 and long term repeat/Copia elements in lineage 3 apomicts is consistent with sex and meiosis being mechanisms for purging genomic parasites. We hypothesize that significant overrepresentation of specific gene ontology classes (e.g., pollen-pistil interaction) in apomicts implies that gene enrichment could be an adaptive mechanism for genome stability in diploid apomicts by providing a polyploid-like system for buffering the effects of deleterious mutations.
Collapse
Affiliation(s)
- Olawale M. Aliyu
- Apomixis Research Group, Leibniz Institute of Plant Genetics and Crop Plant Research, D-06466 Gatersleben, Germany
| | - Michael Seifert
- Data Inspection Research Group, Leibniz Institute of Plant Genetics and Crop Plant Research, D-06466 Gatersleben, Germany
- Cellular Networks and Systems Biology, Biotechnology Center of the Technical University Dresden, D-01307 Dresden, Germany
- Innovative Methods of Computing, Center for Information Services and High Performance Computing, Technical University Dresden, D-01187 Dresden, Germany
| | - José M. Corral
- Apomixis Research Group, Leibniz Institute of Plant Genetics and Crop Plant Research, D-06466 Gatersleben, Germany
| | - Joerg Fuchs
- Karyotype Evolution Research Group, Leibniz Institute of Plant Genetics and Crop Plant Research, D-06466 Gatersleben, Germany
| | - Timothy F. Sharbel
- Apomixis Research Group, Leibniz Institute of Plant Genetics and Crop Plant Research, D-06466 Gatersleben, Germany
- Address correspondence to
| |
Collapse
|
5
|
Darby BJ, Jones KL, Wheeler D, Herman MA. Normalization and centering of array-based heterologous genome hybridization based on divergent control probes. BMC Bioinformatics 2011; 12:183. [PMID: 21600029 PMCID: PMC3125262 DOI: 10.1186/1471-2105-12-183] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2010] [Accepted: 05/21/2011] [Indexed: 11/21/2022] Open
Abstract
Background Hybridization of heterologous (non-specific) nucleic acids onto arrays designed for model-organisms has been proposed as a viable genomic resource for estimating sequence variation and gene expression in non-model organisms. However, conventional methods of normalization that assume equivalent distributions (such as quantile normalization) are inappropriate when applied to non-specific (heterologous) hybridization. We propose an algorithm for normalizing and centering intensity data from heterologous hybridization that makes no prior assumptions of distribution, reduces the false appearance of homology, and provides a way for researchers to confirm whether heterologous hybridization is suitable. Results Data are normalized by adjusting for Gibbs free energy binding, and centered by adjusting for the median of a common set of control probes assumed to be equivalently dissimilar for all species. This procedure was compared to existing approaches and found to be as successful as Loess normalization at detecting sequence variations (deletions) and even more successful than quantile normalization at reducing the accumulation of false positive probe matches between two related nematode species, Caenorhabditis elegans and C. briggsae. Despite the improvements, we still found that probe fluorescence intensity was too poorly correlated with sequence similarity to result in reliable detection of matching probe sequence. Conclusions Cross-species hybridizations can be a way to adapt genome-enabled tools for closely related non-model organisms, but data must be appropriately normalized and centered in a way that accommodates hybridization of nucleic acids with diverged sequence. For short, 25-mer probes, hybridization intensity alone may be insufficiently correlated with sequence similarity to allow reliable inference of homology at the probe level.
Collapse
Affiliation(s)
- Brian J Darby
- Ecological Genomics Institute, Division of Biology, Kansas State University, Manhattan, KS 66506, USA
| | | | | | | |
Collapse
|
6
|
Fourteen-genome comparison identifies DNA markers for severe-disease-associated strains of Clostridium difficile. J Clin Microbiol 2011; 49:2230-8. [PMID: 21508155 DOI: 10.1128/jcm.00391-11] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Clostridium difficile is a common cause of infectious diarrhea in hospitalized patients. A severe and increased incidence of C. difficile infection (CDI) is associated predominantly with the NAP1 strain; however, the existence of other severe-disease-associated (SDA) strains and the extensive genetic diversity across C. difficile complicate reliable detection and diagnosis. Comparative genome analysis of 14 sequenced genomes, including those of a subset of NAP1 isolates, allowed the assessment of genetic diversity within and between strain types to identify DNA markers that are associated with severe disease. Comparative genome analysis of 14 isolates, including five publicly available strains, revealed that C. difficile has a core genome of 3.4 Mb, comprising ∼ 3,000 genes. Analysis of the core genome identified candidate DNA markers that were subsequently evaluated using a multistrain panel of 177 isolates, representing more than 50 pulsovars and 8 toxinotypes. A subset of 117 isolates from the panel had associated patient data that allowed assessment of an association between the DNA markers and severe CDI. We identified 20 candidate DNA markers for species-wide detection and 10,683 single nucleotide polymorphisms (SNPs) associated with the predominant SDA strain (NAP1). A species-wide detection candidate marker, the sspA gene, was found to be the same across 177 sequenced isolates and lacked significant similarity to those of other species. Candidate SNPs in genes CD1269 and CD1265 were found to associate more closely with disease severity than currently used diagnostic markers, as they were also present in the toxin A-negative and B-positive (A-B+) strain types. The genetic markers identified illustrate the potential of comparative genomics for the discovery of diagnostic DNA-based targets that are species specific or associated with multiple SDA strains.
Collapse
|
7
|
Ogura A, Yoshida MA, Fukuzaki M, Sese J. In vitro homology search array comprehensively reveals highly conserved genes and their functional characteristics in non-sequenced species. BMC Genomics 2010; 11 Suppl 4:S9. [PMID: 21143818 PMCID: PMC3005928 DOI: 10.1186/1471-2164-11-s4-s9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND With the increase in genomic and transcriptomic data produced by the recent advancements in next generation sequencers and microarrays, it is now easier than ever to conduct large-scale comparative genomic studies for familiar species. However, there are more than ten million species on earth, and the study of all remaining species is not realistic in terms of cost and time. There have been a number of attempts at using microarrays for cross-species hybridization; however, those approaches only utilized the same probes for each species or different probes designed from orthologous genes. To establish easier and cheaper methods for the large-scale comparative genomic study of non-sequenced species, we developed an in vitro homology search array with the aid of a bioinformatic approach to probe design. RESULTS To perform large-scale genomic comparisons of non-sequenced species, we chose squid, one of the most intelligent species among Protostomes, for comparison with human genes. We designed a microarray using human single copy genes and conducted microarray experiments with mRNAs extracted from the squid. Multi-copy genes could not be detected using the microarray in this study because their sequence similarity caused cross-hybridization. A search for squid homologous genes among human genes revealed that 68% of the human probes tested showed the expression of squid homolog genes and 95 genes were confirmed to be expressed highly in squid. Functional classification analysis showed that these highly expressed genes comprise DNA binding proteins, which are under pressure of DNA level mutation and, consequently, show high similarity at the nucleotide level. CONCLUSIONS Our array could detect homologous genes in squids and humans in spite of the distant phylogenic relationships between the species. This experimental method will be useful for identifying homologs in non-sequenced species, for the development of genetic resources and for the collection of information on biodiversity, particularly when using the genome of sibling or closely related species.
Collapse
Affiliation(s)
- Atsushi Ogura
- Ochadai Academic Production, Ochanomizu University, Bunkyo, Tokyo, Japan.
| | | | | | | |
Collapse
|