1
|
Jiang G, Li Y, Cheng G, Jiang K, Zhou J, Xu C, Kong L, Yu H, Liu S, Li Q. Transcriptome Analysis of Reciprocal Hybrids Between Crassostrea gigas and C. angulata Reveals the Potential Mechanisms Underlying Thermo-Resistant Heterosis. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2023; 25:235-246. [PMID: 36653591 DOI: 10.1007/s10126-023-10197-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 01/11/2023] [Indexed: 05/06/2023]
Abstract
Heterosis, also known as hybrid vigor, is widely used in aquaculture, but the molecular causes for this phenomenon remain obscure. Here, we conducted a transcriptome analysis to unveil the gene expression patterns and molecular bases underlying thermo-resistant heterosis in Crassostrea gigas ♀ × Crassostrea angulata ♂ (GA) and C. angulata ♀ × C. gigas ♂ (AG). About 505 million clean reads were obtained, and 38,210 genes were identified, of which 3779 genes were differentially expressed between the reciprocal hybrids and purebreds. The global gene expression levels were toward the C. gigas genome in the reciprocal hybrids. In GA and AG, 95.69% and 92.00% of the differentially expressed genes (DEGs) exhibited a non-additive expression pattern, respectively. We observed all gene expression modes, including additive, partial dominance, high and low dominance, and under- and over-dominance. Of these, 77.52% and 50.00% of the DEGs exhibited under- or over-dominance in GA and AG, respectively. The over-dominance DEGs common to reciprocal hybrids were significantly enriched in protein folding, protein refolding, and intrinsic apoptotic signaling pathway, while the under-dominance DEGs were significantly enriched in cell cycle. As possible candidate genes for thermo-resistant heterosis, GRP78, major egg antigen, BAG, Hsp70, and Hsp27 were over-dominantly expressed, while MCM6 and ANAPC4 were under-dominantly expressed. This study extends our understanding of the thermo-resistant heterosis in oysters.
Collapse
Affiliation(s)
- Gaowei Jiang
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Yin Li
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Geng Cheng
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Kunyin Jiang
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Jianmin Zhou
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Chengxun Xu
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Lingfeng Kong
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Hong Yu
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Shikai Liu
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China
| | - Qi Li
- Key Laboratory of Mariculture, Ministry of Education, Ocean University of China, Qingdao, 266003, China.
- Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266237, China.
| |
Collapse
|
2
|
Günther T, Lampei C, Barilar I, Schmid KJ. Genomic and phenotypic differentiation of Arabidopsis thaliana along altitudinal gradients in the North Italian Alps. Mol Ecol 2016; 25:3574-92. [PMID: 27220345 DOI: 10.1111/mec.13705] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Revised: 04/19/2016] [Accepted: 05/02/2016] [Indexed: 12/25/2022]
Abstract
Altitudinal gradients in mountain regions are short-range clines of different environmental parameters such as temperature or radiation. We investigated genomic and phenotypic signatures of adaptation to such gradients in five Arabidopsis thaliana populations from the North Italian Alps that originated from 580 to 2350 m altitude by resequencing pools of 19-29 individuals from each population. The sample includes two pairs of low- and high-altitude populations from two different valleys. High-altitude populations showed a lower nucleotide diversity and negative Tajima's D values and were more closely related to each other than to low-altitude populations from the same valley. Despite their close geographic proximity, demographic analysis revealed that low- and high-altitude populations split between 260 000 and 15 000 years before present. Single nucleotide polymorphisms whose allele frequencies were highly differentiated between low- and high-altitude populations identified genomic regions of up to 50 kb length where patterns of genetic diversity are consistent with signatures of local selective sweeps. These regions harbour multiple genes involved in stress response. Variation among populations in two putative adaptive phenotypic traits, frost tolerance and response to light/UV stress was not correlated with altitude. Taken together, the spatial distribution of genetic diversity reflects a potentially adaptive differentiation between low- and high-altitude populations, whereas the phenotypic differentiation in the two traits investigated does not. It may resemble an interaction between adaptation to the local microhabitat and demographic history influenced by historical glaciation cycles, recent seed dispersal and genetic drift in local populations.
Collapse
Affiliation(s)
- Torsten Günther
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany.,Department of Evolutionary Biology, EBC, Uppsala University, Uppsala, Sweden
| | - Christian Lampei
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany
| | - Ivan Barilar
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany
| | - Karl J Schmid
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany
| |
Collapse
|
3
|
Branham SE, Wright SJ, Reba A, Morrison GD, Linder CR. Genome-Wide Association Study in Arabidopsis thaliana of Natural Variation in Seed Oil Melting Point: A Widespread Adaptive Trait in Plants. J Hered 2016; 107:257-65. [PMID: 26865732 DOI: 10.1093/jhered/esw008] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 01/27/2016] [Indexed: 12/29/2022] Open
Abstract
Seed oil melting point is an adaptive, quantitative trait determined by the relative proportions of the fatty acids that compose the oil. Micro- and macro-evolutionary evidence suggests selection has changed the melting point of seed oils to covary with germination temperatures because of a trade-off between total energy stores and the rate of energy acquisition during germination under competition. The seed oil compositions of 391 natural accessions of Arabidopsis thaliana, grown under common-garden conditions, were used to assess whether seed oil melting point within a species varied with germination temperature. In support of the adaptive explanation, long-term monthly spring and fall field temperatures of the accession collection sites significantly predicted their seed oil melting points. In addition, a genome-wide association study (GWAS) was performed to determine which genes were most likely responsible for the natural variation in seed oil melting point. The GWAS found a single highly significant association within the coding region of FAD2, which encodes a fatty acid desaturase central to the oil biosynthesis pathway. In a separate analysis of 15 a priori oil synthesis candidate genes, 2 (FAD2 and FATB) were located near significant SNPs associated with seed oil melting point. These results comport with others' molecular work showing that lines with alterations in these genes affect seed oil melting point as expected. Our results suggest natural selection has acted on a small number of loci to alter a quantitative trait in response to local environmental conditions.
Collapse
Affiliation(s)
- Sandra E Branham
- From the US Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC 29414 (Branham); Department of Biology, Washington University, St. Louis, MO (Wright); Integrative Biology Department, University of Texas at Austin, Austin, TX (Branham, Reba, and Linder); and Division of Plant Sciences, University of Missouri, Columbia, MO (Morrison).
| | - Sara J Wright
- From the US Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC 29414 (Branham); Department of Biology, Washington University, St. Louis, MO (Wright); Integrative Biology Department, University of Texas at Austin, Austin, TX (Branham, Reba, and Linder); and Division of Plant Sciences, University of Missouri, Columbia, MO (Morrison)
| | - Aaron Reba
- From the US Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC 29414 (Branham); Department of Biology, Washington University, St. Louis, MO (Wright); Integrative Biology Department, University of Texas at Austin, Austin, TX (Branham, Reba, and Linder); and Division of Plant Sciences, University of Missouri, Columbia, MO (Morrison)
| | - Ginnie D Morrison
- From the US Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC 29414 (Branham); Department of Biology, Washington University, St. Louis, MO (Wright); Integrative Biology Department, University of Texas at Austin, Austin, TX (Branham, Reba, and Linder); and Division of Plant Sciences, University of Missouri, Columbia, MO (Morrison)
| | - C Randal Linder
- From the US Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC 29414 (Branham); Department of Biology, Washington University, St. Louis, MO (Wright); Integrative Biology Department, University of Texas at Austin, Austin, TX (Branham, Reba, and Linder); and Division of Plant Sciences, University of Missouri, Columbia, MO (Morrison)
| |
Collapse
|
4
|
Prince SJ, Song L, Qiu D, Maldonado Dos Santos JV, Chai C, Joshi T, Patil G, Valliyodan B, Vuong TD, Murphy M, Krampis K, Tucker DM, Biyashev R, Dorrance AE, Maroof MAS, Xu D, Shannon JG, Nguyen HT. Genetic variants in root architecture-related genes in a Glycine soja accession, a potential resource to improve cultivated soybean. BMC Genomics 2015; 16:132. [PMID: 25765991 PMCID: PMC4354765 DOI: 10.1186/s12864-015-1334-6] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Accepted: 02/09/2015] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Root system architecture is important for water acquisition and nutrient acquisition for all crops. In soybean breeding programs, wild soybean alleles have been used successfully to enhance yield and seed composition traits, but have never been investigated to improve root system architecture. Therefore, in this study, high-density single-feature polymorphic markers and simple sequence repeats were used to map quantitative trait loci (QTLs) governing root system architecture in an inter-specific soybean mapping population developed from a cross between Glycine max and Glycine soja. RESULTS Wild and cultivated soybean both contributed alleles towards significant additive large effect QTLs on chromosome 6 and 7 for a longer total root length and root distribution, respectively. Epistatic effect QTLs were also identified for taproot length, average diameter, and root distribution. These root traits will influence the water and nutrient uptake in soybean. Two cell division-related genes (D type cyclin and auxin efflux carrier protein) with insertion/deletion variations might contribute to the shorter root phenotypes observed in G. soja compared with cultivated soybean. Based on the location of the QTLs and sequence information from a second G. soja accession, three genes (slow anion channel associated 1 like, Auxin responsive NEDD8-activating complex and peroxidase), each with a non-synonymous single nucleotide polymorphism mutation were identified, which may also contribute to changes in root architecture in the cultivated soybean. In addition, Apoptosis inhibitor 5-like on chromosome 7 and slow anion channel associated 1-like on chromosome 15 had epistatic interactions for taproot length QTLs in soybean. CONCLUSION Rare alleles from a G. soja accession are expected to enhance our understanding of the genetic components involved in root architecture traits, and could be combined to improve root system and drought adaptation in soybean.
Collapse
Affiliation(s)
- Silvas J Prince
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Li Song
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Dan Qiu
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Joao V Maldonado Dos Santos
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA.
| | - Chenglin Chai
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Trupti Joshi
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA.
- Department of Computer Science, University of Missouri, Columbia, MO, 65211, USA.
| | - Gunvant Patil
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Babu Valliyodan
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Tri D Vuong
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Mackensie Murphy
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Konstantinos Krampis
- Department of Crop and Soil Environmental Sciences, Virginia Tech, Blacksburg, VA, 24061, USA.
| | - Dominic M Tucker
- Department of Crop and Soil Environmental Sciences, Virginia Tech, Blacksburg, VA, 24061, USA.
| | - Ruslan Biyashev
- Department of Crop and Soil Environmental Sciences, Virginia Tech, Blacksburg, VA, 24061, USA.
| | - Anne E Dorrance
- Department of Plant Pathology, The Ohio State University, OARDC, Wooster, OH, 44691, USA.
| | - M A Saghai Maroof
- Department of Crop and Soil Environmental Sciences, Virginia Tech, Blacksburg, VA, 24061, USA.
| | - Dong Xu
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA.
- Department of Computer Science, University of Missouri, Columbia, MO, 65211, USA.
| | - J Grover Shannon
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
| | - Henry T Nguyen
- National Center for Soybean Biotechnology and Division of Plant Sciences, University of Missouri, Columbia, MO, 65211, USA.
- Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
5
|
Korkuć P, Schippers JH, Walther D. Characterization and identification of cis-regulatory elements in Arabidopsis based on single-nucleotide polymorphism information. PLANT PHYSIOLOGY 2014; 164:181-200. [PMID: 24204023 PMCID: PMC3875800 DOI: 10.1104/pp.113.229716] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2013] [Accepted: 11/06/2013] [Indexed: 05/19/2023]
Abstract
Identifying regulatory elements and revealing their role in gene expression regulation remains a central goal of plant genome research. We exploited the detailed genomic sequencing information of a large number of Arabidopsis (Arabidopsis thaliana) accessions to characterize known and to identify novel cis-regulatory elements in gene promoter regions of Arabidopsis by relying on conservation as the hallmark signal of functional relevance. Based on the genomic layout and the obtained density profiles of single-nucleotide polymorphisms (SNPs) in sequence regions upstream of transcription start sites, the average length of promoter regions in Arabidopsis could be established at 500 bp. Genes associated with high degrees of variability of their respective upstream regions are preferentially involved in environmental response and signaling processes, while low levels of promoter SNP density are common among housekeeping genes. Known cis-elements were found to exhibit a decreased SNP density than sequence regions not associated with known motifs. For 15 known cis-element motifs, strong positional preferences relative to the transcription start site were detected based on their promoter SNP density profiles. Five novel candidate cis-element motifs were identified as consensus motifs of 17 sequence hexamers exhibiting increased sequence conservation combined with evidence of positional preferences, annotation information, and functional relevance for inducing correlated gene expression. Our study demonstrates that the currently available resolution of SNP data offers novel ways for the identification of functional genomic elements and the characterization of gene promoter sequences.
Collapse
|
6
|
Settles ML, Coram T, Soule T, Robison BD. An improved algorithm for the detection of genomic variation using short oligonucleotide expression microarrays. Mol Ecol Resour 2012; 12:1079-89. [PMID: 22966828 DOI: 10.1111/1755-0998.12006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2012] [Revised: 07/30/2012] [Accepted: 08/01/2012] [Indexed: 11/30/2022]
Abstract
High-throughput microarray experiments often generate far more biological information than is required to test the experimental hypotheses. Many microarray analyses are considered finished after differential expression and additional analyses are typically not performed, leaving untapped biological information left undiscovered. This is especially true if the microarray experiment is from an ecological study of multiple populations. Comparisons across populations may also contain important genomic polymorphisms, and a subset of these polymorphisms may be identified with microarrays using techniques for the detection of single feature polymorphisms (SFP). SFPs are differences in microarray probe level intensities caused by genetic polymorphisms such as single-nucleotide polymorphisms and small insertions/deletions and not expression differences. In this study, we provide a new algorithm for the detection of SFPs, evaluate the algorithm using existing data from two publicly available Affymetrix Barley (Hordeum vulgare) microarray data sets and compare them to two previously published SFP detection algorithms. Results show that our algorithm provides more consistent and sensitive calling of SFPs with a lower false discovery rate. Simultaneous analysis of SFPs and differential expression is a low-cost method for the enhanced analysis of microarray data, enabling additional biological inferences to be made.
Collapse
Affiliation(s)
- Matthew L Settles
- Institute for Bioinformatics and Evolutionary Studies, University of Idaho, Moscow, ID 83844-3051, USA.
| | | | | | | |
Collapse
|
7
|
Meyer RC, Witucka-Wall H, Becher M, Blacha A, Boudichevskaia A, Dörmann P, Fiehn O, Friedel S, von Korff M, Lisec J, Melzer M, Repsilber D, Schmidt R, Scholz M, Selbig J, Willmitzer L, Altmann T. Heterosis manifestation during early Arabidopsis seedling development is characterized by intermediate gene expression and enhanced metabolic activity in the hybrids. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 71:669-83. [PMID: 22487254 DOI: 10.1111/j.1365-313x.2012.05021.x] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Heterosis-associated cellular and molecular processes were analyzed in seeds and seedlings of Arabidopsis thaliana accessions Col-0 and C24 and their heterotic hybrids. Microscopic examination revealed no advantages in terms of hybrid mature embryo organ sizes or cell numbers. Increased cotyledon sizes were detectable 4 days after sowing. Growth heterosis results from elevated cell sizes and numbers, and is well established at 10 days after sowing. The relative growth rates of hybrid seedlings were most enhanced between 3 and 4 days after sowing. Global metabolite profiling and targeted fatty acid analysis revealed maternal inheritance patterns for a large proportion of metabolites in the very early stages. During developmental progression, the distribution shifts to dominant, intermediate and heterotic patterns, with most changes occurring between 4 and 6 days after sowing. The highest incidence of heterotic patterns coincides with establishment of size differences at 4 days after sowing. In contrast, overall transcript patterns at 4, 6 and 10 days after sowing are characterized by intermediate to dominant patterns, with parental transcript levels showing the largest differences. Overall, the results suggest that, during early developmental stages, intermediate gene expression and higher metabolic activity in the hybrids compared to the parents lead to better resource efficiency, and therefore enhanced performance in the hybrids.
Collapse
Affiliation(s)
- Rhonda C Meyer
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Schwachtje J, Karojet S, Kunz S, Brouwer S, van Dongen JT. Plant-growth promoting effect of newly isolated rhizobacteria varies between two Arabidopsis ecotypes. PLANT SIGNALING & BEHAVIOR 2012; 7:623-7. [PMID: 22580689 PMCID: PMC3442855 DOI: 10.4161/psb.20176] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Various rhizobacteria are known for their beneficial effects on plants, i. e. promotion of growth and induction of systemic resistance against pathogens. These bacteria are categorized as plant growth promoting rhizobacteria (PGPR) and are associated with plant roots. Knowledge of the underlying mechanisms of plant growth promotion in vivo is still very limited, but interference of bacteria with plant hormone metabolism is suggested to play a major role. To obtain new growth promoting bacteria, we started a quest for rhizobacteria that are naturally associated to Arabidopsis thaliana. A suite of native root-associated bacteria were isolated from surface-sterilized roots of the Arabidopsis ecotype Gol-1 derived from a field site near Golm (Berlin area, Germany). We found several Pseudomonas and a Microbacterium species and tested these for growth promotion effects on the Arabidopsis ecotypes Gol-1 and Col-0, and for growth-promotion associated traits, such as auxin production, ACC deaminase activity and phosphate solubilization capacity. We showed that two of the bacteria strains promote plant growth with respect to rosette diameter, stalk length and accelerate development and that the effects were greater when bacteria were applied to Col-0 compared with Gol-1. Furthermore, the capability of promoting growth was not explained by the tested metabolic properties of the bacteria, suggesting that further bacterial traits are required. The natural variation of growth effects, combined with the extensive transgenic approaches available for the model plant Arabidopsis, will build a valuable tool to augment our understanding of the molecular mechanisms involved in the natural Arabidopsis - PGPR association.
Collapse
Affiliation(s)
- Jens Schwachtje
- Max Planck Institute of Molecular Plant Physiology, Potsdam-Golm, Germany.
| | | | | | | | | |
Collapse
|
9
|
Seifert M, Gohr A, Strickert M, Grosse I. Parsimonious higher-order hidden Markov models for improved array-CGH analysis with applications to Arabidopsis thaliana. PLoS Comput Biol 2012; 8:e1002286. [PMID: 22253580 PMCID: PMC3257270 DOI: 10.1371/journal.pcbi.1002286] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2011] [Accepted: 10/11/2011] [Indexed: 12/19/2022] Open
Abstract
Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only based on first-order HMMs having constrained abilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. Here, we develop parsimonious higher-order HMMs enabling the interpolation between a mixture model ignoring spatial dependencies and a higher-order HMM exhaustively modeling spatial dependencies. We apply parsimonious higher-order HMMs to the analysis of Array-CGH data of the accessions C24 and Col-0 of the model plant Arabidopsis thaliana. We compare these models against first-order HMMs and other existing methods using a reference of known deletions and sequence deviations. We find that parsimonious higher-order HMMs clearly improve the identification of these polymorphisms. Moreover, we perform a functional analysis of identified polymorphisms revealing novel details of genomic differences between C24 and Col-0. Additional model evaluations are done on widely considered Array-CGH data of human cell lines indicating that parsimonious HMMs are also well-suited for the analysis of non-plant specific data. All these results indicate that parsimonious higher-order HMMs are useful for Array-CGH analyses. An implementation of parsimonious higher-order HMMs is available as part of the open source Java library Jstacs (www.jstacs.de/index.php/PHHMM). Array-based comparative genomics is a standard approach for the identification of DNA copy number polymorphisms between closely related genomes. The huge amounts of data produced by these experiments require efficient and accurate bioinformatics tools for the identification of copy number polymorphisms. Hidden Markov Models (HMMs) are frequently used for analyzing such data sets, but current models are based on first-order HMMs only having limited capabilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. We develop parsimonious higher-order HMMs enabling the interpolation between a mixture model ignoring spatial dependencies and a higher-order HMM exhaustively modeling these dependencies to overcome this limitation. In an in-depth case study with Arabidopsis thaliana, we find that parsimonious higher-order HMMs clearly improve the identification of copy number polymorphisms in comparison to standard first-order HMMs and other frequently used methods. Functional analysis of identified polymorphisms revealed details of genomic differences between the accessions C24 and Col-0 of Arabidopsis thaliana. An additional study on human cell lines further indicates that parsimonious HMMs are well-suited for the analysis of Array-CGH data.
Collapse
Affiliation(s)
- Michael Seifert
- Department of Molecular Genetics, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
| | | | | | | |
Collapse
|
10
|
Günther T, Schmid KJ. Improved haplotype-based detection of ongoing selective sweeps towards an application in Arabidopsis thaliana. BMC Res Notes 2011; 4:232. [PMID: 21729283 PMCID: PMC3148560 DOI: 10.1186/1756-0500-4-232] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2011] [Accepted: 07/05/2011] [Indexed: 12/19/2022] Open
Abstract
Background The increasing amount of genome information allows us to address various questions regarding the molecular evolution and population genetics of different species. Such genome-wide data sets including thousands of individuals genotyped at hundreds of thousands of markers require time-efficient and powerful analysis methods. Demography and sampling introduce a bias into present population genetic tests of natural selection, which may confound results. Thus, a modification of test statistics is necessary to introduce time-efficient and unbiased analysis methods. Results We present an improved haplotype-based test of selective sweeps in samples of unequally related individuals. For this purpose, we modified existing tests by weighting the contribution of each individual based on its uniqueness in the entire sample. In contrast to previous tests, this modified test is feasible even for large genome-wide data sets of multiple individuals. We utilize coalescent simulations to estimate the sensitivity of such haplotype-based test statistics to complex demographic scenarios, such as population structure, population growth and bottlenecks. The analysis of empirical data from humans reveals different results compared to previous tests. Additionally, we show that our statistic is applicable to empirical data from Arabidopsis thaliana. Overall, the modified test leads to a slight but significant increase of power to detect selective sweeps among all demographic scenarios. Conclusions The concept of this modification might be applied to other statistics in population genetics to reduce the intrinsic bias of demography and sampling. Additionally, the combination of different test statistics may further improve the performance of tests for natural selection.
Collapse
Affiliation(s)
- Torsten Günther
- Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohenheim, Stuttgart, Germany.
| | | |
Collapse
|