1
|
Yadav S, Ross EM, Aitken KS, Hickey LT, Powell O, Wei X, Voss-Fels KP, Hayes BJ. A linkage disequilibrium-based approach to position unmapped SNPs in crop species. BMC Genomics 2021; 22:773. [PMID: 34715779 PMCID: PMC8555328 DOI: 10.1186/s12864-021-08116-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 10/19/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND High-density SNP arrays are now available for a wide range of crop species. Despite the development of many tools for generating genetic maps, the genome position of many SNPs from these arrays is unknown. Here we propose a linkage disequilibrium (LD)-based algorithm to allocate unassigned SNPs to chromosome regions from sparse genetic maps. This algorithm was tested on sugarcane, wheat, and barley data sets. We calculated the algorithm's efficiency by masking SNPs with known locations, then assigning their position to the map with the algorithm, and finally comparing the assigned and true positions. RESULTS In the 20-fold cross-validation, the mean proportion of masked mapped SNPs that were placed by the algorithm to a chromosome was 89.53, 94.25, and 97.23% for sugarcane, wheat, and barley, respectively. Of the markers that were placed in the genome, 98.73, 96.45 and 98.53% of the SNPs were positioned on the correct chromosome. The mean correlations between known and new estimated SNP positions were 0.97, 0.98, and 0.97 for sugarcane, wheat, and barley. The LD-based algorithm was used to assign 5920 out of 21,251 unpositioned markers to the current Q208 sugarcane genetic map, representing the highest density genetic map for this species to date. CONCLUSIONS Our LD-based approach can be used to accurately assign unpositioned SNPs to existing genetic maps, improving genome-wide association studies and genomic prediction in crop species with fragmented and incomplete genome assemblies. This approach will facilitate genomic-assisted breeding for many orphan crops that lack genetic and genomic resources.
Collapse
Affiliation(s)
- Seema Yadav
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia.
| | - Elizabeth M Ross
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Karen S Aitken
- Agriculture and Food, CSIRO, Queensland Bioscience Precinct, St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Lee T Hickey
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Owen Powell
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Xianming Wei
- Sugar Research Australia, Mackay, QLD, 4741, Australia
| | - Kai P Voss-Fels
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia
| | - Ben J Hayes
- Queensland Alliance for Agriculture and Food Innovation, Queensland Bioscience Precinct, 306 Carmody Rd., St. Lucia, Brisbane, Queensland, 4067, Australia.
| |
Collapse
|
2
|
Zhao Z, Zhou Y, Wang S, Zhang X, Wang C, Li S. LDscaff: LD-based scaffolding of de novo genome assemblies. BMC Bioinformatics 2020; 21:570. [PMID: 33371875 PMCID: PMC7768660 DOI: 10.1186/s12859-020-03895-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 11/18/2020] [Indexed: 12/11/2022] Open
Abstract
Background Genome assembly is fundamental for de novo genome analysis. Hybrid assembly, utilizing various sequencing technologies increases both contiguity and accuracy. While such approaches require extra costly sequencing efforts, the information provided millions of existed whole-genome sequencing data have not been fully utilized to resolve the task of scaffolding. Genetic recombination patterns in population data indicate non-random association among alleles at different loci, can provide physical distance signals to guide scaffolding. Results In this paper, we propose LDscaff for draft genome assembly incorporating linkage disequilibrium information in population data. We evaluated the performance of our method with both simulated data and real data. We simulated scaffolds by splitting the pig reference genome and reassembled them. Gaps between scaffolds were introduced ranging from 0 to 100 KB. The genome misassembly rate is 2.43% when there is no gap. Then we implemented our method to refine the Giant Panda genome and the donkey genome, which are purely assembled by NGS data. After LDscaff treatment, the resulting Panda assembly has scaffold N50 of 3.6 MB, 2.5 times larger than the original N50 (1.3 MB). The re-assembled donkey assembly has an improved N50 length of 32.1 MB from 23.8 MB. Conclusions Our method effectively improves the assemblies with existed re-sequencing data, and is an potential alternative to the existing assemblers required for the collection of new data.
Collapse
Affiliation(s)
- Zicheng Zhao
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, 518083, China.,Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR, 999077, China
| | - Yingxiao Zhou
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, 518083, China.,BGI-Shenzhen, Shenzhen, 518083, China
| | - Shuai Wang
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR, 999077, China
| | - Xiuqing Zhang
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, 518083, China
| | - Changfa Wang
- Liaocheng Research Institute of Donkey High-Efficiency Breeding and Ecological Feeding, Liaocheng University, Liaocheng City, 252059, Shandong, China.
| | - Shuaicheng Li
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR, 999077, China.
| |
Collapse
|
3
|
Pengelly RJ, Collins A. Linkage disequilibrium maps to guide contig ordering for genome assembly. Bioinformatics 2018; 35:541-545. [DOI: 10.1093/bioinformatics/bty687] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Revised: 07/13/2018] [Accepted: 08/03/2018] [Indexed: 11/12/2022] Open
Affiliation(s)
- Reuben J Pengelly
- Genetic Epidemiology & Bioinformatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Andrew Collins
- Genetic Epidemiology & Bioinformatics, Faculty of Medicine, University of Southampton, Southampton, UK
| |
Collapse
|
4
|
Calus MPL, Vandenplas J. SNPrune: an efficient algorithm to prune large SNP array and sequence datasets based on high linkage disequilibrium. Genet Sel Evol 2018; 50:34. [PMID: 29940846 PMCID: PMC6019535 DOI: 10.1186/s12711-018-0404-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 06/11/2018] [Indexed: 12/02/2022] Open
Abstract
Background High levels of pairwise linkage disequilibrium (LD) in single nucleotide polymorphism (SNP) array or whole-genome sequence data may affect both performance and efficiency of genomic prediction models. Thus, this warrants pruning of genotyping data for high LD. We developed an algorithm, named SNPrune, which enables the rapid detection of any pair of SNPs in complete or high LD throughout the genome. Methods LD, measured as the squared correlation between phased alleles (r2), can only reach a value of 1 when both loci have the same count of the minor allele. Sorting loci based on the minor allele count, followed by comparison of their alleles, enables rapid detection of loci in complete LD. Detection of loci in high LD can be optimized by computing the range of the minor allele count at another locus for each possible value of the minor allele count that can yield LD values higher than a predefined threshold. This efficiently reduces the number of pairs of loci for which LD needs to be computed, instead of considering all pairwise combinations of loci. The implemented algorithm SNPrune considered bi-allelic loci either using phased alleles or allele counts as input. SNPrune was validated against PLINK on two datasets, using an r2 threshold of 0.99. The first dataset contained 52k SNP genotypes on 3534 pigs and the second dataset contained simulated whole-genome sequence data with 10.8 million SNPs and 2500 animals. Results SNPrune removed a similar number of SNPs as PLINK from the pig data but SNPrune was almost 12 times faster than PLINK. From the simulated sequence data with 10.8 million SNPs, SNPrune removed 6.4 and 1.4 million SNPs due to complete and high LD. Results were very similar regardless of whether phased alleles or allele counts were used. Using allele counts and multi-threading with 10 threads, SNPrune completed the analysis in 21 min. Using a sliding window of up to 500,000 SNPs, PLINK removed ~ 43,000 less SNPs (0.6%) in the sequence data and SNPrune was 24 to 170 times faster, using one or ten threads, respectively. Conclusions The SNPrune algorithm developed here is able to remove SNPs in high LD throughout the genome very efficiently in large datasets. Electronic supplementary material The online version of this article (10.1186/s12711-018-0404-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mario P L Calus
- Animal Breeding and Genomics, Wageningen University & Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands.
| | - Jérémie Vandenplas
- Animal Breeding and Genomics, Wageningen University & Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| |
Collapse
|
5
|
Hampel A, Teuscher F, Gomez-Raya L, Doschoris M, Wittenburg D. Estimation of Recombination Rate and Maternal Linkage Disequilibrium in Half-Sibs. Front Genet 2018; 9:186. [PMID: 29922330 PMCID: PMC5996054 DOI: 10.3389/fgene.2018.00186] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 05/07/2018] [Indexed: 01/23/2023] Open
Abstract
A livestock population can be characterized by different population genetic parameters, such as linkage disequilibrium and recombination rate between pairs of genetic markers. The population structure, which may be caused by family stratification, has an influence on the estimates of these parameters. An expectation maximization algorithm has been proposed for estimating these parameters in half-sibs without phasing the progeny. It, however, overlooks the fact that the underlying likelihood function may have two maxima. The magnitudes of the maxima depend on the maternal allele frequencies at the investigated marker pair. Which maximum the algorithm converges to depends on the chosen start values. We present a stepwise procedure in which the relationship between the two modes is exploited. The expectation maximization algorithm for the parameter estimation is applied twice using different start values, followed by a decision process to assess the most likely estimate. This approach was validated using simulated genotypes of half-sibs. It was also applied to a dairy cattle dataset consisting of multiple half-sib families and 39,780 marker genotypes, leading to estimates for 12,759,713 intrachromosomal marker pairs. Furthermore, the proper order of markers was verified by studying the mean of estimated recombination rates in a window adjacent to the investigated locus as well as in a window at its most distant chromosome end. Putatively misplaced markers or marker clusters were detected by comparing the results with the revised bovine genome assembly UMD 3.1.1. In total, 40 markers were identified as candidates of misplacement. This outcome may help improving the physical order of markers which is also required for refining the bovine genetic map.
Collapse
Affiliation(s)
- Alexander Hampel
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genetics and Biometry, Dummerstorf, Germany
| | - Friedrich Teuscher
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genetics and Biometry, Dummerstorf, Germany
| | - Luis Gomez-Raya
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), Madrid, Spain
| | - Michael Doschoris
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genetics and Biometry, Dummerstorf, Germany
| | - Dörte Wittenburg
- Leibniz Institute for Farm Animal Biology (FBN), Institute of Genetics and Biometry, Dummerstorf, Germany
| |
Collapse
|
6
|
Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data. Genetics 2018; 209:389-400. [PMID: 29588288 PMCID: PMC5972415 DOI: 10.1534/genetics.118.300831] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 03/22/2018] [Indexed: 12/31/2022] Open
Abstract
High-throughput sequencing methods provide a cost-effective approach for genotyping and are commonly used in population genetics studies. A drawback of these methods, however, is that sequencing and genotyping errors can arise... High-throughput sequencing methods that multiplex a large number of individuals have provided a cost-effective approach for discovering genome-wide genetic variation in large populations. These sequencing methods are increasingly being utilized in population genetic studies across a diverse range of species. Two side-effects of these methods, however, are (1) sequencing errors and (2) heterozygous genotypes called as homozygous due to only one allele at a particular locus being sequenced, which occurs when the sequencing depth is insufficient. Both of these errors have a profound effect on the estimation of linkage disequilibrium (LD) and, if not taken into account, lead to inaccurate estimates. We developed a new likelihood method, GUS-LD, to estimate pairwise linkage disequilibrium using low coverage sequencing data that accounts for undercalled heterozygous genotypes and sequencing errors. Our findings show that accurate estimates were obtained using GUS-LD, whereas underestimation of LD results if no adjustment is made for the errors.
Collapse
|
7
|
A comparative integrated gene-based linkage and locus ordering by linkage disequilibrium map for the Pacific white shrimp, Litopenaeus vannamei. Sci Rep 2017; 7:10360. [PMID: 28871114 PMCID: PMC5583237 DOI: 10.1038/s41598-017-10515-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Accepted: 08/09/2017] [Indexed: 11/23/2022] Open
Abstract
The Pacific whiteleg shrimp, Litopenaeus vannamei, is the most farmed aquaculture species worldwide with global production exceeding 3 million tonnes annually. Litopenaeus vannamei has been the focus of many selective breeding programs aiming to improve growth and disease resistance. However, these have been based primarily on phenotypic measurements and omit potential gains by integrating genetic selection into existing breeding programs. Such integration of genetic information has been hindered by the limited available genomic resources, background genetic parameters and knowledge on the genetic architecture of commercial traits for L. vannamei. This study describes the development of a comprehensive set of genomic gene-based resources including the identification and validation of 234,452 putative single nucleotide polymorphisms in-silico, of which 8,967 high value SNPs were incorporated into a commercially available Illumina Infinium ShrimpLD-24 v1.0 genotyping array. A framework genetic linkage map was constructed and combined with locus ordering by disequilibrium methodology to generate an integrated genetic map containing 4,817 SNPs, which spanned a total of 4552.5 cM and covered an estimated 98.12% of the genome. These gene-based genomic resources will not only be valuable for identifying regions underlying important L. vannamei traits, but also as a foundational resource in comparative and genome assembly activities.
Collapse
|
8
|
Utsunomiya ATH, Santos DJA, Boison SA, Utsunomiya YT, Milanesi M, Bickhart DM, Ajmone-Marsan P, Sölkner J, Garcia JF, da Fonseca R, da Silva MVGB. Revealing misassembled segments in the bovine reference genome by high resolution linkage disequilibrium scan. BMC Genomics 2016; 17:705. [PMID: 27595709 PMCID: PMC5011828 DOI: 10.1186/s12864-016-3049-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 08/27/2016] [Indexed: 11/21/2022] Open
Abstract
Background Misassembly signatures, created by shuffling the order of sequences while assembling a genome, can be detected by the unexpected behavior of marker linkage disequilibrium (LD) decay. We developed a heuristic process to identify misassembly signatures, applied it to the bovine reference genome assembly (UMDv3.1) and presented the consequences of misassemblies in two case studies. Results We identified 2,906 single nucleotide polymorphism (SNP) markers presenting unexpected LD decay behavior in 626 putative misassembled contigs, which comprised less than 1 % of the whole genome. Although this represents a small fraction of the reference sequence, these poorly assembled segments can lead to severe implications to local genome context. For instance, we showed that one of the misassembled regions mapped to the POLL locus, which affected the annotation of positional candidate genes in a GWAS case study for polledness in Nellore (Bos indicus beef cattle). Additionally, we found that poorly performing markers in imputation mapped to putative misassembled regions, and that correction of marker positions based on LD was capable to recover imputation accuracy. Conclusions This heuristic approach can be useful to cross validate reference assemblies and to filter out markers located at low confidence genomic regions before conducting downstream analyses. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3049-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Adam T H Utsunomiya
- Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista - UNESP, Campus de Jaboticabal, São Paulo, Brasil.
| | - Daniel J A Santos
- Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista - UNESP, Campus de Jaboticabal, São Paulo, Brasil
| | | | - Yuri T Utsunomiya
- Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista - UNESP, Campus de Jaboticabal, São Paulo, Brasil
| | - Marco Milanesi
- Faculdade de Medicina Veterinária de Araçatuba, Universidade Estadual Paulista - UNESP, Campus de Araçatuba, São Paulo, Brasil
| | - Derek M Bickhart
- Animal Genomics and Improvement Laboratory, ARS, USDA, Beltsville, MD, USA
| | - Paolo Ajmone-Marsan
- Institute of Zootechnics and Biodiversity and Ancient DNA Research Center, Università Cattolica del Sacro Cuore, Piacenza, Italy.,Nutrigenomics and Proteomics Research Center - PRONUTRIGEN, Università Cattolica del Sacro Cuore, Piacenza, Italy
| | - Johann Sölkner
- Department of Sustainable Agricultural Systems, Division of Livestock Sciences, BOKU - University of Natural Resources and Life Sciences, Vienna, Austria
| | - José F Garcia
- Faculdade de Medicina Veterinária de Araçatuba, Universidade Estadual Paulista - UNESP, Campus de Araçatuba, São Paulo, Brasil.,International Atomic Energy Agency (IAEA) Collaborating Centre on Animal Genomics and Bioinformatics, Araçatuba, São Paulo, Brasil
| | - Ricardo da Fonseca
- Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista - UNESP, Campus de Jaboticabal, São Paulo, Brasil.,Faculdade de Ciências Agrárias e Tecnológicas, Universidade Estadual Paulista - UNESP, Campus de Dracena, São Paulo, Brasil
| | | |
Collapse
|
9
|
Genome-wide estimation of linkage disequilibrium from population-level high-throughput sequencing data. Genetics 2014; 197:1303-13. [PMID: 24875187 DOI: 10.1534/genetics.114.165514] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Rapidly improving sequencing technologies provide unprecedented opportunities for analyzing genome-wide patterns of polymorphisms. In particular, they have great potential for linkage-disequilibrium analyses on both global and local genetic scales, which will substantially improve our ability to derive evolutionary inferences. However, there are some difficulties with analyzing high-throughput sequencing data, including high error rates associated with base reads and complications from the random sampling of sequenced chromosomes in diploid organisms. To overcome these difficulties, we developed a maximum-likelihood estimator of linkage disequilibrium for use with error-prone sampling data. Computer simulations indicate that the estimator is nearly unbiased with a sampling variance at high coverage asymptotically approaching the value expected when all relevant information is accurately estimated. The estimator does not require phasing of haplotypes and enables the estimation of linkage disequilibrium even when all individual reads cover just single polymorphic sites.
Collapse
|
10
|
Corbin LJ, Blott SC, Swinburne JE, Vaudin M, Bishop SC, Woolliams JA. The identification of SNPs with indeterminate positions using the Equine SNP50 BeadChip. Anim Genet 2012; 43:337-9. [PMID: 22486508 DOI: 10.1111/j.1365-2052.2011.02243.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We have used linkage disequilibrium (LD) to identify single nucleotide polymorphisms (SNPs) on the Illumina Equine SNP50 BeadChip, which may be incorrectly positioned on the genome map. A total of 1201 Thoroughbred horses were genotyped using the Illumina Equine SNP50 BeadChip. LD was evaluated in a pairwise fashion between all autosomal SNPs, both within and across chromosomes. Filters were then applied to the data, firstly to identify SNPs that may have been mapped to the wrong chromosome and secondly to identify SNPs that may have been incorrectly positioned within chromosomes. We identified a single SNP on ECA28, which showed low LD with neighbouring SNPs but considerable LD with a group of SNPs on ECA10. Furthermore, a cluster of SNPs on ECA5 showed unusually low LD with surrounding SNPs. A total of 39 SNPs met the criteria for unusual within-chromosome LD. The results of this study indicate that some SNPs may be misplaced. This finding is significant, as misplaced SNPs may lead to difficulties in the application of genomic methods, such as homozygosity mapping, for which SNP order is important.
Collapse
Affiliation(s)
- L J Corbin
- Division of Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, UK.
| | | | | | | | | | | |
Collapse
|
11
|
Age- and disease-dependent HERV-W envelope allelic variation in brain: association with neuroimmune gene expression. PLoS One 2011; 6:e19176. [PMID: 21559469 PMCID: PMC3084769 DOI: 10.1371/journal.pone.0019176] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Accepted: 03/22/2011] [Indexed: 12/27/2022] Open
Abstract
Background The glycoprotein, Syncytin-1, is encoded by a human endogenous retrovirus (HERV)-W env gene and is capable of inducing neuroinflammation. The specific allele(s) responsible for Syncytin-1 expression in the brain is uncertain. Herein, HERV-W env diversity together with Syncytin-1 abundance and host immune gene profiles were examined in the nervous system using a multiplatform approach. Results HERV-W env sequences were encoded by multiple chromosomal encoding loci in primary human neurons compared with less chromosomal diversity in astrocytes and microglia (p<0.05). HERV-W env RNA sequences cloned from brains of patients with systemic or neurologic diseases were principally derived from chromosomal locus 7q21.2. Within the same specimens, HERV-W env transcript levels were correlated with the expression of multiple proinflammatory genes (p<0.05). Deep sequencing of brain transcriptomes disclosed the env transcripts to be the most abundant HERV-W transcripts, showing greater expression in fetal compared with healthy adult brain specimens. Syncytin-1's expression in healthy brain specimens was derived from multiple encoding loci and linked to distinct immune and developmental gene profiles. Conclusions Syncytin-1 expression in the brain during disease was associated with neuroinflammation and was principally encoded by a full length provirus. The present studies also highlighted the diversity in HERV gene expression within the brain and reinforce the potential contributions of HERV expression to neuroinflammatory diseases.
Collapse
|