1
|
Forni D, Martin D, Abujaber R, Sharp AJ, Sironi M, Hollox EJ. Determining multiallelic complex copy number and sequence variation from high coverage exome sequencing data. BMC Genomics 2015; 16:891. [PMID: 26526070 PMCID: PMC4630827 DOI: 10.1186/s12864-015-2123-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 10/22/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Copy number variation (CNV) is a major component of genomic variation, yet methods to accurately type genomic CNV lag behind methods that type single nucleotide variation. High-throughput sequencing can contribute to these methods by using sequence read depth, which takes the number of reads that map to a given part of the reference genome as a proxy for copy number of that region, and compares across samples. Furthermore, high-throughput sequencing also provides information on the sequence differences between copies within and between individuals. METHODS In this study we use high-coverage phase 3 exome sequences of the 1000 Genomes project to infer diploid copy number of the beta-defensin genomic region, a well-studied CNV that carries several beta-defensin genes involved in the antimicrobial response, signalling, and fertility. We also use these data to call sequence variants, a particular challenge given the multicopy nature of the region. RESULTS We confidently call copy number and sequence variation of the beta-defensin genes on 1285 samples from 26 global populations, validate copy number using Nanostring nCounter and triplex paralogue ratio test data. We use the copy number calls to verify the genomic extent of the CNV and validate sequence calls using analysis of cloned PCR products. We identify novel variation, mostly individually rare, predicted to alter amino-acid sequence in the beta-defensin genes. Such novel variants may alter antimicrobial properties or have off-target receptor interactions, and may contribute to individuality in immunological response and fertility. CONCLUSIONS Given that 81% of identified sequence variants were not previously in dbSNP, we show that sequence variation in multiallelic CNVs represent an unappreciated source of genomic diversity.
Collapse
Affiliation(s)
- Diego Forni
- Department of Genetics, University of Leicester, Leicester, UK.,Bioinformatics, Scientific Institute IRCCS E.MEDEA, Bosisio, Parini, Italy
| | - Diana Martin
- Department of Genetics, University of Leicester, Leicester, UK
| | - Razan Abujaber
- Department of Genetics, University of Leicester, Leicester, UK
| | - Andrew J Sharp
- Department of Genetics and Genome Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Manuela Sironi
- Bioinformatics, Scientific Institute IRCCS E.MEDEA, Bosisio, Parini, Italy
| | - Edward J Hollox
- Department of Genetics, University of Leicester, Leicester, UK.
| |
Collapse
|
2
|
Fan Y, Zhang Y, Xu S, Kong N, Zhou Y, Ren Z, Deng Y, Lin L, Ren Y, Wang Q, Zi J, Wen B, Liu S. Insights from ENCODE on Missing Proteins: Why β-Defensin Expression Is Scarcely Detected. J Proteome Res 2015; 14:3635-44. [PMID: 26258396 DOI: 10.1021/acs.jproteome.5b00565] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
β-Defensins (DEFBs) have a variety of functions. The majority of these proteins were not identified in a recent proteome survey. Neither protein detection nor the analysis of transcriptomic data based on RNA-seq data for three liver cancer cell lines identified any expression products. Extensive investigation into DEFB transcripts in over 70 cell lines offered similar results. This fact naturally begs the question—Why are DEFB genes scarcely expressed? After examining DEFB gene annotation and the physicochemical properties of its protein products, we postulated that regulatory elements could play a key role in the resultant poor transcription of DEFB genes. Four regions containing DEFB genes and six adjacent regions on chromosomes 6, 8, and 20 were carefully investigated using The Encyclopedia of DNA Elements (ENCODE) information, such as that of DNase I hypersensitive sites (DHSs), transcription factors (TFs), and histone modifications. The results revealed that the intensities of these ENCODE features were globally weaker than those in the adjacent regions. Impressively, DEFB-related regions on chromosomes 6 and 8 containing several non-DEFB genes had lower ENCODE feature intensities, indicating that the absence of DEFB mRNAs might not depend on the gene family but may be reliant upon gene location and chromatin structure.
Collapse
Affiliation(s)
- Yang Fan
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences , No 1, Beichen West Road, Beijing 100101, China.,BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China.,Graduate University of the Chinese Academy of Sciences , 19A, Yuquan Road, Beijing 100049, China
| | - Yue Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences , No 1, Beichen West Road, Beijing 100101, China.,BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China.,Graduate University of the Chinese Academy of Sciences , 19A, Yuquan Road, Beijing 100049, China
| | - Shaohang Xu
- BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Nannan Kong
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences , No 1, Beichen West Road, Beijing 100101, China.,BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China.,Graduate University of the Chinese Academy of Sciences , 19A, Yuquan Road, Beijing 100049, China
| | - Yang Zhou
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences , No 1, Beichen West Road, Beijing 100101, China.,BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China.,Graduate University of the Chinese Academy of Sciences , 19A, Yuquan Road, Beijing 100049, China
| | - Zhe Ren
- BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Yamei Deng
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences , No 1, Beichen West Road, Beijing 100101, China.,BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China.,Graduate University of the Chinese Academy of Sciences , 19A, Yuquan Road, Beijing 100049, China
| | - Liang Lin
- BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Yan Ren
- BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Quanhui Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences , No 1, Beichen West Road, Beijing 100101, China.,BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China.,Graduate University of the Chinese Academy of Sciences , 19A, Yuquan Road, Beijing 100049, China
| | - Jin Zi
- BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Bo Wen
- BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Siqi Liu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences , No 1, Beichen West Road, Beijing 100101, China.,BGI-Shenzhen , Beishan Industrial Zone, Yantian District, Shenzhen 518083, China.,Graduate University of the Chinese Academy of Sciences , 19A, Yuquan Road, Beijing 100049, China
| |
Collapse
|
3
|
Taudien S, Huse K, Groth M, Platzer M. Narrowing down the distal border of the copy number variable beta-defensin gene cluster on human 8p23. BMC Res Notes 2014; 7:93. [PMID: 24552181 PMCID: PMC3942070 DOI: 10.1186/1756-0500-7-93] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Accepted: 02/10/2014] [Indexed: 12/18/2022] Open
Abstract
Background Copy number variation (CNV) in the range from 2 to 12 per diploid genome is an outstanding feature of the beta-defensin gene (DEFB) cluster on human chromosome 8p23.1 numerously demonstrated by different methods. So far, CNV was proven for a 115 kb region between DEFB4 and 21 kb proximal of DEFB107 but the borders for the entire CNV repeat unit are still unknown. Our study aimed to narrow down the distal border of the DEFB cluster. Results We established tests for length polymorphisms based on amplification and capillary electrophoresis with laser-induced fluorescence (CE-LIF) analysis of seven insertion/deletion (indel) containing regions spread over the entire cluster. The tests were carried out with 25 genomic DNAs with different previously determined cluster copy numbers. CNV was demonstrated for six indels between ~1 kb distal of DEFB108P and 10 kb proximal of DEFB107. In contrast, the most distal indel is not affected by CNV. Conclusion Our analysis fixes the minimal length of proven CNV to 157 kb including DEFB108P but excluding DEFB109P. The distal border between CNV and non-CNV part of the DEF cluster is located in the 59 kb interval chr8:7,171,082-7,230,128.
Collapse
Affiliation(s)
- Stefan Taudien
- Leibniz Institute for Age Research - Fritz Lipmann Institute, Beutenbergstr, 11, D-07745 Jena, Germany.
| | | | | | | |
Collapse
|
4
|
Barber JCK, Rodrigues R, Maloney VK, Taborda F, do C Rodrigues M, Bateman MS. Another family with a euchromatic duplication variant of 9q13-q21.1 derived from segmentally duplicated pericentromeric euchromatin. Cytogenet Genome Res 2013; 141:64-9. [PMID: 23651944 DOI: 10.1159/000350870] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/30/2013] [Indexed: 11/19/2022] Open
Abstract
Microscopically visible copy number variations within the proximal short arm heterochromatin and proximal long arm of chromosome 9 have been described as euchromatic variants (EVs) and are derived from extensive segmental duplications (SDs) that map to both the proximal short and long arms of chromosome 9. Recently, 3-4 additional copies of an SD cassette were found in 2 families with duplication EVs of 9q13-q21. Here, we report a third family with a duplication EV of 9q13-q21.1 that was ascertained at prenatal diagnosis for advanced maternal age and found in the fetus and her phenotypically normal mother. Dual-colour fluorescence in situ hybridization with bacterial artificial chromosomes RP11-246P17 and RP11-211E19 was consistent with the EV chromosome having 1-2 additional copies of a similar SD cassette, except that the SD-boundary clone RP11-88I18 was not apparently included. It is important to distinguish the 9q13-q21.1 EVs from possible pathogenic imbalances of chromosome 9, especially at prenatal diagnosis, as these EVs have no established phenotypic or reproductive consequences. The nature of the G-dark bands in 9q13-q21 EVs is briefly discussed.
Collapse
Affiliation(s)
- J C K Barber
- Department of Human Genetics and Genomic Medicine, Faculty of Medicine, Southampton General Hospital, University of Southampton, Southampton, UK.
| | | | | | | | | | | |
Collapse
|
5
|
Taudien S, Gäbel G, Kuss O, Groth M, Grützmann R, Huse K, Kluttig A, Wolf A, Nothnagel M, Rosenstiel P, Greiser KH, Werdan K, Krawczak M, Pilarsky C, Platzer M. Association studies of the copy-number variable ß-defensin cluster on 8p23.1 in adenocarcinoma and chronic pancreatitis. BMC Res Notes 2012; 5:629. [PMID: 23148552 PMCID: PMC3532138 DOI: 10.1186/1756-0500-5-629] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2012] [Accepted: 11/07/2012] [Indexed: 12/20/2022] Open
Abstract
Background Human ß-defensins are a family of antimicrobial peptides located at the mucosal surface. Both sequence multi-site variations (MSV) and copy-number variants (CNV) of the defensin-encoding genes are associated with increased risk for various diseases, including cancer and inflammatory conditions such as psoriasis and acute pancreatitis. In a case–control study, we investigated the association between MSV in DEFB104 as well as defensin gene (DEF) cluster copy number (CN), and pancreatic ductal adenocarcinoma (PDAC) and chronic pancreatitis (CP). Results Two groups of PDAC (N=70) and CP (N=60) patients were compared to matched healthy control groups CARLA1 (N=232) and CARLA2 (N=160), respectively. Four DEFB104 MSV were haplotyped by PCR, cloning and sequencing. DEF cluster CN was determined by multiplex ligation-dependent probe amplification. Neither the PDAC nor the CP cohorts show significant differences in the DEFB104 haplotype distribution compared to the respective control groups CARLA1 and CARLA2, respectively. The diploid DEF cluster CN exhibit a significantly different distribution between PDAC and CARLA1 (Fisher’s exact test P=0.027), but not between CP and CARLA2 (P=0.867). Conclusion Different DEF cluster b CN distribution between PDAC patients and healthy controls indicate a potential protective effect of higher CNs against the disease.
Collapse
Affiliation(s)
- Stefan Taudien
- Genome Analysis, Leibniz Institute for Age Research - Fritz Lipmann Institute, Beutenbergstr 11, D-07745, Jena, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Haas J, Katus HA, Meder B. Next-generation sequencing entering the clinical arena. Mol Cell Probes 2011; 25:206-11. [PMID: 21914469 DOI: 10.1016/j.mcp.2011.08.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2011] [Revised: 08/29/2011] [Accepted: 08/29/2011] [Indexed: 10/17/2022]
Abstract
Over the last decade the genetic etiology of many heritable diseases could be resolved. For heart muscle diseases, so called cardiomyopathies, mutations in more than 40 different genes have been identified. Due to this large genetic heterogeneity and missing of adequate gene-diagnostic tools, most patients are not genetically characterized, which would be important for individualized patient care. Currently, next-generation sequencing technologies are revolutionizing genetic and epigenetic research, since they are capable to produce billions of bases of sequence information in a single experiment. Accordingly, this powerful technology can now also open avenues for genetic diagnostics. The scope of this article is to illustrate technical approaches, clinical applications, and yet unsolved problems of next-generation sequencing entering the clinical arena.
Collapse
Affiliation(s)
- Jan Haas
- Department of Internal Medicine III, University of Heidelberg, Im Neuenheimer Feld 350, Heidelberg 69120, Germany
| | | | | |
Collapse
|