1
|
Liu Y, Wu X, Wang Y. An integrated approach for copy number variation discovery in parent-offspring trios. Brief Bioinform 2021; 22:6306464. [PMID: 34151932 DOI: 10.1093/bib/bbab230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 04/27/2021] [Accepted: 05/25/2021] [Indexed: 11/14/2022] Open
Abstract
Whole-genome sequencing (WGS) of parent-offspring trios has become widely used to identify causal copy number variations (CNVs) in rare and complex diseases. Existing CNV detection approaches usually do not make effective use of Mendelian inheritance in parent-offspring trios and yield low accuracy. In this study, we propose a novel integrated approach, TrioCNV2, for jointly detecting CNVs from WGS data of the parent-offspring trio. TrioCNV2 first makes use of the read depth and discordant read pairs to infer approximate locations of CNVs and then employs the split read and local de novo assembly approaches to refine the breakpoints. We use the real WGS data of two parent-offspring trios to demonstrate TrioCNV2's performance and compare it with other CNV detection approaches. The software TrioCNV2 is implemented using a combination of Java and R and is freely available from the website at https://github.com/yongzhuang/TrioCNV2.
Collapse
Affiliation(s)
- Yongzhuang Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Xiaoliang Wu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
| |
Collapse
|
2
|
Malinga J, Mogeni P, Omedo I, Rockett K, Hubbart C, Jeffreys A, Williams TN, Kwiatkowski D, Bejon P, Ross A. Investigating the drivers of the spatio-temporal patterns of genetic differences between Plasmodium falciparum malaria infections in Kilifi County, Kenya. Sci Rep 2019; 9:19018. [PMID: 31836742 PMCID: PMC6911066 DOI: 10.1038/s41598-019-54348-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Accepted: 11/12/2019] [Indexed: 01/17/2023] Open
Abstract
Knowledge of how malaria infections spread locally is important both for the design of targeted interventions aiming to interrupt malaria transmission and the design of trials to assess the interventions. A previous analysis of 1602 genotyped Plasmodium falciparum parasites in Kilifi, Kenya collected over 12 years found an interaction between time and geographic distance: the mean number of single nucleotide polymorphism (SNP) differences was lower for pairs of infections which were both a shorter time interval and shorter geographic distance apart. We determine whether the empiric pattern could be reproduced by a simple model, and what mean geographic distances between parent and offspring infections and hypotheses about genotype-specific immunity or a limit on the number of infections would be consistent with the data. We developed an individual-based stochastic simulation model of households, people and infections. We parameterized the model for the total number of infections, and population and household density observed in Kilifi. The acquisition of new infections, mutation, recombination, geographic location and clearance were included. We fit the model to the observed numbers of SNP differences between pairs of parasite genotypes. The patterns observed in the empiric data could be reproduced. Although we cannot rule out genotype-specific immunity or a limit on the number of infections per individual, they are not necessary to account for the observed patterns. The mean geographic distance between parent and offspring malaria infections for the base model was 0.4 km (95% CI 0.24, 1.20), for a distribution with 58% of distances shorter than the mean. Very short mean distances did not fit well, but mixtures of distributions were also consistent with the data. For a pathogen which undergoes meiosis in a setting with moderate transmission and a low coverage of infections, analytic methods are limited but an individual-based model can be used with genotyping data to estimate parameter values and investigate hypotheses about underlying processes.
Collapse
Affiliation(s)
- Josephine Malinga
- Swiss Tropical and Public Health Institute, Basel, Switzerland.,University of Basel, Basel, Switzerland
| | - Polycarp Mogeni
- Kenya Medical Research Institute-Wellcome Trust Research Programme, Kilifi, Kenya
| | - Irene Omedo
- Kenya Medical Research Institute-Wellcome Trust Research Programme, Kilifi, Kenya
| | - Kirk Rockett
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Christina Hubbart
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Anne Jeffreys
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Thomas N Williams
- Kenya Medical Research Institute-Wellcome Trust Research Programme, Kilifi, Kenya.,Department of Medicine, South Kensington Campus, Imperial College London, London, UK
| | - Dominic Kwiatkowski
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.,Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Philip Bejon
- Kenya Medical Research Institute-Wellcome Trust Research Programme, Kilifi, Kenya.,Centre for Tropical Medicine & Global Health, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK
| | - Amanda Ross
- Swiss Tropical and Public Health Institute, Basel, Switzerland. .,University of Basel, Basel, Switzerland.
| |
Collapse
|
3
|
Cunha MLR, Meijers JCM, Middeldorp S. Introduction to the analysis of next generation sequencing data and its application to venous thromboembolism. Thromb Haemost 2015; 114:920-32. [PMID: 26446408 DOI: 10.1160/th15-05-0411] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 08/26/2015] [Indexed: 12/13/2022]
Abstract
Despite knowledge of various inherited risk factors associated with venous thromboembolism (VTE), no definite cause can be found in about 50% of patients. The application of data-driven searches such as GWAS has not been able to identify genetic variants with implications for clinical care, and unexplained heritability remains. In the past years, the development of several so-called next generation sequencing (NGS) platforms is offering the possibility of generating fast, inexpensive and accurate genomic information. However, so far their application to VTE has been very limited. Here we review basic concepts of NGS data analysis and explore the application of NGS technology to VTE. We provide both computational and biological viewpoints to discuss potentials and challenges of NGS-based studies.
Collapse
Affiliation(s)
- Marisa L R Cunha
- Marisa L. R. Cunha, Department of Experimental Vascular Medicine, Academic Medical Center, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands, Tel.: +31 20 5662824, Fax: +31 20 6968833, E-mail:
| | | | | |
Collapse
|