1
|
Romdhane L, Kefi S, Mezzi N, Abassi N, Jmel H, Romdhane S, Shan J, Chouchane L, Abdelhak S. Ethnic and functional differentiation of copy number polymorphisms in Tunisian and HapMap population unveils insights on genome organizational plasticity. Sci Rep 2024; 14:4654. [PMID: 38409353 PMCID: PMC10897484 DOI: 10.1038/s41598-024-54749-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 02/15/2024] [Indexed: 02/28/2024] Open
Abstract
Admixture mapping has been useful in identifying genetic variations linked to phenotypes, adaptation and diseases. Copy number variations (CNVs) represents genomic structural variants spanning large regions of chromosomes reaching several megabases. In this investigation, the "Canary" algorithm was applied to 102 Tunisian samples and 991 individuals from eleven HapMap III populations to genotype 1279 copy number polymorphisms (CNPs). In this present work, we investigate the Tunisian population structure using the CNP makers previously identified among Tunisian. The study revealed that Sub-Saharan African populations exhibited the highest diversity with the highest proportions of allelic CNPs. Among all the African populations, Tunisia showed the least diversity. Individual ancestry proportions computed using STRUCTURE analysis revealed a major European component among Tunisians with lesser contribution from Sub-Saharan Africa and Asia. Population structure analysis indicated the genetic proximity with Europeans and noticeable distance from the Sub-Saharan African and East Asian clusters. Seven genes harbouring Tunisian high-frequent CNPs were identified known to be associated with 9 Mendelian diseases and/or phenotypes. Functional annotation of genes under selection highlighted a noteworthy enrichment of biological processes to receptor pathway and activity as well as glutathione metabolism. Additionally, pathways of potential concern for health such as drug metabolism, infectious diseases and cancers exhibited significant enrichment. The distinctive genetic makeup of the Tunisians might have been influenced by various factors including natural selection and genetic drift, resulting in the development of distinct genetic variations playing roles in specific biological processes. Our research provides a justification for focusing on the exclusive genome organization of this population and uncovers previously overlooked elements of the genome.
Collapse
Affiliation(s)
- Lilia Romdhane
- Genomics and Oncogenetics Laboratory (LR16IPT05), Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia.
- Department of Biology, Faculty of Sciences of Bizerte, University of Carthage, Zarzouna, Tunisia.
| | - Sameh Kefi
- Genomics and Oncogenetics Laboratory (LR16IPT05), Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Nessrine Mezzi
- Genomics and Oncogenetics Laboratory (LR16IPT05), Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Najla Abassi
- Genomics and Oncogenetics Laboratory (LR16IPT05), Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Haifa Jmel
- Genomics and Oncogenetics Laboratory (LR16IPT05), Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Safa Romdhane
- Genomics and Oncogenetics Laboratory (LR16IPT05), Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Jingxuan Shan
- Laboratory of Genetic Medicine and Immunology, Weill Cornell Medicine-Qatar, Education City-Qatar Foundation, Doha, Qatar
- Department of Genetic Medicine, Weill Cornell Medicine, New York, NY, USA
- Genetic Intelligence Laboratory, Weill Cornell Medicine in Qatar, Education City, Qatar Foundation, Doha, Qatar
| | - Lotfi Chouchane
- Laboratory of Genetic Medicine and Immunology, Weill Cornell Medicine-Qatar, Education City-Qatar Foundation, Doha, Qatar
- Department of Genetic Medicine, Weill Cornell Medicine, New York, NY, USA
- Genetic Intelligence Laboratory, Weill Cornell Medicine in Qatar, Education City, Qatar Foundation, Doha, Qatar
| | - Sonia Abdelhak
- Genomics and Oncogenetics Laboratory (LR16IPT05), Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| |
Collapse
|
2
|
Zhang T, Dong J, Jiang H, Zhao Z, Zhou M, Yuan T. CNV-PCC: An efficient method for detecting copy number variations from next-generation sequencing data. Front Bioeng Biotechnol 2022; 10:1000638. [DOI: 10.3389/fbioe.2022.1000638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
Copy number variations (CNVs) significantly influence the diversity of the human genome and the occurrence of many complex diseases. The next-generation sequencing (NGS) technology provides rich data for detecting CNVs, and the read depth (RD)-based approach is widely used. However, low CN (copy number of 3–4) duplication events are challenging to identify with existing methods, especially when the size of CNVs is small. In addition, the RD-based approach can only obtain rough breakpoints. We propose a new method, CNV-PCC (detection of CNVs based on Principal Component Classifier), to identify CNVs in whole genome sequencing data. CNV-PPC first uses the split read signal to search for potential breakpoints. A two-stage segmentation strategy is then implemented to enhance the identification capabilities of low CN duplications and small CNVs. Next, the outlier scores are calculated for each segment by PCC (Principal Component Classifier). Finally, the OTSU algorithm calculates the threshold to determine the CNVs regions. The analysis of simulated data results indicates that CNV-PCC outperforms the other methods for sensitivity and F1-score and improves breakpoint accuracy. Furthermore, CNV-PCC shows high consistency on real sequencing samples with other methods. This study demonstrates that CNV-PCC is an effective method for detecting CNVs, even for low CN duplications and small CNVs.
Collapse
|
3
|
Agarwala S, Veerappa AM, Ramachandra NB. Identification of primary copy number variations reveal enrichment of Calcium, and MAPK pathways sensitizing secondary sites for autism. EGYPTIAN JOURNAL OF MEDICAL HUMAN GENETICS 2020. [DOI: 10.1186/s43042-020-00091-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Autism is a neurodevelopmental condition with genetic heterogeneity. It is characterized by difficulties in reciprocal social interactions with strong repetitive behaviors and stereotyped interests. Copy number variations (CNVs) are genomic structural variations altering the genomic structure either by duplication or deletion. De novo or inherited CNVs are found in 5–10% of autistic subjects with a size range of few kilobases to several megabases. CNVs predispose humans to various diseases by altering gene regulation, generation of chimeric genes, and disruption of the coding region or through position effect. Although, CNVs are not the initiating event in pathogenesis; additional preceding mutations might be essential for disease manifestation. The present study is aimed to identify the primary CNVs responsible for autism susceptibility in healthy cohorts to sensitize secondary-hits. In the current investigation, primary-hit autism gene CNVs are characterized in 1715 healthy cohorts of varying ethnicities across 12 populations using Affymetrix high-resolution array study. Thirty-eight individuals from twelve families residing in Karnataka, India, with the age group of 13–73 years are included for the comparative CNV analysis. The findings are validated against global 179 autism whole-exome sequence datasets derived from Simons Simplex Collection. These datasets are deposited at the Simons Foundation Autism Research Initiative (SFARI) database.
Results
The study revealed that 34.8% of the subjects carried 2% primary-hit CNV burden with 73 singleton-autism genes in different clusters. Of these, three conserved CNV breakpoints were identified with ARHGAP11B, DUSP22, and CHRNA7 as the target genes across 12 populations. Enrichment analysis of the population-specific autism genes revealed two signaling pathways—calcium and mitogen-activated protein kinases (MAPK) in the CNV identified regions. These impaired pathways affected the downstream cascades of neuronal function and physiology, leading to autism behavior. The pathway analysis of enriched genes unravelled complex protein interaction networks, which sensitized secondary sites for autism. Further, the identification of miRNA targets associated with autism gene CNVs added severity to the condition.
Conclusion
These findings contribute to an atlas of primary-hit genes to detect autism susceptibility in healthy cohorts, indicating their impact on secondary sites for manifestation.
Collapse
|
4
|
Kong X, Li C, Wang P, Huang G, Li Z, Han Z. Soil Pollution Characteristics and Microbial Responses in a Vertical Profile with Long-Term Tannery Sludge Contamination in Hebei, China. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:E563. [PMID: 30781422 PMCID: PMC6407015 DOI: 10.3390/ijerph16040563] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Revised: 02/09/2019] [Accepted: 02/09/2019] [Indexed: 11/20/2022]
Abstract
An investigation was made into the effects of tannery sludge on soil chemical properties and microbial communities in a typical soil profile with long-term tannery sludge contamination, North China. The results showed that trivalent chromium (Cr(III)), ammonium, organic nitrogen, salinity and sulfide were the predominant contaminants in tannery sludge. Although the tannery sludge contained high chromium (Cr, 3,0970 mg/kg), the proportion of mobile Cr forms (exchangeable plus carbonate-bound fraction) only accounted for 1.32%. The X-ray diffraction and X-ray photoelectron spectroscopy results further demonstrated that the Cr existed in a stable state of oxides and iron oxides. The alkaline loam soil had a significant retardation effect on the migration of salinity, ammonium, Cr(III) and sulfide, and the accumulation of these contaminants occurred in soils (0⁻40 cm). A good correlation (R² = 0.959) was observed between total organic carbon (TOC) and Cr(III) in the soil profile, indicating that the dissolved organic matter from sludge leachate promoted the vertical mobility of Cr(III) via forming Cr(III)-organic complexes. The halotolerant bacteria (Halomonas and Tepidimicrobium) and organic degrading bacteria (Flavobacteriaceae, Tepidimicrobium and Balneola) became the dominant microflora in the soil profile. High contents of salinity, Cr and nitrogen were the main environmental factors affecting the abundance of indigenous microorganisms in soils.
Collapse
Affiliation(s)
- Xiangke Kong
- Institute of Hydrogeology & Environmental Geology, Chinese Academy of Geological Sciences, Shijiazhuang, 050061, China.
- Hebei and China Geological Survey Key Laboratory of Groundwater Remediation, Institute of Hydrogeology & Environmental Geology, Shijiazhuang, 050061, China.
| | - Chunhui Li
- School of Earth Science and Engineering, North China University of Water Resources and Electric Power, Zhengzhou, 450046, China.
| | - Ping Wang
- Institute of Hydrogeology & Environmental Geology, Chinese Academy of Geological Sciences, Shijiazhuang, 050061, China.
- Hebei and China Geological Survey Key Laboratory of Groundwater Remediation, Institute of Hydrogeology & Environmental Geology, Shijiazhuang, 050061, China.
| | - Guoxin Huang
- Chinese Academy for Environmental Planning, Beijing 100012, China.
| | - Zhitao Li
- Chinese Academy for Environmental Planning, Beijing 100012, China.
| | - Zhantao Han
- Institute of Hydrogeology & Environmental Geology, Chinese Academy of Geological Sciences, Shijiazhuang, 050061, China.
- Hebei and China Geological Survey Key Laboratory of Groundwater Remediation, Institute of Hydrogeology & Environmental Geology, Shijiazhuang, 050061, China.
| |
Collapse
|
5
|
Malekpour SA, Pezeshk H, Sadeghi M. MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples. Sci Rep 2018; 8:4009. [PMID: 29507384 PMCID: PMC5838159 DOI: 10.1038/s41598-018-22323-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 02/16/2018] [Indexed: 01/23/2023] Open
Abstract
Currently a few tools are capable of detecting genome-wide Copy Number Variations (CNVs) based on sequencing of multiple samples. Although aberrations in mate pair insertion sizes provide additional hints for the CNV detection based on multiple samples, the majority of the current tools rely only on the depth of coverage. Here, we propose a new algorithm (MSeq-CNV) which allows detecting common CNVs across multiple samples. MSeq-CNV applies a mixture density for modeling aberrations in depth of coverage and abnormalities in the mate pair insertion sizes. Each component in this mixture density applies a Binomial distribution for modeling the number of mate pairs with aberration in the insertion size and also a Poisson distribution for emitting the read counts, in each genomic position. MSeq-CNV is applied on simulated data and also on real data of six HapMap individuals with high-coverage sequencing, in 1000 Genomes Project. These individuals include a CEU trio of European ancestry and a YRI trio of Nigerian ethnicity. Ancestry of these individuals is studied by clustering the identified CNVs. MSeq-CNV is also applied for detecting CNVs in two samples with low-coverage sequencing in 1000 Genomes Project and six samples form the Simons Genome Diversity Project.
Collapse
Affiliation(s)
- Seyed Amir Malekpour
- School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| | - Hamid Pezeshk
- School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran.
- School of Biological Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran.
- Department of Mathematics and Statistics, Concordia University, Montreal, Canada.
| | - Mehdi Sadeghi
- National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| |
Collapse
|
6
|
Suresh RV, Lingaiah K, Veerappa AM, Ramachandra NB. Identifying the risk of producing aneuploids using meiotic recombination genes as biomarkers: A copy number variation approach. Indian J Med Res 2017; 145:39-50. [PMID: 28574013 PMCID: PMC5460571 DOI: 10.4103/ijmr.ijmr_965_14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Background & objectives: Aneuploids are the most common chromosomal abnormality in liveborns and are usually the result of non-disjunction (NDJ) in meiosis. Copy number variations (CNVs) are large structural variations affecting the human genome. CNVs influence critical genes involved in causing NDJ by altering their copy number which affects the clinical outcome. In this study influence of CNVs on critical meiotic recombination was examined using new computational technologies to assess their role in causing aneuploidy. Methods: This investigation was based on the analysis of 12 random normal populations consisting of 1714 individuals for aneuploid causing genes under CNV effect. To examine the effect of CNVs on genes causing aneuploidy, meiotic recombination genes were analyzed using EnrichR, WebGestalt and Ingenuity Pathway Analysis (IPA). Results: Forty three NDJ genes were found under CNV burden; IPA (Ingenuity Pathway Analysis) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis of CNV in meiotic recombination genes revealed a significant role of breast cancer gene 1, amyloid protein precursor, mitogen-activated protein kinase and nerve growth factor as key molecular players involved in causing aneuploidy. Interaction between these genes with other CNV-overlapping genes involved in cell cycle, recombination and meiosis might lead to increased incidences of aneuploidy. Interpretation & conclusions: The findings of this study implied that the effect of CNVs on normal genome contributed in amplifying the occurrences of chromosomal aneuploidies. The normal individuals consisting of variations in the susceptible genes causing aneuploids in the population remain undetected until the disorder genes express in the succeeding generations.
Collapse
Affiliation(s)
- Raviraj V Suresh
- Department of Studies in Genetics & Genomics, University of Mysore, Mysuru, India
| | - Kusuma Lingaiah
- Department of Studies in Genetics & Genomics, University of Mysore, Mysuru, India
| | - Avinash M Veerappa
- Department of Studies in Genetics & Genomics, University of Mysore, Mysuru, India
| | - Nallur B Ramachandra
- Department of Studies in Genetics & Genomics, University of Mysore, Mysuru, India
| |
Collapse
|
7
|
Murthy MN, Veerappa AM, Seshachalam KB, Ramachandra NB. High-resolution arrays reveal burden of copy number variations on Parkinson disease genes associated with increased disease risk in random cohorts. Neurol Res 2016; 38:775-85. [PMID: 27399248 DOI: 10.1080/01616412.2016.1204105] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
BACKGROUND Parkinson disease (PD) is a neurological disease responsible for a considerable rate of mortality and morbidity in the society. Since the symptoms of the disease appear much later than the actual onset of neuron degeneration, a majority of cases remain undiagnosed until the manifestation of the symptoms. OBJECTIVES In order to investigate the existence of such susceptibility in the population, we analyzed Copy Number Variation (CNV) influences on PD genes in 1715 individuals from 12 different populations. RESULTS Overall, 16 CNV-PD genes, 3 known to be causal and 13 associated, were found to be significantly enriched. PARK2, was under heavy burden with ~1% of the population containing CNV in the exonic region. The impact of these genes on the genome and disease pathway was analyzed using several genome analysis tools. Protein interaction network of CNV-PD genes revealed a complex interaction of molecules forming a major hub by the α-Synuclein, whose direct interactors, LRRK2, PARK2 and ATP13A2 are under CNV influence. CONCLUSIONS We hypothesize that CNVs may not be the initiating event in the pathogenesis of PD and remain latent until additional secondary hits are acquired and also propose novel genes that may fall under the PD pathway which contribute in pathogenesis.
Collapse
Affiliation(s)
- Megha N Murthy
- a Genetics and Genomics Lab, Department of Genetics and Genomics , University of Mysore , Mysore , India
| | - Avinash M Veerappa
- a Genetics and Genomics Lab, Department of Genetics and Genomics , University of Mysore , Mysore , India
| | | | - Nallur B Ramachandra
- a Genetics and Genomics Lab, Department of Genetics and Genomics , University of Mysore , Mysore , India
| |
Collapse
|