1
|
Ralli S, Vira T, Robles-Espinoza CD, Adams DJ, Brooks-Wilson AR. Variant ranking pipeline for complex familial disorders. Sci Rep 2024; 14:13599. [PMID: 38866901 PMCID: PMC11169219 DOI: 10.1038/s41598-024-64169-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 06/05/2024] [Indexed: 06/14/2024] Open
Abstract
Identifying genetic susceptibility factors for complex disorders remains a challenging task. To analyze collections of small and large pedigrees where genetic heterogeneity is likely, but biological commonalities are plausible, we have developed a weights-based pipeline to prioritize variants and genes. The Weights-based vAriant Ranking in Pedigrees (WARP) pipeline prioritizes variants using 5 weights: disease incidence rate, number of cases in a family, genome fraction shared amongst cases in a family, allele frequency and variant deleteriousness. Weights, except for the population allele frequency weight, are normalized between 0 and 1. Weights are combined multiplicatively to produce family-specific-variant weights that are then averaged across all families in which the variant is observed to generate a multifamily weight. Sorting multifamily weights in descending order creates a ranked list of variants and genes for further investigation. WARP was validated using familial melanoma sequence data from the European Genome-phenome Archive. The pipeline identified variation in known germline melanoma genes POT1, MITF and BAP1 in 4 out of 13 families (31%). Analysis of the other 9 families identified several interesting genes, some of which might have a role in melanoma. WARP provides an approach to identify disease predisposing genes in studies with small and large pedigrees.
Collapse
Affiliation(s)
- Sneha Ralli
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 1L3, Canada
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Tariq Vira
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 1L3, Canada
| | | | - David J Adams
- Experimental Cancer Genetics, Wellcome Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Angela R Brooks-Wilson
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC, V5Z 1L3, Canada.
- Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada.
| |
Collapse
|
2
|
Susgun S, Kasan K, Yucesan E. Gene Hunting Approaches through the Combination of Linkage Analysis with Whole-Exome Sequencing in Mendelian Diseases: From Darwin to the Present Day. Public Health Genomics 2022; 24:207-217. [PMID: 34237751 DOI: 10.1159/000517102] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 04/27/2021] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND In the context of medical genetics, gene hunting is the process of identifying and functionally characterizing genes or genetic variations that contribute to disease phenotypes. In this review, we would like to summarize gene hunting process in terms of historical aspects from Darwin to now. For this purpose, different approaches and recent developments will be detailed. SUMMARY Linkage analysis and association studies are the most common methods in use for explaining the genetic background of hereditary diseases and disorders. Although linkage analysis is a relatively old approach, it is still a powerful method to detect disease-causing rare variants using family-based data, particularly for consanguineous marriages. As is known that, consanguineous marriages or endogamy poses a social problem in developing countries, however, this same condition also provides a unique opportunity for scientists to identify and characterize pathogenic variants. The rapid advancements in sequencing technologies and their parallel implementation together with linkage analyses now allow us to identify the candidate variants related to diseases in a relatively short time. Furthermore, we can now go one step further and functionally characterize the causative variant through in vitro and in vivo studies and unveil the variant-phenotype relationships on a molecular level more robustly. Key Messages: Herein, we suggest that the combined analysis of linkage and exome analysis is a powerful and precise tool to diagnose clinically rare and recessively inherited conditions.
Collapse
Affiliation(s)
- Seda Susgun
- Department of Medical Biology, Faculty of Medicine, Bezmialem Vakif University, Istanbul, Turkey.,Graduate School of Health Sciences, Istanbul University, Istanbul, Turkey
| | - Koray Kasan
- Faculty of Medicine, Bezmialem Vakif University, Istanbul, Turkey
| | - Emrah Yucesan
- Department of Medical Biology, Faculty of Medicine, Bezmialem Vakif University, Istanbul, Turkey
| |
Collapse
|
3
|
Seo H, Cho DH. Feature selection algorithm based on dual correlation filters for cancer-associated somatic variants. BMC Bioinformatics 2020; 21:486. [PMID: 33121438 PMCID: PMC7596964 DOI: 10.1186/s12859-020-03767-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 09/18/2020] [Indexed: 12/30/2022] Open
Abstract
Background Since the development of sequencing technology, an enormous amount of genetic information has been generated, and human cancer analysis using this information is drawing attention. As the effects of variants on human cancer become known, it is important to find cancer-associated variants among countless variants. Results We propose a new filter-based feature selection method applicable for extracting cancer-associated somatic variants considering correlations of data. Both variants associated with the activation and deactivation of cancer’s characteristics are analyzed using dual correlation filters. The multiobjective optimization is utilized to consider two types of variants simultaneously without redundancy. To overcome high computational complexity problem, we calculate the correlation-based weight to select significant variants instead of directly searching for the optimal subset of variants. The proposed algorithm is applied to the identification of melanoma metastasis or breast cancer stage, and the classification results of the proposed method are compared with those of conventional single correlation filter-based method. Conclusions We verified that the proposed dual correlation filter-based method can extract cancer-associated variants related to the characteristics of human cancer.
Collapse
Affiliation(s)
- Hyein Seo
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea
| | - Dong-Ho Cho
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, 34141, Daejeon, Republic of Korea.
| |
Collapse
|
4
|
Sarig O, Sprecher E. The Molecular Revolution in Cutaneous Biology: Era of Next-Generation Sequencing. J Invest Dermatol 2017; 137:e79-e82. [PMID: 28411851 DOI: 10.1016/j.jid.2016.02.818] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2015] [Revised: 12/22/2015] [Accepted: 02/01/2016] [Indexed: 11/20/2022]
Abstract
Like any true conceptual revolution, next-generation sequencing (NGS) has not only radically changed research and clinical practice, it has also modified scientific culture. With the possibility to investigate DNA contents of any organism and in any context, including in somatic disorders or in tissues carrying complex microbial populations, it initially seemed as if the genetic underpinning of any biological phenomenon could now be deciphered in an almost streamlined fashion. However, over the past recent years, we have once again come to understand that there is no such a thing as great opportunities without great challenges. The steadily expanding use of NGS and related applications is now facing biologists and physicians with novel technological obstacles, analytical hurdles and increasingly pressing ethical questions.
Collapse
Affiliation(s)
- Ofer Sarig
- Department of Dermatology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Eli Sprecher
- Department of Dermatology, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel; Department of Human Molecular Genetics & Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
5
|
Garagnani P, Terlizzi R, Cevoli S, Capellari S, Pierangeli G, Pirazzini C, Bacalini MG, Franceschi C, Cortelli P. Genomics and epigenomics. J Headache Pain 2017; 16:A7. [PMID: 28132381 PMCID: PMC4759335 DOI: 10.1186/1129-2377-16-s1-a7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Affiliation(s)
- Paolo Garagnani
- Dipartimento di Medicina Specialistica Diagnostica e Sperimentale, Università di Bologna, Bologna, Italy.
| | - Rossana Terlizzi
- IRCCS Istituto delle Scienze Neurologiche, Bologna, Italy.,Dipartimento di Scienze Neurologiche, Università di Bologna, Bologna, Italy
| | - Sabina Cevoli
- IRCCS Istituto delle Scienze Neurologiche, Bologna, Italy.,Dipartimento di Scienze Neurologiche, Università di Bologna, Bologna, Italy
| | - Sabina Capellari
- IRCCS Istituto delle Scienze Neurologiche, Bologna, Italy.,Dipartimento di Scienze Neurologiche, Università di Bologna, Bologna, Italy
| | - Giulia Pierangeli
- IRCCS Istituto delle Scienze Neurologiche, Bologna, Italy.,Dipartimento di Scienze Neurologiche, Università di Bologna, Bologna, Italy
| | - Chiara Pirazzini
- Dipartimento di Medicina Specialistica Diagnostica e Sperimentale, Università di Bologna, Bologna, Italy
| | - Maria Giulia Bacalini
- Dipartimento di Medicina Specialistica Diagnostica e Sperimentale, Università di Bologna, Bologna, Italy
| | - Claudio Franceschi
- Dipartimento di Medicina Specialistica Diagnostica e Sperimentale, Università di Bologna, Bologna, Italy
| | - Pietro Cortelli
- IRCCS Istituto delle Scienze Neurologiche, Bologna, Italy.,Dipartimento di Scienze Neurologiche, Università di Bologna, Bologna, Italy
| |
Collapse
|
6
|
Shihab HA, Rogers MF, Ferlaino M, Campbell C, Gaunt TR. GTB - an online genome tolerance browser. BMC Bioinformatics 2017; 18:20. [PMID: 28061747 PMCID: PMC5219737 DOI: 10.1186/s12859-016-1436-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 12/17/2016] [Indexed: 02/03/2023] Open
Abstract
Background Accurate methods capable of predicting the impact of single nucleotide variants (SNVs) are assuming ever increasing importance. There exists a plethora of in silico algorithms designed to help identify and prioritize SNVs across the human genome for further investigation. However, no tool exists to visualize the predicted tolerance of the genome to mutation, or the similarities between these methods. Results We present the Genome Tolerance Browser (GTB, http://gtb.biocompute.org.uk): an online genome browser for visualizing the predicted tolerance of the genome to mutation. The server summarizes several in silico prediction algorithms and conservation scores: including 13 genome-wide prediction algorithms and conservation scores, 12 non-synonymous prediction algorithms and four cancer-specific algorithms. Conclusion The GTB enables users to visualize the similarities and differences between several prediction algorithms and to upload their own data as additional tracks; thereby facilitating the rapid identification of potential regions of interest. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1436-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hashem A Shihab
- MRC Integrative Epidemiology Unit (IEU), University of Bristol, Bristol, BS8 2BN, UK
| | - Mark F Rogers
- Intelligent Systems Laboratory, University of Bristol, Bristol, BS8 1UB, UK
| | - Michael Ferlaino
- Intelligent Systems Laboratory, University of Bristol, Bristol, BS8 1UB, UK
| | - Colin Campbell
- Intelligent Systems Laboratory, University of Bristol, Bristol, BS8 1UB, UK
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit (IEU), University of Bristol, Bristol, BS8 2BN, UK.
| |
Collapse
|
7
|
Erzurumluoglu AM, Shihab HA, Rodriguez S, Gaunt TR, Day INM. Importance of Genetic Studies in Consanguineous Populations for the Characterization of Novel Human Gene Functions. Ann Hum Genet 2016; 80:187-96. [PMID: 27000383 PMCID: PMC4949565 DOI: 10.1111/ahg.12150] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Revised: 12/14/2015] [Accepted: 12/21/2015] [Indexed: 01/04/2023]
Abstract
Consanguineous offspring have elevated levels of homozygosity. Autozygous stretches within their genome are likely to harbour loss of function (LoF) mutations which will lead to complete inactivation or dysfunction of genes. Studying consanguineous offspring with clinical phenotypes has been very useful for identifying disease causal mutations. However, at present, most of the genes in the human genome have no disorder associated with them or have unknown function. This is presumably mostly due to the fact that homozygous LoF variants are not observed in outbred populations which are the main focus of large sequencing projects. However, another reason may be that many genes in the genome—even when completely “knocked out,” do not cause a distinct or defined phenotype. Here, we discuss the benefits and implications of studying consanguineous populations, as opposed to the traditional approach of analysing a subset of consanguineous families or individuals with disease. We suggest that studying consanguineous populations “as a whole” can speed up the characterisation of novel gene functions as well as indicating nonessential genes and/or regions in the human genome. We also suggest designing a single nucleotide variant (SNV) array to make the process more efficient.
Collapse
Affiliation(s)
- A Mesut Erzurumluoglu
- Bristol Genetic Epidemiology Laboratories (BGEL), School of Social and Community Medicine, University of Bristol, Bristol, UK.,Genetic Epidemiology Group, Department of Health Sciences, University of Leicester, Leicester, UK
| | - Hashem A Shihab
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Santiago Rodriguez
- Bristol Genetic Epidemiology Laboratories (BGEL), School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Tom R Gaunt
- Bristol Genetic Epidemiology Laboratories (BGEL), School of Social and Community Medicine, University of Bristol, Bristol, UK.,MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Ian N M Day
- Bristol Genetic Epidemiology Laboratories (BGEL), School of Social and Community Medicine, University of Bristol, Bristol, UK
| |
Collapse
|