1
|
Guan BZ, Parmigiani G, Braun D, Trippa L. PREDICTION OF HEREDITARY CANCERS USING NEURAL NETWORKS. Ann Appl Stat 2022; 16:495-520. [PMID: 37873507 PMCID: PMC10593124 DOI: 10.1214/21-aoas1510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Family history is a major risk factor for many types of cancer. Mendelian risk prediction models translate family histories into cancer risk predictions, based on knowledge of cancer susceptibility genes. These models are widely used in clinical practice to help identify high-risk individuals. Mendelian models leverage the entire family history, but they rely on many assumptions about cancer susceptibility genes that are either unrealistic or challenging to validate, due to low mutation prevalence. Training more flexible models, such as neural networks, on large databases of pedigrees can potentially lead to accuracy gains. In this paper we develop a framework to apply neural networks to family history data and investigate their ability to learn inherited susceptibility to cancer. While there is an extensive literature on neural networks and their state-of-the-art performance in many tasks, there is little work applying them to family history data. We propose adaptations of fully-connected neural networks and convolutional neural networks to pedigrees. In data simulated under Mendelian inheritance, we demonstrate that our proposed neural network models are able to achieve nearly optimal prediction performance. Moreover, when the observed family history includes misreported cancer diagnoses, neural networks are able to outperform the Mendelian BRCAPRO model embedding the correct inheritance laws. Using a large dataset of over 200,000 family histories, the Risk Service cohort, we train prediction models for future risk of breast cancer. We validate the models using data from the Cancer Genetics Network.
Collapse
Affiliation(s)
- By Zoe Guan
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center
| | | | - Danielle Braun
- Department of Biostatistics, Harvard T.H. Chan School of Public Health
| | - Lorenzo Trippa
- Department of Data Sciences, Dana-Farber Cancer Institute
| |
Collapse
|
2
|
Blunk I, Thomsen H, Reinsch N, Mayer M, Försti A, Sundquist J, Sundquist K, Hemminki K. Genomic imprinting analyses identify maternal effects as a cause of phenotypic variability in type 1 diabetes and rheumatoid arthritis. Sci Rep 2020; 10:11562. [PMID: 32665606 PMCID: PMC7360775 DOI: 10.1038/s41598-020-68212-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 06/18/2020] [Indexed: 02/08/2023] Open
Abstract
Imprinted genes, giving rise to parent-of-origin effects (POEs), have been hypothesised to affect type 1 diabetes (T1D) and rheumatoid arthritis (RA). However, maternal effects may also play a role. By using a mixed model that is able to simultaneously consider all kinds of POEs, the importance of POEs for the development of T1D and RA was investigated in a variance components analysis. The analysis was based on Swedish population-scale pedigree data. With P = 0.18 (T1D) and P = 0.26 (RA) imprinting variances were not significant. Explaining up to 19.00% (± 2.00%) and 15.00% (± 6.00%) of the phenotypic variance, the maternal environmental variance was significant for T1D (P = 1.60 × 10-24) and for RA (P = 0.02). For the first time, the existence of maternal genetic effects on RA was indicated, contributing up to 16.00% (± 3.00%) of the total variance. Environmental factors such as the social economic index, the number of offspring, birth year as well as their interactions with sex showed large effects.
Collapse
Affiliation(s)
- Inga Blunk
- Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology (FBN), Wilhelm-Stahl-Allee 2, 18196, Dummerstorf, Germany.
| | - Hauke Thomsen
- Division of Molecular Genetic Epidemiology, German Cancer Research Centre (DKFZ), Heidelberg, Germany
- GeneWerk GmbH, Heidelberg, Germany
| | - Norbert Reinsch
- Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology (FBN), Wilhelm-Stahl-Allee 2, 18196, Dummerstorf, Germany
| | - Manfred Mayer
- Institute of Genetics and Biometry, Leibniz Institute for Farm Animal Biology (FBN), Wilhelm-Stahl-Allee 2, 18196, Dummerstorf, Germany
| | - Asta Försti
- Division of Molecular Genetic Epidemiology, German Cancer Research Centre (DKFZ), Heidelberg, Germany
- Center for Primary Health Care Research, Lund University, Malmö, Sweden
- Hopp Children's Cancer Center (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Jan Sundquist
- Center for Primary Health Care Research, Lund University, Malmö, Sweden
- Department of Family Medicine and Community Health, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, USA
- Center for Community-Based Healthcare Research and Education (CoHRE), Department of Functional Pathology, School of Medicine, Shimane University, Izumo, Japan
| | - Kristina Sundquist
- Center for Primary Health Care Research, Lund University, Malmö, Sweden
- Department of Family Medicine and Community Health, Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, USA
- Center for Community-Based Healthcare Research and Education (CoHRE), Department of Functional Pathology, School of Medicine, Shimane University, Izumo, Japan
| | - Kari Hemminki
- Division of Molecular Genetic Epidemiology, German Cancer Research Centre (DKFZ), Heidelberg, Germany
- Center for Primary Health Care Research, Lund University, Malmö, Sweden
- Faculty of Medicine and Biomedical Center in Pilsen, Charles University in Prague, Pilsen, Czech Republic
| |
Collapse
|
3
|
Mining archival genealogy databases to gain new insights into broader historical issues. DIGITAL LIBRARY PERSPECTIVES 2019. [DOI: 10.1108/dlp-07-2019-0025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Purpose
Several genealogical databases are now publicly available on the Web. The information stored in such databases is not only of interest for genealogical research but might also be used in broader historical studies. As a case study, this paper aims to explore what a crowdsourced genealogical online database can tell about income inequality in Denmark during the First World War.
Design/methodology/approach
The analysis is based on 55,000 family-level records on the payment of local income taxes in a major Danish provincial town (Esbjerg) from a publicly available database on the website of The Esbjerg City Archives combined with official statistics from Statistics Denmark.
Findings
Denmark saw a sharp increase in income inequality during the First World War. The analysis shows that the new riches during the First World War in a harbour city such as Esbjerg were not “goulash barons” or stock-market speculators but fishermen. There were no fishermen in the top 1per cent of the income distribution in 1913. In 1917, more than 37 per cent of the family heads in this part of the income distribution were fishermen.
Originality/value
The paper illustrates how large-scale microdata from publicly available genealogical Web databases might be used to gain new insights into broader historical issues.
Collapse
|
4
|
Nelson D, Moreau C, de Vriendt M, Zeng Y, Preuss C, Vézina H, Milot E, Andelfinger G, Labuda D, Gravel S. Inferring Transmission Histories of Rare Alleles in Population-Scale Genealogies. Am J Hum Genet 2018; 103:893-906. [PMID: 30526866 DOI: 10.1016/j.ajhg.2018.10.017] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Accepted: 10/22/2018] [Indexed: 01/06/2023] Open
Abstract
Learning the transmission history of alleles through a family or population plays an important role in evolutionary, demographic, and medical genetic studies. Most classical models of population genetics have attempted to do so under the assumption that the genealogy of a population is unavailable and that its idiosyncrasies can be described by a small number of parameters describing population size and mate choice dynamics. Large genetic samples have increased sensitivity to such modeling assumptions, and large-scale genealogical datasets become a useful tool to investigate realistic genealogies. However, analyses in such large datasets are often intractable using conventional methods. We present an efficient method to infer transmission paths of rare alleles through population-scale genealogies. Based on backward-time Monte Carlo simulations of genetic inheritance, we use an importance sampling scheme to dramatically speed up convergence. The approach can take advantage of available genotypes of subsets of individuals in the genealogy including haplotype structure as well as information about the mode of inheritance and general prevalence of a mutation or disease in the population. Using a high-quality genealogical dataset of more than three million married individuals in the Quebec founder population, we apply the method to reconstruct the transmission history of chronic atrial and intestinal dysrhythmia (CAID), a rare recessive disease. We identify the most likely early carriers of the mutation and geographically map the expected carrier rate in the present-day French-Canadian population of Quebec.
Collapse
|
5
|
Stefansdottir V, Skirton H, Johannsson OT, Olafsdottir H, Olafsdottir GH, Tryggvadottir L, Jonsson JJ. Electronically ascertained extended pedigrees in breast cancer genetic counseling. Fam Cancer 2018; 18:153-160. [PMID: 30251169 DOI: 10.1007/s10689-018-0105-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
A comprehensive pedigree, usually provided by the counselee and verified by medical records, is essential for risk assessment in cancer genetic counseling. Collecting the relevant information is time-consuming and sometimes impossible. We studied the use of electronically ascertained pedigrees (EGP). The study group comprised women (n = 1352) receiving HBOC genetic counseling between December 2006 and December 2016 at Landspitali in Iceland. EGP's were ascertained using information from the population-based Genealogy Database and Icelandic Cancer Registry. The likelihood of being positive for the Icelandic founder BRCA2 pathogenic variant NM_000059.3:c.767_771delCAAAT was calculated using the risk assessment program Boadicea. We used this unique data to estimate the optimal size of pedigrees, e.g., those that best balance the accuracy of risk assessment using Boadicea and cost of ascertainment. Sub-groups of randomly selected 104 positive and 105 negative women for the founder BRCA2 PV were formed and Receiver Operating Characteristics curves compared for efficiency of PV prediction with a Boadicea score. The optimal pedigree size included 3° relatives or up to five generations with an average no. of 53.8 individuals (range 9-220) (AUC 0.801). Adding 4° relatives did not improve the outcome. Pedigrees including 3° relatives are difficult and sometimes impossible to generate with conventional methods. Pedigrees ascertained with data from pre-existing genealogy databases and cancer registries can save effort and contain more information than traditional pedigrees. Genetic services should consider generating EGP's which requires access to an accurate genealogy database and cancer registry. Local data protection laws and regulations have to be addressed.
Collapse
Affiliation(s)
- V Stefansdottir
- Department of Genetics and Molecular Medicine, Landspitali - National University Hospital, Hringbraut, 101, Reykjavik, Iceland.,Department of Biochemistry and Molecular Biology, Univ. of Iceland, Reykjavik, Iceland
| | - H Skirton
- Faculty of Health and Human Sciences, Plymouth University, Plymouth, UK
| | - O Th Johannsson
- Department Of Medical Oncology, Landspitali - National University Hospital, Reykjavik, Iceland
| | - H Olafsdottir
- Department of Genetics and Molecular Medicine, Landspitali - National University Hospital, Hringbraut, 101, Reykjavik, Iceland
| | - G H Olafsdottir
- Icelandic Cancer Registry, Icelandic Cancer Society, Reykjavik, Iceland
| | - L Tryggvadottir
- Icelandic Cancer Registry, Icelandic Cancer Society, Reykjavik, Iceland.,Faculty of Medicine, Univ. of Iceland, Reykjavik, Iceland
| | - J J Jonsson
- Department of Genetics and Molecular Medicine, Landspitali - National University Hospital, Hringbraut, 101, Reykjavik, Iceland. .,Department of Biochemistry and Molecular Biology, Univ. of Iceland, Reykjavik, Iceland. .,Genetical Committee of the University of Iceland, Reykjavik, Iceland.
| |
Collapse
|
6
|
Kaplanis J, Gordon A, Shor T, Weissbrod O, Geiger D, Wahl M, Gershovits M, Markus B, Sheikh M, Gymrek M, Bhatia G, MacArthur DG, Price AL, Erlich Y. Quantitative analysis of population-scale family trees with millions of relatives. Science 2018; 360:171-175. [PMID: 29496957 PMCID: PMC6593158 DOI: 10.1126/science.aam9309] [Citation(s) in RCA: 106] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 11/02/2017] [Accepted: 02/07/2018] [Indexed: 12/12/2022]
Abstract
Family trees have vast applications in fields as diverse as genetics, anthropology, and economics. However, the collection of extended family trees is tedious and usually relies on resources with limited geographical scope and complex data usage restrictions. We collected 86 million profiles from publicly available online data shared by genealogy enthusiasts. After extensive cleaning and validation, we obtained population-scale family trees, including a single pedigree of 13 million individuals. We leveraged the data to partition the genetic architecture of human longevity and to provide insights into the geographical dispersion of families. We also report a simple digital procedure to overlay other data sets with our resource.
Collapse
Affiliation(s)
- Joanna Kaplanis
- New York Genome Center, New York, NY 10013, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Assaf Gordon
- New York Genome Center, New York, NY 10013, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Tal Shor
- MyHeritage, Or Yehuda 6037606, Israel
- Computer Science Department, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Omer Weissbrod
- Computer Science Department, Weizmann Institute of Science, Rehovot 7610001, Israel
| | - Dan Geiger
- Computer Science Department, Technion-Israel Institute of Technology, Haifa 3200003, Israel
| | - Mary Wahl
- New York Genome Center, New York, NY 10013, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | | | - Barak Markus
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Mona Sheikh
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | - Melissa Gymrek
- New York Genome Center, New York, NY 10013, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- Harvard Medical School, Boston, MA 02115, USA
- Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA 02142, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Gaurav Bhatia
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
| | - Daniel G MacArthur
- Harvard Medical School, Boston, MA 02115, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L Price
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
- Department of Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA
| | - Yaniv Erlich
- New York Genome Center, New York, NY 10013, USA.
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
- MyHeritage, Or Yehuda 6037606, Israel
- Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY, USA
- Center for Computational Biology and Bioinformatics, Department of Systems Biology, Columbia University, New York, NY, USA
| |
Collapse
|
7
|
Kirkpatrick BE, Rashkin MD. Ancestry Testing and the Practice of Genetic Counseling. J Genet Couns 2016; 26:6-20. [DOI: 10.1007/s10897-016-0014-2] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 08/30/2016] [Indexed: 12/20/2022]
|
8
|
Stefansdottir V, Johannsson OT, Skirton H, Jonsson JJ. Counsellee's experience of cancer genetic counselling with pedigrees that automatically incorporate genealogical and cancer database information. J Community Genet 2016; 7:229-35. [PMID: 27372834 DOI: 10.1007/s12687-016-0271-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 06/20/2016] [Indexed: 11/24/2022] Open
Abstract
While pedigree drawing software is often utilised in genetic services, the use of genealogical databases in genetic counselling is unusual. This is mainly because of the unavailability of such databases in most countries. Electronically generated pedigrees used for cancer genetic counselling in Iceland create pedigrees that automatically incorporate information from a large, comprehensive genealogy database and nation-wide cancer registry. The aim of this descriptive qualitative study was to explore counsellees' experiences of genetic services, including family history taking, using these electronically generated pedigrees. Four online focus groups with 19 participants were formed, using an asynchronous posting method. Participants were encouraged to discuss their responses to questions posted on the website by the researcher. The main themes arising were motivation, information and trust, impact of testing and emotional responses. Most of the participants expressed trust in the method of using electronically generated pedigrees, although some voiced worries about information safety. Many experienced worry and anxiety while waiting for results of genetic testing, but limited survival guilt was noted. Family communication was either unchanged or improved following genetic counselling. The use of electronically generated pedigrees was well received by participants, and they trusted the information obtained via the databases. Age did not seem to influence responses. These results may be indicative of the particular culture in Iceland, where genealogical information is well known and freely shared. Further studies are needed to determine whether use of similar approaches to genealogical information gathering may be acceptable elsewhere.
Collapse
Affiliation(s)
- Vigdis Stefansdottir
- Department of Genetics and Molecular Medicine, Landspitali-The National University Hospital of Iceland, 101, Reykjavík, Iceland.,Department of Biochemistry and Molecular Biology, University of Iceland, 101, Reykjavík, Iceland
| | - Oskar Th Johannsson
- Department of Medical Oncology, Landspitali-The National University Hospital of Iceland, 101, Reykjavík, Iceland
| | - Heather Skirton
- Faculty of Health and Human Sciences Plymouth University, Plymouth, UK
| | - Jon J Jonsson
- Department of Genetics and Molecular Medicine, Landspitali-The National University Hospital of Iceland, 101, Reykjavík, Iceland. .,Department of Biochemistry and Molecular Biology, University of Iceland, 101, Reykjavík, Iceland.
| |
Collapse
|
9
|
Brameld KJ, Dye DE, Maxwell S, Brisbane JM, Glasson EJ, Goldblatt J, O'Leary P. The Western Australian Family Connections Genealogical Project: Detection of Familial Occurrences of Single Gene and Chromosomal Disorders. Genet Test Mol Biomarkers 2014; 18:77-82. [DOI: 10.1089/gtmb.2013.0254] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Kate J. Brameld
- Neuropsychiatric Epidemiology Research Unit, School of Psychiatry and Clinical Neurosciences, The University of Western Australia, Crawley, Australia
- Centre for Population Health Research, Curtin University, Bentley, Australia
- School of Population Health, The University of Western Australia, Crawley, Australia
| | - Danielle E. Dye
- School of Biomedical Sciences, Curtin Health Innovation Research Institute (CHIRI), Curtin University, Bentley, Australia
| | - Susannah Maxwell
- Centre for Population Health Research, Curtin University, Bentley, Australia
| | - Joanna M. Brisbane
- Centre for Population Health Research, Curtin University, Bentley, Australia
| | - Emma J. Glasson
- School of Population Health, The University of Western Australia, Crawley, Australia
- Telethon Institute for Child Health Research, Centre for Child Health Research, The University of Western Australia, Subiaco, Australia
| | - Jack Goldblatt
- Genetic Services of Western Australia, Subiaco, Australia
- School of Pediatrics and Child Health, The University of Western Australia, Crawley, Australia
| | - Peter O'Leary
- Centre for Population Health Research, Curtin University, Bentley, Australia
- School of Pathology and Laboratory Medicine, The University of Western Australia, Crawley, Australia
- School of Women's and Infants' Health, The University of Western Australia, Crawley, Australia
| |
Collapse
|
10
|
Abstract
This brief report aims to give an overview of the history and current status of clinical genetics services in Iceland and specific genetic counseling considerations for Iceland's population. Presently, there are two part time medical geneticists and one full time genetic counselor with an MSc education from Cardiff, within the Department of Genetic and Molecular Medicine, based in Iceland's only tertiary healthcare facility, Landspitali, the National University Hospital. An oncologist (20 %) also contributes to the cancer genetic counseling service. In addition, a pediatric medical geneticist has a 25 % appointment at the Children's Hospital. No other health care organization offers genetic counseling, and there are no private genetic counseling services.
Collapse
|