1
|
He H, Yang H, Foo R, Chan W, Zhu F, Liu Y, Zhou X, Ma L, Wang LF, Zhai W. Population genomic analysis reveals distinct demographics and recent adaptation in the black flying fox (Pteropus alecto). J Genet Genomics 2023; 50:554-562. [PMID: 37182682 DOI: 10.1016/j.jgg.2023.05.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 05/03/2023] [Accepted: 05/03/2023] [Indexed: 05/16/2023]
Abstract
As the only mammalian group capable of powered flight, bats have many unique biological traits. Previous comparative genomic studies in bats have focused on long-term evolution. However, the micro-evolutionary processes driving recent evolution are largely under-explored. Using resequencing data from 50 black flying foxes (Pteropus alecto), one of the model species for bats, we find that black flying fox has much higher genetic diversity and lower levels of linkage disequilibrium than most of the mammalian species. Demographic inference reveals strong population fluctuations (>100 fold) coinciding with multiple historical events including the last glacial change and Toba super eruption, suggesting that the black flying fox is a very resilient species with strong recovery abilities. While long-term adaptation in the black flying fox is enriched in metabolic genes, recent adaptation in the black flying fox has a unique landscape where recently selected genes are not strongly enriched in any functional category. The demographic history and mode of adaptation suggest that black flying fox might be a well-adapted species with strong evolutionary resilience. Taken together, this study unravels a vibrant landscape of recent evolution for the black flying fox and sheds light on several unique evolutionary processes for bats comparing to other mammalian groups.
Collapse
Affiliation(s)
- Haopeng He
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hechuan Yang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Randy Foo
- Programme in Emerging Infectious Diseases, Duke-NUS Medical School, Singapore 169857, Singapore; Singhealth Duke-NUS Global Health Institute, Singapore 169857, Singapore
| | - Wharton Chan
- Programme in Emerging Infectious Diseases, Duke-NUS Medical School, Singapore 169857, Singapore; Singhealth Duke-NUS Global Health Institute, Singapore 169857, Singapore
| | - Feng Zhu
- Programme in Emerging Infectious Diseases, Duke-NUS Medical School, Singapore 169857, Singapore; Singhealth Duke-NUS Global Health Institute, Singapore 169857, Singapore
| | - Yunsong Liu
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xuming Zhou
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Liang Ma
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China.
| | - Lin-Fa Wang
- Programme in Emerging Infectious Diseases, Duke-NUS Medical School, Singapore 169857, Singapore; Singhealth Duke-NUS Global Health Institute, Singapore 169857, Singapore.
| | - Weiwei Zhai
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan 650223, China.
| |
Collapse
|
2
|
Adel S, Carels N. Plant Tolerance to Drought Stress with Emphasis on Wheat. PLANTS (BASEL, SWITZERLAND) 2023; 12:plants12112170. [PMID: 37299149 DOI: 10.3390/plants12112170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 03/16/2023] [Accepted: 03/29/2023] [Indexed: 06/12/2023]
Abstract
Environmental stresses, such as drought, have negative effects on crop yield. Drought is a stress whose impact tends to increase in some critical regions. However, the worldwide population is continuously increasing and climate change may affect its food supply in the upcoming years. Therefore, there is an ongoing effort to understand the molecular processes that may contribute to improving drought tolerance of strategic crops. These investigations should contribute to delivering drought-tolerant cultivars by selective breeding. For this reason, it is worthwhile to review regularly the literature concerning the molecular mechanisms and technologies that could facilitate gene pyramiding for drought tolerance. This review summarizes achievements obtained using QTL mapping, genomics, synteny, epigenetics, and transgenics for the selective breeding of drought-tolerant wheat cultivars. Synthetic apomixis combined with the msh1 mutation opens the way to induce and stabilize epigenomes in crops, which offers the potential of accelerating selective breeding for drought tolerance in arid and semi-arid regions.
Collapse
Affiliation(s)
- Sarah Adel
- Genetic Department, Faculty of Agriculture, Ain Shams University, Cairo 11241, Egypt
| | - Nicolas Carels
- Laboratory of Biological System Modeling, Center of Technological Development for Health (CDTS), Oswaldo Cruz Foundation (Fiocruz), Rio de Janeiro 21040-361, Brazil
| |
Collapse
|
3
|
Sall S, Thompson W, Santos A, Dwyer DS. Analysis of Major Depression Risk Genes Reveals Evolutionary Conservation, Shared Phenotypes, and Extensive Genetic Interactions. Front Psychiatry 2021; 12:698029. [PMID: 34335334 PMCID: PMC8319724 DOI: 10.3389/fpsyt.2021.698029] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 06/21/2021] [Indexed: 12/29/2022] Open
Abstract
Major depressive disorder (MDD) affects around 15% of the population at some stage in their lifetime. It can be gravely disabling and it is associated with increased risk of suicide. Genetics play an important role; however, there are additional environmental contributions to the pathogenesis. A number of possible risk genes that increase liability for developing symptoms of MDD have been identified in genome-wide association studies (GWAS). The goal of this study was to characterize the MDD risk genes with respect to the degree of evolutionary conservation in simpler model organisms such as Caenorhabditis elegans and zebrafish, the phenotypes associated with variation in these genes and the extent of network connectivity. The MDD risk genes showed higher conservation in C. elegans and zebrafish than genome-to-genome comparisons. In addition, there were recurring themes among the phenotypes associated with variation of these risk genes in C. elegans. The phenotype analysis revealed enrichment for essential genes with pleiotropic effects. Moreover, the MDD risk genes participated in more interactions with each other than did randomly-selected genes from similar-sized gene sets. Syntenic blocks of risk genes with common functional activities were also identified. By characterizing evolutionarily-conserved counterparts to the MDD risk genes, we have gained new insights into pathogenetic processes relevant to the emergence of depressive symptoms in man.
Collapse
Affiliation(s)
- Saveen Sall
- Department of Psychiatry and Behavioral Medicine, Louisiana State University Health Shreveport, Shreveport, LA, United States
| | - Willie Thompson
- Department of Psychiatry and Behavioral Medicine, Louisiana State University Health Shreveport, Shreveport, LA, United States
| | - Aurianna Santos
- Department of Psychiatry and Behavioral Medicine, Louisiana State University Health Shreveport, Shreveport, LA, United States
| | - Donard S. Dwyer
- Department of Psychiatry and Behavioral Medicine, Louisiana State University Health Shreveport, Shreveport, LA, United States
- Department of Pharmacology, Toxicology and Neuroscience, Louisiana State University Health Shreveport, Shreveport, LA, United States
| |
Collapse
|
4
|
Park J, Choi JY, Choi J, Chung S, Song N, Park SK, Han W, Noh DY, Ahn SH, Lee JW, Kim MK, Jee SH, Wen W, Bolla MK, Wang Q, Dennis J, Michailidou K, Shah M, Conroy DM, Harrington PA, Mayes R, Czene K, Hall P, Teras LR, Patel AV, Couch FJ, Olson JE, Sawyer EJ, Roylance R, Bojesen SE, Flyger H, Lambrechts D, Baten A, Matsuo K, Ito H, Guénel P, Truong T, Keeman R, Schmidt MK, Wu AH, Tseng CC, Cox A, Cross SS, Andrulis IL, Hopper JL, Southey MC, Wu PE, Shen CY, Fasching PA, Ekici AB, Muir K, Lophatananon A, Brenner H, Arndt V, Jones ME, Swerdlow AJ, Hoppe R, Ko YD, Hartman M, Li J, Mannermaa A, Hartikainen JM, Benitez J, González-Neira A, Haiman CA, Dörk T, Bogdanova NV, Teo SH, Mohd Taib NA, Fletcher O, Johnson N, Grip M, Winqvist R, Blomqvist C, Nevanlinna H, Lindblom A, Wendt C, Kristensen VN, Tollenaar RAEM, Heemskerk-Gerritsen BAM, Radice P, Bonanni B, Hamann U, Manoochehri M, Lacey JV, Martinez ME, Dunning AM, Pharoah PDP, Easton DF, Yoo KY, Kang D. Gene-Environment Interactions Relevant to Estrogen and Risk of Breast Cancer: Can Gene-Environment Interactions Be Detected Only among Candidate SNPs from Genome-Wide Association Studies? Cancers (Basel) 2021; 13:2370. [PMID: 34069208 PMCID: PMC8156547 DOI: 10.3390/cancers13102370] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 04/29/2021] [Accepted: 04/30/2021] [Indexed: 12/24/2022] Open
Abstract
In this study we aim to examine gene-environment interactions (GxEs) between genes involved with estrogen metabolism and environmental factors related to estrogen exposure. GxE analyses were conducted with 1970 Korean breast cancer cases and 2052 controls in the case-control study, the Seoul Breast Cancer Study (SEBCS). A total of 11,555 SNPs from the 137 candidate genes were included in the GxE analyses with eight established environmental factors. A replication test was conducted by using an independent population from the Breast Cancer Association Consortium (BCAC), with 62,485 Europeans and 9047 Asians. The GxE tests were performed by using two-step methods in GxEScan software. Two interactions were found in the SEBCS. The first interaction was shown between rs13035764 of NCOA1 and age at menarche in the GE|2df model (p-2df = 1.2 × 10-3). The age at menarche before 14 years old was associated with the high risk of breast cancer, and the risk was higher when subjects had homozygous minor allele G. The second GxE was shown between rs851998 near ESR1 and height in the GE|2df model (p-2df = 1.1 × 10-4). Height taller than 160 cm was associated with a high risk of breast cancer, and the risk increased when the minor allele was added. The findings were not replicated in the BCAC. These results would suggest specificity in Koreans for breast cancer risk.
Collapse
Affiliation(s)
- JooYong Park
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul 03080, Korea; (J.P.); (S.C.); (S.K.P.); (D.K.)
- BK21plus Biomedical Science Project, Seoul National University College of Medicine, Seoul 03080, Korea
| | - Ji-Yeob Choi
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul 03080, Korea; (J.P.); (S.C.); (S.K.P.); (D.K.)
- BK21plus Biomedical Science Project, Seoul National University College of Medicine, Seoul 03080, Korea
- Institute of Health Policy and Management, Seoul National University Medical Research Center, Seoul 03080, Korea;
- Cancer Research Institute, Seoul National University, Seoul 03080, Korea; (W.H.); (D.-Y.N.)
| | - Jaesung Choi
- Institute of Health Policy and Management, Seoul National University Medical Research Center, Seoul 03080, Korea;
| | - Seokang Chung
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul 03080, Korea; (J.P.); (S.C.); (S.K.P.); (D.K.)
| | - Nan Song
- College of Pharmacy, Chungbuk National University, Cheongju-si 28160, Korea;
| | - Sue K. Park
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul 03080, Korea; (J.P.); (S.C.); (S.K.P.); (D.K.)
- Cancer Research Institute, Seoul National University, Seoul 03080, Korea; (W.H.); (D.-Y.N.)
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul 03080, Korea;
| | - Wonshik Han
- Cancer Research Institute, Seoul National University, Seoul 03080, Korea; (W.H.); (D.-Y.N.)
- Department of Surgery, Seoul National University College of Medicine, Seoul 03080, Korea
| | - Dong-Young Noh
- Cancer Research Institute, Seoul National University, Seoul 03080, Korea; (W.H.); (D.-Y.N.)
- Department of Surgery, Seoul National University College of Medicine, Seoul 03080, Korea
| | - Sei-Hyun Ahn
- Department of Surgery, Medicine and ASAN Medical Center, University of Ulsan College, Seoul 05505, Korea; (S.-H.A.); (J.W.L.)
| | - Jong Won Lee
- Department of Surgery, Medicine and ASAN Medical Center, University of Ulsan College, Seoul 05505, Korea; (S.-H.A.); (J.W.L.)
| | - Mi Kyung Kim
- Division of Cancer Epidemiology and Management, National Cancer Center, Goyang-si 10408, Korea;
| | - Sun Ha Jee
- Department of Epidemiology and Health Promotion, Institute for Health Promotion, Graduate School of Public Health, Yonsei University, Seoul 03722, Korea;
| | - Wanqing Wen
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA;
| | - Manjeet K. Bolla
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB2 0SR, UK; (M.K.B.); (Q.W.); (J.D.); (K.M.); (P.D.P.P.); (D.F.E.)
| | - Qin Wang
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB2 0SR, UK; (M.K.B.); (Q.W.); (J.D.); (K.M.); (P.D.P.P.); (D.F.E.)
| | - Joe Dennis
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB2 0SR, UK; (M.K.B.); (Q.W.); (J.D.); (K.M.); (P.D.P.P.); (D.F.E.)
| | - Kyriaki Michailidou
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB2 0SR, UK; (M.K.B.); (Q.W.); (J.D.); (K.M.); (P.D.P.P.); (D.F.E.)
- Biostatistics Unit, The Cyprus Institute of Neurology & Genetics, Nicosia 2371, Cyprus
- Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology & Genetics, Nicosia 23462, Cyprus
| | - Mitul Shah
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK; (M.S.); (D.M.C.); (P.A.H.); (R.M.); (A.M.D.)
| | - Don M. Conroy
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK; (M.S.); (D.M.C.); (P.A.H.); (R.M.); (A.M.D.)
| | - Patricia A. Harrington
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK; (M.S.); (D.M.C.); (P.A.H.); (R.M.); (A.M.D.)
| | - Rebecca Mayes
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK; (M.S.); (D.M.C.); (P.A.H.); (R.M.); (A.M.D.)
| | - Kamila Czene
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 171 65 Stockholm, Sweden; (K.C.); (P.H.)
| | - Per Hall
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 171 65 Stockholm, Sweden; (K.C.); (P.H.)
- Department of Oncology, Södersjukhuset, 118 83 Stockholm, Sweden
| | - Lauren R. Teras
- Department of Population Science, American Cancer Society, Atlanta, GA 30303, USA;
| | - Alpa V. Patel
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA; (A.V.P.); (F.J.C.)
| | - Fergus J. Couch
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA; (A.V.P.); (F.J.C.)
| | - Janet E. Olson
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA;
| | - Elinor J. Sawyer
- School of Cancer & Pharmaceutical Sciences, Comprehensive Cancer Centre, Guy’s Campus, King’s College London, London SE1 9RT, UK;
| | - Rebecca Roylance
- Department of Oncology, UCLH Foundation Trust, London NW1 2PG, UK;
| | - Stig E. Bojesen
- Copenhagen General Population Study, Herlev and Gentofte Hospital, Copenhagen University Hospital, 2730 Herlev, Denmark;
- Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, 2730 Herlev, Denmark
- Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Henrik Flyger
- Department of Breast Surgery, Herlev and Gentofte Hospital, Copenhagen University Hospital, 2730 Herlev, Denmark;
| | - Diether Lambrechts
- VIB Center for Cancer Biology, 3001 Leuve, Belgium;
- Laboratory for Translational Genetics, Department of Human Genetics, University of Leuven, 3000 Leuven, Belgium
| | - Adinda Baten
- Department of Radiotherapy Oncology, KU Leuven—University of Leuven, University Hospitals Leuven, 3000 Leuven, Belgium;
| | - Keitaro Matsuo
- Division of Cancer Epidemiology and Prevention, Aichi Cancer Center Research Institute, Nagoya 464-8681, Japan;
- Division of Cancer Epidemiology, Nagoya University Graduate School of Medicine, Nagoya 466-8550, Japan;
| | - Hidemi Ito
- Division of Cancer Epidemiology, Nagoya University Graduate School of Medicine, Nagoya 466-8550, Japan;
| | - Pascal Guénel
- Center for Research in Epidemiology and Population Health (CESP), Team Exposome and Heredity, INSERM, University Paris-Saclay, 94805 Villejuif, France; (P.G.); (T.T.)
| | - Thérèse Truong
- Center for Research in Epidemiology and Population Health (CESP), Team Exposome and Heredity, INSERM, University Paris-Saclay, 94805 Villejuif, France; (P.G.); (T.T.)
| | - Renske Keeman
- Division of Molecular Pathology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands; (R.K.); (M.K.S.)
| | - Marjanka K. Schmidt
- Division of Molecular Pathology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands; (R.K.); (M.K.S.)
- Division of Psychosocial Research and Epidemiology, The Netherlands Cancer Institute—Antoni van Leeuwenhoek Hospital, 1066 CX Amsterdam, The Netherlands
| | - Anna H. Wu
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA; (A.H.W.); (C.-C.T.); (C.A.H.)
| | - Chiu-Chen Tseng
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA; (A.H.W.); (C.-C.T.); (C.A.H.)
| | - Angela Cox
- Sheffield Institute for Nucleic Acids (SInFoNiA), Department of Oncology and Metabolism, University of Sheffield, Sheffield S10 2TN, UK;
| | - Simon S. Cross
- Academic Unit of Pathology, Department of Neuroscience, University of Sheffield, Sheffield S10 2TN, UK;
| | - kConFab Investigators
- Peter MacCallum Cancer Center, Melbourne, VIC 3000, Australia;
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Melbourne, VIC 3000, Australia
| | - Irene L. Andrulis
- Fred A, Litwin Center for Cancer Genetics, Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, ON M5G 1X5, Canada;
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - John L. Hopper
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC 3010, Australia;
| | - Melissa C. Southey
- Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC 3168, Australia;
- Department of Clinical Pathology, The University of Melbourne, Melbourne, VIC 3010, Australia
- Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, VIC 3004, Australia
| | - Pei-Ei Wu
- Taiwan Biobank, Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan;
| | - Chen-Yang Shen
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan;
- School of Public Health, China Medical University, Taichung 404, Taiwan
| | - Peter A. Fasching
- Department of Medicine Division of Hematology and Oncology, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA;
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center ER-EMN, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nuremberg, 91054 Erlangen, Germany
| | - Arif B. Ekici
- Institute of Human Genetics, Comprehensive Cancer Center Erlangen-EMN, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg, 91054 Erlangen, Germany;
| | - Kenneth Muir
- Division of Population Health, Health Services Research and Primary Care, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK; (K.M.); (A.L.)
| | - Artitaya Lophatananon
- Division of Population Health, Health Services Research and Primary Care, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK; (K.M.); (A.L.)
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (H.B.); (V.A.)
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), 69120 Heidelberg, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Volker Arndt
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (H.B.); (V.A.)
| | - Michael E. Jones
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London SM2 5NG, UK; (M.E.J.); (A.J.S.)
| | - Anthony J. Swerdlow
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London SM2 5NG, UK; (M.E.J.); (A.J.S.)
- Division of Breast Cancer Research, The Institute of Cancer Research, London SW7 3RP, UK
| | - Reiner Hoppe
- Dr. Margarete Fischer-Bosch-Institute of Clinical Pharmacology, 70376 Stuttgart, Germany;
- University of Tübingen, 72074 Tübingen, Germany
| | - Yon-Dschun Ko
- Department of Internal Medicine, Evangelische Kliniken Bonn gGmbH, Johanniter Krankenhaus, 53177 Bonn, Germany;
| | - Mikael Hartman
- Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Singapore 117549, Singapore;
- Department of Surgery, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore 119228, Singapore
- Department of Surgery, National University Health System, Singapore 119228, Singapore
| | - Jingmei Li
- Human Genetics Division, Genome Institute of Singapore, Singapore 138672, Singapore;
| | - Arto Mannermaa
- Translational Cancer Research Area, University of Eastern Finland, 70210 Kuopio, Finland; (A.M.); (J.M.H.)
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, 70210 Kuopio, Finland
- Biobank of Eastern Finland, Kuopio University Hospital, 70210 Kuopio, Finland
| | - Jaana M. Hartikainen
- Translational Cancer Research Area, University of Eastern Finland, 70210 Kuopio, Finland; (A.M.); (J.M.H.)
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, 70210 Kuopio, Finland
| | - Javier Benitez
- Biomedical Network on Rare Diseases (CIBERER), 28029 Madrid, Spain;
- Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain;
| | - Anna González-Neira
- Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain;
| | - Christopher A. Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA; (A.H.W.); (C.-C.T.); (C.A.H.)
| | - Thilo Dörk
- Gynaecology Research Unit, Hannover Medical School, 30625 Hannover, Germany; (T.D.); (N.V.B.)
| | - Natalia V. Bogdanova
- Gynaecology Research Unit, Hannover Medical School, 30625 Hannover, Germany; (T.D.); (N.V.B.)
- Department of Radiation Oncology, Hannover Medical School, 30625 Hannover, Germany
- NN Alexandrov Research Institute of Oncology and Medical Radiology, 223040 Minsk, Belarus
| | - Soo Hwang Teo
- Breast Cancer Research Programme, Cancer Research Malaysia, Subang Jaya 47500, Malaysia;
- Department of Surgery, Faculty of Medicine, University of Malaya, Kuala Lumpur 50603, Malaysia
| | - Nur Aishah Mohd Taib
- Breast Cancer Research Unit, University Malaya Cancer Research Institute, Faculty of Medicine, University of Malaya, Kuala Lumpur 50603, Malaysia;
| | - Olivia Fletcher
- The Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London SW7 3RP, UK; (O.F.); (N.J.)
| | - Nichola Johnson
- The Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London SW7 3RP, UK; (O.F.); (N.J.)
| | - Mervi Grip
- Department of Surgery, Oulu University Hospital, University of Oulu, 90220 Oulu, Finland;
| | - Robert Winqvist
- Laboratory of Cancer Genetics and Tumor Biology, Cancer and Translational Medicine Research Unit, Biocenter Oulu, University of Oulu, 90570 Oulu, Finland;
- Laboratory of Cancer Genetics and Tumor Biology, Northern Finland Laboratory Centre Oulu, Oulu 90570, Finland
| | - Carl Blomqvist
- Department of Oncology, Helsinki University Hospital, University of Helsinki, 00290 Helsinki, Finland;
- Department of Oncology, Örebro University Hospital, 70185 Örebro, Sweden
| | - Heli Nevanlinna
- Department of Obstetrics and Gynecology, Helsinki University Hospital, University of Helsinki, 00290 Helsinki, Finland;
| | - Annika Lindblom
- Department of Molecular Medicine and Surgery, Karolinska Institutet, 171 76 Stockholm, Sweden;
- Department of Clinical Genetics, Karolinska University Hospital, 171 76 Stockholm, Sweden
| | - Camilla Wendt
- Department of Clinical Science and Education, Södersjukhuset, Karolinska Institutet, 118 83 Stockholm, Sweden;
| | - Vessela N. Kristensen
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, 0450 Oslo, Norway; (V.N.K.); (NBCS Collaborators)
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, 0372 Oslo, Norway
| | - NBCS Collaborators
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, 0450 Oslo, Norway; (V.N.K.); (NBCS Collaborators)
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, 0372 Oslo, Norway
- Department of Research, Vestre Viken Hospital, 3004 Drammen, Norway
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, 0450 Oslo, Norway
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital-Radiumhospitalet, 0450 Oslo, Norway
- Section for Breast- and Endocrine Surgery, Department of Cancer, Division of Surgery, Cancer and Transplantation Medicine, Oslo University Hospital-Ullevål, 0450 Oslo, Norway
- Department of Radiology and Nuclear Medicine, Oslo University Hospital, 0450 Oslo, Norway
- Department of Pathology at Akershus University Hospital, 1478 Lørenskog, Norway
- Department of Oncology, Division of Surgery and Cancer and Transplantation Medicine, University Hospital-Radiumhospitalet, 0405 Oslo, Norway
- National Advisory Unit on Late Effects after Cancer Treatment, Department of Oncology, Oslo University Hospital, 0405 Oslo, Norway
- Department of Oncology, Akershus University Hospital, 1478 Lørenskog, Norway
- Oslo Breast Cancer Research Consortium, Oslo University Hospital, 0405 Oslo, Norway
| | - Rob A. E. M. Tollenaar
- Department of Surgery, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands;
| | | | - Paolo Radice
- Unit of Molecular Bases of Genetic Risk and Genetic Testing, Department of Research, Fondazione IRCCS Istituto Nazionale dei Tumori (INT), 20133 Milan, Italy;
| | - Bernardo Bonanni
- Division of Cancer Prevention and Genetics, IEO, European Institute of Oncology IRCCS, 20141 Milan, Italy;
| | - Ute Hamann
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (U.H.); (M.M.)
| | - Mehdi Manoochehri
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (U.H.); (M.M.)
| | - James V. Lacey
- Department of Computational and Quantitative Medicine, City of Hope, Duarte, CA 91010, USA;
- City of Hope Comprehensive Cancer Center, City of Hope, Duarte, CA 91010, USA
| | - Maria Elena Martinez
- Moores Cancer Center, University of California San Diego, La Jolla, CA 92037, USA;
- Herbert Wertheim School of Public Health and Longevity Science, University of California San Diego, La Jolla, CA 92161, USA
| | - Alison M. Dunning
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK; (M.S.); (D.M.C.); (P.A.H.); (R.M.); (A.M.D.)
| | - Paul D. P. Pharoah
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB2 0SR, UK; (M.K.B.); (Q.W.); (J.D.); (K.M.); (P.D.P.P.); (D.F.E.)
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK; (M.S.); (D.M.C.); (P.A.H.); (R.M.); (A.M.D.)
| | - Douglas F. Easton
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB2 0SR, UK; (M.K.B.); (Q.W.); (J.D.); (K.M.); (P.D.P.P.); (D.F.E.)
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge CB1 8RN, UK; (M.S.); (D.M.C.); (P.A.H.); (R.M.); (A.M.D.)
| | - Keun-Young Yoo
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul 03080, Korea;
| | - Daehee Kang
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul 03080, Korea; (J.P.); (S.C.); (S.K.P.); (D.K.)
- Cancer Research Institute, Seoul National University, Seoul 03080, Korea; (W.H.); (D.-Y.N.)
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul 03080, Korea;
| |
Collapse
|
5
|
Dwyer DS. Genomic Chaos Begets Psychiatric Disorder. Complex Psychiatry 2020; 6:20-29. [PMID: 34883501 PMCID: PMC7673594 DOI: 10.1159/000507988] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 04/06/2020] [Indexed: 12/21/2022] Open
Abstract
The processes that created the primordial genome are inextricably linked to current day vulnerability to developing a psychiatric disorder as summarized in this review article. Chaos and dynamic forces including duplication, transposition, and recombination generated the protogenome. To survive early stages of genome evolution, self-organization emerged to curb chaos. Eventually, the human genome evolved through a delicate balance of chaos/instability and organization/stability. However, recombination coldspots, silencing of transposable elements, and other measures to limit chaos also led to retention of variants that increase risk for disease. Moreover, ongoing dynamics in the genome creates various new mutations that determine liability for psychiatric disorders. Homologous recombination, long-range gene regulation, and gene interactions were all guided by spooky action-at-a-distance, which increased variability in the system. A probabilistic system of life was required to deal with a changing environment. This ensured the generation of outliers in the population, which enhanced the probability that some members would survive unfavorable environmental impacts. Some of the outliers produced through this process in man are ill suited to cope with the complex demands of modern life. Genomic chaos and mental distress from the psychological challenges of modern living will inevitably converge to produce psychiatric disorders in man.
Collapse
Affiliation(s)
- Donard S. Dwyer
- Departments of Psychiatry and Behavioral Medicine and Pharmacology, Toxicology and Neuroscience, LSU Health Shreveport, Shreveport, Louisiana, USA
| |
Collapse
|
6
|
Linkage Disequilibrium-Based Inference of Genome Homology and Chromosomal Rearrangements Between Species. G3-GENES GENOMES GENETICS 2020; 10:2327-2343. [PMID: 32434754 PMCID: PMC7341147 DOI: 10.1534/g3.120.401090] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The aim of this study was to analyze the genomic homology between cattle (Bos taurus) and buffaloes (Bubalus bubalis) and to propose a rearrangement of the buffalo genome through linkage disequilibrium analyses of buffalo SNP markers referenced in the cattle genome assembly and also compare it to the buffalo genome assembly. A panel of bovine SNPs (single nucleotide polymorphisms) was used for hierarchical, non-hierarchical and admixture cluster analyses. Thus, the linkage disequilibrium information between markers of a specific panel of buffalo was used to infer chromosomal rearrangement. Haplotype diversity and imputation accuracy of the submetacentric chromosomes were also analyzed. The genomic homology between the species enabled us to use the bovine genome assembly to recreate a buffalo genomic reference by rearranging the submetacentric chromosomes. The centromere of the submetacentric chromosomes exhibited high linkage disequilibrium and low haplotype diversity. It allowed hypothesizing about chromosome evolution. It indicated that buffalo submetacentric chromosomes are a centric fusion of ancestral acrocentric chromosomes. The chronology of fusions was also suggested. Moreover, a linear regression between buffalo and cattle rearranged assembly and the imputation accuracy indicated that the rearrangement of the chromosomes was adequate. When using the bovine reference genome assembly, the rearrangement of the buffalo submetacentric chromosomes could be done by SNP BTA (chromosome of Bos taurus) calculations: shorter BTA (shorter arm of buffalo chromosome) was given as [(shorter BTA length - SNP position in shorter BTA)] and larger BTA length as [shorter BTA length + (larger BTA length - SNP position in larger BTA)]. Finally, the proposed linkage disequilibrium-based method can be applied to elucidate other chromosomal rearrangement events in other species with the possibility of better understanding the evolutionary relationship between their genomes.
Collapse
|
7
|
Prunier J, Lemaçon A, Bastien A, Jafarikia M, Porth I, Robert C, Droit A. LD-annot: A Bioinformatics Tool to Automatically Provide Candidate SNPs With Annotations for Genetically Linked Genes. Front Genet 2019; 10:1192. [PMID: 31850063 PMCID: PMC6889475 DOI: 10.3389/fgene.2019.01192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Accepted: 10/28/2019] [Indexed: 11/24/2022] Open
Abstract
A multitude of model and non-model species studies have now taken full advantage of powerful high-throughput genotyping advances such as SNP arrays and genotyping-by-sequencing (GBS) technology to investigate the genetic basis of trait variation. However, due to incomplete genome coverage by these technologies, the identified SNPs are likely in linkage disequilibrium (LD) with the causal polymorphisms, rather than be causal themselves. In addition, researchers could benefit from annotations for the identified candidate SNPs and, simultaneously, for all neighboring genes in genetic linkage. In such case, LD extent estimation surrounding the candidate SNPs is required to determine the regions encompassing genes of interest. We describe here an automated pipeline, “LD-annot,” designed to delineate specific regions of interest for a given experiment and candidate polymorphisms on the basis of LD extent, and furthermore, provide annotations for all genes within such regions. LD-annot uses standard file formats, bioinformatics tools, and languages to provide identifiers, coordinates, and annotations for genes in genetic linkage with each candidate polymorphism. Although the focus lies upon SNP arrays and GBS data as they are being routinely deployed, this pipeline can be applied to a variety of datasets as long as genotypic data are available for a high number of polymorphisms and formatted into a vcf file. A checkpoint procedure in the pipeline allows to test several threshold values for linkage without having to rerun the entire pipeline, thus saving the user computational time and resources. We applied this new pipeline to four different sample sets: two breeding populations GBS datasets, one within-pedigree SNP set coming from whole genome sequencing (WGS), and a very large multi-varieties SNP dataset obtained from WGS, representing variable sample sizes, and numbers of polymorphisms. LD-annot performed within minutes, even when very high numbers of polymorphisms are investigated and thus will efficiently assist research efforts aimed at identifying biologically meaningful genetic polymorphisms underlying phenotypic variation. LD-annot tool is available under a GPL license from https://github.com/ArnaudDroitLab/LD-annot.
Collapse
Affiliation(s)
- Julien Prunier
- Genomics Center, Centre Hospitalier Universitaire de Québec-Université Laval Research Center, Quebec, QC, Canada.,Forestry Research Centre, Forestry Department, Université Laval, Quebec, QC, Canada
| | - Audrey Lemaçon
- Genomics Center, Centre Hospitalier Universitaire de Québec-Université Laval Research Center, Quebec, QC, Canada
| | - Alexandre Bastien
- Faculty of Agricultural and Food Science, Université Laval, Quebec, QC, Canada
| | - Mohsen Jafarikia
- Canadian Centre for Swine Improvement, Ottawa, ON, Canada.,Department of Animal Biosciences, University of Guelph, Guelph, ON, Canada
| | - Ilga Porth
- Forestry Research Centre, Forestry Department, Université Laval, Quebec, QC, Canada
| | - Claude Robert
- Forestry Research Centre, Forestry Department, Université Laval, Quebec, QC, Canada
| | - Arnaud Droit
- Genomics Center, Centre Hospitalier Universitaire de Québec-Université Laval Research Center, Quebec, QC, Canada
| |
Collapse
|
8
|
Hujoel MLA, Gazal S, Hormozdiari F, van de Geijn B, Price AL. Disease Heritability Enrichment of Regulatory Elements Is Concentrated in Elements with Ancient Sequence Age and Conserved Function across Species. Am J Hum Genet 2019; 104:611-624. [PMID: 30905396 PMCID: PMC6451699 DOI: 10.1016/j.ajhg.2019.02.008] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 02/05/2019] [Indexed: 02/06/2023] Open
Abstract
Regulatory elements, e.g., enhancers and promoters, have been widely reported to be enriched for disease and complex trait heritability. We investigated how this enrichment varies with the age of the underlying genome sequence, the conservation of regulatory function across species, and the target gene of the regulatory element. We estimated heritability enrichment by applying stratified LD score regression to summary statistics from 41 independent diseases and complex traits (average N = 320K) and meta-analyzing results across traits. Enrichment of human putative enhancers and promoters was larger in elements with older sequence age, assessed via alignment with other species irrespective of conserved functionality: putative enhancer elements with ancient sequence age (older than the split between marsupial and placental mammals) were 8.8× enriched (versus 2.5× for all putative enhancers; p = 3e-14), and promoter elements with ancient sequence age were 13.5× enriched (versus 5.1× for all promoters; p = 5e-16). Enrichment of human putative enhancers and promoters was also larger in elements whose regulatory function was conserved across species, e.g., human putative enhancers that were enhancers in ≥5 of 9 other mammals were 4.6× enriched (p = 5e-12 versus all putative enhancers). Enrichment of human promoters was larger in promoters of loss-of-function intolerant genes: 12.0× enrichment (p = 8e-15 versus all promoters). The mean value of several measures of negative selection within these genomic annotations mirrored all of these findings. Notably, the annotations with these excess heritability enrichments were jointly significant conditional on each other and on our baseline-LD model, which includes a broad set of coding, conserved, regulatory, and LD-related annotations.
Collapse
Affiliation(s)
- Margaux L A Hujoel
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Division of Biostatistics, Dana-Farber Cancer Institute, Boston, MA 02215, USA.
| | - Steven Gazal
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Farhad Hormozdiari
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Bryce van de Geijn
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alkes L Price
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
9
|
Jabbari K, Wirtz J, Rauscher M, Wiehe T. A common genomic code for chromatin architecture and recombination landscape. PLoS One 2019; 14:e0213278. [PMID: 30865674 PMCID: PMC6415826 DOI: 10.1371/journal.pone.0213278] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 02/18/2019] [Indexed: 12/14/2022] Open
Abstract
Recent findings established a link between DNA sequence composition and interphase chromatin architecture and explained the evolutionary conservation of TADs (Topologically Associated Domains) and LADs (Lamina Associated Domains) in mammals. This prompted us to analyse conformation capture and recombination rate data to study the relationship between chromatin architecture and recombination landscape of human and mouse genomes. The results reveal that: (1) low recombination domains and blocks of elevated linkage disequilibrium tend to coincide with TADs and isochores, indicating co-evolving regulatory elements and genes in insulated neighbourhoods; (2) double strand break (DSB) and recombination frequencies increase in the short loops of GC-rich TADs, whereas recombination cold spots are typical of LADs and (3) the binding and loading of proteins, which are critical for DSB and meiotic recombination (SPO11, DMC1, H3K4me3 and PRMD9) are higher in GC-rich TADs. One explanation for these observations is that the occurrence of DSB and recombination in meiotic cells are associated with compositional and epigenetic features (genomic code) that influence DNA stiffness/flexibility and appear to be similar to those guiding the chromatin architecture in the interphase nucleus of pre-leptotene cells.
Collapse
Affiliation(s)
- Kamel Jabbari
- Institute for Genetics, Biocenter Cologne, University of Cologne, Köln, Germany
- * E-mail:
| | - Johannes Wirtz
- Institute for Genetics, Biocenter Cologne, University of Cologne, Köln, Germany
| | - Martina Rauscher
- Institute for Genetics, Biocenter Cologne, University of Cologne, Köln, Germany
| | - Thomas Wiehe
- Institute for Genetics, Biocenter Cologne, University of Cologne, Köln, Germany
| |
Collapse
|
10
|
Nyaga DM, Vickers MH, Jefferies C, Perry JK, O’Sullivan JM. Type 1 Diabetes Mellitus-Associated Genetic Variants Contribute to Overlapping Immune Regulatory Networks. Front Genet 2018; 9:535. [PMID: 30524468 PMCID: PMC6258722 DOI: 10.3389/fgene.2018.00535] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 10/22/2018] [Indexed: 01/01/2023] Open
Abstract
Type 1 diabetes (T1D) is a chronic metabolic disorder characterized by the autoimmune destruction of insulin-producing pancreatic islet beta cells in genetically predisposed individuals. Genome-wide association studies (GWAS) have identified over 60 risk regions across the human genome, marked by single nucleotide polymorphisms (SNPs), which confer genetic predisposition to T1D. There is increasing evidence that disease-associated SNPs can alter gene expression through spatial interactions that involve distal loci, in a tissue- and development-specific manner. Here, we used three-dimensional (3D) genome organization data to identify genes that physically co-localized with DNA regions that contained T1D-associated SNPs in the nucleus. Analysis of these SNP-gene pairs using the Genotype-Tissue Expression database identified a subset of SNPs that significantly affected gene expression. We identified 246 spatially regulated genes including HLA-DRB1, LAT, MICA, BTN3A2, CTLA4, CD226, NOTCH1, TRIM26, PTEN, TYK2, CTSH, and FLRT3, which exhibit tissue-specific effects in multiple tissues. We observed that the T1D-associated variants interconnect through networks that form part of the immune regulatory pathways, including immune-cell activation, cytokine signaling, and programmed cell death protein-1 (PD-1). Our results implicate T1D-associated variants in tissue and cell-type specific regulatory networks that contribute to pancreatic beta cell inflammation and destruction, adaptive immune signaling, and immune-cell proliferation and activation. A number of other regulatory changes we identified are not typically considered to be central to the pathology of T1D. Collectively, our data represent a novel resource for the hypothesis-driven development of diagnostic, prognostic, and therapeutic interventions in T1D.
Collapse
Affiliation(s)
- Denis M. Nyaga
- The Liggins Institute, The University of Auckland, Auckland, New Zealand
| | - Mark H. Vickers
- The Liggins Institute, The University of Auckland, Auckland, New Zealand
| | - Craig Jefferies
- The Liggins Institute, The University of Auckland, Auckland, New Zealand
- Starship Children’s Health, Auckland, New Zealand
| | - Jo K. Perry
- The Liggins Institute, The University of Auckland, Auckland, New Zealand
| | | |
Collapse
|
11
|
Campa A, Murube E, Ferreira JJ. Genetic Diversity, Population Structure, and Linkage Disequilibrium in a Spanish Common Bean Diversity Panel Revealed through Genotyping-by-Sequencing. Genes (Basel) 2018; 9:E518. [PMID: 30360561 PMCID: PMC6266623 DOI: 10.3390/genes9110518] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 10/19/2018] [Accepted: 10/19/2018] [Indexed: 11/16/2022] Open
Abstract
A common bean (Phaseolus vulgaris) diversity panel of 308 lines was established from local Spanish germplasm, as well as old and elite cultivars mainly used for snap consumption. Most of the landraces included derived from the Spanish common bean core collection, so this panel can be considered to be representative of the Spanish diversity for this species. The panel was characterized by 3099 single-nucleotide polymorphism markers obtained through genotyping-by-sequencing, which revealed a wide genetic diversity and a low level of redundant material within the panel. Structure, cluster, and principal component analyses revealed the presence of two main subpopulations corresponding to the two main gene pools identified in common bean, the Andean and Mesoamerican pools, although most lines (70%) were associated with the Andean gene pool. Lines showing recombination between the two gene pools were also observed, most of them showing useful for snap bean consumption, which suggests that both gene pools were probably used in the breeding of snap bean cultivars. The usefulness of this panel for genome-wide association studies was tested by conducting association mapping for determinacy. Significant marker⁻trait associations were found on chromosome Pv01, involving the gene Phvul.001G189200, which was identified as a candidate gene for determinacy in the common bean.
Collapse
Affiliation(s)
- Ana Campa
- Plant Genetics, Area of Horticultural and Forest Crops, SERIDA, 33300 Asturias, Spain.
| | - Ester Murube
- Plant Genetics, Area of Horticultural and Forest Crops, SERIDA, 33300 Asturias, Spain.
| | - Juan José Ferreira
- Plant Genetics, Area of Horticultural and Forest Crops, SERIDA, 33300 Asturias, Spain.
| |
Collapse
|
12
|
Martin JS, Xu Z, Reiner AP, Mohlke KL, Sullivan P, Ren B, Hu M, Li Y. HUGIn: Hi-C Unifying Genomic Interrogator. Bioinformatics 2017; 33:3793-3795. [PMID: 28582503 PMCID: PMC5860315 DOI: 10.1093/bioinformatics/btx359] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2017] [Accepted: 05/31/2017] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION High throughput chromatin conformation capture (3C) technologies, such as Hi-C and ChIA-PET, have the potential to elucidate the functional roles of non-coding variants. However, most of published genome-wide unbiased chromatin organization studies have used cultured cell lines, limiting their generalizability. RESULTS We developed a web browser, HUGIn, to visualize Hi-C data generated from 21 human primary tissues and cell lines. HUGIn enables assessment of chromatin contacts both constitutive across and specific to tissue(s) and/or cell line(s) at any genomic loci, including GWAS SNPs, eQTLs and cis-regulatory elements, facilitating the understanding of both GWAS and eQTL results and functional genomics data. AVAILABILITY AND IMPLEMENTATION HUGIn is available at http://yunliweb.its.unc.edu/HUGIn. CONTACT yunli@med.unc.edu or hum@ccf.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Joshua S Martin
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Zheng Xu
- Department of Statistics, University of Nebraska, Lincoln, NE, USA,Quantitative Life Sciences Initiative, University of Nebraska, Lincoln, NE, USA
| | - Alex P Reiner
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA,Department of Epidemiology, School of Public Health, University of Washington, Seattle, WA, USA
| | - Karen L Mohlke
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Patrick Sullivan
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA,Department of Medical Epidemiology and Biostatistics, Karolinksa Institutet, Stockholm, Sweden
| | - Bing Ren
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, UCSD Moores Cancer Center, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, USA,To whom correspondence should be addressed. or
| | - Yun Li
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA,Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA,Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA,To whom correspondence should be addressed. or
| |
Collapse
|
13
|
Broeckx BJG, Derrien T, Mottier S, Wucher V, Cadieu E, Hédan B, Le Béguec C, Botherel N, Lindblad-Toh K, Saunders JH, Deforce D, André C, Peelman L, Hitte C. An exome sequencing based approach for genome-wide association studies in the dog. Sci Rep 2017; 7:15680. [PMID: 29142306 PMCID: PMC5688105 DOI: 10.1038/s41598-017-15947-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 11/04/2017] [Indexed: 12/12/2022] Open
Abstract
Genome-wide association studies (GWAS) are widely used to identify loci associated with phenotypic traits in the domestic dog that has emerged as a model for Mendelian and complex traits. However, a disadvantage of GWAS is that it always requires subsequent fine-mapping or sequencing to pinpoint causal mutations. Here, we performed whole exome sequencing (WES) and canine high-density (cHD) SNP genotyping of 28 dogs from 3 breeds to compare the SNP and linkage disequilibrium characteristics together with the power and mapping precision of exome-guided GWAS (EG-GWAS) versus cHD-based GWAS. Using simulated phenotypes, we showed that EG-GWAS has a higher power than cHD to detect associations within target regions and less power outside target regions, with power being influenced further by sample size and SNP density. We analyzed two real phenotypes (hair length and furnishing), that are fixed in certain breeds to characterize mapping precision of the known causal mutations. EG-GWAS identified the associated exonic and 3'UTR variants within the FGF5 and RSPO2 genes, respectively, with only a few samples per breed. In conclusion, we demonstrated that EG-GWAS can identify loci associated with Mendelian phenotypes both within and across breeds.
Collapse
Affiliation(s)
- Bart J G Broeckx
- Laboratory of Animal Genetics, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium.
| | - Thomas Derrien
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France
| | - Stéphanie Mottier
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France
| | - Valentin Wucher
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France
| | - Edouard Cadieu
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France
| | - Benoît Hédan
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France
| | - Céline Le Béguec
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France
| | - Nadine Botherel
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Jimmy H Saunders
- Department of Medical Imaging and Orthopedics, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
| | - Dieter Deforce
- Laboratory of Pharmaceutical Biotechnology, Faculty of Pharmaceutical Sciences, Ghent University, Ghent, Belgium
| | - Catherine André
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France
| | - Luc Peelman
- Laboratory of Animal Genetics, Faculty of Veterinary Medicine, Ghent University, Merelbeke, Belgium
| | - Christophe Hitte
- Institut de Génétique et Développement de Rennes, CNRS-URM6290, Université Rennes1, Rennes, France.
| |
Collapse
|
14
|
Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat Genet 2017; 49:1421-1427. [PMID: 28892061 DOI: 10.1038/ng.3954] [Citation(s) in RCA: 274] [Impact Index Per Article: 39.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2016] [Accepted: 08/16/2017] [Indexed: 12/14/2022]
Abstract
Recent work has hinted at the linkage disequilibrium (LD)-dependent architecture of human complex traits, where SNPs with low levels of LD (LLD) have larger per-SNP heritability. Here we analyzed summary statistics from 56 complex traits (average N = 101,401) by extending stratified LD score regression to continuous annotations. We determined that SNPs with low LLD have significantly larger per-SNP heritability and that roughly half of this effect can be explained by functional annotations negatively correlated with LLD, such as DNase I hypersensitivity sites (DHSs). The remaining signal is largely driven by our finding that more recent common variants tend to have lower LLD and to explain more heritability (P = 2.38 × 10-104); the youngest 20% of common SNPs explain 3.9 times more heritability than the oldest 20%, consistent with the action of negative selection. We also inferred jointly significant effects of other LD-related annotations and confirmed via forward simulations that they jointly predict deleterious effects.
Collapse
|
15
|
Pengelly RJ, Vergara-Lope A, Alyousfi D, Jabalameli MR, Collins A. Understanding the disease genome: gene essentiality and the interplay of selection, recombination and mutation. Brief Bioinform 2017; 20:267-273. [DOI: 10.1093/bib/bbx110] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Indexed: 12/24/2022] Open
Affiliation(s)
- Reuben J Pengelly
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Alejandra Vergara-Lope
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Dareen Alyousfi
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - M Reza Jabalameli
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Andrew Collins
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Southampton, UK
| |
Collapse
|
16
|
Usset JL, Raghavan R, Tyrer JP, McGuire V, Sieh W, Webb P, Chang-Claude J, Rudolph A, Anton-Culver H, Berchuck A, Brinton L, Cunningham JM, DeFazio A, Doherty JA, Edwards RP, Gayther SA, Gentry-Maharaj A, Goodman MT, Høgdall E, Jensen A, Johnatty SE, Kiemeney LA, Kjaer SK, Larson MC, Lurie G, Massuger L, Menon U, Modugno F, Moysich KB, Ness RB, Pike MC, Ramus SJ, Rossing MA, Rothstein J, Song H, Thompson PJ, van den Berg DJ, Vierkant RA, Wang-Gohrke S, Wentzensen N, Whittemore AS, Wilkens LR, Wu AH, Yang H, Pearce CL, Schildkraut JM, Pharoah P, Goode EL, Fridley BL. Assessment of Multifactor Gene-Environment Interactions and Ovarian Cancer Risk: Candidate Genes, Obesity, and Hormone-Related Risk Factors. Cancer Epidemiol Biomarkers Prev 2016; 25:780-90. [PMID: 26976855 PMCID: PMC4873330 DOI: 10.1158/1055-9965.epi-15-1039] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2015] [Accepted: 01/21/2016] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Many epithelial ovarian cancer (EOC) risk factors relate to hormone exposure and elevated estrogen levels are associated with obesity in postmenopausal women. Therefore, we hypothesized that gene-environment interactions related to hormone-related risk factors could differ between obese and non-obese women. METHODS We considered interactions between 11,441 SNPs within 80 candidate genes related to hormone biosynthesis and metabolism and insulin-like growth factors with six hormone-related factors (oral contraceptive use, parity, endometriosis, tubal ligation, hormone replacement therapy, and estrogen use) and assessed whether these interactions differed between obese and non-obese women. Interactions were assessed using logistic regression models and data from 14 case-control studies (6,247 cases; 10,379 controls). Histotype-specific analyses were also completed. RESULTS SNPs in the following candidate genes showed notable interaction: IGF1R (rs41497346, estrogen plus progesterone hormone therapy, histology = all, P = 4.9 × 10(-6)) and ESR1 (rs12661437, endometriosis, histology = all, P = 1.5 × 10(-5)). The most notable obesity-gene-hormone risk factor interaction was within INSR (rs113759408, parity, histology = endometrioid, P = 8.8 × 10(-6)). CONCLUSIONS We have demonstrated the feasibility of assessing multifactor interactions in large genetic epidemiology studies. Follow-up studies are necessary to assess the robustness of our findings for ESR1, CYP11A1, IGF1R, CYP11B1, INSR, and IGFBP2 Future work is needed to develop powerful statistical methods able to detect these complex interactions. IMPACT Assessment of multifactor interaction is feasible, and, here, suggests that the relationship between genetic variants within candidate genes and hormone-related risk factors may vary EOC susceptibility. Cancer Epidemiol Biomarkers Prev; 25(5); 780-90. ©2016 AACR.
Collapse
Affiliation(s)
- Joseph L Usset
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas
| | - Rama Raghavan
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas
| | - Jonathan P Tyrer
- Department of Oncology, University of Cambridge Strangeways Research Laboratory, Cambridge, United Kingdom
| | - Valerie McGuire
- Department of Health Research and Policy - Epidemiology, Stanford University School of Medicine, Stanford, California
| | - Weiva Sieh
- Department of Health Research and Policy - Epidemiology, Stanford University School of Medicine, Stanford, California
| | - Penelope Webb
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center, Heidelberg, Germany
| | - Anja Rudolph
- Division of Cancer Epidemiology, German Cancer Research Center, Heidelberg, Germany
| | - Hoda Anton-Culver
- Department of Epidemiology, University of California Irvine, Irvine, California
| | - Andrew Berchuck
- Department of Obstetrics and Gynecology, Duke University Medical Center, Durham, North Carolina
| | - Louise Brinton
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
| | - Julie M Cunningham
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota
| | - Anna DeFazio
- Discipline of Obstetrics, Gynecology, and Neonatology, University of Sydney, Westmead Institute for Cancer Research, Westmead Millennium Institute, Westmead, New South Wales, Australia
| | - Jennifer A Doherty
- Department of Epidemiology, Geisel School of Medicine, Hanover, New Hampshire
| | - Robert P Edwards
- Department of Obstetrics, Gynecology, and Reproductive Sciences, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Simon A Gayther
- Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California
| | | | - Marc T Goodman
- Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California
| | - Estrid Høgdall
- Department of Virus, Lifestyle, and Genes, Danish Cancer Society Research Center, Copenhagen, Denmark. Department of Pathology, Herlev Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Allan Jensen
- Department of Virus, Lifestyle, and Genes, Danish Cancer Society Research Center, Copenhagen, Denmark
| | - Sharon E Johnatty
- Division of Genetics and Public Health, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Lambertus A Kiemeney
- Department of Health Evidence, Radboud University Medical Centre, Nijmegen, the Netherlands
| | - Susanne K Kjaer
- Department of Gynecology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Melissa C Larson
- Department of Health Science Research, Mayo Clinic, Rochester, Minnesota
| | - Galina Lurie
- Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
| | - Leon Massuger
- Department of Obstetrics & Gynecology, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Usha Menon
- Women's Cancer, Institute for Women's Health, University College London, London, United Kingdom
| | - Francesmary Modugno
- Department of Obstetrics, Gynecology, and Reproductive Sciences, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania. Department of Epidemiology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Kirsten B Moysich
- Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Buffalo, New York
| | - Roberta B Ness
- School of Public Health, The University of Texas, Houston, Texas
| | - Malcolm C Pike
- Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York
| | - Susan J Ramus
- Department of Preventive Medicine, University of Southern California, Los Angeles, California
| | - Mary Anne Rossing
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington. Department of Epidemiology, University of Washington, Seattle, Washington
| | - Joseph Rothstein
- Department of Health Research and Policy - Epidemiology, Stanford University School of Medicine, Stanford, California
| | - Honglin Song
- Department of Oncology, University of Cambridge Strangeways Research Laboratory, Cambridge, United Kingdom
| | - Pamela J Thompson
- Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California
| | - David J van den Berg
- Department of Preventive Medicine, University of Southern California, Los Angeles, California
| | - Robert A Vierkant
- Department of Health Science Research, Mayo Clinic, Rochester, Minnesota
| | - Shan Wang-Gohrke
- Department of Obstetrics and Gynecology, University of Ulm, Ulm, Germany
| | - Nicolas Wentzensen
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
| | - Alice S Whittemore
- Department of Health Research and Policy - Epidemiology, Stanford University School of Medicine, Stanford, California
| | - Lynne R Wilkens
- Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
| | - Anna H Wu
- Department of Preventive Medicine, University of Southern California, Los Angeles, California
| | - Hannah Yang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
| | - Celeste Leigh Pearce
- Department of Preventive Medicine, University of Southern California, Los Angeles, California. Department of Epidemiology, University of Michigan, Ann Arbor, Michigan
| | - Joellen M Schildkraut
- Department of Public Health Sciences, University of Virginia, Charlottesville, Virginia
| | - Paul Pharoah
- Department of Oncology, University of Cambridge Strangeways Research Laboratory, Cambridge, United Kingdom. Department of Public Health and Primary Care, University of Cambridge Strangeways Research Laboratory, Cambridge, United Kingdom
| | - Ellen L Goode
- Department of Health Science Research, Mayo Clinic, Rochester, Minnesota
| | - Brooke L Fridley
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, Kansas.
| |
Collapse
|
17
|
Sodeland M, Jorde PE, Lien S, Jentoft S, Berg PR, Grove H, Kent MP, Arnyasi M, Olsen EM, Knutsen H. "Islands of Divergence" in the Atlantic Cod Genome Represent Polymorphic Chromosomal Rearrangements. Genome Biol Evol 2016; 8:1012-22. [PMID: 26983822 PMCID: PMC4860689 DOI: 10.1093/gbe/evw057] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
In several species genetic differentiation across environmental gradients or between geographically separate populations has been reported to center at "genomic islands of divergence," resulting in heterogeneous differentiation patterns across genomes. Here, genomic regions of elevated divergence were observed on three chromosomes of the highly mobile fish Atlantic cod (Gadus morhua) within geographically fine-scaled coastal areas. The "genomic islands" extended at least 5, 9.5, and 13 megabases on linkage groups 2, 7, and 12, respectively, and coincided with large blocks of linkage disequilibrium. For each of these three chromosomes, pairs of segregating, highly divergent alleles were identified, with little or no gene exchange between them. These patterns of recombination and divergence mirror genomic signatures previously described for large polymorphic inversions, which have been shown to repress recombination across extensive chromosomal segments. The lack of genetic exchange permits divergence between noninverted and inverted chromosomes in spite of gene flow. For the rearrangements on linkage groups 2 and 12, allelic frequency shifts between coastal and oceanic environments suggest a role in ecological adaptation, in agreement with recently reported associations between molecular variation within these genomic regions and temperature, oxygen, and salinity levels. Elevated genetic differentiation in these genomic regions has previously been described on both sides of the Atlantic Ocean, and we therefore suggest that these polymorphisms are involved in adaptive divergence across the species distributional range.
Collapse
Affiliation(s)
- Marte Sodeland
- Institute of Marine Research, Flødevigen, Norway Department of Natural Sciences, Faculty of Engineering and Science, University of Agder, Kristiansand, Norway
| | - Per Erik Jorde
- Centre for Ecological and Evolutionary Syntheses, Department of Biosciences, University of Oslo, Norway
| | - Sigbjørn Lien
- Centre for Integrative Genetics, Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Norway
| | - Sissel Jentoft
- Department of Natural Sciences, Faculty of Engineering and Science, University of Agder, Kristiansand, Norway Centre for Ecological and Evolutionary Syntheses, Department of Biosciences, University of Oslo, Norway
| | - Paul R Berg
- Centre for Ecological and Evolutionary Syntheses, Department of Biosciences, University of Oslo, Norway
| | - Harald Grove
- Centre for Integrative Genetics, Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Norway
| | - Matthew P Kent
- Centre for Integrative Genetics, Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Norway
| | - Mariann Arnyasi
- Centre for Integrative Genetics, Department of Animal and Aquacultural Sciences, Norwegian University of Life Sciences, Norway
| | - Esben Moland Olsen
- Institute of Marine Research, Flødevigen, Norway Department of Natural Sciences, Faculty of Engineering and Science, University of Agder, Kristiansand, Norway
| | - Halvor Knutsen
- Institute of Marine Research, Flødevigen, Norway Department of Natural Sciences, Faculty of Engineering and Science, University of Agder, Kristiansand, Norway Centre for Ecological and Evolutionary Syntheses, Department of Biosciences, University of Oslo, Norway
| |
Collapse
|
18
|
Berger S, Schlather M, de los Campos G, Weigend S, Preisinger R, Erbe M, Simianer H. A Scale-Corrected Comparison of Linkage Disequilibrium Levels between Genic and Non-Genic Regions. PLoS One 2015; 10:e0141216. [PMID: 26517830 PMCID: PMC4627745 DOI: 10.1371/journal.pone.0141216] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 10/06/2015] [Indexed: 12/27/2022] Open
Abstract
The understanding of non-random association between loci, termed linkage disequilibrium (LD), plays a central role in genomic research. Since causal mutations are generally not included in genomic marker data, LD between those and available markers is essential for capturing the effects of causal loci on localizing genes responsible for traits. Thus, the interpretation of association studies requires a detailed knowledge of LD patterns. It is well known that most LD measures depend on minor allele frequencies (MAF) of the considered loci and the magnitude of LD is influenced by the physical distances between loci. In the present study, a procedure to compare the LD structure between genomic regions comprising several markers each is suggested. The approach accounts for different scaling factors, namely the distribution of MAF, the distribution of pair-wise differences in MAF, and the physical extent of compared regions, reflected by the distribution of pair-wise physical distances. In the first step, genomic regions are matched based on similarity in these scaling factors. In the second step, chromosome- and genome-wide significance tests for differences in medians of LD measures in each pair are performed. The proposed framework was applied to test the hypothesis that the average LD is different in genic and non-genic regions. This was tested with a genome-wide approach with data sets for humans (Homo sapiens), a highly selected chicken line (Gallus gallus domesticus) and the model plant Arabidopsis thaliana. In all three data sets we found a significantly higher level of LD in genic regions compared to non-genic regions. About 31% more LD was detected genome-wide in genic compared to non-genic regions in Arabidopsis thaliana, followed by 13.6% in human and 6% chicken. Chromosome-wide comparison discovered significant differences on all 5 chromosomes in Arabidopsis thaliana and on one third of the human and of the chicken chromosomes.
Collapse
Affiliation(s)
- Swetlana Berger
- Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August-University, Goettingen, Germany
| | - Martin Schlather
- School of Business Informatics and Mathematics, University of Mannheim, Mannheim, Germany
| | - Gustavo de los Campos
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan, United States of America
| | - Steffen Weigend
- Institut of Farm Animal Genetics, Friedrich-Loeffler Institut, Neustadt-Mariensee, Germany
| | | | - Malena Erbe
- Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August-University, Goettingen, Germany
| | - Henner Simianer
- Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August-University, Goettingen, Germany
| |
Collapse
|
19
|
Hussin JG, Hodgkinson A, Idaghdour Y, Grenier JC, Goulet JP, Gbeha E, Hip-Ki E, Awadalla P. Recombination affects accumulation of damaging and disease-associated mutations in human populations. Nat Genet 2015; 47:400-4. [DOI: 10.1038/ng.3216] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2014] [Accepted: 01/14/2015] [Indexed: 01/17/2023]
|
20
|
LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 2015; 47:291-5. [PMID: 25642630 DOI: 10.1038/ng.3211] [Citation(s) in RCA: 2975] [Impact Index Per Article: 330.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2014] [Accepted: 01/07/2015] [Indexed: 12/16/2022]
Abstract
Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.
Collapse
|
21
|
Stearman RS, Cornelius AR, Lu X, Conklin DS, Del Rosario MJ, Lowe AM, Elos MT, Fettig LM, Wong RE, Hara N, Cogan JD, Phillips JA, Taylor MR, Graham BB, Tuder RM, Loyd JE, Geraci MW. Functional prostacyclin synthase promoter polymorphisms. Impact in pulmonary arterial hypertension. Am J Respir Crit Care Med 2014; 189:1110-20. [PMID: 24605778 DOI: 10.1164/rccm.201309-1697oc] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
RATIONALE Pulmonary arterial hypertension (PAH) is a progressive disease characterized by elevated pulmonary artery pressure, vascular remodeling, and ultimately right ventricular heart failure. PAH can have a genetic component (heritable PAH), most often through mutations of bone morphogenetic protein receptor 2, and idiopathic and associated forms. Heritable PAH is not completely penetrant within families, with approximately 20% concurrence of inactivating bone morphogenetic protein receptor 2 mutations and delayed onset of PAH disease. Because one of the treatment options is using prostacyclin analogs, we hypothesized that prostacyclin synthase promoter sequence variants associated with increased mRNA expression may play a protective role in the bone morphogenetic protein receptor 2 unaffected carriers. OBJECTIVES To characterize the range of prostacyclin synthase promoter variants and assess their transcriptional activities in PAH-relevant cell types. To determine the distribution of prostacyclin synthase promoter variants in PAH, unaffected carriers in heritable PAH families, and control populations. METHODS Polymerase chain reaction approaches were used to genotype prostacyclin synthase promoter variants in more than 300 individuals. Prostacyclin synthase promoter haplotypes' transcriptional activities were determined with luciferase reporter assays. MEASUREMENTS AND MAIN RESULTS We identified a comprehensive set of prostacyclin synthase promoter variants and tested their transcriptional activities in PAH-relevant cell types. We demonstrated differences of prostacyclin synthase promoter activities dependent on their haplotype. CONCLUSIONS Prostacyclin synthase promoter sequence variants exhibit a range of transcriptional activities. We discovered a significant bias for more active prostacyclin synthase promoter variants in unaffected carriers as compared with affected patients with PAH.
Collapse
Affiliation(s)
- Robert S Stearman
- 1 Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine, University of Colorado Denver, School of Medicine, Aurora, Colorado
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Spencer AV, Cox A, Walters K. Comparing the efficacy of SNP filtering methods for identifying a single causal SNP in a known association region. Ann Hum Genet 2013; 78:50-61. [PMID: 24205929 PMCID: PMC4282378 DOI: 10.1111/ahg.12043] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Accepted: 09/05/2013] [Indexed: 01/20/2023]
Abstract
Genome-wide association studies have successfully identified associations between common diseases and a large number of single nucleotide polymorphisms (SNPs) across the genome. We investigate the effectiveness of several statistics, including p-values, likelihoods, genetic map distance and linkage disequilibrium between SNPs, in filtering SNPs in several disease-associated regions. We use simulated data to compare the efficacy of filters with different sample sizes and for causal SNPs with different minor allele frequencies (MAFs) and effect sizes, focusing on the small effect sizes and MAFs likely to represent the majority of unidentified causal SNPs. In our analyses, of all the methods investigated, filtering on the ranked likelihoods consistently retains the true causal SNP with the highest probability for a given false positive rate. This was the case for all the local linkage disequilibrium patterns investigated. Our results indicate that when using this method to retain only the top 5% of SNPs, even a causal SNP with an odds ratio of 1.1 and MAF of 0.08 can be retained with a probability exceeding 0.9 using an overall sample size of 50,000.
Collapse
Affiliation(s)
- Amy Victoria Spencer
- School of Mathematics and Statistics, University of Sheffield, Sheffield, S3 7RH, UK
| | | | | |
Collapse
|
23
|
Soufflet-Freslon V, Jourdan M, Clotault J, Huet S, Briard M, Peltier D, Geoffriau E. Functional gene polymorphism to reveal species history: the case of the CRTISO gene in cultivated carrots. PLoS One 2013; 8:e70801. [PMID: 23940644 PMCID: PMC3733727 DOI: 10.1371/journal.pone.0070801] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2013] [Accepted: 06/24/2013] [Indexed: 01/01/2023] Open
Abstract
Background Carrot is a vegetable cultivated worldwide for the consumption of its root. Historical data indicate that root colour has been differentially selected over time and according to geographical areas. Root pigmentation depends on the relative proportion of different carotenoids for the white, yellow, orange and red types but only internally for the purple one. The genetic control for root carotenoid content might be partially associated with carotenoid biosynthetic genes. Carotenoid isomerase (CRTISO) has emerged as a regulatory step in the carotenoid biosynthesis pathway and could be a good candidate to show how a metabolic pathway gene reflects a species genetic history. Methodology/Principal Findings In this study, the nucleotide polymorphism and the linkage disequilibrium among the complete CRTISO sequence, and the deviation from neutral expectation were analysed by considering population subdivision revealed with 17 microsatellite markers. A sample of 39 accessions, which represented different geographical origins and root colours, was used. Cultivated carrot was divided into two genetic groups: one from Middle East and Asia (Eastern group), and another one mainly from Europe (Western group). The Western and Eastern genetic groups were suggested to be differentially affected by selection: a signature of balancing selection was detected within the first group whereas the second one showed no selection. A focus on orange-rooted carrots revealed that cultivars cultivated in Asia were mainly assigned to the Western group but showed CRTISO haplotypes common to Eastern carrots. Conclusion The carotenoid pathway CRTISO gene data proved to be complementary to neutral markers in order to bring critical insight in the cultivated carrot history. We confirmed the occurrence of two migration events since domestication. Our results showed a European background in material from Japan and Central Asia. While confirming the introduction of European carrots in Japanese resources, the history of Central Asia material remains unclear.
Collapse
Affiliation(s)
- Vanessa Soufflet-Freslon
- Agrocampus Ouest, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- Université d’Angers, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- INRA, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Beaucouzé, France
| | - Matthieu Jourdan
- Agrocampus Ouest, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- Université d’Angers, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- INRA, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Beaucouzé, France
| | - Jérémy Clotault
- Agrocampus Ouest, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- Université d’Angers, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- INRA, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Beaucouzé, France
| | - Sébastien Huet
- Agrocampus Ouest, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- Université d’Angers, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- INRA, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Beaucouzé, France
| | - Mathilde Briard
- Agrocampus Ouest, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- Université d’Angers, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- INRA, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Beaucouzé, France
| | - Didier Peltier
- Agrocampus Ouest, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- Université d’Angers, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- INRA, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Beaucouzé, France
| | - Emmanuel Geoffriau
- Agrocampus Ouest, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- Université d’Angers, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Angers, France
- INRA, UMR1345 Institut de Recherche en Horticulture et Semences, SFR4207 QUASAV, Beaucouzé, France
- * E-mail:
| |
Collapse
|
24
|
Gibson J, Tapper W, Ennis S, Collins A. Exome-based linkage disequilibrium maps of individual genes: functional clustering and relationship to disease. Hum Genet 2012; 132:233-43. [PMID: 23124193 DOI: 10.1007/s00439-012-1243-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2012] [Accepted: 10/20/2012] [Indexed: 11/26/2022]
Abstract
Exome sequencing identifies thousands of DNA variants and a proportion of these are involved in disease. Genotypes derived from exome sequences provide particularly high-resolution coverage enabling study of the linkage disequilibrium structure of individual genes. The extent and strength of linkage disequilibrium reflects the combined influences of mutation, recombination, selection and population history. By constructing linkage disequilibrium maps of individual genes, we show that genes containing OMIM-listed disease variants are significantly under-represented amongst genes with complete or very strong linkage disequilibrium (P = 0.0004). In contrast, genes with disease variants are significantly over-represented amongst genes with levels of linkage disequilibrium close to the average for genes not known to contain disease variants (P = 0.0038). Functional clustering reveals, amongst genes with particularly strong linkage disequilibrium, significant enrichment of essential biological functions (e.g. phosphorylation, cell division, cellular transport and metabolic processes). Strong linkage disequilibrium, corresponding to reduced haplotype diversity, may reflect selection in utero against deleterious mutations which have profound impact on the function of essential genes. Genes with very weak linkage disequilibrium show enrichment of functions requiring greater allelic diversity (e.g. sensory perception and immune response). This category is not enriched for genes containing disease variation. In contrast, there is significant enrichment of genes containing disease variants amongst genes with more average levels of linkage disequilibrium. Mutations in these genes may less likely lead to in utero lethality and be subject to less intense selection.
Collapse
Affiliation(s)
- Jane Gibson
- Genetic Epidemiology and Genomic informatics Group, Human Genetics, University of Southampton, Southampton General Hospital, Southampton, UK
| | | | | | | |
Collapse
|
25
|
Schurink A, Wolc A, Ducro BJ, Frankena K, Garrick DJ, Dekkers JCM, van Arendonk JAM. Genome-wide association study of insect bite hypersensitivity in two horse populations in the Netherlands. Genet Sel Evol 2012; 44:31. [PMID: 23110538 PMCID: PMC3524047 DOI: 10.1186/1297-9686-44-31] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Accepted: 10/19/2012] [Indexed: 01/09/2023] Open
Abstract
Background Insect bite hypersensitivity is a common allergic disease in horse populations worldwide. Insect bite hypersensitivity is affected by both environmental and genetic factors. However, little is known about genes contributing to the genetic variance associated with insect bite hypersensitivity. Therefore, the aim of our study was to identify and quantify genomic associations with insect bite hypersensitivity in Shetland pony mares and Icelandic horses in the Netherlands. Methods Data on 200 Shetland pony mares and 146 Icelandic horses were collected according to a matched case–control design. Cases and controls were matched on various factors (e.g. region, sire) to minimize effects of population stratification. Breed-specific genome-wide association studies were performed using 70 k single nucleotide polymorphisms genotypes. Bayesian variable selection method Bayes-C with a threshold model implemented in GenSel software was applied. A 1 Mb non-overlapping window approach that accumulated contributions of adjacent single nucleotide polymorphisms was used to identify associated genomic regions. Results The percentage of variance explained by all single nucleotide polymorphisms was 13% in Shetland pony mares and 28% in Icelandic horses. The 20 non-overlapping windows explaining the largest percentages of genetic variance were found on nine chromosomes in Shetland pony mares and on 14 chromosomes in Icelandic horses. Overlap in identified associated genomic regions between breeds would suggest interesting candidate regions to follow-up on. Such regions common to both breeds (within 15 Mb) were found on chromosomes 3, 7, 11, 20 and 23. Positional candidate genes within 2 Mb from the associated windows were identified on chromosome 20 in both breeds. Candidate genes are within the equine lymphocyte antigen class II region, which evokes an immune response by recognizing many foreign molecules. Conclusions The genome-wide association study identified several genomic regions associated with insect bite hypersensitivity in Shetland pony mares and Icelandic horses. On chromosome 20, associated genomic regions in both breeds were within 2 Mb from the equine lymphocyte antigen class II region. Increased knowledge on insect bite hypersensitivity associated genes will contribute to our understanding of its biology, enabling more efficient selection, therapy and prevention to decrease insect bite hypersensitivity prevalence.
Collapse
Affiliation(s)
- Anouk Schurink
- Animal Breeding and Genomics Centre, Wageningen University, P,O, Box 338, Wageningen, 6700 AH, the Netherlands
| | | | | | | | | | | | | |
Collapse
|
26
|
Dong X, Zhong T, Xu T, Xia Y, Li B, Li C, Yuan L, Ding G, Li Y. Evaluating coverage of exons by HapMap SNPs. Genomics 2012; 101:20-3. [PMID: 23000193 DOI: 10.1016/j.ygeno.2012.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2011] [Revised: 07/22/2012] [Accepted: 09/07/2012] [Indexed: 11/29/2022]
Abstract
Genome-wide association (GWA) studies are currently one of the most powerful tools in identifying disease-associated genes or variants. In typical GWA studies, single-nucleotide polymorphisms (SNPs) are often used as genetic makers. Therefore, it is critical to estimate the percentage of genetic variations which can be covered by SNPs through linkage disequilibrium (LD). In this study, we use the concept of haplotype blocks to evaluate the coverage of five SNP sets including the HapMap and four commercial arrays, for every exon in the human genome. We show that although some Chips can reach similar coverage as the HapMap, only about 50% of exons are completely covered by haplotype blocks of HapMap SNPs. We suggest further high-resolution genotyping methods are required, to provide adequate genome-wide power for identifying variants.
Collapse
Affiliation(s)
- Xiao Dong
- Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, People's Republic of China
| | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Single Nucleotide Polymorphism (SNP) Detection and Genotype Calling from Massively Parallel Sequencing (MPS) Data. STATISTICS IN BIOSCIENCES 2012; 5:3-25. [PMID: 24489615 DOI: 10.1007/s12561-012-9067-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Massively parallel sequencing (MPS), since its debut in 2005, has transformed the field of genomic studies. These new sequencing technologies have resulted in the successful identification of causal variants for several rare Mendelian disorders. They have also begun to deliver on their promise to explain some of the missing heritability from genome-wide association studies (GWAS) of complex traits. We anticipate a rapidly growing number of MPS-based studies for a diverse range of applications in the near future. One crucial and nearly inevitable step is to detect SNPs and call genotypes at the detected polymorphic sites from the sequencing data. Here, we review statistical methods that have been proposed in the past five years for this purpose. In addition, we discuss emerging issues and future directions related to SNP detection and genotype calling from MPS data.
Collapse
|
28
|
Freudenberg J, Gregersen PK, Freudenberg-Hua Y. A simple method for analyzing exome sequencing data shows distinct levels of nonsynonymous variation for human immune and nervous system genes. PLoS One 2012; 7:e38087. [PMID: 22701602 PMCID: PMC3368947 DOI: 10.1371/journal.pone.0038087] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Accepted: 05/03/2012] [Indexed: 11/29/2022] Open
Abstract
To measure the strength of natural selection that acts upon single nucleotide variants (SNVs) in a set of human genes, we calculate the ratio between nonsynonymous SNVs (nsSNVs) per nonsynonymous site and synonymous SNVs (sSNVs) per synonymous site. We transform this ratio with a respective factor f that corrects for the bias of synonymous sites towards transitions in the genetic code and different mutation rates for transitions and transversions. This method approximates the relative density of nsSNVs (rdnsv) in comparison with the neutral expectation as inferred from the density of sSNVs. Using SNVs from a diploid genome and 200 exomes, we apply our method to immune system genes (ISGs), nervous system genes (NSGs), randomly sampled genes (RSGs), and gene ontology annotated genes. The estimate of rdnsv in an individual exome is around 20% for NSGs and 30-40% for ISGs and RSGs. This smaller rdnsv of NSGs indicates overall stronger purifying selection. To quantify the relative shift of nsSNVs towards rare variants, we next fit a linear regression model to the estimates of rdnsv over different SNV allele frequency bins. The obtained regression models show a negative slope for NSGs, ISGs and RSGs, supporting an influence of purifying selection on the frequency spectrum of segregating nsSNVs. The y-intercept of the model predicts rdnsv for an allele frequency close to 0. This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio. A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection. This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.
Collapse
Affiliation(s)
- Jan Freudenberg
- Robert S. Boas Center for Human Genetics and Genomics, The Feinstein Institute for Medical Research, Northshore LIJ Healthsystem, Manhasset, New York, United States of America.
| | | | | |
Collapse
|
29
|
Frenkel S, Kirzhner V, Korol A. Organizational heterogeneity of vertebrate genomes. PLoS One 2012; 7:e32076. [PMID: 22384143 PMCID: PMC3288070 DOI: 10.1371/journal.pone.0032076] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2011] [Accepted: 01/23/2012] [Indexed: 01/06/2023] Open
Abstract
Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.
Collapse
Affiliation(s)
| | | | - Abraham Korol
- Department of Evolutionary and Environmental Biology and Institute of Evolution, University of Haifa, Mount Carmel, Haifa, Israel
| |
Collapse
|
30
|
A survey of genomic studies supports association of circadian clock genes with bipolar disorder spectrum illnesses and lithium response. PLoS One 2012; 7:e32091. [PMID: 22384149 PMCID: PMC3285204 DOI: 10.1371/journal.pone.0032091] [Citation(s) in RCA: 121] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2011] [Accepted: 01/23/2012] [Indexed: 11/19/2022] Open
Abstract
Circadian rhythm abnormalities in bipolar disorder (BD) have led to a search for genetic abnormalities in circadian “clock genes” associated with BD. However, no significant clock gene findings have emerged from genome-wide association studies (GWAS). At least three factors could account for this discrepancy: complex traits are polygenic, the organization of the clock is more complex than previously recognized, and/or genetic risk for BD may be shared across multiple illnesses. To investigate these issues, we considered the clock gene network at three levels: essential “core” clock genes, upstream circadian clock modulators, and downstream clock controlled genes. Using relaxed thresholds for GWAS statistical significance, we determined the rates of clock vs. control genetic associations with BD, and four additional illnesses that share clinical features and/or genetic risk with BD (major depression, schizophrenia, attention deficit/hyperactivity). Then we compared the results to a set of lithium-responsive genes. Associations with BD-spectrum illnesses and lithium-responsiveness were both enriched among core clock genes but not among upstream clock modulators. Associations with BD-spectrum illnesses and lithium-responsiveness were also enriched among pervasively rhythmic clock-controlled genes but not among genes that were less pervasively rhythmic or non-rhythmic. Our analysis reveals previously unrecognized associations between clock genes and BD-spectrum illnesses, partly reconciling previously discordant results from past GWAS and candidate gene studies.
Collapse
|
31
|
Sun P, Zhang R, Jiang Y, Wang X, Li J, Lv H, Tang G, Guo X, Meng X, Zhang H, Zhang R. Assessing the patterns of linkage disequilibrium in genic regions of the human genome. FEBS J 2011; 278:3748-55. [DOI: 10.1111/j.1742-4658.2011.08293.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
32
|
Pavy N, Namroud MC, Gagnon F, Isabel N, Bousquet J. The heterogeneous levels of linkage disequilibrium in white spruce genes and comparative analysis with other conifers. Heredity (Edinb) 2011; 108:273-84. [PMID: 21897435 DOI: 10.1038/hdy.2011.72] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
In plants, knowledge about linkage disequilibrium (LD) is relevant for the design of efficient single-nucleotide polymorphism arrays in relation to their use in population and association genomics studies. Previous studies of conifer genes have shown LD to decay rapidly within gene limits, but exceptions have been reported. To evaluate the extent of heterogeneity of LD among conifer genes and its potential causes, we examined LD in 105 genes of white spruce (Picea glauca) by sequencing a panel of 48 haploid megagametophytes from natural populations and further compared it with LD in other conifer species. The average pairwise r(2) value was 0.19 (s.d.=0.19), and LD dropped quickly with a half-decay being reached at a distance of 65 nucleotides between sites. However, LD was significantly heterogeneous among genes. A first group of 29 genes had stronger LD (mean r(2)=0.28), and a second group of 38 genes had weaker LD (mean r(2)=0.12). While a strong relationship was found with the recombination rate, there was no obvious relationship between LD and functional classification. The level of nucleotide diversity, which was highly heterogeneous across genes, was also not significantly correlated with LD. A search for selection signatures highlighted significant deviations from the standard neutral model, which could be mostly attributed to recent demographic changes. Little evidence was seen for hitchhiking and clear relationships with LD. When compared among conifer species, on average, levels of LD were similar in genes from white spruce, Norway spruce and Scots pine, whereas loblolly pine and Douglas fir genes exhibited a significantly higher LD.
Collapse
Affiliation(s)
- N Pavy
- Canada Research Chair in Forest and Environmental Genomics, Forest Research Centre, Université Laval, Québec, Canada.
| | | | | | | | | |
Collapse
|
33
|
Zhang Y, Liu JS. Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies. J Am Stat Assoc 2011; 106:846-857. [PMID: 22140288 PMCID: PMC3226809 DOI: 10.1198/jasa.2011.ap10657] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Genome-wide association studies commonly involve simultaneous tests of millions of single nucleotide polymorphisms (SNP) for disease association. The SNPs in nearby genomic regions, however, are often highly correlated due to linkage disequilibrium (LD, a genetic term for correlation). Simple Bonferonni correction for multiple comparisons is therefore too conservative. Permutation tests, which are often employed in practice, are both computationally expensive for genome-wide studies and limited in their scopes. We present an accurate and computationally efficient method, based on Poisson de-clumping heuristics, for approximating genome-wide significance of SNP associations. Compared with permutation tests and other multiple comparison adjustment approaches, our method computes the most accurate and robust p-value adjustments for millions of correlated comparisons within seconds. We demonstrate analytically that the accuracy and the efficiency of our method are nearly independent of the sample size, the number of SNPs, and the scale of p-values to be adjusted. In addition, our method can be easily adopted to estimate false discovery rate. When applied to genome-wide SNP datasets, we observed highly variable p-value adjustment results evaluated from different genomic regions. The variation in adjustments along the genome, however, are well conserved between the European and the African populations. The p-value adjustments are significantly correlated with LD among SNPs, recombination rates, and SNP densities. Given the large variability of sequence features in the genome, we further discuss a novel approach of using SNP-specific (local) thresholds to detect genome-wide significant associations. This article has supplementary material online.
Collapse
Affiliation(s)
- Yu Zhang
- Department of Statistics, The Pennsylvania State University, 422A Thomas Building, University Park, PA 16803
| | - Jun S. Liu
- Department of Statistics, Harvard University, 715 Science Center, 1 Oxford St., Cambridge, MA 02138
| |
Collapse
|
34
|
Fridley BL, Biernacka JM. Gene set analysis of SNP data: benefits, challenges, and future directions. Eur J Hum Genet 2011; 19:837-43. [PMID: 21487444 DOI: 10.1038/ejhg.2011.57] [Citation(s) in RCA: 108] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
The last decade of human genetic research witnessed the completion of hundreds of genome-wide association studies (GWASs). However, the genetic variants discovered through these efforts account for only a small proportion of the heritability of complex traits. One explanation for the missing heritability is that the common analysis approach, assessing the effect of each single-nucleotide polymorphism (SNP) individually, is not well suited to the detection of small effects of multiple SNPs. Gene set analysis (GSA) is one of several approaches that may contribute to the discovery of additional genetic risk factors for complex traits. Complex phenotypes are thought to be controlled by networks of interacting biochemical and physiological pathways influenced by the products of sets of genes. By assessing the overall evidence of association of a phenotype with all measured variation in a set of genes, GSA may identify functionally relevant sets of genes corresponding to relevant biomolecular pathways, which will enable more focused studies of genetic risk factors. This approach may thus contribute to the discovery of genetic variants responsible for some of the missing heritability. With the increased use of these approaches for the secondary analysis of data from GWAS, it is important to understand the different GSA methods and their strengths and weaknesses, and consider challenges inherent in these types of analyses. This paper provides an overview of GSA, highlighting the key challenges, potential solutions, and directions for ongoing research.
Collapse
Affiliation(s)
- Brooke L Fridley
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA.
| | | |
Collapse
|
35
|
Freudenberg J, Lee AT, Siminovitch KA, Amos CI, Ballard D, Li W, Gregersen PK. Locus category based analysis of a large genome-wide association study of rheumatoid arthritis. Hum Mol Genet 2010; 19:3863-72. [PMID: 20639398 PMCID: PMC2935861 DOI: 10.1093/hmg/ddq304] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Accepted: 07/13/2010] [Indexed: 11/14/2022] Open
Abstract
To pinpoint true positive single-nucleotide polymorphism (SNP) associations in a genome-wide association study (GWAS) of rheumatoid arthritis (RA), we categorize genetic loci by external knowledge. We test both the 'enrichment of associated loci' in a locus category and the 'combined association' of a locus category. The former is quantified by the odds ratio for the presence of SNP associations at the loci of a category, whereas the latter is quantified by the number of loci in a category that have SNP associations. These measures are compared with their expected values as obtained from the permutation of the affection status. To account for linkage disequilibrium (LD) among SNPs, we view each LD block as a genetic locus. Positional candidates were defined as loci implicated by earlier GWAS results, whereas functional candidates were defined by annotations regarding the molecular roles of genes, such as gene ontology categories. As expected, immune-related categories show the largest enrichment signal, although it is not very strong. The intersection of positional and functional candidate information predicts novel RA loci near the genes TEC/TXK, MBL2 and PIK3R1/CD180. Notably, a combined association signal is not only produced by immune-related categories, but also by most other categories and even randomly defined categories. The unspecific quality of these signals limits the possible conclusions from combined association tests. It also reduces the magnitude of enrichment test results. These unspecific signals might result from common variants of small effect and hardly concentrated in candidate categories, or an inflated size of associated regions from weak LD with infrequent mutations.
Collapse
Affiliation(s)
- Jan Freudenberg
- Robert S. Boas Center for Human Genetics and Genomics, The Feinstein Institute for Medical Research, Manhasset, NY 11030, USA.
| | | | | | | | | | | | | |
Collapse
|
36
|
Mackiewicz D, Zawierta M, Waga W, Cebrat S. Genome analyses and modelling the relationships between coding density, recombination rate and chromosome length. J Theor Biol 2010; 267:186-92. [PMID: 20728453 DOI: 10.1016/j.jtbi.2010.08.022] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2010] [Revised: 06/29/2010] [Accepted: 08/17/2010] [Indexed: 01/23/2023]
Abstract
In the human genomes, recombination frequency between homologous chromosomes during meiosis is highly correlated with their physical length while it differs significantly when their coding density is considered. Furthermore, it has been observed that the recombination events are distributed unevenly along the chromosomes. We have found that many of such recombination properties can be predicted by computer simulations of population evolution based on the Monte Carlo methods. For example, these simulations have shown that the probability of acceptance of the recombination events by selection is higher at the ends of chromosomes and lower in their middle parts. The regions of high coding density are more prone to enter the strategy of haplotype complementation and to form clusters of genes, which are "recombination deserts". The phenomenon of switching in-between the purifying selection and haplotype complementation has a phase transition character, and many relations between the effective population size, coding density, chromosome size and recombination frequency are those of the power law type.
Collapse
Affiliation(s)
- Dorota Mackiewicz
- Department of Genomics, Biotechnology Faculty, University of Wroclaw, ul. Przybyszewskiego 63/77, 51-148 Wroclaw, Poland.
| | | | | | | |
Collapse
|
37
|
Cooper DN, Ball EV, Mort M. Chromosomal distribution of disease genes in the human genome. Genet Test Mol Biomarkers 2010; 14:441-6. [PMID: 20642358 DOI: 10.1089/gtmb.2010.0081] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Genes are nonrandomly distributed in the human genome, both within and between chromosomes. Thus, genes of similar function and common evolutionary origin are often clustered, as are genes with similar expression profiles. We now report that the >2400 genes known to underlie human monogenic inherited disease are non-randomly distributed in the genome over and above the general nonrandomness evident in the distribution of human genes. Further, a subset of 315 inherited disease genes subject to gross deletion was found to exhibit a degree of clustering that was twice that manifested by disease genes in general. The clustering of human disease genes is likely to have important implications for understanding the genotype-phenotype relationship in contiguous gene syndromes as well as those conditions characterized by multigene deletions or complex chromosomal rearrangements.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff, United Kingdom.
| | | | | |
Collapse
|
38
|
Li S, Shih CH, Kohn MH. Functional and evolutionary correlates of gene constellations in the Drosophila melanogaster genome that deviate from the stereotypical gene architecture. BMC Genomics 2010; 11:322. [PMID: 20497561 PMCID: PMC2891614 DOI: 10.1186/1471-2164-11-322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2009] [Accepted: 05/24/2010] [Indexed: 01/19/2023] Open
Abstract
Background The biological dimensions of genes are manifold. These include genomic properties, (e.g., X/autosomal linkage, recombination) and functional properties (e.g., expression level, tissue specificity). Multiple properties, each generally of subtle influence individually, may affect the evolution of genes or merely be (auto-)correlates. Results of multidimensional analyses may reveal the relative importance of these properties on the evolution of genes, and therefore help evaluate whether these properties should be considered during analyses. While numerous properties are now considered during studies, most work still assumes the stereotypical solitary gene as commonly depicted in textbooks. Here, we investigate the Drosophila melanogaster genome to determine whether deviations from the stereotypical gene architecture correlate with other properties of genes. Results Deviations from the stereotypical gene architecture were classified as the following gene constellations: Overlapping genes were defined as those that overlap in the 5-prime, exonic, or intronic regions. Chromatin co-clustering genes were defined as genes that co-clustered within 20 kb of transcriptional territories. If this scheme is applied the stereotypical gene emerges as a rare occurrence (7.5%), slightly varied schemes yielded between ~1%-50%. Moreover, when following our scheme, paired-overlapping genes and chromatin co-clustering genes accounted for 50.1 and 42.4% of the genes analyzed, respectively. Gene constellation was a correlate of a number of functional and evolutionary properties of genes, but its statistical effect was ~1-2 orders of magnitude lower than the effects of recombination, chromosome linkage and protein function. Analysis of datasets on male reproductive proteins showed these were biased in their representation of gene constellations and evolutionary rate Ka/Ks estimates, but these biases did not overwhelm the biologically meaningful observation of high evolutionary rates of male reproductive genes. Conclusion Given the rarity of the solitary stereotypical gene, and the abundance of gene constellations that deviate from it, the presence of gene constellations, while once thought to be exceptional in large Eukaryote genomes, might have broader relevance to the understanding and study of the genome. However, according to our definition, while gene constellations can be significant correlates of functional properties of genes, they generally are weak correlates of the evolution of genes. Thus, the need for their consideration would depend on the context of studies.
Collapse
Affiliation(s)
- Shuwei Li
- Department of Ecology and Evolutionary Biology, Rice University, 6100 Main Street, MS 170, Houston, Texas 77005, USA
| | | | | |
Collapse
|
39
|
Kato M, Kawaguchi T, Ishikawa S, Umeda T, Nakamichi R, Shapero MH, Jones KW, Nakamura Y, Aburatani H, Tsunoda T. Population-genetic nature of copy number variations in the human genome. Hum Mol Genet 2009; 19:761-73. [PMID: 19966329 PMCID: PMC2816609 DOI: 10.1093/hmg/ddp541] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Copy number variations (CNVs) are universal genetic variations, and their association with disease has been increasingly recognized. We designed high-density microarrays for CNVs, and detected 3000–4000 CNVs (4–6% of the genomic sequence) per population that included CNVs previously missed because of smaller sizes and residing in segmental duplications. The patterns of CNVs across individuals were surprisingly simple at the kilo-base scale, suggesting the applicability of a simple genetic analysis for these genetic loci. We utilized the probabilistic theory to determine integer copy numbers of CNVs and employed a recently developed phasing tool to estimate the population frequencies of integer copy number alleles and CNV–SNP haplotypes. The results showed a tendency toward a lower frequency of CNV alleles and that most of our CNVs were explained only by zero-, one- and two-copy alleles. Using the estimated population frequencies, we found several CNV regions with exceptionally high population differentiation. Investigation of CNV–SNP linkage disequilibrium (LD) for 500–900 bi- and multi-allelic CNVs per population revealed that previous conflicting reports on bi-allelic LD were unexpectedly consistent and explained by an LD increase correlated with deletion-allele frequencies. Typically, the bi-allelic LD was lower than SNP–SNP LD, whereas the multi-allelic LD was somewhat stronger than the bi-allelic LD. After further investigation of tag SNPs for CNVs, we conclude that the customary tagging strategy for disease association studies can be applicable for common deletion CNVs, but direct interrogation is needed for other types of CNVs.
Collapse
Affiliation(s)
- Mamoru Kato
- Center for Genomic Medicine, RIKEN, 1-7-22 Suehiro, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Lovell SC, Li X, Weerasinghe NR, Hentges KE. Correlation of microsynteny conservation and disease gene distribution in mammalian genomes. BMC Genomics 2009; 10:521. [PMID: 19909546 PMCID: PMC2779822 DOI: 10.1186/1471-2164-10-521] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2008] [Accepted: 11/12/2009] [Indexed: 12/14/2022] Open
Abstract
Background With the completion of the whole genome sequence for many organisms, investigations into genomic structure have revealed that gene distribution is variable, and that genes with similar function or expression are located within clusters. This clustering suggests that there are evolutionary constraints that determine genome architecture. However, as most of the evidence for constraints on genome evolution comes from studies on yeast, it is unclear how much of this prior work can be extrapolated to mammalian genomes. Therefore, in this work we wished to examine the constraints on regions of the mammalian genome containing conserved gene clusters. Results We first identified regions of the mouse genome with microsynteny conservation by comparing gene arrangement in the mouse genome to the human, rat, and dog genomes. We then asked if any particular gene types were found preferentially in conserved regions. We found a significant correlation between conserved microsynteny and the density of mouse orthologs of human disease genes, suggesting that disease genes are clustered in genomic regions of increased microsynteny conservation. Conclusion The correlation between microsynteny conservation and disease gene locations indicates that regions of the mouse genome with microsynteny conservation may contain undiscovered human disease genes. This study not only demonstrates that gene function constrains mammalian genome organization, but also identifies regions of the mouse genome that can be experimentally examined to produce mouse models of human disease.
Collapse
Affiliation(s)
- Simon C Lovell
- Faculty of Life Sciences, University of Manchester, Manchester M139PT, UK
| | | | | | | |
Collapse
|
41
|
Katanforoush A, Sadeghi M, Pezeshk H, Elahi E. Global haplotype partitioning for maximal associated SNP pairs. BMC Bioinformatics 2009; 10:269. [PMID: 19712447 PMCID: PMC2749056 DOI: 10.1186/1471-2105-10-269] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2009] [Accepted: 08/27/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Global partitioning based on pairwise associations of SNPs has not previously been used to define haplotype blocks within genomes. Here, we define an association index based on LD between SNP pairs. We use the Fisher's exact test to assess the statistical significance of the LD estimator. By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant. We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs. Essentially, this model is reduced to a constrained optimization problem, the solution of which is obtained by iterating a dynamic programming algorithm. RESULTS In comparison with other methods, our algorithm reports blocks of larger average size. Nevertheless, the haplotype diversity within the blocks is captured by a small number of tagSNPs. Resampling HapMap haplotypes under a block-based model of recombination showed that our algorithm is robust in reproducing the same partitioning for recombinant samples. Our algorithm performed better than previously reported models in a case-control association study aimed at mapping a single locus trait, based on simulation results that were evaluated by a block-based statistical test. Compared to methods of haplotype block partitioning, we performed best on detection of recombination hotspots. CONCLUSION Our proposed method divides chromosomes into the regions within which allelic associations of SNP pairs are maximized. This approach presents a native design for dimension reduction in genome-wide association studies. Our results show that the pairwise allelic association of SNPs can describe various features of genomic variation, in particular recombination hotspots.
Collapse
Affiliation(s)
- Ali Katanforoush
- Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Mehdi Sadeghi
- National Institute of Genetics Engineering and Biotechnology, Tehran, Iran
- School of Computer Science, Institute for Studies in Theoretical Physics and Mathematics, Tehran, Iran
| | - Hamid Pezeshk
- School of Mathematics, Statistics and Computer Science, and Center of Excellence in Biomathematics, College of Science, University of Tehran, Tehran, Iran
| | - Elahe Elahi
- Department of Biology, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
42
|
Larkin DM, Pape G, Donthu R, Auvil L, Welge M, Lewin HA. Breakpoint regions and homologous synteny blocks in chromosomes have different evolutionary histories. Genome Res 2009; 19:770-7. [PMID: 19342477 DOI: 10.1101/gr.086546.108] [Citation(s) in RCA: 135] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
The persistence of large blocks of homologous synteny and a high frequency of breakpoint reuse are distinctive features of mammalian chromosomes that are not well understood in evolutionary terms. To gain a better understanding of the evolutionary forces that affect genome architecture, synteny relationships among 10 amniotes (human, chimp, macaque, rat, mouse, pig, cattle, dog, opossum, and chicken) were compared at <1 human-Mbp resolution. Homologous synteny blocks (HSBs; N = 2233) and chromosome evolutionary breakpoint regions (EBRs; N = 1064) were identified from pairwise comparisons of all genomes. Analysis of the size distribution of HSBs shared in all 10 species' chromosomes (msHSBs) identified three (>20 Mbp) that are larger than expected by chance. Gene network analysis of msHSBs >3 human-Mbp and EBRs <1 Mbp demonstrated that msHSBs are significantly enriched for genes involved in development of the central nervous and other organ systems, whereas EBRs are enriched for genes associated with adaptive functions. In addition, we found EBRs are significantly enriched for structural variations (segmental duplications, copy number variants, and indels), retrotransposed and zinc finger genes, and single nucleotide polymorphisms. These results demonstrate that chromosome breakage in evolution is nonrandom and that HSBs and EBRs are evolving in distinctly different ways. We suggest that natural selection acts on the genome to maintain combinations of genes and their regulatory elements that are essential to fundamental processes of amniote development and biological organization. Furthermore, EBRs may be used extensively to generate new genetic variation and novel combinations of genes and regulatory elements that contribute to adaptive phenotypes.
Collapse
Affiliation(s)
- Denis M Larkin
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | | | | | | | | | | |
Collapse
|
43
|
Abstract
The different genetic variation discovery projects (The SNP Consortium, the International HapMap Project, the 1000 Genomes Project, etc.) aim to identify as much as possible of the underlying genetic variation in various human populations. The question we address in this article is how many new variants are yet to be found. This is an instance of the species problem in ecology, where the goal is to estimate the number of species in a closed population. We use a parametric beta-binomial model that allows us to calculate the expected number of new variants with a desired minimum frequency to be discovered in a new dataset of individuals of a specified size. The method can also be used to predict the number of individuals necessary to sequence in order to capture all (or a fraction of) the variation with a specified minimum frequency. We apply the method to three datasets: the ENCODE dataset, the SeattleSNPs dataset, and the National Institute of Environmental Health Sciences SNPs dataset. Consistent with previous descriptions, our results show that the African population is the most diverse in terms of the number of variants expected to exist, the Asian populations the least diverse, with the European population in-between. In addition, our results show a clear distinction between the Chinese and the Japanese populations, with the Japanese population being the less diverse. To find all common variants (frequency at least 1%) the number of individuals that need to be sequenced is small ( approximately 350) and does not differ much among the different populations; our data show that, subject to sequence accuracy, the 1000 Genomes Project is likely to find most of these common variants and a high proportion of the rarer ones (frequency between 0.1 and 1%). The data reveal a rule of diminishing returns: a small number of individuals ( approximately 150) is sufficient to identify 80% of variants with a frequency of at least 0.1%, while a much larger number (> 3,000 individuals) is necessary to find all of those variants. Finally, our results also show a much higher diversity in environmental response genes compared with the average genome, especially in African populations.
Collapse
|
44
|
Abstract
Linkage disequilibrium was estimated using 7119 single nucleotide polymorphism markers across the genome and 200 animals from the North American Holstein cattle population. The analysis of maternally inherited haplotypes revealed strong linkage disequilibrium (r(2) > 0.8) in genomic regions of approximately 50 kb or less. While linkage disequilibrium decays as a function of genomic distance, genomic regions within genes showed greater linkage disequilibrium and greater variation in linkage disequilibrium compared with intergenic regions. Identification of haplotype blocks could characterize the most common haplotypes. Although maximum haplotype block size was over 1 Mb, mean block size was 26-113 kb by various definitions, which was larger than that observed in humans ( approximately 10 kb). Effective population size of the dairy cattle population was estimated from linkage disequilibrium between single nucleotide polymorphism marker pairs in various haplotype ranges. Rapid reduction of effective population size of dairy cattle was inferred from linkage disequilibrium in recent generations. This result implies a loss of genetic diversity because of the high rate of inbreeding and high selection intensity in dairy cattle. The pattern observed in this study indicated linkage disequilibrium in the current dairy cattle population could be exploited to refine mapping resolution. Changes in effective population size during past generations imply a necessity of plans to maintain polymorphism in the Holstein population.
Collapse
Affiliation(s)
- E-S Kim
- Department of Animal Sciences, University of Wisconsin, Madison, WI 53706, USA
| | | |
Collapse
|
45
|
Hutz JE, Kraja AT, McLeod HL, Province MA. CANDID: a flexible method for prioritizing candidate genes for complex human traits. Genet Epidemiol 2009; 32:779-90. [PMID: 18613097 DOI: 10.1002/gepi.20346] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Genomewide studies and localized candidate gene approaches have become everyday study designs for identifying polymorphisms in genes that influence complex human traits. Yet, in general, the number of significant findings and the need to focus on smaller regions require a prioritization of genes for further study. Some candidate gene identification algorithms have been proposed in recent years to attempt to streamline this prioritization, but many suffer from limitations imposed by the source data or are difficult to use and understand. CANDID is a prioritization algorithm designed to produce impartial, accurate rankings of candidate genes that influence complex human traits. CANDID can use information from publications, protein domain descriptions, cross-species conservation measures, gene expression profiles and protein-protein interactions in its analysis. Additionally, users may supplement these data sources with results from linkage, association and other studies. CANDID was tested on well-known complex trait genes using data from the Online Mendelian Inheritance in Man database. Additionally, CANDID was evaluated in a modeled gene discovery environment, where it ranked genes whose trait associations were published after CANDID's databases were compiled. In all settings, CANDID exhibited high sensitivity and specificity, indicating an improvement upon previously published algorithms. Its accuracy and ease of use make CANDID a highly useful tool in study design and analysis for complex human traits.
Collapse
Affiliation(s)
- Janna E Hutz
- Division of Statistical Genomics, Washington University School of Medicine, Saint Louis, Missouri, USA.
| | | | | | | |
Collapse
|
46
|
Sigurdsson MI, Smith AV, Bjornsson HT, Jonsson JJ. HapMap methylation-associated SNPs, markers of germline DNA methylation, positively correlate with regional levels of human meiotic recombination. Genome Res 2009; 19:581-9. [PMID: 19158364 DOI: 10.1101/gr.086181.108] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Inter-individual and regional variability in recombination rates cannot be fully explained by the DNA sequence itself. Epigenetic mechanisms might be one additional factor affecting recombination. A biochemical approach to studying human germline methylation is difficult. We used the density of the 434,198 nonredundant methylation-associated SNPs (mSNPs) in the derived allele HapMap data set as a surrogate marker for germline DNA methylation. We validated our methodology by demonstrating that the mSNP density confirmed known patterns of genomic methylation, including the hypermutability of methylated cytosine and hypomethylation of CpG islands. Using this approach, we found a genome-wide positive correlation between germline methylation and regional recombination rate (500-kb windows: r = 0.622, P < 10(-15)). This remained significant with multiple correlations correcting for sequence features known to affect recombination, such as GC content and CpG dinucleotides (500-kb windows: r = 0.172, P < 10(-15)). Using the ENCODE data set for increased resolution, we found a positive correlation between germline DNA methylation and recombination rate (50-kb windows: r = 0.301, P = 0.002). This correlation was further strengthened when corrected for sequence features affecting recombination (50-kb windows: r = 0.445, P < 0.0001). In the Human Epigenome Project data set there was increased DNA methylation in regions within recombination hot spots in male germ cells (0.632 vs. 0.557, P = 0.007). The relationship we observed between germline DNA methylation and recombination could be explained in two ways that are not mutually exclusive: DNA methylation could indicate preferred sites for recombination, or methylation following recombination could inhibit further recombination, perhaps by being part of the enigmatic molecular pathway mediating crossover interference.
Collapse
Affiliation(s)
- Martin I Sigurdsson
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Iceland, IS-101 Reyjavik , Iceland
| | | | | | | |
Collapse
|
47
|
Abstract
INTRODUCTIONThe primary goal of the International Haplotype Map Project has been to develop a haplotype map of the human genome that describes the common patterns of genetic variation, in order to accelerate the search for the genetic causes of human disease. Within the project, ~3.9 million distinct single-nucleotide polymorphisms (SNPs) have been genotyped in 270 individuals from four worldwide populations. The project data are available for unrestricted public use at the HapMap website. This site, which is the primary portal to genotype data produced by the project, offers bulk downloads of the data set, as well as interactive data browsing and analysis tools that are not available elsewhere. Research into the genetic contributions to a human disease commonly focuses on candidate genes identified from linkage and/or association studies, as well as from pathways suspected to be involved in a particular disease process. In studying candidate genes, a researcher will want to know whether there are any common SNPs in the immediate vicinity, what those SNPs' alleles are, and the relative frequencies of the alleles in the population. The researcher will also be particularly interested in coding SNPs, whose alleles change the amino acid sequence of the gene product and therefore might represent functional variations. This protocol provides details on how to use the genome browser to navigate to and explore HapMap data for a gene or region of interest.
Collapse
|
48
|
Xing J, Witherspoon DJ, Watkins WS, Zhang Y, Tolpinrud W, Jorde LB. HapMap tagSNP transferability in multiple populations: general guidelines. Genomics 2008; 92:41-51. [PMID: 18482828 DOI: 10.1016/j.ygeno.2008.03.011] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2008] [Revised: 03/26/2008] [Accepted: 03/28/2008] [Indexed: 11/30/2022]
Abstract
Linkage disequilibrium (LD) has received much attention recently because of its value in localizing disease-causing genes. Due to the extensive LD between neighboring loci in the human genome, it is believed that a subset of the single nucleotide polymorphisms in a region (tagSNPs) can be selected to capture most of the remaining SNP variants. In this study, we examined LD patterns and HapMap tagSNP transferability in more than 300 individuals. A South Indian sample and an African Mbuti Pygmy population sample were included to evaluate the performance of HapMap tagSNPs in geographically distinct and genetically isolated populations. Our results show that HapMap tagSNPs selected with r(2) >= 0.8 can capture more than 85% of the SNPs in populations that are from the same continental group. Combined tagSNPs from HapMap CEU and CHB+JPT serve as the best reference for the Indian sample. The HapMap YRI are a sufficient reference for tagSNP selection in the Pygmy sample. In addition to our findings, we reviewed over 25 recent studies of tagSNP transferability and propose a general guideline for selecting tagSNPs from HapMap populations.
Collapse
Affiliation(s)
- Jinchuan Xing
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | | | | | | | | | | |
Collapse
|
49
|
Evaluation of coverage variation of SNP chips for genome-wide association studies. Eur J Hum Genet 2008; 16:635-43. [PMID: 18253166 DOI: 10.1038/sj.ejhg.5202007] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Genome-wide association (GWA) studies for complex human diseases are now feasible. Many GWA studies rely on commercial SNP chips, for which a common evaluation criterion is global coverage of the genome. Although providing an overall evaluation of an SNP chip, the global coverage does not tell us how the coverage varies across the genome, an important feature that should be taken into consideration, as coverage variation often results in power variation and potentially biased search in subsequent association analysis. To achieve a fuller understanding of SNP chip coverage, we conducted detailed evaluation of coverage, including (1) a map of local coverage - calculated over small consecutive genomic regions and (2) gene coverage - calculated for each known gene in the genome. These evaluations can reveal the degree of variation of each SNP chip in covering the genome and can facilitate SNP chip comparisons at a finer scale.
Collapse
|
50
|
Narasimhan K, Changqing Z, Choolani M. Ovarian cancer proteomics: Many technologies one goal. Proteomics Clin Appl 2008; 2:195-218. [DOI: 10.1002/prca.200780003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|