1
|
Ochayon DE, DeVore SB, Chang WC, Krishnamurthy D, Seelamneni H, Grashel B, Spagna D, Andorf S, Martin LJ, Biagini JM, Waggoner SN, Khurana Hershey GK. Progressive accumulation of hyperinflammatory NKG2D low NK cells in early childhood severe atopic dermatitis. Sci Immunol 2024; 9:eadd3085. [PMID: 38335270 PMCID: PMC11107477 DOI: 10.1126/sciimmunol.add3085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Accepted: 12/21/2023] [Indexed: 02/12/2024]
Abstract
Atopic dermatitis (AD) is a chronic inflammatory skin disease that often precedes the development of food allergy, asthma, and allergic rhinitis. The prevailing paradigm holds that a reduced frequency and function of natural killer (NK) cell contributes to AD pathogenesis, yet the underlying mechanisms and contributions of NK cells to allergic comorbidities remain ill-defined. Here, analysis of circulating NK cells in a longitudinal early life cohort of children with AD revealed a progressive accumulation of NK cells with low expression of the activating receptor NKG2D, which was linked to more severe AD and sensitivity to allergens. This was most notable in children co-sensitized to food and aeroallergens, a risk factor for development of asthma. Individual-level longitudinal analysis in a subset of children revealed coincident reduction of NKG2D on NK cells with acquired or persistent sensitization, and this was associated with impaired skin barrier function assessed by transepidermal water loss. Low expression of NKG2D on NK cells was paradoxically associated with depressed cytolytic function but exaggerated release of the proinflammatory cytokine tumor necrosis factor-α. These observations provide important insights into a potential mechanism underlying the development of allergic comorbidity in early life in children with AD, which involves altered NK cell functional responses, and define an endotype of severe AD.
Collapse
Affiliation(s)
- David E. Ochayon
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center
| | - Stanley B. DeVore
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center
- Medical Scientist Training Program, University of Cincinnati College of Medicine
- Cancer and Cell Biology Program, University of Cincinnati College of Medicine
| | - Wan-Chi Chang
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center
| | - Durga Krishnamurthy
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center
| | - Harsha Seelamneni
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center
| | - Brittany Grashel
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center
| | - Daniel Spagna
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center
| | - Sandra Andorf
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center
- Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center
- Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center
- Department of Pediatrics, University of Cincinnati College of Medicine
| | - Lisa J. Martin
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center
- Department of Pediatrics, University of Cincinnati College of Medicine
| | - Jocelyn M. Biagini
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center
- Department of Pediatrics, University of Cincinnati College of Medicine
| | - Stephen N. Waggoner
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center
- Medical Scientist Training Program, University of Cincinnati College of Medicine
- Division of Human Genetics, Cincinnati Children’s Hospital Medical Center
- Department of Pediatrics, University of Cincinnati College of Medicine
| | - Gurjit K. Khurana Hershey
- Division of Asthma Research, Cincinnati Children’s Hospital Medical Center
- Medical Scientist Training Program, University of Cincinnati College of Medicine
- Cancer and Cell Biology Program, University of Cincinnati College of Medicine
- Department of Pediatrics, University of Cincinnati College of Medicine
| |
Collapse
|
2
|
D'Amico S, Dall'Olio L, Rollo C, Alonso P, Prada-Luengo I, Dall'Olio D, Sala C, Sauta E, Asti G, Lanino L, Maggioni G, Campagna A, Zazzetti E, Delleani M, Bicchieri ME, Morandini P, Savevski V, Arroyo B, Parras J, Zhao LP, Platzbecker U, Diez-Campelo M, Santini V, Fenaux P, Haferlach T, Krogh A, Zazo S, Fariselli P, Sanavia T, Della Porta MG, Castellani G. MOSAIC: An Artificial Intelligence-Based Framework for Multimodal Analysis, Classification, and Personalized Prognostic Assessment in Rare Cancers. JCO Clin Cancer Inform 2024; 8:e2400008. [PMID: 38875514 DOI: 10.1200/cci.24.00008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 03/14/2024] [Accepted: 04/15/2024] [Indexed: 06/16/2024] Open
Abstract
PURPOSE Rare cancers constitute over 20% of human neoplasms, often affecting patients with unmet medical needs. The development of effective classification and prognostication systems is crucial to improve the decision-making process and drive innovative treatment strategies. We have created and implemented MOSAIC, an artificial intelligence (AI)-based framework designed for multimodal analysis, classification, and personalized prognostic assessment in rare cancers. Clinical validation was performed on myelodysplastic syndrome (MDS), a rare hematologic cancer with clinical and genomic heterogeneities. METHODS We analyzed 4,427 patients with MDS divided into training and validation cohorts. Deep learning methods were applied to integrate and impute clinical/genomic features. Clustering was performed by combining Uniform Manifold Approximation and Projection for Dimension Reduction + Hierarchical Density-Based Spatial Clustering of Applications with Noise (UMAP + HDBSCAN) methods, compared with the conventional Hierarchical Dirichlet Process (HDP). Linear and AI-based nonlinear approaches were compared for survival prediction. Explainable AI (Shapley Additive Explanations approach [SHAP]) and federated learning were used to improve the interpretation and the performance of the clinical models, integrating them into distributed infrastructure. RESULTS UMAP + HDBSCAN clustering obtained a more granular patient stratification, achieving a higher average silhouette coefficient (0.16) with respect to HDP (0.01) and higher balanced accuracy in cluster classification by Random Forest (92.7% ± 1.3% and 85.8% ± 0.8%). AI methods for survival prediction outperform conventional statistical techniques and the reference prognostic tool for MDS. Nonlinear Gradient Boosting Survival stands in the internal (Concordance-Index [C-Index], 0.77; SD, 0.01) and external validation (C-Index, 0.74; SD, 0.02). SHAP analysis revealed that similar features drove patients' subgroups and outcomes in both training and validation cohorts. Federated implementation improved the accuracy of developed models. CONCLUSION MOSAIC provides an explainable and robust framework to optimize classification and prognostic assessment of rare cancers. AI-based approaches demonstrated superior accuracy in capturing genomic similarities and providing individual prognostic information compared with conventional statistical methods. Its federated implementation ensures broad clinical application, guaranteeing high performance and data protection.
Collapse
Affiliation(s)
- Saverio D'Amico
- Humanitas Clinical and Research Center-IRCCS, Milan, Italy
- Train s.r.l., Milan, Italy
| | | | - Cesare Rollo
- Computational Biomedicine Unit, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Patricia Alonso
- Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
| | | | | | - Claudia Sala
- Experimental, Diagnostic and Specialty Medicine-DIMES, Bologna, Italy
| | | | - Gianluca Asti
- Humanitas Clinical and Research Center-IRCCS, Milan, Italy
| | - Luca Lanino
- Humanitas Clinical and Research Center-IRCCS, Milan, Italy
| | | | | | - Elena Zazzetti
- Humanitas Clinical and Research Center-IRCCS, Milan, Italy
| | | | | | | | | | - Borja Arroyo
- Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
| | - Juan Parras
- Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
| | - Lin Pierre Zhao
- Hematology and Bone Marrow Transplantation, Hôpital Saint-Louis/University Paris 7, Paris, France
| | - Uwe Platzbecker
- Medical Clinic and Policlinic 1, Hematology and Cellular Therapy, University Hospital Leipzig, Leipzig, Germany
| | - Maria Diez-Campelo
- Hematology Department, Hospital Universitario de Salamanca, Salamanca, Spain
| | - Valeria Santini
- Hematology, Azienda Ospedaliero-Universitaria Careggi & University of Florence, Florence, Italy
| | - Pierre Fenaux
- Hematology and Bone Marrow Transplantation, Hôpital Saint-Louis/University Paris 7, Paris, France
| | | | | | - Santiago Zazo
- Department of Signals, Systems and Radiocommunications, Polytechnic University of Madrid, Madrid, Spain
| | - Piero Fariselli
- Computational Biomedicine Unit, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Tiziana Sanavia
- Computational Biomedicine Unit, Department of Medical Sciences, University of Turin, Turin, Italy
| | - Matteo Giovanni Della Porta
- Humanitas Clinical and Research Center-IRCCS, Milan, Italy
- Department of Biomedical Sciences, Humanitas University, Milan, Italy
| | - Gastone Castellani
- Department of Physics and Astronomy (DIFA), Bologna, Italy
- Experimental, Diagnostic and Specialty Medicine-DIMES, Bologna, Italy
| |
Collapse
|
3
|
Diaz-Recio Lorenzo C, Tran Lu Y A, Brunner O, Arbizu PM, Jollivet D, Laurent S, Gollner S. Highly structured populations of copepods at risk to deep-sea mining: Integration of genomic data with demogenetic and biophysical modelling. Mol Ecol 2024; 33:e17340. [PMID: 38605683 DOI: 10.1111/mec.17340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 02/25/2024] [Accepted: 03/04/2024] [Indexed: 04/13/2024]
Abstract
Copepoda is the most abundant taxon in deep-sea hydrothermal vents, where hard substrate is available. Despite the increasing interest in seafloor massive sulphides exploitation, there have been no population genomic studies conducted on vent meiofauna, which are known to contribute over 50% to metazoan biodiversity at vents. To bridge this knowledge gap, restriction-site-associated DNA sequencing, specifically 2b-RADseq, was used to retrieve thousands of genome-wide single-nucleotide polymorphisms (SNPs) from abundant populations of the vent-obligate copepod Stygiopontius lauensis from the Lau Basin. SNPs were used to investigate population structure, demographic histories and genotype-environment associations at a basin scale. Genetic analyses also helped to evaluate the suitability of tailored larval dispersal models and the parameterization of life-history traits that better fit the population patterns observed in the genomic dataset for the target organism. Highly structured populations were observed on both spatial and temporal scales, with divergence of populations between the north, mid, and south of the basin estimated to have occurred after the creation of the major transform fault dividing the Australian and the Niuafo'ou tectonic plate (350 kya), with relatively recent secondary contact events (<20 kya). Larval dispersal models were able to predict the high levels of structure and the highly asymmetric northward low-level gene flow observed in the genomic data. These results differ from most studies conducted on megafauna in the region, elucidating the need to incorporate smaller size when considering site prospecting for deep-sea exploitation of seafloor massive sulphides, and the creation of area-based management tools to protect areas at risk of local extinction, should mining occur.
Collapse
Affiliation(s)
- Coral Diaz-Recio Lorenzo
- Adaptation et Diversité en Milieu Marin (AD2M), Station Biologique de Roscoff, Sorbonne Université, CNRS, Roscoff, France
| | - Adrien Tran Lu Y
- UMR MARBEC, University of Montpellier, IRD, Ifremer, CNRS, Sète, France
| | - Otis Brunner
- Okinawa Institute for Science and Technology, Kunigami-gun, Okinawa, Japan
| | - Pedro Martínez Arbizu
- Senckenberg am Meer, German Centre for Marine Biodiversity Research, Wilhelmshaven, Germany
| | - Didier Jollivet
- Adaptation et Diversité en Milieu Marin (AD2M), Station Biologique de Roscoff, Sorbonne Université, CNRS, Roscoff, France
| | | | - Sabine Gollner
- NIOZ Royal Netherlands Institute for Sea Research and Utrecht University, 't Horntje (Texel), The Netherlands
- Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
4
|
Beck JD, Diken M, Suchan M, Streuber M, Diken E, Kolb L, Allnoch L, Vascotto F, Peters D, Beißert T, Akilli-Öztürk Ö, Türeci Ö, Kreiter S, Vormehr M, Sahin U. Long-lasting mRNA-encoded interleukin-2 restores CD8 + T cell neoantigen immunity in MHC class I-deficient cancers. Cancer Cell 2024; 42:568-582.e11. [PMID: 38490213 DOI: 10.1016/j.ccell.2024.02.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 11/29/2023] [Accepted: 02/15/2024] [Indexed: 03/17/2024]
Abstract
Major histocompatibility complex (MHC) class I antigen presentation deficiency is a common cancer immune escape mechanism, but the mechanistic implications and potential strategies to address this challenge remain poorly understood. Studying β2-microglobulin (B2M) deficient mouse tumor models, we find that MHC class I loss leads to a substantial immune desertification of the tumor microenvironment (TME) and broad resistance to immune-, chemo-, and radiotherapy. We show that treatment with long-lasting mRNA-encoded interleukin-2 (IL-2) restores an immune cell infiltrated, IFNγ-promoted, highly proinflammatory TME signature, and when combined with a tumor-targeting monoclonal antibody (mAB), can overcome therapeutic resistance. Unexpectedly, the effectiveness of this treatment is driven by IFNγ-releasing CD8+ T cells that recognize neoantigens cross-presented by TME-resident activated macrophages. These macrophages acquire augmented antigen presentation proficiency and other M1-phenotype-associated features under IL-2 treatment. Our findings highlight the importance of restoring neoantigen-specific immune responses in the treatment of cancers with MHC class I deficiencies.
Collapse
Affiliation(s)
- Jan D Beck
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Mustafa Diken
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany; BioNTech SE, An der Goldgrube 12, 55131 Mainz, Germany
| | - Martin Suchan
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Michael Streuber
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Elif Diken
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Laura Kolb
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Lisa Allnoch
- BioNTech SE, An der Goldgrube 12, 55131 Mainz, Germany
| | - Fulvia Vascotto
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Daniel Peters
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Tim Beißert
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Özlem Akilli-Öztürk
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany
| | - Özlem Türeci
- BioNTech SE, An der Goldgrube 12, 55131 Mainz, Germany
| | - Sebastian Kreiter
- TRON gGmbH - Translational Oncology at the University Medical Center of the Johannes Gutenberg University, Freiligrathstr. 12, 55131 Mainz, Germany; BioNTech SE, An der Goldgrube 12, 55131 Mainz, Germany
| | | | - Ugur Sahin
- BioNTech SE, An der Goldgrube 12, 55131 Mainz, Germany.
| |
Collapse
|
5
|
Groza C, Schwendinger-Schreck C, Cheung WA, Farrow EG, Thiffault I, Lake J, Rizzo WB, Evrony G, Curran T, Bourque G, Pastinen T. Pangenome graphs improve the analysis of structural variants in rare genetic diseases. Nat Commun 2024; 15:657. [PMID: 38253606 PMCID: PMC10803329 DOI: 10.1038/s41467-024-44980-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 01/10/2024] [Indexed: 01/24/2024] Open
Abstract
Rare DNA alterations that cause heritable diseases are only partially resolvable by clinical next-generation sequencing due to the difficulty of detecting structural variation (SV) in all genomic contexts. Long-read, high fidelity genome sequencing (HiFi-GS) detects SVs with increased sensitivity and enables assembling personal and graph genomes. We leverage standard reference genomes, public assemblies (n = 94) and a large collection of HiFi-GS data from a rare disease program (Genomic Answers for Kids, GA4K, n = 574 assemblies) to build a graph genome representing a unified SV callset in GA4K, identify common variation and prioritize SVs that are more likely to cause genetic disease (MAF < 0.01). Using graphs, we obtain a higher level of reproducibility than the standard reference approach. We observe over 200,000 SV alleles unique to GA4K, including nearly 1000 rare variants that impact coding sequence. With improved specificity for rare SVs, we isolate 30 candidate SVs in phenotypically prioritized genes, including known disease SVs. We isolate a novel diagnostic SV in KMT2E, demonstrating use of personal assemblies coupled with pangenome graphs for rare disease genomics. The community may interrogate our pangenome with additional assemblies to discover new SVs within the allele frequency spectrum relevant to genetic diseases.
Collapse
Affiliation(s)
- Cristian Groza
- Quantitative Life Sciences, McGill University, Montréal, QC, Canada
| | | | - Warren A Cheung
- Genomic Medicine Center, Children's Mercy Hospital and Research Institute, KC, MO, USA
| | - Emily G Farrow
- Genomic Medicine Center, Children's Mercy Hospital and Research Institute, KC, MO, USA
| | - Isabelle Thiffault
- Genomic Medicine Center, Children's Mercy Hospital and Research Institute, KC, MO, USA
| | | | - William B Rizzo
- Child Health Research Institute, Department of Pediatrics, Nebraska Medical Center, Omaha, NE, USA
| | - Gilad Evrony
- Center for Human Genetics and Genomics, Department of Pediatrics, Neuroscience & Physiology, New York University Grossman School of Medicine, New York, NY, USA
| | - Tom Curran
- Children's Mercy Research Institute, Kansas City, MO, USA
| | - Guillaume Bourque
- Canadian Center for Computational Genomics, McGill University, Montréal, QC, Canada.
- Department of Human Genetics, McGill University, Montréal, QC, Canada.
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan.
- Victor Phillip Dahdaleh Institute of Genomic Medicine at McGill University, Montréal, QC, Canada.
| | - Tomi Pastinen
- Genomic Medicine Center, Children's Mercy Hospital and Research Institute, KC, MO, USA.
| |
Collapse
|
6
|
Fortes-Lima CA, Burgarella C, Hammarén R, Eriksson A, Vicente M, Jolly C, Semo A, Gunnink H, Pacchiarotti S, Mundeke L, Matonda I, Muluwa JK, Coutros P, Nyambe TS, Cikomola JC, Coetzee V, de Castro M, Ebbesen P, Delanghe J, Stoneking M, Barham L, Lombard M, Meyer A, Steyn M, Malmström H, Rocha J, Soodyall H, Pakendorf B, Bostoen K, Schlebusch CM. The genetic legacy of the expansion of Bantu-speaking peoples in Africa. Nature 2024; 625:540-547. [PMID: 38030719 PMCID: PMC10794141 DOI: 10.1038/s41586-023-06770-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 10/20/2023] [Indexed: 12/01/2023]
Abstract
The expansion of people speaking Bantu languages is the most dramatic demographic event in Late Holocene Africa and fundamentally reshaped the linguistic, cultural and biological landscape of the continent1-7. With a comprehensive genomic dataset, including newly generated data of modern-day and ancient DNA from previously unsampled regions in Africa, we contribute insights into this expansion that started 6,000-4,000 years ago in western Africa. We genotyped 1,763 participants, including 1,526 Bantu speakers from 147 populations across 14 African countries, and generated whole-genome sequences from 12 Late Iron Age individuals8. We show that genetic diversity amongst Bantu-speaking populations declines with distance from western Africa, with current-day Zambia and the Democratic Republic of Congo as possible crossroads of interaction. Using spatially explicit methods9 and correlating genetic, linguistic and geographical data, we provide cross-disciplinary support for a serial-founder migration model. We further show that Bantu speakers received significant gene flow from local groups in regions they expanded into. Our genetic dataset provides an exhaustive modern-day African comparative dataset for ancient DNA studies10 and will be important to a wide range of disciplines from science and humanities, as well as to the medical sector studying human genetic variation and health in African and African-descendant populations.
Collapse
Affiliation(s)
- Cesar A Fortes-Lima
- Human Evolution Program, Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - Concetta Burgarella
- Human Evolution Program, Department of Organismal Biology, Uppsala University, Uppsala, Sweden
- AGAP Institut, University of Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Rickard Hammarén
- Human Evolution Program, Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - Anders Eriksson
- cGEM, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Mário Vicente
- Centre for Palaeogenetics, University of Stockholm, Stockholm, Sweden
- Department of Archaeology and Classical Studies, Stockholm University, Stockholm, Sweden
| | - Cecile Jolly
- Human Evolution Program, Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - Armando Semo
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Hilde Gunnink
- UGent Centre for Bantu Studies (BantUGent), Department of Languages and Cultures, Ghent University, Ghent, Belgium
- Leiden University Centre for Linguistics, Leiden, the Netherlands
| | - Sara Pacchiarotti
- UGent Centre for Bantu Studies (BantUGent), Department of Languages and Cultures, Ghent University, Ghent, Belgium
| | - Leon Mundeke
- University of Kinshasa, Kinshasa, Democratic Republic of Congo
| | - Igor Matonda
- University of Kinshasa, Kinshasa, Democratic Republic of Congo
| | - Joseph Koni Muluwa
- Institut Supérieur Pédagogique de Kikwit, Kikwit, Democratic Republic of Congo
| | - Peter Coutros
- UGent Centre for Bantu Studies (BantUGent), Department of Languages and Cultures, Ghent University, Ghent, Belgium
| | | | | | - Vinet Coetzee
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Minique de Castro
- Biotechnology Platform, Agricultural Research Council, Onderstepoort, Pretoria, South Africa
| | - Peter Ebbesen
- Department of Health Science and Technology, University of Aalborg, Aalborg, Denmark
| | - Joris Delanghe
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Laboratoire de Biométrie et Biologie Evolutive, UMR 5558, Université Lyon 1, CNRS, Villeurbanne, France
| | - Lawrence Barham
- Department of Archaeology, Classics & Egyptology, University of Liverpool, Liverpool, UK
| | - Marlize Lombard
- Palaeo-Research Institute, University of Johannesburg, Johannesburg, South Africa
| | - Anja Meyer
- Human Variation and Identification Research Unit, School of Anatomical Sciences, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Maryna Steyn
- Human Variation and Identification Research Unit, School of Anatomical Sciences, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Helena Malmström
- Human Evolution Program, Department of Organismal Biology, Uppsala University, Uppsala, Sweden
- Palaeo-Research Institute, University of Johannesburg, Johannesburg, South Africa
| | - Jorge Rocha
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Vairão, Portugal
- BIOPOLIS Program in Genomics, Biodiversity and Land Planning, CIBIO, Campus de Vairão, Vairão, Portugal
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal
| | - Himla Soodyall
- Division of Human Genetics, School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Academy of Science of South Africa, Pretoria, South Africa
| | | | - Koen Bostoen
- UGent Centre for Bantu Studies (BantUGent), Department of Languages and Cultures, Ghent University, Ghent, Belgium
| | - Carina M Schlebusch
- Human Evolution Program, Department of Organismal Biology, Uppsala University, Uppsala, Sweden.
- Palaeo-Research Institute, University of Johannesburg, Johannesburg, South Africa.
- SciLifeLab, Uppsala, Sweden.
| |
Collapse
|
7
|
Silcocks M, Farlow A, Hermes A, Tsambos G, Patel HR, Huebner S, Baynam G, Jenkins MR, Vukcevic D, Easteal S, Leslie S. Indigenous Australian genomes show deep structure and rich novel variation. Nature 2023; 624:593-601. [PMID: 38093005 PMCID: PMC10733150 DOI: 10.1038/s41586-023-06831-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 11/03/2023] [Indexed: 12/20/2023]
Abstract
The Indigenous peoples of Australia have a rich linguistic and cultural history. How this relates to genetic diversity remains largely unknown because of their limited engagement with genomic studies. Here we analyse the genomes of 159 individuals from four remote Indigenous communities, including people who speak a language (Tiwi) not from the most widespread family (Pama-Nyungan). This large collection of Indigenous Australian genomes was made possible by careful community engagement and consultation. We observe exceptionally strong population structure across Australia, driven by divergence times between communities of 26,000-35,000 years ago and long-term low but stable effective population sizes. This demographic history, including early divergence from Papua New Guinean (47,000 years ago) and Eurasian groups1, has generated the highest proportion of previously undescribed genetic variation seen outside Africa and the most extended homozygosity compared with global samples. A substantial proportion of this variation is not observed in global reference panels or clinical datasets, and variation with predicted functional consequence is more likely to be homozygous than in other populations, with consequent implications for medical genomics2. Our results show that Indigenous Australians are not a single homogeneous genetic group and their genetic relationship with the peoples of New Guinea is not uniform. These patterns imply that the full breadth of Indigenous Australian genetic diversity remains uncharacterized, potentially limiting genomic medicine and equitable healthcare for Indigenous Australians.
Collapse
Affiliation(s)
- Matthew Silcocks
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
- University of Melbourne, School of Biosciences, Parkville, Victoria, Australia
| | - Ashley Farlow
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
- University of Melbourne, School of Mathematics and Statistics, Parkville, Victoria, Australia
| | - Azure Hermes
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Georgia Tsambos
- University of Melbourne, School of Mathematics and Statistics, Parkville, Victoria, Australia
| | - Hardip R Patel
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Sharon Huebner
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Gareth Baynam
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
- Faculty of Health and Medical Sciences, Division of Paediatrics and Telethon Kids Institute, University of Western Australia, Perth, Western Australia, Australia
- Western Australian Register of Developmental Anomalies, King Edward Memorial Hospital and Rare Care Centre, Perth Children's Hospital, Perth, Western Australia, Australia
| | - Misty R Jenkins
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
- Immunology Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- University of Melbourne, Department of Medical Biology, Parkville, Victoria, Australia
| | - Damjan Vukcevic
- University of Melbourne, School of Mathematics and Statistics, Parkville, Victoria, Australia
| | - Simon Easteal
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Stephen Leslie
- National Centre for Indigenous Genomics, John Curtin School of Medical Research, Australian National University, Canberra, Australian Capital Territory, Australia.
- University of Melbourne, School of Biosciences, Parkville, Victoria, Australia.
- University of Melbourne, School of Mathematics and Statistics, Parkville, Victoria, Australia.
| |
Collapse
|
8
|
Di Pino S, Donkor ED, Sánchez VM, Rodriguez A, Cassone G, Scherlis D, Hassanali A. ZundEig: The Structure of the Proton in Liquid Water from Unsupervised Learning. J Phys Chem B 2023; 127:9822-9832. [PMID: 37930954 DOI: 10.1021/acs.jpcb.3c06078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
The structure of the excess proton in liquid water has been the subject of lively debate on both experimental and theoretical fronts for the last century. Fluctuations of the proton are typically interpreted in terms of limiting states referred to as the Eigen and Zundel species. Here, we put these ideas under the microscope, taking advantage of recent advances in unsupervised learning that use local atomic descriptors to characterize environments of acidic water combined with advanced clustering techniques. Our agnostic approach leads to the observation of only one charged cluster and two neutral ones. We demonstrate that the charged cluster involving the excess proton is best seen as an ionic topological defect in water's hydrogen bond network, forming a single local minimum on the global free-energy landscape. This charged defect is a highly fluxional moiety, where the idealized Eigen and Zundel species are neither limiting configurations nor distinct thermodynamic states. Instead, the ionic defect enhances the presence of neutral water defects through strong interactions with the network. We dub the combination of the charged and neutral defect clusters as ZundEig, demonstrating that the fluctuations between these local environments provide a general framework for rationalizing more descriptive notions of the proton in the existing literature.
Collapse
Affiliation(s)
- Solana Di Pino
- Departamento de Química Inorgánica, Analítica y Química Física/INQUIMAE, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, C1428EHA Buenos Aires, Argentina
| | - Edward Danquah Donkor
- International Centre for Theoretical Physics, Strada Costiera 11, 34151 Trieste, Italy
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
| | - Veronica M Sánchez
- Departamento de Química Inorgánica, Analítica y Química Física/INQUIMAE, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, C1428EHA Buenos Aires, Argentina
| | - Alex Rodriguez
- International Centre for Theoretical Physics, Strada Costiera 11, 34151 Trieste, Italy
- Dipartimento di Matematica e Geoscienze, Universitá degli Studi di Trieste, via Alfonso Valerio 12/1, 34127 Trieste, Italy
| | - Giuseppe Cassone
- Institute for Chemical-Physical Processes, National Research Council (CNR-IPCF), Viale Stagno d'Alcontres 37, 98158 Messina, Italy
| | - Damian Scherlis
- Departamento de Química Inorgánica, Analítica y Química Física/INQUIMAE, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, C1428EHA Buenos Aires, Argentina
| | - Ali Hassanali
- International Centre for Theoretical Physics, Strada Costiera 11, 34151 Trieste, Italy
| |
Collapse
|
9
|
Jang J, Kim H, Park SS, Kim M, Min YK, Jeong HO, Kim S, Hwang T, Choi DWY, Kim HJ, Song S, Kim DO, Lee S, Lee CH, Lee JW. Single-cell RNA Sequencing Reveals Novel Cellular Factors for Response to Immunosuppressive Therapy in Aplastic Anemia. Hemasphere 2023; 7:e977. [PMID: 37908861 PMCID: PMC10615405 DOI: 10.1097/hs9.0000000000000977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 09/22/2023] [Indexed: 11/02/2023] Open
Abstract
Aplastic anemia (AA) is a lethal hematological disorder; however, its pathogenesis is not fully understood. Although immunosuppressive therapy (IST) is a major treatment option for AA, one-third of patients do not respond to IST and its resistance mechanism remains elusive. To understand AA pathogenesis and IST resistance, we performed single-cell RNA sequencing (scRNA-seq) of bone marrow (BM) from healthy controls and patients with AA at diagnosis. We found that CD34+ early-stage erythroid precursor cells and PROM1+ hematopoietic stem cells were significantly depleted in AA, which suggests that the depletion of CD34+ early-stage erythroid precursor cells and PROM1+ hematopoietic stem cells might be one of the major mechanisms for AA pathogenesis related with BM-cell hypoplasia. More importantly, we observed the significant enrichment of CD8+ T cells and T cell-activating intercellular interactions in IST responders, indicating the association between the expansion and activation of T cells and the positive response of IST in AA. Taken together, our findings represent a valuable resource offering novel insights into the cellular heterogeneity in the BM of AA and reveal potential biomarkers for IST, building the foundation for future precision therapies in AA.
Collapse
Affiliation(s)
- Jinho Jang
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea
- Korean Genomics Center, UNIST, Ulsan, Republic of Korea
| | - Hongtae Kim
- Department of Biological Sciences, UNIST, Ulsan, Republic of Korea
| | - Sung-Soo Park
- Department of Hematology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Miok Kim
- Therapeutics & Biotechnology Division, Drug Discovery Platform Research Center, Korea Research Institute of Chemical Technology (KRICT), Daejeon, Republic of Korea
| | - Yong Ki Min
- Therapeutics & Biotechnology Division, Drug Discovery Platform Research Center, Korea Research Institute of Chemical Technology (KRICT), Daejeon, Republic of Korea
| | - Hyoung-oh Jeong
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea
- Korean Genomics Center, UNIST, Ulsan, Republic of Korea
| | - Seunghoon Kim
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea
- Korean Genomics Center, UNIST, Ulsan, Republic of Korea
| | - Taejoo Hwang
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea
- Korean Genomics Center, UNIST, Ulsan, Republic of Korea
| | - David Whee-Young Choi
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea
- Korean Genomics Center, UNIST, Ulsan, Republic of Korea
| | - Hee-Je Kim
- Department of Hematology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| | - Sukgil Song
- Chungnam National University School of Medicine, Daejeon, Republic of Korea
| | | | - Semin Lee
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, Republic of Korea
- Korean Genomics Center, UNIST, Ulsan, Republic of Korea
| | - Chang Hoon Lee
- Therapeutics & Biotechnology Division, Drug Discovery Platform Research Center, Korea Research Institute of Chemical Technology (KRICT), Daejeon, Republic of Korea
- Korea SCBIO Inc, Daejeon, Republic of Korea
| | - Jong Wook Lee
- Department of Hematology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea
| |
Collapse
|
10
|
Spence JP, Zeng T, Mostafavi H, Pritchard JK. Scaling the discrete-time Wright-Fisher model to biobank-scale datasets. Genetics 2023; 225:iyad168. [PMID: 37724741 PMCID: PMC10627256 DOI: 10.1093/genetics/iyad168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 06/01/2023] [Accepted: 09/08/2023] [Indexed: 09/21/2023] Open
Abstract
The discrete-time Wright-Fisher (DTWF) model and its diffusion limit are central to population genetics. These models can describe the forward-in-time evolution of allele frequencies in a population resulting from genetic drift, mutation, and selection. Computing likelihoods under the diffusion process is feasible, but the diffusion approximation breaks down for large samples or in the presence of strong selection. Existing methods for computing likelihoods under the DTWF model do not scale to current exome sequencing sample sizes in the hundreds of thousands. Here, we present a scalable algorithm that approximates the DTWF model with provably bounded error. Our approach relies on two key observations about the DTWF model. The first is that transition probabilities under the model are approximately sparse. The second is that transition distributions for similar starting allele frequencies are extremely close as distributions. Together, these observations enable approximate matrix-vector multiplication in linear (as opposed to the usual quadratic) time. We prove similar properties for Hypergeometric distributions, enabling fast computation of likelihoods for subsamples of the population. We show theoretically and in practice that this approximation is highly accurate and can scale to population sizes in the tens of millions, paving the way for rigorous biobank-scale inference. Finally, we use our results to estimate the impact of larger samples on estimating selection coefficients for loss-of-function variants. We find that increasing sample sizes beyond existing large exome sequencing cohorts will provide essentially no additional information except for genes with the most extreme fitness effects.
Collapse
Affiliation(s)
- Jeffrey P Spence
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Tony Zeng
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | | | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
11
|
Mansouri V, Arjmand B, Hamzeloo-Moghadam M, Rezaei Tavirani M, Razzaghi Z, Ahmadzadeh A, Rezaei M, Robati RM. Collagen Synthesis as a Prominent Process During the Interval between Two Laser Sessions. J Lasers Med Sci 2023; 14:e50. [PMID: 38028873 PMCID: PMC10658108 DOI: 10.34172/jlms.2023.50] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 09/04/2023] [Indexed: 12/01/2023]
Abstract
Introduction: Many people suffer from skin photodamage, especially photoaging. The application of a laser to repair damages is a common therapeutic method that is used widely. In the present study, the effectiveness and molecular mechanism of an Er:Glass non-ablative fractional laser on the human skin was assessed via bioinformatics and network analysis. Methods: The gene expression profiles of 17 white female forearm skins which received an Er:Glass non-ablative fractional laser before and after laser treatment in two sessions were extracted from Gene Expression Omnibus (GEO). Data were evaluated via GEO2R and the significant differentially expressed genes (DEGs) were assessed via protein-protein interaction (PPI) network analysis. The central nodes were identified and discussed for the compared set of samples. Results: Five classes of samples were clustered in two categories: first, baseline, 7 and 14 days after the first session of laser treatment, and second, one day after the first laser session, 29 days after the first laser session, and 1 day after the second laser session. The gross cell functions such as cell division and cell cycle and immune response were highlighted as the early affected targets of the laser. Collagen synthesis was resulted after the first laser session. Conclusion: In conclusion, the time interval between laser sessions plays a critical role in the effectiveness of laser therapy. Findings indicate that the gross effect of laser application appears in a short time, and important processes such as collagen synthesis happen later.
Collapse
Affiliation(s)
- Vahid Mansouri
- Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Babak Arjmand
- Cell Therapy and Regenerative Medicine Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
- Iranian Cancer Control Center (MACSA), Tehran, Iran
| | - Maryam Hamzeloo-Moghadam
- Traditional Medicine and Materia Medica Research Center, School of Traditional Medicine Shahid, Beheshti University of Medical Sciences, Tehran, Iran
| | - Mostafa Rezaei Tavirani
- Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Zahra Razzaghi
- Laser Application in Medical Sciences Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Alireza Ahmadzadeh
- Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mitra Rezaei
- Genomic Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
- Clinical Tuberculosis and Epidemiology Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Reza M Robati
- Skin Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
12
|
Sohail M, Palma-Martínez MJ, Chong AY, Quinto-Cortés CD, Barberena-Jonas C, Medina-Muñoz SG, Ragsdale A, Delgado-Sánchez G, Cruz-Hervert LP, Ferreyra-Reyes L, Ferreira-Guerrero E, Mongua-Rodríguez N, Canizales-Quintero S, Jimenez-Kaufmann A, Moreno-Macías H, Aguilar-Salinas CA, Auckland K, Cortés A, Acuña-Alonzo V, Gignoux CR, Wojcik GL, Ioannidis AG, Fernández-Valverde SL, Hill AVS, Tusié-Luna MT, Mentzer AJ, Novembre J, García-García L, Moreno-Estrada A. Mexican Biobank advances population and medical genomics of diverse ancestries. Nature 2023; 622:775-783. [PMID: 37821706 PMCID: PMC10600006 DOI: 10.1038/s41586-023-06560-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 08/22/2023] [Indexed: 10/13/2023]
Abstract
Latin America continues to be severely underrepresented in genomics research, and fine-scale genetic histories and complex trait architectures remain hidden owing to insufficient data1. To fill this gap, the Mexican Biobank project genotyped 6,057 individuals from 898 rural and urban localities across all 32 states in Mexico at a resolution of 1.8 million genome-wide markers with linked complex trait and disease information creating a valuable nationwide genotype-phenotype database. Here, using ancestry deconvolution and inference of identity-by-descent segments, we inferred ancestral population sizes across Mesoamerican regions over time, unravelling Indigenous, colonial and postcolonial demographic dynamics2-6. We observed variation in runs of homozygosity among genomic regions with different ancestries reflecting distinct demographic histories and, in turn, different distributions of rare deleterious variants. We conducted genome-wide association studies (GWAS) for 22 complex traits and found that several traits are better predicted using the Mexican Biobank GWAS compared to the UK Biobank GWAS7,8. We identified genetic and environmental factors associating with trait variation, such as the length of the genome in runs of homozygosity as a predictor for body mass index, triglycerides, glucose and height. This study provides insights into the genetic histories of individuals in Mexico and dissects their complex trait architectures, both crucial for making precision and preventive medicine initiatives accessible worldwide.
Collapse
Affiliation(s)
- Mashaal Sohail
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico.
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
- Centro de Ciencias Genómicas (CCG), Universidad Nacional Autónoma de México (UNAM), Cuernavaca, Mexico.
| | - María J Palma-Martínez
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico
| | - Amanda Y Chong
- The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Consuelo D Quinto-Cortés
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico
| | - Carmina Barberena-Jonas
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico
| | - Santiago G Medina-Muñoz
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico
| | - Aaron Ragsdale
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI, USA
| | | | - Luis Pablo Cruz-Hervert
- Instituto Nacional de Salud Pública (INSP), Cuernavaca, Mexico
- División de Estudios de Posgrado e Investigación, Facultad de Odontología, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | | | | | | | | | - Andrés Jimenez-Kaufmann
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico
| | - Hortensia Moreno-Macías
- Unidad de Biología Molecular y Medicina Genómica, Instituto de Investigaciones Biomédicas UNAM/Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico
- Universidad Autónoma Metropolitana, Mexico City, Mexico
| | - Carlos A Aguilar-Salinas
- Division de Nutrición, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico
| | - Kathryn Auckland
- The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Adrián Cortés
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | | | - Christopher R Gignoux
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Genevieve L Wojcik
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | | | - Selene L Fernández-Valverde
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico
- School of Biotechnology and Biomolecular Sciences and the RNA Institute, The University of New South Wales, Sydney, New South Wales, Australia
| | - Adrian V S Hill
- The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- The Jenner Institute, University of Oxford, Oxford, UK
| | - María Teresa Tusié-Luna
- Unidad de Biología Molecular y Medicina Genómica, Instituto de Investigaciones Biomédicas UNAM/Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico
| | - Alexander J Mentzer
- The Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | | | - Andrés Moreno-Estrada
- Unidad de Genómica Avanzada (UGA-LANGEBIO), Centro de Investigación y Estudios Avanzados del IPN (Cinvestav), Irapuato, Mexico.
| |
Collapse
|
13
|
Li Z, Meisner J, Albrechtsen A. Fast and accurate out-of-core PCA framework for large scale biobank data. Genome Res 2023; 33:1599-1608. [PMID: 37620119 PMCID: PMC10620046 DOI: 10.1101/gr.277525.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 08/18/2023] [Indexed: 08/26/2023]
Abstract
Principal component analysis (PCA) is widely used in statistics, machine learning, and genomics for dimensionality reduction and uncovering low-dimensional latent structure. To address the challenges posed by ever-growing data size, fast and memory-efficient PCA methods have gained prominence. In this paper, we propose a novel randomized singular value decomposition (RSVD) algorithm implemented in PCAone, featuring a window-based optimization scheme that enables accelerated convergence while improving the accuracy. Additionally, PCAone incorporates out-of-core and multithreaded implementations for the existing Implicitly Restarted Arnoldi Method (IRAM) and RSVD. Through comprehensive evaluations using multiple large-scale real-world data sets in different fields, we show the advantage of PCAone over existing methods. The new algorithm achieves significantly faster computation time while maintaining accuracy comparable to the slower IRAM method. Notably, our analyses of UK Biobank, comprising around 0.5 million individuals and 6.1 million common single nucleotide polymorphisms, show that PCAone accurately computes the top 40 principal components within 9 h. This analysis effectively captures population structure, signals of selection, structural variants, and low recombination regions, utilizing <20 GB of memory and 20 CPU threads. Furthermore, when applied to single-cell RNA sequencing data featuring 1.3 million cells, PCAone, accurately capturing the top 40 principal components in 49 min. This performance represents a 10-fold improvement over state-of-the-art tools.
Collapse
Affiliation(s)
- Zilong Li
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, 2200 København, Denmark;
| | - Jonas Meisner
- Biological and Precision Psychiatry, Mental Health Centre Copenhagen, Copenhagen University Hospital, 2100 København, Denmark
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 København, Denmark
| | - Anders Albrechtsen
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, 2200 København, Denmark
| |
Collapse
|
14
|
Katsumata Y, Fardo DW, Shade LMP, Nelson PT. LATE-NC risk alleles (in TMEM106B, GRN, and ABCC9 genes) among persons with African ancestry. J Neuropathol Exp Neurol 2023; 82:760-768. [PMID: 37528055 PMCID: PMC10440720 DOI: 10.1093/jnen/nlad059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023] Open
Abstract
Limbic-predominant age-related TDP-43 encephalopathy (LATE) affects approximately one-third of older individuals and is associated with cognitive impairment. However, there is a highly incomplete understanding of the genetic determinants of LATE neuropathologic changes (LATE-NC) in diverse populations. The defining neuropathologic feature of LATE-NC is TDP-43 proteinopathy, often with comorbid hippocampal sclerosis (HS). In terms of genetic risk factors, LATE-NC and/or HS are associated with single nucleotide variants (SNVs) in 3 genes-TMEM106B (rs1990622), GRN (rs5848), and ABCC9 (rs1914361 and rs701478). We evaluated these 3 genes in convenience samples of individuals of African ancestry. The allele frequencies of the LATE-associated alleles were significantly different between persons of primarily African (versus European) ancestry: In persons of African ancestry, the risk-associated alleles for TMEM106B and ABCC9 were less frequent, whereas the risk allele in GRN was more frequent. We performed an exploratory analysis of data from African-American subjects processed by the Alzheimer's Disease Genomics Consortium, with a subset of African-American participants (n = 166) having corroborating neuropathologic data through the National Alzheimer's Coordinating Center (NACC). In this limited-size sample, the ABCC9/rs1914361 SNV was associated with HS pathology. More work is required concerning the genetic factors influencing non-Alzheimer disease pathology such as LATE-NC in diverse cohorts.
Collapse
Affiliation(s)
- Yuriko Katsumata
- University of Kentucky Sanders-Brown Center on Aging, Lexington, Kentucky, USA
- University of Kentucky Department of Biostatistics, Lexington, Kentucky, USA
| | - David W Fardo
- University of Kentucky Sanders-Brown Center on Aging, Lexington, Kentucky, USA
- University of Kentucky Department of Biostatistics, Lexington, Kentucky, USA
| | - Lincoln M P Shade
- University of Kentucky Department of Biostatistics, Lexington, Kentucky, USA
| | - Peter T Nelson
- University of Kentucky Sanders-Brown Center on Aging, Lexington, Kentucky, USA
- University of Kentucky Department of Pathology and Laboratory Medicine, Lexington, Kentucky, USA
| |
Collapse
|
15
|
Abstract
Following the widespread use of deep learning for genomics, deep generative modeling is also becoming a viable methodology for the broad field. Deep generative models (DGMs) can learn the complex structure of genomic data and allow researchers to generate novel genomic instances that retain the real characteristics of the original dataset. Aside from data generation, DGMs can also be used for dimensionality reduction by mapping the data space to a latent space, as well as for prediction tasks via exploitation of this learned mapping or supervised/semi-supervised DGM designs. In this review, we briefly introduce generative modeling and two currently prevailing architectures, we present conceptual applications along with notable examples in functional and evolutionary genomics, and we provide our perspective on potential challenges and future directions.
Collapse
Affiliation(s)
- Burak Yelmen
- Laboratoire Interdisciplinaire des Sciences du Numérique, CNRS UMR 9015, INRIA, Université Paris-Saclay, Orsay, France;
- Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Flora Jay
- Laboratoire Interdisciplinaire des Sciences du Numérique, CNRS UMR 9015, INRIA, Université Paris-Saclay, Orsay, France;
| |
Collapse
|
16
|
Henn D, Zhao D, Sivaraj D, Trotsyuk A, Bonham CA, Fischer KS, Kehl T, Fehlmann T, Greco AH, Kussie HC, Moortgat Illouz SE, Padmanabhan J, Barrera JA, Kneser U, Lenhof HP, Januszyk M, Levi B, Keller A, Longaker MT, Chen K, Qi LS, Gurtner GC. Cas9-mediated knockout of Ndrg2 enhances the regenerative potential of dendritic cells for wound healing. Nat Commun 2023; 14:4729. [PMID: 37550295 PMCID: PMC10406832 DOI: 10.1038/s41467-023-40519-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 07/26/2023] [Indexed: 08/09/2023] Open
Abstract
Chronic wounds impose a significant healthcare burden to a broad patient population. Cell-based therapies, while having shown benefits for the treatment of chronic wounds, have not yet achieved widespread adoption into clinical practice. We developed a CRISPR/Cas9 approach to precisely edit murine dendritic cells to enhance their therapeutic potential for healing chronic wounds. Using single-cell RNA sequencing of tolerogenic dendritic cells, we identified N-myc downregulated gene 2 (Ndrg2), which marks a specific population of dendritic cell progenitors, as a promising target for CRISPR knockout. Ndrg2-knockout alters the transcriptomic profile of dendritic cells and preserves an immature cell state with a strong pro-angiogenic and regenerative capacity. We then incorporated our CRISPR-based cell engineering within a therapeutic hydrogel for in vivo cell delivery and developed an effective translational approach for dendritic cell-based immunotherapy that accelerated healing of full-thickness wounds in both non-diabetic and diabetic mouse models. These findings could open the door to future clinical trials using safe gene editing in dendritic cells for treating various types of chronic wounds.
Collapse
Affiliation(s)
- Dominic Henn
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
- Department of Plastic Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Department of Surgery, University of Arizona, Tucson, AZ, USA
| | - Dehua Zhao
- Department of Bioengineering, Sarafan ChEM-H, Stanford University, Stanford, CA, USA
| | - Dharshan Sivaraj
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
- Department of Surgery, University of Arizona, Tucson, AZ, USA
| | - Artem Trotsyuk
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
- Department of Surgery, University of Arizona, Tucson, AZ, USA
| | - Clark Andrew Bonham
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
| | - Katharina S Fischer
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
- Department of Surgery, University of Arizona, Tucson, AZ, USA
| | - Tim Kehl
- Center for Bioinformatics, Saarland Informatics Campus, Saarland University, Saarbrücken, Germany
| | - Tobias Fehlmann
- Chair for Clinical Bioinformatics, Saarland University, Saarbruecken, Germany
| | - Autumn H Greco
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
| | - Hudson C Kussie
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
- Department of Burn, Trauma, Acute and Critical Care Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Sylvia E Moortgat Illouz
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
| | - Jagannath Padmanabhan
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
| | - Janos A Barrera
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
| | - Ulrich Kneser
- Department of Hand, Plastic, and Reconstructive Surgery, BG Trauma Center Ludwigshafen, Ruprecht-Karls-University of Heidelberg, Heidelberg, Germany
| | - Hans-Peter Lenhof
- Center for Bioinformatics, Saarland Informatics Campus, Saarland University, Saarbrücken, Germany
| | - Michael Januszyk
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
| | - Benjamin Levi
- Department of Burn, Trauma, Acute and Critical Care Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Andreas Keller
- Center for Bioinformatics, Saarland Informatics Campus, Saarland University, Saarbrücken, Germany
| | - Michael T Longaker
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
| | - Kellen Chen
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA
- Department of Surgery, University of Arizona, Tucson, AZ, USA
| | - Lei S Qi
- Department of Bioengineering, Sarafan ChEM-H, Stanford University, Stanford, CA, USA.
- Chan Zuckerberg Biohub - San Francisco, San Francisco, CA, USA.
| | - Geoffrey C Gurtner
- Hagey Laboratory for Pediatric Regenerative Medicine, Division of Plastic and Reconstructive Surgery, Stanford University, Stanford, CA, USA.
- Department of Surgery, University of Arizona, Tucson, AZ, USA.
| |
Collapse
|
17
|
Moon J, Posada-Quintero HF, Chon KH. Genetic data visualization using literature text-based neural networks: Examples associated with myocardial infarction. Neural Netw 2023; 165:562-595. [PMID: 37364469 DOI: 10.1016/j.neunet.2023.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 04/11/2023] [Accepted: 05/09/2023] [Indexed: 06/28/2023]
Abstract
Data visualization is critical to unraveling hidden information from complex and high-dimensional data. Interpretable visualization methods are critical, especially in the biology and medical fields, however, there are limited effective visualization methods for large genetic data. Current visualization methods are limited to lower-dimensional data and their performance suffers if there is missing data. In this study, we propose a literature-based visualization method to reduce high-dimensional data without compromising the dynamics of the single nucleotide polymorphisms (SNP) and textual interpretability. Our method is innovative because it is shown to (1) preserves both global and local structures of SNP while reducing the dimension of the data using literature text representations, and (2) enables interpretable visualizations using textual information. For performance evaluations, we examined the proposed approach to classify various classification categories including race, myocardial infarction event age groups, and sex using several machine learning models on the literature-derived SNP data. We used visualization approaches to examine clustering of data as well as quantitative performance metrics for the classification of the risk factors examined above. Our method outperformed all popular dimensionality reduction and visualization methods for both classification and visualization, and it is robust against missing and higher-dimensional data. Moreover, we found it feasible to incorporate both genetic and other risk information obtained from literature with our method.
Collapse
Affiliation(s)
- Jihye Moon
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06269, USA.
| | | | - Ki H Chon
- Department of Biomedical Engineering, University of Connecticut, Storrs, CT 06269, USA.
| |
Collapse
|
18
|
Schreiner W, Karch R, Cibena M, Tomasiak L, Kenn M, Pfeiler G. Clustering molecular dynamics conformations of the CC'-loop of the PD-1 immuno-checkpoint receptor. Comput Struct Biotechnol J 2023; 21:3920-3932. [PMID: 37602229 PMCID: PMC10432919 DOI: 10.1016/j.csbj.2023.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/16/2023] [Accepted: 07/03/2023] [Indexed: 08/22/2023] Open
Abstract
Molecular mechanisms within the checkpoint receptor PD-1 are essential for its activation by PD-L1 as well as for blocking such an activation via checkpoint inhibitors. We use molecular dynamics to scrutinize patterns of atomic motion in PD-1 without a ligand. Molecular dynamics is performed for the whole extracellular domain of PD-1, and the analysis focuses on its CC'-loop and some adjacent Cα-atoms. We extend previous work by applying common nearest neighbor clustering (Cnn) and compare the performance of this method with Daura clustering as well as UMAP dimension reduction and subsequent agglomerative linkage clustering. As compared to Daura clustering, we found Cnn less sensitive to cutoff selection and better able to return representative clusters for sets of different 3D atomic conformations. Interestingly, Cnn yields results quite similar to UMAP plus linkage clustering.
Collapse
Affiliation(s)
- Wolfgang Schreiner
- Medical University of Vienna, Center for Medical Data Science, Spitalgasse 23, A-1090, Vienna, Austria
| | - Rudolf Karch
- Medical University of Vienna, Center for Medical Data Science, Spitalgasse 23, A-1090, Vienna, Austria
| | - Michael Cibena
- Medical University of Vienna, Center for Medical Data Science, Spitalgasse 23, A-1090, Vienna, Austria
| | - Lisa Tomasiak
- Medical University of Vienna, Center for Medical Data Science, Spitalgasse 23, A-1090, Vienna, Austria
| | - Michael Kenn
- Medical University of Vienna, Center for Medical Data Science, Spitalgasse 23, A-1090, Vienna, Austria
| | - Georg Pfeiler
- Medical University of Vienna, Department of Obstetrics and Gynecology, Division of General Gynecology and Gynecologic Oncology, Währinger Gürtel 18-20, A-1090, Vienna, Austria
| |
Collapse
|
19
|
Gonzalez-Castillo J, Fernandez IS, Lam KC, Handwerker DA, Pereira F, Bandettini PA. Manifold learning for fMRI time-varying functional connectivity. Front Hum Neurosci 2023; 17:1134012. [PMID: 37497043 PMCID: PMC10366614 DOI: 10.3389/fnhum.2023.1134012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 06/21/2023] [Indexed: 07/28/2023] Open
Abstract
Whole-brain functional connectivity (FC) measured with functional MRI (fMRI) evolves over time in meaningful ways at temporal scales going from years (e.g., development) to seconds [e.g., within-scan time-varying FC (tvFC)]. Yet, our ability to explore tvFC is severely constrained by its large dimensionality (several thousands). To overcome this difficulty, researchers often seek to generate low dimensional representations (e.g., 2D and 3D scatter plots) hoping those will retain important aspects of the data (e.g., relationships to behavior and disease progression). Limited prior empirical work suggests that manifold learning techniques (MLTs)-namely those seeking to infer a low dimensional non-linear surface (i.e., the manifold) where most of the data lies-are good candidates for accomplishing this task. Here we explore this possibility in detail. First, we discuss why one should expect tvFC data to lie on a low dimensional manifold. Second, we estimate what is the intrinsic dimension (ID; i.e., minimum number of latent dimensions) of tvFC data manifolds. Third, we describe the inner workings of three state-of-the-art MLTs: Laplacian Eigenmaps (LEs), T-distributed Stochastic Neighbor Embedding (T-SNE), and Uniform Manifold Approximation and Projection (UMAP). For each method, we empirically evaluate its ability to generate neuro-biologically meaningful representations of tvFC data, as well as their robustness against hyper-parameter selection. Our results show that tvFC data has an ID that ranges between 4 and 26, and that ID varies significantly between rest and task states. We also show how all three methods can effectively capture subject identity and task being performed: UMAP and T-SNE can capture these two levels of detail concurrently, but LE could only capture one at a time. We observed substantial variability in embedding quality across MLTs, and within-MLT as a function of hyper-parameter selection. To help alleviate this issue, we provide heuristics that can inform future studies. Finally, we also demonstrate the importance of feature normalization when combining data across subjects and the role that temporal autocorrelation plays in the application of MLTs to tvFC data. Overall, we conclude that while MLTs can be useful to generate summary views of labeled tvFC data, their application to unlabeled data such as resting-state remains challenging.
Collapse
Affiliation(s)
- Javier Gonzalez-Castillo
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Isabel S. Fernandez
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Ka Chun Lam
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD, United States
| | - Daniel A. Handwerker
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Francisco Pereira
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD, United States
| | - Peter A. Bandettini
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
- Functional Magnetic Resonance Imaging (FMRI) Core, National Institute of Mental Health, Bethesda, MD, United States
| |
Collapse
|
20
|
Wang J, Adrianto I, Subedi K, Liu T, Wu X, Yi Q, Loveless I, Yin C, Datta I, Sant'Angelo DB, Kronenberg M, Zhou L, Mi QS. Integrative scATAC-seq and scRNA-seq analyses map thymic iNKT cell development and identify Cbfβ for its commitment. Cell Discov 2023; 9:61. [PMID: 37336875 DOI: 10.1038/s41421-023-00547-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 03/18/2023] [Indexed: 06/21/2023] Open
Abstract
Unlike conventional αβT cells, invariant natural killer T (iNKT) cells complete their terminal differentiation to functional iNKT1/2/17 cells in the thymus. However, underlying molecular programs that guide iNKT subset differentiation remain unclear. Here, we profiled the transcriptomes of over 17,000 iNKT cells and the chromatin accessibility states of over 39,000 iNKT cells across four thymic iNKT developmental stages using single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) to define their developmental trajectories. Our study discovered novel features for iNKT precursors and different iNKT subsets and indicated that iNKT2 and iNKT17 lineage commitment may occur as early as stage 0 (ST0) by two distinct programs, while iNKT1 commitments may occur post ST0. Both iNKT1 and iNKT2 cells exhibit extensive phenotypic and functional heterogeneity, while iNKT17 cells are relatively homogenous. Furthermore, we identified that a novel transcription factor, Cbfβ, was highly expressed in iNKT progenitor commitment checkpoint, which showed a similar expression trajectory with other known transcription factors for iNKT cells development, Zbtb16 and Egr2, and could direct iNKT cells fate and drive their effector phenotype differentiation. Conditional deletion of Cbfβ blocked early iNKT cell development and led to severe impairment of iNKT1/2/17 cell differentiation. Overall, our findings uncovered distinct iNKT developmental programs as well as their cellular heterogeneity, and identified a novel transcription factor Cbfβ as a key regulator for early iNKT cell commitment.
Collapse
Affiliation(s)
- Jie Wang
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA
| | - Indra Adrianto
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA
- Center for Bioinformatics, Department of Public Health Sciences, Henry Ford Health, Detroit, MI, USA
- Department of Medicine, College of Human Medicine, Michigan State University, East Lansing, MI, USA
| | - Kalpana Subedi
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA
| | - Tingting Liu
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA
| | - Xiaojun Wu
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA
| | - Qijun Yi
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA
| | - Ian Loveless
- Center for Bioinformatics, Department of Public Health Sciences, Henry Ford Health, Detroit, MI, USA
| | - Congcong Yin
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA
| | - Indrani Datta
- Center for Bioinformatics, Department of Public Health Sciences, Henry Ford Health, Detroit, MI, USA
| | - Derek B Sant'Angelo
- Child Health Institute of New Jersey, Rutgers Robert Wood Johnson Medical School, New Brunswick, NJ, USA
| | | | - Li Zhou
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA.
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA.
- Department of Medicine, College of Human Medicine, Michigan State University, East Lansing, MI, USA.
- Department of Internal Medicine, Henry Ford Health, Detroit, MI, USA.
| | - Qing-Sheng Mi
- Center for Cutaneous Biology and Immunology Research, Department of Dermatology, Henry Ford Health, Detroit, MI, USA.
- Immunology Research Program, Henry Ford Cancer Institute, Henry Ford Health, Detroit, MI, USA.
- Department of Medicine, College of Human Medicine, Michigan State University, East Lansing, MI, USA.
- Department of Internal Medicine, Henry Ford Health, Detroit, MI, USA.
| |
Collapse
|
21
|
Wall JD, Sathirapongsasuti JF, Gupta R, Rasheed A, Venkatesan R, Belsare S, Menon R, Phalke S, Mittal A, Fang J, Tanneeru D, Deshmukh M, Bassi A, Robinson J, Chaudhary R, Murugan S, Ul-Asar Z, Saleem I, Ishtiaq U, Fatima A, Sheikh SS, Hameed S, Ishaq M, Rasheed SZ, Memon FUR, Jalal A, Abbas S, Frossard P, Fuchsberger C, Forer L, Schoenherr S, Bei Q, Bhangale T, Tom J, Gadde SGK, B V P, Naik NK, Wang M, Kwok PY, Khera AV, Lakshmi BR, Butterworth AS, Chowdhury R, Danesh J, di Angelantonio E, Naheed A, Goyal V, Kandadai RM, Kumar H, Borgohain R, Mukherjee A, Wadia PM, Yadav R, Desai S, Kumar N, Biswas A, Pal PK, Muthane UB, Das SK, Ramprasad VL, Kukkle PL, Seshagiri S, Kathiresan S, Ghosh A, Mohan V, Saleheen D, Stawiski EW, Peterson AS. South Asian medical cohorts reveal strong founder effects and high rates of homozygosity. Nat Commun 2023; 14:3377. [PMID: 37291107 PMCID: PMC10250394 DOI: 10.1038/s41467-023-38766-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 05/15/2023] [Indexed: 06/10/2023] Open
Abstract
The benefits of large-scale genetic studies for healthcare of the populations studied are well documented, but these genetic studies have traditionally ignored people from some parts of the world, such as South Asia. Here we describe whole genome sequence (WGS) data from 4806 individuals recruited from the healthcare delivery systems of Pakistan, India and Bangladesh, combined with WGS from 927 individuals from isolated South Asian populations. We characterize population structure in South Asia and describe a genotyping array (SARGAM) and imputation reference panel that are optimized for South Asian genomes. We find evidence for high rates of reproductive isolation, endogamy and consanguinity that vary across the subcontinent and that lead to levels of rare homozygotes that reach 100 times that seen in outbred populations. Founder effects increase the power to associate functional variants with disease processes and make South Asia a uniquely powerful place for population-scale genetic studies.
Collapse
Affiliation(s)
- Jeffrey D Wall
- Institute for Human Genetics, University of California, San Francisco, CA, 94143, USA.
- Dept of Ornithology and Mammology, California Academy of Sciences, San Francisco, CA, 94118, USA.
| | - J Fah Sathirapongsasuti
- MedGenome Inc., Foster City, CA, 94404, USA
- GenomeAsia 100K Foundation, Foster City, CA, 94404, USA
| | - Ravi Gupta
- MedGenome Labs Pvt. Ltd., Bengaluru, Karnataka, 560099, India
| | - Asif Rasheed
- Center for Non-Communicable Disease, Karachi, Karachi City, Sindh, 75300, Pakistan
| | - Radha Venkatesan
- Madras Diabetes Research Foundation and Dr. Mohan's Diabetes Specialties Centre, Chennai, Tamil Nadu, 600086, India
| | - Saurabh Belsare
- Institute for Human Genetics, University of California, San Francisco, CA, 94143, USA
| | - Ramesh Menon
- MedGenome Labs Pvt. Ltd., Bengaluru, Karnataka, 560099, India
| | - Sameer Phalke
- MedGenome Labs Pvt. Ltd., Bengaluru, Karnataka, 560099, India
| | | | - John Fang
- Thermo Fisher Scientific, Santa Clara, CA, 95051, USA
| | - Deepak Tanneeru
- MedGenome Labs Pvt. Ltd., Bengaluru, Karnataka, 560099, India
| | | | - Akshi Bassi
- MedGenome Labs Pvt. Ltd., Bengaluru, Karnataka, 560099, India
| | - Jacqueline Robinson
- Institute for Human Genetics, University of California, San Francisco, CA, 94143, USA
| | | | | | - Zameer Ul-Asar
- Center for Non-Communicable Disease, Karachi, Karachi City, Sindh, 75300, Pakistan
| | - Imran Saleem
- Center for Non-Communicable Disease, Karachi, Karachi City, Sindh, 75300, Pakistan
| | - Unzila Ishtiaq
- Center for Non-Communicable Disease, Karachi, Karachi City, Sindh, 75300, Pakistan
| | - Areej Fatima
- Center for Non-Communicable Disease, Karachi, Karachi City, Sindh, 75300, Pakistan
| | | | | | | | | | | | - Anjum Jalal
- Faisalabad Institute of Cardiology, Faisalabad, Pakistan
| | - Shahid Abbas
- Faisalabad Institute of Cardiology, Faisalabad, Pakistan
| | - Philippe Frossard
- Center for Non-Communicable Disease, Karachi, Karachi City, Sindh, 75300, Pakistan
| | - Christian Fuchsberger
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Institute for Biomedicine, Eurac Research, Bolzano, Italy
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Lukas Forer
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Sebastian Schoenherr
- Institute of Genetic Epidemiology, Department of Genetics and Pharmacology, Medical University of Innsbruck, Innsbruck, Austria
| | - Qixin Bei
- Department of Molecular Biology, Genentech, South San Francisco, CA, 94080, USA
| | - Tushar Bhangale
- Department of Human Genetics, Genentech, South San Francisco, CA, 94080, USA
| | - Jennifer Tom
- Product Development Data Sciences, Genentech, South San Francisco, CA, 94080, USA
| | | | - Priya B V
- Narayana Nethralaya Foundation, Bengaluru, Karnataka, 560010, India
| | | | - Minxian Wang
- Program in Medical and Population Genetics & Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Pui-Yan Kwok
- Institute for Human Genetics, University of California, San Francisco, CA, 94143, USA
- Cardiovascular Research Institute and Department of Dermatology, University of California San Francisco, San Francisco, CA, 94143, USA
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Amit V Khera
- Harvard Medical School, Boston, MA, 02115, USA
- Division of Cardiology, Department of Medicine, Brigham and Women's Hospital, MA, 02115, Boston, USA
- Verve Therapeutics, Cambridge, MA, 02139, USA
| | - B R Lakshmi
- MDCRC, Royal Care Super Speciality Hospital 1/520, Neelambur, Coimbatore, Tamil Nadu, 641062, India
| | - Adam S Butterworth
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- National Institute for Health Research Blood and Transplant Research Unit in Donor Health and Genomics, University of Cambridge, Cambridge, UK
- National Institute for Health Research Cambridge Biomedical Research Centre, University of Cambridge and Cambridge University Hospitals, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Rajiv Chowdhury
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - John Danesh
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- National Institute for Health Research Blood and Transplant Research Unit in Donor Health and Genomics, University of Cambridge, Cambridge, UK
- National Institute for Health Research Cambridge Biomedical Research Centre, University of Cambridge and Cambridge University Hospitals, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, UK
| | - Emanuele di Angelantonio
- British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- National Institute for Health Research Blood and Transplant Research Unit in Donor Health and Genomics, University of Cambridge, Cambridge, UK
- National Institute for Health Research Cambridge Biomedical Research Centre, University of Cambridge and Cambridge University Hospitals, Cambridge, UK
- Health Data Research UK Cambridge, Wellcome Genome Campus and University of Cambridge, Cambridge, UK
| | - Aliya Naheed
- Initiative for Non Communicable Diseases, Health Systems and Population Studies Division, icddr,b, Dhaka, Bangladesh
| | - Vinay Goyal
- All India Institute of Medical Sciences (AIIMS), New Delhi, India
- Medanta Hospital, New Delhi, India
- Medanta, The Medicity, Gurgaon, India
| | | | | | - Rupam Borgohain
- Nizams Institute of Medical Sciences (NIMS), Hyderabad, India
| | - Adreesh Mukherjee
- Bangur Institute of Neurosciences and Institute of Post Graduate Medical Education and Research (IPGME&R), Kolkata, India
| | | | - Ravi Yadav
- National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, India
| | - Soaham Desai
- Shree Krishna Hospital and Pramukhaswami Medical College, Bhaikaka University, Karamsad, Gujarat, India
| | - Niraj Kumar
- All India Institute of Medical Sciences, Rishikesh, India
| | - Atanu Biswas
- Bangur Institute of Neurosciences and Institute of Post Graduate Medical Education and Research (IPGME&R), Kolkata, India
| | - Pramod Kumar Pal
- National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, India
| | - Uday B Muthane
- Parkinson and Ageing Research Foundation, Bengaluru, India
| | - Shymal K Das
- Bangur Institute of Neurosciences and Institute of Post Graduate Medical Education and Research (IPGME&R), Kolkata, India
| | | | - Prashanth L Kukkle
- All India Institute of Medical Sciences, Rishikesh, India
- Manipal Hospital, Miller Road, Bengaluru, India
- Parkinson's Disease and Movement Disorders Clinic, Bengaluru, India
| | - Somasekar Seshagiri
- GenomeAsia 100K Foundation, Foster City, CA, 94404, USA
- Department of Molecular Biology, Genentech, South San Francisco, CA, 94080, USA
| | - Sekar Kathiresan
- Program in Medical and Population Genetics & Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Verve Therapeutics, Cambridge, MA, 02139, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Arkasubhra Ghosh
- Narayana Nethralaya Foundation, Bengaluru, Karnataka, 560010, India
| | - V Mohan
- Madras Diabetes Research Foundation and Dr. Mohan's Diabetes Specialties Centre, Chennai, Tamil Nadu, 600086, India
| | - Danish Saleheen
- Center for Non-Communicable Disease, Karachi, Karachi City, Sindh, 75300, Pakistan
- Seymour, Paul and Gloria Milstein Division of Cardiology at Columbia University, New York, NY, 10032, USA
| | - Eric W Stawiski
- MedGenome Inc., Foster City, CA, 94404, USA
- GenomeAsia 100K Foundation, Foster City, CA, 94404, USA
- Department of Molecular Biology, Genentech, South San Francisco, CA, 94080, USA
- Caribou Biosciences, Berkeley, CA, 94710, USA
| | - Andrew S Peterson
- MedGenome Inc., Foster City, CA, 94404, USA.
- GenomeAsia 100K Foundation, Foster City, CA, 94404, USA.
- Department of Molecular Biology, Genentech, South San Francisco, CA, 94080, USA.
- Broadwing Bio, South San Francisco, CA, 94080, USA.
| |
Collapse
|
22
|
Anderson-Trocmé L, Nelson D, Zabad S, Diaz-Papkovich A, Kryukov I, Baya N, Touvier M, Jeffery B, Dina C, Vézina H, Kelleher J, Gravel S. On the genes, genealogies, and geographies of Quebec. Science 2023; 380:849-855. [PMID: 37228217 DOI: 10.1126/science.add5300] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 04/24/2023] [Indexed: 05/27/2023]
Abstract
Population genetic models only provide coarse representations of real-world ancestry. We used a pedigree compiled from 4 million parish records and genotype data from 2276 French and 20,451 French Canadian individuals to finely model and trace French Canadian ancestry through space and time. The loss of ancestral French population structure and the appearance of spatial and regional structure highlights a wide range of population expansion models. Geographic features shaped migrations, and we find enrichments for migration, genetic, and genealogical relatedness patterns within river networks across regions of Quebec. Finally, we provide a freely accessible simulated whole-genome sequence dataset with spatiotemporal metadata for 1,426,749 individuals reflecting intricate French Canadian population structure. Such realistic population-scale simulations provide opportunities to investigate population genetics at an unprecedented resolution.
Collapse
Affiliation(s)
- Luke Anderson-Trocmé
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill University Genome Centre, Montreal, QC, Canada
| | - Dominic Nelson
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill University Genome Centre, Montreal, QC, Canada
| | - Shadi Zabad
- School of Computer Science, McGill University, Montreal, QC, Canada
| | - Alex Diaz-Papkovich
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- Quantitative Life Sciences, McGill University, Montreal, QC, Canada
| | - Ivan Kryukov
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill University Genome Centre, Montreal, QC, Canada
| | - Nikolas Baya
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | - Mathilde Touvier
- Sorbonne Paris Nord University, INSERM U1153, INRAE U1125, CNAM, Nutritional Epidemiology Research Team (EREN), Epidemiology and Statistics Research Center, University Paris Cité (CRESS), Bobigny, France
| | - Ben Jeffery
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | - Christian Dina
- Nantes Université, CNRS, INSERM, l'institut du thorax, Nantes, France
| | - Hélène Vézina
- BALSAC Project, Université du Québec á Chicoutimi, Chicoutimi, QC, Canada
| | - Jerome Kelleher
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada
- McGill University Genome Centre, Montreal, QC, Canada
| |
Collapse
|
23
|
Cotter DJ, Hofgard EF, Novembre J, Szpiech ZA, Rosenberg NA. A rarefaction approach for measuring population differences in rare and common variation. Genetics 2023; 224:iyad070. [PMID: 37075098 PMCID: PMC10213490 DOI: 10.1093/genetics/iyad070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 12/20/2022] [Accepted: 04/07/2023] [Indexed: 04/20/2023] Open
Abstract
In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as "rare," with nonzero frequency less than or equal to a specified threshold, "common," with a frequency above the threshold, or entirely unobserved in a population. When sample sizes differ across populations, however, especially if the threshold separating "rare" and "common" corresponds to a small number of observed copies of an allelic type, discreteness effects can lead a sample from one population to possess substantially more rare allelic types than a sample from another population, even if the two populations have extremely similar underlying allele-frequency distributions across loci. We introduce a rarefaction-based sample-size correction for use in comparing rare and common variation across multiple populations whose sample sizes potentially differ. We use our approach to examine rare and common variation in worldwide human populations, finding that the sample-size correction introduces subtle differences relative to analyses that use the full available sample sizes. We introduce several ways in which the rarefaction approach can be applied: we explore the dependence of allele classifications on subsample sizes, we permit more than two classes of allelic types of nonzero frequency, and we analyze rare and common variation in sliding windows along the genome. The results can assist in clarifying similarities and differences in allele-frequency patterns across populations.
Collapse
Affiliation(s)
- Daniel J Cotter
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Elyssa F Hofgard
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA 94305, USA
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Zachary A Szpiech
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Noah A Rosenberg
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
24
|
Spence JP, Zeng T, Mostafavi H, Pritchard JK. Scaling the Discrete-time Wright Fisher model to biobank-scale datasets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.19.541517. [PMID: 37293115 PMCID: PMC10245735 DOI: 10.1101/2023.05.19.541517] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The Discrete-Time Wright Fisher (DTWF) model and its large population diffusion limit are central to population genetics. These models describe the forward-in-time evolution of the frequency of an allele in a population and can include the fundamental forces of genetic drift, mutation, and selection. Computing like-lihoods under the diffusion process is feasible, but the diffusion approximation breaks down for large sample sizes or in the presence of strong selection. Unfortunately, existing methods for computing likelihoods under the DTWF model do not scale to current exome sequencing sample sizes in the hundreds of thousands. Here we present an algorithm that approximates the DTWF model with provably bounded error and runs in time linear in the size of the population. Our approach relies on two key observations about Binomial distributions. The first is that Binomial distributions are approximately sparse. The second is that Binomial distributions with similar success probabilities are extremely close as distributions, allowing us to approximate the DTWF Markov transition matrix as a very low rank matrix. Together, these observations enable matrix-vector multiplication in linear (as opposed to the usual quadratic) time. We prove similar properties for Hypergeometric distributions, enabling fast computation of likelihoods for subsamples of the population. We show theoretically and in practice that this approximation is highly accurate and can scale to population sizes in the billions, paving the way for rigorous biobank-scale population genetic inference. Finally, we use our results to estimate how increasing sample sizes will improve the estimation of selection coefficients acting on loss-of-function variants. We find that increasing sample sizes beyond existing large exome sequencing cohorts will provide essentially no additional information except for genes with the most extreme fitness effects.
Collapse
Affiliation(s)
| | - Tony Zeng
- Department of Genetics, Stanford University
| | | | - Jonathan K. Pritchard
- Department of Genetics, Stanford University
- Department of Biology, Stanford University
| |
Collapse
|
25
|
Jiang Y, Trotsyuk AA, Niu S, Henn D, Chen K, Shih CC, Larson MR, Mermin-Bunnell AM, Mittal S, Lai JC, Saberi A, Beard E, Jing S, Zhong D, Steele SR, Sun K, Jain T, Zhao E, Neimeth CR, Viana WG, Tang J, Sivaraj D, Padmanabhan J, Rodrigues M, Perrault DP, Chattopadhyay A, Maan ZN, Leeolou MC, Bonham CA, Kwon SH, Kussie HC, Fischer KS, Gurusankar G, Liang K, Zhang K, Nag R, Snyder MP, Januszyk M, Gurtner GC, Bao Z. Wireless, closed-loop, smart bandage with integrated sensors and stimulators for advanced wound care and accelerated healing. Nat Biotechnol 2023; 41:652-662. [PMID: 36424488 DOI: 10.1038/s41587-022-01528-3] [Citation(s) in RCA: 75] [Impact Index Per Article: 75.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Accepted: 09/23/2022] [Indexed: 11/26/2022]
Abstract
'Smart' bandages based on multimodal wearable devices could enable real-time physiological monitoring and active intervention to promote healing of chronic wounds. However, there has been limited development in incorporation of both sensors and stimulators for the current smart bandage technologies. Additionally, while adhesive electrodes are essential for robust signal transduction, detachment of existing adhesive dressings can lead to secondary damage to delicate wound tissues without switchable adhesion. Here we overcome these issues by developing a flexible bioelectronic system consisting of wirelessly powered, closed-loop sensing and stimulation circuits with skin-interfacing hydrogel electrodes capable of on-demand adhesion and detachment. In mice, we demonstrate that our wound care system can continuously monitor skin impedance and temperature and deliver electrical stimulation in response to the wound environment. Across preclinical wound models, the treatment group healed ~25% more rapidly and with ~50% enhancement in dermal remodeling compared with control. Further, we observed activation of proregenerative genes in monocyte and macrophage cell populations, which may enhance tissue regeneration, neovascularization and dermal recovery.
Collapse
Affiliation(s)
- Yuanwen Jiang
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Artem A Trotsyuk
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
- Department of Surgery, University of Arizona College of Medicine, Tucson, AZ, USA
| | - Simiao Niu
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Dominic Henn
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Kellen Chen
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
- Department of Surgery, University of Arizona College of Medicine, Tucson, AZ, USA
| | - Chien-Chung Shih
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Madelyn R Larson
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Alana M Mermin-Bunnell
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Smiti Mittal
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Jian-Cheng Lai
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Aref Saberi
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Ethan Beard
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Serena Jing
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Donglai Zhong
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Sydney R Steele
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Kefan Sun
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Tanish Jain
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Eric Zhao
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
| | - Christopher R Neimeth
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Willian G Viana
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Jing Tang
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA
- Department of Materials Science and Engineering, Stanford University, Stanford, CA, USA
| | - Dharshan Sivaraj
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
- Department of Surgery, University of Arizona College of Medicine, Tucson, AZ, USA
| | - Jagannath Padmanabhan
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Melanie Rodrigues
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - David P Perrault
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Arhana Chattopadhyay
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Zeshaan N Maan
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Melissa C Leeolou
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Clark A Bonham
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Sun Hyung Kwon
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Hudson C Kussie
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
- Department of Surgery, University of Arizona College of Medicine, Tucson, AZ, USA
| | - Katharina S Fischer
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
- Department of Surgery, University of Arizona College of Medicine, Tucson, AZ, USA
| | | | - Kui Liang
- BOE Technology Center, BOE Technology Group Co., Ltd, Beijing, China
| | - Kailiang Zhang
- BOE Technology Center, BOE Technology Group Co., Ltd, Beijing, China
| | - Ronjon Nag
- Stanford Distinguished Careers Institute, Stanford University, Stanford, CA, USA
| | - Michael P Snyder
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Michael Januszyk
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA
| | - Geoffrey C Gurtner
- Department of Surgery, Division of Plastic and Reconstructive Surgery, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Surgery, University of Arizona College of Medicine, Tucson, AZ, USA.
| | - Zhenan Bao
- Department of Chemical Engineering, Stanford University, Stanford, CA, USA.
| |
Collapse
|
26
|
Černý V, Priehodová E, Fortes-Lima C. A Population Genetic Perspective on Subsistence Systems in the Sahel/Savannah Belt of Africa and the Historical Role of Pastoralism. Genes (Basel) 2023; 14:genes14030758. [PMID: 36981029 PMCID: PMC10048103 DOI: 10.3390/genes14030758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 02/26/2023] [Accepted: 03/10/2023] [Indexed: 03/30/2023] Open
Abstract
This review focuses on the Sahel/Savannah belt, a large region of Africa where two alternative subsistence systems (pastoralism and agriculture), nowadays, interact. It is a long-standing question whether the pastoralists became isolated here from other populations after cattle began to spread into Africa (~8 thousand years ago, kya) or, rather, began to merge with other populations, such as agropastoralists, after the domestication of sorghum and pearl millet (~5 kya) and with the subsequent spread of agriculture. If we look at lactase persistence, a trait closely associated with pastoral lifestyle, we see that its variants in current pastoralists distinguish them from their farmer neighbours. Most other (mostly neutral) genetic polymorphisms do not, however, indicate such clear differentiation between these groups; they suggest a common origin and/or an extensive gene flow. Genetic affinity and ecological symbiosis between the two subsistence systems can help us better understand the population history of this African region. In this review, we show that genomic datasets of modern Sahel/Savannah belt populations properly collected in local populations can complement the still insufficient archaeological research of this region, especially when dealing with the prehistory of mobile populations with perishable material culture and therefore precarious archaeological visibility.
Collapse
Affiliation(s)
- Viktor Černý
- Archaeogenetics Laboratory, Institute of Archaeology of the Academy of Sciences of the Czech Republic, Letenská 1, 118 01 Prague, Czech Republic
| | - Edita Priehodová
- Archaeogenetics Laboratory, Institute of Archaeology of the Academy of Sciences of the Czech Republic, Letenská 1, 118 01 Prague, Czech Republic
| | - Cesar Fortes-Lima
- Human Evolution, Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18C, 752 36 Uppsala, Sweden
| |
Collapse
|
27
|
Saltoun K, Adolphs R, Paul LK, Sharma V, Diedrichsen J, Yeo BTT, Bzdok D. Dissociable brain structural asymmetry patterns reveal unique phenome-wide profiles. Nat Hum Behav 2023; 7:251-268. [PMID: 36344655 DOI: 10.1038/s41562-022-01461-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 09/16/2022] [Indexed: 11/09/2022]
Abstract
Broca reported ~150 years ago that particular lesions of the left hemisphere impair speech. Since then, other brain regions have been reported to show lateralized structure and function. Yet, studies of brain asymmetry have limited their focus to pairwise comparisons between homologous regions. Here, we characterized separable whole-brain asymmetry patterns in grey and white matter structure from n = 37,441 UK Biobank participants. By pooling information on left-right shifts underlying whole-brain structure, we deconvolved signatures of brain asymmetry that are spatially distributed rather than locally constrained. Classically asymmetric regions turned out to belong to more than one asymmetry pattern. Instead of a single dominant signature, we discovered complementary asymmetry patterns that contributed similarly to whole-brain asymmetry at the population level. These asymmetry patterns were associated with unique collections of phenotypes, ranging from early lifestyle factors to demographic status to mental health indicators.
Collapse
Affiliation(s)
- Karin Saltoun
- McConnell Brain Imaging Centre, Montreal Neurological Institute (MNI), McGill University, Montreal, Quebec, Canada.,Mila - Quebec Artificial Intelligence Institute, Montreal, Quebec, Canada.,Department of Biomedical Engineering, Faculty of Medicine, McGill University, Montreal, Quebec, Canada.,School of Computer Science, McGill University, Quebec, Canada
| | - Ralph Adolphs
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.,Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Lynn K Paul
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.,International Research Consortium for the Corpus Callosum and Cerebral Connectivity (IRC5), Pasadena, CA, USA.,Fuller Graduate School of Psychology, Travis Research Institute, Pasadena, CA, USA
| | - Vaibhav Sharma
- McConnell Brain Imaging Centre, Montreal Neurological Institute (MNI), McGill University, Montreal, Quebec, Canada.,Mila - Quebec Artificial Intelligence Institute, Montreal, Quebec, Canada
| | - Joern Diedrichsen
- The Brain and Mind Institute, Western University, London, Ontario, Canada.,Department of Computer Science, Western University, London, Ontario, Canada.,Department of Statistical and Actuarial Sciences, Western University, London, Ontario, Canada
| | - B T Thomas Yeo
- Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore.,Centre for Sleep & Cognition & Centre for Translational Magnetic Resonance Research, Yong Loo Lin School of Medicine, Singapore, Singapore.,N.1 Institute for Health & Institute for Digital Medicine, National University of Singapore, Singapore, Singapore
| | - Danilo Bzdok
- McConnell Brain Imaging Centre, Montreal Neurological Institute (MNI), McGill University, Montreal, Quebec, Canada. .,Mila - Quebec Artificial Intelligence Institute, Montreal, Quebec, Canada.
| |
Collapse
|
28
|
Di N, Sharif MZ, Hu Z, Xue R, Yu B. Applicability of VGGish embedding in bee colony monitoring: comparison with MFCC in colony sound classification. PeerJ 2023; 11:e14696. [PMID: 36721779 PMCID: PMC9884476 DOI: 10.7717/peerj.14696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 12/14/2022] [Indexed: 01/27/2023] Open
Abstract
Background Bee colony sound is a continuous, low-frequency buzzing sound that varies with the environment or the colony's behavior and is considered meaningful. Bees use sounds to communicate within the hive, and bee colony sounds investigation can reveal helpful information about the circumstances in the colony. Therefore, one crucial step in analyzing bee colony sounds is to extract appropriate acoustic feature. Methods This article uses VGGish (a visual geometry group-like audio classification model) embedding and Mel-frequency Cepstral Coefficient (MFCC) generated from three bee colony sound datasets, to train four machine learning algorithms to determine which acoustic feature performs better in bee colony sound recognition. Results The results showed that VGGish embedding performs better than or on par with MFCC in all three datasets.
Collapse
Affiliation(s)
- Nayan Di
- Anhui Institute of Optics and Fine Mechanics, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei, China,University of Science and Technology of China, Hefei, China
| | - Muhammad Zahid Sharif
- Anhui Institute of Optics and Fine Mechanics, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei, China,University of Science and Technology of China, Hefei, China
| | - Zongwen Hu
- Eastern Bee Research Institute, College of Animal Science and Technology, Yunnan Agricultural University, Kunming, China,The Sericultural and Apicultural Research Institute, Yunnan Academy of Agricultural Sciences, Mengzi, China
| | - Renjie Xue
- Anhui Institute of Optics and Fine Mechanics, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei, China,University of Science and Technology of China, Hefei, China
| | - Baizhong Yu
- Anhui Institute of Optics and Fine Mechanics, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei, China,University of Science and Technology of China, Hefei, China
| |
Collapse
|
29
|
Gonzalez-Castillo J, Fernandez I, Lam KC, Handwerker DA, Pereira F, Bandettini PA. Manifold Learning for fMRI time-varying FC. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.14.523992. [PMID: 36789436 PMCID: PMC9928030 DOI: 10.1101/2023.01.14.523992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Whole-brain functional connectivity ( FC ) measured with functional MRI (fMRI) evolve over time in meaningful ways at temporal scales going from years (e.g., development) to seconds (e.g., within-scan time-varying FC ( tvFC )). Yet, our ability to explore tvFC is severely constrained by its large dimensionality (several thousands). To overcome this difficulty, researchers seek to generate low dimensional representations (e.g., 2D and 3D scatter plots) expected to retain its most informative aspects (e.g., relationships to behavior, disease progression). Limited prior empirical work suggests that manifold learning techniques ( MLTs )-namely those seeking to infer a low dimensional non-linear surface (i.e., the manifold) where most of the data lies-are good candidates for accomplishing this task. Here we explore this possibility in detail. First, we discuss why one should expect tv FC data to lie on a low dimensional manifold. Second, we estimate what is the intrinsic dimension (i.e., minimum number of latent dimensions; ID ) of tvFC data manifolds. Third, we describe the inner workings of three state-of-the-art MLTs : Laplacian Eigenmaps ( LE ), T-distributed Stochastic Neighbor Embedding ( T-SNE ), and Uniform Manifold Approximation and Projection ( UMAP ). For each method, we empirically evaluate its ability to generate neuro-biologically meaningful representations of tvFC data, as well as their robustness against hyper-parameter selection. Our results show that tvFC data has an ID that ranges between 4 and 26, and that ID varies significantly between rest and task states. We also show how all three methods can effectively capture subject identity and task being performed: UMAP and T-SNE can capture these two levels of detail concurrently, but L E could only capture one at a time. We observed substantial variability in embedding quality across MLTs , and within- MLT as a function of hyper-parameter selection. To help alleviate this issue, we provide heuristics that can inform future studies. Finally, we also demonstrate the importance of feature normalization when combining data across subjects and the role that temporal autocorrelation plays in the application of MLTs to tvFC data. Overall, we conclude that while MLTs can be useful to generate summary views of labeled tvFC data, their application to unlabeled data such as resting-state remains challenging.
Collapse
Affiliation(s)
| | - Isabel Fernandez
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD
| | - Ka Chun Lam
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD
| | - Daniel A Handwerker
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD
| | - Francisco Pereira
- Machine Learning Group, National Institute of Mental Health, Bethesda, MD
| | - Peter A Bandettini
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD,Machine Learning Group, National Institute of Mental Health, Bethesda, MD,FMRI Core, National Institute of Mental Health, Bethesda, MD
| |
Collapse
|
30
|
Belleau P, Deschênes A, Chambwe N, Tuveson DA, Krasnitz A. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 2023; 83:49-58. [PMID: 36351074 PMCID: PMC9811156 DOI: 10.1158/0008-5472.can-22-0682] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 09/23/2022] [Accepted: 11/02/2022] [Indexed: 11/10/2022]
Abstract
Genetic ancestry-oriented cancer research requires the ability to perform accurate and robust genetic ancestry inference from existing cancer-derived data, including whole-exome sequencing, transcriptome sequencing, and targeted gene panels, very often in the absence of matching cancer-free genomic data. Here we examined the feasibility and accuracy of computational inference of genetic ancestry relying exclusively on cancer-derived data. A data synthesis framework was developed to optimize and assess the performance of the ancestry inference for any given input cancer-derived molecular profile. In its core procedure, the ancestral background of the profiled patient is replaced with one of any number of individuals with known ancestry. The data synthesis framework is applicable to multiple profiling platforms, making it possible to assess the performance of inference specifically for a given molecular profile and separately for each continental-level ancestry; this ability extends to all ancestries, including those without statistically sufficient representation in the existing cancer data. The inference procedure was demonstrated to be accurate and robust in a wide range of sequencing depths. Testing of the approach in four representative cancer types and across three molecular profiling modalities showed that continental-level ancestry of patients can be inferred with high accuracy, as quantified by its agreement with the gold standard of deriving ancestry from matching cancer-free molecular data. This study demonstrates that vast amounts of existing cancer-derived molecular data are potentially amenable to ancestry-oriented studies of the disease without requiring matching cancer-free genomes or patient self-reported ancestry. SIGNIFICANCE The development of a computational approach that enables accurate and robust ancestry inference from cancer-derived molecular profiles without matching cancer-free data provides a valuable methodology for genetic ancestry-oriented cancer research.
Collapse
Affiliation(s)
- Pascal Belleau
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
- Cancer Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
| | - Astrid Deschênes
- Cancer Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
- Lustgarten Foundation Pancreatic Cancer Research Laboratory, Cold Spring Harbor, New York
| | - Nyasha Chambwe
- Institute of Molecular Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, New York
| | - David A. Tuveson
- Cancer Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
- Lustgarten Foundation Pancreatic Cancer Research Laboratory, Cold Spring Harbor, New York
| | - Alexander Krasnitz
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
- Cancer Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York
| |
Collapse
|
31
|
Woodward AA, Urbanowicz RJ, Naj AC, Moore JH. Genetic heterogeneity: Challenges, impacts, and methods through an associative lens. Genet Epidemiol 2022; 46:555-571. [PMID: 35924480 PMCID: PMC9669229 DOI: 10.1002/gepi.22497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 07/06/2022] [Accepted: 07/19/2022] [Indexed: 01/07/2023]
Abstract
Genetic heterogeneity describes the occurrence of the same or similar phenotypes through different genetic mechanisms in different individuals. Robustly characterizing and accounting for genetic heterogeneity is crucial to pursuing the goals of precision medicine, for discovering novel disease biomarkers, and for identifying targets for treatments. Failure to account for genetic heterogeneity may lead to missed associations and incorrect inferences. Thus, it is critical to review the impact of genetic heterogeneity on the design and analysis of population level genetic studies, aspects that are often overlooked in the literature. In this review, we first contextualize our approach to genetic heterogeneity by proposing a high-level categorization of heterogeneity into "feature," "outcome," and "associative" heterogeneity, drawing on perspectives from epidemiology and machine learning to illustrate distinctions between them. We highlight the unique nature of genetic heterogeneity as a heterogeneous pattern of association that warrants specific methodological considerations. We then focus on the challenges that preclude effective detection and characterization of genetic heterogeneity across a variety of epidemiological contexts. Finally, we discuss systems heterogeneity as an integrated approach to using genetic and other high-dimensional multi-omic data in complex disease research.
Collapse
Affiliation(s)
- Alexa A. Woodward
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Ryan J. Urbanowicz
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| | - Adam C. Naj
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Jason H. Moore
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| |
Collapse
|
32
|
Li L, Milesi P, Tiret M, Chen J, Sendrowski J, Baison J, Chen Z, Zhou L, Karlsson B, Berlin M, Westin J, Garcia‐Gil MR, Wu HX, Lascoux M. Teasing apart the joint effect of demography and natural selection in the birth of a contact zone. THE NEW PHYTOLOGIST 2022; 236:1976-1987. [PMID: 36093739 PMCID: PMC9828440 DOI: 10.1111/nph.18480] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 08/23/2022] [Indexed: 05/26/2023]
Abstract
Vast population movements induced by recurrent climatic cycles have shaped the genetic structure of plant species. During glacial periods species were confined to low-latitude refugia from which they recolonized higher latitudes as the climate improved. This multipronged recolonization led to many lineages that later met and formed large contact zones. We utilize genomic data from 5000 Picea abies trees to test for the presence of natural selection during recolonization and establishment of a contact zone in Scandinavia. Scandinavian P. abies is today made up of a southern genetic cluster originating from the Baltics, and a northern one originating from Northern Russia. The contact zone delineating them closely matches the limit between two major climatic regions. We show that natural selection contributed to its establishment and maintenance. First, an isolation-with-migration model with genome-wide linked selection fits the data better than a purely neutral one. Second, many loci show signatures of selection or are associated with environmental variables. These loci, regrouped in clusters on chromosomes, are often related to phenology. Altogether, our results illustrate how climatic cycles, recolonization and selection can establish strong local adaptation along contact zones and affect the genetic architecture of adaptive traits.
Collapse
Affiliation(s)
- Lili Li
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, EBC and SciLife LabUppsala University75236UppsalaSweden
| | - Pascal Milesi
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, EBC and SciLife LabUppsala University75236UppsalaSweden
| | - Mathieu Tiret
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, EBC and SciLife LabUppsala University75236UppsalaSweden
| | - Jun Chen
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, EBC and SciLife LabUppsala University75236UppsalaSweden
- College of Life SciencesZhejiang UniversityHangzhouZhejiang310058China
| | - Janek Sendrowski
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, EBC and SciLife LabUppsala University75236UppsalaSweden
| | - John Baison
- Department Forest Genetics and Plant Physiology, Umeå Plant Science CentreSwedish University of Agricultural SciencesUmeåSE‐90183Sweden
| | - Zhi‐qiang Chen
- Department Forest Genetics and Plant Physiology, Umeå Plant Science CentreSwedish University of Agricultural SciencesUmeåSE‐90183Sweden
| | - Linghua Zhou
- Department Forest Genetics and Plant Physiology, Umeå Plant Science CentreSwedish University of Agricultural SciencesUmeåSE‐90183Sweden
| | | | - Mats Berlin
- SkogforskUppsala Science Park751 83UppsalaSweden
| | - Johan Westin
- Unit for Field‐Based Forest ResearchSwedish University of Agricultural SciencesSE‐922 91VindelnSweden
| | - Maria Rosario Garcia‐Gil
- Department Forest Genetics and Plant Physiology, Umeå Plant Science CentreSwedish University of Agricultural SciencesUmeåSE‐90183Sweden
| | - Harry X. Wu
- Department Forest Genetics and Plant Physiology, Umeå Plant Science CentreSwedish University of Agricultural SciencesUmeåSE‐90183Sweden
- CSIRO National Collection Research AustraliaBlack Mountain LaboratoryCanberraACT2601Australia
| | - Martin Lascoux
- Program in Plant Ecology and Evolution, Department of Ecology and Genetics, EBC and SciLife LabUppsala University75236UppsalaSweden
| |
Collapse
|
33
|
Kukkle PL, Geetha TS, Chaudhary R, Sathirapongsasuti JF, Goyal V, Kandadai RM, Kumar H, Borgohain R, Mukherjee A, Oliver M, Sunil M, Mootor MFE, Kapil S, Mandloi N, Wadia PM, Yadav R, Desai S, Kumar N, Biswas A, Pal PK, Muthane UB, Das SK, Sakthivel Murugan SM, Peterson AS, Stawiski EW, Seshagiri S, Gupta R, Ramprasad VL, Prai PRAOI. Genome-Wide Polygenic Score Predicts Large Number of High Risk Individuals in Monogenic Undiagnosed Young Onset Parkinson's Disease Patients from India. Adv Biol (Weinh) 2022; 6:e2101326. [PMID: 35810474 DOI: 10.1002/adbi.202101326] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Revised: 05/15/2022] [Indexed: 01/28/2023]
Abstract
Parkinson's disease (PD) is a genetically heterogeneous neurodegenerative disease with poorly defined environmental influences. Genomic studies of PD patients have identified disease-relevant monogenic genes, rare variants of significance, and polygenic risk-associated variants. In this study, whole genome sequencing data from 90 young onset Parkinson's disease (YOPD) individuals are analyzed for both monogenic and polygenic risk. The genetic variant analysis identifies pathogenic/likely pathogenic variants in eight of the 90 individuals (8.8%). It includes large homozygous coding exon deletions in PRKN and SNV/InDels in VPS13C, PLA2G6, PINK1, SYNJ1, and GCH1. Eleven rare heterozygous GBA coding variants are also identified in 13 (14.4%) individuals. In 34 (56.6%) individuals, one or more variants of uncertain significance (VUS) in PD/PD-relevant genes are observed. Though YOPD patients with a prioritized pathogenic variant show a low polygenic risk score (PRS), patients with prioritized VUS or no significant rare variants show an increased PRS odds ratio for PD. This study suggests that both significant rare variants and polygenic risk from common variants together may contribute to the genesis of PD. Further validation using a larger cohort of patients will confirm the interplay between monogenic and polygenic variants and their use in routine genetic PD diagnosis and risk assessment.
Collapse
Affiliation(s)
- Prashanth Lingappa Kukkle
- Department of Neurology, Manipal Hospital, Miller Road, Bangalore, 560052, India.,Department of Neurology, Parkinson's Disease and Movement Disorders Clinic, Bangalore, 560010, India.,Department of Neurology, All India Institute of Medical Sciences, Rishikesh, 249201, India
| | - Thenral S Geetha
- Research and Diagnostics Department, MedGenome Labs Pvt Ltd, Bangalore, 560099, India
| | - Ruchi Chaudhary
- Research Department, MedGenome Inc., 348 Hatch Drive, Foster City, CA, 94404, USA
| | | | - Vinay Goyal
- Department of Neurology, All India Institute of Medical Sciences (AIIMS), New Delhi, 110608, India.,Department of Neurology, Medanta Hospital, New Delhi, 110047, India.,Department of Neurology, Medanta, The Medicity, Gurgaon, 122006, India
| | | | - Hrishikesh Kumar
- Department of Neurology, Institute of Neurosciences Kolkata, Kolkata, 700007, India
| | - Rupam Borgohain
- Department of Neurology, Nizams Institute of Medical Sciences (NIMS), Hyderabad, 500082, India
| | - Adreesh Mukherjee
- Department of Neurology, Bangur Institute of Neurosciences and Institute of Post Graduate Medical Education and Research (IPGME&R), Kolkata, 700020, India
| | - Merina Oliver
- Research and Diagnostics Department, MedGenome Labs Pvt Ltd, Bangalore, 560099, India
| | - Meeta Sunil
- Research and Diagnostics Department, MedGenome Labs Pvt Ltd, Bangalore, 560099, India
| | | | - Shruti Kapil
- Research and Diagnostics Department, MedGenome Labs Pvt Ltd, Bangalore, 560099, India
| | - Nitin Mandloi
- Research and Diagnostics Department, MedGenome Labs Pvt Ltd, Bangalore, 560099, India
| | - Pettarusp M Wadia
- Department of Neurology, Jaslok Hospital and Research Centre, Mumbai, 400026, India
| | - Ravi Yadav
- Department of Neurology, National Institute of Mental Health and Neurosciences (NIMHANS), Bangalore, 560029, India
| | - Soaham Desai
- Department of Neurology, Shree Krishna Hospital and Pramukhswami Medical College, Bhaikaka University, Karamsad, 388325, India
| | - Niraj Kumar
- Department of Neurology, All India Institute of Medical Sciences, Rishikesh, 249201, India
| | - Atanu Biswas
- Department of Neurology, Bangur Institute of Neurosciences and Institute of Post Graduate Medical Education and Research (IPGME&R), Kolkata, 700020, India
| | - Pramod Kumar Pal
- Department of Neurology, National Institute of Mental Health and Neurosciences (NIMHANS), Bangalore, 560029, India
| | - Uday B Muthane
- Department of Neurology, Parkinson and Ageing Research Foundation, Bangalore, 560095, India
| | - Shymal Kumar Das
- Department of Neurology, Bangur Institute of Neurosciences and Institute of Post Graduate Medical Education and Research (IPGME&R), Kolkata, 700020, India
| | | | - Andrew S Peterson
- Research Department, MedGenome Inc., 348 Hatch Drive, Foster City, CA, 94404, USA
| | - Eric W Stawiski
- Research Department, MedGenome Inc., 348 Hatch Drive, Foster City, CA, 94404, USA
| | | | - Ravi Gupta
- Research and Diagnostics Department, MedGenome Labs Pvt Ltd, Bangalore, 560099, India
| | - Vedam L Ramprasad
- Research and Diagnostics Department, MedGenome Labs Pvt Ltd, Bangalore, 560099, India
| | | |
Collapse
|
34
|
Zhang W, Yan C, Liu X, Yang P, Wang J, Chen Y, Liu W, Li S, Zhang X, Dong G, He X, Yuan X, Jing H. Global characterization of megakaryocytes in bone marrow, peripheral blood, and cord blood by single-cell RNA sequencing. Cancer Gene Ther 2022; 29:1636-1647. [PMID: 35650393 DOI: 10.1038/s41417-022-00476-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 03/03/2022] [Accepted: 04/21/2022] [Indexed: 02/04/2023]
Abstract
Megakaryocytes (MK) are mainly derived from bone marrow and are mainly involved in platelet production. Studies have shown that MK derived from bone marrow may have immune function, and that MK from peripheral blood are associated with prostate cancer. Single-cell transcriptome sequencing can help us better understand the heterogeneity and potential function of MK cell populations in bone marrow (BM), peripheral Blood (PB), and cord blood (CB) of healthy and diseased people.We integrated more than 1.2 million single-cell transcriptome data from 132 samples of PB, BM, and CB from healthy individuals and patients from different dataset. We examined the MK (including MK and product of MK) by single-cell RNA sequencing data analysis methods and identification of MK-related protein expression by the Human Protein atlas. We investigate the relationship between the MK subtype and Non-Small Cell Lung Cancer (NSCLC) in 77 non-cancer and 402 NSCLC. We found that MK were widely distributed and the amount of MK in peripheral blood was more than that in bone marrow and there were specificity MK subtypes in peripheral blood. We found classical MK1 with typical MK characteristics and non-classical MK2 closely related to immunity which was the most common subtype in bone marrow and cord blood. Classical MK1 was closely related to Non-Small Cell Lung Cancer (NSCLC) and can be used as a diagnostic marker. MK2 may have potential adaptive immune function and play a role in tumor NSCLC and autoimmune diseases Systemic Lupus Erythematosus. MK have 14 subtypes and are widely distributed in PB, CB, and BM. MK subtypes are closely related to immunity and have potential to be a diagnostic indicator of NSCLC.
Collapse
Affiliation(s)
- Weilong Zhang
- Department of Hematology, Lymphoma Research Center, Peking University Third Hospital, 100191, Beijing, China
| | - Changjian Yan
- The Second Affiliated Hospital of Fujian Medical University, 362000, Quanzhou, China
| | - Xiaoni Liu
- Department of Respiratory Medicine, The First Affiliated Hospital of Gannan Medical University, 341000, Ganzhou, China
| | - Ping Yang
- Department of Hematology, Lymphoma Research Center, Peking University Third Hospital, 100191, Beijing, China
| | - Jing Wang
- Department of Hematology, Lymphoma Research Center, Peking University Third Hospital, 100191, Beijing, China
| | - Yingtong Chen
- Department of Hematology, Lymphoma Research Center, Peking University Third Hospital, 100191, Beijing, China
| | - Weiyou Liu
- Department of Respiratory Medicine, The First Affiliated Hospital of Gannan Medical University, 341000, Ganzhou, China
| | - Shaoxiang Li
- Department of Pathology, Beijing Tiantan Hospital, Capital Medical University, 100070, Beijing, China
| | - Xiuru Zhang
- Department of Pathology, Beijing Tiantan Hospital, Capital Medical University, 100070, Beijing, China
| | - Gehong Dong
- Department of Pathology, Beijing Tiantan Hospital, Capital Medical University, 100070, Beijing, China
| | - Xue He
- Department of Pathology, Beijing Tiantan Hospital, Capital Medical University, 100070, Beijing, China.
| | - Xiaoliang Yuan
- Department of Respiratory Medicine, The First Affiliated Hospital of Gannan Medical University, 341000, Ganzhou, China.
| | - Hongmei Jing
- Department of Hematology, Lymphoma Research Center, Peking University Third Hospital, 100191, Beijing, China.
| |
Collapse
|
35
|
Nascimben M, Rimondini L, Corà D, Venturin M. Polygenic risk modeling of tumor stage and survival in bladder cancer. BioData Min 2022; 15:23. [PMID: 36175974 PMCID: PMC9523990 DOI: 10.1186/s13040-022-00306-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 09/18/2022] [Indexed: 11/26/2022] Open
Abstract
Introduction Bladder cancer assessment with non-invasive gene expression signatures facilitates the detection of patients at risk and surveillance of their status, bypassing the discomforts given by cystoscopy. To achieve accurate cancer estimation, analysis pipelines for gene expression data (GED) may integrate a sequence of several machine learning and bio-statistical techniques to model complex characteristics of pathological patterns. Methods Numerical experiments tested the combination of GED preprocessing by discretization with tree ensemble embeddings and nonlinear dimensionality reductions to categorize oncological patients comprehensively. Modeling aimed to identify tumor stage and distinguish survival outcomes in two situations: complete and partial data embedding. This latter experimental condition simulates the addition of new patients to an existing model for rapid monitoring of disease progression. Machine learning procedures were employed to identify the most relevant genes involved in patient prognosis and test the performance of preprocessed GED compared to untransformed data in predicting patient conditions. Results Data embedding paired with dimensionality reduction produced prognostic maps with well-defined clusters of patients, suitable for medical decision support. A second experiment simulated the addition of new patients to an existing model (partial data embedding): Uniform Manifold Approximation and Projection (UMAP) methodology with uniform data discretization led to better outcomes than other analyzed pipelines. Further exploration of parameter space for UMAP and t-distributed stochastic neighbor embedding (t-SNE) underlined the importance of tuning a higher number of parameters for UMAP rather than t-SNE. Moreover, two different machine learning experiments identified a group of genes valuable for partitioning patients (gene relevance analysis) and showed the higher precision obtained by preprocessed data in predicting tumor outcomes for cancer stage and survival rate (six classes prediction). Conclusions The present investigation proposed new analysis pipelines for disease outcome modeling from bladder cancer-related biomarkers. Complete and partial data embedding experiments suggested that pipelines employing UMAP had a more accurate predictive ability, supporting the recent literature trends on this methodology. However, it was also found that several UMAP parameters influence experimental results, therefore deriving a recommendation for researchers to pay attention to this aspect of the UMAP technique. Machine learning procedures further demonstrated the effectiveness of the proposed preprocessing in predicting patients’ conditions and determined a sub-group of biomarkers significant for forecasting bladder cancer prognosis.
Collapse
Affiliation(s)
- Mauro Nascimben
- Department of Health Sciences, Università del Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy. .,Enginsoft SpA, Via Giambellino 7, 35129, Padova, Italy.
| | - Lia Rimondini
- Department of Health Sciences, Università del Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy
| | - Davide Corà
- Department of Health Sciences, Università del Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy.,Department of Translational Medicine, Università del Piemonte Orientale, Via Solaroli 17, 28100, Novara, Italy
| | | |
Collapse
|
36
|
Fortes-Lima C, Tříska P, Čížková M, Podgorná E, Diallo MY, Schlebusch CM, Černý V. Demographic and Selection Histories of Populations Across the Sahel/Savannah Belt. Mol Biol Evol 2022; 39:6731090. [PMID: 36173804 PMCID: PMC9582163 DOI: 10.1093/molbev/msac209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The Sahel/Savannah belt harbors diverse populations with different demographic histories and different subsistence patterns. However, populations from this large African region are notably under-represented in genomic research. To investigate the population structure and adaptation history of populations from the Sahel/Savannah space, we generated dense genome-wide genotype data of 327 individuals-comprising 14 ethnolinguistic groups, including 10 previously unsampled populations. Our results highlight fine-scale population structure and complex patterns of admixture, particularly in Fulani groups and Arabic-speaking populations. Among all studied Sahelian populations, only the Rashaayda Arabic-speaking population from eastern Sudan shows a lack of gene flow from African groups, which is consistent with the short history of this population in the African continent. They are recent migrants from Saudi Arabia with evidence of strong genetic isolation during the last few generations and a strong demographic bottleneck. This population also presents a strong selection signal in a genomic region around the CNR1 gene associated with substance dependence and chronic stress. In Western Sahelian populations, signatures of selection were detected in several other genetic regions, including pathways associated with lactase persistence, immune response, and malaria resistance. Taken together, these findings refine our current knowledge of genetic diversity, population structure, migration, admixture and adaptation of human populations in the Sahel/Savannah belt and contribute to our understanding of human history and health.
Collapse
Affiliation(s)
- Cesar Fortes-Lima
- Human Evolution, Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Petr Tříska
- Archaeogenetics Laboratory, Institute of Archaeology of the Czech Academy of Sciences, Prague, Czech Republic
| | - Martina Čížková
- Archaeogenetics Laboratory, Institute of Archaeology of the Czech Academy of Sciences, Prague, Czech Republic
| | - Eliška Podgorná
- Archaeogenetics Laboratory, Institute of Archaeology of the Czech Academy of Sciences, Prague, Czech Republic
| | - Mame Yoro Diallo
- Archaeogenetics Laboratory, Institute of Archaeology of the Czech Academy of Sciences, Prague, Czech Republic,Department of Anthropology and Human Genetics, Faculty of Science, Charles University, Prague, Czech Republic
| | | | | |
Collapse
|
37
|
Chen K, Henn D, Sivaraj D, Bonham CA, Griffin M, Kussie HC, Padmanabhan J, Trotsyuk AA, Wan DC, Januszyk M, Longaker MT, Gurtner GC. Mechanical Strain Drives Myeloid Cell Differentiation Toward Proinflammatory Subpopulations. Adv Wound Care (New Rochelle) 2022; 11:466-478. [PMID: 34278820 PMCID: PMC9805866 DOI: 10.1089/wound.2021.0036] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Accepted: 06/27/2021] [Indexed: 01/13/2023] Open
Abstract
Objective: After injury, humans and other mammals heal by forming fibrotic scar tissue with diminished function, and this healing process involves the dynamic interplay between resident cells within the skin and cells recruited from the circulation. Recent studies have provided mounting evidence that external mechanical forces stimulate intracellular signaling pathways to drive fibrotic processes. Innovation: While most studies have focused on studying mechanotransduction in fibroblasts, recent data suggest that mechanical stimulation may also shape the behavior of immune cells, referred to as "mechano-immunomodulation." However, the effect of mechanical strain on myeloid cell recruitment and differentiation remains poorly understood and has never been investigated at the single-cell level. Approach: In this study, we utilized a three-dimensional (3D) in vitro culture system that permits the precise manipulation of mechanical strain applied to cells. We cultured myeloid cells and used single-cell RNA-sequencing to interrogate the effects of strain on myeloid differentiation and transcriptional programming. Results: Our data indicate that myeloid cells are indeed mechanoresponsive, with mechanical stress influencing myeloid differentiation. Mechanical strain also upregulated a cascade of inflammatory chemokines, most notably from the Ccl family. Conclusion: Further understanding of how mechanical stress affects myeloid cells in conjunction with other cell types in the complicated, multicellular milieu of wound healing may lead to novel insights and therapies for the treatment of fibrosis.
Collapse
Affiliation(s)
- Kellen Chen
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Dominic Henn
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Dharshan Sivaraj
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Clark A. Bonham
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Michelle Griffin
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Hudson C. Kussie
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Jagannath Padmanabhan
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Artem A. Trotsyuk
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Derrick C. Wan
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Michael Januszyk
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| | - Michael T. Longaker
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Palo Alto, California, USA
| | - Geoffrey C. Gurtner
- Division of Plastic and Reconstructive Surgery, Department of Surgery, Stanford University School of Medicine, Stanford, California, USA
| |
Collapse
|
38
|
Ubbens J, Feldmann MJ, Stavness I, Sharpe AG. Quantitative evaluation of nonlinear methods for population structure visualization and inference. G3 GENES|GENOMES|GENETICS 2022; 12:6651067. [PMID: 35900169 PMCID: PMC9434256 DOI: 10.1093/g3journal/jkac191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 07/20/2022] [Indexed: 11/20/2022]
Abstract
Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations in a population as a result of nonrandom mating between individuals. It can be informative of genetic ancestry, and in the context of medical genetics, it is an important confounding variable in genome-wide association studies. Recently, many nonlinear dimensionality reduction techniques have been proposed for the population structure visualization task. However, an objective comparison of these techniques has so far been missing from the literature. In this article, we discuss the previously proposed nonlinear techniques and some of their potential weaknesses. We then propose a novel quantitative evaluation methodology for comparing these nonlinear techniques, based on populations for which pedigree is known a priori either through artificial selection or simulation. Based on this evaluation metric, we find graph-based algorithms such as t-SNE and UMAP to be superior to principal component analysis, while neural network-based methods fall behind.
Collapse
Affiliation(s)
- Jordan Ubbens
- Global Institute for Food Security (GIFS), University of Saskatchewan, Saskatoon, SKS7N 0W9, Canada
| | - Mitchell J Feldmann
- Department of Plant Sciences, University of California , Davis, CA95616, USA
| | - Ian Stavness
- Global Institute for Food Security (GIFS), University of Saskatchewan, Saskatoon, SKS7N 0W9, Canada
- Department of Computer Science, University of Saskatchewan , Saskatoon, SKS7N 0W9, Canada
| | - Andrew G Sharpe
- Global Institute for Food Security (GIFS), University of Saskatchewan, Saskatoon, SKS7N 0W9, Canada
| |
Collapse
|
39
|
Guo Q, Jiang Y, Wang Z, Bi Y, Chen G, Bai H, Chang G. Genome-Wide Association Study for Screening and Identifying Potential Shin Color Loci in Ducks. Genes (Basel) 2022; 13:genes13081391. [PMID: 36011302 PMCID: PMC9407491 DOI: 10.3390/genes13081391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 07/30/2022] [Accepted: 08/02/2022] [Indexed: 02/05/2023] Open
Abstract
Shin color diversity is a widespread phenomenon in birds. In this study, ducks were assessed to identify candidate genes for yellow, black, and spotted tibiae. For this purpose, we performed whole-genome resequencing of an F2 population consisting of 275 ducks crossed between Runzhou crested-white ducks and Cherry Valley ducks. We obtained 12.6 Mb of single nucleotide polymorphism (SNP) data, and the three shin colors were subsequently genotyped. Genome-wide association studies (GWASs) were performed to identify candidate and potential SNPs for the three shin colors. According to the results, 2947 and 3451 significant SNPs were associated with black and yellow shins, respectively, and six potential SNPs were associated with spotted shins. Based on the SNP annotations, the MITF, EDNRB2, POU family members, and the SLC superfamily were the candidate genes regulating pigmentation. In addition, the isoforms of EDNRB2, TYR, TYRP1, and MITF-M were significantly different between the black and yellow tibiae. MITF and EDNRB2 may have synergistic roles in the regulation of melanin synthesis, and their mutations may lead to phenotypic differences in the melanin deposition between individuals. This study provides new insights into the genetic factors that may influence tibia color diversity in birds.
Collapse
Affiliation(s)
- Qixin Guo
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| | - Yong Jiang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| | - Zhixiu Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| | - Yulin Bi
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| | - Guohong Chen
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
- Joint International Research Laboratory of Agriculture and Agri-Product Safety, The Ministry of Education of China, Institutes of Agricultural Science and Technology Development, Yangzhou University, Yangzhou 225009, China
| | - Hao Bai
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
| | - Guobin Chang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China
- Joint International Research Laboratory of Agriculture and Agri-Product Safety, The Ministry of Education of China, Institutes of Agricultural Science and Technology Development, Yangzhou University, Yangzhou 225009, China
- Correspondence:
| |
Collapse
|
40
|
Gimbernat-Mayol J, Dominguez Mantes A, Bustamante CD, Mas Montserrat D, Ioannidis AG. Archetypal Analysis for population genetics. PLoS Comput Biol 2022; 18:e1010301. [PMID: 36007005 PMCID: PMC9451066 DOI: 10.1371/journal.pcbi.1010301] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 09/07/2022] [Accepted: 06/14/2022] [Indexed: 11/18/2022] Open
Abstract
The estimation of genetic clusters using genomic data has application from genome-wide association studies (GWAS) to demographic history to polygenic risk scores (PRS) and is expected to play an important role in the analyses of increasingly diverse, large-scale cohorts. However, existing methods are computationally-intensive, prohibitively so in the case of nationwide biobanks. Here we explore Archetypal Analysis as an efficient, unsupervised approach for identifying genetic clusters and for associating individuals with them. Such unsupervised approaches help avoid conflating socially constructed ethnic labels with genetic clusters by eliminating the need for exogenous training labels. We show that Archetypal Analysis yields similar cluster structure to existing unsupervised methods such as ADMIXTURE and provides interpretative advantages. More importantly, we show that since Archetypal Analysis can be used with lower-dimensional representations of genetic data, significant reductions in computational time and memory requirements are possible. When Archetypal Analysis is run in such a fashion, it takes several orders of magnitude less compute time than the current standard, ADMIXTURE. Finally, we demonstrate uses ranging across datasets from humans to canids.
Collapse
Affiliation(s)
- Julia Gimbernat-Mayol
- Department of Bioengineering, Faculty of Engineering, Imperial College London, London, United Kingdom
| | - Albert Dominguez Mantes
- Brain Mind Institute, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Department of Biomedical Data Science, Stanford Medical School, Stanford, California, United States of America
| | - Carlos D. Bustamante
- Department of Biomedical Data Science, Stanford Medical School, Stanford, California, United States of America
| | - Daniel Mas Montserrat
- Department of Biomedical Data Science, Stanford Medical School, Stanford, California, United States of America
| | - Alexander G. Ioannidis
- Department of Biomedical Data Science, Stanford Medical School, Stanford, California, United States of America
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, California, United States of America
- * E-mail:
| |
Collapse
|
41
|
Meisner J, Albrechtsen A. Haplotype and population structure inference using neural networks in whole-genome sequencing data. Genome Res 2022; 32:gr.276813.122. [PMID: 35794006 PMCID: PMC9435741 DOI: 10.1101/gr.276813.122] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Accepted: 06/28/2022] [Indexed: 02/03/2023]
Abstract
Accurate inference of population structure is important in many studies of population genetics. Here we present HaploNet, a method for performing dimensionality reduction and clustering of genetic data. The method is based on local clustering of phased haplotypes using neural networks from whole-genome sequencing or dense genotype data. By using Gaussian mixtures in a variational autoencoder framework, we are able to learn a low-dimensional latent space in which we cluster haplotypes along the genome in a highly scalable manner. We show that we can use haplotype clusters in the latent space to infer global population structure using haplotype information by exploiting the generative properties of our framework. Based on fitted neural networks and their latent haplotype clusters, we can perform principal component analysis and estimate ancestry proportions based on a maximum likelihood framework. Using sequencing data from simulations and closely related human populations, we show that our approach is better at distinguishing closely related populations than standard admixture and principal component analysis software. We further show that HaploNet is fast and highly scalable by applying it to genotype array data of the UK Biobank.
Collapse
Affiliation(s)
- Jonas Meisner
- Department of Biology, Bioinformatics Center, University of Copenhagen, DK-2200 Copenhagen, Denmark
| | - Anders Albrechtsen
- Department of Biology, Bioinformatics Center, University of Copenhagen, DK-2200 Copenhagen, Denmark
| |
Collapse
|
42
|
Virtual reality for the observation of oncology models (VROOM): immersive analytics for oncology patient cohorts. Sci Rep 2022; 12:11337. [PMID: 35790803 PMCID: PMC9256599 DOI: 10.1038/s41598-022-15548-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 06/24/2022] [Indexed: 11/08/2022] Open
Abstract
The significant advancement of inexpensive and portable virtual reality (VR) and augmented reality devices has re-energised the research in the immersive analytics field. The immersive environment is different from a traditional 2D display used to analyse 3D data as it provides a unified environment that supports immersion in a 3D scene, gestural interaction, haptic feedback and spatial audio. Genomic data analysis has been used in oncology to understand better the relationship between genetic profile, cancer type, and treatment option. This paper proposes a novel immersive analytics tool for cancer patient cohorts in a virtual reality environment, virtual reality to observe oncology data models. We utilise immersive technologies to analyse the gene expression and clinical data of a cohort of cancer patients. Various machine learning algorithms and visualisation methods have also been deployed in VR to enhance the data interrogation process. This is supported with established 2D visual analytics and graphical methods in bioinformatics, such as scatter plots, descriptive statistical information, linear regression, box plot and heatmap into our visualisation. Our approach allows the clinician to interrogate the information that is familiar and meaningful to them while providing them immersive analytics capabilities to make new discoveries toward personalised medicine.
Collapse
|
43
|
Halldorsson BV, Eggertsson HP, Moore KHS, Hauswedell H, Eiriksson O, Ulfarsson MO, Palsson G, Hardarson MT, Oddsson A, Jensson BO, Kristmundsdottir S, Sigurpalsdottir BD, Stefansson OA, Beyter D, Holley G, Tragante V, Gylfason A, Olason PI, Zink F, Asgeirsdottir M, Sverrisson ST, Sigurdsson B, Gudjonsson SA, Sigurdsson GT, Halldorsson GH, Sveinbjornsson G, Norland K, Styrkarsdottir U, Magnusdottir DN, Snorradottir S, Kristinsson K, Sobech E, Jonsson H, Geirsson AJ, Olafsson I, Jonsson P, Pedersen OB, Erikstrup C, Brunak S, Ostrowski SR, Thorleifsson G, Jonsson F, Melsted P, Jonsdottir I, Rafnar T, Holm H, Stefansson H, Saemundsdottir J, Gudbjartsson DF, Magnusson OT, Masson G, Thorsteinsdottir U, Helgason A, Jonsson H, Sulem P, Stefansson K. The sequences of 150,119 genomes in the UK Biobank. Nature 2022; 607:732-740. [PMID: 35859178 PMCID: PMC9329122 DOI: 10.1038/s41586-022-04965-x] [Citation(s) in RCA: 149] [Impact Index Per Article: 74.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 06/10/2022] [Indexed: 12/25/2022]
Abstract
Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data1,2. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank3. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.
Collapse
Affiliation(s)
- Bjarni V Halldorsson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland. .,School of Technology, Reykjavik University, Reykjavik, Iceland.
| | | | | | | | | | - Magnus O Ulfarsson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | | | - Marteinn T Hardarson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,School of Technology, Reykjavik University, Reykjavik, Iceland
| | | | | | - Snaedis Kristmundsdottir
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,School of Technology, Reykjavik University, Reykjavik, Iceland
| | - Brynja D Sigurpalsdottir
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,School of Technology, Reykjavik University, Reykjavik, Iceland
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Helgi Jonsson
- Landspitali-University Hospital, Reykjavik, Iceland.,Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | | | | | - Palmi Jonsson
- Landspitali-University Hospital, Reykjavik, Iceland.,Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | - Ole Birger Pedersen
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
| | - Christian Erikstrup
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.,Department of Clinical Immunology, Aarhus University Hospital, Aarhus, Denmark
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Sisse Rye Ostrowski
- Department of Clinical Immunology, Copenhagen University Hospital (Rigshospitalet), Copenhagen, Denmark.,Department of Clinical Medicine, Faculty of Health and Clinical Sciences, Copenhagen University, Copenhagen, Denmark
| | | | | | | | - Pall Melsted
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | - Ingileif Jonsdottir
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | | | - Hilma Holm
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
| | | | | | - Daniel F Gudbjartsson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | | | | | - Unnur Thorsteinsdottir
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | - Agnar Helgason
- deCODE genetics/Amgen Inc., Reykjavik, Iceland.,Department of Anthropology, University of Iceland, Reykjavik, Iceland
| | | | | | | |
Collapse
|
44
|
Bellin N, Calzolari M, Magoga G, Callegari E, Bonilauri P, Lelli D, Dottori M, Montagna M, Rossi V. Unsupervised machine learning and geometric morphometrics as tools for the identification of inter and intraspecific variations in the Anopheles Maculipennis complex. Acta Trop 2022; 233:106585. [PMID: 35787418 DOI: 10.1016/j.actatropica.2022.106585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 06/08/2022] [Accepted: 06/30/2022] [Indexed: 11/01/2022]
Abstract
Geometric morphometric analysis was combined with two different unsupervised machine learning algorithms, UMAP and HDBSCAN, to visualize morphological differences in wing shape among and within four Anopheles sibling species (An. atroparvus, An. melanoon, An. maculipennis s.s. and An. daciae sp. inq.) of the Maculipennis complex in Northern Italy. Specifically, we evaluated: 1) wing shape variation among and within species; 2) the consistencies between groups of An. maculipennis s.s. and An. daciae sp. inq. identified based on COI sequences and wing shape variability; and 3) the spatial and temporal distribution of different morphotypes. UMAP detected at least 13 main patterns of variation in wing shape among the four analyzed species and mapped intraspecific morphological variations. The relationship between the most abundant COI haplotypes of An. daciae sp. inq. and shape ordination/variation was not significant. However, morphological variation within haplotypes was reported. HDBSCAN also recognized different clusters of morphotypes within An. daciae sp. inq. (12) and An. maculipennis s.s. (4). All morphotypes shared a similar pattern of variation in the subcostal vein, in the anal vein and in the radio-medial cross-vein of the wing. On the contrary, the marginal part of the wings remained unchanged in all clusters of both species. Any spatial-temporal significant difference was observed in the frequency of the identified morphotypes. Our study demonstrated that machine learning algorithms are a useful tool combined with geometric morphometrics and suggest to deepen the analysis of inter and intra specific shape variability to evaluate evolutionary constrains related to wing functionality.
Collapse
Affiliation(s)
- Nicolò Bellin
- University of Parma, Department of Chemistry, Life Sciences and Environmental Sustainability, Parco Area delle Scienze, 11/A 43124 Parma, Italy.
| | - Mattia Calzolari
- Istituto Zooprofilattico Sperimentale della Lombardia e dell'Emilia Romagna ''B. Ubertini'' (IZSLER), Brescia, Italy
| | - Giulia Magoga
- Università degli Studi di Milano, Dipartimento di Scienze Agrarie e Ambientali, Via Celoria 2, 20133 Milan, Italy
| | - Emanuele Callegari
- Istituto Zooprofilattico Sperimentale della Lombardia e dell'Emilia Romagna ''B. Ubertini'' (IZSLER), Brescia, Italy
| | - Paolo Bonilauri
- Istituto Zooprofilattico Sperimentale della Lombardia e dell'Emilia Romagna ''B. Ubertini'' (IZSLER), Brescia, Italy
| | - Davide Lelli
- Istituto Zooprofilattico Sperimentale della Lombardia e dell'Emilia Romagna ''B. Ubertini'' (IZSLER), Brescia, Italy
| | - Michele Dottori
- Istituto Zooprofilattico Sperimentale della Lombardia e dell'Emilia Romagna ''B. Ubertini'' (IZSLER), Brescia, Italy
| | - Matteo Montagna
- Università degli Studi di Milano, Dipartimento di Scienze Agrarie e Ambientali, Via Celoria 2, 20133 Milan, Italy
| | - Valeria Rossi
- University of Parma, Department of Chemistry, Life Sciences and Environmental Sustainability, Parco Area delle Scienze, 11/A 43124 Parma, Italy
| |
Collapse
|
45
|
Favila N, Madrigal-Trejo D, Legorreta D, Sánchez-Pérez J, Espinosa-Asuar L, Eguiarte LE, Souza V. MicNet toolbox: Visualizing and unraveling a microbial network. PLoS One 2022; 17:e0259756. [PMID: 35749381 PMCID: PMC9231805 DOI: 10.1371/journal.pone.0259756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 04/05/2022] [Indexed: 11/19/2022] Open
Abstract
Applications of network theory to microbial ecology are an emerging and promising approach to understanding both global and local patterns in the structure and interplay of these microbial communities. In this paper, we present an open-source python toolbox which consists of two modules: on one hand, we introduce a visualization module that incorporates the use of UMAP, a dimensionality reduction technique that focuses on local patterns, and HDBSCAN, a clustering technique based on density; on the other hand, we have included a module that runs an enhanced version of the SparCC code, sustaining larger datasets than before, and we couple the resulting networks with network theory analyses to describe the resulting co-occurrence networks, including several novel analyses, such as structural balance metrics and a proposal to discover the underlying topology of a co-occurrence network. We validated the proposed toolbox on 1) a simple and well described biological network of kombucha, consisting of 48 ASVs, and 2) we validate the improvements of our new version of SparCC. Finally, we showcase the use of the MicNet toolbox on a large dataset from Archean Domes, consisting of more than 2,000 ASVs. Our toolbox is freely available as a github repository (https://github.com/Labevo/MicNetToolbox), and it is accompanied by a web dashboard (http://micnetapplb-1212130533.us-east-1.elb.amazonaws.com) that can be used in a simple and straightforward manner with relative abundance data. This easy-to-use implementation is aimed to microbial ecologists with little to no experience in programming, while the most experienced bioinformatics will also be able to manipulate the source code's functions with ease.
Collapse
Affiliation(s)
- Natalia Favila
- Laboratorio de Inteligencia Artificial, Ixulabs, Mexico City, Mexico
| | - David Madrigal-Trejo
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Daniel Legorreta
- Laboratorio de Inteligencia Artificial, Ixulabs, Mexico City, Mexico
| | - Jazmín Sánchez-Pérez
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Laura Espinosa-Asuar
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Luis E. Eguiarte
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Valeria Souza
- Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Centro de Estudios del Cuaternario de Fuego-Patagonia y Antártica (CEQUA), Punta Arenas, Chile
| |
Collapse
|
46
|
Revealing the recent demographic history of Europe via haplotype sharing in the UK Biobank. Proc Natl Acad Sci U S A 2022; 119:e2119281119. [PMID: 35696575 PMCID: PMC9233301 DOI: 10.1073/pnas.2119281119] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Haplotype-based analyses have recently been leveraged to interrogate the fine-scale structure in specific geographic regions, notably in Europe, although an equivalent haplotype-based understanding across the whole of Europe with these tools is lacking. Furthermore, study of identity-by-descent (IBD) sharing in a large sample of haplotypes across Europe would allow a direct comparison between different demographic histories of different regions. The UK Biobank (UKBB) is a population-scale dataset of genotype and phenotype data collected from the United Kingdom, with established sampling of worldwide ancestries. The exact content of these non-UK ancestries is largely uncharacterized, where study could highlight valuable intracontinental ancestry references with deep phenotyping within the UKBB. In this context, we sought to investigate the sample of European ancestry captured in the UKBB. We studied the haplotypes of 5,500 UKBB individuals with a European birthplace; investigated the population structure and demographic history in Europe, showing in parallel the variety of footprints of demographic history in different genetic regions around Europe; and expand knowledge of the genetic landscape of the east and southeast of Europe. Providing an updated map of European genetics, we leverage IBD-segment sharing to explore the extent of population isolation and size across the continent. In addition to building and expanding upon previous knowledge in Europe, our results show the UKBB as a source of diverse ancestries beyond Britain. These worldwide ancestries sampled in the UKBB may complement and inform researchers interested in specific communities or regions not limited to Britain.
Collapse
|
47
|
Qin X, Chiang CWK, Gaggiotti OE. KLFDAPC: a supervised machine learning approach for spatial genetic structure analysis. Brief Bioinform 2022; 23:6596986. [PMID: 35649387 PMCID: PMC9294434 DOI: 10.1093/bib/bbac202] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 04/05/2022] [Accepted: 04/29/2022] [Indexed: 12/30/2022] Open
Abstract
Geographic patterns of human genetic variation provide important insights into human evolution and disease. A commonly used tool to detect and describe them is principal component analysis (PCA) or the supervised linear discriminant analysis of principal components (DAPC). However, genetic features produced from both approaches could fail to correctly characterize population structure for complex scenarios involving admixture. In this study, we introduce Kernel Local Fisher Discriminant Analysis of Principal Components (KLFDAPC), a supervised non-linear approach for inferring individual geographic genetic structure that could rectify the limitations of these approaches by preserving the multimodal space of samples. We tested the power of KLFDAPC to infer population structure and to predict individual geographic origin using neural networks. Simulation results showed that KLFDAPC has higher discriminatory power than PCA and DAPC. The application of our method to empirical European and East Asian genome-wide genetic datasets indicated that the first two reduced features of KLFDAPC correctly recapitulated the geography of individuals and significantly improved the accuracy of predicting individual geographic origin when compared to PCA and DAPC. Therefore, KLFDAPC can be useful for geographic ancestry inference, design of genome scans and correction for spatial stratification in GWAS that link genes to adaptation or disease susceptibility.
Collapse
Affiliation(s)
- Xinghu Qin
- Centre for Biological Diversity, Sir Harold Mitchell Building, University of St Andrews, Fife, KY16 9TF, UK
| | - Charleston W K Chiang
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine & Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Oscar E Gaggiotti
- Centre for Biological Diversity, Sir Harold Mitchell Building, University of St Andrews, Fife, KY16 9TF, UK
| |
Collapse
|
48
|
Bej S, Sarkar J, Biswas S, Mitra P, Chakrabarti P, Wolkenhauer O. Identification and epidemiological characterization of Type-2 diabetes sub-population using an unsupervised machine learning approach. Nutr Diabetes 2022; 12:27. [PMID: 35624098 PMCID: PMC9142500 DOI: 10.1038/s41387-022-00206-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 03/11/2022] [Accepted: 05/18/2022] [Indexed: 12/05/2022] Open
Abstract
Background Studies on Type-2 Diabetes Mellitus (T2DM) have revealed heterogeneous sub-populations in terms of underlying pathologies. However, the identification of sub-populations in epidemiological datasets remains unexplored. We here focus on the detection of T2DM clusters in epidemiological data, specifically analysing the National Family Health Survey-4 (NFHS-4) dataset from India containing a wide spectrum of features, including medical history, dietary and addiction habits, socio-economic and lifestyle patterns of 10,125 T2DM patients. Methods Epidemiological data provide challenges for analysis due to the diverse types of features in it. In this case, applying the state-of-the-art dimension reduction tool UMAP conventionally was found to be ineffective for the NFHS-4 dataset, which contains diverse feature types. We implemented a distributed clustering workflow combining different similarity measure settings of UMAP, for clustering continuous, ordinal and nominal features separately. We integrated the reduced dimensions from each feature-type-distributed clustering to obtain interpretable and unbiased clustering of the data. Results Our analysis reveals four significant clusters, with two of them comprising mainly of non-obese T2DM patients. These non-obese clusters have lower mean age and majorly comprises of rural residents. Surprisingly, one of the obese clusters had 90% of the T2DM patients practising a non-vegetarian diet though they did not show an increased intake of plant-based protein-rich foods. Conclusions From a methodological perspective, we show that for diverse data types, frequent in epidemiological datasets, feature-type-distributed clustering using UMAP is effective as opposed to the conventional use of the UMAP algorithm. The application of UMAP-based clustering workflow for this type of dataset is novel in itself. Our findings demonstrate the presence of heterogeneity among Indian T2DM patients with regard to socio-demography and dietary patterns. From our analysis, we conclude that the existence of significant non-obese T2DM sub-populations characterized by younger age groups and economic disadvantage raises the need for different screening criteria for T2DM among rural Indian residents.
Collapse
Affiliation(s)
- Saptarshi Bej
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany. .,Leibniz-Institute for Food Systems Biology at the Technical University Munich, Munich, Germany.
| | - Jit Sarkar
- Division of Cell Biology and Physiology, CSIR-Indian Institute of Chemical Biology, Kolkata, India. .,Academy of Innovative and Scientific Research, Ghaziabad, India.
| | - Saikat Biswas
- Advanced Technology Development Centre, Indian Institute of Technology, Kharagpur, India
| | - Pabitra Mitra
- Department of Computer Science & Engineering, Indian Institute of Technology, Kharagpur, India
| | - Partha Chakrabarti
- Division of Cell Biology and Physiology, CSIR-Indian Institute of Chemical Biology, Kolkata, India.,Academy of Innovative and Scientific Research, Ghaziabad, India
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany. .,Leibniz-Institute for Food Systems Biology at the Technical University Munich, Munich, Germany. .,Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre at Stellenbosch University, Stellenbosch, South Africa.
| |
Collapse
|
49
|
The genomic origins of the world's first farmers. Cell 2022; 185:1842-1859.e18. [PMID: 35561686 PMCID: PMC9166250 DOI: 10.1016/j.cell.2022.04.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 03/04/2022] [Accepted: 04/06/2022] [Indexed: 11/24/2022]
Abstract
The precise genetic origins of the first Neolithic farming populations in Europe and Southwest Asia, as well as the processes and the timing of their differentiation, remain largely unknown. Demogenomic modeling of high-quality ancient genomes reveals that the early farmers of Anatolia and Europe emerged from a multiphase mixing of a Southwest Asian population with a strongly bottlenecked western hunter-gatherer population after the last glacial maximum. Moreover, the ancestors of the first farmers of Europe and Anatolia went through a period of extreme genetic drift during their westward range expansion, contributing highly to their genetic distinctiveness. This modeling elucidates the demographic processes at the root of the Neolithic transition and leads to a spatial interpretation of the population history of Southwest Asia and Europe during the late Pleistocene and early Holocene.
Collapse
|
50
|
Transcriptional adaptation of olfactory sensory neurons to GPCR identity and activity. Nat Commun 2022; 13:2929. [PMID: 35614043 PMCID: PMC9132991 DOI: 10.1038/s41467-022-30511-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 05/04/2022] [Indexed: 01/02/2023] Open
Abstract
In mammals, chemoperception relies on a diverse set of neuronal sensors able to detect chemicals present in the environment, and to adapt to various levels of stimulation. The contribution of endogenous and external factors to these neuronal identities remains to be determined. Taking advantage of the parallel coding lines present in the olfactory system, we explored the potential variations of neuronal identities before and after olfactory experience. We found that at rest, the transcriptomic profiles of mouse olfactory sensory neuron populations are already divergent, specific to the olfactory receptor they express, and are associated with the sequence of these latter. These divergent profiles further evolve in response to the environment, as odorant exposure leads to reprogramming via the modulation of transcription. These findings highlight a broad range of sensory neuron identities that are present at rest and that adapt to the experience of the individual, thus adding to the complexity and flexibility of sensory coding.
Collapse
|