1
|
Parchure P, Besculides M, Zhan S, Cheng FY, Timsina P, Cheertirala SN, Kersch I, Wilson S, Freeman R, Reich D, Mazumdar M, Kia A. Malnutrition risk assessment using a machine learning-based screening tool: A multicentre retrospective cohort. J Hum Nutr Diet 2024; 37:622-632. [PMID: 38348579 DOI: 10.1111/jhn.13286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 01/05/2024] [Accepted: 01/09/2024] [Indexed: 05/22/2024]
Abstract
BACKGROUND Malnutrition is associated with increased morbidity, mortality, and healthcare costs. Early detection is important for timely intervention. This paper assesses the ability of a machine learning screening tool (MUST-Plus) implemented in registered dietitian (RD) workflow to identify malnourished patients early in the hospital stay and to improve the diagnosis and documentation rate of malnutrition. METHODS This retrospective cohort study was conducted in a large, urban health system in New York City comprising six hospitals serving a diverse patient population. The study included all patients aged ≥ 18 years, who were not admitted for COVID-19 and had a length of stay of ≤ 30 days. RESULTS Of the 7736 hospitalisations that met the inclusion criteria, 1947 (25.2%) were identified as being malnourished by MUST-Plus-assisted RD evaluations. The lag between admission and diagnosis improved with MUST-Plus implementation. The usability of the tool output by RDs exceeded 90%, showing good acceptance by users. When compared pre-/post-implementation, the rate of both diagnoses and documentation of malnutrition showed improvement. CONCLUSION MUST-Plus, a machine learning-based screening tool, shows great promise as a malnutrition screening tool for hospitalised patients when used in conjunction with adequate RD staffing and training about the tool. It performed well across multiple measures and settings. Other health systems can use their electronic health record data to develop, test and implement similar machine learning-based processes to improve malnutrition screening and facilitate timely intervention.
Collapse
|
2
|
Lazaridis I, Patterson N, Anthony D, Vyazov L, Fournier R, Ringbauer H, Olalde I, Khokhlov AA, Kitov EP, Shishlina NI, Ailincăi SC, Agapov DS, Agapov SA, Batieva E, Bauyrzhan B, Bereczki Z, Buzhilova A, Changmai P, Chizhevsky AA, Ciobanu I, Constantinescu M, Csányi M, Dani J, Dashkovskiy PK, Évinger S, Faifert A, Flegontov PN, Frînculeasa A, Frînculeasa MN, Hajdu T, Higham T, Jarosz P, Jelínek P, Khartanovich VI, Kirginekov EN, Kiss V, Kitova A, Kiyashko AV, Koledin J, Korolev A, Kosintsev P, Kulcsár G, Kuznetsov P, Magomedov R, Malikovich MA, Melis E, Moiseyev V, Molnár E, Monge J, Negrea O, Nikolaeva NA, Novak M, Ochir-Goryaeva M, Pálfi G, Popovici S, Rykun MP, Savenkova TM, Semibratov VP, Seregin NN, Šefčáková A, Serikovna MR, Shingiray I, Shirokov VN, Simalcsik A, Sirak K, Solodovnikov KN, Tárnoki J, Tishkin AA, Trifonov V, Vasilyev S, Akbari A, Brielle ES, Callan K, Candilio F, Cheronet O, Curtis E, Flegontova O, Iliev L, Kearns A, Keating D, Lawson AM, Mah M, Micco A, Michel M, Oppenheimer J, Qiu L, Noah Workman J, Zalzala F, Szécsényi-Nagy A, Palamara PF, Mallick S, Rohland N, Pinhasi R, Reich D. The Genetic Origin of the Indo-Europeans. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.17.589597. [PMID: 38659893 PMCID: PMC11042377 DOI: 10.1101/2024.04.17.589597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
The Yamnaya archaeological complex appeared around 3300BCE across the steppes north of the Black and Caspian Seas, and by 3000BCE reached its maximal extent from Hungary in the west to Kazakhstan in the east. To localize the ancestral and geographical origins of the Yamnaya among the diverse Eneolithic people that preceded them, we studied ancient DNA data from 428 individuals of which 299 are reported for the first time, demonstrating three previously unknown Eneolithic genetic clines. First, a "Caucasus-Lower Volga" (CLV) Cline suffused with Caucasus hunter-gatherer (CHG) ancestry extended between a Caucasus Neolithic southern end in Neolithic Armenia, and a steppe northern end in Berezhnovka in the Lower Volga. Bidirectional gene flow across the CLV cline created admixed intermediate populations in both the north Caucasus, such as the Maikop people, and on the steppe, such as those at the site of Remontnoye north of the Manych depression. CLV people also helped form two major riverine clines by admixing with distinct groups of European hunter-gatherers. A "Volga Cline" was formed as Lower Volga people mixed with upriver populations that had more Eastern hunter-gatherer (EHG) ancestry, creating genetically hyper-variable populations as at Khvalynsk in the Middle Volga. A "Dnipro Cline" was formed as CLV people bearing both Caucasus Neolithic and Lower Volga ancestry moved west and acquired Ukraine Neolithic hunter-gatherer (UNHG) ancestry to establish the population of the Serednii Stih culture from which the direct ancestors of the Yamnaya themselves were formed around 4000BCE. This population grew rapidly after 3750-3350BCE, precipitating the expansion of people of the Yamnaya culture who totally displaced previous groups on the Volga and further east, while admixing with more sedentary groups in the west. CLV cline people with Lower Volga ancestry contributed four fifths of the ancestry of the Yamnaya, but also, entering Anatolia from the east, contributed at least a tenth of the ancestry of Bronze Age Central Anatolians, where the Hittite language, related to the Indo-European languages spread by the Yamnaya, was spoken. We thus propose that the final unity of the speakers of the "Proto-Indo-Anatolian" ancestral language of both Anatolian and Indo-European languages can be traced to CLV cline people sometime between 4400-4000 BCE.
Collapse
|
3
|
Sirak K, Jansen Van Rensburg J, Brielle E, Chen B, Lazaridis I, Ringbauer H, Mah M, Mallick S, Micco A, Rohland N, Callan K, Curtis E, Kearns A, Lawson AM, Workman JN, Zalzala F, Ahmed Al-Orqbi AS, Ahmed Salem EM, Salem Hasan AM, Britton DC, Reich D. Medieval DNA from Soqotra points to Eurasian origins of an isolated population at the crossroads of Africa and Arabia. Nat Ecol Evol 2024; 8:817-829. [PMID: 38332026 PMCID: PMC11009077 DOI: 10.1038/s41559-024-02322-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 12/11/2023] [Indexed: 02/10/2024]
Abstract
Soqotra, an island situated at the mouth of the Gulf of Aden in the northwest Indian Ocean between Africa and Arabia, is home to ~60,000 people subsisting through fishing and semi-nomadic pastoralism who speak a Modern South Arabian language. Most of what is known about Soqotri history derives from writings of foreign travellers who provided little detail about local people, and the geographic origins and genetic affinities of early Soqotri people has not yet been investigated directly. Here we report genome-wide data from 39 individuals who lived between ~650 and 1750 CE at six locations across the island and document strong genetic connections between Soqotra and the similarly isolated Hadramawt region of coastal South Arabia that likely reflects a source for the peopling of Soqotra. Medieval Soqotri can be modelled as deriving ~86% of their ancestry from a population such as that found in the Hadramawt today, with the remaining ~14% best proxied by an Iranian-related source with up to 2% ancestry from the Indian sub-continent, possibly reflecting genetic exchanges that occurred along with archaeologically documented trade from these regions. In contrast to all other genotyped populations of the Arabian Peninsula, genome-level analysis of the medieval Soqotri is consistent with no sub-Saharan African admixture dating to the Holocene. The deep ancestry of people from medieval Soqotra and the Hadramawt is also unique in deriving less from early Holocene Levantine farmers and more from groups such as Late Pleistocene hunter-gatherers from the Levant (Natufians) than other mainland Arabians. This attests to migrations by early farmers having less impact in southernmost Arabia and Soqotra and provides compelling evidence that there has not been complete population replacement between the Pleistocene and Holocene throughout the Arabian Peninsula. Medieval Soqotra harboured a small population that showed qualitatively different marriage practices from modern Soqotri, with first-cousin unions occurring significantly less frequently than today.
Collapse
|
4
|
Mallick S, Micco A, Mah M, Ringbauer H, Lazaridis I, Olalde I, Patterson N, Reich D. The Allen Ancient DNA Resource (AADR) a curated compendium of ancient human genomes. Sci Data 2024; 11:182. [PMID: 38341426 PMCID: PMC10858950 DOI: 10.1038/s41597-024-03031-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 01/31/2024] [Indexed: 02/12/2024] Open
Abstract
More than two hundred papers have reported genome-wide data from ancient humans. While the raw data for the vast majority are fully publicly available testifying to the commitment of the paleogenomics community to open data, formats for both raw data and meta-data differ. There is thus a need for uniform curation and a centralized, version-controlled compendium that researchers can download, analyze, and reference. Since 2019, we have been maintaining the Allen Ancient DNA Resource (AADR), which aims to provide an up-to-date, curated version of the world's published ancient human DNA data, represented at more than a million single nucleotide polymorphisms (SNPs) at which almost all ancient individuals have been assayed. The AADR has gone through six public releases at the time of writing and review of this manuscript, and crossed the threshold of >10,000 individuals with published genome-wide ancient DNA data at the end of 2022. This note is intended as a citable descriptor of the AADR.
Collapse
|
5
|
Cousins T, Tabin D, Patterson N, Reich D, Durvasula A. Accurate inference of population history in the presence of background selection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.18.576291. [PMID: 38313273 PMCID: PMC10838404 DOI: 10.1101/2024.01.18.576291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/06/2024]
Abstract
All published methods for learning about demographic history make the simplifying assumption that the genome evolves neutrally, and do not seek to account for the effects of natural selection on patterns of variation. This is a major concern, as ample work has demonstrated the pervasive effects of natural selection and in particular background selection (BGS) on patterns of genetic variation in diverse species. Simulations and theoretical work have shown that methods to infer changes in effective population size over time (Ne(t)) become increasingly inaccurate as the strength of linked selection increases. Here, we introduce an extension to the Pairwise Sequentially Markovian Coalescent (PSMC) algorithm, PSMC+, which explicitly co-models demographic history and natural selection. We benchmark our method using forward-in-time simulations with BGS and find that our approach improves the accuracy of effective population size inference. Leveraging a high resolution map of BGS in humans, we infer considerable changes in the magnitude of inferred effective population size relative to previous reports. Finally, we separately infer Ne(t) on the X chromosome and on the autosomes in diverse great apes without making a correction for selection, and find that the inferred ratio fluctuates substantially through time in a way that differs across species, showing that uncorrected selection may be an important driver of signals of genetic difference on the X chromosome and autosomes.
Collapse
|
6
|
Ringbauer H, Huang Y, Akbari A, Mallick S, Olalde I, Patterson N, Reich D. Accurate detection of identity-by-descent segments in human ancient DNA. Nat Genet 2024; 56:143-151. [PMID: 38123640 PMCID: PMC10786714 DOI: 10.1038/s41588-023-01582-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 10/20/2023] [Indexed: 12/23/2023]
Abstract
Long DNA segments shared between two individuals, known as identity-by-descent (IBD), reveal recent genealogical connections. Here we introduce ancIBD, a method for identifying IBD segments in ancient human DNA (aDNA) using a hidden Markov model and imputed genotype probabilities. We demonstrate that ancIBD accurately identifies IBD segments >8 cM for aDNA data with an average depth of >0.25× for whole-genome sequencing or >1× for 1240k single nucleotide polymorphism capture data. Applying ancIBD to 4,248 ancient Eurasian individuals, we identify relatives up to the sixth degree and genealogical connections between archaeological groups. Notably, we reveal long IBD sharing between Corded Ware and Yamnaya groups, indicating that the Yamnaya herders of the Pontic-Caspian Steppe and the Steppe-related ancestry in various European Corded Ware groups share substantial co-ancestry within only a few hundred years. These results show that detecting IBD segments can generate powerful insights into the growing aDNA record, both on a small scale relevant to life stories and on a large scale relevant to major cultural-historical events.
Collapse
|
7
|
Olalde I, Carrión P, Mikić I, Rohland N, Mallick S, Lazaridis I, Mah M, Korać M, Golubović S, Petković S, Miladinović-Radmilović N, Vulović D, Alihodžić T, Ash A, Baeta M, Bartík J, Bedić Ž, Bilić M, Bonsall C, Bunčić M, Bužanić D, Carić M, Čataj L, Cvetko M, Drnić I, Dugonjić A, Đukić A, Đukić K, Farkaš Z, Jelínek P, Jovanovic M, Kaić I, Kalafatić H, Krmpotić M, Krznar S, Leleković T, M de Pancorbo M, Matijević V, Milošević Zakić B, Osterholtz AJ, Paige JM, Tresić Pavičić D, Premužić Z, Rajić Šikanjić P, Rapan Papeša A, Paraman L, Sanader M, Radovanović I, Roksandic M, Šefčáková A, Stefanović S, Teschler-Nicola M, Tončinić D, Zagorc B, Callan K, Candilio F, Cheronet O, Fernandes D, Kearns A, Lawson AM, Mandl K, Wagner A, Zalzala F, Zettl A, Tomanović Ž, Keckarević D, Novak M, Harper K, McCormick M, Pinhasi R, Grbić M, Lalueza-Fox C, Reich D. A genetic history of the Balkans from Roman frontier to Slavic migrations. Cell 2023; 186:5472-5485.e9. [PMID: 38065079 PMCID: PMC10752003 DOI: 10.1016/j.cell.2023.10.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 08/22/2023] [Accepted: 10/18/2023] [Indexed: 12/18/2023]
Abstract
The rise and fall of the Roman Empire was a socio-political process with enormous ramifications for human history. The Middle Danube was a crucial frontier and a crossroads for population and cultural movement. Here, we present genome-wide data from 136 Balkan individuals dated to the 1st millennium CE. Despite extensive militarization and cultural influence, we find little ancestry contribution from peoples of Italic descent. However, we trace a large-scale influx of people of Anatolian ancestry during the Imperial period. Between ∼250 and 550 CE, we detect migrants with ancestry from Central/Northern Europe and the Steppe, confirming that "barbarian" migrations were propelled by ethnically diverse confederations. Following the end of Roman control, we detect the large-scale arrival of individuals who were genetically similar to modern Eastern European Slavic-speaking populations, who contributed 30%-60% of the ancestry of Balkan people, representing one of the largest permanent demographic changes anywhere in Europe during the Migration Period.
Collapse
|
8
|
Fournier R, Tsangalidou Z, Reich D, Palamara PF. Haplotype-based inference of recent effective population size in modern and ancient DNA samples. Nat Commun 2023; 14:7945. [PMID: 38040695 PMCID: PMC10692198 DOI: 10.1038/s41467-023-43522-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 11/10/2023] [Indexed: 12/03/2023] Open
Abstract
Individuals sharing recent ancestors are likely to co-inherit large identical-by-descent (IBD) genomic regions. The distribution of these IBD segments in a population may be used to reconstruct past demographic events such as effective population size variation, but accurate IBD detection is difficult in ancient DNA data and in underrepresented populations with limited reference data. In this work, we introduce an accurate method for inferring effective population size variation during the past ~2000 years in both modern and ancient DNA data, called HapNe. HapNe infers recent population size fluctuations using either IBD sharing (HapNe-IBD) or linkage disequilibrium (HapNe-LD), which does not require phasing and can be computed in low coverage data, including data sets with heterogeneous sampling times. HapNe shows improved accuracy in a range of simulated demographic scenarios compared to currently available methods for IBD-based and LD-based inference of recent effective population size, while requiring fewer computational resources. We apply HapNe to several modern populations from the 1,000 Genomes Project, the UK Biobank, the Allen Ancient DNA Resource, and recently published samples from Iron Age Britain, detecting multiple instances of recent effective population size variation across these groups.
Collapse
|
9
|
Patiño LH, Guerra S, Muñoz M, Luna N, Farrugia K, van de Guchte A, Khalil Z, Gonzalez-Reiche AS, Hernandez MM, Banu R, Shrestha P, Liggayu B, Firpo Betancourt A, Reich D, Cordon-Cardo C, Albrecht R, Pearl R, Simon V, Rooker A, Sordillo EM, van Bakel H, García-Sastre A, Bogunovic D, Palacios G, Paniz Mondolfi A, Ramírez JD. Phylogenetic landscape of Monkeypox Virus (MPV) during the early outbreak in New York City, 2022. Emerg Microbes Infect 2023; 12:e2192830. [PMID: 36927408 PMCID: PMC10114986 DOI: 10.1080/22221751.2023.2192830] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 03/14/2023] [Indexed: 03/18/2023]
Abstract
Monkeypox (MPOX) is a zoonotic disease endemic to regions of Central/Western Africa. The geographic endemicity of MPV has expanded, broadening the human-monkeypox virus interface and its potential for spillover. Since May 2022, a large multi-country MPV outbreak with no proven links to endemic countries has originated in Europe and has rapidly expanded around the globe, setting off genomic surveillance efforts. Here, we conducted a genomic analysis of 23 MPV-infected patients from New York City during the early outbreak, assessing the phylogenetic relationship of these strains against publicly available MPV genomes. Additionally, we compared the genomic sequences of clinical isolates versus culture-passaged samples from a subset of samples. Phylogenetic analysis revealed that MPV genomes included in this study cluster within the B.1 lineage (Clade IIb), with some of the samples displaying further differentiation into five different sub-lineages of B.1. Mutational analysis revealed 55 non-synonymous polymorphisms throughout the genome, with some of these mutations located in critical regions required for viral multiplication, structural and assembly functions, as well as the target region for antiviral treatment. In addition, we identified a large majority of polymorphisms associated with GA > AA and TC > TT nucleotide replacements, suggesting the action of human APOBEC3 enzyme. A comparison between clinical isolates and cell culture-passaged samples failed to reveal any difference. Our results provide a first glance at the mutational landscape of early MPV-2022 (B.1) circulating strains in NYC.
Collapse
|
10
|
Nakatsuka N, Holguin B, Sedig J, Langenwalter PE, Carpenter J, Culleton BJ, García-Moreno C, Harper TK, Martin D, Martínez-Ramírez J, Porcayo-Michelini A, Tiesler V, Villapando-Canchola ME, Valdes Herrera A, Callan K, Curtis E, Kearns A, Iliev L, Lawson AM, Mah M, Mallick S, Micco A, Michel M, Workman JN, Oppenheimer J, Qiu L, Zalzala F, Rohland N, Punzo Diaz JL, Johnson JR, Reich D. Genetic continuity and change among the Indigenous peoples of California. Nature 2023; 624:122-129. [PMID: 37993721 PMCID: PMC10872549 DOI: 10.1038/s41586-023-06771-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 10/20/2023] [Indexed: 11/24/2023]
Abstract
Before the colonial period, California harboured more language variation than all of Europe, and linguistic and archaeological analyses have led to many hypotheses to explain this diversity1. We report genome-wide data from 79 ancient individuals from California and 40 ancient individuals from Northern Mexico dating to 7,400-200 years before present (BP). Our analyses document long-term genetic continuity between people living on the Northern Channel Islands of California and the adjacent Santa Barbara mainland coast from 7,400 years BP to modern Chumash groups represented by individuals who lived around 200 years BP. The distinctive genetic lineages that characterize present-day and ancient people from Northwest Mexico increased in frequency in Southern and Central California by 5,200 years BP, providing evidence for northward migrations that are candidates for spreading Uto-Aztecan languages before the dispersal of maize agriculture from Mexico2-4. Individuals from Baja California share more alleles with the earliest individual from Central California in the dataset than with later individuals from Central California, potentially reflecting an earlier linguistic substrate, whose impact on local ancestry was diluted by later migrations from inland regions1,5. After 1,600 years BP, ancient individuals from the Channel Islands lived in communities with effective sizes similar to those in pre-agricultural Caribbean and Patagonia, and smaller than those on the California mainland and in sampled regions of Mexico.
Collapse
|
11
|
Nguyen KAN, Tandon P, Ghanavati S, Cheetirala SN, Timsina P, Freeman R, Reich D, Levin MA, Mazumdar M, Fayad ZA, Kia A. A Hybrid Decision Tree and Deep Learning Approach Combining Medical Imaging and Electronic Medical Records to Predict Intubation Among Hospitalized Patients With COVID-19: Algorithm Development and Validation. JMIR Form Res 2023; 7:e46905. [PMID: 37883177 PMCID: PMC10636624 DOI: 10.2196/46905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 05/18/2023] [Accepted: 06/27/2023] [Indexed: 10/27/2023] Open
Abstract
BACKGROUND Early prediction of the need for invasive mechanical ventilation (IMV) in patients hospitalized with COVID-19 symptoms can help in the allocation of resources appropriately and improve patient outcomes by appropriately monitoring and treating patients at the greatest risk of respiratory failure. To help with the complexity of deciding whether a patient needs IMV, machine learning algorithms may help bring more prognostic value in a timely and systematic manner. Chest radiographs (CXRs) and electronic medical records (EMRs), typically obtained early in patients admitted with COVID-19, are the keys to deciding whether they need IMV. OBJECTIVE We aimed to evaluate the use of a machine learning model to predict the need for intubation within 24 hours by using a combination of CXR and EMR data in an end-to-end automated pipeline. We included historical data from 2481 hospitalizations at The Mount Sinai Hospital in New York City. METHODS CXRs were first resized, rescaled, and normalized. Then lungs were segmented from the CXRs by using a U-Net algorithm. After splitting them into a training and a test set, the training set images were augmented. The augmented images were used to train an image classifier to predict the probability of intubation with a prediction window of 24 hours by retraining a pretrained DenseNet model by using transfer learning, 10-fold cross-validation, and grid search. Then, in the final fusion model, we trained a random forest algorithm via 10-fold cross-validation by combining the probability score from the image classifier with 41 longitudinal variables in the EMR. Variables in the EMR included clinical and laboratory data routinely collected in the inpatient setting. The final fusion model gave a prediction likelihood for the need of intubation within 24 hours as well. RESULTS At a prediction probability threshold of 0.5, the fusion model provided 78.9% (95% CI 59%-96%) sensitivity, 83% (95% CI 76%-89%) specificity, 0.509 (95% CI 0.34-0.67) F1-score, 0.874 (95% CI 0.80-0.94) area under the receiver operating characteristic curve (AUROC), and 0.497 (95% CI 0.32-0.65) area under the precision recall curve (AUPRC) on the holdout set. Compared to the image classifier alone, which had an AUROC of 0.577 (95% CI 0.44-0.73) and an AUPRC of 0.206 (95% CI 0.08-0.38), the fusion model showed significant improvement (P<.001). The most important predictor variables were respiratory rate, C-reactive protein, oxygen saturation, and lactate dehydrogenase. The imaging probability score ranked 15th in overall feature importance. CONCLUSIONS We show that, when linked with EMR data, an automated deep learning image classifier improved performance in identifying hospitalized patients with severe COVID-19 at risk for intubation. With additional prospective and external validation, such a model may assist risk assessment and optimize clinical decision-making in choosing the best care plan during the critical stages of COVID-19.
Collapse
|
12
|
Harney É, Sirak K, Sedig J, Micheletti S, Curry R, Ancona Esselmann S, Reich D. Ethical considerations when co-analyzing ancient DNA and data from private genetic databases. Am J Hum Genet 2023; 110:1447-1453. [PMID: 37541241 PMCID: PMC10502734 DOI: 10.1016/j.ajhg.2023.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 06/15/2023] [Accepted: 06/21/2023] [Indexed: 08/06/2023] Open
Abstract
Ancient DNA studies have begun to explore the possibility of identifying identical DNA segments shared between historical and living people. This research requires access to large genetic datasets to maximize the likelihood of identifying previously unknown, close genetic connections. Direct-to-consumer genetic testing companies, such as 23andMe, Inc., manage by far the largest and most diverse genetic databases that can be used for this purpose. It is therefore important to think carefully about guidelines for carrying out collaborations between researchers and such companies. Such collaborations require consideration of ethical issues, including policies for sharing ancient DNA datasets, and ensuring reproducibility of research findings when access to privately controlled genetic datasets is limited. At the same time, they introduce unique possibilities for returning results to the research participants whose data are analyzed, including those who are identified as close genetic relatives of historical individuals, thereby enabling ancient DNA research to contribute to the restoration of information about ancestral connections that were lost over time, which can be particularly meaningful for families and groups where such history has not been well documented. We explore these issues by describing our experience designing and carrying out a study searching for genetic connections between 18th- and 19th-century enslaved and free African Americans who labored at Catoctin Furnace, Maryland, and 23andMe research participants. We share our experience in the hope of helping future researchers navigate similar ethical considerations, recognizing that our perspective is part of a larger conversation about best ethical practices.
Collapse
|
13
|
Flegontov P, Işıldak U, Maier R, Yüncü E, Changmai P, Reich D. Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes. PLoS Genet 2023; 19:e1010931. [PMID: 37676865 PMCID: PMC10508636 DOI: 10.1371/journal.pgen.1010931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 09/19/2023] [Accepted: 08/21/2023] [Indexed: 09/09/2023] Open
Abstract
f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data-that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed-but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True "outgroup ascertainment" is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the "Affymetrix Human Origins array" which has been genotyped on thousands of modern individuals from hundreds of populations, or the "1240k" in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.
Collapse
|
14
|
Harney É, Micheletti S, Bruwelheide KS, Freyman WA, Bryc K, Akbari A, Jewett E, Comer E, Louis Gates H, Heywood L, Thornton J, Curry R, Ancona Esselmann S, Barca KG, Sedig J, Sirak K, Olalde I, Adamski N, Bernardos R, Broomandkhoshbacht N, Ferry M, Qiu L, Stewardson K, Workman JN, Zalzala F, Mallick S, Micco A, Mah M, Zhang Z, Rohland N, Mountain JL, Owsley DW, Reich D. The genetic legacy of African Americans from Catoctin Furnace. Science 2023; 381:eade4995. [PMID: 37535739 PMCID: PMC10958645 DOI: 10.1126/science.ade4995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 06/20/2023] [Indexed: 08/05/2023]
Abstract
Few African Americans have been able to trace family lineages back to ancestors who died before the 1870 United States Census, the first in which all Black people were listed by name. We analyzed 27 individuals from Maryland's Catoctin Furnace African American Cemetery (1774-1850), identifying 41,799 genetic relatives among consenting research participants in 23andMe, Inc.'s genetic database. One of the highest concentrations of close relatives is in Maryland, suggesting that descendants of the Catoctin individuals remain in the area. We find that many of the Catoctin individuals derived African ancestry from the Wolof or Kongo groups and European ancestry from Great Britain and Ireland. This study demonstrates the power of joint analysis of historical DNA and large datasets generated through direct-to-consumer ancestry testing.
Collapse
|
15
|
Nikitin AG, Videiko M, Patterson N, Renson V, Reich D. Interactions between Trypillian farmers and North Pontic forager-pastoralists in Eneolithic central Ukraine. PLoS One 2023; 18:e0285449. [PMID: 37314969 DOI: 10.1371/journal.pone.0285449] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 04/24/2023] [Indexed: 06/16/2023] Open
Abstract
The establishment of agrarian economy in Eneolithic East Europe is associated with the Pre-Cucuteni-Cucuteni-Trypillia complex (PCCTC). PCCTC farmers interacted with Eneolithic forager-pastoralist groups of the North Pontic steppe as PCCTC extended from the Carpathian foothills to the Dnipro Valley beginning in the late 5th millennium BCE. While the cultural interaction between the two groups is evident through the Cucuteni C pottery style that carries steppe influence, the extent of biological interactions between Trypillian farmers and the steppe remains unclear. Here we report the analysis of artefacts from the late 5th millennium Trypillian settlement at the Kolomiytsiv Yar Tract (KYT) archaeological complex in central Ukraine, focusing on a human bone fragment found in the Trypillian context at KYT. Diet stable isotope ratios obtained from the bone fragment suggest the diet of the KYT individual to be within the range of forager-pastoralists of the North Pontic area. Strontium isotope ratios of the KYT individual are consistent with having originated from contexts of the Serednii Stih (Sredny Stog) culture sites of the Middle Dnipro Valley. Genetic analysis of the KYT individual indicates ancestry derived from a proto-Yamna population such as Serednii Stih. Overall, the KYT archaeological site presents evidence of interactions between Trypillians and Eneolithic Pontic steppe inhabitants of the Serednii Stih horizon and suggests a potential for gene flow between the two groups as early as the beginning of the 4th millennium BCE.
Collapse
|
16
|
Maier R, Flegontov P, Flegontova O, Isildak U, Changmai P, Reich D. On the limits of fitting complex models of population history to f-statistics. eLife 2023; 12:85492. [PMID: 37057893 DOI: 10.7554/elife.85492] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Accepted: 04/05/2023] [Indexed: 04/15/2023] Open
Abstract
Our understanding of population history in deep time has been assisted by fitting admixture graphs ('AGs') to data: models that specify the ordering of population splits and mixtures, which along with the amount of genetic drift on each lineage and the proportions of mixture, is the only information needed to predict the patterns of allele frequency correlation among populations. Not needing to specify population size changes, split times, or whether admixture events were sudden or drawn out simplifies the space of models that need to be searched. However, the space of possible AGs relating populations is vast and cannot be sampled fully, and thus most published studies have identified fitting AGs through a manual process driven by prior hypotheses, leaving the vast majority of alternative models unexplored. Here, we develop a method for systematically searching the space of all AGs that can incorporate non-genetic information in the form of topology constraints. We implement this findGraphs tool within a software package, ADMIXTOOLS 2, which is a reimplementation of the ADMIXTOOLS software with new features and large performance gains. We apply this methodology to identify alternative models to AGs that played key roles in eight published studies and find that graphs modeling more than six populations and two or three admixture events are often not unique, with many alternative models fitting nominally or significantly better than the published one. Our results suggest that strong claims about population history from AGs should only be made when all well-fitting and temporally plausible models share common topological features. Our re-evaluation of published data also provides insight into the population histories of humans, dogs, and horses, identifying features that are stable across the models we explored, as well as scenarios of populations relationships that differ in important ways from models that have been highlighted in the literature, that fit the allele frequency correlation data, and that are not obviously wrong.
Collapse
|
17
|
Mallick S, Micco A, Mah M, Ringbauer H, Lazaridis I, Olalde I, Patterson N, Reich D. The Allen Ancient DNA Resource (AADR): A curated compendium of ancient human genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.06.535797. [PMID: 37066305 PMCID: PMC10104067 DOI: 10.1101/2023.04.06.535797] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
More than two hundred papers have reported genome-wide data from ancient humans. While the raw data for the vast majority are fully publicly available testifying to the commitment of the paleogenomics community to open data, formats for both raw data and meta-data differ. There is thus a need for uniform curation and a centralized, version-controlled compendium that researchers can download, analyze, and reference. Since 2019, we have been maintaining the Allen Ancient DNA Resource (AADR), which aims to provide an up-to-date, curated version of the world's published ancient human DNA data, represented at more than a million single nucleotide polymorphisms (SNPs) at which almost all ancient individuals have been assayed. The AADR has gone through six public releases since it first was made available and crossed the threshold of >10,000 ancient individuals with genome-wide data at the end of 2022. This note is intended as a citable description of the AADR.
Collapse
|
18
|
Berdnikov IM, Makarov NP, Savenkova TM, Berdnikova NE, Sokolova NB, Kim AM, Reich D. Middle Neolithic Burials in Baikal-Yenisey Siberia: Problems of Cultural Identity and Genesis. ARCHAEOLOGY, ETHNOLOGY & ANTHROPOLOGY OF EURASIA 2023. [DOI: 10.17746/1563-0110.2023.51.1.042-051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2023]
Abstract
The study focuses on the analysis of Middle Neolithic burial complexes of the Baikal-Yenisey Siberia. Based on a series of reliable radiocarbon dates, their age lies within the range of 6190–5900 cal BP. It partly corresponds to the end of the hiatus in the mortuary traditions of Cis-Baikal. Features of the burial rite and funerary offerings are analyzed and compared with those of neighboring territories. One of the most frequent images in the art of the Middle Neolithic Baikal-Yenisey Siberia is that of the waterfowl, rendered as fi gurines. The common grave goods are leaf-shaped stone arrowheads, shell beads, and pendants made of animal bones and teeth. The funerary rite included the use of fi re and reddish mineral pigment, as well as disrupting the anatomical integrity of the skeletons, possibly due to partial burial (the data are tentative). Most burials of the late stage of the hiatus are evidently those of hunter-gatherers manufacturing the Ust-Belaya ceramics, which were found in certain burials. A bone arrowhead with a biconical point and fi gurines representing waterfowl suggest cultural ties with the Urals and Western Siberia; but their nature has yet to be clarifi ed, which requires large-scale AMS-dating and paleogenetic analysis.
Collapse
|
19
|
Fernandes DM, Sirak KA, Cheronet O, Novak M, Brück F, Zelger E, Llanos-Lizcano A, Wagner A, Zettl A, Mandl K, Duffet Carlson KS, Oberreiter V, Özdoğan KT, Sawyer S, La Pastina F, Borgia E, Coppa A, Dobeš M, Velemínský P, Reich D, Bell LS, Pinhasi R. Density separation of petrous bone powders for optimized ancient DNA yields. Genome Res 2023; 33:622-631. [PMID: 37072186 PMCID: PMC10234301 DOI: 10.1101/gr.277714.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 04/11/2023] [Indexed: 04/20/2023]
Abstract
Density separation is a process routinely used to segregate minerals, organic matter, and even microplastics, from soils and sediments. Here we apply density separation to archaeological bone powders before DNA extraction to increase endogenous DNA recovery relative to a standard control extraction of the same powders. Using nontoxic heavy liquid solutions, we separated powders from the petrous bones of 10 individuals of similar archaeological preservation into eight density intervals (2.15 to 2.45 g/cm3, in 0.05 increments). We found that the 2.30 to 2.35 g/cm3 and 2.35 to 2.40 g/cm3 intervals yielded up to 5.28-fold more endogenous unique DNA than the corresponding standard extraction (and up to 8.53-fold before duplicate read removal), while maintaining signals of ancient DNA authenticity and not reducing library complexity. Although small 0.05 g/cm3 intervals may maximally optimize yields, a single separation to remove materials with a density above 2.40 g/cm3 yielded up to 2.57-fold more endogenous DNA on average, which enables the simultaneous separation of samples that vary in preservation or in the type of material analyzed. While requiring no new ancient DNA laboratory equipment and fewer than 30 min of extra laboratory work, the implementation of density separation before DNA extraction can substantially boost endogenous DNA yields without decreasing library complexity. Although subsequent studies are required, we present theoretical and practical foundations that may prove useful when applied to other ancient DNA substrates such as teeth, other bones, and sediments.
Collapse
|
20
|
Wei X, Robles CR, Pazokitoroudi A, Ganna A, Gusev A, Durvasula A, Gazal S, Loh PR, Reich D, Sankararaman S. The lingering effects of Neanderthal introgression on human complex traits. eLife 2023; 12:e80757. [PMID: 36939312 PMCID: PMC10076017 DOI: 10.7554/elife.80757] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 03/17/2023] [Indexed: 03/21/2023] Open
Abstract
The genetic variants introduced into the ancestors of modern humans from interbreeding with Neanderthals have been suggested to contribute an unexpected extent to complex human traits. However, testing this hypothesis has been challenging due to the idiosyncratic population genetic properties of introgressed variants. We developed rigorous methods to assess the contribution of introgressed Neanderthal variants to heritable trait variation and applied these methods to analyze 235,592 introgressed Neanderthal variants and 96 distinct phenotypes measured in about 300,000 unrelated white British individuals in the UK Biobank. Introgressed Neanderthal variants make a significant contribution to trait variation (explaining 0.12% of trait variation on average). However, the contribution of introgressed variants tends to be significantly depleted relative to modern human variants matched for allele frequency and linkage disequilibrium (about 59% depletion on average), consistent with purifying selection on introgressed variants. Different from previous studies (McArthur et al., 2021), we find no evidence for elevated heritability across the phenotypes examined. We identified 348 independent significant associations of introgressed Neanderthal variants with 64 phenotypes. Previous work (Skov et al., 2020) has suggested that a majority of such associations are likely driven by statistical association with nearby modern human variants that are the true causal variants. Applying a customized fine-mapping led us to identify 112 regions across 47 phenotypes containing 4303 unique genetic variants where introgressed variants are highly likely to have a phenotypic effect. Examination of these variants reveals their substantial impact on genes that are important for the immune system, development, and metabolism.
Collapse
|
21
|
Liu STH, Mirceta M, Lin G, Anderson DM, Broomes T, Jen A, Abid A, Reich D, Hall C, Aberg JA. Safety, Tolerability, and Pharmacokinetics of Anti-SARS-CoV-2 Immunoglobulin Intravenous (Human) Investigational Product (COVID-HIGIV) in Healthy Adults: a Randomized, Controlled, Double-Blinded, Phase 1 Study. Antimicrob Agents Chemother 2023; 67:e0151422. [PMID: 36852998 PMCID: PMC10019156 DOI: 10.1128/aac.01514-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023] Open
Abstract
Anti-SARS-CoV-2 immunoglobulin (human) investigational product (COVID-HIGIV) is a purified immunoglobulin preparation containing SARS-CoV-2 polyclonal antibodies. This single-center clinical trial aimed to characterize the safety and pharmacokinetics of COVID-HIGIV in healthy, adult volunteers. Participants were enrolled to receive one of three doses of COVID-HIGIV (100, 200, 400 mg/kg) or placebo in a 2:2:2:1 randomization scheme. Between 24 December 2020 and 27 July 2021, 28 participants met eligibility and were randomized with 27 of these 28 (96.4%) being administered either COVID-HIGIV (n = 23) or placebo (n = 4). Only one SAE was observed, and it occurred in the placebo group. A total of 18 out of 27 participants (66.7%) reported 50 adverse events (AEs) overall. All COVID-HIGIV-related adverse events were mild or moderate in severity and transient. The most frequent AEs (>5% of participants) reported in the safety population were headache (n = 6, 22.2%), chills (n = 3, 11.1%), increased bilirubin (n = 2, 7.4%), muscle spasms (n = 2, 7.4%), seasonal allergies (n = 2, 7.4%), pyrexia (n = 2, 7.4%), and oropharyngeal pain (n = 2, 7.4%). Using the SARS-CoV-2 binding IgG immunoassay (n = 22, specific for pharmacokinetics), the geometric means of Cmax (AU/mL) for the three COVID-HIGIV dose levels (low to high) were 7.69, 17.02, and 33.27 AU/mL; the average values of Tmax were 7.09, 7.93, and 5.36 h, respectively. The half-life of COVID-HIGIV per dose level was 24 d (583 h), 31 d (753 h), and 26 d (619 h) for the 100 mg/kg, 200 mg/kg, and 400 mg/kg groups, respectively. The safety and pharmacokinetics of COVID-HIGIV support its development as a single-dose regimen for postexposure prophylaxis or treatment of COVID-19.
Collapse
|
22
|
Motti JMB, Pauro M, Scabuzzo C, García A, Aldazábal V, Vecchi R, Bayón C, Pastor N, Demarchi DA, Bravi CM, Reich D, Cabana GS, Nores R. Ancient mitogenomes from the Southern Pampas of Argentina reflect local differentiation and limited extra-regional linkages after rapid initial colonization. AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY 2023; 181:216-230. [PMID: 36919783 DOI: 10.1002/ajpa.24727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 02/13/2023] [Accepted: 02/22/2023] [Indexed: 03/16/2023]
Abstract
OBJECTIVE This study aims to contribute to the recovery of Indigenous evolutionary history in the Southern Pampas region of Argentina through an analysis of ancient complete mitochondrial genomes. MATERIALS AND METHODS We generated DNA data for nine complete mitogenomes from the Southern Pampas, dated to between 2531 and 723 cal BP. In combination with previously published ancient mitogenomes from the region and from throughout South America, we documented instances of extra-regional lineage-sharing, and estimated coalescent ages for local lineages using a Bayesian method with tip calibrations in a phylogenetic analysis. RESULTS We identified a novel mitochondrial haplogroup, B2b16, and two recently defined haplogroups, A2ay and B2ak1, as well as three local haplotypes within founder haplogroups C1b and C1d. We detected lineage-sharing with ancient and contemporary individuals from Central Argentina, but not with ancient or contemporary samples from North Patagonian or Littoral regions of Argentina, despite archeological evidence of cultural interactions with the latter regions. The estimated coalescent age of these shared lineages is ~10,000 years BP. DISCUSSION The history of the human populations in the Southern Pampas is temporally deep, exhibiting long-term continuity of mitogenome lineages. Additionally, the identification of highly localized mtDNA clades accords with a model of relatively rapid initial colonization of South America by Indigenous communities, followed by more local patterns of limited gene flow and genetic drift in various South American regions, including the Pampas.
Collapse
|
23
|
Barton AR, Santander CG, Skoglund P, Moltke I, Reich D, Mathieson I. Insufficient evidence for natural selection associated with the Black Death. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.14.532615. [PMID: 36993413 PMCID: PMC10055098 DOI: 10.1101/2023.03.14.532615] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Klunk et al. analyzed ancient DNA data from individuals in London and Denmark before, during and after the Black Death [1], and argued that allele frequency changes at immune genes were too large to be produced by random genetic drift and thus must reflect natural selection. They also identified four specific variants that they claimed show evidence of selection including at ERAP2, for which they estimate a selection coefficient of 0.39-several times larger than any selection coefficient on a common human variant reported to date. Here we show that these claims are unsupported for four reasons. First, the signal of enrichment of large allele frequency changes in immune genes comparing people in London before and after the Black Death disappears after an appropriate randomization test is carried out: the P value increases by ten orders of magnitude and is no longer significant. Second, a technical error in the estimation of allele frequencies means that none of the four originally reported loci actually pass the filtering thresholds. Third, the filtering thresholds do not adequately correct for multiple testing. Finally, in the case of the ERAP2 variant rs2549794, which Klunk et al. show experimentally may be associated with a host interaction with Y. pestis, we find no evidence of significant frequency change either in the data that Klunk et al. report, or in published data spanning 2,000 years. While it remains plausible that immune genes were subject to natural selection during the Black Death, the magnitude of this selection and which specific genes may have been affected remains unknown.
Collapse
|
24
|
Ringbauer H, Huang Y, Akbari A, Mallick S, Patterson N, Reich D. ancIBD - Screening for identity by descent segments in human ancient DNA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.08.531671. [PMID: 36945531 PMCID: PMC10028887 DOI: 10.1101/2023.03.08.531671] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Long DNA sequences shared between two individuals, known as Identical by descent (IBD) segments, are a powerful signal for identifying close and distant biological relatives because they only arise when the pair shares a recent common ancestor. Existing methods to call IBD segments between present-day genomes cannot be straightforwardly applied to ancient DNA data (aDNA) due to typically low coverage and high genotyping error rates. We present ancIBD, a method to identify IBD segments for human aDNA data implemented as a Python package. Our approach is based on a Hidden Markov Model, using as input genotype probabilities imputed based on a modern reference panel of genomic variation. Through simulation and downsampling experiments, we demonstrate that ancIBD robustly identifies IBD segments longer than 8 centimorgan for aDNA data with at least either 0.25x average whole-genome sequencing (WGS) coverage depth or at least 1x average depth for in-solution enrichment experiments targeting a widely used aDNA SNP set ('1240k'). This application range allows us to screen a substantial fraction of the aDNA record for IBD segments and we showcase two downstream applications. First, leveraging the fact that biological relatives up to the sixth degree are expected to share multiple long IBD segments, we identify relatives between 10,156 ancient Eurasian individuals and document evidence of long-distance migration, for example by identifying a pair of two approximately fifth-degree relatives who were buried 1410km apart in Central Asia 5000 years ago. Second, by applying ancIBD, we reveal new details regarding the spread of ancestry related to Steppe pastoralists into Europe starting 5000 years ago. We find that the first individuals in Central and Northern Europe carrying high amounts of Steppe-ancestry, associated with the Corded Ware culture, share high rates of long IBD (12-25 cM) with Yamnaya herders of the Pontic-Caspian steppe, signaling a strong bottleneck and a recent biological connection on the order of only few hundred years, providing evidence that the Yamnaya themselves are a main source of Steppe ancestry in Corded Ware people. We also detect elevated sharing of long IBD segments between Corded Ware individuals and people associated with the Globular Amphora culture (GAC) from Poland and Ukraine, who were Copper Age farmers not yet carrying Steppe-like ancestry. These IBD links appear for all Corded Ware groups in our analysis, indicating that individuals related to GAC contexts must have had a major demographic impact early on in the genetic admixtures giving rise to various Corded Ware groups across Europe. These results show that detecting IBD segments in aDNA can generate new insights both on a small scale, relevant to understanding the life stories of people, and on the macroscale, relevant to large-scale cultural-historical events.
Collapse
|
25
|
Brielle ES, Fleisher J, Wynne-Jones S, Sirak K, Broomandkhoshbacht N, Callan K, Curtis E, Iliev L, Lawson AM, Oppenheimer J, Qiu L, Stewardson K, Workman JN, Zalzala F, Ayodo G, Gidna AO, Kabiru A, Kwekason A, Mabulla AZP, Manthi FK, Ndiema E, Ogola C, Sawchuk E, Al-Gazali L, Ali BR, Ben-Salem S, Letellier T, Pierron D, Radimilahy C, Rakotoarisoa JA, Raaum RL, Culleton BJ, Mallick S, Rohland N, Patterson N, Mwenje MA, Ahmed KB, Mohamed MM, Williams SR, Monge J, Kusimba S, Prendergast ME, Reich D, Kusimba CM. Entwined African and Asian genetic roots of medieval peoples of the Swahili coast. Nature 2023; 615:866-873. [PMID: 36991187 PMCID: PMC10060156 DOI: 10.1038/s41586-023-05754-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 01/24/2023] [Indexed: 03/31/2023]
Abstract
The urban peoples of the Swahili coast traded across eastern Africa and the Indian Ocean and were among the first practitioners of Islam among sub-Saharan people1,2. The extent to which these early interactions between Africans and non-Africans were accompanied by genetic exchange remains unknown. Here we report ancient DNA data for 80 individuals from 6 medieval and early modern (AD 1250-1800) coastal towns and an inland town after AD 1650. More than half of the DNA of many of the individuals from coastal towns originates from primarily female ancestors from Africa, with a large proportion-and occasionally more than half-of the DNA coming from Asian ancestors. The Asian ancestry includes components associated with Persia and India, with 80-90% of the Asian DNA originating from Persian men. Peoples of African and Asian origins began to mix by about AD 1000, coinciding with the large-scale adoption of Islam. Before about AD 1500, the Southwest Asian ancestry was mainly Persian-related, consistent with the narrative of the Kilwa Chronicle, the oldest history told by people of the Swahili coast3. After this time, the sources of DNA became increasingly Arabian, consistent with evidence of growing interactions with southern Arabia4. Subsequent interactions with Asian and African people further changed the ancestry of present-day people of the Swahili coast in relation to the medieval individuals whose DNA we sequenced.
Collapse
|