1
|
Jacob Machado D, White RA, Kofsky J, Janies DA. Fundamentals of genomic epidemiology, lessons learned from the coronavirus disease 2019 (COVID-19) pandemic, and new directions. ANTIMICROBIAL STEWARDSHIP & HEALTHCARE EPIDEMIOLOGY : ASHE 2021; 1:e60. [PMID: 36168505 PMCID: PMC9495640 DOI: 10.1017/ash.2021.222] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 10/15/2021] [Indexed: 04/19/2023]
Abstract
The coronavirus disease 2019 (COVID-19) pandemic was one of the significant causes of death worldwide in 2020. The disease is caused by severe acute coronavirus syndrome (SARS) coronavirus 2 (SARS-CoV-2), an RNA virus of the subfamily Orthocoronavirinae related to 2 other clinically relevant coronaviruses, SARS-CoV and MERS-CoV. Like other coronaviruses and several other viruses, SARS-CoV-2 originated in bats. However, unlike other coronaviruses, SARS-CoV-2 resulted in a devastating pandemic. The SARS-CoV-2 pandemic rages on due to viral evolution that leads to more transmissible and immune evasive variants. Technology such as genomic sequencing has driven the shift from syndromic to molecular epidemiology and promises better understanding of variants. The COVID-19 pandemic has exposed critical impediments that must be addressed to develop the science of pandemics. Much of the progress is being applied in the developed world. However, barriers to the use of molecular epidemiology in low- and middle-income countries (LMICs) remain, including lack of logistics for equipment and reagents and lack of training in analysis. We review the molecular epidemiology literature to understand its origins from the SARS epidemic (2002-2003) through influenza events and the current COVID-19 pandemic. We advocate for improved genomic surveillance of SARS-CoV and understanding the pathogen diversity in potential zoonotic hosts. This work will require training in phylogenetic and high-performance computing to improve analyses of the origin and spread of pathogens. The overarching goals are to understand and abate zoonosis risk through interdisciplinary collaboration and lowering logistical barriers.
Collapse
Affiliation(s)
- Denis Jacob Machado
- University of North Carolina at Charlotte, College of Computing and Informatics, Department of Bioinformatics and Genomics, Charlotte, North Carolina
| | - Richard Allen White
- University of North Carolina at Charlotte, College of Computing and Informatics, Department of Bioinformatics and Genomics, Charlotte, North Carolina
- University of North Carolina at Charlotte, North Carolina Research Campus (NCRC), Kannapolis, North Carolina
| | - Janice Kofsky
- University of North Carolina at Charlotte, College of Computing and Informatics, Department of Bioinformatics and Genomics, Charlotte, North Carolina
| | - Daniel A. Janies
- University of North Carolina at Charlotte, College of Computing and Informatics, Department of Bioinformatics and Genomics, Charlotte, North Carolina
| |
Collapse
|
2
|
Damodaran L, de Bernardi Schneider A, Chen S, Janies D. Evolution of endemic and sylvatic lineages of dengue virus. Cladistics 2020; 36:115-128. [PMID: 34618965 DOI: 10.1111/cla.12402] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/02/2019] [Indexed: 11/30/2022] Open
Abstract
Recent disease outbreaks have raised awareness of tropical pathogens, especially mosquito-borne viruses. Dengue virus (DENV) is a widely studied mammalian pathogen transmitted by various species of mosquito in the genus Aedes, especially Aedes aegypti and Aedes albopictus. The prevailing view of the research community is that endemic viral lineages that cause epidemics of DENV in humans have emerged over time from sylvatic viral lineages, which persist in wild, non-human primates. These notions have been examined by researchers through phylogenetic analyses of the envelope gene (E) from the four serotypes of DENV (serotypes DENV-1 to DENV-4). In these previous reports, researchers used visual inspection of a phylogeny in order to assert that sylvatic lineages lead to endemic clades. In making this assertion, these researchers also reasserted the model of periodic sylvatic to endemic disease outbreaks. Since that study, there has been a significant increase in data both in terms of metadata (e.g., place and host of isolation) and genetic sequences of DENV. Here, we re-examine the model of sylvatic to endemic shifts in viral lineages through a phylogenetic tree search and character evolution study of metadata on the tree. We built a dataset of nucleotide sequences for 188 isolates of DENV that have metadata on sylvatic or endemic sampling along with three orthologous sequences from West Nile virus as the outgroup for the phylogenetic analysis. In contrast to previous research, we find that there are several shifts from endemic to sylvatic lineages as well as sylvatic to endemic lineages, indicating a much more dynamic model of evolution. We propose that a model that allows oscillation between sylvatic and endemic hosts better captures the dynamics of DENV transmission.
Collapse
Affiliation(s)
- Lambodhar Damodaran
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, 28223-0001, NC, USA.,Institute of Bioinformatics, University of Georgia, 120 Green St., Athens, 30602, GA, USA
| | - Adriano de Bernardi Schneider
- AntiViral Research Center, Department of Medicine, University of California San Diego, 220 Dickinson St, Suite A, San Diego, 92103-8208, CA, USA
| | - Shi Chen
- Department of Public Health Sciences, College of Health and Human Services, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, 28223-0001, NC, USA
| | - Daniel Janies
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, 28223-0001, NC, USA
| |
Collapse
|
3
|
Janies D. Phylogenetic Concepts and Tools Applied to Epidemiologic Investigations of Infectious Diseases. Microbiol Spectr 2019; 7:10.1128/microbiolspec.ame-0006-2018. [PMID: 31325287 PMCID: PMC10956736 DOI: 10.1128/microbiolspec.ame-0006-2018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Indexed: 01/13/2023] Open
Abstract
In this review, which is a part of the Microbiology Spectrum Curated Collection: Advances in Molecular Epidemiology of Infectious Diseases, I present an overview of the principles used to classify organisms in the field of phylogenetics, highlight the methods used to infer the interrelationships of organisms, and summarize how these concepts are applied to molecular epidemiologic analyses. I present steps in analyses that come downstream of the assembly of a set of genomes or genes and the production of a multiple-sequence alignment or other matrices of putative orthologs for comparison. I focus on the history of the problem of phylogenetic reconstruction and debates within the field about the most appropriate methods. I illustrate methods that bridge the gap between molecular epidemiology and traditional epidemiology, including phylogenetic character evolution and geographic visualization. Finally, I provide practical advice on how to conduct an example analysis in the appendix. *This article is part of a curated collection.
Collapse
Affiliation(s)
- Daniel Janies
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223
| |
Collapse
|
4
|
Young SG, Kitchen A, Kayali G, Carrel M. Unlocking pandemic potential: prevalence and spatial patterns of key substitutions in avian influenza H5N1 in Egyptian isolates. BMC Infect Dis 2018; 18:314. [PMID: 29980172 PMCID: PMC6035396 DOI: 10.1186/s12879-018-3222-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 06/28/2018] [Indexed: 11/10/2022] Open
Abstract
Background Avian influenza H5N1 has a high human case fatality rate, but is not yet well-adapted to human hosts. Amino acid substitutions currently circulating in avian populations may enhance viral fitness in, and thus viral adaptation to, human hosts. Substitutions which could increase the risk of a human pandemic (through changes to host specificity, virulence, replication ability, transmissibility, or drug susceptibility) are termed key substitutions (KS). Egypt represents the epicenter of human H5N1 infections, with more confirmed cases than any other country. To date, however, there have not been any spatial analyses of KS in Egypt. Methods Using 925 viral samples of H5N1 from Egypt, we aligned protein sequences and scanned for KS. We geocoded isolates using dasymetric mapping, then carried out geospatial hot spot analyses to identify spatial clusters of high KS detection rates. KS prevalence and spatial clusters were evaluated for all detected KS, as well as when stratified by phenotypic consequence. Results A total of 39 distinct KS were detected in the wild, including 17 not previously reported in Egypt. KS were detected in 874 samples (94.5%). Detection rates varied by viral protein with most KS observed in the surface hemagglutinin (HA) and neuraminidase (NA) proteins, as well as the interior non-structural 1 (NS1) protein. The most frequently detected KS were associated with increased viral binding to mammalian cells and virulence. Samples with high overall detection rates of KS exhibited statistically significant spatial clustering in two governorates in the northwestern Nile delta, Alexandria and Beheira. Conclusions KS provide a possible mechanism by which avian influenza H5N1 could evolve into a pandemic candidate. With numerous KS circulating in Egypt, and non-random spatial clustering of KS detection rates, these findings suggest the need for increased surveillance in these areas.
Collapse
Affiliation(s)
- Sean G Young
- Department of Environmental and Occupational Health, University of Arkansas for Medical Sciences, Little Rock, AR, USA.
| | - Andrew Kitchen
- Department of Anthropology, University of Iowa, Iowa City, IA, USA
| | - Ghazi Kayali
- Department of Epidemiology, Human Genetics, and Environmental Sciences, University of Texas Health Sciences Center, Houston, TX, USA.,Department of Scientific Research, Human Link, Hazmieh, Lebanon
| | - Margaret Carrel
- Department of Geographical and Sustainability Sciences, University of Iowa, Iowa City, IA, USA.,Department of Epidemiology, University of Iowa, Iowa City, IA, USA
| |
Collapse
|
5
|
Tahsin T, Weissenbacher D, Rivera R, Beard R, Firago M, Wallstrom G, Scotch M, Gonzalez G. A high-precision rule-based extraction system for expanding geospatial metadata in GenBank records. J Am Med Inform Assoc 2016; 23:934-41. [PMID: 26911818 PMCID: PMC4997033 DOI: 10.1093/jamia/ocv172] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Revised: 10/22/2015] [Accepted: 10/22/2015] [Indexed: 01/09/2023] Open
Abstract
OBJECTIVE The metadata reflecting the location of the infected host (LOIH) of virus sequences in GenBank often lacks specificity. This work seeks to enhance this metadata by extracting more specific geographic information from related full-text articles and mapping them to their latitude/longitudes using knowledge derived from external geographical databases. MATERIALS AND METHODS We developed a rule-based information extraction framework for linking GenBank records to the latitude/longitudes of the LOIH. Our system first extracts existing geospatial metadata from GenBank records and attempts to improve it by seeking additional, relevant geographic information from text and tables in related full-text PubMed Central articles. The final extracted locations of the records, based on data assimilated from these sources, are then disambiguated and mapped to their respective geo-coordinates. We evaluated our approach on a manually annotated dataset comprising of 5728 GenBank records for the influenza A virus. RESULTS We found the precision, recall, and f-measure of our system for linking GenBank records to the latitude/longitudes of their LOIH to be 0.832, 0.967, and 0.894, respectively. DISCUSSION Our system had a high level of accuracy for linking GenBank records to the geo-coordinates of the LOIH. However, it can be further improved by expanding our database of geospatial data, incorporating spell correction, and enhancing the rules used for extraction. CONCLUSION Our system performs reasonably well for linking GenBank records for the influenza A virus to the geo-coordinates of their LOIH based on record metadata and information extracted from related full-text articles.
Collapse
Affiliation(s)
- Tasnia Tahsin
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| | - Davy Weissenbacher
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| | - Robert Rivera
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| | - Rachel Beard
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| | - Mari Firago
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| | - Garrick Wallstrom
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| | - Matthew Scotch
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| | - Graciela Gonzalez
- Department of Biomedical Informatics, Arizona State University, 13212 E Shea Blvd, Scottsdale, AZ 85259, USA
| |
Collapse
|
6
|
Abstract
This article describes a simple tool to display geophylogenies on web maps including Google Maps and OpenStreetMap. The tool reads a NEXUS format file that includes geographic information, and outputs a GeoJSON format file that can be displayed in a web map application.
Collapse
Affiliation(s)
- Roderic Page
- College of Medical, Veterinary & Life Sciences, Institute of Biodiversity, Animal Health and Comparative Medicine, University of Glasgow, Glasgow, UK
| |
Collapse
|
7
|
Wheeler WC, Whiteley PM. Historical linguistics as a sequence optimization problem: the evolution and biogeography of Uto-Aztecan languages. Cladistics 2015; 31:113-125. [PMID: 34758582 DOI: 10.1111/cla.12078] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/18/2014] [Indexed: 11/30/2022] Open
Abstract
Language origins and diversification are vital for mapping human history. Traditionally, the reconstruction of language trees has been based on cognate forms among related languages, with ancestral protolanguages inferred by individual investigators. Disagreement among competing authorities is typically extensive, without empirical grounds for resolving alternative hypotheses. Here, we apply analytical methods derived from DNA sequence optimization algorithms to Uto-Aztecan languages, treating words as sequences of sounds. Our analysis yields novel relationships and suggests a resolution to current conflicts about the Proto-Uto-Aztecan homeland. The techniques used for Uto-Aztecan are applicable to written and unwritten languages, and should enable more empirically robust hypotheses of language relationships, language histories, and linguistic evolution.
Collapse
Affiliation(s)
- Ward C Wheeler
- Division of Invertebrate Zoology, American Museum of Natural History, Central Park West @ 79th Street, New York, NY, 10024-5192, USA
| | - Peter M Whiteley
- Division of Anthropology, American Museum of Natural History, Central Park West @ 79th Street, New York, NY, 10024-5192, USA
| |
Collapse
|
8
|
Janies DA, Pomeroy LW, Krueger C, Zhang Y, Senturk IF, Kaya K, Çatalyürek ÜV. Phylogenetic visualization of the spread of H7 influenza A viruses. Cladistics 2015; 31:679-691. [DOI: 10.1111/cla.12107] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/28/2014] [Indexed: 11/29/2022] Open
Affiliation(s)
- Daniel A. Janies
- Department of Bioinformatics and Genomics University of North Carolina at Charlotte 9201 University City Blvd Charlotte NC 28223 USA
| | - Laura W. Pomeroy
- Department of Veterinary Preventative Medicine Ohio State University A100 Sisson Hall 1920 Coffey Road Columbus OH 43210 USA
| | - Chris Krueger
- Department of Bioinformatics and Genomics University of North Carolina at Charlotte 9201 University City Blvd Charlotte NC 28223 USA
| | - Yuqi Zhang
- College of Medicine and Life Sciences University of Toledo Toledo OH 43606 USA
| | - Izzet F. Senturk
- Department of Biomedical Informatics Ohio State University College of Medicine Columbus OH 43210 USA
| | - Kamer Kaya
- Faculty of Engineering and Natural Sciences Sabanci University Orta Mahalle Tuzla 34956 İstanbul Turkey
| | - Ümit V. Çatalyürek
- Department of Biomedical Informatics Ohio State University College of Medicine Columbus OH 43210 USA
| |
Collapse
|
9
|
Wheeler WC, Lucaroni N, Hong L, Crowley LM, Varón A. POY version 5: phylogenetic analysis using dynamic homologies under multiple optimality criteria. Cladistics 2014; 31:189-196. [PMID: 34772261 DOI: 10.1111/cla.12083] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/28/2014] [Indexed: 11/26/2022] Open
Affiliation(s)
- Ward C. Wheeler
- Division of Invertebrate Zoology; American Museum of Natural History; Central Park West @ 79th Street New York NY 10024-5192 USA
| | - Nicholas Lucaroni
- Division of Invertebrate Zoology; American Museum of Natural History; Central Park West @ 79th Street New York NY 10024-5192 USA
| | - Lin Hong
- Division of Invertebrate Zoology; American Museum of Natural History; Central Park West @ 79th Street New York NY 10024-5192 USA
| | - Louise M. Crowley
- Division of Invertebrate Zoology; American Museum of Natural History; Central Park West @ 79th Street New York NY 10024-5192 USA
| | - Andrés Varón
- Division of Invertebrate Zoology; American Museum of Natural History; Central Park West @ 79th Street New York NY 10024-5192 USA
| |
Collapse
|
10
|
El Zowalaty ME, Bustin SA, Husseiny MI, Ashour HM. Avian influenza: virology, diagnosis and surveillance. Future Microbiol 2014; 8:1209-27. [PMID: 24020746 DOI: 10.2217/fmb.13.81] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Avian influenza virus (AIV) is the causative agent of a zoonotic disease that affects populations worldwide with often devastating economic and health consequences. Most AIV subtypes cause little or no disease in waterfowl, but outbreaks in poultry can be associated with high mortality. Although transmission of AIV to humans occurs rarely and is strain dependent, the virus has the ability to mutate or reassort into a form that triggers a life-threatening infection. The constant emergence of new influenza strains makes it particularly challenging to predict the behavior, spread, virulence or potential for human-to-human transmission. Because it is difficult to anticipate which viral strain or what location will initiate the next pandemic, it is difficult to prepare for that event. However, rigorous implementation of biosecurity, vaccination and education programs can minimize the threat of AIV. Global surveillance programs help record and identify newly evolving and potentially pandemic strains harbored by the reservoir host.
Collapse
Affiliation(s)
- Mohamed E El Zowalaty
- Postgraduate Medical Institute, Faculty of Health, Social Care & Education, Anglia Ruskin University, Chelmsford, Essex, UK
| | | | | | | |
Collapse
|
11
|
Abstract
Evolution is inherently a spatiotemporal process; however, despite this, phylogenetic and geographical data and models remain largely isolated from one another. Geographical information systems provide a ready-made spatial modelling, analysis and dissemination environment within which phylogenetic models can be explicitly linked with their associated spatial data and subsequently integrated with other georeferenced data sets describing the biotic and abiotic environment. geophylobuilder 1.0 is an extension for the arcgis geographical information system that builds a 'geophylogenetic' data model from a phylogenetic tree and associated geographical data. Geophylogenetic database objects can subsequently be queried, spatially analysed and visualized in both 2D and 3D within a geographical information systems.
Collapse
Affiliation(s)
- David M Kidd
- National Evolutionary Synthesis Center, Suite A200, 2024 West Main Street, Durham, NC 27705, USA
| | | |
Collapse
|
12
|
Carrel M, Emch M. Genetics: A New Landscape for Medical Geography. ANNALS OF THE ASSOCIATION OF AMERICAN GEOGRAPHERS. ASSOCIATION OF AMERICAN GEOGRAPHERS 2013; 103:1452-1467. [PMID: 24558292 PMCID: PMC3928082 DOI: 10.1080/00045608.2013.784102] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The emergence and re-emergence of human pathogens resistant to medical treatment will present a challenge to the international public health community in the coming decades. Geography is uniquely positioned to examine the progressive evolution of pathogens across space and through time, and to link molecular change to interactions between population and environmental drivers. Landscape as an organizing principle for the integration of natural and cultural forces has a long history in geography, and, more specifically, in medical geography. Here, we explore the role of landscape in medical geography, the emergent field of landscape genetics, and the great potential that exists in the combination of these two disciplines. We argue that landscape genetics can enhance medical geographic studies of local-level disease environments with quantitative tests of how human-environment interactions influence pathogenic characteristics. In turn, such analyses can expand theories of disease diffusion to the molecular scale and distinguish the important factors in ecologies of disease that drive genetic change of pathogens.
Collapse
Affiliation(s)
| | - Michael Emch
- Department of Geography, University of North Carolina-Chapel Hill
| |
Collapse
|
13
|
Janies DA, Pomeroy LW, Aaronson JM, Handelman S, Hardman J, Kawalec K, Bitterman T, Wheeler WC. Analysis and visualization of H7 influenza using genomic, evolutionary and geographic information in a modular web service. Cladistics 2012; 28:483-488. [PMID: 32313365 PMCID: PMC7162197 DOI: 10.1111/j.1096-0031.2012.00401.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/12/2012] [Indexed: 11/28/2022] Open
Abstract
We have reported previously on use of a web-based application, Supramap (http://supramap.org) for the study of biogeographic, genotypic, and phenotypic evolution. Using Supramap we have developed maps of the spread of drug-resistant influenza and host shifts in H1N1 and H5N1 influenza and coronaviruses such as SARS. Here we report on another zoonotic pathogen, H7 influenza, and provide an update on the implementation of Supramap as a web service. We find that the emergence of pathogenic strains of H7 is labile with many transitions from high to low pathogenicity, and from low to high pathogenicity. We use Supramap to put these events in a temporal and geospatial context. We identify several lineages of H7 influenza with biomarkers of high pathogenicity in regions that have not been reported in the scientific literature. The original implementation of Supramap was built with tightly coupled client and server software. Now we have decoupled the components to provide a modular web service for POY (http://poyws.org) that can be consumed by a data provider to create a novel application. To demonstrate the web service, we have produced an application, Geogenes (http://geogenes.org). Unlike in Supramap, in which the user is required to create and upload data files, in Geogenes the user works from a graphical interface to query an underlying dataset. Geogenes demonstrates how the web service can provide underlying processing for any sequence and metadata database. © The Willi Hennig Society 2012.
Collapse
Affiliation(s)
- Daniel A Janies
- Department of Biomedical Informatics, Ohio State University, Columbus, OH 43210 USA
| | - Laura W Pomeroy
- Department of Veterinary Preventative Medicine, Ohio State University, Columbus, OH 43210 USA
| | - Jacob M Aaronson
- Department of Biomedical Informatics, Ohio State University, Columbus, OH 43210 USA
| | - Samuel Handelman
- Department of Biomedical Informatics, Ohio State University, Columbus, OH 43210 USA
| | - Jori Hardman
- Department of Biomedical Informatics, Ohio State University, Columbus, OH 43210 USA
| | - Kevin Kawalec
- Department of Biomedical Informatics, Ohio State University, Columbus, OH 43210 USA
| | | | - Ward C Wheeler
- Division of Invertebrate Zoology, American Museum of Natural History, New York, NY, 10024, USA
| |
Collapse
|
14
|
Carrel MA, Emch M, Nguyen T, Todd Jobe R, Wan XF. Population-environment drivers of H5N1 avian influenza molecular change in Vietnam. Health Place 2012; 18:1122-31. [PMID: 22652510 DOI: 10.1016/j.healthplace.2012.04.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 04/16/2012] [Accepted: 04/19/2012] [Indexed: 11/26/2022]
Abstract
This study identifies population and environment drivers of genetic change in H5N1 avian influenza viruses (AIV) in Vietnam using a landscape genetics approach. While prior work has examined how combinations of local-level environmental variables influence H5N1 occurrence, this research expands the analysis to the complex genetic characteristics of H5N1 viruses. A dataset of 125 highly pathogenic H5N1 AIV isolated in Vietnam from 2003 to 2007 is used to explore which population and environment variables are correlated with increased genetic change among viruses. Results from non-parametric multidimensional scaling and regression analyses indicate that variables relating to both the environmental and social ecology of humans and birds in Vietnam interact to affect the genetic character of viruses. These findings suggest that it is a combination of suitable environments for species mixing, the presence of high numbers of potential hosts, and in particular the temporal characteristics of viral occurrence, that drive genetic change among H5N1 AIV in Vietnam.
Collapse
Affiliation(s)
- Margaret A Carrel
- Department of Geography, University of Iowa, Iowa City, IA 52242, USA.
| | | | | | | | | |
Collapse
|
15
|
Newman SH, Hill NJ, Spragens KA, Janies D, Voronkin IO, Prosser DJ, Yan B, Lei F, Batbayar N, Natsagdorj T, Bishop CM, Butler PJ, Wikelski M, Balachandran S, Mundkur T, Douglas DC, Takekawa JY. Eco-virological approach for assessing the role of wild birds in the spread of avian influenza H5N1 along the Central Asian Flyway. PLoS One 2012; 7:e30636. [PMID: 22347393 PMCID: PMC3274535 DOI: 10.1371/journal.pone.0030636] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2011] [Accepted: 12/20/2011] [Indexed: 11/18/2022] Open
Abstract
A unique pattern of highly pathogenic avian influenza (HPAI) H5N1 outbreaks has emerged along the Central Asia Flyway, where infection of wild birds has been reported with steady frequency since 2005. We assessed the potential for two hosts of HPAI H5N1, the bar-headed goose (Anser indicus) and ruddy shelduck (Tadorna tadorna), to act as agents for virus dispersal along this 'thoroughfare'. We used an eco-virological approach to compare the migration of 141 birds marked with GPS satellite transmitters during 2005-2010 with: 1) the spatio-temporal patterns of poultry and wild bird outbreaks of HPAI H5N1, and 2) the trajectory of the virus in the outbreak region based on phylogeographic mapping. We found that biweekly utilization distributions (UDs) for 19.2% of bar-headed geese and 46.2% of ruddy shelduck were significantly associated with outbreaks. Ruddy shelduck showed highest correlation with poultry outbreaks owing to their wintering distribution in South Asia, where there is considerable opportunity for HPAI H5N1 spillover from poultry. Both species showed correlation with wild bird outbreaks during the spring migration, suggesting they may be involved in the northward movement of the virus. However, phylogeographic mapping of HPAI H5N1 clades 2.2 and 2.3 did not support dissemination of the virus in a northern direction along the migration corridor. In particular, two subclades (2.2.1 and 2.3.2) moved in a strictly southern direction in contrast to our spatio-temporal analysis of bird migration. Our attempt to reconcile the disciplines of wild bird ecology and HPAI H5N1 virology highlights prospects offered by both approaches as well as their limitations.
Collapse
Affiliation(s)
- Scott H Newman
- EMPRES Wildlife Unit, Emergency Centre for Transboundary Animal Diseases, Animal Production and Health Division, Food and Agriculture Organization of the United Nations, Rome, Italy
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Page RDM. Space, time, form: viewing the Tree of Life. Trends Ecol Evol 2011; 27:113-20. [PMID: 22209094 DOI: 10.1016/j.tree.2011.12.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2011] [Revised: 12/05/2011] [Accepted: 12/05/2011] [Indexed: 02/06/2023]
Abstract
There are numerous ways to display a phylogenetic tree, which is reflected in the diversity of software tools available to phylogenetists. Displaying very large trees continues to be a challenge, made ever harder as increasing computing power enables researchers to construct ever-larger trees. At the same time, computing technology is enabling novel visualisations, ranging from geophylogenies embedded on digital globes to touch-screen interfaces that enable greater interaction with evolutionary trees. In this review, I survey recent developments in phylogenetic visualisation, highlighting successful (and less successful) approaches and sketching some future directions.
Collapse
Affiliation(s)
- Roderic D M Page
- Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, G12 8QQ, UK.
| |
Collapse
|
17
|
Colosimo ME, Peterson MW, Mardis S, Hirschman L. Nephele: genotyping via complete composition vectors and MapReduce. SOURCE CODE FOR BIOLOGY AND MEDICINE 2011; 6:13. [PMID: 21851626 PMCID: PMC3182884 DOI: 10.1186/1751-0473-6-13] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2011] [Accepted: 08/18/2011] [Indexed: 02/02/2023]
Abstract
BACKGROUND Current sequencing technology makes it practical to sequence many samples of a given organism, raising new challenges for the processing and interpretation of large genomics data sets with associated metadata. Traditional computational phylogenetic methods are ideal for studying the evolution of gene/protein families and using those to infer the evolution of an organism, but are less than ideal for the study of the whole organism mainly due to the presence of insertions/deletions/rearrangements. These methods provide the researcher with the ability to group a set of samples into distinct genotypic groups based on sequence similarity, which can then be associated with metadata, such as host information, pathogenicity, and time or location of occurrence. Genotyping is critical to understanding, at a genomic level, the origin and spread of infectious diseases. Increasingly, genotyping is coming into use for disease surveillance activities, as well as for microbial forensics. The classic genotyping approach has been based on phylogenetic analysis, starting with a multiple sequence alignment. Genotypes are then established by expert examination of phylogenetic trees. However, these traditional single-processor methods are suboptimal for rapidly growing sequence datasets being generated by next-generation DNA sequencing machines, because they increase in computational complexity quickly with the number of sequences. RESULTS Nephele is a suite of tools that uses the complete composition vector algorithm to represent each sequence in the dataset as a vector derived from its constituent k-mers by passing the need for multiple sequence alignment, and affinity propagation clustering to group the sequences into genotypes based on a distance measure over the vectors. Our methods produce results that correlate well with expert-defined clades or genotypes, at a fraction of the computational cost of traditional phylogenetic methods run on traditional hardware. Nephele can use the open-source Hadoop implementation of MapReduce to parallelize execution using multiple compute nodes. We were able to generate a neighbour-joined tree of over 10,000 16S samples in less than 2 hours. CONCLUSIONS We conclude that using Nephele can substantially decrease the processing time required for generating genotype trees of tens to hundreds of organisms at genome scale sequence coverage.
Collapse
Affiliation(s)
- Marc E Colosimo
- The MITRE Corporation, 202 Burlington Rd, Bedford MA 01730, USA.
| | | | | | | |
Collapse
|
18
|
Janies DA, Treseder T, Alexandrov B, Habib F, Chen JJ, Ferreira R, Çatalyürek Ü, Varón A, Wheeler WC. The Supramap project: linking pathogen genomes with geography to fight emergent infectious diseases. Cladistics 2011; 27:61-66. [PMID: 32313364 PMCID: PMC7162175 DOI: 10.1111/j.1096-0031.2010.00314.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2010] [Indexed: 11/27/2022] Open
Abstract
Novel pathogens have the potential to become critical issues of national security, public health and economic welfare. As demonstrated by the response to Severe Acute Respiratory Syndrome (SARS) and influenza, genomic sequencing has become an important method for diagnosing agents of infectious disease. Despite the value of genomic sequences in characterizing novel pathogens, raw data on their own do not provide the information needed by public health officials and researchers. One must integrate knowledge of the genomes of pathogens with host biology and geography to understand the etiology of epidemics. To these ends, we have created an application called Supramap (http://supramap.osu.edu) to put information on the spread of pathogens and key mutations across time, space and various hosts into a geographic information system (GIS). To build this application, we created a web service for integrated sequence alignment and phylogenetic analysis as well as methods to describe the tree, mutations, and host shifts in Keyhole Markup Language (KML). We apply the application to 239 sequences of the polymerase basic 2 (PB2) gene of recent isolates of avian influenza (H5N1). We map a mutation, glutamic acid to lysine at position 627 in the PB2 protein (E627K), in H5N1 influenza that allows for increased replication of the virus in mammals. We use a statistical test to support the hypothesis of a correlation of E627K mutations with avian-mammalian host shifts but reject the hypothesis that lineages with E627K are moving westward. Data, instructions for use, and visualizations are included as supplemental materials at: http://supramap.osu.edu/sm/supramap/publications. © The Willi Hennig Society 2010.
Collapse
Affiliation(s)
- Daniel A Janies
- Department of Biomedical Informatics, The Ohio State University, College of Medicine, Columbus, OH 43210, USA
| | - Travis Treseder
- Department of Biomedical Informatics, The Ohio State University, College of Medicine, Columbus, OH 43210, USA
| | - Boyan Alexandrov
- Department of Biomedical Informatics, The Ohio State University, College of Medicine, Columbus, OH 43210, USA
| | - Farhat Habib
- Indian Institute of Science Education and Research (IISER) Garware Circle, Sutarwadi, Pashan Pune, Maharashtra 411021, India
| | - Jennifer J Chen
- Department of Biomedical Informatics, The Ohio State University, College of Medicine, Columbus, OH 43210, USA
| | - Renato Ferreira
- Universidade Federal de Minas Gerais, Departamento de Ciência da Computação, Belo Horizonte, MG, Brazil
| | - Ümit Çatalyürek
- Department of Biomedical Informatics, The Ohio State University, College of Medicine, Columbus, OH 43210, USA
| | - Andrés Varón
- Division of Invertebrate Zoology, The American Museum of Natural History, New York, NY 10024, USA
- Computer Science Department, The Graduate Center, The City University of New York, New York, NY 10016, USA
| | - Ward C Wheeler
- Division of Invertebrate Zoology, The American Museum of Natural History, New York, NY 10024, USA
| |
Collapse
|
19
|
Bloomquist EW, Lemey P, Suchard MA. Three roads diverged? Routes to phylogeographic inference. Trends Ecol Evol 2010; 25:626-32. [PMID: 20863591 PMCID: PMC2956787 DOI: 10.1016/j.tree.2010.08.010] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2010] [Revised: 08/25/2010] [Accepted: 08/26/2010] [Indexed: 11/29/2022]
Abstract
Phylogeographic methods facilitate inference of the geographical history of genetic lineages. Recent examples explore human migration and the origins of viral pandemics. There is longstanding disagreement over the use and validity of certain phylogeographic inference methodologies. In this paper, we highlight three distinct frameworks for phylogeographic inference to give a taste of this disagreement. Each of the three approaches presents a different viewpoint on phylogeography, most fundamentally on how we view the relationship between the inferred history of a sample and the history of the population the sample is embedded in. Satisfactory resolution of this relationship between history of the tree and history of the population remains a challenge for all but the most trivial models of phylogeographic processes. Intriguingly, we believe that some recent methods that entirely avoid inference about the history of the population will eventually help to reach a resolution.
Collapse
Affiliation(s)
- Erik W. Bloomquist
- Mathematical Biosciences Institute, The Ohio State University, Columbus, OH 43210, USA
| | - Philippe Lemey
- Department of Microbiology and Immunology, Rega Institute, K.U. Leuven, Leuven 3000, Belgium
| | - Marc A. Suchard
- Department of Biostatistics, UCLA School of Public Health, Los Angeles, CA 90095, USA
- Departments of Biomathematics and Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA, Phone: (310) 825-7442, Fax: (310) 825-8685,
| |
Collapse
|
20
|
Liang L, Xu B, Chen Y, Liu Y, Cao W, Fang L, Feng L, Goodchild MF, Gong P. Combining spatial-temporal and phylogenetic analysis approaches for improved understanding on global H5N1 transmission. PLoS One 2010; 5:e13575. [PMID: 21042591 PMCID: PMC2962646 DOI: 10.1371/journal.pone.0013575] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2010] [Accepted: 09/30/2010] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Since late 2003, the highly pathogenic influenza A H5N1 had initiated several outbreak waves that swept across the Eurasia and Africa continents. Getting prepared for reassortment or mutation of H5N1 viruses has become a global priority. Although the spreading mechanism of H5N1 has been studied from different perspectives, its main transmission agents and spread route problems remain unsolved. METHODOLOGY/PRINCIPAL FINDINGS Based on a compilation of the time and location of global H5N1 outbreaks from November 2003 to December 2006, we report an interdisciplinary effort that combines the geospatial informatics approach with a bioinformatics approach to form an improved understanding on the transmission mechanisms of H5N1 virus. Through a spherical coordinate based analysis, which is not conventionally done in geographical analyses, we reveal obvious spatial and temporal clusters of global H5N1 cases on different scales, which we consider to be associated with two different transmission modes of H5N1 viruses. Then through an interdisciplinary study of both geographic and phylogenetic analysis, we obtain a H5N1 spreading route map. Our results provide insight on competing hypotheses as to which avian hosts are responsible for the spread of H5N1. CONCLUSIONS/SIGNIFICANCE We found that although South China and Southeast Asia may be the virus pool of avian flu, East Siberia may be the source of the H5N1 epidemic. The concentration of migratory birds from different places increases the possibility of gene mutation. Special attention should be paid to East Siberia, Middle Siberia and South China for improved surveillance of H5N1 viruses and monitoring of migratory birds.
Collapse
Affiliation(s)
- Lu Liang
- State Key Laboratory of Remote Sensing Science, Jointly Sponsored by
Institute of Remote Sensing Applications, Chinese Academy of Sciences, Beijing
Normal University, Beijing, China
- Center for Earth System Science, Tsinghua University, Beijing,
China
| | - Bing Xu
- Department of Geography, University of Utah, Salt Lake City, Utah, United
States of America
- Department of Environmental Science and Engineering, Tsinghua University,
Beijing, China
| | - Yanlei Chen
- Department of Environmental Science, Policy and Management, University of
California, Berkeley, California, United States America
| | - Yang Liu
- Computational and Molecular Population Genetics, Institute of Ecology and
Evolution, University of Bern, Bern, Switzerland
| | - Wuchun Cao
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of
Microbiology and Epidemiology, Beijing, China
| | - Liqun Fang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of
Microbiology and Epidemiology, Beijing, China
| | - Limin Feng
- Key Laboratory for Biodiversity Science and Ecological Engineering,
Ministry of Education, College of Life Science, Beijing Normal University,
Beijing, China
| | - Michael F. Goodchild
- Department of Geography, University of California Santa Barbara, Santa
Barbara, California, United States of America
| | - Peng Gong
- State Key Laboratory of Remote Sensing Science, Jointly Sponsored by
Institute of Remote Sensing Applications, Chinese Academy of Sciences, Beijing
Normal University, Beijing, China
- Center for Earth System Science, Tsinghua University, Beijing,
China
- Department of Environmental Science, Policy and Management, University of
California, Berkeley, California, United States America
| |
Collapse
|
21
|
Haase M, Starick E, Fereidouni S, Strebelow G, Grund C, Seeland A, Scheuner C, Cieslik D, Smietanka K, Minta Z, Zorman-Rojs O, Mojzis M, Goletic T, Jestin V, Schulenburg B, Pybus O, Mettenleiter T, Beer M, Harder T. Possible sources and spreading routes of highly pathogenic avian influenza virus subtype H5N1 infections in poultry and wild birds in Central Europe in 2007 inferred through likelihood analyses. INFECTION GENETICS AND EVOLUTION 2010; 10:1075-84. [DOI: 10.1016/j.meegid.2010.07.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2010] [Revised: 07/01/2010] [Accepted: 07/02/2010] [Indexed: 12/09/2022]
|
22
|
Affiliation(s)
- David M Kidd
- 1NESCent (National Evolutionary Synthesis Center), Durham, NC 27005, USA.
| |
Collapse
|
23
|
Lokossou AA, Rietman H, Wang M, Krenek P, van der Schoot H, Henken B, Hoekstra R, Vleeshouwers VGAA, van der Vossen EAG, Visser RGF, Jacobsen E, Vosman B. Diversity, distribution, and evolution of Solanum bulbocastanum late blight resistance genes. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2010; 23:1206-16. [PMID: 20687810 DOI: 10.1094/mpmi-23-9-1206] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Knowledge on the evolution and distribution of late blight resistance genes is important for a better understanding of the dynamics of these genes in nature. We analyzed the presence and allelic diversity of the late blight resistance genes Rpi-blb1, Rpi-blb2, and Rpi-blb3, originating from Solanum bulbocastanum, in a set of tuber-bearing Solanum species comprising 196 different taxa. The three genes were only present in some Mexican diploid as well as polyploid species closely related to S. bulbocastanum. Sequence analysis of the fragments obtained from the Rpi-blb1 and Rpi-blb3 genes suggests an evolution through recombinations and point mutations. For Rpi-blb2, only sequences identical to the cloned gene were found in S. bulbocastanum accessions, suggesting that it has emerged recently. The three resistance genes occurred in different combinations and frequencies in S. bulbocastanum accessions and their spread is confined to Central America. A selected set of genotypes was tested for their response to the avirulence effectors IPIO-2, Avr-blb2, and Pi-Avr2, which interact with Rpi-blb1, Rpi-blb2, and Rpi-blb3, respectively, as well as by disease assays with a diverse set of isolates. Using this approach, some accessions could be identified that contain novel, as yet unknown, late blight resistance factors in addition to the Rpi-blb1, Rpi-blb2, and Rpi-blb3 genes.
Collapse
Affiliation(s)
- Anoma A Lokossou
- Wageningen UR Plant Breeding, P.O. Box 16, 6700AA, Wageningen, The Netherlands
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Genome informatics of influenza A: from data sharing to shared analytical capabilities. Anim Health Res Rev 2010; 11:73-9. [DOI: 10.1017/s1466252310000083] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
AbstractEmerging infectious diseases are critical issues of public health and the economic and social stability of nations. As demonstrated by the international response to the severe acute respiratory syndrome (SARS) and influenza A, rapid genomic sequencing is a crucial tool to understand diseases that occur at the interface of human and animal populations. However, our ability to make sense of sequence data lags behind our ability to acquire the data. The potential of sequence data on pathogens is not fully realized until raw data are translated into public health intelligence. Sequencing technologies have become highly mechanized. If the political will for data sharing remains strong, the frontier for progress in emerging infectious diseases will be in analysis of sequence data and translation of results into better public health science and policy. For example, applying analytical tools such as Supramap (http://supramap.osu.edu) to genomic data for pathogens, public health scientists can track specific mutations in pathogens that confer the ability to infect humans or resist drugs. The results produced by the Supramap application are compelling visualizations of pathogen lineages and features mapped into geographic information systems that can be used to test hypotheses and to follow the spread of diseases across geography and hosts and communicate the results to a wide audience.
Collapse
|
25
|
Hankeln W, Buttigieg PL, Fink D, Kottmann R, Yilmaz P, Glöckner FO. MetaBar - a tool for consistent contextual data acquisition and standards compliant submission. BMC Bioinformatics 2010; 11:358. [PMID: 20591175 PMCID: PMC2912304 DOI: 10.1186/1471-2105-11-358] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2010] [Accepted: 06/30/2010] [Indexed: 11/10/2022] Open
Abstract
Background Environmental sequence datasets are increasing at an exponential rate; however, the vast majority of them lack appropriate descriptors like sampling location, time and depth/altitude: generally referred to as metadata or contextual data. The consistent capture and structured submission of these data is crucial for integrated data analysis and ecosystems modeling. The application MetaBar has been developed, to support consistent contextual data acquisition. Results MetaBar is a spreadsheet and web-based software tool designed to assist users in the consistent acquisition, electronic storage, and submission of contextual data associated to their samples. A preconfigured Microsoft® Excel® spreadsheet is used to initiate structured contextual data storage in the field or laboratory. Each sample is given a unique identifier and at any stage the sheets can be uploaded to the MetaBar database server. To label samples, identifiers can be printed as barcodes. An intuitive web interface provides quick access to the contextual data in the MetaBar database as well as user and project management capabilities. Export functions facilitate contextual and sequence data submission to the International Nucleotide Sequence Database Collaboration (INSDC), comprising of the DNA DataBase of Japan (DDBJ), the European Molecular Biology Laboratory database (EMBL) and GenBank. MetaBar requests and stores contextual data in compliance to the Genomic Standards Consortium specifications. The MetaBar open source code base for local installation is available under the GNU General Public License version 3 (GNU GPL3). Conclusion The MetaBar software supports the typical workflow from data acquisition and field-sampling to contextual data enriched sequence submission to an INSDC database. The integration with the megx.net marine Ecological Genomics database and portal facilitates georeferenced data integration and metadata-based comparisons of sampling sites as well as interactive data visualization. The ample export functionalities and the INSDC submission support enable exchange of data across disciplines and safeguarding contextual data.
Collapse
|
26
|
Peterson AT, Knapp S, Guralnick R, Soberón J, Holder MT. The big questions for biodiversity informatics. SYST BIODIVERS 2010. [DOI: 10.1080/14772001003739369] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
27
|
Dang CC, Le QS, Gascuel O, Le VS. FLU, an amino acid substitution model for influenza proteins. BMC Evol Biol 2010; 10:99. [PMID: 20384985 PMCID: PMC2873421 DOI: 10.1186/1471-2148-10-99] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2009] [Accepted: 04/12/2010] [Indexed: 01/28/2023] Open
Abstract
Background The amino acid substitution model is the core component of many protein analysis systems such as sequence similarity search, sequence alignment, and phylogenetic inference. Although several general amino acid substitution models have been estimated from large and diverse protein databases, they remain inappropriate for analyzing specific species, e.g., viruses. Emerging epidemics of influenza viruses raise the need for comprehensive studies of these dangerous viruses. We propose an influenza-specific amino acid substitution model to enhance the understanding of the evolution of influenza viruses. Results A maximum likelihood approach was applied to estimate an amino acid substitution model (FLU) from ~113, 000 influenza protein sequences, consisting of ~20 million residues. FLU outperforms 14 widely used models in constructing maximum likelihood phylogenetic trees for the majority of influenza protein alignments. On average, FLU gains ~42 log likelihood points with an alignment of 300 sites. Moreover, topologies of trees constructed using FLU and other models are frequently different. FLU does indeed have an impact on likelihood improvement as well as tree topologies. It was implemented in PhyML and can be downloaded from ftp://ftp.sanger.ac.uk/pub/1000genomes/lsq/FLU or included in PhyML 3.0 server at http://www.atgc-montpellier.fr/phyml/. Conclusions FLU should be useful for any influenza protein analysis system which requires an accurate description of amino acid substitutions.
Collapse
Affiliation(s)
- Cuong Cao Dang
- College of Technology, Vietnam National University Hanoi, Cau Giay, Hanoi, Vietnam
| | | | | | | |
Collapse
|
28
|
Bokhari SH, Janies DA. Reassortment networks for investigating the evolution of segmented viruses. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2010; 7:288-298. [PMID: 20431148 DOI: 10.1109/tcbb.2008.73] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Many viruses of interest, such as influenza A, have distinct segments in their genome. The evolution of these viruses involves mutation and reassortment, where segments are interchanged between viruses that coinfect a host. Phylogenetic trees can be constructed to investigate the mutation-driven evolution of individual viral segments. However, reassortment events among viral genomes are not well depicted in such bifurcating trees. We propose the concept of reassortment networks to analyze the evolution of segmented viruses. These are layered graphs in which the layers represent evolutionary stages such as a temporal series of seasons in which influenza viruses are isolated. Nodes represent viral isolates and reassortment events between pairs of isolates. Edges represent evolutionary steps, while weights on edges represent edit costs of reassortment and mutation events. Paths represent possible transformation series among viruses. The length of each path is the sum edit cost of the events required to transform one virus into another. In order to analyze tau stages of evolution of n viruses with segments of maximum length m, we first compute the pairwise distances between all corresponding segments of all viruses in O(m2n2) time using dynamic programming. The reassortment network, with O(taun2) nodes, is then constructed using these distances. The ancestors and descendents of a specific virus can be traced via shortest paths in this network, which can be found in O(taun3) time.
Collapse
Affiliation(s)
- Shahid H Bokhari
- Department of Biomedical Informatics, Ohio State University, 3190 Graves Hall, 333 W. 10th Ave. Columbus, OH 43210, USA.
| | | |
Collapse
|
29
|
Hovmöller R, Alexandrov B, Hardman J, Janies D. Tracking the geographical spread of avian influenza (H5N1) with multiple phylogenetic trees. Cladistics 2010; 26:1-13. [DOI: 10.1111/j.1096-0031.2009.00297.x] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
|
30
|
Sidlauskas B, Ganapathy G, Hazkani-Covo E, Jenkins KP, Lapp H, McCall LW, Price S, Scherle R, Spaeth PA, Kidd DM. LINKING BIG: THE CONTINUING PROMISE OF EVOLUTIONARY SYNTHESIS. Evolution 2009; 64:871-80. [DOI: 10.1111/j.1558-5646.2009.00892.x] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
31
|
Macdonald N, Beiko R. Tracking the evolution and geographic spread of Influenza A. PLOS CURRENTS 2009; 1:RRN1014. [PMID: 20029608 PMCID: PMC2762414 DOI: 10.1371/currents.rrn1014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 08/27/2009] [Indexed: 02/07/2023]
Abstract
The 2009 swine-origin strain of Influenza A H1N1 has spread to nearly all parts of the world, with 175 countries reporting confirmed cases thus far. Consistent with seasonal flu outbreaks, the current pandemic strain has shown rapid dispersal, with multiple examples of introduction into different geographic regions. Here we use an automated pipeline to collect data for analysis in the geospatial package GenGIS, which allows the geographic and temporal tracking of new sequence types and polymorphisms. Using this approach, we examine a pair of amino acid changes in the neuraminidase protein that are implicated in antibody recognition, and exhibit global dispersal with little or no geographic structure.
Collapse
|
32
|
Parks DH, Porter M, Churcher S, Wang S, Blouin C, Whalley J, Brooks S, Beiko RG. GenGIS: A geospatial information system for genomic data. Genome Res 2009; 19:1896-904. [PMID: 19635847 DOI: 10.1101/gr.095612.109] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The increasing availability of genetic sequence data associated with explicit geographic and ecological information is offering new opportunities to study the processes that shape biodiversity. The generation and testing of hypotheses using these data sets requires effective tools for mathematical and visual analysis that can integrate digital maps, ecological data, and large genetic, genomic, or metagenomic data sets. GenGIS is a free and open-source software package that supports the integration of digital map data with genetic sequences and environmental information from multiple sample sites. Essential bioinformatic and statistical tools are integrated into the software, allowing the user a wide range of analysis options for their sequence data. Data visualizations are combined with the cartographic display to yield a clear view of the relationship between geography and genomic diversity, with a particular focus on the hierarchical clustering of sites based on their similarity or phylogenetic proximity. Here we outline the features of GenGIS and demonstrate its application to georeferenced microbial metagenomic, HIV-1, and human mitochondrial DNA data sets.
Collapse
|
33
|
Sloan CD, Duell EJ, Shi X, Irwin R, Andrew AS, Williams SM, Moore JH. Ecogeographic genetic epidemiology. Genet Epidemiol 2009; 33:281-9. [PMID: 19025788 DOI: 10.1002/gepi.20386] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Complex diseases such as cancer and heart disease result from interactions between an individual's genetics and environment, i.e. their human ecology. Rates of complex diseases have consistently demonstrated geographic patterns of incidence, or spatial "clusters" of increased incidence relative to the general population. Likewise, genetic subpopulations and environmental influences are not evenly distributed across space. Merging appropriate methods from genetic epidemiology, ecology and geography will provide a more complete understanding of the spatial interactions between genetics and environment that result in spatial patterning of disease rates. Geographic information systems (GIS), which are tools designed specifically for dealing with geographic data and performing spatial analyses to determine their relationship, are key to this kind of data integration. Here the authors introduce a new interdisciplinary paradigm, ecogeographic genetic epidemiology, which uses GIS and spatial statistical analyses to layer genetic subpopulation and environmental data with disease rates and thereby discern the complex gene-environment interactions which result in spatial patterns of incidence.
Collapse
Affiliation(s)
- Chantel D Sloan
- Computational Genetics Laboratory, Department of Genetics, Dartmouth Medical School, Lebanon, New Hampshire, USA
| | | | | | | | | | | | | |
Collapse
|
34
|
Guralnick R, Hill A. Biodiversity informatics: automated approaches for documenting global biodiversity patterns and processes. Bioinformatics 2009; 25:421-8. [PMID: 19129210 DOI: 10.1093/bioinformatics/btn659] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Data about biodiversity have been scattered in different formats in natural history collections, survey reports and the literature. A central challenge for the biodiversity informatics community is to provide the means to share and rapidly synthesize these data and the knowledge they provide us to build an easily accessible, unified global map of biodiversity. Such a map would provide raw and summary data and information on biodiversity and its change across the world at multiple scales. RESULTS We discuss a series of steps required to create a unified global map of biodiversity. These steps include: building biodiversity repositories; creating scalable species distribution maps; creating flexible, user-programmable pipelines which enable biodiversity assessment; and integrating phylogenetic approaches into biodiversity assessment. We show two case studies that combine phyloinformatic and biodiversity informatic approaches to document large scale biodiversity patterns. The first case study uses data available from the Barcode of Life initiative in order to make species conservation assessment of North American birds taking into account evolutionary uniqueness. The second case study uses full genomes of influenza A available from Genbank to provide an auto-updating documentation of the evolution and geographic spread of these viruses. AVAILABILITY Both the website for tracking evolution and spread of influenza A and the website for applying phyloinformatics analysis to Barcode of Life data are available as outcomes of case studies (http://biodiversity.colorado.edu).
Collapse
Affiliation(s)
- Robert Guralnick
- University of Colorado Museum of Natural History, University of Colorado Boulder, Boulder, CO 80309-0265, USA.
| | | |
Collapse
|
35
|
Almeida RPP, Bennett GM, Anhalt MD, Tsai CW, O'Grady P. Spread of an introduced vector-borne banana virus in Hawaii. Mol Ecol 2008; 18:136-46. [PMID: 19037897 DOI: 10.1111/j.1365-294x.2008.04009.x] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Emerging diseases are increasing in incidence; therefore, understanding how pathogens are introduced into new regions and cause epidemics is of importance for the development of strategies that may hinder their spread. We used molecular data to study how a vector-borne banana virus, Banana bunchy top virus (BBTV), spread in Hawaii after it was first detected in 1989. Our analyses suggest that BBTV was introduced once into Hawaii, on the island of Oahu. All other islands were infected with isolates originating from Oahu, suggesting that movement of contaminated plant material was the main driving factor responsible for interisland spread of BBTV. The rate of mutation inferred by the phylogenetic analysis (1.4 x 10(-4) bp/year) was similar to that obtained in an experimental evolution study under greenhouse conditions (3.9 x 10(-4) bp/year). We used these values to estimate the number of infections occurring under field conditions per year. Our results suggest that strict and enforced regulations limiting the movement of banana plant material among Hawaiian islands could have reduced interisland spread of this pathogen.
Collapse
Affiliation(s)
- Rodrigo P P Almeida
- Department of Environmental Science, Policy and Management, University of California, Berkeley, CA 94720, USA.
| | | | | | | | | |
Collapse
|
36
|
Hill AW, Guralnick RP, Wilson MJC, Habib F, Janies D. Evolution of drug resistance in multiple distinct lineages of H5N1 avian influenza. INFECTION GENETICS AND EVOLUTION 2008; 9:169-78. [PMID: 19022400 DOI: 10.1016/j.meegid.2008.10.006] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2008] [Revised: 10/12/2008] [Accepted: 10/13/2008] [Indexed: 12/15/2022]
Abstract
Some predict that influenza A H5N1 will be the cause of a pandemic among humans. In preparation for such an event, many governments and organizations have stockpiled antiviral drugs such as oseltamivir (Tamiflu). However, it is known that multiple lineages of H5N1 are already resistant to another class of drugs, adamantane derivatives, and a few lineages are resistant to oseltamivir. What is less well understood is the evolutionary history of the mutations that confer drug resistance in the H5N1 population. In order to address this gap, we conducted phylogenetic analyses of 676 genomic sequences of H5N1 and used the resulting hypotheses as a basis for asking 3 molecular evolutionary questions: (1) Have drug-resistant genotypes arisen in distinct lineages of H5N1 through point mutation or through reassortment? (2) Is there evidence for positive selection on the codons that lead to drug resistance? (3) Is there evidence for covariation between positions in the genome that confer resistance to drugs and other positions, unrelated to drug resistance, that may be under selection for other phenotypes? We also examine how drug-resistant lineages proliferate across the landscape by projecting or phylogenetic analysis onto a virtual globe. Our results for H5N1 show that in most cases drug resistance has arisen by independent point mutations rather than reassortment or covariation. Furthermore, we found that some codons that mediate resistance to adamantane derivatives are under positive selection, but did not find positive selection on codons that mediate resistance to oseltamivir. Together, our phylogenetic methods, molecular evolutionary analyses, and geographic visualization provide a framework for analysis of globally distributed genomic data that can be used to monitor the evolution of drug resistance.
Collapse
Affiliation(s)
- Andrew W Hill
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO 80309, USA.
| | | | | | | | | |
Collapse
|
37
|
Boulos MNK, Scotch M, Cheung KH, Burden D. Web GIS in practice VI: a demo playlist of geo-mashups for public health neogeographers. Int J Health Geogr 2008; 7:38. [PMID: 18638385 PMCID: PMC2491600 DOI: 10.1186/1476-072x-7-38] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2008] [Accepted: 07/18/2008] [Indexed: 11/15/2022] Open
Abstract
'Mashup' was originally used to describe the mixing together of musical tracks to create a new piece of music. The term now refers to Web sites or services that weave data from different sources into a new data source or service. Using a musical metaphor that builds on the origin of the word 'mashup', this paper presents a demonstration "playlist" of four geo-mashup vignettes that make use of a range of Web 2.0, Semantic Web, and 3-D Internet methods, with outputs/end-user interfaces spanning the flat Web (two-dimensional – 2-D maps), a three-dimensional – 3-D mirror world (Google Earth) and a 3-D virtual world (Second Life ®). The four geo-mashup "songs" in this "playlist" are: 'Web 2.0 and GIS (Geographic Information Systems) for infectious disease surveillance', 'Web 2.0 and GIS for molecular epidemiology', 'Semantic Web for GIS mashup', and 'From Yahoo! Pipes to 3-D, avatar-inhabited geo-mashups'. It is hoped that this showcase of examples and ideas, and the pointers we are providing to the many online tools that are freely available today for creating, sharing and reusing geo-mashups with minimal or no coding, will ultimately spark the imagination of many public health practitioners and stimulate them to start exploring the use of these methods and tools in their day-to-day practice. The paper also discusses how today's Web is rapidly evolving into a much more intensely immersive, mixed-reality and ubiquitous socio-experiential Metaverse that is heavily interconnected through various kinds of user-created mashups.
Collapse
Affiliation(s)
- Maged N Kamel Boulos
- Faculty of Health and Social Work, University of Plymouth, Drake Circus, Plymouth, Devon PL48AA, UK.
| | | | | | | |
Collapse
|
38
|
Janies D, Habib F, Alexandrov B, Hill A, Pol D. Evolution of genomes, host shifts and the geographic spread of SARS-CoV and related coronaviruses. Cladistics 2008; 24:111-130. [PMID: 32313363 PMCID: PMC7162247 DOI: 10.1111/j.1096-0031.2008.00199.x] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/23/2007] [Indexed: 11/26/2022] Open
Abstract
Severe acute respiratory syndrome (SARS) is a novel human illness caused by a previously unrecognized coronavirus (CoV) termed SARS-CoV. There are conflicting reports on the animal reservoir of SARS-CoV. Many of the groups that argue carnivores are the original reservoir of SARS-CoV use a phylogeny to support their argument. However, the phylogenies in these studies often lack outgroup and rooting criteria necessary to determine the origins of SARS-CoV. Recently, SARS-CoV has been isolated from various species of Chiroptera from China (e.g., Rhinolophus sinicus) thus leading to reconsideration of the original reservoir of SARS-CoV. We evaluated the hypothesis that SARS-CoV isolated from Chiroptera are the original zoonotic source for SARS-CoV by sampling SARS-CoV and non-SARS-CoV from diverse hosts including Chiroptera, as well as carnivores, artiodactyls, rodents, birds and humans. Regardless of alignment parameters, optimality criteria, or isolate sampling, the resulting phylogenies clearly show that the SARS-CoV was transmitted to small carnivores well after the epidemic of SARS in humans that began in late 2002. The SARS-CoV isolates from small carnivores in Shenzhen markets form a terminal clade that emerged recently from within the radiation of human SARS-CoV. There is evidence of subsequent exchange of SARS-CoV between humans and carnivores. In addition SARS-CoV was transmitted independently from humans to farmed pigs (Sus scrofa). The position of SARS-CoV isolates from Chiroptera are basal to the SARS-CoV clade isolated from humans and carnivores. Although sequence data indicate that Chiroptera are a good candidate for the original reservoir of SARS-CoV, the structural biology of the spike protein of SARS-CoV isolated from Chiroptera suggests that these viruses are not able to interact with the human variant of the receptor of SARS-CoV, angiotensin-converting enzyme 2 (ACE2). In SARS-CoV we study, both visually and statistically, labile genomic fragments and, putative key mutations of the spike protein that may be associated with host shifts. We display host shifts and candidate mutations on trees projected in virtual globes depicting the spread of SARS-CoV. These results suggest that more sampling of coronaviruses from diverse hosts, especially Chiroptera, carnivores and primates, will be required to understand the genomic and biochemical evolution of coronaviruses, including SARS-CoV. © The Willi Hennig Society 2008.
Collapse
Affiliation(s)
- Daniel Janies
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Farhat Habib
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
- Department of Physics, The Ohio State University, Columbus, OH, USA
| | - Boyan Alexandrov
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
- Biomedical Sciences Program, The Ohio State University, Columbus, OH, USA
| | - Andrew Hill
- Department of Ecology and Evolution Biology, University of Colorado, Boulder, CO, USA
| | - Diego Pol
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
- Mathematical Biosciences Institute, The Ohio State University, Columbus, OH, USA
- Museo Paleontologico Egidio Feruglio, Consejo Nacional de Investigaciones Cientificas y Téchnicas; Argentina
| |
Collapse
|
39
|
Wallace RG, Fitch WM. Influenza A H5N1 immigration is filtered out at some international borders. PLoS One 2008; 3:e1697. [PMID: 18301773 PMCID: PMC2244808 DOI: 10.1371/journal.pone.0001697] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2007] [Accepted: 01/17/2008] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Geographic spread of highly pathogenic influenza A H5N1, the bird flu strain, appears a necessary condition for accelerating the evolution of a related human-to-human infection. As H5N1 spreads the virus diversifies in response to the variety of socioecological environments encountered, increasing the chance a human infection emerges. Genetic phylogenies have for the most part provided only qualitative evidence that localities differ in H5N1 diversity. For the first time H5N1 variation is quantified across geographic space. METHODOLOGY AND PRINCIPAL FINDINGS We constructed a statistical phylogeography of 481 H5N1 hemagglutinin genetic sequences from samples collected across 28 Eurasian and African localities through 2006. The MigraPhyla protocol showed southern China was a source of multiple H5N1 strains. Nested clade analysis indicated H5N1 was widely dispersed across southern China by both limited dispersal and long distance colonization. The UniFrac metric, a measure of shared phylogenetic history, grouped H5N1 from Indonesia, Japan, Thailand and Vietnam with those from southeastern Chinese provinces engaged in intensive international trade. Finally, H5N1's accumulative phylogenetic diversity was greatest in southern China and declined beyond. The gradient was interrupted by areas of greater and lesser phylogenetic dispersion, indicating H5N1 migration was restricted at some geopolitical borders. Thailand and Vietnam, just south of China, showed significant phylogenetic clustering, suggesting newly invasive H5N1 strains have been repeatedly filtered out at their northern borders even as both countries suffered recurring outbreaks of endemic strains. In contrast, Japan, while successful in controlling outbreaks, has been subjected to multiple introductions of the virus. CONCLUSIONS The analysis demonstrates phylogenies can provide local health officials with more than hypotheses about relatedness. Pathogen dispersal, the functional relationships among disease ecologies across localities, and the efficacy of control efforts can also be inferred, all from viral genetic sequences alone.
Collapse
Affiliation(s)
- Robert G Wallace
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, USA.
| | | |
Collapse
|
40
|
|
41
|
Abstract
Biodiversity data are rapidly becoming available over the Internet in common formats that promote sharing and exchange. Currently, these data are somewhat problematic, primarily with regard to geographic and taxonomic accuracy, for use in ecological research, natural resources management and conservation decision-making. However, web-based georeferencing tools that utilize best practices and gazetteer databases can be employed to improve geographic data. Taxonomic data quality can be improved through web-enabled valid taxon names databases and services, as well as more efficient mechanisms to return systematic research results and taxonomic misidentification rates back to the biodiversity community. Both of these are under construction. A separate but related challenge will be developing web-based visualization and analysis tools for tracking biodiversity change. Our aim was to discuss how such tools, combined with data of enhanced quality, will help transform today's portals to raw biodiversity data into nexuses of collaborative creation and sharing of biodiversity knowledge.
Collapse
Affiliation(s)
- Robert P Guralnick
- Department of Ecology and Evolutionary Biology, University of Colorado at Boulder, Boulder, CO 80309, USA.
| | | | | |
Collapse
|