1
|
Zuntini AR, Carruthers T, Maurin O, Bailey PC, Leempoel K, Brewer GE, Epitawalage N, Françoso E, Gallego-Paramo B, McGinnie C, Negrão R, Roy SR, Simpson L, Toledo Romero E, Barber VMA, Botigué L, Clarkson JJ, Cowan RS, Dodsworth S, Johnson MG, Kim JT, Pokorny L, Wickett NJ, Antar GM, DeBolt L, Gutierrez K, Hendriks KP, Hoewener A, Hu AQ, Joyce EM, Kikuchi IABS, Larridon I, Larson DA, de Lírio EJ, Liu JX, Malakasi P, Przelomska NAS, Shah T, Viruel J, Allnutt TR, Ameka GK, Andrew RL, Appelhans MS, Arista M, Ariza MJ, Arroyo J, Arthan W, Bachelier JB, Bailey CD, Barnes HF, Barrett MD, Barrett RL, Bayer RJ, Bayly MJ, Biffin E, Biggs N, Birch JL, Bogarín D, Borosova R, Bowles AMC, Boyce PC, Bramley GLC, Briggs M, Broadhurst L, Brown GK, Bruhl JJ, Bruneau A, Buerki S, Burns E, Byrne M, Cable S, Calladine A, Callmander MW, Cano Á, Cantrill DJ, Cardinal-McTeague WM, Carlsen MM, Carruthers AJA, de Castro Mateo A, Chase MW, Chatrou LW, Cheek M, Chen S, Christenhusz MJM, Christin PA, Clements MA, Coffey SC, Conran JG, Cornejo X, Couvreur TLP, Cowie ID, Csiba L, Darbyshire I, Davidse G, Davies NMJ, Davis AP, van Dijk KJ, Downie SR, Duretto MF, Duvall MR, Edwards SL, Eggli U, Erkens RHJ, Escudero M, de la Estrella M, Fabriani F, Fay MF, Ferreira PDL, Ficinski SZ, Fowler RM, Frisby S, Fu L, Fulcher T, Galbany-Casals M, Gardner EM, German DA, Giaretta A, Gibernau M, Gillespie LJ, González CC, Goyder DJ, Graham SW, Grall A, Green L, Gunn BF, Gutiérrez DG, Hackel J, Haevermans T, Haigh A, Hall JC, Hall T, Harrison MJ, Hatt SA, Hidalgo O, Hodkinson TR, Holmes GD, Hopkins HCF, Jackson CJ, James SA, Jobson RW, Kadereit G, Kahandawala IM, Kainulainen K, Kato M, Kellogg EA, King GJ, Klejevskaja B, Klitgaard BB, Klopper RR, Knapp S, Koch MA, Leebens-Mack JH, Lens F, Leon CJ, Léveillé-Bourret É, Lewis GP, Li DZ, Li L, Liede-Schumann S, Livshultz T, Lorence D, Lu M, Lu-Irving P, Luber J, Lucas EJ, Luján M, Lum M, Macfarlane TD, Magdalena C, Mansano VF, Masters LE, Mayo SJ, McColl K, McDonnell AJ, McDougall AE, McLay TGB, McPherson H, Meneses RI, Merckx VSFT, Michelangeli FA, Mitchell JD, Monro AK, Moore MJ, Mueller TL, Mummenhoff K, Munzinger J, Muriel P, Murphy DJ, Nargar K, Nauheimer L, Nge FJ, Nyffeler R, Orejuela A, Ortiz EM, Palazzesi L, Peixoto AL, Pell SK, Pellicer J, Penneys DS, Perez-Escobar OA, Persson C, Pignal M, Pillon Y, Pirani JR, Plunkett GM, Powell RF, Prance GT, Puglisi C, Qin M, Rabeler RK, Rees PEJ, Renner M, Roalson EH, Rodda M, Rogers ZS, Rokni S, Rutishauser R, de Salas MF, Schaefer H, Schley RJ, Schmidt-Lebuhn A, Shapcott A, Al-Shehbaz I, Shepherd KA, Simmons MP, Simões AO, Simões ARG, Siros M, Smidt EC, Smith JF, Snow N, Soltis DE, Soltis PS, Soreng RJ, Sothers CA, Starr JR, Stevens PF, Straub SCK, Struwe L, Taylor JM, Telford IRH, Thornhill AH, Tooth I, Trias-Blasi A, Udovicic F, Utteridge TMA, Del Valle JC, Verboom GA, Vonow HP, Vorontsova MS, de Vos JM, Al-Wattar N, Waycott M, Welker CAD, White AJ, Wieringa JJ, Williamson LT, Wilson TC, Wong SY, Woods LA, Woods R, Worboys S, Xanthos M, Yang Y, Zhang YX, Zhou MY, Zmarzty S, Zuloaga FO, Antonelli A, Bellot S, Crayn DM, Grace OM, Kersey PJ, Leitch IJ, Sauquet H, Smith SA, Eiserhardt WL, Forest F, Baker WJ. Phylogenomics and the rise of the angiosperms. Nature 2024:10.1038/s41586-024-07324-0. [PMID: 38658746 DOI: 10.1038/s41586-024-07324-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 03/15/2024] [Indexed: 04/26/2024]
Abstract
Angiosperms are the cornerstone of most terrestrial ecosystems and human livelihoods1,2. A robust understanding of angiosperm evolution is required to explain their rise to ecological dominance. So far, the angiosperm tree of life has been determined primarily by means of analyses of the plastid genome3,4. Many studies have drawn on this foundational work, such as classification and first insights into angiosperm diversification since their Mesozoic origins5-7. However, the limited and biased sampling of both taxa and genomes undermines confidence in the tree and its implications. Here, we build the tree of life for almost 8,000 (about 60%) angiosperm genera using a standardized set of 353 nuclear genes8. This 15-fold increase in genus-level sampling relative to comparable nuclear studies9 provides a critical test of earlier results and brings notable change to key groups, especially in rosids, while substantiating many previously predicted relationships. Scaling this tree to time using 200 fossils, we discovered that early angiosperm evolution was characterized by high gene tree conflict and explosive diversification, giving rise to more than 80% of extant angiosperm orders. Steady diversification ensued through the remaining Mesozoic Era until rates resurged in the Cenozoic Era, concurrent with decreasing global temperatures and tightly linked with gene tree conflict. Taken together, our extensive sampling combined with advanced phylogenomic methods shows the deep history and full complexity in the evolution of a megadiverse clade.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Elaine Françoso
- Royal Botanic Gardens, Kew, Richmond, UK
- Centre for Ecology, Evolution and Behaviour, Department of Biological Sciences, School of Life Sciences and the Environment, Royal Holloway University of London, London, UK
| | | | | | | | | | - Lalita Simpson
- Australian Tropical Herbarium, James Cook University, Smithfield, Queensland, Australia
| | | | | | - Laura Botigué
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus UAB, Barcelona, Spain
| | | | | | - Steven Dodsworth
- School of Biological Sciences, University of Portsmouth, Portsmouth, UK
| | | | - Jan T Kim
- School of Physics, Engineering and Computer Science, University of Hertfordshire, Hatfield, UK
| | - Lisa Pokorny
- Royal Botanic Gardens, Kew, Richmond, UK
- Department of Biodiversity and Conservation, Real Jardín Botánico (RJB-CSIC), Madrid, Spain
| | - Norman J Wickett
- Department of Biological Sciences, Clemson University, Clemson, SC, USA
| | - Guilherme M Antar
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
- Departamento de Ciências Agrárias e Biológicas, Centro Universitário Norte do Espírito Santo, Universidade Federal do Espírito Santo, São Mateus, Brazil
| | | | | | - Kasper P Hendriks
- Department of Biology, University of Osnabrück, Osnabrück, Germany
- Naturalis Biodiversity Center, Leiden, The Netherlands
| | - Alina Hoewener
- Plant Biodiversity, Technical University Munich, Freising, Germany
| | - Ai-Qun Hu
- Royal Botanic Gardens, Kew, Richmond, UK
| | - Elizabeth M Joyce
- Australian Tropical Herbarium, James Cook University, Smithfield, Queensland, Australia
- Systematic, Biodiversity and Evolution of Plants, Ludwig Maximilian University of Munich, Munich, Germany
| | - Izai A B S Kikuchi
- Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada
| | | | - Drew A Larson
- Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Elton John de Lírio
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Jing-Xia Liu
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
| | | | - Natalia A S Przelomska
- Royal Botanic Gardens, Kew, Richmond, UK
- School of Biological Sciences, University of Portsmouth, Portsmouth, UK
| | - Toral Shah
- Royal Botanic Gardens, Kew, Richmond, UK
| | | | | | - Gabriel K Ameka
- Department of Plant and Environmental Biology, University of Ghana, Accra, Ghana
| | - Rose L Andrew
- Botany and N.C.W. Beadle Herbarium, University of New England, Armidale, New South Wales, Australia
| | - Marc S Appelhans
- Department of Systematics, Biodiversity and Evolution of Plants, Albrecht-von-Haller Institute of Plant Sciences, University of Göttingen, Göttingen, Germany
| | - Montserrat Arista
- Departamento de Biología Vegetal y Ecología, Facultad de Biología, Universidad de Sevilla, Seville, Spain
| | - María Jesús Ariza
- General Research Services, Herbario SEV, CITIUS, Universidad de Sevilla, Seville, Spain
| | - Juan Arroyo
- Departamento de Biología Vegetal y Ecología, Facultad de Biología, Universidad de Sevilla, Seville, Spain
| | | | | | - C Donovan Bailey
- Department of Biology, New Mexico State University, Las Cruces, NM, USA
| | - Helen F Barnes
- Royal Botanic Gardens Victoria, Melbourne, Victoria, Australia
| | - Matthew D Barrett
- Australian Tropical Herbarium, James Cook University, Smithfield, Queensland, Australia
| | - Russell L Barrett
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Randall J Bayer
- Department of Biological Sciences, University of Memphis, Memphis, TN, USA
| | - Michael J Bayly
- School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
| | - Ed Biffin
- State Herbarium of South Australia, Botanic Gardens and State Herbarium, Adelaide, South Australia, Australia
| | | | - Joanne L Birch
- School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
| | - Diego Bogarín
- Naturalis Biodiversity Center, Leiden, The Netherlands
- Jardín Botánico Lankester, Universidad de Costa Rica, Cartago, Costa Rica
| | | | | | - Peter C Boyce
- Centro Studi Erbario Tropicale, Dipartimento di Biologia, University of Florence, Florence, Italy
| | | | | | - Linda Broadhurst
- Centre for Australian National Biodiversity Research, National Research Collections Australia, CSIRO, Canberra, Australian Capital Territory, Australia
| | - Gillian K Brown
- Queensland Herbarium and Biodiversity Science, Brisbane Botanic Gardens, Toowong, Queensland, Australia
| | - Jeremy J Bruhl
- Botany and N.C.W. Beadle Herbarium, University of New England, Armidale, New South Wales, Australia
| | - Anne Bruneau
- Institut de Recherche en Biologie Végétale and Département de Sciences Biologiques, University of Montreal, Montreal, Quebec, Canada
| | - Sven Buerki
- Department of Biological Sciences, Boise State University, Boise, ID, USA
| | - Edie Burns
- Royal Botanic Gardens, Kew, Richmond, UK
| | - Margaret Byrne
- Biodiversity and Conservation Science, Department of Biodiversity, Conservation and Attractions, Government of Western Australia, Kensington, Western Australia, Australia
| | | | - Ainsley Calladine
- State Herbarium of South Australia, Botanic Gardens and State Herbarium, Adelaide, South Australia, Australia
| | | | - Ángela Cano
- Cambridge University Botanic Garden, Cambridge, UK
| | | | - Warren M Cardinal-McTeague
- Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada
| | | | | | - Alejandra de Castro Mateo
- Departamento de Biología Vegetal y Ecología, Facultad de Biología, Universidad de Sevilla, Seville, Spain
| | - Mark W Chase
- Royal Botanic Gardens, Kew, Richmond, UK
- Department of Environment and Agriculture, Curtin University, Bentley, Western Australia, Australia
| | | | | | - Shilin Chen
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, China
- Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Beijing, China
| | - Maarten J M Christenhusz
- Royal Botanic Gardens, Kew, Richmond, UK
- Department of Environment and Agriculture, Curtin University, Perth, Western Australia, Australia
- Plant Gateway, Den Haag, The Netherlands
| | - Pascal-Antoine Christin
- Ecology and Evolutionary Biology, School of Biosciences, University of Sheffield, Sheffield, UK
| | - Mark A Clements
- Centre for Australian National Biodiversity Research, National Research Collections Australia, CSIRO, Canberra, Australian Capital Territory, Australia
| | - Skye C Coffey
- Western Australian Herbarium, Department of Biodiversity, Conservation and Attractions, Government of Western Australia, Kensington, Western Australia, Australia
| | - John G Conran
- School of Biological Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Xavier Cornejo
- Herbario GUAY, Facultad de Ciencias Naturales, Universidad de Guayaquil, Guayaquil, Ecuador
| | | | - Ian D Cowie
- Northern Territory Herbarium Department of Environment Parks & Water Security, Northern Territory Government, Palmerston, Northern Territory, Australia
| | | | | | | | | | | | - Kor-Jent van Dijk
- The University of Adelaide, North Terrace Campus, Adelaide, South Australia, Australia
| | - Stephen R Downie
- Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Marco F Duretto
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Melvin R Duvall
- Department of Biological Sciences and Institute for the Study of the Environment, Sustainability and Energy, Northern Illinois University, DeKalb, IL, USA
| | | | - Urs Eggli
- Sukkulenten-Sammlung Zürich/ Grün Stadt Zürich, Zürich, Switzerland
| | - Roy H J Erkens
- Naturalis Biodiversity Center, Leiden, The Netherlands
- Maastricht Science Programme, Maastricht University, Maastricht, The Netherlands
- System Earth Science, Maastricht University, Venlo, The Netherlands
| | - Marcial Escudero
- Departamento de Biología Vegetal y Ecología, Facultad de Biología, Universidad de Sevilla, Seville, Spain
| | - Manuel de la Estrella
- Departamento de Botánica, Ecología y Fisiología Vegetal, Facultad de Ciencias, Universidad de Córdoba, Córdoba, Spain
| | | | | | - Paola de L Ferreira
- Departamento de Biologia, Faculdade de Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, São Paulo, Brazil
- Department of Biology, Aarhus University, Aarhus, Denmark
| | | | - Rachael M Fowler
- School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
| | - Sue Frisby
- Royal Botanic Gardens, Kew, Richmond, UK
| | - Lin Fu
- South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | | | - Mercè Galbany-Casals
- Systematics and Evolution of Vascular Plants (UAB)-Associated Unit to CSIC by IBB, Departament de Biologia Animal, Biologia Vegetal i Ecologia, Facultat de Biociències, Universitat Autònoma de Barcelona, Bellaterra, Spain
| | - Elliot M Gardner
- Department of Biology, Case Western Reserve University, Cleveland, OH, USA
| | | | - Augusto Giaretta
- Faculdade de Ciências Biológicas e Ambientais, Universidade Federal da Grande Dourados, Dourados, Brazil
| | - Marc Gibernau
- Laboratoire Sciences Pour l'Environnement, Université de Corse, Ajaccio, France
| | | | - Cynthia C González
- Herbario Trelew, Universidad Nacional de la Patagonia San Juan Bosco, Trelew, Argentina
| | | | - Sean W Graham
- Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada
| | | | | | - Bee F Gunn
- Royal Botanic Gardens Victoria, Melbourne, Victoria, Australia
| | - Diego G Gutiérrez
- Museo Argentino de Ciencias Naturales (MACN-CONICET), Buenos Aires, Argentina
| | - Jan Hackel
- Royal Botanic Gardens, Kew, Richmond, UK
- Department of Biology, Universität Marburg, Marburg, Germany
| | - Thomas Haevermans
- Institut de Systématique, Evolution, Biodiversité, Muséum National d'Histoire Naturelle, Paris, France
| | - Anna Haigh
- Royal Botanic Gardens, Kew, Richmond, UK
| | - Jocelyn C Hall
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Tony Hall
- Royal Botanic Gardens, Kew, Richmond, UK
| | - Melissa J Harrison
- Australian Tropical Herbarium, James Cook University, Smithfield, Queensland, Australia
| | | | - Oriane Hidalgo
- Institut Botànic de Barcelona (IBB CSIC-Ajuntament de Barcelona), Barcelona, Spain
| | - Trevor R Hodkinson
- Botany, School of Natural Sciences, Trinity College Dublin, The University of Dublin, Dublin, Ireland
| | - Gareth D Holmes
- Royal Botanic Gardens Victoria, Melbourne, Victoria, Australia
| | | | | | - Shelley A James
- Western Australian Herbarium, Department of Biodiversity, Conservation and Attractions, Government of Western Australia, Kensington, Western Australia, Australia
| | - Richard W Jobson
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Gudrun Kadereit
- Prinzessin Therese von Bayern-Lehrstuhl für Systematik, Biodiversität & Evolution der Pflanzen, Ludwig-Maximilians-Universität München, Botanische Staatssammlung München, Botanischer Garten München-Nymphenburg, Munich, Germany
| | | | | | - Masahiro Kato
- National Museum of Nature and Science, Tsukuba, Japan
| | | | - Graham J King
- Southern Cross University, Lismore, New South Wales, Australia
| | | | | | - Ronell R Klopper
- Foundational Biodiversity Science Division, South African National Biodiversity Institute, Pretoria, South Africa
- Department of Plant and Soil Sciences, University of Pretoria, Pretoria, South Africa
| | | | - Marcus A Koch
- Centre for Organismal Studies, Biodiversity and Plant Systematics, Heidelberg University, Heidelberg, Germany
| | | | - Frederic Lens
- Naturalis Biodiversity Center, Leiden, The Netherlands
| | | | | | | | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
| | - Lan Li
- CSIRO, Canberra, Australian Capital Territory, Australia
| | | | - Tatyana Livshultz
- Department of Biodiversity, Earth and Environmental Sciences, Drexel University, Philadelphia, PA, USA
- Academy of Natural Science, Drexel University, Philadelphia, PA, USA
| | - David Lorence
- National Tropical Botanical Garden, Kalaheo, HI, USA
| | - Meng Lu
- Royal Botanic Gardens, Kew, Richmond, UK
| | - Patricia Lu-Irving
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Jaquelini Luber
- Instituto de Pesquisas Jardim Botânico do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | | | - Mabel Lum
- Bioplatforms Australia Ltd, Sydney, New South Wales, Australia
| | - Terry D Macfarlane
- Western Australian Herbarium, Department of Biodiversity, Conservation and Attractions, Government of Western Australia, Kensington, Western Australia, Australia
| | | | - Vidal F Mansano
- Instituto de Pesquisas Jardim Botânico do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | | | - Kristina McColl
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Angela J McDonnell
- Department of Biological Sciences, Saint Cloud State University, Saint Cloud, MN, USA
| | - Andrew E McDougall
- The University of Adelaide, North Terrace Campus, Adelaide, South Australia, Australia
| | - Todd G B McLay
- Royal Botanic Gardens Victoria, Melbourne, Victoria, Australia
| | - Hannah McPherson
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Rosa I Meneses
- Instituto de Arqueología y Antropología, Universidad Católica del Norte, San Pedro de Atacama, Chile
| | | | | | | | | | | | - Taryn L Mueller
- Department of Ecology, Evolution & Behavior, University of Minnesota, St. Paul, MN, USA
| | - Klaus Mummenhoff
- Department of Biology, University of Osnabrück, Osnabrück, Germany
| | - Jérôme Munzinger
- AMAP Lab, Université Montpellier, IRD, CIRAD, CNRS INRAE, Montpellier, France
| | - Priscilla Muriel
- Laboratorio de Ecofisiología, Escuela de Ciencias Biológicas, Pontificia Universidad Católica del Ecuador, Quito, Ecuador
| | - Daniel J Murphy
- Royal Botanic Gardens Victoria, Melbourne, Victoria, Australia
| | - Katharina Nargar
- Australian Tropical Herbarium, James Cook University, Smithfield, Queensland, Australia
- Centre for Australian National Biodiversity Research, National Research Collections Australia, CSIRO, Canberra, Australian Capital Territory, Australia
| | - Lars Nauheimer
- Australian Tropical Herbarium, James Cook University, Smithfield, Queensland, Australia
| | - Francis J Nge
- State Herbarium of South Australia, Botanic Gardens and State Herbarium, Adelaide, South Australia, Australia
| | - Reto Nyffeler
- Department of Systematic and Evolutionary Botany, University of Zürich, Zürich, Switzerland
| | - Andrés Orejuela
- Royal Botanic Garden Edinburgh, Edinburgh, UK
- Grupo de Investigación en Recursos Naturales Amazónicos, Instituto Tecnológico del Putumayo, Mocoa, Colombia
| | - Edgardo M Ortiz
- Plant Biodiversity, Technical University Munich, Freising, Germany
| | - Luis Palazzesi
- Museo Argentino de Ciencias Naturales (MACN-CONICET), Buenos Aires, Argentina
| | - Ariane Luna Peixoto
- Instituto de Pesquisas Jardim Botânico do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Jaume Pellicer
- Institut Botànic de Barcelona (IBB CSIC-Ajuntament de Barcelona), Barcelona, Spain
| | - Darin S Penneys
- Department of Biology and Marine Biology, University of North Carolina Wilmington, Wilmington, NC, USA
| | | | - Claes Persson
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
| | - Marc Pignal
- Institut de Systématique, Evolution, Biodiversité, Muséum National d'Histoire Naturelle, Paris, France
| | - Yohan Pillon
- LSTM Université Montpellier, CIRADIRD, Montpellier, France
| | - José R Pirani
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | | | | | | | - Carmen Puglisi
- Royal Botanic Gardens, Kew, Richmond, UK
- Missouri Botanical Garden, St. Louis, MO, USA
| | - Ming Qin
- South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Richard K Rabeler
- Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | | | - Matthew Renner
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Eric H Roalson
- School of Biological Sciences, Washington State University, Pullman, WA, USA
| | - Michele Rodda
- National Parks Board, Singapore Botanic Gardens, Singapore, Singapore
| | | | - Saba Rokni
- Royal Botanic Gardens, Kew, Richmond, UK
| | - Rolf Rutishauser
- Department of Systematic and Evolutionary Botany, University of Zürich, Zürich, Switzerland
| | - Miguel F de Salas
- Tasmanian Herbarium, University of Tasmania, Sandy Bay, Tasmania, Australia
| | - Hanno Schaefer
- Plant Biodiversity, Technical University Munich, Freising, Germany
| | | | - Alexander Schmidt-Lebuhn
- Centre for Australian National Biodiversity Research, National Research Collections Australia, CSIRO, Canberra, Australian Capital Territory, Australia
| | - Alison Shapcott
- School of Science Technology and Engineering, Center for Bioinnovation, University Sunshine Coast, Sippy Downs, Queensland, Australia
| | | | - Kelly A Shepherd
- Western Australian Herbarium, Department of Biodiversity, Conservation and Attractions, Government of Western Australia, Kensington, Western Australia, Australia
| | - Mark P Simmons
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - André O Simões
- Departamento de Biologia Vegetal, Universidade Estadual de Campinas, Campinas, Brazil
| | | | - Michelle Siros
- Royal Botanic Gardens, Kew, Richmond, UK
- University of California, San Francisco, San Francisco, CA, USA
| | - Eric C Smidt
- Departamento de Botânica, Universidade Federal do Paraná, Curitiba, Brazil
| | - James F Smith
- Department of Biological Sciences, Boise State University, Boise, ID, USA
| | - Neil Snow
- Pittsburg State University, Pittsburg, KS, USA
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
| | | | | | - Julian R Starr
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
| | | | | | | | | | - Ian R H Telford
- Botany and N.C.W. Beadle Herbarium, University of New England, Armidale, New South Wales, Australia
| | - Andrew H Thornhill
- Botany and N.C.W. Beadle Herbarium, University of New England, Armidale, New South Wales, Australia
- State Herbarium of South Australia, Botanic Gardens and State Herbarium, Adelaide, South Australia, Australia
- School of Biological Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Ifeanna Tooth
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | | | - Frank Udovicic
- Royal Botanic Gardens Victoria, Melbourne, Victoria, Australia
| | | | - Jose C Del Valle
- Departamento de Biología Vegetal y Ecología, Facultad de Biología, Universidad de Sevilla, Seville, Spain
| | - G Anthony Verboom
- Department of Biological Sciences and Bolus Herbarium, University of Cape Town, Cape Town, South Africa
| | - Helen P Vonow
- State Herbarium of South Australia, Botanic Gardens and State Herbarium, Adelaide, South Australia, Australia
| | | | - Jurriaan M de Vos
- Department of Environmental Sciences-Botany, University of Basel, Basel, Switzerland
| | | | - Michelle Waycott
- State Herbarium of South Australia, Botanic Gardens and State Herbarium, Adelaide, South Australia, Australia
- School of Biological Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Cassiano A D Welker
- Instituto de Biologia, Universidade Federal de Uberlândia, Uberlândia, Brazil
| | - Adam J White
- Australian National Herbarium, Centre for Australian National Biodiversity Research, National Research Collections Australia, CSIRO, Canberra, Australian Capital Territory, Australia
| | | | - Luis T Williamson
- The University of Adelaide, North Terrace Campus, Adelaide, South Australia, Australia
| | - Trevor C Wilson
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Sin Yeng Wong
- Institute of Biodiversity And Environmental Conservation, Universiti Malaysia Sarawak, Samarahan, Malaysia
| | - Lisa A Woods
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | | | - Stuart Worboys
- Australian Tropical Herbarium, James Cook University, Smithfield, Queensland, Australia
| | | | - Ya Yang
- University of Minnesota-Twin Cities, St. Paul, MN, USA
| | | | - Meng-Yuan Zhou
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
| | | | | | - Alexandre Antonelli
- Royal Botanic Gardens, Kew, Richmond, UK
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Gothenburg Global Biodiversity Centre, University of Gothenburg, Gothenburg, Sweden
- Department of Biology, University of Oxford, Oxford, UK
| | | | - Darren M Crayn
- Australian Tropical Herbarium, James Cook University, Smithfield, Queensland, Australia
| | - Olwen M Grace
- Royal Botanic Gardens, Kew, Richmond, UK
- Royal Botanic Garden Edinburgh, Edinburgh, UK
| | | | | | - Hervé Sauquet
- National Herbarium of NSW, Botanic Gardens of Sydney, Mount Annan, New South Wales, Australia
| | - Stephen A Smith
- Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
| | - Wolf L Eiserhardt
- Royal Botanic Gardens, Kew, Richmond, UK
- Department of Biology, Aarhus University, Aarhus, Denmark
| | | | - William J Baker
- Royal Botanic Gardens, Kew, Richmond, UK.
- Department of Biology, Aarhus University, Aarhus, Denmark.
| |
Collapse
|
2
|
Kersey PJ, Antonelli A. Physical infrastructure and global capacity are both needed to fight biodiversity loss. Nat Plants 2023; 9:1940. [PMID: 38040836 DOI: 10.1038/s41477-023-01594-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2023]
|
3
|
Campbell LI, Nwezeobi J, van Brunschot SL, Kaweesi T, Seal SE, Swamy RAR, Namuddu A, Maslen GL, Mugerwa H, Armean IM, Haggerty L, Martin FJ, Malka O, Santos-Garcia D, Juravel K, Morin S, Stephens ME, Muhindira PV, Kersey PJ, Maruthi MN, Omongo CA, Navas-Castillo J, Fiallo-Olivé E, Mohammed IU, Wang HL, Onyeka J, Alicai T, Colvin J. Comparative evolutionary analyses of eight whitefly Bemisia tabaci sensu lato genomes: cryptic species, agricultural pests and plant-virus vectors. BMC Genomics 2023; 24:408. [PMID: 37468834 DOI: 10.1186/s12864-023-09474-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 06/21/2023] [Indexed: 07/21/2023] Open
Abstract
BACKGROUND The group of > 40 cryptic whitefly species called Bemisia tabaci sensu lato are amongst the world's worst agricultural pests and plant-virus vectors. Outbreaks of B. tabaci s.l. and the associated plant-virus diseases continue to contribute to global food insecurity and social instability, particularly in sub-Saharan Africa and Asia. Published B. tabaci s.l. genomes have limited use for studying African cassava B. tabaci SSA1 species, due to the high genetic divergences between them. Genomic annotations presented here were performed using the 'Ensembl gene annotation system', to ensure that comparative analyses and conclusions reflect biological differences, as opposed to arising from different methodologies underpinning transcript model identification. RESULTS We present here six new B. tabaci s.l. genomes from Africa and Asia, and two re-annotated previously published genomes, to provide evolutionary insights into these globally distributed pests. Genome sizes ranged between 616-658 Mb and exhibited some of the highest coverage of transposable elements reported within Arthropoda. Many fewer total protein coding genes (PCG) were recovered compared to the previously published B. tabaci s.l. genomes and structural annotations generated via the uniform methodology strongly supported a repertoire of between 12.8-13.2 × 103 PCG. An integrative systematics approach incorporating phylogenomic analysis of nuclear and mitochondrial markers supported a monophyletic Aleyrodidae and the basal positioning of B. tabaci Uganda-1 to the sub-Saharan group of species. Reciprocal cross-mating data and the co-cladogenesis pattern of the primary obligate endosymbiont 'Candidatus Portiera aleyrodidarum' from 11 Bemisia genomes further supported the phylogenetic reconstruction to show that African cassava B. tabaci populations consist of just three biological species. We include comparative analyses of gene families related to detoxification, sugar metabolism, vector competency and evaluate the presence and function of horizontally transferred genes, essential for understanding the evolution and unique biology of constituent B. tabaci. s.l species. CONCLUSIONS These genomic resources have provided new and critical insights into the genetics underlying B. tabaci s.l. biology. They also provide a rich foundation for post-genomic research, including the selection of candidate gene-targets for innovative whitefly and virus-control strategies.
Collapse
Affiliation(s)
- Lahcen I Campbell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK.
| | - Joachim Nwezeobi
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK.
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, Hinxton, UK.
| | - Sharon L van Brunschot
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
- CSIRO Health and Biosecurity, Dutton Park, QLD, Australia
- School of Biological Sciences, The University of Queensland, Brisbane, QLD, Australia
| | - Tadeo Kaweesi
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
- Rwebitaba Zonal Agricultural Research and Development Institute, Fort Portal, Uganda
| | - Susan E Seal
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
| | - Rekha A R Swamy
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
| | - Annet Namuddu
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
- National Crops Resources Research Institute, Kampala, Uganda
| | - Gareth L Maslen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- Imperial College London, South Kensington, London, UK
| | - Habibu Mugerwa
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
- Department of Entomology, University of Georgia, Griffin, GA, USA
| | - Irina M Armean
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Osnat Malka
- Department of Entomology, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Diego Santos-Garcia
- CNRS, Laboratory of Biometry and Evolutionary Biology UMR 5558, University of Lyon, Villeurbanne, France
- Center for Biology and Management of Populations, INRAe UMR1062, Montferrier-sur-Lez, France
| | - Ksenia Juravel
- Department of Entomology, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Shai Morin
- Department of Entomology, The Hebrew University of Jerusalem, Rehovot, Israel
| | | | - Paul Visendi Muhindira
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
- Institute for Molecular Bioscience, The University of Queensland, St Lucia, QLD, Australia
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- Royal Botanic Gardens, Kew, London, UK
| | - M N Maruthi
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
| | | | - Jesús Navas-Castillo
- Instituto de Hortofruticultura Subtropical Y Mediterránea "La Mayora" (IHSM-UMA-CSIC), Consejo Superior de Investigaciones Científicas, Málaga, Algarrobo-Costa, Spain
| | - Elvira Fiallo-Olivé
- Instituto de Hortofruticultura Subtropical Y Mediterránea "La Mayora" (IHSM-UMA-CSIC), Consejo Superior de Investigaciones Científicas, Málaga, Algarrobo-Costa, Spain
| | | | - Hua-Ling Wang
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
- College of Forestry, Hebei Agricultural University, Baoding, Hebei, China
| | - Joseph Onyeka
- National Root Crops Research Institute (NRCRI), Umudike, Nigeria
| | - Titus Alicai
- National Crops Resources Research Institute, Kampala, Uganda
| | - John Colvin
- Natural Resources Institute, University of Greenwich, Chatham, Kent, UK
| |
Collapse
|
4
|
Argentin J, Bolser D, Kersey PJ, Flicek P. Comparative analysis of repeat content in plant genomes, large and small. Front Plant Sci 2023; 14:1103035. [PMID: 37521909 PMCID: PMC10376685 DOI: 10.3389/fpls.2023.1103035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 06/14/2023] [Indexed: 08/01/2023]
Abstract
The DNA Features pipeline is the analysis pipeline at EMBL-EBI that annotates repeat elements, including transposable elements. With Ensembl's goal to stay at the cutting edge of genome annotation, we proved that this pipeline needed an update. We then created a new analysis that allowed the Ensembl database to store the repeat classification from the PGSB repeat classification (Recat). This new dataset was then fetched using Perl scripts and used to prove that the pipeline modification induced a gain in sensitivity. Finally, we performed a comparative analysis of transposable element distribution in all plant species available, raising new questions about transposable elements in certain branches of the taxonomic tree.
Collapse
Affiliation(s)
- Joris Argentin
- Institut de Biologie en Santé, Centre Hospitalier Universitaire (CHU) d’Angers, Angers, France
| | - Dan Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, United Kingdom
| | - Paul J. Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, United Kingdom
- Digital Revolution, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, United Kingdom
| |
Collapse
|
5
|
Beier S, Fiebig A, Pommier C, Liyanage I, Lange M, Kersey PJ, Weise S, Finkers R, Koylass B, Cezard T, Courtot M, Contreras-Moreira B, Naamati G, Dyer S, Scholz U. Recommendations for the formatting of Variant Call Format (VCF) files to make plant genotyping data FAIR. F1000Res 2022; 11. [PMID: 35811804 PMCID: PMC9218589 DOI: 10.12688/f1000research.109080.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/17/2022] [Indexed: 11/20/2022] Open
Abstract
In this opinion article, we discuss the formatting of files from (plant) genotyping studies, in particular the formatting of metadata in Variant Call Format (VCF) files. The flexibility of the VCF format specification facilitates its use as a generic interchange format across domains but can lead to inconsistency between files in the presentation of metadata. To enable fully autonomous machine actionable data flow, generic elements need to be further specified. We strongly support the merits of the FAIR principles and see the need to facilitate them also through technical implementation specifications. They form a basis for the proposed VCF extensions here. We have learned from the existing application of VCF that the definition of relevant metadata using controlled standards, vocabulary and the consistent use of cross-references via resolvable identifiers (machine-readable) are particularly necessary and propose their encoding. VCF is an established standard for the exchange and publication of genotyping data. Other data formats are also used to capture variant data (for example, the HapMap and the gVCF formats), but none currently have the reach of VCF. For the sake of simplicity, we will only discuss VCF and our recommendations for its use, but these recommendations could also be applied to gVCF. However, the part of the VCF standard relating to metadata (as opposed to the actual variant calls) defines a syntactic format but no vocabulary, unique identifier or recommended content. In practice, often only sparse descriptive metadata is included. When descriptive metadata is provided, proprietary metadata fields are frequently added that have not been agreed upon within the community which may limit long-term and comprehensive interoperability. To address this, we propose recommendations for supplying and encoding metadata, focusing on use cases from plant sciences. We expect there to be overlap, but also divergence, with the needs of other domains.
Collapse
Affiliation(s)
- Sebastian Beier
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
- Institute of Bio- and Geosciences, Bioinformatics (IBG-4), Forschungszentrum Jülich GmbH, Jülich, 52425, Germany
| | - Anne Fiebig
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
| | - Cyril Pommier
- BioinfOmics, Plant bioinformatics facility, Université Paris-Saclay, INRAE, Versailles, France
| | - Isuru Liyanage
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Matthias Lange
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
| | | | - Stephan Weise
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
| | - Richard Finkers
- Plant Breeding, Wageningen University & Research, Wageningen, The Netherlands
- Gennovation B.V., Wageningen, The Netherlands
| | - Baron Koylass
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Timothee Cezard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Mélanie Courtot
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
- Ontario Institute for Cancer Research, Toronto, Canada
| | - Bruno Contreras-Moreira
- Laboratorio de Biología Computacional y Estructural, Estación Experimental Aula Dei-CSIC, Zaragoza, 50059, Spain
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Sarah Dyer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Uwe Scholz
- Breeding Research, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, Seeland, 06466, Germany
| |
Collapse
|
6
|
Lawniczak MKN, Durbin R, Flicek P, Lindblad-Toh K, Wei X, Archibald JM, Baker WJ, Belov K, Blaxter ML, Marques Bonet T, Childers AK, Coddington JA, Crandall KA, Crawford AJ, Davey RP, Di Palma F, Fang Q, Haerty W, Hall N, Hoff KJ, Howe K, Jarvis ED, Johnson WE, Johnson RN, Kersey PJ, Liu X, Lopez JV, Myers EW, Pettersson OV, Phillippy AM, Poelchau MF, Pruitt KD, Rhie A, Castilla-Rubio JC, Sahu SK, Salmon NA, Soltis PS, Swarbreck D, Thibaud-Nissen F, Wang S, Wegrzyn JL, Zhang G, Zhang H, Lewin HA, Richards S. Standards recommendations for the Earth BioGenome Project. Proc Natl Acad Sci U S A 2022; 119:e2115639118. [PMID: 35042802 PMCID: PMC8795494 DOI: 10.1073/pnas.2115639118] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
A global international initiative, such as the Earth BioGenome Project (EBP), requires both agreement and coordination on standards to ensure that the collective effort generates rapid progress toward its goals. To this end, the EBP initiated five technical standards committees comprising volunteer members from the global genomics scientific community: Sample Collection and Processing, Sequencing and Assembly, Annotation, Analysis, and IT and Informatics. The current versions of the resulting standards documents are available on the EBP website, with the recognition that opportunities, technologies, and challenges may improve or change in the future, requiring flexibility for the EBP to meet its goals. Here, we describe some highlights from the proposed standards, and areas where additional challenges will need to be met.
Collapse
Affiliation(s)
- Mara K N Lawniczak
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Richard Durbin
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB3 0DH, United Kingdom
| | - Paul Flicek
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, United Kingdom
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University 751 23 Uppsala, Sweden
| | | | - John M Archibald
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS B3H 4R2, Canada
| | - William J Baker
- Department of Accelerated Taxonomy, Royal Botanic Gardens, Kew, Surrey TW9 3AE, United Kingdom
| | - Katherine Belov
- School of Life and Environmental Sciences, Faculty of Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Mark L Blaxter
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Tomas Marques Bonet
- Institute of Evolutionary Biology, Consejo Superior de Investigaciones Científicas-Universitat Pompeau Fabra, Parc de Rechercha Biomédica Barcelona 08003 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies 08010 Barcelona, Spain
- Centre Nacional d'Anàlisi Geonòmica - Centre for Genomic Regulation, Barcelona Institute of Science and Technology 08028 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona 08193 Barcelona, Spain
| | - Anna K Childers
- Bee Research Laboratory, Beltsville Agricultural Research Center, US Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705
| | - Jonathan A Coddington
- Smithsonian Institution, National Museum of Natural History, Washington, DC 20560-0105
| | - Keith A Crandall
- Computational Biology Institute and Department of Biostatistics & Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes 111711 Bogotá, Colombia
| | - Robert P Davey
- Engineering Biology, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | | | - Qi Fang
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen 518083, China
| | - Wilfried Haerty
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | - Neil Hall
- Genome British Columbia, Vancouver, BC V5Z 0C4, Canada
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | - Katharina J Hoff
- Institute of Mathematics and Computer Science, Center for Functional Genomics of Microbes, University of Greifswald 17489 Greifswald, Germany
| | - Kerstin Howe
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Erich D Jarvis
- Vertebrate Genomes Lab, The Rockefeller University, New York, NY 10065
- HHMI, Chevy Chase, MD 20815
| | - Warren E Johnson
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630
- The Walter Reed Biosystematics Unit, Museum Support Center MRC-534, Smithsonian Institution, Suitland, MD 20746-2863
| | - Rebecca N Johnson
- Smithsonian Institution, National Museum of Natural History, Washington, DC 20560-0105
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge CB10 1SD, United Kingdom
| | - Xin Liu
- China National GeneBank, Shenzhen 518120, China
| | - Jose Victor Lopez
- Halmos College of Arts and Sciences, Guy Harvey Oceanographic Center, Nova Southeastern University, Dania Beach, FL 33004
| | - Eugene W Myers
- Department of Systems Biology, Max Planck Institute of Molecular Cell Biology and Genetics, Dresden 01307, Germany
| | | | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD 20894
| | - Monica F Poelchau
- National Agricultural Library, USDA Agricultural Research Service, Beltsville, MD 20705
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD 20894
| | | | - Sunil Kumar Sahu
- China National GeneBank, Shenzhen 518120, China
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen 518083, China
| | - Nicholas A Salmon
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
| | - David Swarbreck
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894
| | - Sibo Wang
- China National GeneBank, Shenzhen 518120, China
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269
- Institute for Systems Genomics, Computational Biology Core, University of Connecticut, Storrs, CT 06269
| | - Guojie Zhang
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen 1165 Copenhagen, Denmark
- China National Genebank, BGI-Shenzhen 518083 Shenzhen, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences 650223 Kunming, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences 650223 Kunming, China
| | - He Zhang
- BGI-Qingdao, BGI-Shenzhen 266555 Qingdao, China
| | - Harris A Lewin
- University of California Davis Genome Center, University of California, Davis, CA 95616
- Department of Evolution and Ecology, University of California, Davis, CA 95616
| | - Stephen Richards
- University of California Davis Genome Center, University of California, Davis, CA 95616;
| |
Collapse
|
7
|
Kress WJ, Soltis DE, Kersey PJ, Wegrzyn JL, Leebens-Mack JH, Gostel MR, Liu X, Soltis PS. Green plant genomes: What we know in an era of rapidly expanding opportunities. Proc Natl Acad Sci U S A 2022; 119:e2115640118. [PMID: 35042803 PMCID: PMC8795535 DOI: 10.1073/pnas.2115640118] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Green plants play a fundamental role in ecosystems, human health, and agriculture. As de novo genomes are being generated for all known eukaryotic species as advocated by the Earth BioGenome Project, increasing genomic information on green land plants is essential. However, setting standards for the generation and storage of the complex set of genomes that characterize the green lineage of life is a major challenge for plant scientists. Such standards will need to accommodate the immense variation in green plant genome size, transposable element content, and structural complexity while enabling research into the molecular and evolutionary processes that have resulted in this enormous genomic variation. Here we provide an overview and assessment of the current state of knowledge of green plant genomes. To date fewer than 300 complete chromosome-scale genome assemblies representing fewer than 900 species have been generated across the estimated 450,000 to 500,000 species in the green plant clade. These genomes range in size from 12 Mb to 27.6 Gb and are biased toward agricultural crops with large branches of the green tree of life untouched by genomic-scale sequencing. Locating suitable tissue samples of most species of plants, especially those taxa from extreme environments, remains one of the biggest hurdles to increasing our genomic inventory. Furthermore, the annotation of plant genomes is at present undergoing intensive improvement. It is our hope that this fresh overview will help in the development of genomic quality standards for a cohesive and meaningful synthesis of green plant genomes as we scale up for the future.
Collapse
Affiliation(s)
- W John Kress
- National Museum of Natural History, Smithsonian Institution, Department of Botany, Washington, DC 20013-7012;
- Department of Biological Sciences, Dartmouth College, Hanover, NH 03755
- Arnold Arboretum, Harvard University, Boston, MA 02130
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
- Department of Biology, University of Florida, Gainesville, FL 32611
| | - Paul J Kersey
- Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AE, United Kingdom
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, Institute for Systems Genomics: Computational Biology Core, University of Connecticut, Storrs, CT 06269-3214
| | - James H Leebens-Mack
- Department of Plant Biology, 2101 Miller Plant Sciences, University of Georgia, Athens, GA 30602-7271
| | - Morgan R Gostel
- Botanical Research Institute of Texas, Fort Worth, TX 76107-3400
| | - Xin Liu
- China National GeneBank, BGI-Shenzhen, Shenzhen 518120, China
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
| |
Collapse
|
8
|
Lewin HA, Richards S, Lieberman Aiden E, Allende ML, Archibald JM, Bálint M, Barker KB, Baumgartner B, Belov K, Bertorelle G, Blaxter ML, Cai J, Caperello ND, Carlson K, Castilla-Rubio JC, Chaw SM, Chen L, Childers AK, Coddington JA, Conde DA, Corominas M, Crandall KA, Crawford AJ, DiPalma F, Durbin R, Ebenezer TE, Edwards SV, Fedrigo O, Flicek P, Formenti G, Gibbs RA, Gilbert MTP, Goldstein MM, Graves JM, Greely HT, Grigoriev IV, Hackett KJ, Hall N, Haussler D, Helgen KM, Hogg CJ, Isobe S, Jakobsen KS, Janke A, Jarvis ED, Johnson WE, Jones SJM, Karlsson EK, Kersey PJ, Kim JH, Kress WJ, Kuraku S, Lawniczak MKN, Leebens-Mack JH, Li X, Lindblad-Toh K, Liu X, Lopez JV, Marques-Bonet T, Mazard S, Mazet JAK, Mazzoni CJ, Myers EW, O'Neill RJ, Paez S, Park H, Robinson GE, Roquet C, Ryder OA, Sabir JSM, Shaffer HB, Shank TM, Sherkow JS, Soltis PS, Tang B, Tedersoo L, Uliano-Silva M, Wang K, Wei X, Wetzer R, Wilson JL, Xu X, Yang H, Yoder AD, Zhang G. The Earth BioGenome Project 2020: Starting the clock. Proc Natl Acad Sci U S A 2022; 119:e2115635118. [PMID: 35042800 PMCID: PMC8795548 DOI: 10.1073/pnas.2115635118] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Harris A Lewin
- Department of Evolution and Ecology, College of Biological Sciences, University of California, Davis, CA 95616;
- Department of Population Health and Reproduction, University of California, Davis, CA 95616
| | - Stephen Richards
- University of California Davis Genome Center, University of California, Davis, CA 95616
| | - Erez Lieberman Aiden
- DNA Zoo and The Center for Genome Architecture, Baylor College of Medicine, Houston, TX 77030
| | - Miguel L Allende
- Center for Genome Regulation, Universidad de Chile 3425 Santiago, Chile
- Facultad de Ciencias, Universidad de Chile 3425 Santiago, Chile
| | - John M Archibald
- Department of Biochemistry & Molecular Biology, Dalhousie University, Halifax, NS B3H 4H7, Canada
| | - Miklós Bálint
- LOEWE Centre of Translational Biodiversity Genomics, Senckenberg Leibniz Institution for Biodiversity and Earth System Research 60325 Frankfurt am Main, Germany
- Institute for Insect Biotechnology, Justus-Liebig University 35392 Giessen, Germany
| | - Katharine B Barker
- Global Genome Biodiversity Network Secretariat, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560
| | | | - Katherine Belov
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Giorgio Bertorelle
- Department of Life Sciences and Biotechnology, University of Ferrara 44121 Ferrara, Italy
| | - Mark L Blaxter
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Jing Cai
- School of Ecology and Environment, Northwestern Polytechnical University 710072 Xi'an, China
| | - Nicolette D Caperello
- University of California Davis Genome Center, University of California, Davis, CA 95616
| | - Keith Carlson
- The Novim Group, University of California, Santa Barbara, CA 93106
| | | | - Shu-Miaw Chaw
- Biodiversity Research Center, Academia Sinica 11529 Taipei, Taiwan
| | - Lei Chen
- School of Ecology and Environment, Northwestern Polytechnical University 710072 Xi'an, China
| | - Anna K Childers
- Bee Research Laboratory, Beltsville Agricultural Research Center, US Department of Agriculture, Agriculture Research Service, Beltsville, MD 20705
| | - Jonathan A Coddington
- Global Genome Initiative, National Museum of Natural History, Smithsonian Institution, Washington, DC 20560
| | - Dalia A Conde
- Conservation Science, Species360 Conservation Science Alliance, Bloomington, MN 55425
- Department of Biology, University of Southern Denmark 5230 Odense M, Denmark
| | - Montserrat Corominas
- Department of Genetics, Microbiology, and Statistics, Universitat de Barcelona 08028 Barcelona, Spain
- Catalan Society for Biology, Institute for Catalan Studies 08001 Barcelona, Spain
| | - Keith A Crandall
- Department of Biostatistics & Bioinformatics, Computational Biology Institute, George Washington University, Washington, DC 20052
- Department of Biostatistics & Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC 20052
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes 111711 Bogotá, Colombia
| | | | - Richard Durbin
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - ThankGod E Ebenezer
- UniProt, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, United Kingdom
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138
- Museum of Comparative Zoology, Harvard University, Cambridge, MA 02138
| | - Olivier Fedrigo
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY 10065
| | - Paul Flicek
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge CB10 1SD, United Kingdom
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY 10065
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030
| | - M Thomas P Gilbert
- GLOBE Institute, University of Copenhagen 1350 Copenhagen, Denmark
- University Museum, Norwegian University of Science and Technology 7491 Trondheim, Norway
| | - Melissa M Goldstein
- Department of Health Policy and Management, George Washington University, Washington, DC 20052
| | - Jennifer Marshall Graves
- School of Life Sciences, La Trobe University, Bundoora, VIC 3086, Australia
- Institute for Applied Ecology, University of Canberra, Bruce, ACT 2617, Australia
| | - Henry T Greely
- Stanford Law School, Stanford University, Stanford, CA 94305
| | - Igor V Grigoriev
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Kevin J Hackett
- Office of National Programs, US Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705
| | - Neil Hall
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | - David Haussler
- Genome Institute, University of California, Santa Cruz, CA 95060
- HHMI, Chevy Chase, MD 20815
| | - Kristofer M Helgen
- Australian Museum Research Institute, Australian Museum, Sydney, NSW 2000, Australia
| | - Carolyn J Hogg
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Sachiko Isobe
- Department of Frontier Research and Development, Kazusa DNA Research Institute, Chiba 292-0818, Japan
| | | | - Axel Janke
- LOEWE Centre of Translational Biodiversity Genomics, Senckenberg Leibniz Institution for Biodiversity and Earth System Research 60325 Frankfurt am Main, Germany
| | - Erich D Jarvis
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY 10065
- HHMI, Chevy Chase, MD 20815
| | - Warren E Johnson
- Walter Reed Biosystematics Unit, Smithsonian Institution, Suitland, MD 20746
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630
| | - Steven J M Jones
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
| | - Elinor K Karlsson
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
| | - Paul J Kersey
- Royal Botanic Gardens, Kew, Richmond TW9 3AE, United Kingdom
| | - Jin-Hyoung Kim
- Division of Life Sciences, Korea Polar Research Institute 21990 Incheon, South Korea
| | - W John Kress
- Museum of Natural History, Smithsonian Institution, Washington, DC 20013-7012
| | - Shigehiro Kuraku
- Department of Genomics and Evolutionary Biology, National Institute of Genetics 411-8540 Shizuoka, Japan
- Laboratory for Phyloinformatics, RIKEN Center for Biosystems Dynamics Research 650-0047 Hyogo, Japan
| | - Mara K N Lawniczak
- Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | | | - Xueyan Li
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences 650223 Yunnan, China
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University 752 36 Uppsala, Sweden
| | - Xin Liu
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
| | - Jose V Lopez
- Department of Biological Sciences, Halmos College of Arts and Sciences, Nova Southeastern University, Dania Beach, FL 33004
- Guy Harvey Oceanographic Center, Dania Beach, FL 33004
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology, Pompeu Fabra University, Consejo Superior de Investigaciones Cientificas, Parc de Recerca Biomedica de Barcelona 08003 Barcelona, Spain
- Catalan Institute of Research and Advanced Studies 08010 Barcelona, Spain
- Centre Nacional d'Anàlisi Genòmica, Centre for Genomic Regulation, Barcelona Institute of Science and Technology 08028 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona 08193 Barcelona, Spain
| | - Sophie Mazard
- Bioplatforms Australia, Macquarie University, Sydney, NSW 2109, Australia
| | - Jonna A K Mazet
- One Health Institute, University of California Davis, CA 95616
| | - Camila J Mazzoni
- Berlin Center for Genomics in Biodiversity Research 14195 Berlin, Germany
- Evolutionary Genetics Department, Leibniz Institute for Zoo and Wildlife Research 10315 Berlin, Germany
| | - Eugene W Myers
- Max Planck Institute for Molecular Cell Biology and Genetics 01307 Dresden, Germany
| | - Rachel J O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
| | - Sadye Paez
- Laboratory of the Neurogenetics of Language, The Rockefeller University, New York, NY 10065
| | - Hyun Park
- Division of Biotechnology, Korea University 02841 Seoul, Korea
| | - Gene E Robinson
- Department of Entomology, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| | - Cristina Roquet
- Systematics and Evolution of Vascular Plants Associated Unit to Consejo Superior de Investigaciones Cientificas, Departament de Biologia Animal, Biologia Vegetal i Ecologia, Universitat Autònoma de Barcelona 08193 Bellaterra, Spain
- Laboratoire d'Ecologie Alpine, University Grenoble Alpes, University Savoie Mont Blanc, CNRS 38000 Grenoble, France
| | - Oliver A Ryder
- Conservation Genetics, San Diego Zoo Wildlife Alliance, Escondido, CA 92027
- Division of Biology, Department of Evolution, Behavior, and Ecology, University of California, San Diego, La Jolla, CA 92039
| | - Jamal S M Sabir
- Department of Biological Sciences, Faculty of Science, King Abdulaziz University 21589 Jeddah, Saudi Arabia
- Centre of Excellence in Bionanoscience Research, King Abdulaziz University 21589 Jeddah, Saudi Arabia
| | - H Bradley Shaffer
- La Kretz Center for California Conservation Science, Institute of Environment and Sustainability, University of California, Los Angeles, CA 90024
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095
| | - Timothy M Shank
- Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543
| | - Jacob S Sherkow
- Department of Entomology, Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
- College of Law, University of Illinois at Urbana-Champaign, Champaign, IL 61820
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
| | - Boping Tang
- Jiangsu Key Laboratory for Bioresources of Saline Soils, Jiangsu Provincial Key Laboratory of Coastal Wetland Bioresources and Environmental Protection, Jiangsu Synthetic Innovation Center for Coastal Bio-agriculture, School of Wetlands, Yancheng Teachers University 224002 Yancheng, China
| | - Leho Tedersoo
- Center of Mycology and Microbiology, University of Tartu 50411 Tartu, Estonia
- College of Science, King Saud University 11451 Riyadh, Saudi Arabia
| | | | - Kun Wang
- School of Ecology and Environment, Northwestern Polytechnical University 710072 Xi'an, China
| | - Xiaofeng Wei
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
| | - Regina Wetzer
- Research and Collections, Natural History Museum of Los Angeles County, Los Angeles, CA 90007
- Biological Sciences, University of Southern California, Los Angeles, CA 90089
| | - Julia L Wilson
- Wellcome Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Xun Xu
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
| | - Huanming Yang
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
| | - Anne D Yoder
- Department of Biology, Duke University, Durham, NC 27708
- Duke Center for Genomic and Computational Biology, Duke University, Durham, NC 27708
| | - Guojie Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences 650223 Yunnan, China
- BGI-Research, Beijing Genomics Institute-Shenzhen 518083 Shenzhen, China
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen 2100 Copenhagen, Denmark
- China National Genebank, Beijing Genomics Institute 51803 Shenzhen, China
| |
Collapse
|
9
|
Baker WJ, Bailey P, Barber V, Barker A, Bellot S, Bishop D, Botigué LR, Brewer G, Carruthers T, Clarkson JJ, Cook J, Cowan RS, Dodsworth S, Epitawalage N, Françoso E, Gallego B, Johnson MG, Kim JT, Leempoel K, Maurin O, McGinnie C, Pokorny L, Roy S, Stone M, Toledo E, Wickett NJ, Zuntini AR, Eiserhardt WL, Kersey PJ, Leitch IJ, Forest F. A Comprehensive Phylogenomic Platform for Exploring the Angiosperm Tree of Life. Syst Biol 2021; 71:301-319. [PMID: 33983440 PMCID: PMC8830076 DOI: 10.1093/sysbio/syab035] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Revised: 05/06/2021] [Accepted: 05/08/2021] [Indexed: 12/22/2022] Open
Abstract
The tree of life is the fundamental biological roadmap for navigating the evolution and properties of life on Earth, and yet remains largely unknown. Even angiosperms (flowering plants) are fraught with data gaps, despite their critical role in sustaining terrestrial life. Today, high-throughput sequencing promises to significantly deepen our understanding of evolutionary relationships. Here, we describe a comprehensive phylogenomic platform for exploring the angiosperm tree of life, comprising a set of open tools and data based on the 353 nuclear genes targeted by the universal Angiosperms353 sequence capture probes. The primary goals of this article are to (i) document our methods, (ii) describe our first data release, and (iii) present a novel open data portal, the Kew Tree of Life Explorer (https://treeoflife.kew.org). We aim to generate novel target sequence capture data for all genera of flowering plants, exploiting natural history collections such as herbarium specimens, and augment it with mined public data. Our first data release, described here, is the most extensive nuclear phylogenomic data set for angiosperms to date, comprising 3099 samples validated by DNA barcode and phylogenetic tests, representing all 64 orders, 404 families (96\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$\%$\end{document}) and 2333 genera (17\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$\%$\end{document}). A “first pass” angiosperm tree of life was inferred from the data, which totaled 824,878 sequences, 489,086,049 base pairs, and 532,260 alignment columns, for interactive presentation in the Kew Tree of Life Explorer. This species tree was generated using methods that were rigorous, yet tractable at our scale of operation. Despite limitations pertaining to taxon and gene sampling, gene recovery, models of sequence evolution and paralogy, the tree strongly supports existing taxonomy, while challenging numerous hypothesized relationships among orders and placing many genera for the first time. The validated data set, species tree and all intermediates are openly accessible via the Kew Tree of Life Explorer and will be updated as further data become available. This major milestone toward a complete tree of life for all flowering plant species opens doors to a highly integrated future for angiosperm phylogenomics through the systematic sequencing of standardized nuclear markers. Our approach has the potential to serve as a much-needed bridge between the growing movement to sequence the genomes of all life on Earth and the vast phylogenomic potential of the world’s natural history collections. [Angiosperms; Angiosperms353; genomics; herbariomics; museomics; nuclear phylogenomics; open access; target sequence capture; tree of life.]
Collapse
Affiliation(s)
- William J Baker
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Paul Bailey
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Vanessa Barber
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Abigail Barker
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Sidonie Bellot
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - David Bishop
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Laura R Botigué
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom.,Centre for Research in Agricultural Genomics, Campus UAB, Edifici CRAG, Bellaterra Cerdanyola del Vallès, 08193 Barcelona, Spain
| | - Grace Brewer
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Tom Carruthers
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - James J Clarkson
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Jeffrey Cook
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Robyn S Cowan
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Steven Dodsworth
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom.,School of Life Sciences, University of Bedfordshire, University Square, Luton LU1 3JU, United Kingdom
| | | | - Elaine Françoso
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Berta Gallego
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Matthew G Johnson
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jan T Kim
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom.,Department of Computer Science, School of Physics, Engineering and Computer Science, University of Hertfordshire, Hatfield, Hertfordshire, AL10 9AB, United Kingdom
| | - Kevin Leempoel
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Olivier Maurin
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | | | - Lisa Pokorny
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom.,Centre for Plant Biotechnology and Genomics (CBGP) UPM-INIA, 28223 Pozuelo de Alarcón (Madrid), Spain
| | - Shyamali Roy
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Malcolm Stone
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Eduardo Toledo
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Norman J Wickett
- Plant Science and Conservation, Chicago Botanic Garden, 1000 Lake Cook Road, Glencoe, IL 60022, USA
| | | | - Wolf L Eiserhardt
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom.,Department of Biology, Aarhus University, 8000 Aarhus C, Denmark
| | - Paul J Kersey
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Ilia J Leitch
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| | - Félix Forest
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, United Kingdom
| |
Collapse
|
10
|
Tello-Ruiz MK, Naithani S, Gupta P, Olson A, Wei S, Preece J, Jiao Y, Wang B, Chougule K, Garg P, Elser J, Kumari S, Kumar V, Contreras-Moreira B, Naamati G, George N, Cook J, Bolser D, D'Eustachio P, Stein LD, Gupta A, Xu W, Regala J, Papatheodorou I, Kersey PJ, Flicek P, Taylor C, Jaiswal P, Ware D. Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res 2021; 49:D1452-D1463. [PMID: 33170273 DOI: 10.1093/nar/gkaa979] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/09/2020] [Indexed: 01/27/2023] Open
Abstract
Gramene (http://www.gramene.org), a knowledgebase founded on comparative functional analyses of genomic and pathway data for model plants and major crops, supports agricultural researchers worldwide. The resource is committed to open access and reproducible science based on the FAIR data principles. Since the last NAR update, we made nine releases; doubled the genome portal's content; expanded curated genes, pathways and expression sets; and implemented the Domain Informational Vocabulary Extraction (DIVE) algorithm for extracting gene function information from publications. The current release, #63 (October 2020), hosts 93 reference genomes-over 3.9 million genes in 122 947 families with orthologous and paralogous classifications. Plant Reactome portrays pathway networks using a combination of manual biocuration in rice (320 reference pathways) and orthology-based projections to 106 species. The Reactome platform facilitates comparison between reference and projected pathways, gene expression analyses and overlays of gene-gene interactions. Gramene integrates ontology-based protein structure-function annotation; information on genetic, epigenetic, expression, and phenotypic diversity; and gene functional annotations extracted from plant-focused journals using DIVE. We train plant researchers in biocuration of genes and pathways; host curated maize gene structures as tracks in the maize genome browser; and integrate curated rice genes and pathways in the Plant Reactome.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Priyanka Garg
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Justin Cook
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto M5G 1L7, Canada
| | - Daniel Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK.,Current affiliation: Geromics Inc., Cambridge CB1 3NF, UK
| | - Peter D'Eustachio
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Amit Gupta
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Weijia Xu
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Jennifer Regala
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA.,Current affiliation: American Urological Association, Linthicum, MD 21090, USA
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK.,Current affiliation: Royal Botanic Gardens, Kew Richmond, Surrey TW9 3AE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Crispin Taylor
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
| |
Collapse
|
11
|
Tello-Ruiz MK, Naithani S, Gupta P, Olson A, Wei S, Preece J, Jiao Y, Wang B, Chougule K, Garg P, Elser J, Kumari S, Kumar V, Contreras-Moreira B, Naamati G, George N, Cook J, Bolser D, D'Eustachio P, Stein LD, Gupta A, Xu W, Regala J, Papatheodorou I, Kersey PJ, Flicek P, Taylor C, Jaiswal P, Ware D. Gramene 2021: harnessing the power of comparative genomics and pathways for plant research. Nucleic Acids Res 2021; 49:D1452-D1463. [PMID: 33170273 DOI: 10.1093/nar/gkaa979/5973447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/09/2020] [Indexed: 05/20/2023] Open
Abstract
Gramene (http://www.gramene.org), a knowledgebase founded on comparative functional analyses of genomic and pathway data for model plants and major crops, supports agricultural researchers worldwide. The resource is committed to open access and reproducible science based on the FAIR data principles. Since the last NAR update, we made nine releases; doubled the genome portal's content; expanded curated genes, pathways and expression sets; and implemented the Domain Informational Vocabulary Extraction (DIVE) algorithm for extracting gene function information from publications. The current release, #63 (October 2020), hosts 93 reference genomes-over 3.9 million genes in 122 947 families with orthologous and paralogous classifications. Plant Reactome portrays pathway networks using a combination of manual biocuration in rice (320 reference pathways) and orthology-based projections to 106 species. The Reactome platform facilitates comparison between reference and projected pathways, gene expression analyses and overlays of gene-gene interactions. Gramene integrates ontology-based protein structure-function annotation; information on genetic, epigenetic, expression, and phenotypic diversity; and gene functional annotations extracted from plant-focused journals using DIVE. We train plant researchers in biocuration of genes and pathways; host curated maize gene structures as tracks in the maize genome browser; and integrate curated rice genes and pathways in the Plant Reactome.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Priyanka Garg
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Justin Cook
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto M5G 1L7, Canada
| | - Daniel Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Current affiliation: Geromics Inc., Cambridge CB1 3NF, UK
| | - Peter D'Eustachio
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Amit Gupta
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Weijia Xu
- Texas Advanced Computing Center, University of Texas at Austin, Austin, TX 78758, USA
| | - Jennifer Regala
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
- Current affiliation: American Urological Association, Linthicum, MD 21090, USA
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
- Current affiliation: Royal Botanic Gardens, Kew Richmond, Surrey TW9 3AE, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Crispin Taylor
- American Society of Plant Biologists, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
- USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA
| |
Collapse
|
12
|
Williamson HF, Brettschneider J, Caccamo M, Davey RP, Goble C, Kersey PJ, May S, Morris RJ, Ostler R, Pridmore T, Rawlings C, Studholme D, Tsaftaris SA, Leonelli S. Data management challenges for artificial intelligence in plant and agricultural research. F1000Res 2021; 10:324. [PMID: 36873457 PMCID: PMC9975417 DOI: 10.12688/f1000research.52204.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/12/2023] [Indexed: 01/19/2023] Open
Abstract
Artificial Intelligence (AI) is increasingly used within plant science, yet it is far from being routinely and effectively implemented in this domain. Particularly relevant to the development of novel food and agricultural technologies is the development of validated, meaningful and usable ways to integrate, compare and visualise large, multi-dimensional datasets from different sources and scientific approaches. After a brief summary of the reasons for the interest in data science and AI within plant science, the paper identifies and discusses eight key challenges in data management that must be addressed to further unlock the potential of AI in crop and agronomic research, and particularly the application of Machine Learning (AI) which holds much promise for this domain.
Collapse
Affiliation(s)
- Hugh F Williamson
- Exeter Centre for the Study of the Life Sciences & Institute for Data Science and Artificial Intelligence, University of Exeter, Exeter, UK
| | | | - Mario Caccamo
- NIAB, National Research Institute of Brewing, East Malling, UK
| | | | - Carole Goble
- Department of Computer Science, University of Manchester, Manchester, UK
| | | | - Sean May
- School of Biosciences, University of Nottingham, Loughborough, UK
| | | | - Richard Ostler
- Department of Computational and Analytical Sciences, Rothamsted Research, Harpendem, UK
| | - Tony Pridmore
- School of Computer Science, University of Nottingham, Nottingham, UK
| | - Chris Rawlings
- Department of Computational and Analytical Sciences, Rothamsted Research, Harpendem, UK
| | | | - Sotirios A Tsaftaris
- Institute of Digital Communications, University of Edinburgh, Edinburgh, UK.,Alan Turing Institute, London, UK
| | - Sabina Leonelli
- Exeter Centre for the Study of the Life Sciences & Institute for Data Science and Artificial Intelligence, University of Exeter, Exeter, UK.,Alan Turing Institute, London, UK
| |
Collapse
|
13
|
McCouch S, Navabi ZK, Abberton M, Anglin NL, Barbieri RL, Baum M, Bett K, Booker H, Brown GL, Bryan GJ, Cattivelli L, Charest D, Eversole K, Freitas M, Ghamkhar K, Grattapaglia D, Henry R, Valadares Inglis MC, Islam T, Kehel Z, Kersey PJ, King GJ, Kresovich S, Marden E, Mayes S, Ndjiondjop MN, Nguyen HT, Paiva SR, Papa R, Phillips PWB, Rasheed A, Richards C, Rouard M, Amstalden Sampaio MJ, Scholz U, Shaw PD, Sherman B, Staton SE, Stein N, Svensson J, Tester M, Montenegro Valls JF, Varshney R, Visscher S, von Wettberg E, Waugh R, Wenzl P, Rieseberg LH. Mobilizing Crop Biodiversity. Mol Plant 2020; 13:1341-1344. [PMID: 32835887 DOI: 10.1016/j.molp.2020.08.011] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 08/19/2020] [Accepted: 08/19/2020] [Indexed: 05/10/2023]
Affiliation(s)
- Susan McCouch
- Plant Breeding and Genetics, School of Integrated Plant Sciences, Cornell University, Ithaca, NY, 14853, USA
| | - Zahra Katy Navabi
- DivSeek, Global Institute for Food Security, 110 Gymnasium Place, University of Saskatchewan, Saskatoon, SK, S7N 0W9, Canada; Global Institute for Food Security, 110 Gymnasium Place, University of Saskatchewan, Saskatoon, SK, S7N 4J8, Canada
| | - Michael Abberton
- International Institute of Tropical Agriculture (IITA), PMB 5320, Oyo Rd, Ibadan, Nigeria
| | - Noelle L Anglin
- International Potato Center (CIP) 1895 Avenida La Molina, Lima Peru 12, Lima 15023, Peru
| | - Rosa Lia Barbieri
- Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Final Av W5 Norte, Caixa Postal 02372, 70770-917 - Brasília DF, Brazil
| | - Michael Baum
- International Center for Agricultural Research in the Dry Areas (ICARDA), Station Exp. INRA-Quich. Rue Hafiane Cherkaoui. Agdal. Rabat - Instituts, 10111, Rabat, Morocco
| | - Kirstin Bett
- Department of Plant Sciences, University of Saskatchewan, 51 Campus Dr., Saskatoon, SK S7N 5A8, Canada
| | - Helen Booker
- Department of Plant Agriculture, University of Guelph, Rm 316, Crop Science Bldg, 50 Stone Rd E, Guelph, ON N1G 2W1, Canada
| | - Gerald L Brown
- Genome Prairie, 111 Research Drive, Suite 101, Saskatoon, SK, S7N 3R2, Canada
| | - Glenn J Bryan
- The James Hutton Institute, Errol Road, Invergowrie, Dundee, DD2 5DA, UK
| | - Luigi Cattivelli
- CREA, Research Centre for Genomics and Bioinformatics, via San Protaso 302, Fiorenzuola d'Arda, 29017, Italy
| | - David Charest
- Genome British Columbia, 400-575 West 8th Avenue, Vancouver, BC, V5Z 0C4, Canada
| | - Kellye Eversole
- International Wheat Genome Sequencing Consortium, 2841 NE Marywood Ct, Lee's Summit, MO, 64086, USA
| | - Marcelo Freitas
- Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Final Av W5 Norte, Caixa Postal 02372, 70770-917 - Brasília DF, Brazil
| | - Kioumars Ghamkhar
- Forage Science, Grasslands Research Centre, AgResearch, Palmerston North, 4410, New Zealand
| | - Dario Grattapaglia
- Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Final Av W5 Norte, Caixa Postal 02372, 70770-917 - Brasília DF, Brazil
| | - Robert Henry
- Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane, QLD 4072, Australia
| | - Maria Cleria Valadares Inglis
- Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Final Av W5 Norte, Caixa Postal 02372, 70770-917 - Brasília DF, Brazil
| | - Tofazzal Islam
- Institute of Biotechnology and Genetic Engineering (IBGE), Bangabandhu Sheikh Mujibur Rahman Agricultural University, Gazipur 1706, Bangladesh
| | - Zakaria Kehel
- International Center for Agricultural Research in the Dry Areas (ICARDA), Station Exp. INRA-Quich. Rue Hafiane Cherkaoui. Agdal. Rabat - Instituts, 10111, Rabat, Morocco
| | - Paul J Kersey
- Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK
| | - Graham J King
- Southern Cross University, PO Box 157, Lismore, NSW 2480, Australia
| | - Stephen Kresovich
- Feed the Future Innovation Lab for Crop Improvement, 431 Weill Hall, Cornell University, Ithaca, NY, 14853, USA
| | - Emily Marden
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, BC V6R 2A5, Canada
| | - Sean Mayes
- Crops For the Future (UK) CIC 76-80 Baddow Road, Chelmsford, Essex, CM2 7PJ, UK
| | - Marie Noelle Ndjiondjop
- Africa Rice Center (AfricaRice), Mbe Research Station, Bouaké, 01 BP 2511 Bouaké, Côte d'Ivoire
| | - Henry T Nguyen
- University of Missouri, Division of Plant Sciences, 25 Agriculture Lab Bldg, College of Agriculture, Food and Natural Resources, University of Missouri, Columbia, MO 65211, USA
| | - Samuel Rezende Paiva
- Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Final Av W5 Norte, Caixa Postal 02372, 70770-917 - Brasília DF, Brazil
| | - Roberto Papa
- Università Politecnica delle Marche, D3A-Dipartimento di Scienze Agrarie, Alimentari e Ambientali, Via Brecce Bianche, 60131, Ancona, Italy
| | - Peter W B Phillips
- Johnson Shoyama Graduate School of Public Policy, University of Saskatchewan, 101 Diefenbaker Place, Saskatoon, S7N 5B8, Canada
| | - Awais Rasheed
- CIMMYT-China office, Beijing 100081, Beijing, P.R. China
| | - Christopher Richards
- USDA-ARS National Laboratory for Genetic Resources Preservation, 1111 South Mason St, Fort Collins, CO, 80521, USA
| | - Mathieu Rouard
- Bioversity International, Parc Scientifique Agropolis II, 34397, Montpellier, Cedex 5, France
| | - Maria Jose Amstalden Sampaio
- Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Final Av W5 Norte, Caixa Postal 02372, 70770-917 - Brasília DF, Brazil
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Corrensstr. 3, D-06466 Seeland, Germany
| | - Paul D Shaw
- The James Hutton Institute, Errol Road, Invergowrie, Dundee, DD2 5DA, UK
| | - Brad Sherman
- Law School, University of Queensland, St Lucia, QLD, 4072, Australia
| | - S Evan Staton
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, BC V6R 2A5, Canada
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Corrensstr. 3, D-06466 Seeland, Germany; CiBreed - Center for Integrated Breeding Research, Department of Crop Sciences, Georg-August University Göttingen, Von Siebold Straße 8, D-37075 Göttingen, Germany
| | | | - Mark Tester
- King Abdullah University of Science & Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Jose Francisco Montenegro Valls
- Embrapa Genetic Resources and Biotechnology, Parque Estação Biológica, Final Av W5 Norte, Caixa Postal 02372, 70770-917 - Brasília DF, Brazil
| | - Rajeev Varshney
- Center of Excellence in Genomics & Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru - 502 324, Telangana State, India
| | - Stephen Visscher
- Global Institute for Food Security, 110 Gymnasium Place, University of Saskatchewan, Saskatoon, SK, S7N 4J8, Canada
| | - Eric von Wettberg
- University of Vermont, 63 Carrigan Drive, Jeffords Hall, Burlington, VT, 05405, USA
| | - Robbie Waugh
- The James Hutton Institute, Errol Road, Invergowrie, Dundee, DD2 5DA, UK; School of Agriculture and Wine & Waite Research Institute, University of Adelaide, Waite Campus, Glen Osmond, SA, 5064, Australia
| | - Peter Wenzl
- Centro Internacional de Agricultura Tropical (CIAT), Km 17 Recta Cali-Palmira, 763537 Cali, Colombia
| | - Loren H Rieseberg
- Department of Botany and Biodiversity Research Centre, University of British Columbia, Vancouver, BC V6R 2A5, Canada.
| |
Collapse
|
14
|
Pérez-Escobar OA, Richardson JE, Howes MJR, Lucas E, Álvarez de Róman N, Collemare J, Graham IA, Gratzfeld J, Kersey PJ, Leitch IJ, Paton A, Hollingsworth PM, Antonelli A. Untapped resources for medical research. Science 2020; 369:781-782. [DOI: 10.1126/science.abc8085] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
| | - James E. Richardson
- Department of Biology, Faculty of Natural Sciences, Universidad del Rosario, Bogotá, Colombia
- Royal Botanic Garden Edinburgh, Edinburgh, EH3 5LR, UK
| | - Melanie-Jayne R. Howes
- Royal Botanic Gardens, Kew, TW9 3AE, UK
- Institute of Pharmaceutical Science, Faculty of Life Sciences & Medicine, King's College London, SE1 9NH, UK
| | - Eve Lucas
- Royal Botanic Gardens, Kew, TW9 3AE, UK
| | | | | | - Ian A. Graham
- Department of Biology, Centre for Novel Agricultural Products, University of York, York, YO10 5DD, UK
| | | | | | | | | | | | - Alexandre Antonelli
- Royal Botanic Gardens, Kew, TW9 3AE, UK
- Gothenburg Global Biodiversity Centre and University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
15
|
Papoutsoglou EA, Faria D, Arend D, Arnaud E, Athanasiadis IN, Chaves I, Coppens F, Cornut G, Costa BV, Ćwiek-Kupczyńska H, Droesbeke B, Finkers R, Gruden K, Junker A, King GJ, Krajewski P, Lange M, Laporte MA, Michotey C, Oppermann M, Ostler R, Poorter H, Ramı Rez-Gonzalez R, Ramšak Ž, Reif JC, Rocca-Serra P, Sansone SA, Scholz U, Tardieu F, Uauy C, Usadel B, Visser RGF, Weise S, Kersey PJ, Miguel CM, Adam-Blondon AF, Pommier C. Enabling reusability of plant phenomic datasets with MIAPPE 1.1. New Phytol 2020. [PMID: 32171029 DOI: 10.15454/1yxvzv] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Enabling data reuse and knowledge discovery is increasingly critical in modern science, and requires an effort towards standardising data publication practices. This is particularly challenging in the plant phenotyping domain, due to its complexity and heterogeneity. We have produced the MIAPPE 1.1 release, which enhances the existing MIAPPE standard in coverage, to support perennial plants, in structure, through an explicit data model, and in clarity, through definitions and examples. We evaluated MIAPPE 1.1 by using it to express several heterogeneous phenotyping experiments in a range of different formats, to demonstrate its applicability and the interoperability between the various implementations. Furthermore, the extended coverage is demonstrated by the fact that one of the datasets could not have been described under MIAPPE 1.0. MIAPPE 1.1 marks a major step towards enabling plant phenotyping data reusability, thanks to its extended coverage, and especially the formalisation of its data model, which facilitates its implementation in different formats. Community feedback has been critical to this development, and will be a key part of ensuring adoption of the standard.
Collapse
Affiliation(s)
- Evangelia A Papoutsoglou
- Plant Breeding, Wageningen University & Research, PO Box 386, Wageningen, 6700AJ, the Netherlands
| | - Daniel Faria
- BioData.pt, Instituto Gulbenkian de Ciência, 2780-156, Oeiras, Portugal
- INESC-ID, 1000-029, Lisboa, Portugal
| | - Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Elizabeth Arnaud
- Bioversity International, Parc Scientifique Agropolis II, Montpellier Cedex 5, 34397, France
| | - Ioannis N Athanasiadis
- Geo-Information Science and Remote Sensing Laboratory, Wageningen University, Droevendaalsesteeg 3, Wageningen, 6708PB, the Netherlands
| | - Inês Chaves
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa (ITQB NOVA) Avenida da República, 2780-157, Oeiras, Portugal
- Instituto de Biologia Experimental e Tecnológica (iBET), 2780-157, Oeiras, Portugal
| | - Frederik Coppens
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, Ghent, 9052, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, Ghent, 9052, Belgium
| | | | - Bruno V Costa
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa (ITQB NOVA) Avenida da República, 2780-157, Oeiras, Portugal
- BioISI - Biosystems & Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, Lisboa, 1749-016, Portugal
| | - Hanna Ćwiek-Kupczyńska
- Institute of Plant Genetics, Polish Academy of Sciences, ul. Strzeszyńska 34, 60-479, Poznań, Poland
| | - Bert Droesbeke
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, Ghent, 9052, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, Ghent, 9052, Belgium
| | - Richard Finkers
- Plant Breeding, Wageningen University & Research, PO Box 386, Wageningen, 6700AJ, the Netherlands
| | - Kristina Gruden
- Department of Biotechnology and Systems Biology, National Institute of Biology, SI1000, Ljubljana, Slovenia
| | - Astrid Junker
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Graham J King
- Southern Cross Plant Science, Southern Cross University, Lismore, NSW 2577, Australia
| | - Paweł Krajewski
- Institute of Plant Genetics, Polish Academy of Sciences, ul. Strzeszyńska 34, 60-479, Poznań, Poland
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Marie-Angélique Laporte
- Bioversity International, Parc Scientifique Agropolis II, Montpellier Cedex 5, 34397, France
| | - Célia Michotey
- Université Paris-Saclay, INRAE, URGI, Versailles, 78026, France
| | - Markus Oppermann
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Richard Ostler
- Computational and Analytical Sciences, Rothamsted Research, Harpenden, AL5 2JQ, UK
| | - Hendrik Poorter
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, D-52425, Jülich, Germany
- Department of Biological Sciences, Macquarie University, North Ryde, NSW 2109, Australia
| | | | - Živa Ramšak
- Department of Biotechnology and Systems Biology, National Institute of Biology, SI1000, Ljubljana, Slovenia
| | - Jochen C Reif
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, UK
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, UK
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - François Tardieu
- INRA, Laboratoire d'Ecophysiologie des Plantes sous Stress Environnementaux, UMR759, Montpellier, 34060, France
| | - Cristobal Uauy
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Colney, Norwich, NR4 7UH, UK
| | - Björn Usadel
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, D-52425, Jülich, Germany
- Institute for Biology I, BioSC, RWTH Aachen University, Worringer Weg 3, 52074, Aachen, Germany
| | - Richard G F Visser
- Plant Breeding, Wageningen University & Research, PO Box 386, Wageningen, 6700AJ, the Netherlands
| | - Stephan Weise
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | | | - Célia M Miguel
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa (ITQB NOVA) Avenida da República, 2780-157, Oeiras, Portugal
- BioISI - Biosystems & Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, Lisboa, 1749-016, Portugal
| | | | - Cyril Pommier
- Université Paris-Saclay, INRAE, URGI, Versailles, 78026, France
| |
Collapse
|
16
|
Papoutsoglou EA, Faria D, Arend D, Arnaud E, Athanasiadis IN, Chaves I, Coppens F, Cornut G, Costa BV, Ćwiek‐Kupczyńska H, Droesbeke B, Finkers R, Gruden K, Junker A, King GJ, Krajewski P, Lange M, Laporte M, Michotey C, Oppermann M, Ostler R, Poorter H, Ramírez‐Gonzalez R, Ramšak Ž, Reif JC, Rocca‐Serra P, Sansone S, Scholz U, Tardieu F, Uauy C, Usadel B, Visser RGF, Weise S, Kersey PJ, Miguel CM, Adam‐Blondon A, Pommier C. Enabling reusability of plant phenomic datasets with MIAPPE 1.1. New Phytol 2020; 227:260-273. [PMID: 32171029 PMCID: PMC7317793 DOI: 10.1111/nph.16544] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 02/24/2020] [Indexed: 05/21/2023]
Abstract
Enabling data reuse and knowledge discovery is increasingly critical in modern science, and requires an effort towards standardising data publication practices. This is particularly challenging in the plant phenotyping domain, due to its complexity and heterogeneity. We have produced the MIAPPE 1.1 release, which enhances the existing MIAPPE standard in coverage, to support perennial plants, in structure, through an explicit data model, and in clarity, through definitions and examples. We evaluated MIAPPE 1.1 by using it to express several heterogeneous phenotyping experiments in a range of different formats, to demonstrate its applicability and the interoperability between the various implementations. Furthermore, the extended coverage is demonstrated by the fact that one of the datasets could not have been described under MIAPPE 1.0. MIAPPE 1.1 marks a major step towards enabling plant phenotyping data reusability, thanks to its extended coverage, and especially the formalisation of its data model, which facilitates its implementation in different formats. Community feedback has been critical to this development, and will be a key part of ensuring adoption of the standard.
Collapse
|
17
|
Papoutsoglou EA, Faria D, Arend D, Arnaud E, Athanasiadis IN, Chaves I, Coppens F, Cornut G, Costa BV, Ćwiek-Kupczyńska H, Droesbeke B, Finkers R, Gruden K, Junker A, King GJ, Krajewski P, Lange M, Laporte MA, Michotey C, Oppermann M, Ostler R, Poorter H, Ramı Rez-Gonzalez R, Ramšak Ž, Reif JC, Rocca-Serra P, Sansone SA, Scholz U, Tardieu F, Uauy C, Usadel B, Visser RGF, Weise S, Kersey PJ, Miguel CM, Adam-Blondon AF, Pommier C. Enabling reusability of plant phenomic datasets with MIAPPE 1.1. New Phytol 2020. [PMID: 32171029 DOI: 10.15454/ah6u4a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Enabling data reuse and knowledge discovery is increasingly critical in modern science, and requires an effort towards standardising data publication practices. This is particularly challenging in the plant phenotyping domain, due to its complexity and heterogeneity. We have produced the MIAPPE 1.1 release, which enhances the existing MIAPPE standard in coverage, to support perennial plants, in structure, through an explicit data model, and in clarity, through definitions and examples. We evaluated MIAPPE 1.1 by using it to express several heterogeneous phenotyping experiments in a range of different formats, to demonstrate its applicability and the interoperability between the various implementations. Furthermore, the extended coverage is demonstrated by the fact that one of the datasets could not have been described under MIAPPE 1.0. MIAPPE 1.1 marks a major step towards enabling plant phenotyping data reusability, thanks to its extended coverage, and especially the formalisation of its data model, which facilitates its implementation in different formats. Community feedback has been critical to this development, and will be a key part of ensuring adoption of the standard.
Collapse
Affiliation(s)
- Evangelia A Papoutsoglou
- Plant Breeding, Wageningen University & Research, PO Box 386, Wageningen, 6700AJ, the Netherlands
| | - Daniel Faria
- BioData.pt, Instituto Gulbenkian de Ciência, 2780-156, Oeiras, Portugal
- INESC-ID, 1000-029, Lisboa, Portugal
| | - Daniel Arend
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Elizabeth Arnaud
- Bioversity International, Parc Scientifique Agropolis II, Montpellier Cedex 5, 34397, France
| | - Ioannis N Athanasiadis
- Geo-Information Science and Remote Sensing Laboratory, Wageningen University, Droevendaalsesteeg 3, Wageningen, 6708PB, the Netherlands
| | - Inês Chaves
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa (ITQB NOVA) Avenida da República, 2780-157, Oeiras, Portugal
- Instituto de Biologia Experimental e Tecnológica (iBET), 2780-157, Oeiras, Portugal
| | - Frederik Coppens
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, Ghent, 9052, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, Ghent, 9052, Belgium
| | | | - Bruno V Costa
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa (ITQB NOVA) Avenida da República, 2780-157, Oeiras, Portugal
- BioISI - Biosystems & Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, Lisboa, 1749-016, Portugal
| | - Hanna Ćwiek-Kupczyńska
- Institute of Plant Genetics, Polish Academy of Sciences, ul. Strzeszyńska 34, 60-479, Poznań, Poland
| | - Bert Droesbeke
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, Ghent, 9052, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, Ghent, 9052, Belgium
| | - Richard Finkers
- Plant Breeding, Wageningen University & Research, PO Box 386, Wageningen, 6700AJ, the Netherlands
| | - Kristina Gruden
- Department of Biotechnology and Systems Biology, National Institute of Biology, SI1000, Ljubljana, Slovenia
| | - Astrid Junker
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Graham J King
- Southern Cross Plant Science, Southern Cross University, Lismore, NSW 2577, Australia
| | - Paweł Krajewski
- Institute of Plant Genetics, Polish Academy of Sciences, ul. Strzeszyńska 34, 60-479, Poznań, Poland
| | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Marie-Angélique Laporte
- Bioversity International, Parc Scientifique Agropolis II, Montpellier Cedex 5, 34397, France
| | - Célia Michotey
- Université Paris-Saclay, INRAE, URGI, Versailles, 78026, France
| | - Markus Oppermann
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Richard Ostler
- Computational and Analytical Sciences, Rothamsted Research, Harpenden, AL5 2JQ, UK
| | - Hendrik Poorter
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, D-52425, Jülich, Germany
- Department of Biological Sciences, Macquarie University, North Ryde, NSW 2109, Australia
| | | | - Živa Ramšak
- Department of Biotechnology and Systems Biology, National Institute of Biology, SI1000, Ljubljana, Slovenia
| | - Jochen C Reif
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - Philippe Rocca-Serra
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, UK
| | - Susanna-Assunta Sansone
- Oxford e-Research Centre, Department of Engineering Science, University of Oxford, 7 Keble Road, Oxford, OX1 3QG, UK
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | - François Tardieu
- INRA, Laboratoire d'Ecophysiologie des Plantes sous Stress Environnementaux, UMR759, Montpellier, 34060, France
| | - Cristobal Uauy
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Colney, Norwich, NR4 7UH, UK
| | - Björn Usadel
- Plant Sciences (IBG-2), Forschungszentrum Jülich GmbH, D-52425, Jülich, Germany
- Institute for Biology I, BioSC, RWTH Aachen University, Worringer Weg 3, 52074, Aachen, Germany
| | - Richard G F Visser
- Plant Breeding, Wageningen University & Research, PO Box 386, Wageningen, 6700AJ, the Netherlands
| | - Stephan Weise
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, 06466, Seeland, Germany
| | | | - Célia M Miguel
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa (ITQB NOVA) Avenida da República, 2780-157, Oeiras, Portugal
- BioISI - Biosystems & Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, Lisboa, 1749-016, Portugal
| | | | - Cyril Pommier
- Université Paris-Saclay, INRAE, URGI, Versailles, 78026, France
| |
Collapse
|
18
|
Howe KL, Contreras-Moreira B, De Silva N, Maslen G, Akanni W, Allen J, Alvarez-Jarreta J, Barba M, Bolser DM, Cambell L, Carbajo M, Chakiachvili M, Christensen M, Cummins C, Cuzick A, Davis P, Fexova S, Gall A, George N, Gil L, Gupta P, Hammond-Kosack KE, Haskell E, Hunt SE, Jaiswal P, Janacek SH, Kersey PJ, Langridge N, Maheswari U, Maurel T, McDowall MD, Moore B, Muffato M, Naamati G, Naithani S, Olson A, Papatheodorou I, Patricio M, Paulini M, Pedro H, Perry E, Preece J, Rosello M, Russell M, Sitnik V, Staines DM, Stein J, Tello-Ruiz MK, Trevanion SJ, Urban M, Wei S, Ware D, Williams G, Yates AD, Flicek P. Ensembl Genomes 2020-enabling non-vertebrate genomic research. Nucleic Acids Res 2020; 48:D689-D695. [PMID: 31598706 PMCID: PMC6943047 DOI: 10.1093/nar/gkz890] [Citation(s) in RCA: 283] [Impact Index Per Article: 70.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2019] [Revised: 09/29/2019] [Accepted: 10/02/2019] [Indexed: 12/28/2022] Open
Abstract
Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of interfaces to genomic data across the tree of life, including reference genome sequence, gene models, transcriptional data, genetic variation and comparative analysis. Data may be accessed via our website, online tools platform and programmatic interfaces, with updates made four times per year (in synchrony with Ensembl). Here, we provide an overview of Ensembl Genomes, with a focus on recent developments. These include the continued growth, more robust and reproducible sets of orthologues and paralogues, and enriched views of gene expression and gene function in plants. Finally, we report on our continued deeper integration with the Ensembl project, which forms a key part of our future strategy for dealing with the increasing quantity of available genome-scale data across the tree of life.
Collapse
Affiliation(s)
- Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Bruno Contreras-Moreira
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nishadi De Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gareth Maslen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Wasiu Akanni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - James Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jorge Alvarez-Jarreta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthieu Barba
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dan M Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lahcen Cambell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Manuel Carbajo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Marc Chakiachvili
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mikkel Christensen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alayne Cuzick
- Department of Biointeractions and Crop Protection, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, UK
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Silvie Fexova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Astrid Gall
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nancy George
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Laurent Gil
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Kim E Hammond-Kosack
- Department of Biointeractions and Crop Protection, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, UK
| | - Erin Haskell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sophie H Janacek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nick Langridge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Uma Maheswari
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Maurel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mark D McDowall
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ben Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthieu Muffato
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Naamati
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Irene Papatheodorou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michael Paulini
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helder Pedro
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Emily Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Marc Rosello
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthew Russell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Vasily Sitnik
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel M Staines
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Joshua Stein
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | | | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Martin Urban
- Department of Biointeractions and Crop Protection, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, UK
| | - Sharon Wei
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| | - Gary Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
19
|
Pedro H, Yates AD, Kersey PJ, De Silva NH. Collaborative Annotation Redefines Gene Sets for Crucial Phytopathogens. Front Microbiol 2019; 10:2477. [PMID: 31787936 PMCID: PMC6854995 DOI: 10.3389/fmicb.2019.02477] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 10/15/2019] [Indexed: 11/15/2022] Open
Abstract
Accurate and comprehensive annotation of genomic sequences underpins advances in managing plant disease. However, important plant pathogens still have incomplete and inconsistent gene sets and lack dedicated funding or teams to improve this annotation. This paper describes a collaborative approach to gene curation to address this shortcoming. In the first instance, over 40 members of the Botrytis cinerea community from eight countries, with training and infrastructural support from Ensembl Fungi, used the gene editing tool Apollo to systematically review the entire gene set (11,707 protein coding genes) in 6-7 months. This has subsequently been checked and disseminated. Following this, a similar project for another pathogen, Blumeria graminis f. sp. hordei, also led to a completely redefined gene set. Currently, we are working with the Zymoseptoria tritici community to enable them to achieve the same. While the tangible outcome of these projects is improved gene sets, it is apparent that the inherent agreement and ownership of a single gene set by research teams as they undergo this curation process are consequential to the acceleration of research in the field. With the generation of large data sets increasingly affordable, there is value in unifying both the divergent data sets and their associated research teams, pooling time, expertise, and resources. Community-driven annotation efforts can pave the way for a new kind of collaboration among pathogen research communities to generate well-annotated reference data sets, beneficial not just for the genome being examined but for related species and the refinement of automatic gene prediction tools.
Collapse
Affiliation(s)
| | | | | | - Nishadi H. De Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, United Kingdom
| |
Collapse
|
20
|
Brewer GE, Clarkson JJ, Maurin O, Zuntini AR, Barber V, Bellot S, Biggs N, Cowan RS, Davies NMJ, Dodsworth S, Edwards SL, Eiserhardt WL, Epitawalage N, Frisby S, Grall A, Kersey PJ, Pokorny L, Leitch IJ, Forest F, Baker WJ. Factors Affecting Targeted Sequencing of 353 Nuclear Genes From Herbarium Specimens Spanning the Diversity of Angiosperms. Front Plant Sci 2019; 10:1102. [PMID: 31620145 PMCID: PMC6759688 DOI: 10.3389/fpls.2019.01102] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 08/12/2019] [Indexed: 05/03/2023]
Abstract
The world's herbaria collectively house millions of diverse plant specimens, including endangered or extinct species and type specimens. Unlocking genetic data from the typically highly degraded DNA obtained from herbarium specimens was difficult until the arrival of high-throughput sequencing approaches, which can be applied to low quantities of severely fragmented DNA. Target enrichment involves using short molecular probes that hybridise and capture genomic regions of interest for high-throughput sequencing. In this study on herbariomics, we used this targeted sequencing approach and the Angiosperms353 universal probe set to recover up to 351 nuclear genes from 435 herbarium specimens that are up to 204 years old and span the breadth of angiosperm diversity. We show that on average 207 genes were successfully retrieved from herbarium specimens, although the mean number of genes retrieved and target enrichment efficiency is significantly higher for silica gel-dried specimens. Forty-seven target nuclear genes were recovered from a herbarium specimen of the critically endangered St Helena boxwood, Mellissia begoniifolia, collected in 1815. Herbarium specimens yield significantly less high-molecular-weight DNA than silica gel-dried specimens, and genomic DNA quality declines with sample age, which is negatively correlated with target enrichment efficiency. Climate, taxon-specific traits, and collection strategies additionally impact target sequence recovery. We also detected taxonomic bias in targeted sequencing outcomes for the 10 most numerous angiosperm families that were investigated in depth. We recommend that (1) for species distributed in wet tropical climates, silica gel-dried specimens should be used preferentially; (2) for species distributed in seasonally dry tropical climates, herbarium and silica gel-dried specimens yield similar results, and either collection can be used; (3) taxon-specific traits should be explored and established for effective optimisation of taxon-specific studies using herbarium specimens; (4) all herbarium sheets should, in future, be annotated with details of the preservation method used; (5) long-term storage of herbarium specimens should be in stable, low-humidity, and low-temperature environments; and (6) targeted sequencing with universal probes, such as Angiosperms353, should be investigated closely as a new approach for DNA barcoding that will ensure better exploitation of herbarium specimens than traditional Sanger sequencing approaches.
Collapse
Affiliation(s)
- Grace E. Brewer
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - James J. Clarkson
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Olivier Maurin
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | | | - Vanessa Barber
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Sidonie Bellot
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Nicola Biggs
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Robyn S. Cowan
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Nina M. J. Davies
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Steven Dodsworth
- School of Life Sciences, University of Bedfordshire, Luton, BedfordshireUnited Kingdom
| | - Sara L. Edwards
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Wolf L. Eiserhardt
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
- Department of Bioscience, Aarhus University, Ny Munkegade Aarhus C, Denmark
| | | | - Sue Frisby
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Aurélie Grall
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Paul J. Kersey
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Lisa Pokorny
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
- Centre for Plant Biotechnology and Genomics (CBGP, UPM-INIA), Pozuelo de Alarcón, Madrid, Spain
| | - Ilia J. Leitch
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Félix Forest
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - William J. Baker
- Science Directorate, Royal Botanic Gardens, Kew, Richmond, United Kingdom
| |
Collapse
|
21
|
Tello-Ruiz MK, Naithani S, Stein JC, Gupta P, Campbell M, Olson A, Wei S, Preece J, Geniza MJ, Jiao Y, Lee YK, Wang B, Mulvaney J, Chougule K, Elser J, Al-Bader N, Kumari S, Thomason J, Kumar V, Bolser DM, Naamati G, Tapanari E, Fonseca N, Huerta L, Iqbal H, Keays M, Munoz-Pomer Fuentes A, Tang A, Fabregat A, D'Eustachio P, Weiser J, Stein LD, Petryszak R, Papatheodorou I, Kersey PJ, Lockhart P, Taylor C, Jaiswal P, Ware D. Gramene 2018: unifying comparative genomics and pathway resources for plant research. Nucleic Acids Res 2019; 46:D1181-D1189. [PMID: 29165610 PMCID: PMC5753211 DOI: 10.1093/nar/gkx1111] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/25/2017] [Indexed: 12/24/2022] Open
Abstract
Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene-gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces.
Collapse
Affiliation(s)
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Joshua C Stein
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Parul Gupta
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Michael Campbell
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Matthew J Geniza
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Young Koung Lee
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,Division of Biological Sciences and Institute for Basic Science, Wonkwang University, Iksan 54538, Korea
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Joseph Mulvaney
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Noor Al-Bader
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - James Thomason
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Vivek Kumar
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Daniel M Bolser
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Guy Naamati
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Electra Tapanari
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Nuno Fonseca
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Laura Huerta
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Haider Iqbal
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Maria Keays
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | | | - Amy Tang
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Antonio Fabregat
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Peter D'Eustachio
- Department of Biochemistry & Molecular Pharmacology, NYU School of Medicine, New York, NY 10016, USA
| | - Joel Weiser
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Lincoln D Stein
- Adaptive Oncology Program, Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada
| | - Robert Petryszak
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Irene Papatheodorou
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Paul J Kersey
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Patti Lockhart
- American Society of Plant Biologists, 15501 Monona Drive, Rockville, MD 20855-2768, USA
| | - Crispin Taylor
- American Society of Plant Biologists, 15501 Monona Drive, Rockville, MD 20855-2768, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.,USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| |
Collapse
|
22
|
Bolt BJ, Rodgers FH, Shafie M, Kersey PJ, Berriman M, Howe KL. Using WormBase ParaSite: An Integrated Platform for Exploring Helminth Genomic Data. Methods Mol Biol 2018; 1757:471-491. [PMID: 29761467 DOI: 10.1007/978-1-4939-7737-6_15] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
WormBase ParaSite ( parasite.wormbase.org ) is a comprehensive resource for the genomes of parasitic nematodes and flatworms (helminths). It currently includes genomic data for over 100 helminth species, adding value by way of consistent functional annotation, gene comparative analysis and gene expression analysis. We provide several ways of exploring the data including a choice of genome browsers, genome and gene summary pages, text and sequence searching, a query wizard, bulk downloads, and programmatic interfaces. WormBase ParaSite is released three to six times per year, and is developed in collaboration with WormBase ( www.wormbase.org ) and Ensembl Genomes ( www.ensemblgenomes.org ).
Collapse
Affiliation(s)
- Bruce J Bolt
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | | | - Myriam Shafie
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK
| | | | - Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, UK.
| |
Collapse
|
23
|
Clavijo BJ, Venturini L, Schudoma C, Accinelli GG, Kaithakottil G, Wright J, Borrill P, Kettleborough G, Heavens D, Chapman H, Lipscombe J, Barker T, Lu FH, McKenzie N, Raats D, Ramirez-Gonzalez RH, Coince A, Peel N, Percival-Alwyn L, Duncan O, Trösch J, Yu G, Bolser DM, Namaati G, Kerhornou A, Spannagl M, Gundlach H, Haberer G, Davey RP, Fosker C, Palma FD, Phillips AL, Millar AH, Kersey PJ, Uauy C, Krasileva KV, Swarbreck D, Bevan MW, Clark MD. An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations. Genome Res 2017; 27:885-896. [PMID: 28420692 PMCID: PMC5411782 DOI: 10.1101/gr.217117.116] [Citation(s) in RCA: 243] [Impact Index Per Article: 34.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Accepted: 03/14/2017] [Indexed: 01/16/2023]
Abstract
Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Tom Barker
- Earlham Institute, Norwich, NR4 7UZ, United Kingdom
| | - Fu-Hao Lu
- John Innes Centre, Norwich, NR4 7UH, United Kingdom
| | | | - Dina Raats
- Earlham Institute, Norwich, NR4 7UZ, United Kingdom
| | | | | | - Ned Peel
- Earlham Institute, Norwich, NR4 7UZ, United Kingdom
| | | | - Owen Duncan
- ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley Western Australia 6009, Australia
| | - Josua Trösch
- ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley Western Australia 6009, Australia
| | - Guotai Yu
- John Innes Centre, Norwich, NR4 7UH, United Kingdom
| | - Dan M Bolser
- EMBL European Bioinformatics Institute, Hinxton, CB10 1SD, United Kingdom
| | - Guy Namaati
- EMBL European Bioinformatics Institute, Hinxton, CB10 1SD, United Kingdom
| | - Arnaud Kerhornou
- EMBL European Bioinformatics Institute, Hinxton, CB10 1SD, United Kingdom
| | - Manuel Spannagl
- Plant Genome and Systems Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Heidrun Gundlach
- Plant Genome and Systems Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Georg Haberer
- Plant Genome and Systems Biology, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Robert P Davey
- Earlham Institute, Norwich, NR4 7UZ, United Kingdom
- University of East Anglia, Norwich, NR4 7TJ, United Kingdom
| | | | - Federica Di Palma
- Earlham Institute, Norwich, NR4 7UZ, United Kingdom
- University of East Anglia, Norwich, NR4 7TJ, United Kingdom
| | | | - A Harvey Millar
- ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley Western Australia 6009, Australia
| | - Paul J Kersey
- EMBL European Bioinformatics Institute, Hinxton, CB10 1SD, United Kingdom
| | | | - Ksenia V Krasileva
- Earlham Institute, Norwich, NR4 7UZ, United Kingdom
- University of East Anglia, Norwich, NR4 7TJ, United Kingdom
- The Sainsbury Laboratory, Norwich, NR4 7UH, United Kingdom
| | - David Swarbreck
- Earlham Institute, Norwich, NR4 7UZ, United Kingdom
- University of East Anglia, Norwich, NR4 7TJ, United Kingdom
| | | | - Matthew D Clark
- Earlham Institute, Norwich, NR4 7UZ, United Kingdom
- University of East Anglia, Norwich, NR4 7TJ, United Kingdom
| |
Collapse
|
24
|
Petrov AI, Kay SJE, Kalvari I, Howe KL, Gray KA, Bruford EA, Kersey PJ, Cochrane G, Finn RD, Bateman A, Kozomara A, Griffiths-Jones S, Frankish A, Zwieb CW, Lau BY, Williams KP, Chan PP, Lowe TM, Cannone JJ, Gutell R, Machnicka MA, Bujnicki JM, Yoshihama M, Kenmochi N, Chai B, Cole JR, Szymanski M, Karlowski WM, Wood V, Huala E, Berardini TZ, Zhao Y, Chen R, Zhu W, Paraskevopoulou MD, Vlachos IS, Hatzigeorgiou AG, Ma L, Zhang Z, Puetz J, Stadler PF, McDonald D, Basu S, Fey P, Engel SR, Cherry JM, Volders PJ, Mestdagh P, Wower J, Clark MB, Quek XC, Dinger ME. RNAcentral: a comprehensive database of non-coding RNA sequences. Nucleic Acids Res 2017; 45:D128-D134. [PMID: 27794554 PMCID: PMC5210518 DOI: 10.1093/nar/gkw1008] [Citation(s) in RCA: 132] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 10/13/2016] [Accepted: 10/18/2016] [Indexed: 12/12/2022] Open
Abstract
RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/.
Collapse
|
25
|
Abstract
Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for 39 sequenced plant species. Available data includes genome sequence, gene models, functional annotation, and polymorphic loci; for the latter, additional information including population structure, individual genotypes, linkage, and phenotype data is available for some species. Comparative data is also available, including genomic alignments and "gene trees," which show the inferred evolutionary history of each gene family represented in the resource. Access to the data is provided through a genome browser, which incorporates many specialist interfaces for different data types, through a variety of programmatic interfaces, and via a specialist data mining tool supporting rapid filtering and retrieval of bulk data. Genomic data from many non-plant species, including those of plant pathogens, pests, and pollinators, is also available via the same interfaces through other divisions of Ensembl.Ensembl Plants is updated 4-6 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.eu ).
Collapse
Affiliation(s)
- Dan M Bolser
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Daniel M Staines
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Emily Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
26
|
Kohl A, Pondeville E, Schnettler E, Crisanti A, Supparo C, Christophides GK, Kersey PJ, Maslen GL, Takken W, Koenraadt CJM, Oliva CF, Busquets N, Abad FX, Failloux AB, Levashina EA, Wilson AJ, Veronesi E, Pichard M, Arnaud Marsh S, Simard F, Vernick KD. Advancing vector biology research: a community survey for future directions, research applications and infrastructure requirements. Pathog Glob Health 2016; 110:164-72. [PMID: 27677378 PMCID: PMC5072118 DOI: 10.1080/20477724.2016.1211475] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Vector-borne pathogens impact public health, animal production, and animal welfare. Research on arthropod vectors such as mosquitoes, ticks, sandflies, and midges which transmit pathogens to humans and economically important animals is crucial for development of new control measures that target transmission by the vector. While insecticides are an important part of this arsenal, appearance of resistance mechanisms is increasingly common. Novel tools for genetic manipulation of vectors, use of Wolbachia endosymbiotic bacteria, and other biological control mechanisms to prevent pathogen transmission have led to promising new intervention strategies, adding to strong interest in vector biology and genetics as well as vector-pathogen interactions. Vector research is therefore at a crucial juncture, and strategic decisions on future research directions and research infrastructure investment should be informed by the research community. A survey initiated by the European Horizon 2020 INFRAVEC-2 consortium set out to canvass priorities in the vector biology research community and to determine key activities that are needed for researchers to efficiently study vectors, vector-pathogen interactions, as well as access the structures and services that allow such activities to be carried out. We summarize the most important findings of the survey which in particular reflect the priorities of researchers in European countries, and which will be of use to stakeholders that include researchers, government, and research organizations.
Collapse
Affiliation(s)
- Alain Kohl
- a MRC-University of Glasgow Centre for Virus Research , Glasgow , UK
| | - Emilie Pondeville
- a MRC-University of Glasgow Centre for Virus Research , Glasgow , UK
| | - Esther Schnettler
- a MRC-University of Glasgow Centre for Virus Research , Glasgow , UK
| | - Andrea Crisanti
- b Department of Life Sciences , Imperial College London , London , UK
| | - Clelia Supparo
- b Department of Life Sciences , Imperial College London , London , UK
| | | | - Paul J Kersey
- c The European Molecular Biology Laboratory , The European Bioinformatics Institute, Wellcome Trust Genome Campus , Cambridge , UK
| | - Gareth L Maslen
- c The European Molecular Biology Laboratory , The European Bioinformatics Institute, Wellcome Trust Genome Campus , Cambridge , UK
| | - Willem Takken
- d Laboratory of Entomology , Wageningen University and Research Centre , Wageningen , The Netherlands
| | | | - Clelia F Oliva
- e Polo d'Innovazione di Genomica, Genetica e Biologia , Perugia , Italy
| | - Núria Busquets
- f Centre de Recerca en Sanitat Animal (CReSA) , Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Campus UAB , Barcelona , Spain
| | - F Xavier Abad
- f Centre de Recerca en Sanitat Animal (CReSA) , Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Campus UAB , Barcelona , Spain
| | - Anna-Bella Failloux
- g Arboviruses and Insect Vectors Unit, Department of Virology , Institut Pasteur , Paris cedex 15 , France
| | - Elena A Levashina
- h Department of Vector Biology , Max-Planck-Institut für Infektionsbiologie, Campus Charité Mitte , Berlin , Germany
| | - Anthony J Wilson
- i Integrative Entomology Group, Vector-borne Viral Diseases Programme , The Pirbright Institute , Surrey , UK
| | - Eva Veronesi
- j Swiss National Centre for Vector Entomology, Institute of Parasitology , University of Zürich , Zürich , Switzerland
| | - Maëlle Pichard
- k Department of Parasites and Insect Vectors , Institut Pasteur, Unit of Insect Vector Genetics and Genomics , Paris cedex 15 , France
| | - Sarah Arnaud Marsh
- k Department of Parasites and Insect Vectors , Institut Pasteur, Unit of Insect Vector Genetics and Genomics , Paris cedex 15 , France
| | - Frédéric Simard
- l MIVEGEC "Maladies Infectieuses et Vecteurs: Ecologie, Génétique, Evolution et Contrôle" , UMR IRD224-CNRS5290-Université de Montpellier , Montpellier France
| | - Kenneth D Vernick
- k Department of Parasites and Insect Vectors , Institut Pasteur, Unit of Insect Vector Genetics and Genomics , Paris cedex 15 , France.,m CNRS Unit of Hosts, Vectors and Pathogens (URA3012) , Institut Pasteur , Paris cedex 15 , France
| |
Collapse
|
27
|
Spannagl M, Alaux M, Lange M, Bolser DM, Bader KC, Letellier T, Kimmel E, Flores R, Pommier C, Kerhornou A, Walts B, Nussbaumer T, Grabmuller C, Chen J, Colmsee C, Beier S, Mascher M, Schmutzer T, Arend D, Thanki A, Ramirez-Gonzalez R, Ayling M, Ayling S, Caccamo M, Mayer KFX, Scholz U, Steinbach D, Quesneville H, Kersey PJ. transPLANT Resources for Triticeae Genomic Data. Plant Genome 2016; 9. [PMID: 27898761 DOI: 10.3835/plantgenome2015.06.0038] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The genome sequences of many important Triticeae species, including bread wheat ( L.) and barley ( L.), remained uncharacterized for a long time because their high repeat content, large sizes, and polyploidy. As a result of improvements in sequencing technologies and novel analyses strategies, several of these have recently been deciphered. These efforts have generated new insights into Triticeae biology and genome organization and have important implications for downstream usage by breeders, experimental biologists, and comparative genomicists. transPLANT () is an EU-funded project aimed at constructing hardware, software, and data infrastructure for genome-scale research in the life sciences. Since the Triticeae data are intrinsically complex, heterogenous, and distributed, the transPLANT consortium has undertaken efforts to develop common data formats and tools that enable the exchange and integration of data from distributed resources. Here we present an overview of the individual Triticeae genome resources hosted by transPLANT partners, introduce the objectives of transPLANT, and outline common developments and interfaces supporting integrated data access.
Collapse
|
28
|
Tello-Ruiz MK, Stein J, Wei S, Preece J, Olson A, Naithani S, Amarasinghe V, Dharmawardhana P, Jiao Y, Mulvaney J, Kumari S, Chougule K, Elser J, Wang B, Thomason J, Bolser DM, Kerhornou A, Walts B, Fonseca NA, Huerta L, Keays M, Tang YA, Parkinson H, Fabregat A, McKay S, Weiser J, D'Eustachio P, Stein L, Petryszak R, Kersey PJ, Jaiswal P, Ware D. Gramene 2016: comparative plant genomics and pathway resources. Nucleic Acids Res 2015; 44:D1133-40. [PMID: 26553803 PMCID: PMC4702844 DOI: 10.1093/nar/gkv1179] [Citation(s) in RCA: 108] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 10/13/2015] [Indexed: 12/21/2022] Open
Abstract
Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials.
Collapse
Affiliation(s)
| | - Joshua Stein
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sharon Wei
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Preece
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Andrew Olson
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Vindhya Amarasinghe
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Palitha Dharmawardhana
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Yinping Jiao
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Joseph Mulvaney
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Bo Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - James Thomason
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Daniel M Bolser
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Arnaud Kerhornou
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Brandon Walts
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Nuno A Fonseca
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Laura Huerta
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Maria Keays
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Y Amy Tang
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Helen Parkinson
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Antonio Fabregat
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Sheldon McKay
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Joel Weiser
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Peter D'Eustachio
- Department of Biochemistry & Molecular Pharmacology, NYU School of Medicine, New York, NY 10016, USA
| | - Lincoln Stein
- Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto, M5G 1L7, Canada
| | - Robert Petryszak
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Paul J Kersey
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA USDA ARS NAA Robert W. Holley Center for Agriculture and Health, Agricultural Research Service, Ithaca, NY 14853, USA
| |
Collapse
|
29
|
McDowall MD, Harris MA, Lock A, Rutherford K, Staines DM, Bähler J, Kersey PJ, Oliver SG, Wood V. PomBase 2015: updates to the fission yeast database. Nucleic Acids Res 2014; 43:D656-61. [PMID: 25361970 PMCID: PMC4383888 DOI: 10.1093/nar/gku1040] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
PomBase (http://www.pombase.org) is the model organism database for the fission yeast Schizosaccharomyces pombe. PomBase provides a central hub for the fission yeast community, supporting both exploratory and hypothesis-driven research. It provides users easy access to data ranging from the sequence level, to molecular and phenotypic annotations, through to the display of genome-wide high-throughput studies. Recent improvements to the site extend annotation specificity, improve usability and allow for monthly data updates. Both in-house curators and community researchers provide manually curated data to PomBase. The genome browser provides access to published high-throughput data sets and the genomes of three additional Schizosaccharomyces species (Schizosaccharomyces cryophilus, Schizosaccharomyces japonicus and Schizosaccharomyces octosporus).
Collapse
Affiliation(s)
- Mark D McDowall
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Midori A Harris
- Cambridge Systems Biology and Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge, Cambridgeshire CB2 1GA, UK
| | - Antonia Lock
- Research Department of Genetics, Evolution and Environment, and UCL Cancer Institute, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK
| | - Kim Rutherford
- Cambridge Systems Biology and Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge, Cambridgeshire CB2 1GA, UK
| | - Daniel M Staines
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Jürg Bähler
- Research Department of Genetics, Evolution and Environment, and UCL Cancer Institute, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Stephen G Oliver
- Cambridge Systems Biology and Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge, Cambridgeshire CB2 1GA, UK
| | - Valerie Wood
- Cambridge Systems Biology and Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge, Cambridgeshire CB2 1GA, UK
| |
Collapse
|
30
|
Petrov AI, Kay SJE, Gibson R, Kulesha E, Staines D, Bruford EA, Wright MW, Burge S, Finn RD, Kersey PJ, Cochrane G, Bateman A, Griffiths-Jones S, Harrow J, Chan PP, Lowe TM, Zwieb CW, Wower J, Williams KP, Hudson CM, Gutell R, Clark MB, Dinger M, Quek XC, Bujnicki JM, Chua NH, Liu J, Wang H, Skogerbø G, Zhao Y, Chen R, Zhu W, Cole JR, Chai B, Huang HD, Huang HY, Cherry JM, Hatzigeorgiou A, Pruitt KD. RNAcentral: an international database of ncRNA sequences. Nucleic Acids Res 2014; 43:D123-9. [PMID: 25352543 PMCID: PMC4384043 DOI: 10.1093/nar/gku991] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
The field of non-coding RNA biology has been hampered by the lack of availability of a
comprehensive, up-to-date collection of accessioned RNA sequences. Here we present the
first release of RNAcentral, a database that collates and integrates information from an
international consortium of established RNA sequence databases. The initial release
contains over 8.1 million sequences, including representatives of all major functional
classes. A web portal (http://rnacentral.org) provides free access to data, search functionality,
cross-references, source code and an integrated genome browser for selected species.
Collapse
|
31
|
Gago S, Alastruey-Izquierdo A, Marconi M, Buitrago MJ, Kerhornou A, Kersey PJ, Mellado E, Cuenca-Estrella M, Rodríguez-Tudela JL, Cuesta I. Ribosomic DNA intergenic spacer 1 region is useful when identifying Candida parapsilosis spp. complex based on high-resolution melting analysis. Med Mycol 2014; 52:472-81. [PMID: 24847037 DOI: 10.1093/mmy/myu009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The epidemiology of Candida parapsilosis and the closely related species C. orthopsilosis and C. metapsilosis has changed in recent years, justify the need to identify this complex at the species level. In this study we investigate the intergenic spacer 1 (IGS1) of the ribosomal DNA (rDNA) to evaluate the utility of this gene region as a phylogenetic molecular marker and the suitability of a high-resolution melting (HRM) strategy based on this region for identification of members of the C. parapsilosis spp. complex. We sequenced the IGS1 and the internal transcribed spacer (ITS) regions of the rDNA from 33 C. parapsilosis sensu lato strains. Although both regions are useful in identifying species, comparative sequence analysis showed that the diversity in the IGS1 region was higher than in the ITS sequences. We also developed an HRM analysis that reliably identifies C. parapsilosis spp. complex based on the amplification of 70 bp in the IGS1 region. All isolates were correctly identified with a confidence interval >98%. Our results demonstrate that HRM analysis based on the IGS1 region is a powerful tool for distinguishing C. parapsilosis from cryptic species.
Collapse
Affiliation(s)
- Sara Gago
- Mycology Service, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| | - Ana Alastruey-Izquierdo
- Mycology Service, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain Spanish Network for Research on Infectious Diseases, Instituto de Salud Carlos III, Madrid, Spain
| | - Marco Marconi
- Bioinformatic Unit, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| | - María José Buitrago
- Mycology Service, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| | - Arnaud Kerhornou
- Protein and Nucleotide Database (PANDA) Group, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Paul J Kersey
- Protein and Nucleotide Database (PANDA) Group, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Emilia Mellado
- Mycology Service, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| | - Manuel Cuenca-Estrella
- Mycology Service, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| | - Juan Luis Rodríguez-Tudela
- Mycology Service, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| | - Isabel Cuesta
- Bioinformatic Unit, Centro Nacional de Microbiología, Instituto de Salud Carlos III, Majadahonda, Madrid, Spain
| |
Collapse
|
32
|
Monaco MK, Stein J, Naithani S, Wei S, Dharmawardhana P, Kumari S, Amarasinghe V, Youens-Clark K, Thomason J, Preece J, Pasternak S, Olson A, Jiao Y, Lu Z, Bolser D, Kerhornou A, Staines D, Walts B, Wu G, D'Eustachio P, Haw R, Croft D, Kersey PJ, Stein L, Jaiswal P, Ware D. Gramene 2013: comparative plant genomics resources. Nucleic Acids Res 2013; 42:D1193-9. [PMID: 24217918 PMCID: PMC3964986 DOI: 10.1093/nar/gkt1110] [Citation(s) in RCA: 120] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Gramene (http://www.gramene.org) is a curated online resource for comparative functional genomics in crops and model plant species, currently hosting 27 fully and 10 partially sequenced reference genomes in its build number 38. Its strength derives from the application of a phylogenetic framework for genome comparison and the use of ontologies to integrate structural and functional annotation data. Whole-genome alignments complemented by phylogenetic gene family trees help infer syntenic and orthologous relationships. Genetic variation data, sequences and genome mappings available for 10 species, including Arabidopsis, rice and maize, help infer putative variant effects on genes and transcripts. The pathways section also hosts 10 species-specific metabolic pathways databases developed in-house or by our collaborators using Pathway Tools software, which facilitates searches for pathway, reaction and metabolite annotations, and allows analyses of user-defined expression datasets. Recently, we released a Plant Reactome portal featuring 133 curated rice pathways. This portal will be expanded for Arabidopsis, maize and other plant species. We continue to provide genetic and QTL maps and marker datasets developed by crop researchers. The project provides a unique community platform to support scientific research in plant genomics including studies in evolution, genetics, plant breeding, molecular biology, biochemistry and systems biology.
Collapse
Affiliation(s)
- Marcela K Monaco
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA, Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA, EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK, Informatics and Bio-computing Program, Ontario Institute of Cancer Research, Toronto M5G 1L7, Canada, Department of Biochemistry & Molecular Pharmacology, NYU School of Medicine, New York, NY 10016, USA and NAA Plant, Soil & Nutrition Laboratory Research Unit, USDA-ARS, Ithaca, NY 14853, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Bradnam KR, Fass JN, Alexandrov A, Baranay P, Bechner M, Birol I, Boisvert S, Chapman JA, Chapuis G, Chikhi R, Chitsaz H, Chou WC, Corbeil J, Del Fabbro C, Docking TR, Durbin R, Earl D, Emrich S, Fedotov P, Fonseca NA, Ganapathy G, Gibbs RA, Gnerre S, Godzaridis E, Goldstein S, Haimel M, Hall G, Haussler D, Hiatt JB, Ho IY, Howard J, Hunt M, Jackman SD, Jaffe DB, Jarvis ED, Jiang H, Kazakov S, Kersey PJ, Kitzman JO, Knight JR, Koren S, Lam TW, Lavenier D, Laviolette F, Li Y, Li Z, Liu B, Liu Y, Luo R, Maccallum I, Macmanes MD, Maillet N, Melnikov S, Naquin D, Ning Z, Otto TD, Paten B, Paulo OS, Phillippy AM, Pina-Martins F, Place M, Przybylski D, Qin X, Qu C, Ribeiro FJ, Richards S, Rokhsar DS, Ruby JG, Scalabrin S, Schatz MC, Schwartz DC, Sergushichev A, Sharpe T, Shaw TI, Shendure J, Shi Y, Simpson JT, Song H, Tsarev F, Vezzi F, Vicedomini R, Vieira BM, Wang J, Worley KC, Yin S, Yiu SM, Yuan J, Zhang G, Zhang H, Zhou S, Korf IF. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. Gigascience 2013; 2:10. [PMID: 23870653 PMCID: PMC3844414 DOI: 10.1186/2047-217x-2-10] [Citation(s) in RCA: 415] [Impact Index Per Article: 37.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Accepted: 07/15/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. RESULTS In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. CONCLUSIONS Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.
Collapse
|
34
|
Kersey PJ, Staines DM, Lawson D, Kulesha E, Derwent P, Humphrey JC, Hughes DST, Keenan S, Kerhornou A, Koscielny G, Langridge N, McDowall MD, Megy K, Maheswari U, Nuhn M, Paulini M, Pedro H, Toneva I, Wilson D, Yates A, Birney E. Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species. Nucleic Acids Res 2011; 40:D91-7. [PMID: 22067447 PMCID: PMC3245118 DOI: 10.1093/nar/gkr895] [Citation(s) in RCA: 142] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Ensembl Genomes (http://www.ensemblgenomes.org) is an integrative resource for genome-scale data from non-vertebrate species. The project exploits and extends technology (for genome annotation, analysis and dissemination) developed in the context of the (vertebrate-focused) Ensembl project and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. Since its launch in 2009, Ensembl Genomes has undergone rapid expansion, with the goal of providing coverage of all major experimental organisms, and additionally including taxonomic reference points to provide the evolutionary context in which genes can be understood. Against the backdrop of a continuing increase in genome sequencing activities in all parts of the tree of life, we seek to work, wherever possible, with the communities actively generating and using data, and are participants in a growing range of collaborations involved in the annotation and analysis of genomes.
Collapse
Affiliation(s)
- Paul J Kersey
- Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Wood V, Harris MA, McDowall MD, Rutherford K, Vaughan BW, Staines DM, Aslett M, Lock A, Bähler J, Kersey PJ, Oliver SG. PomBase: a comprehensive online resource for fission yeast. Nucleic Acids Res 2011; 40:D695-9. [PMID: 22039153 PMCID: PMC3245111 DOI: 10.1093/nar/gkr853] [Citation(s) in RCA: 228] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
PomBase (www.pombase.org) is a new model organism database established to provide access to comprehensive, accurate, and up-to-date molecular data and biological information for the fission yeast Schizosaccharomyces pombe to effectively support both exploratory and hypothesis-driven research. PomBase encompasses annotation of genomic sequence and features, comprehensive manual literature curation and genome-wide data sets, and supports sophisticated user-defined queries. The implementation of PomBase integrates a Chado relational database that houses manually curated data with Ensembl software that supports sequence-based annotation and web access. PomBase will provide user-friendly tools to promote curation by experts within the fission yeast community. This will make a key contribution to shaping its content and ensuring its comprehensiveness and long-term relevance.
Collapse
Affiliation(s)
- Valerie Wood
- Cambridge Systems Biology Centre, Department of Biochemistry, University of Cambridge, Sanger Building, 80 Tennis Court Road, Cambridge CB2 1GA, UK.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Earl D, Bradnam K, St John J, Darling A, Lin D, Fass J, Yu HOK, Buffalo V, Zerbino DR, Diekhans M, Nguyen N, Ariyaratne PN, Sung WK, Ning Z, Haimel M, Simpson JT, Fonseca NA, Birol İ, Docking TR, Ho IY, Rokhsar DS, Chikhi R, Lavenier D, Chapuis G, Naquin D, Maillet N, Schatz MC, Kelley DR, Phillippy AM, Koren S, Yang SP, Wu W, Chou WC, Srivastava A, Shaw TI, Ruby JG, Skewes-Cox P, Betegon M, Dimon MT, Solovyev V, Seledtsov I, Kosarev P, Vorobyev D, Ramirez-Gonzalez R, Leggett R, MacLean D, Xia F, Luo R, Li Z, Xie Y, Liu B, Gnerre S, MacCallum I, Przybylski D, Ribeiro FJ, Yin S, Sharpe T, Hall G, Kersey PJ, Durbin R, Jackman SD, Chapman JA, Huang X, DeRisi JL, Caccamo M, Li Y, Jaffe DB, Green RE, Haussler D, Korf I, Paten B. Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res 2011; 21:2224-41. [PMID: 21926179 DOI: 10.1101/gr.126599.111] [Citation(s) in RCA: 317] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.
Collapse
Affiliation(s)
- Dent Earl
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California 95064, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Earl D, Bradnam K, St John J, Darling A, Lin D, Fass J, Yu HOK, Buffalo V, Zerbino DR, Diekhans M, Nguyen N, Ariyaratne PN, Sung WK, Ning Z, Haimel M, Simpson JT, Fonseca NA, Birol İ, Docking TR, Ho IY, Rokhsar DS, Chikhi R, Lavenier D, Chapuis G, Naquin D, Maillet N, Schatz MC, Kelley DR, Phillippy AM, Koren S, Yang SP, Wu W, Chou WC, Srivastava A, Shaw TI, Ruby JG, Skewes-Cox P, Betegon M, Dimon MT, Solovyev V, Seledtsov I, Kosarev P, Vorobyev D, Ramirez-Gonzalez R, Leggett R, MacLean D, Xia F, Luo R, Li Z, Xie Y, Liu B, Gnerre S, MacCallum I, Przybylski D, Ribeiro FJ, Yin S, Sharpe T, Hall G, Kersey PJ, Durbin R, Jackman SD, Chapman JA, Huang X, DeRisi JL, Caccamo M, Li Y, Jaffe DB, Green RE, Haussler D, Korf I, Paten B. Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res 2011. [PMID: 21926179 DOI: 10.1101/gr.126599] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.
Collapse
Affiliation(s)
- Dent Earl
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California 95064, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Kersey PJ, Lawson D, Birney E, Derwent PS, Haimel M, Herrero J, Keenan S, Kerhornou A, Koscielny G, Kähäri A, Kinsella RJ, Kulesha E, Maheswari U, Megy K, Nuhn M, Proctor G, Staines D, Valentin F, Vilella AJ, Yates A. Ensembl Genomes: extending Ensembl across the taxonomic space. Nucleic Acids Res 2009; 38:D563-9. [PMID: 19884133 PMCID: PMC2808935 DOI: 10.1093/nar/gkp871] [Citation(s) in RCA: 116] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Ensembl Genomes (http://www.ensemblgenomes.org) is a new portal offering integrated access to genome-scale data from non-vertebrate species of scientific interest, developed using the Ensembl genome annotation and visualisation platform. Ensembl Genomes consists of five sub-portals (for bacteria, protists, fungi, plants and invertebrate metazoa) designed to complement the availability of vertebrate genomes in Ensembl. Many of the databases supporting the portal have been built in close collaboration with the scientific community, which we consider as essential for maintaining the accuracy and usefulness of the resource. A common set of user interfaces (which include a graphical genome browser, FTP, BLAST search, a query optimised data warehouse, programmatic access, and a Perl API) is provided for all domains. Data types incorporated include annotation of (protein and non-protein coding) genes, cross references to external resources, and high throughput experimental data (e.g. data from large scale studies of gene expression and polymorphism visualised in their genomic context). Additionally, extensive comparative analysis has been performed, both within defined clades and across the wider taxonomy, and sequence alignments and gene trees resulting from this can be accessed through the site.
Collapse
Affiliation(s)
- P J Kersey
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Sterk P, Kersey PJ, Apweiler R. Genome Reviews: Standardizing Content and Representation of Information about Complete Genomes. OMICS: A Journal of Integrative Biology 2006; 10:114-8. [PMID: 16901215 DOI: 10.1089/omi.2006.10.114] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The Genome Reviews database provides up-to-date, standardized, and comprehensively annotated views of the genomic sequence of organisms with completely deciphered genomes. Currently, Genome Reviews contains information about the genomes of archaea, bacteria, and selected lower eukaryotes. Expansion to viral genomes and additional eukaryotes is planned. Genome Reviews is available for download in relational and flat file formats. In this paper, the rationale behind the creation of Genome Reviews, the approach taken in standardizing the data and its representation, and particular issues encountered in doing so are described.
Collapse
Affiliation(s)
- Peter Sterk
- EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
| | | | | |
Collapse
|
40
|
Mignone F, Grillo G, Licciulli F, Iacono M, Liuni S, Kersey PJ, Duarte J, Saccone C, Pesole G. UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs. Nucleic Acids Res 2005; 33:D141-6. [PMID: 15608165 PMCID: PMC539975 DOI: 10.1093/nar/gki021] [Citation(s) in RCA: 125] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The 5′ and 3′ untranslated regions of eukaryotic mRNAs play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5′ and 3′ untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated (and also collated as the UTRsite database) and cross-links to genomic and protein data are provided. The integration of UTRdb with genomic and protein data has allowed the implementation of a powerful retrieval resource for the selection and extraction of UTR subsets based on their genomic coordinates and/or features of the protein encoded by the relevant mRNA (e.g. GO term, PFAM domain, etc.). All internet resources implemented for retrieval and functional analysis of 5′ and 3′ untranslated regions of eukaryotic mRNAs are accessible at http://www.ba.itb.cnr.it/UTR/.
Collapse
Affiliation(s)
- Flavio Mignone
- Dipartimento di Scienze Biomolecolari e Biotecnologie, Università di Milano, via Celoria 26, 20133 Milano, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Kersey PJ, Duarte J, Williams A, Karavidopoulou Y, Birney E, Apweiler R. The International Protein Index: An integrated database for proteomics experiments. Proteomics 2004; 4:1985-8. [PMID: 15221759 DOI: 10.1002/pmic.200300721] [Citation(s) in RCA: 622] [Impact Index Per Article: 31.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Despite the complete determination of the genome sequence of several higher eukaryotes, their proteomes remain relatively poorly defined. Information about proteins identified by different experimental and computational methods is stored in different databases, meaning that no single resource offers full coverage of known and predicted proteins. IPI (the International Protein Index) has been developed to address these issues and offers complete nonredundant data sets representing the human, mouse and rat proteomes, built from the Swiss-Prot, TrEMBL, Ensembl and RefSeq databases.
Collapse
Affiliation(s)
- Paul J Kersey
- EMBL Outstation, The European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
| | | | | | | | | | | |
Collapse
|
42
|
Maqbool Z, Kersey PJ, Fantes PA, McInerny CJ. MCB-mediated regulation of cell cycle-specific cdc22+ transcription in fission yeast. Mol Genet Genomics 2003; 269:765-75. [PMID: 12898217 DOI: 10.1007/s00438-003-0885-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2003] [Accepted: 06/09/2003] [Indexed: 10/26/2022]
Abstract
The cdc22+ gene of the fission yeast, Schizosaccharomyces pombe, encodes the large subunit of ribonucleotide reductase, and is periodically expressed during the mitotic cell cycle, transcript abundance reaching a maximum at the G1-S boundary. This regulation of expression is controlled by a transcription factor complex called DSC1, which binds to MCB motifs (ACGCGT) present in the promoter of cdc22+. cdc22+ has a complex pattern of MCBs, including two clusters of four motifs each, one of which is located within the transcribed region. We show that both clusters of MCBs contribute to the regulation of cdc22+ expression during the cell cycle, each having a different role. The MCB cluster within the transcribed region has the major role in regulating cdc22+, as its removal results in loss of transcription. The upstream cluster, instead, controls cell cycle-specific transcription through a negative function, as its removal results in expression of cdc22+ throughout the cell cycle. Both MCB clusters bind DSC1. We show that the interaction of DSC1 with the MCB cluster within the transcribed region has a high "on-off" rate, suggesting a mechanism by which DSC1 could activate expression, and still allow RNA polymerase to pass during transcription. Finally, we show that both clusters are orientation-dependent in their function. The significance of these results, in the context of MCB-mediated regulation of G1-S expression in fission yeast, is discussed.
Collapse
Affiliation(s)
- Z Maqbool
- Division of Biochemistry and Molecular Biology, Institute of Biomedical and Life Sciences, University of Glasgow, G12 8QQ, Glasgow, Scotland
| | | | | | | |
Collapse
|
43
|
Kersey PJ, Morris L, Hermjakob H, Apweiler R. Integr8: enhanced inter-operability of European molecular biology databases. Methods Inf Med 2003; 42:154-60. [PMID: 12743652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
OBJECTIVES The increasing production of molecular biology data in the post-genomic era, and the proliferation of databases that store it, require the development of an integrative layer in database services to facilitate the synthesis of related information. The solution of this problem is made more difficult by the absence of universal identifiers for biological entities, and the breadth and variety of available data. METHODS Integr8 was modelled using UML (Universal Modelling Language). Integr8 is being implemented as an n-tier system using a modern object-oriented programming language (Java). An object-relational mapping tool, OJB, is being used to specify the interface between the upper layers and an underlying relational database. RESULTS The European Bioinformatics Institute is launching the Integr8 project. Integr8 will be an automatically populated database in which we will maintain stable identifiers for biological entities, describe their relationships with each other (in accordance with the central dogma of biology), and store equivalences between identified entities in the source databases. Only core data will be stored in Integr8, with web links to the source databases providing further information. CONCLUSIONS Integr8 will provide the integrative layer of the next generation of bioinformatics services from the EBI. Web-based interfaces will be developed to offer gene-centric views of the integrated data, presenting (where known) the links between genome, proteome and phenotype.
Collapse
Affiliation(s)
- P J Kersey
- The European Bioinformatics Institute, Cambridge, UK.
| | | | | | | |
Collapse
|
44
|
Harris P, Kersey PJ, McInerny CJ, Fantes PA. Cell cycle, DNA damage and heat shock regulate suc22+ expression in fission yeast. Mol Gen Genet 1996; 252:284-91. [PMID: 8842148 DOI: 10.1007/bf02173774] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The suc22+ gene of Schizosaccharomyces pombe encodes the small subunit of ribonucleotide reductase. Two transcripts that hybridise to suc22+ have previously been described: a constitutive transcript of 1.5 kb, and a transcript of approximately 1.9 kb that is induced when DNA replication is blocked by hydroxyurea. In this paper we show that both transcripts derive from the suc22+ gene, are polyadenylated, and have transcription initiation sites separated by approximately 550 nucleotides. The absence of translation initiation codons and predicted intron splice sites within this 550 nucleotide region suggests strongly that both transcripts encode the same protein. Under normal growth conditions, the larger suc22+ transcript is present at a very low level. This low level expression is periodic during the cell cycle, showing a pattern similar to that of other genes under regulation by MCB elements with a maximum in G1/S phase. Consistent with this, there are MCB elements upstream of the initiation site of the transcript. This pattern of expression contrasts with the continuous expression, at a much higher level, of the smaller suc22+ transcript. The larger suc22+ transcript is induced by exposure of cells to 4-nitroquinoline oxide (4-NQO),a UV-mimetic agent that causes DNA damage. The transcriptional response to 4-NQO is observed in cells previously arrested in G2 by a cdc2ts mutation, demonstrating that induction can occur outside S phase. We show that the rad1+ gene, part of the mitotic checkpoint, is required for induction of the large transcript. Exposure of cells to heat shock also induces the suc22+ large transcript: a consensus heat shock element has been identified upstream of the large transcript start site.
Collapse
Affiliation(s)
- P Harris
- Institute of Cell and Molecular Biology, University of Edinburgh, UK
| | | | | | | |
Collapse
|
45
|
Abstract
In this paper we describe properties of the cdc10-C4 mutant of the fission yeast Schizosaccharomyces pombe. The cdc10+ gene encodes a component of the DSC1Sp/MBF transcription complex, which is required for cell-cycle regulated expression at G1-S of several genes via cis-acting MCB (MIuI cell cycle box) elements. At permissive temperatures cdc10-C4 causes expression of MCB-regulated genes through the whole cell cycle, which in asynchronously dividing cells is manifested in overall higher expression levels. This overexpression phenotype is cold sensitive: in cdc10-C4 cells, MCB genes are expressed offprogressively higher levels at lower temperatures. In heterozygous cdc10-C4/cdc10+ diploid strains, MCB-regulated genes are not overexpressed, suggesting that loss, rather than alteration, of function of the cdc10-C4 gene product is the reason for unregulated target gene expression. Consistent with this, the cdc10-C4 mutant allele is known to encode a truncated protein. We have also overexpressed the region of the cdc10 protein absent in cdc10-C4 under the control of an inducible promoter. This induces a G1 delay, and additionally causes a reduction of the overexpression of MCB genes in cdc10-C4 strains. These results suggest that DSC1Sp/MBF represses, as well as activates, MCB gene expression during the cell cycle.
Collapse
Affiliation(s)
- C J McInerny
- Institute of Cell and Molecular Biology, University of Edinburgh, UK
| | | | | | | |
Collapse
|
46
|
|
47
|
Abstract
A family pedigree is described in which the propositus has tricho-dental syndrome. Evidence is presented that the short hair frequently observed in this condition is due to a short anagen phase of the hair cycle.
Collapse
|
48
|
|
49
|
|
50
|
Chapman PH, Kersey PJ, Keys B, Shuster S, Rawlins MD. Generalised tissue abnormality of aryl hydrocarbon hydroxylase in psoriasis. Br Med J 1980; 281:1315-6. [PMID: 7437778 PMCID: PMC1714719 DOI: 10.1136/bmj.281.6251.1315] [Citation(s) in RCA: 30] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Microsomal aryl hydrocarbon hydroxylase (AHH) activity and inducibility were measured in jejunal mucosa, liver, and lesion-free epidermis of patients with psoriasis. In all three tissues AHH activity and inducibility were less than in controls. This demonstration of a generalised enzymatic abnormality in the tissues of patients with psoriasis is in keeping with the suggestion that it may be close to the underlying genetic defect.
Collapse
|