1
|
Conde S, Rami JF, Okello DK, Sambou A, Muitia A, Oteng-Frimpong R, Makweti L, Sako D, Faye I, Chintu J, Coulibaly AM, Miningou A, Asibuo JY, Konate M, Banla EM, Seye M, Djiboune YR, Tossim HA, Sylla SN, Hoisington D, Clevenger J, Chu Y, Tallury S, Ozias-Akins P, Fonceka D. The groundnut improvement network for Africa (GINA) germplasm collection: a unique genetic resource for breeding and gene discovery. G3 (BETHESDA, MD.) 2023; 14:jkad244. [PMID: 37875136 PMCID: PMC10755195 DOI: 10.1093/g3journal/jkad244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 08/22/2023] [Accepted: 10/03/2023] [Indexed: 10/26/2023]
Abstract
Cultivated peanut or groundnut (Arachis hypogaea L.) is a grain legume grown in many developing countries by smallholder farmers for food, feed, and/or income. The speciation of the cultivated species, that involved polyploidization followed by domestication, greatly reduced its variability at the DNA level. Mobilizing peanut diversity is a prerequisite for any breeding program for overcoming the main constraints that plague production and for increasing yield in farmer fields. In this study, the Groundnut Improvement Network for Africa assembled a collection of 1,049 peanut breeding lines, varieties, and landraces from 9 countries in Africa. The collection was genotyped with the Axiom_Arachis2 48K SNP array and 8,229 polymorphic single nucleotide polymorphism (SNP) markers were used to analyze the genetic structure of this collection and quantify the level of genetic diversity in each breeding program. A supervised model was developed using dapc to unambiguously assign 542, 35, and 172 genotypes to the Spanish, Valencia, and Virginia market types, respectively. Distance-based clustering of the collection showed a clear grouping structure according to subspecies and market types, with 73% of the genotypes classified as fastigiata and 27% as hypogaea subspecies. Using STRUCTURE, the global structuration was confirmed and showed that, at a minimum membership of 0.8, 76% of the varieties that were not assigned by dapc were actually admixed. This was particularly the case of most of the genotype of the Valencia subgroup that exhibited admixed genetic heritage. The results also showed that the geographic origin (i.e. East, Southern, and West Africa) did not strongly explain the genetic structure. The gene diversity managed by each breeding program, measured by the expected heterozygosity, ranged from 0.25 to 0.39, with the Niger breeding program having the lowest diversity mainly because only lines that belong to the fastigiata subspecies are used in this program. Finally, we developed a core collection composed of 300 accessions based on breeding traits and genetic diversity. This collection, which is composed of 205 genotypes of fastigiata subspecies (158 Spanish and 47 Valencia) and 95 genotypes of hypogaea subspecies (all Virginia), improves the genetic diversity of each individual breeding program and is, therefore, a unique resource for allele mining and breeding.
Collapse
Affiliation(s)
- Soukeye Conde
- ISRA, Centre d’Etudes Régional pour l’Amélioration de l’Adaptation à la Sécheresse, CERAAS-Route de Khombole, Thiès BP 3320, Senegal
- UMR AGAP, CIRAD, 34398 Montpellier, France
- CIRAD, INRAE, AGAP, University Montpellier, Institut Agro, 34398 Montpellier, France
- F.S.T., Département de B.V., Université Cheikh Anta Diop, BP 5005 Dakar, Senegal
| | - Jean-François Rami
- UMR AGAP, CIRAD, 34398 Montpellier, France
- CIRAD, INRAE, AGAP, University Montpellier, Institut Agro, 34398 Montpellier, France
| | - David K Okello
- National Semi-Arid Resources Research Institute-Serere, PO Box 56, Kampala, Uganda
| | - Aissatou Sambou
- ISRA, Centre d’Etudes Régional pour l’Amélioration de l’Adaptation à la Sécheresse, CERAAS-Route de Khombole, Thiès BP 3320, Senegal
| | - Amade Muitia
- Mozambique Agricultural Research Institute (Instituto de Investigação Agrária de Moçambique), Northeast Zonal Centre, Nampula Research Station, PO Box 1922, Nampula, Mozambique
| | - Richard Oteng-Frimpong
- Groundnut Improvement Program, Council for Scientific and Industrial Research (CSIR)-Savanna Agricultural Research Institute, PO Box 52, Tamale, Ghana
| | - Lutangu Makweti
- Zambia Agriculture Research Institute (ZARI), PO Box 510089, Chipata, Zambia
| | - Dramane Sako
- Institut d’Economie Rurale (IER), Centre Régional de Recherche Agronomique (CRRA), BP 281 Kayes, Mali
| | - Issa Faye
- ISRA, Institut Sénégalais de Recherches Agricoles, Centre National de Recherche Agronomique, BP 53 Bambey, Sénégal
| | - Justus Chintu
- Chitedze Agricultural Research Service, PO Box 158, Lilongwe, Malawi
| | - Adama M Coulibaly
- Institut National de Recherche Agronomique du Niger (INRAN), BP 240 Maradi, Niger
| | - Amos Miningou
- INERA, CREAF, 01 BP 476 Ouagadougou 01, Burkina Faso
| | - James Y Asibuo
- Council for Scientific and Industrial Research-Crops Research Institute (CSIR-CRI), P.O. Box 3785, Kumasi, Ghana
| | - Moumouni Konate
- INERA, DRREA-Ouest, 01 BP 910 Bobo Dioulasso 01, Burkina Faso
| | - Essohouna M Banla
- Institut Togolais de Recherche Agronomique (ITRA), 13BP267 Lome, Togo
| | - Maguette Seye
- ISRA, Centre d’Etudes Régional pour l’Amélioration de l’Adaptation à la Sécheresse, CERAAS-Route de Khombole, Thiès BP 3320, Senegal
| | - Yvette R Djiboune
- ISRA, Centre d’Etudes Régional pour l’Amélioration de l’Adaptation à la Sécheresse, CERAAS-Route de Khombole, Thiès BP 3320, Senegal
| | - Hodo-Abalo Tossim
- ISRA, Centre d’Etudes Régional pour l’Amélioration de l’Adaptation à la Sécheresse, CERAAS-Route de Khombole, Thiès BP 3320, Senegal
| | - Samba N Sylla
- F.S.T., Département de B.V., Université Cheikh Anta Diop, BP 5005 Dakar, Senegal
| | - David Hoisington
- Feed the Future Innovation Lab for Peanut, College of Agricultural and Environmental Sciences, University of Georgia, Athens, GA 30602, USA
| | - Josh Clevenger
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Ye Chu
- Institute of Plant Breeding Genetics and Genomics and Department of Horticulture, College of Agricultural and Environmental Sciences, University of Georgia, Tifton, GA 31793, USA
| | - Shyam Tallury
- Plant Genetic Resources Conservation Unit, Griffin, GA 30223, USA
| | - Peggy Ozias-Akins
- Institute of Plant Breeding Genetics and Genomics and Department of Horticulture, College of Agricultural and Environmental Sciences, University of Georgia, Tifton, GA 31793, USA
| | - Daniel Fonceka
- ISRA, Centre d’Etudes Régional pour l’Amélioration de l’Adaptation à la Sécheresse, CERAAS-Route de Khombole, Thiès BP 3320, Senegal
- UMR AGAP, CIRAD, 34398 Montpellier, France
- CIRAD, INRAE, AGAP, University Montpellier, Institut Agro, 34398 Montpellier, France
| |
Collapse
|
2
|
Sempéré G, Larmande P, Rouard M. Managing High-Density Genotyping Data with Gigwa. Methods Mol Biol 2022; 2443:415-427. [PMID: 35037218 DOI: 10.1007/978-1-0716-2067-0_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Next generation sequencing technologies enabled high-density genotyping for large numbers of samples. Nowadays SNP calling pipelines produce up to millions of such markers, but which need to be filtered in various ways according to the type of analyses. One of the main challenges still lies in the management of an increasing volume of genotyping files that are difficult to handle for many applications. Here, we provide a practical guide for efficiently managing large genomic variation data using Gigwa, a user-friendly, scalable and versatile application that may be deployed either remotely on web servers or on a local machine.
Collapse
Affiliation(s)
- Guilhem Sempéré
- CIRAD, UMR INTERTRYP, Montpellier, France
- INTERTRYP, Univ Montpellier, CIRAD, IRD, Montpellier, France
- French Institute of Bioinformatics (IFB)-South Green Bioinformatics Platform, Bioversity, CIRAD, INRAE, IRD, Montpellier, France
| | - Pierre Larmande
- French Institute of Bioinformatics (IFB)-South Green Bioinformatics Platform, Bioversity, CIRAD, INRAE, IRD, Montpellier, France.
- DIADE, Univ Montpellier, IRD, Montpellier, France.
| | - Mathieu Rouard
- French Institute of Bioinformatics (IFB)-South Green Bioinformatics Platform, Bioversity, CIRAD, INRAE, IRD, Montpellier, France
- Bioversity International, Parc Scientifique Agropolis II, Montpellier, France
| |
Collapse
|
3
|
Cenci A, Sardos J, Hueber Y, Martin G, Breton C, Roux N, Swennen R, Carpentier SC, Rouard M. Unravelling the complex story of intergenomic recombination in ABB allotriploid bananas. ANNALS OF BOTANY 2021; 127:7-20. [PMID: 32104882 DOI: 10.1093/aob/mcaa032/5760888] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 02/25/2020] [Indexed: 05/24/2023]
Abstract
BACKGROUND AND AIMS Bananas (Musa spp.) are a major staple food for hundreds of millions of people in developing countries. The cultivated varieties are seedless and parthenocarpic clones of which the ancestral origin remains to be clarified. The most important cultivars are triploids with an AAA, AAB or ABB genome constitution, with A and B genomes provided by M. acuminata and M. balbisiana, respectively. Previous studies suggested that inter-genome recombinations were relatively common in banana cultivars and that triploids were more likely to have passed through an intermediate hybrid. In this study, we investigated the chromosome structure within the ABB group, composed of starchy cooking bananas that play an important role in food security. METHODS Using SNP markers called from RADSeq data, we studied the chromosome structure of 36 ABB genotypes spanning defined taxonomic subgroups. To complement our understanding, we searched for similar events within nine AB hybrid genotypes. KEY RESULTS Recurrent homologous exchanges (HEs), i.e. chromatin exchanges between A and B subgenomes, were unravelled with at least nine founding events (HE patterns) at the origin of ABB bananas prior to clonal diversification. Two independent founding events were found for Pisang Awak genotypes. Two HE patterns, corresponding to genotypes Pelipita and Klue Teparod, show an over-representation of B genome contribution. Three HE patterns mainly found in Indian accessions shared some recombined regions and two additional patterns did not correspond to any known subgroups. CONCLUSIONS The discovery of the nine founding events allowed an investigation of the possible routes that led to the creation of the different subgroups, which resulted in new hypotheses. Based on our observations, we suggest different routes that gave rise to the current diversity in the ABB cultivars, routes involving primary AB hybrids, routes leading to shared HEs and routes leading to a B excess ratio. Genetic fluxes took place between M. acuminata and M. balbisiana, particularly in India, where these unbalanced AB hybrids and ABB allotriploids originated, and where cultivated M. balbisiana are abundant. The result of this study clarifies the classification of ABB cultivars, possibly leading to the revision of the classification of this subgroup.
Collapse
Affiliation(s)
- Alberto Cenci
- Alliance Bioversity International - CIAT, Montpellier, France
| | - Julie Sardos
- Alliance Bioversity International - CIAT, Montpellier, France
| | - Yann Hueber
- Alliance Bioversity International - CIAT, Montpellier, France
| | - Guillaume Martin
- AGAP, Université de Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
- CIRAD, UMR AGAP, Montpellier, France
| | | | - Nicolas Roux
- Alliance Bioversity International - CIAT, Montpellier, France
| | - Rony Swennen
- Alliance Bioversity International - CIAT, Leuven, Belgium
- Laboratory of Tropical Crop Improvement, Division of Crop Biotechnics, KU Leuven, Leuven, Belgium
- International Institute of Tropical Agriculture, c/o The Nelson Mandela African Institution of Science and Technology (NM-AIST), Arusha, Tanzania
| | | | - Mathieu Rouard
- Alliance Bioversity International - CIAT, Montpellier, France
| |
Collapse
|
4
|
Cenci A, Sardos J, Hueber Y, Martin G, Breton C, Roux N, Swennen R, Carpentier SC, Rouard M. Unravelling the complex story of intergenomic recombination in ABB allotriploid bananas. ANNALS OF BOTANY 2021; 127:7-20. [PMID: 32104882 PMCID: PMC7750727 DOI: 10.1093/aob/mcaa032] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 02/25/2020] [Indexed: 05/24/2023]
Abstract
BACKGROUND AND AIMS Bananas (Musa spp.) are a major staple food for hundreds of millions of people in developing countries. The cultivated varieties are seedless and parthenocarpic clones of which the ancestral origin remains to be clarified. The most important cultivars are triploids with an AAA, AAB or ABB genome constitution, with A and B genomes provided by M. acuminata and M. balbisiana, respectively. Previous studies suggested that inter-genome recombinations were relatively common in banana cultivars and that triploids were more likely to have passed through an intermediate hybrid. In this study, we investigated the chromosome structure within the ABB group, composed of starchy cooking bananas that play an important role in food security. METHODS Using SNP markers called from RADSeq data, we studied the chromosome structure of 36 ABB genotypes spanning defined taxonomic subgroups. To complement our understanding, we searched for similar events within nine AB hybrid genotypes. KEY RESULTS Recurrent homologous exchanges (HEs), i.e. chromatin exchanges between A and B subgenomes, were unravelled with at least nine founding events (HE patterns) at the origin of ABB bananas prior to clonal diversification. Two independent founding events were found for Pisang Awak genotypes. Two HE patterns, corresponding to genotypes Pelipita and Klue Teparod, show an over-representation of B genome contribution. Three HE patterns mainly found in Indian accessions shared some recombined regions and two additional patterns did not correspond to any known subgroups. CONCLUSIONS The discovery of the nine founding events allowed an investigation of the possible routes that led to the creation of the different subgroups, which resulted in new hypotheses. Based on our observations, we suggest different routes that gave rise to the current diversity in the ABB cultivars, routes involving primary AB hybrids, routes leading to shared HEs and routes leading to a B excess ratio. Genetic fluxes took place between M. acuminata and M. balbisiana, particularly in India, where these unbalanced AB hybrids and ABB allotriploids originated, and where cultivated M. balbisiana are abundant. The result of this study clarifies the classification of ABB cultivars, possibly leading to the revision of the classification of this subgroup.
Collapse
Affiliation(s)
- Alberto Cenci
- Alliance Bioversity International - CIAT, Montpellier, France
| | - Julie Sardos
- Alliance Bioversity International - CIAT, Montpellier, France
| | - Yann Hueber
- Alliance Bioversity International - CIAT, Montpellier, France
| | - Guillaume Martin
- AGAP, Université de Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
- CIRAD, UMR AGAP, Montpellier, France
| | | | - Nicolas Roux
- Alliance Bioversity International - CIAT, Montpellier, France
| | - Rony Swennen
- Alliance Bioversity International - CIAT, Leuven, Belgium
- Laboratory of Tropical Crop Improvement, Division of Crop Biotechnics, KU Leuven, Leuven, Belgium
- International Institute of Tropical Agriculture, c/o The Nelson Mandela African Institution of Science and Technology (NM-AIST), Arusha, Tanzania
| | | | - Mathieu Rouard
- Alliance Bioversity International - CIAT, Montpellier, France
| |
Collapse
|
5
|
Morales N, Bauchet GJ, Tantikanjana T, Powell AF, Ellerbrock BJ, Tecle IY, Mueller LA. High density genotype storage for plant breeding in the Chado schema of Breedbase. PLoS One 2020; 15:e0240059. [PMID: 33175872 PMCID: PMC7657515 DOI: 10.1371/journal.pone.0240059] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 09/17/2020] [Indexed: 12/24/2022] Open
Abstract
Modern breeding programs routinely use genome-wide information for selecting individuals to advance. The large volumes of genotypic information required present a challenge for data storage and query efficiency. Major use cases require genotyping data to be linked with trait phenotyping data. In contrast to phenotyping data that are often stored in relational database schemas, next-generation genotyping data are traditionally stored in non-relational storage systems due to their extremely large scope. This study presents a novel data model implemented in Breedbase (https://breedbase.org/) for uniting relational phenotyping data and non-relational genotyping data within the open-source PostgreSQL database engine. Breedbase is an open-source, web-database designed to manage all of a breeder's informatics needs: management of field experiments, phenotypic and genotypic data collection and storage, and statistical analyses. The genotyping data is stored in a PostgreSQL data-type known as binary JavaScript Object Notation (JSONb), where the JSON structures closely follow the Variant Call Format (VCF) data model. The Breedbase genotyping data model can handle different ploidy levels, structural variants, and any genotype encoded in VCF. JSONb is both compressed and indexed, resulting in a space and time efficient system. Furthermore, file caching maximizes data retrieval performance. Integration of all breeding data within the Chado database schema retains referential integrity that may be lost when genotyping and phenotyping data are stored in separate systems. Benchmarking demonstrates that the system is fast enough for computation of a genomic relationship matrix (GRM) and genome wide association study (GWAS) for datasets involving 1,325 diploid Zea mays, 314 triploid Musa acuminata, and 924 diploid Manihot esculenta samples genotyped with 955,690, 142,119, and 287,952 genotype-by-sequencing (GBS) markers, respectively.
Collapse
Affiliation(s)
- Nicolas Morales
- Plant Breeding and Genetics, Cornell University, Ithaca, NY, United States of America
- Boyce Thompson Institute, Ithaca, NY, United States of America
| | | | | | | | | | - Isaak Y. Tecle
- Boyce Thompson Institute, Ithaca, NY, United States of America
| | | |
Collapse
|
6
|
Selby P, Abbeloos R, Backlund JE, Basterrechea Salido M, Bauchet G, Benites-Alfaro OE, Birkett C, Calaminos VC, Carceller P, Cornut G, Vasques Costa B, Edwards JD, Finkers R, Yanxin Gao S, Ghaffar M, Glaser P, Guignon V, Hok P, Kilian A, König P, Lagare JEB, Lange M, Laporte MA, Larmande P, LeBauer DS, Lyon DA, Marshall DS, Matthews D, Milne I, Mistry N, Morales N, Mueller LA, Neveu P, Papoutsoglou E, Pearce B, Perez-Masias I, Pommier C, Ramírez-González RH, Rathore A, Raquel AM, Raubach S, Rife T, Robbins K, Rouard M, Sarma C, Scholz U, Sempéré G, Shaw PD, Simon R, Soldevilla N, Stephen G, Sun Q, Tovar C, Uszynski G, Verouden M. BrAPI-an application programming interface for plant breeding applications. Bioinformatics 2020; 35:4147-4155. [PMID: 30903186 PMCID: PMC6792114 DOI: 10.1093/bioinformatics/btz190] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 11/23/2018] [Accepted: 03/20/2019] [Indexed: 12/04/2022] Open
Abstract
Motivation Modern genomic breeding methods rely heavily on very large amounts of phenotyping and genotyping data, presenting new challenges in effective data management and integration. Recently, the size and complexity of datasets have increased significantly, with the result that data are often stored on multiple systems. As analyses of interest increasingly require aggregation of datasets from diverse sources, data exchange between disparate systems becomes a challenge. Results To facilitate interoperability among breeding applications, we present the public plant Breeding Application Programming Interface (BrAPI). BrAPI is a standardized web service API specification. The development of BrAPI is a collaborative, community-based initiative involving a growing global community of over a hundred participants representing several dozen institutions and companies. Development of such a standard is recognized as critical to a number of important large breeding system initiatives as a foundational technology. The focus of the first version of the API is on providing services for connecting systems and retrieving basic breeding data including germplasm, study, observation, and marker data. A number of BrAPI-enabled applications, termed BrAPPs, have been written, that take advantage of the emerging support of BrAPI by many databases. Availability and implementation More information on BrAPI, including links to the specification, test suites, BrAPPs, and sample implementations is available at https://brapi.org/. The BrAPI specification and the developer tools are provided as free and open source.
Collapse
Affiliation(s)
- Peter Selby
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, New York, USA
| | | | | | | | | | - Omar E Benites-Alfaro
- International Potato Center (CIP), Lima, Peru.,International Food Policy Research Institute (IFPRI), Washington DC, USA
| | | | - Viana C Calaminos
- International Rice Research Institute (IRRI), Los Baños, Laguna, The Philippines
| | - Pierre Carceller
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
| | | | | | | | - Richard Finkers
- Department of Plant Breeding, Wageningen University & Research, Wageningen, The Netherlands
| | - Star Yanxin Gao
- Institute of Biotechnology, Cornell University, Ithaca, New York, USA
| | - Mehmood Ghaffar
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Philip Glaser
- Institute of Biotechnology, Cornell University, Ithaca, New York, USA
| | | | - Puthick Hok
- Diversity Arrays Technology, Bruce, Australia
| | | | - Patrick König
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | | | - Matthias Lange
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | | | | | - David S LeBauer
- College of Agricultural and Life Sciences, The University of Arizona, Tucson, AZ, USA
| | | | - David S Marshall
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK.,SRUC, Edinburgh, UK
| | | | - Iain Milne
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK
| | | | | | | | - Pascal Neveu
- MISTEA, INRA, Montpellier SupAgro, Universite de Montpellier, Montpellier, France
| | - Evangelia Papoutsoglou
- Department of Plant Breeding, Wageningen University & Research, Wageningen, The Netherlands
| | | | | | - Cyril Pommier
- URGI, INRA, Université Paris-Saclay, Versailles, France
| | | | - Abhishek Rathore
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Angel Manica Raquel
- International Rice Research Institute (IRRI), Los Baños, Laguna, The Philippines
| | - Sebastian Raubach
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Trevor Rife
- Department of Plant Pathology, Kansas State University, Manhattan, KS, USA
| | - Kelly Robbins
- Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, New York, USA
| | | | - Chaitanya Sarma
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Uwe Scholz
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Guilhem Sempéré
- AGAP, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France.,INTERTRYP, Univ Montpellier, CIRAD, IRD, Montpellier, France
| | - Paul D Shaw
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK
| | | | - Nahuel Soldevilla
- Integrated Breeding Program (IBP), CIMMYT, Texcoco, Mexico.,LeafNode Technology, Buenos Aires, Argentina
| | - Gordon Stephen
- Information & Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Qi Sun
- Institute of Biotechnology, Cornell University, Ithaca, New York, USA
| | - Clarysabel Tovar
- Integrated Breeding Program (IBP), CIMMYT, Texcoco, Mexico.,LeafNode Technology, Buenos Aires, Argentina
| | | | - Maikel Verouden
- Wageningen University & Research, Biometris, Wageningen PB, The Netherlands
| | | |
Collapse
|
7
|
Nti-Addae Y, Matthews D, Ulat VJ, Syed R, Sempéré G, Pétel A, Renner J, Larmande P, Guignon V, Jones E, Robbins K. Benchmarking database systems for Genomic Selection implementation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2019:5566651. [PMID: 31508797 PMCID: PMC6737464 DOI: 10.1093/database/baz096] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 05/29/2019] [Accepted: 07/01/2019] [Indexed: 01/07/2023]
Abstract
MOTIVATION With high-throughput genotyping systems now available, it has become feasible to fully integrate genotyping information into breeding programs. To make use of this information effectively requires DNA extraction facilities and marker production facilities that can efficiently deploy the desired set of markers across samples with a rapid turnaround time that allows for selection before crosses needed to be made. In reality, breeders often have a short window of time to make decisions by the time they are able to collect all their phenotyping data and receive corresponding genotyping data. This presents a challenge to organize information and utilize it in downstream analyses to support decisions made by breeders. In order to implement genomic selection routinely as part of breeding programs, one would need an efficient genotyping data storage system. We selected and benchmarked six popular open-source data storage systems, including relational database management and columnar storage systems. RESULTS We found that data extract times are greatly influenced by the orientation in which genotype data is stored in a system. HDF5 consistently performed best, in part because it can more efficiently work with both orientations of the allele matrix. AVAILABILITY http://gobiin1.bti.cornell.edu:6083/projects/GBM/repos/benchmarking/browse.
Collapse
Affiliation(s)
| | | | - Victor Jun Ulat
- Centro Internacional de Mejoramiento de Maíz y Trigo (CIMMYT)
| | - Raza Syed
- Institute of Biotechnology, Cornell University
| | | | | | | | | | | | | | - Kelly Robbins
- Section of Plant Breeding and Genetics, School of Integrative Plants Sciences, Cornell University
| |
Collapse
|
8
|
Sempéré G, Pétel A, Rouard M, Frouin J, Hueber Y, De Bellis F, Larmande P. Gigwa v2-Extended and improved genotype investigator. Gigascience 2019; 8:5488103. [PMID: 31077313 PMCID: PMC6511067 DOI: 10.1093/gigascience/giz051] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Revised: 02/19/2019] [Accepted: 04/08/2019] [Indexed: 11/19/2022] Open
Abstract
Background The study of genetic variations is the basis of many research domains in biology. From genome structure to population dynamics, many applications involve the use of genetic variants. The advent of next-generation sequencing technologies led to such a flood of data that the daily work of scientists is often more focused on data management than data analysis. This mass of genotyping data poses several computational challenges in terms of storage, search, sharing, analysis, and visualization. While existing tools try to solve these challenges, few of them offer a comprehensive and scalable solution. Results Gigwa v2 is an easy-to-use, species-agnostic web application for managing and exploring high-density genotyping data. It can handle multiple databases and may be installed on a local computer or deployed as an online data portal. It supports various standard import and export formats, provides advanced filtering options, and offers means to visualize density charts or push selected data into various stand-alone or online tools. It implements 2 standard RESTful application programming interfaces, GA4GH, which is health-oriented, and BrAPI, which is breeding-oriented, thus offering wide possibilities of interaction with third-party applications. The project home page provides a list of live instances allowing users to test the system on public data (or reasonably sized user-provided data). Conclusions This new version of Gigwa provides a more intuitive and more powerful way to explore large amounts of genotyping data by offering a scalable solution to search for genotype patterns, functional annotations, or more complex filtering. Furthermore, its user-friendliness and interoperability make it widely accessible to the life science community.
Collapse
Affiliation(s)
- Guilhem Sempéré
- Centre de coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), UMR INTERTRYP, F-34398 Montpellier, France.,South Green Bioinformatics Platform, Bioversity, CIRAD, Institut National de la Recherche Agronomique (INRA), IRD, Montpellier, France.,INTERTRYP, Univ Montpellier, CIRAD, Institut de Recherche pour le Développpement (IRD), Montpellier, France
| | - Adrien Pétel
- South Green Bioinformatics Platform, Bioversity, CIRAD, Institut National de la Recherche Agronomique (INRA), IRD, Montpellier, France.,DIADE, Univ Montpellier, IRD, 911 Avenue Agropolis, 34394 Montpellier, France
| | - Mathieu Rouard
- South Green Bioinformatics Platform, Bioversity, CIRAD, Institut National de la Recherche Agronomique (INRA), IRD, Montpellier, France.,Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - Julien Frouin
- CIRAD, UMR AGAP, F-34398 Montpellier, France.,AGAP, Univ Montpellier, CIRAD, INRA, Institut national d'études supérieures agronomiques de Montpellier (Montpellier SupAgro), Montpellier, France
| | - Yann Hueber
- South Green Bioinformatics Platform, Bioversity, CIRAD, Institut National de la Recherche Agronomique (INRA), IRD, Montpellier, France.,Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - Fabien De Bellis
- CIRAD, UMR AGAP, F-34398 Montpellier, France.,AGAP, Univ Montpellier, CIRAD, INRA, Institut national d'études supérieures agronomiques de Montpellier (Montpellier SupAgro), Montpellier, France
| | - Pierre Larmande
- South Green Bioinformatics Platform, Bioversity, CIRAD, Institut National de la Recherche Agronomique (INRA), IRD, Montpellier, France.,DIADE, Univ Montpellier, IRD, 911 Avenue Agropolis, 34394 Montpellier, France
| |
Collapse
|
9
|
Wercelens P, da Silva W, Hondo F, Castro K, Walter ME, Araújo A, Lifschitz S, Holanda M. Bioinformatics Workflows With NoSQL Database in Cloud Computing. Evol Bioinform Online 2019; 15:1176934319889974. [PMID: 31839702 PMCID: PMC6896126 DOI: 10.1177/1176934319889974] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Accepted: 10/29/2019] [Indexed: 12/29/2022] Open
Abstract
Scientific workflows can be understood as arrangements of managed activities executed by different processing entities. It is a regular Bioinformatics approach applying workflows to solve problems in Molecular Biology, notably those related to sequence analyses. Due to the nature of the raw data and the in silico environment of Molecular Biology experiments, apart from the research subject, 2 practical and closely related problems have been studied: reproducibility and computational environment. When aiming to enhance the reproducibility of Bioinformatics experiments, various aspects should be considered. The reproducibility requirements comprise the data provenance, which enables the acquisition of knowledge about the trajectory of data over a defined workflow, the settings of the programs, and the entire computational environment. Cloud computing is a booming alternative that can provide this computational environment, hiding technical details, and delivering a more affordable, accessible, and configurable on-demand environment for researchers. Considering this specific scenario, we proposed a solution to improve the reproducibility of Bioinformatics workflows in a cloud computing environment using both Infrastructure as a Service (IaaS) and Not only SQL (NoSQL) database systems. To meet the goal, we have built 3 typical Bioinformatics workflows and ran them on 1 private and 2 public clouds, using different types of NoSQL database systems to persist the provenance data according to the Provenance Data Model (PROV-DM). We present here the results and a guide for the deployment of a cloud environment for Bioinformatics exploring the characteristics of various NoSQL database systems to persist provenance data.
Collapse
Affiliation(s)
- Polyane Wercelens
- Department of Computer Science, University of Brasília, Brasília, Brazil
| | - Waldeyr da Silva
- Department of Computer Science, University of Brasília, Brasília, Brazil.,NEPBIO (Group of Biological Studies and Research on Cerrado), Federal Institute of Goiás (IFG), Formosa, Goiás, Brazil
| | - Fernanda Hondo
- Department of Computer Science, University of Brasília, Brasília, Brazil
| | - Klayton Castro
- Department of Computer Science, University of Brasília, Brasília, Brazil
| | | | - Aletéia Araújo
- Department of Computer Science, University of Brasília, Brasília, Brazil
| | - Sergio Lifschitz
- Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Maristela Holanda
- Department of Computer Science, University of Brasília, Brasília, Brazil
| |
Collapse
|
10
|
Juanillas V, Dereeper A, Beaume N, Droc G, Dizon J, Mendoza JR, Perdon JP, Mansueto L, Triplett L, Lang J, Zhou G, Ratharanjan K, Plale B, Haga J, Leach JE, Ruiz M, Thomson M, Alexandrov N, Larmande P, Kretzschmar T, Mauleon RP. Rice Galaxy: an open resource for plant science. Gigascience 2019; 8:giz028. [PMID: 31107941 PMCID: PMC6527052 DOI: 10.1093/gigascience/giz028] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Revised: 08/29/2018] [Accepted: 02/12/2019] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Rice molecular genetics, breeding, genetic diversity, and allied research (such as rice-pathogen interaction) have adopted sequencing technologies and high-density genotyping platforms for genome variation analysis and gene discovery. Germplasm collections representing rice diversity, improved varieties, and elite breeding materials are accessible through rice gene banks for use in research and breeding, with many having genome sequences and high-density genotype data available. Combining phenotypic and genotypic information on these accessions enables genome-wide association analysis, which is driving quantitative trait loci discovery and molecular marker development. Comparative sequence analyses across quantitative trait loci regions facilitate the discovery of novel alleles. Analyses involving DNA sequences and large genotyping matrices for thousands of samples, however, pose a challenge to non-computer savvy rice researchers. FINDINGS The Rice Galaxy resource has shared datasets that include high-density genotypes from the 3,000 Rice Genomes project and sequences with corresponding annotations from 9 published rice genomes. The Rice Galaxy web server and deployment installer includes tools for designing single-nucleotide polymorphism assays, analyzing genome-wide association studies, population diversity, rice-bacterial pathogen diagnostics, and a suite of published genomic prediction methods. A prototype Rice Galaxy compliant to Open Access, Open Data, and Findable, Accessible, Interoperable, and Reproducible principles is also presented. CONCLUSIONS Rice Galaxy is a freely available resource that empowers the plant research community to perform state-of-the-art analyses and utilize publicly available big datasets for both fundamental and applied science.
Collapse
Affiliation(s)
- Venice Juanillas
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines
| | - Alexis Dereeper
- Institut de recherche pour le développement (IRD), University of Montpellier, DIADE, IPME, Montpellier, France
| | - Nicolas Beaume
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines
| | - Gaetan Droc
- CIRAD, UMR AGAP, F-34398 Montpellier, France
| | - Joshua Dizon
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines
| | - John Robert Mendoza
- Advanced Science and Technology Institute, Department of Science and Technology, Quezon City, Philippines
| | - Jon Peter Perdon
- Advanced Science and Technology Institute, Department of Science and Technology, Quezon City, Philippines
| | - Locedie Mansueto
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines
| | - Lindsay Triplett
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523-1177, USA
| | - Jillian Lang
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523-1177, USA
| | - Gabriel Zhou
- Indiana University, 107 S Indiana Ave, Bloomington, IN 47405, USA
| | | | - Beth Plale
- Indiana University, 107 S Indiana Ave, Bloomington, IN 47405, USA
| | - Jason Haga
- National Institute of Advanced Industrial Science and Technology, AIST Tsukuba Central 1,1-1-1 Umezono, Tsukuba, Ibaraki 305-8560, Japan
| | - Jan E Leach
- Department of Bioagricultural Sciences and Pest Management, Colorado State University, Fort Collins, CO 80523-1177, USA
| | - Manuel Ruiz
- CIRAD, UMR AGAP, F-34398 Montpellier, France
| | - Michael Thomson
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines
- Department of Soil and Crop Sciences, Texas A&M University, Houston, TX, USA
| | - Nickolai Alexandrov
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines
| | - Pierre Larmande
- Institut de recherche pour le développement (IRD), University of Montpellier, DIADE, IPME, Montpellier, France
| | - Tobias Kretzschmar
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines
- Southern Cross Plant Science, Southern Cross University, Lismore, Australia
| | - Ramil P Mauleon
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines
- Southern Cross Plant Science, Southern Cross University, Lismore, Australia
| |
Collapse
|
11
|
Cubry P, Tranchant-Dubreuil C, Thuillet AC, Monat C, Ndjiondjop MN, Labadie K, Cruaud C, Engelen S, Scarcelli N, Rhoné B, Burgarella C, Dupuy C, Larmande P, Wincker P, François O, Sabot F, Vigouroux Y. The Rise and Fall of African Rice Cultivation Revealed by Analysis of 246 New Genomes. Curr Biol 2018; 28:2274-2282.e6. [PMID: 29983312 DOI: 10.1016/j.cub.2018.05.066] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Revised: 04/10/2018] [Accepted: 05/24/2018] [Indexed: 12/23/2022]
Abstract
African rice (Oryza glaberrima) was domesticated independently from Asian rice. The geographical origin of its domestication remains elusive. Using 246 new whole-genome sequences, we inferred the cradle of its domestication to be in the Inner Niger Delta. Domestication was preceded by a sharp decline of most wild populations that started more than 10,000 years ago. The wild population collapse occurred during the drying of the Sahara. This finding supports the hypothesis that depletion of wild resources in the Sahara triggered African rice domestication. African rice cultivation strongly expanded 2,000 years ago. During the last 5 centuries, a sharp decline of its cultivation coincided with the introduction of Asian rice in Africa. A gene, PROG1, associated with an erect plant architecture phenotype, showed convergent selection in two rice cultivated species, Oryza glaberrima from Africa and Oryza sativa from Asia. In contrast, a shattering gene, SH5, showed selection signature during African rice domestication, but not during Asian rice domestication. Overall, our genomic data revealed a complex history of African rice domestication influenced by important climatic changes in the Saharan area, by the expansion of African agricultural society, and by recent replacement by another domesticated species.
Collapse
Affiliation(s)
- Philippe Cubry
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France
| | - Christine Tranchant-Dubreuil
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France; SouthGreen Development Platform, Agropolis Campus, Montpellier, France
| | - Anne-Céline Thuillet
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France
| | - Cécile Monat
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France; SouthGreen Development Platform, Agropolis Campus, Montpellier, France
| | | | - Karine Labadie
- CEA, Institut de Biologie François Jacob, Genoscope, 2 Rue Gaston Crémieux, 91057 Evry, France; CNRS, UMR 8030, CP5706, Evry, France; Université d'Evry, UMR 8030, CP5706, Evry, France
| | - Corinne Cruaud
- CEA, Institut de Biologie François Jacob, Genoscope, 2 Rue Gaston Crémieux, 91057 Evry, France; CNRS, UMR 8030, CP5706, Evry, France; Université d'Evry, UMR 8030, CP5706, Evry, France
| | - Stefan Engelen
- CEA, Institut de Biologie François Jacob, Genoscope, 2 Rue Gaston Crémieux, 91057 Evry, France; CNRS, UMR 8030, CP5706, Evry, France; Université d'Evry, UMR 8030, CP5706, Evry, France
| | - Nora Scarcelli
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France
| | - Bénédicte Rhoné
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France; Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, Lyon, France
| | - Concetta Burgarella
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France
| | | | - Pierre Larmande
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France; SouthGreen Development Platform, Agropolis Campus, Montpellier, France; Institut de Biologie Computationnelle (IBC), Université Montpellier 2, 860 Rue St Priest, 34095 Montpellier Cedex 5, France
| | - Patrick Wincker
- CEA, Institut de Biologie François Jacob, Genoscope, 2 Rue Gaston Crémieux, 91057 Evry, France; CNRS, UMR 8030, CP5706, Evry, France; Université d'Evry, UMR 8030, CP5706, Evry, France
| | - Olivier François
- Université Grenoble-Alpes, CNRS, UMR 5525 TIMC-IMAG, 38042 Grenoble, France
| | - François Sabot
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France; SouthGreen Development Platform, Agropolis Campus, Montpellier, France; Université de Montpellier, Place Eugène Bataillon, 34000 Montpellier, France.
| | - Yves Vigouroux
- Institut de Recherche pour le Développement, UMR DIADE, 911 Avenue Agropolis, 34394 Montpellier, France; Université de Montpellier, Place Eugène Bataillon, 34000 Montpellier, France.
| |
Collapse
|
12
|
Ruas M, Guignon V, Sempere G, Sardos J, Hueber Y, Duvergey H, Andrieu A, Chase R, Jenny C, Hazekamp T, Irish B, Jelali K, Adeka J, Ayala-Silva T, Chao CP, Daniells J, Dowiya B, Effa Effa B, Gueco L, Herradura L, Ibobondji L, Kempenaers E, Kilangi J, Muhangi S, Ngo Xuan P, Paofa J, Pavis C, Thiemele D, Tossou C, Sandoval J, Sutanto A, Vangu Paka G, Yi G, Van den Houwe I, Roux N, Rouard M. MGIS: managing banana (Musa spp.) genetic resources information and high-throughput genotyping data. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2017:3866796. [PMID: 29220435 PMCID: PMC5502358 DOI: 10.1093/database/bax046] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 05/12/2017] [Indexed: 12/22/2022]
Abstract
Unraveling the genetic diversity held in genebanks on a large scale is underway, due to advances in Next-generation sequence (NGS) based technologies that produce high-density genetic markers for a large number of samples at low cost. Genebank users should be in a position to identify and select germplasm from the global genepool based on a combination of passport, genotypic and phenotypic data. To facilitate this, a new generation of information systems is being designed to efficiently handle data and link it with other external resources such as genome or breeding databases. The Musa Germplasm Information System (MGIS), the database for global ex situ-held banana genetic resources, has been developed to address those needs in a user-friendly way. In developing MGIS, we selected a generic database schema (Chado), the robust content management system Drupal for the user interface, and Tripal, a set of Drupal modules which links the Chado schema to Drupal. MGIS allows germplasm collection examination, accession browsing, advanced search functions, and germplasm orders. Additionally, we developed unique graphical interfaces to compare accessions and to explore them based on their taxonomic information. Accession-based data has been enriched with publications, genotyping studies and associated genotyping datasets reporting on germplasm use. Finally, an interoperability layer has been implemented to facilitate the link with complementary databases like the Banana Genome Hub and the MusaBase breeding database. Database URL:https://www.crop-diversity.org/mgis/
Collapse
Affiliation(s)
- Max Ruas
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - V Guignon
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France.,South Green Bioinformatics Platform, Montpellier, France
| | - G Sempere
- South Green Bioinformatics Platform, Montpellier, France.,CIRAD, UMR AGAP 34398 Montpellier Cedex 5, France
| | - J Sardos
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - Y Hueber
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France.,South Green Bioinformatics Platform, Montpellier, France
| | - H Duvergey
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - A Andrieu
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - R Chase
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - C Jenny
- CIRAD, UMR AGAP 34398 Montpellier Cedex 5, France
| | - T Hazekamp
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - B Irish
- USDA-ARS-Tropical Agriculture Research Station, Mayaguez, Puerto Rico
| | - K Jelali
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France
| | - J Adeka
- University of Kisangani, Kisangani (UNIKIS), Democratic Republic of Congo
| | - T Ayala-Silva
- USDA-ARS-Tropical Agriculture Research Station, Mayaguez, Puerto Rico
| | - C P Chao
- Taiwan Banana Research Institute (TBRI), Chiuju, Pingtung, Taiwan, Republic of China
| | - J Daniells
- Department of Agriculture, Fisheries and Forestry, Queensland Government (DAFF South Johnstone), Brisbane, Australia
| | - B Dowiya
- Institut National pour l'Etude et la Recherche Agronomiques (INERA), Democratic Republic of Congo
| | - B Effa Effa
- Centre National de la Recherche Scientifique et Technologique (CENAREST), Libreville, Gabon
| | - L Gueco
- Institute of Plant Breeding (IPB), University of the Philippines (UPLB), Los Baños, Philippines
| | - L Herradura
- Bureau of Plant Industry (BPI) - Davao National Crop Research and Development Center, Davao City, Philippines
| | - L Ibobondji
- Centre Africain de Recherche sur Bananes et Plantains (CARBAP), Njombe, Cameroon
| | - E Kempenaers
- Bioversity International, International Musa Germplasm Transit Center (ITC), KULeuven, Leuven, Belgium
| | - J Kilangi
- Agricultural Research Institute (ARI) Maruku, Bukoba, Tanzania
| | - S Muhangi
- National Agricultural Research Organization (NARO), Mbarara, Uganda
| | - P Ngo Xuan
- Fruit and Vegetable Research Institute (FAVRI), Hanoi, Vietnam
| | - J Paofa
- National Agricultural Research Institute (NARI), Laloki Papua, New Guinea
| | - C Pavis
- CRB Plantes Tropicales, CIRAD INRA - Neufchâteau, Guadeloupe, France
| | - D Thiemele
- Centre National de Recherches Agronomiques (CNRA), Abidjan, Cote d'Ivoire
| | - C Tossou
- Institut National de Recherche Agronomique du Bénin (INRAB), Cotonou, Bénin
| | - J Sandoval
- Corporación Bananera Nacional S.A (CORBANA), San José, Costa Rica
| | - A Sutanto
- Indonesian Centre for Horticultural Research and Development (ICHORD), Bogor, Indonesia
| | - G Vangu Paka
- Institut National pour l'Etude et la Recherche Agronomiques (INERA), Democratic Republic of Congo
| | - G Yi
- Institute of Fruit Tree Research (IFTR), Guangdong Academy of Agricultural Sciences (GDAAS), Guangdong, China
| | - I Van den Houwe
- Bioversity International, International Musa Germplasm Transit Center (ITC), KULeuven, Leuven, Belgium
| | - N Roux
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France.,Bioversity International, International Musa Germplasm Transit Center (ITC), KULeuven, Leuven, Belgium
| | - M Rouard
- Bioversity International, Parc Scientifique Agropolis II, 34397 Montpellier Cedex 5, France.,South Green Bioinformatics Platform, Montpellier, France
| |
Collapse
|
13
|
Sempéré G, Philippe F, Dereeper A, Ruiz M, Sarah G, Larmande P. Erratum to: Gigwa-Genotype investigator for genome-wide analyses. Gigascience 2016; 5:48. [PMID: 27806726 PMCID: PMC5094008 DOI: 10.1186/s13742-016-0153-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2016] [Accepted: 10/21/2016] [Indexed: 11/10/2022] Open
Affiliation(s)
- Guilhem Sempéré
- UMR InterTryp (CIRAD), Campus International de Baillarguet, 34398, Montpellier, Cedex 5, France.,South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France
| | - Florian Philippe
- UMR DIADE (IRD), 911 Avenue Agropolis, 34934, Montpellier, Cedex 5, France
| | - Alexis Dereeper
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France.,UMR IPME (IRD), 911 Avenue Agropolis, 34394, Montpellier, Cedex 5, France
| | - Manuel Ruiz
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France.,UMR AGAP, CIRAD, 34398, Montpellier, Cedex 5, France.,Institut de Biologie Computationnelle, Université de Montpellier, 860 Rue de St Priest, 34095, Montpellier, Cedex 5, France.,Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), 6713, Cali, Colombia
| | - Gautier Sarah
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France.,INRA, UMR AGAP, 34398, Montpellier, Cedex 5, France
| | - Pierre Larmande
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France.,UMR DIADE (IRD), 911 Avenue Agropolis, 34934, Montpellier, Cedex 5, France.,Institut de Biologie Computationnelle, Université de Montpellier, 860 Rue de St Priest, 34095, Montpellier, Cedex 5, France.,INRIA Zenith Team, LIRMM, 161 Rue Ada, 34095, Montpellier, Cedex 5, France
| |
Collapse
|
14
|
Sempéré G, Philippe F, Dereeper A, Ruiz M, Sarah G, Larmande P. Gigwa-Genotype investigator for genome-wide analyses. Gigascience 2016; 5:25. [PMID: 27267926 PMCID: PMC4897896 DOI: 10.1186/s13742-016-0131-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 05/16/2016] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions. DESCRIPTION Here we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats. CONCLUSIONS The Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.
Collapse
Affiliation(s)
- Guilhem Sempéré
- UMR InterTryp (CIRAD), Campus International de Baillarguet, 34398, Montpellier, Cedex 5, France.
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France.
| | - Florian Philippe
- UMR DIADE (IRD), 911 Avenue Agropolis, 34934, Montpellier, Cedex 5, France
| | - Alexis Dereeper
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France
- UMR IPME (IRD), 911 Avenue Agropolis, 34394, Montpellier, Cedex 5, France
| | - Manuel Ruiz
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France
- UMR AGAP, CIRAD, 34398, Montpellier, Cedex 5, France
- Institut de Biologie Computationnelle, Université de Montpellier, 860 Rue de St Priest, 34095, Montpellier, Cedex 5, France
- Agrobiodiversity Research Area, International Center for Tropical Agriculture (CIAT), 6713, Cali, Colombia
| | - Gautier Sarah
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France
- INRA, UMR AGAP, 34398, Montpellier, Cedex 5, France
| | - Pierre Larmande
- South Green Bioinformatics Platform, 1000 Avenue Agropolis, 34934, Montpellier, Cedex 5, France
- UMR DIADE (IRD), 911 Avenue Agropolis, 34934, Montpellier, Cedex 5, France
- Institut de Biologie Computationnelle, Université de Montpellier, 860 Rue de St Priest, 34095, Montpellier, Cedex 5, France
- INRIA Zenith Team, LIRMM, 161 Rue Ada, 34095, Montpellier, Cedex 5, France
| |
Collapse
|