1
|
Zass L, Mwapagha LM, Louis-Jacques AF, Allali I, Mulindwa J, Kiran A, Hanachi M, Souiai O, Mulder N, Oduaran OH. Advancing microbiome research through standardized data and metadata collection: introducing the Microbiome Research Data Toolkit. Database (Oxford) 2024; 2024:baae062. [PMID: 39167718 PMCID: PMC11338178 DOI: 10.1093/database/baae062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 06/28/2024] [Accepted: 08/15/2024] [Indexed: 08/23/2024]
Abstract
Microbiome research has made significant gains with the evolution of sequencing technologies. Ensuring comparability between studies and enhancing the findability, accessibility, interoperability and reproducibility of microbiome data are crucial for maximizing the value of this growing body of research. Addressing the challenges of standardized metadata reporting, collection and curation, the Microbiome Working Group of the Human Hereditary and Health in Africa (H3Africa) consortium aimed to develop a comprehensive solution. In this paper, we present the Microbiome Research Data Toolkit, a versatile tool designed to standardize microbiome research metadata, facilitate MIxS-MIMS and PhenX reporting, standardize prospective collection of participant biological and lifestyle data, and retrospectively harmonize such data. This toolkit enables past, present and future microbiome research endeavors to collaborate effectively, fostering novel collaborations and accelerating knowledge discovery in the field. Database URL: https://doi.org/10.25375/uct.24218999.v2.
Collapse
Affiliation(s)
- Lyndon Zass
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Rondebosch, Cape Town 7701, South Africa
| | - Lamech M Mwapagha
- Department of Biology, Chemistry and Physics, Faculty of Health, Natural Resources and Applied Sciences, Namibia University of Science and Technology, Private Bag 13388, 13 Jackson Kaujeua Street, Windhoek, Namibia
| | - Adetola F Louis-Jacques
- Department of Obstetrics and Gynecology, Division of Maternal-Fetal Medicine, University of Florida, 1600 SW Archer Road, Gainesville, FL 32610, USA
| | - Imane Allali
- Laboratory of Human Pathologies Biology, Department of Biology, Faculty of Sciences, Mohammed V University in Rabat, Rabat, Morocco
| | - Julius Mulindwa
- Department of Biochemistry and Sports Sciences, College of Natural Sciences, Makerere University, P.O. Box 7062, Kampala, Uganda
| | - Anmol Kiran
- Malawi-Liverpool-Wellcome Trust, P.O. Box 30096, Blantyre 3, Malawi
- Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool CH64 7TE, UK
| | - Mariem Hanachi
- Laboratory of Bioinformatics, Biomathematics and Biostatistics (LR16IPT09), Institute Pasteur of Tunis, University Tunis El Manar, 13, Place Pasteur, B.P. 74, Tunis 1002, Tunisia
| | - Oussama Souiai
- Laboratory of Bioinformatics, Biomathematics and Biostatistics (LR16IPT09), Institute Pasteur of Tunis, University Tunis El Manar, 13, Place Pasteur, B.P. 74, Tunis 1002, Tunisia
| | - Nicola Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences, IDM, University of Cape Town, Rondebosch, Cape Town 7701, South Africa
| | - Ovokeraye H Oduaran
- Sydney Brenner Institute for Molecular Bioscience, University of the Witwatersrand, 9 Jubilee Road, Parktown 2193, Johannesburg, Johannesburg, South Africa
| |
Collapse
|
2
|
Cavusoglu E, Sari U, Tiryaki I. Genome-wide identification and expression analysis of Na+/ H+antiporter ( NHX) genes in tomato under salt stress. PLANT DIRECT 2023; 7:e543. [PMID: 37965196 PMCID: PMC10641485 DOI: 10.1002/pld3.543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 09/09/2023] [Accepted: 10/09/2023] [Indexed: 11/16/2023]
Abstract
Plant Na +/H + antiporter (NHX) genes enhance salt tolerance by preventing excessive Na+ accumulation in the cytosol through partitioning of Na+ ions into vacuoles or extracellular transport across the plasma membrane. However, there is limited detailed information regarding the salt stress responsive SlNHXs in the most recent tomato genome. We investigated the role of this gene family's expression patterns in the open flower tissues under salt shock in Solanum lycopersicum using a genome-wide approach. A total of seven putative SlNHX genes located on chromosomes 1, 4, 6, and 10 were identified, but no ortholog of the NHX5 gene was identified in the tomato genome. Phylogenetic analysis revealed that these genes are divided into three different groups. SlNHX proteins with 10-12 transmembrane domains were hypothetically localized in vacuoles or cell membranes. Promoter analysis revealed that SlNHX6 and SlNHX8 are involved with the stress-related MeJA hormone in response to salt stress signaling. The structural motif analysis of SlNHX1, -2, -3, -4, and -6 proteins showed that they have highly conserved amiloride binding sites. The protein-protein network revealed that SlNHX7 and SlNHX8 interact physically with Salt Overly Sensitive (SOS) pathway proteins. Transcriptome analysis demonstrated that the SlNHX2 and SlNHX6 genes were substantially expressed in the open flower tissues. Moreover, quantitative PCR analysis indicated that all SlNHX genes, particularly SlNHX6 and SlNHX8, are significantly upregulated by salt shock in the open flower tissues. Our results provide an updated framework for future genetic research and development of breeding strategies against salt stress in the tomato.
Collapse
Affiliation(s)
- Erman Cavusoglu
- Department of Agricultural Biotechnology, Faculty of AgricultureCanakkale Onsekiz Mart University, Terzioglu CampusCanakkaleTurkey
| | - Ugur Sari
- Department of Agricultural Biotechnology, Faculty of AgricultureCanakkale Onsekiz Mart University, Terzioglu CampusCanakkaleTurkey
| | - Iskender Tiryaki
- Department of Agricultural Biotechnology, Faculty of AgricultureCanakkale Onsekiz Mart University, Terzioglu CampusCanakkaleTurkey
| |
Collapse
|
3
|
Alvare G, Roche-Lima A, Fristensky B. BioLegato: a programmable, object-oriented graphic user interface. BMC Bioinformatics 2023; 24:316. [PMID: 37605108 PMCID: PMC10441721 DOI: 10.1186/s12859-023-05436-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 08/03/2023] [Indexed: 08/23/2023] Open
Abstract
BACKGROUND Biologists are faced with an ever-changing array of complex software tools with steep learning curves, often run on High Performance Computing platforms. To resolve the tradeoff between analytical sophistication and usability, we have designed BioLegato, a programmable graphical user interface (GUI) for running external programs. RESULTS BioLegato can run any program or pipeline that can be launched as a command. BioLegato reads specifications for each tool from files written in PCD, a simple language for specifying GUI components that set parameters for calling external programs. Thus, adding new tools to BioLegato can be done without changing the BioLegato Java code itself. The process is as simple as copying an existing PCD file and modifying it for the new program, which is more like filling in a form than writing code. PCD thus facilitates rapid development of new applications using existing programs as building blocks, and getting them to work together seamlessly. CONCLUSION BioLegato applies Object-Oriented concepts to the user experience by organizing applications based on discrete data types and the methods relevant to that data. PCD makes it easier for BioLegato applications to evolve with the succession of analytical tools for bioinformatics. BioLegato is applicable not only in biology, but in almost any field in which disparate software tools need to work as an integrated system.
Collapse
Affiliation(s)
- Graham Alvare
- Access Norwest Co-op Community Health, Winnipeg, Canada
| | - Abiel Roche-Lima
- RCMI Program, Medical Science Campus, University of Puerto Rico, San Juan, Puerto Rico, USA
| | - Brian Fristensky
- Department of Plant Science, University of Manitoba, Winnipeg, Canada.
| |
Collapse
|
4
|
Li H, Sun P, Wang Y, Zhang Z, Yang J, Suo Y, Han W, Diao S, Li F, Fu J. Allele-aware chromosome-level genome assembly of the autohexaploid Diospyros kaki Thunb. Sci Data 2023; 10:270. [PMID: 37169805 PMCID: PMC10175270 DOI: 10.1038/s41597-023-02175-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 04/21/2023] [Indexed: 05/13/2023] Open
Abstract
Artificially improving persimmon (Diospyros kaki Thunb.), one of the most important fruit trees, remains challenging owing to the lack of reference genomes. In this study, we generated an allele-aware chromosome-level genome assembly for the autohexaploid persimmon 'Xiaoguotianshi' (Chinese-PCNA type) using PacBio CCS and Hi-C technology. The final assembly contained 4.52 Gb, with a contig N50 value of 5.28 Mb and scaffold N50 value of 44.01 Mb, of which 4.06 Gb (89.87%) of the assembly were anchored onto 90 chromosome-level pseudomolecules comprising 15 homologous groups with 6 allelic chromosomes in each. A total of 153,288 protein-coding genes were predicted, of which 98.60% were functionally annotated. Repetitive sequences accounted for 64.02% of the genome; and 110,480 rRNAs, 12,297 tRNAs, 1,483 miRNAs, and 3,510 snRNA genes were also identified. This genome assembly fills the knowledge gap in the autohexaploid persimmon genome, which is conducive in the study on the regulatory mechanisms underlying the major economically advantageous traits of persimmons and promoting breeding programs.
Collapse
Affiliation(s)
- Huawei Li
- Research Institute of Non-timber Forestry, Chinese Academy of Forestry, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
- Key Laboratory of Non-timber Forest Germplasm Enhancement & Utilization of State Administration of Forestry and Grassland, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
- Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, No. 498 Shaoshan South Road, Changsha, 410004, China
| | - Peng Sun
- Research Institute of Non-timber Forestry, Chinese Academy of Forestry, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
- Key Laboratory of Non-timber Forest Germplasm Enhancement & Utilization of State Administration of Forestry and Grassland, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
| | - Yiru Wang
- Research Institute of Non-timber Forestry, Chinese Academy of Forestry, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
- Key Laboratory of Non-timber Forest Germplasm Enhancement & Utilization of State Administration of Forestry and Grassland, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
| | - Zhongren Zhang
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Jun Yang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai Chenshan Botanical Garden, 3888 Chenhua Road, Shanghai, 201602, China
| | - Yujing Suo
- Research Institute of Non-timber Forestry, Chinese Academy of Forestry, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
- Key Laboratory of Non-timber Forest Germplasm Enhancement & Utilization of State Administration of Forestry and Grassland, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
| | - Weijuan Han
- Research Institute of Non-timber Forestry, Chinese Academy of Forestry, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
- Key Laboratory of Non-timber Forest Germplasm Enhancement & Utilization of State Administration of Forestry and Grassland, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
| | - Songfeng Diao
- Research Institute of Non-timber Forestry, Chinese Academy of Forestry, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
- Key Laboratory of Non-timber Forest Germplasm Enhancement & Utilization of State Administration of Forestry and Grassland, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China
| | - Fangdong Li
- Research Institute of Non-timber Forestry, Chinese Academy of Forestry, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China.
- Key Laboratory of Non-timber Forest Germplasm Enhancement & Utilization of State Administration of Forestry and Grassland, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China.
| | - Jianmin Fu
- Research Institute of Non-timber Forestry, Chinese Academy of Forestry, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China.
- Key Laboratory of Non-timber Forest Germplasm Enhancement & Utilization of State Administration of Forestry and Grassland, No. 3 Weiwu Road, Jinshui District, Zhengzhou, 450003, China.
- Henan Key Laboratory of Germplasm Innovation and Utilization of Eco-economic Woody Plant, Pingdingshan University, Pingdingshan, 467000, China.
| |
Collapse
|
5
|
Zhao Y, Yang Z, Zhang Z, Yin M, Chu S, Tong Z, Qin Y, Zha L, Fang Q, Yuan Y, Huang L, Peng H. The first chromosome-level Fallopia multiflora genome assembly provides insights into stilbene biosynthesis. HORTICULTURE RESEARCH 2023; 10:uhad047. [PMID: 37213683 PMCID: PMC10194901 DOI: 10.1093/hr/uhad047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 03/07/2023] [Indexed: 05/23/2023]
Abstract
Fallopia multiflora (Thunb.) Harald, a vine belonging to the Polygonaceae family, is used in traditional medicine. The stilbenes contained in it have significant pharmacological activities in anti-oxidation and anti-aging. This study describes the assembly of the F. multiflora genome and presents its chromosome-level genome sequence containing 1.46 gigabases of data (with a contig N50 of 1.97 megabases), 1.44 gigabases of which was assigned to 11 pseudochromosomes. Comparative genomics confirmed that F. multiflora shared a whole-genome duplication event with Tartary buckwheat and then underwent different transposon evolution after separation. Combining genomics, transcriptomics, and metabolomics data to map a network of associated genes and metabolites, we identified two FmRS genes responsible for the catalysis of one molecule of p-coumaroyl-CoA and three molecules of malonyl-CoA to resveratrol in F. multiflora. These findings not only serve as the basis for revealing the stilbene biosynthetic pathway but will also contribute to the development of tools for increasing the production of bioactive stilbenes through molecular breeding in plants or metabolic engineering in microbes. Moreover, the reference genome of F. multiflora is a useful addition to the genomes of the Polygonaceae family.
Collapse
Affiliation(s)
| | | | | | | | - Shanshan Chu
- School of Pharmacy, Anhui University of Chinese Medicine, Hefei 230012, China
- Anhui Province Key Laboratory of Research & Development of Chinese Medicine, Hefei 230012, China
| | - Zhenzhen Tong
- School of Pharmacy, Anhui University of Chinese Medicine, Hefei 230012, China
| | - Yuejian Qin
- School of Pharmacy, Anhui University of Chinese Medicine, Hefei 230012, China
| | - Liangping Zha
- School of Pharmacy, Anhui University of Chinese Medicine, Hefei 230012, China
- Anhui Province Key Laboratory of Research & Development of Chinese Medicine, Hefei 230012, China
| | - Qingying Fang
- School of Pharmacy, Anhui University of Chinese Medicine, Hefei 230012, China
- Anhui Province Key Laboratory of Research & Development of Chinese Medicine, Hefei 230012, China
| | | | | | | |
Collapse
|
6
|
Song A, Su J, Wang H, Zhang Z, Zhang X, Van de Peer Y, Chen F, Fang W, Guan Z, Zhang F, Wang Z, Wang L, Ding B, Zhao S, Ding L, Liu Y, Zhou L, He J, Jia D, Zhang J, Chen C, Yu Z, Sun D, Jiang J, Chen S, Chen F. Analyses of a chromosome-scale genome assembly reveal the origin and evolution of cultivated chrysanthemum. Nat Commun 2023; 14:2021. [PMID: 37037808 PMCID: PMC10085997 DOI: 10.1038/s41467-023-37730-3] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Accepted: 03/29/2023] [Indexed: 04/12/2023] Open
Abstract
Chrysanthemum (Chrysanthemum morifolium Ramat.) is a globally important ornamental plant with great economic, cultural, and symbolic value. However, research on chrysanthemum is challenging due to its complex genetic background. Here, we report a near-complete assembly and annotation for C. morifolium comprising 27 pseudochromosomes (8.15 Gb; scaffold N50 of 303.69 Mb). Comparative and evolutionary analyses reveal a whole-genome triplication (WGT) event shared by Chrysanthemum species approximately 6 million years ago (Mya) and the possible lineage-specific polyploidization of C. morifolium approximately 3 Mya. Multilevel evidence suggests that C. morifolium is likely a segmental allopolyploid. Furthermore, a combination of genomics and transcriptomics approaches demonstrate the C. morifolium genome can be used to identify genes underlying key ornamental traits. Phylogenetic analysis of CmCCD4a traces the flower colour breeding history of cultivated chrysanthemum. Genomic resources generated from this study could help to accelerate chrysanthemum genetic improvement.
Collapse
Affiliation(s)
- Aiping Song
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Jiangshuo Su
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Haibin Wang
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Zhongren Zhang
- Novogene Bioinformatics Institute, Beijing, 100083, China
| | - Xingtan Zhang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, 518120, China
| | - Yves Van de Peer
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
- Department of Plant Biotechnology and Bioinformatics, Ghent University, VIB Center for Plant Systems Biology, 9052, Ghent, Belgium
- Center for Microbial Ecology and Genomics, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, 0028, South Africa
| | - Fei Chen
- College of tropical crops, Sanya Nanfan Research Institute, Hainan University & Hainan Yazhou Bay Seed Laboratory, Sanya, Hainan, 572025, China
| | - Weimin Fang
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Zhiyong Guan
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Fei Zhang
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Zhenxing Wang
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Likai Wang
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Baoqing Ding
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Shuang Zhao
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Lian Ding
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Ye Liu
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Lijie Zhou
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Jun He
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Diwen Jia
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Jiali Zhang
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Chuwen Chen
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Zhongyu Yu
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Daojin Sun
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China
| | - Jiafu Jiang
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China.
| | - Sumei Chen
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China.
| | - Fadi Chen
- State Key Laboratory of Crop Genetics & Germplasm Enhancement and Utilization, Key Laboratory of Landscaping, Key Laboratory of Flower Biology and Germplasm Innovation (South), Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu, 210095, China.
| |
Collapse
|
7
|
Zhong CJ, Hu XL, Yang XL, Gan HQ, Yan KC, Shu FT, Wei P, Gong T, Luo PF, James TD, Chen ZH, Zheng YJ, He XP, Xia ZF. Metabolically Specific In Situ Fluorescent Visualization of Bacterial Infection on Wound Tissues. ACS APPLIED MATERIALS & INTERFACES 2022; 14:39808-39818. [PMID: 36005548 DOI: 10.1021/acsami.2c10115] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The ability to effectively detect bacterial infection in human tissues is important for the timely treatment of the infection. However, traditional techniques fail to visualize bacterial species adhered to host cells in situ in a target-specific manner. Dihydropteroate synthase (DHPS) exclusively exists in bacterial species and metabolically converts p-aminobenzoic acid (PABA) to folic acid (FA). By targeting this bacterium-specific metabolism, we have developed a fluorescent imaging probe, PABA-DCM, based on the conjugation of PABA with a long-wavelength fluorophore, dicyanomethylene 4H-pyran (DCM). We confirmed that the probe can be used in the synthetic pathway of a broad spectrum of Gram-positive and negative bacteria, resulting in a significantly extended retention time in bacterial over mammalian cells. We validated that DHPS catalytically introduces a dihydropteridine group to the amino end of the PABA motif of PABA-DCM, and the resulting adduct leads to an increase in the FA levels of bacteria. We also constructed a hydrogel dressing containing PABA-DCM and graphene oxide (GO), termed PABA-DCM@GO, that achieves target-specific fluorescence visualization of bacterial infection on the wounded tissues of mice. Our research paves the way for the development of fluorescent imaging agents that target species-conserved metabolic pathways of microorganisms for the in situ monitoring of infections in human tissues.
Collapse
Affiliation(s)
- Chen-Jian Zhong
- Department of Burn Surgery and Wound Repair, Fujian Burn Medical Center, Fujian Provincial Key Laboratory of Burn and Trauma, Fujian Medical University Union Hospital, Fuzhou 350001, Fujian, PR China
- Department of Burn Surgery, the First Affiliated Hospital of Naval Medical University, Shanghai 200433, PR China
- Research Unit of Key Techniques for Treatment of Burns and Combined Burns and Trauma Injury, Chinese Academy of Medical Sciences, Shanghai 200433, China
| | - Xi-Le Hu
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, School of Chemistry and Molecular Engineering, Frontiers Center for Materiobiology and Dynamic Chemistry, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, PR China
| | - Xiao-Lan Yang
- Department of Burn Surgery and Wound Repair, Fujian Burn Medical Center, Fujian Provincial Key Laboratory of Burn and Trauma, Fujian Medical University Union Hospital, Fuzhou 350001, Fujian, PR China
- Department of Burn Surgery, the First Affiliated Hospital of Naval Medical University, Shanghai 200433, PR China
- Research Unit of Key Techniques for Treatment of Burns and Combined Burns and Trauma Injury, Chinese Academy of Medical Sciences, Shanghai 200433, China
- Department of Burn Surgery and Wound Repair, Quanzhou First Hospital Affiliated to Fujian Medical University, Quanzhou 362001, Fujian, China
| | - Hui-Qi Gan
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, School of Chemistry and Molecular Engineering, Frontiers Center for Materiobiology and Dynamic Chemistry, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, PR China
| | - Kai-Cheng Yan
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, School of Chemistry and Molecular Engineering, Frontiers Center for Materiobiology and Dynamic Chemistry, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, PR China
| | - Fu-Ting Shu
- Department of Burn Surgery, the First Affiliated Hospital of Naval Medical University, Shanghai 200433, PR China
- Research Unit of Key Techniques for Treatment of Burns and Combined Burns and Trauma Injury, Chinese Academy of Medical Sciences, Shanghai 200433, China
| | - Pei Wei
- Department of Burn Surgery and Wound Repair, Fujian Burn Medical Center, Fujian Provincial Key Laboratory of Burn and Trauma, Fujian Medical University Union Hospital, Fuzhou 350001, Fujian, PR China
| | - Teng Gong
- Department of Burn Surgery and Wound Repair, Fujian Burn Medical Center, Fujian Provincial Key Laboratory of Burn and Trauma, Fujian Medical University Union Hospital, Fuzhou 350001, Fujian, PR China
| | - Peng-Fei Luo
- Department of Burn Surgery, the First Affiliated Hospital of Naval Medical University, Shanghai 200433, PR China
- Research Unit of Key Techniques for Treatment of Burns and Combined Burns and Trauma Injury, Chinese Academy of Medical Sciences, Shanghai 200433, China
| | - Tony D James
- Department of Chemistry, University of Bath, Bath BA27AY, United Kingdom
- School of Chemistry and Chemical Engineering, Henan Normal University, Xinxiang 453007, PR China
| | - Zhao-Hong Chen
- Department of Burn Surgery and Wound Repair, Fujian Burn Medical Center, Fujian Provincial Key Laboratory of Burn and Trauma, Fujian Medical University Union Hospital, Fuzhou 350001, Fujian, PR China
| | - Yong-Jun Zheng
- Department of Burn Surgery, the First Affiliated Hospital of Naval Medical University, Shanghai 200433, PR China
- Research Unit of Key Techniques for Treatment of Burns and Combined Burns and Trauma Injury, Chinese Academy of Medical Sciences, Shanghai 200433, China
| | - Xiao-Peng He
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, School of Chemistry and Molecular Engineering, Frontiers Center for Materiobiology and Dynamic Chemistry, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, PR China
- The International Cooperation Laboratory on Signal Transduction, Eastern Hepatobiliary Surgery Hospital, Shanghai 200438, China
- National Center for Liver Cancer, Shanghai 200438, China
| | - Zhao-Fan Xia
- Department of Burn Surgery and Wound Repair, Fujian Burn Medical Center, Fujian Provincial Key Laboratory of Burn and Trauma, Fujian Medical University Union Hospital, Fuzhou 350001, Fujian, PR China
- Department of Burn Surgery, the First Affiliated Hospital of Naval Medical University, Shanghai 200433, PR China
- Research Unit of Key Techniques for Treatment of Burns and Combined Burns and Trauma Injury, Chinese Academy of Medical Sciences, Shanghai 200433, China
| |
Collapse
|
8
|
Oh SH, Martin-Yken H, Coleman DA, Dague E, Hoyer LL. Development and Use of a Monoclonal Antibody Specific for the Candida albicans Cell-Surface Protein Hwp1. Front Cell Infect Microbiol 2022; 12:907453. [PMID: 35832385 PMCID: PMC9273023 DOI: 10.3389/fcimb.2022.907453] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 05/20/2022] [Indexed: 12/04/2022] Open
Abstract
The Candida albicans cell-surface protein Hwp1 functions in adhesion to the host and in biofilm formation. A peptide from the Gln-Pro-rich adhesive domain of Hwp1 was used to raise monoclonal antibody (MAb) 2-E8. MAb 2-E8 specificity for Hwp1 was demonstrated using a hwp1/hwp1 C. albicans isolate and strains that expressed at least one HWP1 allele. Immunofluorescence and atomic force microscopy experiments using MAb 2-E8 confirmed C. albicans germ-tube-specific detection of the Hwp1 protein. MAb 2-E8 also immunolabeled the tips of some Candida dubliniensis germ tubes grown under conditions that maximized HWP1 expression. The phylogeny of HWP1 and closely related genes suggested that the Gln-Pro-rich adhesive domain was unique to C. albicans and C. dubliniensis focusing the utility of MAb 2-E8 on these species. This new reagent can be used to address unanswered questions about Hwp1 and its interactions with other proteins in the context of C. albicans biology and pathogenesis.
Collapse
Affiliation(s)
- Soon-Hwan Oh
- Department of Pathobiology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Hélène Martin-Yken
- Toulouse Biotechnology Institute, Université de Toulouse, CNRS, INRAE, INSA, Toulouse, France
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| | - David A. Coleman
- Department of Pathobiology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Etienne Dague
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| | - Lois L. Hoyer
- Department of Pathobiology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|
9
|
Kim J, Oh SH, Rodriguez-Bobadilla R, Vuong VM, Hubka V, Zhao X, Hoyer LL. Peering Into Candida albicans Pir Protein Function and Comparative Genomics of the Pir Family. Front Cell Infect Microbiol 2022; 12:836632. [PMID: 35372132 PMCID: PMC8975586 DOI: 10.3389/fcimb.2022.836632] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 02/11/2022] [Indexed: 11/24/2022] Open
Abstract
The fungal cell wall, comprised primarily of protein and polymeric carbohydrate, maintains cell structure, provides protection from the environment, and is an important antifungal drug target. Pir proteins (proteins with internal repeats) are linked to cell wall β-1,3-glucan and are best studied in Saccharomyces cerevisiae. Sequential deletion of S. cerevisiae PIR genes produces strains with increasingly notable cell wall damage. However, a true null mutant lacking all five S. cerevisiae PIR genes was never constructed. Because only two PIR genes (PIR1, PIR32) were annotated in the Candida albicans genome, the initial goal of this work was to construct a true Δpir/Δpir null strain in this species. Unexpectedly, the phenotype of the null strain was almost indistinguishable from its parent, leading to the search for other proteins with Pir function. Bioinformatic approaches revealed nine additional C. albicans proteins that share a conserved Pir functional motif (minimally DGQ). Examination of the protein sequences revealed another conserved motif (QFQFD) toward the C-terminal end of each protein. Sequence similarities and presence of the conserved motif(s) were used to identify a set of 75 proteins across 16 fungal species that are proposed here as Pir proteins. The Pir family is greatly expanded in C. albicans and C. dubliniensis compared to other species and the orthologs are known to have specialized function during chlamydospore formation. Predicted Pir structures showed a conserved core of antiparallel beta-sheets and sometimes-extensive loops that contain amino acids with the potential to form linkages to cell wall components. Pir phylogeny demonstrated emergence of specific ortholog groups among the fungal species. Variation in gene expression patterns was noted among the ortholog groups during growth in rich medium. PIR allelic variation was quite limited despite the presence of a repeated sequence in many loci. Results presented here demonstrate that the Pir family is larger than previously recognized and lead to new hypotheses to test to better understand Pir proteins and their role in the fungal cell wall.
Collapse
Affiliation(s)
- Jisoo Kim
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Soon-Hwan Oh
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | | | - Vien M. Vuong
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Vit Hubka
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia
- Laboratory of Fungal Genetics and Metabolism, Institute of Microbiology, Czech Academy of Sciences, Prague, Czechia
| | - Xiaomin Zhao
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Lois L. Hoyer
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|
10
|
BEHZADI PAYAM, GAJDÁCS MÁRIÓ. Worldwide Protein Data Bank (wwPDB): A virtual treasure for research in biotechnology. Eur J Microbiol Immunol (Bp) 2021; 11:77-86. [PMID: 34908533 PMCID: PMC8830413 DOI: 10.1556/1886.2021.00020] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 11/23/2021] [Indexed: 12/25/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RSCB PDB) provides a wide range of digital data regarding biology and biomedicine. This huge internet resource involves a wide range of important biological data, obtained from experiments around the globe by different scientists. The Worldwide Protein Data Bank (wwPDB) represents a brilliant collection of 3D structure data associated with important and vital biomolecules including nucleic acids (RNAs and DNAs) and proteins. Moreover, this database accumulates knowledge regarding function and evolution of biomacromolecules which supports different disciplines such as biotechnology. 3D structure, functional characteristics and phylogenetic properties of biomacromolecules give a deep understanding of the biomolecules' characteristics. An important advantage of the wwPDB database is the data updating time, which is done every week. This updating process helps users to have the newest data and information for their projects. The data and information in wwPDB can be a great support to have an accurate imagination and illustrations of the biomacromolecules in biotechnology. As demonstrated by the SARS-CoV-2 pandemic, rapidly reliable and accessible biological data for microbiology, immunology, vaccinology, and drug development are critical to address many healthcare-related challenges that are facing humanity. The aim of this paper is to introduce the readers to wwPDB, and to highlight the importance of this database in biotechnology, with the expectation that the number of scientists interested in the utilization of Protein Data Bank's resources will increase substantially in the coming years.
Collapse
Affiliation(s)
- PAYAM BEHZADI
- Department of Microbiology, College of Basic Sciences, Shahr-e-Qods Branch, Islamic Azad University, Tehran, 37541-374, Iran
| | - MÁRIÓ GAJDÁCS
- Department of Oral Biology and Experimental Dental Research, Faculty of Dentistry, University of Szeged, 6720, Szeged, Hungary,*Corresponding author. Tel.: +36-62-342-532. E-mail:
| |
Collapse
|
11
|
Mohammadi Jenghara M, Iranpour Mobarakeh M, Ebrahimpour Komleh H. Relation Extraction of Protein Complexes from Dynamic Protein-Protein Interaction Network. J Biomed Phys Eng 2021; 11:675-684. [PMID: 34904064 PMCID: PMC8649159 DOI: 10.31661/jbpe.v0i0.1119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 02/14/2019] [Indexed: 11/16/2022]
Abstract
Background Dynamic protein-protein interaction networks (DPPIN) can confirm the conditional and temporal features of proteins and protein complexes. In addition, the relation of protein complexes in dynamic networks can provide useful information in understanding the dynamic functionality of PPI networks. Objective In this paper, an algorithm is presented to discover the temporal association rule from the dynamic PPIN dataset. Material and Methods In this analytical study, the static protein-protein interaction network is transformed into a dynamic network using the gene expression thresholding to extract the protein complex relations. The number of presented proteins of the dynamic network is large at each time point. This number will increase for extraction of multidimensional rules at different times. By mapping the gold standard protein complexes as reference protein complexes, the number of items decreases from active proteins to protein complexes at each transaction. Extracted sub graphs as protein complexes, at each time point, are weighted according to the reference protein complexes similarity degrees. Mega-transactions and extended items are created based on occurrence bitmap matrix of the reference complexes. Rules will be extracted based on Mega-transactions of protein complexes. Results The proposed method has been evaluated using gold standard protein complex rules. The amount of extracted rules from Biogrid datasets and protein complexes are 281, with support 0.2. Conclusion The characteristic of the proposed algorithm is the simultaneous extraction of intra-transaction and inter-transaction rules. The results evaluation using EBI data shows the efficiency of the proposed algorithm.
Collapse
Affiliation(s)
- Moslem Mohammadi Jenghara
- PhD, Department of Computer Engineering and Information Technology, Payame Noor University, Tehran, Iran
| | - Majid Iranpour Mobarakeh
- PhD, Department of Computer Engineering and Information Technology, Payame Noor University, Tehran, Iran
| | | |
Collapse
|
12
|
Oh SH, Schliep K, Isenhower A, Rodriguez-Bobadilla R, Vuong VM, Fields CJ, Hernandez AG, Hoyer LL. Using Genomics to Shape the Definition of the Agglutinin-Like Sequence ( ALS) Family in the Saccharomycetales. Front Cell Infect Microbiol 2021; 11:794529. [PMID: 34970511 PMCID: PMC8712946 DOI: 10.3389/fcimb.2021.794529] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 11/09/2021] [Indexed: 01/09/2023] Open
Abstract
The Candida albicans agglutinin-like sequence (ALS) family is studied because of its contribution to cell adhesion, fungal colonization, and polymicrobial biofilm formation. The goal of this work was to derive an accurate census and sequence for ALS genes in pathogenic yeasts and other closely related species, while probing the boundaries of the ALS family within the Order Saccharomycetales. Bioinformatic methods were combined with laboratory experimentation to characterize 47 novel ALS loci from 8 fungal species. AlphaFold predictions suggested the presence of a conserved N-terminal adhesive domain (NT-Als) structure in all Als proteins reported to date, as well as in S. cerevisiae alpha-agglutinin (Sag1). Lodderomyces elongisporus, Meyerozyma guilliermondii, and Scheffersomyces stipitis were notable because each species had genes with C. albicans ALS features, as well as at least one that encoded a Sag1-like protein. Detection of recombination events between the ALS family and gene families encoding other cell-surface proteins such as Iff/Hyr and Flo suggest widespread domain swapping with the potential to create cell-surface diversity among yeast species. Results from the analysis also revealed subtelomeric ALS genes, ALS pseudogenes, and the potential for yeast species to secrete their own soluble adhesion inhibitors. Information presented here supports the inclusion of SAG1 in the ALS family and yields many experimental hypotheses to pursue to further reveal the nature of the ALS family.
Collapse
Affiliation(s)
- Soon-Hwan Oh
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Klaus Schliep
- Institute of Environmental Biotechnology, Graz University of Technology, Graz, Austria
| | - Allyson Isenhower
- Department of Biology, Millikin University, Decatur, IL, United States
| | | | - Vien M. Vuong
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Christopher J. Fields
- Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Alvaro G. Hernandez
- Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Lois L. Hoyer
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|
13
|
Mulder N, Zass L, Hamdi Y, Othman H, Panji S, Allali I, Fakim YJ. African Global Representation in Biomedical Sciences. Annu Rev Biomed Data Sci 2021; 4:57-81. [PMID: 34465182 DOI: 10.1146/annurev-biodatasci-102920-112550] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
African populations are diverse in their ethnicity, language, culture, and genetics. Although plagued by high disease burdens, until recently the continent has largely been excluded from biomedical studies. Along with limitations in research and clinical infrastructure, human capacity, and funding, this omission has resulted in an underrepresentation of African data and disadvantaged African scientists. This review interrogates the relative abundance of biomedical data from Africa, primarily in genomics and other omics. The visibility of African science through publications is also discussed. A challenge encountered in this review is the relative lack of annotation of data on their geographical or population origin, with African countries represented as a single group. In addition to the abovementioned limitations,the global representation of African data may also be attributed to the hesitation to deposit data in public repositories. Whatever the reason, the disparity should be addressed, as African data have enormous value for scientists in Africa and globally.
Collapse
Affiliation(s)
- Nicola Mulder
- Computational Biology Division, Department of Integrative Biomedical Sciences and Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town 7925, South Africa; .,Wellcome Centre for Infectious Diseases Research in Africa (CIDRI-AFRICA), Faculty of Health Sciences, University of Cape Town, Cape Town 7925, South Africa
| | - Lyndon Zass
- Computational Biology Division, Department of Integrative Biomedical Sciences and Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town 7925, South Africa;
| | - Yosr Hamdi
- Laboratory of Biomedical Genomics and Oncogenetics and Laboratory of Human and Experimental Pathology, Institut Pasteur de Tunis, University of Tunis El Manar, 1002 Tunis, Tunisia
| | - Houcemeddine Othman
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg 2193, South Africa
| | - Sumir Panji
- Computational Biology Division, Department of Integrative Biomedical Sciences and Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town 7925, South Africa;
| | - Imane Allali
- Laboratory of Human Pathologies Biology, Department of Biology, Faculty of Sciences, and Genomic Center of Human Pathologies, Faculty of Medicine and Pharmacy, Mohammed V University in Rabat, 1014 Rabat, Morocco
| | - Yasmina Jaufeerally Fakim
- Biotechnology Unit, Department of Agricultural and Food Science, Faculty of Agriculture, University of Mauritius, Réduit 80837, Mauritius
| |
Collapse
|
14
|
Guo X, Chen F, Gao F, Li L, Liu K, You L, Hua C, Yang F, Liu W, Peng C, Wang L, Yang X, Zhou F, Tong J, Cai J, Li Z, Wan B, Zhang L, Yang T, Zhang M, Yang L, Yang Y, Zeng W, Wang B, Wei X, Xu X. CNSA: a data repository for archiving omics data. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2020:5875523. [PMID: 32705130 PMCID: PMC7377928 DOI: 10.1093/database/baaa055] [Citation(s) in RCA: 212] [Impact Index Per Article: 70.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 05/31/2020] [Accepted: 06/25/2020] [Indexed: 12/16/2022]
Abstract
With the application and development of high-throughput sequencing technology in life and health sciences, massive multi-omics data brings the problem of efficient management and utilization. Database development and biocuration are the prerequisites for the reuse of these big data. Here, relying on China National GeneBank (CNGB), we present CNGB Sequence Archive (CNSA) for archiving omics data, including raw sequencing data and its further analyzed results which are organized into six objects, namely Project, Sample, Experiment, Run, Assembly and Variation at present. Moreover, CNSA has created a correlation model of living samples, sample information and analytical data on some projects. Both living samples and analytical data are directly correlated with the sample information. From either one, information or data of the other two can be obtained, so that all data can be traced throughout the life cycle from the living sample to the sample information to the analytical data. Complying with the data standards commonly used in the life sciences, CNSA is committed to building a comprehensive and curated data repository for storing, managing and sharing of omics data. We will continue to improve the data standards and provide free access to open-data resources for worldwide scientific communities to support academic research and the bio-industry. Database URL: https://db.cngb.org/cnsa/.
Collapse
Affiliation(s)
- Xueqin Guo
- China National GeneBank, Shenzhen 518120, China
| | | | - Fei Gao
- China National GeneBank, Shenzhen 518120, China
| | - Ling Li
- China National GeneBank, Shenzhen 518120, China
| | - Ke Liu
- China National GeneBank, Shenzhen 518120, China
| | - Lijin You
- China National GeneBank, Shenzhen 518120, China
| | - Cong Hua
- China National GeneBank, Shenzhen 518120, China
| | - Fan Yang
- China National GeneBank, Shenzhen 518120, China
| | | | | | - Lina Wang
- China National GeneBank, Shenzhen 518120, China
| | | | - Feiyu Zhou
- China National GeneBank, Shenzhen 518120, China
| | - Jiawei Tong
- China National GeneBank, Shenzhen 518120, China
| | - Jia Cai
- China National GeneBank, Shenzhen 518120, China
| | - Zhiyong Li
- China National GeneBank, Shenzhen 518120, China
| | - Bo Wan
- China National GeneBank, Shenzhen 518120, China
| | - Lei Zhang
- China National GeneBank, Shenzhen 518120, China
| | - Tao Yang
- China National GeneBank, Shenzhen 518120, China
| | | | - Linlin Yang
- China National GeneBank, Shenzhen 518120, China
| | - Yawen Yang
- China National GeneBank, Shenzhen 518120, China
| | - Wenjun Zeng
- China National GeneBank, Shenzhen 518120, China
| | - Bo Wang
- China National GeneBank, Shenzhen 518120, China
| | | | - Xun Xu
- China National GeneBank, Shenzhen 518120, China.,BGI-Shenzhen, Shenzhen 518083, China.,Guangdong Provincial Key Laboratory of Genome Read and Write, Shenzhen 518120, China
| |
Collapse
|
15
|
Pantziarka P, Capistrano I R, De Potter A, Vandeborne L, Bouche G. An Open Access Database of Licensed Cancer Drugs. Front Pharmacol 2021; 12:627574. [PMID: 33776770 PMCID: PMC7991999 DOI: 10.3389/fphar.2021.627574] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 01/26/2021] [Indexed: 11/13/2022] Open
Abstract
A global, comprehensive and open access listing of approved anticancer drugs does not currently exist. Partial information is available from multiple sources, including regulatory authorities, national formularies and scientific agencies. Many such data sources include drugs used in oncology for supportive care, diagnostic or other non-antineoplastic uses. We describe a methodology to combine and cleanse relevant data from multiple sources to produce an open access database of drugs licensed specifically for therapeutic antineoplastic purposes. The resulting list is provided as an open access database, (http://www.redo-project.org/cancer-drugs-db/), so that it may be used by researchers as input for further research projects, for example literature-based text mining for drug repurposing.
Collapse
Affiliation(s)
- Pan Pantziarka
- The Anticancer Fund, Brussels, Belgium.,The George Pantziarka TP53 Trust, London, United Kingdom
| | | | - Arno De Potter
- Faculty of Medicine, University of Leuven, Leuven, Belgium
| | | | | |
Collapse
|
16
|
Chakraborty C, Sharma AR, Bhattacharya M, Sharma G, Lee SS. Immunoinformatics Approach for the Identification and Characterization of T Cell and B Cell Epitopes towards the Peptide-Based Vaccine against SARS-CoV-2. Arch Med Res 2021; 52:362-370. [PMID: 33546870 PMCID: PMC7846223 DOI: 10.1016/j.arcmed.2021.01.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 01/14/2021] [Indexed: 02/07/2023]
Abstract
Presently, immunoinformatics is playing a significant role in epitope identification and vaccine designing for various critical diseases. Using immunoinformatics, several scientists are trying to identify and characterize T cell and B cell epitopes as well as design peptide-based vaccine against SARS-CoV-2. In this review article, we have tried to discuss the importance in adaptive immunity and its significance for designing the SARS-CoV-2 vaccine. Moreover, we have attempted to illustrate several significant key points for utilizing immunoinformatics for vaccine designing, such as the criteria for selection and identification of epitopes, T cell epitope, and B cell epitope prediction and different emerging tools/databases for immunoinformatics. In the current scenario, a few immunoinformatics studies have been performed for various infectious pathogens and related diseases. Thus, we have also summarized and included these current immunoinformatics studies in this review article. Finally, we have discussed about the probable T cell and B cell epitopes and their identification and characterization for vaccine designing against SARS-CoV-2.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, India; Institute for Skeletal Aging and Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, 24252,Gangwon-do, Republic of Korea
| | - Ashish Ranjan Sharma
- Institute for Skeletal Aging and Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, 24252,Gangwon-do, Republic of Korea
| | - Manojit Bhattacharya
- Department of Zoology, Fakir Mohan University, Vyasa Vihar, Balasore Odisha, India
| | - Garima Sharma
- Department of Biomedical Science and Institute of Bioscience and Biotechnology, Kangwon National University, Chuncheon, Republic of Korea
| | - Sang-Soo Lee
- Institute for Skeletal Aging and Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, 24252,Gangwon-do, Republic of Korea.
| |
Collapse
|
17
|
Oh SH, Isenhower A, Rodriguez-Bobadilla R, Smith B, Jones J, Hubka V, Fields C, Hernandez A, Hoyer LL. Pursuing Advances in DNA Sequencing Technology to Solve a Complex Genomic Jigsaw Puzzle: The Agglutinin-Like Sequence ( ALS) Genes of Candida tropicalis. Front Microbiol 2021; 11:594531. [PMID: 33552012 PMCID: PMC7856822 DOI: 10.3389/fmicb.2020.594531] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 11/17/2020] [Indexed: 12/16/2022] Open
Abstract
The agglutinin-like sequence (ALS) gene family encodes cell-surface adhesins that interact with host and abiotic surfaces, promoting colonization by opportunistic fungal pathogens such as Candida tropicalis. Studies of Als protein contribution to C. tropicalis adhesion would benefit from an accurate catalog of ALS gene sequences as well as insight into relative gene expression levels. Even in the genomics era, this information has been elusive: genome assemblies are often broken within ALS genes because of their extensive regions of highly conserved, repeated DNA sequences and because there are many similar ALS genes at different chromosomal locations. Here, we describe the benefit of long-read DNA sequencing technology to facilitate characterization of C. tropicalis ALS loci. Thirteen ALS loci in C. tropicalis strain MYA-3404 were deduced from a genome assembly constructed from Illumina MiSeq and Oxford Nanopore MinION data. Although the MinION data were valuable, PCR amplification and Sanger sequencing of ALS loci were still required to complete and verify the gene sequences. Each predicted Als protein featured an N-terminal binding domain, a central domain of tandemly repeated sequences, and a C-terminal domain rich in Ser and Thr. The presence of a secretory signal peptide and consensus sequence for addition of a glycosylphosphatidylinositol (GPI) anchor was consistent with predicted protein localization to the cell surface. TaqMan assays were designed to recognize each ALS gene, as well as both alleles at the divergent CtrALS3882 locus. C. tropicalis cells grown in five different in vitro conditions showed differential expression of various ALS genes. To place the C. tropicalis data into a larger context, TaqMan assays were also designed and validated for analysis of ALS gene expression in Candida albicans and Candida dubliniensis. These comparisons identified the subset of highly expressed C. tropicalis ALS genes that were predicted to encode proteins with the most abundant cell-surface presence, prioritizing them for subsequent functional analysis. Data presented here provide a solid foundation for future experimentation to deduce ALS family contributions to C. tropicalis adhesion and pathogenesis.
Collapse
Affiliation(s)
- Soon-Hwan Oh
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Allyson Isenhower
- Department of Biology, Millikin University, Decatur, IL, United States
| | | | - Brooke Smith
- Department of Biology, Millikin University, Decatur, IL, United States
| | - Jillian Jones
- Department of Biology, Millikin University, Decatur, IL, United States
| | - Vit Hubka
- Department of Botany, Faculty of Science, Charles University, Prague, Czechia.,Laboratory of Fungal Genetics and Metabolism, Institute of Microbiology, Czech Academy of Sciences, Prague, Czechia
| | - Christopher Fields
- Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Alvaro Hernandez
- Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Lois L Hoyer
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|
18
|
Kahsay R, Vora J, Navelkar R, Mousavi R, Fochtman BC, Holmes X, Pattabiraman N, Ranzinger R, Mahadik R, Williamson T, Kulkarni S, Agarwal G, Martin M, Vasudev P, Garcia L, Edwards N, Zhang W, Natale DA, Ross K, Aoki-Kinoshita KF, Campbell MP, York WS, Mazumder R. GlyGen data model and processing workflow. Bioinformatics 2020; 36:3941-3943. [PMID: 32324859 PMCID: PMC7320628 DOI: 10.1093/bioinformatics/btaa238] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Revised: 03/31/2020] [Accepted: 04/16/2020] [Indexed: 11/18/2022] Open
Abstract
Summary Glycoinformatics plays a major role in glycobiology research, and the development of a comprehensive glycoinformatics knowledgebase is critical. This application note describes the GlyGen data model, processing workflow and the data access interfaces featuring programmatic use case example queries based on specific biological questions. The GlyGen project is a data integration, harmonization and dissemination project for carbohydrate and glycoconjugate-related data retrieved from multiple international data sources including UniProtKB, GlyTouCan, UniCarbKB and other key resources. Availability and implementation GlyGen web portal is freely available to access at https://glygen.org. The data portal, web services, SPARQL endpoint and GitHub repository are also freely available at https://data.glygen.org, https://api.glygen.org, https://sparql.glygen.org and https://github.com/glygener, respectively. All code is released under license GNU General Public License version 3 (GNU GPLv3) and is available on GitHub https://github.com/glygener. The datasets are made available under Creative Commons Attribution 4.0 International (CC BY 4.0) license. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Robel Kahsay
- Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC 20052, USA
| | - Jeet Vora
- Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC 20052, USA
| | - Rahi Navelkar
- Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC 20052, USA
| | - Reza Mousavi
- Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC 20052, USA
| | - Brian C Fochtman
- Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC 20052, USA
| | - Xavier Holmes
- Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC 20052, USA
| | - Nagarajan Pattabiraman
- Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC 20052, USA
| | - Rene Ranzinger
- Complex Carbohydrate Research Center, The University of Georgia, Athens, GA 30602, USA
| | - Rupali Mahadik
- Complex Carbohydrate Research Center, The University of Georgia, Athens, GA 30602, USA
| | - Tatiana Williamson
- Complex Carbohydrate Research Center, The University of Georgia, Athens, GA 30602, USA
| | - Sujeet Kulkarni
- Complex Carbohydrate Research Center, The University of Georgia, Athens, GA 30602, USA
| | - Gaurav Agarwal
- Complex Carbohydrate Research Center, The University of Georgia, Athens, GA 30602, USA
| | - Maria Martin
- European Bioinformatics Institute, Hinxton CB10 1SD, UK
| | | | - Leyla Garcia
- ZB MED Information Centre for Life Sciences, Cologne 50931, Germany
| | - Nathan Edwards
- Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, Washington, DC 20007, USA
| | - Wenjin Zhang
- Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, Washington, DC 20007, USA
| | - Darren A Natale
- Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, Washington, DC 20007, USA
| | - Karen Ross
- Department of Biochemistry and Molecular & Cellular Biology, Georgetown University, Washington, DC 20007, USA
| | | | - Matthew P Campbell
- Institute for Glycomics Griffith University, Southport QLD 4222, Australia
| | - William S York
- Complex Carbohydrate Research Center, The University of Georgia, Athens, GA 30602, USA
| | - Raja Mazumder
- Department of Biochemistry & Molecular Medicine, The George Washington School of Medicine and Health Sciences, Washington, DC 20052, USA
| |
Collapse
|
19
|
A novel computational approach for predicting complex phenotypes in Drosophila (starvation-sensitive and sterile) by deriving their gene expression signatures from public data. PLoS One 2020; 15:e0240824. [PMID: 33104720 PMCID: PMC7588067 DOI: 10.1371/journal.pone.0240824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 10/05/2020] [Indexed: 11/19/2022] Open
Abstract
Many research teams perform numerous genetic, transcriptomic, proteomic and other types of omic experiments to understand molecular, cellular and physiological mechanisms of disease and health. Often (but not always), the results of these experiments are deposited in publicly available repository databases. These data records often include phenotypic characteristics following genetic and environmental perturbations, with the aim of discovering underlying molecular mechanisms leading to the phenotypic responses. A constrained set of phenotypic characteristics is usually recorded and these are mostly hypothesis driven of possible to record within financial or practical constraints. We present a novel proof-of-principal computational approach for combining publicly available gene-expression data from control/mutant animal experiments that exhibit a particular phenotype, and we use this approach to predict unobserved phenotypic characteristics in new experiments (data derived from EBI’s ArrayExpress and ExpressionAtlas respectively). We utilised available microarray gene-expression data for two phenotypes (starvation-sensitive and sterile) in Drosophila. The data were combined using a linear-mixed effects model with the inclusion of consecutive principal components to account for variability between experiments in conjunction with Gene Ontology enrichment analysis. We present how available data can be ranked in accordance to a phenotypic likelihood of exhibiting these two phenotypes using random forest. The results from our study show that it is possible to integrate seemingly different gene-expression microarray data and predict a potential phenotypic manifestation with a relatively high degree of confidence (>80% AUC). This provides thus far unexplored opportunities for inferring unknown and unbiased phenotypic characteristics from already performed experiments, in order to identify studies for future analyses. Molecular mechanisms associated with gene and environment perturbations are intrinsically linked and give rise to a variety of phenotypic manifestations. Therefore, unravelling the phenotypic spectrum can help to gain insights into disease mechanisms associated with gene and environmental perturbations. Our approach uses public data that are set to increase in volume, thus providing value for money.
Collapse
|
20
|
Metabolic drug targets of the cytosine metabolism pathways in the dromedary camel (Camelus dromedarius) and blood parasite Trypanosoma evansi. Trop Anim Health Prod 2020; 52:3337-3358. [PMID: 32926292 DOI: 10.1007/s11250-020-02366-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Accepted: 07/31/2020] [Indexed: 10/23/2022]
Abstract
Trypanosomiasis is a major illness affecting camels in tropical and subtropical regions. Comparisons of camel and Trypanosoma evansi genomes can lead to the discovery of new drug targets for treating Trypanosoma infections. The synthesis pathways of cytosine, cytidine, cytidine monophosphate (CMP), cytidine diphosphate (CDP), cytidine triphosphate (CTP) deoxycytidine, deoxycytidine monophosphate (dCMP), deoxycytidine diphosphate (dCDP), and deoxycytidine triphosphate (dCTP) were compared in the dromedary camel (Camelus dromedarius) and T. evansi. None of the enzymes involved in cytosine pathway were detected in camels and T. evansi. Notably, cytidine kinase (CK) and 5'-nucleotidase, which interconverts cytidine to CMP, were not detected in T. evansi but were present in camels. UMP/CMP kinase was not predicted in T. evansi. Therefore, the presence of enzymes involved in the CTP synthesis cascade was not predicted in T. evansi. CMP synthesis might also be encoded by other enzymes, e.g., purine nucleotides kinases. Both camel and T. evansi share an efficient enzyme system for converting CDP to CTP. In conclusion, CTP synthase is important for homeostasis of cytosine nucleotides in T. evansi and could be a potential drug target against the parasite. In addition, the inhibition of UMP synthesis might contribute to parasite death as it is a shared source for CTP synthesis.
Collapse
|
21
|
Chen T, Tyagi S. Integrative computational epigenomics to build data-driven gene regulation hypotheses. Gigascience 2020; 9:giaa064. [PMID: 32543653 PMCID: PMC7297091 DOI: 10.1093/gigascience/giaa064] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 05/25/2020] [Accepted: 05/26/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Diseases are complex phenotypes often arising as an emergent property of a non-linear network of genetic and epigenetic interactions. To translate this resulting state into a causal relationship with a subset of regulatory features, many experiments deploy an array of laboratory assays from multiple modalities. Often, each of these resulting datasets is large, heterogeneous, and noisy. Thus, it is non-trivial to unify these complex datasets into an interpretable phenotype. Although recent methods address this problem with varying degrees of success, they are constrained by their scopes or limitations. Therefore, an important gap in the field is the lack of a universal data harmonizer with the capability to arbitrarily integrate multi-modal datasets. RESULTS In this review, we perform a critical analysis of methods with the explicit aim of harmonizing data, as opposed to case-specific integration. This revealed that matrix factorization, latent variable analysis, and deep learning are potent strategies. Finally, we describe the properties of an ideal universal data harmonization framework. CONCLUSIONS A sufficiently advanced universal harmonizer has major medical implications, such as (i) identifying dysregulated biological pathways responsible for a disease is a powerful diagnostic tool; (2) investigating these pathways further allows the biological community to better understand a disease's mechanisms; and (3) precision medicine also benefits from developments in this area, particularly in the context of the growing field of selective epigenome editing, which can suppress or induce a desired phenotype.
Collapse
Affiliation(s)
- Tyrone Chen
- 25 Rainforest Walk, School of Biological Sciences, Monash University, Clayton, VIC 3800, Australia
| | - Sonika Tyagi
- 25 Rainforest Walk, School of Biological Sciences, Monash University, Clayton, VIC 3800, Australia
| |
Collapse
|
22
|
Blomberg N, Lauer KB. Connecting data, tools and people across Europe: ELIXIR's response to the COVID-19 pandemic. Eur J Hum Genet 2020; 28:719-723. [PMID: 32415272 PMCID: PMC7225634 DOI: 10.1038/s41431-020-0637-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 04/20/2020] [Indexed: 01/04/2023] Open
Abstract
ELIXIR, the European research infrastructure for life science data, provides open access to data, tools and workflows in the response to the COVID-19 pandemic. ELIXIR's 23 nodes have reacted swiftly to support researchers in their combined efforts against the pandemic setting out three joint priorities: 1. Connecting national COVID-19 data platforms to create federated European COVID-19 Data Spaces; 2. Fostering good data management to make COVID-19 data open, FAIR and reusable over the long term; 3. Providing open tools, workflows and computational resources to drive reproducible and collaborative science. ELIXIR's strategy is based on the support given by our national nodes - collectively spanning over 200 institutes - to research projects and on partnering with community initiatives to drive development and adoption of good data practice and community driven standards. ELIXIR Nodes provide support activities locally and internationally, from provisioning compute capabilities to helping collect viral sequence data from hospitals. Some Nodes have prioritised access to their national cloud and compute facilities for all COVID-19 research projects, while others have developed tools to search, access and share all data related to the pandemic in a national healthcare setting.
Collapse
Affiliation(s)
- Niklas Blomberg
- ELIXIR Wellcome Genome Campus Hinxton, Cambridge, CB10 1SD, UK.
| | | |
Collapse
|
23
|
Dana JM, Gutmanas A, Tyagi N, Qi G, O'Donovan C, Martin M, Velankar S. SIFTS: updated Structure Integration with Function, Taxonomy and Sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins. Nucleic Acids Res 2020; 47:D482-D489. [PMID: 30445541 PMCID: PMC6324003 DOI: 10.1093/nar/gky1114] [Citation(s) in RCA: 127] [Impact Index Per Article: 31.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Accepted: 10/22/2018] [Indexed: 12/12/2022] Open
Abstract
The Structure Integration with Function, Taxonomy and Sequences resource (SIFTS; http://pdbe.org/sifts/) was established in 2002 and continues to operate as a collaboration between the Protein Data Bank in Europe (PDBe; http://pdbe.org) and the UniProt Knowledgebase (UniProtKB; http://uniprot.org). The resource is instrumental in the transfer of annotations between protein structure and protein sequence resources through provision of up-to-date residue-level mappings between entries from the PDB and from UniProtKB. SIFTS also incorporates residue-level annotations from other biological resources, currently comprising the NCBI taxonomy database, IntEnz, GO, Pfam, InterPro, SCOP, CATH, PubMed, Ensembl, Homologene and automatic Pfam domain assignments based on HMM profiles. The recently released implementation of SIFTS includes support for multiple cross-references for proteins in the PDB, allowing mappings to UniProtKB isoforms and UniRef90 cluster members. This development makes structure data in the PDB readily available to over 1.8 million UniProtKB accessions.
Collapse
Affiliation(s)
- Jose M Dana
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aleksandras Gutmanas
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nidhi Tyagi
- Protein Function Development, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guoying Qi
- Protein Function Development, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Claire O'Donovan
- Metabolomics, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maria Martin
- Protein Function Development, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
24
|
Conte N, Mason JC, Halmagyi C, Neuhauser S, Mosaku A, Yordanova G, Chatzipli A, Begley DA, Krupke DM, Parkinson H, Meehan TF, Bult CC. PDX Finder: A portal for patient-derived tumor xenograft model discovery. Nucleic Acids Res 2020; 47:D1073-D1079. [PMID: 30535239 PMCID: PMC6323912 DOI: 10.1093/nar/gky984] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 11/30/2018] [Indexed: 11/12/2022] Open
Abstract
Patient-derived tumor xenograft (PDX) mouse models are a versatile oncology research platform for studying tumor biology and for testing chemotherapeutic approaches tailored to genomic characteristics of individual patients’ tumors. PDX models are generated and distributed by a diverse group of academic labs, multi-institution consortia and contract research organizations. The distributed nature of PDX repositories and the use of different metadata standards for describing model characteristics presents a significant challenge to identifying PDX models relevant to specific cancer research questions. The Jackson Laboratory and EMBL-EBI are addressing these challenges by co-developing PDX Finder, a comprehensive open global catalog of PDX models and their associated datasets. Within PDX Finder, model attributes are harmonized and integrated using a previously developed community minimal information standard to support consistent searching across the originating resources. Links to repositories are provided from the PDX Finder search results to facilitate model acquisition and/or collaboration. The PDX Finder resource currently contains information for 1985 PDX models of diverse cancers including those from large resources such as the Patient-Derived Models Repository, PDXNet and EurOPDX. Individuals or organizations that generate and distribute PDXs are invited to increase the ‘findability’ of their models by participating in the PDX Finder initiative at www.pdxfinder.org.
Collapse
Affiliation(s)
- Nathalie Conte
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeremy C Mason
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Csaba Halmagyi
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Steven Neuhauser
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Abayomi Mosaku
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Galabina Yordanova
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Aikaterini Chatzipli
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Dale A Begley
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Debra M Krupke
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Helen Parkinson
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Terrence F Meehan
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carol C Bult
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| |
Collapse
|
25
|
Huang L, Feng G, Yan H, Zhang Z, Bushman BS, Wang J, Bombarely A, Li M, Yang Z, Nie G, Xie W, Xu L, Chen P, Zhao X, Jiang W, Zhang X. Genome assembly provides insights into the genome evolution and flowering regulation of orchardgrass. PLANT BIOTECHNOLOGY JOURNAL 2020; 18:373-388. [PMID: 31276273 PMCID: PMC6953241 DOI: 10.1111/pbi.13205] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2019] [Revised: 05/27/2019] [Accepted: 06/29/2019] [Indexed: 05/18/2023]
Abstract
Orchardgrass (Dactylis glomerata L.) is an important forage grass for cultivating livestock worldwide. Here, we report an ~1.84-Gb chromosome-scale diploid genome assembly of orchardgrass, with a contig N50 of 0.93 Mb, a scaffold N50 of 6.08 Mb and a super-scaffold N50 of 252.52 Mb, which is the first chromosome-scale assembled genome of a cool-season forage grass. The genome includes 40 088 protein-coding genes, and 69% of the assembled sequences are transposable elements, with long terminal repeats (LTRs) being the most abundant. The LTRretrotransposons may have been activated and expanded in the grass genome in response to environmental changes during the Pleistocene between 0 and 1 million years ago. Phylogenetic analysis reveals that orchardgrass diverged after rice but before three Triticeae species, and evolutionarily conserved chromosomes were detected by analysing ancient chromosome rearrangements in these grass species. We also resequenced the whole genome of 76 orchardgrass accessions and found that germplasm from Northern Europe and East Asia clustered together, likely due to the exchange of plants along the 'Silk Road' or other ancient trade routes connecting the East and West. Last, a combined transcriptome, quantitative genetic and bulk segregant analysis provided insights into the genetic network regulating flowering time in orchardgrass and revealed four main candidate genes controlling this trait. This chromosome-scale genome and the online database of orchardgrass developed here will facilitate the discovery of genes controlling agronomically important traits, stimulate genetic improvement of and functional genetic research on orchardgrass and provide comparative genetic resources for other forage grasses.
Collapse
Affiliation(s)
- Linkai Huang
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| | - Guangyan Feng
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| | - Haidong Yan
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
- School of Plant and Environmental SciencesVirginia TechBlacksburgVAUSA
| | | | | | - Jianping Wang
- Agronomy DepartmentUniversity of FloridaGainesvilleFLUSA
| | | | - Mingzhou Li
- Animal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| | - Zhongfu Yang
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| | - Gang Nie
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| | - Wengang Xie
- State Key Laboratory of Grassland Agro‐EcosystemsCollege of Pastoral Agriculture Science and TechnologyLanzhou UniversityLanzhouChina
| | - Lei Xu
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| | - Peilin Chen
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| | - Xinxin Zhao
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| | | | - Xinquan Zhang
- Department of Grassland ScienceAnimal Science and Technology CollegeSichuan Agricultural UniversityChengduChina
| |
Collapse
|
26
|
Coyne MJ, Béchon N, Matano LM, McEneany VL, Chatzidaki-Livanis M, Comstock LE. A family of anti-Bacteroidales peptide toxins wide-spread in the human gut microbiota. Nat Commun 2019; 10:3460. [PMID: 31371723 PMCID: PMC6671954 DOI: 10.1038/s41467-019-11494-1] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Accepted: 07/17/2019] [Indexed: 12/15/2022] Open
Abstract
Bacteria often produce antimicrobial toxins to compete in microbial communities. Here we identify a family of broad-spectrum peptide toxins, named bacteroidetocins, produced by Bacteroidetes species. We study this toxin family using phenotypic, mutational, bioinformatic, and human metagenomic analyses. Bacteroidetocins are related to class IIa bacteriocins of Gram-positive bacteria and kill members of the Bacteroidetes phylum, including Bacteroides, Parabacteroides, and Prevotella gut species, as well as pathogenic Prevotella species. The bacteroidetocin biosynthesis genes are found in horizontally acquired mobile elements, which likely allow dissemination within the gut microbiota and may explain their wide distribution in human populations. Bacteroidetocins may have potential applications in microbiome engineering and as therapeutics for polymicrobial diseases such as bacterial vaginosis and periodontal disease.
Collapse
Affiliation(s)
- Michael J Coyne
- Division of Infectious Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Nathalie Béchon
- Division of Infectious Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Institut Pasteur, Genetics of Biofilms Unit, 75015 Paris, cedex 15, 25-28 rue du Docteur Roux, France
- Ecole Doctorale Bio Sorbonne Paris Cité (BioSPC), Paris Diderot University, 75013, Cellule Pasteur, Paris, cedex, France
| | - Leigh M Matano
- Division of Infectious Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Valentina Laclare McEneany
- Division of Infectious Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Maria Chatzidaki-Livanis
- Division of Infectious Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Biological Sciences, University of Ohio, Athens, OH, 45701, USA
| | - Laurie E Comstock
- Division of Infectious Diseases, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
27
|
Vamathevan J, Apweiler R, Birney E. Biomolecular Data Resources: Bioinformatics Infrastructure for Biomedical Data Science. Annu Rev Biomed Data Sci 2019. [DOI: 10.1146/annurev-biodatasci-072018-021321] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Technological advances have continuously driven the generation of bio-molecular data and the development of bioinformatics infrastructure, which enables data reuse for scientific discovery. Several types of data management resources have arisen, such as data deposition databases, added-value databases or knowledgebases, and biology-driven portals. In this review, we provide a unique overview of the gradual evolution of these resources and discuss the goals and features that must be considered in their development. With the increasing application of genomics in the health care context and with 60 to 500 million whole genomes estimated to be sequenced by 2022, biomedical research infrastructure is transforming, too. Systems for federated access, portable tools, provision of reference data, and interpretation tools will enable researchers to derive maximal benefits from these data. Collaboration, coordination, and sustainability of data resources are key to ensure that biomedical knowledge management can scale with technology shifts and growing data volumes.
Collapse
Affiliation(s)
- Jessica Vamathevan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Rolf Apweiler
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| |
Collapse
|
28
|
González G, Evans CL. Biomedical Image Processing with Containers and Deep Learning: An Automated Analysis Pipeline: Data architecture, artificial intelligence, automated processing, containerization, and clusters orchestration ease the transition from data acquisition to insights in medium-to-large datasets. Bioessays 2019; 41:e1900004. [PMID: 31094000 PMCID: PMC6538271 DOI: 10.1002/bies.201900004] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 03/18/2019] [Indexed: 12/13/2022]
Abstract
Here, a streamlined, scalable, laboratory approach is discussed that enables medium-to-large dataset analysis. The presented approach combines data management, artificial intelligence, containerization, cluster orchestration, and quality control in a unified analytic pipeline. The unique combination of these individual building blocks creates a new and powerful analysis approach that can readily be applied to medium-to-large datasets by researchers to accelerate the pace of research. The proposed framework is applied to a project that counts the number of plasmonic nanoparticles bound to peripheral blood mononuclear cells in dark-field microscopy images. By using the techniques presented in this article, the images are automatically processed overnight, without user interaction, streamlining the path from experiment to conclusions.
Collapse
Affiliation(s)
- Germán González
- PNP Research Corporation, Drury, MA. 01343
- Sierra Research S.L.U. Avda Costa Blanca 132. Alicante. Spain. 03540
| | - Conor L. Evans
- Wellman Center for Photomedicine, Harvard Medical School, Massachusetts General Hospital, CNY149-3, 13th St, Charlestown, MA 02129
- Ludwig Center at Harvard, Harvard Medical School, Boston, MA
| |
Collapse
|
29
|
Oliveira AL. Biotechnology, Big Data and Artificial Intelligence. Biotechnol J 2019; 14:e1800613. [DOI: 10.1002/biot.201800613] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Indexed: 11/11/2022]
Affiliation(s)
- Arlindo L. Oliveira
- INESC‐ID, Instituto Superior TécnicoUniversity of LisbonR. Alves Redol 9 1000‐029 Lisboa Portugal
| |
Collapse
|
30
|
Leng J, Shoura M, McLeish TCB, Real AN, Hardey M, McCafferty J, Ranson NA, Harris SA. Securing the future of research computing in the biosciences. PLoS Comput Biol 2019; 15:e1006958. [PMID: 31095554 PMCID: PMC6521984 DOI: 10.1371/journal.pcbi.1006958] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Improvements in technology often drive scientific discovery. Therefore, research requires sustained investment in the latest equipment and training for the researchers who are going to use it. Prioritising and administering infrastructure investment is challenging because future needs are difficult to predict. In the past, highly computationally demanding research was associated primarily with particle physics and astronomy experiments. However, as biology becomes more quantitative and bioscientists generate more and more data, their computational requirements may ultimately exceed those of physical scientists. Computation has always been central to bioinformatics, but now imaging experiments have rapidly growing data processing and storage requirements. There is also an urgent need for new modelling and simulation tools to provide insight and understanding of these biophysical experiments. Bioscience communities must work together to provide the software and skills training needed in their areas. Research-active institutions need to recognise that computation is now vital in many more areas of discovery and create an environment where it can be embraced. The public must also become aware of both the power and limitations of computing, particularly with respect to their health and personal data.
Collapse
Affiliation(s)
- Joanna Leng
- School of Computing, University of Leeds, Leeds, United Kingdom
| | - Massa Shoura
- School of Pathology, Stanford University, Palo Alto, California, United States of America
| | | | - Alan N. Real
- Advanced Research Computing, University of Durham, Durham, United Kingdom
| | - Mariann Hardey
- Advanced Research Computing, University of Durham, Durham, United Kingdom
- School of Business, University of Durham, Durham, United Kingdom
| | | | - Neil A. Ranson
- Astbury Centre for Structural and Molecular Biology, University of Leeds, Leeds, United Kingdom
- School of Molecular and Cellular Biology, University of Leeds, Leeds, United Kingdom
| | - Sarah A. Harris
- Astbury Centre for Structural and Molecular Biology, University of Leeds, Leeds, United Kingdom
- School of Physics and Astronomy, University of Leeds, Leeds, United Kingdom
- * E-mail:
| |
Collapse
|
31
|
Oh SH, Smith B, Miller AN, Staker B, Fields C, Hernandez A, Hoyer LL. Agglutinin-Like Sequence ( ALS) Genes in the Candida parapsilosis Species Complex: Blurring the Boundaries Between Gene Families That Encode Cell-Wall Proteins. Front Microbiol 2019; 10:781. [PMID: 31105652 PMCID: PMC6499006 DOI: 10.3389/fmicb.2019.00781] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 03/27/2019] [Indexed: 12/13/2022] Open
Abstract
The agglutinin-like sequence (Als) proteins are best-characterized in Candida albicans and known for their role in adhesion of the fungal cell to host and abiotic surfaces. ALS sequences are often misassembled in whole-genome sequence data because each species has multiple ALS loci that contain similar sequences, most notably tandem copies of highly conserved repeated sequences. The Candida parapsilosis species complex includes Candida parapsilosis, Candida orthopsilosis, and Candida metapsilosis, three distinct but closely related species. Using publicly available genome resources, de novo genome assemblies, and laboratory experimentation including Sanger sequencing, five ALS genes were characterized in C. parapsilosis strain CDC317, three in C. orthopsilosis strain 90-125, and four in C. metapsilosis strain ATCC 96143. The newly characterized ALS genes shared similar features with the well-known C. albicans ALS family, but also displayed unique attributes such as novel short, imperfect repeat sequences that were found in other genes encoding fungal cell-wall proteins. Evidence of recombination between ALS sequences and other genes was most obvious in CmALS2265, which had the 5' end of an ALS gene and the repeated sequences and 3' end from the IFF/HYR family. Together, these results blur the boundaries between the fungal cell-wall families that were defined in C. albicans. TaqMan assays were used to quantify relative expression for each ALS gene. Some measurements were complicated by the assay location within the ALS gene. Considerable variation was noted in relative gene expression for isolates of the same species. Overall, however, there was a trend toward higher relative gene expression in saturated cultures rather than younger cultures. This work provides a complete description of the ALS genes in the C. parapsilosis species complex and a toolkit that promotes further investigations into the role of the Als proteins in host-fungal interactions.
Collapse
Affiliation(s)
- Soon-Hwan Oh
- Department of Pathobiology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Brooke Smith
- Department of Biology, Millikin University, Decatur, IL, United States
| | | | - Bart Staker
- Seattle Structural Genomics Center for Infectious Disease, Seattle Children’s Hospital, Seattle, WA, United States
| | - Christopher Fields
- Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Alvaro Hernandez
- Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| | - Lois L. Hoyer
- Department of Pathobiology, University of Illinois at Urbana-Champaign, Urbana, IL, United States
| |
Collapse
|
32
|
Olsson TS, Hartley M. Lightweight data management with dtool. PeerJ 2019; 7:e6562. [PMID: 30867992 PMCID: PMC6409086 DOI: 10.7717/peerj.6562] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 02/04/2019] [Indexed: 11/29/2022] Open
Abstract
The explosion in volumes and types of data has led to substantial challenges in data management. These challenges are often faced by front-line researchers who are already dealing with rapidly changing technologies and have limited time to devote to data management. There are good high-level guidelines for managing and processing scientific data. However, there is a lack of simple, practical tools to implement these guidelines. This is particularly problematic in a highly distributed research environment where needs differ substantially from group to group and centralised solutions are difficult to implement and storage technologies change rapidly. To meet these challenges we have developed dtool, a command line tool for managing data. The tool packages data and metadata into a unified whole, which we call a dataset. The dataset provides consistency checking and the ability to access metadata for both the whole dataset and individual files. The tool can store these datasets on several different storage systems, including a traditional file system, object store (S3 and Azure) and iRODS. It includes an application programming interface that can be used to incorporate it into existing pipelines and workflows. The tool has provided substantial process, cost, and peace-of-mind benefits to our data management practices and we want to share these benefits. The tool is open source and available freely online at http://dtool.readthedocs.io.
Collapse
Affiliation(s)
- Tjelvar S.G. Olsson
- Computational Systems Biology, John Innes Centre, Norwich, UK, United Kingdom
| | - Matthew Hartley
- Computational Systems Biology, John Innes Centre, Norwich, UK, United Kingdom
| |
Collapse
|
33
|
Cook CE, Lopez R, Stroe O, Cochrane G, Brooksbank C, Birney E, Apweiler R. The European Bioinformatics Institute in 2018: tools, infrastructure and training. Nucleic Acids Res 2019; 47:D15-D22. [PMID: 30445657 PMCID: PMC6323906 DOI: 10.1093/nar/gky1124] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 10/19/2018] [Accepted: 11/11/2018] [Indexed: 02/03/2023] Open
Abstract
The European Bioinformatics Institute (https://www.ebi.ac.uk/) archives, curates and analyses life sciences data produced by researchers throughout the world, and makes these data available for re-use globally (https://www.ebi.ac.uk/). Data volumes continue to grow exponentially: total raw storage capacity now exceeds 160 petabytes, and we manage these increasing data flows while maintaining the quality of our services. This year we have improved the efficiency of our computational infrastructure and doubled the bandwidth of our connection to the worldwide web. We report two new data resources, the Single Cell Expression Atlas (https://www.ebi.ac.uk/gxa/sc/), which is a component of the Expression Atlas; and the PDBe-Knowledgebase (https://www.ebi.ac.uk/pdbe/pdbe-kb), which collates functional annotations and predictions for structure data in the Protein Data Bank. Additionally, Europe PMC (http://europepmc.org/) has added preprint abstracts to its search results, supplementing results from peer-reviewed publications. EMBL-EBI maintains over 150 analytical bioinformatics tools that complement our data resources. We make these tools available for users through a web interface as well as programmatically using application programming interfaces, whilst ensuring the latest versions are available for our users. Our training team, with support from all of our staff, continued to provide on-site, off-site and web-based training opportunities for thousands of researchers worldwide this year.
Collapse
Affiliation(s)
- Charles E Cook
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rodrigo Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Oana Stroe
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Cath Brooksbank
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ewan Birney
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rolf Apweiler
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
34
|
|
35
|
Huntley RP, Kramarz B, Sawford T, Umrao Z, Kalea A, Acquaah V, Martin MJ, Mayr M, Lovering RC. Expanding the horizons of microRNA bioinformatics. RNA (NEW YORK, N.Y.) 2018; 24:1005-1017. [PMID: 29871895 PMCID: PMC6049505 DOI: 10.1261/rna.065565.118] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 06/01/2018] [Indexed: 06/08/2023]
Abstract
MicroRNA regulation of key biological and developmental pathways is a rapidly expanding area of research, accompanied by vast amounts of experimental data. This data, however, is not widely available in bioinformatic resources, making it difficult for researchers to find and analyze microRNA-related experimental data and define further research projects. We are addressing this problem by providing two new bioinformatics data sets that contain experimentally verified functional information for mammalian microRNAs involved in cardiovascular-relevant, and other, processes. To date, our resource provides over 4400 Gene Ontology annotations associated with over 500 microRNAs from human, mouse, and rat and over 2400 experimentally validated microRNA:target interactions. We illustrate how this resource can be used to create microRNA-focused interaction networks with a biological context using the known biological role of microRNAs and the mRNAs they regulate, enabling discovery of associations between gene products, biological pathways and, ultimately, diseases. This data will be crucial in advancing the field of microRNA bioinformatics and will establish consistent data sets for reproducible functional analysis of microRNAs across all biological research areas.
Collapse
Affiliation(s)
- Rachael P Huntley
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Barbara Kramarz
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Tony Sawford
- European Bioinformatics Institute, European Molecular Biology Laboratory (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge CB10 1SD, United Kingdom
| | - Zara Umrao
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Anastasia Kalea
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Vanessa Acquaah
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| | - Maria J Martin
- European Bioinformatics Institute, European Molecular Biology Laboratory (EMBL-EBI), Wellcome Trust Genome Campus, Cambridge CB10 1SD, United Kingdom
| | - Manuel Mayr
- King's British Heart Foundation Centre, King's College London, London SE5 9NU, United Kingdom
| | - Ruth C Lovering
- Institute of Cardiovascular Science, University College London, London WC1E 6JF, United Kingdom
| |
Collapse
|
36
|
Kleywegt GJ, Velankar S, Patwardhan A. Structural biology data archiving - where we are and what lies ahead. FEBS Lett 2018; 592:2153-2167. [PMID: 29749603 PMCID: PMC6019198 DOI: 10.1002/1873-3468.13086] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 04/25/2018] [Accepted: 04/30/2018] [Indexed: 12/31/2022]
Abstract
For almost 50 years, structural biology has endeavoured to conserve and share its experimental data and their interpretations (usually, atomistic models) through global public archives such as the Protein Data Bank, Electron Microscopy Data Bank and Biological Magnetic Resonance Data Bank (BMRB). These archives are treasure troves of freely accessible data that document our quest for molecular or atomic understanding of biological function and processes in health and disease. They have prepared the field to tackle new archiving challenges as more and more (combinations of) techniques are being utilized to elucidate structure at ever increasing length scales. Furthermore, the field has made substantial efforts to develop validation methods that help users to assess the reliability of structures and to identify the most appropriate data for their needs. In this Review, we present an overview of public data archives in structural biology and discuss the importance of validation for users and producers of structural data. Finally, we sketch our efforts to integrate structural data with bioimaging data and with other sources of biological data. This will make relevant structural information available and more easily discoverable for a wide range of scientists.
Collapse
Affiliation(s)
- Gerard J. Kleywegt
- European Molecular Biology Laboratory (EMBL)European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Sameer Velankar
- European Molecular Biology Laboratory (EMBL)European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - Ardan Patwardhan
- European Molecular Biology Laboratory (EMBL)European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| |
Collapse
|
37
|
Rigden DJ, Fernández XM. The 2018 Nucleic Acids Research database issue and the online molecular biology database collection. Nucleic Acids Res 2018; 46:D1-D7. [PMID: 29316735 PMCID: PMC5753253 DOI: 10.1093/nar/gkx1235] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 11/29/2017] [Indexed: 12/20/2022] Open
Abstract
The 2018 Nucleic Acids Research Database Issue contains 181 papers spanning molecular biology. Among them, 82 are new and 84 are updates describing resources that appeared in the Issue previously. The remaining 15 cover databases most recently published elsewhere. Databases in the area of nucleic acids include 3DIV for visualisation of data on genome 3D structure and RNArchitecture, a hierarchical classification of RNA families. Protein databases include the established SMART, ELM and MEROPS while GPCRdb and the newcomer STCRDab cover families of biomedical interest. In the area of metabolism, HMDB and Reactome both report new features while PULDB appears in NAR for the first time. This issue also contains reports on genomics resources including Ensembl, the UCSC Genome Browser and ENCODE. Update papers from the IUPHAR/BPS Guide to Pharmacology and DrugBank are highlights of the drug and drug target section while a number of proteomics databases including proteomicsDB are also covered. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been updated, reviewing 138 entries, adding 88 new resources and eliminating 47 discontinued URLs, bringing the current total to 1737 databases. It is available at http://www.oxfordjournals.org/nar/database/c/.
Collapse
Affiliation(s)
- Daniel J Rigden
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK
| | | |
Collapse
|