Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zeng S, Lyu Z, Narisetti SRK, Xu D, Joshi T. Knowledge Base Commons (KBCommons) v1.1: a universal framework for multi-omics data integration and biological discoveries. BMC Genomics 2019;20:947. [PMID: 31856718 PMCID: PMC6923931 DOI: 10.1186/s12864-019-6287-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

For:	Zeng S, Lyu Z, Narisetti SRK, Xu D, Joshi T. Knowledge Base Commons (KBCommons) v1.1: a universal framework for multi-omics data integration and biological discoveries. BMC Genomics 2019;20:947. [PMID: 31856718 PMCID: PMC6923931 DOI: 10.1186/s12864-019-6287-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

Number

Cited by Other Article(s)

Chan YO, Biová J, Mahmood A, Dietz N, Bilyeu K, Škrabišová M, Joshi T. Genomic Variations Explorer (GenVarX): a toolset for annotating promoter and CNV regions using genotypic and phenotypic differences. Front Genet 2023;14:1251382. [PMID: 37928239 PMCID: PMC10623549 DOI: 10.3389/fgene.2023.1251382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 09/27/2023] [Indexed: 11/07/2023] Open

Abstract

The rapid growth of sequencing technology and its increasing popularity in biology-related research over the years has made whole genome re-sequencing (WGRS) data become widely available. A large amount of WGRS data can unlock the knowledge gap between genomics and phenomics through gaining an understanding of the genomic variations that can lead to phenotype changes. These genomic variations are usually comprised of allele and structural changes in DNA, and these changes can affect the regulatory mechanisms causing changes in gene expression and altering the phenotypes of organisms. In this research work, we created the GenVarX toolset, that is backed by transcription factor binding sequence data in promoter regions, the copy number variations data, SNPs and Indels data, and phenotypes data which can potentially provide insights about phenotypic differences and solve compelling questions in plant research. Analytics-wise, we have developed strategies to better utilize the WGRS data and mine the data using efficient data processing scripts, libraries, tools, and frameworks to create the interactive and visualization-enhanced GenVarX toolset that encompasses both promoter regions and copy number variation analysis components. The main capabilities of the GenVarX toolset are to provide easy-to-use interfaces for users to perform queries, visualize data, and interact with the data. Based on different input windows on the user interface, users can provide inputs corresponding to each field and submit the information as a query. The data returned on the results page is usually displayed in a tabular fashion. In addition, interactive figures are also included in the toolset to facilitate the visualization of statistical results or tool outputs. Currently, the GenVarX toolset supports soybean, rice, and Arabidopsis. The researchers can access the soybean GenVarX toolset from SoyKB via https://soykb.org/SoybeanGenVarX/, rice GenVarX toolset, and Arabidopsis GenVarX toolset from KBCommons web portal with links https://kbcommons.org/system/tools/GenVarX/Osativa and https://kbcommons.org/system/tools/GenVarX/Athaliana, respectively.

Collapse

Xu Z, Cheng S, Qiu X, Wang X, Hu Q, Shi Y, Liu Y, Lin J, Tian J, Peng Y, Jiang Y, Yang Y, Ye J, Wang Y, Meng X, Li Z, Li H, Wang Y. A pipeline for sample tagging of whole genome bisulfite sequencing data using genotypes of whole genome sequencing. BMC Genomics 2023;24:347. [PMID: 37353738 DOI: 10.1186/s12864-023-09413-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 05/27/2023] [Indexed: 06/25/2023] Open

Abstract

BACKGROUND

In large-scale high-throughput sequencing projects and biobank construction, sample tagging is essential to prevent sample mix-ups. Despite the availability of fingerprint panels for DNA data, little research has been conducted on sample tagging of whole genome bisulfite sequencing (WGBS) data. This study aims to construct a pipeline and identify applicable fingerprint panels to address this problem.

RESULTS

Using autosome-wide A/T polymorphic single nucleotide variants (SNVs) obtained from whole genome sequencing (WGS) and WGBS of individuals from the Third China National Stroke Registry, we designed a fingerprint panel and constructed an optimized pipeline for tagging WGBS data. This pipeline used Bis-SNP to call genotypes from the WGBS data, and optimized genotype comparison by eliminating wildtype homozygous and missing genotypes, and retaining variants with identical genomic coordinates and reference/alternative alleles. WGS-based and WGBS-based genotypes called from identical or different samples were extensively compared using hap.py. In the first batch of 94 samples, the genotype consistency rates were between 71.01%-84.23% and 51.43%-60.50% for the matched and mismatched WGS and WGBS data using the autosome-wide A/T polymorphic SNV panel. This capability to tag WGBS data was validated among the second batch of 240 samples, with genotype consistency rates ranging from 70.61%-84.65% to 49.58%-61.42% for the matched and mismatched data, respectively. We also determined that the number of genetic variants required to correctly tag WGBS data was on the order of thousands through testing six fingerprint panels with different orders for the number of variants. Additionally, we affirmed this result with two self-designed panels of 1351 and 1278 SNVs, respectively. Furthermore, this study confirmed that using the number of genetic variants with identical coordinates and ref/alt alleles, or identical genotypes could not correctly tag WGBS data.

CONCLUSION

This study proposed an optimized pipeline, applicable fingerprint panels, and a lower boundary for the number of fingerprint genetic variants needed for correct sample tagging of WGBS data, which are valuable for tagging WGBS data and integrating multi-omics data for biobanks.

Collapse

Affiliation(s)

Zhe Xu Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China Center of excellence for Omics Research (CORe), Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
Si Cheng Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China Center of excellence for Omics Research (CORe), Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China Clinical Center for Precision Medicine in Stroke, Capital Medical University, Beijing, 100069, China Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, 100069, China
Xin Qiu Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China
Xiaoqi Wang BioChain (Beijing) Science and Technology, Inc, Economic and Technological Development Area, 100176, Beijing, P. R. China
Qiuwen Hu BioChain (Beijing) Science and Technology, Inc, Economic and Technological Development Area, 100176, Beijing, P. R. China
Yanfeng Shi Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China Center of excellence for Omics Research (CORe), Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
Yang Liu Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China Center of excellence for Omics Research (CORe), Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
Jinxi Lin Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China
Jichao Tian BioChain (Beijing) Science and Technology, Inc, Economic and Technological Development Area, 100176, Beijing, P. R. China
Yongfei Peng BioChain (Beijing) Science and Technology, Inc, Economic and Technological Development Area, 100176, Beijing, P. R. China
Yong Jiang Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China
Yadong Yang BioChain (Beijing) Science and Technology, Inc, Economic and Technological Development Area, 100176, Beijing, P. R. China
Jianwei Ye BioChain (Beijing) Science and Technology, Inc, Economic and Technological Development Area, 100176, Beijing, P. R. China
Yilong Wang Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
Xia Meng Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China
Zixiao Li Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China
Hao Li Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China Center of excellence for Omics Research (CORe), Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China
Yongjun Wang Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China. China National Clinical Research Center for Neurological Diseases, Beijing, 100070, China. Center of excellence for Omics Research (CORe), Beijing Tiantan Hospital, Capital Medical University, Beijing, 100070, China. Clinical Center for Precision Medicine in Stroke, Capital Medical University, Beijing, 100069, China. Advanced Innovation Center for Human Brain Protection, Capital Medical University, Beijing, 100069, China.

Collapse

Chan YO, Dietz N, Zeng S, Wang J, Flint-Garcia S, Salazar-Vidal MN, Škrabišová M, Bilyeu K, Joshi T. The Allele Catalog Tool: a web-based interactive tool for allele discovery and analysis. BMC Genomics 2023;24:107. [PMID: 36899307 PMCID: PMC10007842 DOI: 10.1186/s12864-023-09161-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 01/31/2023] [Indexed: 03/12/2023] Open

Abstract

BACKGROUND

The advancement of sequencing technologies today has made a plethora of whole-genome re-sequenced (WGRS) data publicly available. However, research utilizing the WGRS data without further configuration is nearly impossible. To solve this problem, our research group has developed an interactive Allele Catalog Tool to enable researchers to explore the coding region allelic variation present in over 1,000 re-sequenced accessions each for soybean, Arabidopsis, and maize.

RESULTS

The Allele Catalog Tool was designed originally with soybean genomic data and resources. The Allele Catalog datasets were generated using our variant calling pipeline (SnakyVC) and the Allele Catalog pipeline (AlleleCatalog). The variant calling pipeline is developed to parallelly process raw sequencing reads to generate the Variant Call Format (VCF) files, and the Allele Catalog pipeline takes VCF files to perform imputations, functional effect predictions, and assemble alleles for each gene to generate curated Allele Catalog datasets. Both pipelines were utilized to generate the data panels (VCF files and Allele Catalog files) in which the accessions of the WGRS datasets were collected from various sources, currently representing over 1,000 diverse accessions for soybean, Arabidopsis, and maize individually. The main features of the Allele Catalog Tool include data query, visualization of results, categorical filtering, and download functions. Queries are performed from user input, and results are a tabular format of summary results by categorical description and genotype results of the alleles for each gene. The categorical information is specific to each species; additionally, available detailed meta-information is provided in modal popups. The genotypic information contains the variant positions, reference or alternate genotypes, the functional effect classes, and the amino-acid changes of each accession. Besides that, the results can also be downloaded for other research purposes.

CONCLUSIONS

The Allele Catalog Tool is a web-based tool that currently supports three species: soybean, Arabidopsis, and maize. The Soybean Allele Catalog Tool is hosted on the SoyKB website ( https://soykb.org/SoybeanAlleleCatalogTool/ ), while the Allele Catalog Tool for Arabidopsis and maize is hosted on the KBCommons website ( https://kbcommons.org/system/tools/AlleleCatalogTool/Zmays and https://kbcommons.org/system/tools/AlleleCatalogTool/Athaliana ). Researchers can use this tool to connect variant alleles of genes with meta-information of species.

Collapse

Zhang R, Zhang C, Yu C, Dong J, Hu J. Integration of multi-omics technologies for crop improvement: Status and prospects. FRONTIERS IN BIOINFORMATICS 2022;2:1027457. [PMID: 36438626 PMCID: PMC9689701 DOI: 10.3389/fbinf.2022.1027457] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 09/28/2022] [Indexed: 08/03/2023] Open

Yang Y, La TC, Gillman JD, Lyu Z, Joshi T, Usovsky M, Song Q, Scaboo A. Linkage analysis and residual heterozygotes derived near isogenic lines reveals a novel protein quantitative trait loci from a Glycine soja accession. FRONTIERS IN PLANT SCIENCE 2022;13:938100. [PMID: 35968122 PMCID: PMC9372550 DOI: 10.3389/fpls.2022.938100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]

Abstract

Modern soybean [Glycine max (L.) Merr] cultivars have low overall genetic variation due to repeated bottleneck events that arose during domestication and from selection strategies typical of many soybean breeding programs. In both public and private soybean breeding programs, the introgression of wild soybean (Glycine soja Siebold and Zucc.) alleles is a viable option to increase genetic diversity and identify new sources for traits of value. The objectives of our study were to examine the genetic architecture responsible for seed protein and oil using a recombinant inbred line (RIL) population derived from hybridizing a G. max line ('Osage') with a G. soja accession (PI 593983). Linkage mapping identified a total of seven significant quantitative trait loci on chromosomes 14 and 20 for seed protein and on chromosome 8 for seed oil with LOD scores ranging from 5.3 to 31.7 for seed protein content and from 9.8 to 25.9 for seed oil content. We analyzed 3,015 single F4:9 soybean plants to develop two residual heterozygotes derived near isogenic lines (RHD-NIL) populations by targeting nine SNP markers from genotype-by-sequencing, which corresponded to two novel quantitative trait loci (QTL) derived from G. soja: one for a novel seed oil QTL on chromosome 8 and another for a novel protein QTL on chromosome 14. Single marker analysis and linkage analysis using 50 RHD-NILs validated the chromosome 14 protein QTL, and whole genome sequencing of RHD-NILs allowed us to reduce the QTL interval from ∼16.5 to ∼4.6 Mbp. We identified two genomic regions based on recombination events which had significant increases of 0.65 and 0.72% in seed protein content without a significant decrease in seed oil content. A new Kompetitive allele-specific polymerase chain reaction (KASP) assay, which will be useful for introgression of this trait into modern elite G. max cultivars, was developed in one region. Within the significantly associated genomic regions, a total of eight genes are considered as candidate genes, based on the presence of gene annotations associated with the protein or amino acid metabolism/movement. Our results provide better insights into utilizing wild soybean as a source of genetic diversity for soybean cultivar improvement utilizing native traits.

Collapse

Wang J, Sidharth S, Zeng S, Jiang Y, Chan YO, Lyu Z, McCubbin T, Mertz R, Sharp RE, Joshi T. Bioinformatics for plant and agricultural discoveries in the age of multiomics: A review and case study of maize nodal root growth under water deficit. PHYSIOLOGIA PLANTARUM 2022;174:e13672. [PMID: 35297059 DOI: 10.1111/ppl.13672] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 03/03/2022] [Accepted: 03/11/2022] [Indexed: 06/14/2023]

Affiliation(s)

Juexin Wang Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri, USA
Sen Sidharth Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri, USA Division of Plant Science and Technology, University of Missouri, Columbia, Missouri, USA Interdisciplinary Plant Group, University of Missouri, Columbia, Missouri, USA
Shuai Zeng Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
Yuexu Jiang Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri, USA
Yen On Chan MU Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA
Zhen Lyu Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA
Tyler McCubbin Division of Plant Science and Technology, University of Missouri, Columbia, Missouri, USA Interdisciplinary Plant Group, University of Missouri, Columbia, Missouri, USA
Rachel Mertz Interdisciplinary Plant Group, University of Missouri, Columbia, Missouri, USA Division of Biological Sciences, University of Missouri, Columbia, Missouri, USA
Robert E Sharp Division of Plant Science and Technology, University of Missouri, Columbia, Missouri, USA Interdisciplinary Plant Group, University of Missouri, Columbia, Missouri, USA
Trupti Joshi Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri, USA Division of Plant Science and Technology, University of Missouri, Columbia, Missouri, USA Interdisciplinary Plant Group, University of Missouri, Columbia, Missouri, USA MU Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri, USA Department of Health Management and Informatics, University of Missouri, Columbia, Missouri, USA

Collapse

Krassowski M, Das V, Sahu SK, Misra BB. State of the Field in Multi-Omics Research: From Computational Needs to Data Mining and Sharing. Front Genet 2020;11:610798. [PMID: 33362867 PMCID: PMC7758509 DOI: 10.3389/fgene.2020.610798] [Citation(s) in RCA: 139] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Accepted: 11/20/2020] [Indexed: 12/24/2022] Open

Jamil IN, Remali J, Azizan KA, Nor Muhammad NA, Arita M, Goh HH, Aizat WM. Systematic Multi-Omics Integration (MOI) Approach in Plant Systems Biology. FRONTIERS IN PLANT SCIENCE 2020;11:944. [PMID: 32754171 PMCID: PMC7371031 DOI: 10.3389/fpls.2020.00944] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 06/10/2020] [Indexed: 05/03/2023]

Naba A, Ricard-Blum S. The Extracellular Matrix Goes -Omics: Resources and Tools. EXTRACELLULAR MATRIX OMICS 2020. [DOI: 10.1007/978-3-030-58330-9_1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]