1
|
Liu W, Vu T, Konigsberg I, Pratte K, Zhuang Y, Kechris K. SmCCNet 2.0: A Comprehensive Tool for Multi-omics Network Inference with Shiny Visualization. bioRxiv 2024:2023.11.20.567893. [PMID: 38045372 PMCID: PMC10690212 DOI: 10.1101/2023.11.20.567893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Summary Sparse multiple canonical correlation network analysis (SmCCNet) is a machine learning technique for integrating omics data along with a variable of interest (e.g., phenotype of complex disease), and reconstructing multi-omics networks that are specific to this variable. We present the second-generation SmCCNet (SmCCNet 2.0) that adeptly integrates single or multiple omics data types along with a quantitative or binary phenotype of interest. In addition, this new package offers a streamlined setup process that can be configured manually or automatically, ensuring a flexible and user-friendly experience. Availability This package is available in both CRAN: https://cran.r-project.org/web/packages/SmCCNet/index.html and Github: https://github.com/KechrisLab/SmCCNet under the MIT license. The network visualization tool is available at https://smccnet.shinyapps.io/smccnetnetwork/.
Collapse
Affiliation(s)
- Weixuan Liu
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Thao Vu
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Iain Konigsberg
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Katherine Pratte
- Department of Biostatistics, National Jewish Health, Denver, 80206, CO, USA
| | - Yonghua Zhuang
- Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| | - Katerina Kechris
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, 80045, CO, USA
| |
Collapse
|
2
|
Carrozzini B, Cascarano GL, Giacovazzo C. The Automatic Solution of Macromolecular Crystal Structures via Molecular Replacement Techniques: REMO22 and Its Pipeline. Int J Mol Sci 2023; 24:ijms24076070. [PMID: 37047043 PMCID: PMC10094557 DOI: 10.3390/ijms24076070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/08/2023] [Accepted: 03/14/2023] [Indexed: 04/14/2023] Open
Abstract
A description of REMO22, a new molecular replacement program for proteins and nucleic acids, is provided. This program, as with REMO09, can use various types of prior information through appropriate conditional distribution functions. Its efficacy in model searching has been validated through several test cases involving proteins and nucleic acids. Although REMO22 can be configured with different protocols according to user directives, it has been developed primarily as an automated tool for determining the crystal structures of macromolecules. To evaluate REMO22's utility in the current crystallographic environment, its experimental results must be compared favorably with those of the most widely used Molecular Replacement (MR) programs. To accomplish this, we chose two leading tools in the field, PHASER and MOLREP. REMO22, along with MOLREP and PHASER, were included in pipelines that contain two additional steps: phase refinement (SYNERGY) and automated model building (CAB). To evaluate the effectiveness of REMO22, SYNERGY and CAB, we conducted experimental tests on numerous macromolecular structures. The results indicate that REMO22, along with its pipeline REMO22 + SYNERGY + CAB, presents a viable alternative to currently used phasing tools.
Collapse
Affiliation(s)
- Benedetta Carrozzini
- Istituto di Cristallografia, The National Research Council (CNR), Via G. Amendola 122/o, I-70126 Bari, Italy
| | - Giovanni Luca Cascarano
- Istituto di Cristallografia, The National Research Council (CNR), Via G. Amendola 122/o, I-70126 Bari, Italy
| | - Carmelo Giacovazzo
- Istituto di Cristallografia, The National Research Council (CNR), Via G. Amendola 122/o, I-70126 Bari, Italy
| |
Collapse
|
3
|
Miñarro-Lleonar M, Ruiz-Carmona S, Alvarez-Garcia D, Schmidtke P, Barril X. Development of an Automatic Pipeline for Participation in the CELPP Challenge. Int J Mol Sci 2022; 23:ijms23094756. [PMID: 35563148 PMCID: PMC9105952 DOI: 10.3390/ijms23094756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 04/20/2022] [Accepted: 04/21/2022] [Indexed: 12/01/2022] Open
Abstract
The prediction of how a ligand binds to its target is an essential step for Structure-Based Drug Design (SBDD) methods. Molecular docking is a standard tool to predict the binding mode of a ligand to its macromolecular receptor and to quantify their mutual complementarity, with multiple applications in drug design. However, docking programs do not always find correct solutions, either because they are not sampled or due to inaccuracies in the scoring functions. Quantifying the docking performance in real scenarios is essential to understanding their limitations, managing expectations and guiding future developments. Here, we present a fully automated pipeline for pose prediction validated by participating in the Continuous Evaluation of Ligand Pose Prediction (CELPP) Challenge. Acknowledging the intrinsic limitations of the docking method, we devised a strategy to automatically mine and exploit pre-existing data, defining—whenever possible—empirical restraints to guide the docking process. We prove that the pipeline is able to generate predictions for most of the proposed targets as well as obtain poses with low RMSD values when compared to the crystal structure. All things considered, our pipeline highlights some major challenges in the automatic prediction of protein–ligand complexes, which will be addressed in future versions of the pipeline.
Collapse
Affiliation(s)
- Marina Miñarro-Lleonar
- Pharmacy Faculty, University of Barcelona, Av. de Joan XXIII 27-31, 08028 Barcelona, Spain;
| | | | - Daniel Alvarez-Garcia
- GAIN Therapeutics, Parc Cientific de Barcelona, Baldiri i Reixac 10, 08029 Barcelona, Spain;
| | - Peter Schmidtke
- Discngine S.A.S., 79 Avenue Ledru Rollin, 75012 Paris, France;
| | - Xavier Barril
- Pharmacy Faculty, University of Barcelona, Av. de Joan XXIII 27-31, 08028 Barcelona, Spain;
- GAIN Therapeutics, Parc Cientific de Barcelona, Baldiri i Reixac 10, 08029 Barcelona, Spain;
- Catalan Institute for Research and Advanced Studies (ICREA), Passeig de Lluis Companys 23, 08010 Barcelona, Spain
- Correspondence:
| |
Collapse
|
4
|
Abstract
Quantitative susceptibility mapping (QSM) is an MRI-based, computational method for anatomically localizing and measuring concentrations of specific biomarkers in tissue such as iron. Growing research suggests QSM is a viable method for evaluating the impact of iron overload in neurological disorders and on cognitive performance in aging. Several software toolboxes are currently available to reconstruct QSM maps from 3D GRE MR Images. However, few if any software packages currently exist that offer fully automated pipelines for QSM-based data analyses: from DICOM images to region-of-interest (ROI) based QSM values. Even less QSM-based software exist that offer quality control measures for evaluating the QSM output. Here, we address these gaps in the field by introducing and demonstrating the reliability and external validity of Ironsmith; an open-source, fully automated pipeline for creating and processing QSM maps, extracting QSM values from subcortical and cortical brain regions (89 ROIs) and evaluating the quality of QSM data using SNR measures and assessment of outlier regions on phase images. Ironsmith also features automatic filtering of QSM outlier values and precise CSF-only QSM reference masks that minimize partial volume effects. Testing of Ironsmith revealed excellent intra- and inter-rater reliability. Finally, external validity of Ironsmith was demonstrated via an anatomically selective relationship between motor performance and Ironsmith-derived QSM values in motor cortex. In sum, Ironsmith provides a freely-available, reliable, turn-key pipeline for QSM-based data analyses to support research on the impact of brain iron in aging and neurodegenerative disease.
Collapse
Affiliation(s)
- Valentinos Zachariou
- Department of Neuroscience, College of Medicine, University of Kentucky, Lexington, KY 40536-0298 United States.
| | - Christopher E Bauer
- Department of Neuroscience, College of Medicine, University of Kentucky, Lexington, KY 40536-0298 United States
| | - David K Powell
- Department of Neuroscience, Magnetic Resonance Imaging and Spectroscopy Center, College of Medicine, University of Kentucky, Lexington, KY 40536-0298 United States
| | - Brian T Gold
- Department of Neuroscience, Sanders-Brown Center on Aging, Magnetic Resonance Imaging and Spectroscopy Center, College of Medicine, University of Kentucky, Lexington, KY 40536-0298 United States.
| |
Collapse
|
5
|
Johnson LK, Alexander H, Brown CT. Re-assembly, quality evaluation, and annotation of 678 microbial eukaryotic reference transcriptomes. Gigascience 2019; 8:5241890. [PMID: 30544207 PMCID: PMC6481552 DOI: 10.1093/gigascience/giy158] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 09/18/2018] [Accepted: 11/29/2018] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND De novo transcriptome assemblies are required prior to analyzing RNA sequencing data from a species without an existing reference genome or transcriptome. Despite the prevalence of transcriptomic studies, the effects of using different workflows, or "pipelines," on the resulting assemblies are poorly understood. Here, a pipeline was programmatically automated and used to assemble and annotate raw transcriptomic short-read data collected as part of the Marine Microbial Eukaryotic Transcriptome Sequencing Project. The resulting transcriptome assemblies were evaluated and compared against assemblies that were previously generated with a different pipeline developed by the National Center for Genome Research. RESULTS New transcriptome assemblies contained the majority of previous contigs as well as new content. On average, 7.8% of the annotated contigs in the new assemblies were novel gene names not found in the previous assemblies. Taxonomic trends were observed in the assembly metrics. Assemblies from the Dinoflagellata showed a higher number of contigs and unique k-mers than transcriptomes from other phyla, while assemblies from Ciliophora had a lower percentage of open reading frames compared to other phyla. CONCLUSIONS Given current bioinformatics approaches, there is no single "best" reference transcriptome for a particular set of raw data. As the optimum transcriptome is a moving target, improving (or not) with new tools and approaches, automated and programmable pipelines are invaluable for managing the computationally intensive tasks required for re-processing large sets of samples with revised pipelines and ensuring a common evaluation workflow is applied to all samples. Thus, re-assembling existing data with new tools using automated and programmable pipelines may yield more accurate identification of taxon-specific trends across samples in addition to novel and useful products for the community.
Collapse
Affiliation(s)
- Lisa K Johnson
- Department of Population Health, and Reproduction, School of Veterinary Medicine, University of California Davis, One Shields Ave, Davis, CA 95616, USA.,Molecular, Cellular, and Integrative Physiology Graduate Group, University of California Davis, One Shields Ave, Davis, CA 95616, USA
| | - Harriet Alexander
- Department of Population Health, and Reproduction, School of Veterinary Medicine, University of California Davis, One Shields Ave, Davis, CA 95616, USA.,Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543, USA
| | - C Titus Brown
- Department of Population Health, and Reproduction, School of Veterinary Medicine, University of California Davis, One Shields Ave, Davis, CA 95616, USA.,Molecular, Cellular, and Integrative Physiology Graduate Group, University of California Davis, One Shields Ave, Davis, CA 95616, USA.,Genome Center, University of California Davis, 451 Health Sciences Dr, Davis, CA 95616, USA
| |
Collapse
|
6
|
Marques P, Soares JM, Alves V, Sousa N. BrainCAT - a tool for automated and combined functional magnetic resonance imaging and diffusion tensor imaging brain connectivity analysis. Front Hum Neurosci 2013; 7:794. [PMID: 24319419 PMCID: PMC3836207 DOI: 10.3389/fnhum.2013.00794] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Accepted: 10/31/2013] [Indexed: 01/03/2023] Open
Abstract
Multimodal neuroimaging studies have recently become a trend in the neuroimaging field and are certainly a standard for the future. Brain connectivity studies combining functional activation patterns using resting-state or task-related functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI) tractography have growing popularity. However, there is a scarcity of solutions to perform optimized, intuitive, and consistent multimodal fMRI/DTI studies. Here we propose a new tool, brain connectivity analysis tool (BrainCAT), for an automated and standard multimodal analysis of combined fMRI/DTI data, using freely available tools. With a friendly graphical user interface, BrainCAT aims to make data processing easier and faster, implementing a fully automated data processing pipeline and minimizing the need for user intervention, which hopefully will expand the use of combined fMRI/DTI studies. Its validity was tested in an aging study of the default mode network (DMN) white matter connectivity. The results evidenced the cingulum bundle as the structural connector of the precuneus/posterior cingulate cortex and the medial frontal cortex, regions of the DMN. Moreover, mean fractional anisotropy (FA) values along the cingulum extracted with BrainCAT showed a strong correlation with FA values from the manual selection of the same bundle. Taken together, these results provide evidence that BrainCAT is suitable for these analyses.
Collapse
Affiliation(s)
- Paulo Marques
- Life and Health Sciences Research Institute, School of Health Sciences, University of Minho Braga, Portugal ; ICVS/3B's - PT Government Associate Laboratory Braga/Guimarães, Portugal
| | | | | | | |
Collapse
|