1
|
Plasma Proteins Associated with COVID-19 Severity in Puerto Rico. Int J Mol Sci 2024; 25:5426. [PMID: 38791465 PMCID: PMC11121485 DOI: 10.3390/ijms25105426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 05/10/2024] [Accepted: 05/12/2024] [Indexed: 05/26/2024] Open
Abstract
Viral strains, age, and host factors are associated with variable immune responses against SARS-CoV-2 and disease severity. Puerto Ricans have a genetic mixture of races: European, African, and Native American. We hypothesized that unique host proteins/pathways are associated with COVID-19 disease severity in Puerto Rico. Following IRB approval, a total of 95 unvaccinated men and women aged 21-71 years old were recruited in Puerto Rico from 2020-2021. Plasma samples were collected from COVID-19-positive subjects (n = 39) and COVID-19-negative individuals (n = 56) during acute disease. COVID-19-positive individuals were stratified based on symptomatology as follows: mild (n = 18), moderate (n = 13), and severe (n = 8). Quantitative proteomics was performed in plasma samples using tandem mass tag (TMT) labeling. Labeled peptides were subjected to LC/MS/MS and analyzed by Proteome Discoverer (version 2.5), Limma software (version 3.41.15), and Ingenuity Pathways Analysis (IPA, version 22.0.2). Cytokines were quantified using a human cytokine array. Proteomics analyses of severely affected COVID-19-positive individuals revealed 58 differentially expressed proteins. Cadherin-13, which participates in synaptogenesis, was downregulated in severe patients and validated by ELISA. Cytokine immunoassay showed that TNF-α levels decreased with disease severity. This study uncovers potential host predictors of COVID-19 severity and new avenues for treatment in Puerto Ricans.
Collapse
|
2
|
Novel hydrazone compounds with broad-spectrum antiplasmodial activity and synergistic interactions with antimalarial drugs. Antimicrob Agents Chemother 2024:e0164323. [PMID: 38639491 DOI: 10.1128/aac.01643-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 03/20/2024] [Indexed: 04/20/2024] Open
Abstract
The development of novel antiplasmodial compounds with broad-spectrum activity against different stages of Plasmodium parasites is crucial to prevent malaria disease and parasite transmission. This study evaluated the antiplasmodial activity of seven novel hydrazone compounds (referred to as CB compounds: CB-27, CB-41, CB-50, CB-53, CB-58, CB-59, and CB-61) against multiple stages of Plasmodium parasites. All CB compounds inhibited blood stage proliferation of drug-resistant or sensitive strains of Plasmodium falciparum in the low micromolar to nanomolar range. Interestingly, CB-41 exhibited prophylactic activity against hypnozoites and liver schizonts in Plasmodium cynomolgi, a primate model for Plasmodium vivax. Four CB compounds (CB-27, CB-41, CB-53, and CB-61) inhibited P. falciparum oocyst formation in mosquitoes, and five CB compounds (CB-27, CB-41, CB-53, CB-58, and CB-61) hindered the in vitro development of Plasmodium berghei ookinetes. The CB compounds did not inhibit the activation of P. berghei female and male gametocytes in vitro. Isobologram assays demonstrated synergistic interactions between CB-61 and the FDA-approved antimalarial drugs, clindamycin and halofantrine. Testing of six CB compounds showed no inhibition of Plasmodium glutathione S-transferase as a putative target and no cytotoxicity in HepG2 liver cells. CB compounds are promising candidates for further development as antimalarial drugs against multidrug-resistant parasites, which could also prevent malaria transmission.
Collapse
|
3
|
Properties and Mechanisms of Deletions, Insertions, and Substitutions in the Evolutionary History of SARS-CoV-2. Int J Mol Sci 2024; 25:3696. [PMID: 38612505 PMCID: PMC11011937 DOI: 10.3390/ijms25073696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 03/22/2024] [Accepted: 03/23/2024] [Indexed: 04/14/2024] Open
Abstract
SARS-CoV-2 has accumulated many mutations since its emergence in late 2019. Nucleotide substitutions leading to amino acid replacements constitute the primary material for natural selection. Insertions, deletions, and substitutions appear to be critical for coronavirus's macro- and microevolution. Understanding the molecular mechanisms of mutations in the mutational hotspots (positions, loci with recurrent mutations, and nucleotide context) is important for disentangling roles of mutagenesis and selection. In the SARS-CoV-2 genome, deletions and insertions are frequently associated with repetitive sequences, whereas C>U substitutions are often surrounded by nucleotides resembling the APOBEC mutable motifs. We describe various approaches to mutation spectra analyses, including the context features of RNAs that are likely to be involved in the generation of recurrent mutations. We also discuss the interplay between mutations and natural selection as a complex evolutionary trend. The substantial variability and complexity of pipelines for the reconstruction of mutations and the huge number of genomic sequences are major problems for the analyses of mutations in the SARS-CoV-2 genome. As a solution, we advocate for the development of a centralized database of predicted mutations, which needs to be updated on a regular basis.
Collapse
|
4
|
Quantitative Proteomics Reveal That CB2R Agonist JWH-133 Downregulates NF-κB Activation, Oxidative Stress, and Lysosomal Exocytosis from HIV-Infected Macrophages. Int J Mol Sci 2024; 25:3246. [PMID: 38542221 PMCID: PMC10970132 DOI: 10.3390/ijms25063246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 03/11/2024] [Accepted: 03/12/2024] [Indexed: 04/13/2024] Open
Abstract
HIV-associated neurocognitive disorders (HAND) affect 15-55% of HIV-positive patients and effective therapies are unavailable. HIV-infected monocyte-derived macrophages (MDM) invade the brain of these individuals, promoting neurotoxicity. We demonstrated an increased expression of cathepsin B (CATB), a lysosomal protease, in monocytes and post-mortem brain tissues of women with HAND. Increased CATB release from HIV-infected MDM leads to neurotoxicity, and their secretion is associated with NF-κB activation, oxidative stress, and lysosomal exocytosis. Cannabinoid receptor 2 (CB2R) agonist, JWH-133, decreases HIV-1 replication, CATB secretion, and neurotoxicity from HIV-infected MDM, but the mechanisms are not entirely understood. We hypothesized that HIV-1 infection upregulates the expression of proteins associated with oxidative stress and that a CB2R agonist could reverse these effects. MDM were isolated from healthy women donors (n = 3), infected with HIV-1ADA, and treated with JWH-133. After 13 days post-infection, cell lysates were labeled by Tandem Mass Tag (TMT) and analyzed by LC/MS/MS quantitative proteomics bioinformatics. While HIV-1 infection upregulated CATB, NF-κB signaling, Nrf2-mediated oxidative stress response, and lysosomal exocytosis, JWH-133 treatment downregulated the expression of the proteins involved in these pathways. Our results suggest that JWH-133 is a potential alternative therapy against HIV-induced neurotoxicity and warrant in vivo studies to test its potential against HAND.
Collapse
|
5
|
Antimalarial Drug Combination Predictions Using the Machine Learning Synergy Predictor (MLSyPred©) tool. Acta Parasitol 2024; 69:415-425. [PMID: 38165555 PMCID: PMC11001753 DOI: 10.1007/s11686-023-00765-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 11/27/2023] [Indexed: 01/04/2024]
Abstract
PURPOSE Antimalarial drug resistance is a global public health problem that leads to treatment failure. Synergistic drug combinations can improve treatment outcomes and delay the development of drug resistance. Here, we describe the implementation of a freely available computational tool, Machine Learning Synergy Predictor (MLSyPred©), to predict potential synergy in antimalarial drug combinations. METHODS The MLSyPred© synergy prediction method extracts molecular fingerprints from the drugs' biochemical structures to use as features and also cleans and prepares the raw data. Five machine learning algorithms (Logistic Regression, Random Forest, Support vector machine, Ada Boost, and Gradient Boost) were implemented to build prediction models. Implementation and application of the MLSyPred© tool were tested using datasets from 1540 combinations of 79 drugs and compounds biologically evaluated in pairs for three strains of Plasmodium falciparum (3D7, HB3, and Dd2). RESULTS The best prediction models were obtained using Logistic Regression for antimalarials with the strains Dd2 and HB3 (0.81 and 0.70 AUC, respectively) and Random Forest for antimalarials with 3D7 (0.69 AUC). The MLSyPred© tool yielded 45% precision for synergistically predicted antimalarial drug combinations that were annotated and biologically validated, thus confirming the functionality and applicability of the tool. CONCLUSION The MLSyPred© tool is freely available and represents a promising strategy for discovering potential synergistic drug combinations for further development as novel antimalarial therapies.
Collapse
|
6
|
Unsupervised machine learning method for indirect estimation of reference intervals for chronic kidney disease in the Puerto Rican population. Sci Rep 2023; 13:17198. [PMID: 37821500 PMCID: PMC10567761 DOI: 10.1038/s41598-023-43830-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/28/2023] [Indexed: 10/13/2023] Open
Abstract
Reference intervals (RIs) for clinical laboratory values are extremely important for diagnostics and treatment of patients. However, the determination of these ranges is costly and time-consuming. As a result, often different unverified RIs are used in practice for the same analyte and the same range is used for all patients despite evidence that the values are gender, age, and ethnicity dependent. Moreover, the abnormal flags are rudimentary, merely indicating if a value is within the RI. At the same time, clinical lab data generated in the everyday medical practice contains a wealth of information, that given the correct methodology, can help determine the RIs for each specific segment of the population, including populations that suffer from health disparities. In this work, we develop unsupervised machine learning methods, based on Gaussian mixtures, to determine RIs of analytes related to chronic kidney disease, using millions of routine lab results for the Puerto Rican population. We show that the measures are both gender and age dependent and we find evidence for normal age-related organ function deterioration and failure. We also show that the joint distribution of measures improves the diagnostic value of the lab results.
Collapse
|
7
|
Discovery of Ancestry-specific Variants Associated with Clopidogrel Response among Caribbean Hispanics. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.09.29.23296372. [PMID: 37873439 PMCID: PMC10593031 DOI: 10.1101/2023.09.29.23296372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Background High on-treatment platelet reactivity (HTPR) with clopidogrel is predictive of ischemic events in adults with coronary artery disease. Despite strong data suggesting HTPR varies with ethnicity, including clinical and genetic variables, no genome-wide association study (GWAS) of clopidogrel response has been performed among Caribbean Hispanics. This study aimed to identify genetic predictors of HTPR in a cohort of Caribbean Hispanic cardiovascular patients from Puerto Rico. Methods Local Ancestry inference (LAI) and traditional GWASs were performed on a cohort of 511 clopidogrel-treated patients, stratified based on their P2Y12 reaction units (PRU) into responders and non-responders (HTPR). Results The LAI GWAS identified variants within the CYP2C19 region associated with HTPR, predominantly driven by individuals of European ancestry and absent in those with native ancestry. Incorporating local ancestry adjustment notably enhanced our ability to detect associations. While no loci reached traditional GWAS significance, three variants showed suggestive significance at chromosomes 3, 14 and 22 (OSBPL10 rs1376606, DERL3 rs5030613, and RGS6 rs9323567). In addition, a variant in the UNC5C gene on chromosome 4 was associated with an increased risk of HTPR. These findings were not identified in other cohorts, highlighting the unique genetic landscape of Caribbean Hispanics. Conclusion This is the first GWAS of clopidogrel response in Hispanics, confirming the relevance of the CYP2C19 cluster, particularly among those with European ancestry, and also identifying novel markers in a diverse patient population. Further studies are warranted to replicate our findings in other diverse cohorts and meta-analyses.
Collapse
|
8
|
Non-Random Enrichment of Single-Nucleotide Polymorphisms Associated with Clopidogrel Resistance within Risk Loci Linked to the Severity of Underlying Cardiovascular Diseases: The Role of Admixture. Genes (Basel) 2023; 14:1813. [PMID: 37761953 PMCID: PMC10531115 DOI: 10.3390/genes14091813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 09/14/2023] [Accepted: 09/14/2023] [Indexed: 09/29/2023] Open
Abstract
Cardiovascular disease (CVD) is one of the leading causes of death in Puerto Rico, where clopidogrel is commonly prescribed to prevent ischemic events. Genetic contributors to both a poor clopidogrel response and the severity of CVD have been identified mainly in Europeans. However, the non-random enrichment of single-nucleotide polymorphisms (SNPs) associated with clopidogrel resistance within risk loci linked to underlying CVDs, and the role of admixture, have yet to be tested. This study aimed to assess the possible interaction between genetic biomarkers linked to CVDs and those associated with clopidogrel resistance among admixed Caribbean Hispanics. We identified 50 SNPs significantly associated with CVDs in previous genome-wide association studies (GWASs). These SNPs were combined with another ten SNPs related to clopidogrel resistance in Caribbean Hispanics. We developed Python scripts to determine whether SNPs related to CVDs are in close proximity to those associated with the clopidogrel response. The average and individual local ancestry (LAI) within each locus were inferred, and 60 random SNPs with their corresponding LAIs were generated for enrichment estimation purposes. Our results showed no CVD-linked SNPs in close proximity to those associated with the clopidogrel response among Caribbean Hispanics. Consequently, no genetic loci with a dual predictive role for the risk of CVD severity and clopidogrel resistance were found in this population. Native American ancestry was the most enriched within the risk loci linked to CVDs in this population. The non-random enrichment of disease susceptibility loci with drug-response SNPs is a new frontier in Precision Medicine that needs further attention.
Collapse
|
9
|
BioLegato: a programmable, object-oriented graphic user interface. BMC Bioinformatics 2023; 24:316. [PMID: 37605108 PMCID: PMC10441721 DOI: 10.1186/s12859-023-05436-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 08/03/2023] [Indexed: 08/23/2023] Open
Abstract
BACKGROUND Biologists are faced with an ever-changing array of complex software tools with steep learning curves, often run on High Performance Computing platforms. To resolve the tradeoff between analytical sophistication and usability, we have designed BioLegato, a programmable graphical user interface (GUI) for running external programs. RESULTS BioLegato can run any program or pipeline that can be launched as a command. BioLegato reads specifications for each tool from files written in PCD, a simple language for specifying GUI components that set parameters for calling external programs. Thus, adding new tools to BioLegato can be done without changing the BioLegato Java code itself. The process is as simple as copying an existing PCD file and modifying it for the new program, which is more like filling in a form than writing code. PCD thus facilitates rapid development of new applications using existing programs as building blocks, and getting them to work together seamlessly. CONCLUSION BioLegato applies Object-Oriented concepts to the user experience by organizing applications based on discrete data types and the methods relevant to that data. PCD makes it easier for BioLegato applications to evolve with the succession of analytical tools for bioinformatics. BioLegato is applicable not only in biology, but in almost any field in which disparate software tools need to work as an integrated system.
Collapse
|
10
|
The 29-nucleotide deletion in SARS-CoV: truncated versions of ORF8 are under purifying selection. BMC Genomics 2023; 24:387. [PMID: 37430204 DOI: 10.1186/s12864-023-09482-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 06/23/2023] [Indexed: 07/12/2023] Open
Abstract
BACKGROUND Accessory proteins have diverse roles in coronavirus pathobiology. One of them in SARS-CoV (the causative agent of the severe acute respiratory syndrome outbreak in 2002-2003) is encoded by the open reading frame 8 (ORF8). Among the most dramatic genomic changes observed in SARS-CoV isolated from patients during the peak of the pandemic in 2003 was the acquisition of a characteristic 29-nucleotide deletion in ORF8. This deletion cause splitting of ORF8 into two smaller ORFs, namely ORF8a and ORF8b. Functional consequences of this event are not entirely clear. RESULTS Here, we performed evolutionary analyses of ORF8a and ORF8b genes and documented that in both cases the frequency of synonymous mutations was greater than that of nonsynonymous ones. These results suggest that ORF8a and ORF8b are under purifying selection, thus proteins translated from these ORFs are likely to be functionally important. Comparisons with several other SARS-CoV genes revealed that another accessory gene, ORF7a, has a similar ratio of nonsynonymous to synonymous mutations suggesting that ORF8a, ORF8b, and ORF7a are under similar selection pressure. CONCLUSIONS Our results for SARS-CoV echo the known excess of deletions in the ORF7a-ORF7b-ORF8 complex of accessory genes in SARS-CoV-2. A high frequency of deletions in this gene complex might reflect recurrent searches in "functional space" of various accessory protein combinations that may eventually produce more advantageous configurations of accessory proteins similar to the fixed deletion in the SARS-CoV ORF8 gene.
Collapse
|
11
|
Evaluation of AIML + HDR-A Course to Enhance Data Science Workforce Capacity for Hispanic Biomedical Researchers. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:2726. [PMID: 36768092 PMCID: PMC9914971 DOI: 10.3390/ijerph20032726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 01/25/2023] [Accepted: 01/29/2023] [Indexed: 06/18/2023]
Abstract
Artificial intelligence (AI) and machine learning (ML) facilitate the creation of revolutionary medical techniques. Unfortunately, biases in current AI and ML approaches are perpetuating minority health inequity. One of the strategies to solve this problem is training a diverse workforce. For this reason, we created the course "Artificial Intelligence and Machine Learning applied to Health Disparities Research (AIML + HDR)" which applied general Data Science (DS) approaches to health disparities research with an emphasis on Hispanic populations. Some technical topics covered included the Jupyter Notebook Framework, coding with R and Python to manipulate data, and ML libraries to create predictive models. Some health disparities topics covered included Electronic Health Records, Social Determinants of Health, and Bias in Data. As a result, the course was taught to 34 selected Hispanic participants and evaluated by a survey on a Likert scale (0-4). The surveys showed high satisfaction (more than 80% of participants agreed) regarding the course organization, activities, and covered topics. The students strongly agreed that the activities were relevant to the course and promoted their learning (3.71 ± 0.21). The students strongly agreed that the course was helpful for their professional development (3.76 ± 0.18). The open question was quantitatively analyzed and showed that seventy-five percent of the comments received from the participants confirmed their great satisfaction.
Collapse
|
12
|
Summary of Year-One Effort of the RCMI Consortium to Enhance Research Capacity and Diversity with Data Science. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 20:279. [PMID: 36612607 PMCID: PMC9819075 DOI: 10.3390/ijerph20010279] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 11/22/2022] [Accepted: 12/02/2022] [Indexed: 05/23/2023]
Abstract
Despite being disproportionately impacted by health disparities, Black, Hispanic, Indigenous, and other underrepresented populations account for a significant minority of graduates in biomedical data science-related disciplines. Given their commitment to educating underrepresented students and trainees, minority serving institutions (MSIs) can play a significant role in enhancing diversity in the biomedical data science workforce. Little has been published about the reach, curricular breadth, and best practices for delivering these data science training programs. The purpose of this paper is to summarize six Research Centers in Minority Institutions (RCMIs) awarded funding from the National Institute of Minority Health Disparities (NIMHD) to develop new data science training programs. A cross-sectional survey was conducted to better understand the demographics of learners served, curricular topics covered, methods of instruction and assessment, challenges, and recommendations by program directors. Programs demonstrated overall success in reach and curricular diversity, serving a broad range of students and faculty, while also covering a broad range of topics. The main challenges highlighted were a lack of resources and infrastructure and teaching learners with varying levels of experience and knowledge. Further investments in MSIs are needed to sustain training efforts and develop pathways for diversifying the biomedical data science workforce.
Collapse
|
13
|
Research Infrastructure Core Facilities at Research Centers in Minority Institutions: Part I-Research Resources Management, Operation, and Best Practices. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:16979. [PMID: 36554864 PMCID: PMC9779820 DOI: 10.3390/ijerph192416979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 12/04/2022] [Accepted: 12/14/2022] [Indexed: 06/17/2023]
Abstract
Funded by the National Institutes of Health (NIH), the Research Centers in Minority Institutions (RCMI) Program fosters the development and implementation of innovative research aimed at improving minority health and reducing or eliminating health disparities. Currently, there are 21 RCMI Specialized (U54) Centers that share the same framework, comprising four required core components, namely the Administrative, Research Infrastructure, Investigator Development, and Community Engagement Cores. The Research Infrastructure Core (RIC) is fundamentally important for biomedical and health disparities research as a critical function domain. This paper aims to assess the research resources and services provided and evaluate the best practices in research resources management and networking across the RCMI Consortium. We conducted a REDCap-based survey and collected responses from 57 RIC Directors and Co-Directors from 98 core leaders. Our findings indicated that the RIC facilities across the 21 RCMI Centers provide access to major research equipment and are managed by experienced faculty and staff who provide expert consultative and technical services. However, several impediments to RIC facilities operation and management have been identified, and these are currently being addressed through implementation of cost-effective strategies and best practices of laboratory management and operation.
Collapse
|
14
|
Decreased CSTB, RAGE, and Axl Receptor Are Associated with Zika Infection in the Human Placenta. Cells 2022; 11:3627. [PMID: 36429055 PMCID: PMC9688057 DOI: 10.3390/cells11223627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/10/2022] [Accepted: 11/11/2022] [Indexed: 11/18/2022] Open
Abstract
Zika virus (ZIKV) compromises placental integrity, infecting the fetus. However, the mechanisms associated with ZIKV penetration into the placenta leading to fetal infection are unknown. Cystatin B (CSTB), the receptor for advanced glycation end products (RAGE), and tyrosine-protein kinase receptor UFO (AXL) have been implicated in ZIKV infection and inflammation. This work investigates CSTB, RAGE, and AXL receptor expression and activation pathways in ZIKV-infected placental tissues at term. The hypothesis is that there is overexpression of CSTB and increased inflammation affecting RAGE and AXL receptor expression in ZIKV-infected placentas. Pathological analyses of 22 placentas were performed to determine changes caused by ZIKV infection. Quantitative proteomics, immunofluorescence, and western blot were performed to analyze proteins and pathways affected by ZIKV infection in frozen placentas. The pathological analysis confirmed decreased size of capillaries, hyperplasia of Hofbauer cells, disruption in the trophoblast layer, cell agglutination, and ZIKV localization to the trophoblast layer. In addition, there was a significant decrease in CSTB, RAGE, and AXL expression and upregulation of caspase 1, tubulin beta, and heat shock protein 27. Modulation of these proteins and activation of inflammasome and pyroptosis pathways suggest targets for modulation of ZIKV infection in the placenta.
Collapse
|
15
|
Proteomics analysis to identify intracellular inflammatory pathways modulated by Fh15 in macrophages stimulated with LPS- E. coli. THE JOURNAL OF IMMUNOLOGY 2022. [DOI: 10.4049/jimmunol.208.supp.164.07] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Abstract
Previous studies of our laboratory have demonstrated that Fh15, the recombinant variant of F. hepatica FABP exhibit powerful anti-inflammatory properties. The administration of a single IV dose (for NHP) or IP dose (for mouse) of Fh15 prior to or after lethal doses of E. coli-LPS, respectively is enough to significantly suppress the pro-inflammatory cytokine storm and prevent the lethal pathologic consequences of septic shock. This suggest that Fh15 has a broad spectrum of action and may modulate pro-inflammatory mechanisms. The present study aimed to identify intracellular inflammatory pathways that could be modulated by Fh15. To achieve this goal, we have applied a quantitative proteomic approach using a Tandem Mass Tag peptide labelling, which has been used to quantified and identified biological macromolecules such as proteins. Samples were analyzed using Liquid Chromatography with tandem mass spectrometry. RAW 264.7 cells were stimulated with LPS and/or Fh15, cells treated with Fh15 or LPS alone were used as control. For the bioinformatics analysis, a fold change of 1.5 and pValue ≤ 0.05, were considered for the enrichment analysis, using the Ingenuity Pathway Analysis. Using these approaches, we identified 257 proteins associated with LPS treatment: 173 were up regulated and 84 were down regulated. Furthermore, we identified 139 proteins associated with Fh15 treatment: 38 were up-regulated and 101 were down-regulated. The IPA analysis identified proteins associated with NFkB, iNOS and Th1/Th2 pathways. We also identified 8 proteins differentially abundant that are associated with inflammatory and/or infectious diseases. Currently we are evaluating these proteins with ELISA, Western Blot and/or Flow cytometry techniques.
Supported by 1SC1AI155439-01 NIAID
Collapse
|
16
|
Reduced RBPMS Levels Promote Cell Proliferation and Decrease Cisplatin Sensitivity in Ovarian Cancer Cells. Int J Mol Sci 2022; 23:535. [PMID: 35008958 PMCID: PMC8745614 DOI: 10.3390/ijms23010535] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 12/28/2021] [Accepted: 12/29/2021] [Indexed: 12/14/2022] Open
Abstract
Worldwide, the number of cancer-related deaths continues to increase due to the ability of cancer cells to become chemotherapy-resistant and metastasize. For women with ovarian cancer, a staggering 70% will become resistant to the front-line therapy, cisplatin. Although many mechanisms of cisplatin resistance have been proposed, the key mechanisms of such resistance remain elusive. The RNA binding protein with multiple splicing (RBPMS) binds to nascent RNA transcripts and regulates splicing, transport, localization, and stability. Evidence indicates that RBPMS also binds to protein members of the AP-1 transcription factor complex repressing its activity. Until now, little has been known about the biological function of RBPMS in ovarian cancer. Accordingly, we interrogated available Internet databases and found that ovarian cancer patients with high RBPMS levels live longer compared to patients with low RBPMS levels. Similarly, immunohistochemical (IHC) analysis in a tissue array of ovarian cancer patient samples showed that serous ovarian cancer tissues showed weaker RBPMS staining when compared with normal ovarian tissues. We generated clustered regularly interspaced short palindromic repeats (CRISPR)-mediated RBPMS knockout vectors that were stably transfected in the high-grade serous ovarian cancer cell line, OVCAR3. The knockout of RBPMS in these cells was confirmed via bioinformatics analysis, real-time PCR, and Western blot analysis. We found that the RBPMS knockout clones grew faster and had increased invasiveness than the control CRISPR clones. RBPMS knockout also reduced the sensitivity of the OVCAR3 cells to cisplatin treatment. Moreover, β-galactosidase (β-Gal) measurements showed that RBPMS knockdown induced senescence in ovarian cancer cells. We performed RNAseq in the RBPMS knockout clones and identified several downstream-RBPMS transcripts, including non-coding RNAs (ncRNAs) and protein-coding genes associated with alteration of the tumor microenvironment as well as those with oncogenic or tumor suppressor capabilities. Moreover, proteomic studies confirmed that RBPMS regulates the expression of proteins involved in cell detoxification, RNA processing, and cytoskeleton network and cell integrity. Interrogation of the Kaplan-Meier (KM) plotter database identified multiple downstream-RBPMS effectors that could be used as prognostic and response-to-therapy biomarkers in ovarian cancer. These studies suggest that RBPMS acts as a tumor suppressor gene and that lower levels of RBPMS promote the cisplatin resistance of ovarian cancer cells.
Collapse
|
17
|
A MACHINE LEARNING-BASED APPROACH TO EPILEPTIC SEIZURE PREDICTION USING ELECTRO-ENCEPHALOGRAPHIC SIGNALS. JOURNAL OF ENGINEERING RESEARCH 2022; 2:10.22533/at.ed.317282219056. [PMID: 35711293 PMCID: PMC9199360 DOI: 10.22533/at.ed.317282219056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The brain is made up of billions of neurons, which control all actions performed by us. In epilepsy, the pattern order of brain signals is altered, causing epileptiform discharges in an individual's brain. Approximately 1% of the world population has epilepsy and, therefore, there is a need for studies that can help in the diagnosis and treatment of this disorder. The objective of this work is to develop a machine learning-based approach to predict epileptic seizures using non-invasive electroencephalography (EEG). Therefore, the classification of interictal and preictal states was performed using the CHB-MIT database. The algorithm was developed to predict epileptic seizures in multiple subjects using a patient-independent approach. The Discrete Wavelet Transform was used to perform the decomposition of the EEG signals in 5 levels and, as characteristics, the Spectral Power, the Mean and the Standard Deviation were studied, in order to analyze which one would present the best result and as a classifier, the Supported Vector Machine (SVM). The study achieved an accuracy of 92.30%, 84.60% and 76.92% for the Power, Standard Deviation and Mean characteristics, respectively.
Collapse
|
18
|
A novel artificial intelligence-based approach for identification of deoxynucleotide aptamers. PLoS Comput Biol 2021; 17:e1009247. [PMID: 34343165 PMCID: PMC8362955 DOI: 10.1371/journal.pcbi.1009247] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Revised: 08/13/2021] [Accepted: 07/05/2021] [Indexed: 02/07/2023] Open
Abstract
The selection of a DNA aptamer through the Systematic Evolution of Ligands by EXponential enrichment (SELEX) method involves multiple binding steps, in which a target and a library of randomized DNA sequences are mixed for selection of a single, nucleotide-specific molecule. Usually, 10 to 20 steps are required for SELEX to be completed. Throughout this process it is necessary to discriminate between true DNA aptamers and unspecified DNA-binding sequences. Thus, a novel machine learning-based approach was developed to support and simplify the early steps of the SELEX process, to help discriminate binding between DNA aptamers from those unspecified targets of DNA-binding sequences. An Artificial Intelligence (AI) approach to identify aptamers were implemented based on Natural Language Processing (NLP) and Machine Learning (ML). NLP method (CountVectorizer) was used to extract information from the nucleotide sequences. Four ML algorithms (Logistic Regression, Decision Tree, Gaussian Naïve Bayes, Support Vector Machines) were trained using data from the NLP method along with sequence information. The best performing model was Support Vector Machines because it had the best ability to discriminate between positive and negative classes. In our model, an Accuracy (A) of 0.995, the fraction of samples that the model correctly classified, and an Area Under the Receiving Operating Curve (AUROC) of 0.998, the degree by which a model is capable of distinguishing between classes, were observed. The developed AI approach is useful to identify potential DNA aptamers to reduce the amount of rounds in a SELEX selection. This new approach could be applied in the design of DNA libraries and result in a more efficient and faster process for DNA aptamers to be chosen during SELEX. In this manuscript authors explain the development and validation of a novel artificial intelligence approach to support and simplify the early steps of the process from SELEX, to help discriminate binding between deoxynucleotide aptamers from those unspecified targets of DNA-binding sequences. The approach was implemented based on Natural Language Processing and Machine Learning. CountVectorizer, a Natural Language Processing method, was used to extract information from nucleotide sequences. Four Machine Learning algorithms (Logistic Regression, Decision Tree, Gaussian Naïve Bayes, and Support Vector Machines) were trained using data from the Natural Language Processing method along with sequence information. From these four trained machine learning algorithms, the best performance and selected model was Support Vectors Machines, because it had the best discriminatory metrics (i.e., Accuracy (A) = 0.995; AUROC (AU) = 0.998). In general, all models showed good metric results for predicting DNA aptamer sequences. The Machine Learning model complexity and difficult interpretation may hinder its application into the standard practice. For this reason, the development of a web-app is already taking place to facilitate the interpretation and application of the obtained results.
Collapse
|
19
|
Abstract 2180: Somatic mutation landscape in early-onset colorectal cancer tumors from hispanics. Cancer Res 2021. [DOI: 10.1158/1538-7445.am2021-2180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
Background: In the last 40 years, the incidence of colorectal cancer (CRC) among individuals <50 years (early-onset CRC) has been increasing at an alarming rate in the US, and is expected to increase by >140% by 2030. Early-onset CRC represents a clinically distinct form of CRC often associated with a poor prognosis. During 2012-2017, more than 11% of the CRC cases in the US and more than 9% of the total CRC cases in Puerto Rico corresponded to patients <50 years old. This highlights the imperative need to describe the genetic drivers of early-onset CRC in Hispanics in order to increase early diagnosis, improve personalized clinical management, and improve survival outcomes. The objective of this study was to characterize the somatic mutation profile of early-onset tumors from Puerto Ricans, a Hispanic subpopulation with a high CRC burden, in order to better understand early-onset CRC biology.
Methods: Whole exome sequencing analyses were performed using the HiSeq4000 System (Illumina) on concordant colorectal adenocarcinoma and colonic mucosa tissue samples from 58 individuals diagnosed with colorectal cancer at <50 years old (early-onset CRC) and 25 individuals diagnosed with CRC >60 years old (late-onset CRC). Somatic variant calling and annotation/visualization were performed using Strelka and Ingenuity Variant Analysis software, respectively. All participants were recruited by Puerto Rico Familial Colorectal Cancer Registry (PURIFICAR).
Results: Early-onset and late-onset CRC tumors showed distinct mutational profiles, only sharing TTN and MUC19 among their top 10 most mutated genes. The most frequently mutated genes in early-onset CRC tumors were: TTN (79%), MUC19 (75%), SYNE1 (68%), PCDHGA1 (68%), LPR1B (67%), PCDHGA2 (67%), PCDHGA3 (65%), PCDHGB1 (%), MUC16 (65%) and PCDHGA4 (65%). In late-onset CRC, the top ten most frequently mutated genes were: CSMD1 (68%), APC (68%), SYNE1 (64%), TTN (60%), MUC19 (60%), MUC16 (60%), PCDHA1 (60%), PCDHB1 (%), PCDHG1 (60%) and PCDHG2 (60%).
Conclusion: This study presents the somatic mutation profile of early-onset colorectal tumors from Puerto Ricans, a Hispanic subgroup with noted CRC health disparities. Somatic mutational profiles were found to be distinct when comparing early-onset and late-onset colorectal tumors. The majority of the somatic mutations detected in the early-onset CRC tumors were in non-coding regions, suggesting that epigenetic regulation may contribute to early-onset colorectal carcinogenesis.
Citation Format: Maria Gonzalez-Pons, Ingrid Montes-Rodriguez, Kelvin Carrasquillo-Carrion, Abiel Roche-Lima, Sandeep Singhal, Anna M. Napoles, Jung S. Byun, Eliseo Perez-Stable, Kevin Gardner, Marcia Cruz-Correa. Somatic mutation landscape in early-onset colorectal cancer tumors from hispanics [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 2180.
Collapse
|
20
|
DNA Methylation, Deamination, and Translesion Synthesis Combine to Generate Footprint Mutations in Cancer Driver Genes in B-Cell Derived Lymphomas and Other Cancers. Front Genet 2021; 12:671866. [PMID: 34093666 PMCID: PMC8170131 DOI: 10.3389/fgene.2021.671866] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open
Abstract
Cancer genomes harbor numerous genomic alterations and many cancers accumulate thousands of nucleotide sequence variations. A prominent fraction of these mutations arises as a consequence of the off-target activity of DNA/RNA editing cytosine deaminases followed by the replication/repair of edited sites by DNA polymerases (pol), as deduced from the analysis of the DNA sequence context of mutations in different tumor tissues. We have used the weight matrix (sequence profile) approach to analyze mutagenesis due to Activation Induced Deaminase (AID) and two error-prone DNA polymerases. Control experiments using shuffled weight matrices and somatic mutations in immunoglobulin genes confirmed the power of this method. Analysis of somatic mutations in various cancers suggested that AID and DNA polymerases η and θ contribute to mutagenesis in contexts that almost universally correlate with the context of mutations in A:T and G:C sites during the affinity maturation of immunoglobulin genes. Previously, we demonstrated that AID contributes to mutagenesis in (de)methylated genomic DNA in various cancers. Our current analysis of methylation data from malignant lymphomas suggests that driver genes are subject to different (de)methylation processes than non-driver genes and, in addition to AID, the activity of pols η and θ contributes to the establishment of methylation-dependent mutation profiles. This may reflect the functional importance of interplay between mutagenesis in cancer and (de)methylation processes in different groups of genes. The resulting changes in CpG methylation levels and chromatin modifications are likely to cause changes in the expression levels of driver genes that may affect cancer initiation and/or progression.
Collapse
|
21
|
Toward the discovery of biological functions associated with the mechanosensor Mtl1p of Saccharomyces cerevisiae via integrative multi-OMICs analysis. Sci Rep 2021; 11:7411. [PMID: 33795741 PMCID: PMC8016984 DOI: 10.1038/s41598-021-86671-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Accepted: 03/15/2021] [Indexed: 02/06/2023] Open
Abstract
Functional analysis of the Mtl1 protein in Saccharomyces cerevisiae has revealed that this transmembrane sensor endows yeast cells with resistance to oxidative stress through a signaling mechanism called the cell wall integrity pathway (CWI). We observed upregulation of multiple heat shock proteins (HSPs), proteins associated with the formation of stress granules, and the phosphatase subunit of trehalose 6-phosphate synthase which suggests that mtl1Δ strains undergo intrinsic activation of a non-lethal heat stress response. Furthermore, quantitative global proteomic analysis conducted on TMT-labeled proteins combined with metabolome analysis revealed that mtl1Δ strains exhibit decreased levels of metabolites of carboxylic acid metabolism, decreased expression of anabolic enzymes and increased expression of catabolic enzymes involved in the metabolism of amino acids, with enhanced expression of mitochondrial respirasome proteins. These observations support the idea that Mtl1 protein controls the suppression of a non-lethal heat stress response under normal conditions while it plays an important role in metabolic regulatory mechanisms linked to TORC1 signaling that are required to maintain cellular homeostasis and optimal mitochondrial function.
Collapse
|
22
|
Machine-Learning-Based In-Hospital Mortality Prediction for Transcatheter Mitral Valve Repair in the United States. CARDIOVASCULAR REVASCULARIZATION MEDICINE 2021; 22:22-28. [PMID: 32591310 PMCID: PMC7736498 DOI: 10.1016/j.carrev.2020.06.017] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 06/02/2020] [Accepted: 06/10/2020] [Indexed: 12/13/2022]
Abstract
BACKGROUND Transcatheter mitral valve repair (TMVR) utilization has increased significantly in the United States over the last years. Yet, a risk-prediction tool for adverse events has not been developed. We aimed to generate a machine-learning-based algorithm to predict in-hospital mortality after TMVR. METHODS Patients who underwent TMVR from 2012 through 2015 were identified using the National Inpatient Sample database. The study population was randomly divided into a training set (n = 636) and a testing set (n = 213). Prediction models for in-hospital mortality were obtained using five supervised machine-learning classifiers. RESULTS A total of 849 TMVRs were analyzed in our study. The overall in-hospital mortality was 3.1%. A naïve Bayes (NB) model had the best discrimination for fifteen variables, with an area under the receiver-operating curve (AUC) of 0.83 (95% CI, 0.80-0.87), compared to 0.77 for logistic regression (95% CI, 0.58-0.95), 0.73 for an artificial neural network (95% CI, 0.55-0.91), and 0.67 for both a random forest and a support-vector machine (95% CI, 0.47-0.87). History of coronary artery disease, of chronic kidney disease, and smoking were the three most significant predictors of in-hospital mortality. CONCLUSIONS We developed a robust machine-learning-derived model to predict in-hospital mortality in patients undergoing TMVR. This model is promising for decision-making and deserves further clinical validation.
Collapse
|
23
|
Machine Learning Prediction Models for In-Hospital Mortality After Transcatheter Aortic Valve Replacement. JACC Cardiovasc Interv 2020; 12:1328-1338. [PMID: 31320027 DOI: 10.1016/j.jcin.2019.06.013] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 05/31/2019] [Accepted: 06/04/2019] [Indexed: 01/23/2023]
Abstract
OBJECTIVES This study sought to develop and compare an array of machine learning methods to predict in-hospital mortality after transcatheter aortic valve replacement (TAVR) in the United States. BACKGROUND Existing risk prediction tools for in-hospital complications in patients undergoing TAVR have been designed using statistical modeling approaches and have certain limitations. METHODS Patient data were obtained from the National Inpatient Sample database from 2012 to 2015. The data were randomly divided into a development cohort (n = 7,615) and a validation cohort (n = 3,268). Logistic regression, artificial neural network, naive Bayes, and random forest machine learning algorithms were applied to obtain in-hospital mortality prediction models. RESULTS A total of 10,883 TAVRs were analyzed in our study. The overall in-hospital mortality was 3.6%. Overall, prediction models' performance measured by area under the curve were good (>0.80). The best model was obtained by logistic regression (area under the curve: 0.92; 95% confidence interval: 0.89 to 0.95). Most obtained models plateaued after introducing 10 variables. Acute kidney injury was the main predictor of in-hospital mortality ranked with the highest mean importance in all the models. The National Inpatient Sample TAVR score showed the best discrimination among available TAVR prediction scores. CONCLUSIONS Machine learning methods can generate robust models to predict in-hospital mortality for TAVR. The National Inpatient Sample TAVR score should be considered for prognosis and shared decision making in TAVR patients.
Collapse
|
24
|
Abstract 2124: Hotspots of sequence variability in gut microbial metagenomic datasets and their association with colorectal cancer. Cancer Res 2020. [DOI: 10.1158/1538-7445.am2020-2124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abstract
The gut microbiome has been found to impact host predisposition to disease. Previous research shows that certain species of microbes, encoding pro-inflammatory factors, increase the probability of developing dysplasia and colorectal cancer (CRC). Recently, through the oligotyping approach, we found genetic variants for an amplified region of usp, a genotoxic protein made by E. coli in the gut. These different variants were present in healthy, adenoma and CRC clinical stool samples. To our surprise, the same positions of genetic variability found experimentally were confirmed by BLAST analyses of shotgun metagenomic data from a case-control study conducted in France and Washington D.C. In this work, using the same metagenomics data, we searched for hotspots of genetic variability in the sequences for five genes with known pro-inflammatory and toxicity profiles: usp, tcpC, gelE, clbB, and clbN. More importantly, we extended our analyses to cover the entire length of these genes using a command line approach to BLAST+. The sequence reads for both clbB and clbN lacked coverage, rendering their analysis unquantifiable. The output data for both tcpC and gelE showed high levels of homology between the sequences of the healthy and CRC patients. However, we found twenty-nine hotspots in the usp gene, thirteen of which cause a modification in the primary structure of the protein. Among these thirteen positions, nine were different in cases and controls suggesting that these hotspots of variation are associated with CRC. All these results provide insight into both the link between variability and pathogenicity, and the viability of shotgun sequencing for such types of analyses.
Citation Format: Rachell M. Martinez-Ramirez, Miguel M. Girod-Hoffman, Abiel Roche-Lima, Kelvin Carrasquillo-Carrión, Josué Pérez-Santiago, Abel Baerga-Ortiz. Hotspots of sequence variability in gut microbial metagenomic datasets and their association with colorectal cancer [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 2124.
Collapse
|
25
|
Reply: Leveraging Machine Learning to Generate Prediction Models for Structural Valve Interventions. JACC Cardiovasc Interv 2020; 12:2113-2114. [PMID: 31648770 DOI: 10.1016/j.jcin.2019.09.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Accepted: 09/04/2019] [Indexed: 10/25/2022]
|
26
|
Machine Learning Algorithm for Predicting Warfarin Dose in Caribbean Hispanics Using Pharmacogenetic Data. Front Pharmacol 2020; 10:1550. [PMID: 32038238 PMCID: PMC6987072 DOI: 10.3389/fphar.2019.01550] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 12/02/2019] [Indexed: 12/18/2022] Open
Abstract
Despite some previous examples of successful application to the field of pharmacogenomics, the utility of machine learning (ML) techniques for warfarin dose predictions in Caribbean Hispanic patients has yet to be fully evaluated. This study compares seven ML methods to predict warfarin dosing in Caribbean Hispanics. This is a secondary analysis of genetic and non-genetic clinical data from 190 cardiovascular Hispanic patients. Seven ML algorithms were applied to the data. Data was divided into 80 and 20% to be used as training and test sets. ML algorithms were trained with the training set to obtain the models. Model performance was determined by computing the corresponding mean absolute error (MAE) and % patients whose predicted optimal dose were within ±20% of the actual stabilization dose, and then compared between groups of patients with “normal” (i.e., > 21 but <49 mg/week), low (i.e., ≤21 mg/week, “sensitive”), and high (i.e., ≥49 mg/week, “resistant”) dose requirements. Random forest regression (RFR) significantly outperform all other methods, with a MAE of 4.73 mg/week and 80.56% of cases within ±20% of the actual stabilization dose. Among those with “normal” dose requirements, RFR performance is also better than the rest of models (MAE = 2.91 mg/week). In the “sensitive” group, support vector regression (SVR) shows superiority over the others with lower MAE of 4.79 mg/week. Finally, multivariate adaptive splines (MARS) shows the best performance in the resistant group (MAE = 7.22 mg/week) and 66.7% of predictions within ±20%. Models generated by using RFR, MARS, and SVR algorithms showed significantly better predictions of weekly warfarin dosing in the studied cohorts than other algorithms. Better performance of the ML models for patients with “normal,” “sensitive,” and “resistant” to warfarin were obtained when compared to other populations and previous statistical models.
Collapse
|
27
|
Hotspots of Sequence Variability in Gut Microbial Genes Encoding Pro-Inflammatory Factors Revealed by Oligotyping. Front Genet 2019; 10:631. [PMID: 31354787 PMCID: PMC6629961 DOI: 10.3389/fgene.2019.00631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Accepted: 06/17/2019] [Indexed: 11/21/2022] Open
Abstract
The gut microbiota has been implicated in a number of normal and disease biological processes. Recent studies have identified a subset of gut bacterial genes as potentially involved in inflammatory processes. In this work, we explore the sequence variability for some of these bacterial genes using a combination of deep sequencing and oligotyping, a data analysis application that identifies mutational hotspots in short stretches of DNA. The genes for pks island, tcpC and usp, all harbored by certain strains of E. coli and all implicated in inflammation, were amplified by PCR directly from stool samples and subjected to deep amplicon sequencing. For comparison, the same genes were amplified from individual bacterial clones. The amplicons for pks island and tcpC from stool samples showed minimal levels of heterogeneity comparable with the individual clones. The amplicons for usp from stool samples, by contrast, revealed the presence of five distinct oligotypes in two different regions. Of these, the oligotype GT was found to be present in the control uropathogenic clinical isolate and also detected in stool samples from individuals with colorectal cancer (CRC). Mutational hotspots were mapped onto the USP protein, revealing possible substitutions around Leu110, Glu114, and Arg115 in the middle of the pyocin domain (Gln110, Gln114, and Thr115 in most healthy samples), and also Arg218 in the middle of the nuclease domain (His218 in the uropathogenic strain). All of these results suggest that a level of variability within bacterial pro-inflammatory genes could explain differences in bacterial virulence and phenotype.
Collapse
|
28
|
Ethnogeographic prevalence and implications of the 677C>T and 1298A>C MTHFR polymorphisms in US primary care populations. Biomark Med 2019; 13:649-661. [PMID: 31157538 PMCID: PMC6630484 DOI: 10.2217/bmm-2018-0392] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 03/25/2019] [Indexed: 02/04/2023] Open
Abstract
Aim: Variants of the MTHFR gene have been associated with a wide range of diseases. Materials & methods: The present study analyzed data from clinical genotyping of MTHFR 677C>T and 1298A>C in 1405 patients in urban primary care settings. Results: Striking differences in ethnogeographic frequencies of MTHFR polymorphisms were observed. African-Americans appear to be protected from MTHFR deficiency. Hispanics and Caucasians may be at elevated risk due to increased frequencies of 677C>T and 1298A>C, respectively. Conclusion: Individuals carrying mutations for both genes were rare and doubly homozygous mutants were absent, suggesting the TTcc is extremely rare in the greater population. The results suggest multilocus MTHFR genotyping may yield deeper insight into the ethnogeographic association between MTHFR variants and disease.
Collapse
|
29
|
Assessment of Mitral Annular Plane Systolic Excursion in Patients With Left Ventricular Diastolic Dysfunction. Cardiol Res 2019; 10:83-88. [PMID: 31019637 PMCID: PMC6469911 DOI: 10.14740/cr837] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Accepted: 02/25/2019] [Indexed: 02/02/2023] Open
Abstract
Background Mitral annular plane systolic excursion (MAPSE) is a well-known surrogate measurement of left ventricular ejection fraction (LVEF) and prognostic factor for many cardiac conditions. However, little is known about its role in assessing LV diastolic function; we therefore sought to identify potential determinants of MAPSE in patients with LV diastolic dysfunction (LVDD). Methods Our echocardiographic database was queried for studies of patients with normal sinus. Patients were allocated into three groups: LVDD 0, LVDD 1 and LVDD 2 in accordance with LVDD stages recommended by the American Society of Echocardiography guidelines. Results A total of 107 echocardiographic studies were included in the study. The mean MAPSE was 1.22 ± 0.32 cm. Groups LVDD 0 (n = 23), LVDD 1 (n = 43), and LVDD 2 (n = 41) were significantly different in most of the studied variables. Particularly, MAPSE differed between the three groups (P < 0.01). A multiple regression analysis showed that age, LVEF and LV mass index were predictors of MAPSE instead of LVDD and left atrial measurements. Finally, a regression model was constructed to predict MAPSE in the studied group showing that age and LVEF explained a 46% of the MAPSE variation. A two-way contour plot was illustrated to ease the model interpretation. Conclusions Age and measures of LV systolic function correlated well with MAPSE. A simplified model to predict MAPSE based on age and LVEF is proposed. Additional studies are needed to examine the potential role of MAPSE in identifying symptoms and overall prognosis in LVDD patients.
Collapse
|
30
|
Racial/Ethnic Disparities in Patients Undergoing Transcatheter Aortic Valve Replacement: Insights from the Healthcare Cost and Utilization Project's National Inpatient Sample. CARDIOVASCULAR REVASCULARIZATION MEDICINE 2019; 20:546-552. [PMID: 30987828 DOI: 10.1016/j.carrev.2019.04.005] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Revised: 04/03/2019] [Accepted: 04/04/2019] [Indexed: 11/26/2022]
Abstract
PURPOSE To identify racial/ethnic disparities in utilization rates, in-hospital outcomes and health care resource use among Non-Hispanic Whites (NHW), African Americans (AA) and Hispanics undergoing transcatheter aortic valve replacement (TAVR) in the United States (US). METHODS AND RESULTS The National Inpatient Sample database was queried for patients ≥18 years of age who underwent TAVR from 2012 to 2014. The primary outcome was all-cause in hospital mortality. A total of 36,270 individuals were included in the study. The number of TAVR performed per million population increased in all study groups over the three years [38.8 to 103.8 (NHW); 9.1 to 26.4 (AA) and 9.4 to 18.2 (Hispanics)]. The overall in-hospital mortality was 4.2% for the entire cohort. Race/ethnicity showed no association with in-hospital mortality (P > .05). Though no significant difference were found between AA and NHW in any secondary outcome, being Hispanic was associated with higher incidence of acute myocardial infarction (aOR = 2.02; 95% CI, 1.06-3.85; P = .03), stroke/transient ischemic attack (aOR = 1.81; 95% CI, 1.04-3.14; P = .04), acute kidney injury (aOR = 1.65; 95% CI, 1.23-2.21; P < .01), prolonged length of stay (aOR = 1.18; 95% CI, 1.08-1.29; P < .01) and higher hospital costs (aOR = 1.27; 95% CI, 1.18-1.36; P < .01). CONCLUSION There are significant racial disparities in patients undergoing TAVR in the US. Though in-hospital mortality was not associated with race/ethnicity, Hispanic patients had less TAVR utilization, higher in-hospital complications, prolonged length of stay and increased hospital costs.
Collapse
|
31
|
MACHINE LEARNING PREDICTION MODELS FOR IN-HOSPITAL MORTALITY AFTER TRANSCATHETER AORTIC VALVE IMPLANTATION AMONG RACIAL/ETHNIC MINORITY POPULATIONS IN THE UNITED STATES. J Am Coll Cardiol 2019. [DOI: 10.1016/s0735-1097(19)31785-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
32
|
Nucleotide Weight Matrices Reveal Ubiquitous Mutational Footprints of AID/APOBEC Deaminases in Human Cancer Genomes. Cancers (Basel) 2019; 11:cancers11020211. [PMID: 30759888 PMCID: PMC6406962 DOI: 10.3390/cancers11020211] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 01/30/2019] [Accepted: 01/30/2019] [Indexed: 01/08/2023] Open
Abstract
Cancer genomes accumulate nucleotide sequence variations that number in the tens of thousands per genome. A prominent fraction of these mutations is thought to arise as a consequence of the off-target activity of DNA/RNA editing cytosine deaminases. These enzymes, collectively called activation induced deaminase (AID)/APOBECs, deaminate cytosines located within defined DNA sequence contexts. The resulting changes of the original C:G pair in these contexts (mutational signatures) provide indirect evidence for the participation of specific cytosine deaminases in a given cancer type. The conventional method used for the analysis of mutable motifs is the consensus approach. Here, for the first time, we have adopted the frequently used weight matrix (sequence profile) approach for the analysis of mutagenesis and provide evidence for this method being a more precise descriptor of mutations than the sequence consensus approach. We confirm that while mutational footprints of APOBEC1, APOBEC3A, APOBEC3B, and APOBEC3G are prominent in many cancers, mutable motifs characteristic of the action of the humoral immune response somatic hypermutation enzyme, AID, are the most widespread feature of somatic mutation spectra attributable to deaminases in cancer genomes. Overall, the weight matrix approach reveals that somatic mutations are significantly associated with at least one AID/APOBEC mutable motif in all studied cancers.
Collapse
|
33
|
Modelling and molecular docking studies of the cytoplasmic domain of Wsc-family, full-length Ras2p, and therapeutic antifungal compounds. Comput Biol Chem 2019; 78:338-352. [PMID: 30654316 DOI: 10.1016/j.compbiolchem.2019.01.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Revised: 12/29/2018] [Accepted: 01/02/2019] [Indexed: 12/28/2022]
Abstract
Saccharomyces cerevisiae, the budding yeast, must remodel initial cell shape and cell wall integrity during vegetative growth and pheromone-induced morphogenesis. The cell wall remodeling is monitored and regulated by the cell wall integrity (CWI) signaling pathway. Wsc1p, together with Wsc2p and Wsc3p, belongs to a family of highly O-glycosylated cell surface proteins that function as stress sensors of the cell wall in S. cerevisiae. These cell surface proteins have the main role of activating the CWI signaling pathway by stimulating the small G-protein Rho1p, which subsequently activates protein kinase C (Pkc1p) and a mitogen activated protein (MAP) kinase cascade that activates downstream transcription factors of stress-response genes. Wsc1p, Wsc2p, and Wsc3p possess a cytoplasmic domain where two conserved regions of the sequence have been assessed to be important for Rom2p interaction. Meanwhile, other research groups have also proposed that these transmembrane proteins could support protein-protein interactions with Ras2p. Molecular structures of the cytoplasmic domain of Wsc1p, Wsc2p and Wsc3p were generated using the standard and fully-automated ORCHESTAR procedures provided by the Sybyl-X 2.1.1 program. The tridimensional structure of full length Ras2p was also generated with Phyre2. These protein models were validated with Procheck-PDBsum and ProSA-web tools and subsequently used in docking-based modeling of protein-protein and protein-compound interfaces for extensive structural and functional characterization of their interaction. The results retrieved from STRING 10.5 suggest that the Wsc-family is involved in protein-protein interactions with each other and with Ras2p. Docking-based studies also validated the existence of protein-protein interactions mainly between Motif I (Wsc3p > Wsc1p > Wsc2p) and Ras2p, in agreement with the data provided by STRING 10.5. Additionally, it has shown that Calcofluor White preferably binds to Wsc1p (-9.5 kcal/mol), meanwhile Caspofungin binds to Wsc3p (-9.1 kcal/mol), Wsc1p (-9.1 kcal/mol) and more weakly Wsc2p (-6.9 kcal/mol). Thus, these data suggests Caspofungin as a common inhibitor for the Wsc-family. MTiOpenScreen database has provided a list of new compounds with energy scores higher than those compounds used in our docking studies, thus suggesting these new compounds have a better affinity towards the cytoplasmic domains and Ras2p. Based on these data, there are new and possibly more effective compounds that should be considered as therapeutic agents against yeast infection.
Collapse
|
34
|
The Characterization of Monoclonal Antibodies to Mouse TLT-1 Suggests That TLT-1 Plays a Role in Wound Healing. Monoclon Antib Immunodiagn Immunother 2018; 37:78-86. [PMID: 29708866 DOI: 10.1089/mab.2017.0063] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Platelets play a vital role in hemostasis and inflammation. The membrane receptor TREM-like transcript-1 (TLT-1) is involved in platelet aggregation, bleeding, and inflammation, and it is localized in the α-granules of platelets. Upon platelet activation, TLT-1 is released from α-granules both in its transmembrane form and as a soluble fragment (sTLT-1). Higher levels of sTLT-1 have been detected in the plasma of patients with acute inflammation or sepsis, suggesting an important role for TLT-1 during inflammation. However, the roles of TLT-1 in hemostasis and inflammation are not well understood. We are developing the mouse model of TLT-1 to mechanistically test clinical associations of TLT-1 in health and disease. To facilitate our studies, monoclonal murine TLT-1 (mTLT-1) antibodies were produced by the immunization of a rabbit using the negatively charged region of the mTLT-1 extracellular domain 122PPVPGPREGEEAEDEK139. In the present study, we demonstrate that two selected clones, 4.6 and 4.8, are suitable for the detection of mTLT-1 by western blot, immunoprecipitation, immunofluorescent staining, flow cytometry and inhibit platelet aggregation in aggregometry assays. In addition, we found that the topical administration of clone 4.8 delayed the wound healing process in an experimental burn model. These results suggest that TLT-1 plays an important role in wound healing and because both clones specifically detect mTLT-1, they are suitable to further develop TLT-1 based models of inflammation and hemostasis in vivo.
Collapse
|
35
|
TCT-173 Racial Disparities in Patients Undergoing Transcatheter Aortic Valve Replacement: Insights of the Healthcare Cost and Utilization Project's National Inpatient Sample. J Am Coll Cardiol 2018. [DOI: 10.1016/j.jacc.2018.08.1286] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
36
|
The Presence of Genotoxic and/or Pro-inflammatory Bacterial Genes in Gut Metagenomic Databases and Their Possible Link With Inflammatory Bowel Diseases. Front Genet 2018; 9:116. [PMID: 29692798 PMCID: PMC5902703 DOI: 10.3389/fgene.2018.00116] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 03/22/2018] [Indexed: 01/19/2023] Open
Abstract
Background: The human gut microbiota is a dynamic community of microorganisms that mediate important biochemical processes. Differences in the gut microbial composition have been associated with inflammatory bowel diseases (IBD) and other intestinal disorders. In this study, we quantified and compared the frequencies of eight genotoxic and/or pro-inflammatory bacterial genes found in metagenomic Whole Genome Sequences (mWGSs) of samples from individuals with IBD vs. a cohort of healthy human subjects. Methods: The eight selected gene sequences were clbN, clbB, cif, cnf-1, usp, tcpC from Escherichia coli, gelE from Enterococcus faecalis and murB from Akkermansia muciniphila. We also included the sequences for the conserved murB genes from E. coli and E. faecalis as markers for the presence of Enterobacteriaceae or Enterococci in the samples. The gene sequences were chosen based on their previously reported ability to disrupt normal cellular processes to either promote inflammation or to cause DNA damage in cultured cells or animal models, which could be linked to a role in IBD. The selected sequences were searched in three different mWGS datasets accessed through the Human Microbiome Project (HMP): a healthy cohort (N = 251), a Crohn's disease cohort (N = 60) and an ulcerative colitis cohort (N = 17). Results: Firstly, the sequences for the murB housekeeping genes from Enterobacteriaceae and Enterococci were more frequently found in the IBD cohorts (32% E. coli in IBD vs. 12% in healthy; 13% E. faecalis in IBD vs. 3% in healthy) than in the healthy cohort, confirming earlier reports of a higher presence of both of these taxa in IBD. For some of the sequences in our study, especially usp and gelE, their frequency was even more sharply increased in the IBD cohorts than in the healthy cohort, suggesting an association with IBD that is not easily explained by the increased presence of E. coli or E. faecalis in those samples. Conclusion: Our results suggest a significant association between the presence of some of these genotoxic or pro-inflammatory gene sequences and IBDs. In addition, these results illustrate the power and limitations of the HMP database in the detection of possible clinical correlations for individual bacterial genes.
Collapse
|
37
|
Learning stochastic finite-state transducer to predict individual patient outcomes. HEALTH AND TECHNOLOGY 2016; 6:239-245. [PMID: 27942425 PMCID: PMC5124435 DOI: 10.1007/s12553-016-0146-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2016] [Accepted: 10/05/2016] [Indexed: 11/30/2022]
Abstract
The high frequency data in intensive care unit is flashed on a screen for a few seconds and never used again. However, this data can be used by machine learning and data mining techniques to predict patient outcomes. Learning finite-state transducers (FSTs) have been widely used in problems where sequences need to be manipulated and insertions, deletions and substitutions need to be modeled. In this paper, we learned the edit distance costs of a symbolic univariate time series representation through a stochastic finite-state transducer to predict patient outcomes in intensive care units. The Nearest-Neighbor method with these learned costs was used to classify the patient status within an hour after 10 h of data. Several experiments were developed to estimate the parameters that better fit the model regarding the prediction metrics. Our best results are compared with published works, where most of the metrics (i.e., Accuracy, Precision and F-measure) were improved.
Collapse
|
38
|
Metabolic network prediction through pairwise rational kernels. BMC Bioinformatics 2014; 15:318. [PMID: 25260372 PMCID: PMC4261252 DOI: 10.1186/1471-2105-15-318] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Accepted: 09/23/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Metabolic networks are represented by the set of metabolic pathways. Metabolic pathways are a series of biochemical reactions, in which the product (output) from one reaction serves as the substrate (input) to another reaction. Many pathways remain incompletely characterized. One of the major challenges of computational biology is to obtain better models of metabolic pathways. Existing models are dependent on the annotation of the genes. This propagates error accumulation when the pathways are predicted by incorrectly annotated genes. Pairwise classification methods are supervised learning methods used to classify new pair of entities. Some of these classification methods, e.g., Pairwise Support Vector Machines (SVMs), use pairwise kernels. Pairwise kernels describe similarity measures between two pairs of entities. Using pairwise kernels to handle sequence data requires long processing times and large storage. Rational kernels are kernels based on weighted finite-state transducers that represent similarity measures between sequences or automata. They have been effectively used in problems that handle large amount of sequence information such as protein essentiality, natural language processing and machine translations. RESULTS We create a new family of pairwise kernels using weighted finite-state transducers (called Pairwise Rational Kernel (PRK)) to predict metabolic pathways from a variety of biological data. PRKs take advantage of the simpler representations and faster algorithms of transducers. Because raw sequence data can be used, the predictor model avoids the errors introduced by incorrect gene annotations. We then developed several experiments with PRKs and Pairwise SVM to validate our methods using the metabolic network of Saccharomyces cerevisiae. As a result, when PRKs are used, our method executes faster in comparison with other pairwise kernels. Also, when we use PRKs combined with other simple kernels that include evolutionary information, the accuracy values have been improved, while maintaining lower construction and execution times. CONCLUSIONS The power of using kernels is that almost any sort of data can be represented using kernels. Therefore, completely disparate types of data can be combined to add power to kernel-based machine learning methods. When we compared our proposal using PRKs with other similar kernel, the execution times were decreased, with no compromise of accuracy. We also proved that by combining PRKs with other kernels that include evolutionary information, the accuracy can also also be improved. As our proposal can use any type of sequence data, genes do not need to be properly annotated, avoiding accumulation errors because of incorrect previous annotations.
Collapse
|
39
|
BioPCD - A Language for GUI Development Requiring a Minimal Skill Set. INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS 2012; 57:9-16. [PMID: 27818582 PMCID: PMC5096648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
BioPCD is a new language whose purpose is to simplify the creation of Graphical User Interfaces (GUIs) by biologists with minimal programming skills. The first step in developing BioPCD was to create a minimal superset of the language referred to as PCD (Pythonesque Command Description). PCD defines the core of terminals and high-level nonterminals required to describe data of almost any type. BioPCD adds to PCD the constructs necessary to describe GUI components and the syntax for executing system commands. BioPCD is implemented using JavaCC to convert the grammar into code. BioPCD is designed to be terse and readable and simple enough to be learned by copying and modifying existing BioPCD files. We demonstrate that BioPCD can easily be used to generate GUIs for existing command line programs. Although BioPCD was designed to make it easier to run bioinformatics programs, it could be used in any domain in which many useful command line programs exist that do not have GUI interfaces.
Collapse
|
40
|
Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers. JOURNAL OF PHYSICS. CONFERENCE SERIES 2012; 341:012034. [PMID: 27795731 PMCID: PMC5082745 DOI: 10.1088/1742-6596/341/1/012034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.
Collapse
|