1
|
Zhao Z, Guo Y, Chowdhury T, Anjum S, Li J, Huang L, Cupp-Sutton KA, Burgett A, Shi D, Wu S. Top-Down Proteomics Analysis of Picogram-Level Complex Samples Using Spray-Capillary-Based Capillary Electrophoresis-Mass Spectrometry. Anal Chem 2024; 96:8763-8771. [PMID: 38722793 DOI: 10.1021/acs.analchem.4c01119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Proteomics analysis of mass-limited samples has become increasingly important for understanding biological systems in physiologically relevant contexts such as patient samples, multicellular organoids, spheroids, and single cells. However, relatively low sensitivity in top-down proteomics methods makes their application to mass-limited samples challenging. Capillary electrophoresis (CE) has emerged as an ideal separation method for mass-limited samples due to its high separation resolution, ultralow detection limit, and minimal sample volume requirements. Recently, we developed "spray-capillary", an electrospray ionization (ESI)-assisted device, that is capable of quantitative ultralow-volume sampling (e.g., pL-nL level). Here, we developed a spray-capillary-CE-MS platform for ultrasensitive top-down proteomics analysis of intact proteins in mass-limited complex biological samples. Specifically, to improve the sensitivity of the spray-capillary platform, we incorporated a polyethylenimine (PEI)-coated capillary and optimized the spray-capillary inner diameter. Under optimized conditions, we successfully detected over 200 proteoforms from 50 pg of E. coli lysate. To our knowledge, the spray-capillary CE-MS platform developed here represents one of the most sensitive detection methods for top-down proteomics. Furthermore, in a proof-of-principle experiment, we detected 261 ± 65 and 174 ± 45 intact proteoforms from fewer than 50 HeLa and OVCAR-8 cells, respectively, by coupling nanodroplet-based sample preparation with our optimized CE-MS platform. Overall, our results demonstrate the capability of the modified spray-capillary CE-MS platform to perform top-down proteomics analysis on picogram amounts of samples. This advancement presents the possibility of meaningful top-down proteomics analysis of mass-limited samples down to the level of single mammalian cells.
Collapse
Affiliation(s)
- Zhitao Zhao
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Yanting Guo
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Trishika Chowdhury
- Department of Chemistry and Biochemistry, University of Alabama, 250 Hackberry Ln, Tuscaloosa, Alabama 35487, United States
| | - Samin Anjum
- Department of Chemistry and Biochemistry, University of Alabama, 250 Hackberry Ln, Tuscaloosa, Alabama 35487, United States
| | - Jiaxue Li
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Lushuang Huang
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
| | - Kellye A Cupp-Sutton
- Department of Chemistry and Biochemistry, University of Alabama, 250 Hackberry Ln, Tuscaloosa, Alabama 35487, United States
| | - Anthony Burgett
- Department of Pharmaceutical Sciences, University of Oklahoma Health Sciences, 1110 N. Stonewall Ave., Oklahoma City, Oklahoma 73117, United States
| | - Dingjing Shi
- Department of Psychology, University of Oklahoma, 455 W Lindsey Street, Norman, Oklahoma 73069, United States
| | - Si Wu
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Parkway, Norman, Oklahoma 73019, United States
- Department of Chemistry and Biochemistry, University of Alabama, 250 Hackberry Ln, Tuscaloosa, Alabama 35487, United States
| |
Collapse
|
2
|
Zhao Y, Gong P. Optimal site selection strategies for urban parks green spaces under the joint perspective of spatial equity and social equity. Front Public Health 2024; 12:1310340. [PMID: 38638465 PMCID: PMC11024374 DOI: 10.3389/fpubh.2024.1310340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 03/22/2024] [Indexed: 04/20/2024] Open
Abstract
Urban park green spaces (UPGS) are a crucial element of social public resources closely related to the health and well-being of urban residents, and issues of equity have always been a focal point of concern. This study takes the downtown area of Nanchang as an example and uses more accurate point of interest (POI) and area of interest (AOI) data as analysis sources. The improved Gaussian two-step floating catchment area (G2SFCA) and spatial autocorrelation models are then used to assess the spatial and social equity in the study area, and the results of the two assessments were coupled to determine the optimization objective using the community as the smallest unit. Finally, the assessment results are combined with the k-means algorithm and particle swarm algorithm (PSO) to propose practical optimization strategies with the objectives of minimum walking distance and maximum fairness. The results indicate (1) There are significant differences in UPGS accessibility among residents with different walking distances, with the more densely populated Old Town and Honggu Tan areas having lower average accessibility and being the main areas of hidden blindness, while the fringe areas in the northern and south-western parts of the city are the main areas of visible blindness. (2) Overall, the UPGS accessibility in Nanchang City exhibits a spatial pattern of decreasing from the east, south, and west to the center. Nanchang City is in transition towards improving spatial and social equity while achieving basic regional equity. (3) There is a spatial positive correlation between socioeconomic level and UPGS accessibility, reflecting certain social inequity. (4) Based on the above research results, the UPGS layout optimization scheme was proposed, 29 new UPGS locations and regions were identified, and the overall accessibility was improved by 2.76. The research methodology and framework can be used as a tool to identify the underserved areas of UPGS and optimize the spatial and social equity of UPGS, which is in line with the current trend of urban development in the world and provides a scientific basis for urban infrastructure planning and spatial resource allocation.
Collapse
Affiliation(s)
| | - Peng Gong
- College of Gardening and Arts, Jiangxi Agricultural University, Nanchang, China
| |
Collapse
|
3
|
Claréus B, Daukantaité D. Off track or on? Associations of positive and negative life events with the continuation versus cessation of repetitive adolescent nonsuicidal self-injury. J Clin Psychol 2023; 79:2459-2477. [PMID: 37178314 DOI: 10.1002/jclp.23533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 02/13/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023]
Abstract
OBJECTIVE This study examined how patterns of repetitive (≥5 instances) nonsuicidal self-injury (NSSI) associate with measures of resilience and life events retrospectively reported to have occurred within the last year, 1 to <5 years ago, and 5 to <10 years ago. METHOD Life events reported by 557 young adults (mean [SD] age 25.3 [0.68]; 59.2% women) were classified as positive, negative, or profoundly negative based on their relationship to participants' mental health and well-being. We subsequently examined how these categories, together with resilience, were cross-sectionally associated with reporting no NSSI, and the (full/partial) cessation/continuation of repetitive NSSI from adolescence to young adulthood. RESULTS Repetitive NSSI in adolescence was associated with (profoundly) negative life events. Relative to cessation, NSSI continuation was significantly associated with more kinds of negative life events (odds ratio [OR] = 1.79) and fewer kinds of positive life events 1 to <5 years ago (OR = 0.65) and tended to be associated with lower resilience (b = -0.63, p = 0.056). Neither life events nor resilience significantly differentiated individuals reporting full or partial cessation. CONCLUSION Resilience appears important for the cessation of repetitive NSSI, but contextual factors must still be considered. Assessing positive life events in future studies holds promise.
Collapse
|
4
|
Yiakoumetti A, Hanko EKR, Zou Y, Chua J, Chromy J, Stoney RA, Valdehuesa KNG, Connolly JA, Yan C, Hollywood KA, Takano E, Breitling R. Expanding flavone and flavonol production capabilities in Escherichia coli. Front Bioeng Biotechnol 2023; 11:1275651. [PMID: 37920246 PMCID: PMC10619664 DOI: 10.3389/fbioe.2023.1275651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 10/04/2023] [Indexed: 11/04/2023] Open
Abstract
Flavones and flavonols are important classes of flavonoids with nutraceutical and pharmacological value, and their production by fermentation with recombinant microorganisms promises to be a scalable and economically favorable alternative to extraction from plant sources. Flavones and flavonols have been produced recombinantly in a number of microorganisms, with Saccharomyces cerevisiae typically being a preferred production host for these compounds due to higher yields and titers of precursor compounds, as well as generally improved ability to functionally express cytochrome P450 enzymes without requiring modification to improve their solubility. Recently, a rapid prototyping platform has been developed for high-value compounds in E. coli, and a number of gatekeeper (2S)-flavanones, from which flavones and flavonols can be derived, have been produced to high titers in E. coli using this platform. In this study, we extended these metabolic pathways using the previously reported platform to produce apigenin, chrysin, luteolin and kaempferol from the gatekeeper flavonoids naringenin, pinocembrin and eriodictyol by the expression of either type-I flavone synthases (FNS-I) or type-II flavone synthases (FNS-II) for flavone biosynthesis, and by the expression of flavanone 3-dioxygenases (F3H) and flavonol synthases (FLS) for the production of the flavonol kaempferol. In our best-performing strains, titers of apigenin and kaempferol reached 128 mg L-1 and 151 mg L-1 in 96-DeepWell plates in cultures supplemented with an additional 3 mM tyrosine, though titers for chrysin (6.8 mg L-1) from phenylalanine, and luteolin (5.0 mg L-1) from caffeic acid were considerably lower. In strains with upregulated tyrosine production, apigenin and kaempferol titers reached 80.2 mg L-1 and 42.4 mg L-1 respectively, without the further supplementation of tyrosine beyond the amount present in the rich medium. Notably, the highest apigenin, chrysin and luteolin titers were achieved with FNS-II enzymes, suggesting that cytochrome P450s can show competitive performance compared with non-cytochrome P450 enzymes in prokaryotes for the production of flavones.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Rainer Breitling
- Manchester Institute of Biotechnology, School of Chemistry, Faculty of Science and Engineering, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
5
|
Casalino L, Seitz C, Lederhofer J, Tsybovsky Y, Wilson IA, Kanekiyo M, Amaro RE. Breathing and Tilting: Mesoscale Simulations Illuminate Influenza Glycoprotein Vulnerabilities. ACS CENTRAL SCIENCE 2022; 8:1646-1663. [PMID: 36589893 PMCID: PMC9801513 DOI: 10.1021/acscentsci.2c00981] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Indexed: 05/28/2023]
Abstract
Influenza virus has resurfaced recently from inactivity during the early stages of the COVID-19 pandemic, raising serious concerns about the nature and magnitude of future epidemics. The main antigenic targets of influenza virus are two surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA). Whereas the structural and dynamical properties of both glycoproteins have been studied previously, the understanding of their plasticity in the whole-virion context is fragmented. Here, we investigate the dynamics of influenza glycoproteins in a crowded protein environment through mesoscale all-atom molecular dynamics simulations of two evolutionary-linked glycosylated influenza A whole-virion models. Our simulations reveal and kinetically characterize three main molecular motions of influenza glycoproteins: NA head tilting, HA ectodomain tilting, and HA head breathing. The flexibility of HA and NA highlights antigenically relevant conformational states, as well as facilitates the characterization of a novel monoclonal antibody, derived from convalescent human donor, that binds to the underside of the NA head. Our work provides previously unappreciated views on the dynamics of HA and NA, advancing the understanding of their interplay and suggesting possible strategies for the design of future vaccines and antivirals against influenza.
Collapse
Affiliation(s)
- Lorenzo Casalino
- Department
of Chemistry and Biochemistry, University
of California San Diego, La Jolla, California92093, United States
| | - Christian Seitz
- Department
of Chemistry and Biochemistry, University
of California San Diego, La Jolla, California92093, United States
| | - Julia Lederhofer
- Vaccine
Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland20892, United States
| | - Yaroslav Tsybovsky
- Electron
Microscopy Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research
Sponsored by the National Cancer Institute, Frederick, Maryland21702, United States
| | - Ian A. Wilson
- Department
of Integrative Structural and Computational Biology and the Skaggs
Institute for Chemical Biology, The Scripps
Research Institute, La Jolla, California92037, United States
| | - Masaru Kanekiyo
- Vaccine
Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland20892, United States
| | - Rommie E. Amaro
- Department
of Chemistry and Biochemistry, University
of California San Diego, La Jolla, California92093, United States
| |
Collapse
|
6
|
Hacking C, Verbeek H, Hamers JPH, Sion K, Aarts S. Text mining in long-term care: Exploring the usefulness of artificial intelligence in a nursing home setting. PLoS One 2022; 17:e0268281. [PMID: 36006921 PMCID: PMC9409502 DOI: 10.1371/journal.pone.0268281] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Objectives In nursing homes, narrative data are collected to evaluate quality of care as perceived by residents or their family members. This results in a large amount of textual data. However, as the volume of data increases, it becomes beyond the capability of humans to analyze it. This study aims to explore the usefulness of text mining approaches regarding narrative data gathered in a nursing home setting. Design Exploratory study showing a variety of text mining approaches. Setting and participants Data has been collected as part of the project ‘Connecting Conversations’: assessing experienced quality of care by conducting individual interviews with residents of nursing homes (n = 39), family members (n = 37) and care professionals (n = 49). Methods Several pre-processing steps were applied. A variety of text mining analyses were conducted: individual word frequencies, bigram frequencies, a correlation analysis and a sentiment analysis. A survey was conducted to establish a sentiment analysis model tailored to text collected in long-term care for older adults. Results Residents, family members and care professionals uttered respectively 285, 362 and 549 words per interview. Word frequency analysis showed that words that occurred most frequently in the interviews are often positive. Despite some differences in word usage, correlation analysis displayed that similar words are used by all three groups to describe quality of care. Most interviews displayed a neutral sentiment. Care professionals expressed a more diverse sentiment compared to residents and family members. A topic clustering analysis showed a total of 12 topics including ‘relations’ and ‘care environment’. Conclusions and implications This study demonstrates the usefulness of text mining to extend our knowledge regarding quality of care in a nursing home setting. With the rise of textual (narrative) data, text mining can lead to valuable new insights for long-term care for older adults.
Collapse
Affiliation(s)
- Coen Hacking
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
- * E-mail:
| | - Hilde Verbeek
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Jan P. H. Hamers
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Katya Sion
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| | - Sil Aarts
- Faculty of Health Medicine and Life Sciences, Department of Health Services Research, CAPHRI Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands
- The Living Lab in Ageing & Long-Term Care, Maastricht, The Netherlands
| |
Collapse
|
7
|
Casalino L, Seitz C, Lederhofer J, Tsybovsky Y, Wilson IA, Kanekiyo M, Amaro RE. Breathing and tilting: mesoscale simulations illuminate influenza glycoprotein vulnerabilities. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.08.02.502576. [PMID: 35982676 PMCID: PMC9387122 DOI: 10.1101/2022.08.02.502576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Influenza virus has resurfaced recently from inactivity during the early stages of the COVID-19 pandemic, raising serious concerns about the nature and magnitude of future epidemics. The main antigenic targets of influenza virus are two surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA). Whereas the structural and dynamical properties of both glycoproteins have been studied previously, the understanding of their plasticity in the whole-virion context is fragmented. Here, we investigate the dynamics of influenza glycoproteins in a crowded protein environment through mesoscale all-atom molecular dynamics simulations of two evolutionary-linked glycosylated influenza A whole-virion models. Our simulations reveal and kinetically characterize three main molecular motions of influenza glycoproteins: NA head tilting, HA ectodomain tilting, and HA head breathing. The flexibility of HA and NA highlights antigenically relevant conformational states, as well as facilitates the characterization of a novel monoclonal antibody, derived from human convalescent plasma, that binds to the underside of the NA head. Our work provides previously unappreciated views on the dynamics of HA and NA, advancing the understanding of their interplay and suggesting possible strategies for the design of future vaccines and antivirals against influenza.
Collapse
Affiliation(s)
- Lorenzo Casalino
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Christian Seitz
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Julia Lederhofer
- Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Yaroslav Tsybovsky
- Electron Microscopy Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD 21702, United States
| | - Ian A. Wilson
- Department of Integrative Structural and Computational Biology and the Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Masaru Kanekiyo
- Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Rommie E. Amaro
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States,Corresponding author.
| |
Collapse
|
8
|
Xu Z, York LM, Seethepalli A, Bucciarelli B, Cheng H, Samac DA. Objective Phenotyping of Root System Architecture Using Image Augmentation and Machine Learning in Alfalfa (Medicago sativa L.). PLANT PHENOMICS (WASHINGTON, D.C.) 2022; 2022:9879610. [PMID: 35479182 PMCID: PMC9012978 DOI: 10.34133/2022/9879610] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 03/03/2022] [Indexed: 12/28/2022]
Abstract
Active breeding programs specifically for root system architecture (RSA) phenotypes remain rare; however, breeding for branch and taproot types in the perennial crop alfalfa is ongoing. Phenotyping in this and other crops for active RSA breeding has mostly used visual scoring of specific traits or subjective classification into different root types. While image-based methods have been developed, translation to applied breeding is limited. This research is aimed at developing and comparing image-based RSA phenotyping methods using machine and deep learning algorithms for objective classification of 617 root images from mature alfalfa plants collected from the field to support the ongoing breeding efforts. Our results show that unsupervised machine learning tends to incorrectly classify roots into a normal distribution with most lines predicted as the intermediate root type. Encouragingly, random forest and TensorFlow-based neural networks can classify the root types into branch-type, taproot-type, and an intermediate taproot-branch type with 86% accuracy. With image augmentation, the prediction accuracy was improved to 97%. Coupling the predicted root type with its prediction probability will give breeders a confidence level for better decisions to advance the best and exclude the worst lines from their breeding program. This machine and deep learning approach enables accurate classification of the RSA phenotypes for genomic breeding of climate-resilient alfalfa.
Collapse
Affiliation(s)
- Zhanyou Xu
- USDA-ARS, Plant Science Research Unit, 1991 Upper Buford Circle, St. Paul, MN 55108, USA
| | - Larry M. York
- Biosciences Division and Center for Bioenergy Innovation, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | | | - Bruna Bucciarelli
- Department of Agronomy and Plant Genetics, University of Minnesota, 1991 Upper Buford Circle, St. Paul, MN 55108, USA
| | - Hao Cheng
- Department of Animal Science, University of California, 2251 Meyer Hall, One Shields Ave., Davis, CA 95616, USA
| | - Deborah A. Samac
- USDA-ARS, Plant Science Research Unit, 1991 Upper Buford Circle, St. Paul, MN 55108, USA
| |
Collapse
|
9
|
Abstract
AbstractClustering of the contents of a document corpus is used to create sub-corpora with the intention that they are expected to consist of documents that are related to each other. However, while clustering is used in a variety of ways in document applications such as information retrieval, and a range of methods have been applied to the task, there has been relatively little exploration of how well it works in practice. Indeed, given the high dimensionality of the data it is possible that clustering may not always produce meaningful outcomes. In this paper we use a well-known clustering method to explore a variety of techniques, existing and novel, to measure clustering effectiveness. Results with our new, extrinsic techniques based on relevance judgements or retrieved documents demonstrate that retrieval-based information can be used to assess the quality of clustering, and also show that clustering can succeed to some extent at gathering together similar material. Further, they show that intrinsic clustering techniques that have been shown to be informative in other domains do not work for information retrieval. Whether clustering is sufficiently effective to have a significant impact on practical retrieval is unclear, but as the results show our measurement techniques can effectively distinguish between clustering methods.
Collapse
|
10
|
Schmidt MN, Seddig D, Davidov E, Mørup M, Albers KJ, Bauer JM, Glückstad FK. Latent profile analysis of human values: What is the optimal number of clusters? METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES 2021. [DOI: 10.5964/meth.5479] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Latent Profile Analysis (LPA) is a method to extract homogeneous clusters characterized by a common response profile. Previous works employing LPA to human value segmentation tend to select a small number of moderately homogeneous clusters based on model selection criteria such as Akaike information criterion, Bayesian information criterion and Entropy. The question is whether a small number of clusters is all that can be gleaned from the data. While some studies have carefully compared different statistical model selection criteria, there is currently no established criteria to assess if an increased number of clusters generates meaningful theoretical insights. This article examines the content and meaningfulness of the clusters extracted using two algorithms: Variational Bayesian LPA and Maximum Likelihood LPA. For both methods, our results point towards eight as the optimal number of clusters for characterizing distinctive Schwartz value typologies that generate meaningful insights and predict several external variables.
Collapse
|
11
|
Phan HP, Ngu BH. Introducing the Concept of Consonance-Disconsonance of Best Practice: A Focus on the Development of 'Student Profiling'. Front Psychol 2021; 12:557968. [PMID: 33995160 PMCID: PMC8121024 DOI: 10.3389/fpsyg.2021.557968] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 04/07/2021] [Indexed: 11/17/2022] Open
Abstract
The present study, using a non-experimental approach, investigated a theoretical concept of best practice, which we recently introduced - namely: a 'state of consonance' and a 'state of disconsonance' of best practice. Consonance of best practice posits that different levels of best practice (e.g., low level of best practice versus optimal level of best practice), as well as other comparable psychological constructs (e.g., motivation towards learning) would cluster or 'group' together. Disconsonance of best practice, in contrast, would indicate non-overlapping of contrasting levels of best practice (i.e., low level of best practice versus optimal level of best practice). Taiwanese undergraduates (N = 831) from five private universities in Taipei City and New Taipei City, Taiwan took part in the study by responding to a suite of Likert-scale questionnaires (e.g., Best Practice Questionnaires, Motivation towards Learning Questionnaire), which took approximately 30-35 min to complete. Cluster analysis, commonly known as ClA, was used to analyze the data and seek theoretical understanding into the nature of the consonance of best practice. Results, overall, showed support for our proposition, resulting in four distinct profiles: 'a Balanced Profile,' 'an Intrinsic Motivation Profile,' 'a Current Best Practice + Interest Profile,' and 'a Current Best Practice + Motivation Profile.' This evidence, helping to advance further research development, has a number of practical implications for consideration. For example, how could we use the Balanced Profile to develop learning objectives and/or pedagogical practices that would encourage students to enjoy their learning experiences?
Collapse
Affiliation(s)
- Huy P. Phan
- School of Education, University of New England, Armidale, NSW, Australia
| | | |
Collapse
|
12
|
Sessa M, Khan AR, Liang D, Andersen M, Kulahci M. Artificial Intelligence in Pharmacoepidemiology: A Systematic Review. Part 1-Overview of Knowledge Discovery Techniques in Artificial Intelligence. Front Pharmacol 2020; 11:1028. [PMID: 32765261 PMCID: PMC7378532 DOI: 10.3389/fphar.2020.01028] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Accepted: 06/24/2020] [Indexed: 12/14/2022] Open
Abstract
Aim To perform a systematic review on the application of artificial intelligence (AI) based knowledge discovery techniques in pharmacoepidemiology. Study Eligibility Criteria Clinical trials, meta-analyses, narrative/systematic review, and observational studies using (or mentioning articles using) artificial intelligence techniques were eligible. Articles without a full text available in the English language were excluded. Data Sources Articles recorded from 1950/01/01 to 2019/05/06 in Ovid MEDLINE were screened. Participants Studies including humans (real or simulated) exposed to a drug. Results In total, 72 original articles and 5 reviews were identified via Ovid MEDLINE. Twenty different knowledge discovery methods were identified, mainly from the area of machine learning (66/72; 91.7%). Classification/regression (44/72; 61.1%), classification/regression + model optimization (13/72; 18.0%), and classification/regression + features selection (12/72; 16.7%) were the three most frequent tasks in reviewed literature that machine learning methods has been applied to solve. The top three used techniques were artificial neural networks, random forest, and support vector machines models. Conclusions The use of knowledge discovery techniques of artificial intelligence techniques has increased exponentially over the years covering numerous sub-topics of pharmacoepidemiology. Systematic Review Registration Systematic review registration number in PROSPERO: CRD42019136552.
Collapse
Affiliation(s)
- Maurizio Sessa
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| | - Abdul Rauf Khan
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark.,Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark
| | - David Liang
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| | - Morten Andersen
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| | - Murat Kulahci
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Lyngby, Denmark.,Department of Business Administration, Technology and Social Sciences, Luleå University of Technology, Luleå, Sweden
| |
Collapse
|
13
|
Spurek P, Byrski K, Tabor J. Online updating of active function cross-entropy clustering. Pattern Anal Appl 2019. [DOI: 10.1007/s10044-018-0701-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
14
|
|
15
|
Kriegel HP, Schubert E, Zimek A. The (black) art of runtime evaluation: Are we comparing algorithms or implementations? Knowl Inf Syst 2016. [DOI: 10.1007/s10115-016-1004-2] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
16
|
Köhn HF, Chiu CY, Brusco MJ. Heuristic cognitive diagnosis when the Q-matrix is unknown. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2015; 68:268-291. [PMID: 25496248 DOI: 10.1111/bmsp.12044] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Revised: 07/12/2014] [Indexed: 06/04/2023]
Abstract
Cognitive diagnosis models of educational test performance rely on a binary Q-matrix that specifies the associations between individual test items and the cognitive attributes (skills) required to answer those items correctly. Current methods for fitting cognitive diagnosis models to educational test data and assigning examinees to proficiency classes are based on parametric estimation methods such as expectation maximization (EM) and Markov chain Monte Carlo (MCMC) that frequently encounter difficulties in practical applications. In response to these difficulties, non-parametric classification techniques (cluster analysis) have been proposed as heuristic alternatives to parametric procedures. These non-parametric classification techniques first aggregate each examinee's test item scores into a profile of attribute sum scores, which then serve as the basis for clustering examinees into proficiency classes. Like the parametric procedures, the non-parametric classification techniques require that the Q-matrix underlying a given test be known. Unfortunately, in practice, the Q-matrix for most tests is not known and must be estimated to specify the associations between items and attributes, risking a misspecified Q-matrix that may then result in the incorrect classification of examinees. This paper demonstrates that clustering examinees into proficiency classes based on their item scores rather than on their attribute sum-score profiles does not require knowledge of the Q-matrix, and results in a more accurate classification of examinees.
Collapse
Affiliation(s)
- Hans-Friedrich Köhn
- Department of Psychology, University of Illinois at Urbana-Champaign, Illinois, USA
| | | | | |
Collapse
|
17
|
Lord E, Diallo AB, Makarenkov V. Classification of bioinformatics workflows using weighted versions of partitioning and hierarchical clustering algorithms. BMC Bioinformatics 2015; 16:68. [PMID: 25887434 PMCID: PMC4354763 DOI: 10.1186/s12859-015-0508-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Accepted: 02/20/2015] [Indexed: 11/10/2022] Open
Abstract
Background Workflows, or computational pipelines, consisting of collections of multiple linked tasks are becoming more and more popular in many scientific fields, including computational biology. For example, simulation studies, which are now a must for statistical validation of new bioinformatics methods and software, are frequently carried out using the available workflow platforms. Workflows are typically organized to minimize the total execution time and to maximize the efficiency of the included operations. Clustering algorithms can be applied either for regrouping similar workflows for their simultaneous execution on a server, or for dispatching some lengthy workflows to different servers, or for classifying the available workflows with a view to performing a specific keyword search. Results In this study, we consider four different workflow encoding and clustering schemes which are representative for bioinformatics projects. Some of them allow for clustering workflows with similar topological features, while the others regroup workflows according to their specific attributes (e.g. associated keywords) or execution time. The four types of workflow encoding examined in this study were compared using the weighted versions of k-means and k-medoids partitioning algorithms. The Calinski-Harabasz, Silhouette and logSS clustering indices were considered. Hierarchical classification methods, including the UPGMA, Neighbor Joining, Fitch and Kitsch algorithms, were also applied to classify bioinformatics workflows. Moreover, a novel pairwise measure of clustering solution stability, which can be computed in situations when a series of independent program runs is carried out, was introduced. Conclusions Our findings based on the analysis of 220 real-life bioinformatics workflows suggest that the weighted clustering models based on keywords information or tasks execution times provide the most appropriate clustering solutions. Using datasets generated by the Armadillo and Taverna scientific workflow management system, we found that the weighted cosine distance in association with the k-medoids partitioning algorithm and the presence-absence workflow encoding provided the highest values of the Rand index among all compared clustering strategies. The introduced clustering stability indices, PS and PSG, can be effectively used to identify elements with a low clustering support. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0508-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Etienne Lord
- Département d'informatique, Université du Québec à Montréal, C.P. 8888 succ. Centre-Ville, Montreal, QC, H3C 3P8, Canada. .,Département de sciences biologiques, Université à Montréal, C.P. 6128 succ. Centre-Ville, Montreal, QC, H3C 3J7, Canada.
| | - Abdoulaye Baniré Diallo
- Département d'informatique, Université du Québec à Montréal, C.P. 8888 succ. Centre-Ville, Montreal, QC, H3C 3P8, Canada.
| | - Vladimir Makarenkov
- Département d'informatique, Université du Québec à Montréal, C.P. 8888 succ. Centre-Ville, Montreal, QC, H3C 3P8, Canada.
| |
Collapse
|
18
|
|
19
|
Fritz H, García-Escudero LA, Mayo-Iscar A. A fast algorithm for robust constrained clustering. Comput Stat Data Anal 2013. [DOI: 10.1016/j.csda.2012.11.018] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
20
|
|