Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wan C, Jones DT. Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. NAT MACH INTELL 2020;2:540-50. [DOI: 10.1038/s42256-020-0222-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

For:	Wan C, Jones DT. Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. NAT MACH INTELL 2020;2:540-50. [DOI: 10.1038/s42256-020-0222-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Number

Cited by Other Article(s)

Beltrán JF, Herrera-Belén L, Parraguez-Contreras F, Farías JG, Machuca-Sepúlveda J, Short S. MultiToxPred 1.0: a novel comprehensive tool for predicting 27 classes of protein toxins using an ensemble machine learning approach. BMC Bioinformatics 2024;25:148. [PMID: 38609877 PMCID: PMC11010298 DOI: 10.1186/s12859-024-05748-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 03/14/2024] [Indexed: 04/14/2024] Open

Singh J, Khanna NN, Rout RK, Singh N, Laird JR, Singh IM, Kalra MK, Mantella LE, Johri AM, Isenovic ER, Fouda MM, Saba L, Fatemi M, Suri JS. GeneAI 3.0: powerful, novel, generalized hybrid and ensemble deep learning frameworks for miRNA species classification of stationary patterns from nucleotides. Sci Rep 2024;14:7154. [PMID: 38531923 DOI: 10.1038/s41598-024-56786-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 03/11/2024] [Indexed: 03/28/2024] Open

Abstract

Due to the intricate relationship between the small non-coding ribonucleic acid (miRNA) sequences, the classification of miRNA species, namely Human, Gorilla, Rat, and Mouse is challenging. Previous methods are not robust and accurate. In this study, we present AtheroPoint's GeneAI 3.0, a powerful, novel, and generalized method for extracting features from the fixed patterns of purines and pyrimidines in each miRNA sequence in ensemble paradigms in machine learning (EML) and convolutional neural network (CNN)-based deep learning (EDL) frameworks. GeneAI 3.0 utilized five conventional (Entropy, Dissimilarity, Energy, Homogeneity, and Contrast), and three contemporary (Shannon entropy, Hurst exponent, Fractal dimension) features, to generate a composite feature set from given miRNA sequences which were then passed into our ML and DL classification framework. A set of 11 new classifiers was designed consisting of 5 EML and 6 EDL for binary/multiclass classification. It was benchmarked against 9 solo ML (SML), 6 solo DL (SDL), 12 hybrid DL (HDL) models, resulting in a total of 11 + 27 = 38 models were designed. Four hypotheses were formulated and validated using explainable AI (XAI) as well as reliability/statistical tests. The order of the mean performance using accuracy (ACC)/area-under-the-curve (AUC) of the 24 DL classifiers was: EDL > HDL > SDL. The mean performance of EDL models with CNN layers was superior to that without CNN layers by 0.73%/0.92%. Mean performance of EML models was superior to SML models with improvements of ACC/AUC by 6.24%/6.46%. EDL models performed significantly better than EML models, with a mean increase in ACC/AUC of 7.09%/6.96%. The GeneAI 3.0 tool produced expected XAI feature plots, and the statistical tests showed significant p-values. Ensemble models with composite features are highly effective and generalized models for effectively classifying miRNA sequences.

Collapse

Mujahid O, Contreras I, Beneyto A, Vehi J. Generative deep learning for the development of a type 1 diabetes simulator. COMMUNICATIONS MEDICINE 2024;4:51. [PMID: 38493243 PMCID: PMC10944502 DOI: 10.1038/s43856-024-00476-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 03/05/2024] [Indexed: 03/18/2024] Open

Abstract

BACKGROUND

Type 1 diabetes (T1D) simulators, crucial for advancing diabetes treatments, often fall short of capturing the entire complexity of the glucose-insulin system due to the imprecise approximation of the physiological models. This study introduces a simulation approach employing a conditional deep generative model. The aim is to overcome the limitations of existing T1D simulators by synthesizing virtual patients that more accurately represent the entire glucose-insulin system physiology.

METHODS

Our methodology utilizes a sequence-to-sequence generative adversarial network to simulate virtual T1D patients causally. Causality is embedded in the model by introducing shifted input-output pairs during training, with a 90-min shift capturing the impact of input insulin and carbohydrates on blood glucose. To validate our approach, we train and evaluate the model using three distinct datasets, each consisting of 27, 12, and 10 T1D patients, respectively. In addition, we subject the trained model to further validation for closed-loop therapy, employing a state-of-the-art controller.

RESULTS

The generated patients display statistical similarity to real patients when evaluated on the time-in-range results for each of the standard blood glucose ranges in T1D management along with means and variability outcomes. When tested for causality, authentic causal links are identified between the insulin, carbohydrates, and blood glucose levels of the virtual patients. The trained generative model demonstrates behaviours that are closer to reality compared to conventional T1D simulators when subjected to closed-loop insulin therapy using a state-of-the-art controller.

CONCLUSIONS

These results highlight our approach's capability to accurately capture physiological dynamics and establish genuine causal relationships, holding promise for enhancing the development and evaluation of therapies in diabetes.

Collapse

Michael-Pitschaze T, Cohen N, Ofer D, Hoshen Y, Linial M. Detecting anomalous proteins using deep representations. NAR Genom Bioinform 2024;6:lqae021. [PMID: 38486884 PMCID: PMC10939404 DOI: 10.1093/nargab/lqae021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 11/17/2023] [Accepted: 02/23/2024] [Indexed: 03/17/2024] Open

Beltrán JF, Belén LH, Farias JG, Zamorano M, Lefin N, Miranda J, Parraguez-Contreras F. VirusHound-I: prediction of viral proteins involved in the evasion of host adaptive immune response using the random forest algorithm and generative adversarial network for data augmentation. Brief Bioinform 2023;25:bbad434. [PMID: 38033292 PMCID: PMC10753651 DOI: 10.1093/bib/bbad434] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 10/18/2023] [Accepted: 11/05/2023] [Indexed: 12/02/2023] Open

Ibtehaz N, Kagaya Y, Kihara D. Domain-PFP allows protein function prediction using function-aware domain embedding representations. Commun Biol 2023;6:1103. [PMID: 37907681 PMCID: PMC10618451 DOI: 10.1038/s42003-023-05476-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 10/17/2023] [Indexed: 11/02/2023] Open

Mardikoraem M, Wang Z, Pascual N, Woldring D. Generative models for protein sequence modeling: recent advances and future directions. Brief Bioinform 2023;24:bbad358. [PMID: 37864295 PMCID: PMC10589401 DOI: 10.1093/bib/bbad358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 09/08/2023] [Accepted: 09/12/2023] [Indexed: 10/22/2023] Open

Ibtehaz N, Kagaya Y, Kihara D. Domain-PFP: Protein Function Prediction Using Function-Aware Domain Embedding Representations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.23.554486. [PMID: 37662252 PMCID: PMC10473699 DOI: 10.1101/2023.08.23.554486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]

Wang H, Fu T, Du Y, Gao W, Huang K, Liu Z, Chandak P, Liu S, Van Katwyk P, Deac A, Anandkumar A, Bergen K, Gomes CP, Ho S, Kohli P, Lasenby J, Leskovec J, Liu TY, Manrai A, Marks D, Ramsundar B, Song L, Sun J, Tang J, Veličković P, Welling M, Zhang L, Coley CW, Bengio Y, Zitnik M. Scientific discovery in the age of artificial intelligence. Nature 2023;620:47-60. [PMID: 37532811 DOI: 10.1038/s41586-023-06221-2] [Citation(s) in RCA: 76] [Impact Index Per Article: 76.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 05/16/2023] [Indexed: 08/04/2023]

Affiliation(s)

Hanchen Wang Department of Engineering, University of Cambridge, Cambridge, UK Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA Department of Research and Early Development, Genentech Inc, South San Francisco, CA, USA Department of Computer Science, Stanford University, Stanford, CA, USA
Tianfan Fu Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA
Yuanqi Du Department of Computer Science, Cornell University, Ithaca, NY, USA
Wenhao Gao Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
Kexin Huang Department of Computer Science, Stanford University, Stanford, CA, USA
Ziming Liu Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA
Payal Chandak Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA, USA
Shengchao Liu Mila - Quebec AI Institute, Montreal, Quebec, Canada Université de Montréal, Montreal, Quebec, Canada
Peter Van Katwyk Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA Data Science Institute, Brown University, Providence, RI, USA
Andreea Deac Mila - Quebec AI Institute, Montreal, Quebec, Canada Université de Montréal, Montreal, Quebec, Canada
Anima Anandkumar Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA NVIDIA, Santa Clara, CA, USA
Karianne Bergen Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA Data Science Institute, Brown University, Providence, RI, USA
Carla P Gomes Department of Computer Science, Cornell University, Ithaca, NY, USA
Shirley Ho Center for Computational Astrophysics, Flatiron Institute, New York, NY, USA Department of Astrophysical Sciences, Princeton University, Princeton, NJ, USA Department of Physics, Carnegie Mellon University, Pittsburgh, PA, USA Department of Physics and Center for Data Science, New York University, New York, NY, USA
Pushmeet Kohli Google DeepMind, London, UK
Joan Lasenby Department of Engineering, University of Cambridge, Cambridge, UK
Jure Leskovec Department of Computer Science, Stanford University, Stanford, CA, USA
Tie-Yan Liu Microsoft Research, Beijing, China
Arjun Manrai Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Debora Marks Department of Systems Biology, Harvard Medical School, Boston, MA, USA Broad Institute of MIT and Harvard, Cambridge, MA, USA
Bharath Ramsundar Deep Forest Sciences, Palo Alto, CA, USA
Le Song BioMap, Beijing, China Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Jimeng Sun University of Illinois at Urbana-Champaign, Champaign, IL, USA
Jian Tang Mila - Quebec AI Institute, Montreal, Quebec, Canada HEC Montréal, Montreal, Quebec, Canada CIFAR AI Chair, Toronto, Ontario, Canada
Petar Veličković Google DeepMind, London, UK Department of Computer Science and Technology, University of Cambridge, Cambridge, UK
Max Welling University of Amsterdam, Amsterdam, Netherlands Microsoft Research Amsterdam, Amsterdam, Netherlands
Linfeng Zhang DP Technology, Beijing, China AI for Science Institute, Beijing, China
Connor W Coley Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
Yoshua Bengio Mila - Quebec AI Institute, Montreal, Quebec, Canada Université de Montréal, Montreal, Quebec, Canada
Marinka Zitnik Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. Broad Institute of MIT and Harvard, Cambridge, MA, USA. Harvard Data Science Initiative, Cambridge, MA, USA. Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.

Collapse

Zheng Y, Young ND, Song J, Chang BC, Gasser RB. An informatic workflow for the enhanced annotation of excretory/secretory proteins of Haemonchus contortus. Comput Struct Biotechnol J 2023;21:2696-2704. [PMID: 37143762 PMCID: PMC10151223 DOI: 10.1016/j.csbj.2023.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/16/2023] [Accepted: 03/16/2023] [Indexed: 03/19/2023] Open

Wang F, Feng X, Kong R, Chang S. Generating new protein sequences by using dense network and attention mechanism. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023;20:4178-4197. [PMID: 36899622 DOI: 10.3934/mbe.2023195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]

Zhu YH, Zhang C, Liu Y, Omenn GS, Freddolino PL, Yu DJ, Zhang Y. TripletGO: Integrating Transcript Expression Profiles with Protein Homology Inferences for Gene Function Prediction. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022;20:1013-1027. [PMID: 35568117 PMCID: PMC10025770 DOI: 10.1016/j.gpb.2022.03.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 03/02/2022] [Accepted: 04/16/2022] [Indexed: 01/13/2023]

Soleymani F, Paquet E, Viktor H, Michalowski W, Spinello D. Protein-protein interaction prediction with deep learning: A comprehensive review. Comput Struct Biotechnol J 2022;20:5316-5341. [PMID: 36212542 PMCID: PMC9520216 DOI: 10.1016/j.csbj.2022.08.070] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/29/2022] [Accepted: 08/30/2022] [Indexed: 11/15/2022] Open

Fu X, Bates PA. Application of deep learning methods: From molecular modelling to patient classification. Exp Cell Res 2022;418:113278. [PMID: 35810775 DOI: 10.1016/j.yexcr.2022.113278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 06/16/2022] [Accepted: 07/05/2022] [Indexed: 11/28/2022]

Kagaya Y, Flannery ST, Jain A, Kihara D. ContactPFP: Protein Function Prediction Using Predicted Contact Information. FRONTIERS IN BIOINFORMATICS 2022;2. [PMID: 35875419 PMCID: PMC9302406 DOI: 10.3389/fbinf.2022.896295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Computational function prediction is one of the most important problems in bioinformatics as elucidating the function of genes is a central task in molecular biology and genomics. Most of the existing function prediction methods use protein sequences as the primary source of input information because the sequence is the most available information for query proteins. There are attempts to consider other attributes of query proteins. Among these attributes, the three-dimensional (3D) structure of proteins is known to be very useful in identifying the evolutionary relationship of proteins, from which functional similarity can be inferred. Here, we report a novel protein function prediction method, ContactPFP, which uses predicted residue-residue contact maps as input structural features of query proteins. Although 3D structure information is known to be useful, it has not been routinely used in function prediction because the 3D structure is not experimentally determined for many proteins. In ContactPFP, we overcome this limitation by using residue-residue contact prediction, which has become increasingly accurate due to rapid development in the protein structure prediction field. ContactPFP takes a query protein sequence as input and uses predicted residue-residue contact as a proxy for the 3D protein structure. To characterize how predicted contacts contribute to function prediction accuracy, we compared the performance of ContactPFP with several well-established sequence-based function prediction methods. The comparative study revealed the advantages and weaknesses of ContactPFP compared to contemporary sequence-based methods. There were many cases where it showed higher prediction accuracy. We examined factors that affected the accuracy of ContactPFP using several illustrative cases that highlight the strength of our method. Collapse

Wang Y, Luo X, Zou Q. Effector-GAN: prediction of fungal effector proteins based on pretrained deep representation learning methods and generative adversarial networks. Bioinformatics 2022;38:3541-3548. [PMID: 35640972 DOI: 10.1093/bioinformatics/btac374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 05/05/2022] [Accepted: 05/27/2022] [Indexed: 11/13/2022] Open

Serov N, Vinogradov V. Artificial intelligence to bring nanomedicine to life. Adv Drug Deliv Rev 2022;184:114194. [PMID: 35283223 DOI: 10.1016/j.addr.2022.114194] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 03/04/2022] [Accepted: 03/07/2022] [Indexed: 12/13/2022]

Abstract

The technology of drug delivery systems (DDSs) has demonstrated an outstanding performance and effectiveness in production of pharmaceuticals, as it is proved by many FDA-approved nanomedicines that have an enhanced selectivity, manageable drug release kinetics and synergistic therapeutic actions. Nonetheless, to date, the rational design and high-throughput development of nanomaterial-based DDSs for specific purposes is far from a routine practice and is still in its infancy, mainly due to the limitations in scientists' capabilities to effectively acquire, analyze, manage, and comprehend complex and ever-growing sets of experimental data, which is vital to develop DDSs with a set of desired functionalities. At the same time, this task is feasible for the data-driven approaches, high throughput experimentation techniques, process automatization, artificial intelligence (AI) technology, and machine learning (ML) approaches, which is referred to as The Fourth Paradigm of scientific research. Therefore, an integration of these approaches with nanomedicine and nanotechnology can potentially accelerate the rational design and high-throughput development of highly efficient nanoformulated drugs and smart materials with pre-defined functionalities. In this Review, we survey the important results and milestones achieved to date in the application of data science, high throughput, as well as automatization approaches, combined with AI and ML to design and optimize DDSs and related nanomaterials. This manuscript mission is not only to reflect the state-of-art in data-driven nanomedicine, but also show how recent findings in the related fields can transform the nanomedicine's image. We discuss how all these results can be used to boost nanomedicine translation to the clinic, as well as highlight the future directions for the development, data-driven, high throughput experimentation-, and AI-assisted design, as well as the production of nanoformulated drugs and smart materials with pre-defined properties and behavior. This Review will be of high interest to the chemists involved in materials science, nanotechnology, and DDSs development for biomedical applications, although the general nature of the presented approaches enables knowledge translation to many other fields of science.

Collapse

Vicedomini R, Bouly JP, Laine E, Falciatore A, Carbone A. Multiple profile models extract features from protein sequence data and resolve functional diversity of very different protein families. Mol Biol Evol 2022;39:6556147. [PMID: 35353898 PMCID: PMC9016551 DOI: 10.1093/molbev/msac070] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

Functional classification of proteins from sequences alone has become a critical bottleneck in understanding the myriad of protein sequences that accumulate in our databases. The great diversity of homologous sequences hides, in many cases, a variety of functional activities that cannot be anticipated. Their identification appears critical for a fundamental understanding of the evolution of living organisms and for biotechnological applications. ProfileView is a sequence-based computational method, designed to functionally classify sets of homologous sequences. It relies on two main ideas: the use of multiple profile models whose construction explores evolutionary information in available databases, and a novel definition of a representation space in which to analyse sequences with multiple profile models combined together. ProfileView classifies protein families by enriching known functional groups with new sequences and discovering new groups and subgroups. We validate ProfileView on seven classes of widespread proteins involved in the interaction with nucleic acids, amino acids and small molecules, and in a large variety of functions and enzymatic reactions. Profile-View agrees with the large set of functional data collected for these proteins from the literature regarding the organisation into functional subgroups and residues that characterise the functions. In addition, ProfileView resolves undefined functional classifications and extracts the molecular determinants underlying protein functional diversity, showing its potential to select sequences towards accurate experimental design and discovery of novel biological functions. On protein families with complex domain architecture, ProfileView functional classification reconciles domain combinations, unlike phylogenetic reconstruction. ProfileView proves to outperform the functional classification approach PANTHER, the two k-mer based methods CUPP and eCAMI and a neural network approach based on Restricted Boltzmann Machines. It overcomes time complexity limitations of the latter.

Collapse

AIM in Genomic Basis of Medicine: Applications. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Asadi M, McPhedran KN. Greenhouse gas emission estimation from municipal wastewater using a hybrid approach of generative adversarial network and data-driven modelling. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021;800:149508. [PMID: 34391143 DOI: 10.1016/j.scitotenv.2021.149508] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2021] [Revised: 07/29/2021] [Accepted: 08/03/2021] [Indexed: 06/13/2023]

Abstract

Greenhouse gas (GHG) emissions including carbon dioxide (CO₂), methane (CH₄), and nitrous oxide (N₂O) created via wastewater treatment processes are not easily modeled given the non-linearity and complexity of biological processes. These factors are also impacted by limited data availability making the development of artificial data generation algorithms, such as a generative adversarial network (GAN), useful for determination of GHG emission rate estimates (EREs). The main objective of this study was to develop a hybrid approach of using GAN and regression modelling to determine GHG EREs from a cold-region biological nutrient removal (BNR) municipal wastewater treatment plant (MWTP) in which the aerobic reactor has previously been established as the main GHG emission source. To our knowledge, this is the first application of GAN used for MWTP modelling purposes. The EREs were generated from laboratory-scale reactors used in conjunction with facility-monitored operating parameters to develop the GAN and regression models. Results showed that regression models provided reasonable EREs using parameters including hydraulic retention time (HRT), temperature, total organic carbon, and dissolved oxygen (DO) concentrations for CO₂ EREs; HRT, temperature, DO and phosphate (PO₄^3-) concentrations for CH₄ EREs; and temperature, DO, and nitrogen (nitrite, nitrate, and ammonium) concentrations for N₂O EREs. Additionally, the addition of 100 GAN-created virtual data points improved regression model metrics including increased correlation coefficient and index agreement values, and decreased root mean square error values. Clearly, virtual data augmentation using GAN is a valuable resource in supplementation of limited data for improved modelling outcomes. Genetic algorithm optimization was also used to determine operating parameter modifications resulting in potential for minimization (or maximization) of GHG emissions.

Collapse

Chu HY, Wong ASL. Facilitating Machine Learning-Guided Protein Engineering with Smart Library Design and Massively Parallel Assays. ADVANCED GENETICS (HOBOKEN, N.J.) 2021;2:2100038. [PMID: 36619853 PMCID: PMC9744531 DOI: 10.1002/ggn2.202100038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 11/08/2021] [Indexed: 01/11/2023]

A Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides. Int J Mol Sci 2021;22:ijms222212291. [PMID: 34830173 PMCID: PMC8619404 DOI: 10.3390/ijms222212291] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 11/09/2021] [Accepted: 11/11/2021] [Indexed: 11/17/2022] Open

Vu TTD, Jung J. Protein function prediction with gene ontology: from traditional to deep learning models. PeerJ 2021;9:e12019. [PMID: 34513334 PMCID: PMC8395570 DOI: 10.7717/peerj.12019] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 07/29/2021] [Indexed: 11/25/2022] Open

Li M, Zhang W. PHIAF: prediction of phage-host interactions with GAN-based data augmentation and sequence-based feature fusion. Brief Bioinform 2021;23:6362109. [PMID: 34472593 DOI: 10.1093/bib/bbab348] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 07/05/2021] [Accepted: 07/18/2021] [Indexed: 01/01/2023] Open

Chen RJ, Lu MY, Chen TY, Williamson DFK, Mahmood F. Synthetic data in machine learning for medicine and healthcare. Nat Biomed Eng 2021;5:493-497. [PMID: 34131324 PMCID: PMC9353344 DOI: 10.1038/s41551-021-00751-8] [Citation(s) in RCA: 144] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Kamada M, Okuno Y. AIM in Genomic Basis of Medicine: Applications. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_264-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Makrodimitris S, van Ham RCHJ, Reinders MJT. Automatic Gene Function Prediction in the 2020's. Genes (Basel) 2020;11:E1264. [PMID: 33120976 PMCID: PMC7692357 DOI: 10.3390/genes11111264] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 10/19/2020] [Accepted: 10/21/2020] [Indexed: 02/06/2023] Open