1
|
Gómez-Martín C, Aparicio-Puerta E, Hackenberg M. sRNAtoolbox: Dockerized Analysis of Small RNA Sequencing Data in Model and Non-model Species. Methods Mol Biol 2023; 2630:179-213. [PMID: 36689184 DOI: 10.1007/978-1-0716-2982-6_13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
The current versions of the microRNA databases MiRgeneDB, miRBase, and PmiREN contain annotations for a total of 358 different species. Public repositories, however, host small RNA sequencing data for over 800 species. This discrepancy implies that microRNA research is also very active in species that neither have an available high-quality genome assembly nor annotations for microRNAs or other types of noncoding genes. These cases are particularly challenging to analyze because reference sequences need to be collected from different sources and processed and formatted appropriately so that the dedicated small RNA analysis tools can make use of them. In this protocol we describe how small RNA sequencing data can be easily analyzed by means of a dockerized version of the well-established sRNAtoolbox/sRNAbench small RNA tools. We outline the analysis of two publicly available datasets to demonstrate basic aspects like the preparation of the local database, expression profiling, or differential expression analysis as well as more advanced features such as quantification of exogenous RNA content and data analysis in non-model species.
Collapse
Affiliation(s)
- Cristina Gómez-Martín
- Department of Pathology, Cancer Center Amsterdam, Amsterdam UMC, VU University, Amsterdam, The Netherlands
| | | | | |
Collapse
|
2
|
Ullah A, Chakir A. Improvement for tasks allocation system in VM for cloud datacenter using modified bat algorithm. Multimed Tools Appl 2022; 81:29443-29457. [PMID: 35401026 PMCID: PMC8977130 DOI: 10.1007/s11042-022-12904-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 02/25/2022] [Accepted: 03/09/2022] [Indexed: 06/14/2023]
Abstract
Since its inception, cloud computing has greatly transformed our lives by connecting the entire world through shared computational resources over the internet. The COVID-19 pandemic has also disrupted the traditional learning and businesses and led us towards an era of cloud-based activities. Virtual machine is one of the main elements of virtualization in cloud computing that represents physical server into the virtual machine. The utilizations of these VM's are important to achieved effective task scheduling mechanism in cloud environment. This paper focuses on improvment of the task distribution system in VM for cloud computing using load balancing technique. For that reason modification took place at Bat algorithm fitness function value this section used in load balancer section. When algorithm iteration are complete then time to distribute the task among different VM therefore in this section of algorithm was modified. The second modification took place at the search process of Bat at dimension section. The proposed algorithm is known as modified Bat algorithm. Four parameter are used to check the performance of the system which are throughput, makespan, degree of imbalance and processing time. The proposed algorithm provides efficient result as compaire to other standard technique. Hence the proposed algorithm improved cloud data center accuracy and efficiency.
Collapse
Affiliation(s)
- Arif Ullah
- Department of Computing, Riphah International University, Faisalabad, Punjab 44000 Faisalabad, Pakistan
| | - Aziza Chakir
- Faculty of Law, Economics and Social Sciences, Hassan II University, Casablanca, Morocco
| |
Collapse
|
3
|
Landman T, Nissim N. Deep-Hook: A trusted deep learning-based framework for unknown malware detection and classification in Linux cloud environments. Neural Netw 2021; 144:648-685. [PMID: 34656885 DOI: 10.1016/j.neunet.2021.09.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 09/18/2021] [Accepted: 09/20/2021] [Indexed: 11/30/2022]
Abstract
Since the beginning of the 21st century, the use of cloud computing has increased rapidly, and it currently plays a significant role among most organizations' information technology (IT) infrastructure. Virtualization technologies, particularly virtual machines (VMs), are widely used and lie at the core of cloud computing. While different operating systems can run on top of VM instances, in public cloud environments the Linux operating system is used 90% of the time. Because of their prevalence, organizational Linux-based virtual servers have become an attractive target for cyber-attacks, mainly launched by sophisticated malware designed at causing harm, sabotaging operations, obtaining data, or gaining financial profit. This has resulted in the need for an advanced and reliable unknown malware detection mechanism for Linux cloud-based environments. Antivirus software and today's even more advanced malware detection solutions have limitations in detecting new, unseen, and evasive malware. Moreover, many existing solutions are considered untrusted, as they operate on the inspected machine and can be interfered with, and can even be detected by the malware itself, allowing malware to evade detection and cause damage. In this paper, we propose Deep-Hook, a trusted framework for unknown malware detection in Linux-based cloud environments. Deep-Hook hooks the VM's volatile memory in a trusted manner and acquires the memory dump to discover malware footprints while the VM operates. The memory dumps are transformed into visual images which are analyzed using a convolutional neural network (CNN) based classifier. The proposed framework has some key advantages, such as its agility, its ability to eliminate the need for features defined by a cyber domain expert, and most importantly, its ability to analyze the entire memory dump and thus to better utilize the existing indication it conceals, thus allowing the induction of a more accurate detection model. Deep-Hook was evaluated on widely used Linux virtual servers; four state-of-the-art CNN architectures; eight image resolutions; and a total of 22,400 volatile memory dumps representing the execution of a broad set of benign and malicious Linux applications. Our experimental evaluation results demonstrate Deep-Hook's ability to effectively, efficiently, and accurately detect and classify unknown malware (even evasive malware like rootkits), with an AUC and accuracy of up to 99.9%.
Collapse
Affiliation(s)
- Tom Landman
- Malware Lab, Cyber Security Research Center, Ben-Gurion University of the Negev, Israel; Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Israel
| | - Nir Nissim
- Malware Lab, Cyber Security Research Center, Ben-Gurion University of the Negev, Israel; Department of Industrial Engineering and Management, Ben-Gurion University of the Negev, Israel.
| |
Collapse
|
4
|
El Motaki S, Yahyaouy A, Gualous H, Sabor J. A new weighted fuzzy C-means clustering for workload monitoring in cloud datacenter platforms. Cluster Comput 2021; 24:3367-3379. [PMID: 34155435 PMCID: PMC8210524 DOI: 10.1007/s10586-021-03331-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 05/30/2021] [Accepted: 06/01/2021] [Indexed: 05/25/2023]
Abstract
The rapid growth in virtualization solutions has driven the widespread adoption of cloud computing paradigms among various industries and applications. This has led to a growing need for XaaS solutions and equipment to enable teleworking. To meet this need, cloud operators and datacenters have to overtake several challenges related to continuity, the quality of services provided, data security, and anomaly detection issues. Mainly, anomaly detection methods play a critical role in detecting virtual machines' abnormal behaviours that can potentially violate service level agreements established with users. Unsupervised machine learning techniques are among the most commonly used technologies for implementing anomaly detection systems. This paper introduces a novel clustering approach for analyzing virtual machine behaviour while running workloads in a system based on resource usage details (such as CPU utilization and downtime events). The proposed algorithm is inspired by the intuitive mechanism of flocking birds in nature to form reasonable clusters. Each starling movement's direction depends on self-information and information provided by other close starlings during the flight. Analogically, after associating a weight with each data sample to guide the formation of meaningful groups, each data element determines its next position in the feature space based on its current position and surroundings. Based on a realistic dataset and clustering validity indices, the experimental evaluation shows that the new weighted fuzzy c-means algorithm provides interesting results and outperforms the corresponding standard algorithm (weighted fuzzy c-means).
Collapse
Affiliation(s)
| | - Ali Yahyaouy
- University Sidi Mohamed Ben Abdellah, Fez, Morocco
| | | | | |
Collapse
|
5
|
Patel YS, Malwi Z, Nighojkar A, Misra R. Truthful online double auction based dynamic resource provisioning for multi-objective trade-offs in IaaS clouds. Cluster Comput 2021; 24:1855-1879. [PMID: 33456318 PMCID: PMC7799171 DOI: 10.1007/s10586-020-03225-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 12/11/2020] [Accepted: 12/15/2020] [Indexed: 06/12/2023]
Abstract
Auction designs have recently been adopted for static and dynamic resource provisioning in IaaS clouds, such as Microsoft Azure and Amazon EC2. However, the existing mechanisms are mostly restricted to simple auctions, single-objective, offline setting, one-sided interactions either among cloud users or cloud service providers (CSPs), and possible misreports of cloud user's private information. This paper proposes a more realistic scenario of online auctioning for IaaS clouds, with the unique characteristics of elasticity for time-varying arrival of cloud user requests under the time-based server maintenance in cloud data centers. We propose an online truthful double auction technique for balancing the multi-objective trade-offs between energy, revenue, and performance in IaaS clouds, consisting of a weighted bipartite matching based winning-bid determination algorithm for resource allocation and a Vickrey-Clarke-Groves (VCG) driven algorithm for payment calculation of winning bids. Through rigorous theoretical analysis and extensive trace-driven simulation studies exploiting Google cluster workload traces, we demonstrate that our mechanism significantly improves the performance while promising truthfulness, heterogeneity, economic efficiency, individual rationality, and has a polynomial-time computational complexity.
Collapse
Affiliation(s)
- Yashwant Singh Patel
- Department of Computer Science and Engineering, Indian Institute of Technology Patna, Bihar, India
| | | | - Animesh Nighojkar
- Department of Computer Science and Engineering, University of South Florida, 4202 E Fowler Ave, Tampa, FL 33620 USA
| | - Rajiv Misra
- Department of Computer Science and Engineering, Indian Institute of Technology Patna, Bihar, India
| |
Collapse
|
6
|
Gómez-Martín C, Lebrón R, Oliver JL, Hackenberg M. Prediction of CpG Islands as an Intrinsic Clustering Property Found in Many Eukaryotic DNA Sequences and Its Relation to DNA Methylation. Methods Mol Biol 2019; 1766:31-47. [PMID: 29605846 DOI: 10.1007/978-1-4939-7768-0_3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The promoter region of around 70% of all genes in the human genome is overlapped by a CpG island (CGI). CGIs have known functions in the transcription initiation and outstanding compositional features like high G+C content and CpG ratios when compared to the bulk DNA. We have shown before that CGIs manifest as clusters of CpGs in mammalian genomes and can therefore be detected using clustering methods. These techniques have several advantages over sliding window approaches which apply compositional properties as thresholds. In this protocol we show how to determine local (CpG islands) and global (distance distribution) clustering properties of CG dinucleotides and how to generalize this analysis to any k-mer or combinations of it. In addition, we illustrate how to easily cross the output of a CpG island prediction algorithm with our methylation database to detect differentially methylated CGIs. The analysis is given in a step-by-step protocol and all necessary programs are implemented into a virtual machine or, alternatively, the software can be downloaded and easily installed.
Collapse
Affiliation(s)
- Cristina Gómez-Martín
- Department of Genetics, Faculty of Science, University of Granada, Granada, Spain.,Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Instituto de Biotecnología, Granada, Spain
| | - Ricardo Lebrón
- Department of Genetics, Faculty of Science, University of Granada, Granada, Spain.,Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Instituto de Biotecnología, Granada, Spain
| | - José L Oliver
- Department of Genetics, Faculty of Science, University of Granada, Granada, Spain.,Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Instituto de Biotecnología, Granada, Spain
| | - Michael Hackenberg
- Department of Genetics, Faculty of Science, University of Granada, Granada, Spain. .,Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Instituto de Biotecnología, Granada, Spain.
| |
Collapse
|
7
|
Strozzi F, Janssen R, Wurmus R, Crusoe MR, Githinji G, Di Tommaso P, Belhachemi D, Möller S, Smant G, de Ligt J, Prins P. Scalable Workflows and Reproducible Data Analysis for Genomics. Methods Mol Biol 2019; 1910:723-745. [PMID: 31278683 PMCID: PMC7613310 DOI: 10.1007/978-1-4939-9074-0_24] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Biological, clinical, and pharmacological research now often involves analyses of genomes, transcriptomes, proteomes, and interactomes, within and between individuals and across species. Due to large volumes, the analysis and integration of data generated by such high-throughput technologies have become computationally intensive, and analysis can no longer happen on a typical desktop computer.In this chapter we show how to describe and execute the same analysis using a number of workflow systems and how these follow different approaches to tackle execution and reproducibility issues. We show how any researcher can create a reusable and reproducible bioinformatics pipeline that can be deployed and run anywhere. We show how to create a scalable, reusable, and shareable workflow using four different workflow engines: the Common Workflow Language (CWL), Guix Workflow Language (GWL), Snakemake, and Nextflow. Each of which can be run in parallel.We show how to bundle a number of tools used in evolutionary biology by using Debian, GNU Guix, and Bioconda software distributions, along with the use of container systems, such as Docker, GNU Guix, and Singularity. Together these distributions represent the overall majority of software packages relevant for biology, including PAML, Muscle, MAFFT, MrBayes, and BLAST. By bundling software in lightweight containers, they can be deployed on a desktop, in the cloud, and, increasingly, on compute clusters.By bundling software through these public software distributions, and by creating reproducible and shareable pipelines using these workflow engines, not only do bioinformaticians have to spend less time reinventing the wheel but also do we get closer to the ideal of making science reproducible. The examples in this chapter allow a quick comparison of different solutions.
Collapse
|
8
|
Verhoeven A, Giera M, Mayboroda OA. KIMBLE: A versatile visual NMR metabolomics workbench in KNIME. Anal Chim Acta 2018; 1044:66-76. [PMID: 30442406 DOI: 10.1016/j.aca.2018.07.070] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 07/25/2018] [Accepted: 07/26/2018] [Indexed: 01/17/2023]
Abstract
The problem of reproducibility of scientific research is a serious issue in biomedical sciences. In addition to experimental repeatability, limiting the (pre-) analytical variance is also essential. To address this problem in the field of metabolomics, we have designed KIMBLE, the KNIME-based Integrated MetaBoLomics Environment, a novel platform for the processing and analysis of NMR metabolomics data. It consists of an elaborate NMR metabolomics workflow in the KNIME workflow management system that handles both targeted and untargeted metabolomics. The workflow provides a self-documenting way of transforming raw time-domain NMR data into metabolic insights. Parameters for the quantification of a number of interesting metabolites in urine are included in the workflow, and several useful statistical analysis and visualization tools are incorporated as well. The workflow comes with an interesting sports-induced ketosis dataset so that new users can easily get acquainted with the platform. The user is free to adapt and extend the workflow to his or her personal needs. The KIMBLE workflow, the KNIME software and all the required libraries are installed in a VirtualBox virtual machine that allows for facile installation and use by non-experts.
Collapse
Affiliation(s)
- Aswin Verhoeven
- Center for Proteomics and Metabolomics, Leiden University Medical Center, Albinusdreef 2, 2333ZA, Leiden, The Netherlands.
| | - Martin Giera
- Center for Proteomics and Metabolomics, Leiden University Medical Center, Albinusdreef 2, 2333ZA, Leiden, The Netherlands
| | - Oleg A Mayboroda
- Center for Proteomics and Metabolomics, Leiden University Medical Center, Albinusdreef 2, 2333ZA, Leiden, The Netherlands
| |
Collapse
|
9
|
Abstract
High-throughput sequencing (HTS) data for small RNAs (noncoding RNA molecules that are 20-250 nucleotides in length) can now be routinely generated by minimally equipped wet laboratories; however, the bottleneck in HTS-based research has shifted now to the analysis of such huge amount of data. One of the reasons is that many analysis types require a Linux environment but computers, system administrators, and bioinformaticians suppose additional costs that often cannot be afforded by small to mid-sized groups or laboratories. Web servers are an alternative that can be used if the data is not subjected to privacy issues (what very often is an important issue with medical data). However, in any case they are less flexible than stand-alone programs limiting the number of workflows and analysis types that can be carried out.We show in this protocol how virtual machines can be used to overcome those problems and limitations. sRNAtoolboxVM is a virtual machine that can be executed on all common operating systems through virtualization programs like VirtualBox or VMware, providing the user with a high number of preinstalled programs like sRNAbench for small RNA analysis without the need to maintain additional servers and/or operating systems.
Collapse
Affiliation(s)
- Cristina Gómez-Martín
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071, Granada, Spain
| | - Ricardo Lebrón
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071, Granada, Spain.,Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Instituto de Biotecnología, Avda. del Conocimiento s/n, Granada, Spain
| | - Antonio Rueda
- Queen Mary University of London, Dawson Hall, Charterhouse Square, London, UK
| | - José L Oliver
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071, Granada, Spain.,Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Instituto de Biotecnología, Avda. del Conocimiento s/n, Granada, Spain
| | - Michael Hackenberg
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, Campus de Fuentenueva s/n, 18071, Granada, Spain. .,Lab. de Bioinformática, Centro de Investigación Biomédica, PTS, Instituto de Biotecnología, Avda. del Conocimiento s/n, Granada, Spain.
| |
Collapse
|
10
|
Agrawal S, Arze C, Adkins RS, Crabtree J, Riley D, Vangala M, Galens K, Fraser CM, Tettelin H, White O, Angiuoli SV, Mahurkar A, Fricke WF. CloVR-Comparative: automated, cloud-enabled comparative microbial genome sequence analysis pipeline. BMC Genomics 2017; 18:332. [PMID: 28449639 PMCID: PMC5408420 DOI: 10.1186/s12864-017-3717-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 04/21/2017] [Indexed: 11/11/2022] Open
Abstract
Background The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. Results CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. CloVR-Comparative runs reference-free multiple whole-genome alignments to determine unique, shared and core coding sequences (CDSs) and single nucleotide polymorphisms (SNPs). Output includes short summary reports and detailed text-based results files, graphical visualizations (phylogenetic trees, circular figures), and a database file linked to the Sybil comparative genome browser. Data up- and download, pipeline configuration and monitoring, and access to Sybil are managed through CloVR-Comparative web interface. CloVR-Comparative and Sybil are distributed as part of the CloVR virtual appliance, which runs on local computers or the Amazon EC2 cloud. Representative datasets (e.g. 40 draft and complete Escherichia coli genomes) are processed in <36 h on a local desktop or at a cost of <$20 on EC2. Conclusions CloVR-Comparative allows anybody with Internet access to run comparative genomics projects, while eliminating the need for on-site computational resources and expertise. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3717-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Cesar Arze
- Institute for Genome Sciences, Baltimore, MD, USA
| | | | | | - David Riley
- Institute for Genome Sciences, Baltimore, MD, USA
| | | | - Kevin Galens
- Institute for Genome Sciences, Baltimore, MD, USA
| | - Claire M Fraser
- Institute for Genome Sciences, Baltimore, MD, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Hervé Tettelin
- Institute for Genome Sciences, Baltimore, MD, USA.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Owen White
- Institute for Genome Sciences, Baltimore, MD, USA.,Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | | | | | - W Florian Fricke
- Institute for Genome Sciences, Baltimore, MD, USA. .,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, USA. .,Department of Nutrigenomics, University of Hohenheim, Stuttgart, Germany.
| |
Collapse
|
11
|
Bonzon P. Towards neuro-inspired symbolic models of cognition: linking neural dynamics to behaviors through asynchronous communications. Cogn Neurodyn 2017; 11:327-353. [PMID: 28761554 PMCID: PMC5509613 DOI: 10.1007/s11571-017-9435-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2016] [Revised: 02/19/2017] [Accepted: 03/08/2017] [Indexed: 12/12/2022] Open
Abstract
A computational architecture modeling the relation between perception and action is proposed. Basic brain processes representing synaptic plasticity are first abstracted through asynchronous communication protocols and implemented as virtual microcircuits. These are used in turn to build mesoscale circuits embodying parallel cognitive processes. Encoding these circuits into symbolic expressions gives finally rise to neuro-inspired programs that are compiled into pseudo-code to be interpreted by a virtual machine. Quantitative evaluation measures are given by the modification of synapse weights over time. This approach is illustrated by models of simple forms of behaviors exhibiting cognition up to the third level of animal awareness. As a potential benefit, symbolic models of emergent psychological mechanisms could lead to the discovery of the learning processes involved in the development of cognition. The executable specifications of an experimental platform allowing for the reproduction of simulated experiments are given in “Appendix”.
Collapse
Affiliation(s)
- Pierre Bonzon
- Department of Information Systems, Faculty of HEC, University of Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
12
|
Mitchener WG. Evolution of communication protocols using an artificial regulatory network. Artif Life 2014; 20:491-530. [PMID: 25148549 DOI: 10.1162/artl_a_00146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
I describe the Utrecht Machine (UM), a discrete artificial regulatory network designed for studying how evolution discovers biochemical computation mechanisms. The corresponding binary genome format is compatible with gene deletion, duplication, and recombination. In the simulation presented here, an agent consisting of two UMs, a sender and a receiver, must encode, transmit, and decode a binary word over time using the narrow communication channel between them. This communication problem has chicken-and-egg structure in that a sending mechanism is useless without a corresponding receiving mechanism. An in-depth case study reveals that a coincidence creates a minimal partial solution, from which a sequence of partial sending and receiving mechanisms evolve. Gene duplications contribute by enlarging the regulatory network. Analysis of 60,000 sample runs under a variety of parameter settings confirms that crossover accelerates evolution, that stronger selection tends to find clumsier solutions and finds them more slowly, and that there is implicit selection for robust mechanisms and genomes at the codon level. Typical solutions associate each input bit with an activation speed and combine them almost additively. The parents of breakthrough organisms sometimes have lower fitness scores than others in the population, indicating that populations can cross valleys in the fitness landscape via outlying members. The simulation exhibits back mutations and population-level memory effects not accounted for in traditional population genetics models. All together, these phenomena suggest that new evolutionary models are needed that incorporate regulatory network structure.
Collapse
|