142 results on '"Natalio Krasnogor"'
Search Results
2. Toward Full-Stack In Silico Synthetic Biology: Integrating Model Specification, Simulation, Verification, and Biological Compilation
- Author
-
Boyang Peter Dun, Christophe Ladroue, Jamie Twycross, Harold Fellermann, Natalio Krasnogor, Savas Konur, Laurentiu Mierla, Anil Wipat, Bradley Brown, Sara Kalvala, and Marian Gheorghe
- Subjects
Software suite ,business.industry ,Computer science ,Interoperability ,Biomedical Engineering ,Synthetic biological circuit ,System requirements specification ,General Medicine ,Biochemistry, Genetics and Molecular Biology (miscellaneous) ,Synthetic biology ,Workflow ,Workbench ,SBML ,Software engineering ,business - Abstract
We present the Infobiotics Workbench (IBW), a user-friendly, scalable, and integrated computational environment for the computer-aided design of synthetic biological systems. It supports an iterative workflow that begins with specification of the desired synthetic system, followed by simulation and verification of the system in high-performance environments and ending with the eventual compilation of the system specification into suitable genetic constructs. IBW integrates modeling, simulation, verification, and biocompilation features into a single software suite. This integration is achieved through a new domain-specific biological programming language, the Infobiotics Language (IBL), which tightly combines these different aspects of in silico synthetic biology into a full-stack integrated development environment. Unlike existing synthetic biology modeling or specification languages, IBL uniquely blends modeling, verification, and biocompilation statements into a single file. This allows biologists to incorporate design constraints within the specification file rather than using decoupled and independent formalisms for different in silico analyses. This novel approach offers seamless interoperability across different tools as well as compatibility with SBOL and SBML frameworks and removes the burden of doing manual translations for standalone applications. We demonstrate the features, usability, and effectiveness of IBW and IBL using well-established synthetic biological circuits.
- Published
- 2021
3. For the sake of the Bioeconomy: define what a Synthetic Biology Chassis is!
- Author
-
Natalio Krasnogor, Víctor de Lorenzo, and Markus Schmidt
- Subjects
0106 biological sciences ,0303 health sciences ,Chassis ,Traceability ,Pseudomonas putida ,Computer science ,Bioengineering ,General Medicine ,01 natural sciences ,Data science ,Unique identifier ,03 medical and health sciences ,Jargon ,Synthetic biology ,010608 biotechnology ,Synthetic Biology ,Molecular Biology ,030304 developmental biology ,Biotechnology - Abstract
At the onset of the 4th Industrial Revolution, the role of synthetic biology (SynBio) as a fuel for the bioeconomy requires clarification of the terms typically adopted by this growing scientific-technical field. The concept of the chassis as a defined, reusable biological frame where non-native components can be plugged in and out to create new functionalities lies at the boundary between frontline bioengineering and more traditional recombinant DNA technology. As synthetic biology leaves academic laboratories and starts penetrating industrial and environmental realms regulatory agencies demand clear definitions and descriptions of SynBio constituents, processes and products. In this article, the state of the ongoing discussion on what is a chassis is reviewed, a non-equivocal nomenclature for the jargon used is proposed and objective criteria are recommended for distinguishing SynBio agents from traditional GMOs. The use of genomic barcodes as unique identifiers is strongly advocated. Finally the soil bacterium Pseudomonas putida is shown as an example of the roadmap that one environmental isolate may go through to become a bona fide SynBio chassis.
- Published
- 2021
4. Automatic Tuning of Rule-Based Evolutionary Machine Learning via Problem Structure Identification
- Author
-
Maria A. Franco, Natalio Krasnogor, and Jaume Bacardit
- Subjects
Structure (mathematical logic) ,Computer science ,business.industry ,media_common.quotation_subject ,Value (computer science) ,Binary number ,Rule-based system ,02 engineering and technology ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,Variety (cybernetics) ,Identification (information) ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Noise (video) ,Artificial intelligence ,Function (engineering) ,business ,computer ,media_common - Abstract
The success of any machine learning technique depends on the correct setting of its parameters and, when it comes to large-scale datasets, hand-tuning these parameters becomes impractical. However, very large-datasets can be pre-processed in order to distil information that could help in appropriately setting various systems parameters. In turn, this makes sophisticated machine learning methods easier to use to end-users. Thus, by modelling the performance of machine learning algorithms as a function of the structure inherent in very large datasets one could, in principle, detect "hotspots" in the parameters' space and thus, auto-tune machine learning algorithms for better dataset-specific performance. In this work we present a parameter setting mechanism for a rule-based evolutionary machine learning system that is capable of finding the adequate parameter value for a wide variety of synthetic classification problems with binary attributes and with/without added noise. Moreover, in the final validation stage our automated mechanism is able to reduce the computational time of preliminary experiments up to 71% for a challenging real-world bioinformatics dataset.
- Published
- 2020
5. AREA: An adaptive reference-set based evolutionary algorithm for multiobjective optimisation
- Author
-
Shengxiang Yang, Marcus Kaiser, Mingjun Zhong, Jinglei Guo, Shouyong Jiang, Natalio Krasnogor, and Hongru Li
- Subjects
Mathematical optimization ,Multiobjective optimisation ,Information Systems and Management ,Computer science ,Population ,Evolutionary algorithm ,02 engineering and technology ,Multi-objective optimization ,Theoretical Computer Science ,Set (abstract data type) ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Decomposition (computer science) ,Sensitivity (control systems) ,Adaptation (computer science) ,education ,Local mating ,education.field_of_study ,Search target ,05 social sciences ,050301 education ,Pareto front ,Computer Science Applications ,Range (mathematics) ,Control and Systems Engineering ,Reference set ,020201 artificial intelligence & image processing ,0503 education ,Software - Abstract
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link. Population-based evolutionary algorithms have great potential to handle multiobjective optimisation problems. However, the performance of these algorithms depends largely on problem characteristics. There is a need to improve these algorithms for wide applicability. References, often specified by the decision maker’s preference in different forms, are very effective to boost the performance of algorithms. This paper proposes a novel framework for effective use of references to strengthen algorithms. This framework considers references as search targets which can be adjusted based on the information collected during the search. The proposed framework is combined with new strategies, such as reference adaptation and adaptive local mating, to solve different types of problems. The proposed algorithm is compared with state-of-the-arts on a wide range of problems with diverse characteristics. The comparison and extensive sensitivity analysis demonstrate that the proposed algorithm is competitive and robust across different types of problems studied in this paper.
- Published
- 2020
6. NIHBA: a network interdiction approach for metabolic engineering design
- Author
-
Yong Wang, Natalio Krasnogor, Shouyong Jiang, and Marcus Kaiser
- Subjects
Statistics and Probability ,Mathematical optimization ,Computer science ,Models, Biological ,Biochemistry ,Bilevel optimization ,03 medical and health sciences ,Overhead (computing) ,Production (economics) ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,030306 microbiology ,Systems Biology ,Computational Biology ,Hemoglobin A ,Interdiction ,Original Papers ,Computer Science Applications ,Flux balance analysis ,Computational Mathematics ,Metabolic Engineering ,Computational Theory and Mathematics ,Algorithms ,Metabolic Networks and Pathways ,Software - Abstract
Motivation Flux balance analysis (FBA) based bilevel optimization has been a great success in redesigning metabolic networks for biochemical overproduction. To date, many computational approaches have been developed to solve the resulting bilevel optimization problems. However, most of them are of limited use due to biased optimality principle, poor scalability with the size of metabolic networks, potential numeric issues or low quantity of design solutions in a single run. Results Here, we have employed a network interdiction model free of growth optimality assumptions, a special case of bilevel optimization, for computational strain design and have developed a hybrid Benders algorithm (HBA) that deals with complicating binary variables in the model, thereby achieving high efficiency without numeric issues in search of best design strategies. More importantly, HBA can list solutions that meet users’ production requirements during the search, making it possible to obtain numerous design strategies at a small runtime overhead (typically ∼1 h, e.g. studied in this article). Availability and implementation Source code implemented in the MATALAB Cobratoolbox is freely available at https://github.com/chang88ye/NIHBA. Contact math4neu@gmail.com or natalio.krasnogor@ncl.ac.uk Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2020
7. Linking Engineered Cells to Their Digital Twins: A Version Control System for Strain Engineering
- Author
-
Paweł Widera, Jonathan Tellechea-Luzardo, Natalio Krasnogor, Charles Winterhalter, Víctor de Lorenzo, Jerzy Kozyra, Engineering and Physical Sciences Research Council (UK), Royal Academy of Engineering, and European Commission
- Subjects
0106 biological sciences ,Computer science ,Biomedical Engineering ,Revision control ,computer.software_genre ,01 natural sciences ,Biochemistry, Genetics and Molecular Biology (miscellaneous) ,Set (abstract data type) ,03 medical and health sciences ,Synthetic biology ,Strain engineering ,Software ,010608 biotechnology ,DNA barcode ,Escherichia coli ,DNA Barcoding, Taxonomic ,Clustered Regularly Interspaced Short Palindromic Repeats ,030304 developmental biology ,Recombination, Genetic ,0303 health sciences ,Version control System ,business.industry ,Scale (chemistry) ,Sequence Analysis, DNA ,General Medicine ,Digital twin ,Reproducibility ,Biological engineering ,VCS ,Microorganisms, Genetically-Modified ,Genetic Engineering ,business ,Software engineering ,computer ,Biotechnology ,Bacillus subtilis ,Agile software development - Abstract
As DNA sequencing and synthesis become cheaper and more easily accessible, the scale and complexity of biological engineering projects is set to grow. Yet, although there is an accelerating convergence between biotechnology and digital technology, a deficit in software and laboratory techniques diminishes the ability to make biotechnology more agile, reproducible, and transparent while, at the same time, limiting the security and safety of synthetic biology constructs. To partially address some of these problems, this paper presents an approach for physically linking engineered cells to their digital footprint—we called it digital twinning. This enables the tracking of the entire engineering history of a cell line in a specialized version control system for collaborative strain engineering via simple barcoding protocols., J.T.L., C.W., J.K., and N.K. were supported by the UK Engineering and Physical Research Council under project “Synthetic Portabolomics: Leading the way at the crossroads of the Digital and the Bio Economies (EP/N031962/1)”. N.K. is funded by a Royal Academy of Engineering Chair in Emerging Technology award. V.d.L. was supported by project “BioRoboost (H2020-NMBP-BIO-CSA-2018, grant agreement N820699)”.
- Published
- 2020
8. Linking Engineered Cells to Their Digital Twins: a Version Control System for Strain Engineering
- Author
-
Paweł Widera, Natalio Krasnogor, Víctor de Lorenzo, and Jonathan Tellechea-Luzardo
- Subjects
0303 health sciences ,030306 microbiology ,business.industry ,Computer science ,Scale (chemistry) ,Biological engineering ,Set (abstract data type) ,03 medical and health sciences ,Synthetic biology ,Software ,Strain engineering ,Cell culture ,Software engineering ,business ,030304 developmental biology ,Agile software development - Abstract
1.AbstractAs DNA sequencing and synthesis become cheaper and more easily accessible, the scale and complexity of biological engineering projects is set to grow. Yet, although there is an accelerating convergence between biotechnology and computing science, a deficit in software and laboratory techniques diminishes the ability to make biotechnology more agile, reproducible and transparent while, at the same time, limiting the security and safety of synthetic biology constructs. To partially address some of these problems, this paper presents an approach for physically linking engineered cells to their digital footprint - we called it digital twinning. This enables the tracking of the entire engineering history of a cell line in a specialised version control system for collaborative strain engineering.
- Published
- 2019
9. NIHBA: A Network Interdiction Approach with Hybrid Benders Algorithm for Strain Design
- Author
-
Yong Wang, Shouyong Jiang, Marcus Kaiser, and Natalio Krasnogor
- Subjects
Computer science ,Production (economics) ,Overhead (computing) ,Interdiction ,Algorithm ,Flux balance analysis - Abstract
Flux balance analysis (FBA) based bilevel optimisation has been a great success in redesigning metabolic networks for biochemical overproduction. To date, many computational approaches have been developed to solve the resulting bilevel optimisation problems. However, most of them are of limited use due to biased optimality principle, poor scalability with the size of metabolic networks, potential numeric issues, or low quantity of design solutions in a single run. In this work, we have employed a network interdiction model free of growth optimality assumptions, a special case of bilevel optimisation, for computational strain design and have developed a hybrid Benders algorithm (HBA) that deals with complicating binary variables in the model, thereby achieving high efficiency without numeric issues in search of best design strategies. More importantly, HBA can list solutions that meet users’ production requirements during the search, making it possible to obtain numerous design strategies at a small runtime overhead (typically ∼1 hour).
- Published
- 2019
- Full Text
- View/download PDF
10. A Scalable Test Suite for Continuous Dynamic Multiobjective Optimisation
- Author
-
Shengxiang Yang, Marcus Kaiser, Shouyong Jiang, Stefanos Kollias, and Natalio Krasnogor
- Subjects
FOS: Computer and information sciences ,Computer science ,business.industry ,Computer Science - Neural and Evolutionary Computing ,Machine learning ,computer.software_genre ,Multi-objective optimization ,Computer Science Applications ,Test (assessment) ,Human-Computer Interaction ,Set (abstract data type) ,Empirical research ,Control and Systems Engineering ,Scalability ,Test suite ,Artificial intelligence ,Neural and Evolutionary Computing (cs.NE) ,Electrical and Electronic Engineering ,business ,computer ,Software ,Information Systems - Abstract
Dynamic multiobjective optimisation has gained increasing attention in recent years. Test problems are of great importance in order to facilitate the development of advanced algorithms that can handle dynamic environments well. However, many of existing dynamic multiobjective test problems have not been rigorously constructed and analysed, which may induce some unexpected bias when they are used for algorithmic analysis. In this paper, some of these biases are identified after a review of widely used test problems. These include poor scalability of objectives and, more importantly, problematic overemphasis of static properties rather than dynamics making it difficult to draw accurate conclusion about the strengths and weaknesses of the algorithms studied. A diverse set of dynamics and features is then highlighted that a good test suite should have. We further develop a scalable continuous test suite, which includes a number of dynamics or features that have been rarely considered in literature but frequently occur in real life. It is demonstrated with empirical studies that the proposed test suite is more challenging to the dynamic multiobjective optimisation algorithms found in the literature. The test suite can also test algorithms in ways that existing test suites can not., Comment: 19 pages, 22 figures and 7 tables
- Published
- 2019
- Full Text
- View/download PDF
11. Easybiotics: a GUI for 3D physical modelling of multi-species bacterial populations
- Author
-
Jonathan Naylor, Harold Fellermann, and Natalio Krasnogor
- Subjects
Statistics and Probability ,Computer science ,Process (engineering) ,Complex system ,computer.software_genre ,Biochemistry ,03 medical and health sciences ,User-Computer Interface ,Software ,Molecular Biology ,030304 developmental biology ,Graphical user interface ,0303 health sciences ,Bacteria ,business.industry ,030302 biochemistry & molecular biology ,Physical modelling ,Computer Science Applications ,Computational Mathematics ,Workflow ,Computational Theory and Mathematics ,Virtual machine ,Graph (abstract data type) ,Software engineering ,business ,computer - Abstract
Motivation 3D physical modelling is a powerful computational technique that allows for the simulation of complex systems such as consortia of mixed bacterial species. The complexities in physical modelling reside in the knowledge intensive model building process and the computational expense in calculating their numerical solutions. These models can offer insights into microbiology, both in understanding natural systems and as design tools for developing novel synthetic bacterial systems. Developing a robust synthetic system typically requires multiple iterations around the specify→design→build→test cycle to meet specifications. This process is laborious and expensive for both the computational and laboratory aspects, hence any improvement in any of the workflow steps would be welcomed. We have previously introduced Simbiotics, a powerful and flexible platform for designing and analyzing 3D simulations of mixed species bacterial populations. Simbiotics requires programming experience to use which creates barriers to entry for use of the tool. Results In the spirit of enabling biologists who may not have programming skills to install and utilize Simbiotics, we present in this application note Easybiotics, a user-friendly graphical user interface for Simbiotics. Users may design, simulate and analyze models from within the graphical user interface, with features such as live graph plotting and parameter sweeps. Easybiotics provides full access to all of Simbiotics simulation features, such as cell growth, motility and gene regulation. Availability and implementation Easybiotics and Simbiotics are free to use under the GPL3.0 licence, and can be found at: http://ico2s.org/software/simbiotics.html. We also provide readily downloadable virtual machine sandboxes to facilitate rapid installation.
- Published
- 2018
12. Less detectable environmental changes in dynamic multiobjective optimisation
- Author
-
Marcus Kaiser, Shouyong Jiang, Jinglei Guo, Natalio Krasnogor, and Shengxiang Yang
- Subjects
0209 industrial biotechnology ,Mathematical optimization ,Dynamic multiobjective optimisation ,020901 industrial engineering & automation ,Computer science ,Less detectable environment ,Environmental changes ,0202 electrical engineering, electronic engineering, information engineering ,Evolutionary algorithm ,020201 artificial intelligence & image processing ,Algorithm design ,02 engineering and technology ,Bridge (nautical) - Abstract
Multiobjective optimisation in dynamic environments is challenging due to the presence of dynamics in the problems in question. Whilst much progress has been made in benchmarks and algorithm design for dynamic multiobjective optimisation, there is a lack of work on the detectability of environmental changes and how this affects the performance of evolutionary algorithms. This is not intentionally left blank but due to the unavailability of suitable test cases to study. To bridge the gap, this work presents several scenarios where environmental changes are less likely to be detected. Our experimental studies suggest that the less detectable environments pose a big challenge to evolutionary algorithms.
- Published
- 2018
13. The fittest, the common, and the dullest: Selection dynamics of exact autocatalytic replicators
- Author
-
Harold Fellermann, Ben Shirt-Ediss, and Natalio Krasnogor
- Subjects
Autocatalysis ,Computer science ,Survival of the fittest ,Dynamics (mechanics) ,Statistical physics ,Selection (genetic algorithm) - Published
- 2018
14. Strain Design as Multiobjective Network Interdiction Problem: A Preliminary Approach
- Author
-
Marcus Kaiser, Marina Torres, Natalio Krasnogor, Shouyong Jiang, and David A. Pelta
- Subjects
0301 basic medicine ,Minimisation (psychology) ,Problem solver ,Mathematical optimization ,030102 biochemistry & molecular biology ,Computer science ,Strain (biology) ,MathematicsofComputing_NUMERICALANALYSIS ,Sorting ,Synthetic biological circuit ,Interdiction ,Task (project management) ,03 medical and health sciences ,030104 developmental biology ,Genetic algorithm - Abstract
Computer-aided techniques have been widely applied to analyse the biological circuits of microorganisms and facilitate rational modification of metabolic networks for strain design in order to maximise the production of desired biochemicals for metabolic engineering. Most existing computational methods for strain design formulate the network redesign as a bilevel optimisation problem. While such methods have shown great promise for strain design, this paper employs the idea of network interdiction to fulfil the task. Strain design as a Multiobjective Network Interdiction Problem (MO-NIP) is proposed for which two objectives are optimised (biomass and bioengineering product) simultaneously in addition to the minimisation of the costs of genetic perturbations (design costs). An initial approach to solve the MO-NIP consists on a Nondominated Sorting Genetic Algorithm (NSGA-II). The shown examples demonstrate the usefulness of the proposed formulation for the MO-NIP and the feasibility of the NSGA-II as a problem solver.
- Published
- 2018
15. Optimizing nucleic acid sequences for a molecular data recorder
- Author
-
Harold Fellermann, Natalio Krasnogor, Annunziata Lopiccolo, Jerzy Kozyra, and Ben Shirt-Ediss
- Subjects
0301 basic medicine ,Fold (higher-order function) ,Computer science ,Evolutionary algorithm ,Brute-force search ,010402 general chemistry ,01 natural sciences ,DNA sequencing ,0104 chemical sciences ,law.invention ,03 medical and health sciences ,030104 developmental biology ,DNA computing ,law ,Nucleic acid ,A-DNA ,Algorithm - Abstract
We recently reported the design for a DNA nano-device that can record and store molecular signals. Here we present an evolutionary algorithm tailored to optimising nucleic acid sequences that predictively fold into our desired target structures. In our approach, a DNA device is first specified abstractly: the topology of the individual strands and their desired foldings into multi-strand complexes are described at the domain-level. Initially, this design is decomposed into a set of pairwise strand interactions. Then, we optimize candidate domains, such that the resulting sequences fold with high accuracy into desired target structures both (a) individually and (b) jointly, but also (c) to show high affinity for binding desired partners and simultaneously low affinity to bind with any undesired partner. As optimization heuristic we use a genetic algorithm that employs a linear combination of the above scores. Our algorithm was able to generate DNA sequences that satisfy all given criteria. Even though we cannot establish the theoretically achievable optima (as this would require exhaustive search), our solutions score 90% of an upper bound that ignores conflicting objectives. We envision that this approach can be generalized towards a broad class of toehold-mediated strand displacement systems.
- Published
- 2017
16. Hard Data Analytics Problems Make for Better Data Analysis Algorithms: Bioinformatics as an Example
- Author
-
Jaume Bacardit, Paweł Widera, Nicola Lazzarini, and Natalio Krasnogor
- Subjects
Biological data ,Information Systems and Management ,Biodata ,Process (engineering) ,Computer science ,Data science ,Field (computer science) ,Computer Science Applications ,Variety (cybernetics) ,Metadata ,Knowledge extraction ,Data analysis ,Original Research ,Information Systems - Abstract
Data mining and knowledge discovery techniques have greatly progressed in the last decade. They are now able to handle larger and larger datasets, process heterogeneous information, integrate complex metadata, and extract and visualize new knowledge. Often these advances were driven by new challenges arising from real-world domains, with biology and biotechnology a prime source of diverse and hard (e.g., high volume, high throughput, high variety, and high noise) data analytics problems. The aim of this article is to show the broad spectrum of data mining tasks and challenges present in biological data, and how these challenges have driven us over the years to design new data mining and knowledge discovery procedures for biodata. This is illustrated with the help of two kinds of case studies. The first kind is focused on the field of protein structure prediction, where we have contributed in several areas: by designing, through regression, functions that can distinguish between good and bad models of a protein's predicted structure; by creating new measures to characterize aspects of a protein's structure associated with individual positions in a protein's sequence, measures containing information that might be useful for protein structure prediction; and by creating accurate estimators of these structural aspects. The second kind of case study is focused on omics data analytics, a class of biological data characterized for having extremely high dimensionalities. Our methods were able not only to generate very accurate classification models, but also to discover new biological knowledge that was later ratified by experimentalists. Finally, we describe several strategies to tightly integrate knowledge extraction and data mining in order to create a new class of biodata mining algorithms that can natively embrace the complexity of biological data, efficiently generate accurate information in the form of classification/regression models, and extract valuable new knowledge. Thus, a complete data-to-information-to-knowledge pipeline is presented.
- Published
- 2014
17. Qualitative and Quantitative Analysis of Systems and Synthetic Biology Constructs using P Systems
- Author
-
Laurentiu Mierla, Natalio Krasnogor, Ciprian Dragomir, Florentin Ipate, Marian Gheorghe, and Savas Konur
- Subjects
Model checking ,Theoretical computer science ,Computer science ,Systems biology ,Green Fluorescent Proteins ,Biomedical Engineering ,Models, Biological ,Biochemistry, Genetics and Molecular Biology (miscellaneous) ,Set (abstract data type) ,Synthetic biology ,Computer Simulation ,Formal verification ,Stochastic Processes ,Computational model ,Models, Statistical ,Bacteria ,Mathematical model ,Systems Biology ,Quorum Sensing ,General Medicine ,Pseudomonas aeruginosa ,Workbench ,Artificial Cells ,Synthetic Biology ,Algorithms ,Signal Transduction - Abstract
Computational models are perceived as an attractive alternative to mathematical models (e.g., ordinary differential equations). These models incorporate a set of methods for specifying, modeling, testing, and simulating biological systems. In addition, they can be analyzed using algorithmic techniques (e.g., formal verification). This paper shows how formal verification is utilized in systems and synthetic biology through qualitative vs quantitative analysis. Here, we choose two well-known case studies: quorum sensing in P. aeruginosas and pulse generator. The paper reports verification analysis of two systems carried out using some model checking tools, integrated to the Infobiotics Workbench platform, where system models are based on stochastic P systems.
- Published
- 2014
18. Heuristic for Maximizing DNA Reuse in Synthetic DNA Library Assembly
- Author
-
Natalio Krasnogor, Ofir Raz, Jonathan Blakes, Tuval Ben-Yehezkel, Jaume Bacardit, Uriel Feige, Paweł Widera, and Ehud Shapiro
- Subjects
Genetics ,DNA synthesis ,Computer science ,Heuristic ,Biomedical Engineering ,DNA ,General Medicine ,Computational biology ,Reuse ,Biochemistry, Genetics and Molecular Biology (miscellaneous) ,Synthetic biology ,chemistry.chemical_compound ,chemistry ,Synthetic DNA ,Synthetic Biology ,Algorithms ,Gene Library ,Production rate - Abstract
De novo DNA synthesis is in need of new ideas for increasing production rate and reducing cost. DNA reuse in combinatorial library construction is one such idea. Here, we describe an algorithm for planning multistage assembly of DNA libraries with shared intermediates that greedily attempts to maximize DNA reuse, and show both theoretically and empirically that it runs in linear time. We compare solution quality and algorithmic performance to the best results reported for computing DNA assembly graphs, finding that our algorithm achieves solutions of equivalent quality but with dramatically shorter running times and substantially improved scalability. We also show that the related computational problem bounded-depth min-cost string production (BDMSP), which captures DNA library assembly operations with a simplified cost model, is NP-hard and APX-hard by reduction from vertex cover. The algorithm presented here provides solutions of near-minimal stages and thanks to almost instantaneous planning of DNA libraries it can be used as a metric of "manufacturability" to guide DNA library design. Rapid planning remains applicable even for DNA library sizes vastly exceeding today's biochemical assembly methods, future-proofing our method.
- Published
- 2014
19. GAssist vs. BioHEL: critical assessment of two paradigms of genetics-based machine learning
- Author
-
Maria A. Franco, Natalio Krasnogor, and Jaume Bacardit
- Subjects
Genetics ,Learning classifier system ,Computer science ,business.industry ,Active learning (machine learning) ,Algorithmic learning theory ,Stability (learning theory) ,Online machine learning ,Multi-task learning ,Computational intelligence ,Machine learning ,computer.software_genre ,Generalization error ,Theoretical Computer Science ,Support vector machine ,Computational learning theory ,Unsupervised learning ,Geometry and Topology ,Artificial intelligence ,Instance-based learning ,business ,computer ,Software - Abstract
This paper reports an exhaustive analysis performed over two specific Genetics-based Machine Learning systems: BioHEL and GAssist. These two systems share many mechanisms and operators, but at the same time, they apply two different learning paradigms (the Iterative Rule Learning approach and the Pittsburgh approach, respectively). The aim of this paper is to: (a) propose standard configurations for handling small and large datasets, (b) compare the two systems in terms of learning capabilities, complexity of the obtained solutions and learning time, (c) determine the areas of the problem space where each one of these two systems performs better, and (d) compare them with other well-known machine learning algorithms. The results show that it is possible to find standard configurations for both systems. With these configurations the systems perform up to the standards of other state-of-the-art machine learning algorithms such as Support Vector Machines. Moreover, we identify the problem domains where each one of these systems have advantages and disadvantages and propose ways to improve the systems based on this analysis.
- Published
- 2013
20. Exploring programmable self−assembly in non−DNA based molecular computing
- Author
-
Hector Zenil, Germán Terrazas, and Natalio Krasnogor
- Subjects
FOS: Computer and information sciences ,Computer science ,Computer Science - Artificial Intelligence ,Distributed computing ,Supramolecular chemistry ,Complex system ,FOS: Physical sciences ,0102 computer and information sciences ,02 engineering and technology ,Computational Complexity (cs.CC) ,Master plan ,01 natural sciences ,Computational Engineering, Finance, and Science (cs.CE) ,Cluster analysis ,Computer Science - Computational Engineering, Finance, and Science ,Scale (chemistry) ,Computational Physics (physics.comp-ph) ,021001 nanoscience & nanotechnology ,Computer Science Applications ,Computer Science - Computational Complexity ,Artificial Intelligence (cs.AI) ,010201 computation theory & mathematics ,Physics - Data Analysis, Statistics and Probability ,Theory of computation ,Self-assembly ,0210 nano-technology ,Physics - Computational Physics ,Data Analysis, Statistics and Probability (physics.data-an) - Abstract
Self-assembly is a phenomenon observed in nature at all scales where autonomous entities build complex structures, without external influences nor centralised master plan. Modelling such entities and programming correct interactions among them is crucial for controlling the manufacture of desired complex structures at the molecular and supramolecular scale. This work focuses on a programmability model for non DNA-based molecules and complex behaviour analysis of their self-assembled conformations. In particular, we look into modelling, programming and simulation of porphyrin molecules self-assembly and apply Kolgomorov complexity-based techniques to classify and assess simulation results in terms of information content. The analysis focuses on phase transition, clustering, variability and parameter discovery which as a whole pave the way to the notion of complex systems programmability.
- Published
- 2016
21. An Integrated In Silico Simulation and Biomatter Compilation Approach to Cellular Computation
- Author
-
Marian Gheorghe, Daven Sanassy, Larentiu Marian Mierla, Natalio Krasnogor, Sara Kalvala, Harold Fellermann, Christophe Ladroue, and Savas Konur
- Subjects
0301 basic medicine ,Theoretical computer science ,Computer science ,business.industry ,Distributed computing ,Computation ,Information processing ,0102 computer and information sciences ,computer.software_genre ,01 natural sciences ,03 medical and health sciences ,Synthetic biology ,030104 developmental biology ,Software ,010201 computation theory & mathematics ,Logic gate ,Workbench ,Compiler ,business ,Membrane computing ,computer - Abstract
Recent advances Synthetic Biology are ushering a new practical computational substrate based on programmable information processing via biological cells. Due to the difficulties in orchestrating complex programmes using myriads of relatively simple, limited and highly stochastic processors such as living cells, robust computational technology to specify, simulate, analyse and compile cellular programs are in demand. We provide the Infobiotics Workbench (Ibw) tool, a software platform developed to model and analyse stochastic compartmentalized systems, which permits using various computational techniques, such as modelling, simulation, verification and biocompilation. We report here the details of our work for modelling, simulation and, for the first time, biocompilation, while verification is reported elsewhere in this book. We consider some basic genetic logic gates to illustrate the main features of the Ibw platform. Our results show that membrane computing provides a suitable formalism for building synthetic biology models. The software platform we developed permits analysing biological systems through the computational methods integrated into the workbench, providing significant advantages in terms of time, and enhanced understanding of biological functionality.
- Published
- 2016
22. In Vitro Implementation of a Stack Data Structure Based on DNA Strand Displacement
- Author
-
Harold Fellermann, Annunziata Lopiccolo, Natalio Krasnogor, and Jerzy Kozyra
- Subjects
0301 basic medicine ,business.industry ,Computer science ,02 engineering and technology ,021001 nanoscience & nanotechnology ,Signal ,Branch migration ,Displacement (vector) ,03 medical and health sciences ,030104 developmental biology ,Mode (computer interface) ,Stack (abstract data type) ,Dna assembly ,0210 nano-technology ,business ,Algorithm ,Computer hardware ,Dna strand displacement - Abstract
We present an implementation of an in vitro signal recorder based on DNA assembly and strand displacement. The signal recorder implements a stack data structure in which both data as well as operators are represented by single stranded DNA "bricks". The stack grows by adding push and write bricks and shrinks in last-in-first-out manner by adding pop and read bricks. We report the design of the signal recorder and its mode of operations and give experimental results from capillary electrophoresis as well as transmission electron microscopy that demonstrate the capability of the device to store and later release several successive signals. We conclude by discussing potential future improvements of our current results.
- Published
- 2016
23. A genotype-phenotype-fitness assessment protocol for evolutionary self-assembly Wang tiles design
- Author
-
Germán Terrazas and Natalio Krasnogor
- Subjects
Control and Optimization ,Fitness function ,General Computer Science ,business.industry ,Fitness approximation ,Wang tile ,Computer science ,Evolutionary algorithm ,Interactive evolutionary computation ,Machine learning ,computer.software_genre ,Distance correlation ,Genetic algorithm ,Artificial intelligence ,business ,computer ,Selection (genetic algorithm) - Abstract
In a previous work we have reported on the evolutionary design optimisation of self-assembling Wang tiles capable of arranging themselves together into a target structure. Apart from the significant findings on how self-assembly is achieved, nothing has been yet said about the efficiency by which individuals were evolved. Specially in light that the mapping from genotype to phenotype and from this to fitness is clearly a complex, stochastic and non-linear relationship. One of the most common procedures would suggest running many experiments for different configurations followed by a fitness comparison, which is not only time-consuming but also inaccurate for such intricate mappings. In this paper we aim to report on a complementary dual assessment protocol to analyse whether our genetic algorithm, using morphological image analyses as fitness function, is an effective methodology. Thus, we present here fitness distance correlation to measure how effectively the fitness of an individual correlates to its genotypic distance to a known optimum, and introduce clustering as a mechanism to verify how the objective function can effectively differentiate between dissimilar phenotypes and classify similar ones for the purpose of selection.
- Published
- 2012
24. Analysing BioHEL using challenging boolean functions
- Author
-
Maria A. Franco, Natalio Krasnogor, and Jaume Bacardit
- Subjects
Class (set theory) ,Class (computer programming) ,Generality ,Fitness function ,Rule induction ,Computer science ,business.industry ,Cognitive Neuroscience ,Evolutionary algorithm ,Machine learning ,computer.software_genre ,Set (abstract data type) ,Mathematics (miscellaneous) ,Artificial Intelligence ,Scalability ,Default rule ,Computer Vision and Pattern Recognition ,Sensitivity (control systems) ,Artificial intelligence ,Boolean function ,business ,computer ,Mathematics - Abstract
In this work we present an extensive empirical analysis of the BioHEL genetics-based machine learning system using the k-Disjunctive Normal Form (k-DNF) family of boolean functions. These functions present a broad set of possible challenges for most machine learning techniques, such as different degrees of specificity, class imbalance and niche overlap. Moreover, as the ideal solutions are known, it is possible to assess if a learning system is able to find them, and how fast. Specifically, we study two aspects of BioHEL: its sensitivity to the coverage breakpoint parameter (that determines the degree of generality pressure applied by the fitness function) and the impact of the default rule policy. The results show that BioHEL is highly sensitive to the choice of coverage breakpoint and that using a default class suitable for the problem allows the system to learn faster than using other default class policies (e.g. the majority class policy). Moreover, the experiments indicate that BioHEL’s scalability depends directly on both k (the specificity of the k-DNF terms) and the number of terms in the problem. In the last part of the paper we discuss alternative policies to adjust the coverage breakpoint parameter.
- Published
- 2012
25. Towards the Design of Heuristics by Means of Self-Assembly
- Author
-
Natalio Krasnogor, Dario Landa-Silva, and Germán Terrazas
- Subjects
FOS: Computer and information sciences ,Computational model ,Generality ,021103 operations research ,Computer Science - Artificial Intelligence ,Computer science ,Heuristic ,business.industry ,lcsh:Mathematics ,0211 other engineering and technologies ,Computer Science - Neural and Evolutionary Computing ,Genetic programming ,02 engineering and technology ,lcsh:QA1-939 ,lcsh:QA75.5-76.95 ,Artificial Intelligence (cs.AI) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,lcsh:Electronic computers. Computer science ,Neural and Evolutionary Computing (cs.NE) ,Artificial intelligence ,Heuristics ,business - Abstract
The current investigations on hyper-heuristics design have sprung up in two different flavours: heuristics that choose heuristics and heuristics that generate heuristics. In the latter, the goal is to develop a problem-domain independent strategy to automatically generate a good performing heuristic for the problem at hand. This can be done, for example, by automatically selecting and combining different low-level heuristics into a problem specific and effective strategy. Hyper-heuristics raise the level of generality on automated problem solving by attempting to select and/or generate tailored heuristics for the problem at hand. Some approaches like genetic programming have been proposed for this. In this paper, we explore an elegant nature-inspired alternative based on self-assembly construction processes, in which structures emerge out of local interactions between autonomous components. This idea arises from previous works in which computational models of self-assembly were subject to evolutionary design in order to perform the automatic construction of user-defined structures. Then, the aim of this paper is to present a novel methodology for the automated design of heuristics by means of self-assembly.
- Published
- 2010
26. Correction to Simbiotics: A Multiscale Integrative Platform for 3D Modeling of Bacterial Populations
- Author
-
Harold Fellermann, Miguel Cámara, Catherine A. Biggs, Yuchun Ding, Felix Dafhnis-Calas, Stephan Heeb, Jonathan Naylor, Natalio Krasnogor, Joy Mukherjee, Waleed K. Mohammed, Nicholas S. Jakubovics, and Phillip C. Wright
- Subjects
business.industry ,Computer science ,Biomedical Engineering ,General Medicine ,Computational biology ,3D modeling ,business ,Biochemistry, Genetics and Molecular Biology (miscellaneous) - Published
- 2018
27. A learning classifier system with mutual-information-based fitness
- Author
-
Jaume Bacardit, Max Kun Jiang, Michael Stout, Robert E. Smith, Natalio Krasnogor, and Jonathan D. Hirst
- Subjects
Learning classifier system ,Artificial neural network ,Computer science ,business.industry ,Cognitive Neuroscience ,Supervised learning ,Contrast (statistics) ,Mutual information ,Information theory ,Machine learning ,computer.software_genre ,Evolutionary computation ,Naive Bayes classifier ,ComputingMethodologies_PATTERNRECOGNITION ,Mathematics (miscellaneous) ,Artificial Intelligence ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer - Abstract
This paper introduces a new variety of learning classifier system (LCS), called MILCS, which utilizes mutual information as fitness feedback. Unlike most LCSs, MILCS is specifically designed for supervised learning. We present experimental results, and contrast them to results from XCS, UCS, GAssist, BioHEL, C4.5 and Naive Bayes. We discuss the explanatory power of the resulting rule sets. MILCS is also shown to promote the discovery of default hierarchies, an important advantage of LCSs. Final comments include future directions for this research, including investigations in neural networks and other systems.
- Published
- 2010
28. GP challenge: evolving energy function for protein structure prediction
- Author
-
Paweł Widera, Natalio Krasnogor, and Jonathan M. Garibaldi
- Subjects
business.industry ,Computer science ,Genetic programming ,Function (mathematics) ,Protein structure prediction ,Machine learning ,computer.software_genre ,Linear function ,Computer Science Applications ,Theoretical Computer Science ,Set (abstract data type) ,Hardware and Architecture ,Artificial intelligence ,CASP ,business ,Linear combination ,computer ,Software ,Energy (signal processing) - Abstract
One of the key elements in protein structure prediction is the ability to distinguish between good and bad candidate structures. This distinction is made by estimation of the structure energy. The energy function used in the best state-of-the-art automatic predictors competing in the most recent CASP (Critical Assessment of Techniques for Protein Structure Prediction) experiment is defined as a weighted sum of a set of energy terms designed by experts. We hypothesised that combining these terms more freely will improve the prediction quality. To test this hypothesis, we designed a genetic programming algorithm to evolve the protein energy function. We compared the predictive power of the best evolved function and a linear combination of energy terms featuring weights optimised by the Nelder---Mead algorithm. The GP based optimisation outperformed the optimised linear function. We have made the data used in our experiments publicly available in order to encourage others to further investigate this challenging problem by using GP and other methods, and to attempt to improve on the results presented here.
- Published
- 2009
29. An unorthodox introduction to Memetic Algorithms
- Author
-
Natalio Krasnogor
- Subjects
Underpinning ,business.industry ,Natural computing ,Computer science ,Key (cryptography) ,Memetic algorithm ,General Medicine ,Artificial intelligence ,business - Abstract
Memetic Algorithms have become one of the key methodologies behind solvers that are capable of tackling very large, real-world, optimisation problems. They are being actively investigated in research institutions as well as broadly applied in industry. This article provides a very short introduction to Memetic Algorithms and it is a condensed version of a chapter with the same title that will appear in the Handbook of Natural Computing to be published by Springer [35] in which a pragmatic guide to the key design issues underpinning Memetic Algorithms (MA) engineering is overviewed.
- Published
- 2008
30. Improving the scalability of rule-based evolutionary learning
- Author
-
Natalio Krasnogor, Jaume Bacardit, and Edmund K. Burke
- Subjects
Control and Optimization ,General Computer Science ,Bioinformatics ,business.industry ,Rule induction ,Computer science ,Evolutionary algorithm ,Feature selection ,Rule-based system ,Semi-supervised learning ,Evolutionary algorithms ,computer.software_genre ,Machine learning ,Support vector machine ,Protein structure prediction ,Scalability ,Artificial intelligence ,Data mining ,business ,Representation (mathematics) ,Evolutionary learning ,Learning classifier systems ,computer - Abstract
Evolutionary learning techniques are comparable in accuracy with other learning methods such as Bayesian Learning, SVM, etc. These techniques often produce more interpretable knowledge than, e.g. SVM; however, efficiency is a significant drawback. This paper presents a new representation motivated by our observations that Bioinformatics and Systems Biology often give rise to very large-scale datasets that are noisy, ambiguous and usually described by a large number of attributes. The crucial observation is that, in the most successful rules obtained for such datasets, only a few key attributes (from the large number of available ones) are expressed in a rule, hence automatically discovering these few key attributes and only keeping track of them contributes to a substantial speed up by avoiding useless match operations with irrelevant attributes. Thus, in effective terms this procedure is performing a fine-grained feature selection at a rule-wise level, as the key attributes may be different for each learned rule. The representation we propose has been tested within the BioHEL machine learning system, and the experiments performed show that not only the representation has competent learning performance, but that it also manages to reduce considerably the system run-time. That is, the proposed representation is up to 2-3 times faster than state-of-the-art evolutionary learning representations designed specifically for efficiency purposes.
- Published
- 2008
31. A tale of human-competitiveness in bioinformatics
- Author
-
Michael Stout, Jaume Bacardit, and Natalio Krasnogor
- Subjects
Theoretical computer science ,Computer science ,business.industry ,Open problem ,Evolutionary algorithm ,General Medicine ,Mutual information ,Protein structure prediction ,Bioinformatics ,Information theory ,Machine learning ,computer.software_genre ,Metric (mathematics) ,Domain knowledge ,Artificial intelligence ,business ,Representation (mathematics) ,computer - Abstract
A key open problem, which has defied scientists for decades is the problem of predicting the 3D structure of proteins (Protein Structure Prediction - PSP) based on its primary sequence: the amino acids that compose a protein chain. Full atomistic molecular dynamics simulations are, for all intents and purposes, impractical as current empirical models may require massive computational resources. One of the possible ways of alleviating this cost and making the problem easier is to simplify the protein representation based on which the native 3D state is searched for. We have proposed a protocol based on evolutionary algorithms to perform this simplification of the protein representation. Our protocol does not use any domain knowledge. Instead it uses a well known information theory metric, Mutual Information, to generate a reduced representation that is able to maintain the crucial information needed for PSP. The evaluation process of our method has shown that it generates alphabets that have competent performance against the original, non-simplified, representation. Moreover, these reduced alphabets obtain better-than-human performance when compared to some classic reduced alphabets.
- Published
- 2008
32. A Genetic Algorithm Approach to Probing the Evolution of Self-Organized Nanostructured Systems
- Author
-
Peter Siepmann, Ioan Vancea, Philip Moriarty, C. P. Martin, and Natalio Krasnogor
- Subjects
Self-organization ,Matching (graph theory) ,Computer science ,Mechanical Engineering ,Monte Carlo method ,Nanoparticle ,Bioengineering ,General Chemistry ,Parameter space ,Condensed Matter Physics ,Colloidal Solution ,Solid substrate ,Models, Chemical ,Genetic algorithm ,Genetics ,Nanoparticles ,Computer Simulation ,General Materials Science ,Colloids ,Statistical physics ,Biological system ,Monte Carlo Method ,Algorithms - Abstract
We present a new methodology, based on a combination of genetic algorithms and image morphometry, for matching the outcome of a Monte Carlo simulation to experimental observations of a far-from-equilibrium nanosystem. The Monte Carlo model used simulates a colloidal solution of nanoparticles drying on a solid substrate and has previously been shown to produce patterns very similar to those observed experimentally. Our approach enables the broad parameter space associated with simulated nanoparticle self-organization to be searched effectively for a given experimental target morphology.
- Published
- 2007
33. Special Issue on Memetic Algorithms
- Author
-
Yew-Soon Ong, Hisao Ishibuchi, and Natalio Krasnogor
- Subjects
business.industry ,Computer science ,MathematicsofComputing_NUMERICALANALYSIS ,Evolutionary algorithm ,Heuristic programming ,General Medicine ,ComputingMethodologies_ARTIFICIALINTELLIGENCE ,Computer Science Applications ,Human-Computer Interaction ,ComputingMethodologies_PATTERNRECOGNITION ,Control and Systems Engineering ,Section (archaeology) ,Memetic algorithm ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Software ,Information Systems - Abstract
The ten papers in this special section are devoted to memetic algorithms. The papers are loosely grouped into two categories: memetic algorithm methodologies and domain-specific memetic algorithms. Briefly summarizes the articles included in this section.
- Published
- 2007
34. Quorum sensing P systems
- Author
-
Francesco Bernardini, Marian Gheorghe, and Natalio Krasnogor
- Subjects
education.field_of_study ,Theoretical computer science ,General Computer Science ,Computer science ,P systems ,Population ,Vibrio fischeri ,Turing machines ,Theoretical Computer Science ,Quorum sensing ,Turing machine ,symbols.namesake ,symbols ,education ,Algorithm ,P system ,Computer Science(all) - Abstract
This paper continues the investigation of population P systems model [F. Bernardini, M. Gheorghe, Population P systems, Journal of Universal Computer Science 10 (5) (2004) 509–539] by considering bacterium quorum sensing (QS) phenomena as the basis of the new approach. A new computational model called QS P system is introduced. It is proved that QS P systems are able to simulate counter machines, and hence they are equivalent in power to Turing machines. An example of a QS P system modelling the behaviour of Vibrio fischeri bacteria colonies is also presented and the emergence of the QS mechanism is illustrated.
- Published
- 2007
35. Complexity measurement based on information theory and kolmogorov complexity
- Author
-
Hector Zenil, Leong Ting Lui, Natalio Krasnogor, Cameron Alexander, and Germán Terrazas
- Subjects
Theoretical computer science ,Kolmogorov complexity ,Computer science ,business.industry ,Complex system ,Game complexity ,Descriptive complexity theory ,Information theory ,General Biochemistry, Genetics and Molecular Biology ,Cellular automaton ,Structural complexity theory ,Artificial Intelligence ,Artificial intelligence ,business ,Quantum complexity theory - Abstract
In the past decades many definitions of complexity have been proposed. Most of these definitions are based either on Shannon's information theory or on Kolmogorov complexity; these two are often compared, but very few studies integrate the two ideas. In this article we introduce a new measure of complexity that builds on both of these theories. As a demonstration of the concept, the technique is applied to elementary cellular automata and simulations of the self-organization of porphyrin molecules.
- Published
- 2015
36. Self Generating Metaheuristics in Bioinformatics: The Proteins Structure Comparison Case
- Author
-
Natalio Krasnogor
- Subjects
Similarity (geometry) ,Relation (database) ,business.industry ,Computer science ,Structural alignment ,Function (mathematics) ,Bioinformatics ,Computer Science Applications ,Theoretical Computer Science ,Hardware and Architecture ,Genetic algorithm ,Memetic algorithm ,Local search (optimization) ,business ,Metaheuristic ,Software - Abstract
In this paper we describe the application of a so called "Self-Generating" Memetic Algorithm to the Maximum Contact Map Overlap problem (MAX-CMO). The maximum overlap of contact maps is emerging as a leading modeling technique to obtain structural alignment among pairs of protein structures. Identifying structural alignments (and hence similarity among proteins) is essential to the correct assessment of the relation between proteins structure and function. A robust methodology for structural comparison could have impact on the process of rational drug design. The Self-Generating Memetic Algorithm we present in this work evolves concurrently both the solutions (i.e. proteins alignments) and the local search move operators that it needs to solve the problem instance at hand. The concurrent generation of local search strategies and solutions allows the Memetic Algorithm to produce better results than those given by a Genetic Algorithm and a Memetic Algorithm with human-designed local searchers. The approach has been tried in four different data sets (1 data set composed of randomly generated proteins and the other 3 data sets with real world proteins) with encouraging results.
- Published
- 2004
37. Blind optimisation problem instance classification via enhanced universal similarity metric
- Author
-
Ivan Contreras, José Ignacio Hidalgo, Natalio Krasnogor, and Ignacio Arnaldo
- Subjects
Control and Optimization ,General Computer Science ,Computer science ,Natural computing ,business.industry ,Computation ,Complex system ,Evolutionary computation ,Euclidean geometry ,Memetic algorithm ,Artificial intelligence ,business ,Classifier (UML) ,Know-how - Abstract
The ultimate aim of Memetic Computing is the fully autonomous solution to complex optimisation problems. For a while now, the Memetic algorithms literature has been moving in the direction of ever increasing generalisation of optimisers initiated by seminal papers such as Krasnogor and Smith (IEEE Trans 9(5):474–488, 2005; Workshops Proceedings of the 2000 International Genetic and Evolutionary Computation Conference (GECCO2000), 2000), Krasnogor and Gustafson (Advances in nature-inspired computation: the PPSN VII Workshops 16(52), 2002) and followed by related and more recent work such as Ong and Keane (IEEE Trans Evol Comput 8(2):99–110, 2004), Ong et al. (IEEE Comp Int Mag 5(2):24–31, 2010), Burke et al. (Hyper-heuristics: an emerging direction in modern search technology, 2003). In this recent trend to ever greater generalisation and applicability, the research has focused on selecting (or even evolving), the right search operator(s) to use when tackling a given instance of a fixed problem type (e.g. Euclidean 2D TSP) within a range of optimisation frameworks (Krasnogor, Handbook of natural computation, Springer, Berlin/Heidelberg, 2009). This paper is the first step up the generalisation ladder, where one assumes that the optimiser is given (perhaps by other solvers who do not necessarily know how to deal with a given problem instance) a problem instance to tackle and it must autonomously and without human intervention pre-select which is the likely family class of problems the instance belongs to. In order to do that we propose an Automatic Problem Classifier System able to identify automatically which kind of instance or problem the system is dealing with. We test an innovative approach to the Universal Similarity Metric, as a variant of the normalised compression distance (NCD), to classify different problem instances. This version is based on the management of compression dictionaries. The results obtained are encouraging as we achieve a 96 % average classification success with the studied dataset.
- Published
- 2014
38. Meta-stochastic simulation of biochemical models for systems and synthetic biology
- Author
-
Natalio Krasnogor, Paweł Widera, and Daven Sanassy
- Subjects
Computer science ,Biochemical Phenomena ,Systems biology ,Biomedical Engineering ,Machine learning ,computer.software_genre ,Biochemistry, Genetics and Molecular Biology (miscellaneous) ,Models, Biological ,Upload ,Synthetic biology ,Stochastic simulation ,Web application ,Computer Simulation ,TRACE (psycholinguistics) ,Class (computer programming) ,Internet ,Stochastic Processes ,business.industry ,Systems Biology ,General Medicine ,Linear Models ,A priori and a posteriori ,Synthetic Biology ,Artificial intelligence ,business ,computer ,Algorithms - Abstract
Stochastic simulation algorithms (SSAs) are used to trace realistic trajectories of biochemical systems at low species concentrations. As the complexity of modeled biosystems increases, it is important to select the best performing SSA. Numerous improvements to SSAs have been introduced but they each only tend to apply to a certain class of models. This makes it difficult for a systems or synthetic biologist to decide which algorithm to employ when confronted with a new model that requires simulation. In this paper, we demonstrate that it is possible to determine which algorithm is best suited to simulate a particular model and that this can be predicted a priori to algorithm execution. We present a Web based tool ssapredict that allows scientists to upload a biochemical model and obtain a prediction of the best performing SSA. Furthermore, ssapredict gives the user the option to download our high performance simulator ngss preconfigured to perform the simulation of the queried biochemical model with the predicted fastest algorithm as the simulation engine. The ssapredict Web application is available at http://ssapredict.ico2s.org. It is free software and its source code is distributed under the terms of the GNU Affero General Public License.
- Published
- 2014
39. Modelling and Stochastic Simulation of Synthetic Biological Boolean Gates
- Author
-
Marian Gheorghe, Daven Sanassy, Laurentiu Mierla, Harold Fellermann, Natalio Krasnogor, Savas Konur, Christophe Ladroue, and Sara Kalvala
- Subjects
0303 health sciences ,Engineered genetic ,OR gate ,Theoretical computer science ,010304 chemical physics ,Computer science ,Stochastic process ,Control engineering ,01 natural sciences ,QA76 ,QH301 ,03 medical and health sciences ,Synthetic biology ,Logic gate ,0103 physical sciences ,Stochastic simulation ,030304 developmental biology ,Statistical hypothesis testing - Abstract
Synthetic Biology aspires to design, compose and engineer biological systems that implement specified behaviour. When designing such systems, hypothesis testing via computational modelling and simulation is vital in order to reduce the need of costly wet lab experiments. As a case study, we discuss the use of computational modelling and stochastic simulation for engineered genetic circuits that implement Boolean AND and OR gates that have been reported in the literature. We present performance analysis results for nine different state-of-the-art stochastic simulation algorithms and analyse the dynamic behaviour of the proposed gates. Stochastic simulations verify the desired functioning of the proposed gate designs.\ud
- Published
- 2014
40. Formalizing modularization and data hiding in synthetic biology
- Author
-
Maik Hadorn, Harold Fellermann, Rudolf Marcel Füchslin, and Natalio Krasnogor
- Subjects
business.industry ,Computer science ,572: Biochemie ,Distributed computing ,Information processing ,Compartmentalization (information security) ,Synthetic biology ,Hardware and Architecture ,Brane calculi ,Embedded system ,Information hiding ,Modular programming ,Electrical and Electronic Engineering ,Representation (mathematics) ,business ,Membrane computing ,Software - Abstract
Biological systems employ compartmentalization and other co-localization strategies in order to orchestrate a multitude of biochemical processes by simultaneously enabling “data hiding” and modularization. This article presents recent research that embraces compartmentalization and co-location as an organizational programmatic principle in synthetic biological and biomimetic systems. In these systems, artificial vesicles and synthetic minimal cells are envisioned as nanoscale reactors for programmable biochemical synthesis and as chassis for molecular information processing. We present P systems, brane calculi, and the recently developed chemtainer calculus as formal frameworks providing data hiding and modularization and thus enabling the representation of highly complicated hierarchically organized compartmentalized reaction systems. We demonstrate how compartmentalization can greatly reduce the complexity required to implement computational functionality, and how addressable compartments permit the scaling-up of programmable chemical synthesis.
- Published
- 2014
- Full Text
- View/download PDF
41. Chemical Production and Molecular Computing in Addressable Reaction Compartments
- Author
-
Harold Fellermann and Natalio Krasnogor
- Subjects
business.industry ,Reaction rule ,Computer science ,Information hiding ,Distributed computing ,Brane calculi ,Artificial intelligence ,business ,Emulsion droplet ,Chemical production - Abstract
Biological systems employ compartmentalisation in order to orchestrate a multitude of biochemical processes by simultaneously enabling “data hiding” and modularisation. In this paper, we present recent research projects that embrace compartmentalisation as an organisational programmatic principle in synthetic biological and biomimetic systems. In these systems, artificial vesicles and synthetic minimal cells are envisioned as nanoscale reactors for programmable biochemical synthesis and as chassis for molecular information processing. We present P systems, brane calculi, and the recently developed chemtainer calculus as formal frameworks providing data hiding and modularisation and thus enabling the representation of highly complicated hierarchically organised compartmentalised reaction systems. We demonstrate how compartmentalisation can greatly reduce the complexity required to implement computational functionality, and how addressable compartments permit the scaling-up of programmable chemical synthesis.
- Published
- 2014
42. Conventional Verification for Unconventional Computing: a Genetic XOR Gate Example
- Author
-
Florentin Ipate, Savas Konur, Natalio Krasnogor, Ciprian Dragomir, and Marian Gheorghe
- Subjects
High-level verification ,Algebra and Number Theory ,Theoretical computer science ,Computer science ,business.industry ,Computation ,Computer programming ,Data_CODINGANDINFORMATIONTHEORY ,Theoretical Computer Science ,Computational Theory and Mathematics ,Computer engineering ,Logic gate ,Unconventional computing ,business ,XOR gate ,Formal verification ,Information Systems ,Hardware_LOGICDESIGN - Abstract
As unconventional computation matures and non-standard programming frameworks are demonstrated, the need for formal veri cation will become more prevalent. This is so because \programming" in unconventional substrates is di cult. In this paper we show how conventional veri cation tools can be used to verify unconventional programs implementing a logical XOR gate.
- Published
- 2014
43. P-systems and X-machines
- Author
-
Marian Gheorghe and Natalio Krasnogor
- Subjects
Structure (mathematical logic) ,Computational model ,Theoretical computer science ,Computer science ,Theory of computation ,Formal language ,Complex system ,Rewriting ,Membrane computing ,P system ,Computer Science Applications - Abstract
A number of computational paradigms have been inspired by or used for modelling the multifaceted complex phenomena present in biological systems. One of the most recent and successful of these paradigms, called membrane computing or P systems (Paun 2002), is a vigorous research field with a significant impact on a variety of disciplines. Indeed, P systems research is a fast growing field—in 2003 the Thompson Institute for Scientific Information characterised the initial paper as ‘‘fast breaking’’ and the domain as ‘‘emergent research front in computer science’’. As the name suggests, a membrane computing system formally captures various mechanisms present in cells, tissues and other more complex organisms. P systems, through complexity and formal language theory constructs, provide a general framework to build up and study nature inspired computational models. In the more basic P system models metabolites, nutrients and other more complex (macro)molecules, which are normally found inside a cell, are encoded as multisets of simple objects or complex strings, respectively. Different chemical interactions, like transcription, translation and various enzymatic and degradation processes are, in turn, represented by rewriting rules that operate on these multisets of objects of different types. Moreover, the device, as it occurs for example in Eukaryotic cells, can be hierarchically organized by means of a well-defined structure of nested compartments. In other cases, the membranes are organised as a network of compartments in a similar fashion to what is found in a variety of biological tissues in which cells are positioned in a well-defined ‘‘matrix’’ structure. More complex entities, like bacterium colonies or social insects such as ants, bees, etc., are represented by dynamic structure entities that communicate, move
- Published
- 2009
44. Algorithms and models for complex natural systems
- Author
-
Giuditta Franco, Carlos A. Coello Coello, Natalio Krasnogor, and Mario Pavone
- Subjects
Model checking ,Computational model ,business.industry ,Computer science ,Theory of computation ,Complex system ,Metabolic network ,Inversion (meteorology) ,Local search (optimization) ,business ,Algorithm ,Biological network ,Computer Science Applications - Abstract
Analysing genomic data and complex natural phenomena in computational terms enhances our comprehension of both nature and computation. Thus, a cross fertilization of algorithms and models for natural complex systems at molecular, cellular, or higher levels, became an active research area, and a more in depth investigation of mutual relationships, synergies, similarities, and differences should be encouraged. This special issue is meant to foster novel hybrid approaches, including general methods of (bio) informatics and synthetic biology, as well as to present the new emerging research concerned with the study and analysis of genome organization, and with the design, modelling, and implementation of bio-inspired evolvable systems. Of particular interest are (unconventional) computational techniques designed to increase our understanding of the evolution of biological life, such as algorithms to infer gene structure and functioning, and parallel distributed computational models involving mechanisms of recognition, affinity based discrimination, reactivity and adaptation to the environment, and spatial search and moving. After a peer review process, 6 manuscripts were accepted for inclusion in this special issue. In ‘‘Combining flux balance analysis and model checking for metabolic network validation and analysis’’, by Roberto Pagliarini, Mara Sangiovanni, Adriano Peron, and Diego Di Bernardo, the authors present a novel useful approach for extracting relevant qualitative information from a metabolic network model, integrating constraintbased techniques with model checking methods. This new computational approach may be helpful in understanding the mechanisms governing the onset and progression of human metabolic-related disorders. It was applied to a simulation and analysis of a well-known inherited disease (primary hyperoxaluria type I) where the lack of a particular liver enzyme causes the body to accumulate excessive amounts of oxalate, leading to renal failure. InA hybrid method for inversion of 3DDC resistivity login measurements, by Ewa Gajda-Zagorska, Robert Schaefer, Maciej Smolka, Maciej Paszynski, and David Pardo, the authors present a new hybrid method for solving the challenging inversion of 3D direct current (DC) resistivity logging measurements. This methodology consists of a hp hierarchic genetic strategy (hp-HGS), and a gradient based optimization method for a local search. The problem has been formulated as a global optimization problem, and simulations have been performed using a self-adaptive hpfinite element method. The experimental results demonstrate the suitability of the proposed method for the tackled inversion problem. In An evolutionary procedure for inferring MP systems regulation functions of biological networks, by Alberto & Mario Pavone mpavone@dmi.unict.it
- Published
- 2015
45. Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features
- Author
-
Natalio Krasnogor, Federico Divina, Jesús S. Aguilar-Ruiz, Jaume Bacardit, Paweł Widera, Alfonso E. Márquez-Chamorro, and Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos
- Subjects
Statistics and Probability ,Computer science ,Machine learning ,computer.software_genre ,Biochemistry ,Artificial Intelligence ,Humans ,Protein Interaction Domains and Motifs ,Representation (mathematics) ,Databases, Protein ,Molecular Biology ,Structure (mathematical logic) ,Sequence ,Fusion ,Caspase 8 ,business.industry ,Rank (computer programming) ,Process (computing) ,Computational Biology ,Proteins ,Caspase 9 ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Artificial intelligence ,Data mining ,Scale (map) ,business ,computer ,Algorithms - Abstract
Motivation: The prediction of a protein’s contact map has become in recent years, a crucial stepping stone for the prediction of the complete 3D structure of a protein. In this article, we describe a methodology for this problem that was shown to be successful in CASP8 and CASP9. The methodology is based on (i) the fusion of the prediction of a variety of structural aspects of protein residues, (ii) an ensemble strategy used to facilitate the training process and (iii) a rule-based machine learning system from which we can extract human-readable explanations of the predictor and derive useful information about the contact map representation. Results: The main part of the evaluation is the comparison against the sequence-based contact prediction methods from CASP9, where our method presented the best rank in five out of the six evaluated metrics. We also assess the impact of the size of the ensemble used in our predictor to show the trade-off between performance and training time of our method. Finally, we also study the rule sets generated by our machine learning system. From this analysis, we are able to estimate the contribution of the attributes in our representation and how these interact to derive contact predictions. Availability: http://icos.cs.nott.ac.uk/servers/psp.html. Contact: natalio.krasnogor@nottingham.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2012
46. Post-processing operators for decision lists
- Author
-
Maria A. Franco, Jaume Bacardit, and Natalio Krasnogor
- Subjects
business.industry ,Computer science ,Order (business) ,Pruning (decision trees) ,Artificial intelligence ,Decision list ,business ,Machine learning ,computer.software_genre ,computer - Abstract
This paper proposes three post-processing operators (rule cleaning, rule pruning and rule swapping) which combined together in different ways can help reduce the complexity of decision lists evolved by means of genetics-based machine learning. While the first two operators work on the independent rules to reduce the number of expressed attributes, the last one changes the order of the rules (based on the similarities between them) to identify and delete the unnecessary ones. These operators were tested using the BioHEL system over 35 different problems. Our results show that it is possible to reduce the number of specified attributes per rule and the number rules up to 30% in some problems, without producing significant changes in the test accuracy. Moreover, the approaches presented in this paper can be easily extended to other learning paradigms and representations.
- Published
- 2012
47. The ten grand challenges of synthetic life
- Author
-
Vitor A. dos Santos, Manuel Porcar, Andrés Moya, Antoine Danchin, Víctor de Lorenzo, Steen Rasmussen, and Natalio Krasnogor
- Subjects
Computer science ,Systems biology ,0206 medical engineering ,Bioengineering ,02 engineering and technology ,Bioinformatics ,Task (project management) ,03 medical and health sciences ,Synthetic biology ,Artificial life ,Milestone (project management) ,Systems and Synthetic Biology ,Challenges ,Molecular Biology ,VLAG ,030304 developmental biology ,Grand Challenges ,Streamlined genomes ,Systeem en Synthetische Biologie ,0303 health sciences ,Data science ,Commentary ,020602 bioinformatics ,Biotechnology - Abstract
The construction of artificial life is one of the main scientific challenges of the Synthetic Biology era. Advances in DNA synthesis and a better understanding of regulatory processes make the goal of constructing the first artificial cell a realistic possibility. This would be both a fundamental scientific milestone and a starting point of a vast range of applications, from biofuel production to drug design. However, several major issues might hamper the objective of achieving an artificial cell. From the bottom-up to the selection-based strategies, this work encompasses the ten grand challenges synthetic biologists will have to be aware of in order to cope with the task of creating life in the lab.
- Published
- 2011
48. (Computational) synthetic biology
- Author
-
Natalio Krasnogor
- Subjects
Computer science ,Emerging technologies ,business.industry ,Systems biology ,In silico ,Data science ,Genome ,In vitro ,Variety (cybernetics) ,Synthetic biology ,Proteome ,Artificial intelligence ,Complex systems biology ,business ,Biological computation - Abstract
The ultimate goal of systems biology is the development of executable in silico models of cells and organisms. Systems biology attempts to provide an integrative methodology, which while able to cope with -on the one hand- the data deluge that is being generated through high throughput experimental technologies -and on the other hand- emerging technologies that produce scarce often noisy data, would allow to capture within human understandable models and simulations novel biological knowledge. In its more modest instantiations, Systems Biology seeks to *clarify* current biological understandings by formalizing what the constitutive elements of a biological system are and how they interact with each other and also it seeks to aid in the *testing* of current understandings against experimental data. In its most ambitious incarnations, however, it aims at *predicting* the behavior of biological systems beyond current understanding and available data thus shedding light onto possible new experimental routes that could lead to better theoretical insights. Synthetic biology, on the other hand, aims to implement, in vitro/vivo, organisms whose behavior is engineered. The field of synthetic biology holds a great promise for the design, construction and development of artificial (i.e. man-made) biological (sub systems thus offering viable new routes to genetically modified organisms, smart drugs as well as model systems to examine artificial genomes and proteomes. The informed manipulation of such biological (sub)systems could have an enormous positive impact on our societies, with its effects being felt across a range of activities such as the provision of healthcare, environmental protection and remediation, etc. The basic premise of synthetic biology is that methods commonly used to design and construct non-biological systems, such as those employed in the computational sciences and the engineering disciplines, could also be used to model and program novel synthetic biosystems. Synthetic biology thus lies at the interface of a variety of disciplines ranging from biology through chemistry, physics, computer science, mathematics and engineering. In this tutorial I will provide an entry level understanding to Systems and Synthetic Biology, it goals, methods and limitations. Furthermore I will describe the many potential applications of evolutionary computation to these two fields. Indeed, I believe that the EC community has a beautiful new application domain in which its methods could be both valued and challenged.
- Published
- 2011
49. Darwin's magic: Evolutionary computation in nanoscience, bioinformatics and systems biology
- Author
-
Natalio Krasnogor
- Subjects
Synthetic biology ,Protein structure ,Computer science ,Systems biology ,Darwin (ADL) ,Natural science ,Magic (programming) ,Nanotechnology ,Bioinformatics ,Ecology and Evolutionary Biology ,Biological computation ,Evolutionary computation - Abstract
In this talk I will overview ten years of research in the application of evolutionary computation ideas in the natural sciences. The talk will take us on a tour that will cover problems in nanoscience, e.g. controlling self-organizing systems, optimizing scanning probe microscopy, etc., problems arising in bioinformatics, such as predicting protein structures and their features, to challenges emerging in systems and synthetic biology. Although the algorithmic solutions involved in these problems are different from each other, at their core, they retain Darwin's wonderful insights. I will conclude the talk by giving a personal view on why EC has been so successful and where, in my mind, the future lies.
- Published
- 2011
50. Integrative analysis of large-scale biological data sets
- Author
-
Enrico Glaab, Natalio Krasnogor, and Jonathan M. Garibaldi
- Subjects
Biological data ,Exploit ,Computer science ,business.industry ,Bioinformatics ,Scale (chemistry) ,Modular design ,Proteomics ,Machine learning ,computer.software_genre ,Genetics & Genomics ,Set (abstract data type) ,ComputingMethodologies_PATTERNRECOGNITION ,Robustness (computer science) ,General Materials Science ,Artificial intelligence ,business ,computer ,Network analysis ,Cancer - Abstract
We present two novel web-applications for microarray and gene/protein set analysis, ArrayMining.net and TopoGSA. These bioinformatics tools use integrative analysis methods, including ensemble and consensus machine learning techniques, as well as modular combinations of different analysis types, to extract new biological insights from experimental transcriptomics and proteomics data. They enable researchers to combine related algorithms and datasets to increase the robustness and accuracy of statistical analyses and exploit synergies of different computational methods, ranging from statistical learning to optimization and topological network analysis.
- Published
- 2011
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.