Difference between revisions of "Autonomous agents"

From "A B C"
Jump to navigation Jump to search
Line 55: Line 55:
 
 
 
 
 
==Further reading and resources==
 
==Further reading and resources==
 +
 +
{{#pmid: 23891719}}
 +
{{#pmid: 22052476}}
 
{{#pmid: 20459813}}
 
{{#pmid: 20459813}}
 +
{{#pmid: 18425405}}
 
{{#pmid: 15890745}}
 
{{#pmid: 15890745}}
 
{{#pmid: 15196484}}
 
{{#pmid: 15196484}}
 +
 +
*{{WP|Software agent}}
 +
*{{WP|Multi-agent system}}
 +
*{{WP|Comparison of agent-based modeling software}}
 +
 +
<div class="reference-box">Franklin &amp; Graesser: [http://www.msci.memphis.edu/~franklin/AgentProg.html Is it an Agent or a Program ... ?] &ndash; whitepaper at IIS, Memphis</div>
 +
<div class="reference-box">[http://www.fipa.org/ '''FIPA''' (Foundation for Intelligent Physical Agents)] &ndash; A rich resource of specifications for a well thought-out agent standard. (Index of specifications is [http://www.fipa.org/repository/standardspecs.html '''here'''].)</div>
 +
 
<!-- {{WWW|WWW_UniProt}} -->
 
<!-- {{WWW|WWW_UniProt}} -->
<!-- <div class="reference-box">[http://www.ncbi.nlm.nih.gov]</div> -->
 
  
  

Revision as of 13:34, 14 September 2013

Autonomous agents


This page is a placeholder, or under current development; it is here principally to establish the logical framework of the site. The material on this page is correct, but incomplete.


Autonomous agents for bioinformatics.



 

Introductory reading

Merelli et al. (2007) Agents in bioinformatics, computational and systems biology. Brief Bioinformatics 8:45-59. (pmid: 16772270)

PubMed ] [ DOI ] The adoption of agent technologies and multi-agent systems constitutes an emerging area in bioinformatics. In this article, we report on the activity of the Working Group on Agents in Bioinformatics (BIOAGENTS) founded during the first AgentLink III Technical Forum meeting on the 2nd of July, 2004, in Rome. The meeting provided an opportunity for seeding collaborations between the agent and bioinformatics communities to develop a different (agent-based) approach of computational frameworks both for data analysis and management in bioinformatics and for systems modelling and simulation in computational and systems biology. The collaborations gave rise to applications and integrated tools that we summarize and discuss in context of the state of the art in this area. We investigate on future challenges and argue that the field should still be explored from many perspectives ranging from bio-conceptual languages for agent-based simulation, to the definition of bio-ontology-based declarative languages to be used by information agents, and to the adoption of agents for computational grids.


 

Levels of software coupling

Considering flexibility in design and development of software introduces the notion of coupling of components. This describes how widely and deeply components depend on each other. Tight coupling may lead to better performance. Loose coupling may lead to higher flexibility. Dependencies can exist along many dimensions. Thus coupling can be structural (a component includes another component), explicit (two components use each other), or implicit through sharing resources, requiring to communicate thorugh a common language or standards, assuming some (or no) synchronicity or sequence of execution. As a general rule: unnecessary coupling is always bad.

One can identify degrees of coupling with programming paradigms as follows:

  • Sequential (unstructured) programming ([1])
Instructions are written and executed one after another. Intermediate data is not isolated but kept in variables in memory. Everything is tightly coupled. This was the traditional way to develop code. Advantage: can be quick to develop and very effcient to run (little overhead). Disadvantage: code is hard to maintain and not easily reusable; changes often have unanticipated sideeffects.
  • Procedural programming ([2])
Code is broken up into modules that communicate through well-defined interfaces. Advantages: code becomes much easier to structure and to maintain as projects become more complex. Rather than requiring awareness of the entire state of the program, the procedures (or functions, or subroutines ...) need only be aware of the parameters that are passed to them. Disadvantages: Parameters can still move out of synchrony regarding their syntax (their datatypes or datastructures) or their semantics (their meaning), since they are not defined and maintained in paralell with the procedures that use them.
  • Object oriented programming ([3])
To further insulate code components from side-effects and inadvertent change, support code-reuse, simplify maintenance and extensibility, the idea of objects was introduced. An object contains both the description of parameters (attributes, properties ...) and of the functions that operate on the object (methods). The object oriented paradigm is usually said to facilitate three goals: encapsulation (no need to concern oneself with the internal workings of a procedure if the interface is specified), polymorphism (the same request can have different results depending on its context, e.g. an object may support the method multiply() that behaves dffernetly, depending on whether an instance of the object is a scalar or a matrix), and inheritance (classes of objects can be defined based on the properties of other classes, which they inherit). Advantages: An emphasis on modeling and structured design supports tackling very complex problems through iterated development. Disadvantages: Encapsulation can make code hard to debug, polymorphism can make code hard to read, inheritance may not be all that useful in the real world and may introduce side-effects (changing code in base-classes effects all derived classes). OO is not a panacea and not a substitute for clear thinking.
  • Distributed computing ([4])
In the quest for increased computing resources, distributed computing schemes have been developed that farm out parts of a larger computation across a network to other machines, typically ones that have nothing to do at the moment. As the code is executed on remote machines, it needs to be sufficiently independent. Structural or procedural coupling are avoided but implicit coupling can be significant. Advantages: Cheap access to resources. Easily scalable. redundancy and fault-tolerance. Disadvantages: Security concerns; not all problems can be divided up into distributable tasks; development overhead for scheduling, communication and integration of results.
  • Autonomous agent systems ([5])
The loosest coupling could be achieved if software components could act totally autonomously. Such "autonomous" components can be called agents. Agents are abstract concepts, not recipes for implementation. The emphasis is on behaviour, not data or method. The many exisiting definitions for agents usually include concepts such as persistence (code is not executed on demand but runs continuously and decides for itself when it should perform some activity), autonomy (agents have capabilities of task selection, prioritization, goal-directed behaviour, decision-making without human intervention), social ability (agents are able to engage other components through some sort of communication and coordination), reactivity (agents perceive the context in which they operate and react to it appropriately). Advantages: Most flexible of all programming paradigms, weakest coupling, easily able to integrate wide variety of standards, resources and languages. Disadvantages: Hype has obscured concepts; Computations are no longer strictly deterministic (since they are dependent on external, changing context) and may thus not be reproducible. It may be difficult to keep track of task progress. Scheduling overhead may be significant.


   

Further reading and resources

Hassanien et al. (2013) Computational intelligence techniques in bioinformatics. Comput Biol Chem 47:37-47. (pmid: 23891719)

PubMed ] [ DOI ] Computational intelligence (CI) is a well-established paradigm with current systems having many of the characteristics of biological computers and capable of performing a variety of tasks that are difficult to do using conventional techniques. It is a methodology involving adaptive mechanisms and/or an ability to learn that facilitate intelligent behavior in complex and changing environments, such that the system is perceived to possess one or more attributes of reason, such as generalization, discovery, association and abstraction. The objective of this article is to present to the CI and bioinformatics research communities some of the state-of-the-art in CI applications to bioinformatics and motivate research in new trend-setting directions. In this article, we present an overview of the CI techniques in bioinformatics. We will show how CI techniques including neural networks, restricted Boltzmann machine, deep belief network, fuzzy logic, rough sets, evolutionary algorithms (EA), genetic algorithms (GA), swarm intelligence, artificial immune systems and support vector machines, could be successfully employed to tackle various problems such as gene expression clustering and classification, protein sequence classification, gene selection, DNA fragment assembly, multiple sequence alignment, and protein function prediction and its structure. We discuss some representative methods to provide inspiring examples to illustrate how CI can be utilized to address these problems and how bioinformatics data can be characterized by CI. Challenges to be addressed and future directions of research are also presented and an extensive bibliography is included.

Holcombe et al. (2012) Modelling complex biological systems using an agent-based approach. Integr Biol (Camb) 4:53-64. (pmid: 22052476)

PubMed ] [ DOI ] Many of the complex systems found in biology are comprised of numerous components, where interactions between individual agents result in the emergence of structures and function, typically in a highly dynamic manner. Often these entities have limited lifetimes but their interactions both with each other and their environment can have profound biological consequences. We will demonstrate how modelling these entities, and their interactions, can lead to a new approach to experimental biology bringing new insights and a deeper understanding of biological systems.

Severin et al. (2010) eHive: an artificial intelligence workflow system for genomic analysis. BMC Bioinformatics 11:240. (pmid: 20459813)

PubMed ] [ DOI ] BACKGROUND: The Ensembl project produces updates to its comparative genomics resources with each of its several releases per year. During each release cycle approximately two weeks are allocated to generate all the genomic alignments and the protein homology predictions. The number of calculations required for this task grows approximately quadratically with the number of species. We currently support 50 species in Ensembl and we expect the number to continue to grow in the future. RESULTS: We present eHive, a new fault tolerant distributed processing system initially designed to support comparative genomic analysis, based on blackboard systems, network distributed autonomous agents, dataflow graphs and block-branch diagrams. In the eHive system a MySQL database serves as the central blackboard and the autonomous agent, a Perl script, queries the system and runs jobs as required. The system allows us to define dataflow and branching rules to suit all our production pipelines. We describe the implementation of three pipelines: (1) pairwise whole genome alignments, (2) multiple whole genome alignments and (3) gene trees with protein homology inference. Finally, we show the efficiency of the system in real case scenarios. CONCLUSIONS: eHive allows us to produce computationally demanding results in a reliable and efficient way with minimal supervision and high throughput. Further documentation is available at: http://www.ensembl.org/info/docs/eHive/.

Ren et al. (2008) Multi-agent-based bio-network for systems biology: protein-protein interaction network as an example. Amino Acids 35:565-72. (pmid: 18425405)

PubMed ] [ DOI ] Recently, a collective effort from multiple research areas has been made to understand biological systems at the system level. This research requires the ability to simulate particular biological systems as cells, organs, organisms, and communities. In this paper, a novel bio-network simulation platform is proposed for system biology studies by combining agent approaches. We consider a biological system as a set of active computational components interacting with each other and with an external environment. Then, we propose a bio-network platform for simulating the behaviors of biological systems and modelling them in terms of bio-entities and society-entities. As a demonstration, we discuss how a protein-protein interaction (PPI) network can be seen as a society of autonomous interactive components. From interactions among small PPI networks, a large PPI network can emerge that has a remarkable ability to accomplish a complex function or task. We also simulate the evolution of the PPI networks by using the bio-operators of the bio-entities. Based on the proposed approach, various simulators with different functions can be embedded in the simulation platform, and further research can be done from design to development, including complexity validation of the biological system.

Karasavvas et al. (2005) A criticality-based framework for task composition in multi-agent bioinformatics integration systems. Bioinformatics 21:3155-63. (pmid: 15890745)

PubMed ] [ DOI ] MOTIVATION: During task composition, such as can be found in distributed query processing, workflow systems and AI planning, decisions have to be made by the system and possibly by users with respect to how a given problem should be solved. Although there is often more than one correct way of solving a given problem, these multiple solutions do not necessarily lead to the same result. Some researchers are addressing this problem by providing data provenance information. Others use expert advice encoded in a supporting knowledge-base. In this paper, we propose an approach that assesses the importance of such decisions with respect to the overall result. We present a way of measuring decision criticality and describe its potential use. RESULTS: A multi-agent bioinformatics integration system is used as the basis of a framework that facilitates such functionality. We propose an agent architecture, and a concrete bioinformatics example (prototype) is used to show how certain decisions may not be critical in the context of more complex tasks.

Karasavvas et al. (2004) Bioinformatics integration and agent technology. J Biomed Inform 37:205-19. (pmid: 15196484)

PubMed ] [ DOI ] Vast amounts of life sciences data are scattered around the world in the form of a variety of heterogeneous data sources. The need to be able to co-relate relevant information is fundamental to increase the overall knowledge and understanding of a specific subject. Bioinformaticians aspire to find ways to integrate biological data sources for this purpose and system integration is a very important research topic. The purpose of this paper is to provide an overview of important integration issues that should be considered when designing a bioinformatics integration system. The currently prevailing approach for integration is presented with examples of bioinformatics information systems together with their main characteristics. Here, we introduce agent technology and we argue why it provides an appropriate solution for designing bioinformatics integration systems.

Franklin & Graesser: Is it an Agent or a Program ... ? – whitepaper at IIS, Memphis
FIPA (Foundation for Intelligent Physical Agents) – A rich resource of specifications for a well thought-out agent standard. (Index of specifications is here.)