CSB modelling methods

From "A B C"
Jump to navigation Jump to search

Modelling Methods in Computational Biology


This page is a placeholder, or under current development; it is here principally to establish the logical framework of the site. The material on this page is correct, but incomplete.


Intuitively we might think of pathway-, network-, and systems modelling simply to be ever more sophisticated applications of coupled differential equations. That is how it has been done for pathway modelling in biochemistry for decades. However, on a cellular level, matters are more difficult: we simply don't have the precise concentrations and rates available that we would need to populate such models. Important alternative (or complementary) approaches focus on the constraints that are inherent to the real world, such as mass- and energy- balance, or push the boundaries of simpler descriptions, down to purely topological approaches that can be populated in principle from interaction networks.


Introductory reading

Meier-Schellersheim et al. (2009) Multiscale modeling for biologists. Wiley Interdiscip Rev Syst Biol Med 1:4-14. (pmid: 20448808)

PubMed ] [ DOI ] Biomedical research frequently involves performing experiments and developing hypotheses that link different scales of biological systems such as, for instance, the scales of intracellular molecular interactions to the scale of cellular behavior and beyond to the behavior of cell populations. Computational modeling efforts that aim at exploring such multiscale systems quantitatively with the help of simulations have to incorporate several different simulation techniques because of the different time and space scales involved. Here, we provide a nontechnical overview of how different scales of experimental research can be combined with the appropriate computational modeling techniques. We also show that current modeling software permits building and simulating multiscale models without having to become involved with the underlying technical details of computational modeling.


 

Contents

 

ODEs, PDEs and their stochastic counterparts

Toni & Stumpf (2010) Parameter inference and model selection in signaling pathway models. Methods Mol Biol 673:283-95. (pmid: 20835806)

PubMed ] [ DOI ] To support and guide an extensive experimental research into systems biology of signaling pathways, increasingly more mechanistic models are being developed with hopes of gaining further insight into biological processes. In order to analyze these models, computational and statistical techniques are needed to estimate the unknown kinetic parameters. This chapter reviews methods from frequentist and Bayesian statistics for estimation of parameters and for choosing which model is best for modeling the underlying system. Approximate Bayesian computation techniques are introduced and employed to explore different hypothesis about the JAK-STAT signaling pathway.

Hughey et al. (2010) Computational modeling of mammalian signaling networks. Wiley Interdiscip Rev Syst Biol Med 2:194-209. (pmid: 20836022)

PubMed ] [ DOI ] One of the most exciting developments in signal transduction research has been the proliferation of studies in which a biological discovery was initiated by computational modeling. In this study, we review the major efforts that enable such studies. First, we describe the experimental technologies that are generally used to identify the molecular components and interactions in, and dynamic behavior exhibited by, a network of interest. Next, we review the mathematical approaches that are used to model signaling network behavior. Finally, we focus on three specific instances of 'model-driven discovery': cases in which computational modeling of a signaling network has led to new insights that have been verified experimentally.

Ullah & Wolkenhauer (2010) Stochastic approaches in systems biology. Wiley Interdiscip Rev Syst Biol Med 2:385-397. (pmid: 20836037)

PubMed ] [ DOI ] The discrete and random occurrence of chemical reactions far from thermodynamic equilibrium, and low copy numbers of chemical species, in systems biology necessitate stochastic approaches. This review is an effort to give the reader a flavor of the most important stochastic approaches relevant to systems biology. Notions of biochemical reaction systems and the relevant concepts of probability theory are introduced side by side. This leads to an intuitive and easy-to-follow presentation of a stochastic framework for modeling subcellular biochemical systems. In particular, we make an effort to show how the notion of propensity, the chemical master equation (CME), and the stochastic simulation algorithm arise as consequences of the Markov property. Most stochastic modeling reviews focus on stochastic simulation approaches--the exact stochastic simulation algorithm and its various improvements and approximations. We complement this with an outline of an analytical approximation. The most common formulation of stochastic models for biochemical networks is the CME. Although stochastic simulations are a practical way to realize the CME, analytical approximations offer more insight into the influence of randomness on system's behavior. Toward that end, we cover the chemical Langevin equation and the related Fokker-Planck equation and the two-moment approximation (2MA). Throughout the text, two pedagogical examples are used to key illustrate ideas. With extensive references to the literature, our goal is to clarify key concepts and thereby prepare the reader for more advanced texts.

Neves (2012) Modeling of spatially-restricted intracellular signaling. Wiley Interdiscip Rev Syst Biol Med 4:103-15. (pmid: 21766466)

PubMed ] [ DOI ] Understanding the signaling capabilities of a cell presents a major challenge, not only due to the number of molecules involved, but also because of the complex network connectivity of intracellular signaling. Recently, the proliferation of quantitative imaging techniques has led to the discovery of the vast spatial organization of intracellular signaling. Computational modeling has emerged as a powerful tool for understanding how inhomogeneous signaling originates and is maintained. This article covers the current imaging techniques used to obtain quantitative spatial data and the mathematical approaches used to model spatial cell biology. Modeling-derived hypotheses have been experimentally tested and the integration of modeling and imaging approaches has led to non-intuitive mechanistic insights.

The Gillespie algorithm for stochastic simulation.


 

Constraint based modelling, Flux balance analysis

Behre et al. (2012) Detecting structural invariants in biological reaction networks. Methods Mol Biol 804:377-407. (pmid: 22144164)

PubMed ] [ DOI ] The detection and analysis of structural invariants in cellular reaction networks is of central importance to achieve a more comprehensive understanding of metabolism. In this work, we review different kinds of structural invariants in reaction networks and their Petri net-based representation. In particular, we discuss invariants that can be obtained from the left and right null spaces of the stoichiometric matrix which correspond to conserved moieties (P-invariants) and elementary flux modes (EFMs, minimal T-invariants). While conserved moieties can be used to detect stoichiometric inconsistencies in reaction networks, EFMs correspond to a mathematically rigorous definition of the concept of a biochemical pathway. As outlined here, EFMs allow to devise strategies for strain improvement, to assess the robustness of metabolic networks subject to perturbations, and to analyze the information flow in regulatory and signaling networks. Another important aspect addressed by this review is the limitation of metabolic pathway analysis using EFMs to small or medium-scale reaction networks. We discuss two recently introduced approaches to circumvent these limitations. The first is an algorithm to enumerate a subset of EFMs in genome-scale metabolic networks starting from the EFM with the least number of reactions. The second approach, elementary flux pattern analysis, allows to analyze pathways through specific subsystems of genome-scale metabolic networks. In contrast to EFMs, elementary flux patterns much more accurately reflect the metabolic capabilities of a subsystem of metabolism as well as its integration into the entire system.

van Eunen et al. (2011) Quantitative analysis of flux regulation through hierarchical regulation analysis. Meth Enzymol 500:571-95. (pmid: 21943915)

PubMed ] [ DOI ] Regulation analysis is a methodology that quantifies to what extent a change in the flux through a metabolic pathway is regulated by either gene expression or metabolism. Two extensions to regulation analysis were developed over the past years: (i) the regulation of V(max) can be dissected into the various levels of the gene-expression cascade, such as transcription, translation, protein degradation, etc. and (ii) a time-dependent version allows following flux regulation when cells adapt to changes in their environment. The methodology of the original form of regulation analysis as well as of the two extensions will be described in detail. In addition, we will show what is needed to apply regulation analysis in practice. Studies in which the different versions of regulation analysis were applied revealed that flux regulation was distributed over various processes and depended on time, enzyme, and condition of interest. In the case of the regulation of glycolysis in baker's yeast, it appeared, however, that cells that remain under respirofermentative conditions during a physiological challenge tend to invoke more gene-expression regulation, while a shift between respirofermentative and respiratory conditions invokes an important contribution of metabolic regulation. The complexity of the regulation observed in these studies raises the question what is the advantage of this highly distributed and condition-dependent flux regulation.

Schäuble et al. (2011) Hands-on metabolism analysis of complex biochemical networks using elementary flux modes. Meth Enzymol 500:437-56. (pmid: 21943910)

PubMed ] [ DOI ] The aim of this chapter is to discuss the basic principles and reasoning behind elementary flux mode analysis (EFM analysis)--an important tool for the analysis of metabolic networks. We begin with a short introduction into metabolic pathway analysis and subsequently outline in detail fundamentals of EFM analysis by way of a small example network. We discuss issues arising in the reconstruction of metabolic networks required for EFM analysis and how they can be circumvented. Subsequently, we analyze a more elaborate example network representing photosynthate metabolism. Finally, we give an overview of applications of EFM analysis in biotechnology and other fields and discuss issues arising when applying methods from metabolic pathway analysis to genome-scale metabolic networks.

Gianchandani et al. (2010) The application of flux balance analysis in systems biology. Wiley Interdiscip Rev Syst Biol Med 2:372-382. (pmid: 20836035)

PubMed ] [ DOI ] An increasing number of genome-scale reconstructions of intracellular biochemical networks are being generated. Coupled with these stoichiometric models, several systems-based approaches for probing these reconstructions in silico have been developed. One such approach, called flux balance analysis (FBA), has been effective at predicting systemic phenotypes in the form of fluxes through a reaction network. FBA employs a linear programming (LP) strategy to generate a flux distribution that is optimized toward a particular 'objective,' subject to a set of underlying physicochemical and thermodynamic constraints. Although classical FBA assumes steady-state conditions, several extensions have been proposed in recent years to constrain the allowable flux distributions and enable characterization of dynamic profiles even with minimal kinetic information. Furthermore, FBA coupled with techniques for measuring fluxes in vivo has facilitated integration of computational and experimental approaches, and is allowing pursuit of rational hypothesis-driven research. Ultimately, as we will describe in this review, studying intracellular reaction fluxes allows us to understand network structure and function and has broad applications ranging from metabolic engineering to drug discovery.

Terzer et al. (2009) Genome-scale metabolic networks. Wiley Interdiscip Rev Syst Biol Med 1:285-297. (pmid: 20835998)

PubMed ] [ DOI ] During the last decade, models have been developed to characterize cellular metabolism at the level of an entire metabolic network. The main concept that underlies whole-network metabolic modeling is the identification and mathematical definition of constraints. Here, we review large-scale metabolic network modeling, in particular, stoichiometric- and constraint-based approaches. Although many such models have been reconstructed, few networks have been extensively validated and tested experimentally, and we focus on these. We describe how metabolic networks can be represented using stoichiometric matrices and well-defined constraints on metabolic fluxes. We then discuss relatively successful approaches, including flux balance analysis (FBA), pathway analysis, and common extensions or modifications to these approaches. Finally, we describe techniques for integrating these approaches with models of other biological processes.

Oberhardt et al. (2009) Flux balance analysis: interrogating genome-scale metabolic networks. Methods Mol Biol 500:61-80. (pmid: 19399432)

PubMed ] [ DOI ] Flux balance analysis (FBA) is a computational method to analyze reconstructions of biochemical networks. FBA requires the formulation of a biochemical network in a precise mathematical framework called a stoichiometric matrix. An objective function is defined (e.g., growth rate) toward which the system is assumed to be optimized. In this chapter, we present the methodology, theory, and common pitfalls of the application of FBA.

Becker et al. (2007) Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nat Protoc 2:727-38. (pmid: 17406635)

PubMed ] [ DOI ] The manner in which microorganisms utilize their metabolic processes can be predicted using constraint-based analysis of genome-scale metabolic networks. Herein, we present the constraint-based reconstruction and analysis toolbox, a software package running in the Matlab environment, which allows for quantitative prediction of cellular behavior using a constraint-based approach. Specifically, this software allows predictive computations of both steady-state and dynamic optimal growth behavior, the effects of gene deletions, comprehensive robustness analyses, sampling the range of possible cellular metabolic states and the determination of network modules. Functions enabling these calculations are included in the toolbox, allowing a user to input a genome-scale metabolic model distributed in Systems Biology Markup Language format and perform these calculations with just a few lines of code. The results are predictions of cellular behavior that have been verified as accurate in a growing body of research. After software installation, calculation time is minimal, allowing the user to focus on the interpretation of the computational results.


 

Logical models

Wynn et al. (2012) Logic-based models in systems biology: a predictive and parameter-free network analysis method. Integr Biol (Camb) 4:1323-37. (pmid: 23072820)

PubMed ] [ DOI ] Highly complex molecular networks, which play fundamental roles in almost all cellular processes, are known to be dysregulated in a number of diseases, most notably in cancer. As a consequence, there is a critical need to develop practical methodologies for constructing and analysing molecular networks at a systems level. Mathematical models built with continuous differential equations are an ideal methodology because they can provide a detailed picture of a network's dynamics. To be predictive, however, differential equation models require that numerous parameters be known a priori and this information is almost never available. An alternative dynamical approach is the use of discrete logic-based models that can provide a good approximation of the qualitative behaviour of a biochemical system without the burden of a large parameter space. Despite their advantages, there remains significant resistance to the use of logic-based models in biology. Here, we address some common concerns and provide a brief tutorial on the use of logic-based models, which we motivate with biological examples.

Wang et al. (2012) Boolean modeling in systems biology: an overview of methodology and applications. Phys Biol 9:055001. (pmid: 23011283)

PubMed ] [ DOI ] Mathematical modeling of biological processes provides deep insights into complex cellular systems. While quantitative and continuous models such as differential equations have been widely used, their use is obstructed in systems wherein the knowledge of mechanistic details and kinetic parameters is scarce. On the other hand, a wealth of molecular level qualitative data on individual components and interactions can be obtained from the experimental literature and high-throughput technologies, making qualitative approaches such as Boolean network modeling extremely useful. In this paper, we build on our research to provide a methodology overview of Boolean modeling in systems biology, including Boolean dynamic modeling of cellular networks, attractor analysis of Boolean dynamic models, as well as inferring biological regulatory mechanisms from high-throughput data using Boolean models. We finally demonstrate how Boolean models can be applied to perform the structural analysis of cellular networks. This overview aims to acquaint life science researchers with the basic steps of Boolean modeling and its applications in several areas of systems biology.

Chaouiya et al. (2012) Logical modelling of gene regulatory networks with GINsim. Methods Mol Biol 804:463-79. (pmid: 22144167)

PubMed ] [ DOI ] Discrete mathematical formalisms are well adapted to model large biological networks, for which detailed kinetic data are scarce. This chapter introduces the reader to a well-established qualitative (logical) framework for the modelling of regulatory networks. Relying on GINsim, a software implementing this logical formalism, we guide the reader step by step towards the definition and the analysis of a simple model of the lysis-lysogeny decision in the bacteriophage λ.

Garg et al. (2012) Implicit methods for qualitative modeling of gene regulatory networks. Methods Mol Biol 786:397-443. (pmid: 21938638)

PubMed ] [ DOI ] Advancements in high-throughput technologies to measure increasingly complex biological phenomena at the genomic level are rapidly changing the face of biological research from the single-gene single-protein experimental approach to studying the behavior of a gene in the context of the entire genome (and proteome). This shift in research methodologies has resulted in a new field of network biology that deals with modeling cellular behavior in terms of network structures such as signaling pathways and gene regulatory networks. In these networks, different biological entities such as genes, proteins, and metabolites interact with each other, giving rise to a dynamical system. Even though there exists a mature field of dynamical systems theory to model such network structures, some technical challenges are unique to biology such as the inability to measure precise kinetic information on gene-gene or gene-protein interactions and the need to model increasingly large networks comprising thousands of nodes. These challenges have renewed interest in developing new computational techniques for modeling complex biological systems. This chapter presents a modeling framework based on Boolean algebra and finite-state machines that are reminiscent of the approach used for digital circuit synthesis and simulation in the field of very-large-scale integration (VLSI). The proposed formalism enables a common mathematical framework to develop computational techniques for modeling different aspects of the regulatory networks such as steady-state behavior, stochasticity, and gene perturbation experiments.

Whelan et al. (2011) Representation, simulation, and hypothesis generation in graph and logical models of biological networks. Methods Mol Biol 759:465-82. (pmid: 21863503)

PubMed ] [ DOI ] This chapter presents a discussion of metabolic modeling from graph theory and logical modeling perspectives. These perspectives are closely related and focus on the coarse structure of metabolism, rather than the finer details of system behavior. The models have been used as background knowledge for hypothesis generation by Robot Scientists using yeast as a model eukaryote, where experimentation and machine learning are used to identify additional knowledge to improve the metabolic model. The logical modeling concept is being adapted to cell signaling and transduction biological networks.

Morris et al. (2010) Logic-based models for the analysis of cell signaling networks. Biochemistry 49:3216-24. (pmid: 20225868)

PubMed ] [ DOI ] Computational models are increasingly used to analyze the operation of complex biochemical networks, including those involved in cell signaling networks. Here we review recent advances in applying logic-based modeling to mammalian cell biology. Logic-based models represent biomolecular networks in a simple and intuitive manner without describing the detailed biochemistry of each interaction. A brief description of several logic-based modeling methods is followed by six case studies that demonstrate biological questions recently addressed using logic-based models and point to potential advances in model formalisms and training procedures that promise to enhance the utility of logic-based methods for studying the relationship between environmental inputs and phenotypic or signaling state outputs of complex signaling networks.

 

Petri Nets

...

 

Cellular Automata

...

 

Process calculi (pi-calculus)

Regev et al. (2001) Representation and simulation of biochemical processes using the pi-calculus process algebra. Pac Symp Biocomput 459-70. (pmid: 11262964)

PubMed ] [ DOI ] Despite the rapidly accumulating body of knowledge about protein networks, there is currently no convenient way of sharing and manipulation of such information. We suggest that a formal computer language for describing the biomolecular processes underlying protein networks is essential for rapid advancement in this field. We propose to model biomolecular processes by using the pi-Calculus, a process algebra, originally developed for describing computer processes. Our model for biochemical processes is mathematically well-defined, while remaining biologically faithful and transparent. It is amenable to computer simulation, analysis and formal verification. We have developed a computer simulation system, the PiFCP, for execution and analysis of pi-calculus programs. The system allows us to trace, debug and monitor the behavior of biochemical networks under various manipulations. We present a pi-calculus model for the RTK-MAPK signal transduction pathway, formally represent detailed molecular and biochemical information, and study it by various PiFCP simulations.


 

Agent-based models

...


 

Exercises

Marwan et al. (2012) Petri nets in Snoopy: a unifying framework for the graphical display, computational modelling, and simulation of bacterial regulatory networks. Methods Mol Biol 804:409-37. (pmid: 22144165)

PubMed ] [ DOI ] Using the example of phosphate regulation in enteric bacteria, we demonstrate the particular suitability of stochastic Petri nets to model biochemical phenomena and their simulative exploration by various features of the software tool Snoopy.


 

Further reading and resources

Kalhor et al. (2011) Genome architectures revealed by tethered chromosome conformation capture and population-based modeling. Nat Biotechnol 30:90-8. (pmid: 22198700)

PubMed ] [ DOI ] We describe tethered conformation capture (TCC), a method for genome-wide mapping of chromatin interactions. By performing ligations on solid substrates rather than in solution, TCC substantially enhances the signal-to-noise ratio, thereby facilitating a detailed analysis of interactions within and between chromosomes. We identified a group of regions in each chromosome in human cells that account for the majority of interchromosomal interactions. These regions are marked by high transcriptional activity, suggesting that their interactions are mediated by transcriptional machinery. Each of these regions interacts with numerous other such regions throughout the genome in an indiscriminate fashion, partly driven by the accessibility of the partners. As a different combination of interactions is likely present in different cells, we developed a computational method to translate the TCC data into physical chromatin contacts in a population of three-dimensional genome structures. Statistical analysis of the resulting population demonstrates that the indiscriminate properties of interchromosomal interactions are consistent with the well-known architectural features of the human genome.

Misteli (2012) Parallel genome universes. Nat Biotechnol 30:55-6. (pmid: 22231096)

PubMed ] [ DOI ]