Polychromatic flow cytometry is a popular technique that has wide usage in the medical sciences, especially for studying phenotypic properties of cells. The high-dimensionality of data generated by flow cytometry usually makes it difficult to visualize. The naive solution of simply plotting two-dimensional graphs for every combination of observables becomes impractical as the number of dimensions increases. A natural solution is to project the data from the original high dimensional space to a lower dimensional space while approximately preserving the overall relationship between the data points. The expert can then easily visualize and analyze this low-dimensional embedding of the original dataset.
A stochastic computational model parameter synthesis system comprising at least one processor capable of executing processor executable code, and a non-transitory computer memory operably coupled with the at least one processor and storing processor executable code, which when executed by processor, causes processor to synthesize at least one parameter of a stochastic computational model to satisfy one or more behavioral specifications of properties observed in a modeled system. The processor generates and searches randomized projections of a first parameter space having n dimensions into one or more second abstract parameter space having d dimensions, where d is less than n, and outputs a signal indicative of a synthesized parameter value to the user.
Background: Probabilistic models have gained widespread acceptance in the systems biology community as a useful way to represent complex biological systems. Such models are developed using existing knowledge of the structure and dynamics of the system, experimental observations, and inferences drawn from statistical analysis of empirical data. A key bottleneck in building such models is that some system variables cannot be measured experimentally. These variables are incorporated into the model as numerical parameters. Determining values of these parameters that justify existing experiments and provide reliable predictions when model simulations are performed is a key research problem. Domain experts usually estimate the values of these parameters by fitting the model to experimental data. Model fitting is usually expressed as an optimization problem that requires minimizing a cost-function which measures some notion of distance between the model and the data. This optimization problem is often solved by combining local and global search methods that tend to perform well for the specific application domain. When some prior information about parameters is available, methods such as Bayesian inference are commonly used for parameter learning. Choosing the appropriate parameter search technique requires detailed domain knowledge and insight into the underlying system.
Results: Using an agent-based model of the dynamics of acute inflammation, we demonstrate a novel parameter estimation algorithm by discovering the amount and schedule of doses of bacterial lipopolysaccharide that guarantee a set of observed clinical outcomes with high probability. We synthesized values of twenty-eight unknown parameters such that the parameterized model instantiated with these parameter values satisfies four specifications describing the dynamic behavior of the model.
Conclusions: We have developed a new algorithmic technique for discovering parameters in complex stochastic models of biological systems given behavioral specifications written in a formal mathematical logic. Our algorithm uses Bayesian model checking, sequential hypothesis testing, and stochastic optimization to automatically synthesize parameters of probabilistic biological models.
We present algorithms for finding control strategies in Boolean Networks (BN). Our approach uses symbolic techniques from the field of model checking. We show that despite recent hardness-results for finding control policies, a model checking-based approach is often capable of scaling to extremely large and complex models. We demonstrate the effectiveness of our approach by applying it to a BN model of embryogenesis in D. melanogaster with 15,360 Boolean variables.
We present an algorithm, called BioLab, for verifying tem- poral properties of rule-based models of cellular signalling networks.
BioLab models are encoded in the BioNetGen language, and prop- erties are expressed as formulae in probabilistic bounded linear temporal logic. Temporal logic is a formalism for representing and reasoning about propositions qualified in terms of time. Properties are then verified using sequential hypothesis testing on executions generated using stochastic simulation. BioLab is optimal, in the sense that it generates the mini- mum number of executions necessary to verify the given property. Bio- Lab also provides guarantees on the probability of it generating Type-I (i.e., false-positive) and Type-II (i.e., false-negative) errors. Moreover, these error bounds are pre-specified by the user. We demonstrate Bio- Lab by verifying stochastic effects and bistability in the dynamics of the T-cell receptor signaling network.
We present a novel approach for predicting protein folding kinetics using techniques from the field of model checking. This represents the first time model checking has been applied to a problem in the field of structural biology. The protein’s energy landscape is encoded sym- bolically using Binary Decision Diagrams and related data structures. Questions regarding the kinetics of folding are encoded as formulas in the temporal logic CTL. Model checking algorithms are then used to make quantitative predictions about the kinetics of folding. We show that our approach scales to state spaces as large as 1023 when using exact algo- rithms for model checking. This is at least 14 orders of magnitude larger than the number of configurations considered by comparable techniques. Furthermore, our approach scales to state spaces at least as large as 1032 unique configurations when using approximation algorithms for model checking. We tested our method on 19 test proteins. The quantitative predictions regarding folding rates for these test proteins are in good agreement with experimentally measured values, achieving a correlation coefficient of 0.87.
This paper introduces novel techniques for exact and approximate inference in Dynamic Bayesian Networks (DBNs) based on algorithms, data structures, and formalisms from the field of model checking. Model checking comprises a family of techniques from for formally verifying systems of concurrent reactive processes. We discuss: i) the use of temporal logics as a query language for inference over DBNs; ii) translation of DBNs into probabilistic reactive modules; and iii) the use of symbolic data structures and algorithms for deciding complex stochastic temporal logic formu- las. We demonstrate the effectiveness of these new algorithms by examining the behavior of an enhanced expression model of embryogenesis in D. melanogaster. In particular, we converted an existing deterministic developmental model over a one-dimensional arrays of cells into a stochas- tic model over a two dimensional array of cells. Our results confirm that the rules which govern the one-dimensional model also display wild-type expression patterns in the two-dimensional case within certain parameter bounds.