Learning Boolean Network Dynamics from Data in Biological Systems
Extracting information regarding the dynamics and functioning of complex systems is one of the most challenging tasks faced in modern science. A central problem to tackle is to accurately and reliable infer the underlying cause-and-effect (i.e., causal) network from observational data, especially when the system consists of a large number of interacting components and the dynamics is intrinsically nonlinear. Utilizing our recently developed theory of causation entropy and its optimization (J. Sun, D. Taylor, and E. M. Bollt, SIAM Journal on Applied Dynamical Systems 14, 73–106, 2015), we developed an efficient computational approach of to learn, or in other words, "reverse engineer", Boolean networks and functions directly from observational data. In contrast to existing methods which either limits to small network size, requires small node degree, demands very large sample size, or some combination of them, the proposed approach, by utilizing the optimal causation entropy principle, handles large networks and produce accurate reconstruction even with limited noisy data.
We demonstrate the effectiveness of our approach using both synthetic and experimental data, including time series from random Boolean networks, simulation data from plant signaling networks, data generated by board games (Tic-Tac-Toe and chess), categorical data for automated medical diagnosis including acute inflammation and cardiac imaging, and data from gene expression.