- Data-driven methods for design, control and verification of cyber-physical systems – combining machine learning and statistics with physics-based modeling
- Efficient methods for analysis and identification of complex dynamical systems
- Applications in Energy and Power Systems, and in Cancer Systems Biology

I am interested in engineering problems that require both the rigorous modeling and analysis to exploit available domain knowledge, as well as a data-driven approach to incorporate insights and reduce model uncertainty from measurements.
My current research is motivated by problems in energy (intelligent operation and control operation of electric power distribution) and cancer systems biology (understanding heterogeneity and treatment escape in breast cancer tumors).
I combine system theoretic tools (modeling, analysis, control) with optimization and data-driven inference methods (machine learning and statistics). In recent years, trends in both these research areas and their respective applications motivate exploring the intersection. On the one hand, the systems theory field has a long tradition of developing system identification tools to construct models from data. There is now a renewed interest to update these methods with new computational and inference techniques. On the other end, in statistics and machine learning, the abundance of data often leads to scalability and computational challenges, leading to the natural question of how to navigate the available data in a structured and efficient way. Therefore it is becoming more important to incorporate the domain knowledge of the application and to use modeling and abstractions to help selecting the right features and design experiments.

Smart Grid Control

High penetration levels of distributed generation (DG) and electric vehicles (EVs) diversify power flow and bring uncertainty to distribution networks, making planning and control more involved for distribution system operators (DSOs). The consequent need to augment forecasts with real-time state estimation is economically and technically challenging since it requires investing in a large number of sensors and these have to communicate with an often older and slower supervisory control and data acquisition (SCADA) systems. We address distribution grid state estimation via combining only a limited set of sensors with load forecast information. It revisits open problems in a recent paper that proposes a Bayesian estimation scheme. We derive the estimator for balanced power networks via rigorous modeling, allowing for generalization to three phase unbalanced networks. An offline analysis of load aggregation, forecast accuracy and number of sensors provides concrete engineering trade-offs to determine the optimal number of sensors for a desired accuracy. This estimation procedure can be used in real time as an observer for control problems or offline for planning purposes to asses the effect of DG or EVs on specific network components.

Electronic power inverters are capable of quickly delivering reactive power to maintain customer voltages within operating tolerances and to reduce system losses in distribution grids. This paper proposes a systematic and data-driven approach to determine reactive power inverter output as a function of local measurements in a manner that obtains near optimal results. First, we use a network model and historic load and generation data and do optimal power flow to compute globally optimal reactive power injections for all controllable inverters in the network. Subsequently, we use regression to find a function for each inverter that maps its local historical data to an approximation of its optimal reactive power injection. The resulting functions then serve as decentralized controllers in the participating inverters to predict the optimal injection based on a new local measurements. The method achieves near-optimal results when performing voltage- and capacity-constrained loss minimization and voltage flattening, and allows for an efficient volt-VAR optimization (VVO) scheme in which legacy control equipment collaborates with existing inverters to facilitate safe operation of distribution networks with higher levels of distributed generation.

Breast cancer tumors have inherently heterogeneous cell types that respond differently to treatments. Although there is a wealth of studies describing canonical cell signaling networks, little is known about how these networks operate in different cancer cells and treatments. This paper proposes a method to split a set of responses gathered from experiments on different cancer cells up into common and specific components. The key to this retrieval is the derivation of a linear time-varying model of the shared dynamics among the different cell lines. A convex optimization problem is derived that retrieves both the model and the common and specific responses without a priori information. The method is tested on synthetic data, and verifies known facts when tested on a biological data set with protein expression data from breast cancer experiments. The technique can be used to analyze specific responses to understand what treatments can be combined to persistently treat a heterogeneous cancer tumor. The linear time-varying model sheds light on how proteins interact over time.

Phenotypic heterogeneity, or cellular diversity on the phenotypic level, poses a major hurdle to effective treatment of certain cancers (e.g., triple negative breast cancer). Hence, the discovery of strategies to reduce this heterogeneity is a fundamental priority for the cancer biology community. Knowledge of the phenomena that govern drug-induced cell populations is needed to design therapeutic approaches that systematically control phenotypic diversity. In this work, we present a mathematically simple, yet powerful framework to suggest why observed trends in phenotype time-trajectories occur and why trends may change under application of targeted therapies. A linear time-invariant model is derived to represent cell division and death of each phenotype in addition to switching between phenotype pairs. A convex optimization procedure is formulated and solved to estimate values of model parameters using measurements from a breast cancer cell line. The resulting values are analyzed and found to agree qualitatively with several existing biological hypotheses. Remarkably, the model indicates that the PI3K/mTOR inhibitor therapy reduces the prominence of basal phenotypes due to switching behavior and low cell division, as opposed to other logical explanations, such as selective elimination. Further, the model suggests that the MEK inhibitor drug promotes the prominence of basal phenotypes via switching phenomena, instead of increased cell division or reduced death. Our modeling framework and initial results are important milestones in the design of richer models and experiments with the potential to drive discovery of effective cancer treatments.