The Cellular Signal Integration Group (C-SIG) at The Technical University of Denmark (DTU) and Center for Biological Sequence Analysis (CBS) conducts computational and quantitative experimental biology with the aim of linking dynamic protein interaction networks to disease progression. This will enable us to target the networks driving regulatory diseases such as cancer, diabetes and neurological disorders. We are motivated by the opportunity to perform systems based targeting of complex human diseases while simultaneously gaining insight into the fundamental principles behind kinase based cellular information processing. Cell signaling networks are the foundation of cell fate and behavior and their aberrant activity is a key mechanism underlying the pathological behavior of cells during tumor development. However, signaling networks are highly complex, involving a large ensemble of dynamic interactions that flux in space and time. Thus, to understand how aberrant cell decisions arise requires a global view of cell signaling networks. We and others have demonstrated that to obtain systems-level insight into cell signaling a combination of experimental and computational exploration is needed. My lab have developed powerful computational tools (e.g. NetworKIN and NetPhorest) that can model phosphorylation driven cellular signaling networks. We have previously deployed these on quantitative proteomics data to model DNA damage, stem-cell differentiation, cell-cell communication, signaling evolution and to compare model organisms. Below we discuss some of our current research activities and projects.
Integrative Network Biology - Biological Forecasting
While studying molecular interactions has been a research focus for many years and has provided much insight into biology, the new age has come for integrative network biology. The aim with integrative network biology is to provide predictive models of cellular networks based on integration of large- and heterogeneous data set, for example originating from proteomics and high-throughput functional genomics studies. We, and others, have shown that the network models pertaining to cellular signaling can be utilized for fine-grained prediction of human disease as well. Through quantitative systematic measurements of these networks and by integration of multiple types of data, it is now possible to define dynamic changes to these networks and perform computational modeling of the networks that enables predictions to be made about specific responses such systems should elicit. In particular, we have recently shown that integrating systems genetics data with phospho-proteomics data and computational models of cellular kinase specificity, we could derive an integrative network model of JNK regulation in the fruit fly. This network can now serve as a framework for future targeted proteomics studies in cancer cells (in vitro and in vivo) to define how the network evolves and changes during cancer progression. We are now actively pursing the integration of data from deep sequencing, functional genomics and extreme-throughput microscopy and mass-spectrometry to perform network medicine and systems-level modeling of cancer metastasis.
In a integrative network biology study of Eph-ephrin cell-cell contact initiated signaling, we recently developed novel experimental and computational approaches to characterize the intracellular signaling networks that control the sorting of distinct cell types into separate compartments. This is an important problem, since cell compartmentalization is critical for the development of complex tissues, and is usurped in a variety of disease states. We used data-integration (with NetPhorest and NetworKIN) to computationally reconstruct cell-specific information processing during Eph receptor/ephrin-initiated cell sorting. An interesting innovation was to correlate phosphorylation of a kinase activation loop with phosphorylation of a predicted substrate site, and thus to assign kinases and substrates in an activity-dependent manner. Together, the combination of cell-specific proteomics, large-scale functional analysis and modeling provides a systems-level view of Eph receptor/ephrin- mediated signaling and cell sorting. This work provides the first systematic analysis of cell-specific signaling events induced by contact between two different cell populations. This work therefore serves as a precedent for future investigations into normal and pathologic cell-cell interactions and associated phosphorylation-driven networks. At DTU we are studying this and related systems as a function of time and during genetic and chemical perturbations.
An underlying aim with our research in networks is to establish frameworks that similar to weather prediction models can predict the behavior of cellular and biological systems. Our approach is somewhat similar to forecasting in the sense that we quantify molecular dynamics at the genome and proteome in order to predict changes at the phenotypic level. These studies involved large-scale quantitative analysis of genomic mutations, signaling dynamics (e.g. phosphorylation) as well as cell morphology and other phenotypic markers in combination with machine learning based data integration and network model reconstruction.
Complex Systems - Networks and Evolution
John Nash showed that within a complex system, individuals are best off if they make the best decision that they can, taking into account the decisions of the other individuals. Recently, we investigated whether similar principles influence the evolution of signaling networks in multicellular animals. Specifically, by analyzing a set of metazoan species we observed a striking negative correlation of genomically encoded tyrosine content with biological complexity (as measured by the number of cell types in each organism). We argued that this observed tyrosine loss correlates with the expansion of tyrosine kinases in the evolution of the metazoan lineage and that it may relate to the optimization of signaling systems in multicellular animals. We proposed that this phenomenon illustrates genome-wide adaptive evolution to accommodate beneficial genetic perturbation. From a disease perspective this observation is exciting because reducing the number of potentially harmful tyrosine kinase interactions is likely important to avoid cancer and other complex diseases, and losing tyrosines seems to be a deliberate effort by the cells to reduce the risk of malfunction and disease.
Although numerous human phosphorylation sites and their dynamics have been characterized, the evolutionary history and physiological importance of many signaling events remain unknown. Using target phosphoproteomes determined with a similar experimental and computational pipeline, we recently investigated the conservation of human phosphorylation events in distantly related model organisms (fly, worm, and yeast). With a sequence-alignment approach, we identified 479 phosphorylation events in 344 human proteins that appear to be positionally conserved over ∼ 600 million years of evolution and hence are likely to be involved in fundamental cellular processes. This sequence- alignment analysis suggested that many phosphorylation sites evolve rapidly and therefore do not display strong evolutionary conservation in terms of sequence position in distantly related organisms. Thus, we devised a network-alignment approach to reconstruct conserved kinase-substrate networks, which identified 778 phosphorylation events in 698 human proteins. Both methods identified proteins tightly regulated by phosphorylation and signal integration hubs and both types of phosphoproteins were enriched in proteins encoded by disease-associated genes. We analyzed the cellular functions and structural relationships for these conserved signaling events, noting the incomplete nature of current phosphoproteomes. Assessing phosphorylation conservation at both site and network levels proved useful for exploring both fast-evolving and ancient signaling events. Finally, we showed that multiple complex diseases converge within the conserved networks, suggesting that disease development might rely on common molecular networks.
Our most recent work aims to enable network based interpretation of genomic sequencing data (NGS) where we are developing new powerful algorithms to predict the impact of disease mutations on cellular signaling networks. The aim is to move far beyond the primitive "Driver/Passenger" paradigm currently deployed in genome-wide association studies (GWAS) and large-scale genomic sequencing projects.
Signaling Network States and Dynamics
Signaling networks are not static, and any particular local or sub-network especially (for example that of a kinase family and its substrates) is in a state of constant flux between a finite number of network states, each with different architectures, we term these network ensembles or the set of executable networks. The differences between a stem cell, and a differentiated cell, or a normal epithelial cell or between a cancer cell in the primary tumor and a metastatic cell, may reflect differences in the network state-space that each cell explores. The network architecture at any point in state-space determines phenotypic output. Even though each network is composed of thousands of elements with complex connectivity, it is likely that only a limited set of network states actually exist and are optimal for cell growth. If cells could explore large regions of network state-space, it would be unlikely any aspects of network architecture would be conserved between cell types or species – which is not always the case. Importantly, because there are likely only a limited number of network states, we propose that quantification of entire cell’s network state-space at a given point in time is an achievable goal.
To perform systematic quantification of network activity following a defined environmental cue, we recently used mass-spectrometry (MS) in combination with siRNA screening and computational models of kinase activation to create integrative models of network states between two distinct cell types. We could thereby study cell-cell contact initiated signaling in cells expressing the Eph receptor (EphR) and cells expressing ephrin, the membrane-bound ligand for EphR. We examined the differences between the network of kinase-substrates- and adaptors that regulate and respond to the signaling events that occur following a cell-cell contact mediated EphR-ephrin interaction. Importantly, we showed that the network dynamics and structure during cell-cell contact is significantly different from those observed using traditional artificial ligands. Thus, it is important moving forward with quantitative biology to re-consider how we stimulate biological systems, for example with a growth factor if the end goal is to derive physiological relevant models.
Computational Biology - Kinome Biology Algorithms
A major activity within our lab is to continue to develop computational tools (for example, NetworKIN and NetPhorest) and to deploy these on quantitative mass-spectrometry based proteomics (or other types) data to understand at a systems-level the principles of how spatio and temporal assembly of mammalian interaction networks transmits and process information in order to alter cellular behavior. Gaining broader network coverage will facilitate the move from prediction to descriptive modeling of mammalian signaling networks. We research the use of Bayesian statistics and machine learning (in particular Artificial Neural Networks) to model and integrate signaling networks with those related to cell behavior and phenotype. The development of the next generation of algorithms requires use of both ExaScale Computing large-shared memory systems (SGI ALTIX UV) as well as GPU based infrastructure for the acceleration needed for large-scale deployment. We also investigate statistical algorithms such as Markov Random Fields, Markov chain Monte Carlo simulations and Support Vector Machines (SVMs) on biological data. Another major activity is the analysis of very high-dimensional non-linear data cubes such as those pertaining morphological data obtained through large-scale quantitative microscopy. We deploy both linear and non-linear dimensionality reduction to integrate these data with signaling dynamics. Our mantra (inspired by Doug Lauffenburger at MIT) is that in systems biology the modeling approach most often depend on the type of data and biological question in mind.
We have developed some of the most frequently used bioinformatics algorithms in studying protein disorder (GlobPlot and DisEMBL) as well as linear motifs in signaling proteins (ELM). Recently we launched the NetPhorest automated machine learning framework which is now a community resource for signaling biologists. A major goal is to integrate the NetPhorest atlas into NetworKIN. This will enable more accurate predictions to be made for a larger fraction of the kinome and facilitate modeling of interactions mediated by e.g SH2 and BRCT phospho-binding domains. We are developing new systems specific contextual algorithms which will enable us to calculate protein-associations similar to STRING in a systems (e.g. a particular cell-line) specific manner. We aim to release all our code as Open Source and as web-services once they have matured. We are developing and applying new powerful tools for quantitative and systems biological research. In collaboration with Prof Tony Pawsons laboratory in Toronto we have recently published the large-scale open source laboratory information management system OpenFreezer and an in-house pipeline for proteomics data analysis, ProteoChart (Pasculescu et al., unpublished).
Current drug development efforts almost uniformly focus on specific steps in a well-described disease “pathway” and aim to identify highly specific inhibitors for these steps. However, these strategies have generally been ineffective for identifying therapeutically useful approaches for treating complex diseases. Therefore, it is proposed to perform targeting of the network itself and/or properties of it. Agents capable of this can be termed Network Drugs, and are molecules (or combinations of molecules) that target a protein (or several proteins) selected based on their importance for a particular network structure or dynamics that can be associated with the disease progression. In our lab we aim to elucidate at a systems level how the dynamic behavior and function of signaling networks contributes to the process of cancer progression. We termed this approach Network Medicine. Thus, a future aim is to develop novel drug discovery strategies aimed at targeting aberrant protein signaling networks in collaboration with the Novo Nordisk Foundation Center for Protein Research, Memorial Sloan-Kettering Cancer Center (MSKCC), Biotech Research and Innovation Centre (BRIC) and others to develop new treatments for cancers. Our approach is to utilize network biology approaches to derive network drugs, i.e. molecules (or combinations of molecules) that target a protein (or multiple proteins) selected based on its importance for a particular network structure or dynamics (what we term network utilization) associated with the disease progression. In other words, a network drug aims to normalize malfunctioning signaling networks in order to either normalize or destroy a diseased cell or tissue. Currently we are performing three large-scale integrative studies of network drugs in Melanoma, Ovarian and Colon Cancer aiming at identifying network based diagnostic markers as well as new kinase inhibitor combination therapies.