Research
Our laboratory conducts computational biology, bioinformatics and experimental research in the field of Integrative Network and Systems Biology.
Introduction
The Cellular & Molecular Logic Team at The Institute of Cancer Research (ICR) conducts computational and quantitative biology with the aim of linking dynamic protein interaction networks to disease progression. This will enable us to target the networks driving regulatory diseases such as cancer, diabetes and neurological disorders. We are motivated by the opportunity to perform systems based targeting of complex human diseases while simultaneously gaining insight into the fundamental principles behind kinase based cellular information processing. Cell signalling networks are the foundation of cell fate and behaviour and their aberrant activity is a key mechanism underlying the pathological behaviour of cells during tumour development. However, signalling networks are highly complex, involving a large ensemble of dynamic interactions that flux in space and time. Thus, to understand how aberrant cell decisions arise requires a global view of cell signalling networks. We and others have demonstrated that to obtain systems-level insight into cell signalling a combination of experimental and computational exploration is needed. My lab have developed powerful computational tools (e.g. NetworKIN and NetPhorest) that can model phosphorylation driven cellular signalling networks. We have previously deployed these on quantitative proteomics data to model DNA damage, stem-cell differentiation, cell-cell communication, signalling evolution and to compare model organisms. Below we discuss some of our current research activities and projects.
Integrative Network Biology
While studying molecular interactions has been a research focus for many years and has provided much insight into biology, the new age has come for integrative network biology. The aim with integrative network biology is to provide models of cellular networks based on integration of large- and heterogeneous data set, for example originating from proteomics and high-throughput functional genomics studies. We, and others, have shown that the network models pertaining to cellular signalling can be utilised for fine-grained prediction of human disease as well. Through quantitative systematic measurements of these networks and by integration of multiple types of data, it is now possible to define dynamic changes to these networks and perform computational modelling of the networks that enables predictions to be made about specific responses such systems should elicit. In particular, we have recently shown that integrating systems genetics data with phospho-proteomics data and computational models of cellular kinase specificity, we could derive an integrative network model of JNK regulation in the fruit fly. This network can now serve as a framework for future targeted proteomics studies in cancer cells (in vitro and in vivo) to define how the network evolves and changes during cancer progression. We are now actively pursing the integration of data from deep sequencing, functional genomics and extreme-throughput microscopy and mass-spectrometry to perform network medicine and systems-level modelling of cancer metastasis.
In a integrative network biology study of Eph-ephrin cell-cell contact initiated signalling, we recently developed novel experimental and computational approaches to characterise the intracellular signaling networks that control the sorting of distinct cell types into separate compartments. This is an important problem, since cell compartmentalisation is critical for the development of complex tissues, and is usurped in a variety of disease states. We used data-integration (with NetPhorest and NetworKIN) to computationally reconstruct cell-specific information processing during Eph receptor/ephrin-initiated cell sorting. An interesting innovation was to correlate phosphorylation of a kinase activation loop with phosphorylation of a predicted substrate site, and thus to assign kinases and substrates in an activity-dependent manner. Together, the combination of cell-specific proteomics, large-scale functional analysis and modelling provides a systems-level view of Eph receptor/ephrin- mediated signaling and cell sorting. This work provides the first systematic analysis of cell-specific signaling events induced by contact between two different cell populations. This work therefore serves as a precedent for future investigations into normal and pathologic cell-cell interactions and associated phosphorylation-driven networks.
Complex Systems - Networks and Evolution
John Nash showed that within a complex system, individuals are best off if they make the best decision that they can, taking into account the decisions of the other individuals. Recently, we investigated whether similar principles influence the evolution of signaling networks in multicellular animals. Specifically, by analyzing a set of metazoan species we observed a striking negative correlation of genomically encoded tyrosine content with biological complexity (as measured by the number of cell types in each organism). We argued that this observed tyrosine loss correlates with the expansion of tyrosine kinases in the evolution of the metazoan lineage and that it may relate to the optimization of signaling systems in multicellular animals. We proposed that this phenomenon illustrates genome-wide adaptive evolution to accommodate beneficial genetic perturbation. From a disease perspective this observation is exciting because reducing the number of potentially harmful tyrosine kinase interactions is likely important to avoid cancer and other complex diseases, and losing tyrosines seems to be a deliberate effort by the cells to reduce the risk of malfunction and disease.
Although numerous human phosphorylation sites and their dynamics have been characterized, the evolutionary history and physiological importance of many signaling events remain unknown. Using target phosphoproteomes determined with a similar experimental and computational pipeline, we recently investigated the conservation of human phosphorylation events in distantly related model organisms (fly, worm, and yeast). With a sequence-alignment approach, we identified 479 phosphorylation events in 344 human proteins that appear to be positionally conserved over ∼ 600 million years of evolution and hence are likely to be involved in fundamental cellular processes. This sequence- alignment analysis suggested that many phosphorylation sites evolve rapidly and therefore do not display strong evolutionary conservation in terms of sequence position in distantly related organisms. Thus, we devised a network-alignment approach to reconstruct conserved kinase-substrate networks, which identi- fied 778 phosphorylation events in 698 human proteins. Both methods identified proteins tightly regulated by phosphorylation and signal integration hubs and both types of phosphoproteins were enriched in pro- teins encoded by disease-associated genes. We analyzed the cellular functions and structural relationships for these conserved signaling events, noting the incomplete nature of current phosphoproteomes. Assessing phosphorylation conservation at both site and network levels proved useful for exploring both fast-evolving and ancient signaling events. Finally, we showed that multiple complex diseases converge within the conserved networks, suggesting that disease development might rely on common molecular networks.
Signaling Network States and Dynamics
Signaling networks are not static, and any particular local or sub-network especially (for example that of a kinase family and its substrates) is in a state of constant flux between a finite number of network states, each with different architectures, we term these network ensembles or the set of executable networks. The differences between a stem cell, and a differentiated cell, or a normal epithelial cell or between a cancer cell in the primary tumour and a metastatic cell, may reflect differences in the network state-space that each cell explores. The network architecture at any point in state-space determines phenotypic output. Even though each network is composed of thousands of elements with complex connectivity, it is likely that only a limited set of network states actually exist and are optimal for cell growth. If cells could explore large regions of network state-space, it would be unlikely any aspects of network architecture would be conserved between cell types or species – which is not always the case. Importantly, because there are likely only a limited number of network states, we propose that quantification of entire cell’s network state-space at a given point in time is an achievable goal.
To perform systematic quantification of network activity following a defined environmental cue, we recently used mass-spectrometry (MS) in combination with siRNA screening and computational models of kinase activation to create integrative models of network states between two distinct cell types. We could thereby study cell-cell contact initiated signalling in cells expressing the Eph receptor (EphR) and cells expressing ephrin, the membrane-bound ligand for EphR. We examined the differences between the network of kinase-substrates- and adaptors that regulate and respond to the signalling events that occur following a cell-cell contact mediated EphR-ephrin interaction. Importantly, we showed that the network dynamics and structure during cell-cell contact is significantly different from those observed using traditional artificial ligands. Thus, it is important moving forward with quantitative biology to re-consider how we stimulate biological systems, for example with a growth factor if the end goal is to derive physiological relevant models.
Computational Biology and Algorithms
A major activity within our lab is to continue to develop computational tools (for example, NetworKIN and NetPhorest) and to deploy these on quantitative mass-spectrometry based proteomics (or other types) data to understand at a systems-level the principles of how spatio and temporal assembly of mammalian interaction networks transmits and process information in order to alter cellular behaviour. Gaining broader network coverage will facilitate the move from prediction to descriptive modelling of mammalian signalling networks. We research the use of Bayesian statistics and machine learning (in particular Artificial Neural Networks) to model and integrate signalling networks with those related to cell behaviour and phenotype. The development of the next generation of algorithms requires use of both large-shared memory systems (SGI ALTIX UV) as well as GPU based infrastructure for the acceleration needed for large-scale deployment. We also investigate alternative statistical algorithms such as Markov chain Monte Carlo simulations and Support Vector Machines (SVMs) on biological data. In computational biology the modelling approach most often depend on the type of data and biological question in mind.
We have developed some of the most frequently used bioinformatics algorithms in studying protein disorder (GlobPlot and DisEMBL) as well as linear motifs in signaling proteins (ELM). Recently we launched the NetPhorest automated machine learning framework which is now a community resource for signaling biologists. A major goal is to integrate the NetPhorest atlas into NetworKIN. This will enable more accurate predictions to be made for a larger fraction of the kinome and facilitate modelling of interactions mediated by e.g SH2 and BRCT phospho-binding domains. We are developing new systems specific contextual algorithms which will enable us to calculate protein-associations similar to STRING in a systems (e.g. a particular cell-line) specific manner. We aim to release all our code as Open Source and as web-services once they have matured. We will develop and apply new powerful tools for quantitative and systems biological research.
Network Medicine
Current drug development efforts almost uniformly focus on specific steps in a well-described disease “pathway” and aim to identify highly specific inhibitors for these steps. However, these strategies have generally been ineffective for identifying therapeutically useful approaches for treating complex diseases. Therefore, it is proposed to perform targeting of the network itself and/or properties of it. Agents capable of this can be termed Network Drugs, and are molecules (or combinations of molecules) that target a protein (or several proteins) selected based on their importance for a particular network structure or dynamics that can be associated with the disease progression. In our lab we aim to elucidate at a systems level how the dynamic behaviour and function of signalling networks contributes to the process of cancer progression. We termed this approach Network Medicine. Thus, a future aim is to develop novel drug discovery strategies aimed at targeting aberrant protein signalling networks in collaboration with the ICR’s Centre for Cancer Therapeutics to develop new treatments for cancers. Our approach is to utilise network biology approaches to derive network drugs, i.e. molecules (or combinations of molecules) that target a protein (or multiple proteins) selected based on its importance for a particular network structure or dynamics (what we term network utilisation) associated with the disease progression. In other words, a network drug aims to normalise malfunctioning signalling networks.

