Network-based Analysis of Protein Function
Abstract:
Large-scale protein-protein interaction networks have been determined for organisms across the evolutionary spectrum. The resulting interactomes are a great resource for furthering our understanding of cellular functioning, pathways and organization. In this thesis, we focus on uncovering the relationship between the topological characteristics of these networks and their underlying functioning.
In the first part of this thesis, we study the problem of network modularity. Cellular networks are known to have modular organization, with groups of proteins working together to perform some larger biological process. Numerous clustering approaches have been applied in order to uncover, from large-scale protein physical interaction data, protein complexes and functional modules. We develop a comprehensive framework to assess how well network clustering approaches perform in uncovering protein complexes and biological processes, and in predicting protein functions. By applying this framework, we find that topological characteristics of networks are a significant factor in the accuracy trade-offs between local and global (i.e. clustering) approaches for uncovering cellular functioning.
In the second half of this thesis, we focus on relating one important aspect of protein functioning, its essentiality, to network topology. A protein is essential if it is vital for a cell's survival and its removal kills the cell. Previously, researchers had observed that essential proteins tend to have many physical interactions. We find that the relationship between essentiality and interaction degree is true at different scales of organization. In particular, we find that the number of intra-complex or intra-process interactions that a protein has is a better indicator of its essentiality than its overall number of interactions. Moreover, we find that within an essential complex, its essential proteins tend to have more interactions, especially intra-complex interactions, than its non-essential proteins. Finally, we build a module-level interaction network, and find that essential complexes and processes tend to have higher interaction degrees in this network than non-essential complexes and processes; that is, they tend to exhibit a larger amount of functional cross-talk than non-essential complexes and processes.