Motivation: Although a great deal of progress is being made in the development of fast and reliable experimental techniques to extract genome-wide networks of protein-protein and protein-DNA interactions, the sequencing of new genomes proceeds at an even faster rate. That is why there is a considerable need for reliable methods of in-silico prediction of protein interaction based solely on sequence similarity information and known interactions from well-studied organisms. This problem can be solved if a dependency exists between sequence similarity and the conservation of the proteins’ functions. Results: In this paper, we introduce a novel probabilistic method for prediction of protein-protein interactions using a new empirical probabilistic formula describing the loss of interactions between homologous proteins during the course of evolution. This formula describes an evolutional process quite similar to the process of the Earth’s population growth. In addition, our method favors predictions confi rmed by several interacting pairs over predictions coming from a single interacting pair. Our approach is useful in working with “noisy” data such as those coming from high-throughput experiments. We have generated predictions for fi ve “model” organisms: H. sapiens, D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae and evaluated the quality of these predictions.
- Functional evolution
- Protein interactions
- Sequence similarity
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics
- Computer Science Applications