Abstract
Motivation: Cancer genomes exhibit a large number of different alterations that affect many genes in a diverse manner. An improved understanding of the generative mechanisms behind the mutation rules and their influence on gene community behavior is of great importance for the study of cancer. Results: To expand our capability to analyze combinatorial patterns of cancer alterations, we developed a rigorous methodology for cancer mutation pattern discovery based on a new, constrained form of correlation clustering. Our new algorithm, named C3 (Cancer Correlation Clustering), leverages mutual exclusivity of mutations, patient coverage and driver network concentration principles. To test C3, we performed a detailed analysis on TCGA breast cancer and glioblastoma data and showed that our algorithm outperforms the state-of-the-art CoMEt method in terms of discovering mutually exclusive gene modules and identifying biologically relevant driver genes. The proposed agnostic clustering method represents a unique tool for efficient and reliable identification of mutation patterns and driver pathways in large-scale cancer genomics studies, and it may also be used for other clustering problems on biological graphs.
Original language | English (US) |
---|---|
Pages (from-to) | 3717-3728 |
Number of pages | 12 |
Journal | Bioinformatics |
Volume | 32 |
Issue number | 24 |
DOIs | |
State | Published - Dec 15 2016 |
ASJC Scopus subject areas
- Statistics and Probability
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics