TY - GEN
T1 - Heart of the matter
T2 - 2008 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
AU - Kosorukoff, Alex
AU - Sinha, Saurabh
PY - 2008
Y1 - 2008
N2 - Clustering is widely used by genomics researchers to discover functional patterns in data. The inherent subjectivity and hardness of the clustering task often lead researchers to explore multiple clustering results of the same data, using different algorithms and parameter settings. This further necessitates a method to automatically summarize multiple clustering results. A natural question to ask about several clustering results is "what is the structure they all have in common?" This work presents a computational method to answer this question. We provide a precise formulation of the problem of computing the consensus of several clusterings, examine its computational complexity and find the problem to be NP-hard. We describe a greedy heuristic to solve the problem, and assess its performance on synthetic data. We demonstrate several applications of this algorithm on genomics data. Our program will be freely available for download.
AB - Clustering is widely used by genomics researchers to discover functional patterns in data. The inherent subjectivity and hardness of the clustering task often lead researchers to explore multiple clustering results of the same data, using different algorithms and parameter settings. This further necessitates a method to automatically summarize multiple clustering results. A natural question to ask about several clustering results is "what is the structure they all have in common?" This work presents a computational method to answer this question. We provide a precise formulation of the problem of computing the consensus of several clusterings, examine its computational complexity and find the problem to be NP-hard. We describe a greedy heuristic to solve the problem, and assess its performance on synthetic data. We demonstrate several applications of this algorithm on genomics data. Our program will be freely available for download.
UR - http://www.scopus.com/inward/record.url?scp=58049152711&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=58049152711&partnerID=8YFLogxK
U2 - 10.1109/BIBM.2008.28
DO - 10.1109/BIBM.2008.28
M3 - Conference contribution
AN - SCOPUS:58049152711
SN - 9780769534527
T3 - Proceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
SP - 155
EP - 162
BT - Proceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
Y2 - 3 November 2008 through 5 November 2008
ER -