TY - GEN
T1 - Ensemble online clustering through decentralized observations
AU - Katselis, Dimitrios
AU - Beck, Carolyn L.
AU - Van Der Schaar, Mihaela
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014
Y1 - 2014
N2 - We investigate the problem of online learning for an ensemble of agents clustering incoming data, i.e., the problem of combining online local clustering decisions made by distributed agents to improve knowledge and accuracy of implicit clusters hidden in the incoming data streams. We focus on clustering using the well-known K-means algorithm for numerical data due to its efficiency in clustering large data sets. Nevertheless, our results can be straightforwardly extended to, e.g., the K-modes variant of the K-means algorithm to handle categorical data, as well as to other clustering algorithms. We show that the proposed ensemble online solutions, which are based on a simple majority-voting scheme, converge to the centralized solutions that would be made by a fusion center, that is, the solutions resulting from one agent with access to all information across agents. Given the dimensions of the clustering model, the aforementioned convergence is demonstrated to be achievable for relatively small sizes of the ensemble.
AB - We investigate the problem of online learning for an ensemble of agents clustering incoming data, i.e., the problem of combining online local clustering decisions made by distributed agents to improve knowledge and accuracy of implicit clusters hidden in the incoming data streams. We focus on clustering using the well-known K-means algorithm for numerical data due to its efficiency in clustering large data sets. Nevertheless, our results can be straightforwardly extended to, e.g., the K-modes variant of the K-means algorithm to handle categorical data, as well as to other clustering algorithms. We show that the proposed ensemble online solutions, which are based on a simple majority-voting scheme, converge to the centralized solutions that would be made by a fusion center, that is, the solutions resulting from one agent with access to all information across agents. Given the dimensions of the clustering model, the aforementioned convergence is demonstrated to be achievable for relatively small sizes of the ensemble.
UR - http://www.scopus.com/inward/record.url?scp=84988234204&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84988234204&partnerID=8YFLogxK
U2 - 10.1109/CDC.2014.7039497
DO - 10.1109/CDC.2014.7039497
M3 - Conference contribution
AN - SCOPUS:84988234204
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 910
EP - 915
BT - 53rd IEEE Conference on Decision and Control,CDC 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 53rd IEEE Annual Conference on Decision and Control, CDC 2014
Y2 - 15 December 2014 through 17 December 2014
ER -