Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.
Clustering by Passing Messages Between Data Points
Brendan J. Frey* and
Delbert Dueck
Clustering data by identifying a subset of representative examplesis important for processing sensory signals and detecting patternsin data. Such "exemplars" can be found by randomly choosingan initial subset of data points and then iteratively refiningit, but this works well only if that initial choice is closeto a good solution. We devised a method called "affinity propagation,"which takes as input measures of similarity between pairs ofdata points. Real-valued messages are exchanged between datapoints until a high-quality set of exemplars and correspondingclusters gradually emerges. We used affinity propagation tocluster images of faces, detect genes in microarray data, identifyrepresentative sentences in this manuscript, and identify citiesthat are efficiently accessed by airline travel. Affinity propagationfound clusters with much lower error than other methods, andit did so in less than one-hundredth the amount of time.
Department of Electrical and Computer Engineering, University of Toronto, 10 King's College Road, Toronto, Ontario M5S 3G4, Canada.
* To whom correspondence should be addressed. E-mail: frey{at}psi.toronto.edu
The editors suggest the following Related Resources on Science sites:
In Science Magazine
TECHNICAL COMMENTS
Michael J. Brusco and Hans-Friedrich Köhn (8 February 2008) Science319 (5864), 726c.
[DOI: 10.1126/science.1150938] |Abstract »|Full Text »|PDF »
TECHNICAL COMMENTS
Brendan J. Frey and Delbert Dueck (8 February 2008) Science319 (5864), 726d.
[DOI: 10.1126/science.1151268] |Abstract »|Full Text »|PDF »
PERSPECTIVES
Marc Mézard (16 February 2007) Science315 (5814), 949.
[DOI: 10.1126/science.1139678] |Summary »|Full Text »|PDF »
THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
Temporal clustering by affinity propagation reveals transcriptional modules in Arabidopsis thaliana.
S. J. Kiddle, O. P. F. Windram, S. McHattie, A. Mead, J. Beynon, V. Buchanan-Wollaston, K. J. Denby, and S. Mukherjee (2010)
Bioinformatics
26, 355-362
|Abstract »|Full Text »|PDF »
Social Choice in a Computer-Assisted Simulation.
P. Thavikulwat (2009)
Simulation Gaming
40, 488-512
|Abstract »|PDF »
jClust: a clustering and visualization toolbox.
G. A. Pavlopoulos, C. N. Moschopoulos, S. D. Hooper, R. Schneider, and S. Kossida (2009)
Bioinformatics
25, 1994-1996
|Abstract »|Full Text »|PDF »
Prediction of sub-cavity binding preferences using an adaptive physicochemical structure representation.
Quantitative Proteomic Analysis of Bean Plants Infected by a Virulent and Avirulent Obligate Rust Fungus.
J. Lee, J. Feng, K. B. Campbell, B. E. Scheffler, W. M. Garrett, S. Thibivilliers, G. Stacey, D. Q. Naiman, M. L. Tucker, M. A. Pastor-Corrales, et al. (2009)
Mol. Cell. Proteomics
8, 19-31
|Abstract »|Full Text »|PDF »
Message-passing algorithms for the prediction of protein domain interactions from protein-protein interaction data.
M. Iqbal, A. A. Freitas, C. G. Johnson, and M. Vergassola (2008)
Bioinformatics
24, 2064-2070
|Abstract »|Full Text »|PDF »
Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space.
Y. Loewenstein, E. Portugaly, M. Fromer, and M. Linial (2008)
Bioinformatics
24, i41-i49
|Abstract »|Full Text »|PDF »
Cell Identity Mediates the Response of Arabidopsis Roots to Abiotic Stress.
J. R. Dinneny, T. A. Long, J. Y. Wang, J. W. Jung, D. Mace, S. Pointer, C. Barron, S. M. Brady, J. Schiefelbein, and P. N. Benfey (2008)
Science
320, 942-945
|Abstract »|Full Text »|PDF »
Comment on "Clustering by Passing Messages Between Data Points".
VISDA: an open-source caBIGTM analytical tool for data clustering and beyond.
J. Wang, H. Li, Y. Zhu, M. Yousef, M. Nebozhyn, M. Showe, L. Showe, J. Xuan, R. Clarke, and Y. Wang (2007)
Bioinformatics
23, 2024-2027
|Abstract »|Full Text »|PDF »