Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.

Site Tools

  • AAAS
  • Subscribe
  • Feedback

Site Search

Search Advanced

Originally published in Science Express on 11 January 2007
Science 16 February 2007:
Vol. 315. no. 5814, pp. 972 - 976
DOI: 10.1126/science.1136800

Reports

Clustering by Passing Messages Between Data Points

Brendan J. Frey* and Delbert Dueck

Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such "exemplars" can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initial choice is close to a good solution. We devised a method called "affinity propagation," which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. We used affinity propagation to cluster images of faces, detect genes in microarray data, identify representative sentences in this manuscript, and identify cities that are efficiently accessed by airline travel. Affinity propagation found clusters with much lower error than other methods, and it did so in less than one-hundredth the amount of time.

Department of Electrical and Computer Engineering, University of Toronto, 10 King's College Road, Toronto, Ontario M5S 3G4, Canada.

* To whom correspondence should be addressed. E-mail: frey{at}psi.toronto.edu

Read the Full Text



THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
Social Choice in a Computer-Assisted Simulation.
P. Thavikulwat (2009)
Simulation Gaming 40, 488-512
   Abstract »    PDF »
Prediction of sub-cavity binding preferences using an adaptive physicochemical structure representation.
I. Wallach and R. H. Lilien (2009)
Bioinformatics 25, i296-i304
   Abstract »    Full Text »    PDF »
Quantitative Proteomic Analysis of Bean Plants Infected by a Virulent and Avirulent Obligate Rust Fungus.
J. Lee, J. Feng, K. B. Campbell, B. E. Scheffler, W. M. Garrett, S. Thibivilliers, G. Stacey, D. Q. Naiman, M. L. Tucker, M. A. Pastor-Corrales, et al. (2009)
Mol. Cell. Proteomics 8, 19-31
   Abstract »    Full Text »    PDF »
Message-passing algorithms for the prediction of protein domain interactions from protein-protein interaction data.
M. Iqbal, A. A. Freitas, C. G. Johnson, and M. Vergassola (2008)
Bioinformatics 24, 2064-2070
   Abstract »    Full Text »    PDF »
Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space.
Y. Loewenstein, E. Portugaly, M. Fromer, and M. Linial (2008)
Bioinformatics 24, i41-i49
   Abstract »    Full Text »    PDF »
Cell Identity Mediates the Response of Arabidopsis Roots to Abiotic Stress.
J. R. Dinneny, T. A. Long, J. Y. Wang, J. W. Jung, D. Mace, S. Pointer, C. Barron, S. M. Brady, J. Schiefelbein, and P. N. Benfey (2008)
Science 320, 942-945
   Abstract »    Full Text »    PDF »
Comment on "Clustering by Passing Messages Between Data Points".
M. J. Brusco and H.-F. Kohn (2008)
Science 319, 726c
   Abstract »    Full Text »    PDF »
Response to Comment on "Clustering by Passing Messages Between Data Points".
B. J. Frey and D. Dueck (2008)
Science 319, 726d
   Abstract »    Full Text »    PDF »
Clustering by soft-constraint affinity propagation: applications to gene-expression data.
M. Leone, Sumedha, and M. Weigt (2007)
Bioinformatics 23, 2708-2715
   Abstract »    Full Text »    PDF »
VISDA: an open-source caBIGTM analytical tool for data clustering and beyond.
J. Wang, H. Li, Y. Zhu, M. Yousef, M. Nebozhyn, M. Showe, L. Showe, J. Xuan, R. Clarke, and Y. Wang (2007)
Bioinformatics 23, 2024-2027
   Abstract »    Full Text »    PDF »



To Advertise     Find Products


Science. ISSN 0036-8075 (print), 1095-9203 (online)