Estivill-Castro, Vladimir. Why so many clustering algorithms – A Position Paper. ACM SIGKDD Explorations Newsletter. 20 June 2002, 4 (1): 65–75. S2CID 7329935. doi:10.1145/568574.568575.
Defays, D. An efficient algorithm for a complete link method. The Computer Journal (British Computer Society). 1977, 20 (4): 364–366. doi:10.1093/comjnl/20.4.364.
Ester, Martin; Kriegel, Hans-Peter; Sander, Jörg; Xu, Xiaowei. A density-based algorithm for discovering clusters in large spatial databases with noise. Simoudis, Evangelos; Han, Jiawei; Fayyad, Usama M. (编). Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press: 226–231. 1996. ISBN 1-57735-004-9.
Ankerst, Mihael; Breunig, Markus M.; Kriegel, Hans-Peter; Sander, Jörg. OPTICS: Ordering Points To Identify the Clustering Structure. ACM SIGMOD international conference on Management of data. ACM Press: 49–60. 1999. CiteSeerX 10.1.1.129.6542.
Pfitzner, Darius; Leibbrandt, Richard; Powers, David. Characterization and evaluation of similarity measures for pairs of clusterings. Knowledge and Information Systems (Springer). 2009, 19 (3): 361–394. S2CID 6935380. doi:10.1007/s10115-008-0150-6.
Clatworthy, J., Buick, D., Hankins, M., Weinman, J., & Horne, R. (2005). The use and reporting of cluster analysis in health psychology: A review. British Journal of Health Psychology 10: 329-358.
Cole, A. J. & Wishart, D. (1970). An improved algorithm for the Jardine-Sibson method of generating overlapping clusters. The Computer Journal 13(2):156-163.
Ester, M., Kriegel, H.P., Sander, J., and Xu, X. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, USA: AAAI Press, pp. 226–231.
Heyer, L.J., Kruglyak, S. and Yooseph, S., Exploring Expression Data: Identification and Analysis of Coexpressed Genes, Genome Research 9:1106-1115.
Huang, Z. (1998). Extensions to the K-means Algorithm for Clustering Large Datasets with ategorical Values. Data Mining and Knowledge Discovery, 2, p. 283-304.
Jardine, N. & Sibson, R. (1968). The construction of hierarchic and non-hierarchic classifications. The Computer Journal 11:177.
Ng, R.T. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. Proceedings of the 20th VLDB Conference, Santiago, Chile, pp. 144–155.
Romesburg, H. Clarles, Cluster Analysis for Researchers, 2004, 340 pp. ISBN 1-4116-0617-5 or publisher (页面存档备份,存于互联网档案馆), reprint of 1990 edition published by Krieger Pub. Co... A Japanese language translation is available from Uchida Rokakuho Publishing Co., Ltd., Tokyo, Japan.
Zhang, T., Ramakrishnan, R., and Livny, M. 1996. BIRCH: An efficient data clustering method for very large databases. Proceedings of ACM SIGMOD Conference, Montreal, Canada, pp. 103–114.
For spectral clustering :
Jianbo Shi and Jitendra Malik, "Normalized Cuts and Image Segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888-905, August 2000. Available on Jitendra Malik's homepage (页面存档备份,存于互联网档案馆)
Marina Meila and Jianbo Shi, "Learning Segmentation with Random Walk", Neural Information Processing Systems, NIPS, 2001. Available from Jianbo Shi's homepage (页面存档备份,存于互联网档案馆)
For estimating number of clusters:
Can, F., Ozkarahan, E. A. (1990) "Concepts and effectiveness of the cover coefficient-based clustering methodology for text databases." ACM Transactions on Database Systems. 15 (4) 483-517.
for another presentation of hierarchical, k-means and fuzzy c-means see this introduction to clustering (页面存档备份,存于互联网档案馆). Also has an explanation on mixture of Gaussians.
YALE (Yet Another Learning Environment): freely available open-source software for data pre-processing, knowledge discovery, data mining, machine learning, visualization, etc. also including a plugin for clustering, fully integrating Weka, easily extendible, and featuring a graphical user interface as well as a XML-based scripting language for data mining;
mixmod (页面存档备份,存于互联网档案馆) : Model Based Cluster And Discriminant Analysis. Code in C++, interface with Matlab and Scilab
LingPipe Clustering Tutorial (页面存档备份,存于互联网档案馆) Tutorial for doing complete- and single-link clustering using LingPipe, a Java text data mining package distributed with source.
Weka : Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
Tanagra (页面存档备份,存于互联网档案馆) : a free data mining software including several clustering algorithms such as K-MEANS, SOM, Clustering Tree, HAC and more.
Cluster : Open source clustering software. The routines are available in the form of a C clustering library, an extension module to Python, a module to Perl.