1. bookVolume 30 (2020): Issue 4 (December 2020)
Journal Details
First Published
05 Apr 2007
Publication timeframe
4 times per year
access type Open Access

A feasible k-means kernel trick under non-Euclidean feature space

Published Online: 31 Dec 2020
Page range: 703 - 715
Received: 17 Feb 2020
Accepted: 17 Jul 2020
Journal Details
First Published
05 Apr 2007
Publication timeframe
4 times per year

This paper poses the question of whether or not the usage of the kernel trick is justified. We investigate it for the special case of its usage in the kernel k-means algorithm. Kernel-k-means is a clustering algorithm, allowing clustering data in a similar way to k-means when an embedding of data points into Euclidean space is not provided and instead a matrix of “distances” (dissimilarities) or similarities is available. The kernel trick allows us to by-pass the need of finding an embedding into Euclidean space. We show that the algorithm returns wrong results if the embedding actually does not exist. This means that the embedding must be found prior to the usage of the algorithm. If it is found, then the kernel trick is pointless. If it is not found, the distance matrix needs to be repaired. But the reparation methods require the construction of an embedding, which first makes the kernel trick pointless, because it is not needed, and second, the kernel-k-means may return different clusterings prior to repairing and after repairing so that the value of the clustering is questioned. In the paper, we identify a distance repairing method that produces the same clustering prior to its application and afterwards and does not need to be performed explicitly, so that the embedding does not need to be constructed explicitly. This renders the kernel trick applicable for kernel-k-means.


Ackerman, M., Ben-David, S. and Loker, D. (2010). Towards property-based classification of clustering paradigms, in J. Lafferty et al. (Eds), Advances in Neural Information Processing Systems 23, Curran Associates, Red Hook, NY, pp. 10–18.Search in Google Scholar

Balaji, R. and Bapat, R. (2007). On Euclidean distance matrices, Linear Algebra and Its Applications424(1): 108–117.Search in Google Scholar

Bao, T. and Kadobayashi, Y. (2008). On tighter inequalities for efficient similarity search in metric spaces, IAENG International Journal of Computer Science35(3): IJCS_35_3_17.Search in Google Scholar

Bradley, P.S., Mangasarian, O.L. and Street, W.N. (1996). Clustering via concave minimization, Proceedings of the 9th International Conference on Neural Information Processing Systems, NIPS’96, Denver, CO, USA, pp. 368–374.Search in Google Scholar

Cailliez, F. (1983). The analytical solution of the additive constant problem, Psychometrika48(2): 305–308.Search in Google Scholar

Chitta, R., Jin, R., Havens, T. and Jain, A. (2011). Approximate kernel k-means: Solution to large scale kernel clustering, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’11, San Diego, CA, USA, pp. 895–903.Search in Google Scholar

Choi, H. and Choi, S. (2005). Kernel Isomap on noisy manifold, 4th International Conference on Development and Learning, Osaka, Japan, pp. 208 – 213.Search in Google Scholar

Cox, T.F. and Cox, M.A.A. (2001). Multidimensional Scaling, 2nd Edn, Chapman & Hall, London.Search in Google Scholar

Demontis, A., Melis, M., Biggio, B., Fumera, G. and Roli, F. (2017). Super-sparse learning in similarity spaces, CoRR: abs/1712.06131.Search in Google Scholar

Dokmanic, I., Parhizkar, R., Ranieri, J. and Vetterli, M. (2015). Euclidean distance matrices: A short walk through theory, algorithms and applications, CoRR: abs/1502.07541.Search in Google Scholar

Du, L., Zhou, P., Shi, L., Wang, H., Fan, M., Wang, W. and Shen, Y. (2015). Robust multiple kernel k-means using l21-norm, Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, Buenos Aries, Argentina, pp. 3476–3482.Search in Google Scholar

Gisbrecht, A. and Schleif, F.-M. (2015). Metric and non-metric proximity transformations at linear costs, Neurocomputing167(1): 643–657.Search in Google Scholar

Gonzalez, J. and Munoz, A. (2010). Representing functional data in reproducing kernel Hilbert spaces with applications to clustering and classification, Statistics and Econometrics Series013, Working paper 10-27.Search in Google Scholar

Gower, J.C. (1982). Euclidean distance geometry, The Mathematical Scientist7: 1–14.Search in Google Scholar

Gower, J.C. (1985). Properties of Euclidean and non-Euclidean distance matrices, Linear Algebra and Its Applications67: 81–97.Search in Google Scholar

Gower, J. and Legendre, P. (1986). Metric and Euclidean properties of dissimilarity coefficients, Journal of Classification3(1): 5–48.Search in Google Scholar

Handhayania, T. and Hiryantob, L. (2015). Intelligent kernel k-means for clustering gene expression, International Conference on Computer Science and Computational Intelligence (ICCSCI 2015), Jakarta, Indonesia, Vol. 59, pp. 171–177.Search in Google Scholar

Higham, N.J. (1988). Computing a nearest symmetric positive semidefinite matrix, Linear Algebra and Its Applications103: 103–118.Search in Google Scholar

Hofmann, T., Schölkopf, B. and Smola, A.J. (2008). Kernel methods in machine learning, Annals of Statistics36(3): 1171–1220.Search in Google Scholar

Jacobs, D., Weinshall, D. and Gdalyahu, Y. (2000). Classification with nonmetric distances: Image retrieval and class representation, IEEE Transactions on Pattern Analysis and Machine Intelligence22(6): 583–600.Search in Google Scholar

Jain, A. and Zongker, D. (1998). Representation and recognition of handwritten digits using deformable templates, IEEE Transactions on Pattern Analysis and Machine Intelligence19(12): 1386–1390.Search in Google Scholar

Jaworski, M. (2018). Regression function and noise variance tracking methods for data streams with concept drift, International Journal of Applied Mathematics and Computer Science28(3): 559–567, DOI: 10.2478/amcs-2018-0043.Search in Google Scholar

Kashima, H., Hu, J., Ray, B. and Singh, M. (2008). K-means clustering of proportional data using l1 distance, Proceedings of the 19th International Conference on Pattern Recognition, Tampa, FL, USA.Search in Google Scholar

Kleinberg, J. (2002). An impossibility theorem for clustering, Proceedings of the 16th Neural Information Processing Systems Conference, NIPS 2002, Vancouver, BC, Canada, pp. 446–453.Search in Google Scholar

Kłopotek, M. (2019). On the existence of kernel function for kernel-trick of k-means in the light of Gower theorem, Fundamenta Informaticae168(1): 25–43.Search in Google Scholar

Legendre, P. and Legendre, L. (1998). Numerical Ecology, 2nd Edn, Elsevier, Amsterdam.Search in Google Scholar

Li, C., Georgiopoulos, M. and Anagnostopoulos, G.C. (2013). Kernel-based distance metric learning in the output space, International Joint Conference on Neural Networks (IJCNN), Dallas, TX, USA, pp. 1–8.Search in Google Scholar

Lingoes, J. (1971). Some boundary conditions for a monotone analysis of symmetric matrices, Psychometrika36(2): 195–203.Search in Google Scholar

Loosli, G., Canu, S. and Ong, C. (2016). Learning SVM in Kreĭn spaces, IEEE Transactions on Pattern Analysis and Machine Intelligence38(6): 1204–1216.Search in Google Scholar

Marin, D., Tang, M., Ayed, I.B. and Boykov, Y. (2019). Kernel clustering: Density biases and solutions, IEEE Transactions on Pattern Analysis and Machine Intelligence41(1): 136–147.Search in Google Scholar

Marteau, P.-F. (2019). Times series averaging and denoising from a probabilistic perspective on time-elastic kernels, International Journal of Applied Mathematics and Computer Science29(2): 375–392, DOI: 10.2478/amcs-2019-0028.Search in Google Scholar

Pan, V.Y. and Chen, Z.Q. (1999). The complexity of the matrix eigenproblem, Proceedings of the 31st Annual ACM Symposium on Theory of Computing, STOC’99, Atlanta, GA, USA, pp. 507–516.Search in Google Scholar

Pękalska, E. and Duin, R. (2005). The Dissimilarity Representation for Pattern Recognition: Foundations and Applications (Machine Perception and Artificial Intelligence), World Scientific, River Edge, NJ.Search in Google Scholar

Pękalska, E., Paclík, P. and Duin, R. (2002). A generalized kernel approach to dissimilarity-based classification, Journal of Machine Learning Research2: 175–211.Search in Google Scholar

Qi, H. (2016). A convex matrix optimization for constant problem in multidimensional scaling with application to LLE, SIAM Journal on Optimization26(4): 2564–2590.Search in Google Scholar

Richter, R., Kyprianidis, J., Springborn, B. and Alexa, M. (2017). Constrained modelling of 3-valent meshes using a hyperbolic deformation metric, Computer Graphics Forum36(6): 62–75.Search in Google Scholar

Roth, V., Laub, J., Kawanabe, M. and Buhmann, J. (2003). Optimal cluster preserving embedding of nonmetric proximity data, IEEE Transactions on Pattern Analysis and Machine Intelligence25(12): 1540–1551.Search in Google Scholar

Sarma, T., Vishwanath, P. and Reddy, B. (2013). Single pass kernel k-means clustering method, Sadhana38(3): 407–419.Search in Google Scholar

Schleif, F. and Tino, P. (2015). Indefinite proximity learning: A review, Neural Computation27(10): 2039–2096.Search in Google Scholar

Schoenberg, I.J. (1938). Metric spaces and positive definite functions, Transactions of the American Mathematical Society44(3): 522–536.Search in Google Scholar

Shawe-Taylor, J. and Cristianini, N. (2004). Kernel Methods for Pattern Analysis, Cambridge University Press, Cambridge.Search in Google Scholar

Szalkai, B. (2013). An implementation of the relational k-means algorithm, CoRR: abs/1304.6899.Search in Google Scholar

Szekely, G. and Rizzo, M. (2014). Partial distance correlation with methods for dissimilarities, CoRR: abs/1310.2926.Search in Google Scholar

Tzortzis, G. and Likas, A.C. (2009). The global kernel k-means algorithm for clustering in feature space, IEEE Transactions on Neural Networks7(20): 1181–94.Search in Google Scholar

Villmann, T., Kaden, M., Nebel, D. and Bohnsack, A. (2016). Similarities, dissimilarities and types of inner products for data analysis in the context of machine learning, in L. Rutkowski et al. (Eds), Artificial Intelligence and Soft Computing: 15th International Conference, ICAISC 2016, Springer, Cham, pp. 125–133.Search in Google Scholar

Wierzchoń, S. and Kłopotek, M. (2018). Modern Clustering Algorithms, Springer, Cham.Search in Google Scholar

Yao, Y. and Chen, H. (2018). Multiple kernel k-means clustering by selecting representative kernels, CoRR: abs/1811.00264.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo