Exploiting multi–core and many–core parallelism for subspace clustering

Aggarwal, C.C., Wolf, J.L., Yu, P.S., Procopiuc, C. and Park, J.S. (1999). Fast algorithms for projected clustering, SIGMOD Record28(2): 61–72.10.1145/304181.304188Search in Google Scholar

Agrawal, R., Gehrke, J., Gunopulos, D. and Raghavan, P. (1998). Automatic subspace clustering of high dimensional data for data mining applications, ACM SIGMOD International Conference on Management of Data, Seattle, WA, USA, Vol. 27, pp. 94–105.10.1145/276305.276314Search in Google Scholar

Alcantara, D.A.F. (2011). Efficient Hash Tables on the GPU, PhD thesis, University of California Davis, Davis, CA.Search in Google Scholar

Anderson, S.E. (2018). Bit Twiddling Hacks–compute the lexicographically next bit permutation, http://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation.Search in Google Scholar

Berkhin, P. (2006). A survey of clustering data mining techniques, in J. Kogan et al. (Eds.), Grouping Multidimensional Data, Springer, Berlin/Heidelberg, pp. 25–71.10.1007/3-540-28349-8_2Search in Google Scholar

Cheng, C.-H., Fu, A.W. and Zhang, Y. (1999). Entropy-based subspace clustering for mining numerical data, 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 84–93.10.1145/312129.312199Search in Google Scholar

Dagum, L. and Menon, R. (1998). OpenMP: An industry standard API for shared-memory programming, IEEE Computational Science Engineering5(1): 46–55.10.1109/99.660313Search in Google Scholar

Datta, A., Kaur, A., Lauer, T. and Chabbouh, S. (2017). Parallel subspace clustering using multi-core and many-core architectures, in M. Kirikova et al. (Eds.), New Trends in Databases and Information Systems, Springer International Publishing, Cham, pp. 213–223.10.1007/978-3-319-67162-8_21Search in Google Scholar

Elhamifar, E. and Vidal, R. (2013). Sparse subspace clustering: Algorithm, theory, and applications, IEEE Transactions on Pattern Analysis and Machine Intelligence35(11): 2765–2781.10.1109/TPAMI.2013.5724051734Search in Google Scholar

Ester, M., Kriegel, H.-P., Sander, J. and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise, International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, pp. 226–231.Search in Google Scholar

Fan, J., Han, F. and Liu, H. (2014). Challenges of big data analysis, National Science Review1(2): 293–314.10.1093/nsr/nwt032423684725419469Search in Google Scholar

Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition, Academic Press, San Diego, CA.10.1016/B978-0-08-047865-4.50007-7Search in Google Scholar

Geiger, A., Lenz, P., Stiller, C. and Urtasun, R. (2013). Vision meets robotics: The KITTI dataset, The International Journal of Robotics Research32(11): 1231–1237.10.1177/0278364913491297Search in Google Scholar

Google Scholar (2018). Search for ‘data clustering’, https://scholar.google.com/scholar?q=data+clustering&btnG=.Search in Google Scholar

Han, J., Kamber, M. and Pei, J. (2011). Data Mining: Concepts and Techniques, 3rd Edn., Morgan Kaufmann Publishers, San Francisco, CA.Search in Google Scholar

Harris, M., Sengupta, S. and Owens, J.D. (2007). Parallel prefix sum (scan) with CUDA, GPU Gems3(39): 851–876.Search in Google Scholar

Jain, A.K. and Dubes, R.C. (1988). Algorithms for Clustering Data, Prentice-Hall, Inc., Upper Saddle River, NJ.Search in Google Scholar

Jain, A.K., Murty, M.N. and Flynn, P.J. (1999). Data clustering: A review, ACM Computing Surveys31(3): 264–323.10.1145/331499.331504Search in Google Scholar

Joliffe, I.T. (2002). Principle Component Analysis, 2nd Edn., Springer, New York, NY.Search in Google Scholar

Jun, J., Chung, S. and McLeod, D. (2006). Subspace clustering of microarray data based on domain transformation, VLDB Workshop on Data Mining and Bioinformatics, Seoul, Korea, pp. 14–28.10.1007/11960669_3Search in Google Scholar

Kailing, K., Kriegel, H.-P. and Kröger, P. (2004). Density-connected subspace clustering for high-dimensional data, SIAM International Conference on Data Mining, Lake Buena Vista, FL, USA, Vol. 4, pp. 246–256.10.1137/1.9781611972740.23Search in Google Scholar

Kaur, A. and Datta, A. (2014). Subscale: Fast and scalable subspace clustering for high dimensional data, IEEE International Conference on Data Mining Workshop, Shenzhen, China, pp. 621–628.10.1109/ICDMW.2014.100Search in Google Scholar

Kaur, A. and Datta, A. (2015). A novel algorithm for fast and scalable subspace clustering of high-dimensional data, Journal of Big Data2(1): 1–24.10.1186/s40537-015-0027-ySearch in Google Scholar

Kriegel, H.-P., Kröger, P. and Zimek, A. (2009). Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering, ACM Transactions on Knowledge Discovery from Data3(1): 1–58.10.1145/1497577.1497578Search in Google Scholar

Li, T., Ma, S. and Ogihara, M. (2004). Document clustering via adaptive subspace iteration, 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Sheffield, UK, pp. 218–225.10.1145/1008992.1009031Search in Google Scholar

Lichman, M. (2013). UCI machine learning repository, http://archive.ics.uci.edu/ml.Search in Google Scholar

Loughry, J., van Hemert, J. and Schoofs, L. (2000). Efficiently enumerating the subsets of a set, http://www.applied-math.org/subset.pdf.Search in Google Scholar

MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations, 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, Vol. 1, pp. 281–297.Search in Google Scholar

McCaffrey, J. (2004). Generating the MTH lexicographical element of a mathematical combination, MSDN Library, Microsoft, Redmond, WA.Search in Google Scholar

Murtagh, F. (1983). A survey of recent advances in hierarchical clustering algorithms, The Computer Journal26(4): 354–359.10.1093/comjnl/26.4.354Search in Google Scholar

Nagesh, H., Goil, S. and Choudhary, A. (2001). Adaptive grids for clustering massive data sets, 1st SIAM International Conference on Data Mining, Chicago, IL, USA, pp. 1–17.10.1137/1.9781611972719.7Search in Google Scholar

Nvidia CUDA (2018). CUDA parallel computing platform and programming model, http://www.nvidia.com/object/cuda_home_new.html.Search in Google Scholar

Parsons, L., Haque, E. and Liu, H. (2004). Subspace clustering for high dimensional data: A review, ACM SIGKDD Explorations Newsletter6(1): 90–105.10.1145/1007730.1007731Search in Google Scholar

Sim, K., Gopalkrishnan, V., Zimek, A. and Cong, G. (2013). A survey on enhanced subspace clustering, Data Mining and Knowledge Discovery26(2): 332–397.10.1007/s10618-012-0258-xSearch in Google Scholar

Steinbach, M., Ertöz, L. and Kumar, V. (2004). The challenges of clustering high dimensional data, in L.T. Wille (Ed.), New Directions in Statistical Physics, Springer, Berlin/Heidelberg, pp. 273–309.10.1007/978-3-662-08968-2_16Search in Google Scholar

Strohm, P.T., Wittmer, S., Haberstroh, A. and Lauer, T. (2015). GPU-accelerated quantification filters for analytical queries in multidimensional databases, in N. Bassiliades et al. (Eds.), New Trends in Databases and Information Systems II, Springer, Cham, pp. 229–242.10.1007/978-3-319-10518-5_18Search in Google Scholar

Thalamuthu, A., Mukhopadhyay, I., Zheng, X. and Tseng, G.C. (2006). Evaluation and comparison of gene clustering methods in microarray analysis, Bioinformatics22(19): 2405–2412.10.1093/bioinformatics/btl40616882653Search in Google Scholar

Tierney, S., Gao, J. and Guo, Y. (2014). Subspace clustering for sequential data, IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, pp. 1019–1026.10.1109/CVPR.2014.134Search in Google Scholar

Xu, D. and Tian, Y. (2015). A comprehensive survey of clustering algorithms, Annals of Data Science2(2): 165–193.10.1007/s40745-015-0040-1Search in Google Scholar

Xu, R. and Wunsch, D. (2005). Survey of clustering algorithms, IEEE Transactions on Neural Networks16(3): 645–678.10.1109/TNN.2005.84514115940994Search in Google Scholar

Zhu, B., Mara, A. and Mozo, A. (2015). CLUS: Parallel subspace clustering algorithm on spark, in T. Morzy et al. (Eds.), New Trends in Databases and Information Systems, Communications in Computer and Information Science, Vol. 539, Springer International Publishing, Cham, pp. 175–185.10.1007/978-3-319-23201-0_20Search in Google Scholar

Zhu, J., Liao, S., Lei, Z., Yi, D. and Li, S.Z. (2013). Pedestrian attribute classification in surveillance: Database and evaluation, ICCV Workshop on Large-Scale Video Search and Mining (LSVSM’13), Sydney, Australia, pp. 331–338.10.1109/ICCVW.2013.51Search in Google Scholar

eISSN:: 2083-8492
Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Mathematics, Applied Mathematics

Journal RSS Feed

Exploiting multi–core and many–core parallelism for subspace clustering

Published Online: Mar 29, 2019

Page range: 81 - 91

Received: Feb 10, 2018

Accepted: Sep 16, 2018

DOI: https://doi.org/10.2478/amcs-2019-0006

Keywordsdata mining, subspace clustering, multi-core, many-core, GPU computing

© 2019 Amitava Datta et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Keywords
data mining, subspace clustering, multi-core, many-core, GPU computing