Otwarty dostęp

A Comparative Study for Outlier Detection Methods in High Dimensional Text Data


Zacytuj

[1] D. Hawkins. Identification of Outliers. Chapman and Hall, 1980.10.1007/978-94-015-3994-4 Search in Google Scholar

[2] C. Aggarwal. Outlier analysis (2nd ed.) Springer, 2017.10.1007/978-3-319-47578-3 Search in Google Scholar

[3] Caroline Cynthia and Thomas George. An outlier detection approach on credit card fraud detection using machine learning: A comparative analysis on supervised and unsupervised learning. In: Peter J., Fernandes S., Alavi A. (eds) Intelligence in Big Data Technologies-Beyond the Hype. Advances in Intelligent Systems and Computing, 1167, 2021. Search in Google Scholar

[4] H. Mazzawi, G. Dalai, D. Rozenblat, L. Ein-Dor, M. Ninio, O. Lavi, A. Adir, E. Aharoni, and E. Kermany. Anomaly detection in large databases using behavioral patterning. In ICDE, 2017.10.1109/ICDE.2017.158 Search in Google Scholar

[5] T. Li, J. Ma, and C. Sun. Dlog: diagnosing router events with syslogs for anomaly detection. The Journal of Supercomputing, 74(2):845–867, 2018. Search in Google Scholar

[6] C. Park. Outlier and anomaly pattern detection on data streams. The journal of supercomputing, 75:6118–6128, 2019.10.1007/s11227-018-2674-1 Search in Google Scholar

[7] H. Wang, M. Bah, and M. Hammad. Progress in outlier detection techniques: A survey. IEEE Access, 7, 2019.10.1109/ACCESS.2019.2932769 Search in Google Scholar

[8] A. Boukerche, L. Zheng, and O. Alfandi. Outlier detection: Methods, models, and classification. ACM Computing Surveys, 53:1–37, 2020. Search in Google Scholar

[9] X. Zhao, J. Zhang, and X. Qin. Loma: A local outlier mining algorithm based on attribute relevance analysis. Expert Systems with Applications, 84, 2017.10.1016/j.eswa.2017.05.009 Search in Google Scholar

[10] X. Zhao, J. Zhang, X. Qin, J. Cai, and Y. Ma. Parallel mining of contextual outlier using sparse subspace. Expert Systems with Applications, 126, 2019.10.1016/j.eswa.2019.02.020 Search in Google Scholar

[11] F. Kamalov and H. Leung. Outlier detection in high dimensional data. Journal of Information and Knowledge Management, 19, 2020.10.1142/S0219649220400134 Search in Google Scholar

[12] C. Park. A dimension reduction method for unsupervised outlier detection in high dimensional data(written in korean). Journal of KIISE. In press. Search in Google Scholar

[13] S. Damaswanny, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. In Proceeding of ACM SIGMOD, pages 427–438, 2000.10.1145/335191.335437 Search in Google Scholar

[14] E. Knorr and R. Ng. Finding intensional knowledge of distance-based outliers. In Proceeding of 25th International Conference on Very Large Databases, 1999. Search in Google Scholar

[15] M. Sugiyama and K. Borgwardt. Rapid distance-based outlier detection via sampling. In International Conference on Neural Information Processing Systems, 2013. Search in Google Scholar

[16] A. Zimek, E. Schubert, and H. Kriegel. A survey on unsupervised outlier detection in high-dimensional numerical data. Statistical Analysis and Data Mining, 5:363–387, 2012.10.1002/sam.11161 Search in Google Scholar

[17] H. Kriegel, M. Schubert, and A. Zimek. Angle-based outlier detection in high-dimensional data. In Proceeding of KDD, pages 444–452, 2008.10.1145/1401890.1401946 Search in Google Scholar

[18] M. Goldstein and A. Dengel. Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. In Proceeding of KI, pages 59–63, 2012. Search in Google Scholar

[19] B. Scholkopf, J. Platt, J. Shawe-Taylor, and A. Smola. Estimating the support of a high-dimensional distribution. Neural computation, pages 1443–1471, 2001.10.1162/08997660175026496511440593 Search in Google Scholar

[20] M. Amer, M. Goldstein, and S. Abdennadher. Enhancing one-class support vector machines for unsupervised anomaly detection. In Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description, 2013.10.1145/2500853.2500857 Search in Google Scholar

[21] L. Ruff, R. Vandermeulen, N. Gornitz, L. Deecke, S. Siddiqui, A. Binder, E. Muller, and M. Kloft. Deep one-class classification. In Proceeding of international conference on machine learning, 2018. Search in Google Scholar

[22] M. Breunig, H. Kriegel, R. Ng, and J. Sander. Lof: Identifying density-based local outliers. In Proceeding of the ACM Sigmod International Conference on Management of Data, 2000.10.1145/342009.335388 Search in Google Scholar

[23] P. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining. Addison Wesley, Boston, 2006. Search in Google Scholar

[24] F. Liu, K. Ting, and Z. Zhou. Isolation forest. In Proceedings of the 8th international conference on data mining, 2008.10.1109/ICDM.2008.17 Search in Google Scholar

[25] G. Susto, A. Beghi, and S. McLoone. Anomaly detection through on-line isolation forest: An application to plasma etching. In the 28th Annual SEMI Advanced Semiconductor Manufacturing Conference (ASMC), pages 89–94, 2017.10.1109/ASMC.2017.7969205 Search in Google Scholar

[26] L. Puggini and S. MCLoone. An enhanced variable selection and isolation forest based methodology for anomaly detection with oes data. Engineering Applications of Artificial Intelligence, 67:126–135, 2018.10.1016/j.engappai.2017.09.021 Search in Google Scholar

[27] J. Kim, H. Naganathan, S. Moon, W. Chong, and S. Ariaratnam. Applications of clustering and isolation forest techniques in real-time building energy-consumption data: Application to leed certified buildings. Journal of energy Engineering, 143, 2017.10.1061/(ASCE)EY.1943-7897.0000479 Search in Google Scholar

[28] J. Hofmockel and E. Sax. Isolation forest for anomaly detection in raw vehicle sensor data. In the 4th International Conference on Vehicle Technology and Intelligent Transport Systems (VE-HITS 2018), pages 411–416, 2018.10.5220/0006758004110416 Search in Google Scholar

[29] J. Livesey. Kurtosis provides a good omnibus test for outliers in small samples. Clinical Biochemistry, 40:1032–1036, 2007.10.1016/j.clinbiochem.2007.04.00317499683 Search in Google Scholar

[30] F. Liu, K. Ting, and Z. Zhou. On detecting clustered anomalies using sciforest. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2010.10.1007/978-3-642-15883-4_18 Search in Google Scholar

[31] S. Hariri, M. Kind, and R. Brunner. Extended isolation forest. IEEE transactions on knowledge and data engineering, 33:1479–1489, 2021.10.1109/TKDE.2019.2947676 Search in Google Scholar

[32] H. Kriegel, P. Kroger, E. Schubert, and A. Zimek. Outlier detection in axis-parallel subspaces of high dimensional data. In Proceedings of PAKDD, 2009.10.1007/978-3-642-01307-2_86 Search in Google Scholar

[33] A. Lazarevic and V. Kumar. Feature bagging for outlier detection. In Proceedings of KDD, 2005.10.1145/1081870.1081891 Search in Google Scholar

[34] R. Duda, P. Hart, and D. Stork. Pattern classification (2nd ed.). Wiley-interscience, 2000. Search in Google Scholar

[35] M. Shyu, S. Chen, K. Sarinnapakorn, and L. Chang. A novel anomaly detection scheme based on principal component classifier. In Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop, 2003. Search in Google Scholar

[36] P. Westfall. Kurtosis as peakedness, 1905-2014. r.i.p. The American Statistician, 68(3):191–195, 2014.10.1080/00031305.2014.917055432175325678714 Search in Google Scholar

[37] D. Pena and F. Prieto. Multivariate outlier detection and robust covariance matrix estimation. Technometrics, 43:286–310, 2001.10.1198/004017001316975899 Search in Google Scholar

[38] D. Greene and P. Cunningham. Practical solutions to the problem of diagonal dominance in kernel document clustering. In Proceeding of ICML, 2006.10.1145/1143844.1143892 Search in Google Scholar

[39] Y. Zhao, Z. Nasrullah, and Z. Li. Pyod: A python toolbox for scalable outlier detection. Journal of Machine Learning Research, 20:1–7, 2019. Search in Google Scholar

[40] A. Paszke, S. Gross, F. Massa, and et. al A. Lerer. Pytorch: An imperative style, high-performance deep learning library. In NeurIPS, pages 8026–8037, 2019. Search in Google Scholar

[41] L. Abdallah, M. Badarna, W. Khalifa, and M. Yousef. Multikoc: Multi-one-class classifier based k-means clustering. Algorithms, 14(5):1–10, 2021. Search in Google Scholar

[42] B. Krawczyk, M. Wozniak, and B. Cyganek. Clustering-based ensemble for one-class classification. Information sciences, 264:182–195, 2014.10.1016/j.ins.2013.12.019 Search in Google Scholar

eISSN:
2449-6499
Język:
Angielski
Częstotliwość wydawania:
4 razy w roku
Dziedziny czasopisma:
Computer Sciences, Databases and Data Mining, Artificial Intelligence