FSPL: A Meta–Learning Approach for a Filter and Embedded Feature Selection Pipeline

Aarts, E.H.L. and van Laarhoven, P.J.M. (1987). Simulated annealing: A pedestrian review of the theory and some applications, in P.A. Devijver and J. Kittler (Eds), Pattern Recognition Theory and Applications, Springer, Berlin/Heidelberg, pp. 179–192. Search in Google Scholar

Abdullah, A.S., Selvakumar, S., Karthikeyan, P. and Venkatesh, M. (2017). Comparing the efficacy of decision tree and its variants using medical data, Indian Journal of Science and Technology 10: 1–8. Search in Google Scholar

Akshaikhdeeb, B. and Ahmad, K. (2017). Feature selection for chemical compound extraction using wrapper approach with naive Bayes classifier, 6th International Conference on Electrical Engineering and Informatics (ICEEI), Langkawi, Malaysia, pp. 1–6. Search in Google Scholar

Anthony, T., Tian, Z. and Barber, D. (2017). Thinking fast and slow with deep learning and tree search, Conference on Neural Information Processing Systems, Long Beach, USA. Search in Google Scholar

Azhagusundari, B. and Thanamani, A.S. (2013). Feature selection based on information gain, International Journal of Innovative Technology and Exploring Engineering 2(2): 18–21. Search in Google Scholar

Bilalli, B., Abelló, A. and Aluja-Banet, T. (2017). On the predictive power of metafeatures in OpenML, International Journal of Applied Mathematics and Computer Science 27(4): 697–712, DOI: 10.1515/amcs-2017-0048. Open DOI Search in Google Scholar

Bo, L. and Rein, L. (2005). Comparison of the Luus–Jaakola optimization procedure and the genetic algorithm, Engineering Optimization 37(4): 381–396. Search in Google Scholar

Bo, Z.W., Hua, L.Z. and Yu, Z.G. (2006). Optimization of process route by genetic algorithms, Robotics and Computer-Integrated Manufacturing 22: 180–188. Search in Google Scholar

Bolón-Canedo, V. and Alonso-Betanzos, A. (2019). Ensembles for feature selection: A review and future trends, Information Fusion 52: 1–12. Search in Google Scholar

Brazdil, P., Giraud-Carrier, C., Soares, C. and Vilalta, R. (2009). Metalearning: Applications to Data Minings, Springer, Berlin/Heidelberg. Search in Google Scholar

Chandrashekar, G. and Sahin, F. (2014). A survey on feature selection methods, Computers & Electrical Engineering 40(1): 16–28. Search in Google Scholar

Chengzhang, L. and Jiucheng, X. (2019). Feature selection with the Fisher score followed by the maximal clique centrality algorithm can accurately identify the hub genes of hepatocellular carcinoma, Scientific Reports 9: 17283. Search in Google Scholar

Drori, I., Krishnamurthy, Y., Rampin, R., de Paula Lourenco, R., Ono, J.P., Cho, K., Silva, C. and Freire, J. (2018). AlphaD3M: Machine learning pipeline synthesis, AutoML Workshop at ICML, Stockholm, Sweden. Search in Google Scholar

Engels, R. and Theusinger, C. (1998). Using a data metric for preprocessing advice for data mining applications, European Conference on Artificial Intelligence, Brighton, UK, pp. 23–28. Search in Google Scholar

Erickson, N., Mueller, J., Shirkov, A., Zhang, H., Larroy, P., Li, M. and Smola, A. (2020). AutoGluon-tabular: Robust and accurate AutoML for structured data, arXiv: 2003.06505. Search in Google Scholar

Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M. and Hutter, F. (2020). Auto-Sklearn 2.0: Hands-free AutoML via meta-learning, arXiv: 2007.04074. Search in Google Scholar

Feurer, M., Klevin, A., Eggensperger, K., Springenberg, J.T., Blum, M. and Hutter, F. (2019). Auto-sklearn: Efficient and robust automated machine learning, in F. Hutter et al. (Eds), Automated Machine Learning, Springer, Cham, pp. 113–134. Search in Google Scholar

Feurer, M., Springenberg, J.T. and Hutter, F. (2014). Using meta-learning to initialize Bayesian optimization of hyperparameters, International Conference on Metalearning and Algorithm Selection, Prague, Czech Republic, pp. 3–10. Search in Google Scholar

Feurer, M., Springenberg, J.T. and Hutter, F. (2015). Initializing Bayesian hyperparameter optimization via meta-learning, Proceedings of the 29th AAAI Conference on Artificial Intelligence, Austin, USA, pp. 1128–1135. Search in Google Scholar

Freitas, A.A. (2014). Comprehensible classification models: A position paper, ACM SIGKDD Explorations Newsletter 15(1): 1–10. Search in Google Scholar

Fushiki, T. (2011). Estimation of prediction error by using k-fold cross-validation, Statistical Computation 21: 137–146. Search in Google Scholar

Ghaheri, A., Shoar, S., Naderan, M. and Hoseini, S.S. (2005). The applications of genetic algorithms in medicine, Oman Medical Journal 30(6): 406–416. Search in Google Scholar

Gil, Y., Yao, K.-T., Ratnakar, V., Garijo, D., Steeg, G.V., Szekely, P., Brekelmans, R., Kejriwal, M., Lau, F. and Huang, I.-H. (2018). P4ml: A phased performance-based pipeline planner for automated machine learning, AutoML Workshop at ICML, Stockholm, Sweden. Search in Google Scholar

Grabmeier, J.L. and Lambe, L.A. (2007). Decision trees for binary classification variables grow equally with the Gini impurity measure and Pearson’s chi-square test, International Journal of Business Intelligence and Data Mining 2(2): 213–226. Search in Google Scholar

Gu, Q., Li, Z. and Han, J. (2011). Generalized Fisher score for feature selection, Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, Barcelona, Spain, p. 266–273. Search in Google Scholar

He, X., Zhao, K. and Chu, X. (2021). AutoML: A survey of the state-of-the-art, Knowledge-Based Systems 212: 106622. Search in Google Scholar

Holland, J.H. (1992). Genetic algorithms, Scientific American 267(1): 66–73. Search in Google Scholar

Ivosev, G., Burton, L. and Bonner, R. (2008). Dimensionality reduction and visualization in principal component analysis, Analytical Chemistry 80(13): 4933–4944. Search in Google Scholar

Kang, Y., Cai, Z., Tan, C.-W., Huang, Q. and Liu, H. (2020). Natural language processing (NLP) in management research: A literature review, Journal of Management Analytics 7(2): 139–172. Search in Google Scholar

Kanna, S.S. and Ramaraj, N. (2010). Feature selection algorithms: A survey and experimental evaluation, Knowledge-Based Systems 23(6): 580–585. Search in Google Scholar

Keren Simon, L., Liberzon, A. and Lazebnik, T. (2023). A computational framework for physics-informed symbolic regression with straightforward integration of domain knowledge, Scientific Reports 13(1): 1249. Search in Google Scholar

Kietz, J.-U., Serban, F., Bernstein, A. and Fischer, S. (2012). Designing KDD workflows via HTN-planning for intelligent discovery assistance, 5th Planning to Learn Workshop at the European Conference on Artificial Intelligence, Montpellier, France. Search in Google Scholar

Kumar, V. and Minz, S. (2014). Feature selection: A literature review, Smart Computing Review 4(3): 211–229. Search in Google Scholar

Kusy, M. and Zajdel, R. (2021). A weighted wrapper approach to feature selection, International Journal of Applied Mathematics and Computer Science 31(4): 685–696, DOI: 10.34768/amcs-2021-0047. Open DOI Search in Google Scholar

Lazebnik, T., Zaher, B., Bunimovich-Mendrazitsky, S. and Halachmi, S. (2022). Predicting acute kidney injury following open partial nephrectomy treatment using sat-pruned explainable machine learning model, BMC Medical Informatics and Decision Making 22: 133. Search in Google Scholar

Lemka, C., Budka, M. and Gabrys, B. (2015). Metalearning: A survey of trends and technologies, Artificial Intelligence Review 44(1): 117–130. Search in Google Scholar

Lin, X., Li, C., Ren, W., Luo, X. and Qi, Y. (2019). A new feature selection method based on symmetrical uncertainty and interaction gain, Computational Biology and Chemistry 83: 107149. Search in Google Scholar

Liu, Y., Mu, Y., Chen, K., Li, Y. and Guo, J. (2020). Daily activity feature selection in smart homes based on Pearson correlation coefficient, Neural Processing Letters 51: 1771–1787. Search in Google Scholar

Luo, G. (2016). A review of automatic selection methods for machine learning algorithms and hyper-parameter values, Network Modeling Analysis in Health Informatics and Bioinformatics 5(1): 18. Search in Google Scholar

Ma, L., Li, M., Gao, Y., Chen, T., Ma, X. and Qu, L. (2017). A novel wrapper approach for feature selection in object-based image classification using polygon-based cross-validation, IEEE Geoscience and Remote Sensing Letters 14(3): 409 – 413. Search in Google Scholar

Maile, H., Li, J.O., Gore, D., Leucci, M., Mulholland, P., Hau, S., Szabo, A., Moghul, I., Balaskas, K., Fujinami, K., Hysi, P., Davidson, A., Liskova, P. Hardcastle, A., Tuft, S. and Pontikos, N. (2021). Machine learning algorithms to detect subclinical keratoconus: Systematic review, JMIR Medical Informatics 9(12): e27363. Search in Google Scholar

Molina, L.C., Belanche, L. and Nebot, A. (2002). Feature selection algorithms: A survey and experimental evaluation, 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 306–313. Search in Google Scholar

Mussa, D.J. and Jameel, N. G.M. (2019). Relevant SMS spam feature selection using wrapper approach and XGBoost algorithm, Kurdistan Journal of Applied Research 4(2): 110–120. Search in Google Scholar

Muthukrishnan, R. and Rohini, R. (2016). Lasso: A feature selection technique in predictive modeling for machine learning, IEEE International Conference on Advances in Computer Applications (ICACA), Coimbatore, India, pp. 18–20. Search in Google Scholar

Neumann, J., Schnorr, C. and Steidl, G. (2005). Combined SVM-based feature selection and classification, Machine Learning 61: 129–150. Search in Google Scholar

Nguyen, P., Hilario, M. and Kalousis, A. (2014). Using meta-mining to support data mining workflow planning and optimization, Journal of Artificial Intelligence Research 51: 605–644. Search in Google Scholar

Nisioti, E., Chatzidimitriou, K.C. and Symeonidis, A.L. (2018). Predicting hyperparameters from meta-features in binary classification problems, AutoML Workshop at International Conference on Machine Learning, Stockholm, Sweden. Search in Google Scholar

Oliveto, P. S. and Witt, C. (2015). Improved time complexity analysis of the simple genetic algorithm, Theoretical Computer Science 605: 21–41, Search in Google Scholar

Olson, R.S. and Moore, J.H. (2016). TPOT: A tree-based pipeline optimization tool for automating machine learning, JMLR: Workshop and Conference Proceedings 64: 66–74. Search in Google Scholar

Ometto, G., Moghul, I., Montesano, G., Hunter, A., Pontikos, N., Jones, P. R., Keane, P.A., Liu, X., Denniston, A.K. and Crabb, D.P. (2019). ReLayer: A free, online tool for extracting retinal thickness from cross-platform oct images, Translational Vision Science and Technology 8(3): 25. Search in Google Scholar

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V. (2011). Scikit-learn: Machine learning in Python, Journal of Machine Learning Research 12: 2825–2830. Search in Google Scholar

Pinto, F., Cerqueira, V., Soares, C. and Mendes-Moreira, J. (2017). Autobagging: Learning to rank bagging workflows with metalearning, arXiv: 1706.09367. Search in Google Scholar

Plackett, R.L. (1983). Karl Pearson and the chi-squared test, International Statistical Review/Revue Internationale de Statistique 51: 59–72. Search in Google Scholar

Reif, M., Shafait, F. and Dengel, A. (2012). Meta-learning for evolutionary parameter optimization of classifiers, Machine Learning 87: 357–380. Search in Google Scholar

Rice, J.R. (1976). The algorithm selection problem, Advances in Computers 15: 65–118. Search in Google Scholar

Rokach, L. (2016). Decision forest: Twenty years of research, Information Fusion 27: 111–125. Search in Google Scholar

Rosenfeld, A. (2021). Better metrics for evaluating explainable artificial intelligence, AAMAS’21: 20th International Conference on Autonomous Agents and Multiagent Systems, pp. 45–50, (virtual). Search in Google Scholar

Rosenfeld, A. and Freiman, M. (2021). Explainable feature ensembles through homogeneous and heterogeneous intersections, JCAI-PRICAI 2020 Workshop on Explainable Artificial Intelligence, (online). Search in Google Scholar

Rosenfeld, A., Graham, D.G., Hamoudi, R., Butawan, R., Eneh, V., Khan, S., Miah, H., Niranjan, M. and Lovat, L.B. (2015). MIAT: A novel attribute selection approach to better predict upper gastrointestinal cancer, International Conference on Data Science and Advanced Analytics, Paris, France. Search in Google Scholar

Rosenfeld, A. and Richardson, A. (2019). Explainability in human-agent systems, Autonomous Agents and Multi-Agent Systems 33(6): 673–705. Search in Google Scholar

Saeys, Y., Abeel, T. and de Peer, Y.V. (2008). Robust feature selection using ensemble feature selection techniques, in W. Daelemans et al. (Eds), Machine Learning and Knowledge Discovery in Databases, Springer, Berlin, pp. 313–325. Search in Google Scholar

Savchenko, E. and Lazebnik, T. (2022). Computer aided functional style identification and correction in modern Russian texts, Journal of Data, Information and Management 4: 25–32. Search in Google Scholar

Seijo-Pardo, B., Porto-Díaz, I., Bolón-Canedo, V. and Alonso-Betanzos, A. (2017). Ensemble feature selection: Homogeneous and heterogeneous approaches, Knowledge-Based Systems 118: 124–139. Search in Google Scholar

Serban, F., Vanschoren, J., Kietz, J.U. and Bernstein, A.A. (2013). A survey of intelligent assistants for data analysis, ACM Computing Surveys 45(3): 1–35. Search in Google Scholar

Sharma, A., Imoto, S. and Miyano, S. (2012). A top-r feature selection algorithm for microarray gene expression data, IEEE/ACM Transactions on Computational Biology and Bioinformatics 9(3): 754–764. Search in Google Scholar

Shatte, A.B.R., Hutchinson, D.M. and Teague, S.J. (2019). Machine learning in mental health: A scoping review of methods and applications, Psychological Medicine 49(9): 1426–1448. Search in Google Scholar

Shen, Z., Chen, X. and Garibaldi, J.M. (2020). A novel meta learning framework for feature selection using data synthesis and fuzzy similarity, IEEE World Congress on Computational Intelligence, (online). Search in Google Scholar

Shilbayeh, S. and Vadera, S. (2014). Feature selection in meta learning framework, Science and Information Conference, London, UK, pp. 269–275. Search in Google Scholar

Smith-Miles, K.A. (2009). Cross-disciplinary perspectives on meta-learning for algorithm selection, ACM Computational Surveys 41(1): 6. Search in Google Scholar

Soares, C., Brazdil, P.B. and Kuba, P. (2004). A meta-learning method to select the kernel width in support vector regression, Machine Learning 54: 195–209. Search in Google Scholar

Strang, B., van der Putten, P., van Rijn, J.N. and Hutter, F. (2018). Don’t rule out simple models prematurely: A large scale benchmark comparing linear and non-linear classifiers in OpenML, in W. Duivesteijn et al. (Eds), Advances in Intelligent Data Analysis XVII, Springer, Berlin, pp. 303–315. Search in Google Scholar

Swain, P. H. and Hauska, H. (1977). The decision tree classifier: Design and potential, IEEE Transactions on Geoscience Electronics 15(3): 142–147. Search in Google Scholar

Tang, J., Alelyani, S. and Liu, H. (2014). Feature Selection for Classification: A Review, CRC Press, Boca Raton. Search in Google Scholar

Teisseyre, P. (2022). Joint feature selection and classification for positive unlabelled multi-label data using weighted penalized empirical risk minimization, International Journal of Applied Mathematics and Computer Science 32(2): 311–322, DOI: 10.34768/amcs-2022-0023. Open DOI Search in Google Scholar

Tokarev, K.E., Zotov, V.M., Khavronina, V.N. and Rodionova, O.V. (2021). Convolutional neural network of deep learning in computer vision and image classification problems, IOP Conference Series: Earth and Environmental Science 786(1): 012040. Search in Google Scholar

Vanschoren, J. (2018). Meta-learning: A survey, arXiv: 1810.03548. Search in Google Scholar

Vasan, K.K. and Surendiran, B. (2016). Dimensionality reduction using principal component analysis for network intrusion detection, Perspectives in Science 8: 510–512. Search in Google Scholar

Waring, J., Lindvall, C. and Umeton, R. (2020). Automated machine learning: Review of the state-of-the-art and opportunities for healthcare, Artificial Intelligence in Medicine 104: 101822. Search in Google Scholar

Wasimuddin, M., Elleithy, K., Abuzneid, A.-S., Faezipour, M. and Abuzaghleh, O. (2020). Stages-based ECG signal analysis from traditional signal processing to machine learning approaches: A survey, IEEE Access 8: 177782–177803. Search in Google Scholar

Wu, S., Roberts, K., Datta, S., Du, J., Ji, Z., Si, Y., Soni, S., Wang, Q., Wei, Q., Xiang, Y., Zhao, B. and Xu, H. (2020). Deep learning in clinical natural language processing: A methodical review, Journal of the American Medical Informatics Association 27(3): 457–470. Search in Google Scholar

Zebari, R.R., Abdulazeez, A.M., Zeebaree, D.Q., Zebari, D.A. and Saeed, J.N. (2020). A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction, Journal of Applied Science and Technology Trends 1(2): 56–70. Search in Google Scholar

Zhu, X., Huang, Z., T., S.H., Cheng, J. and Xu, C. (2012). Dimensionality reduction by mixed kernel canonical correlation analysis, Pattern Recognition 45(8): 3003–3016. Search in Google Scholar

eISSN:: 2083-8492
Język:: Angielski

Częstotliwość wydawania:: 4 razy w roku
Dziedziny czasopisma:: Mathematics, Applied Mathematics

Kanał RSS czasopisma

FSPL: A Meta–Learning Approach for a Filter and Embedded Feature Selection Pipeline

Data publikacji: 29 mar 2023

Zakres stron: 103 - 115

Otrzymano: 17 lut 2022

Przyjęty: 27 wrz 2022

DOI: https://doi.org/10.34768/amcs-2023-0009

Słowa kluczowefeature selection pipeline, meta-learning, no free lunch, autoML, genetic algorithm

© 2023 Teddy Lazebnik et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Słowa kluczowe
feature selection pipeline, meta-learning, no free lunch, autoML, genetic algorithm