[[1] M. K. Saggi and S. Jain, “A Survey Towards an Integration of Big Data Analytics to Big Insights for Value-Creation,” Information Processing & Management, vol. 54, no. 5, pp. 758–790, Sep. 2018. https://doi.org/10.1016/j.ipm.2018.01.01010.1016/j.ipm.2018.01.010]Search in Google Scholar
[[2] A. Oussous, F. Z. Benjelloun, A. A. Lahcen, and S. Belfkih, “Big Data Technologies: A survey,” Journal of King Saud University – Computer and Information Sciences, vol. 30, no. 4, pp. 431–448, Oct. 2018. https://doi.org/10.1016/j.jksuci.2017.06.00110.1016/j.jksuci.2017.06.001]Search in Google Scholar
[[3] G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, “Learning From Class-Imbalanced Data: Review of Methods and Applications,” Expert Systems with Applications, vol. 73, pp. 220–239, May 2017. https://doi.org/10.1016/j.eswa.2016.12.03510.1016/j.eswa.2016.12.035]Search in Google Scholar
[[4] H. He and E. A. Garcia, “Learning From Imbalanced Data,” IEEE Transactions on Knowledge & Data Engineering, vol. 21, no. 9, pp. 1263–1284, Sep. 2009. https://doi.org/10.1109/TKDE.2008.23910.1109/TKDE.2008.239]Search in Google Scholar
[[5] S. Das, S. Datta, and B. B. Chaudhuri, “Handling Data Irregularities in Classification: Foundations, Trends, and Future Challenges,” Pattern Recognition, vol. 81, pp. 674–693, Sep. 2018. https://doi.org/10.1016/j.patcog.2018.03.00810.1016/j.patcog.2018.03.008]Search in Google Scholar
[[6] J. Stefanowski, “Dealing With Data Difficulty Factors While Learning From Imbalanced Data,” in Challenges in Computational Statistics and Data Mining, pp. 333–363, 2016. https://doi.org/10.1007/978-3-319-18781-5_1710.1007/978-3-319-18781-5_17]Search in Google Scholar
[[7] A. Fernández, S. del Río, N. V. Chawla, and F. Herrera, “An Insight Into Imbalanced Big Data Classification: Outcomes and Challenges,” Complex & Intelligent Systems, vol. 3, no. 2, pp. 105–120, Jun. 2017. https://doi.org/10.1007/s40747-017-0037-910.1007/s40747-017-0037-9]Search in Google Scholar
[[8] S. del Río, V. López, J. M. Benítez, and F. Herrera, “On the Use of MapReduce for Imbalanced Big Data Using Random Forest,” Information Sciences, vol. 285, pp. 112–137, 2014. https://doi.org/10.1016/j.ins.2014.03.04310.1016/j.ins.2014.03.043]Search in Google Scholar
[[9] S. S. Patil and S. P. Sonavane, “Enriched Over_Sampling Techniques for Improving Classification of Imbalanced Big Data,” in 2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService), USA, 2017, pp. 1–10. https://doi.org/10.1109/BigDataService.2017.1910.1109/BigDataService.2017.19]Search in Google Scholar
[[10] M. Ghanavati, R. K. Wong, F. Chen, Y. Wang, and C. S. Perng, “An Effective Integrated Method for Learning Big Imbalanced Data,” in 2014 IEEE International Congress on Big Data, USA, 2014, pp. 691–698. https://doi.org/10.1109/BigData.Congress.2014.10210.1109/BigData.Congress.2014.102]Search in Google Scholar
[[11] D. Galpert, S. del Río, F. Herrera, E. Ancede-Gallardo, A. Antunes, and G. Agüero-Chapin, “An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species,” BioMed Research International, vol. 2015, Article ID 748681, 2015. https://doi.org/10.1155/2015/74868110.1155/2015/748681464194326605337]Search in Google Scholar
[[12] S. del Río, J. M. Benítez, and F. Herrera, “Analysis of Data Preprocessing Increasing the Oversampling Ratio for Extremely Imbalanced Big Data Classification,” in 2015 IEEE Trustcom/BigDataSE/ISPA, pp. 180–185, Finland, 2015. https://doi.org/10.1109/Trustcom.2015.57910.1109/Trustcom.2015.579]Search in Google Scholar
[[13] I. Triguero, S. del Río, V. López, J. Bacardit, J. M. Benítez, and F. Herrera, “ROSEFW-RF: The Winner Algorithm for the ECBDL’14 Big Data Competition: An Extremely Imbalanced Big Data Bioinformatics Problem,” Knowledge-Based Systems, vol. 87, pp. 69–79, Oct. 2015. https://doi.org/10.1016/j.knosys.2015.05.02710.1016/j.knosys.2015.05.027]Search in Google Scholar
[[14] I. Triguero, M. Galar, S. Vluymans, C. Cornelis, H. Bustince, F. Herrera, and Y. Saeys, “Evolutionary Undersampling for Imbalanced Big Data Classification,” in 2015 IEEE Congress on Evolutionary Computation (CEC), Japan, 2015, pp. 715–722. https://doi.org/10.1109/CEC.2015.725696110.1109/CEC.2015.7256961]Search in Google Scholar
[[15] I. Triguero, M. Galar, D. Merino, J. Maillo, H. Bustince, and F. Herrera, “Evolutionary Undersampling for Extremely Imbalanced Big Data Classification Under Apache Spark,” in 2016 IEEE Congress on Evolutionary Computation (CEC), Canada, 2016, pp. 640–647. https://doi.org/10.1109/CEC.2016.774385310.1109/CEC.2016.7743853]Search in Google Scholar
[[16] S. Kamal, S.H. Ripon, N. Dey, A.S. Ashour, and V. Santhi, “A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset,” Computer methods and programs in biomedicine, vol. 131, pp. 191–206, Jul. 2016. https://doi.org/10.1016/j.cmpb.2016.04.00510.1016/j.cmpb.2016.04.00527265059]Search in Google Scholar
[[17] F. Hu, H. Li, H. Lou, and J. Dai, “A parallel oversampling algorithm based on NRSBoundary-SMOTE,” Journal of Information & Computational Science, vol. 11, no. 13, pp. 4655–4665, Sep. 2014. https://doi.org/10.12733/jics2010448410.12733/jics20104484]Search in Google Scholar
[[18] R. C. Bhagat and S. S. Patil, “Enhanced SMOTE Algorithm for Classification of Imbalanced Big-Data Using Random Forest,” in 2015 IEEE International Advance Computing Conference (IACC), India, 2015, pp. 403–408. https://doi.org/10.1109/IADCC.2015.715473910.1109/IADCC.2015.7154739]Search in Google Scholar
[[19] C. K. Maurya, D. Toshniwal, and G. V. Venkoparao, “Online Sparse Class Imbalance Learning on Big Data,” Neurocomputing, vol. 216, pp. 250–260, Dec. 2016. https://doi.org/10.1016/j.neucom.2016.07.04010.1016/j.neucom.2016.07.040]Search in Google Scholar
[[20] M. Tang, C. Yang, K. Zhang, Q. Xie, “Cost-Sensitive Support Vector Machine Using Randomized Dual Coordinate Descent Method for Big Class-Imbalanced Data Classification,” Abstract and Applied Analysis, vol. 2014, Article ID 416591, Jul. 2014. https://doi.org/10.1155/2014/41659110.1155/2014/416591]Search in Google Scholar
[[21] X. Wang, X., Liu, and S. Matwin, “A distributed instance-weighted SVM algorithm on large-scale imbalanced datasets”. in 2014 IEEE International Conference on Big Data, USA, 2014, pp. 45–51. https://doi.org/10.1109/BigData.2014.700446710.1109/BigData.2014.7004467]Search in Google Scholar
[[22] V. López, S. del Río, J. M. Benítez, and F. Herrera, “Cost-Sensitive Linguistic Fuzzy Rule Based Classification Systems Under the MapReduce Framework for Imbalanced Big Data,” Fuzzy Sets and Systems, vol. 258, pp. 5–38, Jan. 2015. https://doi.org/10.1016/j.fss.2014.01.01510.1016/j.fss.2014.01.015]Search in Google Scholar
[[23] S. del Rio, V. Lopez, J. M. Benítez, and F. Herrera, “A MapReduce Approach to Address Big Data Classification Problems Based on the Fusion of Linguistic Fuzzy Rules,” International Journal of Computational Intelligence Systems, vol. 8, no. 3, pp. 422–437, May 2015. https://doi.org/10.1080/18756891.2015.101737710.1080/18756891.2015.1017377]Search in Google Scholar
[[24] J. Zhai, S. Zhang, M. Zhang, and X. Liu, “Fuzzy Integral-Based ELM Ensemble for Imbalanced Big Data Classification,” Soft Computing, vol. 22, no. 11, pp. 3519–3531, Jun. 2018. https://doi.org/10.1007/s00500-018-3085-110.1007/s00500-018-3085-1]Search in Google Scholar
[[25] Z. Wang, J. Xin, H. Yang, S. Tian, G. Yu, C. Xu, and Y. Yao, “Distributed and Weighted Extreme Learning Machine for Imbalanced Big Data Learning,” Tsinghua Science and Technology, vol. 22, no. 2, pp. 160–173, Apr. 2017. https://doi.org/10.23919/TST.2017.788963810.23919/TST.2017.7889638]Search in Google Scholar
[[26] N. B. Abdel-Hamid, S. ElGhamrawy, A. El Desouky, and H. Arafat, “A Dynamic Spark-Based Classification Framework for Imbalanced Big Data,” Journal of Grid Computing, vol. 16, no. 4, pp. 607–626, Dec. 2018. https://doi.org/10.1007/s10723-018-9465-z10.1007/s10723-018-9465-z]Search in Google Scholar
[[27] J. L. Leevy, T. M. Khoshgoftaar, R. A. Bauder, and N. Seliya, “A Survey on Addressing High-Class Imbalance in Big Data,” Journal of Big Data, vol. 5, no. 42, Dec. 2018. https://doi.org/10.1186/s40537-018-0151-610.1186/s40537-018-0151-6]Search in Google Scholar
[[28] J. W. Huang, C. W. Chiang, and J. W. Chang, “Email Security Level Classification of Imbalanced Data Using Artificial Neural Network: The Real Case in a World-Leading Enterprise,” Engineering Applications of Artificial Intelligence, vol. 75, pp. 11–21, Oct. 2018. https://doi.org/10.1016/j.engappai.2018.07.01010.1016/j.engappai.2018.07.010]Search in Google Scholar
[[29] T. Jo, and N. Japkowicz, “Class Imbalances Versus Small Disjuncts,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 40–49, Jun. 2004. https://doi.org/10.1145/1007730.100773710.1145/1007730.1007737]Search in Google Scholar
[[30] A. Agrawal, H. L. Viktor, E. Paquet, “SCUT: Multi-Class Imbalanced Data Classification Using SMOTE and Cluster-Based Undersampling,” in 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), 2015, vol. 1, pp. 226–234. https://doi.org/10.5220/000559550226023410.5220/0005595502260234]Search in Google Scholar
[[31] W. C. Lin, C. F. Tsai, Y. H. Hu, and J. S. Jhang, “Clustering-Based Undersampling in Class-Imbalanced Data,” Information Sciences, vol. 409, pp. 17–26, Oct. 2017. https://doi.org/10.1016/j.ins.2017.05.00810.1016/j.ins.2017.05.008]Search in Google Scholar
[[32] I. Nekooeimehr and S. K. Lai-Yuen, “Adaptive Semi-Unsupervised Weighted Oversampling (A-SUWO) for Imbalanced Datasets,” Expert Systems with Applications, vol. 46, pp. 405–416, Mar. 2016. https://doi.org/10.1016/j.eswa.2015.10.03110.1016/j.eswa.2015.10.031]Search in Google Scholar
[[33] A. Estabrooks, T. Jo, and N. Japkowicz, “A Multiple Resampling Method for Learning from Imbalanced Data Sets,” Computational Intelligence, vol. 20, no. 1, pp. 18–36, Feb. 2004. https://doi.org/10.1111/j.0824-7935.2004.t01-1-00228.x10.1111/j.0824-7935.2004.t01-1-00228.x]Search in Google Scholar
[[34] H. Guo, J. Zhou, and C. A. Wu, “Imbalanced Learning Based on Data-Partition and SMOTE,” Information, vol. 9, no. 238, Sep. 2018. https://doi.org/10.3390/info909023810.3390/info9090238]Search in Google Scholar
[[35] GAZİ-BIDISEC. Gazi University Big Data and Information Security Center. [Online]. Available: http://bigdatacenter.gazi.edu.tr/ [Accessed: Sep. 2019].]Search in Google Scholar
[[36] T. Hasanin and T. Khoshgoftaar, “The Effects of Random Undersampling with Simulated Class Imbalance for Big Data,” in 2018 IEEE International Conference on Information Reuse and Integration (IRI), USA, 2018, pp. 70–79. https://doi.org/10.1109/IRI.2018.0001810.1109/IRI.2018.00018]Search in Google Scholar