[Aggarwal, C.C. and Zhai, C.-X. (Eds.) (2012). Mining Text Data, Springer, New York, NY.10.1007/978-1-4614-3223-4]Search in Google Scholar
[Aswani Kumar, C. and Srinivas, S. (2006). Latent semantic indexing using eigenvalue analysis for efficient information retrieval, International Journal of Applied Mathematics and Computer Science 16(4): 551-558.]Search in Google Scholar
[Bayes, T. (1763). An essay towards solving a problem in the doctrine of chances, Philosophical Transactions of the Royal Society of London 53: 370-418.10.1098/rstl.1763.0053]Search in Google Scholar
[Bilski, A. and Wojciechowski, J. (2016). Automatic parametric fault detection in complex analog systems based on a method of minimum node selection, International Journal of Applied Mathematics and Computer Science 26(3): 655-668, DOI: 10.1515/amcs-2016-0045.10.1515/amcs-2016-0045]Open DOISearch in Google Scholar
[Blei, D.M., Ng, A.Y. and Jordan, M.I. (2003). Latent Dirichlet allocation, Journal of Machine Learning Research 3: 993-1022.]Search in Google Scholar
[Breiman, L. (1996). Bagging predictors, Machine Learning 24(2): 123-140.10.1007/BF00058655]Search in Google Scholar
[Breiman, L. (2001). Random forests, Machine Learning 45(1): 5-32.10.1023/A:1010933404324]Search in Google Scholar
[Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984). Classification and Regression Trees, Chapman and Hall, New York, NY.]Search in Google Scholar
[Cestnik, B. (1990). Estimating probabilities: A crucial task in machine learning, Proceedings of the 9th European Conference on Artificial Intelligence (ECAI-90), Stockholm, Sweden, pp. 147-149.]Search in Google Scholar
[Cichosz, P. (2015). Data Mining Algorithms: Explained Using R, Wiley, Chichester.10.1002/9781118950951]Search in Google Scholar
[Cortes, C. and Vapnik, V.N. (1995). Support-vector networks, Machine Learning 20(3): 273-297.10.1007/BF00994018]Search in Google Scholar
[Cristianini, N. and Shawe-Taylor, J. (2000). An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, New York, NY.10.1017/CBO9780511801389]Search in Google Scholar
[Dařena, F. and Žižka, J. (2017). Ensembles of classifiers for parallel categorization of large number of text documents expressing opinions, Journal of Applied Economic Sciences 12(1): 25-35.]Search in Google Scholar
[Dietterich, T.G. (2000). Ensemble methods in machine learning, Proceedings of the 1st International Workshop on Multiple Classifier Systems, Cagliari, Italy, pp. 1-15.10.1007/3-540-45014-9_1]Search in Google Scholar
[Domingos, P. and Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss, Machine Learning 29(2-3): 103-137.10.1023/A:1007413511361]Search in Google Scholar
[Duchi, J., Hazan, E. and Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization, Journal of Machine Learning Research 12: 2121-2159.]Search in Google Scholar
[Dumais, S.T. (2005). Latent semantic analysis, Annual Review of Information Science and Technology 38(1): 188-229.10.1002/aris.1440380105]Search in Google Scholar
[Dumais, S.T., Platt, J.C., Heckerman, D. and Sahami, M. (1998). Inductive learning algorithms and representations for text categorization, Proceedings of the 7th International Conference on Information and Knowledge Management (CIKM-98), Bethesda, MD, USA, pp. 148-155.10.1145/288627.288651]Search in Google Scholar
[Egan, J.P. (1975). Signal Detection Theory and ROC Analysis, Academic Press, New York, NY.]Search in Google Scholar
[Fawcett, T. (2006). An introduction to ROC analysis, Pattern Recognition Letters 27(8): 861-874.10.1016/j.patrec.2005.10.010]Search in Google Scholar
[Forman, G. (2003). An extensive empirical study of feature selection measures for text classification, Journal of Machine Learning Research 3: 1289-1305. Goldberg, Y. and Levy, O. (2014). word2vec Explained: Deriving Mikolov et al.’s negative sampling word-embedding method, arXiv: 1402.3722.]Search in Google Scholar
[Guyon, I.M. and Elisseeff, A. (2003). An introduction to variable and feature selection, Journal of Machine Learning Research 3: 1157-1182.]Search in Google Scholar
[Hamel, L.H. (2009). Knowledge Discovery with Support Vector Machines, Wiley, New York, NY.10.1002/9780470503065]Search in Google Scholar
[Hand, D.J. and Yu, K. (2001). Idiot’s Bayes-not so stupid after all?, International Statistical Review 69(3): 385-399.10.2307/1403452]Search in Google Scholar
[Heaps, H.S. (1978). Information Retrieval: Computational and Theoretical Aspects, Academic Press, New York, NY.]Search in Google Scholar
[Hilbe, J.M. (2009). Logistic Regression Models, Chapman and Hall, New York, NY.10.1201/9781420075779]Search in Google Scholar
[Holtz, P., Kronberger, N. and Wagner, W. (2012). Analyzing Internet forums: A practical guide, Journal of Media Psychology 24(2): 55-66.10.1027/1864-1105/a000062]Search in Google Scholar
[Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features, Proceedings of the 10th European Conference on Machine Learning (ECML-98), Chemnitz, Germany, pp. 137-142.10.1007/BFb0026683]Search in Google Scholar
[Joachims, T. (2002). Learning to Classify Text by Support Vector Machines: Methods, Theory, and Algorithms, Springer, New York, NY.10.1007/978-1-4615-0907-3]Search in Google Scholar
[Koprinska, I., Poon, J., Clark, J. and Chan, J. (2007). Learning to classify e-mail, Information Sciences: An International Journal 177(10): 2167-2187.10.1016/j.ins.2006.12.005]Search in Google Scholar
[Lau, J.H. and Baldwin, T. (2016). An empirical evaluation of doc2vec with practical insights into document embedding generation, Proceedings of the 1st Workshop on Representation Learning for NLP, Berlin, Germany, pp. 78-86.10.18653/v1/W16-1609]Search in Google Scholar
[Le, Q.V. and Mikolov, T. (2014). Distributed representations of sentences and documents, Proceedings of the 31st International Conference on Machine Learning (ICML-14), Beijing, China, pp. 1188-1196.]Search in Google Scholar
[Lewis, D.D. (1998). Naive (Bayes) at forty: The independence assumption in information retrieval, Proceedings of the Tenth European Conference on Machine Learning (ECML- 98), Chemnitz, Germany, pp. 4-15.10.1007/BFb0026666]Search in Google Scholar
[Liaw, A. and Wiener, M. (2002). Classification and regression by randomForest, R News 2(3): 18-22, http://CRAN.R-project.org/doc/Rnews/.]Search in Google Scholar
[Liu, H. and Motoda, H. (1998). Feature Selection for Knowledge Discovery and Data Mining, Springer, New York, NY.10.1007/978-1-4615-5689-3]Search in Google Scholar
[Liu, H., Motoda, H., Setiono, R. and Zhao, Z. (2010). Feature selection: An ever-evolving frontier in data mining, Proceedings of the 4th Workshop on Feature Selection in Data Mining (FSDM-10), Hyderabad, India, pp. 4-13.]Search in Google Scholar
[Lui, A. K.-F., Li, S.C. and Choy, S.O. (2007). An evaluation of automatic text categorization in online discussion analysis, Proceedings of the 7th IEEE International Conference on Advanced Learning Technologies (ICALT-2007), Niigata, Japan, pp. 205-209.10.1109/ICALT.2007.59]Search in Google Scholar
[Manning, C.D., Raghavan, P., and Schütze, H. (2008). Introduction to Information Retrieval, Cambridge University Press, Cambridge.10.1017/CBO9780511809071]Search in Google Scholar
[Marra, R.M., Moore, J.L. and Klimczak, A.K. (2004). Content analysis of online discussion forums: A comparative analysis of protocols, Educational Technology Research and Development 52(2): 23-40.10.1007/BF02504837]Search in Google Scholar
[McCallum, A. and Nigam, K. (1998). A comparison of event models for naive Bayes text classification, Proceedings of the AAAI/ICML-98 Workshop on Learning for Text Categorization, Madison, WI, USA, pp. 41-48.]Search in Google Scholar
[Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A. and Leisch, F. (2015). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien, R package version 1.6-7, https://CRAN.R-project.org/package=e1071.]Search in Google Scholar
[Mikolov, T., Chen, K., Corrado, G.S. and Dean, J. (2013a). Efficient estimation of word representations in vector space, arXiv:1301.3781.]Search in Google Scholar
[Mikolov, T., Le, Q.V. and Sutskever, I. (2013b). Exploiting similarities among languages for machine translation, arXiv:1309.4168.]Search in Google Scholar
[Mitchell, J. and Lapata, M. (2010). Composition in distributional models of semantics, Cognitive Science 34(8): 1388-1429.10.1111/j.1551-6709.2010.01106.x21564253]Search in Google Scholar
[Moldovan, A., Boţ, R.I. and Wanka, G. (2005). Latent semantic indexing for patent documents, International Journal of Applied Mathematics and Computer Science 15(4): 551-560.]Search in Google Scholar
[Oooms, J. (2016). hunspell: Morphological Analysis and Spell Checker for R, R package version 2.3, https://CRAN.R-project.org/package=hunspell.]Search in Google Scholar
[Pennington, J., Socher, R. and Manning, C.D. (2014). GloVe: Global vectors for word representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP-14), Doha, Qatar, pp. 1532-1543.10.3115/v1/D14-1162]Search in Google Scholar
[Platt, J.C. (1998). Fast training of support vector machines using sequential minimal optimization, in B. Schölkopf et al. (Eds.), Advances in Kernel Methods: Support Vector Learning, MIT Press, Cambridge, MA, pp.185-208.10.7551/mitpress/1130.003.0016]Search in Google Scholar
[Platt, J.C. (2000). Probabilistic outputs for support vector machines and comparison to regularized likelihood methods, in A.J. Smola et al. (Eds.), Advances in Large Margin Classifiers, MIT Press, Cambridge, MA, pp. 61-74.]Search in Google Scholar
[Quinlan, J.R. (1986). Induction of decision trees, Machine Learning 1: 81-106.10.1007/BF00116251]Search in Google Scholar
[R Development Core Team (2016). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, http://www.R-project.org.]Search in Google Scholar
[Radovanović, M. and Ivanović, M. (2008). Text mining: Approaches and applications, Novi Sad Journal of Mathematics 38(3): 227-234.]Search in Google Scholar
[Rios, G. and Zha, H. (2004). Exploring support vector machines and random forests for spam detection, Proceedings of the 1st International Conference on Email and Anti Spam (CEAS-04), Mountain View, CA, USA, pp. 398-403.]Search in Google Scholar
[Rousseau, F., Kiagias, E. and Vazirgiannis, M. (2015). Text categorization as a graph classification problem, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics and the 6th International Joint Conference on Natural Language Processing (ACLIJCNLP-15), Beijing, China, pp. 1702-1712.]Search in Google Scholar
[Said, D. and Wanas, N. (2011). Clustering posts in online discussion forum threads, International Journal of Computer Science and Information Technology 3(2): 1-14.10.5121/ijcsit.2011.3201]Search in Google Scholar
[Schölkopf, B. and Smola, A.J. (2001). Learning with Kernels, MIT Press, Cambridge, MA.]Search in Google Scholar
[Sebastiani, F. (2002). Machine learning in automated text categorization, ACM Computing Surveys 34(1): 1-47.10.1145/505282.505283]Search in Google Scholar
[Selivanov, D. (2016). text2vec: Modern Text Mining Framework for R, R package version 0.4.0, https://CRAN.R-project.org/package=text2vec.]Search in Google Scholar
[Siwek, K. and Osowski, S. (2016). Data mining methods for prediction of air pollution, International Journal of Applied Mathematics and Computer Science 26(2): 467-478, DOI: 10.1515/amcs-2016-0033.10.1515/amcs-2016-0033]Open DOISearch in Google Scholar
[Szymański, J. (2014). Comparative analysis of text representation methods using classification, Cybernetics and Systems 45(2): 180-199.10.1080/01969722.2014.874828]Search in Google Scholar
[Wu, Q., Ye, Y., Zhang, H., Ng, M.K. and Ho, S.-H. (2014). ForesTexter: An efficient random forest algorithm for imbalanced text categorization, Knowledge-Based Systems 67: 105-116.10.1016/j.knosys.2014.06.004]Search in Google Scholar
[Xu, B., Guo, X., Ye, Y. and Cheng, J. (2012). An improved random forest classifier for text categorization, Journal of Computers 7(12): 2913-2920.10.4304/jcp.7.12.2913-2920]Search in Google Scholar
[Xue, D. and Li, F. (2015). Research of text categorization model based on random forests, 2015 IEEE International Conference on Computational Intelligence and Communication Technology (CICT-15), Ghaziabad, India, pp. 173-176.10.1109/CICT.2015.101]Search in Google Scholar
[Yang, Y. and Pedersen, J. (1997). A comparative study on feature selection in text categorization, Proceedings of the 14th International Conference on Machine Learning (ICML-97), Nashville, TN, USA, pp. 412-420.]Search in Google Scholar
[Yessenalina, A. and Cardie, C. (2011). Compositional matrix-space models for sentiment analysis, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP-11), Edinburgh, UK, pp. 172-182.]Search in Google Scholar