Open Access

Experiments with language combinatorics in text classification: lessons learned and future implications


Cite

[1] Ptaszynski M., Masui F., Rzepka R., Araki K., First Glance on Pattern-based Language Modeling, Language Acquisition and Understanding Research Group Technical Reports, 2014.Search in Google Scholar

[2] Ptaszynski M., Masui F., Kimura Y., Rzepka R., Araki K., Extracting Patterns of Harmful Expressions for Cyberbullying Detection, Proceedings of LTC’15, 2016, 370-375.10.1016/j.ijcci.2016.07.002Search in Google Scholar

[3] Ptaszynski M., Masui F., Rzepka R., Araki K., Subjective? Emotional? Emotive?: Language Combinatorics based Automatic Detection of Emotionally Loaded Sentences, Linguistics and Literature Studies, Vol. 5, No. 1, 2017, 36-50.10.13189/lls.2017.050103Search in Google Scholar

[4] Bickel S., Haider P., Scheffer T., Predicting sentences using n-gram language models, Proceedings of HLT-EMNLP 2005, 2005, 193-200.10.3115/1220575.1220600Search in Google Scholar

[5] Li Haizhou, Bin Ma, A phonotactic language model for spoken language identification, Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 2005, 515-522.10.3115/1219840.1219904Search in Google Scholar

[6] Ponte J.M., Croft W.B., A language modeling approach to information retrieval, Proceedings of the 21st annual international ACM SIGIR Conference on Research and Development in Information Retrieval, 1998, 275-281.10.1145/290941.291008Search in Google Scholar

[7] Brown P.F., Cocke J., Pietra S.A.D., Pietra V.J.D., Jelinek F., Lafferty J.D., Mercer R.L., Roossin P.S., A statistical approach to machine translation, Computational linguistics, Vol. 16, No. 2, 1990, 79-85.Search in Google Scholar

[8] Mays E., Damerau F.J., Mercer R.L., Context based spelling correction, Information Processing & Management, Vol. 27, No. 5, 1991, 517-522.10.1016/0306-4573(91)90066-USearch in Google Scholar

[9] Kupiec J., Robust part-of-speech tagging using a hidden Markov model, Computer Speech & Language, Vol. 6, No.3, 1992, 225-242.10.1016/0885-2308(92)90019-ZSearch in Google Scholar

[10] Hu Y., Lu R., Li X., Chen Y., Duan J., A language modeling approach to sentiment analysis, Computational Science – ICCS 2007, 1186-1193.10.1007/978-3-540-72586-2_165Search in Google Scholar

[11] Ptaszynski M., Rzepka R., Araki K., Momouchi Y., Language combinatorics: A sentence pattern extraction architecture based on combinatorial explosion, International Journal of Computational Linguistics (IJCL), Vol. 2, No. 1, 2011, 24-36.Search in Google Scholar

[12] Harris Z., Distributional Structure, Word, Vol. 10, N. 2/3, 1954, 146-162.10.1080/00437956.1954.11659520Search in Google Scholar

[13] Cambria E., Hussain A., Sentic Computing: Techniques, Tools, and Applications, Springer, 2012.10.1007/978-94-007-5070-8Search in Google Scholar

[14] Lu Y., Zhai C.X., Positional Language Models for Information Retrieval, 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009, 299-306.10.1145/1571941.1571993Search in Google Scholar

[15] Markov A.A., Extension of the limit theorems of probability theory to a sum of variables connected in a chain, Reprinted in Appendix B of: R. Howard, Dynamic Probabilistic Systems, Vol. 1: Markov Chains, John Wiley and Sons, 1971.Search in Google Scholar

[16] Huang X., Alleva F., Hon H.W., Hwang M.Y., Rosenfeld R., The SPHINX-II Speech Recognition System: An Overview, Computer, Speech and Language, Vol. 7, 1992, 137-148.10.1006/csla.1993.1007Search in Google Scholar

[17] Guthrie D., Allison B., Liu W., Guthrie L., Wilks Y., A closer look at skip-gram modelling, Proceedings of LREC-2006, 2006, 1-4.Search in Google Scholar

[18] Pickhardt R., Gottron T., Korner M., Wagner P.G., Speicher T., Staab S., A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing, Proceedings of ACL 2014, 2014, 1145-1154.10.3115/v1/P14-1108Search in Google Scholar

[19] Ptaszynski M., Lempa P., Masui F., A Modular System for Support of Experiments in Text Classification, Technical Transactions, vol. 7-B/2015, 229-243.Search in Google Scholar

[20] Nakajima Y., Ptaszynski M., Honma H., Masui F., Investigation of Future Reference Expressions in Trend Information, Proceedings of the 2014 AAAI Spring Symposium Series, 2014, 31-38.Search in Google Scholar

[21] Ptaszynski M., Dybala P., Rzepka R., Araki K., Affecting Corpora: Experiments with Automatic Affect Annotation System – A Case Study of the 2channel Forum, Proceedings of PACLING-09, 2009, 223-228.Search in Google Scholar

[22] Human Rights Research Institute Against All Forms for Discrimination and Racism in Mie Prefecture, Japan, http://www.pref.mie.lg.jp/jinkenc/hp/ (access: 21.04.2017).Search in Google Scholar

[23] Ministry of Education, Culture, Sports, Science and Technology (MEXT), ‘Netto-jo no ijime’ ni kansuru taio manyuaru jirei shu (gakko, kyoin muke), MEXT, 2008.Search in Google Scholar

[24] Ure J., Lexical density and register differentiation, [in:] Applications of Linguistics, (eds.) G. Perren, J.L.M. Trim, Cambridge University Press, London 1971, 443-452.Search in Google Scholar