Pre-Training MLM Using Bert for the Albanian Language

1 Abdelali, A., Hassan, S., & Mubarak, H. (2021). Pre-Training BERT on Arabic Tweets: Practical Considerations. Qatar Computing Research Institute. Doha 5825, Qatar: arXiv. Search in Google Scholar

2 Alsentzer, E., Murphy, J., Boag, W., Weng, W.-H., Jindi, D., Naumann, T., & McDermott, M. (2019). Publicly Available Clinical BERT Embeddings. (pp. 72-78). Proceedings of the 2nd Clinical Natural Language Processing Workshop. Search in Google Scholar

3 Canete, J., Chaperon, G., & Fuentes, R. (2019). Spanish pre-trained bert model and evaluation data. PML4DC at ICLR. Search in Google Scholar

4 Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z., Wang, S., & Hu, G. (2019). Pre-Training with Whole Word Masking for Chinese BERT. Search in Google Scholar

5 Dai, A., & Le, Q. (2015). Semi-supervised sequence learning”, In Advances in neural information processing systems., (pp. 3079–3087). Search in Google Scholar

6 De Vries, W., Van Cranenburgh, A., Bisazza, A., Caselli, T., Van Noord, G., & Nissim, M. (2019). BERTje: A Dutch BERT Model. arXiv preprint arXiv. Search in Google Scholar

7 Delobelle, P., Winters, T., & Berendt, B. (2020). RobBERT: a Dutch RoBERTa-based Language Model. Search in Google Scholar

8 Delvin, J., Chang, M., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT. Search in Google Scholar

9 Hugging Face (2022). Transformers (4.4.2) [Computer software]. Retrieved from https://github.com/huggingface/transformers Search in Google Scholar

10 Gerz, D., Vulić, I., Ponti, E., Naradowsky, J., Reichart, R., Korhonen, A. (2018) Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction. Transactions of the Association for Computational Linguistics; 6 451–465. doi: https://doi.org/10.1162/tacl_a_00032 Search in Google Scholar

11 Joseph, T., Lev, R., & Yoshua, B. (2010). Word representations: A simple and general method for semi-supervised learning. 48th Annual Meeting of the Association for Computational Linguistics. Search in Google Scholar

12 Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. (pp. 1097-1105). Advances in neural information processing systems. Search in Google Scholar

13 Kryeziu, L., & Shehu, V. (2022). A Survey of Using Unsupervised Learning Techniques in Building Masked Language Models for Low Resource Languages. 2022 11th Mediterranean Conference on Embedded Computing (MECO) (pp. 1-6). Budva: MECO. Search in Google Scholar

14 Kryeziu, L., Shehu, V., & Caushi, A. (2022). Evaluation and Verification of NLP Datasets for the Albanian Language. International Conference on Artificial Intelligence of Things. Istanbul, Turkey. Search in Google Scholar

15 Kamps, J., Kondylidis, N., & Rau, D. (2020). Impact of Tokenization, Pretraining Task, and Transformer Depth on Text Ranking. In TREC. Search in Google Scholar

16 Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., Ho So, C., & Kang, J. (2020). BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, (pp. 1234–1240). Search in Google Scholar

17 Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. Search in Google Scholar

18 Matthew, E., Mark, N., Mohit, J., Matt, G., Christopher, C., Kenton, L., & Luke, Z. (2018). Deep contextualized word representations. Search in Google Scholar

19 Özçift, A., Akarsu, K., Yumuk, F., & Söylemez, C. (2021). Advancing natural language processing (NLP) applications of morphologically rich languages with bidirectional encoder representations from transformers (BERT): an empirical case study for Turkish. Search in Google Scholar

20 Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI. Search in Google Scholar

21 Schweter, S. (2020). Berturk - bert models for turkish. Search in Google Scholar

22 Tian, H., Yang, K., Liu, D., & Lv, J. (2020). Anchibert: A pre-trained model for ancient chineselanguage understanding and generation. arXiv. Search in Google Scholar

23 Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A., . . . Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, (pp. 6000–6010). Search in Google Scholar

24 Virtanen, A., Kanerva, J., Ilo, R., Luoma, J., Luotolahti, J., Salakoski, T., . . . Pyysalo, S. (2019). Multilingual is not enough: Bert for finnish. Search in Google Scholar

25 Winograd, T. (1972). Understanding natural language. (pp. 1-191). Cognitive psychology. Search in Google Scholar

26 Wu, S., & Dredze, M. (2019). The surprising cross-lingual effectiveness of BERT. Conference on Empirical Methods in Natural Language Processing (pp. 833–844). Hong Kong, China: Association for Computational Linguistics. Search in Google Scholar

27 Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., & Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems (NeurIPS), (pp. 5754–5764). Search in Google Scholar

28 Yuwen, Z., & Zhaozhuo, X. (2018). Bert for question answering on squad 2.0. Search in Google Scholar

eISSN:: 1857-8462
Language:: English

Publication timeframe:: 2 times per year
Journal Subjects:: General Interest

Journal RSS Feed

Pre-Training MLM Using Bert for the Albanian Language

Published Online: Jun 28, 2023

Page range: 52 - 62

DOI: https://doi.org/10.2478/seeur-2023-0035

KeywordsNLP, transformers, pre-training, BERT

© 2023 Labehat Kryeziu et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Keywords
NLP, transformers, pre-training, BERT