Kabyle corpus digital database and exploitation. Test of lexicometric analysis of the identity dimension in the romanesque discourse

The purpose of this contribution is to show, through a preliminary analysis of a corpus sample composed of the first five kabyle novels (1963-1990), the contribution of lexicometry as a new method based on statistics, in the treatment of large corpora and the establishment of databases. The aim is to describe all the phases intrinsic to the preliminary processing of a corpus (transcription, tagging and lemmatization) before submitting them to the various stages of its exploitation. Thus, in our corpus, we have opted to deal with the theme of identity induced by the five works by highlighting both the overused vocabulary and the singularity of each work in relation to the corpus as a whole. But before moving on to the quantitative analysis of the vocabulary, a work of data preparation is necessary. We intend to focus on the orthographic choices to be adopted by removing all ambiguities, the marking out and the lemmatization of the corpus. In order to do this, we have resorted to Lexico5 computer tool.

eISSN:: 1338-4287
Language:: English

Publication timeframe:: 2 times per year
Journal Subjects:: Linguistics and Semiotics, Theoretical Frameworks and Disciplines, Linguistics, other

Journal RSS Feed

Kabyle corpus digital database and exploitation. Test of lexicometric analysis of the identity dimension in the romanesque discourse

Published Online: Aug 17, 2022

Page range: 894 - 905

DOI: https://doi.org/10.2478/jazcas-2022-0014

Keywords
corpus, kabyle, identity, novel, lexicometry, databases

© 2022 Arezki Ikherbane et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Kabyle corpus digital database and exploitation. Test of lexicometric analysis of the identity dimension in the romanesque discourse

Published Online: Aug 17, 2022

Page range: 894 - 905

DOI: https://doi.org/10.2478/jazcas-2022-0014

Keywordscorpus, kabyle, identity, novel, lexicometry, databases

© 2022 Arezki Ikherbane et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Keywords
corpus, kabyle, identity, novel, lexicometry, databases