Swarm Algorithms for NLP - The Case of Limited Training Data

The present article describes a novel phrasing model which can be used for segmenting sentences of unconstrained text into syntactically-defined phrases. This model is based on the notion of attraction and repulsion forces between adjacent words. Each of these forces is weighed appropriately by system parameters, the values of which are optimised via particle swarm optimisation. This approach is designed to be language-independent and is tested here for different languages.

The phrasing model’s performance is assessed per se, by calculating the segmentation accuracy against a golden segmentation. Operational testing also involves integrating the model to a phrase-based Machine Translation (MT) system and measuring the translation quality when the phrasing model is used to segment input text into phrases. Experiments show that the performance of this approach is comparable to other leading segmentation methods and that it exceeds that of baseline systems.

Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Computer Sciences, Databases and Data Mining, Artificial Intelligence

Journal RSS Feed

Swarm Algorithms for NLP - The Case of Limited Training Data

George Tambouratzis

Marina Vassiliou

Published Online: May 09, 2019

Page range: 219 - 234

Received: Dec 07, 2018

Accepted: Jan 30, 2019

DOI: https://doi.org/10.2478/jaiscr-2019-0005

Keywordsparticle swarm optimisation, natural language processing, text phrasing, machine translation

© 2019 George Tambouratzis et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Keywords
particle swarm optimisation, natural language processing, text phrasing, machine translation