1. bookVolume 17 (2017): Edition 2 (June 2017)
Détails du magazine
License
Format
Magazine
eISSN
1314-4081
Première parution
13 Mar 2012
Périodicité
4 fois par an
Langues
Anglais
access type Accès libre

Machine Translation: Phrase-Based, Rule-Based and Neural Approaches with Linguistic Evaluation

Publié en ligne: 26 Jun 2017
Volume & Edition: Volume 17 (2017) - Edition 2 (June 2017)
Pages: 28 - 43
Détails du magazine
License
Format
Magazine
eISSN
1314-4081
Première parution
13 Mar 2012
Périodicité
4 fois par an
Langues
Anglais
Abstract

In this article we present a novel linguistically driven evaluation method and apply it to the main approaches of Machine Translation (Rule-based, Phrase-based, Neural) to gain insights into their strengths and weaknesses in much more detail than provided by current evaluation schemes. Translating between two languages requires substantial modelling of knowledge about the two languages, about translation, and about the world. Using English-German IT-domain translation as a case-study, we also enhance the Phrase-based system by exploiting parallel treebanks for syntax-aware phrase extraction and by interfacing with Linked Open Data (LOD) for extracting named entity translations in a post decoding framework.

Keywords

1. Burchardt, A., K. Harris, G. Rehm, H. Uszkoreit. Towards a Systematic and Human-Informed Paradigm for High-Quality Machine Translation. – In: Proc. of LREC 2016 Workshop Translation Evaluation: From Fragmented Tools and Data Sets to an Integrated Ecosystem (LREC’16), Located at International Co, 2016.Search in Google Scholar

2. Avramidis, E., A. Burchardt, V. Macketanz, A. Srivastava. DFKI’s System for WMT16 IT-Domain Task, Including Analysis of Systematic Errors. – In: Proc. of 1st Conference on Machine Translation, 2016, pp. 415-422.10.18653/v1/W16-2329Search in Google Scholar

3. Guillou, L., C. Hardmeier. PROTEST: A Test Suite for Evaluating Pronouns in Machine Translation. – In: 10th International Conference on Language Resources and Evaluation, Portorož, Slovenia, 2016.Search in Google Scholar

4. Schottmüller, N., J. Nivre. Issues in Translating Verb-Particle Constructions from German to English. – In: 10th Workshop on Multiword Expressions, Gothenburg, Sweden, 2014, pp. 124-131.10.3115/v1/W14-0821Search in Google Scholar

5. Guillou, L., C. Hardmeier, P. Nakov, S. Stymne, J. Tiedemann, Y. Verslay, M. Cettolo, B. Webber, A. Popescu-Belis. Findings of the 2016 WMT Shared Task on Cross-Lingual Pronoun Prediction. – In: Proc. of 1st Conference on Machine Translation, 2016, Berlin, Germany, pp. 525-542.10.18653/v1/W16-2345Search in Google Scholar

6. Steedman, M. Romantics and Revolutionaries. – Linguistic Issues in Language Technology, Vol. 6, 2011, No 11, pp. 1-20.10.33011/lilt.v6i.1257Search in Google Scholar

7. Chiang, D. A Hierarchical Phrase-Based Model for Statistical Machine Translation. – In: Proc. of 45th Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, Michigan, 2005, pp. 263-270.10.3115/1219840.1219873Search in Google Scholar

8. Quirk, C., A. Menezes, C. Cherry. Dependency Treelet Translation: Syntactically-Informed Phrasal SMT. – In: Proc. of 45th Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, Michigan, 2005, pp. 271-279.Search in Google Scholar

9. Galley, M., et al. Scalable Inference and Training of Context-Rich Syntactic Models. – In: Proc. of 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING-ACL’06), Sydney, Australia, 2006, pp. 961-968.10.3115/1220175.1220296Search in Google Scholar

10. Tinsley, J., M. Hearne, A. Way. Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation. – In: Proc. of 6th International Workshop on Treebanks and Linguistic Theories (TLT’07), Bergen, Norway, 2007, pp. 175-187.Search in Google Scholar

11. Hearne, M., S. Ozdowska, J. Tinsley. Comparing Constituency and Dependency Representations for SMT Phrase-Extraction. – In: 15ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN’08), Avignon, France, 2008.Search in Google Scholar

12. Srivastava, A. K., A. Way. Using Percolated Dependencies for Phrase Extraction in SMT. – In: Proc. of Machine Translation Summit XII, Ottawa, Canada, 2009, pp. 316-323.Search in Google Scholar

13. McCrae, J. P., P. Cimiano. Mining Translations from the Web of Open Linked Data. – In: Proc. of Joint Workshop on NLP, LOD and SWAIE, Hissar, Bulgaria, 2013, pp. 8-11.Search in Google Scholar

14. Du, J., A. Way, A. Zydron. Using BabelNet to Improve OOV Coverage in SMT. – In: Proc. of 10th International Conference on Language Resources and Evaluation (LREC’16), Portoroz, Slovenia, 2016.Search in Google Scholar

15. Bojar, O., et al. Findings of the 2013 Workshop on Statistical Machine Translation. – In: 8th Workshop on Statistical Machine Translation, 2013.10.3115/v1/W14-3302Search in Google Scholar

16. Nadejde, M., P. Williams, P. Koehn. Edinburgh’s Syntax-Based Machine Translation Systems. – In: Proc. of 8th Workshop on Statistical Machine Translation, 2013, pp. 170-176.Search in Google Scholar

17. Durrani, N., B. Haddow, K. Heafield, P. Koehn. Edinburgh’s Machine Translation Systems for European Language Pairs. – In: Proc. of 8th Workshop on Statistical Machine Translation, 2013, pp. 114-121.Search in Google Scholar

18. Koehn, P. Europarl: A Parallel Corpus for Statistical Machine Translation. – In: Proc. of 10th Machine Translation Summit, Vol. 5, 2005, pp. 79-86.Search in Google Scholar

19. Eisele, A., Y. Chen. MultiUN: A Multilingual Corpus from United Nation Documents. – In: Proc. of 7th Conference on International Language Resources and Evaluation (LREC’10), 19-21 May 2010, La Valletta, Malta, pp. 2868-2872.Search in Google Scholar

20. Buck, C., K. Heafield, B. Van Ooyen. N-Gram Counts and Language Models from the Common Crawl. – In: Proc. of Language Resources and Evaluation Conference, 2014.Search in Google Scholar

21. Tiedemann, J. News from OPUS – A Collection of Multilingual Parallel Corpora with Tools and Interfaces. – In: Advances in Natural Language Processing. Vol. V. N. Nicolov, K. Bontcheva, G. Angelova, R. Mitkov, Eds. Borovets, Bulgaria. Amsterdam/Philadelphia, John Benjamins, 2009, pp. 237-248.10.1075/cilt.309.19tieSearch in Google Scholar

22. Alonso, J. A., G. Thurmair. The Comprendium Translator System. – In: Proc. of 9th Machine Translation Summit, 2003.Search in Google Scholar

23. Bahdanau, D., K. Cho, Y. Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. – In: 3rd International Conference on Learning Representations, 2015.Search in Google Scholar

24. Cho, K., B. Van Merrienboer, D. Bahdanau, Y. Bengio. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. – In: Proc. of SSST-8, 8th Workshop on Syntax, Semantics and Structure in Statistical Translation, Doha, Qatar, 2014, pp. 103-111.Search in Google Scholar

25. Sennrich, R., B. Haddow, A. Birch. Neural Machine Translation of Rare Words with Subword Units. – CoRR, Vol. abs/1508.0, 2015.10.18653/v1/P16-1162Search in Google Scholar

26. Helcl, J., J. Libovický. Neural Monkey: An Open-Source Tool for Sequence Learning. – Prague Bulleting of Mathematical Linguistics, Vol. 107, 2017, pp. 5-17.10.1515/pralin-2017-0001Search in Google Scholar

27. Abadi, M., et al. Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv Preprint arXiv:1603. 04467, 2016.Search in Google Scholar

28. Koehn, P., F. J. Och, D. Marcu. Statistical Phrase-Based Translation. – In: Proc. of 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, 2003, pp. 48-54.10.3115/1073445.1073462Search in Google Scholar

29. Petrov, S., D. Klein. Improved Inference for Unlexicalized Parsing. – In: Proc. of Annual Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, New York, 2007, pp. 404-411.Search in Google Scholar

30. Zhechev, V. Unsupervised Generation of Parallel Treebank through Sub-Tree Alignment. – Prague Bulletin of Mathematical Linguistics, Vol. 91, 2009, pp. 89-98.10.2478/v10108-009-0019-1Search in Google Scholar

31. Srivastava, A. K. Phrase Extraction and Rescoring in Statistical Machine Translation. Dublin City University, 2014.Search in Google Scholar

32. Srivastava, A. K., F. Sasaki, P. Bourgonje, J. M. Schneider, J. Nehring, G. Rehm. How to Configure Statistical Machine Translation for Linked Open Data. – In: Proc. of 38th Annual Conference on Translating and Computer, London, United Kingdom, 2016, pp. 138-148.Search in Google Scholar

33. Avramidis, E., V. Macketanz, A. Burchardt, J. Helcl, H. Uszkoreit. Deeper Machine Translation and Evaluation for German. – In: Proc. of 2nd Deep Machine Translation Workshop. Deep Machine Translation Workshop (DMTW’16), 21 October 2016, Lisbon, Portugal, pp. 29-38.Search in Google Scholar

Articles recommandés par Trend MD

Planifiez votre conférence à distance avec Sciendo