1. bookVolume 104 (2015): Issue 1 (October 2015)
Journal Details
License
Format
Journal
First Published
04 May 2010
Publication timeframe
2 times per year
Languages
English
Copyright
© 2020 Sciendo

MT-ComparEval: Graphical evaluation interface for Machine Translation development

Published Online: 07 Sep 2015
Page range: 63 - 74
Journal Details
License
Format
Journal
First Published
04 May 2010
Publication timeframe
2 times per year
Languages
English
Copyright
© 2020 Sciendo

The tool described in this article has been designed to help MT developers by implementing a web-based graphical user interface that allows to systematically compare and evaluate various MT engines/experiments using comparative analysis via automatic measures and statistics. The evaluation panel provides graphs, tests for statistical significance and n-gram statistics. We also present a demo server http://wmt.ufal.cz with WMT14 and WMT15 translations.

Aziz, Wilker, S Castilho, and Lucia Specia. PET: a Tool for Post-editing and Assessing Machine Translation. In Eighth International Conference on Language Resources and Evaluation, pages 3982–3987, Istanbul, Turkey, 2012. URL http://wilkeraziz.github.io/dcs-site/publications/2012/AZIZ+LREC2012.pdf.Search in Google Scholar

Berka, Jan, Ondřej Bojar, Mark Fishel, Maja Popović, and Daniel Zeman. Automatic MT Error Analysis: Hjerson Helping Addicter. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pages 2158–2163, İstanbul, Turkey, 2012. European Language Resources Association. ISBN 978-2-9517408-7-7.Search in Google Scholar

Bojar, Ondřej, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, and Aleš Tamchyna. Findings of the 2014 Workshop on Statistical Machine Translation. In Proceedings of the Ninth Workshop on Statistical Machine Translation, pages 12–58, Baltimore, USA, 2014. ACL. URL http://www.aclweb.org/anthology/W/W14/W14-3302.Search in Google Scholar

Bojar, Ondřej, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, and Marco Turchi. Findings of the 2015 Workshop on Statistical Machine Translation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, pages 1–46, Lisboa, Portugal, September 2015. Association for Computational Linguistics. URL http://aclweb.org/anthology/W15-3001.Search in Google Scholar

Callison-Burch, Chris, Cameron Fordyce, Philipp Koehn, Christof Monz, and Josh Schroeder. (Meta-) Evaluation of Machine Translation. In Proceedings of the Second Workshop on Statistical Machine Translation, pages 136–158, Prague, Czech Republic, June 2007. ACL.Search in Google Scholar

Clark, Jonathan H., Chris Dyer, Alon Lavie, and Noah A. Smith. Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 176–181, Portland, Oregon, USA, June 2011. Association for Computational Linguistics. URL http://www.aclweb.org/anthology/P11-2031.Search in Google Scholar

Federmann, Christian. Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations. In LREC 2010, pages 1731–1734, Valletta, Malta, May 2010. European Language Resources Association (ELRA). ISBN 2-9517408-6-7. URL http://www.lrec-conf.org/proceedings/lrec2010/pdf/197_Paper.pdf.Search in Google Scholar

Giménez, J and L Màrquez. Asiya: An Open Toolkit for Automatic Machine Translation (Meta-)Evaluation. The Prague Bulletin of Mathematical Linguistics, 2010. URL http://ufal.mff.cuni.cz/pbml/94/art-gimenez-marques-evaluation.pdf.Search in Google Scholar

Girardi, Christian, Luisa Bentivogli, Mohammad Amin Farajian, and Marcello Federico. MTEQuAl: a Toolkit for Human Assessment of Machine Translation Output. In COLING 2014, pages 120–123, Dublin, Ireland, Aug. 2014. Dublin City University and ACL. URL http://www.aclweb.org/anthology/C14-2026.Search in Google Scholar

Koehn, Philipp. Statistical significance tests for machine translation evaluation. In Lin, Dekang and Dekai Wu, editors, Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pages 388–395, Barcelona, Spain, 2004. ACL.Search in Google Scholar

Koehn, Philipp. An Experimental Management System. The Prague Bulletin of Mathematical Linguistics, 94:87–96, 2010. doi: 10.2478/v10108-010-0023-5. URL http://ufal.mff.cuni.cz/pbml/94/art-koehn-ems.pdf.Search in Google Scholar

Koehn, Philipp, Wade Shen, Marcello Federico, Nicola Bertoldi, Chris Callison-Burch, Brooke Cowan, Chris Dyer, Hieu Hoang, Ondrej Bojar, Richard Zens, Alexandra Constantin, Evan Herbst, and Christine Moran. Open Source Toolkit for Statistical Machine Translation. In Proceedings of ACL, pages 177–180, Prague, Czech Republic, June 2006.Search in Google Scholar

Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. BLEU: a method for automatic evaluation of machine translation. In Proc. of ACL, pages 311–318, Stroudsburg, PA, USA, 2002. ACL. URL http://dx.doi.org/10.3115/1073083.1073135.Search in Google Scholar

Popović, Maja. Hjerson: An Open Source Tool for Automatic Error Classification of Machine Translation Output. The Prague Bulletin of Mathematical Linguistics, 96:59–68, 2011. doi: 10.2478/v10108-011-0011-4. URL http://ufal.mff.cuni.cz/pbml/96/art-popovic.pdf.Search in Google Scholar

Riezler, Stefan and John T. Maxwell. On Some Pitfalls in Automatic Evaluation and Significance Testing for MT. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 57–64, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics. URL http://www.aclweb.org/anthology/W/W05/W05-0908.Search in Google Scholar

Zeman, Daniel, Mark Fishel, Jan Berka, and Ondřej Bojar. Addicter: What Is Wrong with My Translations? The Prague Bulletin of Mathematical Linguistics, 96:79–88, 2011. ISSN 0032-6585.Search in Google Scholar

Plan your remote conference with Sciendo