Performance Comparison of Statistical vs. Neural-Based Translation System on Low-Resource Languages

Datta, Goutam; Joshi, Nisheeth; Gupta, Kusum

Open Access

Performance Comparison of Statistical vs. Neural-Based Translation System on Low-Resource Languages

,

and

Aug 12, 2023

International Journal on Smart Sensing and Intelligent Systems

Volume 16 (2023): Issue 1 (January 2023)

About this article

Cite

Share

Download Cover

Article Category: Article

Published Online: Aug 12, 2023

Received: Feb 22, 2023

DOI: https://doi.org/10.2478/ijssis-2023-0007

Keywords
Neural machine translation, statistical machine translation, RNN, deep learning

© 2023 Goutam Datta et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Architecture of SMT system. SMT, statistical machine translation.

Encoder–decoder model with attention [3].

GAN model in the case of NMT use. GAN, generative adversarial network.

Heat map representation of attention visualization.

Masked values are represented with zeros.

Graphical representation of masking operation; the colored right half is the masked part.

BLEU score generated by NMT and SMT for Eng.–Beng. language pairs. MT, machine translation; SGD, stochastic gradient descent; SMT, statistical machine translation.

BLEU score generated by NMT and SMT for Eng.–Hindi language pairs. MT, machine translation; SGD, stochastic gradient descent; SMT, statistical machine translation.

BLEU with minimum n-gram having maximum score. SGD, stochastic gradient descent.

Attention-based NMT outperforms SMT for the Bengali–Hindi language pair (Das et al_ [32])

Translation model	BLEU score	Iterations
Attention-based NMT model	20.41	25
MOSES (SMT)	14.35	-

NMT outperformed SMT with transfer learning, ensemble, and further processing of data (Zopth et al_)

Language	SBMT	NMT	Transfer	Final
Hausa	23.7	16.8	21.3	24.0
Turkish	20.4	11.4	17.0	18.7
Uzbek	17.9	10.7	14.4	16.8
Urdu	17.9	5.2	13.8	14.5

NMT system with transformer model and BPE outperformed phrase-based SMT for English–Hindi and Hindi–English language pairs (Haque et al_ [33])

MT model	BLEU	METEOR	TER
Eng.Hindi-PBSMT	28.8	30.2	53.4
Eng.Hindi-NMT	36.6	33.5	46.3
Hindi–Eng.PBSMT	34.1	36.6	50.0
Hindi–Eng.NMT	39.9	38.5	42.0

English–Hindi translation using different optimizers

Language pair	Optimizer	BLEU-4 score	NMT model	No. of epochs
Eng.–Hindi	Adam	12.25	NMT with attention	14
Eng.–Hindi	SGD	11.50	NMT with attention	14
Eng.–Hindi		16.64	MOSES

English–Bengali translation BLEU scores using different optimizers

Language pairs	Optimizer	BLEU-4 score	MT model	No. of epochs
Eng.–Beng.	Adam	10.78	NMT with attention	12
Eng.–Beng.	SGD	11.17	NMT with attention	12
Eng.–Beng.		14.58	MOSES

BLEU-1, 2, and 3 scores are summarized for Eng_–Beng_ and Eng_–Hindi language pairs using Adam- and SGD-Optimizers

BLEU	Eng.–Beng.-NMT (Adam-Optimizer)	Eng.–Beng.-NMT (SGD-Optimizer)	Eng.–Hindi (NMT-Adam)	Eng.–Hindi (NMT-SGD)
BLEU-1	14.15	13.91	15.77	14.18
BLEU-2	12.65	13.11	14.12	13.33
BLEU-3	11.83	12.17	13.95	12.19

For various low-resource corpus SMT outperformed NMT (Ahmadnia et al_ [17])

Corpus	SMT	NMT	NMT*	NMT**
Gnome	20.54	15.49	17.26	18.76
KDE4	15.64	13.36	14.29	15.71
Subtitles	18.82	18.62	19.51	22.54
Ubuntu	16.76	14.27	15.14	15.87
Tanzil	17.69	15.14	16.53	17.72
Overall	17.06	15.25	16.67	17.32

Language:: English

Publication timeframe:: 1 times per year
Journal Subjects:: Engineering, Introductions and Overviews, Engineering, other

Journal RSS Feed

Performance Comparison of Statistical vs. Neural-Based Translation System on Low-Resource Languages

Article Category: Article

Published Online: Aug 12, 2023

Received: Feb 22, 2023

DOI: https://doi.org/10.2478/ijssis-2023-0007

Keywords
Neural machine translation, statistical machine translation, RNN, deep learning

© 2023 Goutam Datta et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Attention-based NMT outperforms SMT for the Bengali–Hindi language pair (Das et al_ [32])

NMT outperformed SMT with transfer learning, ensemble, and further processing of data (Zopth et al_)

NMT system with transformer model and BPE outperformed phrase-based SMT for English–Hindi and Hindi–English language pairs (Haque et al_ [33])

English–Hindi translation using different optimizers

English–Bengali translation BLEU scores using different optimizers

BLEU-1, 2, and 3 scores are summarized for Eng_–Beng_ and Eng_–Hindi language pairs using Adam- and SGD-Optimizers

For various low-resource corpus SMT outperformed NMT (Ahmadnia et al_ [17])

Performance Comparison of Statistical vs. Neural-Based Translation System on Low-Resource Languages

Goutam Datta

Nisheeth Joshi

Kusum Gupta

Article Category: Article

Published Online: Aug 12, 2023

Received: Feb 22, 2023

DOI: https://doi.org/10.2478/ijssis-2023-0007

KeywordsNeural machine translation, statistical machine translation, RNN, deep learning

© 2023 Goutam Datta et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Attention-based NMT outperforms SMT for the Bengali–Hindi language pair (Das et al_ [32])

NMT outperformed SMT with transfer learning, ensemble, and further processing of data (Zopth et al_)

NMT system with transformer model and BPE outperformed phrase-based SMT for English–Hindi and Hindi–English language pairs (Haque et al_ [33])

English–Hindi translation using different optimizers

English–Bengali translation BLEU scores using different optimizers

BLEU-1, 2, and 3 scores are summarized for Eng_–Beng_ and Eng_–Hindi language pairs using Adam- and SGD-Optimizers

For various low-resource corpus SMT outperformed NMT (Ahmadnia et al_ [17])

Keywords
Neural machine translation, statistical machine translation, RNN, deep learning