Benchmarking 24 Large Language Models for Automated Multiple-Choice Question Generation in Latvian

Large Language Models (LLMs) are increasingly being used for a wide range of text generation tasks. This paper investigates the generation of Multiple-Choice Questions in Latvian to assess both the ability of LLMs to generate high-quality questions and answers and, more broadly, their capability to process Latvian, a lower-resourced language that has received relatively little attention in LLM research. This study benchmarks 24 different LLMs, specifically those developed by Anthropic, DeepSeek, OpenAI, Google, Meta, Mistral, and Microsoft. The findings highlight the varying capabilities of these models in handling Latvian, producing grammatically correct, coherent, and meaningful text. The best-performing closed-weights model is claude-3.5-sonnet (by Anthropic), the best-performing open-weights model is deepseek-v3 (by DeepSeek), and the best-performing small open-weights model is open-mistral-nemo (by Mistral).

Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 1 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Informatik, Künstliche Intelligenz, Informationstechnik, Projektmanagement, Softwareentwicklung

Zeitschrift RSS Feed

Benchmarking 24 Large Language Models for Automated Multiple-Choice Question Generation in Latvian

Anna Daupare

Gints Jēkabsons

Online veröffentlicht: 30. Mai 2025

Seitenbereich: 85 - 90

Eingereicht: 03. Apr. 2025

Akzeptiert: 15. Mai 2025

DOI: https://doi.org/10.2478/acss-2025-0010

SchlüsselwörterLarge language models, multiple-choice questions, text generation

© 2025 Anna Daupare et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Schlüsselwörter
Large language models, multiple-choice questions, text generation