Acerca de este artículo
Publicado en línea: 30 sept 2024
Páginas: 69 - 79
DOI: https://doi.org/10.2478/ijanmc-2024-0029
Palabras clave
© 2024 Kun Li et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Figure 1.

Figure 2.

Figure 3.

Figure 4.

Experimental results on various datasets
DataSet | Model | NoAgent | ReAct | InterAct | VIMBank |
---|---|---|---|---|---|
ALFWorld | Qwen2-7b | 48.8 | 54.7 | 60.1 | 72.3 |
ChatGLM3 | 46.2 | 49.2 | 55.8 | 64.9 | |
HotpotQA | Qwen2-7b | 51.6 | 57.3 | 63.4 | 76.3 |
ChatGLM3 | 45.9 | 51.8 | 59.7 | 71.5 | |
KAgentBench | Qwen2-7b | 34.2 | 48.5 | 52.6 | 58.7 |
ChatGLM3 | 32.6 | 44.7 | 46.3 | 54.2 |
Reasoning cost of ALFWorld environment
200 | 600 | 1000 | |
---|---|---|---|
NoAgent | 63.2K | 164.7K | 334.7K |
VIMBank | 56.8K | 142.6K | 258.3K |
Experimental Environment
Experimental Environment | Version |
---|---|
CPU | Intel Core i9-10900K |
GPU | NVIDIA Tesla V100 PCIe |
32G | |
Language | Python 3.9 |
Framework | LangChain |