거대 언어 모델 기반 멀티에이전트 토론 시스템의 상호작용 아키텍처별 성능 비교

임보정; 서호건

doi:10.30693/SMJ.2025.14.10.140

본 논문은 대규모 언어 모델(Large Language Model: LLM) 기반 멀티에이전트 시스템에서 시스템의 아키텍처가 토론의 품질에 미치는 영향을 분석한다. 이를 위해 GPT-4o, Gemini-2.5-Pro, Gemini-2.5-Flash, Gemini-2.0-Flash, Llama-4-maverick, Deepseek-chat-v3-0324, Claude-Sonnet-4의 7가지 LLM를 이용하여 토론 논제를 평가했다. 그리고 각 모델의 응답 성능을 비교한 후, 본 논문에서 이용할 대표 언어 모델로 Gemini-2.5-Flash를 선정했다. 멀티에이전트 상호작용 방식으로 MagenticOne, Swarm, SelectorGroupChat, RoundRobinGroupChat의 네 가지 토론 아키텍처를 이용하여 토론 시스템을 구축하였으며, 대표 모델 선정 과정에서 점수가 높은 상위 10개의 논제를 기반으로 토론을 진행하였다. 실험 결과로는 대화의 흐름에 따라 발언자를 동적으로 선택하는 SelectorGroupChat 아키텍처가 토론 점수에서 가장 높은 점수를 받았다. 본 논문은 멀티에이전트 기반 시스템을 활용하여 토론에 가장 적합한 아키텍처를 확인하였다.

This study analyzes the impact of system architecture on the quality of debates in multi-agent systems based on large language models (LLMs). To this end, debates were evaluated using seven LLMs: GPT-4o, Gemini-2.5-Pro, Gemini-2.5-Flash, Gemini-2.0-Flash, Llama-4-maverick, Deepseek-chat-v3-0324, and Claude-Sonnet-4. After comparing the response performance of each model, Gemini-2.5-Flash was selected as the representative LLM for this research. A multi-agent debate system was then constructed using four architectures—MagenticOne, Swarm, SelectorGroupChat, and RoundRobinGroupChat—and debates were conducted based on the top 10 topics with the highest scores from the model selection process. The experimental results showed that the SelectorGroupChat architecture, which dynamically selects speakers according to the flow of conversation, achieved the highest debate scores. This study identifies the most suitable architecture for debates utilizing a multi-agent-based system.

거대 언어 모델 기반 멀티에이전트 토론 시스템의 상호작용 아키텍처별 성능 비교
Multi-Agent Debate System based on Large Language Model: Comparative Analysis of Interaction Architectures

(0)

(0)

(0)

(0)

거대 언어 모델 기반 멀티에이전트 토론 시스템의 상호작용 아키텍처별 성능 비교 Multi-Agent Debate System based on Large Language Model: Comparative Analysis of Interaction Architectures

(0)

(0) 팝업 열기 팝업 닫기

(0)

(0)

거대 언어 모델 기반 멀티에이전트 토론 시스템의 상호작용 아키텍처별 성능 비교
Multi-Agent Debate System based on Large Language Model: Comparative Analysis of Interaction Architectures

(0)