์‹œ๋ฆฌ์ฆˆ: LLM Tool Calling ๋‚ด๋ถ€ ์›๋ฆฌ๋ถ€ํ„ฐ ์—์ด์ „ํŠธ ์ง์ ‘ ๊ตฌํ˜„๊นŒ์ง€

์ด ์‹œ๋ฆฌ์ฆˆ๋Š” ์‚ฌ์šฉ์ž์˜ ์ž์—ฐ์–ด ํ•œ ์ค„์ด tool ์‹คํ–‰์œผ๋กœ ๋ฐ”๋€Œ๋Š” ๋‚ด๋ถ€ ์ฒ˜๋ฆฌ ๊ณผ์ •์„ ๋‹จ๊ณ„๋ณ„๋กœ ํ•ด๋ถ€ํ•˜๊ณ , ์ตœ์ข…์ ์œผ๋กœ ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ + ์ž์ฒด middleware๋กœ ๋‚˜๋งŒ์˜ ์—์ด์ „ํŠธ๋ฅผ ์ง์ ‘ ๊ตฌํ˜„ํ•˜๋Š” ๊ฒƒ๊นŒ์ง€ ๋„๋‹ฌํ•˜๋Š” ๊ณผ์ •์ด๋‹ค.

ํŽธ๋‚ด์šฉํ•ต์‹ฌ
1ํŽธ์ „์ฒด ์กฐ๊ฐ๋„์ž์—ฐ์–ด โ†’ tool ์‹คํ–‰๊นŒ์ง€ 5๊ฐœ ๋ ˆ์ด์–ด์˜ ์กด์žฌ๋ฅผ ํ™•์ธ
2ํŽธChat TemplateJSON์ด ๋ชจ๋ธ์— ์ง์ ‘ ๋“ค์–ด๊ฐ€์ง€ ์•Š๋Š”๋‹ค
3ํŽธTokenization๋ชจ๋ธ์€ ํ…์ŠคํŠธ๋ฅผ ์ฝ์ง€ ๋ชปํ•œ๋‹ค - ํ† ํฐ ID์™€ control token
4ํŽธ๋ชจ๋ธ ์ถ”๋ก โ€tool์„ ์“ธ๊นŒ ๋ง๊นŒโ€ ํŒ๋‹จ๊ณผ constrained decoding
5ํŽธTool ์‹คํ–‰tool_use๋ฅผ ๋ฐ›์€ ํด๋ผ์ด์–ธํŠธ์˜ ์‹คํ–‰ ๋ฃจํ”„
6ํŽธNative vs Non-native๊ฐ™์€ ๊ธฐ๋Šฅ, ๋‹ค๋ฅธ ๊ตฌ์กฐ โ†’ Middleware
7ํŽธ (๋ณธ๋ฌธ)Middleware ๋งŒ๋“ค๊ธฐํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ + ์ถœ๋ ฅ ํŒŒ์‹ฑ + ์‹คํ–‰ ๋ฃจํ”„ ์ง์ ‘ ๊ตฌํ˜„
8ํŽธ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ ๋กœ์ปฌ ๊ตฌ์ถ•Ollama/vLLM์œผ๋กœ ๋กœ์ปฌ LLM ์„œ๋น™
9ํŽธ๋‚˜๋งŒ์˜ ์—์ด์ „ํŠธ๋ชจ๋ธ + Middleware = ์—์ด์ „ํŠธ ์™„์„ฑ

  • Tool Calling Middleware๋Š” Non-native ๋ชจ๋ธ์— ์—†๋Š” tool calling ๋ ˆ์ด์–ด(Chat Template, ์ถœ๋ ฅ ํŒŒ์„œ, ์‹คํ–‰ ๋ฃจํ”„)๋ฅผ ์™ธ๋ถ€์—์„œ ๊ตฌํ˜„ํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ๊ณ„์ธต
  • tool ์ •์˜ โ†’ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ํ…์ŠคํŠธ ๋ณ€ํ™˜, ๋ชจ๋ธ ์ถœ๋ ฅ โ†’ tool call ํŒจํ„ด ๊ฐ์ง€ ๋ฐ JSON ์ถ”์ถœ, tool ์‹คํ–‰ โ†’ ๊ฒฐ๊ณผ ํ”ผ๋“œ๋ฐฑ์˜ 3๊ฐ€์ง€ ์—ญํ• ์„ ๋‹ด๋‹นํ•˜๋Š” ๋ณ€ํ™˜/์‹คํ–‰ ๋ฃจํ”„
  • Hermes, XML ๋“ฑ ํ…์ŠคํŠธ ๊ธฐ๋ฐ˜ ํ”„๋กœํ† ์ฝœ๋กœ ๋ชจ๋ธ๊ณผ ํ†ต์‹ ํ•˜๋ฉฐ, Native ๋ชจ๋ธ์˜ control token + constrained decoding ๋Œ€์‹  ์ •๊ทœ์‹ ํŒŒ์‹ฑ์— ์˜์กดํ•˜๋Š” ๊ตฌ์กฐ

ํ•ด๋‹น ๊ฐœ๋…์ด ํ•„์š”ํ•œ ์ด์œ 

  • 6ํŽธ์—์„œ Non-native ๋ชจ๋ธ์—๋Š” Chat Template, ์ถœ๋ ฅ ํŒŒ์„œ, Constrained Decoding ๋“ฑ 5๊ฐœ ๋ ˆ์ด์–ด ์ค‘ ๋Œ€๋ถ€๋ถ„์ด ์—†๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ–ˆ๋‹ค. ์ด ๊ธ€์—์„œ๋Š” ๊ทธ ์—†๋Š” ๋ ˆ์ด์–ด๋ฅผ ์ง์ ‘ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•œ๋‹ค
  • โ€œMiddleware๋ฅผ ๋งŒ๋“ ๋‹คโ€๋Š” ๊ฒƒ์ด ๊ตฌ์ฒด์ ์œผ๋กœ ์–ด๋–ค ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์ธ์ง€ ํ™•์ธํ•ด์•ผ, 8ํŽธ์—์„œ ๋กœ์ปฌ ๋ชจ๋ธ์— ์—ฐ๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค
  • Qwen ๊ณต์‹ ๋ฌธ์„œ์˜ ํ‘œํ˜„๋Œ€๋กœ, function calling์€ ๋ณธ์งˆ์ ์œผ๋กœ prompt engineering์œผ๋กœ ๊ตฌํ˜„๋œ๋‹ค. ์ด ์‚ฌ์‹ค์„ ์ฝ”๋“œ๋กœ ์ง์ ‘ ํ™•์ธํ•œ๋‹ค

AS-IS

sequenceDiagram
    autonumber
    participant User as ์‚ฌ์šฉ์ž
    participant LLM as ๋ชจ๋ธ ๊ฐ€์ค‘์น˜

    User->>LLM: "์„œ์šธ ๋‚ ์”จ ์•Œ๋ ค์ค˜"
    LLM-->>User: "์„œ์šธ์˜ ํ˜„์žฌ ๋‚ ์”จ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์—†์ง€๋งŒ,<br/>์ผ๋ฐ˜์ ์œผ๋กœ 2์›”์€ ์ถฅ์Šต๋‹ˆ๋‹ค..."
    Note over User,LLM: tool ์ •์˜๋ฅผ ์ „๋‹ฌํ•  ๋ฐฉ๋ฒ•์ด ์—†๋‹ค<br/>tool call ํ˜•์‹ ์ถœ๋ ฅ๋„ ๋‚˜์˜ค์ง€ ์•Š๋Š”๋‹ค

TO-BE

sequenceDiagram
    autonumber
    participant User as ์‚ฌ์šฉ์ž
    box ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜
        participant MW as Middleware
        participant Tool as Tool ํ•จ์ˆ˜
    end
    participant LLM as ๋ชจ๋ธ ๊ฐ€์ค‘์น˜

    User->>MW: "์„œ์šธ ๋‚ ์”จ ์•Œ๋ ค์ค˜"
    MW->>LLM: tool ์ •์˜ ํฌํ•จ ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ
    LLM-->>MW: "...<tool_call>{...get_weather...}</tool_call>"
    MW->>MW: tool call ํŒจํ„ด ๊ฐ์ง€ + JSON ์ถ”์ถœ
    MW->>Tool: get_weather("Seoul") ์‹คํ–‰
    Tool-->>MW: {temperature: "15ยฐC", condition: "๋ง‘์Œ"}
    MW->>LLM: tool ๊ฒฐ๊ณผ ํฌํ•จ ํ”„๋กฌํ”„ํŠธ
    LLM-->>MW: "์„œ์šธ์˜ ํ˜„์žฌ ๋‚ ์”จ๋Š” 15ยฐC์ด๋ฉฐ ๋ง‘์Šต๋‹ˆ๋‹ค"
    MW-->>User: ์ตœ์ข… ์‘๋‹ต
    Note over MW,Tool: tool ์ •์˜์™€ ์‹คํ–‰ ํ•จ์ˆ˜๋Š” ๊ฐœ๋ฐœ์ž๊ฐ€ ์ž‘์„ฑ<br/>Middleware๋Š” ํ”„๋กฌํ”„ํŠธ ๋ณ€ํ™˜ + ํŒŒ์‹ฑ + ์‹คํ–‰ ๋ฃจํ”„๋ฅผ ๋‹ด๋‹น

Middleware์˜ 3๊ฐ€์ง€ ์—ญํ• 

6ํŽธ์—์„œ Middleware๊ฐ€ โ€œ์—†๋Š” ๋ ˆ์ด์–ด๋ฅผ ์™ธ๋ถ€์—์„œ ๋ฉ”์šด๋‹คโ€๊ณ  ํ–ˆ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ 3๊ฐ€์ง€ ์—ญํ• ์ด๋‹ค:

sequenceDiagram
    autonumber
    participant User as ์‚ฌ์šฉ์ž
    box ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜
        participant MW as Middleware
        participant Tool as Tool ํ•จ์ˆ˜
    end
    participant LLM as ๋ชจ๋ธ ๊ฐ€์ค‘์น˜

    User->>MW: "์„œ์šธ ๋‚ ์”จ ์•Œ๋ ค์ค˜"

    Note over MW: ์—ญํ•  1: ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ<br/>tool JSON โ†’ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ํ…์ŠคํŠธ
    MW->>LLM: ํ”„๋กœํ† ์ฝœ์— ๋งž์ถ˜ ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ
    LLM-->>MW: "...<tool_call>{...get_weather...}</tool_call>"

    Note over MW: ์—ญํ•  2: ์ถœ๋ ฅ ํŒŒ์‹ฑ<br/>์ •๊ทœ์‹์œผ๋กœ tool call ๊ฐ์ง€ โ†’ JSON ์ถ”์ถœ
    MW->>Tool: get_weather("Seoul") ์‹คํ–‰
    Tool-->>MW: {temperature: "15ยฐC", condition: "๋ง‘์Œ"}

    Note over MW: ์—ญํ•  3: ์‹คํ–‰ ๋ฃจํ”„<br/>tool ๊ฒฐ๊ณผ๋ฅผ ๋ชจ๋ธ์— ํ”ผ๋“œ๋ฐฑ
    MW->>LLM: tool ๊ฒฐ๊ณผ ํฌํ•จ ํ”„๋กฌํ”„ํŠธ
    LLM-->>MW: "์„œ์šธ์˜ ํ˜„์žฌ ๋‚ ์”จ๋Š” 15ยฐC์ด๋ฉฐ ๋ง‘์Šต๋‹ˆ๋‹ค"

    MW-->>User: ์ตœ์ข… ์‘๋‹ต
์—ญํ• ๋Œ€์ฒดํ•˜๋Š” Native ๋ ˆ์ด์–ดํ•˜๋Š” ์ผ๋‹ค์ด์–ด๊ทธ๋žจ
ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝChat Templatetool JSON โ†’ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ํ…์ŠคํŠธ ๋ณ€ํ™˜2๋ฒˆ
์ถœ๋ ฅ ํŒŒ์‹ฑ์ถœ๋ ฅ ํŒŒ์„œ + Constrained Decoding๋ชจ๋ธ ์ถœ๋ ฅ์—์„œ <tool_call> ํŒจํ„ด ๊ฐ์ง€ โ†’ JSON ์ถ”์ถœ3๋ฒˆ
์‹คํ–‰ ๋ฃจํ”„SDK์˜ toolRunnertool ์‹คํ–‰ โ†’ ๊ฒฐ๊ณผ๋ฅผ ๋ชจ๋ธ์— ํ”ผ๋“œ๋ฐฑ โ†’ ์ถ”๊ฐ€ tool call ํ™•์ธ โ†’ ๋ฐ˜๋ณต4~7๋ฒˆ

ํ”„๋กœํ† ์ฝœ - Middleware์™€ ๋ชจ๋ธ์ด ์•ฝ์†ํ•˜๋Š” ํ˜•์‹ (2,3๋ฒˆ๊ณผ 6,7๋ฒˆ ๊ณผ์ •)

์œ„ ๋‹ค์ด์–ด๊ทธ๋žจ์—์„œ Middleware์™€ ๋ชจ๋ธ์ด ํ†ต์‹ ํ•˜๋Š” ๊ณผ์ •(2,3๋ฒˆ๊ณผ 6,7๋ฒˆ)์—๋Š” ์•ฝ์†์ด ํ•„์š”ํ•˜๋‹ค. โ€œtool ์ •์˜๋ฅผ ์–ด๋–ค ํ…์ŠคํŠธ๋กœ ์ „๋‹ฌํ•˜๊ณ , ๋ชจ๋ธ์€ tool call์„ ์–ด๋–ค ํ…์ŠคํŠธ๋กœ ํ‘œํ˜„ํ•  ๊ฒƒ์ธ๊ฐ€โ€ โ€” ์ด๊ฒƒ์ด ํ”„๋กœํ† ์ฝœ์ด๋‹ค.

Native ๋ชจ๋ธ์—์„œ๋Š” control token์ด ์ด ์—ญํ• ์„ ํ–ˆ๋‹ค (3ํŽธ). Middleware์—์„œ๋Š” control token์ด ์—†์œผ๋ฏ€๋กœ, ์ผ๋ฐ˜ ํ…์ŠคํŠธ ํŒจํ„ด์œผ๋กœ ๋Œ€์ฒดํ•œ๋‹ค.

๋Œ€ํ‘œ์ ์ธ ํ”„๋กœํ† ์ฝœ 2๊ฐ€์ง€:

Hermes ํ”„๋กœํ† ์ฝœ

NousResearch Hermes ๋ชจ๋ธ์ด ์ •์˜ํ•˜๊ณ , Qwen3 ๋“ฑ tool calling fine-tuning๋œ ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ์ด ์ฑ„ํƒํ•œ ํ˜•์‹์ด๋‹ค. <tool_call> XML ํƒœ๊ทธ ์•ˆ์— JSON์„ ๋„ฃ๋Š”๋‹ค:

๋ชจ๋ธ ์ถœ๋ ฅ:
<tool_call>
{"name": "get_weather", "arguments": {"location": "Seoul"}}
</tool_call>

์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ์—์„œ tool ์ •์˜๋„ JSON์œผ๋กœ ์ „๋‹ฌํ•œ๋‹ค:

You are provided with function signatures within <tools></tools> XML tags:
<tools>
[{"type": "function", "function": {"name": "get_weather", ...}}]
</tools>

For each function call return a json object with function name and arguments
within <tool_call></tool_call> XML tags:
<tool_call>
{"name": "<function_name>", "arguments": <args_json_object>}
</tool_call>

XML ํ”„๋กœํ† ์ฝœ

tool ์ด๋ฆ„๊ณผ ์ธ์ž๋ฅผ ๋ชจ๋‘ XML ํƒœ๊ทธ๋กœ ํ‘œํ˜„ํ•˜๋Š” ํ˜•์‹์ด๋‹ค. ai-sdk-tool-call-middleware์˜ morphXmlToolMiddleware๊ฐ€ ์ด ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•œ๋‹ค:

๋ชจ๋ธ ์ถœ๋ ฅ:
<tool_call>
<tool_name>get_weather</tool_name>
<location>Seoul</location>
</tool_call>

ํŒŒ๋ผ๋ฏธํ„ฐ ์ด๋ฆ„์ด XML ํƒœ๊ทธ ์ด๋ฆ„์ด ๋˜๋ฏ€๋กœ, JSON ํŒŒ์‹ฑ ์—†์ด XML ํŒŒ์‹ฑ๋งŒ์œผ๋กœ ์ธ์ž๋ฅผ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ๋‹ค.

์–ด๋–ค ํ”„๋กœํ† ์ฝœ์„ ์„ ํƒํ•˜๋Š”๊ฐ€

HermesXML
tool call ์ธ์žJSON ๊ฐ์ฒดXML ํƒœ๊ทธ
ํŒŒ์‹ฑ ๋‚œ์ด๋„JSON.parse ํ•„์š”XML ํƒœ๊ทธ ์ถ”์ถœ๋งŒ
์ ํ•ฉํ•œ ๋ชจ๋ธHermes format์œผ๋กœ fine-tuning๋œ ๋ชจ๋ธ (Qwen3, Hermes ๋“ฑ)fine-tuning ์—†๋Š” ๋ฒ”์šฉ ๋ชจ๋ธ
์‚ฌ์šฉํ•˜๋Š” ๊ตฌํ˜„์ฒดhermesToolMiddleware, qwen3CoderToolMiddlewaremorphXmlToolMiddleware

ํ•ต์‹ฌ ๊ธฐ์ค€: ๋ชจ๋ธ์ด ์–ด๋–ค ํ˜•์‹์œผ๋กœ ํ•™์Šต๋˜์—ˆ๋Š”๊ฐ€. 4ํŽธ์—์„œ ํ™•์ธํ•œ ๊ฒƒ์ฒ˜๋Ÿผ, fine-tuning์€ ๋ชจ๋ธ์ด ํŠน์ • ํŒจํ„ด์„ ์ƒ์„ฑํ•  ํ™•๋ฅ ์„ ๋†’์ธ๋‹ค. Qwen3๊ฐ€ Hermes ํ˜•์‹์œผ๋กœ fine-tuning๋˜์—ˆ๋‹ค๋ฉด, Hermes ํ”„๋กœํ† ์ฝœ์„ ์‚ฌ์šฉํ•ด์•ผ ํ•ด๋‹น ํŒจํ„ด์˜ ์ƒ์„ฑ ํ™•๋ฅ ์ด ๊ฐ€์žฅ ๋†’๋‹ค.

fine-tuning ์—†๋Š” ๋ฒ”์šฉ ๋ชจ๋ธ์ด๋ผ๋ฉด? In-context learning์— ์˜์กดํ•ด์•ผ ํ•œ๋‹ค. ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ์— ํ˜•์‹์„ ์ƒ์„ธํžˆ ์ง€์‹œํ•˜๊ณ , ๋ชจ๋ธ์ด ๊ทธ ์ง€์‹œ๋ฅผ ๋”ฐ๋ฅด๊ธฐ๋ฅผ ๊ธฐ๋Œ€ํ•œ๋‹ค. ์ด ๊ฒฝ์šฐ XML ํ”„๋กœํ† ์ฝœ์ด ๋” ๋‚˜์„ ์ˆ˜ ์žˆ๋‹ค. XML์€ ์›น ๋ฐ์ดํ„ฐ์—์„œ ์ž์ฃผ ๋“ฑ์žฅํ•˜๋ฏ€๋กœ, ๋ฒ”์šฉ ๋ชจ๋ธ๋„ XML ํƒœ๊ทธ ํŒจํ„ด์— ์–ด๋А ์ •๋„ ์ต์ˆ™ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

์ด ๊ธ€์—์„œ๋Š” Hermes ํ”„๋กœํ† ์ฝœ๋กœ ๊ตฌํ˜„ํ•œ๋‹ค. Qwen3 ๋“ฑ ์‹ค์ œ ์‚ฌ์šฉํ•  ๋ชจ๋ธ์ด ์ด ํ˜•์‹์„ ์ง€์›ํ•˜๊ณ , ๊ฐ€์žฅ ๋„๋ฆฌ ์“ฐ์ด๋Š” ํ˜•์‹์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

Tool ์‹คํ–‰ - ํ•จ์ˆ˜ ํ˜ธ์ถœ๊ณผ ๊ฒฐ๊ณผ ๋ฐ˜ํ™˜ (4,5๋ฒˆ ๊ณผ์ •)

ํ”„๋กœํ† ์ฝœ์ด Middlewareโ†”๋ชจ๋ธ ์‚ฌ์ด์˜ ํ…์ŠคํŠธ ํ†ต์‹  ๊ทœ์•ฝ์ด๋ผ๋ฉด, 4,5๋ฒˆ์€ Middlewareโ†”Tool ์‚ฌ์ด์˜ ์ฝ”๋“œ ํ†ต์‹ ์ด๋‹ค. ํ”„๋กœํ† ์ฝœ๊ณผ ๋ฌด๊ด€ํ•˜๊ฒŒ, ์ผ๋ฐ˜์ ์ธ ํ•จ์ˆ˜ ํ˜ธ์ถœ์ด๋‹ค.

  • 4๋ฒˆ (MWโ†’Tool): Middleware๊ฐ€ ์ถœ๋ ฅ ํŒŒ์‹ฑ์œผ๋กœ ์ถ”์ถœํ•œ tool ์ด๋ฆ„("get_weather")๊ณผ ์ธ์ž({"location": "Seoul"})๋กœ, ๋“ฑ๋ก๋œ ํ•จ์ˆ˜๋ฅผ ์ฐพ์•„ ํ˜ธ์ถœํ•œ๋‹ค. 5ํŽธ์—์„œ ๋‹ค๋ฃฌ โ€œtool ์ด๋ฆ„ โ†’ ํ•จ์ˆ˜ ๋งคํ•‘ โ†’ ์‹คํ–‰โ€๊ณผ ๋™์ผํ•˜๋‹ค.
  • 5๋ฒˆ (Toolโ†’MW): Tool ํ•จ์ˆ˜๊ฐ€ ์‹คํ–‰ ๊ฒฐ๊ณผ๋ฅผ JSON์œผ๋กœ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ์ด ๊ฒฐ๊ณผ๊ฐ€ 6๋ฒˆ์—์„œ ํ”„๋กœํ† ์ฝœ ํ˜•์‹์œผ๋กœ ๋ชจ๋ธ์— ํ”ผ๋“œ๋ฐฑ๋œ๋‹ค.
// 4๋ฒˆ: tool ์ด๋ฆ„์œผ๋กœ ํ•จ์ˆ˜๋ฅผ ์ฐพ์•„ ํ˜ธ์ถœ
const tool = tools[parsedToolCall.name];       // "get_weather" โ†’ ํ•จ์ˆ˜ ๋งคํ•‘
const result = await tool.execute(parsedToolCall.arguments);  // ์‹คํ–‰
 
// 5๋ฒˆ: ํ•จ์ˆ˜๊ฐ€ JSON์œผ๋กœ ๊ฒฐ๊ณผ ๋ฐ˜ํ™˜
// result = { temperature: "15ยฐC", condition: "๋ง‘์Œ" }

tool ์ •์˜์™€ ์‹คํ–‰ ํ•จ์ˆ˜๋Š” ๊ฐœ๋ฐœ์ž๊ฐ€ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ์ž‘์„ฑํ•œ๋‹ค. Middleware๋Š” ์ด ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜๊ณ  ๊ฒฐ๊ณผ๋ฅผ ๋ฐ›์•„ ๋ชจ๋ธ์— ์ „๋‹ฌํ•˜๋Š” ์—ญํ• ๋งŒ ํ•œ๋‹ค.

ํ”„๋กœํ† ์ฝœ๊ณผ Tool ์‹คํ–‰์„ ์ •๋ฆฌํ–ˆ์œผ๋‹ˆ, ์ด์ œ 3๊ฐ€์ง€ ์—ญํ• ์„ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•œ๋‹ค.

์—ญํ•  1: ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ

์ฒซ ๋ฒˆ์งธ ์—ญํ• ์€ tool ์ •์˜ JSON์„ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. 2ํŽธ์—์„œ Claude์˜ API ์„œ๋ฒ„๊ฐ€ ํ•˜๋˜ ์ผ์„ ์šฐ๋ฆฌ๊ฐ€ ์ง์ ‘ ํ•œ๋‹ค.

interface ToolDefinition {
  name: string;
  description: string;
  parameters: Record<string, unknown>; // JSON Schema
}
 
function assembleToolPrompt(tools: ToolDefinition[]): string {
  // tool ์ •์˜๋ฅผ Hermes ํ˜•์‹์˜ JSON ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜
  const toolDefs = tools.map((t) => ({
    type: "function",
    function: {
      name: t.name,
      description: t.description,
      parameters: t.parameters,
    },
  }));
 
  return [
    "You are a function calling AI model.",
    "You are provided with function signatures within <tools></tools> XML tags:",
    "<tools>",
    JSON.stringify(toolDefs, null, 2),
    "</tools>",
    "",
    "For each function call return a json object with function name and arguments",
    "within <tool_call></tool_call> XML tags:",
    "<tool_call>",
    '{"name": "<function_name>", "arguments": <args_json_object>}',
    "</tool_call>",
  ].join("\n");
}

์ด ํ•จ์ˆ˜๊ฐ€ ํ•˜๋Š” ์ผ์€ ๋‹จ์ˆœํ•˜๋‹ค. tool JSON์„ ๋ฐ›์•„์„œ, ๋ชจ๋ธ์ด ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋Š” ํ…์ŠคํŠธ๋กœ ๊ฐ์‹ธ๋Š” ๊ฒƒ์ด๋‹ค. ์ถœ๋ ฅ ์˜ˆ์‹œ:

You are a function calling AI model.
You are provided with function signatures within <tools></tools> XML tags:
<tools>
[
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": { "type": "string" }
        },
        "required": ["location"]
      }
    }
  }
]
</tools>
 
For each function call return a json object with function name and arguments
within <tool_call></tool_call> XML tags:
<tool_call>
{"name": "<function_name>", "arguments": <args_json_object>}
</tool_call>

2ํŽธ์—์„œ Claude์˜ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๊ฐ€ "In this environment you have access to a set of tools..." ๋กœ ์‹œ์ž‘ํ–ˆ๋˜ ๊ฒƒ๊ณผ ๋™์ผํ•œ ์—ญํ• ์ด๋‹ค. ํ˜•์‹๋งŒ ๋‹ค๋ฅผ ๋ฟ, ํ•˜๋Š” ์ผ์€ ๊ฐ™๋‹ค: tool ์ •์˜๋ฅผ ํ…์ŠคํŠธ๋กœ ๋ชจ๋ธ์—๊ฒŒ ์•Œ๋ ค์ฃผ๋Š” ๊ฒƒ.

tool ์‹คํ–‰ ๊ฒฐ๊ณผ๋ฅผ ํ”ผ๋“œ๋ฐฑํ•  ๋•Œ๋„ ํ”„๋กฌํ”„ํŠธ์— ํฌํ•จํ•ด์•ผ ํ•œ๋‹ค:

function assembleToolResult(
  toolName: string,
  result: string
): string {
  return [
    "<tool_response>",
    `{"name": "${toolName}", "content": ${result}}`,
    "</tool_response>",
  ].join("\n");
}

์—ญํ•  2: ์ถœ๋ ฅ ํŒŒ์‹ฑ

๋‘ ๋ฒˆ์งธ ์—ญํ• ์€ ๋ชจ๋ธ์ด ์ƒ์„ฑํ•œ ํ…์ŠคํŠธ์—์„œ tool call ํŒจํ„ด์„ ๊ฐ์ง€ํ•˜๊ณ  JSON์œผ๋กœ ์ถ”์ถœํ•˜๋Š” ๊ฒƒ์ด๋‹ค. 4ํŽธ์—์„œ API ์„œ๋ฒ„์˜ ์ถœ๋ ฅ ํŒŒ์„œ๊ฐ€ ํ•˜๋˜ ์ผ์ด๋‹ค.

๋ชจ๋ธ์˜ ์‹ค์ œ ์ถœ๋ ฅ์€ ์ด๋Ÿฐ ํ…์ŠคํŠธ๋‹ค:

๋‚ ์”จ๋ฅผ ํ™•์ธํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
 
<tool_call>
{"name": "get_weather", "arguments": {"location": "Seoul"}}
</tool_call>

์ด ํ…์ŠคํŠธ์—์„œ <tool_call> ํƒœ๊ทธ๋ฅผ ์ฐพ๊ณ , ์•ˆ์˜ JSON์„ ์ถ”์ถœํ•ด์•ผ ํ•œ๋‹ค:

interface ParsedToolCall {
  name: string;
  arguments: Record<string, unknown>;
}
 
function parseToolCalls(text: string): ParsedToolCall[] {
  const pattern = /<tool_call>\s*([\s\S]*?)\s*<\/tool_call>/g;
  const toolCalls: ParsedToolCall[] = [];
 
  let match;
  while ((match = pattern.exec(text)) !== null) {
    const jsonStr = match[1].trim();
    const parsed = JSON.parse(jsonStr);
    toolCalls.push({
      name: parsed.name,
      arguments: parsed.arguments,
    });
  }
 
  return toolCalls;
}

์ •๊ทœ์‹ /<tool_call>\s*([\s\S]*?)\s*<\/tool_call>/g์ด ํ•ต์‹ฌ์ด๋‹ค. <tool_call> ๊ณผ </tool_call> ์‚ฌ์ด์˜ ๋‚ด์šฉ์„ ์บก์ฒ˜ํ•˜๊ณ , [\s\S]*?๋กœ ์ค„๋ฐ”๊ฟˆ์„ ํฌํ•จํ•œ ๋ชจ๋“  ๋ฌธ์ž๋ฅผ ๋งค์นญํ•œ๋‹ค.

์ด ๋ถ€๋ถ„์ด Native ๋ชจ๋ธ๊ณผ ๊ฐ€์žฅ ํฐ ์ฐจ์ด๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ์ง€์ ์ด๋‹ค. Native ๋ชจ๋ธ์€ control token์œผ๋กœ tool call ์˜์—ญ์„ ๋ช…ํ™•ํžˆ ๊ตฌ๋ถ„ํ•˜๊ณ , constrained decoding์œผ๋กœ valid JSON์„ ๋ณด์žฅํ•œ๋‹ค (3ํŽธ, 4ํŽธ). Middleware๋Š” ์ •๊ทœ์‹ ๋งค์นญ๊ณผ JSON.parse์— ์˜์กดํ•œ๋‹ค. ๋ชจ๋ธ์ด <tool_call> ํƒœ๊ทธ๋ฅผ ์ •ํ™•ํžˆ ๋‹ซ์ง€ ์•Š๊ฑฐ๋‚˜, JSON์ด ๊นจ์ ธ ์žˆ์œผ๋ฉด ํŒŒ์‹ฑ์ด ์‹คํŒจํ•œ๋‹ค.

์ด ํ•œ๊ณ„๋Š” ์—๋Ÿฌ ํ•ธ๋“ค๋ง ์„น์…˜์—์„œ ๋‹ค๋ฃฌ๋‹ค.

tool call ์œ ๋ฌด๋กœ ์‘๋‹ต ๋ถ„๊ธฐ

ํŒŒ์‹ฑ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์‘๋‹ต์„ ๋ถ„๊ธฐํ•œ๋‹ค:

function classifyResponse(text: string):
  | { type: "tool_calls"; calls: ParsedToolCall[]; textBefore: string }
  | { type: "text"; content: string } {
 
  const toolCalls = parseToolCalls(text);
 
  if (toolCalls.length > 0) {
    // <tool_call> ์•ž์˜ ํ…์ŠคํŠธ ์ถ”์ถœ (๋ชจ๋ธ์˜ ์‚ฌ์ „ ์„ค๋ช…)
    const textBefore = text.split("<tool_call>")[0].trim();
    return { type: "tool_calls", calls: toolCalls, textBefore };
  }
 
  return { type: "text", content: text };
}

์ด๊ฒƒ์ด 5ํŽธ์˜ stop_reason: "tool_use" vs stop_reason: "end_turn" ๋ถ„๊ธฐ์™€ ๋™์ผํ•œ ์—ญํ• ์ด๋‹ค. Native ๋ชจ๋ธ์€ API ์„œ๋ฒ„๊ฐ€ stop_reason์„ ์„ค์ •ํ•ด์ฃผ์ง€๋งŒ, Middleware์—์„œ๋Š” ํ…์ŠคํŠธ์— <tool_call> ํŒจํ„ด์ด ์žˆ๋Š”์ง€ ์ง์ ‘ ํ™•์ธํ•ด์•ผ ํ•œ๋‹ค.

์—ญํ•  3: ์‹คํ–‰ ๋ฃจํ”„

์„ธ ๋ฒˆ์งธ ์—ญํ• ์€ ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ โ†’ ๋ชจ๋ธ ํ˜ธ์ถœ โ†’ ์ถœ๋ ฅ ํŒŒ์‹ฑ โ†’ tool ์‹คํ–‰ โ†’ ๊ฒฐ๊ณผ ํ”ผ๋“œ๋ฐฑ์„ ๋ฐ˜๋ณตํ•˜๋Š” ๊ฒƒ์ด๋‹ค. 5ํŽธ์—์„œ SDK์˜ toolRunner๊ฐ€ ํ•˜๋˜ ์ผ์ด๋‹ค.

interface Tool {
  description: string;
  parameters: Record<string, unknown>;
  execute: (args: Record<string, unknown>) => Promise<unknown>;
}
 
type Message = { role: string; content: string };
 
async function toolCallingLoop(
  model: { generate: (opts: { system: string; messages: Message[] }) => Promise<{ text: string }> },
  tools: Record<string, Tool>,
  prompt: string,
  maxSteps: number = 5
): Promise<string> {
  // ์—ญํ•  1: ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ
  const toolDefs = Object.entries(tools).map(([name, t]) => ({
    name,
    description: t.description,
    parameters: t.parameters,
  }));
  const systemPrompt = assembleToolPrompt(toolDefs);
 
  const messages: Message[] = [{ role: "user", content: prompt }];
 
  for (let step = 0; step < maxSteps; step++) {
    // ๋ชจ๋ธ ํ˜ธ์ถœ
    const response = await model.generate({
      system: systemPrompt,
      messages,
    });
 
    // ์—ญํ•  2: ์ถœ๋ ฅ ํŒŒ์‹ฑ
    const classified = classifyResponse(response.text);
 
    if (classified.type === "text") {
      return classified.content; // ์ตœ์ข… ํ…์ŠคํŠธ ์‘๋‹ต โ†’ ๋ฃจํ”„ ์ข…๋ฃŒ
    }
 
    // tool call์ด ์žˆ์œผ๋ฉด โ†’ ์‹คํ–‰ ํ›„ ๊ฒฐ๊ณผ ํ”ผ๋“œ๋ฐฑ
    messages.push({ role: "assistant", content: response.text });
 
    for (const toolCall of classified.calls) {
      const tool = tools[toolCall.name];
      if (!tool) {
        throw new Error(`Unknown tool: ${toolCall.name}`);
      }
 
      // tool ์‹คํ–‰
      const result = await tool.execute(toolCall.arguments);
      const resultStr = JSON.stringify(result);
 
      // ๊ฒฐ๊ณผ๋ฅผ ๋ฉ”์‹œ์ง€์— ์ถ”๊ฐ€ โ†’ ๋‹ค์Œ ๋ฃจํ”„์—์„œ ๋ชจ๋ธ์— ์ „๋‹ฌ
      messages.push({
        role: "tool",
        content: assembleToolResult(toolCall.name, resultStr),
      });
    }
    // ๋‹ค์‹œ ๋ฃจํ”„ ์‹œ์ž‘ โ†’ ๋ชจ๋ธ์ด ์ถ”๊ฐ€ tool call ๋˜๋Š” ์ตœ์ข… ์‘๋‹ต ์ƒ์„ฑ
  }
 
  return "์ตœ๋Œ€ ๋‹จ๊ณ„์— ๋„๋‹ฌํ–ˆ์Šต๋‹ˆ๋‹ค.";
}

5ํŽธ์˜ ๋ฉ€ํ‹ฐํ„ด tool calling ์‹œํ€€์Šค ๋‹ค์ด์–ด๊ทธ๋žจ์„ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•œ ๊ฒƒ์ด๋‹ค. ๋ฃจํ”„์˜ ์ข…๋ฃŒ ์กฐ๊ฑด์€ ๋‘ ๊ฐ€์ง€:

  • classifyResponse๊ฐ€ "text" ๋ฐ˜ํ™˜ โ†’ ๋ชจ๋ธ์ด ์ตœ์ข… ์‘๋‹ต ์ƒ์„ฑ (= stop_reason: "end_turn")
  • step >= maxSteps โ†’ ์ตœ๋Œ€ ๋‹จ๊ณ„ ๋„๋‹ฌ (= Vercel AI SDK์˜ stopWhen: stepCountIs(5))

์‚ฌ์šฉ ์˜ˆ์‹œ

// ์‹คํ–‰
const answer = await toolCallingLoop(
  ollamaModel,
  {
    get_weather: {
      description: "Get the current weather in a given location",
      parameters: {
        type: "object",
        properties: { location: { type: "string" } },
        required: ["location"],
      },
      execute: async ({ location }) => ({
        temperature: "15ยฐC",
        condition: "๋ง‘์Œ",
        location,
      }),
    },
  },
  "์„œ์šธ ๋‚ ์”จ ์•Œ๋ ค์ค˜"
);
 
console.log(answer);
// โ†’ "์„œ์šธ์˜ ํ˜„์žฌ ๋‚ ์”จ๋Š” 15ยฐC์ด๋ฉฐ ๋ง‘์Šต๋‹ˆ๋‹ค."

์ด๊ฒƒ์ด Middleware์˜ ์ „์ฒด ๊ตฌํ˜„์ด๋‹ค. assembleToolPrompt + parseToolCalls + toolCallingLoop 3๊ฐœ ํ•จ์ˆ˜๋กœ Non-native ๋ชจ๋ธ์—์„œ tool calling์ด ๋™์ž‘ํ•œ๋‹ค.

์—๋Ÿฌ๋Š” ๋ฐ˜๋“œ์‹œ ๋ฐœ์ƒํ•œ๋‹ค

Qwen ๊ณต์‹ ๋ฌธ์„œ๋Š” ์ด๋ ‡๊ฒŒ ๊ฒฝ๊ณ ํ•œ๋‹ค:

โ€œIt is not guaranteed that the model generation will always follow the protocol even with proper prompting or templates.โ€

Native ๋ชจ๋ธ์—์„œ๋Š” constrained decoding์ด valid JSON์„ 100% ๋ณด์žฅํ–ˆ๋‹ค (4ํŽธ). Middleware์—๋Š” ์ด ์•ˆ์ „์žฅ์น˜๊ฐ€ ์—†๋‹ค. ๋ชจ๋ธ์€ ์ž์œ ๋กญ๊ฒŒ ์•„๋ฌด ํ† ํฐ์ด๋‚˜ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ์—๋Ÿฌ๋Š” ๊ฐ€๋Šฅ์„ฑ์ด ์•„๋‹ˆ๋ผ ํ™•์‹ค์„ฑ์ด๋‹ค.

์ฃผ์š” ์—๋Ÿฌ ์œ ํ˜•๊ณผ ๋Œ€์‘

์—๋Ÿฌ์›์ธ๋Œ€์‘
ํƒœ๊ทธ ๋ฏธ๋‹ซํž˜<tool_call> ์—ด๊ณ  </tool_call> ์—†์ด ์ข…๋ฃŒ์ •๊ทœ์‹ ๋งค์นญ ์‹คํŒจ โ†’ ํ…์ŠคํŠธ ์‘๋‹ต์œผ๋กœ ์ฒ˜๋ฆฌ
JSON ํŒŒ์‹ฑ ์‹คํŒจ{"name": "get_weather", "arguments": {location: Seoul}} (๋”ฐ์˜ดํ‘œ ๋ˆ„๋ฝ)try-catch๋กœ ๊ฐ์‹ธ๊ณ  ์žฌ์‹œ๋„ ๋˜๋Š” fallback
์กด์žฌํ•˜์ง€ ์•Š๋Š” tool๋ชจ๋ธ์ด ์ •์˜์— ์—†๋Š” tool ์ด๋ฆ„ ์ƒ์„ฑtool ์ด๋ฆ„ ๊ฒ€์ฆ ํ›„ ์—๋Ÿฌ ๋ฉ”์‹œ์ง€๋ฅผ ๋ชจ๋ธ์— ํ”ผ๋“œ๋ฐฑ
ํƒ€์ž… ๋ถˆ์ผ์น˜"temperature" (string) ๋Œ€์‹  temperature (number)JSON Schema ๊ธฐ๋ฐ˜ ํƒ€์ž… ๊ฐ•์ œ ๋ณ€ํ™˜

์—๋Ÿฌ ํ•ธ๋“ค๋ง์„ ์ถ”๊ฐ€ํ•œ ํŒŒ์‹ฑ ํ•จ์ˆ˜:

function parseToolCallsSafe(text: string): {
  toolCalls: ParsedToolCall[];
  errors: string[];
} {
  const pattern = /<tool_call>\s*([\s\S]*?)\s*<\/tool_call>/g;
  const toolCalls: ParsedToolCall[] = [];
  const errors: string[] = [];
 
  let match;
  while ((match = pattern.exec(text)) !== null) {
    try {
      const parsed = JSON.parse(match[1].trim());
 
      if (!parsed.name || typeof parsed.name !== "string") {
        errors.push(`Invalid tool name: ${JSON.stringify(parsed.name)}`);
        continue;
      }
 
      toolCalls.push({
        name: parsed.name,
        arguments: parsed.arguments ?? {},
      });
    } catch (e) {
      errors.push(`JSON parse failed: ${match[1].trim().slice(0, 100)}`);
    }
  }
 
  return { toolCalls, errors };
}

์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ•˜๋ฉด ๋ชจ๋ธ์— ์—๋Ÿฌ ๋ฉ”์‹œ์ง€๋ฅผ ํ”ผ๋“œ๋ฐฑํ•˜์—ฌ ์žฌ์‹œ๋„ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๊ฒƒ์€ Vercel AI SDK๊ฐ€ tool ์‹คํ–‰ ์—๋Ÿฌ๋ฅผ tool-error content part๋กœ ๋ชจ๋ธ์— ์ „๋‹ฌํ•˜์—ฌ LLM์ด ์ž์ฒด ๋ณต๊ตฌํ•˜๋„๋ก ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์€ ์›๋ฆฌ๋‹ค.

Vercel AI SDK Middleware๋กœ ์กฐ๋ฆฝํ•˜๊ธฐ

์ง€๊ธˆ๊นŒ์ง€ ๋งŒ๋“  3๊ฐœ ํ•จ์ˆ˜(assembleToolPrompt, parseToolCalls, toolCallingLoop)๋Š” Middleware์˜ ์›๋ฆฌ๋ฅผ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•œ ๊ฒƒ์ด์—ˆ๋‹ค. ์‹ค์ œ ํ”„๋กœ๋•์…˜์—์„œ๋Š” Vercel AI SDK์˜ middleware ๊ตฌ์กฐ์— ๋งž์ถฐ ํŒจํ‚ค์ง•ํ•œ๋‹ค.

Vercel AI SDK์˜ wrapLanguageModel์€ 3๊ฐ€์ง€ ํ›„ํ‚น ํฌ์ธํŠธ๋ฅผ ์ œ๊ณตํ•œ๋‹ค:

ํ›„ํ‚น ํฌ์ธํŠธ์—ญํ• ์šฐ๋ฆฌ๊ฐ€ ๋งŒ๋“  ํ•จ์ˆ˜
transformParams๋ชจ๋ธ์— ์ „๋‹ฌ๋˜๊ธฐ ์ „ ํŒŒ๋ผ๋ฏธํ„ฐ ๋ณ€ํ™˜assembleToolPrompt (ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ)
wrapGeneratedoGenerate ์‹คํ–‰์„ ๊ฐ์‹ธ๊ธฐparseToolCalls (์ถœ๋ ฅ ํŒŒ์‹ฑ)
wrapStreamdoStream ์‹คํ–‰์„ ๊ฐ์‹ธ๊ธฐparseToolCalls์˜ ์ŠคํŠธ๋ฆฌ๋ฐ ๋ฒ„์ „
import { type LanguageModelV3Middleware } from "ai";
 
const toolCallMiddleware: LanguageModelV3Middleware = {
  // ์—ญํ•  1: ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ
  transformParams: async ({ params }) => ({
    ...params,
    prompt: {
      ...params.prompt,
      system: assembleToolPrompt(params.tools ?? []),
    },
    // tool ์ •์˜๋Š” ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ๋กœ ์ด๋™ํ–ˆ์œผ๋ฏ€๋กœ ์ œ๊ฑฐ
    tools: undefined,
  }),
 
  // ์—ญํ•  2: ์ถœ๋ ฅ ํŒŒ์‹ฑ
  wrapGenerate: async ({ doGenerate }) => {
    const result = await doGenerate();
    const { toolCalls, errors } = parseToolCallsSafe(result.text ?? "");
 
    if (toolCalls.length > 0) {
      return {
        ...result,
        toolCalls: toolCalls.map((tc) => ({
          toolCallType: "function" as const,
          toolCallId: crypto.randomUUID(),
          toolName: tc.name,
          args: JSON.stringify(tc.arguments),
        })),
        finishReason: "tool-calls" as const,
      };
    }
 
    return result;
  },
};

์ด์ œ ์ด middleware๋ฅผ ๋ชจ๋ธ์— ๊ฐ์‹ธ๋ฉด ๋œ๋‹ค:

import { wrapLanguageModel, generateText, tool, stepCountIs } from "ai";
import { createOllama } from "ollama-ai-provider";
import { z } from "zod";
 
const ollama = createOllama();
 
// middleware๋กœ ๋ชจ๋ธ ๊ฐ์‹ธ๊ธฐ
const wrappedModel = wrapLanguageModel({
  model: ollama("qwen3:8b"),
  middleware: toolCallMiddleware,
});
 
// ์ด์ œ native tool calling์ฒ˜๋Ÿผ ์‚ฌ์šฉ
const result = await generateText({
  model: wrappedModel,
  tools: {
    get_weather: tool({
      description: "Get the current weather",
      parameters: z.object({ location: z.string() }),
      execute: async ({ location }) => ({
        temperature: "15ยฐC",
        condition: "๋ง‘์Œ",
      }),
    }),
  },
  stopWhen: stepCountIs(5),
  prompt: "์„œ์šธ ๋‚ ์”จ ์•Œ๋ ค์ค˜",
});

๊ฐœ๋ฐœ์ž ์ž…์žฅ์—์„œ๋Š” native tool calling API์™€ ๋™์ผํ•œ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. middleware๊ฐ€ ๋‚ด๋ถ€์—์„œ ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ โ†’ ์ถœ๋ ฅ ํŒŒ์‹ฑ โ†’ ์‹คํ–‰ ๋ฃจํ”„๋ฅผ ์ฒ˜๋ฆฌํ•œ๋‹ค.

์‹ค์ œ ๊ตฌํ˜„์ฒด: @ai-sdk-tool/parser

์ง์ ‘ middleware๋ฅผ ๋งŒ๋“ค ์ˆ˜๋„ ์žˆ์ง€๋งŒ, ์ด๋ฏธ ๊ฒ€์ฆ๋œ ๊ตฌํ˜„์ฒด๊ฐ€ ์žˆ๋‹ค. ai-sdk-tool-call-middleware๋Š” 4๊ฐ€์ง€ ํ”„๋กœํ† ์ฝœ์„ ์ง€์›ํ•œ๋‹ค:

import {
  hermesToolMiddleware,       // Hermes: JSON in <tool_call> tags
  morphXmlToolMiddleware,     // XML: ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ XML ํƒœ๊ทธ๋กœ
  yamlXmlToolMiddleware,      // YAML-XML: XML ํƒœ๊ทธ + YAML body
  qwen3CoderToolMiddleware,   // Qwen3 ์ „์šฉ
} from "@ai-sdk-tool/parser";
 
// ๋ชจ๋ธ์— ๋งž๋Š” ํ”„๋กœํ† ์ฝœ ์„ ํƒ
const wrappedModel = wrapLanguageModel({
  model: ollama("qwen3:8b"),
  middleware: hermesToolMiddleware,
});

์ปค์Šคํ…€ ํ”„๋กœํ† ์ฝœ๋„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค:

import { createToolMiddleware, hermesProtocol } from "@ai-sdk-tool/parser";
 
const customMiddleware = createToolMiddleware({
  protocol: hermesProtocol,
  toolSystemPromptTemplate: (tools) =>
    `Use these tools: ${JSON.stringify(tools)}`,
});

์ŠคํŠธ๋ฆฌ๋ฐ ํ™˜๊ฒฝ์—์„œ๋Š” tool-input-start, tool-input-delta, tool-input-end ์ด๋ฒคํŠธ๋ฅผ ํ†ตํ•ด tool call ํŒŒ์‹ฑ ๊ณผ์ •์„ ์ ์ง„์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๋ฉฐ, ์—๋Ÿฌ ๋ฐœ์ƒ ์‹œ providerOptions.toolCallMiddleware.onError ์ฝœ๋ฐฑ์œผ๋กœ ์ปค์Šคํ…€ ์ฒ˜๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค.

๋‹ค์Œ ํŽธ: ๋ชจ๋ธ๋„ ์ง์ ‘ ๋„์šธ ์ˆ˜ ์žˆ๋‚˜?

์ด ๊ธ€์—์„œ Middleware์˜ 3๊ฐ€์ง€ ์—ญํ• ์„ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ–ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ, ์ถœ๋ ฅ ํŒŒ์‹ฑ, ์‹คํ–‰ ๋ฃจํ”„ โ€” ์ด 3๊ฐœ ํ•จ์ˆ˜๋งŒ์œผ๋กœ Non-native ๋ชจ๋ธ์—์„œ tool calling์ด ๋™์ž‘ํ•œ๋‹ค.

ํ•˜์ง€๋งŒ ์ง€๊ธˆ๊นŒ์ง€ ollama("qwen3:8b") ๊ฐ™์€ ๋ชจ๋ธ ํ˜ธ์ถœ์„ ๋‹น์—ฐํ•˜๊ฒŒ ์‚ฌ์šฉํ–ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ์–ด๋””์„œ ์‹คํ–‰๋˜๊ณ  ์žˆ๋Š” ๊ฑธ๊นŒ? ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ์„ ๋กœ์ปฌ์—์„œ ์ง์ ‘ ๋„์šฐ๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ• ๊นŒ?

๋‹ค์Œ ํŽธ์—์„œ Ollama์™€ vLLM์œผ๋กœ ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ์„ ๋กœ์ปฌ์— ์„œ๋น™ํ•˜๋Š” ๊ณผ์ •์„ ๋‹ค๋ฃฌ๋‹ค. ํŠนํžˆ, vLLM์˜ --tool-call-parser hermes ์˜ต์…˜์œผ๋กœ ์„œ๋น™ ๋ ˆ๋ฒจ์—์„œ Middleware ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ๋„ ํ™•์ธํ•œ๋‹ค.

์ฐธ๊ณ  ๋ฌธ์„œ