์‹œ๋ฆฌ์ฆˆ: LLM Tool Calling ๋‚ด๋ถ€ ์›๋ฆฌ๋ถ€ํ„ฐ ์—์ด์ „ํŠธ ์ง์ ‘ ๊ตฌํ˜„๊นŒ์ง€

์ด ์‹œ๋ฆฌ์ฆˆ๋Š” ์‚ฌ์šฉ์ž์˜ ์ž์—ฐ์–ด ํ•œ ์ค„์ด tool ์‹คํ–‰์œผ๋กœ ๋ฐ”๋€Œ๋Š” ๋‚ด๋ถ€ ์ฒ˜๋ฆฌ ๊ณผ์ •์„ ๋‹จ๊ณ„๋ณ„๋กœ ํ•ด๋ถ€ํ•˜๊ณ , ์ตœ์ข…์ ์œผ๋กœ ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ + ์ž์ฒด middleware๋กœ ๋‚˜๋งŒ์˜ ์—์ด์ „ํŠธ๋ฅผ ์ง์ ‘ ๊ตฌํ˜„ํ•˜๋Š” ๊ฒƒ๊นŒ์ง€ ๋„๋‹ฌํ•˜๋Š” ๊ณผ์ •์ด๋‹ค.

ํŽธ๋‚ด์šฉํ•ต์‹ฌ
1ํŽธ์ „์ฒด ์กฐ๊ฐ๋„์ž์—ฐ์–ด โ†’ tool ์‹คํ–‰๊นŒ์ง€ 5๊ฐœ ๋ ˆ์ด์–ด์˜ ์กด์žฌ๋ฅผ ํ™•์ธ
2ํŽธChat TemplateJSON์ด ๋ชจ๋ธ์— ์ง์ ‘ ๋“ค์–ด๊ฐ€์ง€ ์•Š๋Š”๋‹ค
3ํŽธ (๋ณธ๋ฌธ)Tokenization๋ชจ๋ธ์€ ํ…์ŠคํŠธ๋ฅผ ์ฝ์ง€ ๋ชปํ•œ๋‹ค - ํ† ํฐ ID์™€ control token
4ํŽธ๋ชจ๋ธ ์ถ”๋ก โ€tool์„ ์“ธ๊นŒ ๋ง๊นŒโ€ ํŒ๋‹จ๊ณผ constrained decoding
5ํŽธTool ์‹คํ–‰tool_use๋ฅผ ๋ฐ›์€ ํด๋ผ์ด์–ธํŠธ์˜ ์‹คํ–‰ ๋ฃจํ”„
6ํŽธNative vs Non-native๊ฐ™์€ ๊ธฐ๋Šฅ, ๋‹ค๋ฅธ ๊ตฌ์กฐ โ†’ Middleware
7ํŽธMiddleware ๋งŒ๋“ค๊ธฐํ”„๋กฌํ”„ํŠธ ์กฐ๋ฆฝ + ์ถœ๋ ฅ ํŒŒ์‹ฑ + ์‹คํ–‰ ๋ฃจํ”„ ์ง์ ‘ ๊ตฌํ˜„
8ํŽธ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ ๋กœ์ปฌ ๊ตฌ์ถ•Ollama/vLLM์œผ๋กœ ๋กœ์ปฌ LLM ์„œ๋น™
9ํŽธ๋‚˜๋งŒ์˜ ์—์ด์ „ํŠธ๋ชจ๋ธ + Middleware = ์—์ด์ „ํŠธ ์™„์„ฑ

  • Tokenization์€ Chat Template์ด ๋งŒ๋“  ํ”„๋กฌํ”„ํŠธ ํ…์ŠคํŠธ๋ฅผ ๋ชจ๋ธ์ด ์—ฐ์‚ฐํ•  ์ˆ˜ ์žˆ๋Š” ํ† ํฐ ID ์ˆซ์ž ์‹œํ€€์Šค๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ณผ์ •
  • ๋ชจ๋ธ์€ ํ…์ŠคํŠธ๋ฅผ ์ง์ ‘ ์ฝ์ง€ ๋ชปํ•˜๋ฉฐ, ํ–‰๋ ฌ ๊ณฑ์…ˆ์œผ๋กœ ๋™์ž‘ํ•˜๋ฏ€๋กœ ์ˆซ์ž ์ž…๋ ฅ์ด ํ•„์ˆ˜์ ์ธ ๊ตฌ์กฐ์  ์ œ์•ฝ
  • tool calling์—์„œ๋Š” ์ผ๋ฐ˜ ํ…์ŠคํŠธ ํ† ํฐ๊ณผ ๊ตฌ๋ณ„๋˜๋Š” control token์ด ์กด์žฌํ•˜๋ฉฐ, ์ด๊ฒƒ์ด โ€œ์—ฌ๊ธฐ๋ถ€ํ„ฐ tool ์ •์˜โ€, โ€œ์—ฌ๊ธฐ๋ถ€ํ„ฐ tool ํ˜ธ์ถœโ€๊ฐ™์€ ๊ตฌ์กฐ์  ๊ฒฝ๊ณ„๋ฅผ ํ‘œ์‹œํ•˜๋Š” ํ•ต์‹ฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜
  • ์ด ํŽธ์—์„œ๋Š” ํ† ํฐ ์ˆ˜์ค€์˜ ๋‚ด๋ถ€ ํฌ๋งท์„ ๊ฐ€์žฅ ์ƒ์„ธํ•˜๊ฒŒ ๊ณต๊ฐœํ•œ Mistral์„ ์ฃผ์š” ์˜ˆ์‹œ๋กœ ์‚ฌ์šฉ. Claude, OpenAI, Gemini๋„ ๋™์ผํ•œ tokenization ๋ ˆ์ด์–ด๊ฐ€ ์กด์žฌํ•˜์ง€๋งŒ ๋‚ด๋ถ€ ํฌ๋งท์€ ๋น„๊ณต๊ฐœ

ํ•ด๋‹น ๊ฐœ๋…์ด ํ•„์š”ํ•œ ์ด์œ 

  • 2ํŽธ์—์„œ Chat Template์ด JSON์„ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ–ˆ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ๋ชจ๋ธ์€ ํ…์ŠคํŠธ๋„ ์ง์ ‘ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์—†๋‹ค
  • ๋ชจ๋ธ์ด ์—ฐ์‚ฐํ•˜๋ ค๋ฉด ํ…์ŠคํŠธ๊ฐ€ ์ˆซ์ž๋กœ ๋ฐ”๋€Œ์–ด์•ผ ํ•˜๊ณ , ์ด ๋ณ€ํ™˜ ๊ณผ์ •์—์„œ tool calling์— ํ•ต์‹ฌ์ ์ธ control token์˜ ์—ญํ• ์ด ๋“œ๋Ÿฌ๋‚œ๋‹ค
  • Mistral์˜ [AVAILABLE_TOOLS]๊ฐ€ ์™œ ์‚ฌ์šฉ์ž injection์— ์•ˆ์ „ํ•œ์ง€, ์ด ๋‹จ๊ณ„๋ฅผ ์ดํ•ดํ•ด์•ผ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋‹ค

AS-IS

sequenceDiagram
    autonumber
    box AI ์„œ๋น„์Šค ์˜์—ญ
        participant CT as Chat Template
        participant LLM as LLM
    end

    CT->>LLM: "In this environment you have access to..."
    Note over CT,LLM: ํ…์ŠคํŠธ๊ฐ€ ๊ทธ๋Œ€๋กœ ๋ชจ๋ธ์— ๋“ค์–ด๊ฐ„๋‹ค?

TO-BE

sequenceDiagram
    autonumber
    box AI ์„œ๋น„์Šค ์˜์—ญ
        participant CT as Chat Template
        participant TK as Tokenizer
        participant LLM as LLM
    end

    CT->>TK: ํ”„๋กฌํ”„ํŠธ ํ…์ŠคํŠธ
    TK->>TK: BPE๋กœ ํ…์ŠคํŠธ ๋ถ„ํ• 
    TK->>TK: control token์€ ๋‹จ์ผ ID๋กœ ๋งคํ•‘
    TK->>LLM: [151331, 151333, 8948, ..., 872, 103013]
    Note over TK,LLM: ์ˆซ์ž ์‹œํ€€์Šค๋งŒ ๋ชจ๋ธ์— ์ž…๋ ฅ

์™œ ํ† ํฐํ™”๊ฐ€ ํ•„์š”ํ•œ๊ฐ€

LLM์€ ๋ณธ์งˆ์ ์œผ๋กœ ํ–‰๋ ฌ ๊ณฑ์…ˆ ๊ธฐ๊ณ„๋‹ค. ์ž…๋ ฅ์œผ๋กœ ์ˆซ์ž ๋ฒกํ„ฐ๋ฅผ ๋ฐ›์•„, ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ๊ณผ ๊ณฑํ•˜๊ณ , ๋‹ค์Œ ํ† ํฐ์˜ ํ™•๋ฅ  ๋ถ„ํฌ๋ฅผ ์ถœ๋ ฅํ•œ๋‹ค. ํ…์ŠคํŠธ ๋ฌธ์ž์—ด์„ ์ง์ ‘ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ๊ตฌ์กฐ๊ฐ€ ์•„๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ํ…์ŠคํŠธ๋ฅผ ๋ชจ๋ธ์— ๋„ฃ์œผ๋ ค๋ฉด ๋ฐ˜๋“œ์‹œ ์ˆซ์ž๋กœ ๋ณ€ํ™˜ํ•ด์•ผ ํ•œ๋‹ค. ์ด ๋ณ€ํ™˜์„ ๋‹ด๋‹นํ•˜๋Š” ๊ฒƒ์ด Tokenizer๋‹ค.

"์„œ์šธ ๋‚ ์”จ ์•Œ๋ ค์ค˜"
    โ†“ Tokenizer
[12847, 38291, 9823, 44102]
    โ†“ Embedding Layer
[[0.23, -0.41, ...], [0.87, 0.12, ...], ...]   โ† ๋ฒกํ„ฐ ์‹œํ€€์Šค
    โ†“ Transformer Layers (ํ–‰๋ ฌ ๊ณฑ์…ˆ ๋ฐ˜๋ณต)
๋‹ค์Œ ํ† ํฐ ํ™•๋ฅ  ๋ถ„ํฌ

BPE - ํ…์ŠคํŠธ๋ฅผ ํ† ํฐ์œผ๋กœ ๋‚˜๋ˆ„๋Š” ๋ฐฉ๋ฒ•

๋Œ€๋ถ€๋ถ„์˜ LLM์ด ์‚ฌ์šฉํ•˜๋Š” ํ† ํฐํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ **BPE(Byte Pair Encoding)**๋‹ค.

ํ•ต์‹ฌ ์›๋ฆฌ: ํ•™์Šต ๋ฐ์ดํ„ฐ์—์„œ ์ž์ฃผ ํ•จ๊ป˜ ๋“ฑ์žฅํ•˜๋Š” ๋ฌธ์ž ์Œ์„ ๋ฐ˜๋ณต์ ์œผ๋กœ ๋ณ‘ํ•ฉํ•˜์—ฌ vocabulary๋ฅผ ๊ตฌ์ถ•ํ•œ๋‹ค.

1๋‹จ๊ณ„: ๋ฌธ์ž ๋‹จ์œ„๋กœ ๋ถ„ํ• 
"hello" โ†’ ["h", "e", "l", "l", "o"]

2๋‹จ๊ณ„: ๊ฐ€์žฅ ๋นˆ๋ฒˆํ•œ ์Œ ๋ณ‘ํ•ฉ (l, l) โ†’ "ll"
["h", "e", "ll", "o"]

3๋‹จ๊ณ„: ๋‹ค์Œ ๋นˆ๋ฒˆํ•œ ์Œ ๋ณ‘ํ•ฉ (h, e) โ†’ "he"
["he", "ll", "o"]

4๋‹จ๊ณ„: ๋‹ค์Œ ๋นˆ๋ฒˆํ•œ ์Œ ๋ณ‘ํ•ฉ (he, ll) โ†’ "hell"
["hell", "o"]

5๋‹จ๊ณ„: ์ตœ์ข… ๋ณ‘ํ•ฉ (hell, o) โ†’ "hello"
["hello"]   โ† vocabulary์— "hello"๊ฐ€ ํ†ต์งธ๋กœ ๋“ฑ๋ก

์ž์ฃผ ๋“ฑ์žฅํ•˜๋Š” ๋‹จ์–ด๋Š” ํ†ต์งธ๋กœ ํ•˜๋‚˜์˜ ํ† ํฐ์ด ๋˜๊ณ , ๋“œ๋ฌธ ๋‹จ์–ด๋Š” ์—ฌ๋Ÿฌ ํ† ํฐ์œผ๋กœ ๋‚˜๋‰œ๋‹ค. ์ด๊ฒƒ์ด description์ด ๊ธด tool ์ •์˜๊ฐ€ ๋” ๋งŽ์€ ํ† ํฐ์„ ์†Œ๋น„ํ•˜๋Š” ์ด์œ ๋‹ค.

์ผ๋ฐ˜ ํ† ํฐ vs Control Token

ํ† ํฐํ™”์—๋Š” ๋‘ ์ข…๋ฅ˜์˜ ํ† ํฐ์ด ์กด์žฌํ•œ๋‹ค. ์ด ๊ตฌ๋ถ„์ด tool calling์˜ ํ•ต์‹ฌ์ด๋‹ค.

๋‹ค๋ฅธ ๋ฒค๋”๋„ ๋™์ผํ•œ tokenization ๋ ˆ์ด์–ด๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค

Mistral๋งŒ ์ด ๊ณผ์ •์„ ๊ฑฐ์น˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋‹ค. ๋ชจ๋“  LLM์€ ํ…์ŠคํŠธ๋ฅผ ํ† ํฐ ID๋กœ ๋ณ€ํ™˜ํ•ด์•ผ ํ•˜๋ฉฐ, ๊ฐ ๋ฒค๋”์˜ ๊ณต์‹ ๋ฌธ์„œ์—์„œ ์ด ๋ ˆ์ด์–ด์˜ ์กด์žฌ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค:

๋ฒค๋”tokenization ๋ ˆ์ด์–ด ์กด์žฌ ์ฆ๊ฑฐ๋‚ด๋ถ€ ํฌ๋งท ๊ณต๊ฐœ
Claudetool use ์‹œ ๋ชจ๋ธ๋ณ„ 313~530๊ฐœ ์ถ”๊ฐ€ ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ํ† ํฐ ์†Œ๋น„ (๊ณต์‹ ๋ฌธ์„œ ๋ช…์‹œ)๋น„๊ณต๊ฐœ. ๊ณต์‹ ํ† ํฌ๋‚˜์ด์ € ํŒจํ‚ค์ง€๋„ Claude 3 ์ดํ›„ โ€œ๋ถ€์ •ํ™•ํ•œ ๊ทผ์‚ฌ์น˜โ€
OpenAItool ์ •์˜๊ฐ€ ํ”„๋กฌํ”„ํŠธ์— ์‚ฝ์ž…๋˜์–ด input token์œผ๋กœ ๊ณผ๊ธˆ (๊ณต์‹ ํ™•์ธ)๋น„๊ณต๊ฐœ. โ€œํŠน์ˆ˜ ํ† ํฐ์œผ๋กœ ๊ฒฝ๊ณ„๋ฅผ ๊ตฌ๋ถ„ํ•  ๊ฒƒโ€์œผ๋กœ ์ถ”์ •
Geminitoken counting API์—์„œ tool ์ •์˜๋„ ํ† ํฐ์œผ๋กœ ๊ณ„์‚ฐ๋น„๊ณต๊ฐœ
Mistralcontrol token ID, ๋ฒ„์ „๋ณ„ ์ฐจ์ด๊นŒ์ง€ ํ† ํฐ ์ˆ˜์ค€ ๊ณต๊ฐœ๊ณต๊ฐœ
GLM[gMASK] = ID 151331, <sop> = ID 151333 ๋“ฑ ํ† ํฌ๋‚˜์ด์ €์—์„œ ํ™•์ธ ๊ฐ€๋Šฅ๋ชจ๋ธ ์ฝ”๋“œ์—์„œ ํ™•์ธ ๊ฐ€๋Šฅ

๋‚ด๋ถ€ ํฌ๋งท์€ ๋ฒค๋”๋งˆ๋‹ค ๋‹ค๋ฅด์ง€๋งŒ, โ€œํ…์ŠคํŠธ โ†’ ํ† ํฐ ID ๋ณ€ํ™˜โ€์ด๋ผ๋Š” ๋ ˆ์ด์–ด ์ž์ฒด๋Š” ๋ชจ๋“  LLM์— ์กด์žฌํ•œ๋‹ค. ์•„๋ž˜ ์˜ˆ์‹œ๋Š” ์ด ๊ณผ์ •์„ ๊ฐ€์žฅ ์ƒ์„ธํ•˜๊ฒŒ ๊ณต๊ฐœํ•œ Mistral์„ ๊ธฐ์ค€์œผ๋กœ ์„ค๋ช…ํ•œ๋‹ค.

์ผ๋ฐ˜ ํ† ํฐ (Regular Token)

BPE๋กœ ํ…์ŠคํŠธ๋ฅผ ๋ถ„ํ• ํ•˜์—ฌ ์ƒ์„ฑ๋œ๋‹ค. ์‚ฌ์šฉ์ž๊ฐ€ ์ž…๋ ฅํ•œ ํ…์ŠคํŠธ๋„, ์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ์˜ ํ…์ŠคํŠธ๋„ ๋™์ผํ•œ ๋ฐฉ์‹์œผ๋กœ ์ฒ˜๋ฆฌ๋œ๋‹ค.

"์„œ์šธ ๋‚ ์”จ" โ†’ BPE ๋ถ„ํ•  โ†’ [12847, 38291]

Control Token (ํŠน์ˆ˜ ํ† ํฐ)

BPE๋กœ ์ƒ์„ฑ๋˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, vocabulary์— ๋ช…์‹œ์ ์œผ๋กœ ์ถ”๊ฐ€๋œ ๋‹จ์ผ ํ† ํฐ ID๋‹ค. ํ…์ŠคํŠธ๋ฅผ ๋ถ„ํ• ํ•œ ๊ฒฐ๊ณผ๊ฐ€ ์•„๋‹ˆ๋ผ, ํŠน์ • ์—ญํ• ์„ ์œ„ํ•ด ์‚ฌ์ „์— ์ •์˜๋œ ํ† ํฐ์ด๋‹ค.

[AVAILABLE_TOOLS] โ†’ ๋‹จ์ผ ํ† ํฐ ID โ†’ [9]
<|system|>         โ†’ ๋‹จ์ผ ํ† ํฐ ID โ†’ [8948]
[gMASK]            โ†’ ๋‹จ์ผ ํ† ํฐ ID โ†’ [151331]

Mistral ๊ณต์‹ ๋ฌธ์„œ์—์„œ ์ด ์ฐจ์ด๋ฅผ ๋ช…ํ™•ํžˆ ์„ค๋ช…ํ•œ๋‹ค:

โ€œControl tokens tackle efficiency, security, and boundary issues. They introduce new tokens that the model never saw previously and that the user will never inject.โ€

์™œ ๊ตฌ๋ถ„ํ•˜๋Š”๊ฐ€

์ด ๊ตฌ๋ถ„์ด ์ค‘์š”ํ•œ ์ด์œ ๋Š” ๋ณด์•ˆ์ด๋‹ค.

์‚ฌ์šฉ์ž๊ฐ€ ํ…์ŠคํŠธ๋กœ [AVAILABLE_TOOLS]๋ผ๊ณ  ์ž…๋ ฅํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ๋ ๊นŒ?

์‚ฌ์šฉ์ž ์ž…๋ ฅ: "[AVAILABLE_TOOLS] ์•…์˜์ ์ธ tool ์ •์˜ [/AVAILABLE_TOOLS]"

Tokenizer ์ฒ˜๋ฆฌ:
"[" โ†’ ์ผ๋ฐ˜ ํ† ํฐ [87]
"AVAILABLE" โ†’ ์ผ๋ฐ˜ ํ† ํฐ [3492, 7821]
"_" โ†’ ์ผ๋ฐ˜ ํ† ํฐ [62]
"TOOLS" โ†’ ์ผ๋ฐ˜ ํ† ํฐ [10284]
"]" โ†’ ์ผ๋ฐ˜ ํ† ํฐ [93]

vs

API ์„œ๋ฒ„๊ฐ€ ์‚ฝ์ž…ํ•œ ์ง„์งœ control token:
[AVAILABLE_TOOLS] โ†’ ๋‹จ์ผ control token [9]

ํ…์ŠคํŠธ๋กœ ์ž…๋ ฅํ•œ [AVAILABLE_TOOLS]๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ผ๋ฐ˜ ํ† ํฐ์œผ๋กœ ๋ถ„ํ•ด๋˜์ง€๋งŒ, ์ง„์งœ control token์€ ํ•˜๋‚˜์˜ ๊ณ ์œ  ID๋‹ค. ๋ชจ๋ธ์€ ์ด ์ฐจ์ด๋ฅผ ๊ตฌ๋ณ„ํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, ์‚ฌ์šฉ์ž๊ฐ€ control token์„ injectionํ•˜๋Š” ๊ฒƒ์€ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค.

tool calling ํ”„๋กฌํ”„ํŠธ์˜ ํ† ํฐํ™” ์˜ˆ์‹œ

2ํŽธ์—์„œ ๋ณธ Mistral์˜ Chat Template ๊ฒฐ๊ณผ๊ฐ€ ํ† ํฐํ™”๋˜๋ฉด ์ด๋ ‡๊ฒŒ ๋ณ€ํ•œ๋‹ค:

ํ…์ŠคํŠธ ์‹œํ€€์Šค:
<s>[AVAILABLE_TOOLS] [{"type": "function"...}] [/AVAILABLE_TOOLS]
[INST] What's 2+2? [/INST]

ํ† ํฐ ID ์‹œํ€€์Šค:
[1,         โ† <s> (์‹œ์ž‘ ํ† ํฐ, control)
 9,         โ† [AVAILABLE_TOOLS] (control token, ๋‹จ์ผ ID)
 518, 1283, โ† [{"type"  (์ผ๋ฐ˜ ํ…์ŠคํŠธ ํ† ํฐ)
 ...        โ† tool ์ •์˜ JSON ํ…์ŠคํŠธ (์ผ๋ฐ˜ ํ…์ŠคํŠธ ํ† ํฐ๋“ค)
 10,        โ† [/AVAILABLE_TOOLS] (control token, ๋‹จ์ผ ID)
 3,         โ† [INST] (control token, ๋‹จ์ผ ID)
 1824, 28,  โ† What's 2+2? (์ผ๋ฐ˜ ํ…์ŠคํŠธ ํ† ํฐ)
 4]         โ† [/INST] (control token, ๋‹จ์ผ ID)

control token์€ ๋‹จ์ผ ID, ๋‚˜๋จธ์ง€๋Š” BPE๋กœ ๋ถ„ํ• ๋œ ์ผ๋ฐ˜ ํ† ํฐ์ด๋‹ค. ๋ชจ๋ธ์€ ์ด ์ˆซ์ž ์‹œํ€€์Šค๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ์ถ”๋ก ์„ ์‹œ์ž‘ํ•œ๋‹ค.

Tokenizer ๋ฒ„์ „๋ณ„ ์ฐจ์ด - Mistral V2 vs V3

Mistral์€ Tokenizer๋ฅผ ๋ฒ„์ „๋ณ„๋กœ ๊ฐœ์„ ํ•˜๋ฉฐ tool calling ์ง€์›์„ ๊ฐ•ํ™”ํ–ˆ๋‹ค:

V2V3V3 Tekken
tool ๊ฒฐ๊ณผ ํ˜•์‹๋ฆฌ์ŠคํŠธ๋กœ ๊ฐ์‹ธ๊ธฐ [{...}]๋‹จ์ผ ๊ฐ์ฒด {...}๋‹จ์ผ ๊ฐ์ฒด {...}
ํ˜ธ์ถœ ์ถ”์ ID ์—†์Œid ํ•„๋“œ ์ถ”๊ฐ€id ํ•„๋“œ ์ถ”๊ฐ€
๊ณต๋ฐฑ ์ฒ˜๋ฆฌcontrol token ๋’ค ๊ณต๋ฐฑ ์œ ์ง€๊ณต๋ฐฑ ์œ ์ง€๊ณต๋ฐฑ ์ œ๊ฑฐ
๋Œ€ํ™” ์ด๋ ฅํ† ํฐํ™”ํ•˜์ง€ ์•Š์Œ์‹œํ€€์Šค์— ํฌํ•จ์‹œํ€€์Šค์— ํฌํ•จ

V3์—์„œ id ํ•„๋“œ๊ฐ€ ์ถ”๊ฐ€๋œ ๊ฒƒ์€, ๋ฉ€ํ‹ฐํ„ด ๋Œ€ํ™”์—์„œ ์–ด๋–ค tool call์— ๋Œ€ํ•œ ๊ฒฐ๊ณผ์ธ์ง€ ์ถ”์ ํ•˜๊ธฐ ์œ„ํ•จ์ด๋‹ค. ์ด๊ฒƒ์€ Claude์˜ tool_use_id์™€ ๋™์ผํ•œ ์—ญํ• ์ด๋‹ค.

HuggingFace์—์„œ ํ† ํฐํ™” ์ง์ ‘ ํ™•์ธํ•˜๊ธฐ

์‹ค์ œ๋กœ chat template์ด ์–ด๋–ป๊ฒŒ ํ† ํฐํ™”๋˜๋Š”์ง€ ์ฝ”๋“œ๋กœ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค:

// Python ์ฝ”๋“œ์ง€๋งŒ ์›๋ฆฌ ์ดํ•ด ๋ชฉ์ 
// HuggingFace transformers ์‚ฌ์šฉ
 
import { AutoTokenizer } from "transformers";
 
const tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1");
 
const chat = [
  { role: "user", content: "What's 2+2?" }
];
 
// tokenize=false: ํ…์ŠคํŠธ ํ˜•ํƒœ ํ™•์ธ
const text = tokenizer.apply_chat_template(chat, { tokenize: false });
// โ†’ "<s>[INST] What's 2+2? [/INST]"
 
// tokenize=true: ํ† ํฐ ID ์‹œํ€€์Šค ํ™•์ธ
const tokens = tokenizer.apply_chat_template(chat, { tokenize: true });
// โ†’ [1, 3, 1824, 28, 32, 28, 32, 28, 4]

apply_chat_template์ด ๋ฐ”๋กœ 2ํŽธ์˜ Chat Template๊ณผ 3ํŽธ์˜ Tokenization์„ ํ•œ๋ฒˆ์— ์ˆ˜ํ–‰ํ•˜๋Š” ํ•จ์ˆ˜๋‹ค. tokenize: false๋ฉด ํ…์ŠคํŠธ๊นŒ์ง€๋งŒ, tokenize: true๋ฉด ํ† ํฐ ID๊นŒ์ง€ ๋ณ€ํ™˜ํ•œ๋‹ค.

๋‹ค์Œ ํŽธ: ํ† ํฐ ์‹œํ€€์Šค๋ฅผ ๋ฐ›์€ ๋ชจ๋ธ์€ ์–ด๋–ป๊ฒŒ tool์„ ์“ธ์ง€ ํŒ๋‹จํ•˜์ง€?

์ด ๊ธ€์—์„œ ํ…์ŠคํŠธ๊ฐ€ ํ† ํฐ ID ์‹œํ€€์Šค๋กœ ๋ณ€ํ™˜๋˜๋Š” ๊ณผ์ •์„ ํ™•์ธํ–ˆ๋‹ค. control token์ด ๊ตฌ์กฐ์  ๊ฒฝ๊ณ„๋ฅผ ํ‘œ์‹œํ•˜๊ณ , ์‚ฌ์šฉ์ž injection์œผ๋กœ๋ถ€ํ„ฐ ์•ˆ์ „ํ•˜๋‹ค๋Š” ๊ฒƒ๋„ ์ดํ•ดํ–ˆ๋‹ค.

๊ทธ๋Ÿฐ๋ฐ ๋ชจ๋ธ์ด ์ด ํ† ํฐ ์‹œํ€€์Šค๋ฅผ ๋ฐ›์€ ํ›„, โ€œtool์„ ์‚ฌ์šฉํ•ด์•ผ๊ฒ ๋‹คโ€๊ณ  ํŒ๋‹จํ•˜๋Š” ๊ณผ์ •์€ ์–ด๋–ป๊ฒŒ ์ด๋ฃจ์–ด์งˆ๊นŒ? ๊ฒฐ๊ตญ ๋ชจ๋ธ์€ ๋‹ค์Œ ํ† ํฐ์„ ์˜ˆ์ธกํ•  ๋ฟ์ธ๋ฐ, ์–ด๋–ป๊ฒŒ tool call ํ˜•์‹์˜ ์ถœ๋ ฅ์„ ์ƒ์„ฑํ•˜๊ฒŒ ๋˜๋Š” ๊ฑธ๊นŒ?

๋‹ค์Œ ํŽธ์—์„œ ์ด ๋ชจ๋ธ ์ถ”๋ก  ๊ณผ์ •์„ ์‚ดํŽด๋ณธ๋‹ค.

์ฐธ๊ณ  ๋ฌธ์„œ