• delta๋Š” LLM ์ŠคํŠธ๋ฆฌ๋ฐ API์—์„œ ์ด์ „ ์ฒญํฌ ์ดํ›„ ์ƒˆ๋กœ ์ƒ์„ฑ๋œ ํ…์ŠคํŠธ ์กฐ๊ฐ(์ฆ๋ถ„ ๋ณ€๊ฒฝ๋Ÿ‰)
  • SSE (Server-Sent Events) ๋ฐฉ์‹์œผ๋กœ ์ „๋‹ฌ๋˜๋Š” ๊ฐ ์ด๋ฒคํŠธ ์ฒญํฌ ์•ˆ์˜ ๋ณ€๊ฒฝ ๋‚ด์šฉ ๊ฐ์ฒด
  • ์ˆ˜ํ•™์˜ ฮ”(๋ธํƒ€) = โ€œ๋ณ€ํ™”๋Ÿ‰โ€์—์„œ ์œ ๋ž˜ํ•œ โ€œ์ „์ฒด ์ค‘ ์ด๋ฒˆ์— ์ถ”๊ฐ€๋œ ๋ถ€๋ถ„โ€ ์˜๋ฏธ

ํ•ด๋‹น ๊ฐœ๋…์ด ํ•„์š”ํ•œ ์ด์œ 

  • LLM์ด ์ „์ฒด ์‘๋‹ต์„ ์ƒ์„ฑ ํ›„ ํ•œ ๋ฒˆ์— ๋ฐ˜ํ™˜ํ•˜๋ฉด ์‚ฌ์šฉ์ž๊ฐ€ ์ˆ˜ ์ดˆ~์ˆ˜์‹ญ ์ดˆ ๋Œ€๊ธฐํ•ด์•ผ ํ•จ
  • delta๋ฅผ ์ด์šฉํ•œ ์ŠคํŠธ๋ฆฌ๋ฐ์œผ๋กœ ์ฒซ ๊ธ€์ž๋ถ€ํ„ฐ ์‹ค์‹œ๊ฐ„ ํ‘œ์‹œ โ†’ UX ๋Œ€ํญ ๊ฐœ์„ 
  • ๊ธด ์‘๋‹ต์ด๋‚˜ Tool Use์—์„œ HTTP ํƒ€์ž„์•„์›ƒ ๋ฐฉ์ง€

AS-IS (์ŠคํŠธ๋ฆฌ๋ฐ ์—†์Œ)

sequenceDiagram
    autonumber
    Client->>LLM API: POST /messages (stream: false)
    Note over LLM API: ์ „์ฒด ์‘๋‹ต ์ƒ์„ฑ ์ค‘...<br/>(5~30์ดˆ ๋Œ€๊ธฐ)
    LLM API-->>Client: ์™„์„ฑ๋œ ์ „์ฒด ์‘๋‹ต ๋ฐ˜ํ™˜
    Note over Client: ์‚ฌ์šฉ์ž๋Š” ๊ฒฐ๊ณผ๋ฅผ<br/>ํ•œ ๋ฒˆ์— ๋ฐ›์Œ

TO-BE (delta ์ŠคํŠธ๋ฆฌ๋ฐ)

sequenceDiagram
    autonumber
    Client->>LLM API: POST /messages (stream: true)
    LLM API-->>Client: delta: "์•ˆ"
    LLM API-->>Client: delta: "๋…•"
    LLM API-->>Client: delta: "ํ•˜์„ธ์š”"
    LLM API-->>Client: message_stop
    Note over Client: ๊ธ€์ž๊ฐ€ ์ƒ์„ฑ๋  ๋•Œ๋งˆ๋‹ค<br/>์‹ค์‹œ๊ฐ„ ํ‘œ์‹œ

Anthropic API์˜ delta ์ด๋ฒคํŠธ ๊ตฌ์กฐ

delta๊ฐ€ ์ „๋‹ฌ๋˜๋Š” ์ „์ฒด ์ด๋ฒคํŠธ ํ๋ฆ„:

event: message_start          โ† 1. ๋ฉ”์‹œ์ง€ ์ดˆ๊ธฐํ™”
event: content_block_start    โ† 2. ์ฝ˜ํ…์ธ  ๋ธ”๋ก ์‹œ์ž‘
event: content_block_delta    โ† 3. delta ๋ฐ˜๋ณต ์ „๋‹ฌ (ํ•ต์‹ฌ!)
event: content_block_delta
event: content_block_delta
event: content_block_stop     โ† 4. ๋ธ”๋ก ์ข…๋ฃŒ
event: message_delta          โ† 5. ๋ฉ”์‹œ์ง€ ๋ ˆ๋ฒจ ๋ณ€๊ฒฝ (stop_reason ๋“ฑ)
event: message_stop           โ† 6. ์ŠคํŠธ๋ฆผ ์ข…๋ฃŒ

์‹ค์ œ delta ์ฒญํฌ ์˜ˆ์‹œ:

{
  "type": "content_block_delta",
  "index": 0,
  "delta": {
    "type": "text_delta",
    "text": "์•ˆ๋…•ํ•˜์„ธ์š”"
  }
}

delta์˜ 3๊ฐ€์ง€ ํƒ€์ž… (Anthropic ๊ธฐ์ค€)

delta ํƒ€์ž…์„ค๋ช…ํฌํ•จ ํ•„๋“œ
text_delta์ผ๋ฐ˜ ํ…์ŠคํŠธ ์‘๋‹ต ์กฐ๊ฐtext
input_json_deltaTool Use ํŒŒ๋ผ๋ฏธํ„ฐ (partial JSON)partial_json
thinking_deltaExtended Thinking ๋‚ด์šฉthinking
# Python SDK์—์„œ delta ํƒ€์ž…๋ณ„ ์ฒ˜๋ฆฌ
with client.messages.stream(...) as stream:
    for event in stream:
        if event.type == "content_block_delta":
            if event.delta.type == "text_delta":
                print(event.delta.text, end="", flush=True)
            elif event.delta.type == "thinking_delta":
                print(event.delta.thinking, end="", flush=True)

OpenAI vs Anthropic delta ๊ตฌ์กฐ ๋น„๊ต

๋‘ API ๋ชจ๋‘ delta ๊ฐœ๋…์„ ์‚ฌ์šฉํ•˜์ง€๋งŒ ๊ตฌ์กฐ๊ฐ€ ๋‹ค๋ฆ„:

OpenAI (chat.completion.chunk):

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion.chunk",
  "choices": [{
    "index": 0,
    "delta": {
      "role": "assistant",   // ์ฒซ ์ฒญํฌ์—๋งŒ ํฌํ•จ
      "content": "์•ˆ๋…•"      // ํ…์ŠคํŠธ ์กฐ๊ฐ
    },
    "finish_reason": null
  }]
}

Anthropic (content_block_delta):

{
  "type": "content_block_delta",
  "index": 0,
  "delta": {
    "type": "text_delta",    // delta ํƒ€์ž… ๋ช…์‹œ
    "text": "์•ˆ๋…•"           // ํ…์ŠคํŠธ ์กฐ๊ฐ
  }
}
๋น„๊ต ํ•ญ๋ชฉOpenAIAnthropic
delta ์œ„์น˜choices[0].delta.contentdelta.text
delta ํƒ€์ž… ๊ตฌ๋ถ„์—†์Œ (content/tool_calls)type ํ•„๋“œ๋กœ ๋ช…์‹œ
์ด๋ฒคํŠธ ๋ž˜ํ•‘๋‹จ์ˆœ JSON chunkSSE event + data
Tool Use deltadelta.tool_callsinput_json_delta

delta ๋ˆ„์ ์œผ๋กœ ์ „์ฒด ํ…์ŠคํŠธ ์กฐํ•ฉํ•˜๊ธฐ

delta๋Š” ์กฐ๊ฐ์ด๋ฏ€๋กœ, ์ „์ฒด ์‘๋‹ต์„ ์–ป์œผ๋ ค๋ฉด ์ง์ ‘ ๋ˆ„์ ํ•ด์•ผ ํ•จ:

full_text = ""
with client.messages.stream(...) as stream:
    for text in stream.text_stream:
        full_text += text          # delta ๋ˆ„์ 
        print(text, end="", flush=True)  # ์‹ค์‹œ๊ฐ„ ์ถœ๋ ฅ
 
print(f"\n์ตœ์ข… ์ „์ฒด ์‘๋‹ต: {full_text}")

์ฐธ๊ณ  ๋ฌธ์„œ