waveterm/aiprompts/openai-streaming.md
Mike Sawka d272a4ec03
New AIPanel (#2370)
Massive PR, over 13k LOC updated, 128 commits to implement the first pass at the new Wave AI panel.  Two backend adapters (OpenAI and Anthropic), layout changes to support the panel, keyboard shortcuts, and a huge focus/layout change to integrate the panel seamlessly into the UI.

Also fixes some small issues found during the Wave AI journey (zoom fixes, documentation, more scss removal, circular dependency issues, settings, etc)
2025-10-07 13:32:10 -07:00

7.6 KiB

OpenAI Responses API SSE Events Documentation

This document outlines the Server-Sent Events (SSE) format used by OpenAI's Responses API for streaming chat completions, based on the Vercel AI SDK implementation.

Core Event Types

Response Lifecycle Events

response.created

Emitted when a new response begins.

{
  "type": "response.created",
  "response": {
    "id": "resp_abc123",
    "created_at": 1640995200,
    "model": "gpt-5",
    "service_tier": "default"
  }
}

response.completed

Emitted when the response completes successfully.

{
  "type": "response.completed",
  "response": {
    "incomplete_details": null,
    "usage": {
      "input_tokens": 100,
      "input_tokens_details": {
        "cached_tokens": 50
      },
      "output_tokens": 200,
      "output_tokens_details": {
        "reasoning_tokens": 150
      }
    },
    "service_tier": "default"
  }
}

response.incomplete

Emitted when the response is incomplete (e.g., due to length limits).

{
  "type": "response.incomplete",
  "response": {
    "incomplete_details": {
      "reason": "max_tokens"
    },
    "usage": {
      "input_tokens": 100,
      "output_tokens": 4000
    }
  }
}

Content Block Events

response.output_item.added

Emitted when a new output item (content block) is added.

{
  "type": "response.output_item.added",
  "output_index": 0,
  "item": {
    "type": "message",
    "id": "msg_abc123"
  }
}

Item types can be:

  • message - Text content
  • reasoning - Reasoning/thinking content
  • function_call - Tool call
  • web_search_call - Web search tool call
  • computer_call - Computer use tool call
  • file_search_call - File search tool call
  • image_generation_call - Image generation tool call
  • code_interpreter_call - Code interpreter tool call

response.output_item.done

Emitted when an output item is completed.

{
  "type": "response.output_item.done",
  "output_index": 0,
  "item": {
    "type": "message",
    "id": "msg_abc123"
  }
}

For function calls, includes the complete arguments:

{
  "type": "response.output_item.done",
  "output_index": 1,
  "item": {
    "type": "function_call",
    "id": "call_abc123",
    "call_id": "call_abc123",
    "name": "get_weather",
    "arguments": "{\"location\": \"San Francisco\"}",
    "status": "completed"
  }
}

Text Streaming Events

response.output_text.delta

Emitted for incremental text content.

{
  "type": "response.output_text.delta",
  "item_id": "msg_abc123",
  "delta": "Hello, how can I",
  "logprobs": [
    {
      "token": "Hello",
      "logprob": -0.1,
      "top_logprobs": [
        {
          "token": "Hello",
          "logprob": -0.1
        },
        {
          "token": "Hi",
          "logprob": -2.3
        }
      ]
    }
  ]
}

Tool Call Events

response.function_call_arguments.delta

Emitted for streaming function call arguments.

{
  "type": "response.function_call_arguments.delta",
  "item_id": "call_abc123",
  "output_index": 1,
  "delta": "\"location\": \"San"
}

Reasoning Events

response.reasoning_summary_part.added

Emitted when a new reasoning summary part is added.

{
  "type": "response.reasoning_summary_part.added",
  "item_id": "reasoning_abc123",
  "summary_index": 0
}

response.reasoning_summary_text.delta

Emitted for incremental reasoning text.

{
  "type": "response.reasoning_summary_text.delta",
  "item_id": "reasoning_abc123",
  "summary_index": 0,
  "delta": "Let me think about this step by step..."
}

Annotation Events

response.output_text.annotation.added

Emitted when citations or annotations are added to text.

{
  "type": "response.output_text.annotation.added",
  "annotation": {
    "type": "url_citation",
    "url": "https://example.com/article",
    "title": "Example Article"
  }
}

Or for file citations:

{
  "type": "response.output_text.annotation.added",
  "annotation": {
    "type": "file_citation",
    "file_id": "file_abc123",
    "filename": "document.pdf",
    "quote": "This is the relevant quote",
    "start_index": 100,
    "end_index": 150
  }
}

Error Events

error

Emitted when an error occurs.

{
  "type": "error",
  "code": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Please try again later.",
  "param": null,
  "sequence_number": 5
}

Built-in Tool Call Schemas

Web Search Call

{
  "type": "web_search_call",
  "id": "search_abc123",
  "status": "completed",
  "action": {
    "type": "search",
    "query": "OpenAI API documentation"
  }
}

File Search Call

{
  "type": "file_search_call",
  "id": "search_abc123",
  "queries": ["OpenAI pricing", "API limits"],
  "results": [
    {
      "attributes": {},
      "file_id": "file_abc123",
      "filename": "pricing.pdf",
      "score": 0.85,
      "text": "OpenAI API pricing starts at..."
    }
  ]
}

Code Interpreter Call

{
  "type": "code_interpreter_call",
  "id": "code_abc123",
  "code": "print('Hello, world!')",
  "container_id": "container_123",
  "outputs": [
    {
      "type": "logs",
      "logs": "Hello, world!\n"
    }
  ]
}

Image Generation Call

{
  "type": "image_generation_call",
  "id": "img_abc123",
  "result": "https://example.com/generated-image.png"
}

Computer Use Call

{
  "type": "computer_call",
  "id": "computer_abc123",
  "status": "completed"
}

Event Processing Flow

  1. Response Start: response.created → Initialize response tracking
  2. Content Blocks: response.output_item.added → Start tracking content block
  3. Streaming Content:
    • response.output_text.delta → Accumulate text
    • response.function_call_arguments.delta → Accumulate tool arguments
    • response.reasoning_summary_text.delta → Accumulate reasoning
  4. Content Complete: response.output_item.done → Finalize content block
  5. Response End: response.completed/response.incomplete → Finalize response

Key Differences from Anthropic

Aspect OpenAI Responses API Anthropic Messages API
Text streaming response.output_text.delta content_block_delta (type: text_delta)
Tool arguments response.function_call_arguments.delta content_block_delta (type: input_json_delta)
Reasoning response.reasoning_summary_text.delta content_block_delta (type: thinking_delta)
Block tracking output_index index
Response start response.created message_start
Response end response.completed message_stop

Error Handling

  • Parse each SSE event with proper JSON validation
  • Handle unknown event types gracefully (forward as-is or ignore)
  • Track sequence_number for error events to maintain order
  • Use output_index to correlate events with specific content blocks
  • Handle partial JSON in tool argument deltas (accumulate until complete)

Implementation Notes

  • Events may arrive out of order; use output_index and item_id for correlation
  • Multiple reasoning summary parts can exist; track by summary_index
  • Tool calls can be provider-executed (built-in tools) or require client execution
  • Logprobs are optional and only included when requested
  • Usage tokens are only available in completion events