waveterm/aiprompts/waveai-architecture.md

12 KiB

Wave AI Architecture Documentation

Overview

Wave AI is a chat-based AI assistant feature integrated into Wave Terminal. It provides a conversational interface for interacting with various AI providers (OpenAI, Anthropic, Perplexity, Google, and Wave's cloud proxy) through a unified streaming architecture. The feature is implemented as a block view within Wave Terminal's modular system.

Architecture Components

Frontend Architecture (frontend/app/view/waveai/)

Core Components

1. WaveAiModel Class

  • Purpose: Main view model implementing the ViewModel interface
  • Responsibilities:
    • State management using Jotai atoms
    • Configuration management (presets, AI options)
    • Message handling and persistence
    • RPC communication with backend
    • UI state coordination

2. AiWshClient Class

  • Purpose: Specialized WSH RPC client for AI operations
  • Extends: WshClient
  • Responsibilities:
    • Handle incoming aisendmessage RPC calls
    • Route messages to the model's sendMessage method

3. React Components

  • WaveAi: Main container component
  • ChatWindow: Scrollable message display with auto-scroll behavior
  • ChatItem: Individual message renderer with role-based styling
  • ChatInput: Auto-resizing textarea with keyboard navigation

State Management (Jotai Atoms)

Message State:

messagesAtom: PrimitiveAtom<Array<ChatMessageType>>
messagesSplitAtom: SplitAtom<Array<ChatMessageType>>
latestMessageAtom: Atom<ChatMessageType>
addMessageAtom: WritableAtom<unknown, [message: ChatMessageType], void>
updateLastMessageAtom: WritableAtom<unknown, [text: string, isUpdating: boolean], void>
removeLastMessageAtom: WritableAtom<unknown, [], void>

Configuration State:

presetKey: Atom<string>           // Current AI preset selection
presetMap: Atom<{[k: string]: MetaType}>  // Available AI presets
mergedPresets: Atom<MetaType>     // Merged configuration hierarchy
aiOpts: Atom<WaveAIOptsType>      // Final AI options for requests

UI State:

locked: PrimitiveAtom<boolean>    // Prevents input during AI response
viewIcon: Atom<string>            // Header icon
viewName: Atom<string>            // Header title
viewText: Atom<HeaderElem[]>      // Dynamic header elements
endIconButtons: Atom<IconButtonDecl[]>  // Header action buttons

Configuration Hierarchy

The AI configuration follows a three-tier hierarchy (lowest to highest priority):

  1. Global Settings: atoms.settingsAtom["ai:*"]
  2. Preset Configuration: presets[presetKey]["ai:*"]
  3. Block Metadata: block.meta["ai:*"]

Configuration is merged using mergeMeta() utility, allowing fine-grained overrides at each level.

Data Flow - Frontend

User Input → sendMessage() → 
├── Add user message to UI
├── Create WaveAIStreamRequest
├── Call RpcApi.StreamWaveAiCommand()
├── Add typing indicator
└── Stream response handling:
    ├── Update message incrementally
    ├── Handle errors
    └── Save complete conversation

Backend Architecture (pkg/waveai/)

Core Interface

AIBackend Interface:

type AIBackend interface {
    StreamCompletion(
        ctx context.Context,
        request wshrpc.WaveAIStreamRequest,
    ) chan wshrpc.RespOrErrorUnion[wshrpc.WaveAIPacketType]
}

Backend Implementations

1. OpenAIBackend (openaibackend.go)

  • Providers: OpenAI, Azure OpenAI, Cloudflare Azure
  • Features:
    • Reasoning model support (o1, o3, o4, gpt-5)
    • Proxy support
    • Multiple API types (OpenAI, Azure, AzureAD, CloudflareAzure)
  • Streaming: Uses go-openai library for SSE streaming

2. AnthropicBackend (anthropicbackend.go)

  • Provider: Anthropic Claude
  • Features:
    • Custom SSE parser for Anthropic's event format
    • System message handling
    • Usage token tracking
  • Events: message_start, content_block_delta, message_stop, etc.

3. WaveAICloudBackend (cloudbackend.go)

  • Provider: Wave's cloud proxy service
  • Transport: WebSocket connection to Wave cloud
  • Features:
    • Fallback when no API token/baseURL provided
    • Built-in rate limiting and abuse protection

4. PerplexityBackend (perplexitybackend.go)

  • Provider: Perplexity AI
  • Implementation: Similar to OpenAI backend

5. GoogleBackend (googlebackend.go)

  • Provider: Google AI (Gemini)
  • Implementation: Custom integration for Google's API

Backend Routing Logic

func RunAICommand(ctx context.Context, request wshrpc.WaveAIStreamRequest) chan wshrpc.RespOrErrorUnion[wshrpc.WaveAIPacketType] {
    // Route based on request.Opts.APIType:
    switch request.Opts.APIType {
    case "anthropic":
        backend = AnthropicBackend{}
    case "perplexity":
        backend = PerplexityBackend{}
    case "google":
        backend = GoogleBackend{}
    default:
        if IsCloudAIRequest(request.Opts) {
            backend = WaveAICloudBackend{}
        } else {
            backend = OpenAIBackend{}
        }
    }
    return backend.StreamCompletion(ctx, request)
}

RPC Communication Layer

WSH RPC Integration

Command: streamwaveai Type: Response Stream (one request, multiple responses)

Request Type (WaveAIStreamRequest):

type WaveAIStreamRequest struct {
    ClientId string                    `json:"clientid,omitempty"`
    Opts     *WaveAIOptsType           `json:"opts"`
    Prompt   []WaveAIPromptMessageType `json:"prompt"`
}

Response Type (WaveAIPacketType):

type WaveAIPacketType struct {
    Type         string           `json:"type"`
    Model        string           `json:"model,omitempty"`
    Created      int64            `json:"created,omitempty"`
    FinishReason string           `json:"finish_reason,omitempty"`
    Usage        *WaveAIUsageType `json:"usage,omitempty"`
    Index        int              `json:"index,omitempty"`
    Text         string           `json:"text,omitempty"`
    Error        string           `json:"error,omitempty"`
}

Configuration Types

AI Options (WaveAIOptsType):

type WaveAIOptsType struct {
    Model      string `json:"model"`
    APIType    string `json:"apitype,omitempty"`
    APIToken   string `json:"apitoken"`
    OrgID      string `json:"orgid,omitempty"`
    APIVersion string `json:"apiversion,omitempty"`
    BaseURL    string `json:"baseurl,omitempty"`
    ProxyURL   string `json:"proxyurl,omitempty"`
    MaxTokens  int    `json:"maxtokens,omitempty"`
    MaxChoices int    `json:"maxchoices,omitempty"`
    TimeoutMs  int    `json:"timeoutms,omitempty"`
}

Data Persistence

Chat History Storage

Frontend:

  • Method: fetchWaveFile(blockId, "aidata")
  • Format: JSON array of WaveAIPromptMessageType
  • Sliding Window: Last 30 messages (slidingWindowSize = 30)

Backend:

  • Service: BlockService.SaveWaveAiData(blockId, history)
  • Storage: Block-associated file storage
  • Persistence: Automatic save after each complete exchange

Message Format

UI Messages (ChatMessageType):

interface ChatMessageType {
    id: string;
    user: string;        // "user" | "assistant" | "error"
    text: string;
    isUpdating?: boolean;
}

Stored Messages (WaveAIPromptMessageType):

type WaveAIPromptMessageType struct {
    Role    string `json:"role"`     // "user" | "assistant" | "system" | "error"
    Content string `json:"content"`
    Name    string `json:"name,omitempty"`
}

Error Handling

Frontend Error Handling

  1. Network Errors: Caught in streaming loop, displayed as error messages
  2. Empty Responses: Automatically remove typing indicator
  3. Cancellation: User can cancel via stop button (model.cancel = true)
  4. Partial Responses: Saved even if incomplete due to errors

Backend Error Handling

  1. Panic Recovery: All backends use panichandler.PanicHandler()
  2. Context Cancellation: Proper cleanup on request cancellation
  3. Provider Errors: Wrapped and forwarded to frontend
  4. Connection Errors: Detailed error messages for debugging

UI Features

Message Rendering

  • Markdown Support: Full markdown rendering with syntax highlighting
  • Role-based Styling: Different colors/layouts for user/assistant/error messages
  • Typing Indicator: Animated dots during AI response
  • Font Configuration: Configurable font sizes via presets

Input Handling

  • Auto-resize: Textarea grows/shrinks with content (max 5 lines)
  • Keyboard Navigation:
    • Enter to send
    • Cmd+L to clear history
    • Arrow keys for code block selection
  • Code Block Selection: Navigate through code blocks in responses

Scroll Management

  • Auto-scroll: Automatically scrolls to new messages
  • User Scroll Detection: Pauses auto-scroll when user manually scrolls
  • Smart Resume: Resumes auto-scroll when near bottom

Configuration Management

Preset System

Preset Structure:

{
  "ai@preset-name": {
    "display:name": "Preset Display Name",
    "display:order": 1,
    "ai:model": "gpt-4",
    "ai:apitype": "openai",
    "ai:apitoken": "sk-...",
    "ai:baseurl": "https://api.openai.com/v1",
    "ai:maxtokens": 4000,
    "ai:fontsize": "14px",
    "ai:fixedfontsize": "12px"
  }
}

Configuration Keys:

  • ai:model - AI model name
  • ai:apitype - Provider type (openai, anthropic, perplexity, google)
  • ai:apitoken - API authentication token
  • ai:baseurl - Custom API endpoint
  • ai:proxyurl - HTTP proxy URL
  • ai:maxtokens - Maximum response tokens
  • ai:timeoutms - Request timeout
  • ai:fontsize - UI font size
  • ai:fixedfontsize - Code block font size

Provider Detection

The UI automatically detects and displays the active provider:

  • Cloud: Wave's proxy (no token/baseURL)
  • Local: localhost/127.0.0.1 endpoints
  • Remote: External API endpoints
  • Provider-specific: Anthropic, Perplexity with custom icons

Performance Considerations

Frontend Optimizations

  • Jotai Atoms: Granular reactivity, only re-render affected components
  • Memo Components: ChatWindow and ChatItem are memoized
  • Throttled Scrolling: Scroll events throttled to 100ms
  • Debounced Scroll Detection: User scroll detection debounced to 300ms

Backend Optimizations

  • Streaming: All responses are streamed for immediate feedback
  • Context Cancellation: Proper cleanup prevents resource leaks
  • Connection Pooling: HTTP clients reuse connections
  • Error Recovery: Graceful degradation on provider failures

Security Considerations

API Token Handling

  • Storage: Tokens stored in encrypted configuration
  • Transmission: Tokens only sent to configured endpoints
  • Validation: Backend validates token format and permissions

Request Validation

  • Input Sanitization: User input validated before sending
  • Rate Limiting: Cloud backend includes built-in rate limiting
  • Error Filtering: Sensitive error details filtered from UI

Extension Points

Adding New Providers

  1. Implement AIBackend Interface: Create new backend struct
  2. Add Provider Detection: Update RunAICommand() routing logic
  3. Add Configuration: Define provider-specific config keys
  4. Update UI: Add provider detection in viewText atom

Custom Message Types

  1. Extend ChatMessageType: Add new user types
  2. Update ChatItem Rendering: Handle new message types
  3. Modify Storage: Update persistence format if needed

This architecture provides a flexible, extensible foundation for AI chat functionality while maintaining clean separation between UI, business logic, and provider integrations.