12 KiB
Wave AI Architecture Documentation
Overview
Wave AI is a chat-based AI assistant feature integrated into Wave Terminal. It provides a conversational interface for interacting with various AI providers (OpenAI, Anthropic, Perplexity, Google, and Wave's cloud proxy) through a unified streaming architecture. The feature is implemented as a block view within Wave Terminal's modular system.
Architecture Components
Frontend Architecture (frontend/app/view/waveai/)
Core Components
1. WaveAiModel Class
- Purpose: Main view model implementing the
ViewModelinterface - Responsibilities:
- State management using Jotai atoms
- Configuration management (presets, AI options)
- Message handling and persistence
- RPC communication with backend
- UI state coordination
2. AiWshClient Class
- Purpose: Specialized WSH RPC client for AI operations
- Extends:
WshClient - Responsibilities:
- Handle incoming
aisendmessageRPC calls - Route messages to the model's
sendMessagemethod
- Handle incoming
3. React Components
- WaveAi: Main container component
- ChatWindow: Scrollable message display with auto-scroll behavior
- ChatItem: Individual message renderer with role-based styling
- ChatInput: Auto-resizing textarea with keyboard navigation
State Management (Jotai Atoms)
Message State:
messagesAtom: PrimitiveAtom<Array<ChatMessageType>>
messagesSplitAtom: SplitAtom<Array<ChatMessageType>>
latestMessageAtom: Atom<ChatMessageType>
addMessageAtom: WritableAtom<unknown, [message: ChatMessageType], void>
updateLastMessageAtom: WritableAtom<unknown, [text: string, isUpdating: boolean], void>
removeLastMessageAtom: WritableAtom<unknown, [], void>
Configuration State:
presetKey: Atom<string> // Current AI preset selection
presetMap: Atom<{[k: string]: MetaType}> // Available AI presets
mergedPresets: Atom<MetaType> // Merged configuration hierarchy
aiOpts: Atom<WaveAIOptsType> // Final AI options for requests
UI State:
locked: PrimitiveAtom<boolean> // Prevents input during AI response
viewIcon: Atom<string> // Header icon
viewName: Atom<string> // Header title
viewText: Atom<HeaderElem[]> // Dynamic header elements
endIconButtons: Atom<IconButtonDecl[]> // Header action buttons
Configuration Hierarchy
The AI configuration follows a three-tier hierarchy (lowest to highest priority):
- Global Settings:
atoms.settingsAtom["ai:*"] - Preset Configuration:
presets[presetKey]["ai:*"] - Block Metadata:
block.meta["ai:*"]
Configuration is merged using mergeMeta() utility, allowing fine-grained overrides at each level.
Data Flow - Frontend
User Input → sendMessage() →
├── Add user message to UI
├── Create WaveAIStreamRequest
├── Call RpcApi.StreamWaveAiCommand()
├── Add typing indicator
└── Stream response handling:
├── Update message incrementally
├── Handle errors
└── Save complete conversation
Backend Architecture (pkg/waveai/)
Core Interface
AIBackend Interface:
type AIBackend interface {
StreamCompletion(
ctx context.Context,
request wshrpc.WaveAIStreamRequest,
) chan wshrpc.RespOrErrorUnion[wshrpc.WaveAIPacketType]
}
Backend Implementations
1. OpenAIBackend (openaibackend.go)
- Providers: OpenAI, Azure OpenAI, Cloudflare Azure
- Features:
- Reasoning model support (o1, o3, o4, gpt-5)
- Proxy support
- Multiple API types (OpenAI, Azure, AzureAD, CloudflareAzure)
- Streaming: Uses
go-openailibrary for SSE streaming
2. AnthropicBackend (anthropicbackend.go)
- Provider: Anthropic Claude
- Features:
- Custom SSE parser for Anthropic's event format
- System message handling
- Usage token tracking
- Events:
message_start,content_block_delta,message_stop, etc.
3. WaveAICloudBackend (cloudbackend.go)
- Provider: Wave's cloud proxy service
- Transport: WebSocket connection to Wave cloud
- Features:
- Fallback when no API token/baseURL provided
- Built-in rate limiting and abuse protection
4. PerplexityBackend (perplexitybackend.go)
- Provider: Perplexity AI
- Implementation: Similar to OpenAI backend
5. GoogleBackend (googlebackend.go)
- Provider: Google AI (Gemini)
- Implementation: Custom integration for Google's API
Backend Routing Logic
func RunAICommand(ctx context.Context, request wshrpc.WaveAIStreamRequest) chan wshrpc.RespOrErrorUnion[wshrpc.WaveAIPacketType] {
// Route based on request.Opts.APIType:
switch request.Opts.APIType {
case "anthropic":
backend = AnthropicBackend{}
case "perplexity":
backend = PerplexityBackend{}
case "google":
backend = GoogleBackend{}
default:
if IsCloudAIRequest(request.Opts) {
backend = WaveAICloudBackend{}
} else {
backend = OpenAIBackend{}
}
}
return backend.StreamCompletion(ctx, request)
}
RPC Communication Layer
WSH RPC Integration
Command: streamwaveai
Type: Response Stream (one request, multiple responses)
Request Type (WaveAIStreamRequest):
type WaveAIStreamRequest struct {
ClientId string `json:"clientid,omitempty"`
Opts *WaveAIOptsType `json:"opts"`
Prompt []WaveAIPromptMessageType `json:"prompt"`
}
Response Type (WaveAIPacketType):
type WaveAIPacketType struct {
Type string `json:"type"`
Model string `json:"model,omitempty"`
Created int64 `json:"created,omitempty"`
FinishReason string `json:"finish_reason,omitempty"`
Usage *WaveAIUsageType `json:"usage,omitempty"`
Index int `json:"index,omitempty"`
Text string `json:"text,omitempty"`
Error string `json:"error,omitempty"`
}
Configuration Types
AI Options (WaveAIOptsType):
type WaveAIOptsType struct {
Model string `json:"model"`
APIType string `json:"apitype,omitempty"`
APIToken string `json:"apitoken"`
OrgID string `json:"orgid,omitempty"`
APIVersion string `json:"apiversion,omitempty"`
BaseURL string `json:"baseurl,omitempty"`
ProxyURL string `json:"proxyurl,omitempty"`
MaxTokens int `json:"maxtokens,omitempty"`
MaxChoices int `json:"maxchoices,omitempty"`
TimeoutMs int `json:"timeoutms,omitempty"`
}
Data Persistence
Chat History Storage
Frontend:
- Method:
fetchWaveFile(blockId, "aidata") - Format: JSON array of
WaveAIPromptMessageType - Sliding Window: Last 30 messages (
slidingWindowSize = 30)
Backend:
- Service:
BlockService.SaveWaveAiData(blockId, history) - Storage: Block-associated file storage
- Persistence: Automatic save after each complete exchange
Message Format
UI Messages (ChatMessageType):
interface ChatMessageType {
id: string;
user: string; // "user" | "assistant" | "error"
text: string;
isUpdating?: boolean;
}
Stored Messages (WaveAIPromptMessageType):
type WaveAIPromptMessageType struct {
Role string `json:"role"` // "user" | "assistant" | "system" | "error"
Content string `json:"content"`
Name string `json:"name,omitempty"`
}
Error Handling
Frontend Error Handling
- Network Errors: Caught in streaming loop, displayed as error messages
- Empty Responses: Automatically remove typing indicator
- Cancellation: User can cancel via stop button (
model.cancel = true) - Partial Responses: Saved even if incomplete due to errors
Backend Error Handling
- Panic Recovery: All backends use
panichandler.PanicHandler() - Context Cancellation: Proper cleanup on request cancellation
- Provider Errors: Wrapped and forwarded to frontend
- Connection Errors: Detailed error messages for debugging
UI Features
Message Rendering
- Markdown Support: Full markdown rendering with syntax highlighting
- Role-based Styling: Different colors/layouts for user/assistant/error messages
- Typing Indicator: Animated dots during AI response
- Font Configuration: Configurable font sizes via presets
Input Handling
- Auto-resize: Textarea grows/shrinks with content (max 5 lines)
- Keyboard Navigation:
- Enter to send
- Cmd+L to clear history
- Arrow keys for code block selection
- Code Block Selection: Navigate through code blocks in responses
Scroll Management
- Auto-scroll: Automatically scrolls to new messages
- User Scroll Detection: Pauses auto-scroll when user manually scrolls
- Smart Resume: Resumes auto-scroll when near bottom
Configuration Management
Preset System
Preset Structure:
{
"ai@preset-name": {
"display:name": "Preset Display Name",
"display:order": 1,
"ai:model": "gpt-4",
"ai:apitype": "openai",
"ai:apitoken": "sk-...",
"ai:baseurl": "https://api.openai.com/v1",
"ai:maxtokens": 4000,
"ai:fontsize": "14px",
"ai:fixedfontsize": "12px"
}
}
Configuration Keys:
ai:model- AI model nameai:apitype- Provider type (openai, anthropic, perplexity, google)ai:apitoken- API authentication tokenai:baseurl- Custom API endpointai:proxyurl- HTTP proxy URLai:maxtokens- Maximum response tokensai:timeoutms- Request timeoutai:fontsize- UI font sizeai:fixedfontsize- Code block font size
Provider Detection
The UI automatically detects and displays the active provider:
- Cloud: Wave's proxy (no token/baseURL)
- Local: localhost/127.0.0.1 endpoints
- Remote: External API endpoints
- Provider-specific: Anthropic, Perplexity with custom icons
Performance Considerations
Frontend Optimizations
- Jotai Atoms: Granular reactivity, only re-render affected components
- Memo Components:
ChatWindowandChatItemare memoized - Throttled Scrolling: Scroll events throttled to 100ms
- Debounced Scroll Detection: User scroll detection debounced to 300ms
Backend Optimizations
- Streaming: All responses are streamed for immediate feedback
- Context Cancellation: Proper cleanup prevents resource leaks
- Connection Pooling: HTTP clients reuse connections
- Error Recovery: Graceful degradation on provider failures
Security Considerations
API Token Handling
- Storage: Tokens stored in encrypted configuration
- Transmission: Tokens only sent to configured endpoints
- Validation: Backend validates token format and permissions
Request Validation
- Input Sanitization: User input validated before sending
- Rate Limiting: Cloud backend includes built-in rate limiting
- Error Filtering: Sensitive error details filtered from UI
Extension Points
Adding New Providers
- Implement AIBackend Interface: Create new backend struct
- Add Provider Detection: Update
RunAICommand()routing logic - Add Configuration: Define provider-specific config keys
- Update UI: Add provider detection in
viewTextatom
Custom Message Types
- Extend ChatMessageType: Add new user types
- Update ChatItem Rendering: Handle new message types
- Modify Storage: Update persistence format if needed
This architecture provides a flexible, extensible foundation for AI chat functionality while maintaining clean separation between UI, business logic, and provider integrations.