mirror of
https://github.com/wavetermdev/waveterm.git
synced 2025-11-28 05:00:26 +08:00
366 lines
No EOL
12 KiB
Markdown
366 lines
No EOL
12 KiB
Markdown
# Wave AI Architecture Documentation
|
|
|
|
## Overview
|
|
|
|
Wave AI is a chat-based AI assistant feature integrated into Wave Terminal. It provides a conversational interface for interacting with various AI providers (OpenAI, Anthropic, Perplexity, Google, and Wave's cloud proxy) through a unified streaming architecture. The feature is implemented as a block view within Wave Terminal's modular system.
|
|
|
|
## Architecture Components
|
|
|
|
### Frontend Architecture (`frontend/app/view/waveai/`)
|
|
|
|
#### Core Components
|
|
|
|
**1. WaveAiModel Class**
|
|
- **Purpose**: Main view model implementing the `ViewModel` interface
|
|
- **Responsibilities**:
|
|
- State management using Jotai atoms
|
|
- Configuration management (presets, AI options)
|
|
- Message handling and persistence
|
|
- RPC communication with backend
|
|
- UI state coordination
|
|
|
|
**2. AiWshClient Class**
|
|
- **Purpose**: Specialized WSH RPC client for AI operations
|
|
- **Extends**: `WshClient`
|
|
- **Responsibilities**:
|
|
- Handle incoming `aisendmessage` RPC calls
|
|
- Route messages to the model's `sendMessage` method
|
|
|
|
**3. React Components**
|
|
- **WaveAi**: Main container component
|
|
- **ChatWindow**: Scrollable message display with auto-scroll behavior
|
|
- **ChatItem**: Individual message renderer with role-based styling
|
|
- **ChatInput**: Auto-resizing textarea with keyboard navigation
|
|
|
|
#### State Management (Jotai Atoms)
|
|
|
|
**Message State**:
|
|
```typescript
|
|
messagesAtom: PrimitiveAtom<Array<ChatMessageType>>
|
|
messagesSplitAtom: SplitAtom<Array<ChatMessageType>>
|
|
latestMessageAtom: Atom<ChatMessageType>
|
|
addMessageAtom: WritableAtom<unknown, [message: ChatMessageType], void>
|
|
updateLastMessageAtom: WritableAtom<unknown, [text: string, isUpdating: boolean], void>
|
|
removeLastMessageAtom: WritableAtom<unknown, [], void>
|
|
```
|
|
|
|
**Configuration State**:
|
|
```typescript
|
|
presetKey: Atom<string> // Current AI preset selection
|
|
presetMap: Atom<{[k: string]: MetaType}> // Available AI presets
|
|
mergedPresets: Atom<MetaType> // Merged configuration hierarchy
|
|
aiOpts: Atom<WaveAIOptsType> // Final AI options for requests
|
|
```
|
|
|
|
**UI State**:
|
|
```typescript
|
|
locked: PrimitiveAtom<boolean> // Prevents input during AI response
|
|
viewIcon: Atom<string> // Header icon
|
|
viewName: Atom<string> // Header title
|
|
viewText: Atom<HeaderElem[]> // Dynamic header elements
|
|
endIconButtons: Atom<IconButtonDecl[]> // Header action buttons
|
|
```
|
|
|
|
#### Configuration Hierarchy
|
|
|
|
The AI configuration follows a three-tier hierarchy (lowest to highest priority):
|
|
1. **Global Settings**: `atoms.settingsAtom["ai:*"]`
|
|
2. **Preset Configuration**: `presets[presetKey]["ai:*"]`
|
|
3. **Block Metadata**: `block.meta["ai:*"]`
|
|
|
|
Configuration is merged using `mergeMeta()` utility, allowing fine-grained overrides at each level.
|
|
|
|
#### Data Flow - Frontend
|
|
|
|
```
|
|
User Input → sendMessage() →
|
|
├── Add user message to UI
|
|
├── Create WaveAIStreamRequest
|
|
├── Call RpcApi.StreamWaveAiCommand()
|
|
├── Add typing indicator
|
|
└── Stream response handling:
|
|
├── Update message incrementally
|
|
├── Handle errors
|
|
└── Save complete conversation
|
|
```
|
|
|
|
### Backend Architecture (`pkg/waveai/`)
|
|
|
|
#### Core Interface
|
|
|
|
**AIBackend Interface**:
|
|
```go
|
|
type AIBackend interface {
|
|
StreamCompletion(
|
|
ctx context.Context,
|
|
request wshrpc.WaveAIStreamRequest,
|
|
) chan wshrpc.RespOrErrorUnion[wshrpc.WaveAIPacketType]
|
|
}
|
|
```
|
|
|
|
#### Backend Implementations
|
|
|
|
**1. OpenAIBackend** (`openaibackend.go`)
|
|
- **Providers**: OpenAI, Azure OpenAI, Cloudflare Azure
|
|
- **Features**:
|
|
- Reasoning model support (o1, o3, o4, gpt-5)
|
|
- Proxy support
|
|
- Multiple API types (OpenAI, Azure, AzureAD, CloudflareAzure)
|
|
- **Streaming**: Uses `go-openai` library for SSE streaming
|
|
|
|
**2. AnthropicBackend** (`anthropicbackend.go`)
|
|
- **Provider**: Anthropic Claude
|
|
- **Features**:
|
|
- Custom SSE parser for Anthropic's event format
|
|
- System message handling
|
|
- Usage token tracking
|
|
- **Events**: `message_start`, `content_block_delta`, `message_stop`, etc.
|
|
|
|
**3. WaveAICloudBackend** (`cloudbackend.go`)
|
|
- **Provider**: Wave's cloud proxy service
|
|
- **Transport**: WebSocket connection to Wave cloud
|
|
- **Features**:
|
|
- Fallback when no API token/baseURL provided
|
|
- Built-in rate limiting and abuse protection
|
|
|
|
**4. PerplexityBackend** (`perplexitybackend.go`)
|
|
- **Provider**: Perplexity AI
|
|
- **Implementation**: Similar to OpenAI backend
|
|
|
|
**5. GoogleBackend** (`googlebackend.go`)
|
|
- **Provider**: Google AI (Gemini)
|
|
- **Implementation**: Custom integration for Google's API
|
|
|
|
#### Backend Routing Logic
|
|
|
|
```go
|
|
func RunAICommand(ctx context.Context, request wshrpc.WaveAIStreamRequest) chan wshrpc.RespOrErrorUnion[wshrpc.WaveAIPacketType] {
|
|
// Route based on request.Opts.APIType:
|
|
switch request.Opts.APIType {
|
|
case "anthropic":
|
|
backend = AnthropicBackend{}
|
|
case "perplexity":
|
|
backend = PerplexityBackend{}
|
|
case "google":
|
|
backend = GoogleBackend{}
|
|
default:
|
|
if IsCloudAIRequest(request.Opts) {
|
|
backend = WaveAICloudBackend{}
|
|
} else {
|
|
backend = OpenAIBackend{}
|
|
}
|
|
}
|
|
return backend.StreamCompletion(ctx, request)
|
|
}
|
|
```
|
|
|
|
### RPC Communication Layer
|
|
|
|
#### WSH RPC Integration
|
|
|
|
**Command**: `streamwaveai`
|
|
**Type**: Response Stream (one request, multiple responses)
|
|
|
|
**Request Type** (`WaveAIStreamRequest`):
|
|
```go
|
|
type WaveAIStreamRequest struct {
|
|
ClientId string `json:"clientid,omitempty"`
|
|
Opts *WaveAIOptsType `json:"opts"`
|
|
Prompt []WaveAIPromptMessageType `json:"prompt"`
|
|
}
|
|
```
|
|
|
|
**Response Type** (`WaveAIPacketType`):
|
|
```go
|
|
type WaveAIPacketType struct {
|
|
Type string `json:"type"`
|
|
Model string `json:"model,omitempty"`
|
|
Created int64 `json:"created,omitempty"`
|
|
FinishReason string `json:"finish_reason,omitempty"`
|
|
Usage *WaveAIUsageType `json:"usage,omitempty"`
|
|
Index int `json:"index,omitempty"`
|
|
Text string `json:"text,omitempty"`
|
|
Error string `json:"error,omitempty"`
|
|
}
|
|
```
|
|
|
|
#### Configuration Types
|
|
|
|
**AI Options** (`WaveAIOptsType`):
|
|
```go
|
|
type WaveAIOptsType struct {
|
|
Model string `json:"model"`
|
|
APIType string `json:"apitype,omitempty"`
|
|
APIToken string `json:"apitoken"`
|
|
OrgID string `json:"orgid,omitempty"`
|
|
APIVersion string `json:"apiversion,omitempty"`
|
|
BaseURL string `json:"baseurl,omitempty"`
|
|
ProxyURL string `json:"proxyurl,omitempty"`
|
|
MaxTokens int `json:"maxtokens,omitempty"`
|
|
MaxChoices int `json:"maxchoices,omitempty"`
|
|
TimeoutMs int `json:"timeoutms,omitempty"`
|
|
}
|
|
```
|
|
|
|
### Data Persistence
|
|
|
|
#### Chat History Storage
|
|
|
|
**Frontend**:
|
|
- **Method**: `fetchWaveFile(blockId, "aidata")`
|
|
- **Format**: JSON array of `WaveAIPromptMessageType`
|
|
- **Sliding Window**: Last 30 messages (`slidingWindowSize = 30`)
|
|
|
|
**Backend**:
|
|
- **Service**: `BlockService.SaveWaveAiData(blockId, history)`
|
|
- **Storage**: Block-associated file storage
|
|
- **Persistence**: Automatic save after each complete exchange
|
|
|
|
#### Message Format
|
|
|
|
**UI Messages** (`ChatMessageType`):
|
|
```typescript
|
|
interface ChatMessageType {
|
|
id: string;
|
|
user: string; // "user" | "assistant" | "error"
|
|
text: string;
|
|
isUpdating?: boolean;
|
|
}
|
|
```
|
|
|
|
**Stored Messages** (`WaveAIPromptMessageType`):
|
|
```go
|
|
type WaveAIPromptMessageType struct {
|
|
Role string `json:"role"` // "user" | "assistant" | "system" | "error"
|
|
Content string `json:"content"`
|
|
Name string `json:"name,omitempty"`
|
|
}
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
#### Frontend Error Handling
|
|
|
|
1. **Network Errors**: Caught in streaming loop, displayed as error messages
|
|
2. **Empty Responses**: Automatically remove typing indicator
|
|
3. **Cancellation**: User can cancel via stop button (`model.cancel = true`)
|
|
4. **Partial Responses**: Saved even if incomplete due to errors
|
|
|
|
#### Backend Error Handling
|
|
|
|
1. **Panic Recovery**: All backends use `panichandler.PanicHandler()`
|
|
2. **Context Cancellation**: Proper cleanup on request cancellation
|
|
3. **Provider Errors**: Wrapped and forwarded to frontend
|
|
4. **Connection Errors**: Detailed error messages for debugging
|
|
|
|
### UI Features
|
|
|
|
#### Message Rendering
|
|
|
|
- **Markdown Support**: Full markdown rendering with syntax highlighting
|
|
- **Role-based Styling**: Different colors/layouts for user/assistant/error messages
|
|
- **Typing Indicator**: Animated dots during AI response
|
|
- **Font Configuration**: Configurable font sizes via presets
|
|
|
|
#### Input Handling
|
|
|
|
- **Auto-resize**: Textarea grows/shrinks with content (max 5 lines)
|
|
- **Keyboard Navigation**:
|
|
- Enter to send
|
|
- Cmd+L to clear history
|
|
- Arrow keys for code block selection
|
|
- **Code Block Selection**: Navigate through code blocks in responses
|
|
|
|
#### Scroll Management
|
|
|
|
- **Auto-scroll**: Automatically scrolls to new messages
|
|
- **User Scroll Detection**: Pauses auto-scroll when user manually scrolls
|
|
- **Smart Resume**: Resumes auto-scroll when near bottom
|
|
|
|
### Configuration Management
|
|
|
|
#### Preset System
|
|
|
|
**Preset Structure**:
|
|
```json
|
|
{
|
|
"ai@preset-name": {
|
|
"display:name": "Preset Display Name",
|
|
"display:order": 1,
|
|
"ai:model": "gpt-4",
|
|
"ai:apitype": "openai",
|
|
"ai:apitoken": "sk-...",
|
|
"ai:baseurl": "https://api.openai.com/v1",
|
|
"ai:maxtokens": 4000,
|
|
"ai:fontsize": "14px",
|
|
"ai:fixedfontsize": "12px"
|
|
}
|
|
}
|
|
```
|
|
|
|
**Configuration Keys**:
|
|
- `ai:model` - AI model name
|
|
- `ai:apitype` - Provider type (openai, anthropic, perplexity, google)
|
|
- `ai:apitoken` - API authentication token
|
|
- `ai:baseurl` - Custom API endpoint
|
|
- `ai:proxyurl` - HTTP proxy URL
|
|
- `ai:maxtokens` - Maximum response tokens
|
|
- `ai:timeoutms` - Request timeout
|
|
- `ai:fontsize` - UI font size
|
|
- `ai:fixedfontsize` - Code block font size
|
|
|
|
#### Provider Detection
|
|
|
|
The UI automatically detects and displays the active provider:
|
|
|
|
- **Cloud**: Wave's proxy (no token/baseURL)
|
|
- **Local**: localhost/127.0.0.1 endpoints
|
|
- **Remote**: External API endpoints
|
|
- **Provider-specific**: Anthropic, Perplexity with custom icons
|
|
|
|
### Performance Considerations
|
|
|
|
#### Frontend Optimizations
|
|
|
|
- **Jotai Atoms**: Granular reactivity, only re-render affected components
|
|
- **Memo Components**: `ChatWindow` and `ChatItem` are memoized
|
|
- **Throttled Scrolling**: Scroll events throttled to 100ms
|
|
- **Debounced Scroll Detection**: User scroll detection debounced to 300ms
|
|
|
|
#### Backend Optimizations
|
|
|
|
- **Streaming**: All responses are streamed for immediate feedback
|
|
- **Context Cancellation**: Proper cleanup prevents resource leaks
|
|
- **Connection Pooling**: HTTP clients reuse connections
|
|
- **Error Recovery**: Graceful degradation on provider failures
|
|
|
|
### Security Considerations
|
|
|
|
#### API Token Handling
|
|
|
|
- **Storage**: Tokens stored in encrypted configuration
|
|
- **Transmission**: Tokens only sent to configured endpoints
|
|
- **Validation**: Backend validates token format and permissions
|
|
|
|
#### Request Validation
|
|
|
|
- **Input Sanitization**: User input validated before sending
|
|
- **Rate Limiting**: Cloud backend includes built-in rate limiting
|
|
- **Error Filtering**: Sensitive error details filtered from UI
|
|
|
|
### Extension Points
|
|
|
|
#### Adding New Providers
|
|
|
|
1. **Implement AIBackend Interface**: Create new backend struct
|
|
2. **Add Provider Detection**: Update `RunAICommand()` routing logic
|
|
3. **Add Configuration**: Define provider-specific config keys
|
|
4. **Update UI**: Add provider detection in `viewText` atom
|
|
|
|
#### Custom Message Types
|
|
|
|
1. **Extend ChatMessageType**: Add new user types
|
|
2. **Update ChatItem Rendering**: Handle new message types
|
|
3. **Modify Storage**: Update persistence format if needed
|
|
|
|
This architecture provides a flexible, extensible foundation for AI chat functionality while maintaining clean separation between UI, business logic, and provider integrations. |