diff --git a/docs/API.md b/docs/API.md new file mode 100644 index 0000000..6baa21f --- /dev/null +++ b/docs/API.md @@ -0,0 +1,233 @@ +# API Reference + +This document describes the tool interface exposed to the LLM and the internal APIs for extending nanobot. + +## Tool Interface + +All tools implement the `Tool` interface from `src/agent/tools/base.ts`: + +```typescript +interface Tool { + name: string; // Tool identifier + description: string; // LLM-readable description + parameters: Record; // JSON Schema object + execute(args: Record): Promise; +} +``` + +## Built-in Tools + +### read_file + +Read a file from the filesystem. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| path | string | yes | Absolute or relative file path | +| offset | number | no | Line number to start from (1-indexed) | +| limit | number | no | Maximum number of lines to read | + +**Returns**: Line-numbered content (e.g., `1: first line\n2: second line`) + +### write_file + +Write content to a file, creating parent directories as needed. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| path | string | yes | File path to write | +| content | string | yes | Content to write | + +**Returns**: Success message or error + +### edit_file + +Replace an exact string in a file. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| path | string | yes | File path to edit | +| oldString | string | yes | Exact string to replace | +| newString | string | yes | Replacement string | +| replaceAll | boolean | no | Replace all occurrences | + +**Returns**: Success message or error if oldString not found + +### list_dir + +List files in a directory. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| path | string | yes | Directory path | +| recursive | boolean | no | List recursively | + +**Returns**: One file/directory per line, directories suffixed with `/` + +### exec + +Execute a shell command. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| command | string | yes | Shell command to execute | +| timeout | number | no | Timeout in seconds (default: 120) | +| workdir | string | no | Working directory override | + +**Returns**: Combined stdout + stderr + +### web_search + +Search the web using Brave Search API. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| query | string | yes | Search query | +| count | number | no | Number of results (default: 10) | + +**Returns**: JSON array of `{ title, url, snippet }` objects + +### web_fetch + +Fetch and parse a URL. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| url | string | yes | URL to fetch | +| mode | string | no | `markdown` (default), `raw`, or `html` | + +**Returns**: +- HTML pages: extracted readable text (via Readability) +- JSON: pretty-printed JSON +- Other: raw text + +### message + +Send a message to the current chat channel. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| content | string | yes | Message content | + +**Returns**: Success confirmation + +### spawn + +Spawn a background subagent for long-running tasks. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| task | string | yes | Task description for the subagent | + +**Returns**: Spawn confirmation with subagent ID + +### cron + +Manage scheduled tasks. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| action | string | yes | `list`, `add`, `remove`, `enable`, `disable`, `run`, `status` | +| id | string | conditional | Job ID (for remove/enable/disable/run) | +| name | string | conditional | Job name (for add) | +| message | string | conditional | Task message (for add) | +| schedule | string | conditional | Schedule expression (for add) | +| deleteAfterRun | boolean | no | Delete after one execution | + +**Schedule formats**: +- `every Ns/m/h/d` — e.g., `every 30m` +- `at YYYY-MM-DD HH:MM` — one-time +- Cron expression — e.g., `0 9 * * 1-5` + +**Returns**: Action-specific response (job list, confirmation, status) + +## Internal APIs + +### BaseChannel + +Extend to create new channel types: + +```typescript +abstract class BaseChannel { + _bus: MessageBus; + abstract start(): Promise; + abstract stop(): void; + abstract send(chatId: string, content: string, metadata?: Record): Promise; + isAllowed(senderId: string, allowFrom: string[]): boolean; +} +``` + +### MessageBus + +```typescript +class MessageBus { + publishInbound(msg: InboundMessage): void; + consumeInbound(): Promise; + publishOutbound(msg: OutboundMessage): void; + consumeOutbound(): Promise; +} +``` + +### InboundMessage + +```typescript +type InboundMessage = { + channel: string; // 'mattermost', 'cli', 'system' + senderId: string; // User identifier + chatId: string; // Conversation identifier + content: string; // Message text + metadata: Record; + media?: string[]; // Optional media URLs +}; +``` + +### OutboundMessage + +```typescript +type OutboundMessage = { + channel: string; + chatId: string; + content: string | null; + metadata: Record; + media?: string[]; +}; +``` + +### LLMProvider + +```typescript +class LLMProvider { + defaultModel: string; + chat(opts: ChatOptions): Promise<{ response: LLMResponse; responseMessages: ModelMessage[] }>; + chatWithRetry(opts: ChatOptions): Promise<{ response: LLMResponse; responseMessages: ModelMessage[] }>; +} +``` + +### Session + +```typescript +class Session { + key: string; + messages: SessionMessage[]; + createdAt: string; + updatedAt: string; + lastConsolidated: number; + getHistory(maxMessages?: number): SessionMessage[]; + clear(): void; +} +``` + +### CronService + +```typescript +class CronService { + listJobs(): CronJob[]; + addJob(job: Omit): CronJob; + removeJob(id: string): boolean; + enableJob(id: string, enabled: boolean): boolean; + runJob(id: string): Promise; + status(): string; + start(): void; + stop(): void; +} +``` diff --git a/docs/Architecture.md b/docs/Architecture.md new file mode 100644 index 0000000..bd443d2 --- /dev/null +++ b/docs/Architecture.md @@ -0,0 +1,150 @@ +# Architecture + +## Tech Stack + +| Layer | Technology | +|-------|------------| +| Runtime | Bun (v1.0+) | +| Language | TypeScript (strict mode) | +| LLM Abstraction | Vercel AI SDK v6 | +| Validation | Zod v4 | +| CLI | Commander | +| Colors | picocolors | +| Formatting | oxfmt (single quotes) | +| Linting | oxlint | + +## Folder Structure + +``` +nanobot-ts/ +├── index.ts # Entry point +├── src/ +│ ├── agent/ +│ │ ├── loop.ts # AgentLoop: LLM ↔ tool execution loop +│ │ ├── context.ts # ContextBuilder: system prompt assembly +│ │ ├── memory.ts # MemoryConsolidator: token management +│ │ ├── skills.ts # Skill loader from workspace +│ │ ├── subagent.ts # SubagentManager: background tasks +│ │ └── tools/ +│ │ ├── base.ts # Tool interface + ToolRegistry +│ │ ├── filesystem.ts # read_file, write_file, edit_file, list_dir +│ │ ├── shell.ts # exec +│ │ ├── web.ts # web_search, web_fetch +│ │ ├── message.ts # message +│ │ ├── spawn.ts # spawn +│ │ └── cron.ts # cron +│ ├── channels/ +│ │ ├── base.ts # BaseChannel abstract class +│ │ ├── mattermost.ts # Mattermost WebSocket + REST +│ │ └── manager.ts # ChannelManager lifecycle +│ ├── bus/ +│ │ ├── types.ts # InboundMessage, OutboundMessage schemas +│ │ └── queue.ts # AsyncQueue, MessageBus +│ ├── provider/ +│ │ ├── types.ts # LLMResponse, ToolCall, ChatOptions +│ │ └── index.ts # LLMProvider (AI SDK wrapper) +│ ├── session/ +│ │ ├── types.ts # SessionMessage, SessionMeta schemas +│ │ └── manager.ts # Session persistence (JSONL) +│ ├── cron/ +│ │ ├── types.ts # CronJob, CronSchedule schemas +│ │ └── service.ts # CronService +│ ├── heartbeat/ +│ │ └── service.ts # HeartbeatService +│ ├── config/ +│ │ ├── types.ts # Zod config schemas +│ │ └── loader.ts # loadConfig, env overrides +│ └── cli/ +│ └── commands.ts # gateway + agent commands +├── templates/ # Default workspace files +│ ├── SOUL.md # Agent personality +│ ├── USER.md # User preferences +│ ├── TOOLS.md # Tool documentation +│ ├── AGENTS.md # Agent behavior rules +│ ├── HEARTBEAT.md # Periodic tasks +│ └── memory/MEMORY.md # Long-term memory +└── skills/ # Bundled skills +``` + +## Data Flow + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Gateway Mode │ +├─────────────────────────────────────────────────────────────────┤ +│ │ +│ Mattermost ──► BaseChannel ──► MessageBus ──► AgentLoop │ +│ ▲ │ │ │ +│ │ ▼ ▼ │ +│ │ OutboundQueue LLMProvider │ +│ │ │ │ │ +│ └───────────────────────────────┘ ▼ │ +│ ToolRegistry │ +│ │ │ +│ ▼ │ +│ Tool.execute() │ +│ │ +└─────────────────────────────────────────────────────────────────┘ + +┌─────────────────────────────────────────────────────────────────┐ +│ Agent Mode │ +├─────────────────────────────────────────────────────────────────┤ +│ │ +│ CLI stdin ──► processDirect() ──► AgentLoop ──► Response │ +│ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +## Key Components + +### AgentLoop +The core orchestrator. Consumes inbound messages, runs the LLM tool-calling loop, and publishes responses. + +1. Receives `InboundMessage` from bus +2. Loads/creates session by key +3. Builds context (system prompt + history) +4. Calls LLM with tools +5. Executes tool calls, appends results +6. Repeats until no tool calls or max iterations +7. Saves session, publishes response + +### MessageBus +An async queue system for decoupling channels from the agent loop. + +- `publishInbound()` / `consumeInbound()`: messages from channels to agent +- `publishOutbound()` / `consumeOutbound()`: responses from agent to channels + +### LLMProvider +Wraps Vercel AI SDK `generateText()` with: + +- Model string resolution (e.g., `openrouter/anthropic/claude-sonnet-4-5`) +- Retry logic (3 attempts, exponential backoff) +- Malformed JSON repair +- Normalized `LLMResponse` type + +### SessionManager +Persists conversation history to JSONL files in `~/.nanobot/sessions/`. + +- Key format: `{channel}:{chatId}` (e.g., `mattermost:abc123`) +- Supports history truncation for context window limits + +### ToolRegistry +Stores tools by name, provides OpenAI-compatible function definitions to the LLM. + +### MemoryConsolidator +When session history exceeds token limits, summarizes old messages and archives to `memory/MEMORY.md`. + +## Configuration + +- File: `~/.nanobot/config.json` +- Validation: Zod schemas in `src/config/types.ts` +- Env overrides: `NANOBOT_MODEL`, `NANOBOT_WORKSPACE`, `NANOBOT_CONFIG` + +## Session Key Convention + +| Channel | Key Format | Example | +|---------|-----------|----------| +| Mattermost | `mattermost:{channelId}` | `mattermost:abc123` | +| Mattermost (thread) | `mattermost:{channelId}:{rootId}` | `mattermost:abc:def456` | +| CLI | `cli:{chatId}` | `cli:interactive` | +| System | `system:{source}` | `system:heartbeat` | diff --git a/docs/Discoveries.md b/docs/Discoveries.md new file mode 100644 index 0000000..c873433 --- /dev/null +++ b/docs/Discoveries.md @@ -0,0 +1,151 @@ +# Discoveries + +Empirical learnings from implementation that future sessions should know. + +## Zod v4 Specifics + +### `.default()` on Nested Objects + +Zod v4 requires factory functions for nested object defaults, and the factory must return the **full output type** (not just `{}`): + +```typescript +// ❌ Wrong - empty object won't match the schema +const Config = z.object({ + nested: NestedSchema.default({}), +}); + +// ✅ Correct - factory returning full type +const Config = z.object({ + nested: NestedSchema.default(() => ({ field: value, ... })), +}); +``` + +### `z.record()` Requires Two Arguments + +```typescript +// ❌ Wrong +z.record(z.string()) + +// ✅ Correct +z.record(z.string(), z.unknown()) +``` + +## AI SDK v6 Changes + +| v4/v5 | v6 | +|-------|-----| +| `LanguageModelV2` | `LanguageModel` | +| `maxTokens` | `maxOutputTokens` | +| `maxSteps` | `stopWhen: stepCountIs(n)` | +| `usage.promptTokens` | `usage.inputTokens` | +| `usage.completionTokens` | `usage.outputTokens` | + +## ollama-ai-provider Compatibility + +`ollama-ai-provider` v1.2.0 returns `LanguageModelV1`, not the expected `LanguageModel` (v2/v3). Cast at call site: + +```typescript +import { ollama } from 'ollama-ai-provider'; +import type { LanguageModel } from 'ai'; + +const model = ollama('llama3.2') as unknown as LanguageModel; +``` + +## js-tiktoken API + +```typescript +// ❌ Wrong (Python-style) +import { get_encoding } from 'js-tiktoken'; + +// ✅ Correct +import { getEncoding } from 'js-tiktoken'; +``` + +## Bun/Node Globals + +`Document` is not available as a global in Bun/Node. For DOM-like operations: + +```typescript +// ❌ Wrong +function makeDocument(): Document { ... } + +// ✅ Correct +function makePseudoDocument(): Record { ... } +// Cast at call site if needed +``` + +## WebSocket Error Types + +WebSocket `onerror` handler receives an `Event`, not an `Error`: + +```typescript +socket.onerror = (err: Event) => { + console.error(`WebSocket error: ${err.type}`); +}; +``` + +## Template Literals with Unknown Types + +When interpolating `unknown` types in template literals, explicitly convert to string: + +```typescript +// ❌ Risky - may throw +console.log(`Error: ${err}`); + +// ✅ Safe +console.log(`Error: ${String(err)}`); +``` + +## Helper: strArg + +For safely extracting string arguments from `Record`: + +```typescript +// src/agent/tools/base.ts +export function strArg(args: Record, key: string, fallback = ''): string { + const val = args[key]; + return typeof val === 'string' ? val : fallback; +} +``` + +Usage: + +```typescript +// ❌ Verbose +const path = String(args['path'] ?? ''); + +// ✅ Cleaner +const path = strArg(args, 'path'); +const timeout = parseInt(strArg(args, 'timeout', '30'), 10); +``` + +## Mattermost WebSocket + +- Uses raw `WebSocket` + `fetch` (no mattermostdriver library) +- Auth via hello message with token +- Event types: `posted`, `post_edited`, `reaction_added`, etc. +- Group channel policy: `mention` (default), `open`, `allowlist` + +## Session Persistence + +- Format: JSONL (one JSON object per line) +- Location: `~/.nanobot/sessions/{sessionKey}.jsonl` +- Tool results truncated at 16,000 characters +- Memory consolidation triggered when approaching context window limit + +## Retry Logic + +`LLMProvider.chatWithRetry()` retries on: +- HTTP 429 (rate limit) +- HTTP 5xx (server errors) +- Timeouts +- Network errors + +Max 3 attempts with exponential backoff. + +## Config Precedence + +1. CLI flags (`-c`, `-m`, `-w`, `-M`) +2. Environment variables (`NANOBOT_CONFIG`, `NANOBOT_MODEL`, `NANOBOT_WORKSPACE`) +3. Config file (`~/.nanobot/config.json`) +4. Zod schema defaults diff --git a/docs/PRD.md b/docs/PRD.md new file mode 100644 index 0000000..849f144 --- /dev/null +++ b/docs/PRD.md @@ -0,0 +1,64 @@ +# Product Requirements Document (PRD) + +## Overview + +nanobot is an ultra-lightweight personal AI assistant framework. It provides a chat-controlled bot that can execute tasks through natural language commands, with pluggable "channels" for different messaging platforms. + +## Target Audience + +- Individual developers and power users who want a personal AI assistant +- Users who prefer self-hosted, privacy-respecting AI tools +- Teams using Mattermost who want an integrated AI assistant +- Users who need AI assistance with file operations, shell commands, and web searches + +## Core Features + +### 1. Agent Loop +- Conversational AI powered by LLMs (Anthropic, OpenAI, Google, OpenRouter, Ollama) +- Tool execution with iterative refinement +- Session management with persistent conversation history +- Memory consolidation to manage context window limits + +### 2. Tool System +- **Filesystem**: read, write, edit, list files +- **Shell**: execute arbitrary commands with configurable security constraints +- **Web**: search (Brave), fetch and parse URLs +- **Message**: send intermediate updates to chat channels +- **Spawn**: delegate long-running tasks to background subagents +- **Cron**: schedule recurring tasks + +### 3. Channel System +- **Mattermost**: WebSocket-based real-time messaging with REST API for posts +- **CLI**: local interactive terminal or single-shot mode +- Extensible channel interface for future platforms + +### 4. Scheduling +- **Cron Service**: schedule tasks with cron expressions, intervals, or one-time execution +- **Heartbeat**: periodic wake-up to check for tasks (e.g., HEARTBEAT.md) + +### 5. Memory & Skills +- Long-term memory with consolidation +- Skill loading from workspace +- System prompt construction from templates (SOUL.md, USER.md, TOOLS.md) + +## Non-Goals (Out of Scope) + +- Non-Mattermost channels (Telegram, Discord, Slack, etc.) +- MCP (Model Context Protocol) client support +- Extended thinking/reasoning token handling +- Onboard configuration wizard +- Multi-tenancy or user authentication + +## User Stories + +1. As a developer, I want to ask the AI to read and modify files in my workspace so I can work faster. +2. As a team lead, I want the bot to respond in Mattermost channels when mentioned so my team can get AI help without leaving chat. +3. As a power user, I want to schedule recurring tasks so the AI can check things automatically. +4. As a privacy-conscious user, I want to run the bot locally with Ollama so my data stays on my machine. + +## Success Metrics + +- Zero external dependencies for core functionality beyond LLM providers +- Sub-second response time for tool execution +- Graceful degradation on LLM errors +- Clear error messages for configuration issues diff --git a/memory-bank/activeContext.md b/memory-bank/activeContext.md index 6ad65a0..9f04d42 100644 --- a/memory-bank/activeContext.md +++ b/memory-bank/activeContext.md @@ -1,7 +1,7 @@ # Active Context ## Current Focus -All source files written and verified — typecheck and lint are both clean. +Docs directory created with 4 files (PRD.md, Architecture.md, API.md, Discoveries.md). All source files previously written and verified — typecheck and lint are both clean. ## Session State (as of this writing) - All source files complete and passing `tsc --noEmit` (0 errors) and `oxlint` (0 errors, 0 warnings) diff --git a/memory-bank/progress.md b/memory-bank/progress.md index 5b74c93..5d383d1 100644 --- a/memory-bank/progress.md +++ b/memory-bank/progress.md @@ -34,6 +34,7 @@ - **Full typecheck pass**: `tsc --noEmit` → 0 errors - **Full lint pass**: `oxlint` → 0 errors, 0 warnings - `package.json` scripts added: `start`, `dev`, `typecheck` +- **Docs created**: `/docs/PRD.md`, `Architecture.md`, `API.md`, `Discoveries.md` ### 🔄 In Progress - Nothing