第二章：内置工具 -- read、edit 和工具设计模式

对应源文件：packages/coding-agent/src/core/tools/

工具的统一架构

每个内置工具都遵循相同的设计模式。以 read 为例：

// 1. 用 TypeBox 定义参数 Schema
const readSchema = Type.Object({
  path: Type.String({ description: "Path to the file to read" }),
  offset: Type.Optional(Type.Number({ description: "Line number to start reading from" })),
  limit: Type.Optional(Type.Number({ description: "Maximum number of lines to read" })),
});

// 2. 创建 ToolDefinition 工厂函数
function createReadToolDefinition(cwd: string, options?: ReadToolOptions): ToolDefinition {
  return {
    name: "read",
    label: "read",
    description: "Read the contents of a file...",
    promptSnippet: "Read file contents",              // system prompt 中的一句话说明
    promptGuidelines: ["Use read instead of cat."],   // system prompt 中的指引
    parameters: readSchema,
    execute(...) { ... },       // 实际执行逻辑
    renderCall(...) { ... },    // TUI 中如何显示工具调用
    renderResult(...) { ... },  // TUI 中如何显示工具结果
  };
}

// 3. 包装为 AgentTool（桥接 agent 包）
function createReadTool(cwd: string): AgentTool {
  return wrapToolDefinition(createReadToolDefinition(cwd));
}

两层 API：ToolDefinition vs AgentTool

层次	定义位置	用途
`ToolDefinition`	coding-agent	完整定义：执行逻辑 + TUI 渲染 + prompt 集成
`AgentTool`	agent 包	只有执行逻辑的精简版本

wrapToolDefinition 把 ToolDefinition 转换为 AgentTool，丢弃渲染相关的部分。这种分离使得工具的 UI 层和逻辑层解耦。

read 工具详解

文本文件读取流程

用户请求 → read({ path: "main.ts", offset: 1, limit: 50 })

1. 解析路径
   resolveReadPath("main.ts", cwd) → "/project/main.ts"

2. 检查权限
   fsAccess(absolutePath, R_OK)

3. 检测类型
   detectSupportedImageMimeType → null（文本文件）

4. 读取内容
   fsReadFile → Buffer → UTF-8 string

5. 应用 offset/limit
   从第 1 行开始，取 50 行

6. 截断保护
   truncateHead() → 最多 2000 行或 256KB
   超出 → 附加 "[Showing lines 1-2000, use offset=2001 to continue]"

7. 返回结果
   { content: [{ type: "text", text: fileContent }], details: { truncation } }

图片文件读取

read 工具自动识别图片文件（jpg, png, gif, webp），并以 base64 编码返回：

read({ path: "screenshot.png" })
→ detectSupportedImageMimeType → "image/png"
→ readFile → Buffer → base64 string
→ resizeImage（如果太大，自动缩放到 2000x2000 以内）
→ { content: [
     { type: "text", text: "Read image file [image/png]" },
     { type: "image", data: base64Data, mimeType: "image/png" }
   ]}

可插拔操作（ReadOperations）

read 工具的实际 I/O 操作被抽象为 ReadOperations 接口：

interface ReadOperations {
  readFile: (path: string) => Promise<Buffer>;
  access: (path: string) => Promise<void>;
  detectImageMimeType?: (path: string) => Promise<string | null>;
}

默认使用 Node.js 的 fs 模块。但你可以替换为 SSH 远程文件系统、Docker 容器内的文件系统、内存文件系统等。这种设计在 pods 包中用于远程沙箱执行。

TUI 渲染

每个工具有两个渲染函数：renderCall 和 renderResult。

renderCall -- 显示工具调用

read main.ts                              ← 简洁的单行显示
read src/utils/helper.ts:10-50            ← 带行范围
read ...                                   ← 参数还在流式传输中

renderResult -- 显示工具结果

1  import { readFile } from "fs/promises";
2  import { resolve } from "path";
3  ...
... (50 more lines, ctrl+e to expand)     ← 默认折叠，只显示前 10 行

渲染函数接收一个 ToolRenderContext，其中包含：

expanded：用户是否展开了结果
isPartial：结果是否正在流式传输
lastComponent：上次渲染返回的组件（用于复用）

edit 工具设计

edit 工具可能是最复杂的内置工具。它实现了基于搜索替换的精准编辑。

为什么不用 diff/patch

很多 AI 编码工具用 diff 格式让 AI 输出修改，但 diff 有一个致命问题：AI 经常搞错行号。diff 是行号敏感的，错一行整个 patch 就无法应用。

Pi 选择了搜索替换策略：

// AI 生成这样的调用：
edit({
  path: "main.ts",
  old_string: "function hello() {\n  console.log('hi');\n}",
  new_string: "function hello() {\n  console.log('hello world');\n}",
})

好处：

不依赖行号，AI 只需要准确引用目标代码
支持多处替换（old_string 匹配多次时可选择全部替换）
容易验证（搜索 old_string 是否存在）

备份策略

edit 在修改文件前会创建备份：

临时写入修改后的内容到 .edit.tmp 文件
如果写入成功，原子替换原文件
出错时可以恢复

bash 工具设计

bash 工具需要特别注意安全和控制：

超时和输出限制

// 默认限制
超时：120 秒
输出：512KB

超出输出限制时，只保留头部和尾部各一半，中间用截断标记连接。

可插拔执行器

和 read 一样，bash 也有可插拔的执行接口 BashOperations：

interface BashOperations {
  exec: (command: string, signal?: AbortSignal) => Promise<BashResult>;
}

默认在本地 shell 执行。pods 包替换为远程 SSH 执行。

promptSnippet 和 promptGuidelines

每个工具可以声明两种 system prompt 组件：

promptSnippet -- 出现在"Available tools"列表中：

Available tools:
- read: Read file contents                ← promptSnippet
- bash: Run shell commands                ← promptSnippet
- edit: Edit file contents                ← promptSnippet

promptGuidelines -- 出现在"Guidelines"列表中：

Guidelines:
- Use read to examine files instead of cat or sed    ← 来自 read
- Prefer grep/find/ls tools over bash                ← 根据启用的工具动态生成
- Be concise in your responses                       ← 默认

这种设计让 system prompt 根据实际启用的工具动态组装，避免提到用户没有的工具。

小结

工具架构
├── TypeBox Schema ─── 参数定义和验证
├── ToolDefinition ─── 完整定义（执行 + 渲染 + prompt）
│   ├── execute() ──── 实际执行逻辑
│   ├── renderCall() ─ TUI 调用显示
│   ├── renderResult() TUI 结果显示
│   ├── promptSnippet  system prompt 简介
│   └── promptGuidelines system prompt 指引
├── AgentTool ──────── agent 包接口（wrapToolDefinition 转换）
└── Operations ─────── 可插拔 I/O（本地/SSH/Docker）