From 1576853fbcd7fb9451d63f2a09bd8c5beb4174c6 Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Sun, 10 May 2026 07:55:45 +0100 Subject: [PATCH] docs: document tool search --- CHANGELOG.md | 1 + docs/.i18n/glossary.zh-CN.json | 20 +++ docs/docs.json | 1 + docs/tools/index.md | 7 + docs/tools/tool-search.md | 260 +++++++++++++++++++++++++++++++++ 5 files changed, 289 insertions(+) create mode 100644 docs/tools/tool-search.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 4cf01d20726..693815eeae4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -106,6 +106,7 @@ Docs: https://docs.openclaw.ai ### Changes - Skills: add `skills.load.allowSymlinkTargets` so intentional symlinked skill folders can resolve into trusted sibling repos without disabling root containment. +- Agents/tools: add core Tool Search so agents can search and call large OpenClaw, MCP, and client tool catalogs through one compact PI bridge. - Chat commands: add `/think default` and `/fast default` to clear session overrides and inherit configured/provider defaults. (#79385) Thanks @VACInc. - Dependencies: refresh workspace dependency pins and lockfile, including `@openai/codex` `0.130.0`, `acpx` `0.7.0`, AWS SDK `3.1044.0`, OpenTelemetry `0.217.0`, `typebox` `1.1.38`, `vite` `8.0.11`, `oxfmt` `0.48.0`, and `oxlint` `1.63.0`, and update the Codex harness model snapshot for the new bundled app-server catalog. - Plugins/install: add guarded plugin install overrides so onboarding and repair tests can route specific plugins to registry specs or local `npm pack` artifacts via environment variables. diff --git a/docs/.i18n/glossary.zh-CN.json b/docs/.i18n/glossary.zh-CN.json index 7ebfa3d1c98..740057ca623 100644 --- a/docs/.i18n/glossary.zh-CN.json +++ b/docs/.i18n/glossary.zh-CN.json @@ -862,5 +862,25 @@ { "source": "fs-safe Cleanup Plan", "target": "fs-safe Cleanup Plan" + }, + { + "source": "Tool Search", + "target": "工具搜索" + }, + { + "source": "Tools and plugins", + "target": "工具和插件" + }, + { + "source": "Multi-agent sandbox and tools", + "target": "多 Agent 沙盒和工具" + }, + { + "source": "Exec tool", + "target": "Exec 工具" + }, + { + "source": "ACP agents setup", + "target": "ACP Agents 设置" } ] diff --git a/docs/docs.json b/docs/docs.json index 44933712b4e..6b9612191fd 100644 --- a/docs/docs.json +++ b/docs/docs.json @@ -1280,6 +1280,7 @@ "tools/reactions", "tools/thinking", "tools/tokenjuice", + "tools/tool-search", "tools/loop-detection", "tools/trajectory", "tools/tts", diff --git a/docs/tools/index.md b/docs/tools/index.md index f38b3e9a302..ca70607619c 100644 --- a/docs/tools/index.md +++ b/docs/tools/index.md @@ -117,6 +117,13 @@ tool descriptor during discovery and caches it by plugin source and contract, so later tool planning can skip plugin runtime loading. Tool execution still loads the owning plugin and calls the live registered implementation. +[Tool Search](/tools/tool-search) is the compact surface +for large catalogs. Instead of putting every OpenClaw, MCP, or client tool +schema into the prompt, OpenClaw can give the model an isolated Node runtime +with `openclaw.tools.search`, `openclaw.tools.describe`, and +`openclaw.tools.call`. Calls still flow back through the Gateway, so tool +policy, approvals, hooks, and session logs remain authoritative. + ## Tool configuration ### Allow and deny lists diff --git a/docs/tools/tool-search.md b/docs/tools/tool-search.md new file mode 100644 index 00000000000..30e891a819a --- /dev/null +++ b/docs/tools/tool-search.md @@ -0,0 +1,260 @@ +--- +summary: "Tool Search: compact large PI tool catalogs behind search, describe, and call" +title: "Tool Search" +read_when: + - You want PI agents to use a large tool catalog without adding every tool schema to the prompt + - You want OpenClaw tools, MCP tools, and client tools exposed through one compact PI surface + - You are implementing or debugging tool discovery for PI runs +--- + +Tool Search gives PI agents one compact way to discover and call large tool +catalogs. It is useful when the run has many available tools but the model is +likely to need only a few of them. + +When enabled for PI, the model receives one `tool_search_code` tool by default. +That tool runs a short JavaScript body in an isolated Node subprocess with an +`openclaw.tools` bridge: + +```js +const hits = await openclaw.tools.search("create a GitHub issue"); +const tool = await openclaw.tools.describe(hits[0].id); +return await openclaw.tools.call(tool.id, { + title: "Crash on startup", + body: "Steps to reproduce...", +}); +``` + +The catalog can include OpenClaw tools, plugin tools, MCP tools, and +client-provided tools. The model does not see every full schema up front. +Instead, it searches compact descriptors, describes one selected tool when it +needs the exact schema, and calls that tool through OpenClaw. + +Codex harness runs do not receive these OpenClaw Tool Search controls. OpenClaw +passes product capabilities to Codex as dynamic tools, and Codex owns native +code mode, native tool search, deferred dynamic tools, and nested tool calls. + +## How a turn runs + +At planning time the PI embedded runner builds the effective catalog for the +run: + +1. Resolve the active tool policy for the agent, profile, sandbox, and session. +2. List eligible OpenClaw and plugin tools. +3. List eligible MCP tools through the session MCP runtime. +4. Add eligible client tools supplied for the current run. +5. Index compact descriptors for search. +6. Expose either the PI code bridge or the structured fallback tools to the + model. + +At execution time every real tool call returns to OpenClaw. The isolated Node +runtime does not hold plugin implementations, MCP client objects, or secrets. +`openclaw.tools.call(...)` crosses the bridge back into the Gateway, where the +normal policy, approval, hook, logging, and result handling still apply. + +## Modes + +`tools.toolSearch` has two model-facing modes: + +- `code`: exposes `tool_search_code`, the default compact JavaScript bridge. +- `tools`: exposes `tool_search`, `tool_describe`, and `tool_call` as plain + structured tools for providers that should not receive code. + +Both modes use the same catalog and execution path. The only difference is the +shape the model sees. If the current runtime cannot launch the isolated Node +code-mode child process, the default `code` mode falls back to `tools` before +catalog compaction. + +There is no separate source-selection config. When Tool Search is enabled, the +catalog includes eligible OpenClaw, MCP, and client tools after normal policy +filtering. + +## Why this exists + +Large catalogs are useful but expensive. Sending every tool schema to the model +makes the request larger, slows planning, and increases accidental tool +selection. + +Tool Search changes the shape: + +- direct tools: the model sees every selected schema before the first token +- Tool Search code mode: the model sees one compact code tool and a short API + contract +- Tool Search tools mode: the model sees three compact structured fallback + tools +- during the turn: the model loads only the tool schemas it actually needs + +Direct tool exposure is still the right default for small catalogs. Tool Search +is best when one run can see many tools, especially from MCP servers or +client-provided app tools. + +## API + +`openclaw.tools.search(query, options?)` + +Searches the effective catalog for the current run. Results are compact and safe +to put back into prompt context. + +```js +const hits = await openclaw.tools.search("calendar event", { limit: 5 }); +``` + +`openclaw.tools.describe(id)` + +Loads full metadata for one search result, including the exact input schema. + +```js +const calendarCreate = await openclaw.tools.describe("mcp:calendar:create_event"); +``` + +`openclaw.tools.call(id, args)` + +Calls a selected tool through OpenClaw. + +```js +await openclaw.tools.call(calendarCreate.id, { + summary: "Planning", + start: "2026-05-09T14:00:00Z", +}); +``` + +The structured fallback mode exposes the same operations as tools: + +- `tool_search` +- `tool_describe` +- `tool_call` + +## Runtime boundary + +The code bridge runs in a short-lived Node subprocess. The subprocess starts +with Node permission mode enabled, an empty environment, no filesystem or +network grants, and no child-process or worker grants. OpenClaw enforces a +parent-process wall-clock timeout and kills the subprocess on timeout, including +after async continuations. + +The runtime exposes only: + +- `console.log`, `console.warn`, and `console.error` +- `openclaw.tools.search` +- `openclaw.tools.describe` +- `openclaw.tools.call` + +Normal OpenClaw behavior still applies to final calls: + +- tool allow and deny policies +- per-agent and per-sandbox tool restrictions +- owner-only gating +- approval hooks +- plugin `before_tool_call` hooks +- session identity, logs, and telemetry + +## Config + +Enable Tool Search for PI runs with the default code bridge: + +```bash +openclaw config set tools.toolSearch true +``` + +Equivalent JSON: + +```json5 +{ + tools: { + toolSearch: true, + }, +} +``` + +Use the structured fallback tools instead for PI runs: + +```json5 +{ + tools: { + toolSearch: { + mode: "tools", + }, + }, +} +``` + +Tune code-mode timeout and search result limits: + +```json5 +{ + tools: { + toolSearch: { + mode: "code", + codeTimeoutMs: 10000, + searchDefaultLimit: 8, + maxSearchLimit: 20, + }, + }, +} +``` + +Disable it: + +```json5 +{ + tools: { + toolSearch: false, + }, +} +``` + +## Prompt and telemetry + +Tool Search records enough telemetry to compare it with direct tool exposure: + +- total serialized tool and prompt bytes sent to the harness +- catalog size and source breakdown +- search, describe, and call counts +- final tool calls executed through OpenClaw +- selected tool ids and sources + +Session logs should make it possible to answer: + +- how many tool schemas the model saw up front +- how many search and describe operations it performed +- which final tool was called +- whether the result came from OpenClaw, MCP, or a client tool + +## E2E validation + +The gateway E2E runner proves both paths with the PI harness: + +```bash +node --import tsx scripts/tool-search-gateway-e2e.ts +``` + +It creates a temporary fake plugin with a large tool catalog, starts the mock +OpenAI provider, starts a Gateway once in direct mode and once with Tool Search +enabled, then compares provider request payloads and session logs. + +The regression proves: + +1. Direct mode can call the fake plugin tool. +2. Tool Search can call the same fake plugin tool. +3. Direct mode exposes the fake plugin tool schemas directly to the provider. +4. Tool Search exposes only the compact bridge. +5. The Tool Search request payload is smaller for the large fake catalog. +6. Session logs show the expected tool-call counts and bridged call telemetry. + +## Failure behavior + +Tool Search should fail closed: + +- if a tool is not in the effective policy, search should not return it +- if a selected tool becomes unavailable, `tool_call` should fail +- if policy or approval blocks execution, the call result should report that + block instead of bypassing it +- if the code bridge cannot create an isolated runtime, use `mode: "tools"` or + disable Tool Search for that deployment + +## Related + +- [Tools and plugins](/tools) +- [Multi-agent sandbox and tools](/tools/multi-agent-sandbox-tools) +- [Exec tool](/tools/exec) +- [ACP agents setup](/tools/acp-agents-setup) +- [Building plugins](/plugins/building-plugins)