feat(core): 更新 web 工具描述并添加域名过滤参数

- 更新 web_extract 工具描述为英文版本 - 更新 web_search 工具描述，添加搜索日期提示 - 为 web_search 添加 allowed_domains 和 blocked_domains 参数 - 添加 GET_CURRENT_DATE_FN 模板函数支持动态日期
2025-12-17 13:47:58 +08:00
parent 2afa7bb103
commit 8d8ebb8786
4 changed files with 55 additions and 35 deletions
@@ -482,6 +482,11 @@ export function createToolDescriptionContext(
    CUSTOM_TIMEOUT_MS: () => CUSTOM_TIMEOUT_MS,
    MAX_TIMEOUT_MS: () => MAX_TIMEOUT_MS,
    MAX_OUTPUT_CHARS: () => MAX_OUTPUT_CHARS,
+    GET_CURRENT_DATE_FN: () => {
+      const now = new Date();
+      const pad = (n: number) => n.toString().padStart(2, '0');
+      return `${now.getFullYear()}-${pad(now.getMonth() + 1)}-${pad(now.getDate())} ${pad(now.getHours())}:${pad(now.getMinutes())}:${pad(now.getSeconds())}`;
+    },
  };

  return {
@@ -1,19 +1,16 @@
-从指定的网页 URL 提取内容。使用 Tavily Extract API 智能解析网页，返回结构化的文本内容。
+- Fetches content from a specified URL and processes it using an AI model
+- Takes a URL and a prompt as input
+- Fetches the URL content, converts HTML to markdown
+- Processes the content with the prompt using a small, fast model
+- Returns the model's response about the content
+- Use this tool when you need to retrieve and analyze web content

-适用场景：
- 获取网页文章的完整内容
- 提取文档页面的详细信息
- 抓取多个页面进行对比分析
- 获取网页中的图片列表
- 深度提取包含表格、嵌入内容的页面
-
-参数说明：
- urls: URL 列表（必填，最多 20 个，也可传单个 URL 字符串）
- extract_depth: "basic" 快速提取 / "advanced" 深度提取（含表格等）
- format: "markdown" / "text" 输出格式，默认 markdown
- include_images: 是否包含图片列表，默认 false
-
-返回内容：
- 每个 URL 的提取内容
- 图片列表（如果启用）
- 失败的 URL 及错误信息
+Usage notes:
+  - IMPORTANT: If an MCP-provided web fetch tool is available, prefer using that tool instead of this one, as it may have fewer restrictions.
+  - The URL must be a fully-formed valid URL
+  - HTTP URLs will be automatically upgraded to HTTPS
+  - The prompt should describe what information you want to extract from the page
+  - This tool is read-only and does not modify any files
+  - Results may be summarized if the content is very large
+  - Includes a self-cleaning 15-minute cache for faster responses when repeatedly accessing the same URL
+  - When a URL redirects to a different host, the tool will inform you and provide the redirect URL in a special format. You should then make a new WebFetch request with the redirect URL to fetch the content.
@@ -1,21 +1,25 @@
-搜索网络获取最新信息。使用 Tavily API 进行智能搜索，返回相关网页内容和 AI 摘要。
+- Allows Claude to search the web and use the results to inform responses
+- Provides up-to-date information for current events and recent data
+- Returns search result information formatted as search result blocks, including links as markdown hyperlinks
+- Use this tool for accessing information beyond Claude's knowledge cutoff
+- Searches are performed automatically within a single API call

-这是进行网络搜索的首选工具，不要使用 curl 或 bash 命令来搜索网络。
+CRITICAL REQUIREMENT - You MUST follow this:
+  - After answering the user's question, you MUST include a "Sources:" section at the end of your response
+  - In the Sources section, list all relevant URLs from the search results as markdown hyperlinks: [Title](URL)
+  - This is MANDATORY - never skip including sources in your response
+  - Example format:

-适用场景：
- 查询最新新闻、事件、游戏更新
- 搜索技术文档、API 参考
- 获取实时数据（股价、天气等）
- 查找开源项目、库的信息
- 了解最新的技术趋势
+    [Your answer here]

-参数说明：
- query: 搜索关键词（必填）
- max_results: 返回结果数量，1-20，默认 5
- search_depth: "basic" 快速搜索 / "advanced" 深度搜索
- topic: "general" 通用 / "news" 新闻 / "finance" 财经
- include_answer: 是否包含 AI 摘要，默认 true
+    Sources:
+    - [Source Title 1](https://example.com/1)
+    - [Source Title 2](https://example.com/2)

-返回内容：
- AI 生成的摘要答案
- 相关网页列表（标题、链接、内容摘要）
+Usage notes:
+  - Domain filtering is supported to include or block specific websites
+  - Web search is only available in the US
+
+IMPORTANT - Use the correct year in search queries:
+  - Today's date is ${GET_CURRENT_DATE_FN()}. You MUST use this year when searching for recent information, documentation, or current events.
+  - Example: If today is 2025-07-15 and the user asks for "latest React docs", search for "React documentation 2025", NOT "React documentation 2024"
@@ -41,6 +41,16 @@ export const webSearchTool: ToolWithMetadata = {
      description: '是否包含 AI 生成的摘要答案（默认 true）',
      required: false,
    },
+    allowed_domains: {
+      type: 'array',
+      description: '只包含这些域名的结果',
+      required: false,
+    },
+    blocked_domains: {
+      type: 'array',
+      description: '排除这些域名的结果',
+      required: false,
+    },
  },
  execute: async (params: Record<string, unknown>): Promise<ToolResult> => {
    const query = params.query as string;
@@ -48,6 +58,8 @@ export const webSearchTool: ToolWithMetadata = {
    const searchDepth = (params.search_depth as 'basic' | 'advanced') || 'basic';
    const topic = (params.topic as 'general' | 'news' | 'finance') || 'general';
    const includeAnswer = params.include_answer !== false;
+    const includeDomains = params.allowed_domains as string[] | undefined;
+    const excludeDomains = params.blocked_domains as string[] | undefined;

    // 权限检查
    const permissionManager = getPermissionManager();
@@ -94,6 +106,8 @@ export const webSearchTool: ToolWithMetadata = {
        topic,
        maxResults,
        includeAnswer,
+        includeDomains,
+        excludeDomains,
      });

      // 格式化输出