Files
beaver_project/skills/multi-search-engine/versions/v0001/SKILL.md
steven_li 8aeb97a5fc feat(app): 移除内置agents并添加CORS支持和技能上传优化
移除了agents/registry.json中的所有内置agents配置,将agents数组清空。
为web应用添加了CORS中间件支持,允许指定的前端地址跨域访问。
重构了技能上传功能,增加了LLM重写机制,自动规范化上传的技能格式。
新增了工具名称提取逻辑,从技能正文中自动识别Required Tools段落。
更新了技能学习候选者和草稿的载荷结构,添加评估报告统计信息。
修改了意图路由技能的说明,改进任务状态管理逻辑。
2026-06-12 13:25:20 +08:00

194 lines
7.2 KiB
Markdown

---
name: multi-search-engine
description: Multi search engine integration with 16 engines (7 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and WolframAlpha knowledge queries. No API keys required.
tools:
- web_fetch
---
# Multi Search Engine
## Overview
Multi search engine integration with 16 engines (7 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and WolframAlpha knowledge queries. No API keys required.
## When to Use
- Use when the task requires Multi Search Engine guidance.
## Required Tools
- `web_fetch`
## Workflow
- Identify whether the user's request matches the skill's trigger conditions.
- Read the relevant source guidance below and apply only the steps that fit the current task.
- Use the required tools deliberately and keep tool output tied to the user's goal.
### Source Guidance
### Multi Search Engine
Integration of 16 search engines for web crawling without API keys.
#### Workflow
1. **Preparation**: AI Agent initializes an empty in-memory cookie store. Cookies are only acquired dynamically during search operations when access is denied
2. **Language Evaluation**: Detect the language attribute of the search query. If the query is in Chinese, use Domestic search engines (Baidu, Bing CN, Bing INT, 360, Sogou, WeChat, Shenma). If the query is non-Chinese, use International search engines (Google, Google HK, DuckDuckGo, Yahoo, Startpage, Brave, Ecosia, Qwant, WolframAlpha). Select engines based on query relevance and availability.
3. **Controlled Search**: Use web_fetch to execute search requests with rate limiting:
- Add 1-2 second delay between requests to respect server load
- Batch requests in groups of 3-4 engines with sequential execution between batches
- Include standard browser headers to identify as legitimate user agent
- If access is denied (403/429), fetch engine homepage to obtain fresh session cookies
4. **Cookie Management**:
- Cookies are stored ONLY in memory during runtime
- Cookies are acquired on-demand when search requests fail
- No cookies are read from or written to config.json or any file
- Cookies are cleared after search session completes
- Only session cookies from search engine domains are captured
5. **Retry Mechanism**: If a search fails due to cookie/session issues, retry once with freshly acquired cookies after a 2-second delay
6. **Result Aggregation**: Consolidate successful results from search engines, organize and summarize them to output a core search report
#### Search Engines
##### Domestic (7)
- **Baidu**: `https://www.baidu.com/s?wd={keyword}`
- **Bing CN**: `https://cn.bing.com/search?q={keyword}&ensearch=0`
- **Bing INT**: `https://cn.bing.com/search?q={keyword}&ensearch=1`
- **360**: `https://www.so.com/s?q={keyword}`
- **Sogou**: `https://sogou.com/web?query={keyword}`
- **WeChat**: `https://wx.sogou.com/weixin?type=2&query={keyword}`
- **Shenma**: `https://m.sm.cn/s?q={keyword}`
##### International (9)
- **Google**: `https://www.google.com/search?q={keyword}`
- **Google HK**: `https://www.google.com.hk/search?q={keyword}`
- **DuckDuckGo**: `https://duckduckgo.com/html/?q={keyword}`
- **Yahoo**: `https://search.yahoo.com/search?p={keyword}`
- **Startpage**: `https://www.startpage.com/sp/search?query={keyword}`
- **Brave**: `https://search.brave.com/search?q={keyword}`
- **Ecosia**: `https://www.ecosia.org/search?q={keyword}`
- **Qwant**: `https://www.qwant.com/?q={keyword}`
- **WolframAlpha**: `https://www.wolframalpha.com/input?i={keyword}`
#### Quick Examples
```javascript
// Basic search
web_fetch({"url": "https://www.google.com/search?q=python+tutorial"})
// Site-specific
web_fetch({"url": "https://www.google.com/search?q=site:github.com+react"})
// File type
web_fetch({"url": "https://www.google.com/search?q=machine+learning+filetype:pdf"})
// Time filter (past week)
web_fetch({"url": "https://www.google.com/search?q=ai+news&tbs=qdr:w"})
// Privacy search
web_fetch({"url": "https://duckduckgo.com/html/?q=privacy+tools"})
// DuckDuckGo Bangs
web_fetch({"url": "https://duckduckgo.com/html/?q=!gh+tensorflow"})
// Knowledge calculation
web_fetch({"url": "https://www.wolframalpha.com/input?i=100+USD+to+CNY"})
```
#### Advanced Operators
| Operator | Example | Description |
|----------|---------|-------------|
| `site:` | `site:github.com python` | Search within site |
| `filetype:` | `filetype:pdf report` | Specific file type |
| `""` | `"machine learning"` | Exact match |
| `-` | `python -snake` | Exclude term |
| `OR` | `cat OR dog` | Either term |
#### Time Filters
| Parameter | Description |
|-----------|-------------|
| `tbs=qdr:h` | Past hour |
| `tbs=qdr:d` | Past day |
| `tbs=qdr:w` | Past week |
| `tbs=qdr:m` | Past month |
| `tbs=qdr:y` | Past year |
#### Privacy Engines
- **DuckDuckGo**: No tracking
- **Startpage**: Google results + privacy
- **Brave**: Independent index
- **Qwant**: EU GDPR compliant
#### Bangs Shortcuts (DuckDuckGo)
| Bang | Destination |
|------|-------------|
| `!g` | Google |
| `!gh` | GitHub |
| `!so` | Stack Overflow |
| `!w` | Wikipedia |
| `!yt` | YouTube |
#### WolframAlpha Queries
- Math: `integrate x^2 dx`
- Conversion: `100 USD to CNY`
- Stocks: `AAPL stock`
- Weather: `weather in Beijing`
#### Documentation
- `references/advanced-search.md` - Domestic search guide
- `references/international-search.md` - International search guide
- `CHANGELOG.md` - Version history
#### License
MIT
#### Security & Privacy Notice
##### Cookie Handling
- **Purpose**: Cookies are used ONLY to maintain search session state when access is denied (403/429 errors)
- **Storage**: Cookies are kept STRICTLY in memory during runtime - NEVER persisted to disk or config files
- **Acquisition**: Cookies are acquired on-demand from search engine homepages only when search requests fail
- **Scope**: Only session cookies from the specific search engine domain are captured
- **Lifecycle**: Cookies are cleared immediately after the search session completes
- **No Pre-configuration**: No cookies are loaded from config.json or any external file at startup
- **No API Keys**: This tool uses standard web search URLs, no authentication required
##### Crawling Ethics
- **Rate Limiting**: Implement reasonable delays between requests (recommend 1-2 seconds)
- **Respect robots.txt**: Honor search engine crawling policies
- **Terms of Service**: Users are responsible for complying with search engine ToS
- **Purpose**: Designed for legitimate search aggregation, not mass data scraping
##### Data Handling
- **No Personal Data**: Tool does not collect or transmit user personal information
- **Local Execution**: All operations run locally, no external data transmission
- **Session Isolation**: Cookies are session-specific and cleared after use
## Validation
- Verify the requested outcome with the most direct available check.
- Report any skipped step, unavailable dependency, or remaining uncertainty explicitly.
## Boundaries
- Do not broaden the task beyond the user's request.
- Do not use tools that are not listed or clearly available in the current runtime.
## Anti-Patterns
- Do not summarize the skill instead of applying it.
- Do not claim completion without validation evidence.