feat: 添加MinIO文件系统支持并优化外部连接器功能
- 添加MinIO用户文件系统配置选项(BEAVER_MINIO_ROOT_USER等) - 更新外部连接器配置结构,包括BASE_URL和认证令牌设置 - 改进connector provider支持更多类型(official, feishu_bot等) - 实现Mistral模型推理模式支持reasoning_effort参数 - 增强外部连接器策略配置和运行时配置管理 - 添加connector bridge事件验证和安全保护机制 - 优化任务路由逻辑,区分simple_chat和new_task场景 - 更新初始技能工具提示配置,分离authoring admin功能
This commit is contained in:
156
skills/multi-search-engine/versions/v0001/SKILL.md
Normal file
156
skills/multi-search-engine/versions/v0001/SKILL.md
Normal file
@ -0,0 +1,156 @@
|
||||
---
|
||||
name: multi-search-engine
|
||||
description: Multi search engine integration with 16 engines (7 CN + 9 Global). Supports advanced search operators, time filters, site search, privacy engines, and WolframAlpha knowledge queries. No API keys required.
|
||||
tools:
|
||||
- web_fetch
|
||||
---
|
||||
|
||||
# Multi Search Engine
|
||||
|
||||
Integration of 16 search engines for web crawling without API keys.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Preparation**: AI Agent initializes an empty in-memory cookie store. Cookies are only acquired dynamically during search operations when access is denied
|
||||
|
||||
2. **Language Evaluation**: Detect the language attribute of the search query. If the query is in Chinese, use Domestic search engines (Baidu, Bing CN, Bing INT, 360, Sogou, WeChat, Shenma). If the query is non-Chinese, use International search engines (Google, Google HK, DuckDuckGo, Yahoo, Startpage, Brave, Ecosia, Qwant, WolframAlpha). Select engines based on query relevance and availability.
|
||||
|
||||
3. **Controlled Search**: Use web_fetch to execute search requests with rate limiting:
|
||||
- Add 1-2 second delay between requests to respect server load
|
||||
- Batch requests in groups of 3-4 engines with sequential execution between batches
|
||||
- Include standard browser headers to identify as legitimate user agent
|
||||
- If access is denied (403/429), fetch engine homepage to obtain fresh session cookies
|
||||
|
||||
4. **Cookie Management**:
|
||||
- Cookies are stored ONLY in memory during runtime
|
||||
- Cookies are acquired on-demand when search requests fail
|
||||
- No cookies are read from or written to config.json or any file
|
||||
- Cookies are cleared after search session completes
|
||||
- Only session cookies from search engine domains are captured
|
||||
|
||||
5. **Retry Mechanism**: If a search fails due to cookie/session issues, retry once with freshly acquired cookies after a 2-second delay
|
||||
|
||||
6. **Result Aggregation**: Consolidate successful results from search engines, organize and summarize them to output a core search report
|
||||
|
||||
## Search Engines
|
||||
|
||||
### Domestic (7)
|
||||
- **Baidu**: `https://www.baidu.com/s?wd={keyword}`
|
||||
- **Bing CN**: `https://cn.bing.com/search?q={keyword}&ensearch=0`
|
||||
- **Bing INT**: `https://cn.bing.com/search?q={keyword}&ensearch=1`
|
||||
- **360**: `https://www.so.com/s?q={keyword}`
|
||||
- **Sogou**: `https://sogou.com/web?query={keyword}`
|
||||
- **WeChat**: `https://wx.sogou.com/weixin?type=2&query={keyword}`
|
||||
- **Shenma**: `https://m.sm.cn/s?q={keyword}`
|
||||
|
||||
### International (9)
|
||||
- **Google**: `https://www.google.com/search?q={keyword}`
|
||||
- **Google HK**: `https://www.google.com.hk/search?q={keyword}`
|
||||
- **DuckDuckGo**: `https://duckduckgo.com/html/?q={keyword}`
|
||||
- **Yahoo**: `https://search.yahoo.com/search?p={keyword}`
|
||||
- **Startpage**: `https://www.startpage.com/sp/search?query={keyword}`
|
||||
- **Brave**: `https://search.brave.com/search?q={keyword}`
|
||||
- **Ecosia**: `https://www.ecosia.org/search?q={keyword}`
|
||||
- **Qwant**: `https://www.qwant.com/?q={keyword}`
|
||||
- **WolframAlpha**: `https://www.wolframalpha.com/input?i={keyword}`
|
||||
|
||||
## Quick Examples
|
||||
|
||||
```javascript
|
||||
// Basic search
|
||||
web_fetch({"url": "https://www.google.com/search?q=python+tutorial"})
|
||||
|
||||
// Site-specific
|
||||
web_fetch({"url": "https://www.google.com/search?q=site:github.com+react"})
|
||||
|
||||
// File type
|
||||
web_fetch({"url": "https://www.google.com/search?q=machine+learning+filetype:pdf"})
|
||||
|
||||
// Time filter (past week)
|
||||
web_fetch({"url": "https://www.google.com/search?q=ai+news&tbs=qdr:w"})
|
||||
|
||||
// Privacy search
|
||||
web_fetch({"url": "https://duckduckgo.com/html/?q=privacy+tools"})
|
||||
|
||||
// DuckDuckGo Bangs
|
||||
web_fetch({"url": "https://duckduckgo.com/html/?q=!gh+tensorflow"})
|
||||
|
||||
// Knowledge calculation
|
||||
web_fetch({"url": "https://www.wolframalpha.com/input?i=100+USD+to+CNY"})
|
||||
```
|
||||
|
||||
## Advanced Operators
|
||||
|
||||
| Operator | Example | Description |
|
||||
|----------|---------|-------------|
|
||||
| `site:` | `site:github.com python` | Search within site |
|
||||
| `filetype:` | `filetype:pdf report` | Specific file type |
|
||||
| `""` | `"machine learning"` | Exact match |
|
||||
| `-` | `python -snake` | Exclude term |
|
||||
| `OR` | `cat OR dog` | Either term |
|
||||
|
||||
## Time Filters
|
||||
|
||||
| Parameter | Description |
|
||||
|-----------|-------------|
|
||||
| `tbs=qdr:h` | Past hour |
|
||||
| `tbs=qdr:d` | Past day |
|
||||
| `tbs=qdr:w` | Past week |
|
||||
| `tbs=qdr:m` | Past month |
|
||||
| `tbs=qdr:y` | Past year |
|
||||
|
||||
## Privacy Engines
|
||||
|
||||
- **DuckDuckGo**: No tracking
|
||||
- **Startpage**: Google results + privacy
|
||||
- **Brave**: Independent index
|
||||
- **Qwant**: EU GDPR compliant
|
||||
|
||||
## Bangs Shortcuts (DuckDuckGo)
|
||||
|
||||
| Bang | Destination |
|
||||
|------|-------------|
|
||||
| `!g` | Google |
|
||||
| `!gh` | GitHub |
|
||||
| `!so` | Stack Overflow |
|
||||
| `!w` | Wikipedia |
|
||||
| `!yt` | YouTube |
|
||||
|
||||
## WolframAlpha Queries
|
||||
|
||||
- Math: `integrate x^2 dx`
|
||||
- Conversion: `100 USD to CNY`
|
||||
- Stocks: `AAPL stock`
|
||||
- Weather: `weather in Beijing`
|
||||
|
||||
## Documentation
|
||||
|
||||
- `references/advanced-search.md` - Domestic search guide
|
||||
- `references/international-search.md` - International search guide
|
||||
- `CHANGELOG.md` - Version history
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
|
||||
## Security & Privacy Notice
|
||||
|
||||
### Cookie Handling
|
||||
- **Purpose**: Cookies are used ONLY to maintain search session state when access is denied (403/429 errors)
|
||||
- **Storage**: Cookies are kept STRICTLY in memory during runtime - NEVER persisted to disk or config files
|
||||
- **Acquisition**: Cookies are acquired on-demand from search engine homepages only when search requests fail
|
||||
- **Scope**: Only session cookies from the specific search engine domain are captured
|
||||
- **Lifecycle**: Cookies are cleared immediately after the search session completes
|
||||
- **No Pre-configuration**: No cookies are loaded from config.json or any external file at startup
|
||||
- **No API Keys**: This tool uses standard web search URLs, no authentication required
|
||||
|
||||
### Crawling Ethics
|
||||
- **Rate Limiting**: Implement reasonable delays between requests (recommend 1-2 seconds)
|
||||
- **Respect robots.txt**: Honor search engine crawling policies
|
||||
- **Terms of Service**: Users are responsible for complying with search engine ToS
|
||||
- **Purpose**: Designed for legitimate search aggregation, not mass data scraping
|
||||
|
||||
### Data Handling
|
||||
- **No Personal Data**: Tool does not collect or transmit user personal information
|
||||
- **Local Execution**: All operations run locally, no external data transmission
|
||||
- **Session Isolation**: Cookies are session-specific and cleared after use
|
||||
Reference in New Issue
Block a user