Liang

@clawhub-matthew77-ac7442ae63

5prompts

0upvotes received

0contributions

Joined 3 months ago

5 contributions in the last year

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Less

Minimax API

Skill

Provides image analysis and description from URLs or local files plus real-time web search using MiniMax's Token Plan API.

---
name: minimax-api
description: Enables image understanding and web search via MiniMax's Token Plan API. Use when asked to analyze/describe images, extract information from images, or search the web. Handles both HTTP/HTTPS image URLs and local file paths (absolute paths). Triggers on: "analyze this image", "describe this picture", "what's in this image", "search the web for", "look up", "web search".
---

# Minimax API

## Overview

Provides two capabilities via MiniMax's Token Plan API:
1. **Image understanding** — analyze images via VLM
2. **Web search** — real-time web search

API base URL: `https://api.minimaxi.com`

## Capabilities

### 1. understand_image

Analyzes an image and returns a text description.

**Input:**
- `image_url`: HTTP/HTTPS URL or absolute local file path (e.g., `/home/user/photo.png`, `D:\images\photo.png`)
- `prompt`: What to ask about the image

**Output:** Text description from the VLM.

**Script:** `scripts/minimax_image.py`

**Usage:**
```bash
export MINIMAX_API_KEY="your_api_key"

python3 skills/minimax-api/scripts/minimax_image.py \
  --prompt "Describe this image briefly" \
  --image-url "https://example.com/photo.jpg"

# Or with local file
python3 skills/minimax-api/scripts/minimax_image.py \
  --prompt "Extract text from this image" \
  --image-url "/home/user/documents/receipt.png"
```

### 2. web_search

Performs a web search and returns formatted results.

**Input:**
- `query`: Search query string

**Output:** JSON with organic results, related searches, and metadata.

**Script:** `scripts/minimax_search.py`

**Usage:**
```bash
export MINIMAX_API_KEY="your_api_key"

python3 skills/minimax-api/scripts/minimax_search.py \
  --query "MiniMax M2.7 release notes"
```

## Setup

**Required:** A MiniMax API key from [platform.minimaxi.com](https://platform.minimaxi.com).

Set it as an environment variable:

```bash
export MINIMAX_API_KEY="your_api_key_here"
```

Add the above line to your `~/.bashrc` (or `.zshrc`) to make it permanent.

Alternatively, pass `--api-key` directly on the command line (not recommended — exposes key in shell history).

## API Reference

See `references/api_spec.md` for full API documentation including request/response schemas, error codes, and headers.

FILE:references/api_spec.md
# MiniMax Token Plan API Specification

## Overview

MiniMax provides two API endpoints accessible via the Token Plan subscription:
- **VLM API** — image understanding (vision)
- **Search API** — web search

Base URL: `https://api.minimaxi.com`

## Authentication

All requests require:
```
Authorization: Bearer {API_KEY}
MM-API-Source: minimax-coding-plan-mcp
Content-Type: application/json
```

## Endpoints

### 1. Image Understanding (VLM)

**Endpoint:** `POST /v1/coding_plan/vlm`

**Request Body:**
```json
{
  "prompt": "string - The question/instruction about the image",
  "image_url": "string - Image as data URL (data:image/{format};base64,{data})"
}
```

**Supported Image Formats:** JPEG, PNG, WebP

**Image URL Formats:**
- `data:image/jpeg;base64,{base64_data}` — base64 encoded image data
- `data:image/png;base64,{base64_data}` — base64 encoded PNG
- `data:image/webp;base64,{base64_data}` — base64 encoded WebP
- `https://example.com/image.jpg` — HTTP/HTTPS URL (only works via MCP server which auto-converts)

**Response:**
```json
{
  "content": "string - The VLM's text response",
  "base_resp": {
    "status_code": 0,
    "status_msg": "string"
  }
}
```

**Error Codes:**
- `0` — Success
- `1004` — Authentication error (invalid API key)
- `2013` — Invalid parameters (e.g., invalid image URL)
- `2038` — Real-name verification required

### 2. Web Search

**Endpoint:** `POST /v1/coding_plan/search`

**Request Body:**
```json
{
  "q": "string - Search query"
}
```

**Response:**
```json
{
  "organic": [
    {
      "title": "string - Result title",
      "link": "string - Result URL",
      "snippet": "string - Result description",
      "date": "string - Publication date (if available)"
    }
  ],
  "related_searches": [
    {
      "query": "string - Related search query"
    }
  ],
  "base_resp": {
    "status_code": 0,
    "status_msg": "string"
  }
}
```

## Error Handling

All endpoints return a `base_resp` object:

| status_code | Meaning |
|-------------|---------|
| 0 | Success |
| 1004 | Auth error — check API key |
| 2013 | Invalid parameters |
| 2038 | Real-name verification needed |

On error, the tool exits with code 1 and prints the error message to stderr.

FILE:scripts/minimax_image.py
#!/usr/bin/env python3
"""
Minimax VLM (Vision Language Model) API client.
Handles image understanding via MiniMax Token Plan API.

Usage:
    python3 minimax_image.py --api-key KEY --prompt PROMPT --image-url URL_OR_PATH
"""

import argparse
import base64
import json
import os
import sys
import urllib.request
import urllib.error


def download_image_as_base64(url: str) -> str:
    """Download HTTP/HTTPS image and return base64 data URL."""
    try:
        with urllib.request.urlopen(url, timeout=30) as response:
            image_data = response.read()
            content_type = response.headers.get('Content-Type', 'image/jpeg').lower()
            
            # Detect format
            if 'png' in content_type:
                fmt = 'png'
            elif 'webp' in content_type:
                fmt = 'webp'
            elif 'jpeg' in content_type or 'jpg' in content_type:
                fmt = 'jpeg'
            else:
                fmt = 'jpeg'  # default
            
            b64_data = base64.b64encode(image_data).decode('utf-8')
            return f"data:image/{fmt};base64,{b64_data}"
    except urllib.error.URLError as e:
        print(f"Error downloading image: {e}", file=sys.stderr)
        sys.exit(1)


def local_file_to_base64(path: str) -> str:
    """Read local image file and return base64 data URL."""
    if not os.path.exists(path):
        # Try expanding path for Windows paths like D:\...
        if sys.platform == 'win32' and ':' in path:
            # Try as-is
            pass
        print(f"Error: File not found: {path}", file=sys.stderr)
        sys.exit(1)
    
    # Detect format from extension
    lower_path = path.lower()
    if lower_path.endswith('.png'):
        fmt = 'png'
    elif lower_path.endswith('.webp'):
        fmt = 'webp'
    elif lower_path.endswith('.jpg') or lower_path.endswith('.jpeg'):
        fmt = 'jpeg'
    else:
        fmt = 'jpeg'  # default
    
    try:
        with open(path, 'rb') as f:
            image_data = f.read()
        b64_data = base64.b64encode(image_data).decode('utf-8')
        return f"data:image/{fmt};base64,{b64_data}"
    except IOError as e:
        print(f"Error reading file: {e}", file=sys.stderr)
        sys.exit(1)


def process_image_url(image_url: str) -> str:
    """
    Process image input and convert to base64 data URL.
    
    Handles:
    - HTTP/HTTPS URLs: downloads and converts to base64
    - Local file paths: reads file and converts to base64
    - Base64 data URLs: passes through as-is
    """
    if image_url.startswith('data:'):
        # Already a data URL
        return image_url
    elif image_url.startswith(('http://', 'https://')):
        return download_image_as_base64(image_url)
    else:
        # Local file path
        return local_file_to_base64(image_url)


def call_vlm_api(api_key: str, prompt: str, image_url: str) -> dict:
    """Call MiniMax VLM API."""
    # Process image (convert to base64 if needed)
    processed_image_url = process_image_url(image_url)
    
    url = "https://api.minimaxi.com/v1/coding_plan/vlm"
    
    payload = {
        "prompt": prompt,
        "image_url": processed_image_url
    }
    
    data = json.dumps(payload).encode('utf-8')
    
    req = urllib.request.Request(
        url,
        data=data,
        headers={
            'Authorization': f'Bearer {api_key}',
            'Content-Type': 'application/json',
            'MM-API-Source': 'minimax-coding-plan-mcp'
        },
        method='POST'
    )
    
    try:
        with urllib.request.urlopen(req, timeout=60) as response:
            result = json.loads(response.read().decode('utf-8'))
            return result
    except urllib.error.HTTPError as e:
        error_body = e.read().decode('utf-8') if e.fp else ''
        print(f"API Error {e.code}: {error_body}", file=sys.stderr)
        sys.exit(1)
    except urllib.error.URLError as e:
        print(f"Request Error: {e}", file=sys.stderr)
        sys.exit(1)


def main():
    parser = argparse.ArgumentParser(description='Minimax VLM Image Understanding')
    parser.add_argument('--api-key', required=True, help='Minimax API key')
    parser.add_argument('--prompt', required=True, help='Prompt/question about the image')
    parser.add_argument('--image-url', required=True, help='Image URL or local file path')
    
    args = parser.parse_args()
    
    result = call_vlm_api(args.api_key, args.prompt, args.image_url)
    
    # Check for API errors
    base_resp = result.get('base_resp', {})
    if base_resp.get('status_code', 0) != 0:
        status_msg = base_resp.get('status_msg', 'Unknown error')
        print(f"API Error: {status_msg}", file=sys.stderr)
        sys.exit(1)
    
    # Output the content
    content = result.get('content', '')
    if content:
        print(content)
    else:
        print("No content returned from API", file=sys.stderr)
        sys.exit(1)


if __name__ == '__main__':
    main()

FILE:scripts/minimax_search.py
#!/usr/bin/env python3
"""
Minimax Web Search API client.
Performs web searches via MiniMax Token Plan API.

Usage:
    python3 minimax_search.py --api-key KEY --query "search query"

Example:
    python3 minimax_search.py --api-key KEY --query "MiniMax M2.7 release notes"
"""

import argparse
import json
import sys
import urllib.request
import urllib.error


def call_search_api(api_key: str, query: str) -> dict:
    """Call MiniMax Search API."""
    url = "https://api.minimaxi.com/v1/coding_plan/search"
    
    payload = {
        "q": query
    }
    
    data = json.dumps(payload).encode('utf-8')
    
    req = urllib.request.Request(
        url,
        data=data,
        headers={
            'Authorization': f'Bearer {api_key}',
            'Content-Type': 'application/json',
            'MM-API-Source': 'minimax-coding-plan-mcp'
        },
        method='POST'
    )
    
    try:
        with urllib.request.urlopen(req, timeout=30) as response:
            result = json.loads(response.read().decode('utf-8'))
            return result
    except urllib.error.HTTPError as e:
        error_body = e.read().decode('utf-8') if e.fp else ''
        print(f"API Error {e.code}: {error_body}", file=sys.stderr)
        sys.exit(1)
    except urllib.error.URLError as e:
        print(f"Request Error: {e}", file=sys.stderr)
        sys.exit(1)


def format_results(result: dict) -> str:
    """Format search results as readable text."""
    lines = []
    
    base_resp = result.get('base_resp', {})
    if base_resp.get('status_code', 0) != 0:
        return f"API Error: {base_resp.get('status_msg', 'Unknown error')}"
    
    organic = result.get('organic', [])
    if organic:
        lines.append("=== Search Results ===")
        for i, item in enumerate(organic, 1):
            title = item.get('title', 'No title')
            link = item.get('link', '')
            snippet = item.get('snippet', '')
            date = item.get('date', '')
            
            lines.append(f"\n[{i}] {title}")
            if date:
                lines.append(f"    Date: {date}")
            lines.append(f"    Link: {link}")
            if snippet:
                lines.append(f"    {snippet}")
    
    related = result.get('related_searches', [])
    if related:
        lines.append("\n=== Related Searches ===")
        for item in related:
            query = item.get('query', '')
            if query:
                lines.append(f"  - {query}")
    
    return '\n'.join(lines) if lines else "No results found"


def main():
    parser = argparse.ArgumentParser(description='Minimax Web Search')
    parser.add_argument('--api-key', required=True, help='Minimax API key')
    parser.add_argument('--query', required=True, help='Search query')
    
    args = parser.parse_args()
    
    result = call_search_api(args.api_key, args.query)
    print(format_results(result))


if __name__ == '__main__':
    main()

ClawHub Coding Backend+2

L@clawhub-matthew77-ac7442ae63

Tavily Search

Skill

Web search using Tavily's LLM-optimized API. Returns relevant results with content snippets, scores, and metadata.

---
name: tavily-search
description: Web search using Tavily's LLM-optimized API. Returns relevant results with content snippets, scores, and metadata.
homepage: https://tavily.com
metadata: {"openclaw":{"emoji":"🔍","requires":{"bins":["node"],"env":["TAVILY_API_KEY"]},"primaryEnv":"TAVILY_API_KEY"}}
---

# Tavily Search

Search the web and get relevant results optimized for LLM consumption.

## Authentication

Get your API key at https://tavily.com and add to your OpenClaw config:

```json
{
  "skills": {
    "entries": {
      "tavily-search": {
        "enabled": true,
        "apiKey": "tvly-YOUR_API_KEY_HERE"
      }
    }
  }
}
```

Or set the environment variable:
```bash
export TAVILY_API_KEY="tvly-YOUR_API_KEY_HERE"
```

## Quick Start

### Using the Script

```bash
node {baseDir}/scripts/search.mjs "query"
node {baseDir}/scripts/search.mjs "query" -n 10
node {baseDir}/scripts/search.mjs "query" --deep
node {baseDir}/scripts/search.mjs "query" --topic news
```

### Examples

```bash
# Basic search
node {baseDir}/scripts/search.mjs "python async patterns"

# With more results
node {baseDir}/scripts/search.mjs "React hooks tutorial" -n 10

# Advanced search
node {baseDir}/scripts/search.mjs "machine learning" --deep

# News search
node {baseDir}/scripts/search.mjs "AI news" --topic news

# Domain-filtered search
node {baseDir}/scripts/search.mjs "Python docs" --include-domains docs.python.org
```

## Options

| Option | Description | Default |
|--------|-------------|---------|
| `-n <count>` | Number of results (1-20) | 10 |
| `--depth <mode>` | Search depth: `ultra-fast`, `fast`, `basic`, `advanced` | `basic` |
| `--topic <topic>` | Topic: `general` or `news` | `general` |
| `--time-range <range>` | Time range: `day`, `week`, `month`, `year` | - |
| `--include-domains <domains>` | Comma-separated domains to include | - |
| `--exclude-domains <domains>` | Comma-separated domains to exclude | - |
| `--raw-content` | Include full page content | false |
| `--json` | Output raw JSON | false |

## Search Depth

| Depth | Latency | Relevance | Use Case |
|-------|---------|-----------|----------|
| `ultra-fast` | Lowest | Lower | Real-time chat, autocomplete |
| `fast` | Low | Good | Need chunks but latency matters |
| `basic` | Medium | High | General-purpose, balanced |
| `advanced` | Higher | Highest | Precision matters, research |

## Tips

- **Keep queries under 400 characters** - Think search query, not prompt
- **Break complex queries into sub-queries** - Better results than one massive query
- **Use `--include-domains`** to focus on trusted sources
- **Use `--time-range`** for recent information
- **Filter by `score`** (0-1) to get highest relevance results
FILE:scripts/search.mjs
#!/usr/bin/env node

function usage() {
  console.error(`Usage: search.mjs "query" [options]

Options:
  -n <count>              Number of results (1-20, default: 10)
  --depth <mode>           Search depth: ultra-fast, fast, basic, advanced (default: basic)
  --topic <topic>          Topic: general or news (default: general)
  --time-range <range>      Time range: day, week, month, year
  --include-domains <list>  Comma-separated domains to include
  --exclude-domains <list>  Comma-separated domains to exclude
  --raw-content            Include full page content
  --json                   Output raw JSON

Examples:
  search.mjs "python async patterns"
  search.mjs "React hooks tutorial" -n 10
  search.mjs "AI news" --topic news --time-range week
  search.mjs "Python docs" --include-domains docs.python.org,realpython.com`);
  process.exit(2);
}

const args = process.argv.slice(2);
if (args.length === 0 || args[0] === "-h" || args[0] === "--help") usage();

const query = args[0];
let maxResults = 10;
let searchDepth = "basic";
let topic = "general";
let timeRange = null;
let includeDomains = [];
let excludeDomains = [];
let includeRawContent = false;
let outputJson = false;

for (let i = 1; i < args.length; i++) {
  const a = args[i];
  if (a === "-n") {
    maxResults = Number.parseInt(args[i + 1] ?? "10", 10);
    i++;
    continue;
  }
  if (a === "--depth") {
    searchDepth = args[i + 1] ?? "basic";
    i++;
    continue;
  }
  if (a === "--topic") {
    topic = args[i + 1] ?? "general";
    i++;
    continue;
  }
  if (a === "--time-range") {
    timeRange = args[i + 1];
    i++;
    continue;
  }
  if (a === "--include-domains") {
    includeDomains = (args[i + 1] ?? "").split(",").map(d => d.trim()).filter(Boolean);
    i++;
    continue;
  }
  if (a === "--exclude-domains") {
    excludeDomains = (args[i + 1] ?? "").split(",").map(d => d.trim()).filter(Boolean);
    i++;
    continue;
  }
  if (a === "--raw-content") {
    includeRawContent = true;
    continue;
  }
  if (a === "--json") {
    outputJson = true;
    continue;
  }
  console.error(`Unknown arg: a`);
  usage();
}

const apiKey = (process.env.TAVILY_API_KEY ?? "").trim();
if (!apiKey) {
  console.error("Error: TAVILY_API_KEY not set");
  console.error("Get your API key at https://tavily.com");
  process.exit(1);
}

const body = {
  query: query,
  max_results: Math.max(1, Math.min(maxResults, 20)),
  search_depth: searchDepth,
  topic: topic,
  include_raw_content: includeRawContent,
};

if (timeRange) body.time_range = timeRange;
if (includeDomains.length > 0) body.include_domains = includeDomains;
if (excludeDomains.length > 0) body.exclude_domains = excludeDomains;

const resp = await fetch("https://api.tavily.com/search", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer apiKey`,
  },
  body: JSON.stringify(body),
});

if (!resp.ok) {
  const text = await resp.text().catch(() => "");
  throw new Error(`Tavily Search failed (resp.status): text`);
}

const data = await resp.json();

if (outputJson) {
  console.log(JSON.stringify(data, null, 2));
  process.exit(0);
}

// Print AI answer if available
if (data.answer) {
  console.log("## Answer\n");
  console.log(data.answer);
  console.log("\n---\n");
}

// Print results
const results = (data.results ?? []).slice(0, maxResults);
console.log(`## Sources (results.length results)\n`);

for (const r of results) {
  const title = String(r?.title ?? "").trim();
  const url = String(r?.url ?? "").trim();
  const content = String(r?.content ?? "").trim();
  const score = r?.score ? ` (relevance: (r.score * 100).toFixed(0)%)` : "";

  if (!title || !url) continue;

  console.log(`- **title**score`);
  console.log(`  url`);
  if (content) {
    console.log(`  content.slice(0, 300)""`);
  }
  console.log();
}

if (data.response_time) {
  console.log(`\nResponse time: data.response_times`);
}

ClawHub Backend Documentation+2

L@clawhub-matthew77-ac7442ae63

Tavily Research

Skill

Comprehensive research grounded in web data with explicit citations. Use when you need multi-source synthesis—comparisons, current events, market analysis, d...

---
name: tavily-research
description: Comprehensive research grounded in web data with explicit citations. Use when you need multi-source synthesis—comparisons, current events, market analysis, detailed reports.
homepage: https://tavily.com
metadata: {"openclaw":{"emoji":"📊","requires":{"bins":["node"],"env":["TAVILY_API_KEY"]},"primaryEnv":"TAVILY_API_KEY"}}
---

# Tavily Research

Conduct comprehensive research on any topic with automatic source gathering, analysis, and response generation with citations.

## Authentication

Get your API key at https://tavily.com and add to your OpenClaw config:

```json
{
  "skills": {
    "entries": {
      "tavily-research": {
        "enabled": true,
        "apiKey": "tvly-YOUR_API_KEY_HERE"
      }
    }
  }
}
```

Or set the environment variable:
```bash
export TAVILY_API_KEY="tvly-YOUR_API_KEY_HERE"
```

## Quick Start

### Using the Script

```bash
node {baseDir}/scripts/research.mjs "query"
node {baseDir}/scripts/research.mjs "query" --pro
node {baseDir}/scripts/research.mjs "query" --output report.md
```

### Examples

```bash
# Quick overview
node {baseDir}/scripts/research.mjs "What is retrieval augmented generation?"

# Comprehensive analysis
node {baseDir}/scripts/research.mjs "LangGraph vs CrewAI for multi-agent systems" --pro

# Market research with output file
node {baseDir}/scripts/research.mjs "Fintech startup landscape 2025" --pro --output fintech-report.md

# Technical comparison
node {baseDir}/scripts/research.mjs "React vs Vue vs Svelte" --pro
```

## Options

| Option | Description | Default |
|--------|-------------|---------|
| `--model <model>` | Model: `mini`, `pro`, `auto` | `mini` |
| `--output <file>` | Save report to file | - |
| `--json` | Output raw JSON | false |

## Model Selection

**Rule of thumb**: "what does X do?" → mini. "X vs Y vs Z" or "best way to..." → pro.

| Model | Use Case | Speed |
|-------|----------|-------|
| `mini` | Single topic, targeted research | ~30s |
| `pro` | Comprehensive multi-angle analysis | ~60-120s |
| `auto` | API chooses based on complexity | Varies |

## Output Format

The research includes:
- **AI-generated answer**: Comprehensive synthesis
- **Sources**: Citations with titles, URLs, and relevance scores
- **Metadata**: Query, response time, and statistics

## Tips

- Research can take 30-120 seconds depending on complexity
- Use `--pro` for comparisons, market analysis, or detailed reports
- Use `--output` to save reports for later reference
- The `auto` model lets Tavily choose based on query complexity
FILE:scripts/research.mjs
#!/usr/bin/env node

function usage() {
  console.error(`Usage: research.mjs "query" [options]

Options:
  --model <model>   Model: mini, pro, auto (default: mini)
  --output <file>   Save report to file
  --json            Output raw JSON

Examples:
  research.mjs "What is retrieval augmented generation?"
  research.mjs "LangGraph vs CrewAI" --pro
  research.mjs "Fintech landscape 2025" --pro --output report.md`);
  process.exit(2);
}

const args = process.argv.slice(2);
if (args.length === 0 || args[0] === "-h" || args[0] === "--help") usage();

const query = args[0];
let model = "mini";
let outputFile = null;
let outputJson = false;

for (let i = 1; i < args.length; i++) {
  const a = args[i];
  if (a === "--model") {
    model = args[i + 1] ?? "mini";
    i++;
    continue;
  }
  if (a === "--output") {
    outputFile = args[i + 1];
    i++;
    continue;
  }
  if (a === "--json") {
    outputJson = true;
    continue;
  }
  console.error(`Unknown arg: a`);
  usage();
}

const apiKey = (process.env.TAVILY_API_KEY ?? "").trim();
if (!apiKey) {
  console.error("Error: TAVILY_API_KEY not set");
  console.error("Get your API key at https://tavily.com");
  process.exit(1);
}

const body = {
  query: query,
  search_depth: "advanced",
  include_answer: true,
  include_raw_content: false,
  max_results: 10,
  topic: "general",
};

// Map model to search_depth
if (model === "mini") {
  body.search_depth = "basic";
  body.max_results = 5;
} else if (model === "pro") {
  body.search_depth = "advanced";
  body.max_results = 10;
}

const resp = await fetch("https://api.tavily.com/search", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer apiKey`,
  },
  body: JSON.stringify(body),
});

if (!resp.ok) {
  const text = await resp.text().catch(() => "");
  throw new Error(`Tavily Research failed (resp.status): text`);
}

const data = await resp.json();

if (outputJson) {
  console.log(JSON.stringify(data, null, 2));
  process.exit(0);
}

// Build report
let report = "";

report += `# Research Report: query\n\n`;

if (data.answer) {
  report += `## Summary\n\ndata.answer\n\n`;
  report += `---\n\n`;
}

// Sources
const results = data.results ?? [];
report += `## Sources (results.length)\n\n`;

for (const r of results) {
  const title = String(r?.title ?? "").trim();
  const url = String(r?.url ?? "").trim();
  const content = String(r?.content ?? "").trim();
  const score = r?.score ? ` (relevance: (r.score * 100).toFixed(0)%)` : "";

  if (!title || !url) continue;

  report += `### titlescore\n\n`;
  report += `url\n\n`;
  if (content) {
    report += `content\n\n`;
  }
}

if (data.response_time) {
  report += `\n---\n\nResponse time: data.response_times\n`;
}

// Output
if (outputFile) {
  const fs = await import("fs");
  fs.writeFileSync(outputFile, report, "utf-8");
  console.log(`Report saved to outputFile`);
} else {
  console.log(report);
}

ClawHub Frontend Backend+2

L@clawhub-matthew77-ac7442ae63

Tavily Extract

Skill

Extract content from specific URLs using Tavily's extraction API. Returns clean markdown/text from web pages.

---
name: tavily-extract
description: Extract content from specific URLs using Tavily's extraction API. Returns clean markdown/text from web pages.
homepage: https://tavily.com
metadata: {"openclaw":{"emoji":"📄","requires":{"bins":["node"],"env":["TAVILY_API_KEY"]},"primaryEnv":"TAVILY_API_KEY"}}
---

# Tavily Extract

Extract clean content from specific URLs. Ideal when you know which pages you want content from.

## Authentication

Get your API key at https://tavily.com and add to your OpenClaw config:

```json
{
  "skills": {
    "entries": {
      "tavily-extract": {
        "enabled": true,
        "apiKey": "tvly-YOUR_API_KEY_HERE"
      }
    }
  }
}
```

Or set in environment variable:
```bash
export TAVILY_API_KEY="tvly-YOUR_API_KEY_HERE"
```

## Quick Start

### Using the Script

```bash
node {baseDir}/scripts/extract.mjs "https://example.com/article"
node {baseDir}/scripts/extract.mjs "url1,url2,url3"
node {baseDir}/scripts/extract.mjs "url" --query "authentication API"
```

### Examples

```bash
# Single URL
node {baseDir}/scripts/extract.mjs "https://docs.python.org/3/tutorial/classes.html"

# Multiple URLs
node {baseDir}/scripts/extract.mjs "https://example.com/page1,https://example.com/page2"

# With query focus
node {baseDir}/scripts/extract.mjs "https://example.com/docs" --query "authentication API"

# Advanced extraction for JS pages
node {baseDir}/scripts/extract.mjs "https://app.example.com" --depth advanced --timeout 60
```

## Options

| Option| Description | Default |
|--------|-------------|---------|
| `--query <text>` | Rerank chunks by relevance | - |
| `--chunks <n>` | Chunks per URL (1-5, requires query) | 3 |
| `--depth <mode>` | Extract depth: `basic` or `advanced` | `basic` |
| `--format <fmt>` | Output format: `markdown` or `text` | `markdown` |
| `--timeout <sec>` | Max wait time (1-60 seconds) | varies |
| `--json` | Output raw JSON | false |

## Extract Depth

| Depth | When to Use |
|-------|-------------|
| `basic` | Simple text extraction, faster |
| `advanced` | Dynamic/JS-rendered pages, tables, structured data |

## Tips

- **Max 20 URLs per request** - batch larger lists
- **Use `--query` + `--chunks`** to get only relevant content
- **Try `basic` first**, fall back to `advanced` if content is missing
- **Set longer `--timeout`** for slow pages (up to 60s)
- **Check `failed_results`** in JSON output for URLs that couldn't be extracted
FILE:scripts/extract.mjs
#!/usr/bin/env node

function usage() {
  console.error(`Usage: extract.mjs "url1,url2,..." [optionsURLs (comma-separated, max 20)
  --query <text>    Rerank chunks by relevance
  --chunks <n>       Chunks per URL (1-5, requires query)
  --depth <mode>     Extract depth: basic or advanced (default: basic)
  --format <fmt>     Output format: markdown or text (default: markdown)
  --timeout <sec>    Max wait time (1-60 seconds)
  --json              Output raw JSON

Examples:
  extract.mjs "https://docs.python.org/3/tutorial/classes.html"
  extract.mjs "https://example.com/page1,https://example.com/page2"
  extract.mjs "https://example.com/docs" --query "authentication API"
  extract.mjs "https://app.example.com" --depth advanced --timeout 60`);
  process.exit(2);
}

const args = process.argv.slice(2);
if (args.length === 0 || args[0] === "-h" || args[0] === "--help") usage();

const urlsInput = args[0];
let query = null;
let chunksPerSource = 3;
let extractDepth = "basic";
let format = "markdown";
let timeout = null;
let outputJson = false;

for (let i = 1; i < args.length; i++) {
  const a = args[i];
  if (a === "--query") {
    query = args[i + 1];
    i++;
    continue;
  }
  if (a === "--chunks") {
    chunksPerSource = Number.parseInt(args[i + 1] ?? "3", 10);
    i++;
    continue;
  }
  if (a === "--depth") {
    extractDepth = args[i + 1] ?? "basic";
    i++;
    continue;
  }
  if (a === "--format") {
    format = args[i + 1] ?? "markdown";
    i++;
    continue;
  }
  if (a === "--timeout") {
    timeout = Number.parseFloat(args[i + 1]);
    i++;
    continue;
  }
  if (a === "--json") {
    outputJson = true;
    continue;
  }
  console.error(`Unknown arg: a`);
  usage();
}

const urls = urlsInput.split(",").map(u => u.trim()).filter(Boolean);
if (urls.length === 0) {
  console.error("Error: No URLs provided");
  process.exit(1);
}
if (urls.length > 20) {
  console.error("Error: Max 20 URLs per request");
  process.exit(1);
}

const apiKey = (process.env.TAVILY_API_KEY ?? "").trim();
if (!apiKey) {
  console.error("Error: TAVILY_API_KEY not set");
  console.error("Get your API key at https://tavily.com");
  process.exit(1);
}

const body = {
  urls: urls,
  extract_depth: extractDepth,
  format: format,
};

if (query) body.query = query;
if (query && chunksPerSource) body.chunks_per_source = chunksPerSource;
if (timeout) body.timeout = timeout;

const resp = await fetch("https://api.tavily.com/extract", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer apiKey`,
  },
  body: JSON.stringify(body),
});

if (!resp.ok) {
  const text = await resp.text().catch(() => "");
  throw new Error(`Tavily Extract failed (resp.status): text`);
}

const data = await resp.json();

if (outputJson) {
  console.log(JSON.stringify(data, null, 2));
  process.exit(0);
}

// Print results
const results = data.results ?? [];
console.log(`## Extracted Content (results.length URLs)\n`);

for (const r of results) {
  const url = String(r?.url ?? "").trim();
  const content = String(r?.raw_content ?? "").trim();

  if (!url) continue;

  console.log(`### url\n`);
  if (content) {
    console.log(content);
  }
  console.log("\n---\n");
}

// Print failed results
const failed = data.failed_results ?? [];
if (failed.length > 0) {
  console.log(`## Failed (failed.length)\n`);
  for (const f of failed) {
    console.log(`- f.url: f.error ?? "Unknown error"`);
  }
}

if (data.response_time) {
  console.log(`\nResponse time: data.response_times`);
}

ClawHub Backend Writing+2

L@clawhub-matthew77-ac7442ae63

Tavily Crawl

Skill

Crawl any website and save pages as local markdown files. Ideal for downloading documentation, knowledge bases, or web content for offline access or analysis.

---
name: tavily-crawl
description: Crawl any website and save pages as local markdown files. Ideal for downloading documentation, knowledge bases, or web content for offline access or analysis.
homepage: https://tavily.com
metadata: {"openclaw":{"emoji":"🕷️","requires":{"bins":["node"],"env":["TAVILY_API_KEY"]},"primaryEnv":"TAVILY_API_KEY"}}
---

# Tavily Crawl

Crawl websites to extract content from multiple pages. Ideal for documentation, knowledge bases, and site-wide content extraction.

## Authentication

Get your API key at https://tavily.com and add to your OpenClaw config:

```json
{
  "skills": {
    "entries": {
      "tavily-crawl": {
        "enabled": true,
        "apiKey": "tvly-YOUR_API_KEY_HERE"
      }
    }
  }
}
```

Or set in environment variable:
```bash
export TAVILY_API_KEY="tvly-YOUR_API_KEY_HERE"
```

## Quick Start

### Using the Script

```bash
node {baseDir}/scripts/crawl.mjs "https://docs.example.com"
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --output ./docs
node {baseDir}/scripts/crawl.mjs "https://example.com" --depth 2 --limit 50
```

### Examples

```bash
# Basic crawl
node {baseDir}/scripts/crawl.mjs "https://docs.example.com"

# Deeper crawl with limits
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --depth 2 --limit 50

# Save to files
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" --depth 2 --output ./docs

# Focused crawl with path filters
node {baseDir}/scripts/crawl.mjs "https://example.com" --depth 2 \
  --select "/docs/.*" --exclude "/blog/.*"

# With semantic instructions
node {baseDir}/scripts/crawl.mjs "https://docs.example.com" \
  --instructions "Find API documentation" --chunks 3
```

## Options

| Option | Description | Default |
|--------|-------------|---------|
| `--depth <n>` | Crawl depth (1-5) | 1 |
| `--breadth <n>` | Links per page | 20 |
| `--limit <n>` | Total pages cap | 50 |
| `--output <dir>` | Save pages to directory | - |
| `--instructions <text>` | Natural language guidance | - |
| `--chunks <n>` | Chunks per page (1-5, requires instructions) | - |
| `--depth-mode <mode>` | Extract depth: `basic` or `advanced` | `basic` |
| `--select <pattern>` | Regex pattern to include | - |
| `--exclude <pattern>` | Regex pattern to exclude | - |
| `--timeout <sec>` | Max wait time (10-150 seconds) | 150 |
| `--json` | Output raw JSON | false |

## Depth vs Performance

| Depth | Typical Pages | Time |
|-------|---------------|------|
| 1 | 10-50 | Seconds |
| 2 | 50-500 | Minutes |
| 3 | 500-5000 | Many minutes |

**Start with `--depth 1`** and increase only if needed.

## Crawl for Context vs Data Collection

**For agentic use (feeding results into context):** Always use `--instructions` + `--chunks`. This returns only relevant chunks instead of full pages, preventing context window explosion.

**For data collection (saving to files):** Omit `--chunks` to get full page content.

## Tips

- **Always use `--chunks` for agentic workflows** - prevents context explosion when feeding results to LLMs
- **Omit `--chunks` only for data collection** - when saving full pages to files
- **Start conservative** (`--depth 1`, `--limit 20`) and scale up
- **Use path patterns** to focus on relevant sections
- **Always set a `--limit`** to prevent runaway crawls
FILE:scripts/crawl.mjs
#!/usr/bin/env node

import fs from "fs";
import path from "path";

function usage() {
  console.error(`Usage: crawl.mjs "url" [options]

Options:
  --depth <n>              Crawl depth (1-5, default: 1)
  --breadth <n>            Links per page (default: 20)
  --limit <n>              Total pages cap (default: 50)
  --output <dir>           Save pages to directory
  --instructions <text>    Natural language guidance
  --chunks <n>             Chunks per page (1-5, requires instructions)
  --depth-mode <mode>      Extract depth: basic or advanced (default: basic)
  --select <pattern>       Regex pattern to include
  --exclude <pattern>      Regex pattern to exclude
  --timeout <sec>          Max wait time (10-150 seconds, default: 150)
  --json                   Output raw JSON

Examples:
  crawl.mjs "https://docs.example.com"
  crawl.mjs "https://docs.example.com" --depth 2 --limit 50
  crawl.mjs "https://docs.example.com" --depth 2 --output ./docs
  crawl.mjs "https://example.com" --instructions "Find API docs" --chunks 3`);
  process.exit(2);
}

const args = process.argv.slice(2);
if (args.length === 0 || args[0] === "-h" || args[0] === "--help") usage();

const url = args[0];
let maxDepth = 1;
let maxBreadth = 20;
let limit = 50;
let outputDir = null;
let instructions = null;
let chunksPerSource = null;
let extractDepth = "basic";
let selectPaths = null;
let excludePaths = null;
let timeout = 150;
let outputJson = false;

for (let i = 1; i < args.length; i++) {
  const a = args[i];
  if (a === "--depth") {
    maxDepth = Number.parseInt(args[i + 1] ?? "1", 10);
    i++;
    continue;
  }
  if (a === "--breadth") {
    maxBreadth = Number.parseInt(args[i + 1] ?? "20", 10);
    i++;
    continue;
  }
  if (a === "--limit") {
    limit = Number.parseInt(args[i + 1] ?? "50", 10);
    i++;
    continue;
  }
  if (a === "--output") {
    outputDir = args[i + 1];
    i++;
    continue;
  }
  if (a === "--instructions") {
    instructions = args[i + 1];
    i++;
    continue;
  }
  if (a === "--chunks") {
    chunksPerSource = Number.parseInt(args[i + 1], 10);
    i++;
    continue;
  }
  if (a === "--depth-mode") {
    extractDepth = args[i + 1] ?? "basic";
    i++;
    continue;
  }
  if (a === "--select") {
    selectPaths = args[i + 1];
    i++;
    continue;
  }
  if (a === "--exclude") {
    excludePaths = args[i + 1];
    i++;
    continue;
  }
  if (a === "--timeout") {
    timeout = Number.parseFloat(args[i + 1] ?? "150", 10);
    i++;
    continue;
  }
  if (a === "--json") {
    outputJson = true;
    continue;
  }
  console.error(`Unknown arg: a`);
  usage();
}

const apiKey = (process.env.TAVILY_API_KEY ?? "").trim();
if (!apiKey) {
  console.error("Error: TAVILY_API_KEY not set");
  console.error("Get your API key at https://tavily.com");
  process.exit(1);
}

const body = {
  url: url,
  max_depth: maxDepth,
  max_breadth: maxBreadth,
  limit: limit,
  extract_depth: extractDepth,
};

if (instructions) body.instructions = instructions;
if (instructions && chunksPerSource) body.chunks_per_source = chunksPerSource;
if (selectPaths) body.select_paths = [selectPaths];
if (excludePaths) body.exclude_paths = [excludePaths];
if (timeout) body.timeout = timeout;

console.error(`Crawling url (depth: maxDepth, limit: limit)...`);

const resp = await fetch("https://api.tavily.com/crawl", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer apiKey`,
  },
  body: JSON.stringify(body),
});

if (!resp.ok) {
  const text = await resp.text().catch(() => "");
  throw new Error(`Tavily Crawl failed (resp.status): text`);
}

const data = await resp.json();

if (outputJson) {
  console.log(JSON.stringify(data, null, 2));
  process.exit(0);
}

// Print results
const results = data.results ?? [];
console.error(`\nCrawled results.length pages\n`);

if (outputDir) {
  // Save to files
  fs.mkdirSync(outputDir, { recursive: true });

  for (const r of results) {
    const pageUrl = String(r?.url ?? "").trim();
    const content = String(r?.raw_content ?? "").trim();

    if (!pageUrl) continue;

    // Generate filename from URL
    const urlObj = new URL(pageUrl);
    let filename = urlObj.pathname.replace(/[^a-zA-Z0-9_-]/g, "_") || "index";
    filename = filename.replace(/^_+|_+$/g, "");
    filename = filename || "index";
    filename += ".md";

    const filepath = path.join(outputDir, filename);
    fs.writeFileSync(filepath, content, "utf-8");
    console.error(`Saved: filepath`);
  }

  console.error(`\nAll pages saved to outputDir/`);
} else {
  // Print to stdout
  console.log(`## Crawl Results (results.length pages)\n`);

  for (const r of results) {
    const pageUrl = String(r?.url ?? "").trim();
    const content = String(r?.raw_content ?? "").trim();

    if (!pageUrl) continue;

    console.log(`### pageUrl\n`);
    if (content) {
      console.log(content);
    }
    console.log("\n---\n");
  }
}

if (data.response_time) {
  console.error(`\nResponse time: data.response_times`);
}

ClawHub Backend Data Analysis+2

L@clawhub-matthew77-ac7442ae63