Joe

@clawhub-keepfit44-b40cc406cb

2prompts

0upvotes received

0contributions

Joined 3 months ago

2 contributions in the last year

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

Less

Job Hunter

Skill

LinkedIn job search assistant that scrapes listings, filters by technologies and countries, and scores matches with AI. Use when the user wants to find jobs,...

---
name: job-hunter
description: LinkedIn job search assistant that scrapes listings, filters by technologies and countries, and scores matches with AI. Use when the user wants to find jobs, search for job openings, look for work, job hunt, or find career opportunities. Triggers on phrases like "find jobs", "job search", "looking for work", "job openings", "search LinkedIn", "remote jobs", "buscar trabajo", "ofertas de trabajo", "ofertas de empleo", "empleo remoto", "vacantes", "buscar empleo", "trabajo remoto", "career opportunities", "hiring", "job listings".
metadata:
  { "openclaw": { "requires": { "bins": ["python3"], "packages": { "pip": ["httpx", "selectolax", "google-genai"] } }, "emoji": "🔍" } }
---

# Job Hunter

AI-powered LinkedIn job search assistant that scrapes real-time listings, filters by technology and location, and scores each match — delivered through chat.

## Setup

Before first use, the user needs a **Google Gemini API key** for AI scoring. Ask for it and save it:

```bash
python3 scripts/job_hunter.py setkey "USER_GEMINI_KEY_HERE"
```

If the user doesn't have one, searches still work but without AI scoring (all jobs get a neutral 0.5 score). Free keys available at https://aistudio.google.com/apikey

## Core Workflow

### 1. Conversational Search

When the user asks to search for jobs, gather these parameters conversationally:

- **keywords** (required): job title or search terms (e.g., "Python developer", "data engineer")
- **technologies** (optional): required tech stack (e.g., ["Python", "AWS", "Docker"])
- **countries** (optional): countries to search in (e.g., ["Spain", "Germany"])
- **remote** (optional): true/false for remote-only jobs
- **experience** (optional): "entry", "mid", "senior", "director", "executive"
- **exclude** (optional): terms to exclude (e.g., ["consultant", "staffing"])
- **company_size** (optional): LinkedIn size codes "1"-"8" (1=1-10, 4=201-500, 7=5001-10000)
- **salary_min** (optional): minimum salary in EUR
- **ai_prompt** (optional): extra criteria for AI scoring (e.g., "Must use microservices")
- **max_pages** (optional): pages to scrape per location (default 3, max 5)
- **min_score** (optional): minimum AI score to show (default 0.6)

Don't ask for ALL parameters — just ask the essentials (keywords, technologies, countries) and use sensible defaults for the rest. Let the user add filters if they want.

### 2. Run the Search

```bash
python3 scripts/job_hunter.py search '{
  "keywords": "Python developer",
  "technologies": ["Python", "FastAPI", "AWS"],
  "countries": ["Spain", "Germany"],
  "remote": true,
  "experience": "mid",
  "exclude": ["consultant"],
  "min_score": 0.6,
  "max_pages": 3
}'
```

The script returns JSON with scored jobs. Present the results in a clean format:

> **1. Senior Python Engineer** — TechCorp
> Madrid, Spain | Remote | €50k-60k
> Score: 0.92 — "Excelente match: remoto, Python/FastAPI"
> https://linkedin.com/jobs/view/12345

Show the top results (score >= min_score) sorted by score. If there are many results, show the top 10 and mention how many more are available.

**Important:** Searches take time (30-90 seconds) due to LinkedIn scraping. Tell the user to wait.

### 3. Save Interesting Jobs

Users can save jobs they like for later review:

```bash
# Save a job
python3 scripts/job_hunter.py save '{
  "title": "Senior Python Engineer",
  "company": "TechCorp",
  "location": "Madrid",
  "url": "https://linkedin.com/jobs/view/12345",
  "score": 0.92,
  "notes": "Great match, applied on 2026-03-19"
}'

# List saved jobs
python3 scripts/job_hunter.py saved

# Remove a saved job
python3 scripts/job_hunter.py unsave "https://linkedin.com/jobs/view/12345"
```

### 4. Search History

```bash
# Show recent searches
python3 scripts/job_hunter.py history

# Re-run a previous search
python3 scripts/job_hunter.py rerun 1
```

## Handling Different Languages

Detect the user's language and:
- Respond in their language
- AI summaries are always in the user's language (pass it in ai_prompt, e.g., "Respond in Spanish")
- Job data stays in the original LinkedIn language

## Tips

- **Per-country searches** give much better results than global "Remote" searches on LinkedIn
- If no results, suggest broadening: fewer technologies, more countries, lower experience level
- LinkedIn may rate-limit after many searches — suggest waiting 5-10 minutes if errors occur
- Encourage users to save interesting jobs before they disappear from LinkedIn

## Storage

All data stored as JSON in `~/.openclaw/job-hunter/`:
- `config.json` — Gemini API key and settings
- `history.json` — search history
- `saved.json` — saved jobs

See [references/search_format.md](references/search_format.md) for full schemas.

FILE:references/search_format.md
# Job Hunter Data Formats

## Search Parameters

```json
{
  "keywords": "Python developer",
  "technologies": ["Python", "FastAPI", "AWS"],
  "countries": ["Spain", "Germany"],
  "remote": true,
  "experience": "mid",
  "exclude": ["consultant", "staffing"],
  "company_size": ["4", "5", "6"],
  "salary_min": 40000,
  "ai_prompt": "Must use microservices architecture",
  "max_pages": 3,
  "min_score": 0.6
}
```

### Experience Levels
- `internship`, `entry`, `associate`, `mid`, `senior`, `director`, `executive`

### Company Size Codes (LinkedIn f_CS)
- `1`: 1-10 employees
- `2`: 11-50
- `3`: 51-200
- `4`: 201-500
- `5`: 501-1000
- `6`: 1001-5000
- `7`: 5001-10000
- `8`: 10001+

## Search Result (per job)

```json
{
  "title": "Senior Python Engineer",
  "company": "TechCorp",
  "location": "Madrid, Spain",
  "url": "https://www.linkedin.com/jobs/view/12345",
  "posted_at": "2026-03-18",
  "salary": "€50,000 - €60,000",
  "ai_score": 0.92,
  "ai_summary": "Excelente match: remoto, Python/FastAPI, ubicación Madrid"
}
```

## Saved Job

```json
{
  "title": "Senior Python Engineer",
  "company": "TechCorp",
  "location": "Madrid, Spain",
  "url": "https://www.linkedin.com/jobs/view/12345",
  "score": 0.92,
  "notes": "Applied on 2026-03-19",
  "saved_at": "2026-03-19T10:30:00"
}
```

## Config

```json
{
  "gemini_api_key": "...",
  "gemini_model": "gemini-2.5-flash",
  "min_ai_score": 0.6,
  "max_pages": 3
}
```

FILE:scripts/job_hunter.py
#!/usr/bin/env python3
"""
Job Hunter - LinkedIn Job Search Skill for OpenClaw

A personal job search assistant that reads publicly available LinkedIn job
listings (the same pages any web browser can access) and optionally scores
them with Google Gemini AI. No authentication, login, or private data access
is involved — only public search result pages are fetched.

Usage:
    job_hunter.py search '<json_params>'
    job_hunter.py setkey <gemini_api_key>
    job_hunter.py save '<json_job>'
    job_hunter.py saved
    job_hunter.py unsave <url>
    job_hunter.py history
    job_hunter.py rerun <index>

License: MIT
"""

from __future__ import annotations

import asyncio
import json
import logging
import re
import sys
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any
from urllib.parse import quote_plus, urlencode

import httpx
from selectolax.parser import HTMLParser, Node

__all__ = [
    "cmd_history",
    "cmd_rerun",
    "cmd_save",
    "cmd_saved",
    "cmd_search",
    "cmd_setkey",
    "cmd_unsave",
    "load_config",
    "load_json",
    "main",
    "parse_job_listings",
    "save_json",
    "score_jobs",
    "scrape_jobs",
]

# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
MAX_DESCRIPTION_LENGTH = 2000
MAX_DESCRIPTION_FOR_PROMPT = 500
MAX_RESULTS_OUTPUT = 30
MAX_HISTORY_ENTRIES = 20
DEFAULT_MIN_SCORE = 0.6
DEFAULT_MAX_PAGES = 3
MAX_PAGES_CAP = 5
DEFAULT_MAX_CONCURRENT = 5

# ---------------------------------------------------------------------------
# Storage
# ---------------------------------------------------------------------------
DATA_DIR = Path.home() / ".openclaw" / "job-hunter"
CONFIG_FILE = DATA_DIR / "config.json"
HISTORY_FILE = DATA_DIR / "history.json"
SAVED_FILE = DATA_DIR / "saved.json"


def ensure_dir() -> None:
    """Create the data directory if it does not exist."""
    DATA_DIR.mkdir(parents=True, exist_ok=True)


def load_json(path: Path, default: Any = None) -> Any:
    """Load and return JSON data from *path*, or *default* if missing."""
    if path.exists():
        with open(path) as f:
            return json.load(f)
    return default if default is not None else {}


def save_json(path: Path, data: Any) -> None:
    """Persist *data* as JSON to *path*, creating directories as needed."""
    ensure_dir()
    with open(path, "w") as f:
        json.dump(data, f, indent=2, ensure_ascii=False)


def load_config() -> dict[str, Any]:
    """Return the stored config or sensible defaults."""
    return load_json(CONFIG_FILE, {
        "gemini_api_key": "",
        "gemini_model": "gemini-2.5-flash",
        "min_ai_score": DEFAULT_MIN_SCORE,
        "max_pages": DEFAULT_MAX_PAGES,
    })


# ---------------------------------------------------------------------------
# LinkedIn scraping
# ---------------------------------------------------------------------------
LINKEDIN_JOBS_URL = "https://www.linkedin.com/jobs/search/"
JOBS_PER_PAGE = 25

EXPERIENCE_MAP: dict[str, str] = {
    "internship": "1", "entry": "2", "associate": "3",
    "mid": "4", "mid-senior": "4", "senior": "4",
    "director": "5", "executive": "6",
}

LOCATION_TO_COUNTRY: dict[str, str] = {
    ", ca": "united states", ", ny": "united states", ", tx": "united states",
    ", wa": "united states", ", il": "united states", ", ma": "united states",
    ", co": "united states", ", ga": "united states", ", va": "united states",
    ", pa": "united states", ", nc": "united states", ", or": "united states",
    ", fl": "united states", ", nj": "united states", ", ct": "united states",
    ", az": "united states", ", mn": "united states", ", oh": "united states",
    ", md": "united states", ", mi": "united states", ", ut": "united states",
    ", dc": "united states", ", tn": "united states", ", mo": "united states",
    "new york": "united states", "san francisco": "united states",
    "seattle": "united states", "austin": "united states", "boston": "united states",
    "chicago": "united states", "los angeles": "united states", "denver": "united states",
    "atlanta": "united states", "dallas": "united states", "portland": "united states",
    "miami": "united states", "san jose": "united states", "san diego": "united states",
    "washington": "united states", "houston": "united states", "philadelphia": "united states",
    "charlotte": "united states", "pittsburgh": "united states", "raleigh": "united states",
    "minneapolis": "united states", "detroit": "united states", "phoenix": "united states",
    "salt lake": "united states",
    "london": "united kingdom", "manchester": "united kingdom",
    "edinburgh": "united kingdom", "birmingham": "united kingdom",
    "bristol": "united kingdom", "cambridge": "united kingdom",
    "oxford": "united kingdom", "glasgow": "united kingdom", "leeds": "united kingdom",
    "england": "united kingdom", "scotland": "united kingdom", "wales": "united kingdom",
    "berlin": "germany", "munich": "germany", "münchen": "germany",
    "hamburg": "germany", "frankfurt": "germany", "cologne": "germany",
    "köln": "germany", "düsseldorf": "germany", "stuttgart": "germany",
    "zurich": "switzerland", "zürich": "switzerland", "geneva": "switzerland",
    "genève": "switzerland", "basel": "switzerland", "bern": "switzerland",
    "lausanne": "switzerland",
    "vienna": "austria", "wien": "austria", "graz": "austria",
    "salzburg": "austria", "linz": "austria",
    "amsterdam": "netherlands", "rotterdam": "netherlands",
    "the hague": "netherlands", "eindhoven": "netherlands", "utrecht": "netherlands",
    "dublin": "ireland", "cork": "ireland", "galway": "ireland",
    "sydney": "australia", "melbourne": "australia", "brisbane": "australia",
    "perth": "australia", "adelaide": "australia", "canberra": "australia",
    "toronto": "canada", "vancouver": "canada", "montreal": "canada",
    "ottawa": "canada", "calgary": "canada",
    "stockholm": "sweden", "gothenburg": "sweden", "malmö": "sweden",
    "copenhagen": "denmark", "oslo": "norway", "helsinki": "finland",
    "paris": "france", "lyon": "france", "toulouse": "france",
    "brussels": "belgium", "antwerp": "belgium",
    "luxembourg": "luxembourg",
    "madrid": "spain", "barcelona": "spain", "valencia": "spain",
    "sevilla": "spain", "malaga": "spain", "bilbao": "spain",
    "lisbon": "portugal", "porto": "portugal",
    "rome": "italy", "milan": "italy", "turin": "italy",
    "warsaw": "poland", "krakow": "poland", "wroclaw": "poland",
    "prague": "czech republic", "brno": "czech republic",
    "bucharest": "romania", "cluj": "romania",
    "budapest": "hungary",
    "athens": "greece", "thessaloniki": "greece",
    "tel aviv": "israel", "jerusalem": "israel",
    "singapore": "singapore",
    "tokyo": "japan", "osaka": "japan",
    "seoul": "south korea",
    "bangalore": "india", "mumbai": "india", "hyderabad": "india",
    "delhi": "india", "pune": "india", "chennai": "india",
    "são paulo": "brazil", "rio de janeiro": "brazil",
    "mexico city": "mexico", "guadalajara": "mexico", "monterrey": "mexico",
    "buenos aires": "argentina",
    "bogota": "colombia", "medellin": "colombia",
    "santiago": "chile", "lima": "peru",
}

logger = logging.getLogger("job_hunter")


def _extract_linkedin_id(url: str) -> str | None:
    """Extract the numeric LinkedIn job ID from a job URL."""
    match = re.search(r"/jobs/view/(?:.*?[-/])?(\d+)", url)
    return match.group(1) if match else None


def _parse_relative_date(text: str | None) -> str | None:
    """Convert a relative date string (e.g. '3 days ago') to ISO date."""
    if not text:
        return None
    text = text.lower().strip()
    now = datetime.now(timezone.utc)
    patterns: list[tuple[str, Any]] = [
        (r"(\d+)\s*second", lambda m: timedelta(seconds=int(m.group(1)))),
        (r"(\d+)\s*minute", lambda m: timedelta(minutes=int(m.group(1)))),
        (r"(\d+)\s*hour", lambda m: timedelta(hours=int(m.group(1)))),
        (r"(\d+)\s*day", lambda m: timedelta(days=int(m.group(1)))),
        (r"(\d+)\s*week", lambda m: timedelta(weeks=int(m.group(1)))),
        (r"(\d+)\s*month", lambda m: timedelta(days=int(m.group(1)) * 30)),
    ]
    for pattern, delta_fn in patterns:
        match = re.search(pattern, text)
        if match:
            return (now - delta_fn(match)).date().isoformat()
    return None


def _parse_single_card(card: Node) -> dict[str, Any] | None:
    """Parse a single LinkedIn job card HTML node into a job dict."""
    title_el = card.css_first("h3, h4, .base-search-card__title")
    title = title_el.text(strip=True) if title_el else None

    link_el = card.css_first("a[href*='/jobs/view/'], a.base-card__full-link")
    url = link_el.attributes.get("href", "").split("?")[0] if link_el else None
    linkedin_id = _extract_linkedin_id(url) if url else None

    if not title or not linkedin_id or not url:
        return None

    company_el = card.css_first("h4 a, .base-search-card__subtitle, .base-search-card__subtitle a")
    company = company_el.text(strip=True) if company_el else "Unknown"

    location_el = card.css_first(".job-search-card__location, .base-search-card__metadata span")
    location = location_el.text(strip=True) if location_el else None

    time_el = card.css_first("time")
    posted_at = None
    if time_el:
        dt = time_el.attributes.get("datetime")
        posted_at = dt if dt else _parse_relative_date(time_el.text(strip=True))

    salary_el = card.css_first(".job-search-card__salary-info, .base-search-card__metadata .salary")
    salary = salary_el.text(strip=True) if salary_el else None

    return {
        "linkedin_id": linkedin_id, "title": title, "company": company,
        "location": location, "url": url, "posted_at": posted_at,
        "salary": salary, "description": None,
    }


def parse_job_listings(html: str) -> list[dict[str, Any]]:
    """Parse LinkedIn search results HTML and return a list of job dicts."""
    parser = HTMLParser(html)
    jobs: list[dict[str, Any]] = []
    cards = parser.css("ul.jobs-search__results-list > li")
    if not cards:
        cards = parser.css("[data-entity-urn]")
    if not cards:
        cards = parser.css("div.base-search-card")

    for card in cards:
        try:
            job = _parse_single_card(card)
            if job:
                jobs.append(job)
        except (AttributeError, KeyError, TypeError) as exc:
            logger.warning("Failed to parse job card: %s", exc)
            continue
    return jobs


async def _fetch_description(client: httpx.AsyncClient, job: dict[str, Any]) -> str | None:
    """Fetch the full description for a single job listing."""
    try:
        resp = await client.get(job["url"])
        resp.raise_for_status()
    except httpx.HTTPStatusError as exc:
        logger.warning("HTTP %s fetching description for %s", exc.response.status_code, job["url"])
        return None
    except httpx.RequestError as exc:
        logger.warning("Request error fetching description for %s: %s", job["url"], exc)
        return None
    parser = HTMLParser(resp.text)
    desc_el = (
        parser.css_first("div.description__text")
        or parser.css_first("div.show-more-less-html__markup")
        or parser.css_first("section.description")
    )
    return desc_el.text(strip=True)[:MAX_DESCRIPTION_LENGTH] if desc_el else None


async def _enrich_descriptions(
    client: httpx.AsyncClient,
    jobs: list[dict[str, Any]],
    max_concurrent: int = DEFAULT_MAX_CONCURRENT,
) -> None:
    """Fetch and attach descriptions for a list of jobs in batches."""
    for i in range(0, len(jobs), max_concurrent):
        batch = jobs[i:i + max_concurrent]
        descriptions = await asyncio.gather(*[_fetch_description(client, j) for j in batch])
        for job, desc in zip(batch, descriptions):
            job["description"] = desc
        if i + max_concurrent < len(jobs):
            await asyncio.sleep(2.0)  # polite delay between batches


def _job_matches_technologies(job: dict[str, Any], techs_lower: list[str]) -> bool:
    """Return True if any technology keyword appears in the job title or description."""
    searchable = job["title"].lower()
    if job.get("description"):
        searchable += " " + job["description"].lower()
    return any(tech in searchable for tech in techs_lower)


def _location_matches_countries(location: str | None, countries_lower: set[str]) -> bool:
    """Return True if *location* maps to one of the given countries."""
    if not location:
        return False
    loc = location.lower()
    if any(country in loc for country in countries_lower):
        return True
    for fragment, country in LOCATION_TO_COUNTRY.items():
        if fragment in loc and country in countries_lower:
            return True
    return False


async def _scrape_one_location(
    client: httpx.AsyncClient,
    params: dict[str, Any],
    location: str | None,
    max_pages: int,
) -> list[dict[str, Any]]:
    """Scrape job listings for a single location across multiple pages."""
    all_jobs: list[dict[str, Any]] = []
    for page in range(max_pages):
        url_params: dict[str, str] = {
            "keywords": params["keywords"],
            "start": str(page * JOBS_PER_PAGE),
            "sortBy": "DD",
        }
        if location:
            url_params["location"] = location
        if params.get("remote"):
            url_params["f_WT"] = "2"
        if params.get("experience") and params["experience"] in EXPERIENCE_MAP:
            url_params["f_E"] = EXPERIENCE_MAP[params["experience"]]
        if params.get("company_size"):
            url_params["f_CS"] = ",".join(params["company_size"])

        url = f"{LINKEDIN_JOBS_URL}?{urlencode(url_params, quote_via=quote_plus)}"
        try:
            response = await client.get(url)
            response.raise_for_status()
        except httpx.HTTPStatusError as exc:
            logger.warning("HTTP %s fetching page %d for '%s'", exc.response.status_code, page, location)
            break
        except httpx.RequestError as exc:
            logger.warning("Request error on page %d for '%s': %s", page, location, exc)
            break

        jobs = parse_job_listings(response.text)
        if not jobs:
            break
        all_jobs.extend(jobs)

        if page < max_pages - 1:
            await asyncio.sleep(2.0)  # polite delay between pages
    return all_jobs


async def scrape_jobs(params: dict[str, Any]) -> list[dict[str, Any]]:
    """Scrape LinkedIn job listings based on search parameters."""
    headers = {
        "Accept": "text/html",
    }
    max_pages = min(params.get("max_pages", DEFAULT_MAX_PAGES), MAX_PAGES_CAP)
    all_jobs: list[dict[str, Any]] = []
    seen_ids: set[str] = set()

    async with httpx.AsyncClient(headers=headers, follow_redirects=True, timeout=30.0) as client:
        countries = params.get("countries", [])
        if countries:
            for country in countries:
                jobs = await _scrape_one_location(client, params, country, max_pages)
                for j in jobs:
                    if j["linkedin_id"] not in seen_ids:
                        seen_ids.add(j["linkedin_id"])
                        all_jobs.append(j)
                await asyncio.sleep(2.0)  # polite delay between batches
        else:
            all_jobs = await _scrape_one_location(client, params, params.get("location"), max_pages)

    # Filter excluded terms
    exclude = params.get("exclude", [])
    if exclude:
        exclude_lower = [e.lower() for e in exclude]
        all_jobs = [
            j for j in all_jobs
            if not any(term in j["title"].lower() or term in j["company"].lower() for term in exclude_lower)
        ]

    # Enrich with descriptions
    if all_jobs:
        async with httpx.AsyncClient(follow_redirects=True, timeout=30.0) as desc_client:
            await _enrich_descriptions(desc_client, all_jobs)

    # Filter by technologies
    technologies = params.get("technologies", [])
    if technologies:
        techs_lower = [t.lower() for t in technologies]
        all_jobs = [j for j in all_jobs if _job_matches_technologies(j, techs_lower)]

    return all_jobs


# ---------------------------------------------------------------------------
# AI Scoring (Google Gemini)
# ---------------------------------------------------------------------------
SCORING_PROMPT = """You are a job matching assistant. Score each job from 0.0 to 1.0 based on how
well it matches the candidate's criteria. Be strict: only score above 0.7 if the job is a strong match.

Candidate criteria:
- Keywords: {keywords}
- Required technologies: {technologies}
- Location preference: {location}
- Remote: {remote}
- Minimum salary: {salary_min}
- Experience level: {experience}
- Exclude terms: {exclude}
- Additional requirements: {ai_prompt}

Jobs to evaluate:
{jobs_text}

Respond ONLY with a JSON array. Each element must have:
- "index": the job number (starting from 0)
- "score": float from 0.0 to 1.0
- "reason": one sentence explaining the score in Spanish

Example response:
[{{"index": 0, "score": 0.85, "reason": "Buen match por tecnologia y ubicacion remota"}}]
"""

BATCH_SIZE = 10


def _format_jobs_for_prompt(jobs: list[dict[str, Any]]) -> str:
    """Format a list of jobs into a text block for the AI scoring prompt."""
    lines: list[str] = []
    for i, job in enumerate(jobs):
        parts = [f"Job {i}:", f"  Title: {job['title']}", f"  Company: {job['company']}"]
        if job.get("location"):
            parts.append(f"  Location: {job['location']}")
        if job.get("salary"):
            parts.append(f"  Salary: {job['salary']}")
        if job.get("description"):
            parts.append(f"  Description: {job['description'][:MAX_DESCRIPTION_FOR_PROMPT]}")
        parts.append(f"  URL: {job['url']}")
        lines.append("\n".join(parts))
    return "\n\n".join(lines)


def _build_scoring_prompt(jobs: list[dict[str, Any]], params: dict[str, Any]) -> str:
    """Build the full Gemini scoring prompt for a batch of jobs."""
    return SCORING_PROMPT.format(
        keywords=params.get("keywords", ""),
        technologies=", ".join(params.get("technologies", [])) or "Not specified",
        location=", ".join(params.get("countries", [])) or params.get("location", "Any"),
        remote="Yes" if params.get("remote") else "No preference",
        salary_min=f"EUR {params['salary_min']}" if params.get("salary_min") else "Not specified",
        experience=params.get("experience", "Any"),
        exclude=", ".join(params.get("exclude", [])) or "None",
        ai_prompt=params.get("ai_prompt", "No additional requirements"),
        jobs_text=_format_jobs_for_prompt(jobs),
    )


def _parse_scores(response_text: str, count: int) -> list[dict[str, Any]]:
    """Parse AI response text into a list of score dicts, with fallback."""
    try:
        text = response_text.strip()
        if "```" in text:
            start = text.find("[")
            end = text.rfind("]") + 1
            if start >= 0 and end > start:
                text = text[start:end]
        scores = json.loads(text)
        if not isinstance(scores, list):
            raise ValueError("Expected JSON array")
        return scores
    except (json.JSONDecodeError, ValueError) as exc:
        logger.warning("Failed to parse AI scores: %s", exc)
        return [{"index": i, "score": 0.5, "reason": "Score unavailable"} for i in range(count)]


def score_jobs(
    jobs: list[dict[str, Any]],
    params: dict[str, Any],
    api_key: str,
    model: str = "gemini-2.5-flash",
) -> list[dict[str, Any]]:
    """Score jobs using Google Gemini AI, or assign neutral scores if no key."""
    if not jobs:
        return []
    if not api_key:
        for job in jobs:
            job["ai_score"] = 0.5
            job["ai_summary"] = "No Gemini API key configured"
        return jobs

    from google import genai
    client = genai.Client(api_key=api_key)

    for batch_start in range(0, len(jobs), BATCH_SIZE):
        batch = jobs[batch_start:batch_start + BATCH_SIZE]
        prompt = _build_scoring_prompt(batch, params)
        try:
            response = client.models.generate_content(
                model=model,
                contents=prompt,
                config=genai.types.GenerateContentConfig(temperature=0.1, max_output_tokens=2048),
            )
            scores = _parse_scores(response.text, len(batch))
            score_map = {s["index"]: s for s in scores}
            for i, job in enumerate(batch):
                score_data = score_map.get(i, {"score": 0.5, "reason": "Not scored"})
                job["ai_score"] = float(score_data.get("score", 0.5))
                job["ai_summary"] = score_data.get("reason")
        except Exception as e:
            logger.error("AI scoring failed for batch starting at %d: %s", batch_start, e)
            for job in batch:
                job["ai_score"] = 0.5
                job["ai_summary"] = f"AI error: {e}"

    return jobs


# ---------------------------------------------------------------------------
# Commands
# ---------------------------------------------------------------------------
def cmd_search(params_json: str) -> None:
    """Search LinkedIn for jobs matching the given JSON parameters."""
    params = json.loads(params_json)

    if "keywords" not in params:
        print(json.dumps({
            "status": "error",
            "message": "Missing required 'keywords' key in search parameters",
        }, indent=2))
        return

    config = load_config()

    # Run scraping
    jobs = asyncio.run(scrape_jobs(params))

    # Save to history (always, even with no results)
    history = load_json(HISTORY_FILE, [])
    history.insert(0, {
        "params": params,
        "total_found": len(jobs),
        "above_threshold": 0,
        "timestamp": datetime.now().isoformat(),
    })
    history = history[:MAX_HISTORY_ENTRIES]
    save_json(HISTORY_FILE, history)

    if not jobs:
        print(json.dumps({"status": "no_results", "message": "No jobs found matching your criteria"}, indent=2))
        return

    # Score with AI
    api_key = config.get("gemini_api_key", "")
    model = config.get("gemini_model", "gemini-2.5-flash")
    jobs = score_jobs(jobs, params, api_key, model)

    # Filter by min score
    min_score = params.get("min_score", config.get("min_ai_score", DEFAULT_MIN_SCORE))
    scored_jobs = sorted(jobs, key=lambda j: j.get("ai_score", 0), reverse=True)
    filtered = [j for j in scored_jobs if j.get("ai_score", 0) >= min_score]

    # Clean output (remove description to keep output manageable)
    for j in scored_jobs:
        j.pop("description", None)
        j.pop("linkedin_id", None)

    # Update history with actual above_threshold count
    history = load_json(HISTORY_FILE, [])
    if history:
        history[0]["above_threshold"] = len(filtered)
        save_json(HISTORY_FILE, history)

    print(json.dumps({
        "status": "ok",
        "total_scraped": len(jobs),
        "above_threshold": len(filtered),
        "min_score": min_score,
        "jobs": filtered[:MAX_RESULTS_OUTPUT],
        "all_jobs": len(scored_jobs),
    }, indent=2, ensure_ascii=False))


def cmd_setkey(api_key: str) -> None:
    """Store the Gemini API key in the config file."""
    config = load_config()
    config["gemini_api_key"] = api_key
    save_json(CONFIG_FILE, config)
    print(json.dumps({"status": "ok", "message": "Gemini API key saved"}, indent=2))


def cmd_save(job_json: str) -> None:
    """Save a job to the bookmarks list (no duplicates by URL)."""
    job = json.loads(job_json)
    job["saved_at"] = datetime.now().isoformat()
    saved = load_json(SAVED_FILE, [])
    # Don't duplicate
    if not any(s.get("url") == job.get("url") for s in saved):
        saved.append(job)
        save_json(SAVED_FILE, saved)
    print(json.dumps({"status": "saved", "total_saved": len(saved)}, indent=2))


def cmd_saved() -> None:
    """List all saved/bookmarked jobs."""
    saved = load_json(SAVED_FILE, [])
    print(json.dumps({"saved_jobs": saved, "total": len(saved)}, indent=2, ensure_ascii=False))


def cmd_unsave(url: str) -> None:
    """Remove a saved job by its URL."""
    saved = load_json(SAVED_FILE, [])
    saved = [s for s in saved if s.get("url") != url]
    save_json(SAVED_FILE, saved)
    print(json.dumps({"status": "removed", "total_saved": len(saved)}, indent=2))


def cmd_history() -> None:
    """Show the search history with indexed entries."""
    history = load_json(HISTORY_FILE, [])
    for i, h in enumerate(history):
        h["index"] = i
    print(json.dumps({"searches": history}, indent=2, ensure_ascii=False))


def cmd_rerun(index_str: str) -> None:
    """Re-run a previous search by its history index."""
    history = load_json(HISTORY_FILE, [])
    index = int(index_str)
    if index < 0 or index >= len(history):
        print(json.dumps({"status": "error", "message": f"Invalid index. History has {len(history)} entries."}, indent=2))
        return
    params = history[index]["params"]
    cmd_search(json.dumps(params))


def main() -> None:
    """CLI entry point for the job_hunter skill."""
    if len(sys.argv) < 2:
        print("Usage: job_hunter.py <command> [args]", file=sys.stderr)
        sys.exit(1)

    command = sys.argv[1]
    commands: dict[str, Any] = {
        "search": lambda: cmd_search(sys.argv[2]),
        "setkey": lambda: cmd_setkey(sys.argv[2]),
        "save": lambda: cmd_save(sys.argv[2]),
        "saved": lambda: cmd_saved(),
        "unsave": lambda: cmd_unsave(sys.argv[2]),
        "history": lambda: cmd_history(),
        "rerun": lambda: cmd_rerun(sys.argv[2]),
    }

    if command not in commands:
        print(f"Unknown command: {command}", file=sys.stderr)
        sys.exit(1)

    commands[command]()


if __name__ == "__main__":
    main()

ClawHub Coding Backend+2

J@clawhub-keepfit44-b40cc406cb

Study Buddy

Skill

Interactive study assistant that creates flashcards, quizzes, and spaced repetition reviews from any source material (notes, PDFs, photos, text, URLs). Use w...

---
name: study-buddy
description: Interactive study assistant that creates flashcards, quizzes, and spaced repetition reviews from any source material (notes, PDFs, photos, text, URLs). Use when the user wants to study, memorize, review, prepare for exams, create flashcards, take a quiz, practice questions, or learn any topic. Triggers on phrases like "study", "quiz me", "flashcards", "review", "exam prep", "test me", "help me memorize", "spaced repetition", "study session".
metadata:
  { "openclaw": { "emoji": "📚", "requires": { "bins": ["python3"] } } }
---

# Study Buddy

AI-powered study assistant that turns any material into interactive learning sessions with flashcards, quizzes, and spaced repetition — delivered through chat.

## Core Workflow

### 1. Create Flashcards from Material

When the user provides study material (text, image, PDF, URL):

1. Extract and analyze the content
2. Identify key concepts, definitions, formulas, dates, and relationships
3. Generate flashcards as Q&A pairs
4. Store them using `scripts/deck_manager.py`

```bash
# Create a new deck
python3 scripts/deck_manager.py create "Biology Exam" --cards '[
  {"q": "What is mitosis?", "a": "Cell division producing two identical daughter cells"},
  {"q": "What are the phases of mitosis?", "a": "Prophase, Metaphase, Anaphase, Telophase (PMAT)"}
]'

# Add cards to existing deck
python3 scripts/deck_manager.py add "Biology Exam" --cards '[
  {"q": "What is meiosis?", "a": "Cell division producing four genetically different gametes"}
]'
```

**Card generation guidelines:**
- One concept per card
- Questions should test understanding, not just recall
- Include mnemonics when helpful (e.g., PMAT for mitosis phases)
- For math/science: include both formula cards and application cards
- For languages: include context sentences, not just word translations
- Aim for 10-20 cards per topic section

### 2. Quiz Session

When the user asks to be quizzed:

```bash
# Get cards due for review (spaced repetition)
python3 scripts/deck_manager.py review "Biology Exam"

# Get random quiz (all cards)
python3 scripts/deck_manager.py quiz "Biology Exam" --count 10
```

**Quiz delivery format:**

Present one question at a time:

> **Question 3/10**
> What are the phases of mitosis?

Wait for the user's answer, then reveal:

> **Answer:** Prophase, Metaphase, Anaphase, Telophase (PMAT)
>
> How did you do?
> Got it | Partially | Missed it

Record the result:

```bash
python3 scripts/deck_manager.py record "Biology Exam" --card-id 2 --result "correct"
```

Results affect spaced repetition scheduling:
- correct: review interval increases (1d, 3d, 7d, 14d, 30d)
- partial: interval stays the same
- missed: interval resets to 1 day

### 3. Spaced Repetition Review

When the user starts a study session or asks "what should I review?":

```bash
# Check what's due across all decks
python3 scripts/deck_manager.py due

# Review specific deck
python3 scripts/deck_manager.py review "Biology Exam"
```

Only show cards that are due based on the SM-2 algorithm intervals. After each session, show a summary:

> **Session complete!**
> Reviewed: 12 cards
> Correct: 9 | Partial: 2 | Missed: 1
> Next review: 3 cards due tomorrow

### 4. Generate Practice Exam

When the user asks for an exam or test:

```bash
python3 scripts/deck_manager.py exam "Biology Exam" --questions 20 --types "multiple_choice,short_answer,true_false"
```

Generate a mix of question types from the deck:
- **Multiple choice** (4 options, one correct) -- use other cards' answers as distractors
- **True/False** -- modify real answers slightly for false statements
- **Short answer** -- direct questions from flashcards
- **Fill in the blank** -- remove key terms from answers

### 5. Deck Management

```bash
# List all decks
python3 scripts/deck_manager.py list

# Show deck stats
python3 scripts/deck_manager.py stats "Biology Exam"

# Export deck (share with others)
python3 scripts/deck_manager.py export "Biology Exam"

# Import deck
python3 scripts/deck_manager.py import deck_file.json

# Delete deck
python3 scripts/deck_manager.py delete "Biology Exam"
```

For guidance on handling different input types (text, photos, PDFs, URLs) and tips for creating effective cards, see [references/guidelines.md](references/guidelines.md).

## Storage

All decks are stored as JSON in `~/.openclaw/study-buddy/decks/`. Each deck file contains cards, review history, and scheduling metadata. See [references/data_format.md](references/data_format.md) for the schema.

## Multilingual Support

Study Buddy works in any language. Detect the user's language from their message and:
- Generate cards in the same language
- Quiz prompts in the user's language
- Support mixed-language decks (useful for language learning)

FILE:references/data_format.md
# Study Buddy - Data Format

## Deck Schema

```json
{
  "name": "Biology Exam",
  "cards": [
    {
      "id": 1,
      "q": "What is mitosis?",
      "a": "Cell division producing two identical daughter cells",
      "interval": 7,
      "ease": 2.6,
      "repetitions": 3,
      "next_review": "2026-03-26T10:00:00",
      "created_at": "2026-03-19T10:00:00"
    }
  ],
  "next_id": 2,
  "created_at": "2026-03-19T10:00:00",
  "updated_at": "2026-03-19T10:00:00"
}
```

## Card Fields

| Field | Type | Description |
|-------|------|-------------|
| id | int | Unique card ID within deck |
| q | string | Question text |
| a | string | Answer text |
| interval | int | Days until next review |
| ease | float | SM-2 ease factor (1.3-3.0, default 2.5) |
| repetitions | int | Consecutive correct answers |
| next_review | ISO datetime | When the card is next due |
| created_at | ISO datetime | When the card was created |

## SM-2 Algorithm

Interval progression for consecutive correct answers:
- Rep 1: 1 day
- Rep 2: 3 days
- Rep 3+: previous_interval * ease_factor

Ease adjustments:
- correct: ease + 0.1
- partial: ease - 0.15
- missed: ease - 0.2, reset to rep 0, interval 1 day

Minimum ease: 1.3

## Storage Location

`~/.openclaw/study-buddy/decks/<deck_name>.json`

Deck names are normalized: lowercased, spaces replaced with underscores.

FILE:references/guidelines.md
# Study Buddy Guidelines

## Handling Different Input Types

| Input | How to Process |
|-------|---------------|
| **Text/notes** | Extract key concepts directly |
| **Photo of handwritten notes** | Describe the image, extract text and concepts |
| **PDF document** | Read and extract key sections |
| **URL/webpage** | Fetch content, extract main points |
| **Topic name only** | Generate cards from AI knowledge on the topic |
| **Conversation history** | Summarize and create cards from recent discussion |

## Tips for Effective Cards

- **Atomic**: One fact per card, never compound questions
- **Clear**: Unambiguous questions with definitive answers
- **Bidirectional**: For definitions, create both "What is X?" and "X is the definition of?"
- **Visual mnemonics**: Suggest memory tricks when possible
- **Progressive**: Start with basic recall, add application questions later

FILE:scripts/deck_manager.py
#!/usr/bin/env python3
"""
Study Buddy - Deck Manager
Manages flashcard decks with spaced repetition (SM-2 algorithm).

License: MIT

Usage:
    deck_manager.py create <deck_name> --cards '<json_array>'
    deck_manager.py add <deck_name> --cards '<json_array>'
    deck_manager.py list
    deck_manager.py stats <deck_name>
    deck_manager.py review <deck_name>
    deck_manager.py quiz <deck_name> [--count N]
    deck_manager.py exam <deck_name> [--questions N] [--types types]
    deck_manager.py record <deck_name> --card-id <id> --result <correct|partial|missed>
    deck_manager.py due
    deck_manager.py export <deck_name>
    deck_manager.py import <file_path>
    deck_manager.py delete <deck_name>
"""

from __future__ import annotations

import argparse
import json
import logging
import os
import random
import sys
from datetime import datetime, timedelta
from pathlib import Path

logger = logging.getLogger(__name__)

__all__ = [
    "DECKS_DIR",
    "ensure_dir",
    "deck_path",
    "load_deck",
    "save_deck",
    "new_card",
    "sm2_update",
    "cmd_create",
    "cmd_add",
    "cmd_list",
    "cmd_stats",
    "cmd_review",
    "cmd_quiz",
    "cmd_exam",
    "cmd_record",
    "cmd_due",
    "cmd_export",
    "cmd_import",
    "cmd_delete",
    "main",
]

# --- Named constants ---
MASTERY_THRESHOLD = 5          # Repetitions needed to consider a card mastered
DEFAULT_EASE = 2.5             # Starting ease factor for new cards
MIN_EASE = 1.3                 # Minimum ease factor (SM-2 floor)
EASE_BONUS_CORRECT = 0.1      # Ease increase on correct answer
EASE_PENALTY_PARTIAL = -0.15   # Ease decrease on partial answer
EASE_PENALTY_MISSED = -0.2    # Ease decrease on missed answer
DEFAULT_QUIZ_COUNT = 10        # Default number of quiz questions
DEFAULT_EXAM_QUESTIONS = 20    # Default number of exam questions

DECKS_DIR = Path.home() / ".openclaw" / "study-buddy" / "decks"


def ensure_dir() -> None:
    """Create the decks directory if it does not exist."""
    DECKS_DIR.mkdir(parents=True, exist_ok=True)


def deck_path(name: str) -> Path:
    """Return the filesystem path for a deck given its name."""
    safe = name.lower().replace(" ", "_").replace("/", "_")
    return DECKS_DIR / f"{safe}.json"


def load_deck(name: str) -> dict:
    """Load a deck from disk by name, exiting if not found."""
    path = deck_path(name)
    if not path.exists():
        logger.error("Deck '%s' not found.", name)
        print(f"Error: Deck '{name}' not found.", file=sys.stderr)
        sys.exit(1)
    with open(path) as f:
        return json.load(f)


def save_deck(deck: dict) -> None:
    """Persist a deck dictionary to its JSON file on disk."""
    ensure_dir()
    path = deck_path(deck["name"])
    with open(path, "w") as f:
        json.dump(deck, f, indent=2, ensure_ascii=False)


def new_card(card_id: int, question: str, answer: str) -> dict:
    """Create a new flashcard with default SM-2 scheduling values."""
    return {
        "id": card_id,
        "q": question,
        "a": answer,
        "interval": 0,
        "ease": DEFAULT_EASE,
        "repetitions": 0,
        "next_review": datetime.now().isoformat(),
        "created_at": datetime.now().isoformat(),
    }


def sm2_update(card: dict, result: str) -> dict:
    """Apply the SM-2 spaced repetition algorithm to update a card's scheduling."""
    ease = card.get("ease", DEFAULT_EASE)
    interval = card.get("interval", 0)
    reps = card.get("repetitions", 0)

    if result == "correct":
        if reps == 0:
            interval = 1
        elif reps == 1:
            interval = 3
        else:
            interval = int(interval * ease)
        reps += 1
        ease = max(MIN_EASE, ease + EASE_BONUS_CORRECT)
    elif result == "partial":
        ease = max(MIN_EASE, ease + EASE_PENALTY_PARTIAL)
    elif result == "missed":
        reps = 0
        interval = 1
        ease = max(MIN_EASE, ease + EASE_PENALTY_MISSED)

    card["interval"] = interval
    card["ease"] = round(ease, 2)
    card["repetitions"] = reps
    card["next_review"] = (datetime.now() + timedelta(days=max(interval, 1))).isoformat()
    return card


def _validate_cards_json(cards_json: str) -> list[dict]:
    """Parse and validate a JSON string of cards, ensuring each has 'q' and 'a' keys."""
    try:
        cards_data = json.loads(cards_json)
    except json.JSONDecodeError as exc:
        logger.error("Invalid JSON for cards: %s", exc)
        print(f"Error: Invalid JSON for cards: {exc}", file=sys.stderr)
        sys.exit(1)

    if not isinstance(cards_data, list):
        logger.error("Cards must be a JSON array.")
        print("Error: Cards must be a JSON array.", file=sys.stderr)
        sys.exit(1)

    for i, card in enumerate(cards_data):
        if "q" not in card or "a" not in card:
            logger.error("Card at index %d is missing required 'q' and/or 'a' keys.", i)
            print(
                f"Error: Card at index {i} is missing required 'q' and/or 'a' keys.",
                file=sys.stderr,
            )
            sys.exit(1)

    return cards_data


def cmd_create(args: argparse.Namespace) -> None:
    """Create a new flashcard deck from a JSON array of cards."""
    ensure_dir()
    path = deck_path(args.deck_name)
    if path.exists():
        logger.error("Deck '%s' already exists. Use 'add' to add cards.", args.deck_name)
        print(f"Error: Deck '{args.deck_name}' already exists. Use 'add' to add cards.", file=sys.stderr)
        sys.exit(1)

    cards_data = _validate_cards_json(args.cards)
    cards = [new_card(i + 1, c["q"], c["a"]) for i, c in enumerate(cards_data)]

    deck = {
        "name": args.deck_name,
        "cards": cards,
        "next_id": len(cards) + 1,
        "created_at": datetime.now().isoformat(),
        "updated_at": datetime.now().isoformat(),
    }
    save_deck(deck)
    print(json.dumps({"status": "created", "deck": args.deck_name, "cards": len(cards)}, indent=2))


def cmd_add(args: argparse.Namespace) -> None:
    """Add new cards to an existing deck."""
    deck = load_deck(args.deck_name)
    cards_data = _validate_cards_json(args.cards)
    next_id = deck.get("next_id", len(deck["cards"]) + 1)

    new_cards = []
    for i, c in enumerate(cards_data):
        new_cards.append(new_card(next_id + i, c["q"], c["a"]))

    deck["cards"].extend(new_cards)
    deck["next_id"] = next_id + len(new_cards)
    deck["updated_at"] = datetime.now().isoformat()
    save_deck(deck)
    print(json.dumps({"status": "added", "deck": args.deck_name, "new_cards": len(new_cards), "total_cards": len(deck["cards"])}, indent=2))


def cmd_list(args: argparse.Namespace) -> None:
    """List all decks with card counts and due counts."""
    ensure_dir()
    decks: list[dict] = []
    for f in sorted(DECKS_DIR.glob("*.json")):
        with open(f) as fh:
            d = json.load(fh)
            now = datetime.now()
            due = sum(1 for c in d["cards"] if datetime.fromisoformat(c["next_review"]) <= now)
            decks.append({
                "name": d["name"],
                "cards": len(d["cards"]),
                "due": due,
                "created": d.get("created_at", "unknown"),
            })
    print(json.dumps({"decks": decks}, indent=2))


def cmd_stats(args: argparse.Namespace) -> None:
    """Show statistics for a specific deck."""
    deck = load_deck(args.deck_name)
    now = datetime.now()
    cards = deck["cards"]
    due = [c for c in cards if datetime.fromisoformat(c["next_review"]) <= now]
    mastered = [c for c in cards if c.get("repetitions", 0) >= MASTERY_THRESHOLD]
    learning = [c for c in cards if 0 < c.get("repetitions", 0) < MASTERY_THRESHOLD]
    new = [c for c in cards if c.get("repetitions", 0) == 0]

    print(json.dumps({
        "deck": args.deck_name,
        "total_cards": len(cards),
        "due_now": len(due),
        "mastered": len(mastered),
        "learning": len(learning),
        "new": len(new),
        "average_ease": round(sum(c.get("ease", DEFAULT_EASE) for c in cards) / max(len(cards), 1), 2),
    }, indent=2))


def cmd_review(args: argparse.Namespace) -> None:
    """Return due cards for review, shuffled randomly."""
    deck = load_deck(args.deck_name)
    now = datetime.now()
    due = [c for c in deck["cards"] if datetime.fromisoformat(c["next_review"]) <= now]

    if not due:
        next_reviews = sorted(deck["cards"], key=lambda c: c["next_review"])
        next_time = next_reviews[0]["next_review"] if next_reviews else "never"
        print(json.dumps({"status": "no_cards_due", "next_review": next_time}, indent=2))
        return

    random.shuffle(due)
    cards_out = [{"id": c["id"], "q": c["q"], "a": c["a"], "repetitions": c.get("repetitions", 0)} for c in due]
    print(json.dumps({"deck": args.deck_name, "due_count": len(due), "cards": cards_out}, indent=2))


def cmd_quiz(args: argparse.Namespace) -> None:
    """Generate a random quiz from the deck's cards."""
    deck = load_deck(args.deck_name)
    count = min(args.count or DEFAULT_QUIZ_COUNT, len(deck["cards"]))
    selected = random.sample(deck["cards"], count)
    cards_out = [{"id": c["id"], "q": c["q"], "a": c["a"]} for c in selected]
    print(json.dumps({"deck": args.deck_name, "quiz_count": count, "cards": cards_out}, indent=2))


def cmd_exam(args: argparse.Namespace) -> None:
    """Generate a structured exam with multiple question types."""
    deck = load_deck(args.deck_name)
    count = min(args.questions or DEFAULT_EXAM_QUESTIONS, len(deck["cards"]))
    types = (args.types or "multiple_choice,short_answer,true_false").split(",")
    selected = random.sample(deck["cards"], count)
    all_answers = [c["a"] for c in deck["cards"]]

    questions: list[dict] = []
    for i, card in enumerate(selected):
        q_type = types[i % len(types)]
        q: dict = {"number": i + 1, "type": q_type, "question": card["q"], "card_id": card["id"]}

        if q_type == "multiple_choice":
            distractors = [a for a in all_answers if a != card["a"]]
            random.shuffle(distractors)
            options = [card["a"]] + distractors[:3]
            random.shuffle(options)
            q["options"] = options
            q["correct"] = card["a"]
        elif q_type == "true_false":
            use_true = random.choice([True, False])
            if use_true:
                q["statement"] = card["a"]
                q["correct"] = True
            else:
                if distractors := [a for a in all_answers if a != card["a"]]:
                    q["statement"] = random.choice(distractors)
                else:
                    q["statement"] = card["a"]
                    use_true = True
                q["correct"] = use_true
        else:
            q["correct"] = card["a"]

        questions.append(q)

    print(json.dumps({"deck": args.deck_name, "exam": questions}, indent=2))


def cmd_record(args: argparse.Namespace) -> None:
    """Record a review result for a specific card and update its schedule."""
    deck = load_deck(args.deck_name)
    card_id = args.card_id

    for card in deck["cards"]:
        if card["id"] == card_id:
            sm2_update(card, args.result)
            deck["updated_at"] = datetime.now().isoformat()
            save_deck(deck)
            print(json.dumps({
                "status": "recorded",
                "card_id": card_id,
                "result": args.result,
                "next_review": card["next_review"],
                "interval_days": card["interval"],
                "ease": card["ease"],
            }, indent=2))
            return

    logger.error("Card ID %d not found in deck '%s'.", card_id, args.deck_name)
    print(f"Error: Card ID {card_id} not found in deck '{args.deck_name}'.", file=sys.stderr)
    sys.exit(1)


def cmd_due(args: argparse.Namespace) -> None:
    """List all decks that have cards due for review."""
    ensure_dir()
    now = datetime.now()
    results: list[dict] = []
    for f in sorted(DECKS_DIR.glob("*.json")):
        with open(f) as fh:
            d = json.load(fh)
            due = [c for c in d["cards"] if datetime.fromisoformat(c["next_review"]) <= now]
            if due:
                results.append({"deck": d["name"], "due_count": len(due)})
    print(json.dumps({"due_decks": results, "total_due": sum(r["due_count"] for r in results)}, indent=2))


def cmd_export(args: argparse.Namespace) -> None:
    """Export a deck as formatted JSON to stdout."""
    deck = load_deck(args.deck_name)
    print(json.dumps(deck, indent=2, ensure_ascii=False))


def cmd_import(args: argparse.Namespace) -> None:
    """Import a deck from a JSON file on disk."""
    ensure_dir()
    try:
        with open(args.file_path) as f:
            deck = json.load(f)
    except FileNotFoundError:
        logger.error("File not found: %s", args.file_path)
        print(f"Error: File not found: {args.file_path}", file=sys.stderr)
        sys.exit(1)
    except json.JSONDecodeError as exc:
        logger.error("Invalid JSON in file '%s': %s", args.file_path, exc)
        print(f"Error: Invalid JSON in file '{args.file_path}': {exc}", file=sys.stderr)
        sys.exit(1)

    if "name" not in deck or "cards" not in deck:
        logger.error("Invalid deck format. Must have 'name' and 'cards'.")
        print("Error: Invalid deck format. Must have 'name' and 'cards'.", file=sys.stderr)
        sys.exit(1)
    save_deck(deck)
    print(json.dumps({"status": "imported", "deck": deck["name"], "cards": len(deck["cards"])}, indent=2))


def cmd_delete(args: argparse.Namespace) -> None:
    """Delete a deck file from disk."""
    path = deck_path(args.deck_name)
    if not path.exists():
        logger.error("Deck '%s' not found.", args.deck_name)
        print(f"Error: Deck '{args.deck_name}' not found.", file=sys.stderr)
        sys.exit(1)
    path.unlink()
    print(json.dumps({"status": "deleted", "deck": args.deck_name}, indent=2))


def main() -> None:
    """Entry point: parse CLI arguments and dispatch to the appropriate command."""
    parser = argparse.ArgumentParser(description="Study Buddy - Flashcard Deck Manager")
    sub = parser.add_subparsers(dest="command", required=True)

    p = sub.add_parser("create")
    p.add_argument("deck_name")
    p.add_argument("--cards", required=True)

    p = sub.add_parser("add")
    p.add_argument("deck_name")
    p.add_argument("--cards", required=True)

    sub.add_parser("list")

    p = sub.add_parser("stats")
    p.add_argument("deck_name")

    p = sub.add_parser("review")
    p.add_argument("deck_name")

    p = sub.add_parser("quiz")
    p.add_argument("deck_name")
    p.add_argument("--count", type=int, default=10)

    p = sub.add_parser("exam")
    p.add_argument("deck_name")
    p.add_argument("--questions", type=int, default=20)
    p.add_argument("--types", default="multiple_choice,short_answer,true_false")

    p = sub.add_parser("record")
    p.add_argument("deck_name")
    p.add_argument("--card-id", type=int, required=True)
    p.add_argument("--result", required=True, choices=["correct", "partial", "missed"])

    sub.add_parser("due")

    p = sub.add_parser("export")
    p.add_argument("deck_name")

    p = sub.add_parser("import")
    p.add_argument("file_path")

    p = sub.add_parser("delete")
    p.add_argument("deck_name")

    args = parser.parse_args()

    commands: dict[str, callable] = {
        "create": cmd_create, "add": cmd_add, "list": cmd_list,
        "stats": cmd_stats, "review": cmd_review, "quiz": cmd_quiz,
        "exam": cmd_exam, "record": cmd_record, "due": cmd_due,
        "export": cmd_export, "import": cmd_import, "delete": cmd_delete,
    }
    commands[args.command](args)


if __name__ == "__main__":
    main()

ClawHub Research Education+2

J@clawhub-keepfit44-b40cc406cb