@clawhub-keepfit44-b40cc406cb
LinkedIn job search assistant that scrapes listings, filters by technologies and countries, and scores matches with AI. Use when the user wants to find jobs,...
---
name: job-hunter
description: LinkedIn job search assistant that scrapes listings, filters by technologies and countries, and scores matches with AI. Use when the user wants to find jobs, search for job openings, look for work, job hunt, or find career opportunities. Triggers on phrases like "find jobs", "job search", "looking for work", "job openings", "search LinkedIn", "remote jobs", "buscar trabajo", "ofertas de trabajo", "ofertas de empleo", "empleo remoto", "vacantes", "buscar empleo", "trabajo remoto", "career opportunities", "hiring", "job listings".
metadata:
{ "openclaw": { "requires": { "bins": ["python3"], "packages": { "pip": ["httpx", "selectolax", "google-genai"] } }, "emoji": "🔍" } }
---
# Job Hunter
AI-powered LinkedIn job search assistant that scrapes real-time listings, filters by technology and location, and scores each match — delivered through chat.
## Setup
Before first use, the user needs a **Google Gemini API key** for AI scoring. Ask for it and save it:
```bash
python3 scripts/job_hunter.py setkey "USER_GEMINI_KEY_HERE"
```
If the user doesn't have one, searches still work but without AI scoring (all jobs get a neutral 0.5 score). Free keys available at https://aistudio.google.com/apikey
## Core Workflow
### 1. Conversational Search
When the user asks to search for jobs, gather these parameters conversationally:
- **keywords** (required): job title or search terms (e.g., "Python developer", "data engineer")
- **technologies** (optional): required tech stack (e.g., ["Python", "AWS", "Docker"])
- **countries** (optional): countries to search in (e.g., ["Spain", "Germany"])
- **remote** (optional): true/false for remote-only jobs
- **experience** (optional): "entry", "mid", "senior", "director", "executive"
- **exclude** (optional): terms to exclude (e.g., ["consultant", "staffing"])
- **company_size** (optional): LinkedIn size codes "1"-"8" (1=1-10, 4=201-500, 7=5001-10000)
- **salary_min** (optional): minimum salary in EUR
- **ai_prompt** (optional): extra criteria for AI scoring (e.g., "Must use microservices")
- **max_pages** (optional): pages to scrape per location (default 3, max 5)
- **min_score** (optional): minimum AI score to show (default 0.6)
Don't ask for ALL parameters — just ask the essentials (keywords, technologies, countries) and use sensible defaults for the rest. Let the user add filters if they want.
### 2. Run the Search
```bash
python3 scripts/job_hunter.py search '{
"keywords": "Python developer",
"technologies": ["Python", "FastAPI", "AWS"],
"countries": ["Spain", "Germany"],
"remote": true,
"experience": "mid",
"exclude": ["consultant"],
"min_score": 0.6,
"max_pages": 3
}'
```
The script returns JSON with scored jobs. Present the results in a clean format:
> **1. Senior Python Engineer** — TechCorp
> Madrid, Spain | Remote | €50k-60k
> Score: 0.92 — "Excelente match: remoto, Python/FastAPI"
> https://linkedin.com/jobs/view/12345
Show the top results (score >= min_score) sorted by score. If there are many results, show the top 10 and mention how many more are available.
**Important:** Searches take time (30-90 seconds) due to LinkedIn scraping. Tell the user to wait.
### 3. Save Interesting Jobs
Users can save jobs they like for later review:
```bash
# Save a job
python3 scripts/job_hunter.py save '{
"title": "Senior Python Engineer",
"company": "TechCorp",
"location": "Madrid",
"url": "https://linkedin.com/jobs/view/12345",
"score": 0.92,
"notes": "Great match, applied on 2026-03-19"
}'
# List saved jobs
python3 scripts/job_hunter.py saved
# Remove a saved job
python3 scripts/job_hunter.py unsave "https://linkedin.com/jobs/view/12345"
```
### 4. Search History
```bash
# Show recent searches
python3 scripts/job_hunter.py history
# Re-run a previous search
python3 scripts/job_hunter.py rerun 1
```
## Handling Different Languages
Detect the user's language and:
- Respond in their language
- AI summaries are always in the user's language (pass it in ai_prompt, e.g., "Respond in Spanish")
- Job data stays in the original LinkedIn language
## Tips
- **Per-country searches** give much better results than global "Remote" searches on LinkedIn
- If no results, suggest broadening: fewer technologies, more countries, lower experience level
- LinkedIn may rate-limit after many searches — suggest waiting 5-10 minutes if errors occur
- Encourage users to save interesting jobs before they disappear from LinkedIn
## Storage
All data stored as JSON in `~/.openclaw/job-hunter/`:
- `config.json` — Gemini API key and settings
- `history.json` — search history
- `saved.json` — saved jobs
See [references/search_format.md](references/search_format.md) for full schemas.
FILE:references/search_format.md
# Job Hunter Data Formats
## Search Parameters
```json
{
"keywords": "Python developer",
"technologies": ["Python", "FastAPI", "AWS"],
"countries": ["Spain", "Germany"],
"remote": true,
"experience": "mid",
"exclude": ["consultant", "staffing"],
"company_size": ["4", "5", "6"],
"salary_min": 40000,
"ai_prompt": "Must use microservices architecture",
"max_pages": 3,
"min_score": 0.6
}
```
### Experience Levels
- `internship`, `entry`, `associate`, `mid`, `senior`, `director`, `executive`
### Company Size Codes (LinkedIn f_CS)
- `1`: 1-10 employees
- `2`: 11-50
- `3`: 51-200
- `4`: 201-500
- `5`: 501-1000
- `6`: 1001-5000
- `7`: 5001-10000
- `8`: 10001+
## Search Result (per job)
```json
{
"title": "Senior Python Engineer",
"company": "TechCorp",
"location": "Madrid, Spain",
"url": "https://www.linkedin.com/jobs/view/12345",
"posted_at": "2026-03-18",
"salary": "€50,000 - €60,000",
"ai_score": 0.92,
"ai_summary": "Excelente match: remoto, Python/FastAPI, ubicación Madrid"
}
```
## Saved Job
```json
{
"title": "Senior Python Engineer",
"company": "TechCorp",
"location": "Madrid, Spain",
"url": "https://www.linkedin.com/jobs/view/12345",
"score": 0.92,
"notes": "Applied on 2026-03-19",
"saved_at": "2026-03-19T10:30:00"
}
```
## Config
```json
{
"gemini_api_key": "...",
"gemini_model": "gemini-2.5-flash",
"min_ai_score": 0.6,
"max_pages": 3
}
```
FILE:scripts/job_hunter.py
#!/usr/bin/env python3
"""
Job Hunter - LinkedIn Job Search Skill for OpenClaw
A personal job search assistant that reads publicly available LinkedIn job
listings (the same pages any web browser can access) and optionally scores
them with Google Gemini AI. No authentication, login, or private data access
is involved — only public search result pages are fetched.
Usage:
job_hunter.py search '<json_params>'
job_hunter.py setkey <gemini_api_key>
job_hunter.py save '<json_job>'
job_hunter.py saved
job_hunter.py unsave <url>
job_hunter.py history
job_hunter.py rerun <index>
License: MIT
"""
from __future__ import annotations
import asyncio
import json
import logging
import re
import sys
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any
from urllib.parse import quote_plus, urlencode
import httpx
from selectolax.parser import HTMLParser, Node
__all__ = [
"cmd_history",
"cmd_rerun",
"cmd_save",
"cmd_saved",
"cmd_search",
"cmd_setkey",
"cmd_unsave",
"load_config",
"load_json",
"main",
"parse_job_listings",
"save_json",
"score_jobs",
"scrape_jobs",
]
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
MAX_DESCRIPTION_LENGTH = 2000
MAX_DESCRIPTION_FOR_PROMPT = 500
MAX_RESULTS_OUTPUT = 30
MAX_HISTORY_ENTRIES = 20
DEFAULT_MIN_SCORE = 0.6
DEFAULT_MAX_PAGES = 3
MAX_PAGES_CAP = 5
DEFAULT_MAX_CONCURRENT = 5
# ---------------------------------------------------------------------------
# Storage
# ---------------------------------------------------------------------------
DATA_DIR = Path.home() / ".openclaw" / "job-hunter"
CONFIG_FILE = DATA_DIR / "config.json"
HISTORY_FILE = DATA_DIR / "history.json"
SAVED_FILE = DATA_DIR / "saved.json"
def ensure_dir() -> None:
"""Create the data directory if it does not exist."""
DATA_DIR.mkdir(parents=True, exist_ok=True)
def load_json(path: Path, default: Any = None) -> Any:
"""Load and return JSON data from *path*, or *default* if missing."""
if path.exists():
with open(path) as f:
return json.load(f)
return default if default is not None else {}
def save_json(path: Path, data: Any) -> None:
"""Persist *data* as JSON to *path*, creating directories as needed."""
ensure_dir()
with open(path, "w") as f:
json.dump(data, f, indent=2, ensure_ascii=False)
def load_config() -> dict[str, Any]:
"""Return the stored config or sensible defaults."""
return load_json(CONFIG_FILE, {
"gemini_api_key": "",
"gemini_model": "gemini-2.5-flash",
"min_ai_score": DEFAULT_MIN_SCORE,
"max_pages": DEFAULT_MAX_PAGES,
})
# ---------------------------------------------------------------------------
# LinkedIn scraping
# ---------------------------------------------------------------------------
LINKEDIN_JOBS_URL = "https://www.linkedin.com/jobs/search/"
JOBS_PER_PAGE = 25
EXPERIENCE_MAP: dict[str, str] = {
"internship": "1", "entry": "2", "associate": "3",
"mid": "4", "mid-senior": "4", "senior": "4",
"director": "5", "executive": "6",
}
LOCATION_TO_COUNTRY: dict[str, str] = {
", ca": "united states", ", ny": "united states", ", tx": "united states",
", wa": "united states", ", il": "united states", ", ma": "united states",
", co": "united states", ", ga": "united states", ", va": "united states",
", pa": "united states", ", nc": "united states", ", or": "united states",
", fl": "united states", ", nj": "united states", ", ct": "united states",
", az": "united states", ", mn": "united states", ", oh": "united states",
", md": "united states", ", mi": "united states", ", ut": "united states",
", dc": "united states", ", tn": "united states", ", mo": "united states",
"new york": "united states", "san francisco": "united states",
"seattle": "united states", "austin": "united states", "boston": "united states",
"chicago": "united states", "los angeles": "united states", "denver": "united states",
"atlanta": "united states", "dallas": "united states", "portland": "united states",
"miami": "united states", "san jose": "united states", "san diego": "united states",
"washington": "united states", "houston": "united states", "philadelphia": "united states",
"charlotte": "united states", "pittsburgh": "united states", "raleigh": "united states",
"minneapolis": "united states", "detroit": "united states", "phoenix": "united states",
"salt lake": "united states",
"london": "united kingdom", "manchester": "united kingdom",
"edinburgh": "united kingdom", "birmingham": "united kingdom",
"bristol": "united kingdom", "cambridge": "united kingdom",
"oxford": "united kingdom", "glasgow": "united kingdom", "leeds": "united kingdom",
"england": "united kingdom", "scotland": "united kingdom", "wales": "united kingdom",
"berlin": "germany", "munich": "germany", "münchen": "germany",
"hamburg": "germany", "frankfurt": "germany", "cologne": "germany",
"köln": "germany", "düsseldorf": "germany", "stuttgart": "germany",
"zurich": "switzerland", "zürich": "switzerland", "geneva": "switzerland",
"genève": "switzerland", "basel": "switzerland", "bern": "switzerland",
"lausanne": "switzerland",
"vienna": "austria", "wien": "austria", "graz": "austria",
"salzburg": "austria", "linz": "austria",
"amsterdam": "netherlands", "rotterdam": "netherlands",
"the hague": "netherlands", "eindhoven": "netherlands", "utrecht": "netherlands",
"dublin": "ireland", "cork": "ireland", "galway": "ireland",
"sydney": "australia", "melbourne": "australia", "brisbane": "australia",
"perth": "australia", "adelaide": "australia", "canberra": "australia",
"toronto": "canada", "vancouver": "canada", "montreal": "canada",
"ottawa": "canada", "calgary": "canada",
"stockholm": "sweden", "gothenburg": "sweden", "malmö": "sweden",
"copenhagen": "denmark", "oslo": "norway", "helsinki": "finland",
"paris": "france", "lyon": "france", "toulouse": "france",
"brussels": "belgium", "antwerp": "belgium",
"luxembourg": "luxembourg",
"madrid": "spain", "barcelona": "spain", "valencia": "spain",
"sevilla": "spain", "malaga": "spain", "bilbao": "spain",
"lisbon": "portugal", "porto": "portugal",
"rome": "italy", "milan": "italy", "turin": "italy",
"warsaw": "poland", "krakow": "poland", "wroclaw": "poland",
"prague": "czech republic", "brno": "czech republic",
"bucharest": "romania", "cluj": "romania",
"budapest": "hungary",
"athens": "greece", "thessaloniki": "greece",
"tel aviv": "israel", "jerusalem": "israel",
"singapore": "singapore",
"tokyo": "japan", "osaka": "japan",
"seoul": "south korea",
"bangalore": "india", "mumbai": "india", "hyderabad": "india",
"delhi": "india", "pune": "india", "chennai": "india",
"são paulo": "brazil", "rio de janeiro": "brazil",
"mexico city": "mexico", "guadalajara": "mexico", "monterrey": "mexico",
"buenos aires": "argentina",
"bogota": "colombia", "medellin": "colombia",
"santiago": "chile", "lima": "peru",
}
logger = logging.getLogger("job_hunter")
def _extract_linkedin_id(url: str) -> str | None:
"""Extract the numeric LinkedIn job ID from a job URL."""
match = re.search(r"/jobs/view/(?:.*?[-/])?(\d+)", url)
return match.group(1) if match else None
def _parse_relative_date(text: str | None) -> str | None:
"""Convert a relative date string (e.g. '3 days ago') to ISO date."""
if not text:
return None
text = text.lower().strip()
now = datetime.now(timezone.utc)
patterns: list[tuple[str, Any]] = [
(r"(\d+)\s*second", lambda m: timedelta(seconds=int(m.group(1)))),
(r"(\d+)\s*minute", lambda m: timedelta(minutes=int(m.group(1)))),
(r"(\d+)\s*hour", lambda m: timedelta(hours=int(m.group(1)))),
(r"(\d+)\s*day", lambda m: timedelta(days=int(m.group(1)))),
(r"(\d+)\s*week", lambda m: timedelta(weeks=int(m.group(1)))),
(r"(\d+)\s*month", lambda m: timedelta(days=int(m.group(1)) * 30)),
]
for pattern, delta_fn in patterns:
match = re.search(pattern, text)
if match:
return (now - delta_fn(match)).date().isoformat()
return None
def _parse_single_card(card: Node) -> dict[str, Any] | None:
"""Parse a single LinkedIn job card HTML node into a job dict."""
title_el = card.css_first("h3, h4, .base-search-card__title")
title = title_el.text(strip=True) if title_el else None
link_el = card.css_first("a[href*='/jobs/view/'], a.base-card__full-link")
url = link_el.attributes.get("href", "").split("?")[0] if link_el else None
linkedin_id = _extract_linkedin_id(url) if url else None
if not title or not linkedin_id or not url:
return None
company_el = card.css_first("h4 a, .base-search-card__subtitle, .base-search-card__subtitle a")
company = company_el.text(strip=True) if company_el else "Unknown"
location_el = card.css_first(".job-search-card__location, .base-search-card__metadata span")
location = location_el.text(strip=True) if location_el else None
time_el = card.css_first("time")
posted_at = None
if time_el:
dt = time_el.attributes.get("datetime")
posted_at = dt if dt else _parse_relative_date(time_el.text(strip=True))
salary_el = card.css_first(".job-search-card__salary-info, .base-search-card__metadata .salary")
salary = salary_el.text(strip=True) if salary_el else None
return {
"linkedin_id": linkedin_id, "title": title, "company": company,
"location": location, "url": url, "posted_at": posted_at,
"salary": salary, "description": None,
}
def parse_job_listings(html: str) -> list[dict[str, Any]]:
"""Parse LinkedIn search results HTML and return a list of job dicts."""
parser = HTMLParser(html)
jobs: list[dict[str, Any]] = []
cards = parser.css("ul.jobs-search__results-list > li")
if not cards:
cards = parser.css("[data-entity-urn]")
if not cards:
cards = parser.css("div.base-search-card")
for card in cards:
try:
job = _parse_single_card(card)
if job:
jobs.append(job)
except (AttributeError, KeyError, TypeError) as exc:
logger.warning("Failed to parse job card: %s", exc)
continue
return jobs
async def _fetch_description(client: httpx.AsyncClient, job: dict[str, Any]) -> str | None:
"""Fetch the full description for a single job listing."""
try:
resp = await client.get(job["url"])
resp.raise_for_status()
except httpx.HTTPStatusError as exc:
logger.warning("HTTP %s fetching description for %s", exc.response.status_code, job["url"])
return None
except httpx.RequestError as exc:
logger.warning("Request error fetching description for %s: %s", job["url"], exc)
return None
parser = HTMLParser(resp.text)
desc_el = (
parser.css_first("div.description__text")
or parser.css_first("div.show-more-less-html__markup")
or parser.css_first("section.description")
)
return desc_el.text(strip=True)[:MAX_DESCRIPTION_LENGTH] if desc_el else None
async def _enrich_descriptions(
client: httpx.AsyncClient,
jobs: list[dict[str, Any]],
max_concurrent: int = DEFAULT_MAX_CONCURRENT,
) -> None:
"""Fetch and attach descriptions for a list of jobs in batches."""
for i in range(0, len(jobs), max_concurrent):
batch = jobs[i:i + max_concurrent]
descriptions = await asyncio.gather(*[_fetch_description(client, j) for j in batch])
for job, desc in zip(batch, descriptions):
job["description"] = desc
if i + max_concurrent < len(jobs):
await asyncio.sleep(2.0) # polite delay between batches
def _job_matches_technologies(job: dict[str, Any], techs_lower: list[str]) -> bool:
"""Return True if any technology keyword appears in the job title or description."""
searchable = job["title"].lower()
if job.get("description"):
searchable += " " + job["description"].lower()
return any(tech in searchable for tech in techs_lower)
def _location_matches_countries(location: str | None, countries_lower: set[str]) -> bool:
"""Return True if *location* maps to one of the given countries."""
if not location:
return False
loc = location.lower()
if any(country in loc for country in countries_lower):
return True
for fragment, country in LOCATION_TO_COUNTRY.items():
if fragment in loc and country in countries_lower:
return True
return False
async def _scrape_one_location(
client: httpx.AsyncClient,
params: dict[str, Any],
location: str | None,
max_pages: int,
) -> list[dict[str, Any]]:
"""Scrape job listings for a single location across multiple pages."""
all_jobs: list[dict[str, Any]] = []
for page in range(max_pages):
url_params: dict[str, str] = {
"keywords": params["keywords"],
"start": str(page * JOBS_PER_PAGE),
"sortBy": "DD",
}
if location:
url_params["location"] = location
if params.get("remote"):
url_params["f_WT"] = "2"
if params.get("experience") and params["experience"] in EXPERIENCE_MAP:
url_params["f_E"] = EXPERIENCE_MAP[params["experience"]]
if params.get("company_size"):
url_params["f_CS"] = ",".join(params["company_size"])
url = f"{LINKEDIN_JOBS_URL}?{urlencode(url_params, quote_via=quote_plus)}"
try:
response = await client.get(url)
response.raise_for_status()
except httpx.HTTPStatusError as exc:
logger.warning("HTTP %s fetching page %d for '%s'", exc.response.status_code, page, location)
break
except httpx.RequestError as exc:
logger.warning("Request error on page %d for '%s': %s", page, location, exc)
break
jobs = parse_job_listings(response.text)
if not jobs:
break
all_jobs.extend(jobs)
if page < max_pages - 1:
await asyncio.sleep(2.0) # polite delay between pages
return all_jobs
async def scrape_jobs(params: dict[str, Any]) -> list[dict[str, Any]]:
"""Scrape LinkedIn job listings based on search parameters."""
headers = {
"Accept": "text/html",
}
max_pages = min(params.get("max_pages", DEFAULT_MAX_PAGES), MAX_PAGES_CAP)
all_jobs: list[dict[str, Any]] = []
seen_ids: set[str] = set()
async with httpx.AsyncClient(headers=headers, follow_redirects=True, timeout=30.0) as client:
countries = params.get("countries", [])
if countries:
for country in countries:
jobs = await _scrape_one_location(client, params, country, max_pages)
for j in jobs:
if j["linkedin_id"] not in seen_ids:
seen_ids.add(j["linkedin_id"])
all_jobs.append(j)
await asyncio.sleep(2.0) # polite delay between batches
else:
all_jobs = await _scrape_one_location(client, params, params.get("location"), max_pages)
# Filter excluded terms
exclude = params.get("exclude", [])
if exclude:
exclude_lower = [e.lower() for e in exclude]
all_jobs = [
j for j in all_jobs
if not any(term in j["title"].lower() or term in j["company"].lower() for term in exclude_lower)
]
# Enrich with descriptions
if all_jobs:
async with httpx.AsyncClient(follow_redirects=True, timeout=30.0) as desc_client:
await _enrich_descriptions(desc_client, all_jobs)
# Filter by technologies
technologies = params.get("technologies", [])
if technologies:
techs_lower = [t.lower() for t in technologies]
all_jobs = [j for j in all_jobs if _job_matches_technologies(j, techs_lower)]
return all_jobs
# ---------------------------------------------------------------------------
# AI Scoring (Google Gemini)
# ---------------------------------------------------------------------------
SCORING_PROMPT = """You are a job matching assistant. Score each job from 0.0 to 1.0 based on how
well it matches the candidate's criteria. Be strict: only score above 0.7 if the job is a strong match.
Candidate criteria:
- Keywords: {keywords}
- Required technologies: {technologies}
- Location preference: {location}
- Remote: {remote}
- Minimum salary: {salary_min}
- Experience level: {experience}
- Exclude terms: {exclude}
- Additional requirements: {ai_prompt}
Jobs to evaluate:
{jobs_text}
Respond ONLY with a JSON array. Each element must have:
- "index": the job number (starting from 0)
- "score": float from 0.0 to 1.0
- "reason": one sentence explaining the score in Spanish
Example response:
[{{"index": 0, "score": 0.85, "reason": "Buen match por tecnologia y ubicacion remota"}}]
"""
BATCH_SIZE = 10
def _format_jobs_for_prompt(jobs: list[dict[str, Any]]) -> str:
"""Format a list of jobs into a text block for the AI scoring prompt."""
lines: list[str] = []
for i, job in enumerate(jobs):
parts = [f"Job {i}:", f" Title: {job['title']}", f" Company: {job['company']}"]
if job.get("location"):
parts.append(f" Location: {job['location']}")
if job.get("salary"):
parts.append(f" Salary: {job['salary']}")
if job.get("description"):
parts.append(f" Description: {job['description'][:MAX_DESCRIPTION_FOR_PROMPT]}")
parts.append(f" URL: {job['url']}")
lines.append("\n".join(parts))
return "\n\n".join(lines)
def _build_scoring_prompt(jobs: list[dict[str, Any]], params: dict[str, Any]) -> str:
"""Build the full Gemini scoring prompt for a batch of jobs."""
return SCORING_PROMPT.format(
keywords=params.get("keywords", ""),
technologies=", ".join(params.get("technologies", [])) or "Not specified",
location=", ".join(params.get("countries", [])) or params.get("location", "Any"),
remote="Yes" if params.get("remote") else "No preference",
salary_min=f"EUR {params['salary_min']}" if params.get("salary_min") else "Not specified",
experience=params.get("experience", "Any"),
exclude=", ".join(params.get("exclude", [])) or "None",
ai_prompt=params.get("ai_prompt", "No additional requirements"),
jobs_text=_format_jobs_for_prompt(jobs),
)
def _parse_scores(response_text: str, count: int) -> list[dict[str, Any]]:
"""Parse AI response text into a list of score dicts, with fallback."""
try:
text = response_text.strip()
if "```" in text:
start = text.find("[")
end = text.rfind("]") + 1
if start >= 0 and end > start:
text = text[start:end]
scores = json.loads(text)
if not isinstance(scores, list):
raise ValueError("Expected JSON array")
return scores
except (json.JSONDecodeError, ValueError) as exc:
logger.warning("Failed to parse AI scores: %s", exc)
return [{"index": i, "score": 0.5, "reason": "Score unavailable"} for i in range(count)]
def score_jobs(
jobs: list[dict[str, Any]],
params: dict[str, Any],
api_key: str,
model: str = "gemini-2.5-flash",
) -> list[dict[str, Any]]:
"""Score jobs using Google Gemini AI, or assign neutral scores if no key."""
if not jobs:
return []
if not api_key:
for job in jobs:
job["ai_score"] = 0.5
job["ai_summary"] = "No Gemini API key configured"
return jobs
from google import genai
client = genai.Client(api_key=api_key)
for batch_start in range(0, len(jobs), BATCH_SIZE):
batch = jobs[batch_start:batch_start + BATCH_SIZE]
prompt = _build_scoring_prompt(batch, params)
try:
response = client.models.generate_content(
model=model,
contents=prompt,
config=genai.types.GenerateContentConfig(temperature=0.1, max_output_tokens=2048),
)
scores = _parse_scores(response.text, len(batch))
score_map = {s["index"]: s for s in scores}
for i, job in enumerate(batch):
score_data = score_map.get(i, {"score": 0.5, "reason": "Not scored"})
job["ai_score"] = float(score_data.get("score", 0.5))
job["ai_summary"] = score_data.get("reason")
except Exception as e:
logger.error("AI scoring failed for batch starting at %d: %s", batch_start, e)
for job in batch:
job["ai_score"] = 0.5
job["ai_summary"] = f"AI error: {e}"
return jobs
# ---------------------------------------------------------------------------
# Commands
# ---------------------------------------------------------------------------
def cmd_search(params_json: str) -> None:
"""Search LinkedIn for jobs matching the given JSON parameters."""
params = json.loads(params_json)
if "keywords" not in params:
print(json.dumps({
"status": "error",
"message": "Missing required 'keywords' key in search parameters",
}, indent=2))
return
config = load_config()
# Run scraping
jobs = asyncio.run(scrape_jobs(params))
# Save to history (always, even with no results)
history = load_json(HISTORY_FILE, [])
history.insert(0, {
"params": params,
"total_found": len(jobs),
"above_threshold": 0,
"timestamp": datetime.now().isoformat(),
})
history = history[:MAX_HISTORY_ENTRIES]
save_json(HISTORY_FILE, history)
if not jobs:
print(json.dumps({"status": "no_results", "message": "No jobs found matching your criteria"}, indent=2))
return
# Score with AI
api_key = config.get("gemini_api_key", "")
model = config.get("gemini_model", "gemini-2.5-flash")
jobs = score_jobs(jobs, params, api_key, model)
# Filter by min score
min_score = params.get("min_score", config.get("min_ai_score", DEFAULT_MIN_SCORE))
scored_jobs = sorted(jobs, key=lambda j: j.get("ai_score", 0), reverse=True)
filtered = [j for j in scored_jobs if j.get("ai_score", 0) >= min_score]
# Clean output (remove description to keep output manageable)
for j in scored_jobs:
j.pop("description", None)
j.pop("linkedin_id", None)
# Update history with actual above_threshold count
history = load_json(HISTORY_FILE, [])
if history:
history[0]["above_threshold"] = len(filtered)
save_json(HISTORY_FILE, history)
print(json.dumps({
"status": "ok",
"total_scraped": len(jobs),
"above_threshold": len(filtered),
"min_score": min_score,
"jobs": filtered[:MAX_RESULTS_OUTPUT],
"all_jobs": len(scored_jobs),
}, indent=2, ensure_ascii=False))
def cmd_setkey(api_key: str) -> None:
"""Store the Gemini API key in the config file."""
config = load_config()
config["gemini_api_key"] = api_key
save_json(CONFIG_FILE, config)
print(json.dumps({"status": "ok", "message": "Gemini API key saved"}, indent=2))
def cmd_save(job_json: str) -> None:
"""Save a job to the bookmarks list (no duplicates by URL)."""
job = json.loads(job_json)
job["saved_at"] = datetime.now().isoformat()
saved = load_json(SAVED_FILE, [])
# Don't duplicate
if not any(s.get("url") == job.get("url") for s in saved):
saved.append(job)
save_json(SAVED_FILE, saved)
print(json.dumps({"status": "saved", "total_saved": len(saved)}, indent=2))
def cmd_saved() -> None:
"""List all saved/bookmarked jobs."""
saved = load_json(SAVED_FILE, [])
print(json.dumps({"saved_jobs": saved, "total": len(saved)}, indent=2, ensure_ascii=False))
def cmd_unsave(url: str) -> None:
"""Remove a saved job by its URL."""
saved = load_json(SAVED_FILE, [])
saved = [s for s in saved if s.get("url") != url]
save_json(SAVED_FILE, saved)
print(json.dumps({"status": "removed", "total_saved": len(saved)}, indent=2))
def cmd_history() -> None:
"""Show the search history with indexed entries."""
history = load_json(HISTORY_FILE, [])
for i, h in enumerate(history):
h["index"] = i
print(json.dumps({"searches": history}, indent=2, ensure_ascii=False))
def cmd_rerun(index_str: str) -> None:
"""Re-run a previous search by its history index."""
history = load_json(HISTORY_FILE, [])
index = int(index_str)
if index < 0 or index >= len(history):
print(json.dumps({"status": "error", "message": f"Invalid index. History has {len(history)} entries."}, indent=2))
return
params = history[index]["params"]
cmd_search(json.dumps(params))
def main() -> None:
"""CLI entry point for the job_hunter skill."""
if len(sys.argv) < 2:
print("Usage: job_hunter.py <command> [args]", file=sys.stderr)
sys.exit(1)
command = sys.argv[1]
commands: dict[str, Any] = {
"search": lambda: cmd_search(sys.argv[2]),
"setkey": lambda: cmd_setkey(sys.argv[2]),
"save": lambda: cmd_save(sys.argv[2]),
"saved": lambda: cmd_saved(),
"unsave": lambda: cmd_unsave(sys.argv[2]),
"history": lambda: cmd_history(),
"rerun": lambda: cmd_rerun(sys.argv[2]),
}
if command not in commands:
print(f"Unknown command: {command}", file=sys.stderr)
sys.exit(1)
commands[command]()
if __name__ == "__main__":
main()
Interactive study assistant that creates flashcards, quizzes, and spaced repetition reviews from any source material (notes, PDFs, photos, text, URLs). Use w...
---
name: study-buddy
description: Interactive study assistant that creates flashcards, quizzes, and spaced repetition reviews from any source material (notes, PDFs, photos, text, URLs). Use when the user wants to study, memorize, review, prepare for exams, create flashcards, take a quiz, practice questions, or learn any topic. Triggers on phrases like "study", "quiz me", "flashcards", "review", "exam prep", "test me", "help me memorize", "spaced repetition", "study session".
metadata:
{ "openclaw": { "emoji": "📚", "requires": { "bins": ["python3"] } } }
---
# Study Buddy
AI-powered study assistant that turns any material into interactive learning sessions with flashcards, quizzes, and spaced repetition — delivered through chat.
## Core Workflow
### 1. Create Flashcards from Material
When the user provides study material (text, image, PDF, URL):
1. Extract and analyze the content
2. Identify key concepts, definitions, formulas, dates, and relationships
3. Generate flashcards as Q&A pairs
4. Store them using `scripts/deck_manager.py`
```bash
# Create a new deck
python3 scripts/deck_manager.py create "Biology Exam" --cards '[
{"q": "What is mitosis?", "a": "Cell division producing two identical daughter cells"},
{"q": "What are the phases of mitosis?", "a": "Prophase, Metaphase, Anaphase, Telophase (PMAT)"}
]'
# Add cards to existing deck
python3 scripts/deck_manager.py add "Biology Exam" --cards '[
{"q": "What is meiosis?", "a": "Cell division producing four genetically different gametes"}
]'
```
**Card generation guidelines:**
- One concept per card
- Questions should test understanding, not just recall
- Include mnemonics when helpful (e.g., PMAT for mitosis phases)
- For math/science: include both formula cards and application cards
- For languages: include context sentences, not just word translations
- Aim for 10-20 cards per topic section
### 2. Quiz Session
When the user asks to be quizzed:
```bash
# Get cards due for review (spaced repetition)
python3 scripts/deck_manager.py review "Biology Exam"
# Get random quiz (all cards)
python3 scripts/deck_manager.py quiz "Biology Exam" --count 10
```
**Quiz delivery format:**
Present one question at a time:
> **Question 3/10**
> What are the phases of mitosis?
Wait for the user's answer, then reveal:
> **Answer:** Prophase, Metaphase, Anaphase, Telophase (PMAT)
>
> How did you do?
> Got it | Partially | Missed it
Record the result:
```bash
python3 scripts/deck_manager.py record "Biology Exam" --card-id 2 --result "correct"
```
Results affect spaced repetition scheduling:
- correct: review interval increases (1d, 3d, 7d, 14d, 30d)
- partial: interval stays the same
- missed: interval resets to 1 day
### 3. Spaced Repetition Review
When the user starts a study session or asks "what should I review?":
```bash
# Check what's due across all decks
python3 scripts/deck_manager.py due
# Review specific deck
python3 scripts/deck_manager.py review "Biology Exam"
```
Only show cards that are due based on the SM-2 algorithm intervals. After each session, show a summary:
> **Session complete!**
> Reviewed: 12 cards
> Correct: 9 | Partial: 2 | Missed: 1
> Next review: 3 cards due tomorrow
### 4. Generate Practice Exam
When the user asks for an exam or test:
```bash
python3 scripts/deck_manager.py exam "Biology Exam" --questions 20 --types "multiple_choice,short_answer,true_false"
```
Generate a mix of question types from the deck:
- **Multiple choice** (4 options, one correct) -- use other cards' answers as distractors
- **True/False** -- modify real answers slightly for false statements
- **Short answer** -- direct questions from flashcards
- **Fill in the blank** -- remove key terms from answers
### 5. Deck Management
```bash
# List all decks
python3 scripts/deck_manager.py list
# Show deck stats
python3 scripts/deck_manager.py stats "Biology Exam"
# Export deck (share with others)
python3 scripts/deck_manager.py export "Biology Exam"
# Import deck
python3 scripts/deck_manager.py import deck_file.json
# Delete deck
python3 scripts/deck_manager.py delete "Biology Exam"
```
For guidance on handling different input types (text, photos, PDFs, URLs) and tips for creating effective cards, see [references/guidelines.md](references/guidelines.md).
## Storage
All decks are stored as JSON in `~/.openclaw/study-buddy/decks/`. Each deck file contains cards, review history, and scheduling metadata. See [references/data_format.md](references/data_format.md) for the schema.
## Multilingual Support
Study Buddy works in any language. Detect the user's language from their message and:
- Generate cards in the same language
- Quiz prompts in the user's language
- Support mixed-language decks (useful for language learning)
FILE:references/data_format.md
# Study Buddy - Data Format
## Deck Schema
```json
{
"name": "Biology Exam",
"cards": [
{
"id": 1,
"q": "What is mitosis?",
"a": "Cell division producing two identical daughter cells",
"interval": 7,
"ease": 2.6,
"repetitions": 3,
"next_review": "2026-03-26T10:00:00",
"created_at": "2026-03-19T10:00:00"
}
],
"next_id": 2,
"created_at": "2026-03-19T10:00:00",
"updated_at": "2026-03-19T10:00:00"
}
```
## Card Fields
| Field | Type | Description |
|-------|------|-------------|
| id | int | Unique card ID within deck |
| q | string | Question text |
| a | string | Answer text |
| interval | int | Days until next review |
| ease | float | SM-2 ease factor (1.3-3.0, default 2.5) |
| repetitions | int | Consecutive correct answers |
| next_review | ISO datetime | When the card is next due |
| created_at | ISO datetime | When the card was created |
## SM-2 Algorithm
Interval progression for consecutive correct answers:
- Rep 1: 1 day
- Rep 2: 3 days
- Rep 3+: previous_interval * ease_factor
Ease adjustments:
- correct: ease + 0.1
- partial: ease - 0.15
- missed: ease - 0.2, reset to rep 0, interval 1 day
Minimum ease: 1.3
## Storage Location
`~/.openclaw/study-buddy/decks/<deck_name>.json`
Deck names are normalized: lowercased, spaces replaced with underscores.
FILE:references/guidelines.md
# Study Buddy Guidelines
## Handling Different Input Types
| Input | How to Process |
|-------|---------------|
| **Text/notes** | Extract key concepts directly |
| **Photo of handwritten notes** | Describe the image, extract text and concepts |
| **PDF document** | Read and extract key sections |
| **URL/webpage** | Fetch content, extract main points |
| **Topic name only** | Generate cards from AI knowledge on the topic |
| **Conversation history** | Summarize and create cards from recent discussion |
## Tips for Effective Cards
- **Atomic**: One fact per card, never compound questions
- **Clear**: Unambiguous questions with definitive answers
- **Bidirectional**: For definitions, create both "What is X?" and "X is the definition of?"
- **Visual mnemonics**: Suggest memory tricks when possible
- **Progressive**: Start with basic recall, add application questions later
FILE:scripts/deck_manager.py
#!/usr/bin/env python3
"""
Study Buddy - Deck Manager
Manages flashcard decks with spaced repetition (SM-2 algorithm).
License: MIT
Usage:
deck_manager.py create <deck_name> --cards '<json_array>'
deck_manager.py add <deck_name> --cards '<json_array>'
deck_manager.py list
deck_manager.py stats <deck_name>
deck_manager.py review <deck_name>
deck_manager.py quiz <deck_name> [--count N]
deck_manager.py exam <deck_name> [--questions N] [--types types]
deck_manager.py record <deck_name> --card-id <id> --result <correct|partial|missed>
deck_manager.py due
deck_manager.py export <deck_name>
deck_manager.py import <file_path>
deck_manager.py delete <deck_name>
"""
from __future__ import annotations
import argparse
import json
import logging
import os
import random
import sys
from datetime import datetime, timedelta
from pathlib import Path
logger = logging.getLogger(__name__)
__all__ = [
"DECKS_DIR",
"ensure_dir",
"deck_path",
"load_deck",
"save_deck",
"new_card",
"sm2_update",
"cmd_create",
"cmd_add",
"cmd_list",
"cmd_stats",
"cmd_review",
"cmd_quiz",
"cmd_exam",
"cmd_record",
"cmd_due",
"cmd_export",
"cmd_import",
"cmd_delete",
"main",
]
# --- Named constants ---
MASTERY_THRESHOLD = 5 # Repetitions needed to consider a card mastered
DEFAULT_EASE = 2.5 # Starting ease factor for new cards
MIN_EASE = 1.3 # Minimum ease factor (SM-2 floor)
EASE_BONUS_CORRECT = 0.1 # Ease increase on correct answer
EASE_PENALTY_PARTIAL = -0.15 # Ease decrease on partial answer
EASE_PENALTY_MISSED = -0.2 # Ease decrease on missed answer
DEFAULT_QUIZ_COUNT = 10 # Default number of quiz questions
DEFAULT_EXAM_QUESTIONS = 20 # Default number of exam questions
DECKS_DIR = Path.home() / ".openclaw" / "study-buddy" / "decks"
def ensure_dir() -> None:
"""Create the decks directory if it does not exist."""
DECKS_DIR.mkdir(parents=True, exist_ok=True)
def deck_path(name: str) -> Path:
"""Return the filesystem path for a deck given its name."""
safe = name.lower().replace(" ", "_").replace("/", "_")
return DECKS_DIR / f"{safe}.json"
def load_deck(name: str) -> dict:
"""Load a deck from disk by name, exiting if not found."""
path = deck_path(name)
if not path.exists():
logger.error("Deck '%s' not found.", name)
print(f"Error: Deck '{name}' not found.", file=sys.stderr)
sys.exit(1)
with open(path) as f:
return json.load(f)
def save_deck(deck: dict) -> None:
"""Persist a deck dictionary to its JSON file on disk."""
ensure_dir()
path = deck_path(deck["name"])
with open(path, "w") as f:
json.dump(deck, f, indent=2, ensure_ascii=False)
def new_card(card_id: int, question: str, answer: str) -> dict:
"""Create a new flashcard with default SM-2 scheduling values."""
return {
"id": card_id,
"q": question,
"a": answer,
"interval": 0,
"ease": DEFAULT_EASE,
"repetitions": 0,
"next_review": datetime.now().isoformat(),
"created_at": datetime.now().isoformat(),
}
def sm2_update(card: dict, result: str) -> dict:
"""Apply the SM-2 spaced repetition algorithm to update a card's scheduling."""
ease = card.get("ease", DEFAULT_EASE)
interval = card.get("interval", 0)
reps = card.get("repetitions", 0)
if result == "correct":
if reps == 0:
interval = 1
elif reps == 1:
interval = 3
else:
interval = int(interval * ease)
reps += 1
ease = max(MIN_EASE, ease + EASE_BONUS_CORRECT)
elif result == "partial":
ease = max(MIN_EASE, ease + EASE_PENALTY_PARTIAL)
elif result == "missed":
reps = 0
interval = 1
ease = max(MIN_EASE, ease + EASE_PENALTY_MISSED)
card["interval"] = interval
card["ease"] = round(ease, 2)
card["repetitions"] = reps
card["next_review"] = (datetime.now() + timedelta(days=max(interval, 1))).isoformat()
return card
def _validate_cards_json(cards_json: str) -> list[dict]:
"""Parse and validate a JSON string of cards, ensuring each has 'q' and 'a' keys."""
try:
cards_data = json.loads(cards_json)
except json.JSONDecodeError as exc:
logger.error("Invalid JSON for cards: %s", exc)
print(f"Error: Invalid JSON for cards: {exc}", file=sys.stderr)
sys.exit(1)
if not isinstance(cards_data, list):
logger.error("Cards must be a JSON array.")
print("Error: Cards must be a JSON array.", file=sys.stderr)
sys.exit(1)
for i, card in enumerate(cards_data):
if "q" not in card or "a" not in card:
logger.error("Card at index %d is missing required 'q' and/or 'a' keys.", i)
print(
f"Error: Card at index {i} is missing required 'q' and/or 'a' keys.",
file=sys.stderr,
)
sys.exit(1)
return cards_data
def cmd_create(args: argparse.Namespace) -> None:
"""Create a new flashcard deck from a JSON array of cards."""
ensure_dir()
path = deck_path(args.deck_name)
if path.exists():
logger.error("Deck '%s' already exists. Use 'add' to add cards.", args.deck_name)
print(f"Error: Deck '{args.deck_name}' already exists. Use 'add' to add cards.", file=sys.stderr)
sys.exit(1)
cards_data = _validate_cards_json(args.cards)
cards = [new_card(i + 1, c["q"], c["a"]) for i, c in enumerate(cards_data)]
deck = {
"name": args.deck_name,
"cards": cards,
"next_id": len(cards) + 1,
"created_at": datetime.now().isoformat(),
"updated_at": datetime.now().isoformat(),
}
save_deck(deck)
print(json.dumps({"status": "created", "deck": args.deck_name, "cards": len(cards)}, indent=2))
def cmd_add(args: argparse.Namespace) -> None:
"""Add new cards to an existing deck."""
deck = load_deck(args.deck_name)
cards_data = _validate_cards_json(args.cards)
next_id = deck.get("next_id", len(deck["cards"]) + 1)
new_cards = []
for i, c in enumerate(cards_data):
new_cards.append(new_card(next_id + i, c["q"], c["a"]))
deck["cards"].extend(new_cards)
deck["next_id"] = next_id + len(new_cards)
deck["updated_at"] = datetime.now().isoformat()
save_deck(deck)
print(json.dumps({"status": "added", "deck": args.deck_name, "new_cards": len(new_cards), "total_cards": len(deck["cards"])}, indent=2))
def cmd_list(args: argparse.Namespace) -> None:
"""List all decks with card counts and due counts."""
ensure_dir()
decks: list[dict] = []
for f in sorted(DECKS_DIR.glob("*.json")):
with open(f) as fh:
d = json.load(fh)
now = datetime.now()
due = sum(1 for c in d["cards"] if datetime.fromisoformat(c["next_review"]) <= now)
decks.append({
"name": d["name"],
"cards": len(d["cards"]),
"due": due,
"created": d.get("created_at", "unknown"),
})
print(json.dumps({"decks": decks}, indent=2))
def cmd_stats(args: argparse.Namespace) -> None:
"""Show statistics for a specific deck."""
deck = load_deck(args.deck_name)
now = datetime.now()
cards = deck["cards"]
due = [c for c in cards if datetime.fromisoformat(c["next_review"]) <= now]
mastered = [c for c in cards if c.get("repetitions", 0) >= MASTERY_THRESHOLD]
learning = [c for c in cards if 0 < c.get("repetitions", 0) < MASTERY_THRESHOLD]
new = [c for c in cards if c.get("repetitions", 0) == 0]
print(json.dumps({
"deck": args.deck_name,
"total_cards": len(cards),
"due_now": len(due),
"mastered": len(mastered),
"learning": len(learning),
"new": len(new),
"average_ease": round(sum(c.get("ease", DEFAULT_EASE) for c in cards) / max(len(cards), 1), 2),
}, indent=2))
def cmd_review(args: argparse.Namespace) -> None:
"""Return due cards for review, shuffled randomly."""
deck = load_deck(args.deck_name)
now = datetime.now()
due = [c for c in deck["cards"] if datetime.fromisoformat(c["next_review"]) <= now]
if not due:
next_reviews = sorted(deck["cards"], key=lambda c: c["next_review"])
next_time = next_reviews[0]["next_review"] if next_reviews else "never"
print(json.dumps({"status": "no_cards_due", "next_review": next_time}, indent=2))
return
random.shuffle(due)
cards_out = [{"id": c["id"], "q": c["q"], "a": c["a"], "repetitions": c.get("repetitions", 0)} for c in due]
print(json.dumps({"deck": args.deck_name, "due_count": len(due), "cards": cards_out}, indent=2))
def cmd_quiz(args: argparse.Namespace) -> None:
"""Generate a random quiz from the deck's cards."""
deck = load_deck(args.deck_name)
count = min(args.count or DEFAULT_QUIZ_COUNT, len(deck["cards"]))
selected = random.sample(deck["cards"], count)
cards_out = [{"id": c["id"], "q": c["q"], "a": c["a"]} for c in selected]
print(json.dumps({"deck": args.deck_name, "quiz_count": count, "cards": cards_out}, indent=2))
def cmd_exam(args: argparse.Namespace) -> None:
"""Generate a structured exam with multiple question types."""
deck = load_deck(args.deck_name)
count = min(args.questions or DEFAULT_EXAM_QUESTIONS, len(deck["cards"]))
types = (args.types or "multiple_choice,short_answer,true_false").split(",")
selected = random.sample(deck["cards"], count)
all_answers = [c["a"] for c in deck["cards"]]
questions: list[dict] = []
for i, card in enumerate(selected):
q_type = types[i % len(types)]
q: dict = {"number": i + 1, "type": q_type, "question": card["q"], "card_id": card["id"]}
if q_type == "multiple_choice":
distractors = [a for a in all_answers if a != card["a"]]
random.shuffle(distractors)
options = [card["a"]] + distractors[:3]
random.shuffle(options)
q["options"] = options
q["correct"] = card["a"]
elif q_type == "true_false":
use_true = random.choice([True, False])
if use_true:
q["statement"] = card["a"]
q["correct"] = True
else:
if distractors := [a for a in all_answers if a != card["a"]]:
q["statement"] = random.choice(distractors)
else:
q["statement"] = card["a"]
use_true = True
q["correct"] = use_true
else:
q["correct"] = card["a"]
questions.append(q)
print(json.dumps({"deck": args.deck_name, "exam": questions}, indent=2))
def cmd_record(args: argparse.Namespace) -> None:
"""Record a review result for a specific card and update its schedule."""
deck = load_deck(args.deck_name)
card_id = args.card_id
for card in deck["cards"]:
if card["id"] == card_id:
sm2_update(card, args.result)
deck["updated_at"] = datetime.now().isoformat()
save_deck(deck)
print(json.dumps({
"status": "recorded",
"card_id": card_id,
"result": args.result,
"next_review": card["next_review"],
"interval_days": card["interval"],
"ease": card["ease"],
}, indent=2))
return
logger.error("Card ID %d not found in deck '%s'.", card_id, args.deck_name)
print(f"Error: Card ID {card_id} not found in deck '{args.deck_name}'.", file=sys.stderr)
sys.exit(1)
def cmd_due(args: argparse.Namespace) -> None:
"""List all decks that have cards due for review."""
ensure_dir()
now = datetime.now()
results: list[dict] = []
for f in sorted(DECKS_DIR.glob("*.json")):
with open(f) as fh:
d = json.load(fh)
due = [c for c in d["cards"] if datetime.fromisoformat(c["next_review"]) <= now]
if due:
results.append({"deck": d["name"], "due_count": len(due)})
print(json.dumps({"due_decks": results, "total_due": sum(r["due_count"] for r in results)}, indent=2))
def cmd_export(args: argparse.Namespace) -> None:
"""Export a deck as formatted JSON to stdout."""
deck = load_deck(args.deck_name)
print(json.dumps(deck, indent=2, ensure_ascii=False))
def cmd_import(args: argparse.Namespace) -> None:
"""Import a deck from a JSON file on disk."""
ensure_dir()
try:
with open(args.file_path) as f:
deck = json.load(f)
except FileNotFoundError:
logger.error("File not found: %s", args.file_path)
print(f"Error: File not found: {args.file_path}", file=sys.stderr)
sys.exit(1)
except json.JSONDecodeError as exc:
logger.error("Invalid JSON in file '%s': %s", args.file_path, exc)
print(f"Error: Invalid JSON in file '{args.file_path}': {exc}", file=sys.stderr)
sys.exit(1)
if "name" not in deck or "cards" not in deck:
logger.error("Invalid deck format. Must have 'name' and 'cards'.")
print("Error: Invalid deck format. Must have 'name' and 'cards'.", file=sys.stderr)
sys.exit(1)
save_deck(deck)
print(json.dumps({"status": "imported", "deck": deck["name"], "cards": len(deck["cards"])}, indent=2))
def cmd_delete(args: argparse.Namespace) -> None:
"""Delete a deck file from disk."""
path = deck_path(args.deck_name)
if not path.exists():
logger.error("Deck '%s' not found.", args.deck_name)
print(f"Error: Deck '{args.deck_name}' not found.", file=sys.stderr)
sys.exit(1)
path.unlink()
print(json.dumps({"status": "deleted", "deck": args.deck_name}, indent=2))
def main() -> None:
"""Entry point: parse CLI arguments and dispatch to the appropriate command."""
parser = argparse.ArgumentParser(description="Study Buddy - Flashcard Deck Manager")
sub = parser.add_subparsers(dest="command", required=True)
p = sub.add_parser("create")
p.add_argument("deck_name")
p.add_argument("--cards", required=True)
p = sub.add_parser("add")
p.add_argument("deck_name")
p.add_argument("--cards", required=True)
sub.add_parser("list")
p = sub.add_parser("stats")
p.add_argument("deck_name")
p = sub.add_parser("review")
p.add_argument("deck_name")
p = sub.add_parser("quiz")
p.add_argument("deck_name")
p.add_argument("--count", type=int, default=10)
p = sub.add_parser("exam")
p.add_argument("deck_name")
p.add_argument("--questions", type=int, default=20)
p.add_argument("--types", default="multiple_choice,short_answer,true_false")
p = sub.add_parser("record")
p.add_argument("deck_name")
p.add_argument("--card-id", type=int, required=True)
p.add_argument("--result", required=True, choices=["correct", "partial", "missed"])
sub.add_parser("due")
p = sub.add_parser("export")
p.add_argument("deck_name")
p = sub.add_parser("import")
p.add_argument("file_path")
p = sub.add_parser("delete")
p.add_argument("deck_name")
args = parser.parse_args()
commands: dict[str, callable] = {
"create": cmd_create, "add": cmd_add, "list": cmd_list,
"stats": cmd_stats, "review": cmd_review, "quiz": cmd_quiz,
"exam": cmd_exam, "record": cmd_record, "due": cmd_due,
"export": cmd_export, "import": cmd_import, "delete": cmd_delete,
}
commands[args.command](args)
if __name__ == "__main__":
main()