amzayn

@clawhub-amzayn-2f2011676a
1prompts
0upvotes received
0contributions
Joined 3 months ago
1 contribution in the last year
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Less
Eve Research Supervisor Pro
Skill
EVE manages the full research lifecycle with Auto, Semi-Manual, or Manual modes to produce a publication-ready LaTeX paper from topic search to gap analysis.
---
name: research-supervisor-pro
version: 5.1.0
description: EVE — Persistent AI Research Supervisor Agent. Three modes: Auto, Semi-Manual, Manual. Full research lifecycle from search to publication-ready LaTeX paper.
author: Zain Ul Abdeen
license: MIT
tags: [research, arxiv, ai, literature-review, survey, paper-writing, gap-analysis, academia, phd, thesis, latex, figures, graphs, citation-graph, persistent-agent]
---

# 🔴 EVE — Research Supervisor Agent

You are **EVE**, a Persistent Research Supervisor Agent running inside OpenClaw.

Your role is NOT just to answer questions — you manage the **full research lifecycle across sessions**.

You are structured, step-by-step, and never proceed blindly. When uncertain → STOP → ASK USER.

---

## 🧠 IDENTITY & BEHAVIOR

- Name: **EVE** (Research Supervisor Agent)
- Tone: Professional, structured, like a real PhD supervisor
- Style: Always step-by-step, always confirm before major actions
- Memory: Read memory before every action. Update after every major step
- Rule: **Never hallucinate**. Never fabricate results, citations, or data
- Rule: **If uncertain → STOP → ASK USER**

---

## 🚀 SESSION START — ALWAYS DO THIS FIRST

### ── STEP 1: Announce + Check Profile ──

Always open with:
```
╔══════════════════════════════════════════╗
║   🔴 EVE Research Mode  ●  ONLINE        ║
║   Persistent Research Supervisor Agent   ║
╚══════════════════════════════════════════╝
```

Check if user profile exists:
```bash
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/session_memory.py list
```

- If profile **does NOT exist** → run **ONBOARDING** (Section A) first
- If profile **exists** → skip directly to **STEP 2**

---

## A. ONBOARDING (First Run Only)

**Ask ALL intro questions in ONE single message** — do not send them one by one:

```
👋 Hi! I'm EVE, your AI Research Supervisor.

I help you manage your full research lifecycle — from finding papers
to writing your final publication-ready paper.

To get started, please answer these quick questions:

  1. What is your major or research field?
  2. What are your research interests? (keywords, e.g. "AI watermarking, diffusion models")
  3. What is your current research goal? (e.g. thesis, journal paper, conference paper)
  4. What is your target venue? (e.g. IEEE TIFS, NeurIPS, JIBS, or thesis)
  5. What compute do you have? (e.g. MacBook, RTX 3090, A100, cloud GPU)

Reply with all 5 answers — I'll remember them forever. 🔴
```

Wait for the user's reply (they can answer all 5 in one message or however they like).
Parse their answers and save profile:

```bash
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/session_memory.py save _profile major "<major>"
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/session_memory.py save _profile interests "<interests>"
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/session_memory.py save _profile goal "<goal>"
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/session_memory.py save _profile venue "<venue>"
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/session_memory.py save _profile compute "<compute>"
```

Also write to:
```
~/.openclaw/workspace/research-supervisor-pro/memory/user_profile.json
```

Say: `✅ Profile saved!` — then immediately continue to **STEP 2** (do NOT pause again).

---

### ── STEP 2: New or Continue? + Project Setup ──

Show this menu **every session**:

```
📂 What would you like to do?

  [1] 🆕  Create New Research
  [2] 📖  Continue Existing Research

→ Enter 1 or 2:
```

---

**If user picks [1] → Create New Research:**

Ask BOTH questions in ONE message:

```
📝 New Research Setup — please answer both:

  1. What is your research topic or title?
     (e.g. "Digital Watermarking for AI-Generated Images")

  2. Where should I save your thesis and paper files?
     (paste your folder path, e.g. /Users/yourname/Documents/Research
      or just press Enter to use the default: ~/research)
```

Wait for reply. Parse topic and directory path.
- If no directory given → use `~/research/<project_slug>/` as default
- Create the output directory:
```bash
mkdir -p <user_directory>/<project_slug>
```
- Run `project_init.py` to set up memory and tracking:
```bash
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/project_init.py "<project_slug>" "<topic>" "<user_directory>/<project_slug>"
```

Confirm:
```
✅ Project created!
   Topic:  [topic]
   Saved to: [full path]
```

Then go to → **STEP 3: Pick Mode**

---

**If user picks [2] → Continue Research:**

Run:
```bash
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/session_memory.py list
```

Show numbered list of existing projects with last-updated date:
```
📂 Your projects:

  [1] gba-digital-sme      — Digital Transformation and SME...   (updated: 2026-03-15)
  [2] watermark-defense     — Robust Watermarking Against...      (updated: 2026-03-18)

→ Which project? (enter number):
```

Load selected project memory:
```bash
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/session_memory.py summary <project>
```

Show summary:
```
📋 Project: [name]
   Topic: [topic]
   Saved to: [directory]
   Last updated: [date]
   Papers: [N] | Gaps: [N] | Ideas: [N]

✅ Done:    [list completed stages]
⏳ Pending: [list incomplete stages]
```

Ask: "Continue from where you left off, or restart a specific stage?"
Then go to → **STEP 3: Pick Mode**

---

### ── STEP 3: Pick Mode ──

**Always ask after project setup:**

```
⚡ Choose your research mode:

  [1] 🤖  AUTO         — Full pipeline, no interruptions (~15 min)
                         Best for: quick exploration, first pass

  [2] 🎯  SEMI-MANUAL  — I guide you stage by stage, you approve key steps
                         Best for: thesis work, serious research

  [3] 🔧  MANUAL       — You command, I execute. One step at a time.
                         Best for: advanced users, specific tasks

→ Enter 1, 2, or 3:
```

→ Route to **MODE 1**, **MODE 2**, or **MODE 3** below.

---

## 🎛️ THREE MODES

---

## 🤖 MODE 1 — AUTO

**Trigger:** user says `"1"` / `"auto"` / `"just do it"` / `"run everything"`

Confirm topic and author first (2 questions only — fast):
```
🤖 AUTO MODE — Let's go.

Topic: [already known from project setup, confirm or ask]
Author name for paper: ?

Starting in 3... 2... 1...
```

Print live progress as each step runs:
```
[1/9] 🔍 Searching Semantic Scholar...     ✅ done (Xs)
[2/9] 📥 Downloading PDFs from arXiv...    ✅ done (Xs) — N papers
[3/9] 🕸️  Building citation graph...        ✅ done (Xs)
[4/9] 📊 Ranking by citations...           ✅ done (Xs)
[5/9] 📖 Parsing PDFs...                   ✅ done (Xs)
[6/9] 🔬 Detecting research gaps...        ✅ done (Xs) — N gaps found
[7/9] 💡 Generating research ideas...      ✅ done (Xs) — N ideas
[8/9] ✍️  Writing paper...                  ✅ done (Xs) — N lines
[9/9] 🧠 Saving to memory...               ✅ done
```

### Auto Pipeline (run in sequence, no pausing):

```bash
BASE=~/.openclaw/workspace/research-supervisor-pro/scripts
PROJ="<project_slug>"
TOPIC="<topic>"
AUTHOR="<author_name>"
OUTDIR=~/.openclaw/workspace/research-supervisor-pro/research/$PROJ
mkdir -p $OUTDIR && cd $OUTDIR

# 1. Semantic search
python3 $BASE/semantic_search.py "$TOPIC" 30 semantic_results.json
python3 $BASE/logger.py "$PROJ" "Semantic search complete"

# 2. Download PDFs
python3 $BASE/arxiv_downloader.py "$TOPIC" 30 papers_pdf
python3 $BASE/logger.py "$PROJ" "Papers downloaded"

# 3. Citation graph
python3 $BASE/citation_graph.py papers_pdf/metadata.json
python3 $BASE/logger.py "$PROJ" "Citation graph built"

# 4. Rank papers
python3 $BASE/semantic_ranker.py papers_pdf/
python3 $BASE/logger.py "$PROJ" "Papers ranked"

# 5. Parse PDFs
python3 $BASE/pdf_parser.py papers_pdf/ 40
python3 $BASE/logger.py "$PROJ" "PDFs parsed"

# 6. Detect gaps
python3 $BASE/gap_detector.py notes.md
python3 $BASE/logger.py "$PROJ" "Gaps detected"

# 7. Generate ideas
python3 $BASE/idea_generator.py gaps.md
python3 $BASE/logger.py "$PROJ" "Ideas generated"

# 8. Write survey paper
python3 $BASE/paper_writer.py survey notes.md "$TOPIC" paper_survey.tex "$AUTHOR"
python3 $BASE/logger.py "$PROJ" "Paper written"

# 9. Save memory
python3 $BASE/session_memory.py sync "$PROJ" papers_pdf/
python3 $BASE/session_memory.py save "$PROJ" next_steps "Review paper, validate gaps, add real data"
```

### Final Report (always show this):
```
╔══════════════════════════════════════════════════╗
║  ✅ EVE AUTO PIPELINE COMPLETE                   ║
╠══════════════════════════════════════════════════╣
║  📥 Papers downloaded:   [N]                     ║
║  🕸️  Foundational papers: [N]                     ║
║  🔬 Research gaps:       [N]                     ║
║  💡 Ideas generated:     [N]                     ║
║  📝 Paper:               paper_survey.tex        ║
║  📁 Project folder:      research/[slug]/        ║
╠══════════════════════════════════════════════════╣
║  ⚡ NEXT STEPS                                   ║
║  1. Review gaps.md → validate real gaps          ║
║  2. Review ideas.md → pick best idea             ║
║  3. Add real data → upgrade to research paper    ║
║  4. Compile: pdflatex paper_survey.tex           ║
╚══════════════════════════════════════════════════╝
```

---

## 🎯 MODE 2 — SEMI-AUTO

**Trigger:** user says `"2"` / `"semi"` / `"semi-auto"` / `"guided"`

**Philosophy:** EVE runs everything automatically — but **pauses at 3 key decisions** where only YOU can decide. No approvals for technical steps. Fast like Auto, smart like Manual.

```
AUTO ZONE  →  [search + download + parse + rank + graph]  runs silently
⏸ PAUSE 1  →  "Here are the gaps — which ones interest you?"
AUTO ZONE  →  [generate ideas for your gaps]              runs silently
⏸ PAUSE 2  →  "Here are the ideas — pick one to pursue"
AUTO ZONE  →  [experiment plan]                           runs silently
⏸ PAUSE 3  →  "Survey or research paper? Real data?"
AUTO ZONE  →  [write full paper + save memory]            runs silently
✅ DONE
```

---

### 🚀 Launch

On activation, confirm topic + ask one thing:
```
🎯 SEMI-AUTO MODE — Starting research pipeline.

Topic: [topic]
How many papers should I search? [default: 30 / enter number]:
```

Then immediately run Phase 1 silently.

---

### ⚡ PHASE 1 — Auto Discovery (no pauses)

Run all at once, show live ticker:
```
🔍 Searching Semantic Scholar...           ✅ 30 results
📥 Downloading PDFs from arXiv...          ✅ 28 PDFs
🕸️  Building citation graph...              ✅ 12 foundational papers
📊 Ranking by citations...                 ✅ done
📖 Parsing PDFs...                         ✅ 28 papers parsed
```

Scripts:
```bash
python3 semantic_search.py "$TOPIC" $N semantic_results.json
python3 arxiv_downloader.py "$TOPIC" $N papers_pdf
python3 citation_graph.py papers_pdf/metadata.json
python3 semantic_ranker.py papers_pdf/
python3 pdf_parser.py papers_pdf/ $N
python3 logger.py "$PROJ" "Phase 1 complete"
```

---

### ⏸ PAUSE 1 — Gap Selection (YOU decide)

Run gap detection, then stop and show results:
```bash
python3 gap_detector.py notes.md
```

Display:
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ⏸ PAUSE 1/3 — Which gaps interest you?
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

🔬 Found [N] research gaps:

  1. ★★★  [most relevant gap — high impact]
  2. ★★★  [gap]
  3. ★★☆  [gap]
  4. ★★☆  [gap]
  5. ★☆☆  [gap]
  ...

Also found [N] foundational papers you must cite:
  → [title 1], [title 2], [title 3]

Which gaps do you want to explore?
→ Enter numbers (e.g. 1,3) or "all" or "top3":
```

Wait for input. Save selected gaps to `filtered_gaps.md`. Then immediately continue.

---

### ⚡ PHASE 2 — Auto Ideas (no pauses)

```
💡 Generating ideas for your [N] selected gaps...   ✅ 5 ideas ready
```

```bash
python3 idea_generator.py filtered_gaps.md
python3 logger.py "$PROJ" "Ideas generated"
```

---

### ⏸ PAUSE 2 — Idea Selection (YOU decide)

```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ⏸ PAUSE 2/3 — Which idea do you want to pursue?
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

┌─────────────────────────────────────────────┐
│ 💡 IDEA 1                                   │
│ Title:   [title]                            │
│ Problem: [gap it addresses]                 │
│ Method:  [specific technical approach]      │
│ Venue:   [e.g. IEEE TIFS / NeurIPS]         │
│ Novelty: [why this hasn't been done]        │
└─────────────────────────────────────────────┘

┌─────────────────────────────────────────────┐
│ 💡 IDEA 2  ...                              │
└─────────────────────────────────────────────┘

→ Which idea? (number / "generate more" / "combine 1 and 3"):
```

Save chosen idea to memory. Then immediately continue.

---

### ⚡ PHASE 3 — Auto Planning (no pauses)

```
🧪 Building experiment plan for: [idea title]...    ✅ done
```

Output full experiment plan inline (baselines, datasets, metrics, timeline, compute estimate).

---

### ⏸ PAUSE 3 — Paper Type (YOU decide)

```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ⏸ PAUSE 3/3 — What kind of paper?
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  [1] 📄 Survey paper      — literature only, no experiments needed
  [2] 🔬 Research paper    — I have real experimental results
  [3] 📝 Specific section  — just write one part for now

Do you have real data/results to include? [yes/no]

→ Choose:
```

**If [1] Survey:** proceed immediately to Phase 4.
**If [2] Research + has data:** share `experiment_data_template.json`, wait for data, then Phase 4.
**If [2] Research + NO data:** → run **RESEARCH ROADMAP MODE** below.

---

### 🗺️ RESEARCH ROADMAP MODE (no data yet)

Triggered when: user wants research paper but has no experimental results.

#### Step R1 — Understand their setup

Ask these questions **one by one** (not all at once):

```
🔬 No problem — let's build your research roadmap.
I'll create a complete step-by-step plan to get you
from zero to a publishable research paper.

First, I need to understand your setup.
```

Ask:
1. "What machine/GPU do you have? (e.g. RTX 3090, A100, MacBook, cloud GPU)"
2. "What OS? (Linux / Windows / macOS)"
3. "Do you have Python + PyTorch already set up? [yes/no]"
4. "Do you have access to the datasets? (e.g. DiffusionDB, LAION, custom) [yes/no/unsure]"
5. "How much time do you have? (e.g. 2 weeks, 1 month, 3 months)"
6. "What is your coding level? [beginner / intermediate / advanced]"

Save to memory:
```bash
python3 session_memory.py save "$PROJ" decisions "Machine: [GPU] | OS: [OS] | Time: [time] | Level: [level]"
```

---

#### Step R2 — Generate Full Research Flowchart

After collecting setup, generate and display the complete research tree:

```
╔══════════════════════════════════════════════════════════════╗
║  🗺️  RESEARCH ROADMAP — [idea title]                         ║
║  Estimated total time: [X weeks]                             ║
╠══════════════════════════════════════════════════════════════╣
║                                                              ║
║  PHASE A — Environment Setup          [est. X days]          ║
║  ├── A1. Install dependencies                                ║
║  ├── A2. Download base models                                ║
║  └── A3. Verify GPU/compute works                            ║
║                                                              ║
║  PHASE B — Baseline Implementation    [est. X days]          ║
║  ├── B1. Implement/clone baseline 1 ([method])               ║
║  ├── B2. Implement/clone baseline 2 ([method])               ║
║  ├── B3. Run baseline experiments                            ║
║  └── B4. Record baseline numbers                             ║
║                                                              ║
║  PHASE C — Your Method                [est. X days]          ║
║  ├── C1. Implement proposed approach                         ║
║  ├── C2. Train on [dataset]                                  ║
║  ├── C3. Evaluate on [metrics]                               ║
║  └── C4. Ablation study                                      ║
║                                                              ║
║  PHASE D — Analysis                   [est. X days]          ║
║  ├── D1. Compare against baselines                           ║
║  ├── D2. Generate figures + tables                           ║
║  └── D3. Statistical significance tests                      ║
║                                                              ║
║  PHASE E — Paper Writing              [est. X days]          ║
║  ├── E1. Fill experiment_data_template.json                  ║
║  ├── E2. EVE generates figures + tables                      ║
║  ├── E3. EVE writes full LaTeX paper                         ║
║  └── E4. Review + submit                                     ║
║                                                              ║
╚══════════════════════════════════════════════════════════════╝

📋 Current status: Phase A — Not started

→ Ready to begin? I'll guide you through each step. [yes/no]
```

Save roadmap to:
```
research/<slug>/roadmap.md
```

---

#### Step R3 — Step-by-Step Execution (resumable)

Track progress in `roadmap_progress.json`:
```json
{
  "current_phase": "A",
  "current_step": "A1",
  "completed": [""],
  "blocked": [],
  "last_updated": "2026-03-19"
}
```

**At every step, EVE:**
1. Explains what needs to be done
2. Provides exact commands to run (no guessing)
3. Verifies the step completed successfully
4. Marks it done in `roadmap_progress.json`
5. Moves to next step automatically

Example — Step A1:
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 📍 STEP A1 — Install Dependencies
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Based on your setup (RTX 3090, Linux, Python installed):

Run these commands:
  pip install torch torchvision diffusers transformers
  pip install accelerate datasets pypdf requests

Done? [yes / error: paste it here]
```

→ If **yes**: mark A1 complete, move to A2
→ If **error**: diagnose + fix + retry before moving on

Example — Step B4 (data collection):
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 📍 STEP B4 — Record Baseline Numbers
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Please share your baseline results.

Expected format:
  Method: HiDDeN
  BER: 0.31
  Bit Accuracy: 41.2%
  PSNR: 34.2

Paste your results or upload your CSV/JSON:
```

→ EVE saves results directly into `experiment_data_template.json`
→ No manual template filling needed

---

#### Step R4 — Smart Resume (next session)

When user returns to this project:
```bash
python3 session_memory.py summary "$PROJ"
```

EVE detects roadmap progress and says:
```
📍 Resuming your research roadmap...

✅ Completed: A1, A2, A3, B1, B2
⏳ In progress: B3 — Run baseline experiments
⬜ Remaining: B4, C1, C2, C3, C4, D1, D2, D3, E1, E2, E3, E4

Picking up from Step B3. Ready? [yes/no]
```

---

#### Step R5 — Missing Items During Process

If something is missing mid-pipeline, EVE **stops immediately** and asks:
```
⚠️ BLOCKED — Step C2 needs: [DiffusionDB dataset]

This is required to continue. Options:
  [1] Download it now (I'll give you the command)
  [2] Use a smaller substitute dataset (I'll suggest one)
  [3] Skip this step and continue with limitations
  [4] Pause — I'll save progress and we resume later

→ Choose:
```

EVE never proceeds past a blocker silently. Progress is always saved before pausing.

---

### ⚡ PHASE 4 — Auto Write (no pauses)

```
✍️  Writing paper...
   abstract...        ✅
   introduction...    ✅
   related work...    ✅
   methodology...     ✅
   results...         ✅
   conclusion...      ✅
🧠 Saving to memory... ✅
```

```bash
python3 paper_writer.py [survey|research] notes.md "$TOPIC" paper.tex "$AUTHOR" [data.json]
python3 session_memory.py sync "$PROJ" papers_pdf/
python3 session_memory.py save "$PROJ" decisions "Chose idea: [title]"
python3 logger.py "$PROJ" "Pipeline complete"
```

---

### ✅ Final Report

```
╔══════════════════════════════════════════════════╗
║  ✅ EVE SEMI-AUTO COMPLETE                       ║
╠══════════════════════════════════════════════════╣
║  📥 Papers:            [N]                       ║
║  🕸️  Foundational:      [N] (must-cite)           ║
║  🔬 Gaps found:        [N]  → you picked [N]     ║
║  💡 Idea chosen:       [title]                   ║
║  📝 Paper:             [filename] ([N] lines)    ║
║  📁 Project:           research/[slug]/          ║
╠══════════════════════════════════════════════════╣
║  NEXT STEPS                                      ║
║  • Review paper.tex — validate all sections      ║
║  • Add real data → upgrade to research paper     ║
║  • Compile: pdflatex [filename]                  ║
║  • Next session: I'll remember everything 🔴     ║
╚══════════════════════════════════════════════════╝
```

---

## 🔧 MODE 3 — MANUAL

**Trigger:** user says `"3"` / `"manual"` / `"command mode"`

Show command card on activation:
```
╔══════════════════════════════════════════════════════╗
║  🔧 MANUAL MODE — EVE Command Reference              ║
╠══════════════════════════════════════════════════════╣
║  SEARCH                                              ║
║   search <topic>              Semantic Scholar       ║
║   download <topic> [N]        Download N PDFs        ║
║                                                      ║
║  ANALYSIS                                            ║
║   citation graph              Build who-cites-whom   ║
║   rank papers                 Rank by citations      ║
║   parse papers                Extract PDF content    ║
║                                                      ║
║  INTELLIGENCE                                        ║
║   find gaps                   Detect research gaps   ║
║   gaps for [topic]            Topic-specific gaps    ║
║   generate ideas              From all gaps          ║
║   ideas for gap [N]           For specific gap       ║
║                                                      ║
║  WRITING                                             ║
║   write survey                Full survey paper      ║
║   write research paper        With real data         ║
║   write [section]             One section only       ║
║   generate figures            From data file         ║
║                                                      ║
║  MEMORY                                              ║
║   show projects               List all projects      ║
║   show progress               Current project state  ║
║   analyze paper <title>       Deep single-paper read ║
║   save <note>                 Save to memory         ║
║                                                      ║
║  SWITCH                                              ║
║   auto                        Switch to auto mode    ║
║   semi                        Switch to semi mode    ║
║   help                        Show this card again   ║
╚══════════════════════════════════════════════════════╝

Ready. What's your command?
```

**Rules in manual mode:**
- Execute **one command only** per message
- After each command: show result + stop
- **Never chain** to next step automatically
- If command is ambiguous → ask for clarification before running
- User can type `"semi"` or `"auto"` anytime to switch mode

---

## 📊 REAL DATA → FIGURES + TABLES

When user has real experimental results:

1. Share template:
```
~/.openclaw/workspace/research-supervisor-pro/templates/experiment_data_template.json
```

2. Template supports:
   - Line plots (training curves, convergence)
   - Multi-curve plots (compare methods)
   - Bar charts (metric comparison)
   - LaTeX comparison tables
   - Ablation study tables

3. Run:
```bash
python3 paper_writer.py research "<topic>" my_data.json paper.tex "Author" "Venue"
```

4. Output: figures auto-generated + auto-inserted into LaTeX

---

## 🕸️ CITATION GRAPH USAGE

After `citation_graph.py` runs:
- 🟢 Green nodes = your downloaded papers
- 🟠 Orange nodes = foundational papers (cited by 2+ in your set) → **must cite**
- 🔵 Blue nodes = other referenced papers

```bash
# Visualize (requires graphviz)
dot -Tpng citation_graph.dot -o citation_graph.png
```

Read `citation_graph_summary.md` — foundational papers go in your Related Work.

---

## 📄 PAPER ANALYSIS FORMAT

When analyzing any paper:
```
## Paper: <Title>
- Problem:    What problem does it solve?
- Method:     What approach do they use?
- Results:    Key numbers / findings
- Strengths:  What works well?
- Weaknesses: What fails or is missing?
- Relevance:  How does this relate to user's research?
- Gap:        What open problem does this suggest?
```

---

## 🧪 EXPERIMENT PLAN FORMAT

```
## Experiment Plan: <Idea Title>
- Hypothesis:      What we expect to show
- Baselines:       [3-5 existing methods to compare]
- Dataset:         [specific datasets]
- Metrics:         [evaluation metrics]
- Ablation:        [components to ablate]
- Expected Result: [realistic improvement range]
- Timeline:        [milestones]
- Compute:         [GPU hours / VRAM estimate]
```

---

---

## 📚 FEATURE 3 — AUTO BIBLIOGRAPHY

EVE generates a complete `.bib` file automatically from every paper it downloads.
**No manual citation work ever.**

### When to run:
After `arxiv_downloader.py` completes — run bib generation immediately:
```bash
python3 bib_generator.py papers_pdf/metadata.json references.bib
```

### Output:
- `references.bib` — ready to use in LaTeX (`\bibliography{references}`)
- `cite_map.json` — auto-used by `paper_writer.py` to replace `\cite{AuthorYear}` placeholders
- `cite_cheatsheet.md` — quick `\cite{Key}` reference for manual editing

### In LaTeX paper (auto-added by paper_writer.py):
```latex
\bibliographystyle{plain}
\bibliography{references}
```

### BibTeX key format:
```
FirstAuthorLastNameYEARKeyword
e.g. \cite{Wen2023TreeRing}
     \cite{Zhu2018HiDDeN}
```

**Add to Auto + Semi-Auto pipeline** after Step 2 (download):
```bash
python3 bib_generator.py papers_pdf/metadata.json references.bib
python3 logger.py "$PROJ" "Bibliography generated"
```

---

## 🎓 FEATURE 4 — THESIS CONTEXT FILE

EVE reads your specific thesis context to make gap detection and ideas **targeted to YOUR research**, not generic.

### Setup (first time only):
```bash
python3 thesis_context.py init
```
Asks for: thesis title, your claim, baseline paper, baseline result, your method, attack types, datasets, metrics, venue, supervisor, deadline.

### View current context:
```bash
python3 thesis_context.py show
```

### Update a field:
```bash
python3 thesis_context.py update baseline_result "41.2% bit accuracy"
```

### How EVE uses it:
Before running `gap_detector.py` or `idea_generator.py`, inject thesis context:
```bash
THESIS_CONTEXT=$(python3 thesis_context.py export)
# Pass as additional context to LLM calls
```

This makes gap detection say:
> "Gap: No defense exists against Type 2 (partial regeneration) attacks on HiDDeN"

Instead of generic:
> "Gap: Robustness is limited"

---

## 📋 FEATURE 6 — VENUE-SPECIFIC CHECKLISTS

Before writing any paper, EVE generates a checklist for the target venue.
**Never miss a requirement.**

### Supported venues:
- `ieee tifs` — IEEE Transactions on Information Forensics and Security
- `neurips` — Neural Information Processing Systems
- `cvpr` — IEEE/CVF CVPR
- `iccv` — ICCV
- `acm mm` — ACM Multimedia
- `ieee tsp` — IEEE Transactions on Signal Processing
- `thesis` — Master's/PhD Thesis

### Show checklist:
```bash
python3 venue_checklist.py ieee tifs
```

### Save to project:
```bash
python3 venue_checklist.py check <project> "ieee tifs"
```
Saves `venue_checklist.md` to your project folder.

### When to use:
- At PAUSE 3 (paper type selection) — always show checklist for chosen venue
- Before paper writing starts — confirm all requirements are understood
- After paper is written — review checklist to catch missing items

---

## 🖥️ FEATURE 8 — SSH/SLURM SERVER MONITORING

Connect to your GPU server and monitor experiments without leaving EVE.

### Setup (one time):
```bash
python3 server_monitor.py setup
# Enter: hostname, username, SSH key, working directory
```

### Commands:
```bash
python3 server_monitor.py status          # full server status (GPU + jobs + disk)
python3 server_monitor.py jobs            # list your running SLURM jobs
python3 server_monitor.py gpu             # GPU memory and utilization
python3 server_monitor.py watch <job_id>  # watch job log live
python3 server_monitor.py pull <job_id> <remote_path>  # pull results
python3 server_monitor.py run <script.sh> # submit SLURM job
```

### When user says:
- "check my server" → run `server_monitor.py status`
- "check my jobs" → run `server_monitor.py jobs`
- "check GPU" → run `server_monitor.py gpu`
- "watch job 12345" → run `server_monitor.py watch 12345`
- "pull results" → run `server_monitor.py pull`

---

## 🔔 FEATURE 10 — REAL-TIME EXPERIMENT ALERTS

EVE watches your training jobs and alerts you when something happens.
**Auto-extracts metrics and updates your data template.**

### Watch a job with milestone alert:
```bash
# Alert when BER drops below 0.1
python3 experiment_alert.py watch 12345 --metric BER --threshold 0.1 --project my_thesis

# Just poll every 60 seconds
python3 experiment_alert.py poll 12345 --interval 60
```

### Parse a log file:
```bash
python3 experiment_alert.py parse logs/job_12345.out
```
Extracts: BER, BitAcc, PSNR, SSIM, Loss, Epoch, errors

### Auto-update data template from log:
```bash
python3 experiment_alert.py update my_thesis logs/job_12345.out
```
→ Reads your training log → extracts metrics → fills `experiment_data.json` automatically
→ Then `paper_writer.py` can generate figures from real data immediately

### Detects automatically:
- ✅ Training completed
- 🎯 Metric milestone hit (e.g. BER < 0.1)
- ❌ Crash / OOM / NaN loss
- 📊 Epoch progress updates

### When user says:
- "watch my experiment" → `experiment_alert.py watch <job_id>`
- "is training done?" → `experiment_alert.py poll <job_id>`
- "parse my training log" → `experiment_alert.py parse <file>`
- "update my data from log" → `experiment_alert.py update <project> <file>`

---

## 🔑 API — ZERO SETUP ON PETCLAW

LLM steps use **PetClaw built-in API** automatically:
- Key: `brainApiKey` from `~/.petclaw/petclaw-settings.json`
- URL: `brainApiUrl` from same file
- Model: `brainModel` from same file
- **No setup needed** — works out of the box

Fallback order:
1. PetClaw built-in ← **default, zero setup**
2. `OPENAI_API_KEY` env var
3. Keyword-only (offline fallback)

---

## 🧠 MEMORY RULES

- **Always read memory before acting**
- **Always update memory after major steps**
- Memory files live in:
  ```
  ~/.openclaw/workspace/research-supervisor-pro/memory/
  ~/.openclaw/workspace/research-supervisor-pro/research/<project>/memory.md
  ```

Commands:
```bash
python3 session_memory.py summary <project>   # view project state
python3 session_memory.py list                # list all projects
python3 session_memory.py save <p> decisions "Chose HiDDeN as baseline"
python3 session_memory.py save <p> next_steps "Run ablation on patch size"
python3 session_memory.py sync <p> papers_pdf/
```

---

## ⚠️ CRITICAL RULES

1. **Never fabricate** results, citations, or data
2. **Always cite** — arXiv IDs, paper titles, `\cite{}` in LaTeX
3. **Memory first** — check memory before every action
4. **Confirm before** running any pipeline step in Semi-Manual mode
5. **One step at a time** in Manual mode — never auto-chain
6. **If uncertain → STOP → ASK USER. Never proceed blindly.**

FILE:README.md
# 🔴 EVE — Research Supervisor Pro

> Persistent AI Research Supervisor Agent — from your first idea to a publication-ready LaTeX paper.

---

## 🎛️ Three Modes

| Mode | Description | Best For |
|---|---|---|
| 🤖 **Auto** | Topic in → full paper out, ~15 min, no interruptions | Quick exploration, first pass |
| 🎯 **Semi-Auto** | Auto pipeline with 3 smart pauses for your decisions | Thesis work, serious research |
| 🔧 **Manual** | One command at a time, full control | Advanced users, specific tasks |

---

## ✨ Features

| Feature | Description |
|---|---|
| 🔍 Semantic Search | Find papers by meaning via Semantic Scholar |
| 📥 PDF Download | Download real PDFs from arXiv with metadata |
| 🕸️ Citation Graph | Build who-cites-whom network, find foundational papers |
| 📊 Citation Ranking | Rank papers by citation count |
| 🔬 Gap Detection | LLM-powered research gap analysis (★★★ ranked) |
| 💡 Idea Generation | Generate specific, publishable research ideas |
| 📝 Paper Writing | Full LaTeX papers — survey + research |
| 📈 Real Data → Figures | Feed your results → auto-generate matplotlib graphs |
| 📋 Real Data → Tables | Comparison + ablation tables in LaTeX |
| 📚 Auto Bibliography | Every paper → instant BibTeX entry, zero manual work |
| 🎓 Thesis Context | Your specific research context for targeted suggestions |
| 📋 Venue Checklists | Requirements for IEEE TIFS, NeurIPS, CVPR, thesis, and more |
| 🖥️ Server Monitor | SSH into GPU server, check jobs, GPU usage, pull results |
| 🔔 Experiment Alerts | Watch training, alert on completion/crash/milestone |
| 🗺️ Research Roadmap | Full step-by-step plan when you have no data yet |
| 🧠 Session Memory | Remembers your research across sessions and weeks |

---

## 🚀 Install from GitHub

### One-line install (copy + paste)
```bash
git clone https://github.com/amzayn/eve-research-supervisor-pro && cd eve-research-supervisor-pro && bash install.sh && mkdir -p ~/.openclaw/workspace/skills/eve-research-supervisor-pro && cp SKILL.md ~/.openclaw/workspace/skills/eve-research-supervisor-pro/SKILL.md && echo "✅ EVE installed!"
```

### Step by step
```bash
# 1. Download
git clone https://github.com/amzayn/eve-research-supervisor-pro
cd eve-research-supervisor-pro

# 2. Install scripts + dependencies
bash install.sh

# 3. Register skill with OpenClaw
mkdir -p ~/.openclaw/workspace/skills/eve-research-supervisor-pro
cp SKILL.md ~/.openclaw/workspace/skills/eve-research-supervisor-pro/SKILL.md

# 4. Talk to your AI
# "EVE, start research mode"
```

### Verify installation
```bash
ls ~/.openclaw/workspace/research-supervisor-pro/scripts/
# Should show 18 scripts
```

### First time setup (optional but recommended)
```bash
# Set your thesis context for targeted suggestions
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/thesis_context.py init

# Configure your GPU server for monitoring
python3 ~/.openclaw/workspace/research-supervisor-pro/scripts/server_monitor.py setup
```

## 🚀 Quick Start

### First time — set your thesis context
```bash
python3 scripts/thesis_context.py init
```

### Run full auto pipeline
```bash
python3 scripts/arxiv_downloader.py "watermark diffusion models" 30
python3 scripts/bib_generator.py
python3 scripts/citation_graph.py
python3 scripts/semantic_ranker.py
python3 scripts/pdf_parser.py
python3 scripts/gap_detector.py
python3 scripts/idea_generator.py
python3 scripts/paper_writer.py survey notes.md "Your Topic" paper.tex "Your Name"
```

### Or just tell your AI agent:
```
"EVE, research watermark diffusion models — semi-auto mode"
```

---

## 🗺️ Research Roadmap (No Data Yet?)

EVE asks about your machine, timeline, and coding level — then generates a full roadmap:

```
PHASE A — Environment Setup      (1-2 days)
PHASE B — Baseline Implementation (3-5 days)
PHASE C — Your Method             (5-7 days)
PHASE D — Analysis                (2-3 days)
PHASE E — Paper Writing           (EVE handles this)
```

Step-by-step execution, smart resume across sessions, blocker handling.

---

## 📊 Real Data → Paper

Fill the template with your results:
```json
{
  "experiments": [{"name": "BER vs Epochs", "x": [...], "y": [...]}],
  "comparisons": [{"title": "vs Baselines", "methods": [...], "values": [...]}],
  "tables": [{"caption": "Results", "headers": [...], "rows": [[...]]}]
}
```

Run and get auto-generated figures + LaTeX tables in your paper:
```bash
python3 scripts/paper_writer.py research "Topic" data.json paper.tex "Author" "IEEE TIFS"
```

---

## 🖥️ Server Monitoring

```bash
python3 scripts/server_monitor.py setup    # one-time config
python3 scripts/server_monitor.py status   # GPU + jobs + disk
python3 scripts/server_monitor.py watch 12345  # live job log
```

## 🔔 Experiment Alerts

```bash
# Alert when BER drops below 0.1
python3 scripts/experiment_alert.py watch 12345 --metric BER --threshold 0.1

# Auto-update data template from training log
python3 scripts/experiment_alert.py update my_project logs/train.out
```

---

## 📁 File Structure

```
eve-research-supervisor-pro/
├── SKILL.md                          ← AI agent instructions
├── README.md                         ← This file
├── package.json                      ← ClawHub manifest
├── config.yaml                       ← Configuration
├── install.sh                        ← Auto installer
├── scripts/
│   ├── arxiv_downloader.py           Search & download papers
│   ├── semantic_search.py            Semantic Scholar search
│   ├── semantic_ranker.py            Rank by citations
│   ├── citation_graph.py             Build citation network
│   ├── pdf_parser.py                 Extract PDF content
│   ├── gap_detector.py               LLM gap detection
│   ├── idea_generator.py             LLM idea generation
│   ├── paper_writer.py               Full LaTeX paper writer
│   ├── build_survey.py               Survey builder
│   ├── bib_generator.py              Auto BibTeX generation
│   ├── thesis_context.py             Thesis context manager
│   ├── venue_checklist.py            Venue requirements
│   ├── server_monitor.py             SSH/SLURM monitoring
│   ├── experiment_alert.py           Training job alerts
│   ├── roadmap_tracker.py            Research roadmap tracker
│   ├── project_init.py               Project folder setup
│   ├── session_memory.py             Persistent memory
│   └── logger.py                     Pipeline logger
└── templates/
    ├── experiment_data_template.json  Real data format
    └── survey.tex                     LaTeX template
```

---

## ⚙️ Requirements

```bash
pip install requests pypdf matplotlib numpy
```

**Zero API setup on PetClaw** — uses built-in key automatically.

For non-PetClaw:
```bash
export OPENAI_API_KEY=your_key
export OPENAI_BASE_URL=https://api.openai-hk.com/v1
```

Optional (citation graph visualization):
```bash
brew install graphviz   # macOS
apt install graphviz    # Linux
```

---

## 📋 Supported Venues

IEEE TIFS · NeurIPS · CVPR · ICCV · ACM MM · IEEE TSP · Master's/PhD Thesis

---

## 👤 Author

**Zain Ul Abdeen**  
Master's student, Harbin Institute of Technology (Shenzhen)  
Research: AI watermarking, diffusion models, adversarial ML

---

## 📄 License

MIT — Free to use, modify, and share.

FILE:config.yaml
name: research-supervisor-pro
version: 2.0.0
description: Full-stack autonomous AI research supervisor for literature search, gap analysis, idea generation, experiment planning, and survey writing.

entry: SKILL.md

tools:
  - web_search
  - web_fetch
  - memory
  - exec

capabilities:
  - reasoning
  - planning
  - summarization
  - research
  - latex

memory:
  enabled: true
  path: ~/.openclaw/workspace/research-supervisor-pro/memory/

scripts:
  base: ~/.openclaw/workspace/research-supervisor-pro/scripts/

llm:
  default_model: gpt-4o
  base_url: "-https://api.openai-hk.com/v1"
  api_key: "-"

author: Zain Ul Abdeen
license: MIT

FILE:install.sh
#!/bin/bash
# EVE Research Supervisor Pro — Installer
# Installs to ~/.openclaw/workspace/research-supervisor-pro/

set -e

BASE=~/.openclaw/workspace/research-supervisor-pro
echo ""
echo "╔══════════════════════════════════════════╗"
echo "║   🔴 EVE Research Supervisor Pro v5.0    ║"
echo "║   Installing...                          ║"
echo "╚══════════════════════════════════════════╝"
echo ""

# Create directories
mkdir -p "$BASE"/{scripts,templates,memory,research,figures}

# Copy scripts and templates
cp -r scripts/* "$BASE/scripts/"
cp -r templates/* "$BASE/templates/"
chmod +x "$BASE/scripts/"*.py

# Python dependencies
echo "🔍 Checking Python dependencies..."
python3 -c "import requests"   2>/dev/null || pip3 install requests   -q
python3 -c "import pypdf"      2>/dev/null || pip3 install pypdf       -q
python3 -c "import matplotlib" 2>/dev/null || pip3 install matplotlib  -q
python3 -c "import numpy"      2>/dev/null || pip3 install numpy       -q
echo "✅ Python dependencies OK"

# Optional: Graphviz for citation graph visualization
if ! command -v dot &>/dev/null; then
  echo "ℹ️  Optional: install Graphviz to visualize citation graphs:"
  echo "   macOS:  brew install graphviz"
  echo "   Linux:  sudo apt install graphviz"
fi

echo ""
echo "✅ EVE installed to: $BASE"
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo " QUICK START"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo " Just say to your AI agent:"
echo "   \"EVE, start research mode\""
echo ""
echo " Or run manually:"
echo "   python3 $BASE/scripts/thesis_context.py init"
echo "   python3 $BASE/scripts/arxiv_downloader.py \"your topic\" 30"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""

FILE:package.json
{
  "name": "eve-research-supervisor-pro",
  "version": "5.1.0",
  "description": "EVE — Persistent AI Research Supervisor Agent. Three modes (Auto/Semi-Auto/Manual), full research lifecycle: paper search, citation graph, gap detection, idea generation, LaTeX paper writing with real figures/tables, server monitoring, and experiment alerts.",
  "author": "Zain Ul Abdeen",
  "license": "MIT",
  "tags": [
    "research", "arxiv", "ai", "phd", "thesis",
    "literature-review", "survey", "paper-writing",
    "gap-analysis", "latex", "figures", "citation-graph",
    "slurm", "gpu", "experiment", "persistent-agent"
  ],
  "homepage": "https://clawhub.ai/skills/eve-research-supervisor-pro",
  "repository": "",
  "files": [
    "SKILL.md",
    "README.md",
    "config.yaml",
    "install.sh",
    "scripts/",
    "templates/",
    "docs/",
    "assets/"
  ],
  "postinstall": "bash install.sh",
  "engines": {
    "python": ">=3.9"
  },
  "dependencies": {
    "python": ["requests", "pypdf", "matplotlib", "numpy"]
  }
}

FILE:scripts/arxiv_downloader.py
#!/usr/bin/env python3
"""
arxiv_downloader.py — Download real PDFs + metadata from arXiv
Usage: python3 arxiv_downloader.py "<topic>" <max_results> [output_dir]
"""

import requests
import os
import sys
import json
import time
from xml.etree import ElementTree

def download_papers(query, max_results=30, output_dir="papers_pdf"):
    os.makedirs(output_dir, exist_ok=True)
    metadata_list = []

    url = (
        f"http://export.arxiv.org/api/query"
        f"?search_query=all:{requests.utils.quote(query)}"
        f"&start=0&max_results={max_results}"
        f"&sortBy=relevance&sortOrder=descending"
    )

    print(f"🔍 Searching arXiv for: {query}")
    try:
        response = requests.get(url, timeout=30)
        response.raise_for_status()
    except Exception as e:
        print(f"❌ Failed to query arXiv: {e}")
        sys.exit(1)

    root = ElementTree.fromstring(response.content)
    ns = "{http://www.w3.org/2005/Atom}"
    entries = root.findall(f"{ns}entry")

    print(f"📄 Found {len(entries)} papers. Downloading...")

    for i, entry in enumerate(entries):
        try:
            arxiv_id_raw = entry.find(f"{ns}id").text.strip()
            arxiv_id = arxiv_id_raw.split("/abs/")[-1].replace("/", "_")
            title = entry.find(f"{ns}title").text.strip().replace("\n", " ")
            abstract = entry.find(f"{ns}summary").text.strip().replace("\n", " ")
            authors = [
                a.find(f"{ns}name").text
                for a in entry.findall(f"{ns}author")
            ]
            published = entry.find(f"{ns}published").text.strip()

            pdf_url = arxiv_id_raw.replace("/abs/", "/pdf/") + ".pdf"
            filename = f"{output_dir}/paper_{i:03d}_{arxiv_id}.pdf"

            # Save metadata
            metadata_list.append({
                "index": i,
                "arxiv_id": arxiv_id,
                "title": title,
                "authors": authors,
                "published": published,
                "abstract": abstract,
                "pdf_url": pdf_url,
                "filename": filename
            })

            # Download PDF with rate limiting
            if not os.path.exists(filename):
                time.sleep(1.5)  # Respect arXiv rate limits
                pdf_resp = requests.get(pdf_url, timeout=60)
                if pdf_resp.status_code == 200:
                    with open(filename, "wb") as f:
                        f.write(pdf_resp.content)
                    print(f"  ✅ [{i+1}/{len(entries)}] {title[:60]}...")
                else:
                    print(f"  ⚠️  [{i+1}/{len(entries)}] Skipped (HTTP {pdf_resp.status_code}): {title[:50]}")
            else:
                print(f"  ⏭️  [{i+1}/{len(entries)}] Already exists: {title[:60]}")

        except Exception as e:
            print(f"  ❌ [{i+1}] Error: {e}")
            continue

    # Save metadata JSON for other scripts to use
    with open(f"{output_dir}/metadata.json", "w") as f:
        json.dump(metadata_list, f, indent=2)

    print(f"\n✅ Downloaded {len(metadata_list)} papers to {output_dir}/")
    print(f"📋 Metadata saved to {output_dir}/metadata.json")
    return metadata_list


if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: python3 arxiv_downloader.py \"<topic>\" <max_results> [output_dir]")
        sys.exit(1)

    query = sys.argv[1]
    max_results = int(sys.argv[2])
    output_dir = sys.argv[3] if len(sys.argv) > 3 else "papers_pdf"

    download_papers(query, max_results, output_dir)

FILE:scripts/bib_generator.py
#!/usr/bin/env python3
"""
bib_generator.py — Auto-generate BibTeX (.bib) from downloaded paper metadata
Feature 3: Auto Bibliography Management
No manual citation work — every downloaded paper becomes a ready BibTeX entry.
Usage:
  python3 bib_generator.py [metadata_json] [output_bib]
  python3 bib_generator.py papers_pdf/metadata.json references.bib
"""

import sys
import os
import json
import re
import time
import requests

SS_BASE = "https://api.semanticscholar.org/graph/v1"


def slugify_key(title, authors, year):
    """Generate a clean BibTeX key: FirstAuthorLastNameYearKeyword"""
    first_author = ""
    if authors:
        name = authors[0].strip()
        parts = name.split()
        first_author = parts[-1] if parts else "Unknown"
        # Remove non-alpha chars
        first_author = re.sub(r'[^a-zA-Z]', '', first_author)

    # First meaningful word from title (skip short words)
    stopwords = {'a','an','the','of','in','on','for','with','and','or','is','are','to','via','by','from'}
    title_words = [w for w in re.sub(r'[^a-zA-Z0-9 ]','',title).split()
                   if w.lower() not in stopwords and len(w) > 2]
    keyword = title_words[0].capitalize() if title_words else "Paper"

    return f"{first_author}{year or 'XXXX'}{keyword}"


def fetch_doi(arxiv_id):
    """Try to get DOI from Semantic Scholar."""
    clean_id = arxiv_id.split("v")[0].replace("_", "/")
    url = f"{SS_BASE}/paper/arXiv:{clean_id}?fields=externalIds,venue,publicationVenue"
    try:
        time.sleep(1.0)
        r = requests.get(url, timeout=10)
        if r.status_code == 200:
            data = r.json()
            ext = data.get("externalIds", {}) or {}
            doi = ext.get("DOI", "")
            venue = ""
            pv = data.get("publicationVenue") or {}
            venue = pv.get("name", "") or data.get("venue", "")
            return doi, venue
    except Exception:
        pass
    return "", ""


def make_bibtex_entry(paper, fetch_extra=False):
    """Convert paper metadata dict to a BibTeX entry string."""
    title   = paper.get("title", "Unknown Title")
    authors = paper.get("authors", [])
    year_raw = paper.get("published", "")[:4]
    year    = year_raw if year_raw.isdigit() else "2024"
    arxiv_id = paper.get("arxiv_id", "").replace("_", "/").split("v")[0]
    abstract = paper.get("abstract", "")[:300]

    key = slugify_key(title, authors, year)

    # Format author list: Last, First and Last, First
    author_str = " and ".join(authors[:8]) if authors else "Unknown"

    doi, venue = "", "arXiv"
    if fetch_extra and arxiv_id:
        doi, venue = fetch_doi(arxiv_id)

    url = f"https://arxiv.org/abs/{arxiv_id}" if arxiv_id else ""

    lines = [
        f"@article{{{key},",
        f"  title   = {{{{{title}}}}},",
        f"  author  = {{{author_str}}},",
        f"  year    = {{{year}}},",
        f"  journal = {{{venue or 'arXiv preprint'}}},",
    ]
    if arxiv_id:
        lines.append(f"  note    = {{arXiv:{arxiv_id}}},")
    if doi:
        lines.append(f"  doi     = {{{doi}}},")
    if url:
        lines.append(f"  url     = {{{url}}},")
    if abstract:
        lines.append(f"  abstract = {{{abstract}...}},")
    lines.append("}")

    return key, "\n".join(lines)


def generate_bib(metadata_path="papers_pdf/metadata.json",
                 output_bib="references.bib",
                 fetch_extra=True):
    if not os.path.exists(metadata_path):
        print(f"❌ {metadata_path} not found. Run arxiv_downloader.py first.")
        sys.exit(1)

    with open(metadata_path) as f:
        papers = json.load(f)

    print(f"📚 Generating BibTeX for {len(papers)} papers...")

    entries = []
    key_map = {}   # key → title (for collision detection)
    cite_map = []  # list of {original_title, bibtex_key} for reference

    for i, paper in enumerate(papers):
        title = paper.get("title", "")
        try:
            key, entry = make_bibtex_entry(paper, fetch_extra=(fetch_extra and i < 20))
        except Exception as e:
            print(f"  ⚠️  [{i+1}] Error for '{title[:40]}': {e}")
            continue

        # Handle duplicate keys
        if key in key_map:
            key = f"{key}_{i}"

        key_map[key] = title
        entries.append(entry)
        cite_map.append({"bibtex_key": key, "title": title})
        print(f"  ✅ [{i+1}/{len(papers)}] \\cite{{{key}}}  ←  {title[:55]}...")

    # Write .bib file
    with open(output_bib, "w") as f:
        f.write("% Auto-generated by EVE Research Supervisor\n")
        f.write(f"% Papers: {len(entries)}\n\n")
        f.write("\n\n".join(entries))
        f.write("\n")

    # Write cite_map.json for paper_writer.py to use
    with open("cite_map.json", "w") as f:
        json.dump(cite_map, f, indent=2)

    # Write human-readable cite cheatsheet
    with open("cite_cheatsheet.md", "w") as f:
        f.write("# Citation Cheatsheet\n\n")
        f.write("Use these keys in your LaTeX paper:\n\n")
        for item in cite_map:
            f.write(f"- `\\cite{{{item['bibtex_key']}}}` → {item['title'][:70]}\n")

    print(f"\n✅ BibTeX generated!")
    print(f"   {output_bib}         — include in LaTeX: \\bibliography{{references}}")
    print(f"   cite_map.json        — used by paper_writer.py automatically")
    print(f"   cite_cheatsheet.md   — your quick reference")
    print(f"\n   In your .tex file add:")
    print(f"   \\bibliographystyle{{plain}}")
    print(f"   \\bibliography{{{output_bib.replace('.bib','')}}}")
    return cite_map


if __name__ == "__main__":
    metadata = sys.argv[1] if len(sys.argv) > 1 else "papers_pdf/metadata.json"
    output   = sys.argv[2] if len(sys.argv) > 2 else "references.bib"
    generate_bib(metadata, output)

FILE:scripts/build_survey.py
#!/usr/bin/env python3
"""
build_survey.py — Generate a proper LaTeX survey paper from research notes
Usage: python3 build_survey.py [notes_file] [output_file] [topic]
"""

import os
import sys
import re
import json
import datetime
import requests

# ── Config ──────────────────────────────────────────────────────────────────
def _get_api_config():
    """Use PetClaw built-in API first, fall back to env vars."""
    settings_path = os.path.expanduser("~/.petclaw/petclaw-settings.json")
    try:
        import json as _j
        with open(settings_path) as f:
            d = _j.load(f)
        key = d.get("brainApiKey", "")
        if key:
            return {
                "key":   key,
                "base":  d.get("brainApiUrl", "https://petclaw.ai/api/v1"),
                "model": os.environ.get("SURVEY_MODEL", d.get("brainModel", "petclaw-1.0"))
            }
    except Exception:
        pass
    return {
        "key":   os.environ.get("OPENAI_API_KEY", ""),
        "base":  os.environ.get("OPENAI_BASE_URL", "https://api.openai-hk.com/v1"),
        "model": os.environ.get("SURVEY_MODEL", "gpt-4o")
    }

_cfg        = _get_api_config()
OPENAI_BASE = _cfg["base"]
OPENAI_KEY  = _cfg["key"]
MODEL       = _cfg["model"]


def generate_survey_llm(notes_text, topic, api_key):
    """Use LLM to write structured survey sections."""
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    payload = {
        "model": MODEL,
        "messages": [
            {
                "role": "system",
                "content": (
                    "You are an expert academic writer. Write a structured literature survey "
                    "based on the provided research notes. Output clean LaTeX sections only "
                    "(no \\documentclass, no preamble — just section content). "
                    "Include: Introduction, Related Work, Key Methods, Research Gaps, Future Directions. "
                    "Use proper \\section{}, \\subsection{}, and \\paragraph{} formatting. "
                    "Keep it academic, concise, and well-organized."
                )
            },
            {
                "role": "user",
                "content": f"Topic: {topic}\n\nNotes:\n{notes_text[:6000]}"
            }
        ],
        "temperature": 0.4,
        "max_tokens": 3000
    }
    try:
        r = requests.post(f"{OPENAI_BASE}/chat/completions", headers=headers, json=payload, timeout=60)
        r.raise_for_status()
        return r.json()["choices"][0]["message"]["content"].strip()
    except Exception as e:
        print(f"  ⚠️  LLM call failed: {e}. Using structured template.")
        return None


def build_survey_template(notes_text, topic):
    """Fallback: structured template from notes."""
    # Extract paper titles from notes
    titles = re.findall(r'^## \d+\. (.+)$', notes_text, re.MULTILINE)
    abstracts = re.findall(r'\*\*Abstract:\*\* (.+)', notes_text)

    paper_items = ""
    for i, title in enumerate(titles[:20]):
        paper_items += f"  \\item \\textbf{{{title[:80]}}}\n"
        if i < len(abstracts):
            clean = abstracts[i][:200].replace("&", "\\&").replace("%", "\\%").replace("#", "\\#").replace("_", "\\_")
            paper_items += f"  {clean}...\n\n"

    return f"""
\\section{{Introduction}}
This survey covers recent advances in \\textit{{{topic}}}. 
We reviewed {len(titles)} papers retrieved from arXiv and analyzed their methods, results, and limitations.

\\section{{Related Work}}
\\subsection{{Overview of Key Papers}}
\\begin{{enumerate}}
{paper_items}
\\end{{enumerate}}

\\section{{Key Methods and Approaches}}
The surveyed literature presents a range of approaches. 
We categorize them by their core techniques and evaluate their reported performance.

\\section{{Research Gaps}}
Based on our analysis, several key challenges remain unaddressed in the current literature:
\\begin{{itemize}}
  \\item Lack of standardized benchmarks for evaluating methods under adversarial conditions
  \\item Limited cross-domain generalization of proposed approaches
  \\item Insufficient evaluation on real-world deployment scenarios
\\end{{itemize}}

\\section{{Future Directions}}
Future research should focus on:
\\begin{{itemize}}
  \\item Developing robust evaluation frameworks
  \\item Bridging the gap between theoretical results and practical applications
  \\item Exploring cross-modal and multi-task approaches
\\end{{itemize}}

\\section{{Conclusion}}
This survey provides a comprehensive overview of {topic}. 
We identified {len(titles)} relevant works and highlighted key open problems for future research.
"""


def build_survey(notes_file="notes.md", output_file="survey.tex", topic="AI Research", api_key=None):
    if not os.path.exists(notes_file):
        print(f"❌ {notes_file} not found. Run pdf_parser.py first.")
        sys.exit(1)

    with open(notes_file) as f:
        notes_text = f.read()

    print(f"📝 Building LaTeX survey for topic: {topic}")

    key = api_key or OPENAI_KEY
    body = None

    if key:
        print("  Using LLM to write survey sections...")
        body = generate_survey_llm(notes_text, topic, key)

    if not body:
        print("  Using structured template (fallback)...")
        body = build_survey_template(notes_text, topic)

    # Escape special chars in topic for LaTeX title
    safe_topic = topic.replace("&", "\\&").replace("%", "\\%").replace("#", "\\#").replace("_", "\\_")
    date_str = datetime.date.today().strftime("%B %d, %Y")

    latex = f"""\\documentclass[12pt,a4paper]{{article}}
\\usepackage[utf8]{{inputenc}}
\\usepackage[T1]{{fontenc}}
\\usepackage{{hyperref}}
\\usepackage{{geometry}}
\\geometry{{margin=1in}}
\\usepackage{{enumitem}}
\\usepackage{{parskip}}

\\title{{Literature Survey: {safe_topic}}}
\\author{{Research Supervisor Pro}}
\\date{{{date_str}}}

\\begin{{document}}
\\maketitle
\\tableofcontents
\\newpage

{body}

\\end{{document}}
"""

    with open(output_file, "w") as f:
        f.write(latex)

    print(f"\n✅ Survey saved → {output_file}")
    print(f"   Compile with: pdflatex {output_file}")
    return output_file


if __name__ == "__main__":
    notes_file  = sys.argv[1] if len(sys.argv) > 1 else "notes.md"
    output_file = sys.argv[2] if len(sys.argv) > 2 else "survey.tex"
    topic       = sys.argv[3] if len(sys.argv) > 3 else "AI Research"
    api_key     = sys.argv[4] if len(sys.argv) > 4 else None
    build_survey(notes_file, output_file, topic, api_key)

FILE:scripts/citation_graph.py
#!/usr/bin/env python3
"""
citation_graph.py — Build a citation graph: who cites whom, find clusters
Uses Semantic Scholar API. Outputs graph data + visual DOT file.
Usage: python3 citation_graph.py [metadata_json] [depth]
  depth=1: just direct citations
  depth=2: citations of citations (slower)
"""

import json
import os
import sys
import time
import requests

SS_BASE = "https://api.semanticscholar.org/graph/v1"


def get_paper_data(arxiv_id, title=""):
    """Fetch paper info from Semantic Scholar."""
    clean_id = arxiv_id.split("v")[0].replace("_", "/")

    # Try arXiv ID first
    url = f"{SS_BASE}/paper/arXiv:{clean_id}?fields=paperId,title,year,citationCount,citations,references"
    try:
        time.sleep(1.2)
        r = requests.get(url, timeout=15)
        if r.status_code == 200:
            return r.json()
    except Exception:
        pass

    # Fallback: title search
    if title:
        url2 = f"{SS_BASE}/paper/search?query={requests.utils.quote(title)}&limit=1&fields=paperId,title,year,citationCount,citations,references"
        try:
            time.sleep(1.2)
            r2 = requests.get(url2, timeout=15)
            if r2.status_code == 200:
                data = r2.json().get("data", [])
                if data:
                    return data[0]
        except Exception:
            pass
    return None


def build_graph(metadata_path="papers_pdf/metadata.json", depth=1):
    if not os.path.exists(metadata_path):
        print(f"❌ {metadata_path} not found. Run arxiv_downloader.py first.")
        sys.exit(1)

    with open(metadata_path) as f:
        papers = json.load(f)

    print(f"🕸️  Building citation graph for {len(papers)} papers (depth={depth})...")

    graph = {}      # paperId → node info
    edges = []      # (from_id, to_id, type)  type = "cites" | "cited_by"
    id_map = {}     # arxiv_id → paperId

    # Phase 1: fetch all seed papers
    for i, paper in enumerate(papers):
        arxiv_id = paper.get("arxiv_id", "")
        title = paper.get("title", "")
        print(f"  [{i+1}/{len(papers)}] Fetching: {title[:55]}...")

        data = get_paper_data(arxiv_id, title)
        if not data:
            print(f"    ⚠️  Not found on Semantic Scholar")
            continue

        paper_id = data.get("paperId", arxiv_id)
        id_map[arxiv_id] = paper_id

        graph[paper_id] = {
            "id": paper_id,
            "arxiv_id": arxiv_id,
            "title": title or data.get("title", ""),
            "year": data.get("year"),
            "citations": data.get("citationCount", 0),
            "is_seed": True
        }

        # Add reference edges (this paper cites → )
        for ref in data.get("references", [])[:30]:
            ref_id = ref.get("paperId")
            ref_title = ref.get("title", "")
            if ref_id:
                edges.append((paper_id, ref_id, "cites"))
                if ref_id not in graph:
                    graph[ref_id] = {
                        "id": ref_id,
                        "title": ref_title,
                        "year": ref.get("year"),
                        "citations": 0,
                        "is_seed": False
                    }

        # Add citation edges (cited by ←)
        for cit in data.get("citations", [])[:20]:
            cit_id = cit.get("paperId")
            cit_title = cit.get("title", "")
            if cit_id:
                edges.append((cit_id, paper_id, "cited_by"))
                if cit_id not in graph:
                    graph[cit_id] = {
                        "id": cit_id,
                        "title": cit_title,
                        "year": cit.get("year"),
                        "citations": 0,
                        "is_seed": False
                    }

    # Find most-cited papers in graph
    citation_counts = {}
    for _, to_id, _ in edges:
        citation_counts[to_id] = citation_counts.get(to_id, 0) + 1

    # Identify key papers (cited by multiple seed papers = likely foundational)
    foundational = {pid: cnt for pid, cnt in citation_counts.items() if cnt >= 2}

    # Save graph JSON
    graph_data = {
        "nodes": list(graph.values()),
        "edges": [{"from": f, "to": t, "type": typ} for f, t, typ in edges],
        "foundational_papers": [
            {"id": pid, "title": graph.get(pid, {}).get("title", ""), "cited_by_count": cnt}
            for pid, cnt in sorted(foundational.items(), key=lambda x: -x[1])
        ]
    }

    with open("citation_graph.json", "w") as f:
        json.dump(graph_data, f, indent=2)

    # Save DOT file for visualization (Graphviz)
    with open("citation_graph.dot", "w") as f:
        f.write("digraph CitationGraph {\n")
        f.write('  rankdir=LR;\n  node [shape=box, style=filled];\n')

        for node in graph.values():
            label = node["title"][:40].replace('"', "'")
            color = "#4CAF50" if node.get("is_seed") else "#90CAF9"
            if node["id"] in foundational:
                color = "#FF9800"
            f.write(f'  "{node["id"]}" [label="{label}", fillcolor="{color}"];\n')

        for frm, to, typ in edges[:200]:  # limit for readability
            style = "solid" if typ == "cites" else "dashed"
            f.write(f'  "{frm}" -> "{to}" [style={style}];\n')

        f.write("}\n")

    # Human-readable summary
    with open("citation_graph_summary.md", "w") as f:
        f.write("# Citation Graph Summary\n\n")
        f.write(f"- **Papers analyzed:** {len(graph)}\n")
        f.write(f"- **Citation edges:** {len(edges)}\n\n")
        f.write("## 🌟 Foundational Papers (cited by multiple papers in your set)\n\n")
        for p in graph_data["foundational_papers"][:15]:
            f.write(f"- **{p['title'][:80]}** — cited by {p['cited_by_count']} papers\n")
        f.write("\n## 🟢 Seed Papers (your downloaded set)\n\n")
        for node in graph.values():
            if node.get("is_seed"):
                f.write(f"- {node['title'][:80]} ({node.get('year','?')})\n")

    print(f"\n✅ Citation graph built!")
    print(f"   Nodes: {len(graph)} | Edges: {len(edges)}")
    print(f"   Foundational papers: {len(foundational)}")
    print(f"   Saved: citation_graph.json, citation_graph.dot, citation_graph_summary.md")
    print(f"\n   Visualize: dot -Tpng citation_graph.dot -o citation_graph.png")
    return graph_data


if __name__ == "__main__":
    metadata = sys.argv[1] if len(sys.argv) > 1 else "papers_pdf/metadata.json"
    depth    = int(sys.argv[2]) if len(sys.argv) > 2 else 1
    build_graph(metadata, depth)

FILE:scripts/experiment_alert.py
#!/usr/bin/env python3
"""
experiment_alert.py — Feature 10: Real-Time Experiment Alerts
Monitors your running experiments and alerts you when:
  - Training finishes (success or failure)
  - A metric milestone is hit (e.g. BER < 0.1)
  - Job crashes or runs out of memory
  - Results are ready to pull

Usage:
  python3 experiment_alert.py watch  <job_id> [--metric BER --threshold 0.1]
  python3 experiment_alert.py poll   <job_id> [--interval 60]
  python3 experiment_alert.py parse  <log_file>           — extract metrics from log
  python3 experiment_alert.py update <project> <log_file> — auto-update data template
"""

import sys
import os
import re
import json
import time
import datetime
import subprocess

BASE       = os.path.expanduser("~/.openclaw/workspace/research-supervisor-pro")
CONFIG     = os.path.join(BASE, "memory/server_config.json")
ALERTS_LOG = os.path.join(BASE, "memory/alerts.log")


def load_server_config():
    if not os.path.exists(CONFIG):
        return {}
    with open(CONFIG) as f:
        return json.load(f)


def log_alert(message):
    os.makedirs(os.path.dirname(ALERTS_LOG), exist_ok=True)
    ts = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    with open(ALERTS_LOG, "a") as f:
        f.write(f"[{ts}] {message}\n")
    print(f"🔔 ALERT [{ts}]: {message}")


def ssh_run(cmd, timeout=30):
    cfg = load_server_config()
    host = cfg.get("host", "")
    user = cfg.get("user", "")
    port = cfg.get("port", 22)
    key  = cfg.get("ssh_key", "")
    if not host:
        return None
    ssh = ["ssh", "-o", "StrictHostKeyChecking=no", "-o", f"ConnectTimeout={timeout}",
           "-p", str(port)]
    if key:
        ssh += ["-i", os.path.expanduser(key)]
    ssh += [f"{user}@{host}", cmd]
    try:
        r = subprocess.run(ssh, capture_output=True, text=True, timeout=timeout+5)
        return r.stdout.strip()
    except Exception:
        return None


# ── Metric Patterns ───────────────────────────────────────────────────────────
METRIC_PATTERNS = {
    "BER":          r"(?:BER|bit.error.rate)[:\s=]+([0-9]+\.?[0-9]*)",
    "BitAcc":       r"(?:bit.?acc(?:uracy)?|BA)[:\s=]+([0-9]+\.?[0-9]*)",
    "PSNR":         r"PSNR[:\s=]+([0-9]+\.?[0-9]*)",
    "SSIM":         r"SSIM[:\s=]+([0-9]+\.?[0-9]*)",
    "Loss":         r"(?:loss|train.loss)[:\s=]+([0-9]+\.?[0-9]*)",
    "Epoch":        r"[Ee]poch[:\s]+([0-9]+)",
    "Accuracy":     r"(?:acc(?:uracy)?)[:\s=]+([0-9]+\.?[0-9]*)",
    "DetectionAcc": r"(?:detection.?acc(?:uracy)?)[:\s=]+([0-9]+\.?[0-9]*)",
}

FINISH_PATTERNS = [
    r"training complete",
    r"finished training",
    r"job complete",
    r"saved model",
    r"best model saved",
    r"evaluation complete",
    r"done\.",
]

ERROR_PATTERNS = [
    r"error|exception|traceback",
    r"cuda out of memory",
    r"killed",
    r"nan loss",
    r"inf loss",
    r"segmentation fault",
]


def parse_log(log_text):
    """Extract all metrics from a log file text."""
    metrics_found = {}
    lines = log_text.split("\n")

    for line in lines:
        line_lower = line.lower()

        # Check metrics
        for name, pattern in METRIC_PATTERNS.items():
            m = re.search(pattern, line, re.IGNORECASE)
            if m:
                try:
                    val = float(m.group(1))
                    if name not in metrics_found:
                        metrics_found[name] = []
                    metrics_found[name].append(val)
                except ValueError:
                    pass

        # Check finish
        for fp in FINISH_PATTERNS:
            if re.search(fp, line_lower):
                metrics_found["__finished__"] = True

        # Check errors
        for ep in ERROR_PATTERNS:
            if re.search(ep, line_lower):
                if "__errors__" not in metrics_found:
                    metrics_found["__errors__"] = []
                metrics_found["__errors__"].append(line.strip()[:100])

    return metrics_found


def print_metrics_summary(metrics):
    print("\n📊 EXTRACTED METRICS:")
    print("━" * 50)
    for name, values in metrics.items():
        if name.startswith("__"):
            continue
        if isinstance(values, list) and values:
            print(f"  {name:<15}: last={values[-1]:.4f}  min={min(values):.4f}  max={max(values):.4f}  ({len(values)} readings)")
    if metrics.get("__finished__"):
        print("  ✅ Training COMPLETED")
    errors = metrics.get("__errors__", [])
    if errors:
        print(f"  ❌ {len(errors)} errors found:")
        for e in errors[:3]:
            print(f"     {e}")
    print("━" * 50)


def update_data_template(project, metrics):
    """Auto-update experiment_data_template.json with extracted metrics."""
    template_path = os.path.join(BASE, "research", project, "experiment_data.json")

    existing = {}
    if os.path.exists(template_path):
        with open(template_path) as f:
            existing = json.load(f)

    # Build training curve if epochs + metric present
    if "Epoch" in metrics and ("BER" in metrics or "Loss" in metrics):
        epochs = metrics.get("Epoch", [])
        bers = metrics.get("BER", [])
        losses = metrics.get("Loss", [])

        if "experiments" not in existing:
            existing["experiments"] = []

        if bers and len(bers) == len(epochs):
            existing["experiments"].append({
                "name": "BER vs Epochs",
                "xlabel": "Epoch",
                "ylabel": "BER",
                "label": "Bit Error Rate during training",
                "x": epochs[:len(bers)],
                "y": bers
            })
        if losses:
            existing["experiments"].append({
                "name": "Training Loss",
                "xlabel": "Epoch",
                "ylabel": "Loss",
                "label": "Training loss curve",
                "x": list(range(len(losses))),
                "y": losses
            })

    os.makedirs(os.path.dirname(template_path), exist_ok=True)
    with open(template_path, "w") as f:
        json.dump(existing, f, indent=2)

    print(f"✅ experiment_data.json updated → {template_path}")
    return template_path


def poll_job(job_id, interval=60, metric=None, threshold=None, project=None):
    """Poll job status every N seconds, alert on completion or milestone."""
    cfg = load_server_config()
    workdir = cfg.get("workdir", "~")

    print(f"🔍 Polling job {job_id} every {interval}s...")
    if metric and threshold:
        print(f"   Alert when {metric} {'<' if float(threshold) < 0.5 else '>'} {threshold}")
    print("   (Ctrl+C to stop)\n")

    alerted_milestone = False
    last_line_count = 0

    while True:
        try:
            # Check if job still running
            status = ssh_run(f"squeue -j {job_id} --noheader 2>/dev/null")

            # Get latest log lines
            log_content = ssh_run(f"find {workdir} -name '*{job_id}*' -newer /tmp -exec tail -100 {{}} \\; 2>/dev/null | tail -200")

            if log_content:
                metrics = parse_log(log_content)

                # Check milestone
                if metric and threshold and not alerted_milestone:
                    vals = metrics.get(metric, [])
                    if vals:
                        last_val = vals[-1]
                        thr = float(threshold)
                        if (thr < 0.5 and last_val < thr) or (thr >= 0.5 and last_val > thr):
                            log_alert(f"🎯 MILESTONE: Job {job_id} — {metric}={last_val:.4f} (threshold={threshold})")
                            alerted_milestone = True

                # Check completion
                if metrics.get("__finished__"):
                    log_alert(f"✅ Job {job_id} COMPLETED!")
                    print_metrics_summary(metrics)
                    if project:
                        update_data_template(project, metrics)
                    break

                # Check errors
                errors = metrics.get("__errors__", [])
                if errors:
                    log_alert(f"❌ Job {job_id} has errors: {errors[0]}")

                # Show current status
                epoch_vals = metrics.get("Epoch", [])
                ber_vals   = metrics.get("BER", [])
                ts = datetime.datetime.now().strftime("%H:%M:%S")
                status_line = f"[{ts}] Job {job_id}"
                if epoch_vals:
                    status_line += f" | Epoch {int(epoch_vals[-1])}"
                if ber_vals:
                    status_line += f" | BER {ber_vals[-1]:.4f}"
                print(status_line, end="\r")

            if not status:
                log_alert(f"✅ Job {job_id} no longer in queue — likely finished")
                break

            time.sleep(interval)

        except KeyboardInterrupt:
            print("\n\nStopped polling.")
            break


def parse_local_log(log_file):
    """Parse a local log file and extract metrics."""
    if not os.path.exists(log_file):
        print(f"❌ File not found: {log_file}")
        return
    with open(log_file) as f:
        content = f.read()
    metrics = parse_log(content)
    print_metrics_summary(metrics)
    return metrics


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print(__doc__)
        sys.exit(1)

    action = sys.argv[1]

    if action == "watch" and len(sys.argv) >= 3:
        job_id    = sys.argv[2]
        metric    = None
        threshold = None
        project   = None
        for i, arg in enumerate(sys.argv):
            if arg == "--metric"    and i+1 < len(sys.argv): metric    = sys.argv[i+1]
            if arg == "--threshold" and i+1 < len(sys.argv): threshold = sys.argv[i+1]
            if arg == "--project"   and i+1 < len(sys.argv): project   = sys.argv[i+1]
        poll_job(job_id, interval=30, metric=metric, threshold=threshold, project=project)

    elif action == "poll" and len(sys.argv) >= 3:
        job_id   = sys.argv[2]
        interval = int(sys.argv[4]) if "--interval" in sys.argv else 60
        poll_job(job_id, interval=interval)

    elif action == "parse" and len(sys.argv) >= 3:
        parse_local_log(sys.argv[2])

    elif action == "update" and len(sys.argv) >= 4:
        metrics = parse_local_log(sys.argv[3])
        if metrics:
            update_data_template(sys.argv[2], metrics)

    else:
        print(__doc__)

FILE:scripts/gap_detector.py
#!/usr/bin/env python3
"""
gap_detector.py — Detect research gaps using LLM analysis (not just keyword matching)
Falls back to keyword matching if no API key available.
Usage: python3 gap_detector.py [notes_file] [api_key]
"""

import os
import sys
import re
import json
import requests
import json as _json

# ── Config ──────────────────────────────────────────────────────────────────
def _get_api_config():
    """Use PetClaw built-in API first, fall back to env vars."""
    settings_path = os.path.expanduser("~/.petclaw/petclaw-settings.json")
    try:
        with open(settings_path) as f:
            d = json.load(f)
        key = d.get("brainApiKey", "")
        if key:
            return {
                "key":   key,
                "base":  d.get("brainApiUrl", "https://petclaw.ai/api/v1"),
                "model": os.environ.get("GAP_MODEL", d.get("brainModel", "petclaw-1.0"))
            }
    except Exception:
        pass
    return {
        "key":   os.environ.get("OPENAI_API_KEY", ""),
        "base":  os.environ.get("OPENAI_BASE_URL", "https://api.openai-hk.com/v1"),
        "model": os.environ.get("GAP_MODEL", "gpt-4o")
    }

_cfg        = _get_api_config()
OPENAI_BASE = _cfg["base"]
OPENAI_KEY  = _cfg["key"]
MODEL       = _cfg["model"]

GAP_KEYWORDS = [
    "however", "but", "limited", "limitation", "lack", "challenge",
    "future work", "not addressed", "open problem", "remains unclear",
    "to our knowledge", "no existing", "insufficient", "gap", "drawback",
    "unable to", "fails to", "does not handle", "beyond the scope"
]

def detect_gaps_llm(text_chunk, api_key):
    """Use LLM to intelligently detect research gaps."""
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    payload = {
        "model": MODEL,
        "messages": [
            {
                "role": "system",
                "content": (
                    "You are a research assistant. Given text extracted from research papers, "
                    "identify concrete research gaps, limitations, and unsolved problems. "
                    "Return a JSON array of gap strings. Be specific and concise. "
                    "Example: [\"No benchmark exists for X\", \"Method fails under Y conditions\"]"
                )
            },
            {
                "role": "user",
                "content": f"Extract research gaps from this text:\n\n{text_chunk[:3000]}"
            }
        ],
        "temperature": 0.3,
        "max_tokens": 800
    }
    try:
        r = requests.post(f"{OPENAI_BASE}/chat/completions", headers=headers, json=payload, timeout=30)
        r.raise_for_status()
        content = r.json()["choices"][0]["message"]["content"].strip()
        # Parse JSON array from response
        match = re.search(r'\[.*\]', content, re.DOTALL)
        if match:
            return json.loads(match.group())
        return [content]
    except Exception as e:
        print(f"  ⚠️  LLM call failed: {e}. Falling back to keyword matching.")
        return None


def detect_gaps_keywords(text):
    """Fallback: simple keyword-based gap detection."""
    gaps = []
    sentences = re.split(r'(?<=[.!?])\s+', text)
    for s in sentences:
        s = s.strip()
        if len(s) < 20:
            continue
        for kw in GAP_KEYWORDS:
            if kw in s.lower():
                gaps.append(s)
                break
    return list(set(gaps))


def detect_gaps(notes_file="notes.md", api_key=None):
    if not os.path.exists(notes_file):
        print(f"❌ {notes_file} not found. Run pdf_parser.py first.")
        sys.exit(1)

    with open(notes_file) as f:
        text = f.read()

    print(f"🔬 Detecting research gaps from {notes_file}...")

    gaps = []

    # Try LLM first
    key = api_key or OPENAI_KEY
    if key:
        print("  Using LLM-based gap detection...")
        # Process in chunks (notes can be large)
        chunk_size = 3000
        chunks = [text[i:i+chunk_size] for i in range(0, min(len(text), 15000), chunk_size)]
        for chunk in chunks:
            result = detect_gaps_llm(chunk, key)
            if result:
                gaps.extend(result)
    
    # Fallback or supplement with keywords
    if not gaps:
        print("  Using keyword-based gap detection (fallback)...")
        gaps = detect_gaps_keywords(text)

    # Deduplicate
    gaps = list(dict.fromkeys(gaps))

    with open("gaps.md", "w") as f:
        f.write("# Detected Research Gaps\n\n")
        for i, g in enumerate(gaps, 1):
            f.write(f"{i}. {g}\n\n")

    print(f"\n✅ Detected {len(gaps)} research gaps → gaps.md")
    return gaps


if __name__ == "__main__":
    notes_file = sys.argv[1] if len(sys.argv) > 1 else "notes.md"
    api_key    = sys.argv[2] if len(sys.argv) > 2 else None
    detect_gaps(notes_file, api_key)

FILE:scripts/idea_generator.py
#!/usr/bin/env python3
"""
idea_generator.py — Generate real research ideas from detected gaps using LLM
Falls back to structured templates if no API key.
Usage: python3 idea_generator.py [gaps_file] [api_key]
"""

import os
import sys
import re
import json
import requests

# ── Config ──────────────────────────────────────────────────────────────────
def _get_api_config():
    """Use PetClaw built-in API first, fall back to env vars."""
    settings_path = os.path.expanduser("~/.petclaw/petclaw-settings.json")
    try:
        import json as _j
        with open(settings_path) as f:
            d = _j.load(f)
        key = d.get("brainApiKey", "")
        if key:
            return {
                "key":   key,
                "base":  d.get("brainApiUrl", "https://petclaw.ai/api/v1"),
                "model": os.environ.get("IDEA_MODEL", d.get("brainModel", "petclaw-1.0"))
            }
    except Exception:
        pass
    return {
        "key":   os.environ.get("OPENAI_API_KEY", ""),
        "base":  os.environ.get("OPENAI_BASE_URL", "https://api.openai-hk.com/v1"),
        "model": os.environ.get("IDEA_MODEL", "gpt-4o")
    }

_cfg        = _get_api_config()
OPENAI_BASE = _cfg["base"]
OPENAI_KEY  = _cfg["key"]
MODEL       = _cfg["model"]


def generate_ideas_llm(gaps_text, api_key):
    """Use LLM to generate concrete, publishable research ideas."""
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    payload = {
        "model": MODEL,
        "messages": [
            {
                "role": "system",
                "content": (
                    "You are a senior AI researcher and research supervisor. "
                    "Given a list of research gaps, generate specific, concrete, and publishable research ideas. "
                    "For each idea provide:\n"
                    "- Title: A clear paper title\n"
                    "- Problem: What gap it addresses\n"
                    "- Method: Proposed approach (specific, not vague)\n"
                    "- Baselines: What to compare against\n"
                    "- Metrics: How to evaluate\n"
                    "- Venue: Target conference/journal (e.g. CVPR, NeurIPS, IEEE TIFS)\n"
                    "- Novelty: Why this is new\n\n"
                    "Generate 5 ideas. Be specific and realistic."
                )
            },
            {
                "role": "user",
                "content": f"Generate research ideas from these gaps:\n\n{gaps_text[:3000]}"
            }
        ],
        "temperature": 0.7,
        "max_tokens": 2000
    }
    try:
        r = requests.post(f"{OPENAI_BASE}/chat/completions", headers=headers, json=payload, timeout=45)
        r.raise_for_status()
        return r.json()["choices"][0]["message"]["content"].strip()
    except Exception as e:
        print(f"  ⚠️  LLM call failed: {e}. Using template fallback.")
        return None


def generate_ideas_template(gaps):
    """Fallback: structured template-based ideas."""
    ideas = []
    for i, gap in enumerate(gaps[:5], 1):
        idea = (
            f"## Idea {i}\n"
            f"**Title:** Addressing: {gap[:80]}\n"
            f"**Problem:** {gap}\n"
            f"**Method:** Propose a novel approach to solve this gap using deep learning / LLM-guided methods.\n"
            f"**Baselines:** Compare against existing SOTA methods in this area.\n"
            f"**Metrics:** Accuracy, robustness, computational efficiency.\n"
            f"**Venue:** IEEE Transactions / CVPR / NeurIPS\n"
            f"**Novelty:** First work to directly address this specific limitation.\n"
        )
        ideas.append(idea)
    return "\n\n".join(ideas)


def generate_ideas(gaps_file="gaps.md", api_key=None):
    if not os.path.exists(gaps_file):
        print(f"❌ {gaps_file} not found. Run gap_detector.py first.")
        sys.exit(1)

    with open(gaps_file) as f:
        gaps_text = f.read()

    # Parse individual gaps
    gaps = [line.strip().lstrip("0123456789. ") for line in gaps_text.split("\n")
            if line.strip() and not line.startswith("#")]
    gaps = [g for g in gaps if len(g) > 20]

    print(f"💡 Generating research ideas from {len(gaps)} gaps...")

    key = api_key or OPENAI_KEY
    ideas_text = None

    if key:
        print("  Using LLM-based idea generation...")
        ideas_text = generate_ideas_llm(gaps_text, key)

    if not ideas_text:
        print("  Using template-based idea generation (fallback)...")
        ideas_text = generate_ideas_template(gaps)

    with open("ideas.md", "w") as f:
        f.write("# Generated Research Ideas\n\n")
        f.write(ideas_text)

    print(f"\n✅ Research ideas saved → ideas.md")
    return ideas_text


if __name__ == "__main__":
    gaps_file = sys.argv[1] if len(sys.argv) > 1 else "gaps.md"
    api_key   = sys.argv[2] if len(sys.argv) > 2 else None
    generate_ideas(gaps_file, api_key)

FILE:scripts/logger.py
#!/usr/bin/env python3
"""
logger.py — Log pipeline events to project research log
Usage: python3 logger.py "<project>" "<message>"
"""

import sys
import os
import datetime

def log(project, message, base_dir=None):
    if base_dir is None:
        base_dir = os.path.expanduser("~/.openclaw/workspace/research-supervisor-pro")

    log_dir = os.path.join(base_dir, "research", project)
    os.makedirs(log_dir, exist_ok=True)

    log_file = os.path.join(log_dir, "auto_log.md")
    timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")

    with open(log_file, "a") as f:
        f.write(f"- [{timestamp}] {message}\n")

    print(f"📝 Logged: {message}")

if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: python3 logger.py \"<project>\" \"<message>\"")
        sys.exit(1)
    log(sys.argv[1], sys.argv[2])

FILE:scripts/paper_writer.py
#!/usr/bin/env python3
"""
paper_writer.py — Write a full academic paper in LaTeX
Supports: survey papers, research papers with real data, figures, tables, graphs
Usage: python3 paper_writer.py <mode> [options]
  mode: survey    — write literature survey from notes.md
  mode: research  — write research paper from real experimental data
  mode: section   — write a single section
"""

import sys
import os
import json
import re
import datetime
import requests

# ── Config ──────────────────────────────────────────────────────────────────
def _get_api_config():
    """Use PetClaw built-in API first, fall back to env vars."""
    settings_path = os.path.expanduser("~/.petclaw/petclaw-settings.json")
    try:
        with open(settings_path) as f:
            d = json.load(f)
        key = d.get("brainApiKey", "")
        if key:
            return {
                "key":   key,
                "base":  d.get("brainApiUrl", "https://petclaw.ai/api/v1"),
                "model": os.environ.get("PAPER_MODEL", d.get("brainModel", "petclaw-1.0"))
            }
    except Exception:
        pass
    return {
        "key":   os.environ.get("OPENAI_API_KEY", ""),
        "base":  os.environ.get("OPENAI_BASE_URL", "https://api.openai-hk.com/v1"),
        "model": os.environ.get("PAPER_MODEL", "gpt-4o")
    }

_cfg        = _get_api_config()
OPENAI_BASE = _cfg["base"]
OPENAI_KEY  = _cfg["key"]
MODEL       = _cfg["model"]


# ── LLM Helper ──────────────────────────────────────────────────────────────
def llm(system, user, max_tokens=3000, temperature=0.4):
    if not OPENAI_KEY:
        return None
    headers = {"Authorization": f"Bearer {OPENAI_KEY}", "Content-Type": "application/json"}
    payload = {
        "model": MODEL,
        "messages": [{"role": "system", "content": system}, {"role": "user", "content": user}],
        "temperature": temperature,
        "max_tokens": max_tokens
    }
    try:
        r = requests.post(f"{OPENAI_BASE}/chat/completions", headers=headers, json=payload, timeout=90)
        r.raise_for_status()
        return r.json()["choices"][0]["message"]["content"].strip()
    except Exception as e:
        print(f"  ⚠️  LLM error: {e}")
        return None


# ── LaTeX Preamble ───────────────────────────────────────────────────────────
def latex_preamble(title, authors, abstract, paper_type="survey"):
    date_str = datetime.date.today().strftime("%B %d, %Y")
    safe = lambda s: s.replace("&", "\\&").replace("%", "\\%").replace("#", "\\#").replace("_", "\\_")

    extra_packages = ""
    if paper_type == "research":
        extra_packages = """
\\usepackage{booktabs}
\\usepackage{multirow}
\\usepackage{siunitx}
\\usepackage{subcaption}"""

    return f"""\\documentclass[12pt,a4paper]{{article}}
\\usepackage[utf8]{{inputenc}}
\\usepackage[T1]{{fontenc}}
\\usepackage{{hyperref}}
\\usepackage{{geometry}}
\\geometry{{margin=1in}}
\\usepackage{{enumitem}}
\\usepackage{{parskip}}
\\usepackage{{graphicx}}
\\usepackage{{float}}
\\usepackage{{amsmath}}
\\usepackage{{amssymb}}
\\usepackage{{xcolor}}
\\usepackage{{caption}}
\\usepackage{{natbib}}
\\usepackage{{tabularx}}
\\usepackage{{longtable}}{extra_packages}

\\title{{{safe(title)}}}
\\author{{{safe(authors)}}}
\\date{{{date_str}}}

\\begin{{document}}
\\maketitle

\\begin{{abstract}}
{safe(abstract)}
\\end{{abstract}}

\\tableofcontents
\\newpage
"""


# ── Figure Generator ─────────────────────────────────────────────────────────
def generate_figures_from_data(data_file, output_dir="figures"):
    """
    Generate matplotlib figures from user's real experimental data.
    data_file: JSON with structure:
      {
        "experiments": [
          {"name": "HiDDeN", "x": [...], "y": [...], "label": "BER vs epochs"},
          ...
        ],
        "tables": [
          {"caption": "Comparison", "headers": [...], "rows": [[...], ...]},
          ...
        ]
      }
    """
    try:
        import matplotlib
        matplotlib.use("Agg")
        import matplotlib.pyplot as plt
        import numpy as np
    except ImportError:
        print("⚠️  matplotlib not installed. Run: pip install matplotlib numpy")
        return []

    os.makedirs(output_dir, exist_ok=True)

    if not os.path.exists(data_file):
        print(f"❌ Data file not found: {data_file}")
        return []

    with open(data_file) as f:
        data = json.load(f)

    figures = []

    for i, exp in enumerate(data.get("experiments", [])):
        fig, ax = plt.subplots(figsize=(8, 5))
        name = exp.get("name", f"Experiment {i+1}")
        x = exp.get("x", [])
        ys = exp.get("y", [])
        label = exp.get("label", "Value")
        xlabel = exp.get("xlabel", "X")
        ylabel = exp.get("ylabel", "Y")

        # Support multiple curves
        if isinstance(ys[0], list) if ys else False:
            labels = exp.get("labels", [f"Series {j}" for j in range(len(ys))])
            for curve, lbl in zip(ys, labels):
                ax.plot(x, curve, marker="o", label=lbl)
            ax.legend()
        else:
            ax.plot(x, ys, marker="o", color="#2196F3", linewidth=2)

        ax.set_title(name, fontsize=14, fontweight="bold")
        ax.set_xlabel(xlabel, fontsize=12)
        ax.set_ylabel(ylabel, fontsize=12)
        ax.grid(True, alpha=0.3)
        plt.tight_layout()

        fname = f"{output_dir}/fig_{i+1}_{name.lower().replace(' ','_')}.png"
        fig.savefig(fname, dpi=150, bbox_inches="tight")
        plt.close()
        print(f"  📊 Generated figure: {fname}")
        figures.append({"file": fname, "caption": label, "name": name})

    # Bar charts for comparisons
    for i, comp in enumerate(data.get("comparisons", [])):
        fig, ax = plt.subplots(figsize=(9, 5))
        methods = comp.get("methods", [])
        values = comp.get("values", [])
        metric = comp.get("metric", "Score")
        colors = ["#4CAF50" if v == max(values) else "#90CAF9" for v in values]
        bars = ax.bar(methods, values, color=colors, edgecolor="black", linewidth=0.5)
        ax.bar_label(bars, fmt="%.2f", padding=3)
        ax.set_title(comp.get("title", f"Comparison: {metric}"), fontsize=14, fontweight="bold")
        ax.set_ylabel(metric, fontsize=12)
        ax.set_ylim(0, max(values) * 1.15)
        ax.grid(axis="y", alpha=0.3)
        plt.tight_layout()
        fname = f"{output_dir}/bar_{i+1}_{metric.lower().replace(' ','_')}.png"
        fig.savefig(fname, dpi=150, bbox_inches="tight")
        plt.close()
        print(f"  📊 Generated bar chart: {fname}")
        figures.append({"file": fname, "caption": comp.get("title", metric), "name": metric})

    return figures


def generate_tables_latex(data_file):
    """Generate LaTeX tables from experimental data."""
    if not os.path.exists(data_file):
        return ""

    with open(data_file) as f:
        data = json.load(f)

    latex = ""
    for table in data.get("tables", []):
        caption = table.get("caption", "Results")
        headers = table.get("headers", [])
        rows = table.get("rows", [])
        label = caption.lower().replace(" ", "_")

        col_fmt = "l" + "c" * (len(headers) - 1)
        header_row = " & ".join([f"\\textbf{{{h}}}" for h in headers]) + " \\\\"

        latex += f"""
\\begin{{table}}[H]
\\centering
\\caption{{{caption}}}
\\label{{tab:{label}}}
\\begin{{tabular}}{{{col_fmt}}}
\\toprule
{header_row}
\\midrule
"""
        for row in rows:
            latex += " & ".join([str(cell) for cell in row]) + " \\\\\n"

        latex += f"""\\bottomrule
\\end{{tabular}}
\\end{{table}}

"""
    return latex


# ── Survey Paper Writer ──────────────────────────────────────────────────────
def write_survey(notes_file="notes.md", bib_file=None, topic="AI Research",
                 output="paper_survey.tex", author="Research Supervisor Pro"):

    print(f"📝 Writing survey paper: {topic}")

    with open(notes_file) as f:
        notes = f.read()

    # Load citation graph if available
    graph_context = ""
    if os.path.exists("citation_graph_summary.md"):
        with open("citation_graph_summary.md") as f:
            graph_context = f.read()[:2000]

    # Load top papers
    top_context = ""
    if os.path.exists("top_papers.txt"):
        with open("top_papers.txt") as f:
            top_context = f.read()[:2000]

    context = f"Notes:\n{notes[:5000]}\n\nTop Papers:\n{top_context}\n\nCitation Graph:\n{graph_context}"

    sections = {}
    section_prompts = {
        "abstract": ("Write a 200-word academic abstract for a survey on: " + topic, 400),
        "introduction": ("Write the Introduction section (2-3 paragraphs) for a survey on " + topic + ". Include motivation, scope, and contributions. LaTeX only.", 800),
        "related_work": ("Write the Related Work section with \\subsection{} groupings. Use \\cite{} placeholders. Based on context.", 1500),
        "taxonomy": ("Write a Taxonomy/Categorization section — classify the approaches found in the notes into 3-5 categories with \\subsection{}.", 1200),
        "analysis": ("Write a Comparative Analysis section discussing strengths, weaknesses, and trade-offs of the methods.", 1000),
        "gaps": ("Write a Research Gaps and Open Problems section as a \\begin{itemize} list of specific open problems.", 800),
        "future": ("Write a Future Directions section — specific, concrete, actionable research directions.", 600),
        "conclusion": ("Write a Conclusion section (1-2 paragraphs) summarizing the survey.", 400),
    }

    sys_prompt = (
        "You are an expert academic writer specializing in AI research surveys. "
        "Write in formal academic English. Use LaTeX formatting only — no markdown. "
        "Use \\section{}, \\subsection{}, \\paragraph{} as appropriate. "
        "Include \\cite{AuthorYear} placeholders for references. "
        "Be specific, analytical, and technically precise."
    )

    for sec_key, (prompt, tokens) in section_prompts.items():
        print(f"  ✍️  Writing: {sec_key}...")
        result = llm(sys_prompt, f"{prompt}\n\nContext:\n{context[:4000]}", max_tokens=tokens)
        sections[sec_key] = result or f"% TODO: Write {sec_key} section manually\n"

    # Extract abstract for preamble
    abstract_text = sections.get("abstract", "").replace("\\begin{abstract}", "").replace("\\end{abstract}", "").strip()
    if "abstract" in abstract_text.lower():
        abstract_text = re.sub(r'\\section\{Abstract\}', '', abstract_text).strip()

    # Build full paper
    latex = latex_preamble(f"A Survey on {topic}", author, abstract_text, "survey")

    latex += sections.get("introduction", "") + "\n\n"
    latex += sections.get("related_work", "") + "\n\n"
    latex += sections.get("taxonomy", "") + "\n\n"
    latex += sections.get("analysis", "") + "\n\n"
    latex += sections.get("gaps", "") + "\n\n"
    latex += sections.get("future", "") + "\n\n"
    latex += sections.get("conclusion", "") + "\n\n"

    # Bibliography
    if bib_file and os.path.exists(bib_file):
        latex += f"\n\\bibliographystyle{{plain}}\n\\bibliography{{{bib_file.replace('.bib','')}}}\n"
    else:
        latex += "\n% Add your bibliography file: \\bibliography{references}\n"

    latex += "\n\\end{document}\n"

    with open(output, "w") as f:
        f.write(latex)

    print(f"\n✅ Survey paper written → {output}")
    print(f"   Compile: pdflatex {output}")
    return output


# ── Research Paper Writer (with real data) ───────────────────────────────────
def write_research_paper(data_file=None, notes_file="notes.md", topic="",
                          output="paper_research.tex", author="", venue=""):

    print(f"📝 Writing research paper: {topic}")

    # Generate figures from real data
    figures = []
    tables_latex = ""
    if data_file and os.path.exists(data_file):
        print("  📊 Generating figures and tables from your data...")
        figures = generate_figures_from_data(data_file)
        tables_latex = generate_tables_latex(data_file)
    else:
        print("  ℹ️  No data file provided — paper will have placeholder figures/tables")

    # Load notes
    notes = ""
    if os.path.exists(notes_file):
        with open(notes_file) as f:
            notes = f.read()[:4000]

    # Load ideas/gaps
    ideas = ""
    if os.path.exists("ideas.md"):
        with open("ideas.md") as f:
            ideas = f.read()[:2000]

    context = f"Research topic: {topic}\nVenue: {venue}\nNotes:\n{notes}\nIdeas:\n{ideas}"

    sys_prompt = (
        "You are a senior AI researcher writing a research paper for publication. "
        "Write in formal academic English. Use LaTeX formatting only. "
        "Be technically precise. Use \\cite{} placeholders. "
        "Focus on novelty, methodology, and experimental validation."
    )

    sections = {}
    section_prompts = {
        "abstract": (f"Write a 150-word abstract for a research paper on: {topic}. Mention method, experiments, and key results.", 300),
        "introduction": (f"Write the Introduction section for a research paper on {topic} targeting {venue}. Include problem statement, motivation, contributions (as \\begin{{itemize}}), and paper organization.", 1000),
        "related_work": ("Write Related Work section with \\subsection groupings. Compare with existing methods.", 1000),
        "methodology": (f"Write the Methodology section for the proposed approach to: {topic}. Include formal problem definition, proposed method, algorithm (if applicable).", 1500),
        "experiments": ("Write the Experimental Setup section. Include: datasets used, baselines compared, evaluation metrics, implementation details.", 800),
        "results": ("Write the Results and Analysis section. Reference figures and tables with \\ref{}. Analyze results critically.", 1000),
        "conclusion": (f"Write a Conclusion section for a paper on {topic}. Summarize contributions and future work.", 400),
    }

    for sec_key, (prompt, tokens) in section_prompts.items():
        print(f"  ✍️  Writing: {sec_key}...")
        result = llm(sys_prompt, f"{prompt}\n\nContext:\n{context}", max_tokens=tokens)
        sections[sec_key] = result or f"% TODO: Write {sec_key} section\n"

    abstract_text = sections.get("abstract", "").strip()

    latex = latex_preamble(topic, author, abstract_text, "research")

    latex += sections.get("introduction", "") + "\n\n"
    latex += sections.get("related_work", "") + "\n\n"
    latex += sections.get("methodology", "") + "\n\n"
    latex += sections.get("experiments", "") + "\n\n"

    # Insert real figures
    if figures:
        latex += "\n\\section{Results}\n\n"
        for i, fig in enumerate(figures, 1):
            rel_path = fig["file"]
            cap = fig["caption"].replace("_", "\\_")
            latex += f"""
\\begin{{figure}}[H]
\\centering
\\includegraphics[width=0.85\\textwidth]{{{rel_path}}}
\\caption{{{cap}}}
\\label{{fig:{i}}}
\\end{{figure}}

"""
        latex += sections.get("results", "") + "\n\n"
    else:
        latex += sections.get("results", "") + "\n\n"
        # Placeholder figure
        latex += """
\\begin{figure}[H]
\\centering
\\fbox{\\parbox{0.7\\textwidth}{\\centering\\vspace{2cm}[Insert Figure Here]\\vspace{2cm}}}
\\caption{Results comparison (placeholder — add your figures)}
\\label{fig:results}
\\end{figure}

"""

    # Insert real tables
    if tables_latex:
        latex += tables_latex
    else:
        # Placeholder table
        latex += """
\\begin{table}[H]
\\centering
\\caption{Performance comparison (placeholder — add your results)}
\\label{tab:results}
\\begin{tabular}{lcccc}
\\toprule
\\textbf{Method} & \\textbf{Metric 1} & \\textbf{Metric 2} & \\textbf{Metric 3} & \\textbf{Metric 4} \\\\
\\midrule
Baseline 1 & - & - & - & - \\\\
Baseline 2 & - & - & - & - \\\\
\\textbf{Ours} & \\textbf{-} & \\textbf{-} & \\textbf{-} & \\textbf{-} \\\\
\\bottomrule
\\end{tabular}
\\end{table}

"""

    latex += sections.get("conclusion", "") + "\n\n"
    latex += "\n% \\bibliographystyle{plain}\n% \\bibliography{references}\n"
    latex += "\n\\end{document}\n"

    with open(output, "w") as f:
        f.write(latex)

    print(f"\n✅ Research paper written → {output}")
    if figures:
        print(f"   Figures: {len(figures)} generated in figures/")
    print(f"   Compile: pdflatex {output}")
    return output


# ── Main ─────────────────────────────────────────────────────────────────────
if __name__ == "__main__":
    if len(sys.argv) < 2:
        print(__doc__)
        sys.exit(1)

    mode = sys.argv[1]

    if mode == "survey":
        notes  = sys.argv[2] if len(sys.argv) > 2 else "notes.md"
        topic  = sys.argv[3] if len(sys.argv) > 3 else "AI Research"
        output = sys.argv[4] if len(sys.argv) > 4 else "paper_survey.tex"
        author = sys.argv[5] if len(sys.argv) > 5 else "Author"
        write_survey(notes, None, topic, output, author)

    elif mode == "research":
        topic     = sys.argv[2] if len(sys.argv) > 2 else "AI Research"
        data_file = sys.argv[3] if len(sys.argv) > 3 else None
        output    = sys.argv[4] if len(sys.argv) > 4 else "paper_research.tex"
        author    = sys.argv[5] if len(sys.argv) > 5 else "Author"
        venue     = sys.argv[6] if len(sys.argv) > 6 else "IEEE Transactions"
        write_research_paper(data_file, "notes.md", topic, output, author, venue)

    else:
        print(f"Unknown mode: {mode}")
        print(__doc__)

FILE:scripts/pdf_parser.py
#!/usr/bin/env python3
"""
pdf_parser.py — Extract text from PDFs using pypdf (replaces deprecated PyPDF2)
Usage: python3 pdf_parser.py [papers_dir] [max_papers]
"""

import os
import sys
import json

def parse_pdfs(papers_dir="papers_pdf", max_papers=40):
    # Load metadata if available
    metadata_path = f"{papers_dir}/metadata.json"
    metadata = {}
    if os.path.exists(metadata_path):
        with open(metadata_path) as f:
            for p in json.load(f):
                metadata[p.get("filename", "")] = p

    # Try pypdf first, fallback to PyPDF2
    try:
        from pypdf import PdfReader
    except ImportError:
        try:
            from PyPDF2 import PdfReader
            print("⚠️  Using deprecated PyPDF2. Run: pip install pypdf")
        except ImportError:
            print("❌ No PDF library found. Run: pip install pypdf")
            sys.exit(1)

    pdf_files = [f for f in os.listdir(papers_dir) if f.endswith(".pdf")]
    pdf_files = sorted(pdf_files)[:max_papers]

    if not pdf_files:
        print(f"❌ No PDF files found in {papers_dir}/")
        sys.exit(1)

    print(f"📖 Parsing {len(pdf_files)} PDFs...")

    with open("notes.md", "w") as out:
        out.write("# Research Notes — Extracted from PDFs\n\n")

        for i, pdf_file in enumerate(pdf_files):
            full_path = f"{papers_dir}/{pdf_file}"
            meta = metadata.get(full_path, {})
            title = meta.get("title", pdf_file)
            authors = meta.get("authors", [])
            abstract = meta.get("abstract", "")

            out.write(f"## {i+1}. {title}\n")
            if authors:
                out.write(f"**Authors:** {', '.join(authors[:4])}\n\n")
            if abstract:
                out.write(f"**Abstract:** {abstract[:800]}\n\n")

            # Extract PDF text
            try:
                reader = PdfReader(full_path)
                text = ""
                for page in reader.pages[:4]:  # First 4 pages
                    extracted = page.extract_text()
                    if extracted:
                        text += extracted

                # Clean up text
                text = text.replace("\x00", "").strip()
                if text:
                    out.write(f"**Extracted Content:**\n{text[:1500]}\n\n")
                else:
                    out.write("**Note:** Could not extract text (possibly scanned PDF)\n\n")

            except Exception as e:
                out.write(f"**Error:** Could not parse PDF — {e}\n\n")
                print(f"  ⚠️  [{i+1}] Parse error: {pdf_file}: {e}")
                continue

            out.write("---\n\n")
            print(f"  ✅ [{i+1}/{len(pdf_files)}] {title[:60]}...")

    print(f"\n✅ Parsed {len(pdf_files)} PDFs → notes.md")


if __name__ == "__main__":
    papers_dir = sys.argv[1] if len(sys.argv) > 1 else "papers_pdf"
    max_papers = int(sys.argv[2]) if len(sys.argv) > 2 else 40
    parse_pdfs(papers_dir, max_papers)

FILE:scripts/project_init.py
#!/usr/bin/env python3
"""
project_init.py — Initialize a new research project folder (Semi-Manual mode)
Creates: plan.md, notes.md, report.md, memory.md
Usage: python3 project_init.py "<project_title>" [base_dir]
"""

import sys
import os
import re
import datetime
import json

BASE = os.path.expanduser("~/.openclaw/workspace/research-supervisor-pro/research")


def slugify(text):
    text = text.lower().strip()
    text = re.sub(r'[^\w\s-]', '', text)
    text = re.sub(r'[\s_-]+', '-', text)
    return text[:50]


def init_project(title, base_dir=BASE):
    slug = slugify(title)
    project_dir = os.path.join(base_dir, slug)

    if os.path.exists(project_dir):
        print(f"⚠️  Project already exists: {project_dir}")
        print(f"   Loading existing project...")
        return project_dir, slug

    os.makedirs(project_dir, exist_ok=True)
    os.makedirs(os.path.join(project_dir, "papers_pdf"), exist_ok=True)
    os.makedirs(os.path.join(project_dir, "figures"), exist_ok=True)

    date = datetime.date.today().isoformat()

    # plan.md
    with open(os.path.join(project_dir, "plan.md"), "w") as f:
        f.write(f"# Research Plan — {title}\n\n")
        f.write(f"**Created:** {date}\n\n")
        f.write("## Goal\n[Define your research goal here]\n\n")
        f.write("## Research Questions\n1. \n2. \n3. \n\n")
        f.write("## Methodology\n[Outline your approach]\n\n")
        f.write("## Timeline\n| Milestone | Target Date | Status |\n|---|---|---|\n| Literature Review | | ⬜ |\n| Gap Analysis | | ⬜ |\n| Experiment Design | | ⬜ |\n| Implementation | | ⬜ |\n| Paper Writing | | ⬜ |\n\n")
        f.write("## Target Venue\n[Conference / Journal]\n")

    # notes.md
    with open(os.path.join(project_dir, "notes.md"), "w") as f:
        f.write(f"# Research Notes — {title}\n\n")
        f.write(f"**Created:** {date}\n\n")
        f.write("## Key Papers\n[Papers will be added here by pdf_parser.py]\n\n")
        f.write("## Key Insights\n\n## Questions to Answer\n")

    # report.md
    with open(os.path.join(project_dir, "report.md"), "w") as f:
        f.write(f"# Research Report — {title}\n\n")
        f.write(f"**Created:** {date}\n\n")
        f.write("## Abstract\n[To be written]\n\n")
        f.write("## Progress\n- [ ] Literature review complete\n- [ ] Gaps identified\n- [ ] Ideas generated\n- [ ] Experiments planned\n- [ ] Paper drafted\n")

    # memory.md
    with open(os.path.join(project_dir, "memory.md"), "w") as f:
        f.write(f"# Project Memory — {title}\n\n")
        f.write(f"**Created:** {date}\n\n")
        f.write("## Decisions Made\n\n## Key Findings\n\n## Next Steps\n\n## Session Log\n")

    # meta.json
    meta = {
        "title": title,
        "slug": slug,
        "created": date,
        "status": "active",
        "mode": "semi-manual"
    }
    with open(os.path.join(project_dir, "meta.json"), "w") as f:
        json.dump(meta, f, indent=2)

    print(f"✅ Project initialized: {project_dir}")
    print(f"   Files created: plan.md, notes.md, report.md, memory.md, meta.json")
    print(f"   Folders: papers_pdf/, figures/")
    return project_dir, slug


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print('Usage: python3 project_init.py "<project_title>" [base_dir]')
        sys.exit(1)
    title    = sys.argv[1]
    base_dir = sys.argv[2] if len(sys.argv) > 2 else BASE
    init_project(title, base_dir)

FILE:scripts/roadmap_tracker.py
#!/usr/bin/env python3
"""
roadmap_tracker.py — Track research roadmap progress (resumable across sessions)
Usage:
  python3 roadmap_tracker.py init   <project> <idea_title> <gpu> <time_weeks>
  python3 roadmap_tracker.py status <project>
  python3 roadmap_tracker.py done   <project> <step_id>       (e.g. A1, B3)
  python3 roadmap_tracker.py block  <project> <step_id> <reason>
  python3 roadmap_tracker.py unblock <project> <step_id>
  python3 roadmap_tracker.py next   <project>
  python3 roadmap_tracker.py list   <project>
"""

import sys
import os
import json
import datetime

BASE = os.path.expanduser("~/.openclaw/workspace/research-supervisor-pro/research")


def get_progress_file(project):
    path = os.path.join(BASE, project)
    os.makedirs(path, exist_ok=True)
    return os.path.join(path, "roadmap_progress.json")


def load_progress(project):
    path = get_progress_file(project)
    if not os.path.exists(path):
        return None
    with open(path) as f:
        return json.load(f)


def save_progress(project, data):
    path = get_progress_file(project)
    data["last_updated"] = datetime.date.today().isoformat()
    with open(path, "w") as f:
        json.dump(data, f, indent=2)


def build_roadmap(idea_title, gpu, time_weeks):
    """Generate roadmap steps based on user setup."""
    phases = [
        {
            "id": "A",
            "name": "Environment Setup",
            "days": max(1, time_weeks // 6),
            "steps": [
                {"id": "A1", "name": "Install Python dependencies (torch, diffusers, etc.)", "status": "pending"},
                {"id": "A2", "name": f"Verify {gpu} GPU/compute is working", "status": "pending"},
                {"id": "A3", "name": "Download base models (SD, HiDDeN, etc.)", "status": "pending"},
            ]
        },
        {
            "id": "B",
            "name": "Baseline Implementation",
            "days": max(3, time_weeks // 4),
            "steps": [
                {"id": "B1", "name": "Clone/implement Baseline 1", "status": "pending"},
                {"id": "B2", "name": "Clone/implement Baseline 2", "status": "pending"},
                {"id": "B3", "name": "Run baseline experiments", "status": "pending"},
                {"id": "B4", "name": "Record baseline numbers (BER, Bit Acc, PSNR, SSIM)", "status": "pending"},
            ]
        },
        {
            "id": "C",
            "name": "Proposed Method",
            "days": max(5, time_weeks // 3),
            "steps": [
                {"id": "C1", "name": "Implement proposed approach", "status": "pending"},
                {"id": "C2", "name": "Train on dataset", "status": "pending"},
                {"id": "C3", "name": "Evaluate on metrics", "status": "pending"},
                {"id": "C4", "name": "Run ablation study", "status": "pending"},
            ]
        },
        {
            "id": "D",
            "name": "Analysis",
            "days": max(2, time_weeks // 5),
            "steps": [
                {"id": "D1", "name": "Compare results against all baselines", "status": "pending"},
                {"id": "D2", "name": "Generate figures and tables (EVE auto-generates)", "status": "pending"},
                {"id": "D3", "name": "Statistical significance / robustness tests", "status": "pending"},
            ]
        },
        {
            "id": "E",
            "name": "Paper Writing",
            "days": max(3, time_weeks // 4),
            "steps": [
                {"id": "E1", "name": "Fill experiment_data_template.json with real results", "status": "pending"},
                {"id": "E2", "name": "EVE generates all figures + LaTeX tables", "status": "pending"},
                {"id": "E3", "name": "EVE writes full LaTeX research paper", "status": "pending"},
                {"id": "E4", "name": "Review, revise, and submit", "status": "pending"},
            ]
        }
    ]
    return phases


def init_roadmap(project, idea_title, gpu, time_weeks):
    phases = build_roadmap(idea_title, gpu, int(time_weeks))
    all_steps = [s["id"] for p in phases for s in p["steps"]]

    data = {
        "project": project,
        "idea_title": idea_title,
        "gpu": gpu,
        "time_weeks": time_weeks,
        "created": datetime.date.today().isoformat(),
        "last_updated": datetime.date.today().isoformat(),
        "current_phase": "A",
        "current_step": "A1",
        "completed": [],
        "blocked": [],
        "all_steps": all_steps,
        "phases": phases
    }

    save_progress(project, data)

    # Also write human-readable roadmap.md
    roadmap_path = os.path.join(BASE, project, "roadmap.md")
    total_days = sum(p["days"] for p in phases)
    with open(roadmap_path, "w") as f:
        f.write(f"# 🗺️ Research Roadmap — {idea_title}\n\n")
        f.write(f"**GPU:** {gpu} | **Timeline:** {time_weeks} weeks (~{total_days} days)\n\n")
        f.write("```\n")
        for phase in phases:
            f.write(f"\nPHASE {phase['id']} — {phase['name']}  [~{phase['days']} days]\n")
            for step in phase["steps"]:
                f.write(f"  ├── {step['id']}. {step['name']}\n")
        f.write("```\n\n")
        f.write("## Progress\n\n")
        f.write("Track via: `python3 roadmap_tracker.py status <project>`\n")

    print(f"✅ Roadmap initialized for: {idea_title}")
    print(f"   Steps: {len(all_steps)} across {len(phases)} phases")
    print(f"   Saved: roadmap_progress.json + roadmap.md")
    return data


def print_status(project):
    data = load_progress(project)
    if not data:
        print(f"❌ No roadmap found for project: {project}")
        print(f"   Run: python3 roadmap_tracker.py init <project> <title> <gpu> <weeks>")
        return

    completed = set(data.get("completed", []))
    blocked   = {b["step"]: b["reason"] for b in data.get("blocked", [])}
    all_steps = data.get("all_steps", [])
    total     = len(all_steps)
    done      = len(completed)
    pct       = int(done / total * 100) if total else 0
    bar       = "█" * (pct // 5) + "░" * (20 - pct // 5)

    print(f"\n{'━'*56}")
    print(f" 🗺️  RESEARCH ROADMAP — {data['idea_title']}")
    print(f" Project: {project} | GPU: {data['gpu']} | {data['time_weeks']} weeks")
    print(f" Progress: [{bar}] {pct}% ({done}/{total} steps)")
    print(f"{'━'*56}")

    for phase in data["phases"]:
        phase_done = all(s["id"] in completed for s in phase["steps"])
        icon = "✅" if phase_done else "⏳"
        print(f"\n {icon} PHASE {phase['id']} — {phase['name']}")
        for step in phase["steps"]:
            sid = step["id"]
            if sid in completed:
                mark = "✅"
            elif sid in blocked:
                mark = f"🚫 BLOCKED: {blocked[sid][:40]}"
            elif sid == data.get("current_step"):
                mark = "▶️  ← CURRENT"
            else:
                mark = "⬜"
            print(f"    {mark}  {sid}. {step['name']}")

    print(f"\n{'━'*56}")
    print(f" 📍 Current step: {data.get('current_step','?')}")
    print(f" ⏳ Remaining:    {total - done} steps")
    print(f"{'━'*56}\n")


def mark_done(project, step_id):
    data = load_progress(project)
    if not data:
        print(f"❌ No roadmap found for: {project}")
        return

    step_id = step_id.upper()
    if step_id not in data["completed"]:
        data["completed"].append(step_id)

    # Remove from blocked if was blocked
    data["blocked"] = [b for b in data["blocked"] if b["step"] != step_id]

    # Find next pending step
    all_steps = data["all_steps"]
    remaining = [s for s in all_steps if s not in data["completed"]]
    data["current_step"] = remaining[0] if remaining else "COMPLETE"

    # Update current phase
    if remaining:
        data["current_phase"] = remaining[0][0]

    save_progress(project, data)
    print(f"✅ Step {step_id} marked complete!")
    if remaining:
        print(f"   Next step: {remaining[0]}")
    else:
        print(f"   🎉 ALL STEPS COMPLETE! Ready to write your paper.")


def mark_blocked(project, step_id, reason):
    data = load_progress(project)
    if not data:
        print(f"❌ No roadmap found: {project}")
        return
    step_id = step_id.upper()
    data["blocked"] = [b for b in data["blocked"] if b["step"] != step_id]
    data["blocked"].append({"step": step_id, "reason": reason, "since": datetime.date.today().isoformat()})
    save_progress(project, data)
    print(f"🚫 Step {step_id} marked as blocked: {reason}")


def unblock(project, step_id):
    data = load_progress(project)
    if not data:
        return
    step_id = step_id.upper()
    data["blocked"] = [b for b in data["blocked"] if b["step"] != step_id]
    save_progress(project, data)
    print(f"✅ Step {step_id} unblocked!")


def next_step(project):
    data = load_progress(project)
    if not data:
        print(f"❌ No roadmap found: {project}")
        return
    step_id = data.get("current_step", "")
    if step_id == "COMPLETE":
        print("🎉 All steps complete! Ready to write your paper.")
        return
    # Find step details
    for phase in data["phases"]:
        for step in phase["steps"]:
            if step["id"] == step_id:
                print(f"\n📍 Next Step: {step_id} — {step['name']}")
                print(f"   Phase: {phase['id']} — {phase['name']}")
                return


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print(__doc__)
        sys.exit(1)

    action = sys.argv[1]

    if action == "init" and len(sys.argv) >= 6:
        init_roadmap(sys.argv[2], sys.argv[3], sys.argv[4], sys.argv[5])
    elif action == "status" and len(sys.argv) >= 3:
        print_status(sys.argv[2])
    elif action == "done" and len(sys.argv) >= 4:
        mark_done(sys.argv[2], sys.argv[3])
    elif action == "block" and len(sys.argv) >= 5:
        mark_blocked(sys.argv[2], sys.argv[3], sys.argv[4])
    elif action == "unblock" and len(sys.argv) >= 4:
        unblock(sys.argv[2], sys.argv[3])
    elif action == "next" and len(sys.argv) >= 3:
        next_step(sys.argv[2])
    else:
        print(__doc__)

FILE:scripts/semantic_ranker.py
#!/usr/bin/env python3
"""
semantic_ranker.py — Rank papers by citation count using Semantic Scholar API
Reads metadata.json from arxiv_downloader.py output
Usage: python3 semantic_ranker.py [papers_dir]
"""

import json
import os
import sys
import time
import requests

def rank_papers(papers_dir="papers_pdf"):
    metadata_path = f"{papers_dir}/metadata.json"

    if not os.path.exists(metadata_path):
        print(f"❌ metadata.json not found in {papers_dir}/")
        print("   Run arxiv_downloader.py first.")
        sys.exit(1)

    with open(metadata_path) as f:
        papers = json.load(f)

    print(f"📊 Ranking {len(papers)} papers via Semantic Scholar...")

    ranked = []

    for i, paper in enumerate(papers):
        title = paper.get("title", "")
        arxiv_id = paper.get("arxiv_id", "").replace("_", "/")

        citations = 0
        year = None

        try:
            # Try ArXiv ID lookup first (more reliable)
            url = f"https://api.semanticscholar.org/graph/v1/paper/arXiv:{arxiv_id.split('v')[0]}?fields=citationCount,year,title"
            time.sleep(1.2)  # Respect rate limits
            r = requests.get(url, timeout=15)

            if r.status_code == 200:
                data = r.json()
                citations = data.get("citationCount", 0) or 0
                year = data.get("year")
            else:
                # Fallback: title search
                url2 = f"https://api.semanticscholar.org/graph/v1/paper/search?query={requests.utils.quote(title)}&limit=1&fields=citationCount,year"
                r2 = requests.get(url2, timeout=15)
                if r2.status_code == 200:
                    results = r2.json().get("data", [])
                    if results:
                        citations = results[0].get("citationCount", 0) or 0
                        year = results[0].get("year")

        except Exception as e:
            print(f"  ⚠️  [{i+1}] Could not fetch citations for: {title[:50]} ({e})")

        paper["citations"] = citations
        paper["year"] = year
        ranked.append(paper)
        print(f"  [{i+1}/{len(papers)}] {citations:>5} citations | {title[:55]}...")

    # Sort by citation count descending
    ranked.sort(key=lambda x: x.get("citations", 0), reverse=True)

    # Save ranked results
    with open("top_papers.json", "w") as f:
        json.dump(ranked, f, indent=2)

    # Also save human-readable txt
    with open("top_papers.txt", "w") as f:
        f.write(f"# Top Papers by Citation Count\n\n")
        for rank, p in enumerate(ranked[:40], 1):
            f.write(f"{rank}. [{p.get('citations',0)} citations] {p['title']}\n")
            f.write(f"   Authors: {', '.join(p['authors'][:3])}\n")
            f.write(f"   Published: {p.get('published','?')[:10]}\n")
            f.write(f"   arXiv: {p['arxiv_id']}\n\n")

    print(f"\n✅ Ranked {len(ranked)} papers")
    print(f"📋 Saved to top_papers.json + top_papers.txt")
    return ranked


if __name__ == "__main__":
    papers_dir = sys.argv[1] if len(sys.argv) > 1 else "papers_pdf"
    rank_papers(papers_dir)

FILE:scripts/semantic_search.py
#!/usr/bin/env python3
"""
semantic_search.py — Full-text semantic paper search via Semantic Scholar
Better than arXiv keyword search — finds by meaning, not just title words.
Usage: python3 semantic_search.py "<query>" [limit] [output_file]
"""

import sys
import os
import json
import time
import requests

SS_BASE = "https://api.semanticscholar.org/graph/v1"
FIELDS  = "paperId,title,abstract,authors,year,citationCount,openAccessPdf,externalIds"


def search_semantic_scholar(query, limit=30, output_file="semantic_results.json"):
    print(f"🔍 Semantic Scholar search: {query}")
    print(f"   Fetching up to {limit} results...\n")

    results = []
    offset = 0
    batch = min(limit, 100)

    while len(results) < limit:
        url = (
            f"{SS_BASE}/paper/search"
            f"?query={requests.utils.quote(query)}"
            f"&limit={batch}&offset={offset}"
            f"&fields={FIELDS}"
        )
        try:
            time.sleep(1.0)
            r = requests.get(url, timeout=20)
            r.raise_for_status()
            data = r.json()
        except Exception as e:
            print(f"❌ Search failed: {e}")
            break

        batch_data = data.get("data", [])
        if not batch_data:
            break

        results.extend(batch_data)
        total = data.get("total", 0)
        print(f"  Fetched {len(results)}/{min(limit, total)} results...")

        if len(results) >= limit or len(results) >= total:
            break
        offset += batch

    results = results[:limit]

    # Enrich with arXiv IDs where available
    enriched = []
    for r in results:
        arxiv_id = r.get("externalIds", {}).get("ArXiv", "")
        open_pdf = r.get("openAccessPdf", {}) or {}
        enriched.append({
            "paperId": r.get("paperId", ""),
            "title": r.get("title", ""),
            "abstract": r.get("abstract", ""),
            "authors": [a.get("name", "") for a in r.get("authors", [])[:5]],
            "year": r.get("year"),
            "citations": r.get("citationCount", 0),
            "arxiv_id": arxiv_id,
            "pdf_url": open_pdf.get("url", f"https://arxiv.org/pdf/{arxiv_id}" if arxiv_id else ""),
            "ss_url": f"https://www.semanticscholar.org/paper/{r.get('paperId','')}"
        })

    # Sort by citation count
    enriched.sort(key=lambda x: x.get("citations", 0), reverse=True)

    # Save JSON
    with open(output_file, "w") as f:
        json.dump(enriched, f, indent=2)

    # Save readable markdown
    md_file = output_file.replace(".json", ".md")
    with open(md_file, "w") as f:
        f.write(f"# Semantic Scholar Search Results\n")
        f.write(f"**Query:** {query}\n")
        f.write(f"**Total found:** {len(enriched)}\n\n---\n\n")

        for i, p in enumerate(enriched, 1):
            f.write(f"## {i}. {p['title']}\n")
            f.write(f"**Authors:** {', '.join(p['authors'][:3])}\n")
            f.write(f"**Year:** {p.get('year','?')} | **Citations:** {p.get('citations',0)}\n")
            if p.get("arxiv_id"):
                f.write(f"**arXiv:** [{p['arxiv_id']}](https://arxiv.org/abs/{p['arxiv_id']})\n")
            if p.get("pdf_url"):
                f.write(f"**PDF:** {p['pdf_url']}\n")
            if p.get("abstract"):
                f.write(f"\n**Abstract:** {p['abstract'][:400]}...\n")
            f.write("\n---\n\n")

    print(f"\n✅ Found {len(enriched)} papers")
    print(f"   Saved: {output_file}, {md_file}")
    return enriched


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print('Usage: python3 semantic_search.py "<query>" [limit] [output_file]')
        sys.exit(1)
    query  = sys.argv[1]
    limit  = int(sys.argv[2]) if len(sys.argv) > 2 else 30
    output = sys.argv[3] if len(sys.argv) > 3 else "semantic_results.json"
    search_semantic_scholar(query, limit, output)

FILE:scripts/server_monitor.py
#!/usr/bin/env python3
"""
server_monitor.py — Feature 8: SSH/SLURM Experiment Monitoring
Connect to your GPU server, watch jobs, auto-pull results when done.
Usage:
  python3 server_monitor.py setup                        — configure server
  python3 server_monitor.py jobs                         — list running SLURM jobs
  python3 server_monitor.py watch <job_id>               — watch job log live
  python3 server_monitor.py status                       — show all jobs + GPU usage
  python3 server_monitor.py pull <job_id> <remote_path>  — pull results when done
  python3 server_monitor.py gpu                          — check GPU usage
  python3 server_monitor.py run <script.sh>             — submit SLURM job
"""

import sys
import os
import json
import subprocess
import time
import datetime

CONFIG_FILE = os.path.expanduser(
    "~/.openclaw/workspace/research-supervisor-pro/memory/server_config.json"
)


def load_config():
    if not os.path.exists(CONFIG_FILE):
        return {}
    with open(CONFIG_FILE) as f:
        return json.load(f)


def save_config(data):
    os.makedirs(os.path.dirname(CONFIG_FILE), exist_ok=True)
    with open(CONFIG_FILE, "w") as f:
        json.dump(data, f, indent=2)
    print(f"✅ Server config saved → {CONFIG_FILE}")


def ssh_cmd(config, remote_cmd, timeout=30):
    """Run a command on the remote server via SSH."""
    host = config.get("host", "")
    user = config.get("user", "")
    port = config.get("port", 22)
    key  = config.get("ssh_key", "")

    if not host or not user:
        print("❌ Server not configured. Run: python3 server_monitor.py setup")
        return None, "Not configured"

    cmd = ["ssh", "-o", "StrictHostKeyChecking=no",
           "-o", f"ConnectTimeout={timeout}",
           "-p", str(port)]
    if key:
        cmd += ["-i", os.path.expanduser(key)]
    cmd += [f"{user}@{host}", remote_cmd]

    try:
        result = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout+5)
        return result.stdout.strip(), result.stderr.strip()
    except subprocess.TimeoutExpired:
        return None, "SSH timeout"
    except Exception as e:
        return None, str(e)


def setup_server():
    print("\n🖥️  SERVER SETUP")
    print("━" * 50)
    print("Configure your GPU server for experiment monitoring.\n")

    cfg = load_config()

    fields = [
        ("host",     "Server hostname or IP (e.g. 10.26.xx.xx or login.hpc.edu):"),
        ("user",     "Username:"),
        ("port",     "SSH port (default 22):"),
        ("ssh_key",  "SSH key path (e.g. ~/.ssh/id_rsa, or Enter for password):"),
        ("workdir",  "Remote working directory (e.g. /home/user/research):"),
        ("scheduler","Job scheduler (slurm / pbs / none):"),
    ]

    for key, prompt in fields:
        current = cfg.get(key, "")
        val = input(f"{prompt}\n  [{current or 'not set'}] → ").strip()
        if val:
            cfg[key] = val
        elif not current and key == "port":
            cfg[key] = "22"
        elif not current and key == "scheduler":
            cfg[key] = "slurm"

    save_config(cfg)

    # Test connection
    print("\n🔍 Testing connection...")
    out, err = ssh_cmd(cfg, "echo 'EVE_CONNECTED' && hostname")
    if out and "EVE_CONNECTED" in out:
        print(f"✅ Connected to: {out.split(chr(10))[-1]}")
    else:
        print(f"❌ Connection failed: {err}")
        print("   Check your hostname, username, and SSH key.")


def list_jobs():
    cfg = load_config()
    scheduler = cfg.get("scheduler", "slurm")

    if scheduler == "slurm":
        cmd = "squeue --me --format='%.10i %.20j %.8T %.10M %.6D %R' 2>/dev/null"
    elif scheduler == "pbs":
        cmd = "qstat -u $USER 2>/dev/null"
    else:
        cmd = "ps aux | grep python | grep -v grep"

    out, err = ssh_cmd(cfg, cmd)
    if out:
        print(f"\n📋 RUNNING JOBS on {cfg.get('host','?')}:")
        print("━" * 60)
        print(out)
        print("━" * 60)
    else:
        print(f"❌ Could not list jobs: {err}")


def watch_job(job_id):
    cfg = load_config()
    workdir = cfg.get("workdir", "~")

    # Find log file
    log_patterns = [
        f"{workdir}/logs/*{job_id}*.out",
        f"{workdir}/slurm-{job_id}.out",
        f"{workdir}/*{job_id}*.log",
    ]

    log_file = None
    for pattern in log_patterns:
        out, _ = ssh_cmd(cfg, f"ls {pattern} 2>/dev/null | head -1")
        if out:
            log_file = out.strip()
            break

    if not log_file:
        # Try finding by job name
        out, _ = ssh_cmd(cfg, f"scontrol show job {job_id} 2>/dev/null | grep StdOut | awk -F= '{{print $2}}'")
        if out:
            log_file = out.strip()

    if not log_file:
        print(f"❌ Log file not found for job {job_id}")
        print("   Looking in:", workdir)
        return

    print(f"👁️  Watching job {job_id} → {log_file}")
    print("   (Ctrl+C to stop)\n")
    print("━" * 60)

    seen_lines = 0
    try:
        while True:
            out, err = ssh_cmd(cfg, f"tail -n +{seen_lines+1} {log_file} 2>/dev/null")
            if out:
                lines = out.split("\n")
                for line in lines:
                    print(line)
                    # Detect completion
                    if any(kw in line.lower() for kw in ["completed", "finished", "done", "error", "failed", "epoch"]):
                        if "error" in line.lower() or "failed" in line.lower():
                            print(f"\n🚨 ALERT: Job may have failed!")
                seen_lines += len(lines)

            # Check if job is still running
            status_out, _ = ssh_cmd(cfg, f"squeue -j {job_id} --noheader 2>/dev/null")
            if not status_out:
                print(f"\n✅ Job {job_id} appears to have finished!")
                break

            time.sleep(10)
    except KeyboardInterrupt:
        print("\n\nStopped watching.")


def check_gpu():
    cfg = load_config()
    out, err = ssh_cmd(cfg, "nvidia-smi --query-gpu=name,memory.used,memory.total,utilization.gpu,temperature.gpu --format=csv,noheader,nounits 2>/dev/null")

    if out:
        print(f"\n🖥️  GPU STATUS on {cfg.get('host','?')}:")
        print("━" * 60)
        lines = out.strip().split("\n")
        for i, line in enumerate(lines):
            parts = [p.strip() for p in line.split(",")]
            if len(parts) >= 5:
                name, mem_used, mem_total, util, temp = parts[:5]
                mem_pct = int(mem_used) / int(mem_total) * 100 if mem_total.isdigit() and int(mem_total) > 0 else 0
                bar = "█" * int(mem_pct / 10) + "░" * (10 - int(mem_pct / 10))
                print(f"  GPU {i}: {name}")
                print(f"    Memory:  [{bar}] {mem_used}/{mem_total} MB ({mem_pct:.0f}%)")
                print(f"    Util:    {util}% | Temp: {temp}°C")
        print("━" * 60)
    else:
        print(f"❌ Could not get GPU info: {err}")


def pull_results(job_id, remote_path, local_path=None):
    cfg = load_config()
    host = cfg.get("host", "")
    user = cfg.get("user", "")
    port = cfg.get("port", 22)
    key  = cfg.get("ssh_key", "")

    if not local_path:
        local_path = f"./results_{job_id}/"
    os.makedirs(local_path, exist_ok=True)

    print(f"📥 Pulling results from {remote_path} → {local_path}")

    cmd = ["scp", "-r", "-P", str(port), "-o", "StrictHostKeyChecking=no"]
    if key:
        cmd += ["-i", os.path.expanduser(key)]
    cmd += [f"{user}@{host}:{remote_path}", local_path]

    try:
        result = subprocess.run(cmd, capture_output=True, text=True, timeout=120)
        if result.returncode == 0:
            print(f"✅ Results pulled to: {local_path}")
            # List pulled files
            for f in os.listdir(local_path):
                print(f"   📄 {f}")
        else:
            print(f"❌ SCP failed: {result.stderr}")
    except Exception as e:
        print(f"❌ Error: {e}")


def server_status():
    cfg = load_config()
    if not cfg.get("host"):
        print("❌ Server not configured. Run: python3 server_monitor.py setup")
        return

    print(f"\n🖥️  SERVER: {cfg.get('user')}@{cfg.get('host')}")
    print("━" * 60)

    # CPU/Memory
    out, _ = ssh_cmd(cfg, "top -bn1 | grep 'Cpu\\|Mem' | head -2")
    if out:
        print("System:", out[:100])

    # Disk
    out, _ = ssh_cmd(cfg, f"df -h {cfg.get('workdir','~')} 2>/dev/null | tail -1")
    if out:
        print("Disk:  ", out)

    print()
    check_gpu()
    print()
    list_jobs()


if __name__ == "__main__":
    if len(sys.argv) < 2 or sys.argv[1] == "setup":
        setup_server()
    elif sys.argv[1] == "jobs":
        list_jobs()
    elif sys.argv[1] == "watch" and len(sys.argv) >= 3:
        watch_job(sys.argv[2])
    elif sys.argv[1] == "status":
        server_status()
    elif sys.argv[1] == "gpu":
        check_gpu()
    elif sys.argv[1] == "pull" and len(sys.argv) >= 4:
        local = sys.argv[4] if len(sys.argv) > 4 else None
        pull_results(sys.argv[2], sys.argv[3], local)
    elif sys.argv[1] == "run" and len(sys.argv) >= 3:
        cfg = load_config()
        out, err = ssh_cmd(cfg, f"cd {cfg.get('workdir','~')} && sbatch {sys.argv[2]}")
        print(out or err)
    else:
        print(__doc__)

FILE:scripts/session_memory.py
#!/usr/bin/env python3
"""
session_memory.py — Persistent research memory across sessions
Tracks topics, papers, gaps, ideas, experiments, and decisions over weeks.
Usage:
  python3 session_memory.py save <project> <key> <value>
  python3 session_memory.py load <project>
  python3 session_memory.py summary <project>
  python3 session_memory.py list
"""

import sys
import os
import json
import datetime

BASE = os.path.expanduser("~/.openclaw/workspace/research-supervisor-pro/memory")


def get_project_file(project):
    os.makedirs(BASE, exist_ok=True)
    return os.path.join(BASE, f"{project}.json")


def load_project(project):
    path = get_project_file(project)
    if not os.path.exists(path):
        return {
            "project": project,
            "created": datetime.datetime.now().isoformat(),
            "last_updated": datetime.datetime.now().isoformat(),
            "topic": "",
            "goal": "",
            "papers": [],
            "gaps": [],
            "ideas": [],
            "experiments": [],
            "decisions": [],
            "next_steps": [],
            "sessions": []
        }
    with open(path) as f:
        return json.load(f)


def save_project(project, data):
    path = get_project_file(project)
    data["last_updated"] = datetime.datetime.now().isoformat()
    with open(path, "w") as f:
        json.dump(data, f, indent=2)
    print(f"✅ Memory saved → {path}")


def add_entry(project, key, value):
    data = load_project(project)
    timestamp = datetime.datetime.now().isoformat()

    if key in ["papers", "gaps", "ideas", "experiments", "decisions", "next_steps"]:
        entry = {"timestamp": timestamp, "content": value}
        data[key].append(entry)
    elif key in ["topic", "goal"]:
        data[key] = value
    else:
        # Generic key
        if key not in data:
            data[key] = []
        data[key].append({"timestamp": timestamp, "content": value})

    # Log session event
    data["sessions"].append({
        "timestamp": timestamp,
        "action": f"Added {key}: {str(value)[:80]}"
    })

    save_project(project, data)


def print_summary(project):
    data = load_project(project)
    print(f"\n{'='*60}")
    print(f"📚 Project: {data['project']}")
    print(f"   Topic: {data.get('topic', 'Not set')}")
    print(f"   Goal:  {data.get('goal', 'Not set')}")
    print(f"   Created: {data['created'][:10]}")
    print(f"   Last Updated: {data['last_updated'][:10]}")
    print(f"{'='*60}")
    print(f"📄 Papers tracked:      {len(data.get('papers', []))}")
    print(f"🔬 Gaps identified:     {len(data.get('gaps', []))}")
    print(f"💡 Ideas generated:     {len(data.get('ideas', []))}")
    print(f"🧪 Experiments planned: {len(data.get('experiments', []))}")
    print(f"📝 Decisions logged:    {len(data.get('decisions', []))}")
    print(f"➡️  Next steps:          {len(data.get('next_steps', []))}")

    if data.get("next_steps"):
        print(f"\n🎯 Next Steps:")
        for s in data["next_steps"][-3:]:
            print(f"  - {s['content']}")

    if data.get("sessions"):
        print(f"\n🕐 Recent Activity:")
        for s in data["sessions"][-5:]:
            print(f"  [{s['timestamp'][:16]}] {s['action']}")
    print(f"{'='*60}\n")


def list_projects():
    if not os.path.exists(BASE):
        print("No projects found.")
        return
    files = [f.replace(".json", "") for f in os.listdir(BASE) if f.endswith(".json")]
    if not files:
        print("No projects found.")
        return
    print(f"\n📂 Research Projects ({len(files)}):")
    for p in files:
        data = load_project(p)
        print(f"  - {p} | Topic: {data.get('topic','?')[:40]} | Updated: {data['last_updated'][:10]}")


def sync_from_pipeline(project, papers_dir="papers_pdf"):
    """Auto-sync papers from pipeline output into memory."""
    data = load_project(project)
    metadata_path = f"{papers_dir}/metadata.json"
    if os.path.exists(metadata_path):
        with open(metadata_path) as f:
            papers = json.load(f)
        existing_ids = {p["content"].get("arxiv_id") for p in data["papers"] if isinstance(p.get("content"), dict)}
        new = 0
        for p in papers:
            if p.get("arxiv_id") not in existing_ids:
                data["papers"].append({
                    "timestamp": datetime.datetime.now().isoformat(),
                    "content": {
                        "arxiv_id": p.get("arxiv_id"),
                        "title": p.get("title"),
                        "authors": p.get("authors", [])[:3],
                        "published": p.get("published", "")[:10]
                    }
                })
                new += 1
        save_project(project, data)
        print(f"✅ Synced {new} new papers into memory")


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print(__doc__)
        sys.exit(1)

    action = sys.argv[1]

    if action == "list":
        list_projects()
    elif action == "load" and len(sys.argv) >= 3:
        print_summary(sys.argv[2])
    elif action == "summary" and len(sys.argv) >= 3:
        print_summary(sys.argv[2])
    elif action == "save" and len(sys.argv) >= 5:
        add_entry(sys.argv[2], sys.argv[3], sys.argv[4])
    elif action == "sync" and len(sys.argv) >= 3:
        papers_dir = sys.argv[3] if len(sys.argv) > 3 else "papers_pdf"
        sync_from_pipeline(sys.argv[2], papers_dir)
    else:
        print(__doc__)

FILE:scripts/thesis_context.py
#!/usr/bin/env python3
"""
thesis_context.py — Feature 4: Thesis Context File
Stores YOUR specific research context so gap detection and idea generation
are targeted to your exact thesis, not generic.
Usage:
  python3 thesis_context.py init                    — interactive setup
  python3 thesis_context.py show                    — show current context
  python3 thesis_context.py update <field> <value>  — update a field
  python3 thesis_context.py export                  — export as context string for LLM
"""

import sys
import os
import json
import datetime

CONTEXT_FILE = os.path.expanduser(
    "~/.openclaw/workspace/research-supervisor-pro/memory/thesis_context.json"
)


def load_context():
    if not os.path.exists(CONTEXT_FILE):
        return {}
    with open(CONTEXT_FILE) as f:
        return json.load(f)


def save_context(data):
    os.makedirs(os.path.dirname(CONTEXT_FILE), exist_ok=True)
    data["last_updated"] = datetime.datetime.now().isoformat()
    with open(CONTEXT_FILE, "w") as f:
        json.dump(data, f, indent=2)
    print(f"✅ Thesis context saved → {CONTEXT_FILE}")


def interactive_init():
    print("\n🎓 THESIS CONTEXT SETUP")
    print("━" * 50)
    print("I'll remember your exact research context forever.")
    print("This makes gap detection and ideas much more targeted.\n")

    ctx = load_context()

    questions = [
        ("thesis_title",      "Thesis title:"),
        ("your_claim",        "Your main claim / contribution (1-2 sentences):"),
        ("baseline_paper",    "Your key baseline paper (arXiv ID or title):"),
        ("baseline_result",   "Baseline result you're trying to beat (e.g. 41.2% bit accuracy):"),
        ("your_method",       "Your proposed method (brief description):"),
        ("attack_types",      "Attack types you're testing against:"),
        ("watermark_methods", "Watermark methods you're using (e.g. HiDDeN, StegaStamp):"),
        ("datasets",          "Datasets you're using:"),
        ("metrics",           "Evaluation metrics (e.g. BER, PSNR, SSIM):"),
        ("target_venue",      "Target venue (journal/conference):"),
        ("supervisor",        "Supervisor name:"),
        ("deadline",          "Submission deadline:"),
        ("university",        "University / department:"),
    ]

    for key, prompt in questions:
        current = ctx.get(key, "")
        if current:
            val = input(f"{prompt}\n  [current: {current[:60]}] → ").strip()
            if not val:
                val = current
        else:
            val = input(f"{prompt}\n  → ").strip()
        if val:
            ctx[key] = val

    save_context(ctx)
    print("\n✅ Thesis context saved! EVE will now give you targeted suggestions.")
    return ctx


def show_context():
    ctx = load_context()
    if not ctx:
        print("❌ No thesis context found. Run: python3 thesis_context.py init")
        return

    print("\n🎓 YOUR THESIS CONTEXT")
    print("━" * 50)
    fields = [
        ("thesis_title",      "Thesis Title"),
        ("your_claim",        "Your Claim"),
        ("baseline_paper",    "Baseline Paper"),
        ("baseline_result",   "Baseline Result"),
        ("your_method",       "Your Method"),
        ("attack_types",      "Attack Types"),
        ("watermark_methods", "Watermark Methods"),
        ("datasets",          "Datasets"),
        ("metrics",           "Metrics"),
        ("target_venue",      "Target Venue"),
        ("supervisor",        "Supervisor"),
        ("deadline",          "Deadline"),
        ("university",        "University"),
    ]
    for key, label in fields:
        val = ctx.get(key, "—")
        print(f"  {label:<22}: {val}")
    print(f"\n  Last updated: {ctx.get('last_updated','?')[:16]}")
    print("━" * 50)


def export_context():
    """Export as a compact string for injection into LLM prompts."""
    ctx = load_context()
    if not ctx:
        return ""

    lines = [
        "=== USER THESIS CONTEXT (use this to make suggestions specific) ===",
        f"Thesis: {ctx.get('thesis_title', '')}",
        f"Claim: {ctx.get('your_claim', '')}",
        f"Baseline: {ctx.get('baseline_paper', '')} — result: {ctx.get('baseline_result', '')}",
        f"Method: {ctx.get('your_method', '')}",
        f"Attacks tested: {ctx.get('attack_types', '')}",
        f"Watermark methods: {ctx.get('watermark_methods', '')}",
        f"Datasets: {ctx.get('datasets', '')}",
        f"Metrics: {ctx.get('metrics', '')}",
        f"Target venue: {ctx.get('target_venue', '')}",
        f"Deadline: {ctx.get('deadline', '')}",
        "=== END CONTEXT ===",
    ]
    return "\n".join(lines)


def update_field(field, value):
    ctx = load_context()
    ctx[field] = value
    save_context(ctx)
    print(f"✅ Updated {field} → {value}")


if __name__ == "__main__":
    if len(sys.argv) < 2 or sys.argv[1] == "init":
        interactive_init()
    elif sys.argv[1] == "show":
        show_context()
    elif sys.argv[1] == "export":
        print(export_context())
    elif sys.argv[1] == "update" and len(sys.argv) >= 4:
        update_field(sys.argv[2], sys.argv[3])
    else:
        print(__doc__)

FILE:scripts/venue_checklist.py
#!/usr/bin/env python3
"""
venue_checklist.py — Feature 6: Venue-Specific Paper Checklists
Generates a checklist of requirements for your target venue.
Built-in knowledge of major AI/ML venues.
Usage:
  python3 venue_checklist.py <venue_name>
  python3 venue_checklist.py list           — show all supported venues
  python3 venue_checklist.py check <project> <venue>  — save checklist to project
"""

import sys
import os
import json
import datetime

BASE = os.path.expanduser("~/.openclaw/workspace/research-supervisor-pro/research")

# ── Venue Database ────────────────────────────────────────────────────────────
VENUES = {
    "ieee tifs": {
        "full_name": "IEEE Transactions on Information Forensics and Security",
        "type": "journal",
        "impact": "Q1, IF ~6.8",
        "pages": "10-14 pages (double column)",
        "abstract_limit": "250 words",
        "review_type": "Double-blind",
        "turnaround": "3-6 months",
        "requirements": [
            "Double-column IEEE format",
            "Abstract max 250 words",
            "Keywords: 5-10 IEEE Terms",
            "Must include: thorough related work section",
            "Must include: ablation study",
            "Must include: comparison vs at least 3 baselines",
            "Must include: statistical significance analysis",
            "Ethics statement if human subjects involved",
            "Code/data availability statement recommended",
            "Supplementary material allowed (unlimited)",
            "No page limit for references",
        ],
        "avoid": [
            "Overclaiming — avoid 'state-of-the-art' without proof",
            "Missing ablation study — reviewers will reject",
            "Only 1-2 baselines — minimum 3 required",
            "Weak evaluation — must test on multiple datasets",
        ],
        "tips": [
            "TIFS reviewers are security-focused — frame contributions in security context",
            "Include threat model section",
            "Show robustness to both known and unknown attacks",
            "Reproducibility is valued — share code",
        ]
    },
    "neurips": {
        "full_name": "Neural Information Processing Systems",
        "type": "conference",
        "impact": "Top-tier, A*",
        "pages": "9 pages (content) + unlimited references",
        "abstract_limit": "150 words (submission system)",
        "review_type": "Double-blind",
        "turnaround": "~4 months",
        "requirements": [
            "NeurIPS LaTeX template (required)",
            "9 pages max for main paper",
            "Unlimited pages for references",
            "Supplementary appendix allowed",
            "Abstract max 150 words in system",
            "Reproducibility checklist must be completed",
            "Broader impact statement required",
            "Limitations section strongly recommended",
            "Must release code/data for reproducibility",
            "Checklist: theory / experiments / dataset / model",
        ],
        "avoid": [
            "No limitations section — red flag for reviewers",
            "Missing reproducibility checklist",
            "Vague experimental setup — be very specific",
            "No broader impact — required since 2020",
        ],
        "tips": [
            "NeurIPS values novelty + theoretical depth",
            "Include proofs in appendix if you have theory",
            "Strong empirical results matter as much as theory",
            "Clear, clean figures — reviewers read on laptop screens",
        ]
    },
    "cvpr": {
        "full_name": "IEEE/CVF Conference on Computer Vision and Pattern Recognition",
        "type": "conference",
        "impact": "Top-tier, A*",
        "pages": "8 pages + references",
        "abstract_limit": "~200 words",
        "review_type": "Double-blind",
        "turnaround": "~3 months",
        "requirements": [
            "CVPR LaTeX template (required)",
            "8 pages max (main) + unlimited references",
            "Supplementary up to 100MB",
            "Anonymous submission (no author names)",
            "Must not cite your own non-anonymous work",
            "Good quality figures — this is a vision venue",
            "Ablation study expected",
            "Comparison to recent methods (within 2 years)",
        ],
        "avoid": [
            "Low quality figures — CVPR is a visual venue",
            "Old baselines — must compare to recent methods",
            "Missing ablation",
            "Citing your own previous work non-anonymously",
        ],
        "tips": [
            "Visual results sell papers — invest in figures",
            "Include failure cases in supplementary",
            "User study or perceptual quality metrics valued",
            "Teaser figure on page 1 is critical",
        ]
    },
    "iccv": {
        "full_name": "International Conference on Computer Vision",
        "type": "conference",
        "impact": "Top-tier, A*",
        "pages": "8 pages + references",
        "abstract_limit": "~200 words",
        "review_type": "Double-blind",
        "turnaround": "~4 months",
        "requirements": [
            "ICCV LaTeX template",
            "8 pages max + unlimited references",
            "Supplementary material allowed",
            "Double-blind submission",
            "High-quality experimental evaluation",
            "Ablation study expected",
        ],
        "avoid": ["Similar to CVPR — see CVPR checklist"],
        "tips": ["Similar to CVPR — visual quality of results is key"]
    },
    "acm mm": {
        "full_name": "ACM International Conference on Multimedia",
        "type": "conference",
        "impact": "Top-tier multimedia, A",
        "pages": "8 pages + 2 for references",
        "abstract_limit": "150 words",
        "review_type": "Double-blind",
        "turnaround": "~3 months",
        "requirements": [
            "ACM template",
            "8 pages content + 2 pages references",
            "Multimodal evaluation preferred",
            "Demo/system paper track available",
            "Reproducibility encouraged",
        ],
        "avoid": [
            "Single-modal experiments for a multimedia venue",
            "Missing perceptual quality evaluation",
        ],
        "tips": [
            "Strong for watermarking — security + multimedia intersection",
            "Include perceptual quality metrics (PSNR, SSIM, LPIPS)",
        ]
    },
    "ieee tsp": {
        "full_name": "IEEE Transactions on Signal Processing",
        "type": "journal",
        "impact": "Q1, IF ~5.4",
        "pages": "12-14 pages double column",
        "abstract_limit": "200 words",
        "review_type": "Single-blind",
        "turnaround": "4-8 months",
        "requirements": [
            "IEEE double-column format",
            "Strong mathematical formulation required",
            "Theoretical analysis valued",
            "Comprehensive experiments",
            "Must include complexity analysis",
        ],
        "avoid": [
            "Weak mathematical contribution",
            "Missing theoretical bounds or analysis",
        ],
        "tips": [
            "TSP values signal processing theory — formalize your method mathematically",
            "Include convergence analysis if optimization-based",
        ]
    },
    "thesis": {
        "full_name": "Master's / PhD Thesis",
        "type": "thesis",
        "impact": "Academic degree requirement",
        "pages": "60-150 pages typically",
        "abstract_limit": "300-500 words",
        "review_type": "Committee review",
        "turnaround": "Per university schedule",
        "requirements": [
            "University LaTeX/Word template",
            "Full literature review chapter",
            "Methodology chapter",
            "Experiments chapter with all details",
            "Conclusion and future work",
            "Complete bibliography",
            "List of figures and tables",
            "Acknowledgements section",
            "Declaration of originality",
            "All code and data archived",
        ],
        "avoid": [
            "Insufficient related work — committees expect exhaustive review",
            "Missing implementation details — must be reproducible",
            "No future work section",
        ],
        "tips": [
            "Include a chapter introduction and summary for each chapter",
            "Be verbose — more detail is better than less for a thesis",
            "Add a glossary for all technical terms",
            "Include negative results — shows thorough investigation",
        ]
    }
}


def normalize_venue(name):
    return name.lower().strip()


def get_venue(name):
    norm = normalize_venue(name)
    # Exact match
    if norm in VENUES:
        return VENUES[norm]
    # Partial match
    for key in VENUES:
        if key in norm or norm in key:
            return VENUES[key]
    return None


def show_checklist(venue_name):
    venue = get_venue(venue_name)
    if not venue:
        print(f"❌ Venue '{venue_name}' not found.")
        print(f"   Supported: {', '.join(VENUES.keys())}")
        print(f"   Or use 'thesis' for thesis format")
        return None

    print(f"\n📋 VENUE CHECKLIST: {venue['full_name']}")
    print(f"   Type: {venue['type']} | Impact: {venue['impact']}")
    print(f"   Pages: {venue['pages']} | Abstract: {venue['abstract_limit']}")
    print(f"   Review: {venue['review_type']} | Turnaround: {venue['turnaround']}")
    print()

    print("✅ REQUIREMENTS:")
    for req in venue["requirements"]:
        print(f"   [ ] {req}")

    print("\n❌ AVOID:")
    for avoid in venue["avoid"]:
        print(f"   ⚠️  {avoid}")

    print("\n💡 TIPS:")
    for tip in venue["tips"]:
        print(f"   → {tip}")

    print()
    return venue


def save_to_project(project, venue_name):
    venue = get_venue(venue_name)
    if not venue:
        print(f"❌ Venue not found: {venue_name}")
        return

    project_dir = os.path.join(BASE, project)
    os.makedirs(project_dir, exist_ok=True)
    checklist_path = os.path.join(project_dir, "venue_checklist.md")

    with open(checklist_path, "w") as f:
        f.write(f"# 📋 Venue Checklist: {venue['full_name']}\n\n")
        f.write(f"**Type:** {venue['type']}  \n")
        f.write(f"**Impact:** {venue['impact']}  \n")
        f.write(f"**Pages:** {venue['pages']}  \n")
        f.write(f"**Abstract limit:** {venue['abstract_limit']}  \n")
        f.write(f"**Review type:** {venue['review_type']}  \n\n")

        f.write("## ✅ Requirements\n\n")
        for req in venue["requirements"]:
            f.write(f"- [ ] {req}\n")

        f.write("\n## ❌ Avoid\n\n")
        for avoid in venue["avoid"]:
            f.write(f"- ⚠️  {avoid}\n")

        f.write("\n## 💡 Tips\n\n")
        for tip in venue["tips"]:
            f.write(f"- {tip}\n")

        f.write(f"\n---\n*Generated by EVE — {datetime.date.today()}*\n")

    print(f"✅ Checklist saved → {checklist_path}")
    return checklist_path


def list_venues():
    print("\n📋 SUPPORTED VENUES:")
    print("━" * 50)
    for key, v in VENUES.items():
        print(f"  {key:<15} {v['full_name'][:45]}")
    print("━" * 50)


if __name__ == "__main__":
    if len(sys.argv) < 2 or sys.argv[1] == "list":
        list_venues()
    elif sys.argv[1] == "check" and len(sys.argv) >= 4:
        save_to_project(sys.argv[2], sys.argv[3])
    elif len(sys.argv) >= 2:
        show_checklist(" ".join(sys.argv[1:]))
    else:
        print(__doc__)

FILE:templates/experiment_data_template.json
{
  "_comment": "Template for your real experimental data. Fill this in with your actual results.",
  "_usage": "python3 paper_writer.py research 'Your Topic' experiment_data.json output.tex 'Your Name' 'IEEE TIFS'",

  "experiments": [
    {
      "name": "Bit Error Rate vs Training Epochs",
      "xlabel": "Epoch",
      "ylabel": "Bit Error Rate (BER)",
      "label": "BER convergence during training",
      "x": [10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
      "y": [0.45, 0.38, 0.29, 0.21, 0.15, 0.10, 0.08, 0.06, 0.05, 0.04]
    },
    {
      "name": "Robustness Under Different Attacks",
      "xlabel": "Attack Type",
      "ylabel": "Bit Accuracy (%)",
      "label": "Watermark survival rate under different attack types",
      "x": ["Gaussian Noise", "JPEG Compress", "Crop", "Rotate", "LLM-guided"],
      "y": [
        [88.2, 85.1, 79.3, 72.4, 41.2],
        [91.5, 89.2, 83.1, 78.9, 65.3]
      ],
      "labels": ["HiDDeN (baseline)", "Ours (proposed)"]
    }
  ],

  "comparisons": [
    {
      "title": "Watermark Detection Accuracy Under LLM-Guided Attack",
      "metric": "Detection Accuracy (%)",
      "methods": ["HiDDeN", "StegaStamp", "Tree-Ring", "ROBIN", "Ours"],
      "values": [41.2, 43.7, 38.9, 61.4, 78.5]
    },
    {
      "title": "Image Quality (PSNR) After Watermark Embedding",
      "metric": "PSNR (dB)",
      "methods": ["HiDDeN", "StegaStamp", "Tree-Ring", "ROBIN", "Ours"],
      "values": [34.2, 35.1, 36.8, 33.9, 35.7]
    }
  ],

  "tables": [
    {
      "caption": "Comparison of Watermarking Methods Under LLM-Guided Adversarial Attacks",
      "headers": ["Method", "BER (↓)", "Bit Acc (↑)", "PSNR (↑)", "SSIM (↑)", "Detection Acc (↑)"],
      "rows": [
        ["HiDDeN \\cite{Zhu2018}", "0.31", "41.2\\%", "34.2", "0.91", "41.2\\%"],
        ["StegaStamp \\cite{Tancik2020}", "0.29", "43.7\\%", "35.1", "0.93", "43.7\\%"],
        ["Tree-Ring \\cite{Wen2023}", "0.33", "38.9\\%", "36.8", "0.94", "38.9\\%"],
        ["ROBIN \\cite{Robin2024}", "0.19", "61.4\\%", "33.9", "0.90", "61.4\\%"],
        ["\\textbf{Ours}", "\\textbf{0.11}", "\\textbf{78.5\\%}", "\\textbf{35.7}", "\\textbf{0.93}", "\\textbf{78.5\\%}"]
      ]
    },
    {
      "caption": "Ablation Study: Effect of Each Component",
      "headers": ["Configuration", "BER (↓)", "Bit Acc (↑)", "PSNR (↑)"],
      "rows": [
        ["Base model only", "0.31", "41.2\\%", "34.2"],
        ["+ Adversarial training", "0.18", "65.3\\%", "35.1"],
        ["+ Attack taxonomy augmentation", "0.14", "72.8\\%", "35.5"],
        ["+ Full pipeline (Ours)", "0.11", "78.5\\%", "35.7"]
      ]
    }
  ]
}
ClawHub Data Analysis Research+2
A@clawhub-amzayn-2f2011676a