@clawhub-nick-tsyen-19f8837cfa
Export GitHub starred repositories by category and sync them to a Notion database.
---
name: github-stars-notion-sync
description: Export GitHub starred repositories by category and sync them to a Notion database.
---
# GitHub Stars to Notion Sync Skill
This skill allows you to automate the process of exporting your GitHub starred repositories (grouped by custom lists/categories) and syncing them into a structured Notion database.
## Instructions
When this skill is active, you can perform the following tasks:
### 1. Export GitHub Stars
Use the shell script in `./scripts/export_stars.sh` to fetch all starred repositories and save them to `./assets/starred_lists.md`.
- **Requirement**: GitHub CLI (`gh`) must be installed and authenticated.
- **Output**: A Markdown file with tables for each category.
### 2. Sync to Notion
Use the Python script in `./scripts/sync_stars_to_notion_db.py` to parse the exported Markdown and populate a Notion database.
- **Requirement**: `NOTION_API_KEY` environment variable must be set.
- **Requirement**: `requests` library must be installed.
- **Config**: Local state is tracked in `./assets/.notion_sync_config.json`.
### 3. Workflow
1. Run `./scripts/export_stars.sh`.
2. Run `python scripts/sync_stars_to_notion_db.py`.
## Tool Definitions
- **export_stars**: Fetches GitHub stars and updates `./assets/starred_lists.md`.
- **sync_to_notion**: Syncs the contents of `./assets/starred_lists.md` to Notion.
FILE:requirements.txt
requests
FILE:agent.yaml
name: github-stars-notion-sync-agent
version: 1.0.0
description: An agent that manages GitHub starred repository backups and Notion sync.
tools:
- name: export_stars
description: Export all GitHub starred repositories categorized by custom lists to a local Markdown file.
command: bash scripts/export_stars.sh
- name: sync_to_notion
description: Sync the exported GitHub starred repositories to a Notion database.
command: python3 scripts/sync_stars_to_notion_db.py
environment:
required_vars:
- NOTION_API_KEY
dependencies:
- python: requests
- cli: gh
- cli: jq
FILE:README.md
# GitHub Stars Notion Sync Agent Skill
A specialized agent skill designed to automate the backup and synchronization of your GitHub starred repositories into a structured Notion database. This skill organizes your stars by GitHub's custom lists (categories), making it easier to manage and search your collection.
## 📁 Repository Structure
```text
github-stars-notion-sync/
├── SKILL.md # (Required) Skill definition for Gemini CLI
├── agent.yaml # Agent metadata and tool mapping
├── README.md # Main documentation
├── requirements.txt # Python dependencies
├── scripts/ # Implementation logic
│ ├── export_stars.sh
│ └── sync_stars_to_notion_db.py
├── references/ # Supplemental documentation
│ ├── export_stars.md
│ └── sync_stars.md
└── assets/ # Data and local state
├── starred_lists.md
└── .notion_sync_config.json
```
## 🚀 Getting Started
### Prerequisites
1. **GitHub CLI (`gh`)**: Must be installed and authenticated.
2. **jq**: Required for JSON processing in the shell script.
3. **Notion API Key**: Obtain a token from [Notion Developers](https://developers.notion.com/) and set it as an environment variable:
```bash
export NOTION_API_KEY="ntn_..."
```
### Installation
Install Python dependencies:
```bash
pip install -r requirements.txt
```
## 🛠️ Usage
This skill exposes two primary tools:
### 1. `export_stars`
Fetches all your starred repositories and organizes them by GitHub List.
- **Run manually**: `bash scripts/export_stars.sh`
- **Output**: `./assets/starred_lists.md`
### 2. `sync_to_notion`
Parses the local Markdown file and populates/updates a Notion database.
- **Run manually**: `python3 scripts/sync_stars_to_notion_db.py`
- **State**: Tracks the Notion database ID in `./assets/.notion_sync_config.json`.
## 📚 Documentation
Detailed information for each script can be found in the `references/` directory.
- [Exporting Stars](./references/export_stars.md)
- [Syncing to Notion](./references/sync_stars.md)
FILE:scripts/export_stars.sh
#!/bin/bash
# Define the output file name
OUTPUT_FILE="assets/starred_lists.md"
echo "Fetching starred repositories organized by category (supports >100 lists and >100 stars per list)..."
> "$OUTPUT_FILE"
# 1. Fetch all lists first (handling pagination via gh api --paginate)
echo "Fetching all categories..."
LISTS_JSON=$(gh api graphql --paginate -f query='
query($endCursor: String) {
viewer {
lists(first: 100, after: $endCursor) {
pageInfo { hasNextPage endCursor }
nodes {
id
name
}
}
}
}' --jq '.data.viewer.lists.nodes[]' 2>/dev/null)
if [ -z "$LISTS_JSON" ]; then
echo "No lists found or an error occurred."
exit 1
fi
# 2. Iterate over the lists and fetch items for each one
echo "$LISTS_JSON" | jq -c '.' | while read -r list; do
list_id=$(echo "$list" | jq -r '.id')
list_name=$(echo "$list" | jq -r '.name')
echo "Processing category: $list_name..."
echo "## $list_name" >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
echo "| Repo name | Repo handler | Full URL to Repo | Number of Stars |" >> "$OUTPUT_FILE"
echo "|---|---|---|---|" >> "$OUTPUT_FILE"
# Fetch paginated items for this specific list
# Note: The '?' allows jq to handle cases where items might be empty without erroring out
gh api graphql --paginate -F id="$list_id" -f query='
query($id: ID!, $endCursor: String) {
node(id: $id) {
... on UserList {
items(first: 100, after: $endCursor) {
pageInfo { hasNextPage endCursor }
nodes {
... on Repository {
name
owner { login }
url
stargazerCount
}
}
}
}
}
}' --jq '.data.node.items.nodes[]? | select(. != null) | "| \(.name) | \(.owner.login) | \(.url) | \(.stargazerCount) |"' >> "$OUTPUT_FILE"
echo "" >> "$OUTPUT_FILE"
done
echo "Successfully saved all repositories to $OUTPUT_FILE"
FILE:scripts/sync_stars_to_notion_db.py
import argparse
import json
import os
import sys
try:
import requests
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
except ImportError:
print("The 'requests' library is required. Install it using: pip install requests")
sys.exit(1)
# Default Notion API Token
NOTION_TOKEN = os.environ.get("NOTION_API_KEY")
NOTION_VERSION = "2022-06-28"
CONFIG_FILE = "assets/.notion_sync_config.json"
DEFAULT_PARENT_PAGE_ID = "f94aa417-3269-4fa6-a869-dc5b22eb1cca"
HEADERS = {
"Authorization": f"Bearer {NOTION_TOKEN}",
"Content-Type": "application/json",
"Notion-Version": NOTION_VERSION
}
def load_config():
if os.path.exists(CONFIG_FILE):
with open(CONFIG_FILE, 'r') as f:
return json.load(f)
return {}
def save_config(config):
with open(CONFIG_FILE, 'w') as f:
json.dump(config, f, indent=4)
def parse_markdown(filepath):
"""Parses markdown and returns a list of dictionaries representing the rows."""
if not os.path.exists(filepath):
print(f"File {filepath} not found.")
sys.exit(1)
with open(filepath, 'r', encoding='utf-8') as f:
lines = f.readlines()
rows = []
current_category = "Uncategorized"
for line in lines:
line = line.strip()
if not line:
continue
if line.startswith("## "):
# Multi-select options in Notion cannot contain commas
current_category = line[3:].strip().replace(",", "")[:100]
elif line.startswith("|") and not line.startswith("|---"):
cells = [cell.strip() for cell in line.split("|")[1:-1]]
# Ignore the header row
if len(cells) == 4 and cells[0].lower() != "repo name":
try:
stars = int(cells[3].replace(",", ""))
except ValueError:
stars = 0
rows.append({
"repo_name": cells[0],
"repo_handler": cells[1],
"url": cells[2],
"stars": stars,
"category": current_category
})
return rows
def create_database(db_name, parent_id):
"""Creates a new Notion database and returns its ID."""
url = "https://api.notion.com/v1/databases"
payload = {
"parent": {"type": "page_id", "page_id": parent_id},
"title": [
{"type": "text", "text": {"content": db_name}}
],
"properties": {
"Repo name": {"title": {}},
"Repo handler": {"rich_text": {}},
"Full URL to Repo": {"url": {}},
"Number of Stars": {"number": {"format": "number"}},
"category": {"multi_select": {}}
}
}
response = requests.post(url, headers=HEADERS, json=payload, verify=False)
if response.status_code != 200:
print(f"Error creating database: {response.text}")
sys.exit(1)
return response.json()["id"]
def clear_database(db_id):
"""Archives all existing pages in the database so it can be 'overwritten'."""
print(f"Clearing existing entries in database {db_id}...")
query_url = f"https://api.notion.com/v1/databases/{db_id}/query"
has_more = True
next_cursor = None
count = 0
while has_more:
payload = {}
if next_cursor:
payload["start_cursor"] = next_cursor
response = requests.post(query_url, headers=HEADERS, json=payload, verify=False)
if response.status_code != 200:
print(f"Error querying database for clearing: {response.text}")
break
data = response.json()
pages = data.get("results", [])
for page in pages:
page_id = page["id"]
patch_url = f"https://api.notion.com/v1/pages/{page_id}"
requests.patch(patch_url, headers=HEADERS, json={"archived": True}, verify=False)
count += 1
has_more = data.get("has_more", False)
next_cursor = data.get("next_cursor")
print(f"Cleared {count} existing entries.")
def insert_row(db_id, row):
"""Inserts a single row into the Notion database."""
url = "https://api.notion.com/v1/pages"
payload = {
"parent": {"type": "database_id", "database_id": db_id},
"properties": {
"Repo name": {
"title": [{"type": "text", "text": {"content": row["repo_name"]}}]
},
"Repo handler": {
"rich_text": [{"type": "text", "text": {"content": row["repo_handler"]}}]
},
"Number of Stars": {
"number": row["stars"]
},
"category": {
"multi_select": [{"name": row["category"]}]
}
}
}
# Only assign URL if it looks like a valid URL start to avoid Notion API schema errors
url_val = row["url"]
if url_val.startswith("http"):
payload["properties"]["Full URL to Repo"] = {"url": url_val}
response = requests.post(url, headers=HEADERS, json=payload, verify=False)
if response.status_code != 200:
print(f"Error inserting row for {row['repo_name']}: {response.text}")
def main():
parser = argparse.ArgumentParser(description="Sync GitHub stars markdown to a Notion Database.")
parser.add_argument("--input", default="assets/starred_lists.md", help="Path to the input markdown file.")
parser.add_argument("--db-name", default="Starred GitHub Repositories DB", help="Name of the Notion Database.")
parser.add_argument("--parent-id", default=DEFAULT_PARENT_PAGE_ID, help="ID of the parent Notion page.")
args = parser.parse_args()
# Check config for existing database ID mapping
config = load_config()
db_id = config.get(args.db_name)
if db_id:
print(f"Found existing database ID in config for '{args.db_name}': {db_id}")
clear_database(db_id)
else:
print(f"Creating new database '{args.db_name}'...")
db_id = create_database(args.db_name, args.parent_id)
config[args.db_name] = db_id
save_config(config)
print(f"Database created with ID: {db_id}")
print(f"Parsing {args.input}...")
rows = parse_markdown(args.input)
if not rows:
print("No content found to sync.")
sys.exit(0)
print(f"Inserting {len(rows)} rows into the Notion Database...")
for i, row in enumerate(rows, 1):
insert_row(db_id, row)
if i % 10 == 0:
print(f"Inserted {i}/{len(rows)} rows...")
print("Sync complete!")
if __name__ == "__main__":
main()
FILE:references/sync_stars.md
# Sync Stars to Notion DB (`sync_stars_to_notion_db.py`)
## 📖 Overview
The `sync_stars_to_notion_db.py` script bridges the gap between an exported local markdown list of GitHub Starred repositories and your Notion workspace. Given a markdown file structured with tables and category headings, this script interprets the data and programmatically builds a native **Notion Database** populated with your repositories.
Unlike traditional exports that just create a Notion text block or simple subpage layout, this script generates an actionable, sortable, and filterable database using explicit Notion property types (Options, URLs, Numbers, and Text).
## ✨ Key Features
- **Database Property Mapping**: The script parses tabular markdown and assigns it robust data types within Notion:
- `Repo name` ➔ `title`
- `Repo handler` ➔ `rich_text`
- `Full URL to Repo` ➔ `url`
- `Number of Stars` ➔ `number`
- `category` ➔ `multi_select`
- **Idempotent Syncing (Smart Overwrites)**: The tool relies on a local `.notion_sync_config.json` state tracker to map the database name to its Notion Database ID. If you rerun the script without changing the target database name, it prevents duplicates by archiving the existing row entries in the database before pushing the fresh updates.
- **CLI Configuration**: Fully powered by Python's `argparse`, letting you dynamically tweak file mappings and execution arguments.
## ⚙️ Prerequisites
1. **Python Packages**: The script requires the `requests` library. Install it using pip:
```bash
pip install requests
```
2. **Notion API Key**:
The script will securely read your integration token from your environment variables:
```bash
export NOTION_API_KEY="ntn_..."
```
*(Note: A default fallback key is hardcoded within the script if the environment variable is not set).*
---
## 🚀 Usage
### Basic Execution
Run the script using all default properties (Looks for `starred_lists.md` and generates a database named "Starred GitHub Repositories DB"):
```bash
python sync_stars_to_notion_db.py
```
### Specifying a Custom Title (State Identifier)
The database name works as the unique identifier in your caching state:
```bash
python sync_stars_to_notion_db.py --db-name "My Awesome Repo Collection"
```
### Declaring a Different Data Source
Use the `--input` argument to utilize a different local markdown file:
```bash
python sync_stars_to_notion_db.py --input "other_stars.md"
```
### Assigning a Parent Page Destination
Select which Notion page will host the newly generated database by assigning its UUID:
```bash
python sync_stars_to_notion_db.py --parent-id "YOUR_NOTION_PAGE_UUID_HERE"
```
### View Help Menu
```bash
python sync_stars_to_notion_db.py --help
```
---
## 🧠 How It Works Under the Hood
1. **Local State Check**
The script starts by looking for `.notion_sync_config.json`. It tries to find a pre-existing Notion Database ID linked to the requested `--db-name`.
2. **Smart Database Management**
- **If an ID is found (Updating):** The script queries the `/databases/{id}/query` endpoint to fetch all current records, iterates through them, and sets them to `"archived": True`. This guarantees a clean slate to host fresh updates without deleting the actual core database permissions.
- **If no ID is found (Creation):** The script queries the `/v1/databases` Notion endpoint to set up a brand new table with the associated Data Types (URLs, numbers, options), assigns it to your `parent-id`, and caches the new Database ID to `.notion_sync_config.json`.
3. **Data Parsing (`parse_markdown`)**
The script processes your target file line by line:
- Captures any text following `## ` headers and assigns it as the `category`.
- Cleans the header strings by stripping commas, as the Notion API restricts commas in `multi_select` options.
- Parses the Markdown Table syntaxes (`|`) and sanitizes commas out of the Number of Stars count so that they can accurately be injected as Integers.
4. **Data Injection (`insert_row`)**
After establishing the schema, the script loops through your parsed categories/tables and inserts them progressively as individual database items. It prints its sync progress to your console every 10 rows.
FILE:references/export_stars.md
# Export GitHub Stars Script
The `export_stars.sh` script is a powerful command-line utility that extracts all of your starred GitHub repositories, grouped neatly by their custom GitHub Lists (categories), and formats them into a Markdown table.
## Features
- **Category Grouping**: Organizes your stars directly into the Lists you created on GitHub.
- **Full Pagination Support**: Properly paginates through categories and repositories. Unlike a simple GraphQL query that caps at 100, this script correctly fetches an unlimited number of categories and items.
- **Markdown Ready**: Automatically generates a completely formatted Markdown file (`starred_lists.md`) ready to be viewed in any Markdown editor or uploaded to a repository.
## Requirements
To run this script, you will need the following dependencies installed on your system:
1. **[GitHub CLI (`gh`)](https://cli.github.com/)**: Standard command-line tool for GitHub.
- You must be authenticated with the CLI and have the necessary scopes.
- Run `gh auth status` to check if you are logged in.
- If not, run `gh auth login` and follow the prompts.
2. **[jq](https://jqlang.github.io/jq/)**: A lightweight and flexible command-line JSON processor.
- Mac: `brew install jq`
- Debian/Ubuntu: `sudo apt-get install jq`
- Arch Linux: `sudo pacman -S jq`
## Usage
1. Open your terminal.
2. Ensure the script is executable. If it isn't, run:
```bash
chmod +x export_stars.sh
```
3. Execute the script:
```bash
./export_stars.sh
```
## Output Format
The script will write an output file called `starred_lists.md` in the current working directory. The output is structured with the category name as a header (level 2), followed by a Markdown table.
### Columns Included
- **Repo name**: The name of the repository.
- **Repo handler**: The username or organization that owns the repository.
- **Full URL to Repo**: The direct link to the repository on GitHub.
- **Number of Stars**: The current total stargazer count for that repository.
### Output Example
```markdown
## 🤖 Machine Learning
| Repo name | Repo handler | Full URL to Repo | Number of Stars |
|---|---|---|---|
| cleanlab | cleanlab | https://github.com/cleanlab/cleanlab | 9324 |
| human-learn | koaning | https://github.com/koaning/human-learn | 1500 |
```
## How It Works Under The Hood
The script performs the following operations:
1. Uses the GitHub GraphQL API to fetch a complete, paginated list of all your custom List categories.
2. Uses `jq` to parse that list to extract the `id` and `name` strings natively.
3. Loops through each `id` to fire off targeted point GraphQL API queries specifically fetching the starred repositories tagged within that category. It fetches these via nested pagination to bypass standard GitHub API limits.
4. Directly injects the formatted output into an active Markdown file using Bash string interpolation and `jq` object mapping.