@clawhub-tsingcode-57351dce42
This skill should be used when the user wants to compress a PowerPoint (.pptx) file by reducing the size of embedded videos and large images. It handles the...
---
name: ppt-compressor
description: This skill should be used when the user wants to compress a PowerPoint (.pptx) file by reducing the size of embedded videos and large images. It handles the complete workflow of extracting media from .pptx archives, compressing videos with a bundled ffmpeg (no installation required) and images with Pillow, and repackaging into a valid .pptx file. Trigger phrases include "compress PPT", "reduce PPT size", "compress PowerPoint", "PPT too large", "shrink presentation", "compress videos in PPT", "compress images in PPT", "PPT压缩", "压缩PPT", "PPT文件太大", or any mention of reducing .pptx file sizes involving embedded media. Supports Windows, macOS, and Linux.
---
# PPT Compressor (Videos & Images)
## Overview
Compress embedded videos and large images (>1MB) in PowerPoint (.pptx) files to significantly reduce file size while maintaining playback compatibility. The skill provides a bundled Python script with **built-in ffmpeg** — no installation required.
## Prerequisites
- **Python 3.7+** must be installed
- **Pillow** — automatically installed if missing (used for image compression)
- **ffmpeg** — **bundled** in `{SKILL_DIR}/scripts/bin/`. No manual installation needed!
- If the bundled ffmpeg is missing, run: `python {SKILL_DIR}/scripts/download_ffmpeg.py` to re-download
- Supports automatic download for Windows, macOS, and Linux
## Architecture
```
{SKILL_DIR}/
├── SKILL.md # This file
└── scripts/
├── compress_ppt_videos.py # Main compression script (cross-platform)
├── path_helper.py # Cross-platform path utilities
├── download_ffmpeg.py # FFmpeg downloader (Windows/macOS/Linux)
└── bin/
├── ffmpeg[.exe] # Bundled ffmpeg (platform-specific)
└── ffprobe[.exe] # Bundled ffprobe (platform-specific)
```
### FFmpeg Resolution Order
The script automatically finds ffmpeg in this priority:
1. **Bundled version** in `{SKILL_DIR}/scripts/bin/` (preferred)
2. **System PATH** as fallback
This means users **never need to install ffmpeg manually**.
## User Interaction Guide (IMPORTANT for UX)
### 获取用户的 PPT 文件路径
当用户请求压缩 PPT 但**没有提供文件路径**时,Agent 应该:
#### 方式 1: 引导用户拖拽文件(推荐,跨平台通用)
直接告诉用户:
> 📂 **请把要压缩的 PPT 文件拖拽到这里**,或者直接发送文件路径给我。
#### 方式 2: 根据用户系统提供复制路径的指引
**Windows 用户:**
> 💡 在文件资源管理器中,按住 Shift 键右键点击文件,选择"复制为路径",然后粘贴到这里。
**macOS 用户:**
> 💡 在 Finder 中选中文件,按 `Option + Command + C` 复制文件路径,然后粘贴到这里。
> 或者:右键点击文件,按住 Option 键,选择"将xxx拷贝为路径名"。
**Linux 用户:**
> 💡 在文件管理器中右键点击文件,选择"复制路径"或"Copy Path"。
> 或者使用终端:`readlink -f /path/to/file.pptx`
#### 方式 3: 自动识别用户消息中的路径
Agent 应该智能识别用户消息中的各种路径格式:
| 系统 | 用户输入示例 | Agent 应该提取的路径 |
|------|-------------|---------------------|
| Windows | `压缩这个 C:\Users\user\Desktop\报告.pptx` | `C:/Users/user/Desktop/报告.pptx` |
| Windows | `"D:/工作文档/演示文稿.pptx" 太大了` | `D:/工作文档/演示文稿.pptx` |
| macOS | `/Users/john/Documents/presentation.pptx` | `/Users/john/Documents/presentation.pptx` |
| macOS | `~/Desktop/report.pptx 压缩一下` | `~/Desktop/report.pptx` |
| Linux | `/home/user/documents/slides.pptx` | `/home/user/documents/slides.pptx` |
| 通用 | 直接拖拽文件(显示为路径文本) | 自动提取完整路径 |
**路径识别正则表达式参考**:
```
Windows 绝对路径: [A-Za-z]:[\\\/][^\s"'<>|*?]+\.pptx
Unix 绝对路径: /[^\s"'<>|*?]+\.pptx
Home 目录路径: ~/[^\s"'<>|*?]+\.pptx
带引号的路径: ["'][^"']+\.pptx["']
```
#### 方式 4: 如果无法识别,友好询问
如果用户消息中没有明确的文件路径,**根据用户系统**使用以下模板询问:
**Windows 系统:**
```
我需要知道 PPT 文件的位置才能帮你压缩。请用以下任一方式告诉我:
1. **拖拽文件**:直接把 .pptx 文件拖到对话框
2. **复制路径**:在文件资源管理器中,按住 Shift 右键点击文件 → "复制为路径"
3. **直接输入**:例如 `C:\Users\你的用户名\Desktop\文件名.pptx`
```
**macOS 系统:**
```
我需要知道 PPT 文件的位置才能帮你压缩。请用以下任一方式告诉我:
1. **拖拽文件**:直接把 .pptx 文件拖到对话框
2. **复制路径**:在 Finder 中选中文件,按 Option + Command + C
3. **直接输入**:例如 `/Users/你的用户名/Documents/文件名.pptx` 或 `~/Desktop/文件名.pptx`
```
**Linux 系统:**
```
我需要知道 PPT 文件的位置才能帮你压缩。请用以下任一方式告诉我:
1. **拖拽文件**:直接把 .pptx 文件拖到对话框
2. **复制路径**:在文件管理器中右键点击文件 → "复制路径"
3. **直接输入**:例如 `/home/你的用户名/Documents/文件名.pptx` 或 `~/Documents/文件名.pptx`
```
### 路径清理和标准化
Agent 在获取到路径后,应该:
1. **移除首尾引号**:`"C:\path\file.pptx"` → `C:\path\file.pptx`
2. **统一为正斜杠**:`C:\Users\file.pptx` → `C:/Users/file.pptx`(Python 兼容)
3. **验证文件扩展名**:确保是 `.pptx` 文件
4. **验证文件存在**:在执行前检查文件是否存在
### 验证文件存在的方法
在执行压缩前,Agent 应该先验证文件存在:
```bash
python -c "import os; print('✓ 文件存在' if os.path.exists(r'<path>') else '✗ 文件不存在')"
```
如果文件不存在,友好提示用户检查路径是否正确。
---
## Agent 完整执行流程
当用户请求压缩 PPT 时,Agent 应该按以下步骤执行:
### Step 1: 检测用户消息中是否有路径
使用 `path_helper.py` 或正则表达式从用户消息中提取路径:
```bash
python -c "import sys; sys.path.insert(0, r'{SKILL_DIR}/scripts'); from path_helper import extract_pptx_paths; paths = extract_pptx_paths(r'''<用户的完整消息>'''); print(paths[0] if paths else 'NO_PATH_FOUND')"
```
### Step 2: 根据检测结果决定下一步
**情况 A: 找到路径** → 进入 Step 3 验证
**情况 B: 没有路径** → 根据用户的操作系统友好询问(参考上方的"用户交互指南")
**快速版本(通用):**
```
📂 请把要压缩的 PPT 文件路径发给我。
**最简单的方法:** 直接把文件拖拽到对话框即可!
或者复制文件路径粘贴给我。
```
### Step 3: 验证文件
```bash
python -c "import sys; sys.path.insert(0, r'{SKILL_DIR}/scripts'); from path_helper import validate_path; r = validate_path(r'<提取到的路径>'); print(f'Valid: {r[\"valid\"]}, Size: {r[\"size_mb\"]}MB' if r['valid'] else f'Error: {r[\"error\"]}')"
```
### Step 4: 执行压缩
验证通过后,执行压缩(使用 Python one-liner):
```bash
python -c "import sys; sys.path.insert(0, r'{SKILL_DIR}/scripts'); from compress_ppt_videos import run; run(r'<验证后的路径>')"
```
### Step 5: 报告结果
压缩完成后,告诉用户:
- 原始文件大小
- 压缩后文件大小
- 压缩比例
- 输出文件位置
---
## CRITICAL: Execution Instructions (Agent MUST Follow)
### ⚠️ IMPORTANT: Avoid Path Parsing Issues
When running the compression script, you **MUST** use the following approach to avoid shell path parsing issues on Windows:
**RECOMMENDED METHOD (Most Reliable):**
Use Python's `-c` flag with raw strings to bypass shell escaping issues:
```python
python -c "
import sys
sys.path.insert(0, r'{SKILL_DIR}/scripts')
from compress_ppt_videos import compress_pptx
compress_pptx(r'<input_path>')
"
```
**ALTERNATIVE: Use forward slashes with double-quoted paths:**
```bash
python "{SKILL_DIR}/scripts/compress_ppt_videos.py" "<input_path>"
```
### Common Pitfalls to AVOID:
1. **DON'T** use backslashes with `cd` in PowerShell - it may fail silently
2. **DON'T** use complex path concatenation in shells
3. **DON'T** attempt multiple `cd` commands in sequence
4. **DO** use raw strings (r'...') in Python for Windows paths
5. **DO** use forward slashes in paths when possible (Python accepts them on Windows)
6. **DO** double-quote paths containing spaces or non-ASCII characters
### Handling Chinese Filenames / Paths with Spaces
Chinese characters and spaces in paths are common. Always:
- Wrap file paths in quotes: `"path/to/文件.pptx"`
- Use raw strings in Python: `r"C:\Users\用户\桌面\文件.pptx"`
## Workflow
### Quick Start
To compress a PPT file with default settings (compresses both videos and images >1MB):
```bash
python "{SKILL_DIR}/scripts/compress_ppt_videos.py" "<input.pptx>"
```
This produces `<input>_compressed.pptx` in the same directory.
### How It Works
1. **Extract** — The `.pptx` file (ZIP archive) is extracted to a temporary directory
2. **Discover Videos** — All video files under `ppt/media/` are identified (supports: `.mp4`, `.avi`, `.mov`, `.wmv`, `.m4v`, `.mkv`, `.webm`, `.flv`, `.mpeg`, `.mpg`)
3. **Compress Videos** — Each video is compressed using ffmpeg with H.264 codec and AAC audio
4. **Discover Images** — All image files > 1MB under `ppt/media/` are identified (supports: `.png`, `.jpg`, `.jpeg`, `.bmp`, `.tiff`, `.tif`, `.webp`)
5. **Compress Images** — Each large image is compressed using Pillow with quality optimization and optional downscaling
6. **Smart Replace** — Compressed files only replace originals if they are actually smaller
7. **Repackage** — The modified directory is re-zipped into a valid `.pptx` file
### Command-Line Options
```
python compress_ppt_videos.py <input.pptx> [options]
General Options:
-o, --output PATH Output file path (default: <input>_compressed.pptx)
--no-videos Skip video compression
--no-images Skip image compression
--dry-run Preview what would be compressed without doing it
Video Compression Options:
--crf VALUE Quality factor 0-51 (default: 28, higher = smaller file)
--preset PRESET Encoding speed (default: medium)
--max-height PIXELS Max video height, 0 = no scaling (default: 720)
--audio-bitrate BITRATE Audio bitrate (default: 128k)
Image Compression Options:
--image-quality VALUE JPEG quality 1-95 (default: 80, lower = smaller file)
--image-max-dim PIXELS Max image dimension, 0 = no scaling (default: 1920)
--image-threshold BYTES Size threshold for image compression (default: 1048576 = 1MB)
```
### Video Compression Parameters Guide
| Parameter | Default | Purpose | Guidance |
|-----------|---------|---------|----------|
| `--crf` | 28 | Controls visual quality | 23 = high quality, 28 = good for PPT, 32 = aggressive compression |
| `--preset` | medium | Encoding speed vs ratio | `fast` for speed, `slow` for smaller files |
| `--max-height` | 720 | Downscale resolution | 720p is sufficient for most presentations; set 0 to keep original |
| `--audio-bitrate` | 128k | Audio quality | 96k for speech-only, 128k for general, 192k for music |
### Image Compression Parameters Guide
| Parameter | Default | Purpose | Guidance |
|-----------|---------|---------|----------|
| `--image-quality` | 80 | JPEG quality (1-95) | 60 = aggressive, 80 = balanced, 90 = high quality |
| `--image-max-dim` | 1920 | Max width or height | 1920 for Full HD, 1280 for 720p, 0 to keep original |
| `--image-threshold` | 1MB | Min size to compress | Only images larger than this are compressed |
### Recommended Presets by Use Case
- **Maximum compression** (email/sharing): `--crf 32 --preset slow --max-height 480 --audio-bitrate 96k --image-quality 60 --image-max-dim 1280`
- **Balanced** (default, good for most): `--crf 28 --preset medium --max-height 720 --image-quality 80`
- **High quality** (important presentations): `--crf 23 --preset slow --max-height 0 --image-quality 90 --image-max-dim 0`
- **Fast processing** (quick compress): `--crf 28 --preset fast --max-height 720 --image-quality 80`
- **Images only** (no video compression): `--no-videos --image-quality 75 --image-max-dim 1920`
- **Videos only** (no image compression): `--no-images --crf 28`
## Usage Examples
### Example 1: Basic Compression (RECOMMENDED METHOD - Most Reliable)
User request: "Help me compress this PPT file, it's too large to email"
**Agent should use this Python one-liner approach:**
```bash
python -c "import sys; sys.path.insert(0, r'{SKILL_DIR}/scripts'); from compress_ppt_videos import run; run(r'C:/Users/user/Desktop/presentation.pptx')"
```
Or equivalently (with forward slashes which work on Windows):
```bash
python "{SKILL_DIR}/scripts/compress_ppt_videos.py" "C:/Users/user/Desktop/presentation.pptx"
```
### Example 2: Using the Python API (for complex scenarios)
When you need more control or want to avoid shell issues entirely:
```python
# This Python code can be executed directly
import sys
sys.path.insert(0, r'{SKILL_DIR}/scripts')
from compress_ppt_videos import run
# Basic usage
run(r"C:/path/to/presentation.pptx")
# With custom settings
run(r"C:/path/to/presentation.pptx", crf=32, image_quality=70)
# High quality compression
run(r"C:/path/to/presentation.pptx", crf=23, preset='slow', max_height=0)
```
### Example 3: Custom Video Settings
User request: "Compress my PPT but keep high video quality"
```bash
python -c "import sys; sys.path.insert(0, r'{SKILL_DIR}/scripts'); from compress_ppt_videos import run; run(r'C:/path/to/file.pptx', crf=23, preset='slow', max_height=0, image_quality=90)"
```
### Example 4: Images Only
User request: "My PPT has huge screenshots, just compress the images"
```bash
python -c "import sys; sys.path.insert(0, r'{SKILL_DIR}/scripts'); from compress_ppt_videos import run; run(r'C:/path/to/file.pptx', skip_videos=True, image_quality=75)"
```
### Example 5: Preview Before Compressing (Dry Run)
User request: "I want to see what media is in my PPT before compressing"
```bash
python -c "import sys; sys.path.insert(0, r'{SKILL_DIR}/scripts'); from compress_ppt_videos import run; run(r'C:/path/to/file.pptx', dry_run=True)"
```
### Example 6: Batch Compression
User request: "Compress all PPT files in this folder"
```python
import sys
import os
sys.path.insert(0, r'{SKILL_DIR}/scripts')
from compress_ppt_videos import run
folder = r"C:/path/to/folder"
for f in os.listdir(folder):
if f.endswith('.pptx') and not f.endswith('_compressed.pptx'):
run(os.path.join(folder, f))
```
## Troubleshooting
### Path / Shell Issues (MOST COMMON)
- **Command doesn't execute / silent failure**: Use the Python one-liner method:
```bash
python -c "import sys; sys.path.insert(0, r'{SKILL_DIR}/scripts'); from compress_ppt_videos import run; run(r'<path>')"
```
- **Unicode/Chinese path errors**: Always use raw strings (`r'...'`) and forward slashes
- **PowerShell path issues**: Avoid `cd` commands; use full paths or Python one-liner
- **macOS/Linux: "python" not found**: Use `python3` instead of `python`
- **Home directory (~) not expanding**: Use `os.path.expanduser('~/path')` in Python
### FFmpeg Issues
- **"ffmpeg not found"**: Run `python {SKILL_DIR}/scripts/download_ffmpeg.py` to download the bundled version
- **macOS: "ffmpeg" cannot be opened (security)**: Run this command first:
```bash
xattr -d com.apple.quarantine "{SKILL_DIR}/scripts/bin/ffmpeg"
xattr -d com.apple.quarantine "{SKILL_DIR}/scripts/bin/ffprobe"
```
Or install ffmpeg via Homebrew: `brew install ffmpeg`
- **Linux: permission denied**: Make ffmpeg executable:
```bash
chmod +x "{SKILL_DIR}/scripts/bin/ffmpeg"
chmod +x "{SKILL_DIR}/scripts/bin/ffprobe"
```
### Compression Issues
- **"Bad zip archive"**: The file may be corrupted or not a valid .pptx file
- **No size reduction**: Media may already be well-compressed; try lower CRF/quality or skip the file
- **Video not playing after compression**: Try `crf=23, max_height=0` for conservative compression
- **Image quality too low**: Increase `image_quality` (e.g., 90) or set `image_max_dim=0`
- **Pillow not installed**: The script auto-installs it; if that fails, run `pip install Pillow` manually
## Resources
### scripts/
- **`compress_ppt_videos.py`** — Main compression script. Provides:
- `run(input_path, **kwargs)` — Simple function for programmatic use (RECOMMENDED)
- `compress_pptx(...)` — Full function with all parameters
- `main(argv=None)` — CLI entry point that accepts argument list
- **`path_helper.py`** — 路径辅助工具,帮助验证和提取用户输入的路径:
- `validate_path(path)` — 验证路径是否有效,返回详细状态
- `extract_pptx_paths(text)` — 从用户消息中智能提取 .pptx 路径
- `normalize_path(path)` — 标准化路径格式
- **`compress.py`** — Simplified entry point wrapper
- **`download_ffmpeg.py`** — Downloads and installs ffmpeg essentials to `scripts/bin/`. Run this if the bundled ffmpeg is missing.
- **`bin/ffmpeg.exe`** — Bundled ffmpeg binary (Windows). Auto-detected by the compression script.
- **`bin/ffprobe.exe`** — Bundled ffprobe binary (Windows). Used for video analysis.
### API Reference
```python
from compress_ppt_videos import run, compress_pptx
# Simple API (RECOMMENDED)
run(input_path, output_path=None, **kwargs)
# Full API
compress_pptx(
input_pptx,
output_pptx=None,
crf=28,
preset='medium',
max_height=720,
audio_bitrate='128k',
image_quality=80,
image_max_dim=1920,
image_threshold=1048576, # 1MB
skip_images=False,
skip_videos=False,
dry_run=False
)
```
FILE:scripts/compress.py
#!/usr/bin/env python3
"""
PPT Compressor - Simple Entry Point
This is a simplified wrapper that handles path issues automatically.
It can be copied to any directory and executed directly.
Usage:
python compress.py "path/to/your.pptx"
python compress.py "path/to/your.pptx" --crf 32 --image-quality 70
"""
import sys
import os
# Get the directory where this script is located
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
# Add the scripts directory to Python path
if SCRIPT_DIR not in sys.path:
sys.path.insert(0, SCRIPT_DIR)
# Import and run the main compression module
from compress_ppt_videos import main
if __name__ == '__main__':
main()
FILE:scripts/compress_ppt_videos.py
#!/usr/bin/env python3
"""
PPT Compressor - Compress videos and large images embedded in PowerPoint files.
Workflow:
1. Copy the .pptx file and rename to .zip
2. Extract the zip archive
3. Find all video files under ppt/media/ and compress with ffmpeg
4. Find all image files > 1MB under ppt/media/ and compress with Pillow
5. Replace originals with compressed versions (only if smaller)
6. Repackage into a new .pptx file
Usage:
python compress_ppt_videos.py <input.pptx> [--output <output.pptx>] [--crf <value>] [--preset <preset>] [--max-height <pixels>]
Arguments:
input.pptx Path to the input PowerPoint file
--output, -o Path for the output compressed file (default: <input>_compressed.pptx)
--crf CRF (Constant Rate Factor) value for quality control (default: 28, range: 0-51, higher = smaller file)
--preset FFmpeg encoding preset (default: medium, options: ultrafast/superfast/veryfast/faster/fast/medium/slow/slower/veryslow)
--max-height Maximum video height in pixels for downscaling (default: 720, set 0 to disable)
--audio-bitrate Audio bitrate (default: 128k)
--image-quality JPEG quality for image compression (default: 80, range: 1-95)
--image-max-dim Maximum dimension (width or height) for image downscaling (default: 1920, set 0 to disable)
--image-threshold Image file size threshold in bytes for compression (default: 1048576 = 1MB)
--no-images Skip image compression
--no-videos Skip video compression
--dry-run Show what would be done without actually compressing
Dependencies:
- Python 3.7+
- Pillow (pip install Pillow) - for image compression
- ffmpeg - bundled in scripts/bin/ or system PATH
"""
import sys
import os
import shutil
import zipfile
import subprocess
import argparse
import tempfile
import json
from pathlib import Path
# Supported file extensions
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.wmv', '.m4v', '.mkv', '.webm', '.flv', '.mpeg', '.mpg'}
IMAGE_EXTENSIONS = {'.png', '.jpg', '.jpeg', '.bmp', '.tiff', '.tif', '.webp'}
# Default image compression threshold: 1MB
DEFAULT_IMAGE_THRESHOLD = 1 * 1024 * 1024 # 1MB
def get_bundled_bin_dir():
"""Get the path to the bundled bin directory containing ffmpeg."""
script_dir = Path(__file__).parent.resolve()
return script_dir / 'bin'
def find_ffmpeg():
"""
Find ffmpeg executable. Priority:
1. Bundled version in scripts/bin/
2. System PATH
Returns:
tuple: (ffmpeg_path, ffprobe_path) or (None, None) if not found
"""
bin_dir = get_bundled_bin_dir()
# Check bundled version first
if sys.platform == 'win32':
ffmpeg_bundled = bin_dir / 'ffmpeg.exe'
ffprobe_bundled = bin_dir / 'ffprobe.exe'
else:
ffmpeg_bundled = bin_dir / 'ffmpeg'
ffprobe_bundled = bin_dir / 'ffprobe'
if ffmpeg_bundled.exists() and ffprobe_bundled.exists():
print(f"[INFO] Using bundled ffmpeg from: {bin_dir}")
return str(ffmpeg_bundled), str(ffprobe_bundled)
# Fall back to system PATH
ffmpeg_name = 'ffmpeg.exe' if sys.platform == 'win32' else 'ffmpeg'
ffprobe_name = 'ffprobe.exe' if sys.platform == 'win32' else 'ffprobe'
ffmpeg_sys = shutil.which(ffmpeg_name)
ffprobe_sys = shutil.which(ffprobe_name)
if ffmpeg_sys and ffprobe_sys:
print(f"[INFO] Using system ffmpeg from PATH")
return ffmpeg_sys, ffprobe_sys
return None, None
def check_ffmpeg(ffmpeg_path):
"""Verify that ffmpeg is accessible and working."""
try:
result = subprocess.run(
[ffmpeg_path, '-version'],
capture_output=True, text=True, timeout=10
)
if result.returncode == 0:
version_line = result.stdout.split('\n')[0]
print(f"[INFO] ffmpeg version: {version_line}")
return True
except (FileNotFoundError, subprocess.TimeoutExpired):
pass
return False
def ensure_pillow():
"""Ensure Pillow is installed, install if missing."""
try:
from PIL import Image
return True
except ImportError:
print("[INFO] Pillow not found. Installing...")
try:
result = subprocess.run(
[sys.executable, '-m', 'pip', 'install', 'Pillow', '-q'],
capture_output=True, text=True, timeout=120
)
if result.returncode == 0:
print("[INFO] Pillow installed successfully.")
return True
else:
print(f"[WARN] Failed to install Pillow: {result.stderr}")
return False
except Exception as e:
print(f"[WARN] Failed to install Pillow: {e}")
return False
def get_video_info(ffprobe_path, video_path):
"""Get video file information using ffprobe."""
try:
result = subprocess.run(
[
ffprobe_path, '-v', 'quiet',
'-print_format', 'json',
'-show_format', '-show_streams',
str(video_path)
],
capture_output=True, text=True, timeout=30
)
if result.returncode == 0:
return json.loads(result.stdout)
except (FileNotFoundError, subprocess.TimeoutExpired, json.JSONDecodeError):
pass
return None
def format_size(size_bytes):
"""Format byte size to human-readable string."""
for unit in ['B', 'KB', 'MB', 'GB']:
if size_bytes < 1024.0:
return f"{size_bytes:.2f} {unit}"
size_bytes /= 1024.0
return f"{size_bytes:.2f} TB"
def compress_video(ffmpeg_path, ffprobe_path, input_path, output_path,
crf=28, preset='medium', max_height=720, audio_bitrate='128k'):
"""
Compress a video file using ffmpeg with H.264 codec.
Returns:
True if compression succeeded, False otherwise
"""
input_path = Path(input_path)
output_path = Path(output_path)
# Build ffmpeg command
cmd = [
ffmpeg_path, '-y', # Overwrite output without asking
'-i', str(input_path), # Input file
]
# Video codec: H.264 for maximum compatibility with PowerPoint
video_filters = []
# Add scale filter if max_height is set and video is larger
if max_height > 0:
info = get_video_info(ffprobe_path, input_path)
if info:
for stream in info.get('streams', []):
if stream.get('codec_type') == 'video':
height = stream.get('height', 0)
if height > max_height:
video_filters.append(f"scale=-2:{max_height}")
break
cmd.extend([
'-c:v', 'libx264', # H.264 codec (best PPT compatibility)
'-crf', str(crf), # Quality factor
'-preset', preset, # Encoding speed/compression tradeoff
'-profile:v', 'high', # H.264 profile for good quality
'-level', '4.1', # Compatibility level
'-pix_fmt', 'yuv420p', # Pixel format for maximum compatibility
])
if video_filters:
cmd.extend(['-vf', ','.join(video_filters)])
# Audio settings: AAC for best compatibility
cmd.extend([
'-c:a', 'aac', # AAC audio codec
'-b:a', audio_bitrate, # Audio bitrate
'-ar', '44100', # Sample rate
])
cmd.extend([
'-movflags', '+faststart',
])
cmd.append(str(output_path))
try:
result = subprocess.run(
cmd,
capture_output=True, text=True, timeout=600
)
if result.returncode == 0:
return True
else:
print(f"[ERROR] ffmpeg failed for {input_path.name}:")
stderr_lines = result.stderr.strip().split('\n')
for line in stderr_lines[-5:]:
print(f" {line}")
return False
except subprocess.TimeoutExpired:
print(f"[ERROR] Compression timed out for {input_path.name}")
return False
def compress_image(input_path, output_path, quality=80, max_dim=1920):
"""
Compress an image file using Pillow.
Parameters:
input_path: Path to the source image
output_path: Path for the compressed output
quality: JPEG quality (1-95, default: 80)
max_dim: Max dimension (width or height), 0 to disable (default: 1920)
Returns:
True if compression succeeded, False otherwise
"""
try:
from PIL import Image
except ImportError:
print(f"[WARN] Pillow not available, skipping image compression for {input_path}")
return False
try:
input_path = Path(input_path)
output_path = Path(output_path)
img = Image.open(str(input_path))
original_format = img.format # e.g. 'PNG', 'JPEG', 'BMP'
original_mode = img.mode
# Resize if larger than max_dim
if max_dim > 0:
w, h = img.size
if w > max_dim or h > max_dim:
ratio = min(max_dim / w, max_dim / h)
new_w = int(w * ratio)
new_h = int(h * ratio)
img = img.resize((new_w, new_h), Image.LANCZOS)
print(f" Resized: {w}x{h} -> {new_w}x{new_h}")
# Determine output format based on original extension
suffix = input_path.suffix.lower()
if suffix in ('.jpg', '.jpeg'):
# Save as JPEG
if original_mode in ('RGBA', 'P', 'LA'):
img = img.convert('RGB')
img.save(str(output_path), 'JPEG', quality=quality, optimize=True)
elif suffix == '.png':
# For PNG: try saving as optimized PNG
# If has alpha, keep as PNG; otherwise convert to JPEG for better compression
if original_mode in ('RGBA', 'LA', 'PA'):
# Has transparency - keep as PNG with optimization
img.save(str(output_path), 'PNG', optimize=True)
else:
# No transparency - save as optimized PNG
img.save(str(output_path), 'PNG', optimize=True)
elif suffix in ('.bmp', '.tiff', '.tif'):
# Convert BMP/TIFF to JPEG (much smaller), keep same extension for PPT compatibility
if original_mode in ('RGBA', 'P', 'LA'):
img = img.convert('RGB')
# Save as JPEG but with original extension for PPTX compatibility
# Actually, we need to keep the format compatible. Save as PNG for better compatibility
img.save(str(output_path), 'PNG', optimize=True)
# Rename to match original extension
final_path = output_path.with_suffix(suffix)
if final_path != output_path:
output_path.rename(final_path)
return True
elif suffix == '.webp':
img.save(str(output_path), 'WEBP', quality=quality, method=6)
else:
# Fallback: save as JPEG
if original_mode in ('RGBA', 'P', 'LA'):
img = img.convert('RGB')
img.save(str(output_path), 'JPEG', quality=quality, optimize=True)
return True
except Exception as e:
print(f"[ERROR] Image compression failed for {input_path}: {e}")
return False
def extract_pptx(pptx_path, extract_dir):
"""Extract .pptx file (which is a zip archive) to a directory."""
try:
with zipfile.ZipFile(str(pptx_path), 'r') as zf:
zf.extractall(str(extract_dir))
return True
except zipfile.BadZipFile:
print(f"[ERROR] {pptx_path} is not a valid .pptx file (bad zip archive)")
return False
except Exception as e:
print(f"[ERROR] Failed to extract {pptx_path}: {e}")
return False
def repackage_pptx(source_dir, output_path):
"""
Repackage extracted directory back into a .pptx file.
Use deflated compression and preserve directory structure.
"""
source_dir = Path(source_dir)
output_path = Path(output_path)
try:
with zipfile.ZipFile(str(output_path), 'w', zipfile.ZIP_DEFLATED) as zf:
for file_path in sorted(source_dir.rglob('*')):
if file_path.is_file():
arcname = file_path.relative_to(source_dir)
zf.write(str(file_path), str(arcname))
return True
except Exception as e:
print(f"[ERROR] Failed to repackage .pptx: {e}")
return False
def find_videos(media_dir):
"""Find all video files in the media directory."""
media_path = Path(media_dir)
if not media_path.exists():
return []
videos = []
for f in media_path.iterdir():
if f.is_file() and f.suffix.lower() in VIDEO_EXTENSIONS:
videos.append(f)
return sorted(videos)
def find_large_images(media_dir, threshold=DEFAULT_IMAGE_THRESHOLD):
"""Find all image files larger than threshold in the media directory."""
media_path = Path(media_dir)
if not media_path.exists():
return []
images = []
for f in media_path.iterdir():
if f.is_file() and f.suffix.lower() in IMAGE_EXTENSIONS:
if f.stat().st_size > threshold:
images.append(f)
return sorted(images)
def compress_pptx(input_pptx, output_pptx=None, crf=28, preset='medium',
max_height=720, audio_bitrate='128k',
image_quality=80, image_max_dim=1920,
image_threshold=DEFAULT_IMAGE_THRESHOLD,
skip_images=False, skip_videos=False, dry_run=False):
"""
Main function: Compress all videos and large images in a .pptx file.
Returns:
True if successful, False otherwise
"""
input_pptx = Path(input_pptx).resolve()
if not input_pptx.exists():
print(f"[ERROR] File not found: {input_pptx}")
return False
if input_pptx.suffix.lower() != '.pptx':
print(f"[ERROR] File is not a .pptx file: {input_pptx}")
return False
# Generate output path if not provided
if output_pptx is None:
output_pptx = input_pptx.parent / f"{input_pptx.stem}_compressed.pptx"
else:
output_pptx = Path(output_pptx).resolve()
original_size = input_pptx.stat().st_size
print(f"[INFO] Input: {input_pptx}")
print(f"[INFO] Output: {output_pptx}")
print(f"[INFO] Original size: {format_size(original_size)}")
print()
# Find ffmpeg if video compression is needed
ffmpeg_path, ffprobe_path = None, None
if not skip_videos:
ffmpeg_path, ffprobe_path = find_ffmpeg()
if ffmpeg_path is None:
print("[WARN] ffmpeg not found (not bundled and not in PATH).")
print("[WARN] Video compression will be skipped.")
print("[WARN] To enable video compression, run: python {SKILL_DIR}/scripts/download_ffmpeg.py")
skip_videos = True
elif not dry_run:
if not check_ffmpeg(ffmpeg_path):
print("[WARN] ffmpeg found but not working. Video compression will be skipped.")
skip_videos = True
# Check Pillow for image compression
has_pillow = False
if not skip_images:
has_pillow = ensure_pillow()
if not has_pillow:
print("[WARN] Pillow not available. Image compression will be skipped.")
skip_images = True
if not skip_videos:
print(f"[INFO] Video settings: CRF={crf}, preset={preset}, max_height={max_height}px, audio={audio_bitrate}")
if not skip_images:
print(f"[INFO] Image settings: quality={image_quality}, max_dim={image_max_dim}px, threshold={format_size(image_threshold)}")
print()
# Create a temporary working directory
with tempfile.TemporaryDirectory(prefix='ppt_compress_') as tmp_dir:
tmp_path = Path(tmp_dir)
extract_dir = tmp_path / 'extracted'
# Step 1: Extract the .pptx archive
print("[STEP 1/5] Extracting .pptx archive...")
if not extract_pptx(input_pptx, extract_dir):
return False
media_dir = extract_dir / 'ppt' / 'media'
# Step 2: Find and compress videos
videos = find_videos(media_dir) if not skip_videos else []
total_video_saved = 0
if videos:
print(f"\n[STEP 2/5] Found {len(videos)} video file(s):")
total_video_size = 0
for v in videos:
size = v.stat().st_size
total_video_size += size
print(f" - {v.name} ({format_size(size)})")
print(f" Total video size: {format_size(total_video_size)}")
print()
if dry_run:
print("[DRY-RUN] Would compress the above videos with these settings:")
print(f" CRF: {crf}, Preset: {preset}, Max Height: {max_height}px")
else:
print(" Compressing videos with ffmpeg...")
compressed_dir = tmp_path / 'compressed_videos'
compressed_dir.mkdir()
success_count = 0
for i, video in enumerate(videos, 1):
print(f" [{i}/{len(videos)}] Compressing {video.name}...", end=' ', flush=True)
original_video_size = video.stat().st_size
compressed_path = compressed_dir / video.name
if compress_video(ffmpeg_path, ffprobe_path, video, compressed_path,
crf, preset, max_height, audio_bitrate):
compressed_size = compressed_path.stat().st_size
saved = original_video_size - compressed_size
ratio = (1 - compressed_size / original_video_size) * 100 if original_video_size > 0 else 0
if compressed_size < original_video_size:
shutil.copy2(str(compressed_path), str(video))
total_video_saved += saved
print(f"OK ({format_size(original_video_size)} -> {format_size(compressed_size)}, saved {ratio:.1f}%)")
else:
print(f"SKIP (compressed file is larger, keeping original)")
success_count += 1
else:
print("FAILED (keeping original)")
print(f"\n Video compression: {success_count}/{len(videos)} processed")
print(f" Total saved from videos: {format_size(total_video_saved)}")
else:
if not skip_videos:
print("[STEP 2/5] No video files found in ppt/media/.")
else:
print("[STEP 2/5] Video compression skipped.")
# Step 3: Find and compress large images
large_images = find_large_images(media_dir, image_threshold) if not skip_images else []
total_image_saved = 0
if large_images:
print(f"\n[STEP 3/5] Found {len(large_images)} image(s) larger than {format_size(image_threshold)}:")
total_image_size = 0
for img in large_images:
size = img.stat().st_size
total_image_size += size
print(f" - {img.name} ({format_size(size)})")
print(f" Total large image size: {format_size(total_image_size)}")
print()
if dry_run:
print("[DRY-RUN] Would compress the above images with these settings:")
print(f" Quality: {image_quality}, Max Dimension: {image_max_dim}px")
else:
print(" Compressing images with Pillow...")
compressed_img_dir = tmp_path / 'compressed_images'
compressed_img_dir.mkdir()
success_count = 0
for i, img_file in enumerate(large_images, 1):
print(f" [{i}/{len(large_images)}] Compressing {img_file.name}...", end=' ', flush=True)
original_img_size = img_file.stat().st_size
compressed_path = compressed_img_dir / img_file.name
if compress_image(img_file, compressed_path, image_quality, image_max_dim):
if compressed_path.exists():
compressed_size = compressed_path.stat().st_size
saved = original_img_size - compressed_size
ratio = (1 - compressed_size / original_img_size) * 100 if original_img_size > 0 else 0
if compressed_size < original_img_size:
shutil.copy2(str(compressed_path), str(img_file))
total_image_saved += saved
print(f"OK ({format_size(original_img_size)} -> {format_size(compressed_size)}, saved {ratio:.1f}%)")
else:
print(f"SKIP (compressed file is larger, keeping original)")
else:
print("SKIP (output file not created)")
success_count += 1
else:
print("FAILED (keeping original)")
print(f"\n Image compression: {success_count}/{len(large_images)} processed")
print(f" Total saved from images: {format_size(total_image_saved)}")
else:
if not skip_images:
print(f"\n[STEP 3/5] No images larger than {format_size(image_threshold)} found.")
else:
print("\n[STEP 3/5] Image compression skipped.")
# Check if anything was done
if not videos and not large_images:
print("\n[INFO] No compressible media found. Copying original file.")
shutil.copy2(str(input_pptx), str(output_pptx))
print(f"[INFO] Copied original file to {output_pptx}")
return True
if dry_run:
print("\n[DRY-RUN] No actual compression performed.")
return True
# Step 4: Repackage into .pptx
print("\n[STEP 4/5] Repackaging into .pptx...")
if not repackage_pptx(extract_dir, output_pptx):
return False
# Step 5: Show final results
final_size = output_pptx.stat().st_size
total_reduction = original_size - final_size
total_ratio = (1 - final_size / original_size) * 100 if original_size > 0 else 0
print()
print("=" * 60)
print(f" Compression Complete!")
print(f" Original: {format_size(original_size)}")
print(f" Compressed: {format_size(final_size)}")
if total_video_saved > 0:
print(f" Saved (videos): {format_size(total_video_saved)}")
if total_image_saved > 0:
print(f" Saved (images): {format_size(total_image_saved)}")
print(f" Total Saved: {format_size(total_reduction)} ({total_ratio:.1f}%)")
print(f" Output: {output_pptx}")
print("=" * 60)
return True
def run(input_path, output_path=None, **kwargs):
"""
Simple function to compress a PPT file. This is the recommended way to call
the compressor from Python code or agent scripts.
Parameters:
input_path: Path to the input .pptx file (can be string or Path)
output_path: Optional path for the output file (default: <input>_compressed.pptx)
**kwargs: Additional options:
- crf (int): Video quality factor, 0-51 (default: 28)
- preset (str): Encoding speed preset (default: 'medium')
- max_height (int): Max video height in pixels (default: 720)
- audio_bitrate (str): Audio bitrate (default: '128k')
- image_quality (int): JPEG quality, 1-95 (default: 80)
- image_max_dim (int): Max image dimension (default: 1920)
- image_threshold (int): Size threshold for image compression (default: 1MB)
- skip_images (bool): Skip image compression (default: False)
- skip_videos (bool): Skip video compression (default: False)
- dry_run (bool): Preview without compressing (default: False)
Returns:
bool: True if compression succeeded, False otherwise
Example:
from compress_ppt_videos import run
run(r"C:/Users/user/Desktop/presentation.pptx")
run(r"C:/path/to.pptx", crf=32, image_quality=70)
"""
return compress_pptx(
input_pptx=input_path,
output_pptx=output_path,
crf=kwargs.get('crf', 28),
preset=kwargs.get('preset', 'medium'),
max_height=kwargs.get('max_height', 720),
audio_bitrate=kwargs.get('audio_bitrate', '128k'),
image_quality=kwargs.get('image_quality', 80),
image_max_dim=kwargs.get('image_max_dim', 1920),
image_threshold=kwargs.get('image_threshold', DEFAULT_IMAGE_THRESHOLD),
skip_images=kwargs.get('skip_images', False),
skip_videos=kwargs.get('skip_videos', False),
dry_run=kwargs.get('dry_run', False)
)
def main(argv=None):
"""
Main entry point with command-line argument parsing.
Parameters:
argv: Optional list of arguments. If None, uses sys.argv.
This allows programmatic calling: main(['input.pptx', '--crf', '32'])
"""
parser = argparse.ArgumentParser(
description='Compress videos and large images in PowerPoint (.pptx) files.',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
%(prog)s presentation.pptx
%(prog)s presentation.pptx -o compressed.pptx
%(prog)s presentation.pptx --crf 32 --preset fast
%(prog)s presentation.pptx --max-height 480
%(prog)s presentation.pptx --image-quality 70 --image-max-dim 1280
%(prog)s presentation.pptx --no-videos # Only compress images
%(prog)s presentation.pptx --no-images # Only compress videos
%(prog)s presentation.pptx --dry-run
"""
)
parser.add_argument('input', help='Path to the input .pptx file')
parser.add_argument('-o', '--output', help='Path for the output compressed .pptx file')
# Video compression options
video_group = parser.add_argument_group('Video Compression Options')
video_group.add_argument('--crf', type=int, default=28,
help='CRF value for quality (0-51, default: 28, higher = smaller)')
video_group.add_argument('--preset', default='medium',
choices=['ultrafast', 'superfast', 'veryfast', 'faster',
'fast', 'medium', 'slow', 'slower', 'veryslow'],
help='FFmpeg encoding preset (default: medium)')
video_group.add_argument('--max-height', type=int, default=720,
help='Max video height in pixels, 0 to disable (default: 720)')
video_group.add_argument('--audio-bitrate', default='128k',
help='Audio bitrate (default: 128k)')
# Image compression options
image_group = parser.add_argument_group('Image Compression Options')
image_group.add_argument('--image-quality', type=int, default=80,
help='JPEG quality for image compression (1-95, default: 80)')
image_group.add_argument('--image-max-dim', type=int, default=1920,
help='Max image dimension in pixels, 0 to disable (default: 1920)')
image_group.add_argument('--image-threshold', type=int, default=DEFAULT_IMAGE_THRESHOLD,
help='Image file size threshold in bytes (default: 1048576 = 1MB)')
# Flags
parser.add_argument('--no-images', action='store_true',
help='Skip image compression')
parser.add_argument('--no-videos', action='store_true',
help='Skip video compression')
parser.add_argument('--dry-run', action='store_true',
help='Show what would be done without compressing')
# Parse arguments - use provided argv or sys.argv
args = parser.parse_args(argv)
# Validate CRF range
if not 0 <= args.crf <= 51:
parser.error("CRF must be between 0 and 51")
# Validate image quality range
if not 1 <= args.image_quality <= 95:
parser.error("Image quality must be between 1 and 95")
# Run compression
success = compress_pptx(
input_pptx=args.input,
output_pptx=args.output,
crf=args.crf,
preset=args.preset,
max_height=args.max_height,
audio_bitrate=args.audio_bitrate,
image_quality=args.image_quality,
image_max_dim=args.image_max_dim,
image_threshold=args.image_threshold,
skip_images=args.no_images,
skip_videos=args.no_videos,
dry_run=args.dry_run
)
sys.exit(0 if success else 1)
if __name__ == '__main__':
main()
FILE:scripts/download_ffmpeg.py
#!/usr/bin/env python3
"""
Download and extract ffmpeg for the current platform (Windows/macOS/Linux).
Supported platforms:
- Windows (x64): from gyan.dev
- macOS (Intel/ARM): from evermeet.cx
- Linux (x64): from johnvansickle.com
"""
import urllib.request
import zipfile
import tarfile
import os
import shutil
import sys
import glob
import platform
import stat
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
BIN_DIR = os.path.join(SCRIPT_DIR, "bin")
# Download URLs for different platforms
FFMPEG_URLS = {
'windows': {
'url': 'https://www.gyan.dev/ffmpeg/builds/ffmpeg-release-essentials.zip',
'type': 'zip',
'binaries': ['ffmpeg.exe', 'ffprobe.exe'],
},
'darwin': { # macOS
# evermeet.cx provides macOS builds
'ffmpeg_url': 'https://evermeet.cx/ffmpeg/getrelease/ffmpeg/zip',
'ffprobe_url': 'https://evermeet.cx/ffmpeg/getrelease/ffprobe/zip',
'type': 'zip_separate',
'binaries': ['ffmpeg', 'ffprobe'],
},
'linux': {
'url': 'https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz',
'type': 'tar.xz',
'binaries': ['ffmpeg', 'ffprobe'],
},
}
def get_platform():
"""Detect the current platform."""
system = platform.system().lower()
if system == 'windows':
return 'windows'
elif system == 'darwin':
return 'darwin'
elif system == 'linux':
return 'linux'
else:
return None
def make_executable(path):
"""Make a file executable (Unix-like systems)."""
if os.path.exists(path):
st = os.stat(path)
os.chmod(path, st.st_mode | stat.S_IEXEC | stat.S_IXGRP | stat.S_IXOTH)
def download_file(url, dest_path):
"""Download a file with progress indication."""
print(f"[INFO] Downloading from {url} ...")
try:
urllib.request.urlretrieve(url, dest_path)
print(f"[INFO] Downloaded to {dest_path}")
return True
except Exception as e:
print(f"[ERROR] Failed to download: {e}")
return False
def download_windows():
"""Download and extract ffmpeg for Windows."""
config = FFMPEG_URLS['windows']
zip_path = os.path.join(SCRIPT_DIR, "ffmpeg_download.zip")
if not download_file(config['url'], zip_path):
return False
print("[INFO] Extracting...")
try:
with zipfile.ZipFile(zip_path, "r") as zf:
zf.extractall(SCRIPT_DIR)
except Exception as e:
print(f"[ERROR] Failed to extract: {e}")
return False
# Find the extracted bin directory
pattern = os.path.join(SCRIPT_DIR, "ffmpeg-*-essentials_build", "bin")
matches = glob.glob(pattern)
if not matches:
pattern = os.path.join(SCRIPT_DIR, "ffmpeg-*", "bin")
matches = glob.glob(pattern)
if matches:
src_bin = matches[0]
for fname in config['binaries']:
src = os.path.join(src_bin, fname)
dst = os.path.join(BIN_DIR, fname)
if os.path.exists(src):
shutil.copy2(src, dst)
print(f"[INFO] Copied {fname} to {BIN_DIR}")
# Cleanup
os.remove(zip_path)
for d in glob.glob(os.path.join(SCRIPT_DIR, "ffmpeg-*-essentials_build")):
shutil.rmtree(d, ignore_errors=True)
for d in glob.glob(os.path.join(SCRIPT_DIR, "ffmpeg-*")):
if os.path.isdir(d) and d != BIN_DIR:
shutil.rmtree(d, ignore_errors=True)
return True
def download_macos():
"""Download and extract ffmpeg for macOS."""
config = FFMPEG_URLS['darwin']
# Download ffmpeg
ffmpeg_zip = os.path.join(SCRIPT_DIR, "ffmpeg.zip")
if not download_file(config['ffmpeg_url'], ffmpeg_zip):
return False
print("[INFO] Extracting ffmpeg...")
try:
with zipfile.ZipFile(ffmpeg_zip, "r") as zf:
zf.extractall(BIN_DIR)
os.remove(ffmpeg_zip)
except Exception as e:
print(f"[ERROR] Failed to extract ffmpeg: {e}")
return False
# Download ffprobe
ffprobe_zip = os.path.join(SCRIPT_DIR, "ffprobe.zip")
if not download_file(config['ffprobe_url'], ffprobe_zip):
return False
print("[INFO] Extracting ffprobe...")
try:
with zipfile.ZipFile(ffprobe_zip, "r") as zf:
zf.extractall(BIN_DIR)
os.remove(ffprobe_zip)
except Exception as e:
print(f"[ERROR] Failed to extract ffprobe: {e}")
return False
# Make executables
for fname in config['binaries']:
make_executable(os.path.join(BIN_DIR, fname))
return True
def download_linux():
"""Download and extract ffmpeg for Linux."""
config = FFMPEG_URLS['linux']
archive_path = os.path.join(SCRIPT_DIR, "ffmpeg_download.tar.xz")
if not download_file(config['url'], archive_path):
return False
print("[INFO] Extracting...")
try:
with tarfile.open(archive_path, "r:xz") as tf:
tf.extractall(SCRIPT_DIR)
except Exception as e:
print(f"[ERROR] Failed to extract: {e}")
return False
# Find the extracted directory
matches = glob.glob(os.path.join(SCRIPT_DIR, "ffmpeg-*-amd64-static"))
if not matches:
matches = glob.glob(os.path.join(SCRIPT_DIR, "ffmpeg-*-static"))
if matches:
src_dir = matches[0]
for fname in config['binaries']:
src = os.path.join(src_dir, fname)
dst = os.path.join(BIN_DIR, fname)
if os.path.exists(src):
shutil.copy2(src, dst)
make_executable(dst)
print(f"[INFO] Copied {fname} to {BIN_DIR}")
# Cleanup
os.remove(archive_path)
for d in glob.glob(os.path.join(SCRIPT_DIR, "ffmpeg-*-static")):
if os.path.isdir(d):
shutil.rmtree(d, ignore_errors=True)
for d in glob.glob(os.path.join(SCRIPT_DIR, "ffmpeg-*-amd64-static")):
if os.path.isdir(d):
shutil.rmtree(d, ignore_errors=True)
return True
def main():
os.makedirs(BIN_DIR, exist_ok=True)
# Detect platform
plat = get_platform()
if plat is None:
print(f"[ERROR] Unsupported platform: {platform.system()}")
print("[INFO] Supported platforms: Windows, macOS, Linux")
sys.exit(1)
print(f"[INFO] Detected platform: {plat}")
# Check if already downloaded
config = FFMPEG_URLS[plat]
binaries = config['binaries']
ffmpeg_path = os.path.join(BIN_DIR, binaries[0])
ffprobe_path = os.path.join(BIN_DIR, binaries[1])
if os.path.exists(ffmpeg_path) and os.path.exists(ffprobe_path):
print(f"[INFO] ffmpeg already exists at {BIN_DIR}")
return
# Download for the detected platform
success = False
if plat == 'windows':
success = download_windows()
elif plat == 'darwin':
success = download_macos()
elif plat == 'linux':
success = download_linux()
# Verify
if success and os.path.exists(ffmpeg_path):
print(f"[SUCCESS] ffmpeg is ready at {ffmpeg_path}")
else:
print("[ERROR] Failed to download/extract ffmpeg")
print("\n[INFO] Manual installation alternatives:")
if plat == 'windows':
print(" - Download from: https://www.gyan.dev/ffmpeg/builds/")
print(" - Or install via: choco install ffmpeg")
elif plat == 'darwin':
print(" - Install via: brew install ffmpeg")
elif plat == 'linux':
print(" - Install via: sudo apt install ffmpeg (Debian/Ubuntu)")
print(" - Install via: sudo dnf install ffmpeg (Fedora)")
print(" - Install via: sudo pacman -S ffmpeg (Arch)")
sys.exit(1)
if __name__ == "__main__":
main()
FILE:scripts/path_helper.py
#!/usr/bin/env python3
"""
Path Helper for PPT Compressor (Cross-Platform)
Helps validate and normalize user-provided file paths on Windows, macOS, and Linux.
Usage:
python path_helper.py <path_or_user_input>
python path_helper.py --extract "用户的完整消息,其中包含路径"
Functions:
validate_path(path) -> dict with status, normalized_path, error
extract_pptx_paths(text) -> list of found .pptx paths
normalize_path(path) -> cleaned path string
get_platform() -> 'windows' | 'darwin' | 'linux'
"""
import os
import re
import sys
import platform
from pathlib import Path
def get_platform() -> str:
"""Get the current platform name."""
system = platform.system().lower()
if system == 'windows':
return 'windows'
elif system == 'darwin':
return 'darwin' # macOS
else:
return 'linux'
def normalize_path(path: str) -> str:
"""
Normalize a file path for cross-platform compatibility.
- Removes surrounding quotes
- On Windows: preserves drive letter, converts to forward slashes
- On Unix: keeps forward slashes
- Expands ~ to user home directory
- Resolves relative paths
Args:
path: Raw path string from user input
Returns:
Normalized path string
"""
if not path:
return ""
# Remove surrounding quotes (single or double)
path = path.strip()
if (path.startswith('"') and path.endswith('"')) or \
(path.startswith("'") and path.endswith("'")):
path = path[1:-1]
# Expand user home directory (~)
path = os.path.expanduser(path)
# Convert to Path object for normalization
try:
p = Path(path)
# Try to resolve if it's a valid path
if p.exists():
path = str(p.resolve())
else:
# Just normalize the path string
path = str(p)
except Exception:
pass
# Convert backslashes to forward slashes (Python handles both on Windows)
path = path.replace('\\', '/')
return path
def extract_pptx_paths(text: str) -> list:
"""
Extract potential .pptx file paths from user text.
Handles cross-platform paths:
- Windows: C:/Users/name/file.pptx or C:\\Users\\name\\file.pptx
- macOS/Linux: /Users/name/file.pptx or ~/Documents/file.pptx
- Quoted paths: "path with spaces.pptx" or 'path.pptx'
- Chinese and other unicode characters in paths
Args:
text: User's message that may contain file paths
Returns:
List of extracted paths (may be empty)
"""
paths = []
current_platform = get_platform()
# Pattern 1: Quoted paths (highest priority - handles spaces and special chars)
# Works on all platforms
quoted_pattern = r'["\']([^"\']+\.pptx)["\']'
for match in re.finditer(quoted_pattern, text, re.IGNORECASE):
paths.append(match.group(1))
# Pattern 2: Windows absolute paths (drive letter)
# Handles both forward and back slashes, Chinese chars
# Stop at whitespace, Chinese punctuation, or common delimiters
win_pattern = r'[A-Za-z]:[/\\](?:[^<>:"|?*\n\s,。!?、;:]+[/\\])*[^<>:"|?*\n\s,。!?、;:]+\.pptx'
for match in re.finditer(win_pattern, text, re.IGNORECASE):
path = match.group(0)
if path not in paths:
paths.append(path)
# Pattern 3: Home directory paths: ~/something.pptx (for macOS/Linux)
# Only match if not already found Windows paths (to avoid false positives on Windows)
if not paths or current_platform != 'windows':
home_pattern = r'~[/][^\s<>:"|?*\n,。!?、;:]+\.pptx'
for match in re.finditer(home_pattern, text, re.IGNORECASE):
path = match.group(0)
if path not in paths:
paths.append(path)
# Pattern 4: Unix absolute paths: /path/to/file.pptx (for macOS/Linux)
# Must start with / and not be preceded by drive letter
# Only match on non-Windows platforms OR if no Windows paths found
if not paths or current_platform != 'windows':
# Match absolute paths that start with /
# Exclude paths that might be part of a Windows path
unix_pattern = r'(?<![A-Za-z]:)(?<![~/\w])/(?:[^\s<>:"|?*\n,。!?、;:]+/)*[^\s<>:"|?*\n,。!?、;:]+\.pptx'
for match in re.finditer(unix_pattern, text, re.IGNORECASE):
path = match.group(0)
# Verify it starts with / and is a valid absolute path
if path.startswith('/') and path not in paths:
paths.append(path)
# Normalize all found paths
normalized = [normalize_path(p) for p in paths]
# Remove duplicates while preserving order
seen = set()
unique = []
for p in normalized:
key = p.lower() if current_platform == 'windows' else p
if key not in seen:
seen.add(key)
unique.append(p)
return unique
def validate_path(path: str) -> dict:
"""
Validate a file path and return detailed status.
Args:
path: File path to validate
Returns:
Dictionary with:
- valid (bool): Whether the path is valid and file exists
- normalized_path (str): Cleaned path
- exists (bool): Whether file exists
- is_pptx (bool): Whether file has .pptx extension
- size_mb (float): File size in MB (if exists)
- error (str): Error message (if any)
- suggestion (str): Helpful suggestion for the user
"""
result = {
'valid': False,
'normalized_path': '',
'exists': False,
'is_pptx': False,
'size_mb': 0,
'error': '',
'suggestion': ''
}
if not path:
result['error'] = '路径为空'
result['suggestion'] = '请提供 PPT 文件的完整路径'
return result
# Normalize the path
normalized = normalize_path(path)
result['normalized_path'] = normalized
# Check extension
if not normalized.lower().endswith('.pptx'):
result['error'] = f'文件扩展名不是 .pptx(当前: {Path(normalized).suffix})'
result['suggestion'] = '此工具只支持 .pptx 格式的 PowerPoint 文件'
return result
result['is_pptx'] = True
# Check if file exists
try:
p = Path(normalized)
if not p.exists():
# Try with backslashes (Windows)
p = Path(path)
if not p.exists():
result['error'] = '文件不存在'
result['suggestion'] = f'请检查路径是否正确: {normalized}'
return result
else:
normalized = str(p.resolve()).replace('\\', '/')
result['normalized_path'] = normalized
result['exists'] = True
result['size_mb'] = round(p.stat().st_size / (1024 * 1024), 2)
result['valid'] = True
except Exception as e:
result['error'] = f'路径解析错误: {str(e)}'
result['suggestion'] = '路径可能包含无效字符,请尝试用引号包裹路径'
return result
return result
def format_validation_result(result: dict) -> str:
"""Format validation result for display."""
if result['valid']:
return f"""✅ 文件验证通过
📁 路径: {result['normalized_path']}
📊 大小: {result['size_mb']} MB
"""
else:
return f"""❌ 文件验证失败
原因: {result['error']}
💡 建议: {result['suggestion']}
"""
def main():
"""Command-line interface for path helper."""
if len(sys.argv) < 2:
print("""
PPT 路径助手 - 帮助验证和提取 PPT 文件路径
用法:
python path_helper.py <文件路径> 验证指定路径
python path_helper.py --extract "文本" 从文本中提取 .pptx 路径
python path_helper.py --help 显示此帮助
示例:
python path_helper.py "C:/Users/user/Desktop/报告.pptx"
python path_helper.py --extract "帮我压缩 C:\\Users\\test\\文件.pptx 这个PPT"
""")
return
if sys.argv[1] == '--extract':
if len(sys.argv) < 3:
print("错误: --extract 需要提供文本参数")
return
text = ' '.join(sys.argv[2:])
paths = extract_pptx_paths(text)
if paths:
print(f"找到 {len(paths)} 个 .pptx 路径:")
for i, p in enumerate(paths, 1):
print(f" {i}. {p}")
result = validate_path(p)
if result['exists']:
print(f" ✓ 文件存在 ({result['size_mb']} MB)")
else:
print(f" ✗ {result['error']}")
else:
print("未在文本中找到 .pptx 文件路径")
print("💡 提示: 请确保路径包含完整的盘符(如 C:/)或使用引号包裹")
elif sys.argv[1] == '--help':
main.__doc__ and print(main.__doc__)
else:
# Validate provided path
path = ' '.join(sys.argv[1:])
result = validate_path(path)
print(format_validation_result(result))
if result['valid']:
print(f"可以使用以下命令压缩:")
print(f'python -c "import sys; sys.path.insert(0, r\'{os.path.dirname(os.path.abspath(__file__))}\'); from compress_ppt_videos import run; run(r\'{result["normalized_path"]}\')"')
if __name__ == '__main__':
main()