@clawhub-huanglizhuo-d5ed81d569
Local speech-to-text using Qwen3-ASR (CPU-only, no API key, no cloud). Use when: (1) a voice message or audio file needs transcription, (2) user asks to tran...
---
name: qwen-asr
description: >-
Local speech-to-text using Qwen3-ASR (CPU-only, no API key, no cloud).
Use when: (1) a voice message or audio file needs transcription,
(2) user asks to transcribe audio, (3) speech-to-text is needed.
Supports offline, segmented, and streaming modes. macOS and Linux only.
metadata:
{
"openclaw": {
"emoji": "🗣️",
"os": ["darwin", "linux"],
"requires": { "bins": ["qwen-asr"] }
}
}
---
# qwen-asr
Local, CPU-only speech-to-text powered by Qwen3-ASR. No API key or cloud needed.
- Source code: [huanglizhuo/QwenASR](https://github.com/huanglizhuo/QwenASR)
- Based on: [antirez/qwen-asr](https://github.com/antirez/qwen-asr) (original C implementation)
## Install
Run the install script to download the pre-built binary and model:
```bash
bash {baseDir}/scripts/install.sh
```
This will:
1. Download the `qwen-asr` binary for your platform from GitHub Releases
2. Download the `qwen3-asr-0.6b` model (~1.5 GB) from HuggingFace
## Usage
### Transcribe an audio file
```bash
bash {baseDir}/scripts/transcribe.sh <audio-file>
```
Supports any audio format: wav, mp3, m4a, ogg, flac, opus, webm, aac, etc.
Non-WAV files are automatically converted via `ffmpeg` (must be installed).
Or call `qwen-asr` directly (WAV only):
```bash
qwen-asr -d ~/.openclaw/tools/qwen-asr/qwen3-asr-0.6b -i <audio-file> --silent
```
### From stdin
```bash
cat audio.wav | qwen-asr -d ~/.openclaw/tools/qwen-asr/qwen3-asr-0.6b --stdin --silent
```
### Common parameters
| Flag | Description |
|------|-------------|
| `--silent` | Print only transcription text (no progress) |
| `--language <lang>` | Force language (e.g., `zh`, `en`) |
| `-S <seconds>` | Segmented mode — split audio into chunks |
| `--stream` | Streaming mode — process audio in real time |
| `--stdin` | Read audio from stdin |
### Model path
Default model directory: `~/.openclaw/tools/qwen-asr/qwen3-asr-0.6b`
FILE:scripts/transcribe.sh
#!/usr/bin/env bash
set -euo pipefail
MODEL_DIR="-${HOME/.openclaw/tools/qwen-asr/qwen3-asr-0.6b}"
# Convert non-WAV audio to WAV via ffmpeg, return path to use
to_wav() {
local input="$1"
case "input##*." in
wav|WAV) echo "$input" ;;
*)
if ! command -v ffmpeg &>/dev/null; then
echo "Error: ffmpeg is required for non-WAV files. Install it with:" >&2
echo " macOS: brew install ffmpeg" >&2
echo " Linux: sudo apt install ffmpeg" >&2
exit 1
fi
local tmp
tmp="$(mktemp /tmp/qwen-asr-XXXXXX.wav)"
ffmpeg -y -i "$input" -ar 16000 -ac 1 -f wav "$tmp" -loglevel error
echo "$tmp"
;;
esac
}
if [ "-" = "--stdin" ]; then
shift
exec qwen-asr -d "$MODEL_DIR" --stdin --silent "$@"
elif [ -n "-" ]; then
INPUT="$(to_wav "$1")"
trap 'rm -f "$INPUT"' EXIT 2>/dev/null || true
qwen-asr -d "$MODEL_DIR" -i "$INPUT" --silent "2"
else
echo "Usage: transcribe.sh <audio-file> [options...]"
echo " transcribe.sh --stdin [options...]"
echo ""
echo "Supports: wav, mp3, m4a, ogg, flac, opus, webm, aac, etc. (non-WAV requires ffmpeg)"
echo "Options are passed through to qwen-asr (e.g., --language zh, -S 30)"
exit 1
fi
FILE:scripts/install.sh
#!/usr/bin/env bash
set -euo pipefail
REPO="huanglizhuo/QwenASR"
INSTALL_DIR="HOME/.local/bin"
MODEL_DIR="HOME/.openclaw/tools/qwen-asr/qwen3-asr-0.6b"
# --- 1. Install binary ---
if command -v qwen-asr &>/dev/null; then
echo "qwen-asr is already installed: $(command -v qwen-asr)"
else
# Detect platform
OS="$(uname -s)"
ARCH="$(uname -m)"
case "OS-ARCH" in
Darwin-arm64) TARGET="aarch64-apple-darwin" ;;
Linux-x86_64) TARGET="x86_64-unknown-linux-gnu" ;;
*)
echo "No pre-built binary for OS-ARCH."
echo "Install from source: cargo install qwen-asr-cli"
exit 1
;;
esac
# Get latest qwen-asr-cli release tag
echo "Fetching latest release..."
TAG=$(curl -fsSL "https://api.github.com/repos/REPO/releases" \
| grep -o '"tag_name": *"qwen-asr-cli-v[^"]*"' \
| head -1 \
| sed 's/"tag_name": *"//;s/"//')
if [ -z "$TAG" ]; then
echo "Could not find a qwen-asr-cli release."
echo "Install from source: cargo install qwen-asr-cli"
exit 1
fi
VERSION="TAG#qwen-asr-cli-v"
ARCHIVE="qwen-asr-VERSION-TARGET.tar.gz"
URL="https://github.com/REPO/releases/download/TAG/ARCHIVE"
echo "Downloading ARCHIVE..."
TMPDIR="$(mktemp -d)"
trap 'rm -rf "$TMPDIR"' EXIT
if ! curl -fSL -o "TMPDIR/ARCHIVE" "$URL"; then
echo "Download failed. No pre-built binary for your platform in this release."
echo "Install from source: cargo install qwen-asr-cli"
exit 1
fi
# Extract to install dir
mkdir -p "$INSTALL_DIR"
tar -xzf "TMPDIR/ARCHIVE" -C "$INSTALL_DIR"
chmod +x "INSTALL_DIR/qwen-asr"
echo "Installed qwen-asr to INSTALL_DIR/qwen-asr"
# Check if INSTALL_DIR is in PATH
if ! echo "$PATH" | tr ':' '\n' | grep -qx "$INSTALL_DIR"; then
echo ""
echo "NOTE: INSTALL_DIR is not in your PATH."
echo "Add it with: export PATH=\"INSTALL_DIR:\$PATH\""
fi
fi
# --- 2. Download model ---
if [ -d "$MODEL_DIR" ] && [ -f "MODEL_DIR/model.safetensors" ]; then
echo "Model already downloaded at MODEL_DIR"
else
echo "Downloading qwen3-asr-0.6b model..."
mkdir -p "$(dirname "$MODEL_DIR")"
qwen-asr download qwen3-asr-0.6b --output "$MODEL_DIR"
echo "Model downloaded to MODEL_DIR"
fi
echo ""
echo "Setup complete! Test with:"
echo " qwen-asr -d MODEL_DIR -i <audio-file> --silent"