lizhuo

@clawhub-huanglizhuo-d5ed81d569
1prompts
0upvotes received
0contributions
Joined 3 months ago
1 contribution in the last year
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Less
Qwen ASR
Skill
Local speech-to-text using Qwen3-ASR (CPU-only, no API key, no cloud). Use when: (1) a voice message or audio file needs transcription, (2) user asks to tran...
---
name: qwen-asr
description: >-
  Local speech-to-text using Qwen3-ASR (CPU-only, no API key, no cloud).
  Use when: (1) a voice message or audio file needs transcription,
  (2) user asks to transcribe audio, (3) speech-to-text is needed.
  Supports offline, segmented, and streaming modes. macOS and Linux only.
metadata:
  {
    "openclaw": {
      "emoji": "🗣️",
      "os": ["darwin", "linux"],
      "requires": { "bins": ["qwen-asr"] }
    }
  }
---

# qwen-asr

Local, CPU-only speech-to-text powered by Qwen3-ASR. No API key or cloud needed.

- Source code: [huanglizhuo/QwenASR](https://github.com/huanglizhuo/QwenASR)
- Based on: [antirez/qwen-asr](https://github.com/antirez/qwen-asr) (original C implementation)

## Install

Run the install script to download the pre-built binary and model:

```bash
bash {baseDir}/scripts/install.sh
```

This will:
1. Download the `qwen-asr` binary for your platform from GitHub Releases
2. Download the `qwen3-asr-0.6b` model (~1.5 GB) from HuggingFace

## Usage

### Transcribe an audio file

```bash
bash {baseDir}/scripts/transcribe.sh <audio-file>
```

Supports any audio format: wav, mp3, m4a, ogg, flac, opus, webm, aac, etc.
Non-WAV files are automatically converted via `ffmpeg` (must be installed).

Or call `qwen-asr` directly (WAV only):

```bash
qwen-asr -d ~/.openclaw/tools/qwen-asr/qwen3-asr-0.6b -i <audio-file> --silent
```

### From stdin

```bash
cat audio.wav | qwen-asr -d ~/.openclaw/tools/qwen-asr/qwen3-asr-0.6b --stdin --silent
```

### Common parameters

| Flag | Description |
|------|-------------|
| `--silent` | Print only transcription text (no progress) |
| `--language <lang>` | Force language (e.g., `zh`, `en`) |
| `-S <seconds>` | Segmented mode — split audio into chunks |
| `--stream` | Streaming mode — process audio in real time |
| `--stdin` | Read audio from stdin |

### Model path

Default model directory: `~/.openclaw/tools/qwen-asr/qwen3-asr-0.6b`

FILE:scripts/transcribe.sh
#!/usr/bin/env bash
set -euo pipefail

MODEL_DIR="-${HOME/.openclaw/tools/qwen-asr/qwen3-asr-0.6b}"

# Convert non-WAV audio to WAV via ffmpeg, return path to use
to_wav() {
  local input="$1"
  case "input##*." in
    wav|WAV) echo "$input" ;;
    *)
      if ! command -v ffmpeg &>/dev/null; then
        echo "Error: ffmpeg is required for non-WAV files. Install it with:" >&2
        echo "  macOS:  brew install ffmpeg" >&2
        echo "  Linux:  sudo apt install ffmpeg" >&2
        exit 1
      fi
      local tmp
      tmp="$(mktemp /tmp/qwen-asr-XXXXXX.wav)"
      ffmpeg -y -i "$input" -ar 16000 -ac 1 -f wav "$tmp" -loglevel error
      echo "$tmp"
      ;;
  esac
}

if [ "-" = "--stdin" ]; then
  shift
  exec qwen-asr -d "$MODEL_DIR" --stdin --silent "$@"
elif [ -n "-" ]; then
  INPUT="$(to_wav "$1")"
  trap 'rm -f "$INPUT"' EXIT 2>/dev/null || true
  qwen-asr -d "$MODEL_DIR" -i "$INPUT" --silent "2"
else
  echo "Usage: transcribe.sh <audio-file> [options...]"
  echo "       transcribe.sh --stdin [options...]"
  echo ""
  echo "Supports: wav, mp3, m4a, ogg, flac, opus, webm, aac, etc. (non-WAV requires ffmpeg)"
  echo "Options are passed through to qwen-asr (e.g., --language zh, -S 30)"
  exit 1
fi

FILE:scripts/install.sh
#!/usr/bin/env bash
set -euo pipefail

REPO="huanglizhuo/QwenASR"
INSTALL_DIR="HOME/.local/bin"
MODEL_DIR="HOME/.openclaw/tools/qwen-asr/qwen3-asr-0.6b"

# --- 1. Install binary ---
if command -v qwen-asr &>/dev/null; then
  echo "qwen-asr is already installed: $(command -v qwen-asr)"
else
  # Detect platform
  OS="$(uname -s)"
  ARCH="$(uname -m)"

  case "OS-ARCH" in
    Darwin-arm64)  TARGET="aarch64-apple-darwin" ;;
    Linux-x86_64)  TARGET="x86_64-unknown-linux-gnu" ;;
    *)
      echo "No pre-built binary for OS-ARCH."
      echo "Install from source: cargo install qwen-asr-cli"
      exit 1
      ;;
  esac

  # Get latest qwen-asr-cli release tag
  echo "Fetching latest release..."
  TAG=$(curl -fsSL "https://api.github.com/repos/REPO/releases" \
    | grep -o '"tag_name": *"qwen-asr-cli-v[^"]*"' \
    | head -1 \
    | sed 's/"tag_name": *"//;s/"//')

  if [ -z "$TAG" ]; then
    echo "Could not find a qwen-asr-cli release."
    echo "Install from source: cargo install qwen-asr-cli"
    exit 1
  fi

  VERSION="TAG#qwen-asr-cli-v"
  ARCHIVE="qwen-asr-VERSION-TARGET.tar.gz"
  URL="https://github.com/REPO/releases/download/TAG/ARCHIVE"

  echo "Downloading ARCHIVE..."
  TMPDIR="$(mktemp -d)"
  trap 'rm -rf "$TMPDIR"' EXIT

  if ! curl -fSL -o "TMPDIR/ARCHIVE" "$URL"; then
    echo "Download failed. No pre-built binary for your platform in this release."
    echo "Install from source: cargo install qwen-asr-cli"
    exit 1
  fi

  # Extract to install dir
  mkdir -p "$INSTALL_DIR"
  tar -xzf "TMPDIR/ARCHIVE" -C "$INSTALL_DIR"
  chmod +x "INSTALL_DIR/qwen-asr"

  echo "Installed qwen-asr to INSTALL_DIR/qwen-asr"

  # Check if INSTALL_DIR is in PATH
  if ! echo "$PATH" | tr ':' '\n' | grep -qx "$INSTALL_DIR"; then
    echo ""
    echo "NOTE: INSTALL_DIR is not in your PATH."
    echo "Add it with:  export PATH=\"INSTALL_DIR:\$PATH\""
  fi
fi

# --- 2. Download model ---
if [ -d "$MODEL_DIR" ] && [ -f "MODEL_DIR/model.safetensors" ]; then
  echo "Model already downloaded at MODEL_DIR"
else
  echo "Downloading qwen3-asr-0.6b model..."
  mkdir -p "$(dirname "$MODEL_DIR")"
  qwen-asr download qwen3-asr-0.6b --output "$MODEL_DIR"
  echo "Model downloaded to MODEL_DIR"
fi

echo ""
echo "Setup complete! Test with:"
echo "  qwen-asr -d MODEL_DIR -i <audio-file> --silent"
ClawHub Coding Backend+2
L@clawhub-huanglizhuo-d5ed81d569