Granite Speech

Private speech-to-text in your browser — no audio sent to any server.

Granite Speech WebGPU runs local inference directly in your browser using GPU acceleration. It supports English, French, German, Spanish, Portuguese, and Japanese, and is designed for private, on-device transcription — not as a general-purpose or cloud-connected transcription service.

IBM Granite

Granite Speech WebGPU

Private, on-device speech-to-text running entirely in your browser — supports English, French, German, Spanish, Portuguese, and Japanese with no audio sent to any server.

Open on Hugging Face

Capture audio in chunks — everything stays on-device.

Browser or edge audio capture runs locally with no audio sent to external servers. Chunking helps manage latency and memory while keeping the entire transcription experience private.

Supports English, French, German, Spanish, Portuguese, and Japanese.

Granite Speech WebGPU focuses on six languages for browser-based transcription and translation. It is purpose-built for private, local use — not a general-purpose cloud transcription service.

Run speech inference with WebGPU acceleration.

Granite Speech converts acoustic input into language tokens or transcript segments — powered by your local GPU, with no cloud dependency and no audio leaving the device.

Assemble text and post-process.

Partial outputs are merged, cleaned, and surfaced to the user in a way that feels fast and reliable — entirely within the browser, maintaining end-to-end privacy.