Question 1

Where does my audio go?

Accepted Answer

Nowhere. Both transcription and the optional summary run entirely in your browser via Transformers.js (WebGPU/WASM) and WebLLM. Audio bytes never leave your device.

Question 2

What models are used?

Accepted Answer

Whisper tiny.en (about 75 MB) for the transcript, and Llama 3.2 1B Instruct (about 750 MB, quantized) for the optional TL;DR + key points summary.

Question 3

What happens on first use?

Accepted Answer

Whisper downloads (~75 MB) and caches in your browser the first time you press Rec or Upload. If you ask for a summary, Llama 3.2 1B downloads (~750 MB) on that click. Both are cached so subsequent runs don't re-download.

Question 4

Does it work offline?

Accepted Answer

Yes, after the initial model downloads. The models live in your browser's IndexedDB cache; a second visit needs no network.

Question 5

How accurate is it?

Accepted Answer

Whisper tiny.en is small and fast — great for clear English speech, but it can struggle with heavy accents, noisy audio, or domain-specific jargon. Larger Whisper variants are more accurate at the cost of download size.

Question 6

What is WebGPU? What if my browser doesn’t have it?

Accepted Answer

WebGPU is a browser API for running GPU compute. Modern Chrome and Edge support it natively; Safari has it behind a flag. When WebGPU isn't available, the page automatically falls back to WebAssembly — slower, but works everywhere.

Question 7

Can I run this offline as a PWA?

Accepted Answer

Not yet — the page is still a regular web app. Once the models are cached, the page itself loads from the Vercel edge and could be cached by a service worker. PWA support is on the roadmap.

Private transcribe

FAQ