Quick Start

1. Install dependencies

powershell

npm install

VILab now supports both Windows and Apple Silicon macOS desktop development from the same repository.

2. Start the desktop app

powershell

npm run dev

On Apple Silicon macOS, you can also use the helper script:

bash

bash scripts/dev-mac.sh

On Windows, the local Whisper runtime may require a 64-bit libclang.dll during Rust compilation. npm run dev, npm run build, and npm run cargo:check now try to auto-detect common install locations first. If cargo still fails inside whisper-rs-sys, set LIBCLANG_PATH to a directory that contains a compatible 64-bit libclang.dll and run the command again.

On Windows, the same startup path now also prepares ONNX Runtime automatically for built-in SenseVoice support. You normally do not need to install onnxruntime.dll by hand before running the desktop app.

On macOS, the first successful dictation workflow also requires you to grant:

Microphone permission for audio capture
Accessibility permission for global shortcuts and paste-back into the focused app

The current default macOS dictation shortcut is Cmd+Shift+Option+D.

This launches:

The Vite renderer
The Tauri desktop shell
The embedded VILab HTTP service managed by the app

3. Confirm the service URL

In the desktop app, open Settings and check Server URL.

The current default is:

text

http://127.0.0.1:8765

4. Check health

powershell

curl http://127.0.0.1:8765/health

Expected response:

json

{
  "serviceId": "uuid",
  "version": "0.1.7",
  "publicModel": "vilab-local-stt"
}

publicModel is the currently exposed STT model alias. It can represent either the active local model alias or the currently routed cloud STT alias.

5. Choose the STT mode

Open Settings and configure Routing and models -> STT mode:

Local: desktop dictation and /v1/audio/transcriptions use the active local speech runtime
Cloud: desktop dictation and /v1/audio/transcriptions use the selected cloud provider and cloud STT model

The Home page quick-start header now shows the current STT source and model name, for example:

Local 路 Whisper Small
Local 路 SenseVoice
Cloud 路 whisper-large-v3-turbo

Settings now save automatically after each change. You do not need to click a separate save button after switching STT mode, dictation priority, providers, or local models.

6. If using local STT, install and activate a local speech model

Before desktop dictation or LAN clients can call /v1/audio/transcriptions, open Settings and configure Local speech models:

Download a built-in model such as Whisper Base, Whisper Small, or SenseVoice, or
Place a compatible Whisper.cpp .bin model file in the models directory and refresh
Activate the model you want the host to use

The desktop app and the embedded HTTP service now share the same local speech runtime.

Built-in SenseVoice is installed as a managed directory model. Manual custom local models are still limited to Whisper.cpp .bin files in this phase.

Built-in SenseVoice is supported on both Windows and Apple Silicon macOS in this release.

If you select Cloud STT mode instead, configure the cloud provider and cloud STT model in the same routing section.

7. Choose the transcript output mode

Open Settings and configure Transcript mode and Scene mode for desktop dictation:

Verbatim keeps the output closest to the raw transcript
Smart Clean removes low-risk fillers, repetitions, and obvious self-repairs using only local cleanup rules
Polished runs the cloud transform provider after smart-clean and can shape the result for chat, email, or notes

History now keeps multiple output variants per session so you can switch between raw, smart-clean, and polished results without losing the original transcript.

8. Configure text transform providers

Cloud providers are now used for polished dictation, cleanup, rewrite, and other text transforms.

In Settings, configure:

Transform provider
API key
API base URL
Rewrite model

9. Open Prompt Lab

In the sidebar footer, use Test platform under Open Docs to launch Prompt Lab in your browser.

Prompt Lab is intended for internal prompt and model evaluation:

Smart Clean runs local rules only and should finish almost instantly
Polished uses the selected provider, model, preset, and optional prompt override
Runs do not write to the normal session history

10. Create an external API key

In Settings -> External API keys:

Create one key per project, script, device, or integration
Do not distribute the adminKey to normal callers
Copy the new key immediately; the full value is only shown once

11. Build a packaged macOS app

On Apple Silicon macOS:

bash

npm run build:mac
bash scripts/run-mac-app.sh

npm run build:mac produces the packaged .app and .dmg, while run-mac-app.sh opens the newest built app bundle for a local smoke test.

Quick Start ​

1. Install dependencies ​

2. Start the desktop app ​

3. Confirm the service URL ​

4. Check health ​

5. Choose the STT mode ​

6. If using local STT, install and activate a local speech model ​

7. Choose the transcript output mode ​

8. Configure text transform providers ​

9. Open Prompt Lab ​

10. Create an external API key ​

11. Build a packaged macOS app ​

Next steps ​

Quick Start

1. Install dependencies

2. Start the desktop app

3. Confirm the service URL

4. Check health

5. Choose the STT mode

6. If using local STT, install and activate a local speech model

7. Choose the transcript output mode

8. Configure text transform providers

9. Open Prompt Lab

10. Create an external API key

11. Build a packaged macOS app

Next steps