Skip to content

Quick Start

1. Install dependencies

powershell
npm install

VILab now supports both Windows and Apple Silicon macOS desktop development from the same repository.

2. Start the desktop app

powershell
npm run dev

On Apple Silicon macOS, you can also use the helper script:

bash
bash scripts/dev-mac.sh

On Windows, the local Whisper runtime may require a 64-bit libclang.dll during Rust compilation. npm run dev, npm run build, and npm run cargo:check now try to auto-detect common install locations first. If cargo still fails inside whisper-rs-sys, set LIBCLANG_PATH to a directory that contains a compatible 64-bit libclang.dll and run the command again.

On Windows, the same startup path now also prepares ONNX Runtime automatically for built-in SenseVoice support. You normally do not need to install onnxruntime.dll by hand before running the desktop app.

On macOS, the first successful dictation workflow also requires you to grant:

  • Microphone permission for audio capture
  • Accessibility permission for global shortcuts and paste-back into the focused app

The current default macOS dictation shortcut is Cmd+Shift+Option+D.

This launches:

  • The Vite renderer
  • The Tauri desktop shell
  • The embedded VILab HTTP service managed by the app

3. Confirm the service URL

In the desktop app, open Settings and check Server URL.

The current default is:

text
http://127.0.0.1:8765

4. Check health

powershell
curl http://127.0.0.1:8765/health

Expected response:

json
{
  "serviceId": "uuid",
  "version": "0.1.7",
  "publicModel": "vilab-local-stt"
}

publicModel is the currently exposed STT model alias. It can represent either the active local model alias or the currently routed cloud STT alias.

5. Choose the STT mode

Open Settings and configure Routing and models -> STT mode:

  • Local: desktop dictation and /v1/audio/transcriptions use the active local speech runtime
  • Cloud: desktop dictation and /v1/audio/transcriptions use the selected cloud provider and cloud STT model

The Home page quick-start header now shows the current STT source and model name, for example:

  • Local 路 Whisper Small
  • Local 路 SenseVoice
  • Cloud 路 whisper-large-v3-turbo

Settings now save automatically after each change. You do not need to click a separate save button after switching STT mode, dictation priority, providers, or local models.

6. If using local STT, install and activate a local speech model

Before desktop dictation or LAN clients can call /v1/audio/transcriptions, open Settings and configure Local speech models:

  • Download a built-in model such as Whisper Base, Whisper Small, or SenseVoice, or
  • Place a compatible Whisper.cpp .bin model file in the models directory and refresh
  • Activate the model you want the host to use

The desktop app and the embedded HTTP service now share the same local speech runtime.

Built-in SenseVoice is installed as a managed directory model. Manual custom local models are still limited to Whisper.cpp .bin files in this phase.

Built-in SenseVoice is supported on both Windows and Apple Silicon macOS in this release.

If you select Cloud STT mode instead, configure the cloud provider and cloud STT model in the same routing section.

7. Choose the transcript output mode

Open Settings and configure Transcript mode and Scene mode for desktop dictation:

  • Verbatim keeps the output closest to the raw transcript
  • Smart Clean removes low-risk fillers, repetitions, and obvious self-repairs using only local cleanup rules
  • Polished runs the cloud transform provider after smart-clean and can shape the result for chat, email, or notes

History now keeps multiple output variants per session so you can switch between raw, smart-clean, and polished results without losing the original transcript.

8. Configure text transform providers

Cloud providers are now used for polished dictation, cleanup, rewrite, and other text transforms.

In Settings, configure:

  • Transform provider
  • API key
  • API base URL
  • Rewrite model

9. Open Prompt Lab

In the sidebar footer, use Test platform under Open Docs to launch Prompt Lab in your browser.

Prompt Lab is intended for internal prompt and model evaluation:

  • Smart Clean runs local rules only and should finish almost instantly
  • Polished uses the selected provider, model, preset, and optional prompt override
  • Runs do not write to the normal session history

10. Create an external API key

In Settings -> External API keys:

  • Create one key per project, script, device, or integration
  • Do not distribute the adminKey to normal callers
  • Copy the new key immediately; the full value is only shown once

11. Build a packaged macOS app

On Apple Silicon macOS:

bash
npm run build:mac
bash scripts/run-mac-app.sh

npm run build:mac produces the packaged .app and .dmg, while run-mac-app.sh opens the newest built app bundle for a local smoke test.

Next steps

Public release docs and self-hosted deployment guidance.