Configuration Guide

Customize hotkeys, choose providers, configure AI cleanup, and fine-tune settings.

Global Hotkey

The global hotkey triggers push-to-talk recording from any application. The default is Ctrl+Alt+Space. You can change this in Settings > Hotkey.

Hold the hotkey to record, release to stop and transcribe. Or configure toggle mode to press once to start, once to stop. You can also set a push-to-talk grace period — if you release within the grace window, Dikt switches to toggle-like behavior so you don't need to hold the key for longer dictations.

Transcription Provider

Choose between local (Whisper.cpp), cloud (OpenAI Whisper API), or GPT-4o Audio transcription. Configure in Settings > Transcription.

Local: No API key needed. Download a model from the Model Manager.
Cloud: Requires OpenAI API key (BYOK) or included with Pro/Trial.
GPT-4o Audio: Highest quality cloud transcription via OpenAI GPT-4o multimodal.

AI Cleanup (LLM Post-Processing)

Enable AI cleanup to fix grammar, punctuation, and filler words. Supports OpenAI GPT-4o-mini and Anthropic Claude Haiku. Configure in Settings > AI & Text.

BYOK users provide their own API keys. Pro users get this included via the managed proxy.

Profanity Filter

Automatically removes curse words from transcriptions. Uses dual coverage: the AI cleanup prompt is augmented to filter profanity, and a local word-list filter catches anything the AI misses. Add custom words to extend the built-in list. Configure in Settings > Swear Jar.

The Swear Jar tracks removal stats with a configurable cost-per-word, daily trends, and top offenders.

Overlay System

Dikt has two overlay modes that keep you informed without opening the full main window. The Notification Overlay shows transcription results as a brief semi-transparent popup. The Compact Overlay is a persistent floating window showing recording status and audio levels with live waveform visualization.

Both overlays support configurable Type (Minimal, Line, Box), Style (Pill, Box, Text), Position (9 presets or custom drag), Opacity (0.5–1.0), Display (choose which monitor for multi-monitor setups), and Size(1–100 scaling). Visual previews in dropdown menus let you see each option before selecting it. Configure in Settings > General.

The compact overlay features Waveform Indicators with 8 visual styles (Mirror, Bars, Pulse, Steps, Dots, Ribbon, Peaks, or None) that respond to audio levels during recording. When using Box type, enable Show Transcription (last transcribed text), Show Controls (Record/Stop button), and Show History (mini history list). Additional options include Click Opens App, Lock Position, and Auto-Hide (show during recording, hide after transcription with configurable delay of 1–30 seconds). Toggle the compact overlay from the status bar button in the main window.

Typing Animation displays transcription text with a letter-by-letter effect in both overlays (enabled by default, toggle in Settings).

Audio Dimming

Audio dimming automatically lowers your system volume while you're recording, so background music or notifications don't interfere with your dictation. Configure the dim percentage (0–100%) in Settings > Audio. If the app crashes during recording, volume is automatically restored on next launch.

GPU Acceleration

Speed up local Whisper transcription using your GPU. Go to Settings > Models > Acceleration and choose:

Auto: Detects your GPU and uses it if compatible
CPU: Software-only (works on all machines)
CUDA: NVIDIA GPUs with CUDA support
DirectML: AMD and Intel GPUs

AI Personas

Choose a cleanup persona in Settings > Speed & AI > Cleanup Persona. 6 built-in personas are available: Technical Writer, Doctor, Journalist, Lawyer, Code Reviewer, and Casual. Click "Add Custom" to create your own persona with a custom system prompt. Personas can be assigned per Voice Profile.

Translation

Set an output language in Settings > Speed & AI > Output Language. Enter the target language name (e.g. "French", "Spanish", "Japanese"). After transcription and AI cleanup, the text is automatically translated before injection. Leave blank to disable. Requires AI cleanup to be enabled (Pro or BYOK).

Noise Profile

Go to Settings > Audio and click "Learn Background Noise". Record 3 seconds of ambient sound (fans, AC, office hum). Dikt saves the noise profile and applies spectral subtraction to future recordings. Re-learn any time your environment changes. The profile is serialized to your settings file and persists across restarts.

TTS Preview

Enable in Settings > Speed & AI > TTS Preview. Choose a provider:

Windows: Built-in System.Speech voices (free, offline, configurable voice and speed)
ElevenLabs: High-quality neural voices (requires API key and voice ID in settings)

After transcription, Dikt reads the text aloud. Press your hotkey to confirm injection, or Escape to cancel.

Wake Word

Settings > General > Wake Word. Enable wake word detection and configure the trigger phrase (default: "Hey Dikt"). Uses Windows built-in speech recognition — no model download required.

Settings File Location

Settings are stored in a JSON file at:

%APPDATA%\Dikt\settings.json

You can export and import settings from Settings > Advanced.

Back to Documentation