
How Dikt Works
From voice to polished text in seconds. A simple four-step pipeline that runs locally or in the cloud.
The Pipeline
Record
Press your global hotkey — or say "Hey Dikt" with wake word listening — and start talking. NAudio captures high-quality audio from your microphone in the background, in any application. Noise profile learning filters out ambient sound, and system audio is optionally dimmed during recording.
Transcribe
Your audio is converted to text using Whisper.cpp locally on your machine (with optional GPU acceleration via CUDA or DirectML), via OpenAI's Whisper API, or GPT-4o Audio for maximum accuracy.
AI Cleanup
Optional AI post-processing with Claude or GPT fixes grammar, punctuation, and filler words. Choose an AI persona for your writing style, apply word correction rules, translate to another language, or enable Markdown voice mode for structured documents. A profanity filter removes curse words automatically.
Inject
The polished text is automatically injected at your cursor position in whatever application you're using. Optionally hear it read back with TTS preview before injection. If injection fails, the clipboard re-inject queue saves it for one-click retry.
Local vs Cloud Transcription
Choose the approach that fits your needs. Use both with automatic failover.
| Local (Whisper.cpp) | Cloud (OpenAI API) | GPT-4o Audio | |
|---|---|---|---|
| Cost | Free | Pay-per-use | Pay-per-use |
| Internet Required | No | Yes | Yes |
| Privacy | Audio stays on device | Audio sent to OpenAI | Audio sent to OpenAI |
| Accuracy | Good (varies by model) | Excellent | Best |
| Speed | Depends on hardware | Fast (server-side) | Fast (server-side) |
| Engine | Whisper.cpp models | OpenAI Whisper API | GPT-4o multimodal |
| GPU Acceleration | CUDA / DirectML | N/A (server-side) | N/A (server-side) |
Privacy & Security
Your voice data is yours. Dikt is designed from the ground up to keep your information private and secure.
- DPAPI encryption for all API keys stored on disk
- No server-side storage of your transcriptions or audio
- Local-only mode disables all network features entirely
- Anonymous telemetry is opt-out — disable anytime in Settings
- Atomic file writes prevent settings corruption
Whisper Model Comparison
Choose the model that balances speed, accuracy, and disk space for your needs.
| Model | Size | Speed | Accuracy | Best For |
|---|---|---|---|---|
| tiny | ~75 MB | Fastest | Basic | Quick drafts, low-resource machines |
| base | ~142 MB | Fast | Good | General use with decent hardware |
| small | ~466 MB | Moderate | Very Good | Balanced speed and accuracy |
| medium | ~1.5 GB | Slower | Excellent | High-accuracy offline transcription |
| large-v3-turbo | ~1.5 GB | Moderate | Best | Best accuracy-to-speed ratio, multi-language |
| large-v3 | ~2.9 GB | Slowest | Best | Maximum accuracy, multi-language |
Ready to Try Dikt?
14-day free trial. No credit card required. All features included.