Frequently Asked Questions

Everything you need to know about Dikt.

General

What is Dikt?

Dikt is a Windows desktop speech-to-text application that combines local AI transcription with optional cloud services. It features a global hotkey for push-to-talk, automatic text injection at your cursor, and AI-powered cleanup of your transcriptions.

Is Dikt free?

Dikt offers a 14-day free trial with all features included — cloud transcription and AI cleanup work out of the box, no API keys needed. After the trial, you can choose a BYOK license ($29 one-time) where you bring your own API keys, or a Pro subscription ($10/month) with everything included.

Does Dikt work on Mac or Linux?

Not currently. Dikt is a Windows-only application that requires Windows 10 or later. Mac and Linux support may be considered in the future.

Does it work offline?

Yes. Local Whisper mode runs entirely on your machine with no internet connection required. Your audio never leaves your computer. Enable "Local Only" mode in Settings to disable all cloud features and work fully offline.

What languages are supported?

Dikt supports 100+ languages through OpenAI's Whisper models. You can set a specific language in Settings for best accuracy, or use auto-detection to let the model identify the language automatically.

What is AI Command Mode?

AI Command Mode lets you transform selected text with voice instructions. Select text in any app, press Ctrl+Shift+Space, and speak an instruction like "make this more formal", "translate to Spanish", or "rewrite as bullet points". The AI replaces the selection with the result. It requires an OpenAI or Anthropic API key (BYOK) or a Pro subscription.

What is Cloud Sync?

Cloud Sync is a Pro feature that syncs your custom vocabulary, snippets, context profiles, and AI prompt between the Dikt desktop app and your web dashboard at dikt.app/dashboard. Enable it in Settings > Advanced > Cloud Sync. Changes sync automatically using version-based conflict resolution.

What is the overlay system?

Dikt has two overlay modes. The Notification Overlay shows transcription results as a brief semi-transparent popup that auto-dismisses. The Compact Overlay is a persistent always-on-top floating window showing recording status and audio levels. Both support configurable Type (Minimal/Line/Box), Style (Pill/Box/Text), Position (9 presets or custom drag), Opacity, Display (choose which monitor), and Size scaling. The compact overlay also features live waveform indicators (8 visual styles), Box-mode features (transcription preview, record/stop controls, mini history), click-to-open, lock position, auto-hide, and a letter-by-letter typing animation.

What are voice profiles?

Voice profiles let multiple users share the same machine with separate settings, vocabulary, history, and statistics. Each profile stores its data in a separate directory. Switch profiles from the system tray > Profiles menu. The app restarts to load the new profile's data.

What are dictation streaks?

Dikt tracks consecutive days where you make at least one dictation. Your current streak, longest streak, and milestone badges (7-day, 14-day, 30-day, 90-day, 365-day) are shown on the Dashboard. It's a fun way to build a daily dictation habit.

Does Dikt support multiple users on the same computer?

Yes — Voice Profiles let each person have their own vocabulary list, history, and settings. Switch profiles from the tray icon or Settings → Voice Profiles. The app restarts to load the new profile's data.

Does Dikt track my dictation habits?

Dikt tracks a daily dictation streak (like Duolingo). You earn milestone badges at 7, 14, 30, 90, and 365 consecutive days. View your streak on the Dashboard tab in Settings.

Transcription

How accurate is the transcription?

Accuracy depends on your microphone quality, background noise, and the transcription method. Local Whisper models range from fast-but-basic (tiny) to slower-but-accurate (large). Cloud transcription via OpenAI typically provides the highest accuracy.

Is my audio sent to the cloud?

It depends on your chosen mode. In local Whisper mode, your audio is processed entirely on your machine and never leaves your computer. In cloud mode, audio is sent to OpenAI for transcription and is subject to their data handling policies. You have full control over which mode you use, and can switch at any time in Settings.

Can I use Dikt completely offline?

Yes. Local Whisper mode works without any internet connection. Your audio is processed entirely on your machine. Enable 'Local Only' mode in settings to disable all network features.

What is the profanity filter?

Dikt can automatically remove profanity from your transcriptions. It uses dual coverage: the AI cleanup prompt is augmented to filter profanity, and a local word-list filter catches anything the AI misses. There's a built-in list of common curse words, and you can add your own custom words. The "Swear Jar" tracks how many words were removed with configurable cost-per-word tracking. Configure everything in Settings > Swear Jar.

What's the difference between local and cloud transcription?

Local transcription runs Whisper.cpp on your computer — it's private and works offline, but uses your CPU. Cloud transcription sends audio to OpenAI's servers for processing — it's typically more accurate and faster, but requires an internet connection.

What is audio dimming?

Audio dimming automatically lowers your system volume while you're dictating, so background music or other sounds don't interfere. Configure the dim percentage (0–100%) in Settings > Audio. If the app crashes during recording, volume is automatically restored on next launch.

What are auto-correction suggestions?

When enabled, Dikt uses word-level timestamps and confidence scores from Whisper to highlight uncertain words before injection. Low-confidence words are underlined in a preview overlay so you can review and correct them before the text is typed out. Enable in Settings > Speed & AI > Auto-Correction. This adds a brief review step to the pipeline.

Does Dikt support GPU acceleration for local Whisper?

Yes. In Settings > Models, set Acceleration to Auto, CUDA, or DirectML. Auto detects your GPU and uses it if available. CUDA requires an NVIDIA GPU; DirectML works on most modern AMD and Intel GPUs. Fall back to CPU if no compatible GPU is found.

What is noise profile learning?

Noise profile learning lets Dikt filter out consistent background noise (fans, AC, office hum). Click 'Learn Background Noise' in Settings > Audio to record 3 seconds of ambient sound. Dikt uses spectral subtraction to subtract that noise from future recordings, improving transcription accuracy in noisy environments.

Technical

What are BYOK API keys?

BYOK stands for 'Bring Your Own Keys.' You provide your own OpenAI and/or Anthropic API keys, which you get by creating accounts with those services. You pay those providers directly for API usage — Dikt doesn't add any markup.

Is my data secure?

Yes. API keys are encrypted using Windows DPAPI encryption. Local transcription never sends audio anywhere. When using cloud mode, audio is sent only to your chosen provider (OpenAI or Anthropic). Dikt does not collect, store, or share your transcription data.

What are the system requirements?

Windows 10 or later (64-bit), .NET 8 Desktop Runtime (auto-installed if missing), approximately 200 MB of disk space, and a microphone. Internet is only required for cloud features.

What are update channels?

Dikt offers three update channels: Stable (default, recommended), Beta (early access to upcoming features), and Canary (bleeding-edge, may be unstable). Change your channel in Settings > Advanced > Update Channel. Dikt auto-checks for updates and shows a notification bar when a new version is available.

What is the push-to-talk grace period?

The grace period lets you use push-to-talk mode without holding the hotkey. If you press and release the hotkey within the grace period (configurable in seconds), Dikt switches to toggle-like behavior — it keeps recording until you press the hotkey again. This is useful for longer dictations. Configure it in Settings > General. An optional audio cue plays when the grace period activates.

What's the difference between notification and compact overlay?

The notification overlay appears briefly after each transcription to show the result, then auto-dismisses. The compact overlay is a persistent floating window that stays on screen showing recording status and audio levels with live waveform visualization. Both overlays support configurable Type, Style, Position, Opacity, Display (multi-monitor), and Size. The compact overlay adds exclusive features: waveform indicators (8 styles including Mirror, Bars, Pulse, and more), Box-mode features (transcription preview, record/stop controls, mini history list), click-to-open, lock position, and auto-hide. Enable from the status bar toggle button or Settings.

What is the clipboard re-inject queue?

If Dikt can't inject text because no window is focused or the target app blocks it, the transcription is queued instead of lost. The compact overlay shows a 'transcription queued' indicator — click it to inject into whatever window you focus next. The queue is cleared on app restart.

What is wake word listening?

Wake word listening lets you say "Hey Dikt" to start recording hands-free — no hotkey needed. A lightweight background detector uses Windows built-in speech recognition to monitor your microphone and triggers recording when it hears the wake phrase. No model download required. Enable it in Settings > General > Wake Word.

What are waveform styles?

The compact overlay can display a live audio waveform that responds to your voice during recording. There are 8 styles to choose from: Mirror (mirrored waveform above/below center), Bars (vertical bars), Pulse (smooth sine wave), Steps (staircase pattern), Dots (points at varying heights), Ribbon (filled envelope), Peaks (sharp zigzag spikes), or None (simple dot indicator). The waveform color changes by state: green when ready, red when recording, and blue when transcribing. Configure in Settings > General > Compact Overlay > Waveform Style.

Integrations

How do I set up the Obsidian integration?

In Settings > Text Processing > Output Target, select "Obsidian vault". Enter the absolute path to your vault root folder and the relative path to your note file (e.g., "Dikt/dictations.md"). Transcriptions will be appended to that file. You can use output templates with {text}, {date}, and {time} placeholders.

How do I set up the Notion integration?

First, create an integration at notion.so/my-integrations and copy the API key. Then share your target Notion page with the integration. In Dikt Settings > Text Processing > Output Target, select "Notion page" and enter your API key and page ID. Transcriptions are appended as paragraph blocks.

How does the VS Code extension work?

Enable the Local API in Dikt Settings > Advanced (port 9847 by default). Install the Dikt extension from the VS Code marketplace. The extension connects to the running Dikt desktop app via localhost HTTP. Use Ctrl+Alt+Space to toggle dictation — transcribed text is inserted at your cursor. The status bar shows connection state.

What are output templates?

Output templates let you format transcriptions before they're sent to an output target. Use placeholders: {text} for the transcription, {date} for the current date (YYYY-MM-DD), and {time} for the current time (HH:mm). Example: "## {date}\n{text}\n---" creates a Markdown entry. Configure in Settings > Text Processing > Output Target.

Can I send transcriptions to multiple targets?

Yes. When using Obsidian or Notion as your output target, enable "Also inject at cursor" to simultaneously paste the text at your cursor position and send it to the configured target.

Can Dikt translate my dictation into another language?

Yes. In Settings > Speed & AI, set an Output Language (e.g. "French", "Spanish", "Japanese"). After transcription, your AI cleanup step will automatically translate the cleaned text into that language before injecting it. You can dictate in any language and receive output in any language your AI provider supports.

What is Markdown voice mode?

Markdown voice mode lets you dictate document structure with voice commands: say "heading one" for # headings, "bullet point" for list items, "open code block" for fenced code blocks, "bold" / "italic" for inline formatting, and more. It's automatically enabled when Obsidian, Typora, or VS Code is the active window, or you can enable it manually in Settings > Speed & AI.

What is code dictation mode?

Code dictation mode formats your dictation as code instead of prose. It activates automatically when you're dictating into an IDE — Rider, Visual Studio, Cursor, IntelliJ, or VS Code. For example, saying "create a function called get user that takes id and returns a user object" produces formatted code. It uses a specialized AI context profile tuned for code generation.

Teams

How do team workspaces work?

A team workspace lets you share vocabulary, text snippets, and context profiles with your team members. When a team admin adds terms to the team vocabulary, every member gets them automatically via cloud sync. Personal settings always override team defaults when there's a conflict.

How is team billing handled?

Team plans are billed per seat at $25/seat/month. When you create a team, you choose how many seats you need. You can add or remove seats anytime — billing adjusts automatically via Stripe.

How do I invite team members?

Team owners and admins can invite members by email through the team management API or web dashboard. Invited members receive an email and can join the team once they have a Dikt account.

How do team and personal settings merge?

Team vocabulary is additive — your personal vocabulary is combined with team vocabulary, and both are used during transcription. For snippets and context profiles, team settings provide defaults, but your personal settings override on any key conflict. This means you always keep your personal customizations.

What is a Team Workspace?

A Team Workspace lets your organization share vocabulary lists, text snippets, and voice profiles across all members. Individual users can still override any shared setting with their own preference.

How do I create a team?

Purchase a Team plan on the pricing page, then visit your account dashboard and click 'Create Team'. Invite members by email — each person uses their own Dikt license seat.

How are team and personal settings merged?

Team settings are the baseline. Any personal setting you configure in the app overrides the team default for you only. Shared vocabulary words are merged (team words + your personal words).

Billing

How does the free trial work?

Download and install Dikt — the trial starts automatically. You get 14 days with all features enabled, including cloud transcription and AI cleanup — no API keys or credit card required. After the trial, local Whisper continues to work for free; cloud features require a license.

What happens when my trial expires?

After your 14-day trial ends, local Whisper transcription continues to work indefinitely for free. Cloud transcription, AI cleanup, and other premium features require activating a BYOK license ($29 one-time, bring your own API keys) or Pro subscription ($10/month, everything included). Pro also unlocks exclusive features like the AI Chat Assistant and Cloud Sync.

Can I cancel my subscription anytime?

Yes. You can cancel your Pro subscription at any time through your Stripe customer portal or by contacting support. There are no cancellation fees, and your Pro features remain active until the end of your current billing period.

Is there a refund policy?

Yes. We offer a 30-day money-back guarantee on all purchases, no questions asked. Simply contact support@dikt.app to request a refund and we will process it promptly.

How does Team pricing work?

Team plans are $25 per seat per month. Choose your seat count when you subscribe — the price scales linearly. You can adjust seats anytime through your Stripe customer portal. All team members get full Pro-level features plus shared workspace settings.

Is there a team or business plan?

Yes. The Team plan is $25/seat/month. It includes all Pro features plus shared vocabulary, snippets, voice profiles, and team management. There's no minimum seat count.

AI Features

What are AI personas?

AI personas change the style and tone of your AI cleanup. Choose from 6 built-in personas: Technical Writer (precise, clear), Doctor (clinical notes), Journalist (engaging prose), Lawyer (formal legal language), Code Reviewer (preserves identifiers exactly), or Casual (light, conversational). You can also define your own persona with a custom system prompt. Switch personas in Settings > Speed & AI > Cleanup Persona.

Does AI Command Mode remember context between commands?

Yes. AI Command Mode now supports multi-turn context — it remembers your last 3 instructions so you can refine your text iteratively. For example: 'make this more formal', then 'shorten it', then 'add bullet points'. Say 'start over' or 'clear history' to reset the context.

What is word correction training?

Word correction training lets you define wrong→correct pairs that are applied automatically on every future transcription. For example, if Whisper consistently transcribes a name as "Jon" when you mean "John", add a correction rule. Rules use case-insensitive whole-word matching and are applied as a post-processing pass before AI cleanup. Manage your corrections in Settings > Vocabulary > Word Corrections.

What is TTS preview?

TTS (text-to-speech) preview reads your transcription aloud before injecting it — useful for proofreading long dictations. After transcription completes, Dikt reads the text back to you. Press your hotkey to confirm and inject, or press Escape to cancel. Choose between Windows built-in voices (free, offline) or ElevenLabs neural voices (high quality, requires API key). Configure in Settings > Speed & AI > TTS Preview.

Privacy

How is my data protected?

Dikt uses multiple layers of protection. All sensitive credentials (API keys, license key) are encrypted at rest using Windows DPAPI (Data Protection API), tied to your Windows user account. Audio is never stored on our servers — local mode processes everything on your machine, and cloud mode streams audio directly to OpenAI without intermediary storage. Settings and transcription history stay entirely on your computer. For full details, see our privacy policy at dikt.app/privacy.

Does Dikt collect telemetry or analytics?

Dikt sends anonymous usage heartbeats to help improve the product — no personal data, audio, or transcriptions are collected. The machine identifier is a one-way hash and cannot be linked back to you. You can disable telemetry at any time in Settings → Privacy. Your transcriptions, audio, and settings always stay on your machine.

How do I opt out of all network features?

Enable "Local Only" mode in Settings > Advanced. This disables cloud transcription, AI cleanup, license validation, and update checks. Only local Whisper transcription will work.

What data is encrypted?

All API keys (OpenAI, Anthropic, license key) are encrypted at rest using Windows DPAPI (Data Protection API). This means they are encrypted with your Windows user credentials and cannot be read by other users on the same machine.

Still have a question?

We're here to help. Reach out and we'll get back to you as soon as we can.

Contact Support