Model Comparison

Compare Whisper models from tiny to large-v3: download sizes, accuracy, and speed.

Dikt uses Whisper.cpp for local transcription. Models range from tiny (fast, basic) to large-v3 (slow, most accurate). Speed estimates are for a 30-second audio clip on a modern CPU.

ModelParametersDownloadSpeedAccuracyMulti-LanguageNotes
tiny39M~75 MB~1sBasicLimitedFastest, lowest resource usage
base74M~142 MB~2sGoodGoodGood balance for quick tasks
small244M~466 MB~5sVery GoodVery GoodRecommended for most users
medium769M~1.5 GB~12sExcellentExcellentHigh accuracy, needs more RAM
large-v3-turbo809M~1.5 GB~8sBestBestBest accuracy-to-speed ratio, multi-language
large-v31550M~2.9 GB~25sBestBestMaximum accuracy, all languages

How to Download Models

Open Dikt, go to Settings > Model Manager. Select the model you want and click Download. Models are cached locally and only need to be downloaded once.

Stay in the loop

Get product updates, tips, and news delivered to your inbox.