Model Comparison

Compare Whisper models from tiny to large-v3: download sizes, accuracy, and speed.

Dikt uses Whisper.cpp for local transcription. Models range from tiny (fast, basic) to large-v3 (slow, most accurate). Speed estimates are for a 30-second audio clip on a modern CPU.

Model	Parameters	Download	Speed	Accuracy	Multi-Language	Notes
tiny	39M	~75 MB	~1s	Basic	Limited	Fastest, lowest resource usage
base	74M	~142 MB	~2s	Good	Good	Good balance for quick tasks
small	244M	~466 MB	~5s	Very Good	Very Good	Recommended for most users
medium	769M	~1.5 GB	~12s	Excellent	Excellent	High accuracy, needs more RAM
large-v3-turbo	809M	~1.5 GB	~8s	Best	Best	Best accuracy-to-speed ratio, multi-language
large-v3	1550M	~2.9 GB	~25s	Best	Best	Maximum accuracy, all languages

How to Download Models

Open Dikt, go to Settings > Model Manager. Select the model you want and click Download. Models are cached locally and only need to be downloaded once.

Back to Documentation

Model Comparison

How to Download Models

Stay in the loop