Simple voice dictation application for Linux. Press the shortcut key, speak, press the shortcut key again, and text will appear in whatever app owns the cursor at the time.
Uses whisper.cpp for offline speech-to-text transcription. No fancy GPUs are required although whisper.cpp is capable of using them if you have one available. Once your speech is transcribed, it is sent to a ydotool daemon that will write the text into the focused application.
Super useful for voice prompting AI models and speaking terminal commands.
Here's a quick demo
![]() |
![]() |
- Local speech-to-text processing via whisper.cpp (no cloud dependencies)
- No expensive hardware required (works well on a plain x86 laptop with AVX instructions)
- Global keyboard shortcuts for system-wide operation
- Automatic text injection into focused applications
- Configurable whisper models and shortcuts
Run the setup script:
git clone https://github.com/cjams/whispertux
cd whispertux
python3 setup.pyThe setup script handles everything: system dependencies, creating Python virtual environment, building whisper.cpp, downloading models, configuring services, and testing the installation. See setup.md for details.
Start the application:
./whispertux
# or
python3 main.pyAfter building the project, you can add WhisperTux to your desktop environment's applications menu:
# Create desktop entry for GNOME/KDE/other desktop environments
bash scripts/create-desktop-entry.shThis will:
- Add WhisperTux to your applications menu
- Optionally configure it to start automatically on login
- Create proper desktop integration for launching from GUI
- Press $GLOBAL_SHORTCUT (configurable within the app) to start recording
- Speak clearly into your microphone
- Press $GLOBAL_SHORTCUT again to stop recording
- Transcribed text appears in the currently focused application
You can say 'tux enter' to simulate Enter keypress after you're done speaking for automated carriage return.
You can also add overrides that will replace words before writing the final output text. For example, if you want every instance of 'duck' to be replaced by 'squirrel', you would add an override in the Word Overrides section with Original being 'duck'.
Settings are stored in ~/.config/whispertux/config.json:
{
"primary_shortcut": "F12",
"model": "base",
"key_delay": 15,
"use_clipboard": false,
"always_on_top": true,
"theme": "darkly",
"audio_device": null
}Any whisper model is usable. By default the base model is downloaded and used. You can download additional models from within the app.
- Linux with a GUI. Has only been tested on GNOME/Ubuntu but should work on others. Depends on evdev for handling low-level input events
- Python 3
- Microphone access
Test shortcut functionality:
python3 -c "from src.global_shortcuts import test_key_accessibility; test_key_accessibility()"Check microphone access:
python3 -c "from src.audio_capture import AudioCapture; print(AudioCapture().is_available())"List available audio devices:
python3 -c "from src.audio_capture import AudioCapture; AudioCapture().list_devices()"If you see failed to open uinput device errors, run the fix script:
./scripts/fix-uinput-permissions.shThis script will:
- Add your user to the
inputandttygroups - Create the necessary udev rule for
/dev/uinputaccess - Reload udev rules
You may need to log out and back in or reboot for group changes to take effect.
Verify ydotoold service status:
systemctl status ydotoold
sudo systemctl restart ydotoold # if neededTest text injection directly:
ydotool type "test message"Check available models:
python3 -c "from src.whisper_manager import WhisperManager; print(WhisperManager().get_available_models())"Download models manually:
cd whisper.cpp/models
bash download-ggml-model.sh base.en- Architecture - Technical architecture and component design
- Setup Details - Manual installation and system configuration
MIT License

