WhisperTux

Simple voice dictation application for Linux. Press the shortcut key, speak, press the shortcut key again, and text will appear in whatever app owns the cursor at the time.

Uses whisper.cpp for offline speech-to-text transcription. No fancy GPUs are required although whisper.cpp is capable of using them if you have one available. Once your speech is transcribed, it is sent to a ydotool daemon that will write the text into the focused application.

Super useful for voice prompting AI models and speaking terminal commands.

Here's a quick demo

Screenshots

Features

Local speech-to-text processing via whisper.cpp (no cloud dependencies)
No expensive hardware required (works well on a plain x86 laptop with AVX instructions)
Global keyboard shortcuts for system-wide operation
Automatic text injection into focused applications
Configurable whisper models and shortcuts

Installation

Run the setup script:

git clone https://github.com/cjams/whispertux
cd whispertux
python3 setup.py

The setup script handles everything: system dependencies, creating Python virtual environment, building whisper.cpp, downloading models, configuring services, and testing the installation. See setup.md for details.

Usage

Start the application:

./whispertux
# or
python3 main.py

Desktop Integration (Optional)

After building the project, you can add WhisperTux to your desktop environment's applications menu:

# Create desktop entry for GNOME/KDE/other desktop environments
bash scripts/create-desktop-entry.sh

This will:

Add WhisperTux to your applications menu
Optionally configure it to start automatically on login
Create proper desktop integration for launching from GUI

Basic Operation

Press $GLOBAL_SHORTCUT (configurable within the app) to start recording
Speak clearly into your microphone
Press $GLOBAL_SHORTCUT again to stop recording
Transcribed text appears in the currently focused application

You can say 'tux enter' to simulate Enter keypress after you're done speaking for automated carriage return.

You can also add overrides that will replace words before writing the final output text. For example, if you want every instance of 'duck' to be replaced by 'squirrel', you would add an override in the Word Overrides section with Original being 'duck'.

Configuration

Settings are stored in ~/.config/whispertux/config.json:

{
  "primary_shortcut": "F12",
  "model": "base",
  "key_delay": 15,
  "use_clipboard": false,
  "always_on_top": true,
  "theme": "darkly",
  "audio_device": null
}

Available Models

Any whisper model is usable. By default the base model is downloaded and used. You can download additional models from within the app.

System Requirements

Linux with a GUI. Has only been tested on GNOME/Ubuntu but should work on others. Depends on evdev for handling low-level input events
Python 3
Microphone access

Troubleshooting

Global Shortcuts Not Working

Test shortcut functionality:

python3 -c "from src.global_shortcuts import test_key_accessibility; test_key_accessibility()"

Audio Issues

Check microphone access:

python3 -c "from src.audio_capture import AudioCapture; print(AudioCapture().is_available())"

List available audio devices:

python3 -c "from src.audio_capture import AudioCapture; AudioCapture().list_devices()"

Text Injection Problems

If you see failed to open uinput device errors, run the fix script:

./scripts/fix-uinput-permissions.sh

This script will:

Add your user to the input and tty groups
Create the necessary udev rule for /dev/uinput access
Reload udev rules

You may need to log out and back in or reboot for group changes to take effect.

Verify ydotoold service status:

systemctl status ydotoold
sudo systemctl restart ydotoold  # if needed

Test text injection directly:

ydotool type "test message"

Whisper Model Issues

Check available models:

python3 -c "from src.whisper_manager import WhisperManager; print(WhisperManager().get_available_models())"

Download models manually:

cd whisper.cpp/models
bash download-ggml-model.sh base.en

Documentation

Architecture - Technical architecture and component design
Setup Details - Manual installation and system configuration

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
assets		assets
docs		docs
scripts		scripts
src		src
systemd		systemd
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
whispertux		whispertux
whispertux-main.png		whispertux-main.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WhisperTux

Screenshots

Features

Installation

Usage

Desktop Integration (Optional)

Basic Operation

Configuration

Available Models

System Requirements

Troubleshooting

Global Shortcuts Not Working

Audio Issues

Text Injection Problems

Whisper Model Issues

Documentation

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

cjams/whispertux

Folders and files

Latest commit

History

Repository files navigation

WhisperTux

Screenshots

Features

Installation

Usage

Desktop Integration (Optional)

Basic Operation

Configuration

Available Models

System Requirements

Troubleshooting

Global Shortcuts Not Working

Audio Issues

Text Injection Problems

Whisper Model Issues

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages