Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification. For more details: github.com/openai/whisper
faster-whisper is a reimplementation of OpenAI's Whisper model using CTranslate2, which is a fast inference engine for Transformer models. For more details: github.com/guillaumekln/faster-whisper
Tgisper is a bot for Telegram using a model from OpenAI to convert voice or audio messages to text. It is enough to record a voice message or send it to the bot from another chat and you're done!
docker run -d \
-e ASR_MODEL=small \
-e BOT_TOKEN=3916463517:ABC2tkTGkD9FHl4Ra-jv2Vv6DVECTyeV3Mm \
-e OMP_NUM_THREADS=2 \
ghcr.io/ckaytev/tgisper:mainInstall command-line tool ffmpeg:
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg
# on Arch Linux
sudo pacman -S ffmpeg
# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg
# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg
# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpegInstall poetry with following command:
pip3 install poetryInstall packages:
poetry installSet environment variable:
export BOT_TOKEN=3916463517:ABC2tkTGkD9FHl4Ra-jv2Vv6DVECTyeV3Mm
# The list of available models (https://github.com/openai/whisper/#available-models-and-languages)
export ASR_MODEL=base
# When running on CPU, make sure to set the same number of threads
export OMP_NUM_THREADS=2Starting the bot polling:
poetry run tgisperWith docker compose:
docker compose run -d -P -e BOT_TOKEN=3916463517:ABC2tkTGkD9FHl4Ra-jv2Vv6DVECTyeV3Mm tgisper