🎙 Speech Transcription API

Speech Transcription API is a RESTful service that processes audio input and converts speech into text using state-of-the-art speech recognition models. Ideal for building transcription tools, smart assistants, and voice-controlled applications.

🚀 Features

🎤 Transcribe audio to text (STT, speech-to-text)
🔐 Secure JWT-based authentication
⚡ FastAPI backend with async support
🐳 Dockerized for easy deployment (CPU & GPU)

🛠️ Getting Started

Follow the steps below to set up and run the Speech Transcription API using Docker (with optional GPU acceleration).

📦 Install Dependencies

You can use either uv (recommended for speed) or pip.

Using `uv`:

uv sync

Using `pip`:

Create a virtual environment:
```
python -m venv .venv
```

Activate the virtual environment:

source .venv/bin/activate  # Linux/macOS
# .venv\Scripts\activate   # Windows

Install the required packages:
```
pip install -r requirements.txt
```

⚙️ Configure Environment Variables

Copy the example environment file and fill in the necessary values:

cp .env.example .env

Edit the .env file to set your environment variables. You can use the default values or customize them as needed.

🐳 Build and Run the Docker Container

Using CPU:

Start the Docker container with the following command:

docker-compose up --build

This command will build the Docker image and start the container.

Using GPU:

Set up the docker-compose.yml file to use GPU acceleration.

docker-compose up --build

This command will build the Docker image and start the container with GPU support.

Then, API will be available at http://localhost:8000. Documentation will be available at http://localhost:8000/docs.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
migrations		migrations
scripts		scripts
src		src
tests		tests
.DS_Store		.DS_Store
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.cuda		Dockerfile.cuda
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.in		requirements.in
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎙 Speech Transcription API

🚀 Features

🛠️ Getting Started

📦 Install Dependencies

Using `uv`:

Using `pip`:

⚙️ Configure Environment Variables

🐳 Build and Run the Docker Container

Using CPU:

Using GPU:

About

Uh oh!

Uh oh!

Languages

License

laviprog/speech-transcription

Folders and files

Latest commit

History

Repository files navigation

🎙 Speech Transcription API

🚀 Features

🛠️ Getting Started

📦 Install Dependencies

Using uv:

Using pip:

⚙️ Configure Environment Variables

🐳 Build and Run the Docker Container

Using CPU:

Using GPU:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages

Using `uv`:

Using `pip`: