Skip to content

A modular and open-source RAG-ready Embedding API supporting dense and sparse. Easily configurable via config.yaml — no code changes required.

License

Notifications You must be signed in to change notification settings

fahmiaziz98/unified-embedding-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

title emoji colorFrom colorTo sdk pinned
Api Embedding
🐠
green
purple
docker
false

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

🧠 Unified Embedding API

🧩 Unified API for all your Embedding & Sparse needs — plug and play with any model from Hugging Face or your own fine-tuned versions. This official repository from huggingface space


🚀 Overview

Unified Embedding API is a modular and open-source RAG-ready API built for developers who want a simple, unified way to access dense, and sparse models.

It’s designed for vector search, semantic retrieval, and AI-powered pipelines — all controlled from a single config.yaml file.

⚠️ Note: This is a development API.
For production deployment, host it on cloud platforms such as Hugging Face TGI, AWS, or GCP.


🧩 Features

  • 🧠 Unified Interface — One API to handle dense, sparse, and reranking models.
  • ⚙️ Configurable — Switch models instantly via config.yaml.
  • 🔍 Vector DB Ready — Easily integrates with FAISS, Chroma, Qdrant, Milvus, etc.
  • 📈 RAG Support — Perfect base for Retrieval-Augmented Generation systems.
  • Fast & Lightweight — Powered by FastAPI and optimized with async processing.
  • 🧰 Extendable — Add your own models or pipelines effortlessly.

📁 Project Structure


unified-embedding-api/
│
├── core/
│   ├── embedding.py         
│   └── model_manager.py     
├── models/
|   └──model.py
├── app.py                   # Entry point (FastAPI server)
|── config.yaml              # Model + system configuration
├── Dockerfile                 
├── requirements.txt
└── README.md


🧩 Model Selection

Default configuration is optimized for CPU 2vCPU / 16GB RAM. See MTEB Leaderboard for memory usage reference.

⚠️ If you plan to use larger models like Qwen2-embedding-8B, please upgrade your Space.


☁️ How to Deploy (Free 🚀)

Deploy your custom Embedding API on Hugging Face Spaces — free, fast, and serverless.

🔧 Steps:

  1. Clone this Space Template: 👉 Hugging Face Space — fahmiaziz/api-embedding
  2. Edit config.yaml to set your own model names and backend preferences.
  3. Push your code — Spaces will automatically rebuild and host your API.

That’s it! You now have a live embedding API endpoint powered by your models.

📘 Tutorial Reference:


🧑‍💻 Contributing

Contributions are welcome! Please open an issue or submit a pull request to discuss changes.


⚠️ License

MIT License © 2025 Developed with ❤️ by the Open-Source Community.


✨ “Unify your embeddings. Simplify your AI stack.”

About

A modular and open-source RAG-ready Embedding API supporting dense and sparse. Easily configurable via config.yaml — no code changes required.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published