A high-performance vector database REST API built with FastAPI
- Docker Engine 20.10+ and Docker Compose 2.0+
- 4GB+ RAM recommended
- 2GB+ available disk space
make build
# Or: docker build -t vectordb-api:latest .make run
# Or: docker run -d --name vectordb -p 8000:8000 -v vectordb_data:/app/data vectordb-api:latest# Check container status
docker ps
# Test health endpoint
curl http://localhost:8000/health- API Base: http://localhost:8000
- Interactive Documentation: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
# Create a library
curl -X POST "http://localhost:8000/libraries/" \
-H "Content-Type: application/json" \
-d '{"name": "Test Library", "metadata": {"description": "Demo library"}}'
# Note the library ID from response, then add a document
export LIBRARY_ID="your-library-id-here"
curl -X POST "http://localhost:8000/libraries/$LIBRARY_ID/documents/" \
-H "Content-Type: application/json" \
-d '{
"document_data": {"metadata": {"title": "AI Document"}},
"chunk_texts": ["Artificial intelligence is transforming technology."]
}'
# Build index
curl -X POST "http://localhost:8000/libraries/$LIBRARY_ID/index" \
-H "Content-Type: application/json" \
-d '{"index_type": "optimized_linear", "force_rebuild": true}'
# Search
curl -X POST "http://localhost:8000/libraries/$LIBRARY_ID/search" \
-H "Content-Type: application/json" \
-d '{"query_text": "machine learning", "top_k": 3}'- Open your browser
- Navigate to http://localhost:8000/docs
- You should see the interactive API documentation
Step 7.1: Create a Library
- Click on POST /libraries
- Use this JSON:
{
"name": "AI Research Library",
"metadata": {
"description": "Collection of AI and ML documents",
"category": "research"
}
}- Copy the returned
library_id
Step 7.2: Create a Document & add Chunks
- Click on POST /libraries/{library_id}/documents
- Replace
{library_id}with your copied ID - Use this JSON:
{
"document_data": {
"metadata": {
"title": "string",
"author": "string",
"created_at": "2025-09-30T06:24:03.501Z",
"updated_at": "2025-09-30T06:24:03.501Z",
"source": "string",
"file_type": "string",
"file_size": 0,
"tags": [
"string"
],
"category": "string",
"language": "string"
}
},
"chunk_texts": [
"Artificial intelligence (AI) is the simulation of human intelligence in machines to perform tasks and learn from experience, often involving complex reasoning, planning, and decision-making"
]
}
- Copy the returned `document_id`**Step 7.3: Build Index**
- Click on **POST /libraries/{library_id}/index**
- Try different index types:
```json
{
"index_type": "linear"
}
Step 7.4: Perform Search
- Click on POST /libraries/{library_id}/search
- Test semantic search:
{
"query_text": "What is artificial intelligence?",
"top_k": 5,
"include_text": true,
"include_metadata": true
}┌─────────────────────────────────────────────────────────────────────────────────┐
│ CLIENT APPLICATIONS │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Web Browser │ │ Python SDK │ │ cURL │ │ Mobile App │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │ │
│ └──────────────────┴──────────────────┴──────────────────┘ │
│ │ │
│ HTTP/REST API │
└────────────────────────────────────┼─────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│ LAYER 1: API LAYER (FastAPI) │
│ Handles HTTP, Validation, Routing │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────┐ ┌────────────────────┐ ┌─────────────────────┐ │
│ │ library_router.py │ │ document_router.py │ │ chunk_router.py │ │
│ ├────────────────────┤ ├────────────────────┤ ├─────────────────────┤ │
│ │ POST /libraries │ │ POST /libraries/ │ │ POST /libraries/ │ │
│ │ GET /libraries │ │ {id}/docs │ │ {id}/chunks │ │
│ │ GET /libraries/ │ │ GET /libraries/ │ │ GET /libraries/ │ │
│ │ {id} │ │ {id}/docs/ │ │ {id}/chunks/ │ │
│ │ PUT /libraries/ │ │ {doc_id} │ │ {chunk_id} │ │
│ │ {id} │ │ PUT /libraries/ │ │ PUT /libraries/ │ │
│ │ DELETE /libraries/ │ │ {id}/docs/ │ │ {id}/chunks/ │ │
│ │ {id} │ │ {doc_id} │ │ {chunk_id} │ │
│ └────────────────────┘ │ DELETE /libraries/ │ │ DELETE /libraries/ │ │
│ │ {id}/docs/ │ │ {id}/chunks/ │ │
│ ┌────────────────────┐ │ {doc_id} │ │ {chunk_id} │ │
│ │ search_router.py │ └────────────────────┘ └─────────────────────┘ │
│ ├────────────────────┤ │
│ │ POST /libraries/ │ ┌────────────────────┐ │
│ │ {id}/search │ │ index_router.py │ │
│ │ │ ├────────────────────┤ │
│ │ - Query by text │ │ POST /libraries/ │ │
│ │ - Query by vector │ │ {id}/index │ │
│ │ - Metadata filters │ │ GET /libraries/ │ │
│ │ - Top K results │ │ {id}/index/ │ │
│ └────────────────────┘ │ stats │ │
│ └────────────────────┘ │
│ │
│ Responsibilities: │
│ • HTTP request/response handling │
│ • Pydantic model validation │
│ • Status code management │
│ • Error transformation (exceptions → HTTP errors) │
│ • OpenAPI/Swagger documentation generation │
│ • NO business logic here! │
│ │
└──────────────────────────────────┬───────────────────────────────────────────────┘
│
│ Calls
▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│ LAYER 2: SERVICE LAYER (Business Logic) │
│ Orchestrates Operations, Enforces Rules │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────────┐ │
│ │ library_service.py │ │ document_service.py │ │ chunk_service.py │ │
│ ├─────────────────────┤ ├──────────────────────┤ ├─────────────────────┤ │
│ │ create_library() │ │ create_document() │ │ create_chunk() │ │
│ │ get_library() │ │ get_document() │ │ get_chunk() │ │
│ │ update_library() │ │ update_document() │ │ update_chunk() │ │
│ │ delete_library() │ │ delete_document() │ │ delete_chunk() │ │
│ │ list_libraries() │ │ add_chunks_to_doc() │ │ bulk_create() │ │
│ └─────────────────────┘ └──────────────────────┘ └─────────────────────┘ │
│ │
│ ┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────────┐ │
│ │ search_service.py │ │ index_service.py │ │ embedding_service.py│ │
│ ├─────────────────────┤ ├──────────────────────┤ ├─────────────────────┤ │
│ │ search_library() │ │ build_index() │ │ get_embedding() │ │
│ │ - Apply filters │ │ delete_index() │ │ batch_embed() │ │
│ │ - Call index search │ │ update_index() │ │ - Cohere API │ │
│ │ - Rank results │ │ get_index_stats() │ │ - Retry logic │ │
│ │ - Format output │ │ recommend_algo() │ │ - Error handling │ │
│ └─────────────────────┘ └──────────────────────┘ └─────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ metadata_filter_service.py │ │
│ ├──────────────────────────────────────────────────────────────────────┤ │
│ │ apply_filters() - Pre/Post filtering logic │ │
│ │ parse_filter_expression() - Query language parsing │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ Responsibilities: │
│ • Business logic and rules enforcement │
│ • Transaction coordination across repositories │
│ • Integration with external services (embeddings) │
│ • Concurrency control (async/await, locking) │
│ • Algorithm selection and optimization │
│ • Error handling and domain exceptions │
│ │
└──────────────────────────────────┬───────────────────────────────────────────────┘
│
│ Uses
▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│ LAYER 3: REPOSITORY LAYER (Data Access) │
│ Abstracts Data Storage/Retrieval │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────┐ ┌──────────────────────┐ ┌──────────────────────┐ │
│ │library_repository.py │ │document_repository.py│ │ chunk_repository.py │ │
│ ├──────────────────────┤ ├──────────────────────┤ ├──────────────────────┤ │
│ │ create(library) │ │ create(document) │ │ create(chunk) │ │
│ │ get(id) → Library │ │ get(id) → Document │ │ get(id) → Chunk │ │
│ │ update(id, library) │ │ update(id, document) │ │ update(id, chunk) │ │
│ │ delete(id) → bool │ │ delete(id) → bool │ │ delete(id) → bool │ │
│ │ list() → List[...] │ │ get_by_library() │ │ get_by_library() │ │
│ │ exists(id) → bool │ │ count() → int │ │ get_by_document() │ │
│ └──────────────────────┘ └──────────────────────┘ │ get_vectors() │ │
│ └──────────────────────┘ │
│ ┌────────────────────┐ │
│ │ base.py │ │
│ ├────────────────────┤ │
│ │ BaseRepository │ │
│ │ - CRUD interface │ │
│ │ - Common methods │ │
│ └────────────────────┘ │
│ │
│ 💾 Current Implementation: In-Memory Storage │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ libraries: Dict[UUID, Library] = {} │ │
│ │ documents: Dict[UUID, Document] = {} │ │
│ │ chunks: Dict[UUID, Chunk] = {} │ │
│ │ │ │
│ │ + asyncio.Lock() for thread-safety │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ 📋 Responsibilities: │
│ • CRUD operations on domain entities │
│ • Data persistence abstraction │
│ • Query implementation │
│ • Thread-safe access to shared state │
│ • Can be swapped: InMemory → PostgreSQL → Redis (transparent to services) │
│ │
└──────────────────────────────────┬───────────────────────────────────────────────┘
│
│ Stores/Retrieves
▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│ LAYER 4: DOMAIN LAYER (Core Models) │
│ Pydantic Models, Business Entities │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐ │
│ │ library.py │ │ document.py │ │ chunk.py │ │
│ ├────────────────────┤ ├────────────────────┤ ├────────────────────┤ │
│ │ class Library: │ │ class Document: │ │ class Chunk: │ │
│ │ id: UUID │ │ id: UUID │ │ id: UUID │ │
│ │ name: str │ │ library_id: UUID │ │ text: str │ │
│ │ description: str │ │ document_id: UUID│ │ embedding: List │ │
│ │ index_type: str │ │ metadata: dict │ │ document_id: UUID│ │
│ │ index_status: str│ │ chunk_count: int │ │ library_id: UUID │ │
│ │ metadata: dict │ │ created_at: dt │ │ metadata: dict │ │
│ │ created_at: dt │ │ updated_at: dt │ │ created_at: dt │ │
│ │ updated_at: dt │ └────────────────────┘ └────────────────────┘ │
│ └────────────────────┘ │
│ │
│ ┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐ │
│ │ search.py │ │ metadata_filter.py │ │ pagination.py │ │
│ ├────────────────────┤ ├────────────────────┤ ├────────────────────┤ │
│ │ SearchRequest │ │ MetadataFilter │ │ PaginationParams │ │
│ │ SearchResponse │ │ FilterOperator │ │ PaginatedResponse │ │
│ │ SearchResult │ │ FilterExpression │ └────────────────────┘ │
│ └────────────────────┘ └────────────────────┘ │
│ │
│ Responsibilities: │
│ • Define domain entities with validation │
│ • Type safety with Pydantic │
│ • Automatic serialization/deserialization │
│ • Field constraints and business rules │
│ • No dependencies on other layers (pure domain) │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────┐
│ CROSS-CUTTING CONCERNS: INDEXING ALGORITHMS │
│ (Used by Index Service) │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ base.py (Abstract Interface) │ │
│ │ ┌──────────────────────────────────────────────────────────────────┐ │ │
│ │ │ class BaseIndex(ABC): │ │ │
│ │ │ @abstractmethod │ │ │
│ │ │ def build(vectors: List[Tuple[UUID, List[float]]]) -> None │ │ │
│ │ │ @abstractmethod │ │ │
│ │ │ def search(query: List[float], k: int) -> List[SearchResult] │ │ │
│ │ │ @abstractmethod │ │ │
│ │ │ def add(chunk_id: UUID, vector: List[float]) -> None │ │ │
│ │ │ @abstractmethod │ │ │
│ │ │ def remove(chunk_id: UUID) -> bool │ │ │
│ │ └──────────────────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ implements │
│ ┌───────────────┴───────────────┐ │
│ ▼ ▼ ▼ │
│ ┌───────────────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ linear_index.py │ │kd_tree_index │ │ lsh_index.py │ │
│ ├───────────────────────┤ ├──────────────┤ ├──────────────────────┤ │
│ │ LinearIndex │ │ KDTreeIndex │ │ LSHIndex │ │
│ │ │ │ │ │ │ │
│ │ Time: O(N×D) │ │ Time: O(logN)│ │ Time: O(N^ρ), ρ<1 │ │
│ │ Space: O(N×D) │ │ to O(N) │ │ Space: O(L×N×D) │ │
│ │ Recall: 100% │ │ Space: O(N×D)│ │ Recall: 90-95% │ │
│ │ │ │ Recall: 100% │ │ │ │
│ │ ✓ Exact search │ │ │ │ ✓ High dimensions │ │
│ │ ✓ Simple │ │ ✓ Fast in low│ │ ✓ Sub-linear search │ │
│ │ ✗ Poor scalability │ │ dimensions │ │ ✓ Tunable precision │ │
│ │ │ │ ✗ Curse of │ │ ✗ Approximate │ │
│ │ Use: <10K vectors │ │ dimensionlty│ │ │ │
│ └───────────────────────┘ └──────────────┘ │ Use: 100K+ vectors, │ │
│ │ high-D │ │
│ ┌──────────────────────┐ ┌───────────────┐ └──────────────────────┘ │
│ │ hnsw_index.py │ │optimized_ │ │
│ ├──────────────────────┤ │linear_index.py│ ┌───────────────────────┐ │
│ │ HNSWIndex │ ├───────────────┤ │ multiprobe_lsh_ │ │
│ │ │ │ Optimized │ │ index.py │ │
│ │ Time: O(log N) │ │ Linear │ ├───────────────────────┤ │
│ │ Space: O(M×N×D) │ │ │ │ MultiProbe LSH │ │
│ │ Recall: 95-99% │ │ + SIMD │ │ │ │
│ │ │ │ + Batching │ │ + Multi-bucket probe │ │
│ │ ✓ State-of-the-art │ │ + NumPy opts │ │ + Better recall │ │
│ │ ✓ Best recall/speed │ └───────────────┘ └───────────────────────┘ │
│ │ ✓ Dynamic inserts │ │
│ │ ✓ Runtime tunable │ ┌─────────────────────────────────────────┐ │
│ │ ✗ Complex impl │ │ index_factory.py │ │
│ │ ✗ Higher memory │ ├─────────────────────────────────────────┤ │
│ │ │ │ create(type, dimension) → BaseIndex │ │
│ │ Use: Production! │ │ recommend_index_type() → IndexType │ │
│ └──────────────────────┘ └─────────────────────────────────────────┘ │
│ │
│ Design Pattern: Strategy Pattern │
│ • Index Service uses BaseIndex interface │
│ • Concrete implementations are interchangeable │
│ • Open/Closed Principle: add new algorithms without modifying existing code │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────┐
│ CROSS-CUTTING CONCERNS: INFRASTRUCTURE │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ persistence/persistence_manager.py │ │
│ ├──────────────────────────────────────────────────────────────────────┤ │
│ │ • save_state() - Snapshot to disk │ │
│ │ • load_state() - Restore from disk │ │
│ │ • write_wal_entry() - Write-Ahead Log │ │
│ │ • Periodic snapshots (every 5 min) │ │
│ │ • JSON for metadata, NumPy compressed for vectors │ │
│ │ • Atomic writes with temp files │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ distributed/distributed_architecture.py │ │
│ ├──────────────────────────────────────────────────────────────────────┤ │
│ │ • Leader-Follower pattern │ │
│ │ • Leader election (Raft-like consensus) │ │
│ │ • Heartbeat mechanism │ │
│ │ • Async/Sync replication modes │ │
│ │ • Reads: Any node | Writes: Leader only │ │
│ │ • Automatic failover on leader failure │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ core/error_handlers.py │ │
│ ├──────────────────────────────────────────────────────────────────────┤ │
│ │ • Custom domain exceptions │ │
│ │ • HTTP exception mapping │ │
│ │ • Centralized error handling │ │
│ │ • Structured error responses │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────────┐
│ DATA FLOW EXAMPLE │
│ (Search Request: "Find similar documents") │
└─────────────────────────────────────────────────────────────────────────────────┘
1. CLIENT
│
│ POST /libraries/{id}/search
│ { "query_text": "machine learning", "top_k": 5 }
▼
2. API LAYER (search_router.py)
│
│ • Validate request with Pydantic
│ • Extract library_id, query_text, top_k
▼
│ search_service.search_library(lib_id, query_text, top_k)
▼
3. SERVICE LAYER (search_service.py)
│
│ A) Get library from repository
│ library_repo.get(library_id) → Library
│
│ B) Generate embedding for query text
│ embedding_service.get_embedding("machine learning")
│ └─→ Call Cohere API → [0.123, -0.456, 0.789, ...]
│
│ C) Get index for library
│ index_service.get_index(library_id) → HNSWIndex
│
│ D) Search index
│ index.search(query_vector, k=5)
│ └─→ HNSW traversal → [chunk1, chunk2, chunk3, chunk4, chunk5]
│
│ E) Apply metadata filters (if any)
│ metadata_filter_service.apply_filters(results, filters)
│
│ F) Fetch full chunk data
│ chunk_repo.get_batch([chunk_ids])
│
│ G) Format response
│ SearchResponse(results=[...], total=5, search_time_ms=12.3)
▼
4. API LAYER
│
│ Convert to JSON
│ Return HTTP 200 with results
▼
5. CLIENT
│
│ Receives search results with similarity scores
└─→ Display to user
┌─────────────────────────────────────────────────────────────────────────────────┐
│ KEY DESIGN PRINCIPLES │
├─────────────────────────────────────────────────────────────────────────────────┤
│ │
│ SOLID PRINCIPLES │
│ │
│ S - Single Responsibility │
│ Each class has one reason to change │
│ Example: LinearIndex only handles linear search │
│ │
│ O - Open/Closed │
│ Open for extension, closed for modification │
│ Example: Add new index types without changing search service │
│ │
│ L - Liskov Substitution │
│ Subclasses can replace parent classes │
│ Example: Any BaseIndex implementation works in search service │
│ │
│ I - Interface Segregation │
│ Clients don't depend on unused methods │
│ Example: Search doesn't need to know about index building internals │
│ │
│ D - Dependency Inversion │
│ Depend on abstractions, not concretions │
│ Example: Service → BaseIndex interface, not specific index class │
│ │
│ ──────────────────────────────────────────────────────────────────────────── │
│ │
│ DOMAIN-DRIVEN DESIGN │
│ │
│ • Layered Architecture - Clear separation of concerns │
│ • Repository Pattern - Data access abstraction │
│ • Service Layer - Business logic isolation │
│ • Domain Models - Rich, validated entities (Pydantic) │
│ • Ubiquitous Language - Consistent terminology (Library, Chunk, Index) │
│ │
│ ──────────────────────────────────────────────────────────────────────────── │
│ │
│ CONCURRENCY & THREAD SAFETY │
│ │
│ • async/await throughout (non-blocking I/O) │
│ • asyncio.Lock for shared state access │
│ • Thread-safe repository operations │
│ • FastAPI handles concurrent requests automatically │
│ │
│ ──────────────────────────────────────────────────────────────────────────── │
│ │
│ BEST PRACTICES │
│ │
│ • Type hints everywhere (static typing) │
│ • Pydantic for validation (fail fast) │
│ • Early returns (avoid deep nesting) │
│ • Composition over inheritance │
│ • No hardcoded values (use constants) │
│ • Comprehensive error handling │
│ • Unit + Integration testing │
│ │
└─────────────────────────────────────────────────────────────────────────────────┘
------------------------------ TO DO
┌─────────────────────────────────────────────────────────────────────────────────┐
│ DEPLOYMENT ARCHITECTURE │
└─────────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────┐
│ Load Balancer │
└────────────┬────────────┘
│
┌───────────────────┼───────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Docker │ │ Docker │ │ Docker │
│ Container 1 │ │ Container 2 │ │ Container 3 │
│ │ │ │ │ │
│ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ │
│ │ FastAPI │ │ │ │ FastAPI │ │ │ │ FastAPI │ │
│ │ App │ │ │ │ App │ │ │ │ App │ │
│ └───────────┘ │ │ └───────────┘ │ │ └───────────┘ │
│ │ │ │ │ │
│ Leader │ │ Follower │ │ Follower │
│ (Writes) │ │ (Reads) │ │ (Reads) │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
└───────────────────┼───────────────────┘
│
Replication & Sync
│
▼
┌──────────────────────┐
│ Persistent Storage │
│ │
│ data/ │
│ ├─ snapshots/ │
│ ├─ wal/ │
│ └─ indexes/ │
└──────────────────────┘
Docker Volume: vectordb_data (mounted to /app/data)
Once the application is running, navigate to http://localhost:8000/docs to access the interactive Swagger UI.
- Click on GET /health
- Click "Try it out"
- Click "Execute"
- Verify the response shows:
{ "status": "healthy", "embedding_service": { "provider": "cohere", "available": true, "dimension": 1024 } }
- Click on POST /libraries
- Click "Try it out"
- Replace the request body with:
{ "name": "AI Research Library", "description": "Collection of AI and ML documents", "index_type": "linear", "metadata": { "category": "research", "owner": "test_user" } } - Click "Execute"
- Copy the
library_idfrom the response (you'll need it for next steps)
Example response:
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"name": "AI Research Library",
"description": "Collection of AI and ML documents",
"index_type": "linear",
"metadata": {
"category": "research",
"owner": "test_user"
},
"created_at": "2024-01-20T10:00:00Z",
"updated_at": "2024-01-20T10:00:00Z"
}- Click on POST /libraries/{library_id}/documents
- Click "Try it out"
- Enter your
library_idin the path parameter - Replace the request body with:
{ "name": "Introduction to Machine Learning", "metadata": { "author": "John Doe", "year": 2024, "topic": "ML Basics" } } - Click "Execute"
- Copy the
document_idfrom the response
- Click on POST /libraries/{library_id}/chunks
- Click "Try it out"
- Enter your
library_idin the path parameter - Add multiple chunks with different texts:
Chunk 1:
{
"text": "Machine learning is a subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed.",
"document_id": "your-document-id-here",
"metadata": {
"chapter": 1,
"section": "Introduction"
}
}Chunk 2:
{
"text": "Neural networks are computing systems inspired by biological neural networks that constitute animal brains. They are the foundation of deep learning.",
"document_id": "your-document-id-here",
"metadata": {
"chapter": 2,
"section": "Neural Networks"
}
}Chunk 3:
{
"text": "Natural language processing is a field of AI that gives machines the ability to read, understand, and derive meaning from human language.",
"document_id": "your-document-id-here",
"metadata": {
"chapter": 3,
"section": "NLP"
}
}- Execute each chunk creation separately
- The API will automatically generate embeddings using Cohere
- Click on POST /libraries/{library_id}/index
- Click "Try it out"
- Enter your
library_idin the path parameter - Choose an index type in the request body:
Options:
{ "index_type": "kd_tree" }"linear","kd_tree", or"lsh" - Click "Execute"
- Click on POST /libraries/{library_id}/search
- Click "Try it out"
- Enter your
library_idin the path parameter - Try different search queries:
Example 1 - Text Query:
{
"query_text": "What is deep learning?",
"top_k": 5,
"include_text": true,
"include_metadata": true
}Example 2 - Semantic Similarity:
{
"query_text": "artificial intelligence and neural computation",
"top_k": 3,
"include_text": true,
"min_score": 0.5
}Example 3 - With Metadata Filters:
{
"query_text": "machine learning fundamentals",
"top_k": 5,
"filters": {
"chapter": 1
},
"include_text": true,
"include_metadata": true
}- Click "Execute"
- Observe the results sorted by similarity score
Expected response structure:
{
"query": "What is deep learning?",
"results": [
{
"chunk_id": "chunk-uuid",
"document_id": "doc-uuid",
"score": 0.8567,
"text": "Neural networks are computing systems...",
"metadata": {
"chapter": 2,
"section": "Neural Networks"
}
}
],
"total_results": 1,
"search_time_ms": 12.5,
"index_type": "kd_tree"
}- Go back to POST /libraries/{library_id}/index
- Rebuild the index with different algorithms:
"linear"- Brute force search (most accurate)"kd_tree"- Tree-based spatial index (fast for low dimensions)"lsh"- Locality-Sensitive Hashing (fast approximate search)
- Repeat the search queries and compare:
- Search times
- Result accuracy
- Similarity scores -->
- Click on PUT /libraries/{library_id}/chunks/{chunk_id}
- Update the text or metadata
- The embedding will be automatically regenerated
- Click on DELETE /libraries/{library_id}/chunks/{chunk_id}
- Verify deletion with GET /libraries/{library_id}/chunks
- Click on GET /libraries
- See all created libraries with pagination support
- Click on POST /libraries/{library_id}/chunks/bulk
- Provide multiple texts at once:
{
"document_id": "your-document-id",
"texts": [
"First chunk text about machine learning",
"Second chunk text about deep learning",
"Third chunk text about neural networks"
]
}- First, get an embedding from a chunk
- Use POST /libraries/{library_id}/search with:
{
"query_vector": [0.123, -0.456, 0.789, ...],
"top_k": 5
}- Create a library with many chunks (50+)
- Compare search performance:
- Without index (linear scan)
- With KD-Tree index
- With LSH index
- Monitor response times in the Swagger UI
Successful Library Creation: Returns library with unique ID Chunk Creation: Automatically generates 1024-dimensional embeddings Semantic Search: Returns relevant chunks even with different wording Similarity Scores: Range from 0.0 to 1.0 (higher is more similar) Index Building: Improves search performance for large datasets Metadata Filtering: Correctly filters results based on criteria
-
Semantic Similarity Test:
- Add chunk: "Machine learning is a type of artificial intelligence"
- Search: "AI and ML technologies"
- Expected: High similarity score (>0.6)
-
Different Topics Test:
- Add chunk: "Python is a programming language"
- Search: "Neural networks and deep learning"
- Expected: Low similarity score (<0.3)
-
Exact Match Test:
- Add chunk with specific text
- Search with same text
- Expected: Very high similarity score (>0.95)
- No results returned: Check if chunks were added to the correct library
- Low similarity scores: Ensure Cohere API key is set correctly
- Index not improving performance: May need more data (>100 chunks) to see benefits
- API errors: Check the response details in Swagger UI for specific error messages
| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Root endpoint with API info |
| GET | /health |
Health check with service status |
| POST | /libraries |
Create a new library |
| GET | /libraries |
List all libraries |
| GET | /libraries/{id} |
Get library details |
| PUT | /libraries/{id} |
Update library metadata |
| DELETE | /libraries/{id} |
Delete a library |
| POST | /libraries/{id}/documents |
Create a document |
| GET | /libraries/{id}/documents/{doc_id} |
Get document details |
| POST | /libraries/{id}/chunks |
Add a chunk |
| POST | /libraries/{id}/chunks/bulk |
Add multiple chunks |
| GET | /libraries/{id}/chunks |
List chunks in library |
| POST | /libraries/{id}/index |
Build/rebuild index |
| POST | /libraries/{id}/search |
Perform vector search |
If you prefer command-line testing:
# Create a library
curl -X POST "http://localhost:8000/libraries" \
-H "Content-Type: application/json" \
-d '{"name": "Test Library", "index_type": "linear"}'
# Add a chunk (replace library-id and document-id)
curl -X POST "http://localhost:8000/libraries/{library-id}/chunks" \
-H "Content-Type: application/json" \
-d '{"text": "Sample text", "document_id": "{document-id}"}'
# Search
curl -X POST "http://localhost:8000/libraries/{library-id}/search" \
-H "Content-Type: application/json" \
-d '{"query_text": "Sample query", "top_k": 5}'# Run all tests
pytest
# Run with coverage
pytest --cov=src --cov-report=html
# Run specific test file
pytest tests/test_api_integration.py -v