25-200x faster than MySQL FULLTEXT. In-memory full-text search engine with MySQL replication.
MySQL FULLTEXT is painfully slow — it scans B-tree pages on disk, doesn't compress postings, and struggles with common terms.
MygramDB solves this with an in-memory search replica that syncs via GTID binlog, delivering sub-60ms queries even for datasets matching 75% of your data.
Tested on 1.7M rows (real production data):
| Query Type | MySQL (Cold/Warm) | MygramDB | Speedup |
|---|---|---|---|
| SORT id LIMIT 100 (typical use) | 900-3,700ms | 24-56ms | 25-68x |
| Medium-freq term (4.6% match) | 906ms / 592ms | 24ms | 38x / 25x |
| High-freq term (47.5% match) | 2,495ms / 2,017ms | 42ms | 59x / 48x |
| Ultra high-freq (74.9% match) | 3,753ms / 3,228ms | 56ms | 68x / 58x |
| Two terms AND | 1,189ms | 10ms | 115x |
| COUNT queries | 680-1,070ms | 5-9ms | 70-200x |
Key advantages:
- No cache warmup needed - Always fast, even on cold starts
- SORT optimization - Uses primary key index (no external sort)
- Scales with result size - Larger result sets show bigger speedups
- Consistent performance - MySQL varies 600ms-3.7s, MygramDB stays under 60ms
- High concurrency - Handles heavy load effortlessly; MySQL FULLTEXT often stalls under concurrent traffic
Real-world impact: Under heavy concurrent load, MySQL FULLTEXT starts queuing requests due to disk I/O bottlenecks, causing cascading delays and timeouts. MygramDB's in-memory architecture handles the same load trivially with consistent sub-60ms latencies.
See Performance Guide for detailed benchmarks.
Prerequisites: Ensure MySQL has GTID mode enabled:
-- Check GTID mode (should be ON)
SHOW VARIABLES LIKE 'gtid_mode';
-- If OFF, enable GTID mode (MySQL 8.0+)
SET GLOBAL enforce_gtid_consistency = ON;
SET GLOBAL gtid_mode = OFF_PERMISSIVE;
SET GLOBAL gtid_mode = ON_PERMISSIVE;
SET GLOBAL gtid_mode = ON;Start MygramDB:
docker run -d --name mygramdb \
-p 11016:11016 \
-e MYSQL_HOST=your-mysql-host \
-e MYSQL_USER=repl_user \
-e MYSQL_PASSWORD=your_password \
-e MYSQL_DATABASE=mydb \
-e TABLE_NAME=articles \
-e TABLE_PRIMARY_KEY=id \
-e TABLE_TEXT_COLUMN=content \
-e TABLE_NGRAM_SIZE=2 \
-e REPLICATION_SERVER_ID=12345 \
ghcr.io/libraz/mygram-db:latest
# Check logs
docker logs -f mygramdb
# Trigger initial data sync (required on first start)
docker exec mygramdb mygram-cli -p 11016 SYNC articles
# Try a search
docker exec mygramdb mygram-cli -p 11016 SEARCH articles "hello world"git clone https://github.com/libraz/mygram-db.git
cd mygram-db
docker-compose up -d
# Wait for MySQL to be ready (check with docker-compose logs -f)
# Trigger initial data sync
docker-compose exec mygramdb mygram-cli -p 11016 SYNC articles
# Try searching
docker-compose exec mygramdb mygram-cli -p 11016 SEARCH articles "hello"Includes MySQL 8.4 with sample data for instant testing.
# Search with pagination
SEARCH articles "hello world" SORT id LIMIT 100
# Sort by custom column
SEARCH articles "hello" SORT created_at DESC LIMIT 50
# LIMIT with offset (MySQL-style)
SEARCH articles "tech" LIMIT 10,100 # offset=10, count=100
# Count matches
COUNT articles "hello world"
# Multi-term AND search
SEARCH articles hello AND world
# With filters
SEARCH articles tech FILTER status=1 LIMIT 100
# Get by primary key
GET articles 12345See Protocol Reference for all commands.
- Fast: 25-200x faster than MySQL FULLTEXT
- MySQL Replication: Real-time GTID-based binlog streaming
- Multiple Tables: Index multiple tables in one instance
- Dual Protocol: TCP (memcached-style) and HTTP/REST API
- High Concurrency: Thread pool supporting 10,000+ connections
- Unicode: ICU-based normalization for CJK/multilingual text
- Compression: Hybrid Delta encoding + Roaring bitmaps
- Easy Deploy: Single binary or Docker container
graph LR
MySQL[MySQL Primary] -->|binlog GTID| MygramDB1[MygramDB #1]
MySQL -->|binlog GTID| MygramDB2[MygramDB #2]
MygramDB1 -->|Search| App[Application]
MygramDB2 -->|Search| App
App -->|Write| MySQL
MygramDB acts as a specialized read replica for full-text search, while MySQL handles writes and normal queries.
✅ Good fit:
- Search-heavy workloads (read >> write)
- Millions of documents with full-text search
- Need sub-100ms search latency
- Simple deployment requirements
- Japanese/CJK text with ngrams
❌ Not recommended:
- Write-heavy workloads
- Dataset doesn't fit in RAM (~1-2GB per million docs)
- Need distributed search across nodes
- Complex aggregations/analytics
- Docker Deployment Guide - Production Docker setup
- Configuration Guide - All configuration options
- Protocol Reference - Complete command reference
- HTTP API Reference - REST API documentation
- Performance Guide - Benchmarks and optimization
- Replication Guide - MySQL replication setup
- Installation Guide - Build from source
- Development Guide - Contributing guidelines
- Client Library - C/C++ client library
System:
- RAM: ~1-2GB per million documents
- OS: Linux or macOS
MySQL:
- Version: 5.7.6+ or 8.0+
- GTID mode enabled (
gtid_mode=ON) - Binary log format: ROW (
binlog_format=ROW) - Replication privileges:
REPLICATION SLAVE,REPLICATION CLIENT
See Installation Guide for details.
Contributions welcome! See Development Guide.
- libraz libraz@libraz.net
- Roaring Bitmaps for compressed bitmaps
- ICU for Unicode support
- spdlog for logging
- yaml-cpp for configuration