Efficient multi-strategy Zotero search using code execution pattern
A Python library for Zotero MCP that implements Anthropic's code execution pattern to enable safe, comprehensive searches without context overflow or crashes.
import sys
sys.path.append('/path/to/zotero-code-execution')
import setup_paths
from zotero_lib import SearchOrchestrator, format_results
# Single comprehensive search - fetches 100+ items, returns top 20
orchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("embodied cognition", max_results=20)
print(format_results(results))That's it! This automatically:
- ✅ Performs semantic + keyword + tag searches
- ✅ Deduplicates results
- ✅ Ranks by relevance
- ✅ Keeps large datasets in code (no crashes)
For OR-style searches (e.g., multiple spellings or languages), search each term separately and merge:
# Search for "Atayal" OR "泰雅族"
all_results = {}
for term in ['Atayal', '泰雅族']:
results = orchestrator.comprehensive_search(term, max_results=50)
for item in results:
all_results[item.key] = item # Deduplicate by key
# Re-rank combined results
ranked = orchestrator._rank_items(list(all_results.values()), 'Atayal 泰雅族')
print(format_results(ranked[:25]))Why? Zotero treats multi-word queries as AND conditions. Searching "Atayal 泰雅族" finds items matching BOTH terms, not either term.
Direct MCP tool calls have limitations:
- 🚫 Crash risk with large result sets (>15-20 items)
- 🚫 Token bloat - all results load into LLM context
- 🚫 Manual orchestration - multiple searches, manual deduplication
- 🚫 No ranking - results not sorted by relevance
Code execution keeps large datasets in the execution environment:
- ✅ No crashes - only filtered results return to context
- ✅ Token efficient - process 100+ items, return top 20
- ✅ Auto-orchestration - multi-strategy search in one call
- ✅ Auto-ranking - results sorted by relevance
One function call performs:
- Semantic search (multiple variations)
- Keyword search (multiple modes)
- Tag-based search
- Automatic deduplication
- Relevance ranking
# ❌ Old way: Crash risk
results1 = zotero_semantic_search("query", limit=10) # Limited to 10
results2 = zotero_search_items("query", limit=10) # Another 10
# Manual deduplication, manual ranking...
# ✅ New way: Safe and comprehensive
orchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("query", max_results=20)
# Fetches 100+, processes in code, returns top 20# Fetch broadly, filter in code
library = ZoteroLibrary()
items = library.search_items("machine learning", limit=100) # Safe!
# Filter to recent journal articles
filtered = orchestrator.filter_by_criteria(
items,
item_types=["journalArticle"],
date_range=(2020, 2025)
)- Python 3.8+
- Zotero MCP installed via pipx
- Claude Code or similar code execution environment
- Clone this repository:
git clone https://github.com/yourusername/zotero-code-execution.git
cd zotero-code-execution- Install dependencies (optional - usually already installed with Zotero MCP):
pip install -r requirements.txt- Use in your code:
import sys
sys.path.append('/path/to/zotero-code-execution')
import setup_paths # Adds zotero_mcp to path
from zotero_lib import SearchOrchestrator, format_resultsorchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("neural networks", max_results=20)
print(format_results(results))library = ZoteroLibrary()
results = library.search_items("Kahneman", qmode="titleCreatorYear", limit=50)
sorted_results = sorted(results, key=lambda x: x.date, reverse=True)
print(format_results(sorted_results))library = ZoteroLibrary()
results = library.search_by_tag(["learning", "cognition"], limit=50)
print(format_results(results[:20]))library = ZoteroLibrary()
results = library.get_recent(limit=20)
print(format_results(results))library = ZoteroLibrary()
orchestrator = SearchOrchestrator(library)
items = library.search_items("AI", limit=100)
# Only recent papers with DOI
recent_with_doi = [
item for item in items
if item.doi and item.date and int(item.date[:4]) >= 2020
]
print(format_results(recent_with_doi))See examples.py for 8 complete working examples.
This repository includes a Claude Code skill for easy integration.
Copy the skill to your Claude skills directory:
cp -r claude-skill ~/.claude/skills/zotero-mcp-codeIn Claude Code, searches will automatically use the code execution pattern:
"Find papers about embodied cognition"
Claude will write code using this library instead of direct MCP calls.
See claude-skill/SKILL.md for complete skill documentation.
Main class for automated multi-strategy searching.
comprehensive_search(query, max_results=20, use_semantic=True, use_keyword=True, use_tags=True, search_limit_per_strategy=50)
Performs comprehensive search with automatic deduplication and ranking.
Returns: List of ZoteroItem objects
Filter items by various criteria.
Returns: Filtered list of ZoteroItem objects
Low-level interface to Zotero.
search_items(query, ...)- Keyword searchsemantic_search(query, ...)- Semantic/vector searchsearch_by_tag(tags, ...)- Tag-based searchget_recent(limit)- Recently added itemsget_tags()- All library tags
format_results(items, include_abstracts=True, max_abstract_length=300)- Format as markdown
See README_LIBRARY.md for complete API documentation.
Based on Anthropic's code execution with MCP:
- Claude writes Python code (not direct MCP calls)
- Code fetches large datasets (100+ items) from Zotero
- Code processes in execution environment (dedup, rank, filter)
- Only filtered results return to LLM context (20 items)
Result: Large datasets stay out of context, preventing crashes and saving tokens.
Based on Anthropic's pattern and implementation design:
- Token reduction: 50-90% (exact amount depends on search size)
- Function calls: 5-10x → 1x (confirmed by design)
- Search limits: 10-15 → 100+ items (safe in code)
- Crash prevention: Likely effective (untested)
See HONEST_STATUS.md for detailed status and validation needs.
- README_LIBRARY.md - Complete library documentation
- QUICK_START.md - Quick reference guide
- CLAUDE_INSTRUCTIONS.md - Instructions for Claude Code
- examples.py - 8 working examples
- IMPLEMENTATION_SUMMARY.md - Technical details
- HONEST_STATUS.md - Implementation status
- claude-skill/SKILL.md - Claude Code skill docs
Contributions welcome! Areas for improvement:
- Performance validation - Measure actual token savings
- Better ranking - Incorporate semantic similarity scores
- Caching - Cache search results with invalidation
- Parallel processing - Execute search strategies concurrently
- Export functions - Batch BibTeX generation, CSV export
MIT License - see LICENSE file for details.
- Based on Zotero MCP
- Inspired by Anthropic's code execution with MCP
- Zotero MCP - The underlying MCP server
- Claude Code - Code execution environment
- FastMCP - MCP server framework
If you use this in research, please cite:
@software{zotero_code_execution,
title = {Zotero Code Execution: Efficient Multi-Strategy Search},
year = {2025},
url = {https://github.com/kerim/zotero-code-execution}
}