Zotero Code Execution

Efficient multi-strategy Zotero search using code execution pattern

A Python library for Zotero MCP that implements Anthropic's code execution pattern to enable safe, comprehensive searches without context overflow or crashes.

Quick Start

import sys
sys.path.append('/path/to/zotero-code-execution')
import setup_paths
from zotero_lib import SearchOrchestrator, format_results

# Single comprehensive search - fetches 100+ items, returns top 20
orchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("embodied cognition", max_results=20)
print(format_results(results))

That's it! This automatically:

✅ Performs semantic + keyword + tag searches
✅ Deduplicates results
✅ Ranks by relevance
✅ Keeps large datasets in code (no crashes)

Multi-Term Searches

For OR-style searches (e.g., multiple spellings or languages), search each term separately and merge:

# Search for "Atayal" OR "泰雅族"
all_results = {}

for term in ['Atayal', '泰雅族']:
    results = orchestrator.comprehensive_search(term, max_results=50)
    for item in results:
        all_results[item.key] = item  # Deduplicate by key

# Re-rank combined results
ranked = orchestrator._rank_items(list(all_results.values()), 'Atayal 泰雅族')
print(format_results(ranked[:25]))

Why? Zotero treats multi-word queries as AND conditions. Searching "Atayal 泰雅族" finds items matching BOTH terms, not either term.

Why This Exists

The Problem

Direct MCP tool calls have limitations:

🚫 Crash risk with large result sets (>15-20 items)
🚫 Token bloat - all results load into LLM context
🚫 Manual orchestration - multiple searches, manual deduplication
🚫 No ranking - results not sorted by relevance

The Solution

Code execution keeps large datasets in the execution environment:

✅ No crashes - only filtered results return to context
✅ Token efficient - process 100+ items, return top 20
✅ Auto-orchestration - multi-strategy search in one call
✅ Auto-ranking - results sorted by relevance

Features

Multi-Strategy Search

One function call performs:

Semantic search (multiple variations)
Keyword search (multiple modes)
Tag-based search
Automatic deduplication
Relevance ranking

Safe Large Searches

# ❌ Old way: Crash risk
results1 = zotero_semantic_search("query", limit=10)  # Limited to 10
results2 = zotero_search_items("query", limit=10)     # Another 10
# Manual deduplication, manual ranking...

# ✅ New way: Safe and comprehensive
orchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("query", max_results=20)
# Fetches 100+, processes in code, returns top 20

Advanced Filtering

# Fetch broadly, filter in code
library = ZoteroLibrary()
items = library.search_items("machine learning", limit=100)  # Safe!

# Filter to recent journal articles
filtered = orchestrator.filter_by_criteria(
    items,
    item_types=["journalArticle"],
    date_range=(2020, 2025)
)

Installation

Requirements

Python 3.8+
Zotero MCP installed via pipx
Claude Code or similar code execution environment

Setup

Clone this repository:

git clone https://github.com/yourusername/zotero-code-execution.git
cd zotero-code-execution

Install dependencies (optional - usually already installed with Zotero MCP):

pip install -r requirements.txt

Use in your code:

import sys
sys.path.append('/path/to/zotero-code-execution')
import setup_paths  # Adds zotero_mcp to path
from zotero_lib import SearchOrchestrator, format_results

Usage Examples

Basic Search

orchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("neural networks", max_results=20)
print(format_results(results))

Filter by Author

library = ZoteroLibrary()
results = library.search_items("Kahneman", qmode="titleCreatorYear", limit=50)
sorted_results = sorted(results, key=lambda x: x.date, reverse=True)
print(format_results(sorted_results))

Tag-Based Search

library = ZoteroLibrary()
results = library.search_by_tag(["learning", "cognition"], limit=50)
print(format_results(results[:20]))

Recent Papers

library = ZoteroLibrary()
results = library.get_recent(limit=20)
print(format_results(results))

Custom Filtering

library = ZoteroLibrary()
orchestrator = SearchOrchestrator(library)

items = library.search_items("AI", limit=100)

# Only recent papers with DOI
recent_with_doi = [
    item for item in items
    if item.doi and item.date and int(item.date[:4]) >= 2020
]
print(format_results(recent_with_doi))

See examples.py for 8 complete working examples.

Claude Code Skill

This repository includes a Claude Code skill for easy integration.

Installation

Copy the skill to your Claude skills directory:

cp -r claude-skill ~/.claude/skills/zotero-mcp-code

Usage

In Claude Code, searches will automatically use the code execution pattern:

"Find papers about embodied cognition"

Claude will write code using this library instead of direct MCP calls.

See claude-skill/SKILL.md for complete skill documentation.

API Reference

`SearchOrchestrator`

Main class for automated multi-strategy searching.

`comprehensive_search(query, max_results=20, use_semantic=True, use_keyword=True, use_tags=True, search_limit_per_strategy=50)`

Performs comprehensive search with automatic deduplication and ranking.

Returns: List of ZoteroItem objects

`filter_by_criteria(items, item_types=None, date_range=None, required_tags=None, excluded_tags=None)`

Filter items by various criteria.

Returns: Filtered list of ZoteroItem objects

`ZoteroLibrary`

Low-level interface to Zotero.

search_items(query, ...) - Keyword search
semantic_search(query, ...) - Semantic/vector search
search_by_tag(tags, ...) - Tag-based search
get_recent(limit) - Recently added items
get_tags() - All library tags

Helper Functions

format_results(items, include_abstracts=True, max_abstract_length=300) - Format as markdown

See README_LIBRARY.md for complete API documentation.

Architecture

Based on Anthropic's code execution with MCP:

Claude writes Python code (not direct MCP calls)
Code fetches large datasets (100+ items) from Zotero
Code processes in execution environment (dedup, rank, filter)
Only filtered results return to LLM context (20 items)

Result: Large datasets stay out of context, preventing crashes and saving tokens.

Performance

Expected Benefits

Based on Anthropic's pattern and implementation design:

Token reduction: 50-90% (exact amount depends on search size)
Function calls: 5-10x → 1x (confirmed by design)
Search limits: 10-15 → 100+ items (safe in code)
Crash prevention: Likely effective (untested)

Status

⚠️ Proof of concept - Performance claims are theoretical projections, not measured results.

See HONEST_STATUS.md for detailed status and validation needs.

Documentation

README_LIBRARY.md - Complete library documentation
QUICK_START.md - Quick reference guide
CLAUDE_INSTRUCTIONS.md - Instructions for Claude Code
examples.py - 8 working examples
IMPLEMENTATION_SUMMARY.md - Technical details
HONEST_STATUS.md - Implementation status
claude-skill/SKILL.md - Claude Code skill docs

Contributing

Contributions welcome! Areas for improvement:

Performance validation - Measure actual token savings
Better ranking - Incorporate semantic similarity scores
Caching - Cache search results with invalidation
Parallel processing - Execute search strategies concurrently
Export functions - Batch BibTeX generation, CSV export

License

MIT License - see LICENSE file for details.

Credits

Based on Zotero MCP
Inspired by Anthropic's code execution with MCP

Related Projects

Zotero MCP - The underlying MCP server
Claude Code - Code execution environment
FastMCP - MCP server framework

Citation

If you use this in research, please cite:

@software{zotero_code_execution,
  title = {Zotero Code Execution: Efficient Multi-Strategy Search},
  year = {2025},
  url = {https://github.com/kerim/zotero-code-execution}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
claude-skill		claude-skill
.gitignore		.gitignore
CLAUDE_INSTRUCTIONS.md		CLAUDE_INSTRUCTIONS.md
DEPLOYMENT_SUMMARY.md		DEPLOYMENT_SUMMARY.md
HONEST_STATUS.md		HONEST_STATUS.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
LICENSE		LICENSE
QUICK_START.md		QUICK_START.md
README.md		README.md
README_LIBRARY.md		README_LIBRARY.md
examples.py		examples.py
requirements.txt		requirements.txt
setup.py		setup.py
setup_paths.py		setup_paths.py
skill.md		skill.md
test_basic.py		test_basic.py
test_real_performance.py		test_real_performance.py
zotero_lib.py		zotero_lib.py

License

kerim/zotero-code-execution

Folders and files

Latest commit

History

Repository files navigation

Zotero Code Execution

Quick Start

Multi-Term Searches

Why This Exists

The Problem

The Solution

Features

Multi-Strategy Search

Safe Large Searches

Advanced Filtering

Installation

Requirements

Setup

Usage Examples

Basic Search

Filter by Author

Tag-Based Search

Recent Papers

Custom Filtering

Claude Code Skill

Installation

Usage

API Reference

SearchOrchestrator

comprehensive_search(query, max_results=20, use_semantic=True, use_keyword=True, use_tags=True, search_limit_per_strategy=50)

filter_by_criteria(items, item_types=None, date_range=None, required_tags=None, excluded_tags=None)

ZoteroLibrary

Helper Functions

Architecture

Performance

Expected Benefits

Status

Documentation

Contributing

License

Credits

Related Projects

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`SearchOrchestrator`

`comprehensive_search(query, max_results=20, use_semantic=True, use_keyword=True, use_tags=True, search_limit_per_strategy=50)`

`filter_by_criteria(items, item_types=None, date_range=None, required_tags=None, excluded_tags=None)`

`ZoteroLibrary`

Packages