-
Notifications
You must be signed in to change notification settings - Fork 345
Feature/add data explorer example #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request introduces a new Python MCP server (data_explorer_server_python) that powers an interactive Data Explorer widget, enabling CSV uploads, dataset profiling, filtered previews, and chart generation. The implementation includes both frontend React components and backend Python services with comprehensive test coverage.
Key changes:
- Added a complete Data Explorer widget with React frontend supporting CSV uploads, data filtering, and chart visualization (bar, scatter, histogram)
- Implemented Python backend server with dataset storage, profiling, filtering, and chart generation capabilities
- Updated documentation to describe the new server and its asset-serving behavior
Reviewed Changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/data-explorer/App.tsx |
Main React component implementing upload UI, dataset selection, filtering, table preview, and chart builder |
src/data-explorer/types.ts |
TypeScript type definitions for dataset profiles, filters, charts, and API responses |
src/data-explorer/utils/format.ts |
Utility functions for formatting numbers, bytes, percentages, and values |
src/data-explorer/utils/callTool.ts |
Helper for calling MCP tools and parsing JSON responses |
src/data-explorer/index.tsx |
Widget entry point and root rendering |
data_explorer_server_python/main.py |
Core server implementation with MCP tool handlers, upload sessions, and asset loading |
data_explorer_server_python/schemas.py |
Pydantic models for request/response validation and API contracts |
data_explorer_server_python/store.py |
In-memory dataset storage with thread-safe operations |
data_explorer_server_python/profiling.py |
Dataset profiling logic for column metadata and statistics |
data_explorer_server_python/filters.py |
DataFrame filtering implementation for equals and range conditions |
data_explorer_server_python/charts.py |
Chart data generation for bar, scatter, and histogram visualizations |
data_explorer_server_python/utils.py |
Type conversion and DataFrame utility functions |
data_explorer_server_python/tests/test_server.py |
Comprehensive test suite covering upload, preview, chart, and chunked upload flows |
data_explorer_server_python/requirements.txt |
Python dependencies specification |
data_explorer_server_python/README.md |
Server documentation with setup and usage instructions |
README.md |
Updated main documentation with Data Explorer server details |
package.json |
Added recharts dependency for chart visualization |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
cching-openai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice submission! I'm a typescript/JS dev so I'd get someone else to review the python code here specifically, but left a comment on the main server.
Also, suggest adding a note in your readme, or add a stub in the main server on how they can add authentication. Right now the he HTTP app is exposed without authentication, for example, and CORS is opened to * origin
| raise HTTPException(status_code=413, detail=_format_size_limit_error(max_bytes)) | ||
|
|
||
|
|
||
| def _extract_path_from_payload(payload: UploadDatasetInput) -> Optional[Path]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This accepts any filePath/fileUri, expands it, and _read_csv_from_path reads whatever file the caller names before handing its contents back through previews, profiles, and charts. When this server is wired into an LLM tool chain, any prompt that hits data-explorer.upload could extract secrets from the users's machine (e.g., /etc/passwd, SSH keys). Consider dropping raw path/URI ingestion, enforcing an allow‑listed directory, or adding an explicit human approval step before reading from disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review and good flag! I added tighter controls around allowlisting directories for uploading & updated the README accordingly. Let me me know if there's anything else I should address.
This pull request adds a new Python MCP server for the Data Explorer widget, enabling interactive CSV upload, profiling, preview, and charting functionality. The changes include updates to the main
README.mdto document the new server, addition of a dedicatedREADME.mdfor the Data Explorer server, and implementation of core backend logic for dataset handling, filtering, profiling, and chart generation. These enhancements allow users to upload and explore datasets directly through the Data Explorer widget.Documentation updates:
README.mdto introduce the newdata_explorer_server_pythonMCP server, describe its capabilities (CSV uploads, filters, charts), and provide setup and usage instructions. Also clarified asset serving behavior and added the Data Explorer to the list of demo servers. [1] [2] [3] [4] [5]README.mdtodata_explorer_server_python/explaining prerequisites, setup, server commands, and available MCP tools for the Data Explorer widget.Core backend implementation:
charts.pyimplementing bar, scatter, and histogram chart generation from pandas DataFrames, supporting grouping, aggregation, and binning logic.filters.pyto apply equals and range filters to DataFrame columns, enabling filtered previews and charting.profiling.pyto compute column metadata, numeric and datetime statistics, and top values for profiling uploaded datasets.Data modeling and API schemas:
schemas.pydefining pydantic models for all request and response types, including dataset upload, chunked upload, preview, chart configuration, and dataset profiling.Project setup:
requirements.txtspecifying dependencies for FastAPI, MCP SDK, pandas, numpy, and related libraries.__init__.pyfordata_explorer_server_python.