Skip to content

Conversation

@saito-oai
Copy link

This pull request adds a new Python MCP server for the Data Explorer widget, enabling interactive CSV upload, profiling, preview, and charting functionality. The changes include updates to the main README.md to document the new server, addition of a dedicated README.md for the Data Explorer server, and implementation of core backend logic for dataset handling, filtering, profiling, and chart generation. These enhancements allow users to upload and explore datasets directly through the Data Explorer widget.

Documentation updates:

  • Updated README.md to introduce the new data_explorer_server_python MCP server, describe its capabilities (CSV uploads, filters, charts), and provide setup and usage instructions. Also clarified asset serving behavior and added the Data Explorer to the list of demo servers. [1] [2] [3] [4] [5]
  • Added a detailed README.md to data_explorer_server_python/ explaining prerequisites, setup, server commands, and available MCP tools for the Data Explorer widget.

Core backend implementation:

  • Added charts.py implementing bar, scatter, and histogram chart generation from pandas DataFrames, supporting grouping, aggregation, and binning logic.
  • Added filters.py to apply equals and range filters to DataFrame columns, enabling filtered previews and charting.
  • Added profiling.py to compute column metadata, numeric and datetime statistics, and top values for profiling uploaded datasets.

Data modeling and API schemas:

  • Added schemas.py defining pydantic models for all request and response types, including dataset upload, chunked upload, preview, chart configuration, and dataset profiling.

Project setup:

  • Added requirements.txt specifying dependencies for FastAPI, MCP SDK, pandas, numpy, and related libraries.
  • Added package init file __init__.py for data_explorer_server_python.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces a new Python MCP server (data_explorer_server_python) that powers an interactive Data Explorer widget, enabling CSV uploads, dataset profiling, filtered previews, and chart generation. The implementation includes both frontend React components and backend Python services with comprehensive test coverage.

Key changes:

  • Added a complete Data Explorer widget with React frontend supporting CSV uploads, data filtering, and chart visualization (bar, scatter, histogram)
  • Implemented Python backend server with dataset storage, profiling, filtering, and chart generation capabilities
  • Updated documentation to describe the new server and its asset-serving behavior

Reviewed Changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/data-explorer/App.tsx Main React component implementing upload UI, dataset selection, filtering, table preview, and chart builder
src/data-explorer/types.ts TypeScript type definitions for dataset profiles, filters, charts, and API responses
src/data-explorer/utils/format.ts Utility functions for formatting numbers, bytes, percentages, and values
src/data-explorer/utils/callTool.ts Helper for calling MCP tools and parsing JSON responses
src/data-explorer/index.tsx Widget entry point and root rendering
data_explorer_server_python/main.py Core server implementation with MCP tool handlers, upload sessions, and asset loading
data_explorer_server_python/schemas.py Pydantic models for request/response validation and API contracts
data_explorer_server_python/store.py In-memory dataset storage with thread-safe operations
data_explorer_server_python/profiling.py Dataset profiling logic for column metadata and statistics
data_explorer_server_python/filters.py DataFrame filtering implementation for equals and range conditions
data_explorer_server_python/charts.py Chart data generation for bar, scatter, and histogram visualizations
data_explorer_server_python/utils.py Type conversion and DataFrame utility functions
data_explorer_server_python/tests/test_server.py Comprehensive test suite covering upload, preview, chart, and chunked upload flows
data_explorer_server_python/requirements.txt Python dependencies specification
data_explorer_server_python/README.md Server documentation with setup and usage instructions
README.md Updated main documentation with Data Explorer server details
package.json Added recharts dependency for chart visualization
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

saito-oai and others added 3 commits November 12, 2025 23:34
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link

@cching-openai cching-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice submission! I'm a typescript/JS dev so I'd get someone else to review the python code here specifically, but left a comment on the main server.

Also, suggest adding a note in your readme, or add a stub in the main server on how they can add authentication. Right now the he HTTP app is exposed without authentication, for example, and CORS is opened to * origin

raise HTTPException(status_code=413, detail=_format_size_limit_error(max_bytes))


def _extract_path_from_payload(payload: UploadDatasetInput) -> Optional[Path]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This accepts any filePath/fileUri, expands it, and _read_csv_from_path reads whatever file the caller names before handing its contents back through previews, profiles, and charts. When this server is wired into an LLM tool chain, any prompt that hits data-explorer.upload could extract secrets from the users's machine (e.g., /etc/passwd, SSH keys). Consider dropping raw path/URI ingestion, enforcing an allow‑listed directory, or adding an explicit human approval step before reading from disk.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review and good flag! I added tighter controls around allowlisting directories for uploading & updated the README accordingly. Let me me know if there's anything else I should address.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants