Skip to content

[Feature]: CLI server mode (pydoll serve) #247

@thalissonvs

Description

@thalissonvs

It would be useful to expose Pydoll as an HTTP service, so external systems can trigger crawls without writing Python code. The idea is to add a CLI command:

pydoll serve --port 8000

This spins up a lightweight server (likely as a plugin, to avoid bloating the core) and exposes a simple API.

Proposed API

Initial endpoint:

  • POST /crawl → body contains { "url": "https://example.com", "format": "html" | "markdown" }
  • Response returns the page content, either as HTML or Markdown (depending on the Markdown exporter feature).

This endpoint becomes a foundation for LLM integrations, where the returned HTML or Markdown can be fed into models for structured data extraction. By exposing crawling as a simple web API, Pydoll can be plugged directly into AI pipelines, data labeling flows, or automated extraction systems without extra glue code.

This could start as a separate repository (pydoll-serve) and evolve independently, but integrating a CLI hook into Pydoll keeps the DX simple.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestfuture planningIdeas or features proposed for future development.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions