MCTS OpenAI API Wrapper

Monte Carlo Tree Search (MCTS) is a heuristic search algorithm that systematically explores a tree of candidate outputs to refine language model responses. Upon receiving an input, the MCTS pipeline generates multiple candidate answers through iterative simulations. In each iteration, the algorithm evaluates and updates these candidates based on feedback, propagating the best scores upward. This process enhances inference by scaling the model's reasoning capabilities, enabling the selection of the optimal response from multiple candidates.

Overview

This FastAPI server exposes two endpoints:

Method	Endpoint	Description
POST	`/v1/chat/completions`	Accepts chat completion requests. The call is wrapped with an MCTS refinement
GET	`/v1/models`	Proxies a request to the underlying LLM provider’s models endpoint

During a chat completion call, the server runs an MCTS pipeline that produces iterative updates. Each update includes a dynamic Mermaid diagram and detailed logs of the iteration process. All intermediate responses are combined into a single <details> block. Finally, the final answer is appended at the end using a consistent, structured markdown template.

Getting Started

Deploy using Docker (Recommended) 🐳

Create a secrets.env with the variables from the docker-compose.yml file.

Use this command to pull the image and deploy the application with Docker Compose:

docker pull ghcr.io/bearlike/mcts-openai-api:latest
docker compose --env-file secrets.env up -d

# Go to http://hostname:8426/docs for Swagger API docs and test the endpoints.

Use http://hostname:8426/v1 as the OpenAI Base URL with any API key in any compatible application.

Expand to view Manual Installation

Manual Installation

Prerequisites

Python 3.13+
Poetry for dependency management

Setup

Clone the repository:

git clone https://github.com/bearlike/mcts-openai-api.git
cd mcts-openai-api

Copy the Environment File:

Copy the example environment file to .env and set your OPENAI_API_KEY:
```
cp .env.example .env
```
Open the .env file and update the OPENAI_API_KEY (and other settings if needed).
Install Dependencies:

Use Poetry to install the required packages:
```
poetry install
```

Run the Server:

Start the FastAPI server with Uvicorn:

# Visit http://mcts-server:8000/docs to view the Swagger API documentation
uvicorn main:app --reload

Testing the Server

You can test the server using curl or any HTTP client.

Example Request

curl -X 'POST' \
  'http://mcts-server:8000/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "How many R in STRAWBERRY?"
    }
  ],
  "max_tokens": 1024,
  "temperature": 0.5,
  "reasoning_effort": "low"
}' | jq -r '.choices[0].message.content'

This request will return a JSON response with the aggregated intermediate responses wrapped inside a single <details> block, followed by the final answer.

Endpoints

POST `/v1/chat/completions`

Wraps a chat completion request in an MCTS pipeline that refines the answer by generating intermediate updates and a final response.

Parameter	Data Type	Default	Description
model	string (required)	N/A	e.g., `gpt-4o-mini`.
messages	array (required)	N/A	Array of chat messages with `role` and `content`.
max_tokens	number (optional)	N/A	Maximum tokens allowed in each step response.
temperature	number (optional)	`0.7`	Controls the randomness of the output.
stream	boolean (optional)	`false`	If false, aggregates streamed responses and returns on completion. If true, streams intermediate responses.
reasoning_effort	string (optional)	`normal`	Controls the `MCTSAgent` search settings:
=>	=>	=>	`low` - 2 iterations, 2 simulations per iteration, and 2 child nodes per parent (default).
=>	=>	=>	`medium` - 3 iterations, 3 simulations per iteration, and 3 child nodes per parent.
=>	=>	=>	`high` - 4 iterations, 4 simulations per iteration, and 4 child nodes per parent.

GET `/v1/models`

Proxies requests to list available models from the underlying LLM provider using the OPENAI_API_BASE_URL.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
utils		utils
.dockerignore		.dockerignore
.env.sample		.env.sample
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
repo-to-prompt.codemod.js		repo-to-prompt.codemod.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MCTS OpenAI API Wrapper

Overview

Getting Started

Deploy using Docker (Recommended) 🐳

Manual Installation

Prerequisites

Setup

Testing the Server

Example Request

Endpoints

POST `/v1/chat/completions`

GET `/v1/models`

License

About

Uh oh!

Releases 3

Packages

Languages

License

bearlike/MCTS-OpenAI-API

Folders and files

Latest commit

History

Repository files navigation

MCTS OpenAI API Wrapper

Overview

Getting Started

Deploy using Docker (Recommended) 🐳

Manual Installation

Prerequisites

Setup

Testing the Server

Example Request

Endpoints

POST /v1/chat/completions

GET /v1/models

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

POST `/v1/chat/completions`

GET `/v1/models`

Packages