-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Monte Carlo Tree Search (MCTS) is a heuristic search algorithm that systematically explores a tree of candidate outputs to refine language model responses. Upon receiving an input, the MCTS pipeline generates multiple candidate answers through iterative simulations. In each iteration, the algorithm evaluates and updates these candidates based on feedback, propagating the best scores upward. This process enhances inference by scaling the model's reasoning capabilities, enabling the selection of the optimal response from multiple candidates.
This FastAPI server exposes two endpoints:
| Method | Endpoint | Description |
|---|---|---|
| POST | /v1/chat/completions |
Accepts chat completion requests. The call is wrapped with an MCTS refinement |
| GET | /v1/models |
Proxies a request to the underlying LLM provider’s models endpoint |
During a chat completion call, the server runs an MCTS pipeline that produces iterative updates. Each update includes a dynamic Mermaid diagram and detailed logs of the iteration process. All intermediate responses are combined into a single <details> block. Finally, the final answer is appended at the end using a consistent, structured markdown template.
This project is licensed under the MIT License.