Skip to content

Conversation

@Kylie-dot-s
Copy link
Contributor

  • Fix rule_resume.py: Replace invalid eval_details/eval_status with proper EvalDetail fields (status, label, reason)
  • Fix mcp_server.py: Add environment variables to force ThreadPool usage and prevent ProcessPool fd inheritance
    • Set LOCAL_DEPLOYMENT_MODE=true to use ThreadPoolExecutor instead of ProcessPoolExecutor
    • Set TQDM_DISABLE=1 to prevent progress bar output pollution
    • Add stdout/stderr redirection with StringIO for better error handling
    • Fix rule name extraction logic to handle both class and instance types
    • Add defensive checks for Model.prompt_name_map access
    • Change default rule group fallback from 'default' to 'sft' to avoid buggy Resume rules

Kylie-dot-s and others added 2 commits December 26, 2025 14:13
- Fix rule_resume.py: Replace invalid eval_details/eval_status with proper EvalDetail fields (status, label, reason)
- Fix mcp_server.py: Add environment variables to force ThreadPool usage and prevent ProcessPool fd inheritance
  * Set LOCAL_DEPLOYMENT_MODE=true to use ThreadPoolExecutor instead of ProcessPoolExecutor
  * Set TQDM_DISABLE=1 to prevent progress bar output pollution
  * Add stdout/stderr redirection with StringIO for better error handling
  * Fix rule name extraction logic to handle both class and instance types
  * Add defensive checks for Model.prompt_name_map access
  * Change default rule group fallback from 'default' to 'sft' to avoid buggy Resume rules
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Kylie-dot-s, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the stability and reliability of the MCP server's communication and Dingo evaluation processes. It addresses critical issues related to stdio output pollution and ProcessPoolExecutor file descriptor inheritance by enforcing specific environment configurations and redirecting output. The changes also streamline the Dingo evaluation setup, improve LLM configuration handling, and refine rule processing, leading to a more robust and predictable execution environment for evaluations.

Highlights

  • Improved Stdio Communication: The MCP server now explicitly sets environment variables (TQDM_DISABLE, LOCAL_DEPLOYMENT_MODE) to prevent tqdm progress bars from polluting stdio and to force the use of ThreadPoolExecutor over ProcessPoolExecutor, mitigating file descriptor inheritance issues.
  • Enhanced Logging and Error Handling: Logging is now conditionally redirected to a file (mcp_server.log) when the MCP_TRANSPORT is stdio, ensuring clean stdout/stderr for JSON-RPC communication. Additionally, stdout/stderr are redirected to StringIO during Dingo execution to capture internal output for better debugging and error handling.
  • Refactored Dingo Evaluation Configuration: The run_dingo_evaluation function has been refactored to use structured dictionaries for dataset, executor, and evaluator configurations, improving clarity and maintainability. It also includes a new SELF_CONTAINED_LLM_EVALUATORS set to handle LLM evaluators that manage their own prompts.
  • Rule and LLM Configuration Fixes: The PR fixes rule name extraction logic to correctly handle both class and instance types of rules, changes the default rule group fallback from 'default' to 'sft' to avoid buggy Resume rules, and adds defensive checks for Model.prompt_name_map access.
  • Forced Single-Threaded Execution: The max_workers setting is now explicitly forced to 1 during Dingo execution to ensure ThreadPool usage and further prevent ProcessPool related issues, especially with file descriptor inheritance.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses stdio communication issues in mcp_server.py by forcing single-threaded execution and redirecting output streams to prevent interference. It also introduces several valuable refactorings and robustness improvements, such as adding defensive checks, improving LLM configuration handling, and making rule processing more flexible. The changes align well with the PR description and significantly enhance the server's stability and maintainability. I have a few suggestions to further improve the code, mainly regarding a performance concern and some minor code duplication.

Comment on lines +806 to +807
input_data["max_workers"] = 1
log.info(f"Forced max_workers=1 to use ThreadPool mode (avoiding ProcessPool fd inheritance)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change unconditionally forces max_workers to 1 to solve stdio communication issues. While this is effective for stdio mode, it also impacts other transport modes like sse, potentially limiting performance unnecessarily.

Consider making this change conditional on the transport mode. This would solve the stdio issue without affecting other deployment scenarios.

Suggested change
input_data["max_workers"] = 1
log.info(f"Forced max_workers=1 to use ThreadPool mode (avoiding ProcessPool fd inheritance)")
if os.environ.get("MCP_TRANSPORT", "stdio") == "stdio":
input_data["max_workers"] = 1
log.info(f"Forced max_workers=1 to use ThreadPool mode (avoiding ProcessPool fd inheritance)")

Comment on lines +654 to +672
if isinstance(rule_group_data, list):
rules_iterable = rule_group_data
elif hasattr(rule_group_data, 'rules'):
rules_iterable = rule_group_data.rules
else:
rules_iterable = []
log.warning(f"Unknown structure for rule group '{rule_group}'")

# 2. 智能提取名字 (修复 'type' 问题)
evals_list = []
for rule in rules_iterable:
# 如果 rule 是个类 (type),直接取 __name__
# 如果 rule 是个实例,取 __class__.__name__
if isinstance(rule, type):
r_name = rule.__name__
else:
r_name = rule.__class__.__name__

evals_list.append({"name": r_name})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic here for iterating through rule_group_data and handling both class and instance types is very similar to the logic in the _get_rule_details_logic function (lines 877-907). To improve maintainability and follow the DRY (Don't Repeat Yourself) principle, consider extracting this shared iteration logic into a common helper function. This would centralize the handling of different rule group structures.

log.info(f"Found {len(evals_list)} rules in group '{rule_group}'")
else:
# Fallback to common rules
evals_list = [{"name": "RuleColonEnd"}, {"name": "RuleContentNull"}, {"name": "RuleDocRepeat"}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The list of fallback rules [{\"name\": \"RuleColonEnd\"}, {\"name\": \"RuleContentNull\"}, {\"name\": \"RuleDocRepeat\"}] is hardcoded here and also in the except block on line 681. To improve maintainability and avoid duplication, consider defining this list as a constant at the top of the file, for example: FALLBACK_RULES = [...].

Kylie-dot-s and others added 3 commits December 26, 2025 14:28
- Add noqa: E402 comments to imports that must stay after os.environ setup
- Fix E261: Add proper spacing before inline comments
- Fix F841: Remove unused variable 'inner_e' in exception handler
- Fix E115: Correct indentation of FIX END comment
- Keep noqa: E402 comments for imports after os.environ setup
- Maintain proper import order and formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants