Skip to content

[BUG] 404 Not Found on evaluate() with Azure OpenAI Graders - Possible Service/SDK Incompatibility #44763

@JavierAvellaEmpire

Description

@JavierAvellaEmpire

Description

I am encountering a persistent 404 Not Found error when attempting to run the Azure OpenAI Graders workflow, both via the Python SDK and the Azure AI Foundry (Portal) GUI. This occurs even when using the official sample dataset and following the lab documentation.

1. Environment & Setup

  • Notebook Source: scenarios/evaluate/Azure_OpenAI_Graders/Azure_OpenAI_Graders.ipynb

  • Python Version: 3.10 (Miniconda)

  • Libraries: azure-ai-evaluation, openai (latest)

  • Region: canadaeast

  • Models Tested: gpt-4o, gpt-4.1-nano, o3

2. Identified Issues

A. Missing Module / Deprecation

The notebook initially fails on the following import:

Python
from openai.types.eval_string_check_grader import EvalStringCheckGrader

Error: Module not found. I had to remove this line to proceed, indicating a possible breaking change in the recent openai library versions that the notebook has not yet accounted for.

B. Execution Failure (404 Error)

When calling the evaluate function, the process fails with an openai.NotFoundError: Error code: 404.

Code Snippet:

Python
evaluation = evaluate(
    data=fname,
    evaluators={
        "label": label_grader,
        "string": string_grader,
        "similarity": sim_grader,
    }
)

Traceback Highlights:

The error originates deep within the openai base client during the evaluation group creation:

Plaintext
File "/opt/miniconda/lib/python3.10/site-packages/azure/ai/evaluation/_evaluate/_evaluate_aoai.py", line 137, in _begin_single_aoai_evaluation
    eval_group_info = client.evals.create(
File "/opt/miniconda/lib/python3.10/site-packages/openai/resources/evals/evals.py", line 109, in create
    return self._post(...)
openai.NotFoundError: Error code: 404

3. Troubleshooting Steps Taken

To rule out configuration errors, I performed the following:

  • Infrastructure: Completely re-deployed the Azure AI project and resources.

  • Credentials: Verified API keys, endpoints, and environment variables (all correct).

  • Model Variety: Switched between GPT-4o, GPT-4.1-nano, and o3 deployments; all returned the same 404.

  • Portal Testing: Attempted the same evaluation using the Azure AI Foundry GUI (Match Criteria tool) with the official data.jsonl.

  • Portal Result: The run starts and abruptly ends with EvaluationException('Error code: 404') and INFO:__main__:RUN DOES NOT EXIST.

4. Summary of Observations

Attempt | Method | Result -- | -- | -- Local Notebook | Python SDK | openai.NotFoundError: 404 Azure AI Foundry Portal | GUI / Match Criteria | EvaluationException: 404

Expected Behavior

The evaluate() function should communicate with the Azure OpenAI evaluation back-end, create the evaluation run, and return the metrics.

Actual Behavior

The back-end returns a 404, implying that the endpoint the SDK is attempting to hit (client.evals.create) is either missing, incorrectly constructed, or not supported in the current region/version.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EvaluationIssues related to the client library for Azure AI EvaluationOpenAIService AttentionWorkflow: This issue is responsible by Azure service team.customer-reportedIssues that are reported by GitHub users external to the Azure organization.needs-team-attentionWorkflow: This issue needs attention from Azure service team or SDK teamquestionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions