Skip to content

Conversation

@nickmisasi
Copy link
Collaborator

@nickmisasi nickmisasi commented Dec 15, 2025

Summary

  • Adds CI workflow to build and push MCP server Docker image to Docker Hub on release tags
  • Adds multi-stage Dockerfile for standalone MCP server container (linux/amd64 and linux/arm64)
  • Adds make mcp-server-docker target for local builds
  • Adds documentation for Docker-based deployment

Test plan

  • Verify local build with make mcp-server-docker
  • Test STDIO mode with Claude Code/Desktop
  • Test HTTP mode connectivity

🤖 Generated with Claude Code

- Add CI workflow to build and push Docker image on release tags
- Add Dockerfile for standalone MCP server container
- Add local build target (make mcp-server-docker)
- Add documentation for Docker usage

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@github-actions
Copy link

🤖 LLM Evaluation Results

OpenAI

⚠️ Overall: 19/21 tests passed (90.5%)

Provider Total Passed Failed Pass Rate
⚠️ OPENAI 21 19 2 90.5%

❌ Failed Evaluations

Show 2 failures

OPENAI

1. TestChannelSummarization/[openai]_channel_summarization_developers_webapp_channel

  • Score: 0.00
  • Rubric: mentions @claudio.costa is working on adding code coverage tracking to the monorepo
  • Reason: The output mentions @claudio.costa is tracking code coverage and sharing a PR, but it does not mention adding coverage tracking to the monorepo specifically.

2. TestChannelSummarization/[openai]_channel_summarization_developers_webapp_channel

  • Score: 0.00
  • Rubric: mentions claudio and harrison discussing exactly what should be tracked for code coverage
  • Reason: While the output mentions Claudio and Harrison discussing coverage-related topics (snapshot tests and exploring E2E via Playwright) and a focus on the webapp, it does not state that they discussed exactly what should be tracked for code coverage. Therefore, it does not meet the rubric requirement.

Anthropic

⚠️ Overall: 13/14 tests passed (92.9%)

Provider Total Passed Failed Pass Rate
⚠️ ANTHROPIC 14 13 1 92.9%

❌ Failed Evaluations

Show 1 failures

ANTHROPIC

1. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output lists a mix of bugs, UX defects, and a feature gap explicitly labeled 'Not a Bug', so it is not strictly a list of bugs.

Azure OpenAI

Overall: 21/21 tests passed (100.0%)

Provider Total Passed Failed Pass Rate
✅ AZURE 21 21 0 100.0%

Mistral

Overall: 19/19 tests passed (100.0%)

Provider Total Passed Failed Pass Rate
✅ MISTRAL 19 19 0 100.0%

AWS Bedrock

Overall: 25/25 tests passed (100.0%)

Provider Total Passed Failed Pass Rate
✅ BEDROCK 25 25 0 100.0%

This comment was automatically generated by the eval CI pipeline.

Copy link

@NARSimoes NARSimoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just adding a few points for discussion - overall this looks good.

runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/checkout@v4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider using commit SHA pointing to a specific commit for better reproducibility. For example, instead of actions/[email protected] we can use something like actions/checkout@ 8e8c483db84b4bee98b60c0593521ed34d9990e8 #v6.0.1. The same applies for there steps.

- name: ci/extract-version
id: version
run: echo "version=${GITHUB_REF#refs/tags/v}" >> $GITHUB_OUTPUT

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a step to scan the Dockerfile & images on pull-request and before pushing to the repository. With that we can catch issues earlier in CI and before those images are published.

docker-mcp-server-build-push:
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about breaking this down to also validate the Dockerfile in pull-requests ? For example, something like:

  • On pull-requests: checkout -> build -> scan / lint / test Dockerfile & images
  • On github.ref, 'refs/tags/v': push the images

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants