Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 151 additions & 0 deletions .github/workflows/api-monitoring.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
name: Live API Monitoring

on:
schedule:
# Run daily at midnight UTC
- cron: '0 0 * * *'
workflow_dispatch:
# Allow manual triggering from GitHub UI

jobs:
monitor-apis:
runs-on: ubuntu-latest
permissions:
contents: read
issues: write
actions: read

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '22'
cache: 'npm'

- name: Install dependencies
run: npm ci

- name: Run API monitoring tests
id: monitoring
env:
MAPBOX_ACCESS_TOKEN: ${{ secrets.MAPBOX_ACCESS_TOKEN }}
RUN_API_MONITORING: 'true'
run: |
npm test -- test/integration/live-api-monitoring.test.ts
continue-on-error: true

- name: Check for failures
id: check_failures
run: |
if [ -d "test/failures" ] && [ "$(ls -A test/failures)" ]; then
echo "has_failures=true" >> $GITHUB_OUTPUT
echo "Failures detected in test/failures/"
ls -la test/failures/
else
echo "has_failures=false" >> $GITHUB_OUTPUT
echo "No failures detected"
fi

- name: Upload failure artifacts
if: steps.check_failures.outputs.has_failures == 'true'
uses: actions/upload-artifact@v4
with:
name: api-monitoring-failures-${{ github.run_number }}
path: test/failures/
retention-days: 30

- name: Create GitHub issue on failure
if: steps.check_failures.outputs.has_failures == 'true'
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const path = require('path');

// Read all failure files
const failuresDir = 'test/failures';
const files = fs.readdirSync(failuresDir);

let failures = [];
for (const file of files) {
const content = fs.readFileSync(path.join(failuresDir, file), 'utf8');
failures.push(JSON.parse(content));
}

// Group failures by tool
const byTool = {};
for (const failure of failures) {
if (!byTool[failure.tool]) {
byTool[failure.tool] = [];
}
byTool[failure.tool].push(failure);
}

// Build issue body
let body = `## API Schema Validation Failures Detected\n\n`;
body += `**Date:** ${new Date().toISOString()}\n`;
body += `**Total Failures:** ${failures.length}\n`;
body += `**Workflow Run:** https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}\n\n`;

body += `### Summary by Tool\n\n`;
for (const [tool, toolFailures] of Object.entries(byTool)) {
body += `- **${tool}**: ${toolFailures.length} failure(s)\n`;
}

body += `\n### Failure Details\n\n`;
for (const [tool, toolFailures] of Object.entries(byTool)) {
body += `#### ${tool}\n\n`;
for (const failure of toolFailures) {
body += `**Query:** \`${JSON.stringify(failure.query)}\`\n`;
body += `**Error:** ${failure.error.split('\n')[0]}\n\n`;
}
}

body += `\n### Next Steps\n\n`;
body += `1. Download the failure artifacts from the workflow run\n`;
body += `2. Review the full API responses in the JSON files\n`;
body += `3. Update the relevant output schemas to handle the new response format\n`;
body += `4. Consider if this is a breaking API change that needs escalation\n\n`;
body += `### Artifacts\n\n`;
body += `Detailed failure responses are available in the workflow artifacts: [api-monitoring-failures-${{ github.run_number }}](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})\n`;

// Check if an issue already exists for today
const today = new Date().toISOString().split('T')[0];
const existingIssues = await github.rest.issues.listForRepo({
owner: context.repo.owner,
repo: context.repo.repo,
state: 'open',
labels: 'api-monitoring',
per_page: 10
});

const todayIssue = existingIssues.data.find(issue =>
issue.title.includes(today)
);

if (todayIssue) {
// Update existing issue
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: todayIssue.number,
body: `### Additional Failures Detected\n\n${body}`
});
} else {
// Create new issue
await github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: `[API Monitor] Schema validation failures - ${today}`,
body: body,
labels: ['api-monitoring', 'schema-validation', 'needs-triage']
});
}

- name: Comment on success
if: steps.check_failures.outputs.has_failures == 'false'
run: |
echo "✅ All API monitoring tests passed. Schemas are in sync with current APIs."
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -151,3 +151,6 @@ dist

# Test results
test-results.xml

# API monitoring failures
test/failures/
168 changes: 168 additions & 0 deletions test/integration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# Integration Tests

This directory contains integration tests that make real API calls to external services.

## Live API Monitoring

**File:** `live-api-monitoring.test.ts`

### Purpose

Detects schema drift and breaking changes in upstream Mapbox APIs by:

- Making real API calls daily
- Validating responses against our output schemas
- Automatically reporting failures via GitHub issues
- Saving problematic responses for analysis

### Why This Exists

Mapbox APIs don't follow strict semantic versioning, and responses can change without notice. This monitoring system provides early warning when schemas drift, preventing production failures.

### How It Works

**Daily Schedule:**

1. GitHub Actions runs at midnight UTC
2. Tests call Mapbox APIs with various queries
3. Responses are validated against Zod schemas
4. Failures are saved to `test/failures/`
5. A GitHub issue is created with details
6. Artifacts are uploaded for investigation

**Skipped in Regular CI:**

- Set `skipInCI = true` to avoid API rate limits
- Only runs when `RUN_API_MONITORING=true` env var is set
- GitHub Actions workflow sets this automatically

### Running Locally

```bash
# Run API monitoring tests
RUN_API_MONITORING=true npm test -- test/integration/live-api-monitoring.test.ts

# Check for failures
ls test/failures/
```

### Adding New Monitored Tools

1. Import the tool and its output schema
2. Create a new `describe.skipIf(skipInCI)` block
3. Define representative test queries/inputs
4. Follow the existing pattern for validation and failure handling

Example:

```typescript
describe.skipIf(skipInCI)('MyNewTool', () => {
const tool = new MyNewTool({ httpRequest });

const testInputs = [{ param: 'value1' }, { param: 'value2' }];

it('should handle current API responses', async () => {
const failures: ValidationFailure[] = [];

for (const input of testInputs) {
try {
const result = await tool.run(input);

if (result.isError) {
failures.push({
tool: 'my_new_tool',
query: input,
error: 'Tool returned isError=true',
response: result,
timestamp: new Date().toISOString()
});
continue;
}

const validation = MyToolOutputSchema.safeParse(
result.structuredContent
);

if (!validation.success) {
const failure: ValidationFailure = {
tool: 'my_new_tool',
query: input,
error: validation.error.message,
response: result.structuredContent,
timestamp: new Date().toISOString()
};
failures.push(failure);

await fs.writeFile(
path.join(failuresDir, `mytool-${Date.now()}.json`),
JSON.stringify(failure, null, 2)
);
}
} catch (error) {
// Handle unexpected errors
failures.push({
tool: 'my_new_tool',
query: input,
error: String(error),
response: null,
timestamp: new Date().toISOString()
});
}
}

if (failures.length > 0) {
console.error('Failures:', failures);
}

expect(failures).toHaveLength(0);
}, 60000);
});
```

### Responding to Failures

When a GitHub issue is created:

1. **Download artifacts** from the workflow run
2. **Review failure JSON files** to see actual API responses
3. **Identify the schema mismatch:**
- New field added by API?
- Field type changed?
- Field removed?
4. **Update the output schema** to handle the new format
5. **Test locally** with saved failure response
6. **Create PR** with schema fix
7. **Close the monitoring issue** once fixed

### GitHub Actions Workflow

**File:** `.github/workflows/api-monitoring.yml`

**Triggers:**

- Daily at midnight UTC (cron schedule)
- Manual dispatch from GitHub UI

**What It Does:**

- Runs monitoring tests
- Uploads failure artifacts
- Creates GitHub issues with labels: `api-monitoring`, `schema-validation`, `needs-triage`
- Updates existing issues if multiple failures occur in one day

**Required Secrets:**

- `MAPBOX_ACCESS_TOKEN` - Valid Mapbox token for API access

### Philosophy

This complements PR #73's non-fatal validation approach:

- **PR #73** makes validation failures non-fatal (resilience)
- **API Monitoring** detects schema drift early (observability)

Together they provide:

- ✅ Resilient production behavior (users get data)
- ✅ Early warning system (maintainers notified)
- ✅ Clear evidence for schema updates (failure artifacts)
Loading