A Model Context Protocol (MCP) server that integrates with Supadata for video transcript extraction, web scraping, crawling, and site discovery.
- Video transcript extraction from YouTube, TikTok, Instagram, Twitter, and file URLs
- Web scraping, crawling, and URL discovery
- Automatic retries and rate limiting
Connect your AI assistant to Supadata's MCP server to enable transcript extraction and web scraping capabilities directly in your workflow.
claude mcp add --transport http supadata https://api.supadata.ai/mcp \
--header "x-api-token: YOUR_SUPADATA_API_TOKEN"Add to your config file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"supadata": {
"url": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
}Add to .cursor/mcp.json in your project root (or global config):
{
"mcpServers": {
"supadata": {
"url": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
}Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"supadata": {
"serverUrl": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
}Add to your VS Code settings.json:
{
"mcp": {
"servers": {
"supadata": {
"url": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}
}
}Open Cline settings and add to the MCP Servers configuration:
{
"supadata": {
"url": "https://api.supadata.ai/mcp",
"headers": {
"x-api-token": "YOUR_SUPADATA_API_TOKEN"
}
}
}Replace YOUR_SUPADATA_API_TOKEN with your API token from supadata.ai.
SUPADATA_API_KEY: Your Supadata API key
The server includes configurable retry and rate limiting parameters:
const CONFIG = {
retry: {
maxAttempts: 3, // Number of retry attempts
initialDelay: 1000, // Initial delay (milliseconds)
maxDelay: 10000, // Maximum delay between retries (milliseconds)
backoffFactor: 2 // Exponential backoff multiplier
}
};Select the right tool based on your needs:
- Transcript: Extract video transcripts from platforms and file URLs
- Scrape: Extract content from a single page when you know the exact URL
- Map: Discover all available URLs on a website
- Crawl: Extract content from multiple related pages comprehensively
| Tool | Best for | Returns |
|---|---|---|
| transcript | Video transcript extraction | text/markdown |
| scrape | Single page content | markdown/html |
| map | URL discovery on a site | URL[] |
| crawl | Multi-page extraction | markdown/html[] |
Extract transcripts from supported video platforms (YouTube, TikTok, Instagram, Twitter) and file URLs.
Usage:
supadata_transcript --url "https://youtube.com/watch?v=example" --lang "en"Check the progress of a transcript extraction job using the job ID.
Usage:
supadata_check_transcript_status --id "550e8400-e29b-41d4-a716-446655440000"Extract content from a single URL with advanced options.
Usage:
supadata_scrape --url "https://example.com" --lang "en"Discover all indexed URLs on a website to find relevant pages before scraping.
Usage:
supadata_map --url "https://example.com"Start an asynchronous crawl job to extract content from multiple pages on a site.
Usage:
supadata_crawl --url "https://example.com/blog" --limit 100Check the progress of a crawl job using the job ID.
Usage:
supadata_check_crawl_status --id "550e8400-e29b-41d4-a716-446655440000"# Install dependencies
npm install
# Build
npm run build
# Run tests
npm test- Fork the repository
- Create your feature branch
- Run tests:
npm test - Submit a pull request
MIT License - see LICENSE file for details