Skip to content

Official Supadata MCP Server - Adds powerful video & web scraping to Cursor, Claude and any other LLM clients.

License

Notifications You must be signed in to change notification settings

supadata-ai/mcp

Repository files navigation

Supadata MCP Server

A Model Context Protocol (MCP) server that integrates with Supadata for video transcript extraction, web scraping, crawling, and site discovery.

Features

  • Video transcript extraction from YouTube, TikTok, Instagram, Twitter, and file URLs
  • Web scraping, crawling, and URL discovery
  • Automatic retries and rate limiting

Installation

Connect your AI assistant to Supadata's MCP server to enable transcript extraction and web scraping capabilities directly in your workflow.

Claude Code

claude mcp add --transport http supadata https://api.supadata.ai/mcp \
  --header "x-api-token: YOUR_SUPADATA_API_TOKEN"

Claude Desktop

Add to your config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "supadata": {
      "url": "https://api.supadata.ai/mcp",
      "headers": {
        "x-api-token": "YOUR_SUPADATA_API_TOKEN"
      }
    }
  }
}

Cursor

Add to .cursor/mcp.json in your project root (or global config):

{
  "mcpServers": {
    "supadata": {
      "url": "https://api.supadata.ai/mcp",
      "headers": {
        "x-api-token": "YOUR_SUPADATA_API_TOKEN"
      }
    }
  }
}

Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "supadata": {
      "serverUrl": "https://api.supadata.ai/mcp",
      "headers": {
        "x-api-token": "YOUR_SUPADATA_API_TOKEN"
      }
    }
  }
}

VS Code + Copilot

Add to your VS Code settings.json:

{
  "mcp": {
    "servers": {
      "supadata": {
        "url": "https://api.supadata.ai/mcp",
        "headers": {
          "x-api-token": "YOUR_SUPADATA_API_TOKEN"
        }
      }
    }
  }
}

Cline (VS Code Extension)

Open Cline settings and add to the MCP Servers configuration:

{
  "supadata": {
    "url": "https://api.supadata.ai/mcp",
    "headers": {
      "x-api-token": "YOUR_SUPADATA_API_TOKEN"
    }
  }
}

Replace YOUR_SUPADATA_API_TOKEN with your API token from supadata.ai.

Configuration

Environment Variables

  • SUPADATA_API_KEY: Your Supadata API key

System Configuration

The server includes configurable retry and rate limiting parameters:

const CONFIG = {
  retry: {
    maxAttempts: 3,           // Number of retry attempts
    initialDelay: 1000,       // Initial delay (milliseconds)
    maxDelay: 10000,          // Maximum delay between retries (milliseconds)
    backoffFactor: 2          // Exponential backoff multiplier
  }
};

How to Choose a Tool

Select the right tool based on your needs:

  • Transcript: Extract video transcripts from platforms and file URLs
  • Scrape: Extract content from a single page when you know the exact URL
  • Map: Discover all available URLs on a website
  • Crawl: Extract content from multiple related pages comprehensively
Tool Best for Returns
transcript Video transcript extraction text/markdown
scrape Single page content markdown/html
map URL discovery on a site URL[]
crawl Multi-page extraction markdown/html[]

Available Tools

Transcript (supadata_transcript)

Extract transcripts from supported video platforms (YouTube, TikTok, Instagram, Twitter) and file URLs.

Usage:

supadata_transcript --url "https://youtube.com/watch?v=example" --lang "en"

Check Transcript Status (supadata_check_transcript_status)

Check the progress of a transcript extraction job using the job ID.

Usage:

supadata_check_transcript_status --id "550e8400-e29b-41d4-a716-446655440000"

Scrape (supadata_scrape)

Extract content from a single URL with advanced options.

Usage:

supadata_scrape --url "https://example.com" --lang "en"

Map (supadata_map)

Discover all indexed URLs on a website to find relevant pages before scraping.

Usage:

supadata_map --url "https://example.com"

Crawl (supadata_crawl)

Start an asynchronous crawl job to extract content from multiple pages on a site.

Usage:

supadata_crawl --url "https://example.com/blog" --limit 100

Check Crawl Status (supadata_check_crawl_status)

Check the progress of a crawl job using the job ID.

Usage:

supadata_check_crawl_status --id "550e8400-e29b-41d4-a716-446655440000"

Development

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

Contributing

  1. Fork the repository
  2. Create your feature branch
  3. Run tests: npm test
  4. Submit a pull request

License

MIT License - see LICENSE file for details

About

Official Supadata MCP Server - Adds powerful video & web scraping to Cursor, Claude and any other LLM clients.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •