Skip to content

Extract, process, and export YouTube podcast transcripts with AI-powered summaries. Built with Next.js, TypeScript, and yt-dlp. Features speaker detection, deduplication, TXT export format, and multi-LLM summary generation.

Notifications You must be signed in to change notification settings

shrimpy8/youtube-transcript-processor

Repository files navigation

YouTube Podcast Transcript Processor

A Next.js application for extracting, processing, and exporting YouTube podcast transcripts with advanced features including speaker detection, deduplication, and TXT/PDF export.

πŸ“Έ Screenshots

Home Page

Home Page Main interface with processing options, favorite channels, and video preview

AI-Powered Episode Summary

Episode AI Summary Extract transcripts and generate AI summaries with bullet points and timestamp links

Favorite Channels with One-Click Summarize

Favorite Channel Episode AI Summary Save favorite podcast channels, browse episodes, and summarize with any LLM provider

Dark Mode

Dark Mode Full dark mode support with system preference detection

πŸ€– AI Summary

The application generates AI-powered summaries using 3 LLM providers in 3 styles:

Provider Model Notes
Anthropic Claude Sonnet 4.5 System + user message split (Anthropic best practice), temperature 0.7
Google Gemini Gemini 2.5 Flash Single content block, temperature 0.7
Perplexity Sonar Online Chat completions format, temperature 0.7
Style Output Limit
Bullets 10-15 bullet points with YouTube timestamp links 10-15 bullets
Narrative Flowing essay (Opening, Key Ideas, Practical Takeaways, Closing) 750-1000 words
Technical Structured extraction (Tools, Workflows, Tips, Metrics) 2000 words max

Prompt templates are stored in the prompts/ folder and loaded at runtime. They have gone through multiple iterations of tuning to tighten accuracy and produce quality results across all providers. See prompts/README.md for full details on which files are used by which LLMs and modes.

AI Summary Examples

The ai_summary folder contains example summaries generated by different providers and styles:

  • AI Summary Folder - Contains summaries from Anthropic Sonnet 4.5, Google Gemini 2.5 Flash, and Perplexity Sonar Online

⚑ Performance Optimizations

The application includes comprehensive performance optimizations:

Runtime Optimizations

  • Session-based caching: Channel data is cached in memory for 5 minutes, enabling instant tab switching
  • Request deduplication: Prevents duplicate concurrent API requests
  • Component memoization: React.memo and useMemo prevent unnecessary re-renders
  • Optimized video enrichment: Parallel processing for video metadata fetching
  • Tab persistence: Channel tab stays mounted once viewed for faster subsequent access
  • Debounce & throttle: Optimized user input handling and API calls
  • Lazy loading: Images and heavy components loaded on demand
  • Code splitting: Automatic bundle splitting for optimal loading

Build Optimizations

  • Bundle optimization: Webpack code splitting with vendor/common chunks
  • Image optimization: AVIF and WebP format support with caching
  • Font optimization: Font display swap for faster rendering
  • Tree shaking: Unused code elimination
  • SWC minification: Fast JavaScript minification

Performance Monitoring

  • Web Vitals tracking: FCP, LCP, FID, CLS, and TTFB monitoring
  • Performance metrics: Page load time, DOM content loaded time
  • Memory usage tracking: JavaScript heap size monitoring
  • Bundle size analysis: Resource size tracking and optimization

🎨 User Interface

The application features a clean, modern interface with:

  • Tabbed interface: Video tab shows preview and transcript, Channel tab shows top 10 videos
  • Real-time processing: Visual feedback during transcript processing
  • Search functionality: Search within transcripts with highlighting
  • Export options: TXT and PDF export with customizable options (metadata, timestamps)
  • Dark mode: Full dark mode support with system preference detection
  • Responsive design: Works seamlessly on mobile, tablet, and desktop
  • Loading skeletons: Smooth loading states for async content
  • Smooth animations: CSS transitions with reduced motion support
  • Micro-interactions: Visual feedback for all user actions

β™Ώ Accessibility

The application is built with accessibility in mind:

  • WCAG 2.1 AA compliant: Meets accessibility standards
  • Keyboard navigation: Full keyboard support for all interactions
  • Screen reader support: ARIA labels and semantic HTML
  • Focus management: Proper focus trapping and restoration
  • Color contrast: Meets WCAG contrast requirements
  • Skip links: Quick navigation for keyboard users
  • Reduced motion: Respects user's motion preferences

πŸš€ Getting Started

For the full setup guide, see docs/SETUP.md.

Environment Setup

Before running the development server, you need to configure your environment variables. Create a .env.local file in the root directory:

# Copy the example and add your API keys
cp .env.example .env.local  # If .env.example exists
# Or create .env.local manually

Add your API keys to .env.local:

# Anthropic API Configuration (Required for AI Summary feature)
ANTHROPIC_API_KEY=sk-ant-your-api-key-here
ANTHROPIC_MODEL=claude-sonnet-4-20250514
ANTHROPIC_MODEL_NAME=Anthropic Sonnet 4.5

# Google Gemini API Configuration (Optional)
GOOGLE_GEMINI_API_KEY=your_google_gemini_api_key_here
GOOGLE_GEMINI_MODEL=gemini-2.5-flash
GOOGLE_GEMINI_MODEL_NAME=Google Gemini 2.5 Flash

# Perplexity API Configuration (Optional)
PERPLEXITY_API_KEY=your_perplexity_api_key_here
PERPLEXITY_MODEL=sonar-online
PERPLEXITY_MODEL_NAME=Perplexity Sonar Online

Note: The ANTHROPIC_API_KEY is required if you want to use the AI Summary feature. You can get your API key from Anthropic's Console.

For more details, see docs/ENV_VARIABLES.md.

Running the Development Server

npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev

Open http://localhost:3000 with your browser to see the result.

You can start editing the page by modifying app/page.tsx. The page auto-updates as you edit the file.

This project uses next/font to automatically optimize and load Geist, a new font family for Vercel.

πŸ› οΈ Tech Stack

  • Framework: Next.js 15+ (App Router)
  • Language: TypeScript 5+
  • Styling: Tailwind CSS 4+
  • UI Components: shadcn/ui (Radix UI + Lucide Icons)
  • React: 19+

πŸ“¦ Features

βœ… Core Features

  • βœ… YouTube URL validation and parsing (multiple formats)
  • βœ… Transcript processing with deduplication
  • βœ… Automatic speaker detection (Host/Guest)
  • βœ… TXT and PDF export with customizable options
  • βœ… Single video transcript processing
  • βœ… Channel and playlist video browsing
  • βœ… Interactive transcript viewer with search
  • βœ… Real-time processing options with persistence
  • βœ… Channel information display with top 10 videos
  • βœ… AI-powered transcript summaries (Anthropic, Google Gemini, Perplexity)
  • βœ… My Favorite Podcast Channels β€” Save up to 5 channels, browse latest episodes, one-click summarize pipeline

βœ… Performance & Optimization

  • βœ… Session-based caching for instant tab switching
  • βœ… Request deduplication to prevent duplicate API calls
  • βœ… Component memoization (React.memo, useMemo, useCallback)
  • βœ… Code splitting and lazy loading
  • βœ… Bundle optimization and tree shaking
  • βœ… Image optimization (AVIF, WebP)
  • βœ… Performance monitoring (Web Vitals tracking)
  • βœ… Debounce and throttle utilities

βœ… User Experience

  • βœ… Dark mode support with system preference detection
  • βœ… Responsive mobile design with touch optimization
  • βœ… Loading skeletons for smooth loading states
  • βœ… Smooth animations with reduced motion support
  • βœ… Micro-interactions and visual feedback
  • βœ… Error handling with recovery options
  • βœ… Empty states with helpful messages
  • βœ… Comprehensive error boundaries

βœ… Accessibility

  • βœ… WCAG 2.1 AA compliance
  • βœ… Full keyboard navigation support
  • βœ… Screen reader optimization
  • βœ… ARIA labels on all interactive elements
  • βœ… Focus management and trapping
  • βœ… Color contrast compliance
  • βœ… Skip links for quick navigation

🚧 Future Enhancements

  • Server-side persistence (Supabase migration)
  • Advanced speaker identification (ML-based)
  • Multi-language support
  • Browser extension

πŸ—οΈ Project Structure

src/
β”œβ”€β”€ app/                    # Next.js App Router
β”‚   β”œβ”€β”€ api/               # API routes
β”‚   β”‚   β”œβ”€β”€ transcript/    # Transcript fetching endpoints
β”‚   β”‚   β”œβ”€β”€ channel/       # Channel information endpoint
β”‚   β”‚   β”œβ”€β”€ discover/      # Video discovery endpoint
β”‚   β”‚   └── ai-summary/    # AI summary + config endpoints
β”‚   β”œβ”€β”€ api-docs/          # Interactive Swagger/OpenAPI docs
β”‚   β”œβ”€β”€ layout.tsx         # Root layout with theme provider
β”‚   └── page.tsx           # Home page with main UI
β”œβ”€β”€ components/            # React components
β”‚   β”œβ”€β”€ ui/               # shadcn/ui components
β”‚   β”‚   └── skeleton.tsx  # Loading skeleton component
β”‚   β”œβ”€β”€ layout/           # Layout components (Header, Footer, Container)
β”‚   β”œβ”€β”€ features/         # Feature-specific components
β”‚   β”‚   β”œβ”€β”€ VideoPreview.tsx           # Video metadata and tabs
β”‚   β”‚   β”œβ”€β”€ ChannelDetails.tsx         # Channel info and top videos
β”‚   β”‚   β”œβ”€β”€ TranscriptViewer.tsx       # Transcript display with search
β”‚   β”‚   β”œβ”€β”€ ProcessingOptions.tsx      # Processing configuration
β”‚   β”‚   β”œβ”€β”€ ExportControls.tsx         # Export functionality
β”‚   β”‚   β”œβ”€β”€ FavoriteChannels.tsx       # Saved channels with episode list
β”‚   β”‚   β”œβ”€β”€ SummarizePipelineModal.tsx # Pipeline progress modal
β”‚   β”‚   β”œβ”€β”€ ErrorDisplay.tsx           # Error display component
β”‚   β”‚   β”œβ”€β”€ EmptyState.tsx             # Empty state components
β”‚   β”‚   └── RetryButton.tsx            # Retry action component
β”‚   └── ErrorBoundary.tsx # React error boundary
β”œβ”€β”€ lib/                   # Utility functions
β”‚   β”œβ”€β”€ transcript-processor.ts  # Processing logic
β”‚   β”œβ”€β”€ ytdlp-service.ts         # yt-dlp integration
β”‚   β”œβ”€β”€ api-client.ts            # API client with caching
β”‚   β”œβ”€β”€ channel-cache.ts          # Session-based caching
β”‚   β”œβ”€β”€ youtube-validator.ts    # URL validation
β”‚   β”œβ”€β”€ performance-utils.ts     # Performance utilities
β”‚   β”œβ”€β”€ accessibility-utils.ts   # Accessibility helpers
β”‚   β”œβ”€β”€ mobile-utils.ts          # Mobile optimization
β”‚   β”œβ”€β”€ performance-monitor.ts   # Performance monitoring
β”‚   β”œβ”€β”€ animations.ts            # Animation utilities
β”‚   └── utils.ts                # General utilities
β”œβ”€β”€ hooks/                 # Custom React hooks
β”‚   β”œβ”€β”€ useChannelData.ts           # Channel data with caching
β”‚   β”œβ”€β”€ useTranscriptProcessing.ts  # Transcript processing
β”‚   β”œβ”€β”€ useProcessingOptions.ts     # Options management
β”‚   β”œβ”€β”€ useUrlValidation.ts         # URL validation
β”‚   β”œβ”€β”€ useFavoriteChannels.ts      # Channel CRUD, episode cache, localStorage
β”‚   β”œβ”€β”€ useUrlDetection.ts          # Channel/playlist URL detection
β”‚   β”œβ”€β”€ useUrlSubmission.ts         # URL validation and transcript fetching
β”‚   └── useSummarizePipeline.ts     # One-click summarize pipeline orchestration
└── types/                 # TypeScript definitions
    └── index.ts          # Type definitions

πŸ§ͺ Testing

The project includes comprehensive testing:

  • Unit Tests: Vitest + React Testing Library (80%+ coverage)
  • Integration Tests: API routes and utility functions
  • E2E Tests: Playwright for user flows and cross-browser testing
  • Performance Tests: Web Vitals and bundle size monitoring
  • Accessibility Tests: WCAG compliance and keyboard navigation

Run tests:

npm test              # Unit tests
npm run test:coverage # With coverage report
npm run test:e2e      # E2E tests

πŸ“š Documentation

πŸ“ Learn More

🚒 Deployment

Deploy on Vercel

The easiest way to deploy your Next.js app is to use the Vercel Platform.

Pre-Deployment Checklist

  • Set all required environment variables in Vercel dashboard
  • Ensure yt-dlp binary is available in deployment environment
  • Verify API keys are configured correctly
  • Run npm run build locally to verify build succeeds
  • Run npm run test:e2e to verify E2E tests pass
  • Check bundle size meets performance targets (< 1MB initial JS)

Performance Targets

  • βœ… Page load time < 2 seconds
  • βœ… Lighthouse Performance score > 90
  • βœ… Lighthouse Accessibility score > 95
  • βœ… Bundle size < 1MB initial JavaScript
  • βœ… Memory usage < 100MB typical operations

About

Extract, process, and export YouTube podcast transcripts with AI-powered summaries. Built with Next.js, TypeScript, and yt-dlp. Features speaker detection, deduplication, TXT export format, and multi-LLM summary generation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages