A Next.js application for extracting, processing, and exporting YouTube podcast transcripts with advanced features including speaker detection, deduplication, and TXT/PDF export.
Main interface with processing options, favorite channels, and video preview
Extract transcripts and generate AI summaries with bullet points and timestamp links
Save favorite podcast channels, browse episodes, and summarize with any LLM provider
Full dark mode support with system preference detection
The application generates AI-powered summaries using 3 LLM providers in 3 styles:
| Provider | Model | Notes |
|---|---|---|
| Anthropic | Claude Sonnet 4.5 | System + user message split (Anthropic best practice), temperature 0.7 |
| Google Gemini | Gemini 2.5 Flash | Single content block, temperature 0.7 |
| Perplexity | Sonar Online | Chat completions format, temperature 0.7 |
| Style | Output | Limit |
|---|---|---|
| Bullets | 10-15 bullet points with YouTube timestamp links | 10-15 bullets |
| Narrative | Flowing essay (Opening, Key Ideas, Practical Takeaways, Closing) | 750-1000 words |
| Technical | Structured extraction (Tools, Workflows, Tips, Metrics) | 2000 words max |
Prompt templates are stored in the prompts/ folder and loaded at runtime. They have gone through multiple iterations of tuning to tighten accuracy and produce quality results across all providers. See prompts/README.md for full details on which files are used by which LLMs and modes.
The ai_summary folder contains example summaries generated by different providers and styles:
- AI Summary Folder - Contains summaries from Anthropic Sonnet 4.5, Google Gemini 2.5 Flash, and Perplexity Sonar Online
The application includes comprehensive performance optimizations:
- Session-based caching: Channel data is cached in memory for 5 minutes, enabling instant tab switching
- Request deduplication: Prevents duplicate concurrent API requests
- Component memoization: React.memo and useMemo prevent unnecessary re-renders
- Optimized video enrichment: Parallel processing for video metadata fetching
- Tab persistence: Channel tab stays mounted once viewed for faster subsequent access
- Debounce & throttle: Optimized user input handling and API calls
- Lazy loading: Images and heavy components loaded on demand
- Code splitting: Automatic bundle splitting for optimal loading
- Bundle optimization: Webpack code splitting with vendor/common chunks
- Image optimization: AVIF and WebP format support with caching
- Font optimization: Font display swap for faster rendering
- Tree shaking: Unused code elimination
- SWC minification: Fast JavaScript minification
- Web Vitals tracking: FCP, LCP, FID, CLS, and TTFB monitoring
- Performance metrics: Page load time, DOM content loaded time
- Memory usage tracking: JavaScript heap size monitoring
- Bundle size analysis: Resource size tracking and optimization
The application features a clean, modern interface with:
- Tabbed interface: Video tab shows preview and transcript, Channel tab shows top 10 videos
- Real-time processing: Visual feedback during transcript processing
- Search functionality: Search within transcripts with highlighting
- Export options: TXT and PDF export with customizable options (metadata, timestamps)
- Dark mode: Full dark mode support with system preference detection
- Responsive design: Works seamlessly on mobile, tablet, and desktop
- Loading skeletons: Smooth loading states for async content
- Smooth animations: CSS transitions with reduced motion support
- Micro-interactions: Visual feedback for all user actions
The application is built with accessibility in mind:
- WCAG 2.1 AA compliant: Meets accessibility standards
- Keyboard navigation: Full keyboard support for all interactions
- Screen reader support: ARIA labels and semantic HTML
- Focus management: Proper focus trapping and restoration
- Color contrast: Meets WCAG contrast requirements
- Skip links: Quick navigation for keyboard users
- Reduced motion: Respects user's motion preferences
For the full setup guide, see docs/SETUP.md.
Before running the development server, you need to configure your environment variables. Create a .env.local file in the root directory:
# Copy the example and add your API keys
cp .env.example .env.local # If .env.example exists
# Or create .env.local manuallyAdd your API keys to .env.local:
# Anthropic API Configuration (Required for AI Summary feature)
ANTHROPIC_API_KEY=sk-ant-your-api-key-here
ANTHROPIC_MODEL=claude-sonnet-4-20250514
ANTHROPIC_MODEL_NAME=Anthropic Sonnet 4.5
# Google Gemini API Configuration (Optional)
GOOGLE_GEMINI_API_KEY=your_google_gemini_api_key_here
GOOGLE_GEMINI_MODEL=gemini-2.5-flash
GOOGLE_GEMINI_MODEL_NAME=Google Gemini 2.5 Flash
# Perplexity API Configuration (Optional)
PERPLEXITY_API_KEY=your_perplexity_api_key_here
PERPLEXITY_MODEL=sonar-online
PERPLEXITY_MODEL_NAME=Perplexity Sonar OnlineNote: The ANTHROPIC_API_KEY is required if you want to use the AI Summary feature. You can get your API key from Anthropic's Console.
For more details, see docs/ENV_VARIABLES.md.
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun devOpen http://localhost:3000 with your browser to see the result.
You can start editing the page by modifying app/page.tsx. The page auto-updates as you edit the file.
This project uses next/font to automatically optimize and load Geist, a new font family for Vercel.
- Framework: Next.js 15+ (App Router)
- Language: TypeScript 5+
- Styling: Tailwind CSS 4+
- UI Components: shadcn/ui (Radix UI + Lucide Icons)
- React: 19+
- β YouTube URL validation and parsing (multiple formats)
- β Transcript processing with deduplication
- β Automatic speaker detection (Host/Guest)
- β TXT and PDF export with customizable options
- β Single video transcript processing
- β Channel and playlist video browsing
- β Interactive transcript viewer with search
- β Real-time processing options with persistence
- β Channel information display with top 10 videos
- β AI-powered transcript summaries (Anthropic, Google Gemini, Perplexity)
- β My Favorite Podcast Channels β Save up to 5 channels, browse latest episodes, one-click summarize pipeline
- β Session-based caching for instant tab switching
- β Request deduplication to prevent duplicate API calls
- β Component memoization (React.memo, useMemo, useCallback)
- β Code splitting and lazy loading
- β Bundle optimization and tree shaking
- β Image optimization (AVIF, WebP)
- β Performance monitoring (Web Vitals tracking)
- β Debounce and throttle utilities
- β Dark mode support with system preference detection
- β Responsive mobile design with touch optimization
- β Loading skeletons for smooth loading states
- β Smooth animations with reduced motion support
- β Micro-interactions and visual feedback
- β Error handling with recovery options
- β Empty states with helpful messages
- β Comprehensive error boundaries
- β WCAG 2.1 AA compliance
- β Full keyboard navigation support
- β Screen reader optimization
- β ARIA labels on all interactive elements
- β Focus management and trapping
- β Color contrast compliance
- β Skip links for quick navigation
- Server-side persistence (Supabase migration)
- Advanced speaker identification (ML-based)
- Multi-language support
- Browser extension
src/
βββ app/ # Next.js App Router
β βββ api/ # API routes
β β βββ transcript/ # Transcript fetching endpoints
β β βββ channel/ # Channel information endpoint
β β βββ discover/ # Video discovery endpoint
β β βββ ai-summary/ # AI summary + config endpoints
β βββ api-docs/ # Interactive Swagger/OpenAPI docs
β βββ layout.tsx # Root layout with theme provider
β βββ page.tsx # Home page with main UI
βββ components/ # React components
β βββ ui/ # shadcn/ui components
β β βββ skeleton.tsx # Loading skeleton component
β βββ layout/ # Layout components (Header, Footer, Container)
β βββ features/ # Feature-specific components
β β βββ VideoPreview.tsx # Video metadata and tabs
β β βββ ChannelDetails.tsx # Channel info and top videos
β β βββ TranscriptViewer.tsx # Transcript display with search
β β βββ ProcessingOptions.tsx # Processing configuration
β β βββ ExportControls.tsx # Export functionality
β β βββ FavoriteChannels.tsx # Saved channels with episode list
β β βββ SummarizePipelineModal.tsx # Pipeline progress modal
β β βββ ErrorDisplay.tsx # Error display component
β β βββ EmptyState.tsx # Empty state components
β β βββ RetryButton.tsx # Retry action component
β βββ ErrorBoundary.tsx # React error boundary
βββ lib/ # Utility functions
β βββ transcript-processor.ts # Processing logic
β βββ ytdlp-service.ts # yt-dlp integration
β βββ api-client.ts # API client with caching
β βββ channel-cache.ts # Session-based caching
β βββ youtube-validator.ts # URL validation
β βββ performance-utils.ts # Performance utilities
β βββ accessibility-utils.ts # Accessibility helpers
β βββ mobile-utils.ts # Mobile optimization
β βββ performance-monitor.ts # Performance monitoring
β βββ animations.ts # Animation utilities
β βββ utils.ts # General utilities
βββ hooks/ # Custom React hooks
β βββ useChannelData.ts # Channel data with caching
β βββ useTranscriptProcessing.ts # Transcript processing
β βββ useProcessingOptions.ts # Options management
β βββ useUrlValidation.ts # URL validation
β βββ useFavoriteChannels.ts # Channel CRUD, episode cache, localStorage
β βββ useUrlDetection.ts # Channel/playlist URL detection
β βββ useUrlSubmission.ts # URL validation and transcript fetching
β βββ useSummarizePipeline.ts # One-click summarize pipeline orchestration
βββ types/ # TypeScript definitions
βββ index.ts # Type definitions
The project includes comprehensive testing:
- Unit Tests: Vitest + React Testing Library (80%+ coverage)
- Integration Tests: API routes and utility functions
- E2E Tests: Playwright for user flows and cross-browser testing
- Performance Tests: Web Vitals and bundle size monitoring
- Accessibility Tests: WCAG compliance and keyboard navigation
Run tests:
npm test # Unit tests
npm run test:coverage # With coverage report
npm run test:e2e # E2E tests- Interactive API Docs - Swagger/OpenAPI UI (available at
/api-docswhen running locally) - docs/SETUP.md - Setup and installation guide
- docs/API.md - API reference (endpoints, request/response schemas, rate limits)
- docs/INFRASTRUCTURE.md - Architecture, tech stack, and infrastructure
- docs/ENV_VARIABLES.md - Environment variable configuration
- prompts/ - AI summary prompt templates (README for details)
- How It Works β Interactive architecture overview (available in-app at
/how-it-works.html)
- Next.js Documentation
- Tailwind CSS Documentation
- shadcn/ui Documentation
- Playwright Documentation
- Vitest Documentation
The easiest way to deploy your Next.js app is to use the Vercel Platform.
- Set all required environment variables in Vercel dashboard
- Ensure
yt-dlpbinary is available in deployment environment - Verify API keys are configured correctly
- Run
npm run buildlocally to verify build succeeds - Run
npm run test:e2eto verify E2E tests pass - Check bundle size meets performance targets (< 1MB initial JS)
- β Page load time < 2 seconds
- β Lighthouse Performance score > 90
- β Lighthouse Accessibility score > 95
- β Bundle size < 1MB initial JavaScript
- β Memory usage < 100MB typical operations