MCP Serverpalvindersanderpublic

youtube transcript mcp

mcp to get youtube video metadata and transcript -- written by an llm

Repository Info

Stars

Forks

Watchers

Issues

Python

Language

License

View on GitHubGitHub Download DocumentationDocs

About This Server

mcp to get youtube video metadata and transcript -- written by an llm

Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.

Documentation

YouTube Transcript MCP Server

An MCP server for fetching transcripts from YouTube videos directly in Claude.

Documentation: For complete documentation, see the Documentation Index.

Architecture

System Overview

graph TD
    Claude[Claude Desktop] -->|Command| MCP[MCP Protocol]
    MCP -->|Request| Server[Transcript MCP Server]
    Server -->|Parse URL| YTLib[Transcript Library]
    YTLib -->|Extract Video ID| YT[YouTube]
    YTLib -->|Fetch Transcript| YT
    YTLib -->|Fetch Metadata| YT
    YTLib -->|Process| Results[Results]
    Results -->|Format Response| Server
    Server -->|Return Data| MCP
    MCP -->|Display Results| Claude
    
    %% New fact-checking components
    Server -->|Search Request| SearchAPI[Search API Client]
    SearchAPI -->|Query| Web[Web Search]
    Web -->|Results| SearchAPI
    SearchAPI -->|Formatted Results| Server
    
    Server -->|Segment Request| SegmentLib[Transcript Segment]
    YTLib -->|Transcript Data| SegmentLib
    SegmentLib -->|Extracted Segment| Server
    
    subgraph "Transcript Server"
        Server
        YTLib
        SearchAPI
        SegmentLib
    end
    
    subgraph "External Services"
        YT
        Web
    end
    
    subgraph "Client"
        Claude
        MCP
    end

Component Structure

classDiagram
    class FastMCP {
        +run()
        +tool()
    }
    
    class TranscriptMCPServer {
        +get_transcript()
        +get_video_metadata()
        +list_transcript_languages()
        +get_chapter_markers()
        +search_for_claim_verification()
        +extract_transcript_segment()
        +find_claim_in_transcript()
    }
    
    class TranscriptLib {
        +get_video_id()
        +get_video_metadata()
        +get_video_statistics()
        +get_available_languages()
        +get_transcript()
        +get_chapter_markers()
        +format_transcript_text()
        +format_transcript_json()
    }
    
    class SearchAPIClient {
        +search()
        +search_for_claim_verification()
    }
    
    class TranscriptSegment {
        +timestamp_to_seconds()
        +seconds_to_timestamp()
        +find_transcript_segment()
        +extract_transcript_segment()
        +find_claim_in_transcript()
    }
    
    class YouTubeTranscriptAPI {
        +get_transcript()
        +list_transcripts()
    }
    
    class TranscriptError {
        +message
    }
    
    class SearchAPIError {
        +message
    }
    
    FastMCP <|-- TranscriptMCPServer : extends
    TranscriptMCPServer --> TranscriptLib : uses
    TranscriptMCPServer --> SearchAPIClient : uses
    TranscriptMCPServer --> TranscriptSegment : uses
    TranscriptLib --> YouTubeTranscriptAPI : uses
    TranscriptLib --> TranscriptError : throws
    SearchAPIClient --> SearchAPIError : throws
    TranscriptSegment --> TranscriptLib : uses

Request Flow Sequence

sequenceDiagram
    participant User
    participant Claude
    participant MCPServer
    participant TranscriptLib
    participant YouTube
    
    User->>Claude: @transcript get_transcript [URL]
    Claude->>MCPServer: Call get_transcript
    MCPServer->>TranscriptLib: get_video_id(url)
    MCPServer->>TranscriptLib: get_video_metadata(video_id)
    TranscriptLib->>YouTube: HTTP request to oEmbed API
    YouTube-->>TranscriptLib: Return metadata
    TranscriptLib->>YouTube: HTTP request to page
    YouTube-->>TranscriptLib: Return page content
    TranscriptLib-->>MCPServer: Return metadata
    
    MCPServer->>TranscriptLib: get_video_statistics(video_id)
    TranscriptLib->>YouTube: HTTP request to page
    YouTube-->>TranscriptLib: Return page content
    TranscriptLib-->>MCPServer: Return statistics
    
    MCPServer->>TranscriptLib: get_chapter_markers(video_id)
    TranscriptLib->>YouTube: HTTP request to page
    YouTube-->>TranscriptLib: Return page content
    TranscriptLib-->>MCPServer: Return chapter markers
    
    MCPServer->>TranscriptLib: get_transcript(video_id)
    TranscriptLib->>YouTube: Request transcript
    YouTube-->>TranscriptLib: Return raw transcript
    TranscriptLib-->>MCPServer: Return transcript segments
    MCPServer->>TranscriptLib: format_transcript_text(transcript, chapters)
    TranscriptLib-->>MCPServer: Return formatted transcript with chapters
    MCPServer-->>Claude: Return complete response
    Claude-->>User: Display transcript with metadata, statistics, and chapters

Fact-Checking Request Flow

sequenceDiagram
    participant User
    participant Claude
    participant MCPServer
    participant TranscriptLib
    participant SearchAPI
    participant TranscriptSegment
    participant YouTube
    participant WebSearch
    
    User->>Claude: Request transcript & summarize
    Claude->>MCPServer: get_transcript(url)
    MCPServer->>TranscriptLib: Get transcript & metadata
    TranscriptLib->>YouTube: Fetch data
    YouTube-->>TranscriptLib: Return data
    TranscriptLib-->>MCPServer: Return formatted data
    MCPServer-->>Claude: Return transcript
    Claude-->>User: Display transcript & summary
    
    User->>Claude: Request fact check for claim
    Claude->>Claude: Identify claim to check
    
    alt Find Claim Location
        Claude->>MCPServer: find_claim_in_transcript(url, claim)
        MCPServer->>TranscriptLib: get_transcript(url)
        TranscriptLib->>YouTube: Fetch transcript
        YouTube-->>TranscriptLib: Return transcript
        TranscriptLib-->>MCPServer: Return transcript data
        MCPServer->>TranscriptSegment: find_claim_in_transcript(transcript, claim)
        TranscriptSegment-->>MCPServer: Return claim location & context
        MCPServer-->>Claude: Return claim context
    end
    
    alt Get Transcript Segment
        Claude->>MCPServer: extract_transcript_segment(url, timestamp)
        MCPServer->>TranscriptSegment: extract_transcript_segment(url, timestamp)
        TranscriptSegment->>TranscriptLib: get_transcript(url)
        TranscriptLib->>YouTube: Fetch transcript
        YouTube-->>TranscriptLib: Return transcript
        TranscriptLib-->>TranscriptSegment: Return transcript data
        TranscriptSegment-->>MCPServer: Return segment with context
        MCPServer-->>Claude: Return formatted segment
    end
    
    Claude->>MCPServer: search_for_claim_verification(claim, context)
    MCPServer->>SearchAPI: search_for_claim_verification(claim, context)
    SearchAPI->>WebSearch: Search for fact check
    WebSearch-->>SearchAPI: Return search results
    SearchAPI->>WebSearch: Search for direct information
    WebSearch-->>SearchAPI: Return search results
    SearchAPI-->>MCPServer: Return combined, formatted results
    MCPServer-->>Claude: Return structured search data
    
    Claude->>Claude: Analyze verification data
    Claude->>User: Present fact check results

Project Status and Roadmap

This project is actively maintained. For information about:

Current implementation status
Completed features
Planned enhancements
Roadmap for future development

See the Progress Tracker.

Setup Instructions

Install dependencies:

python3 -m pip install -r requirements.txt

Configure Claude for Desktop:

Open your Claude for Desktop App configuration at ~/Library/Application Support/Claude/claude_desktop_config.json and add:

{
    "mcpServers": {
        "transcript": {
            "command": "python3",
            "args": [
                "/absolute/path/to/transcript_mcp.py"
            ]
        }
    }
}

Make sure to replace /absolute/path/to/transcript_mcp.py with the actual path to the MCP script.

Usage

Once configured, you can use the transcript MCP server in Claude with commands like:

@transcript get_transcript https://www.youtube.com/watch?v=ELj2LLNP8Ak

Or:

@transcript list_transcript_languages https://www.youtube.com/watch?v=ELj2LLNP8Ak

Or:

@transcript get_video_metadata https://www.youtube.com/watch?v=ELj2LLNP8Ak

Available Tools

get_transcript(url, language_code=None, include_metadata=True, include_chapters=True)
- Fetches a transcript for a YouTube video with timestamps in ~10 second intervals
- Arguments:
  - url: YouTube video URL or ID
  - language_code (optional): Language code (e.g., 'en', 'es')
  - include_metadata (optional): Whether to include video metadata (default: True)
  - include_chapters (optional): Whether to include chapter markers in the transcript (default: True)
- Returns:
  - Video metadata (title, author, channel URL, view count, etc.) if requested
  - Chapter markers if available and requested
  - Transcript with timestamps and chapter markers
get_video_metadata(url, include_statistics=True)
- Fetches metadata and statistics for a YouTube video
- Arguments:
  - url: YouTube video URL or ID
  - include_statistics (optional): Whether to include view count, likes, etc. (default: True)
- Returns:
  - Video title
  - Author/channel name
  - Channel URL
  - Thumbnail URL
  - View count (if available)
  - Like count (if available)
  - Upload date (if available)
  - Video description
list_transcript_languages(url)
- Lists available transcript languages for a YouTube video
- Arguments:
  - url: YouTube video URL or ID
get_chapter_markers(url)
- Fetches chapter markers for a YouTube video
- Arguments:
  - url: YouTube video URL or ID
- Returns:
  - List of chapter markers with timestamps and titles, or a message if no chapters are found
search_for_claim_verification(claim, context=None)
- Searches for information to help verify a claim made in a video
- Arguments:
  - claim: The specific claim to verify (a statement that can be true or false)
  - context (optional): Context from the video to help with the search
- Returns:
  - JSON-formatted search results with fact-checking and general information
extract_transcript_segment(url, timestamp, context_seconds=30)
- Extracts a specific segment of a transcript around a timestamp
- Arguments:
  - url: YouTube video URL or ID
  - timestamp: Timestamp in format MM:SS or HH:MM:SS
  - context_seconds (optional): Number of seconds of context before and after (default: 30)
- Returns:
  - The transcript segment with metadata
find_claim_in_transcript(url, claim, fuzzy_match=True)
- Finds a specific claim in a transcript and returns its timestamp and context
- Arguments:
  - url: YouTube video URL or ID
  - claim: The claim to find
  - fuzzy_match (optional): Whether to use fuzzy matching (default: True)
- Returns:
  - Timestamp and context of the claim if found

Transcript Format

The transcript is formatted with timestamps in approximately 10-second intervals. Short segments are merged until they reach about 10 seconds in duration. Each line is prefixed with a timestamp in [MM:SS] format.

When chapter markers are available and included, they are displayed in two ways:

As a complete list at the top of the response under the "Chapter Markers" section
Inserted at appropriate positions in the transcript with a format like [CHAPTER] MM:SS - Chapter Title

This dual approach makes it easier to get an overview of the video structure while also seeing chapter transitions as you read through the content.

Video Metadata and Statistics

The server can extract the following information from YouTube videos:

Video title
Author/channel name
Channel URL
Thumbnail URL
View count
Like count
Upload date
Video description

This information can be included with transcripts or retrieved separately using the get_video_metadata tool.

Chapter Markers

Chapter markers are segments of a video defined by the video creator. The server can extract these markers from YouTube videos when available. Each chapter has:

A title describing the content
A timestamp indicating when the chapter starts
A formatted time string (HH:MM:SS or MM:SS)

Chapter markers can be included directly in the transcript to provide additional context and structure, or retrieved separately using the get_chapter_markers tool.

Fact-Checking

The server provides several tools to help Claude verify information from YouTube videos:

Claim Verification Search: Uses web search to find information that can verify claims made in videos
Transcript Segment Extraction: Extracts specific segments of transcripts around timestamps for focused analysis
Claim Finding: Locates claims within transcripts with exact or fuzzy matching

When fact-checking, Claude can:

First identify specific claims from the video transcript
Find where in the transcript the claim appears
Gather context around the claim
Search for verification information
Analyze the results to provide a fact-check assessment

Example Fact-Checking Flow

User: @transcript get_transcript https://www.youtube.com/watch?v=ELj2LLNP8Ak and summarize

Claude: [fetches transcript and provides summary]

User: Please fact check the claim that "AI will replace all programmers by 2025" made at 12:34 in the video

Claude: Let me check this claim carefully.

First, I'll extract the segment from the video to verify the exact wording...

[uses extract_transcript_segment to get context]

Now I'll search for information to verify this claim...

[uses search_for_claim_verification to gather data]

Based on my research:
1. Expert consensus from multiple sources indicates that AI will augment programming rather than completely replace programmers by 2025
2. While AI coding assistants are becoming more capable, they still require human oversight and direction
3. The claim appears to be an exaggeration that contradicts current industry projections

The claim that "AI will replace all programmers by 2025" is not supported by current evidence and expert analysis.

Search API Configuration

The search functionality requires a search API key. To configure this:

Get an API key from a search provider (default implementation uses Serper.dev)
Set the API key in one of these ways:
- Recommended - Create a config file: Create a config.py file in the project root with:
```
# Serper.dev API key
SEARCH_API_KEY = "your_api_key_here"
```
  This file is already in .gitignore to ensure your API key isn't committed to version control.
- Alternative - Use environment variable: Set the environment variable SEARCH_API_KEY:
```
export SEARCH_API_KEY=your_api_key_here
```
Verify your configuration works:
```
python3 test_api_key.py
```
This will test if your API key is being correctly detected from config.py or environment variables.

Mock Mode for Testing

The fact-checking feature includes a mock mode that allows testing and demonstration without requiring a real API key. When no API key is configured, the system can automatically switch to mock mode, which:

Generates realistic-looking search results based on the query
Clearly indicates that mock results are being used
Maintains the same data structure as real search results

This enables users to see how the fact-checking tools work without needing to configure an API key. In a Claude conversation, mock results will be preceded by a clear notice:

[NOTE: Using mock search results for demonstration purposes. To use real search results, set the SEARCH_API_KEY environment variable.]

To explicitly test mock mode, run:

python3 test_missing_api_key.py

Important Note About Missing API Keys

If no API key is configured and mock mode is disabled, the search-based fact-checking tools will return an error message:

Error: No Search API key configured

This behavior is intentional - the MCP server remains functional for transcript retrieval even if search functionality is unavailable.

Testing

You can test the core functionality with the included test scripts:

python3 test_transcript.py [video_id] [language_code]
python3 test_chapter_markers.py [video_id]
python3 test_statistics.py [video_id]
python3 test_top_chapter_markers.py [video_id] [language_code]
python3 test_missing_api_key.py
python3 test_api_key.py

Testing scripts for specific features:

test_transcript.py - Tests basic transcript retrieval
test_chapter_markers.py - Tests chapter marker extraction
test_statistics.py - Tests video statistics retrieval
test_top_chapter_markers.py - Tests the chapter markers at the top of transcript feature
test_missing_api_key.py - Tests error handling and mock mode when no API key is available
test_api_key.py - Tests if your API key configuration is working correctly

Notes:

Always run test scripts with python3 rather than making them executable
The video_id parameter can be a full YouTube URL or just the ID
If no video_id is provided, a default testing video will be used
Log files with timestamps are saved in the logs/ directory

Development Guidelines

When developing for this project:

Always run Python scripts using python3 rather than making them executable
Test changes thoroughly with the provided test scripts
Document any significant changes in the appropriate documentation files
Follow the existing code style for consistency

Quick Start

Clone the repository

git clone https://github.com/palvindersander/youtube-transcript-mcp

Install dependencies

cd youtube-transcript-mcp
npm install

Follow the documentation

Check the repository's README.md file for specific installation and usage instructions.

Repository Details

Ownerpalvindersander

Repoyoutube-transcript-mcp

LanguagePython

License-

Last fetched8/10/2025

Quick Links

Issues

Releases

License

Recommended MCP Servers

💬

Discord MCP

Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.

integrationsdiscordchat

🔗

Knit MCP

Connect AI agents to 200+ SaaS applications and automate workflows.

integrationsautomationsaas

🕷️

Apify MCP Server

Deploy and interact with Apify actors for web scraping and data extraction.

apifycrawlerdata

🌐

BrowserStack MCP

BrowserStack MCP Server for automated testing across multiple browsers.

testingqabrowsers

⚡

Zapier MCP

A Zapier server that provides automation capabilities for various apps.

zapierautomation