diffusionstudio
MCP Serverdiffusionstudiopublic

agent

The agentic video editing framework

Repository Info

149
Stars
15
Forks
149
Watchers
1
Issues
Python
Language
MIT License
License

About This Server

The agentic video editing framework

Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.

Documentation

Library Banner

Video Composer Agent

discord Static Badge Static Badge


Setup

pip install uv
uv sync

or alternatively:

uv add -r requirements.txt

Environment Variables

You will need to use the environment variables defined in .env.example to run Video Composer Agent. It's recommended you use Vercel Environment Variables for this, but a .env file is all that is necessary.

Note: You should not commit your .env file or it will expose secrets that will allow others to control access to your various OpenAI and authentication provider accounts.

Run Agent

To run the main script:

uv run main.py

Feel free to modify the main.py script to add new tools and modify the agent's behavior.

Demo

https://github.com/user-attachments/assets/55625231-89ce-4bf5-af45-de0672e597e1

The documentation search system provides semantic search capabilities for Diffusion Studio's documentation:

Usage

from src.tools.docs_search import DocsSearchTool

# Initialize search tool
docs_search = DocsSearchTool()

# Basic search
results = docs_search.forward(query="how to add text overlay")

# With reranking for more accurate results
results = docs_search.forward(query="how to add text overlay", rerank_results=True)

# Limit number of results
results = docs_search.forward(query="how to add text overlay", limit=10)

# With filters
results = docs_search.forward(
    query="video transitions",
    filter_conditions={"section": "video-effects"}
)

The search tool:

  • Uses vector embeddings for fast semantic search
  • Supports optional semantic reranking for higher accuracy
  • Allows filtering by documentation sections
  • Auto-embeds documentation from configured URL
  • Maintains embedding cache with hash checking

Development

See CONTRIBUTING.md for development setup and guidelines.

ToDos PRs Welcome

  • Make python agent fully async
  • Add TS implementation of agent
  • Stream the console logs of browser back to the agent
  • Add support for feedback for more modalities like audio
    • Speech to text to remove certain centences
    • Waveform analysis to sync audio to video
    • Moderation analysis to remove certain phrases
  • Add MCP integration

    MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.

  • Add BM25 to DocsSearchTool to enable hybrid search
  • Add support for video understanding models like VideoLLaMA

Quick Start

1

Clone the repository

git clone https://github.com/diffusionstudio/agent
2

Install dependencies

cd agent
npm install
3

Follow the documentation

Check the repository's README.md file for specific installation and usage instructions.

Repository Details

Ownerdiffusionstudio
Repoagent
LanguagePython
LicenseMIT License
Last fetched8/10/2025

Recommended MCP Servers

💬

Discord MCP

Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.

integrationsdiscordchat
🔗

Knit MCP

Connect AI agents to 200+ SaaS applications and automate workflows.

integrationsautomationsaas
🕷️

Apify MCP Server

Deploy and interact with Apify actors for web scraping and data extraction.

apifycrawlerdata
🌐

BrowserStack MCP

BrowserStack MCP Server for automated testing across multiple browsers.

testingqabrowsers

Zapier MCP

A Zapier server that provides automation capabilities for various apps.

zapierautomation