
agent
The agentic video editing framework
Repository Info
About This Server
The agentic video editing framework
Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.
Documentation
Video Composer Agent
Setup
pip install uv
uv sync
or alternatively:
uv add -r requirements.txt
Environment Variables
You will need to use the environment variables defined in .env.example to run Video Composer Agent. It's recommended you use Vercel Environment Variables for this, but a .env file is all that is necessary.
Note: You should not commit your .env file or it will expose secrets that will allow others to control access to your various OpenAI and authentication provider accounts.
Run Agent
To run the main script:
uv run main.py
Feel free to modify the main.py script to add new tools and modify the agent's behavior.
Demo
https://github.com/user-attachments/assets/55625231-89ce-4bf5-af45-de0672e597e1
Documentation Search
The documentation search system provides semantic search capabilities for Diffusion Studio's documentation:
Usage
from src.tools.docs_search import DocsSearchTool
# Initialize search tool
docs_search = DocsSearchTool()
# Basic search
results = docs_search.forward(query="how to add text overlay")
# With reranking for more accurate results
results = docs_search.forward(query="how to add text overlay", rerank_results=True)
# Limit number of results
results = docs_search.forward(query="how to add text overlay", limit=10)
# With filters
results = docs_search.forward(
query="video transitions",
filter_conditions={"section": "video-effects"}
)
The search tool:
- Uses vector embeddings for fast semantic search
- Supports optional semantic reranking for higher accuracy
- Allows filtering by documentation sections
- Auto-embeds documentation from configured URL
- Maintains embedding cache with hash checking
Development
See CONTRIBUTING.md for development setup and guidelines.
ToDos PRs Welcome
- Make python agent fully async
- Add TS implementation of agent
- Stream the console logs of browser back to the agent
- Add support for feedback for more modalities like audio
- Speech to text to remove certain centences
- Waveform analysis to sync audio to video
- Moderation analysis to remove certain phrases
- Add MCP integration
MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.
- Add BM25 to
DocsSearchToolto enable hybrid search - Add support for video understanding models like VideoLLaMA
Quick Start
Clone the repository
git clone https://github.com/diffusionstudio/agentInstall dependencies
cd agent
npm installFollow the documentation
Check the repository's README.md file for specific installation and usage instructions.
Repository Details
Recommended MCP Servers
Discord MCP
Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.
Knit MCP
Connect AI agents to 200+ SaaS applications and automate workflows.
Apify MCP Server
Deploy and interact with Apify actors for web scraping and data extraction.
BrowserStack MCP
BrowserStack MCP Server for automated testing across multiple browsers.
Zapier MCP
A Zapier server that provides automation capabilities for various apps.