MCP Serverarghya05public

s12

能够理解、导航和自动与网页交互的智能网络自动化代理。

Repository Info

Stars

Forks

Watchers

Issues

TypeScript

Language

MIT License

License

View on GitHubGitHub Download DocumentationDocs

About This Server

能够理解、导航和自动与网页交互的智能网络自动化代理。

Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.

Documentation

🤖 AI Browser Agent

Intelligent web automation agent that can understand, navigate, and interact with web pages autonomously.

!Python !License !Status

🎯 What This Agent Does

This AI agent can:

🌐 Open and navigate to any website automatically
🔍 Intelligently detect form elements and interactive components
✍️ Fill forms with provided data
📤 Submit forms autonomously
🧠 Adapt strategies when initial approaches fail
📸 Take screenshots for verification

✨ Features

Multi-Strategy Element Detection: Uses both DOM service and CSS selectors
Smart Fallback Logic: Automatically tries alternative approaches
Real Browser Automation: Uses Playwright with Chromium
Local Profile Management: Avoids permission issues with local directories
Comprehensive Logging: Detailed progress tracking
Error Recovery: Handles failures gracefully

🚀 Quick Start

Prerequisites

# Python 3.11+
# Conda environment (recommended)

Installation

Clone the repository:

git clone https://github.com/yourusername/browser-ai-agent.git
cd browser-ai-agent

Install dependencies:

pip install -r requirements.txt

Run the demo:

python examples/google_form_demo.py

📖 Usage Examples

Basic Form Filling

from ai_browser_agent import DirectFormFiller

async def fill_my_form():
    filler = DirectFormFiller()
    try:
        await filler.fill_google_form()
    finally:
        await filler.close()

# Run the agent
import asyncio
asyncio.run(fill_my_form())

Custom Form Data

# Modify the form data in the script
form_data = {
    'name': 'Your Name',
    'email': 'your.email@domain.com',
    'organization': 'Your Organization',
    'message': 'Your custom message'
}

🎬 Demo

The included demo successfully:

Opens Google Form - "Agentic Form" (live example)
Detects 4 text inputs using CSS selectors
Fills form fields:
- Name: "AI Agent Demo"
- Email: "demo@tsai.in"
- Organization: "The School of AI"
- Message: Custom automation message
Submits form automatically
Takes screenshot for verification

🛠️ Technical Architecture

Core Components

DirectFormFiller: Main agent class
BrowserSession: Manages browser lifecycle
DomService: Handles page element detection
CSS Selectors: Fallback element targeting
Action Registry: Available browser actions

Key Technologies

Playwright: Browser automation engine
Chromium: Headless/headed browser
AsyncIO: Asynchronous execution
CSS Selectors: Robust element targeting

🔧 Configuration

Browser Settings

# Local profile directory (avoids permission issues)
local_profile_dir = Path(__file__).parent / "browser_profiles" / "default"
local_downloads_dir = Path(__file__).parent / "browser_profiles" / "downloads"

profile = BrowserProfile(
    headless=False,  # Set to True for headless mode
    user_data_dir=local_profile_dir,
    downloads_dir=local_downloads_dir
)

Element Detection Strategy

Primary: DOM Service with highlight elements
Fallback: CSS selectors with multiple strategies
Recovery: Index-based element targeting

📁 Project Structure

browser-ai-agent/
├── README.md                 # This file
├── requirements.txt          # Python dependencies
├── examples/
│   └── google_form_demo.py   # Working form filling demo
├── docs/
│   └── setup.md             # Detailed setup guide
├── browser_profiles/        # Local browser data (git ignored)
└── .gitignore              # Git ignore rules

🐛 Troubleshooting

Permission Issues

Uses local browser_profiles/ directory
No system config directory dependencies

Element Detection Failures

Multiple fallback strategies implemented
CSS selectors as robust backup
Index-based targeting as final fallback

Browser Issues

Automatically manages browser lifecycle
Local profile prevents conflicts
Detailed logging for debugging

🤝 Contributing

Fork the repository
Create a feature branch
Test your changes thoroughly
Submit a pull request

📄 License

MIT License - see LICENSE file for details.

🎉 Success Story

This agent successfully demonstrated:

Intelligent adaptation when MCP HTTP protocol failed
Robust element detection using multiple strategies
Real-world automation on live Google Forms
Error recovery and graceful fallbacks

Result: 100% successful form completion with zero manual intervention! 🎯

🔗 Links

Demo Video
Documentation
Issues
Contributing Guide

Built with ❤️ for intelligent web automation

Quick Start

Clone the repository

git clone https://github.com/arghya05/s12

Install dependencies

cd s12
npm install

Follow the documentation

Check the repository's README.md file for specific installation and usage instructions.

Repository Details

Ownerarghya05

Repos12

LanguageTypeScript

LicenseMIT License

Last fetched8/10/2025

Quick Links

Issues

Releases

License

Recommended MCP Servers

💬

Discord MCP

Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.

integrationsdiscordchat

🔗

Knit MCP

Connect AI agents to 200+ SaaS applications and automate workflows.

integrationsautomationsaas

🕷️

Apify MCP Server

Deploy and interact with Apify actors for web scraping and data extraction.

apifycrawlerdata

🌐

BrowserStack MCP

BrowserStack MCP Server for automated testing across multiple browsers.

testingqabrowsers

⚡

Zapier MCP

A Zapier server that provides automation capabilities for various apps.

zapierautomation