arghya05
MCP Serverarghya05public

s12

能够理解、导航和自动与网页交互的智能网络自动化代理。

Repository Info

0
Stars
0
Forks
0
Watchers
0
Issues
TypeScript
Language
MIT License
License

About This Server

能够理解、导航和自动与网页交互的智能网络自动化代理。

Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.

Documentation

🤖 AI Browser Agent

Intelligent web automation agent that can understand, navigate, and interact with web pages autonomously.

!Python !License !Status

🎯 What This Agent Does

This AI agent can:

  • 🌐 Open and navigate to any website automatically
  • 🔍 Intelligently detect form elements and interactive components
  • ✍️ Fill forms with provided data
  • 📤 Submit forms autonomously
  • 🧠 Adapt strategies when initial approaches fail
  • 📸 Take screenshots for verification

✨ Features

  • Multi-Strategy Element Detection: Uses both DOM service and CSS selectors
  • Smart Fallback Logic: Automatically tries alternative approaches
  • Real Browser Automation: Uses Playwright with Chromium
  • Local Profile Management: Avoids permission issues with local directories
  • Comprehensive Logging: Detailed progress tracking
  • Error Recovery: Handles failures gracefully

🚀 Quick Start

Prerequisites

# Python 3.11+
# Conda environment (recommended)

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/browser-ai-agent.git
cd browser-ai-agent
  1. Install dependencies:
pip install -r requirements.txt
  1. Run the demo:
python examples/google_form_demo.py

📖 Usage Examples

Basic Form Filling

from ai_browser_agent import DirectFormFiller

async def fill_my_form():
    filler = DirectFormFiller()
    try:
        await filler.fill_google_form()
    finally:
        await filler.close()

# Run the agent
import asyncio
asyncio.run(fill_my_form())

Custom Form Data

# Modify the form data in the script
form_data = {
    'name': 'Your Name',
    'email': 'your.email@domain.com',
    'organization': 'Your Organization',
    'message': 'Your custom message'
}

🎬 Demo

The included demo successfully:

  1. Opens Google Form - "Agentic Form" (live example)
  2. Detects 4 text inputs using CSS selectors
  3. Fills form fields:
    • Name: "AI Agent Demo"
    • Email: "demo@tsai.in"
    • Organization: "The School of AI"
    • Message: Custom automation message
  4. Submits form automatically
  5. Takes screenshot for verification

🛠️ Technical Architecture

Core Components

  • DirectFormFiller: Main agent class
  • BrowserSession: Manages browser lifecycle
  • DomService: Handles page element detection
  • CSS Selectors: Fallback element targeting
  • Action Registry: Available browser actions

Key Technologies

  • Playwright: Browser automation engine
  • Chromium: Headless/headed browser
  • AsyncIO: Asynchronous execution
  • CSS Selectors: Robust element targeting

🔧 Configuration

Browser Settings

# Local profile directory (avoids permission issues)
local_profile_dir = Path(__file__).parent / "browser_profiles" / "default"
local_downloads_dir = Path(__file__).parent / "browser_profiles" / "downloads"

profile = BrowserProfile(
    headless=False,  # Set to True for headless mode
    user_data_dir=local_profile_dir,
    downloads_dir=local_downloads_dir
)

Element Detection Strategy

  1. Primary: DOM Service with highlight elements
  2. Fallback: CSS selectors with multiple strategies
  3. Recovery: Index-based element targeting

📁 Project Structure

browser-ai-agent/
├── README.md                 # This file
├── requirements.txt          # Python dependencies
├── examples/
│   └── google_form_demo.py   # Working form filling demo
├── docs/
│   └── setup.md             # Detailed setup guide
├── browser_profiles/        # Local browser data (git ignored)
└── .gitignore              # Git ignore rules

🐛 Troubleshooting

Permission Issues

  • Uses local browser_profiles/ directory
  • No system config directory dependencies

Element Detection Failures

  • Multiple fallback strategies implemented
  • CSS selectors as robust backup
  • Index-based targeting as final fallback

Browser Issues

  • Automatically manages browser lifecycle
  • Local profile prevents conflicts
  • Detailed logging for debugging

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Test your changes thoroughly
  4. Submit a pull request

📄 License

MIT License - see LICENSE file for details.

🎉 Success Story

This agent successfully demonstrated:

  • Intelligent adaptation when MCP HTTP protocol failed
  • Robust element detection using multiple strategies
  • Real-world automation on live Google Forms
  • Error recovery and graceful fallbacks

Result: 100% successful form completion with zero manual intervention! 🎯

  • Demo Video
  • Documentation
  • Issues
  • Contributing Guide

Built with ❤️ for intelligent web automation

Quick Start

1

Clone the repository

git clone https://github.com/arghya05/s12
2

Install dependencies

cd s12
npm install
3

Follow the documentation

Check the repository's README.md file for specific installation and usage instructions.

Repository Details

Ownerarghya05
Repos12
LanguageTypeScript
LicenseMIT License
Last fetched8/10/2025

Recommended MCP Servers

💬

Discord MCP

Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.

integrationsdiscordchat
🔗

Knit MCP

Connect AI agents to 200+ SaaS applications and automate workflows.

integrationsautomationsaas
🕷️

Apify MCP Server

Deploy and interact with Apify actors for web scraping and data extraction.

apifycrawlerdata
🌐

BrowserStack MCP

BrowserStack MCP Server for automated testing across multiple browsers.

testingqabrowsers

Zapier MCP

A Zapier server that provides automation capabilities for various apps.

zapierautomation