
s12
能够理解、导航和自动与网页交互的智能网络自动化代理。
Repository Info
About This Server
能够理解、导航和自动与网页交互的智能网络自动化代理。
Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.
Documentation
🤖 AI Browser Agent
Intelligent web automation agent that can understand, navigate, and interact with web pages autonomously.
!Python !License !Status
🎯 What This Agent Does
This AI agent can:
- 🌐 Open and navigate to any website automatically
- 🔍 Intelligently detect form elements and interactive components
- ✍️ Fill forms with provided data
- 📤 Submit forms autonomously
- 🧠 Adapt strategies when initial approaches fail
- 📸 Take screenshots for verification
✨ Features
- Multi-Strategy Element Detection: Uses both DOM service and CSS selectors
- Smart Fallback Logic: Automatically tries alternative approaches
- Real Browser Automation: Uses Playwright with Chromium
- Local Profile Management: Avoids permission issues with local directories
- Comprehensive Logging: Detailed progress tracking
- Error Recovery: Handles failures gracefully
🚀 Quick Start
Prerequisites
# Python 3.11+
# Conda environment (recommended)
Installation
- Clone the repository:
git clone https://github.com/yourusername/browser-ai-agent.git
cd browser-ai-agent
- Install dependencies:
pip install -r requirements.txt
- Run the demo:
python examples/google_form_demo.py
📖 Usage Examples
Basic Form Filling
from ai_browser_agent import DirectFormFiller
async def fill_my_form():
filler = DirectFormFiller()
try:
await filler.fill_google_form()
finally:
await filler.close()
# Run the agent
import asyncio
asyncio.run(fill_my_form())
Custom Form Data
# Modify the form data in the script
form_data = {
'name': 'Your Name',
'email': 'your.email@domain.com',
'organization': 'Your Organization',
'message': 'Your custom message'
}
🎬 Demo
The included demo successfully:
- Opens Google Form - "Agentic Form" (live example)
- Detects 4 text inputs using CSS selectors
- Fills form fields:
- Name: "AI Agent Demo"
- Email: "demo@tsai.in"
- Organization: "The School of AI"
- Message: Custom automation message
- Submits form automatically
- Takes screenshot for verification
🛠️ Technical Architecture
Core Components
- DirectFormFiller: Main agent class
- BrowserSession: Manages browser lifecycle
- DomService: Handles page element detection
- CSS Selectors: Fallback element targeting
- Action Registry: Available browser actions
Key Technologies
- Playwright: Browser automation engine
- Chromium: Headless/headed browser
- AsyncIO: Asynchronous execution
- CSS Selectors: Robust element targeting
🔧 Configuration
Browser Settings
# Local profile directory (avoids permission issues)
local_profile_dir = Path(__file__).parent / "browser_profiles" / "default"
local_downloads_dir = Path(__file__).parent / "browser_profiles" / "downloads"
profile = BrowserProfile(
headless=False, # Set to True for headless mode
user_data_dir=local_profile_dir,
downloads_dir=local_downloads_dir
)
Element Detection Strategy
- Primary: DOM Service with highlight elements
- Fallback: CSS selectors with multiple strategies
- Recovery: Index-based element targeting
📁 Project Structure
browser-ai-agent/
├── README.md # This file
├── requirements.txt # Python dependencies
├── examples/
│ └── google_form_demo.py # Working form filling demo
├── docs/
│ └── setup.md # Detailed setup guide
├── browser_profiles/ # Local browser data (git ignored)
└── .gitignore # Git ignore rules
🐛 Troubleshooting
Permission Issues
- Uses local
browser_profiles/directory - No system config directory dependencies
Element Detection Failures
- Multiple fallback strategies implemented
- CSS selectors as robust backup
- Index-based targeting as final fallback
Browser Issues
- Automatically manages browser lifecycle
- Local profile prevents conflicts
- Detailed logging for debugging
🤝 Contributing
- Fork the repository
- Create a feature branch
- Test your changes thoroughly
- Submit a pull request
📄 License
MIT License - see LICENSE file for details.
🎉 Success Story
This agent successfully demonstrated:
- Intelligent adaptation when MCP HTTP protocol failed
- Robust element detection using multiple strategies
- Real-world automation on live Google Forms
- Error recovery and graceful fallbacks
Result: 100% successful form completion with zero manual intervention! 🎯
🔗 Links
- Demo Video
- Documentation
- Issues
- Contributing Guide
Built with ❤️ for intelligent web automation
Quick Start
Clone the repository
git clone https://github.com/arghya05/s12Install dependencies
cd s12
npm installFollow the documentation
Check the repository's README.md file for specific installation and usage instructions.
Repository Details
Recommended MCP Servers
Discord MCP
Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.
Knit MCP
Connect AI agents to 200+ SaaS applications and automate workflows.
Apify MCP Server
Deploy and interact with Apify actors for web scraping and data extraction.
BrowserStack MCP
BrowserStack MCP Server for automated testing across multiple browsers.
Zapier MCP
A Zapier server that provides automation capabilities for various apps.