
something
使用RAG技术的威胁狩猎聊天机器人,支持文档分析、日志处理和高级威胁检测。
Repository Info
About This Server
使用RAG技术的威胁狩猎聊天机器人,支持文档分析、日志处理和高级威胁检测。
Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.
Documentation
ThreatChat - Threat Hunting Chatbot with MCP
This project implements a Retrieval-Augmented Generation (RAG) threat hunting chatbot using:
- Google Gemini for embeddings and LLM responses
- Qdrant as the vector database
- FastAPI for the web/chat API
- Model Context Protocol (MCP) for enhanced Elasticsearch integration
Features
-
Document Ingestion and Indexing
- On startup, scans the
documents/directory for PDF, TXT, and MD files - Uses Gemini embedding model (
models/embedding-001) to vectorize documents - Stores embeddings in a Qdrant collection named
threat_docs
- On startup, scans the
-
Chat Interface
- REST endpoint at
POST /chatto ask questions - Embeds user queries and retrieves top-5 relevant documents from Qdrant
- Feeds retrieved context into Gemini LLM to generate grounded answers
- REST endpoint at
-
Direct EVTX Log Analysis
- Upload Windows Event Log (.evtx) files directly for analysis
- Processes logs directly without going through RAG pipeline
- Provides security-focused analysis with MITRE ATT&CK TTPs
- Formats responses clearly for security analysts
-
Elasticsearch Integration via MCP
- Connect to Elasticsearch through a Model Context Protocol server
- Visualize and analyze network connections from Sysmon logs
- Query logs directly with natural language
- Intelligent caching system for performance optimization
- Cache management interface for statistics and manual clearing
-
Advanced Threat Hunting Queries
- Lateral movement detection between systems
- PowerShell command analysis with security context
- Process creation event analysis with command details
- Login failure detection and brute force attempt identification
- Command and Control (C2) traffic detection
- Data exfiltration activity monitoring
- Privilege escalation attempt detection
- Time range extraction from natural language (e.g., "last 7 days", "since yesterday")
- Results mapped to MITRE ATT&CK techniques and tactics
Prerequisites
- Python 3.8+
- Qdrant server (local or remote)
- Docker (optional, for running Qdrant locally)
Setup
-
Clone and navigate
git clone <repo-url> . cd C:/Users/moham/Desktop/pfe/threat-chat -
Install dependencies
pip install -r requirements.txt -
Run tests (optional)
python run_tests.py # Run specific test modules python run_tests.py --pattern test_mcp_client.py # Run with less verbose output python run_tests.py --quiet -
Configure environment
Create a
.envfile with the following variables:# Gemini AI Configuration GEMINI_API_KEY=your_gemini_api_key GEMINI_FALLBACK_API_KEY=your_backup_api_key GEMINI_TIMEOUT=120 # Vector Database Configuration QDRANT_URL=http://localhost:6333 QDRANT_API_KEY= DOCUMENTS_PATH=./documents # Elasticsearch MCP Server Configuration MCP_ES_URL=http://localhost:8000/api MCP_API_KEY=your_mcp_api_key ES_INDEX=logs MCP_TIMEOUT=60 MCP_CACHE_ENABLED=true MCP_CACHE_TTL=300 -
Optional: Set up MCP server
For enhanced Elasticsearch integration, set up the MCP server using the guides in the
docs/directory:- MCP Server Setup
- MCP Server Implementation
- MCP API Documentation
-
Create and activate a virtual environment
python -m venv venv .\venv\Scripts\Activate.ps1 -
Install dependencies
pip install -r requirements.txt -
Environment variables
Copy the
.envfile and ensure the following keys are set:# Gemini AI Configuration GEMINI_API_KEY=your_api_key_here GEMINI_FALLBACK_API_KEY=optional_fallback_key GEMINI_TIMEOUT=120 # Vector Database Configuration QDRANT_URL=http://localhost:6333 QDRANT_API_KEY= DOCUMENTS_PATH=./documents # Elasticsearch MCP Server Configuration MCP_ES_URL=http://localhost:8000/api MCP_API_KEY=your_mcp_api_key_here ES_INDEX=logs MCP_TIMEOUT=60 MCP_CACHE_ENABLED=true MCP_CACHE_TTL=300 -
Run Qdrant (if not already running)
docker run -p 6333:6333 qdrant/qdrant -
Create a
documents/folder and add your threat docs (PDF, TXT, MD)mkdir documents -
Start the application
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
Usage
Web Interface
Access the web interface at http://localhost:8000/ and use the tabbed interface:
- EVTX Analysis: Upload Windows Event Log files and ask questions about them
- Document Search: Ask questions about your threat hunting documentation
- Elasticsearch Analysis: Query your Elasticsearch logs directly with natural language
API Endpoints
Document Search
POST /chat
Content-Type: multipart/form-data
Form fields:
- question: "What indicators are mentioned regarding malware X?"
EVTX Analysis
POST /chat
Content-Type: multipart/form-data
Form fields:
- question: "Are there any suspicious login attempts?"
- evtx_files: [binary files]
Elasticsearch Analysis
POST /es/chat
Content-Type: multipart/form-data
Form fields:
- question: "Show me recent network connections"
- index: "winlogbeat-*"
The response for all endpoints will include the answer and the list of sources used.
Notes
- The ingestion runs on startup; restart the API to re-index when documents change.
- The app is modular; you can swap embedding or LLM models by updating
app/gemini_client.py.
Advanced Threat Hunting Features
ThreatChat supports several specialized query types for advanced threat hunting:
Lateral Movement Detection
Identifies potential lateral movement through a network by analyzing:
- Authentication events across multiple systems
- Network connections between systems
- Remote process execution events
- Service installations and creations
PowerShell Command Analysis
Analyzes PowerShell commands with security focus:
- Identifies obfuscated commands
- Detects encoded PowerShell payloads
- Recognizes fileless malware techniques
- Maps suspicious commands to MITRE ATT&CK techniques
Command and Control (C2) Detection
Searches for indicators of C2 traffic:
- Unusual outbound connection patterns
- Beaconing behavior
- Anomalous destination ports
- Domain/IP reputation analysis
- Long-running or persistent connections
Data Exfiltration Detection
Identifies potential data theft activity:
- Large file transfers to external destinations
- Unusual file access patterns before network activity
- Temporal correlation between file access and network communication
- Non-standard protocols or ports for data transfer
Privilege Escalation Detection
Finds attempts to gain higher privileges:
- UAC bypass techniques
- Token manipulation
- Process injection into privileged processes
- Suspicious service creations or modifications
License
MIT License
Quick Start
Clone the repository
git clone https://github.com/mohamedjawady/somethingInstall dependencies
cd something
npm installFollow the documentation
Check the repository's README.md file for specific installation and usage instructions.
Repository Details
Recommended MCP Servers
Discord MCP
Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.
Knit MCP
Connect AI agents to 200+ SaaS applications and automate workflows.
Apify MCP Server
Deploy and interact with Apify actors for web scraping and data extraction.
BrowserStack MCP
BrowserStack MCP Server for automated testing across multiple browsers.
Zapier MCP
A Zapier server that provides automation capabilities for various apps.