
md rag mcp
RAG to index md files accessible via MCP
Repository Info
About This Server
RAG to index md files accessible via MCP
Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.
Documentation
Journal RAG System
A powerful, privacy-focused journal system with AI-powered semantic search and metadata analysis. This system allows you to query your personal journal entries using natural language and analyze patterns in your mood, habits, and thoughts over time.
✨ Features
- 🔍 Semantic Search: Query your journal entries using natural language
- 📊 Frontmatter Analysis: Track mood, anxiety, weight, and custom metrics over time
- 🤖 AI Agent Integration: Connect to AI assistants like Rovo Dev, VS Code, etc.
- 🏠 Local Processing: Journal files and search indexing stay on your machine
- 🚀 GPU Acceleration: Optional NVIDIA GPU support for faster embeddings
- 📱 Cross-Platform: macOS, Ubuntu/Debian, Arch Linux, and Windows/WSL with automated setup
🚀 Quick Start
1. Choose Your Setup Method
macOS (Automated):
./setup_mac_environment.sh
Ubuntu/Debian (Automated):
./setup_ubuntu_environment.sh
Arch Linux (Automated):
./setup_arch_environment.sh
Windows/WSL (Automated):
# After setting up WSL with Ubuntu, run:
./setup_ubuntu_environment.sh
For detailed Windows/WSL setup, see setup_win_instructions.md.
2. Configure MCP Servers
- Copy the template:
cp mcp.json.template mcp.json - Replace
${PROJECT_ROOT}with your actual project path - Configure your AI agent to use the MCP servers
3. Start Journaling
Add entries to the journal/ directory:
journal/2025/01/01.md
journal/2025/01/02.md
With optional frontmatter for tracking:
---
date: 2025-01-01
mood: 7
anxiety: 3
weight_kg: 70
---
# Today's thoughts
Your journal content here...
📁 Directory Structure
md-rag-mcp/
├── README.md # This file
├── setup_mac_environment.sh # macOS automated setup
├── setup_ubuntu_environment.sh # Ubuntu/Debian automated setup
├── setup_arch_environment.sh # Arch Linux automated setup
├── setup_win_instructions.md # Windows/WSL setup guide
├── install_instructions.md # General setup guide
├── mcp.json.template # MCP configuration template
├── journal/ # Your journal entries
│ ├── 2025/01/01.md # Daily entries
│ └── topics/ # Topic-based entries
└── .tech/ # Technical components
├── code/
│ ├── mcp/ # MCP servers
│ │ ├── journal_rag_mcp.py # Semantic search server
│ │ ├── frontmatter_mcp.py # Metadata analysis server
│ │ ├── requirements.txt # Python dependencies
│ │ └── .venv/ # Virtual environment
│ └── scripts/
│ └── rag_search.py # Standalone search script
├── data/ # Generated data
│ └── chroma_db/ # Vector database
└── docs/ # Additional documentation
🔧 System Requirements
Minimum Requirements
- Python 3.8+
- 4GB RAM
- 2GB free disk space
Recommended (for GPU acceleration)
- NVIDIA GPU with CUDA support
- 8GB+ GPU VRAM for large embedding models
- 16GB+ system RAM
Supported Platforms
- macOS (Intel & Apple Silicon) - Automated setup
- Ubuntu/Debian (including WSL) - Automated setup
- Arch Linux - Automated setup
- Windows (via WSL2 with Ubuntu) - Automated setup
🎯 Usage
Semantic Search
Query your journal entries using natural language:
# Via standalone script
source .tech/code/mcp/.venv/bin/activate
python .tech/code/scripts/rag_search.py
# Via AI agent (after MCP setup)
"What did I write about productivity last month?"
"Show me entries where I felt anxious"
Frontmatter Analysis
Analyze patterns in your journal metadata:
# Via AI agent (after MCP setup)
"What's my average mood this month?"
"Show me my weight trend over the last 3 months"
"Plot my anxiety levels vs sleep quality"
🤖 AI Agent Integration
Rovo Dev (Free) - Recommended!
Rovo Dev is a free AI agent built by Atlassian that works excellently with this journal system.
Setup Steps:
- Install Rovo Dev CLI: Follow the complete setup guide at the official Atlassian Community post
- Configure MCP Integration:
# Open Rovo Dev's MCP configuration file acli rovodev mcp # Copy the contents from mcp.json.template # Replace ${PROJECT_ROOT} with your actual path, e.g.: # /Users/yourname/Documents/md-rag-mcp - Create .agent.md: Create a custom persona that defines how Rovo Dev should interact with your journal. This persona can help with thoughtful questioning, pattern recognition across entries, balanced reflection prompts, and contextual analysis of your journaling themes and personal growth over time.
- Start Rovo Dev:
acli rovodev - Start Querying: Try commands like:
- "What did I write about yesterday?"
- "Show me my mood trends this month"
- "Find entries where I mentioned productivity"
Other Agents
Cloud-based options:
- Cursor: MCP-compatible, all conversations sent to cloud
- GitHub Copilot: If MCP-compatible, cloud-based
Local privacy options:
- Cline (VS Code extension): Can connect to local LLMs (Ollama, LM Studio)
- Roo Code (VS Code extension): Can connect to local LLMs for complete privacy
Any MCP-compatible AI agent can connect using the provided configuration.
📊 Frontmatter Fields
Track various metrics in your journal entries:
---
date: 2025-01-01 # Entry date (required)
mood: 7 # 1-10 scale
anxiety: 3 # 1-10 scale
energy: 8 # 1-10 scale
sleep_hours: 7.5 # Hours of sleep
weight_kg: 70 # Weight in kg
exercise_minutes: 30 # Exercise duration
weather: "sunny" # Weather description
tags: ["work", "family"] # Custom tags
---
You can customize these fields based on what you want to track.
🔍 Advanced Features
GPU Acceleration
If you have an NVIDIA GPU, the system will automatically use it for faster embedding generation:
# Check GPU status
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
Backup and Sync
Your journal is just markdown files - sync with any service:
- Git repositories
- Cloud storage (Dropbox, iCloud, etc.)
- Network drives
📚 Documentation
- Setup Instructions:
install_instructions.md - Windows Setup:
setup_win_instructions.md - Migration Plan:
MIGRATION_PLAN.md(development)
🤝 Contributing
This is an open-source project! Contributions welcome:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
📝 License
This project is open source. See LICENSE file for details.
🆘 Support
Common Issues
"Module not found" errors:
- Ensure virtual environment is activated
- Verify requirements are installed in correct venv
MCP connection issues:
- Check paths in mcp.json are absolute, not relative
- Verify Python executable path is correct
Slow performance:
- Check if GPU acceleration is working
- Consider using a smaller embedding model
- Ensure sufficient RAM available
Getting Help
- Check the troubleshooting section in
install_instructions.md - Review the setup guides for your platform
- Open an issue on GitHub with system details
🔒 Privacy & Data Considerations
What stays private (always local):
- Your journal files and content
- RAG embeddings and search indexing
- Frontmatter analysis and statistics
- All file processing and storage
What gets exposed with cloud-based AI agents (Rovo Dev, Cursor - both 100% cloud-based):
- Your conversations about journal content
- AI-generated insights and analysis
- Personal reflections and patterns discussed
- Any journal excerpts shared during conversations
For complete privacy: Use VS Code extensions like Cline or Roo Code with local LLMs (Ollama, LM Studio, etc.). These extensions can connect to local models while keeping all conversations on your machine.
Bottom line: Your raw journal data stays local, but if you use cloud AI agents to analyze it, those conversations are not private.
Quick Start
Clone the repository
git clone https://github.com/estevaom/md-rag-mcpInstall dependencies
cd md-rag-mcp
npm installFollow the documentation
Check the repository's README.md file for specific installation and usage instructions.
Repository Details
Recommended MCP Servers
Discord MCP
Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.
Knit MCP
Connect AI agents to 200+ SaaS applications and automate workflows.
Apify MCP Server
Deploy and interact with Apify actors for web scraping and data extraction.
BrowserStack MCP
BrowserStack MCP Server for automated testing across multiple browsers.
Zapier MCP
A Zapier server that provides automation capabilities for various apps.