gopeshkhandelwal
MCP Servergopeshkhandelwalpublic

idc llm mcp

AI agent powered by MCP, LangGraph, LangChain, RAG, OpenAI, and PostgreSQL

Repository Info

2
Stars
0
Forks
2
Watchers
1
Issues
Python
Language
Other
License

About This Server

AI agent powered by MCP, LangGraph, LangChain, RAG, OpenAI, and PostgreSQL

Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.

Documentation

ITAC AI Agent (An end-to-end GenAI framework)

Overview

The ITAC AI Agent is a modular, production-ready framework that integrates Large Language Models (LLMs) with Intel tiber developer cloud (ITAC) and MCP (Model Context Protocol) to enable intelligent agentic behavior. It combines LangGraph-based orchestration, LangChain tool-calling, and advanced RAG-powered memory with secure, auditable tool execution. The system supports natural language queries that trigger real cloud actions—backed by open-source LLMs (e.g., LLaMA, Mistral) or OpenAI-compatible APIs—and demonstrates scalable, secure integration of advanced GenAI components across infrastructure.

Architecture

Sequence Diagram

!Sequence Diagram

Features

  • LLM-powered agent: Supports both OpenAI GPT models (via API) and local Hugging Face models (Llama, Mistral, etc.).
  • LangGraph orchestration: Multi-tool, multi-step agent workflow using LangGraph for robust, extensible logic.
  • Advanced RAG (Retrieval-Augmented Generation):
    • Hybrid search combining BM25 keyword and semantic vector search
    • Optional cross-encoder reranking for enhanced accuracy
    • Adaptive document retrieval based on query complexity
    • Query caching for improved performance
    • Multiple search strategies (hybrid, semantic, keyword)
  • Conversation Memory: Short-Term and Long-Term (PostgreSQL).
  • Secure tool execution: All tool calls are routed through the MCP server, with authentication and logging.
  • Extensible tool registry: Easily add new tools for cloud, infrastructure, or document Q&A.
  • Async and streaming support: Fast, scalable, and ready for production workloads.
  • Environment-based configuration: Uses .env for secrets and endpoints.
  • Local model caching: Avoids repeated downloads by using a local Hugging Face cache.

Quickstart

  1. Clone the repository

    git clone <repo-url>
    cd nextgen-ai
    
  2. Set up environment and install dependencies

    make install
    
  3. Configure environment variables

    • Copy .env.example to .env and fill in your secrets (OpenAI API key, Hugging Face token, ITAC tokens, etc).
    • Ensure your .env file contains the required database configuration:
    DB_NAME=your_db_name
    DB_USER=your_db_user
    DB_PASS=your_db_password
    DB_HOST=localhost
    DB_PORT=5432
    
  4. Install & Setup PostgreSQL

    make setup-postgres
    

    This automated setup will:

    • Install PostgreSQL server and client tools
    • Start and enable the PostgreSQL service
    • Create the database and user from your .env configuration
    • Set up proper permissions and privileges
    • Configure authentication for password-based access
    • Create the required database tables (conversation_history)
  5. Download embedding models

    # Download MiniLM embedding model (required for RAG)
    make download-model-minilm
    
  6. Build the vectorstore for document Q&A

    # Place your docs in docs/ and set RAG_DOC_PATH in .env
    make build-vectorstore
    
  7. Build and run the vLLM Hermes server for local LLM inference

    # Build and start the vLLM Hermes Docker image (downloads the model and vllm-fork automatically; runs on port 8000 by default)
    make setup-vllm-hermes
    
    # Check the logs or health endpoint to verify the server is running
    make logs-vllm-hermes
    
  8. Start the application

    make start-nextgen-suite
    
  9. Interact with the system Enter natural language queries such as:

    • "List of all available ITAC products"
    • "What is the weather in Dallas?"
    • "Give me a detailed explanation of ITAC gRPC APIs"

    The agent will automatically select and call the appropriate tools, returning results with source attribution.

Advanced RAG Features

The system includes a production-ready RAG implementation with:

  • BM25 keyword search: Exact term matching
  • Semantic vector search: Meaning-based retrieval
  • Ensemble combination: Configurable weights (default: 70% semantic, 30% keyword)

Optional Reranking

  • Cross-encoder reranking: Enhanced relevance scoring using sentence-transformers
  • Automatic trigger: Detects queries with "detailed", "comprehensive", "thorough" keywords
  • Configurable: Enable/disable via RAG_ENABLE_RERANKER environment variable

Adaptive Retrieval

  • Query complexity analysis: Adjusts document count based on query length
  • Smart K selection: Simple queries use fewer docs, complex queries use more
  • Performance optimization: Caches results for repeated queries

## Adding New Tools

1. Implement your tool in `mcp_server/tools/`.
2. Register it in the `register_tools` function in `mcp_server/server.py`.
3. Restart the server to pick up new tools.
4. (Optional) Update the LangGraph agent logic if you want custom routing or multi-tool workflows.

## Environment Variables

See `.env.example` for all required and optional variables, including:

### **Core Configuration**
- `OPENAI_API_KEY`, `OPENAI_API_BASE`, `OPENAI_MODEL` (for OpenAI API)
- `HUGGINGFACE_HUB_TOKEN` (for model downloads)

### **RAG Configuration**
- `RAG_EMBED_MODEL` (local model path, e.g., `./resources/models/minilm`)
- `RAG_DOC_PATH`, `RAG_INDEX_DIR` (for RAG document processing)
- `RAG_SEMANTIC_WEIGHT=0.7` (semantic search weight)
- `RAG_KEYWORD_WEIGHT=0.3` (keyword search weight)
- `RAG_RETRIEVAL_K=5` (default number of documents to retrieve)
- `RAG_CACHE_SIZE=100` (query cache size)

### **Reranking Configuration**
- `RAG_ENABLE_RERANKER=true` (enable/disable reranking)
- `RAG_RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-12-v2` (reranker model)
- `RAG_RERANK_CANDIDATE_MULTIPLIER=3` (candidate multiplier for reranking)

### **ITAC Integration**
- `ITAC_PRODUCTS` 

### **Database Configuration**
- `DB_HOST`, `DB_PORT`, `DB_NAME`, `DB_USER`, `DB_PASSWORD`

## Available Make Commands

```sh
# Environment setup
make install                    # Set up virtual environment and install dependencies
make install-postgres-deps      # Install PostgreSQL Python dependencies

# Database setup
make setup-postgres             # Complete PostgreSQL installation and configuration

# Model management
make download-model-minilm      # Download MiniLM embedding model for RAG
make download-model-llama-2-7b-chat-hf  # Download LLaMA 2 model for local inference

# RAG system
make build-vectorstore          # Build FAISS vectorstore from documents
make test-rag                   # Test RAG pipeline functionality

# Application
make start-nextgen-suite        # Start both MCP client and server
make clean                      # Clean up environment and artifacts

Production Notes

  • Never commit real secrets to .env or git.
  • Use make commands for all workflows to ensure proper environment setup.
  • All Python scripts use the .venv and correct PYTHONPATH for imports.
  • Logging is enabled for all major actions and errors.
  • For production, set environment variables securely (e.g., Docker secrets, Kubernetes secrets).
  • Monitor logs for errors and tool execution.

RAG Performance Tuning

  • Cache size: Adjust RAG_CACHE_SIZE based on memory constraints
  • Search weights: Tune RAG_SEMANTIC_WEIGHT and RAG_KEYWORD_WEIGHT for your use case
  • Reranking: Enable for better quality, disable for faster responses
  • K values: Adjust RAG_RETRIEVAL_K based on document corpus size

Testing

RAG System Testing

# Test RAG pipeline with sample queries
make test-rag

Manual Testing

# Start the system
make start-nextgen-suite

# Test different query types:
# 1. Simple factual: "What is USA Capital?"
# 2. Complex technical: "Give info on ITAC gRPC API"
# 3. Detailed request: "List of all available ITAC products"

Complete Setup Example

For a complete setup from scratch:

# Clone and setup
git clone <repo-url>
cd nextgen-ai

# Configure environment variables
cp .env.example .env
# Edit .env file with your actual values:
# - Set OpenAI API key, Hugging Face token, ITAC tokens
# - Configure database settings (DB_NAME, DB_USER, DB_PASS, etc.)
# - Set RAG and other configuration parameters

# Install everything
make install
make setup-postgres
make download-model-minilm
make build-vectorstore

# Start the application
make start-nextgen-suite

Troubleshooting

Common Issues

  • Failed to retrieve ITAC products: Ensure your tool is not using Proxy.

    export NO_PROXY=
    export no_proxy=
    
  • ModuleNotFoundError: Ensure you are running from the project root and using make targets.

  • Model not found: Check RAG_EMBED_MODEL and Hugging Face token.

  • Vectorstore errors: Ensure you have built the vectorstore and set RAG_INDEX_DIR correctly.

  • Rate limits: Use a Hugging Face token and cache models locally.

  • Tool not called: Ensure your tool is registered and appears in the agent's tool list.

RAG-Specific Issues

  • Poor search quality: Try enabling reranking with RAG_ENABLE_RERANKER=true
  • Slow responses: Disable reranking or reduce RAG_RETRIEVAL_K
  • Memory issues: Reduce RAG_CACHE_SIZE or use smaller embedding models
  • Reranking errors: Ensure sentence-transformers is installed: pip install sentence-transformers

PostgreSQL Issues

  • Peer authentication failed: If you get "FATAL: Peer authentication failed for user", run:
    make setup-postgres  # This will reconfigure authentication
    
    Or manually connect using:
    psql -U demo_user -d demo_db -h localhost  # Forces TCP connection with password auth
    
  • Connection refused: Ensure PostgreSQL is running: sudo systemctl status postgresql
  • Database does not exist: Re-run make setup-postgres to recreate the database

PostgreSQL Setup for Long-Term Memory

This project uses PostgreSQL to persist all conversation history for long-term memory.

The simplest way to set up PostgreSQL is using the provided Makefile target:

# One-command PostgreSQL setup
make setup-postgres

This automated setup will:

  • Install PostgreSQL server and client tools
  • Start and enable PostgreSQL service
  • Create database and user from your .env configuration
  • Set proper permissions and privileges
  • Configure password-based authentication
  • Create all required database tables automatically
  • Handle error cases (existing database/user)

Manual Setup (Alternative)

If you prefer manual setup:

  1. Install PostgreSQL on your system.
  2. Create a database and user (e.g., demo_db and demo_user).
  3. Create the required tables using the schema in common_utils/database/conversation_history.sql:
    psql -U demo_user -d demo_db -f common_utils/database/conversation_history.sql
    
  4. Set your database credentials in the .env file.
  5. Restart the application to enable persistent conversation memory.

Configuration

Ensure your .env file contains the database configuration:

DB_NAME=demo_db
DB_USER=demo_user
DB_PASS=DiwaliPwd123$
DB_HOST=localhost
DB_PORT=5432

All user and assistant messages will be stored in PostgreSQL, enabling robust long-term memory and analytics.

System Architecture Details

RAG Pipeline Flow

  1. Document Ingestion: Documents are processed and stored in FAISS vectorstore
  2. Query Processing: User queries are analyzed for complexity and intent
  3. Hybrid Retrieval: BM25 and semantic search run in parallel
  4. Optional Reranking: Cross-encoder reranks results for better relevance
  5. Answer Generation: LLM generates response using retrieved context
  6. Source Attribution: System provides transparency about sources used

Agent Workflow

  1. Query Reception: User input received via LangGraph agent
  2. Tool Selection: Agent selects appropriate tool based on query content
  3. Tool Execution: Selected tool executes via MCP protocol
  4. Response Assembly: Results are formatted and returned to user
  5. Memory Storage: Conversation history saved to PostgreSQL

For more details, see comments in each script and .env.example.


Security Reminder:
Never commit real secrets or tokens. Use secure methods to handle sensitive information in production environments.

License

See the LICENSE file for full details.

Quick Start

1

Clone the repository

git clone https://github.com/gopeshkhandelwal/idc-llm-mcp
2

Install dependencies

cd idc-llm-mcp
npm install
3

Follow the documentation

Check the repository's README.md file for specific installation and usage instructions.

Repository Details

Ownergopeshkhandelwal
Repoidc-llm-mcp
LanguagePython
LicenseOther
Last fetched8/10/2025

Recommended MCP Servers

💬

Discord MCP

Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.

integrationsdiscordchat
🔗

Knit MCP

Connect AI agents to 200+ SaaS applications and automate workflows.

integrationsautomationsaas
🕷️

Apify MCP Server

Deploy and interact with Apify actors for web scraping and data extraction.

apifycrawlerdata
🌐

BrowserStack MCP

BrowserStack MCP Server for automated testing across multiple browsers.

testingqabrowsers

Zapier MCP

A Zapier server that provides automation capabilities for various apps.

zapierautomation