opendatahub-io
MCP Serveropendatahub-iopublic

llama stack demos

Collection of demos for building Llama Stack based apps on OpenShift

Repository Info

51
Stars
65
Forks
51
Watchers
26
Issues
Jupyter Notebook
Language
Apache License 2.0
License

About This Server

Collection of demos for building Llama Stack based apps on OpenShift

Model Context Protocol (MCP) - This server can be integrated with AI applications to provide additional context and capabilities, enabling enhanced AI interactions and functionality.

Documentation

Llama Stack Demos on OpenDataHub

This repository contains practical examples and demos designed to get you started quickly building AI apps with Llama Stack on Kubernetes or OpenShift. Whether you're a cluster admin looking to deploy the right GenAI infrastructure or a developer eager to innovate with AI Agents, the content in this repo should help you get started.

🛠️ Get Started

Requirements

To run these demos, ensure your environment meets the following:

  • OpenShift Cluster 4.17+
  • 2 GPUs with a minimum of 40GB VRAM each.

Deployment Instructions

Next, follow these simple steps to deploy the core components:

  1. Create a dedicated OpenShift project:
    oc new-project llama-serve
    
  2. Apply the Kubernetes manifests:
    oc apply -k kubernetes/kustomize/overlay/all-models
    
    This will deploy the foundational Llama Stack services, vLLM model servers, and MCP tool servers.

Setting Up Your Development Environment

We use uv for managing Python dependencies, ensuring a consistent and efficient development experience. Here's how to get your environment ready:

  1. Install uv:
    pip install uv
    
  2. Synchronize your environment:
    uv sync
    
  3. Activate the virtual environment:
    source .venv/bin/activate
    

Now you're all set to run any Python scripts or Jupyter notebooks within the demos/rag_agentic directory!

💡 Demo Architecture

The below diagram is an example architecture for a secure Llama Stack based application deployed on OpenShift (OCP) using both MCP tools and a Milvus vectorDB for its agentic and RAG based workflows. This is the same architecture that has been implemented in the RAG/Agentic demos.

!Architecture Diagram


We're excited to see what you build with Llama Stack! If you have any questions or feedback, please don't hesitate to open an issue. Happy building! 🎉

Quick Start

1

Clone the repository

git clone https://github.com/opendatahub-io/llama-stack-demos
2

Install dependencies

cd llama-stack-demos
npm install
3

Follow the documentation

Check the repository's README.md file for specific installation and usage instructions.

Repository Details

Owneropendatahub-io
Repollama-stack-demos
LanguageJupyter Notebook
LicenseApache License 2.0
Last fetched8/10/2025

Recommended MCP Servers

💬

Discord MCP

Enable AI assistants to seamlessly interact with Discord servers, channels, and messages.

integrationsdiscordchat
🔗

Knit MCP

Connect AI agents to 200+ SaaS applications and automate workflows.

integrationsautomationsaas
🕷️

Apify MCP Server

Deploy and interact with Apify actors for web scraping and data extraction.

apifycrawlerdata
🌐

BrowserStack MCP

BrowserStack MCP Server for automated testing across multiple browsers.

testingqabrowsers

Zapier MCP

A Zapier server that provides automation capabilities for various apps.

zapierautomation