Reference Architecture | Redis Iris — Context Engine
ArchitectureRedis Iris
Redis Eats — Redis Iris Context Engine Architecture
The Redis Eats demo is a food delivery customer support agent that proves one point: LLMs are a commodity — the data layer is what determines agent quality. It runs two modes side by side: a naive RAG chatbot that can only answer from policy documents, and a Redis Iris-powered agent that knows the customer, their active order, their driver's live location, and their preferences across sessions. This diagram shows the full Iris stack that makes the second mode possible.
Launch Redis Eats Demo →
Agent frameworkLangGraph ReAct
LLMGPT-4o
MCP tools62 auto-generated
Data entities9 types
Pipeline phases7
Embeddingstext-embedding-3-small · 1536-dim
Data Sources

Operational Databases

Postgres, MongoDB, MySQL, Oracle — systems of record for all entities

Customer & Order Entities

Customer, Order, OrderItem, Restaurant, Driver — 9 entity types total

Delivery & Support Entities

DeliveryEvent, Payment, SupportTicket — operational event and service data

Policy Documents

Refund, delivery, and service policies — vector-embedded for semantic search

User Conversation

Live chat input, session turns, prior interactions across all channels

Ingest Layer

RDI

CDC stream from operational DBs — keeps all 9 entity types always-fresh in Redis

Embedding Pipeline

OpenAI text-embedding-3-small (1536-dim) — policy documents vectorized and written to Redis at load time

Redis — Unified Context Layer

Redis Hash + Search + Vector

9 entity schemas with tag, text, and numeric indexes. Policy documents stored as 1536-dim cosine vectors. All queried by Context Retriever at request time

Redis Agent Memory

Short-term: per-session event log. Long-term: LLM-extracted facts and preferences (spicy food, contactless delivery) retrieved via vector similarity each turn

Redis LangCache

Semantic cache — "What's your refund policy?" and "Tell me your refund rules" resolve to the same cached answer, bypassing the agent pipeline entirely

Redis Context Retriever

Reads entity schemas → auto-generates 62 MCP tools (customer lookup, order search, driver location, delivery events, payments, support tickets) → serves them as governed MCP interfaces to the agent. No hand-written tool code.

Agent Layer

LangGraph ReAct Agent

Receives enriched prompt (query + session memory + long-term preferences) and calls Context Retriever MCP tools in a ReAct loop until the answer is grounded in live data

GPT-4o

Reasons over real customer, order, and driver context — not static documents. Never queries Redis directly. All data access flows through Context Retriever tools

Output

Chat Interface

Streaming response via SSE — answer grounded in live order, customer, and driver data rather than static policy text

Activity Panel

Live trace of every MCP tool call, phase progress, and total pipeline latency — visible in the Redis Eats demo UI

Context Retriever
62 MCP tools
Memory types
Short + Long term
Pipeline phases
7 phases
Simple RAG — Before Redis Iris

Generic answers from policy documents only

The RAG agent embeds the question, searches the policy vector index, retrieves the top 3 document chunks, and asks GPT-4o to answer from those chunks only. It has no access to customer identity, order state, driver location, or session history.

1Embed user question → text-embedding-3-small
2VectorQuery → Policy index → top 3 chunks (cosine similarity)
3GPT-4o → answer from policy text only
"Delays can happen due to high demand, traffic, or restaurant prep time" — no live data
Redis Iris — Real-Time Context Pipeline

Answers grounded in live customer, order, and driver data

The Iris agent runs a 7-phase pipeline. LangCache short-circuits on repeated questions. Agent Memory enriches the prompt with session history and preferences. Context Retriever MCP tools pull live entity data. GPT-4o reasons over real context, not static documents.

0LangCacheCheck semantic cache — HIT returns instantly
1Init LangGraph ReAct agent + bind 62 MCP tools + 5 internal tools
2–3MemoryWrite event, retrieve short-term + long-term memories
4Build enriched prompt — query + memory context block
5ToolsReAct loop — 5+ MCP tool calls: customer, order, driver, events
6–7GPT-4oStream answer + write response to Memory + LangCache
"Your Sakura Sushi order is with driver Marcus — flat tire reported 10 min ago. ETA now 7:42 PM."