Postgres, MongoDB, MySQL, Oracle — systems of record for all entities
Customer, Order, OrderItem, Restaurant, Driver — 9 entity types total
DeliveryEvent, Payment, SupportTicket — operational event and service data
Refund, delivery, and service policies — vector-embedded for semantic search
Live chat input, session turns, prior interactions across all channels
CDC stream from operational DBs — keeps all 9 entity types always-fresh in Redis
OpenAI text-embedding-3-small (1536-dim) — policy documents vectorized and written to Redis at load time
9 entity schemas with tag, text, and numeric indexes. Policy documents stored as 1536-dim cosine vectors. All queried by Context Retriever at request time
Short-term: per-session event log. Long-term: LLM-extracted facts and preferences (spicy food, contactless delivery) retrieved via vector similarity each turn
Semantic cache — "What's your refund policy?" and "Tell me your refund rules" resolve to the same cached answer, bypassing the agent pipeline entirely
Reads entity schemas → auto-generates 62 MCP tools (customer lookup, order search, driver location, delivery events, payments, support tickets) → serves them as governed MCP interfaces to the agent. No hand-written tool code.
Receives enriched prompt (query + session memory + long-term preferences) and calls Context Retriever MCP tools in a ReAct loop until the answer is grounded in live data
Reasons over real customer, order, and driver context — not static documents. Never queries Redis directly. All data access flows through Context Retriever tools
Streaming response via SSE — answer grounded in live order, customer, and driver data rather than static policy text
Live trace of every MCP tool call, phase progress, and total pipeline latency — visible in the Redis Eats demo UI
The RAG agent embeds the question, searches the policy vector index, retrieves the top 3 document chunks, and asks GPT-4o to answer from those chunks only. It has no access to customer identity, order state, driver location, or session history.
The Iris agent runs a 7-phase pipeline. LangCache short-circuits on repeated questions. Agent Memory enriches the prompt with session history and preferences. Context Retriever MCP tools pull live entity data. GPT-4o reasons over real context, not static documents.