Reference Architecture | Redis Iris — Real-Time Context Engine
ArchitectureRedis Iris
Redis Iris — Real-Time Context Engine
Redis Iris is a unified platform that sits between AI agents and enterprise data sources. It replaces ad-hoc retrieval pipelines, hand-coded MCP servers, and custom memory systems with a single context engine — handling data navigability, freshness, agent memory, and LLM cost optimization in one stack.
RDI — Data freshness
Context Retriever — MCP tools from schema
Agent Memory — Short + long-term
Redis Search — Vector + hybrid
LangCache — Semantic cache
Enterprise Data Sources

Structured Data

Relational databases, CRMs, ERPs, operational systems — the entities agents need to reason about (customers, orders, accounts, inventory)

Semi-Structured Data

JSON, event streams, Kafka topics, API responses, logs — real-time signals and operational state changes

Unstructured Data

Documents, PDFs, support transcripts, knowledge base articles, policies — content best accessed via vector similarity search

Agent Context

Session history, prior decisions, user preferences, workflow state — the context agents generate themselves as they work

Ingest Layer

RDI — Redis Data Integration

Change data capture from relational and NoSQL sources. Keeps Redis continuously synchronized with systems of record — no batch jobs, no stale reads, no custom ETL pipelines

Embedding Pipeline

Unstructured content vectorized at ingest time and written to Redis vector indexes — ready for semantic search without runtime embedding latency

Streaming Ingest

Kafka, event APIs, and webhook feeds — real-time operational signals written directly into Redis for immediate agent access

Redis Iris — Unified Context Layer

Redis Search

Vector, full-text, and hybrid search over structured and unstructured data in a single query. Powers RAG over documents and semantic similarity over entity fields

Redis Agent Memory

Short-term: per-session working memory across agent turns. Long-term: LLM-extracted durable facts and preferences retrieved via vector similarity. Agents get smarter over time without manual state management

Redis LangCache

Semantic cache-as-a-service — routes semantically equivalent queries to cached responses before the agent pipeline runs. Reduces LLM cost and response latency for repeated or similar requests

Redis Context Retriever

Schema-first MCP tool generation — define your data model once and Context Retriever auto-compiles a governed set of MCP tools the agent can call directly. No hand-written tool code, no text-to-SQL, no raw database access. The agent navigates your data in business language, not query language.

AI Agent

Agent Framework

LangGraph, CrewAI, OpenAI Agents SDK, Microsoft Agent Framework, Google ADK — any framework that supports MCP tool use and memory injection

LLM

GPT-4o, Claude, Gemini, Llama — receives enriched prompts including session memory, long-term preferences, and Context Retriever tool results. Reasons over real data, not static documents

MCP Tools

Auto-generated by Context Retriever from the entity schema. The agent calls governed, structured interfaces — not raw queries. Access controls enforced server-side at the Context Retriever layer

Agent Output + Feedback Loop

Grounded Response

Agent answers are based on live entity data and retrieved context — not hallucinated or stale. Every response is traceable to a Context Retriever tool call

Action Execution

Downstream actions triggered by agent decisions — API calls, workflow updates, CRM writes, notifications — all driven by context assembled in Redis

Memory + Cache Write-back

Agent Memory records session outcomes and extracts durable facts. LangCache stores the response for future semantic cache hits. The system compounds with every interaction

Context Retriever
Schema → MCP tools
Agent Memory
Short + long term
LangCache
Semantic cache
Context Retriever
Define your data schema once. Get a governed MCP server automatically. Agents navigate business objects — not raw database rows.
Agent Memory
Session context across turns. Durable facts across sessions. Agents personalize, remember, and improve — without manual state management.
RDI
Change data capture from any source. Redis stays synchronized with systems of record. Agents always see fresh data — not a stale snapshot.
LangCache
Semantic deduplication of LLM calls. Equivalent questions return cached answers instantly. Lower cost, faster responses, no pipeline overhead.