§ 4Architecture Comparator

How do different memory architectures handle the same conversation? This module places six strategies side by side (sliding window, summarization, RAG, hierarchical tiers, infinite context, and graph memory) and feeds them identical input. Watch tokens flow, get compressed, retrieved, or discarded in real time. The synchronized playback reveals each architecture's trade-offs: latency vs. recall, compression vs. fidelity, simplicity vs. capacity. No single winner emerges; the optimal choice depends on your dialogue's structure.

Select Architectures (2-4 recommended)4 / 6 selected

Figure 7

Conversation

User22 tok

Hi, I'm planning a trip to Japan next spring. Can you help me plan an itinerary?

Assistant35 tok

I'd love to help you plan a trip to Japan! Spring is beautiful with cherry blossoms. When exactly are you thinking of going, and how long will your trip be?

User22 tok

I'm thinking two weeks, starting March 25th. My budget is around $5,000 not including flights.

Assistant42 tok

$5,000 for two weeks is workable! I recommend splitting time between Tokyo (5 days), Kyoto (4 days), Osaka (2 days), and Hiroshima (2 days) with a day for travel.

User20 tok

That sounds great. I'm vegetarian though -- is that going to be a problem in Japan?

Assistant44 tok

Being vegetarian in Japan requires some planning. Many dishes use dashi (fish stock). Buddhist temple cuisine (shojin ryori) is fully vegetarian. I'll note restaurants for each city. Apps like HappyCow help locate options.

User28 tok

Good to know about shojin ryori! By the way, my partner Alex is joining for the second week. Alex loves photography and traditional architecture.

Assistant48 tok

Great! For Alex joining week two, I'll plan more photography-friendly spots: bamboo groves in Arashiyama, Fushimi Inari gates, Kinkaku-ji golden pavilion. For architecture, we can add Nijo Castle and traditional machiya houses in Kyoto.

User18 tok

Perfect. What about accommodation? I prefer something authentic rather than chain hotels.

Assistant52 tok

For authentic stays: try ryokans (traditional inns) in Kyoto -- Tawaraya or budget-friendly Gion options. In Tokyo, consider a capsule hotel for one night (the experience!) plus a boutique hotel in Yanaka. Minshuku (family-run B&Bs) are affordable everywhere.

User24 tok

A capsule hotel sounds fun for one night! How does the Japan Rail Pass work? Is it worth it for our route?

Assistant54 tok

The 14-day Japan Rail Pass costs about $420 and covers almost all JR trains including the shinkansen (bullet train). For Tokyo-Kyoto-Osaka-Hiroshima it saves roughly $300 vs individual tickets. Absolutely worth it for your itinerary. Activate it on March 25th.

User26 tok

Excellent. Let's talk about day-by-day plans for Tokyo. I'm especially interested in the Tsukiji outer market and Akihabara.

Assistant68 tok

Tokyo day-by-day: Day 1: Arrival, Shinjuku area walk. Day 2: Tsukiji outer market morning, Ginza afternoon, teamLab Borderless evening. Day 3: Meiji Shrine, Harajuku, Shibuya Crossing. Day 4: Akihabara morning, Ueno Park & museums afternoon. Day 5: Day trip to Kamakura (Great Buddha, Hase-dera temple).

User24 tok

I just realized I should mention -- I have a mild allergy to shellfish. Not life-threatening but I should avoid it.

Assistant52 tok

Important to note alongside your vegetarian diet! In Japan, the phrase is "ebi/kani arerugi ga arimasu" (I have shrimp/crab allergy). Many Japanese dishes include hidden shellfish in sauces. I'll add allergy cards you can print in Japanese. This is especially relevant at Tsukiji.

User20 tok

Thank you for the Japanese phrase! What about pocket WiFi vs SIM card for staying connected?

Assistant50 tok

Pocket WiFi rental (about $5/day) is best for two people -- Alex can connect too in week two. Pick up at Narita airport. Unlimited data, return by mail. Alternatively, eSIM cards work for single devices. I recommend Ninja WiFi or Japan Wireless.

User28 tok

Let's go with pocket WiFi. Now, back to the itinerary -- what should we prioritize in Kyoto since Alex loves temples and photography?

Assistant62 tok

Kyoto highlights for Alex: Kiyomizu-dera at sunrise (fewer crowds, golden light), Fushimi Inari early morning for the iconic gates, Arashiyama bamboo grove, Philosopher's Path for cherry blossoms. Architecture picks: Nijo Castle (nightingale floors), Tofuku-ji temple. Book a private tea ceremony in Gion for the cultural experience.

View:

Sliding WindowTurn 0

Token Usage20 / 4096

Window: 1/8 slots used

Memory Contents

T1Verbatim (in window)

Hi, I'm planning a trip to Japan next spring. Can you help me plan an itinerary?

High

SummarizationTurn 0

Token Usage20 / 4096

Buffer: 1 of 5 before next compression

Memory Contents

T1Verbatim (buffer)

Hi, I'm planning a trip to Japan next spring. Can you help me plan an itinerary?

High

RAGTurn 0

Token Usage20 / 4096

Retrieved top-3 relevant turns from vector store

Memory Contents

T1Retrieved (relevant)

Hi, I'm planning a trip to Japan next spring. Can you help me plan an itinerary?

High

HierarchicalTurn 0

Token Usage20 / 4096

Tier 1: 1 verbatim | Tier 2: 0 summarized

Memory Contents

T1Verbatim (Tier 1)

Hi, I'm planning a trip to Japan next spring. Can you help me plan an itinerary?

High

Side-by-side comparison of memory architectures processing the same 20-turn conversation. Each column shows how the architecture manages its memory at the current turn.

Turn1 / 20

019

Speed:

Figure 8

Memory utilization over the conversation. Each architecture's token usage is plotted against the maximum capacity. The vertical line tracks the current playback position.

Architecture Memory Strategies

Sliding Window: Retains the last W turns verbatim. Memory cost is O(W), strictly bounded. Information beyond the window is permanently lost.

Summarization: Periodically compresses k oldest turns into a summary of approximately k/r tokens (compression ratio r). Preserves gist but loses lexical detail.

RAG: Stores all turns in a vector index. Active context contains only the top-K retrieved results plus the current query. Memory isO(K) in context but O(N) in external storage.

Hierarchical: Three tiers with different retention policies. Tier 1 (hot): verbatim, last 5 turns. Tier 2 (warm): summarized, turns 6-15. Tier 3 (cold): keyword-only, beyond turn 15.

Infinite Attention: Theoretical baseline. Retains all tokens, growing at O(N). Quadratic attention cost O(N^2) makes this impractical at scale.

Graph Memory: Extracts entities and relations into an adjacency structure. Memory grows as O(|E| + |R|) where E = entities andR = relations. Connected facts reinforce retention through graph centrality.

Key Observations

Token efficiency diverges rapidly. By turn 10, sliding window plateaus while infinite attention grows linearly. Summarization and hierarchical architectures find middle ground.

RAG is context-efficient but retrieval-dependent. Its active context stays small, but answer quality depends entirely on the relevance scoring of the retrieval step. Missed retrievals mean lost context.

Graph memory excels at entity-dense conversations. When the same entities recur (like "Alex" or "Kyoto" in our example), the graph structure preserves their connections even as individual turns age out of other architectures.

The vegetarian + shellfish allergy test case is illuminating: sliding window loses the vegetarian fact by turn 13, while graph memory retains both dietary constraints as connected entities throughout.

§ 4.5Validate Live: How Would Each Architecture Handle Your Conversation?

Provide a multi-turn conversation and the LLM will simulate what each architecture retains vs. loses. See the concrete tradeoffs of sliding window, summarization, RAG, and graph memory on your data.

Provider

Model

API Key

Conversation (one turn per line)