HEMA: A Hippocampus-Inspired Extended Memory Architecture for Long-Context AI Conversations

한국인공지능학회
인공지능연구
Vol.13 No. 2
2025.06

1 - 7 (7 pages)
DOI : 10.24225/kjai.2025.13.2.1

원문보기

원문저장

Large language models (LLMs) maintain coherence over a few thousand tokens but degrade sharply in multi hundred turn conversations. We present a hippocampus inspired dual memory architecture that separates dialogue context into (1) Compact Memory, a continuously updated one sentence summary that preserves the global narrative, and (2) Vector Memory, an episodic store of chunk embeddings queried via cosine similarity. Integrated with an off the shelf 6B parameter transformer, the system sustains > 300 turn dialogues while keeping the prompt under 3.5 K tokens. On long form QA and story continuation benchmarks, Compact + Vector Memory elevates factual recall accuracy from 41 % to 87 % and human rated coherence from 2.7 to 4.3. Precision-recall analysis shows that, with 10 K indexed chunks, Vector Memory achieves P@5 ≥ 0.80 and R@50 ≥ 0.74, doubling the area under the PR curve relative to a summarisation only baseline. Ablation experiments reveal that (i) semantic forgetting—age weighted pruning of low salience chunks—cuts retrieval latency by 34 % with < 2 pp recall loss, and (ii) a two level summary of summaries eliminates cascade errors that otherwise emerge after 1,000 turns. By reconciling verbatim recall with semantic continuity, our architecture offers a practical path toward scalable, privacy aware conversational AI capable of engaging in months long dialogue without retraining the underlying model.

1. Introduction

2. Literature Review

3. Methodology

4. Results

5. Discussion

6. Conclusion

References

HEMA: A Hippocampus-Inspired Extended Memory Architecture for Long-Context AI Conversations

(0)

(0)

(0)

(0)