Add Hermes memory evaluation framework with LoCoMo dataset support
- Implement HermesClient for interacting with the Hermes CLI. - Create judge module for grading QA outputs from Hermes memory. - Develop LoCoMo dataset parsing and formatting utilities. - Introduce run_eval script to facilitate memory evaluation using LoCoMo-style datasets.
This commit is contained in:
2
eval/hermes_memory_eval/__init__.py
Normal file
2
eval/hermes_memory_eval/__init__.py
Normal file
@ -0,0 +1,2 @@
|
||||
"""Hermes memory evaluation helpers."""
|
||||
|
||||
Reference in New Issue
Block a user