Add Hermes memory evaluation framework with LoCoMo dataset support
- Implement HermesClient for interacting with the Hermes CLI. - Create judge module for grading QA outputs from Hermes memory. - Develop LoCoMo dataset parsing and formatting utilities. - Introduce run_eval script to facilitate memory evaluation using LoCoMo-style datasets.
This commit is contained in:
66751
eval/hermes_memory_eval/datasets/locomo10.json
Normal file
66751
eval/hermes_memory_eval/datasets/locomo10.json
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user