declare-lab/OffTopicEval
Viewer
•
Updated
•
214k
•
194
•
5
Natural Language Processing
Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!