Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models Paper • 2507.12547 • Published Jul 16, 2025
On the Same Wavelength? Evaluating Pragmatic Reasoning in Language Models across Broad Concepts Paper • 2509.06952 • Published Sep 8, 2025
Code-enabled language models can outperform reasoning models on diverse tasks Paper • 2510.20909 • Published Oct 23, 2025 • 1
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers Paper • 2310.15164 • Published Oct 23, 2023 • 3
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers Paper • 2310.15164 • Published Oct 23, 2023 • 3
Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models Paper • 2405.09605 • Published May 15, 2024
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling Paper • 2504.05410 • Published Apr 7, 2025 • 2
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers Paper • 2310.15164 • Published Oct 23, 2023 • 3
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code Paper • 2403.07974 • Published Mar 12, 2024 • 3
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution Paper • 2401.03065 • Published Jan 5, 2024 • 11
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models Paper • 2306.15626 • Published Jun 27, 2023 • 17