Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
emoji: 🏆🏆🏆
|
| 4 |
colorFrom: red
|
| 5 |
colorTo: purple
|
|
@@ -8,14 +8,66 @@ sdk_version: 1.41.1
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: true
|
| 10 |
license: mit
|
| 11 |
-
short_description:
|
| 12 |
---
|
| 13 |
|
| 14 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
| 15 |
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
Features:
|
| 21 |
|
|
|
|
| 1 |
---
|
| 2 |
+
title: DeepResearchEvaluator
|
| 3 |
emoji: 🏆🏆🏆
|
| 4 |
colorFrom: red
|
| 5 |
colorTo: purple
|
|
|
|
| 8 |
app_file: app.py
|
| 9 |
pinned: true
|
| 10 |
license: mit
|
| 11 |
+
short_description: Deep Research Evaluator for Long Horizon Learning Tasks
|
| 12 |
---
|
| 13 |
|
|
|
|
| 14 |
|
| 15 |
+
A Deep Research Evaluator is a conceptual AI system designed to analyze and synthesize information from extensive research literature, such as arXiv papers, to learn about specific topics and generate code applicable to long-horizon tasks in AI. This involves understanding complex subjects, identifying relevant methodologies, and implementing solutions that require planning and execution over extended sequences.
|
| 16 |
+
|
| 17 |
+
Key Topics and Related Papers:
|
| 18 |
+
|
| 19 |
+
Long-Horizon Task Planning in Robotics:
|
| 20 |
+
|
| 21 |
+
"MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model"
|
| 22 |
+
Authors: Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, Jun Shao, Jie Ren, Wei Song
|
| 23 |
+
This paper introduces a method that decomposes complex tasks at multiple levels to enhance planning capabilities using open-source large language models.
|
| 24 |
+
ARXIV
|
| 25 |
+
|
| 26 |
+
"ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning"
|
| 27 |
+
Authors: Zhehua Zhou, Jiayang Song, Kunpeng Yao, Zhan Shu, Lei Ma
|
| 28 |
+
The study presents a framework that improves LLM-based planning through an iterative self-refinement process, enhancing feasibility and correctness in task plans.
|
| 29 |
+
ARXIV
|
| 30 |
+
|
| 31 |
+
Skill-Based Reinforcement Learning:
|
| 32 |
+
|
| 33 |
+
"Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks"
|
| 34 |
+
Authors: Haoqi Yuan, Chi Zhang, Hongcheng Wang, Feiyang Xie, Penglin Cai, Hao Dong, Zongqing Lu
|
| 35 |
+
This research focuses on building multi-task agents in open-world environments by learning basic skills and planning over them to accomplish long-horizon tasks efficiently.
|
| 36 |
+
ARXIV
|
| 37 |
+
|
| 38 |
+
"SkillTree: Explainable Skill-Based Deep Reinforcement Learning for Long-Horizon Control Tasks"
|
| 39 |
+
Authors: Yongyan Wen, Siyuan Li, Rongchang Zuo, Lei Yuan, Hangyu Mao, Peng Liu
|
| 40 |
+
The paper proposes a framework that integrates a differentiable decision tree within the high-level policy to generate skill embeddings, enhancing explainability in decision-making for complex tasks.
|
| 41 |
+
ARXIV
|
| 42 |
+
|
| 43 |
+
Neuro-Symbolic Approaches:
|
| 44 |
+
|
| 45 |
+
"Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation"
|
| 46 |
+
Authors: Jie-Jing Shao, Hao-Ran Hao, Xiao-Wen Yang, Yu-Feng Li
|
| 47 |
+
This work introduces a framework that combines data-driven learning and symbolic-based reasoning to enable long-horizon planning through abductive imitation learning.
|
| 48 |
+
ARXIV
|
| 49 |
+
|
| 50 |
+
"CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning"
|
| 51 |
+
Authors: [Authors not specified]
|
| 52 |
+
The study presents a method that utilizes large language models to translate constraints into formal specifications, facilitating long-horizon task and motion planning.
|
| 53 |
+
ARXIV
|
| 54 |
+
|
| 55 |
+
Evaluation Frameworks for AI Models:
|
| 56 |
+
|
| 57 |
+
"ASI: Accuracy-Stability Index for Evaluating Deep Learning Models"
|
| 58 |
+
Authors: Wei Dai, Daniel Berleant
|
| 59 |
+
The paper introduces the Accuracy-Stability Index (ASI), a quantitative measure that incorporates both accuracy and stability for assessing deep learning models.
|
| 60 |
+
ARXIV
|
| 61 |
+
|
| 62 |
+
"Benchmarks for Deep Off-Policy Evaluation"
|
| 63 |
+
Authors: Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine
|
| 64 |
+
This research provides a collection of policies that, in conjunction with existing offline datasets, can be used for benchmarking off-policy evaluation in deep learning.
|
| 65 |
+
ARXIV
|
| 66 |
+
|
| 67 |
+
These topics and papers contribute to the development of AI systems capable of understanding research literature and applying the acquired knowledge to complex, long-horizon tasks, thereby advancing the field of artificial intelligence.
|
| 68 |
+
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
|
| 72 |
Features:
|
| 73 |
|