Spaces:

JeffYang52415
/

LLMEval-Dataset-Parser

Running

App Files Files Community

JeffYang52415 commited on Dec 30, 2024

Commit

b01d107

unverified ·

1 Parent(s): fecdc3d

bug: fix minor bugs

Browse files

Files changed (2) hide show

.github/workflows/huggingface-sync.yml +13 -2
README.md +19 -17

.github/workflows/huggingface-sync.yml CHANGED Viewed

@@ -18,6 +18,15 @@ jobs:
           git config --global user.email "github-actions[bot]@users.noreply.github.com"
           git config --global user.name "github-actions[bot]"
       - name: Login to Hugging Face
         env:
           HF_TOKEN: ${{ secrets.HUGGINGFACE_TOKEN }}
@@ -26,5 +35,7 @@ jobs:
       - name: Push to Hugging Face Space
         run: |
-          git remote add space https://huggingface.co/spaces/JeffYang52415/LLMEval-Dataset-Parser
-          git push space main:main

           git config --global user.email "github-actions[bot]@users.noreply.github.com"
           git config --global user.name "github-actions[bot]"
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: "3.x"
+      - name: Install Hugging Face CLI
+        run: |
+          pip install --upgrade huggingface-hub
       - name: Login to Hugging Face
         env:
           HF_TOKEN: ${{ secrets.HUGGINGFACE_TOKEN }}
       - name: Push to Hugging Face Space
         run: |
+          git remote add space https://huggingface.co/spaces/JeffYang52415/LLMEval-Dataset-Parser || true
+          git fetch space || true
+          # Force push to ensure sync, use with caution
+          git push -f space main:main

README.md CHANGED Viewed

@@ -13,10 +13,12 @@ short_description: A collection of parsers for LLM benchmark datasets
 **LLMDataParser** is a Python library that provides parsers for benchmark datasets used in evaluating Large Language Models (LLMs). It offers a unified interface for loading and parsing datasets like **MMLU**, **GSM8k**, and others, streamlining dataset preparation for LLM evaluation. The library aims to simplify the process of working with common LLM benchmark datasets through a consistent API.
 ## Features
 - **Unified Interface**: Consistent `DatasetParser` for all datasets.
-- **LLM-Agnostic**: Independent of any specific language model.
 - **Easy to Use**: Simple methods and built-in Python types.
 - **Extensible**: Easily add support for new datasets.
 - **Gradio**: Built-in Gradio interface for interactive dataset exploration and testing.
@@ -78,22 +80,22 @@ Poetry manages the virtual environment and dependencies automatically, so you do
 Here's a simple example demonstrating how to use the library:
 ```python
- from llmdataparser import ParserRegistry
- # list all available parsers
- ParserRegistry.list_parsers()
- # get a parser
- parser = ParserRegistry.get_parser("mmlu")
- # load the parser
- parser.load() # optional: task_name, split
- # parse the parser
- parser.parse() # optional: split_names
- print(parser.task_names)
- print(parser.split_names)
- print(parser.get_dataset_description)
- print(parser.get_huggingface_link)
- print(parser.total_tasks)
- data = parser.get_parsed_data
 ```
 We also provide a Gradio demo for interactive testing:

 **LLMDataParser** is a Python library that provides parsers for benchmark datasets used in evaluating Large Language Models (LLMs). It offers a unified interface for loading and parsing datasets like **MMLU**, **GSM8k**, and others, streamlining dataset preparation for LLM evaluation. The library aims to simplify the process of working with common LLM benchmark datasets through a consistent API.
+**Spaces**: You can also try out the online demo on Hugging Face Spaces:
+[LLMEval Dataset Parser Demo](https://huggingface.co/spaces/JeffYang52415/LLMEval-Dataset-Parser)
 ## Features
 - **Unified Interface**: Consistent `DatasetParser` for all datasets.
 - **Easy to Use**: Simple methods and built-in Python types.
 - **Extensible**: Easily add support for new datasets.
 - **Gradio**: Built-in Gradio interface for interactive dataset exploration and testing.
 Here's a simple example demonstrating how to use the library:
 ```python
+from llmdataparser import ParserRegistry
+# list all available parsers
+ParserRegistry.list_parsers()
+# get a parser
+parser = ParserRegistry.get_parser("mmlu")
+# load the parser
+parser.load() # optional: task_name, split
+# parse the parser
+parser.parse() # optional: split_names
+print(parser.task_names)
+print(parser.split_names)
+print(parser.get_dataset_description)
+print(parser.get_huggingface_link)
+print(parser.total_tasks)
+data = parser.get_parsed_data
 ```
 We also provide a Gradio demo for interactive testing: