Spaces:
Running
Running
File size: 669 Bytes
7c73423 8167cc2 7c73423 f2cec45 7c73423 f2cec45 fd223ba f2cec45 6ef6bf4 e6c318a fd223ba 6ef6bf4 f2cec45 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
---
title: Tokenizer Arena
emoji: ⚔
colorFrom: red
colorTo: gray
sdk: gradio
sdk_version: 4.38.1
app_file: app.py
pinned: false
datasets:
- cc100
tags:
- tokenizer
short_description: Compare different tokenizers in char-level and byte-level.
---
Please visit our GitHub repo for more information: https://github.com/xu-song/tokenizer-arena
## Run gradio demo
```sh
python app.py
```
## Deploy to Huggingface
```sh
python compression_util.py # cache compression
python character_util.py # cache character
python stats/sample.py # sample stats of compression
git add stats/compression_rate/*
git add -u .
```
|