Spaces:
Runtime error
A newer version of the Streamlit SDK is available:
1.51.0
Prompt Template
The prompt template of XTuner ensures consistency with the LLMs' official templates. Below, we will elaborate on its logic using the example of InternLM-Chat model (internlm_chat).
Structure
internlm_chat=dict(
SYSTEM='<|System|>:{system}\n',
INSTRUCTION='<|User|>:{input}<eoh>\n<|Bot|>:',
SUFFIX='<eoa>',
SUFFIX_AS_EOS=True,
SEP='\n',
STOP_WORDS=['<eoa>'])
SYSTEM: The template for the "system" field during Q&A, where{system}represents the "system" text. It's worth noting that this field only appears once in multi-turn dialogues, specifically in the first turn.INSTRUCTION: The template for the "instruction" field during Q&A, where{input}represents the user instruction text.SUFFIX: The suffix for the "instruction" field, which will be appended to the "response" of each Q&A turn. Typically, this also serves as a special ending symbol (i.e.,eos). Defaults to''.SUFFIX_AS_EOS: Represents whether the aforementioned suffix acts as an ending symbol. If set toTrue, it will replace theeos_tokenof thetokenizer. Otherwise, theeos_tokenof thetokenizerwill still be used to denote the end of sequence. Defaults toFalse.SEP: Used to separate multi-turn dialogues, it will be appended after theINSTRUCTIONandSUFFIX. Defaults to''.STOP_WORDS: Used to specify the stop words, this information will be utilized during the text generation stage. It's worth noting that theeos_tokenof thetokenizeris automatically added toSTOP_WORDS, without the need for manual setting.
Results
Single-turn
<|System|>:{system}
<|User|>:{input}<eoh>
<|Bot|>:{output}<eoa>
Multi-turn
<|System|>:{system}
<|User|>:{input}<eoh>
<|Bot|>:{output}<eoa>
<|User|>:{input}<eoh>
<|Bot|>:{output}<eoa>
<|User|>:{input}<eoh>
<|Bot|>:{output}<eoa>
Choosing the prompt template
| Model | Prompt Template |
|---|---|
| baichuan-inc/Baichuan-7B | default* |
| baichuan-inc/Baichuan-13B-Base | default* |
| baichuan-inc/Baichuan-13B-Chat | baichuan_chat |
| baichuan-inc/Baichuan2-7B-Base | default* |
| baichuan-inc/Baichuan2-7B-Chat | baichuan2_chat |
| baichuan-inc/Baichuan2-13B-Base | default* |
| baichuan-inc/Baichuan2-13B-Chat | baichuan2_chat |
| THUDM/chatglm2-6b | chatglm2 |
| THUDM/chatglm3-6b | chatglm3 |
| THUDM/chatglm3-6b-base | chatglm3 |
| deepseek-ai/deepseek-coder-6.7b-base | deepseek_coder |
| deepseek-ai/deepseek-coder-6.7b-instruct | deepseek_coder |
| internlm/internlm-7b | default* |
| internlm/internlm-20b | default* |
| internlm/internlm-chat-7b | internlm_chat |
| internlm/internlm-chat-20b | internlm_chat |
| huggyllama/llama-7b | default |
| meta-llama/Llama-2-7b-hf | llama2_chat |
| meta-llama/Llama-2-7b-chat-hf | llama2_chat |
| meta-llama/Llama-2-70b-hf | llama2_chat |
| lmsys/vicuna-7b-v1.5 | vicuna |
| lmsys/vicuna-13b-v1.5 | vicuna |
| mistralai/Mistral-7B-v0.1 | mistral |
| mistralai/Mixtral-8x7B-v0.1 | mixtral |
| mistralai/Mixtral-8x7B-Instruct-v0.1 | mixtral |
| Qwen/Qwen-1_8B | default* |
| Qwen/Qwen-1_8B-Chat | qwen_chat |
| Qwen/Qwen-7B | default* |
| Qwen/Qwen-7B-Chat | qwen_chat |
| Qwen/Qwen-72B | default* |
| Qwen/Qwen-72B-Chat | qwen_chat |
| bigcode/starcoder | default |
| 01-ai/Yi-6B | default |
| 01-ai/Yi-34B | default |
| HuggingFaceH4/zephyr-7b-beta | zephyr |
| deepseek-ai/deepseek-moe-16b-base | deepseek_moe |
| deepseek-ai/deepseek-moe-16b-chat | deepseek_moe |
| internlm/internlm2-1_8b | default* |
| internlm/internlm2-7b | default* |
| internlm/internlm2-20b | default* |
| internlm/internlm2-chat-1_8b | internlm2_chat |
| internlm/internlm2-chat-7b | internlm2_chat |
| internlm/internlm2-chat-20b | internlm2_chat |
| Qwen/Qwen1.5-0.5B | default* |
| Qwen/Qwen1.5-0.5B-Chat | qwen_chat |
| Qwen/Qwen1.5-1.8B | default* |
| Qwen/Qwen1.5-1.8B-Chat | qwen_chat |
| Qwen/Qwen1.5-4B | default* |
| Qwen/Qwen1.5-4B-Chat | qwen_chat |
| Qwen/Qwen1.5-7B | default* |
| Qwen/Qwen1.5-7B-Chat | qwen_chat |
| Qwen/Qwen1.5-14B | default* |
| Qwen/Qwen1.5-14B-Chat | qwen_chat |
| Qwen/Qwen1.5-72B | default* |
| Qwen/Qwen1.5-72B-Chat | qwen_chat |
| google/gemma-2b | default* |
| google/gemma-2b-it | gemma* |
| google/gemma-7b | default* |
| google/gemma-7b-it | gemma* |
*: The official template has special tokens (like <|im_start|>, <|im_end|>) that were not trained during the pre-training phase. Therefore, these models utilize the default template.