luguoshan commited on
Commit
783d346
·
verified ·
1 Parent(s): 3a63b48

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -13,7 +13,7 @@ library_name: transformers
13
 
14
  - `LLaDA-MoE-7B-A1B-Base`: A base pre-trained model designed for research and secondary development.
15
  - `LLaDA-MoE-7B-A1B-Instruct`: An instruction-tuned model optimized for practical applications.
16
-
17
  ---
18
  <div align="center">
19
  <img src="https://raw.githubusercontent.com/Ulov888/LLaDA_Assets/main/benchmarks_grouped_bar.png" width="800" />
@@ -48,6 +48,8 @@ library_name: transformers
48
  |--------|-------------|-------------------|
49
  | [`inclusionAI/LLaDA-MoE-7B-A1B-Base`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Base) | Base pre-trained model for research and fine-tuning. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Base) |
50
  | [`inclusionAI/LLaDA-MoE-7B-A1B-Instruct`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct) | Instruction-tuned model, ready for downstream applications. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct) |
 
 
51
 
52
  ---
53
 
 
13
 
14
  - `LLaDA-MoE-7B-A1B-Base`: A base pre-trained model designed for research and secondary development.
15
  - `LLaDA-MoE-7B-A1B-Instruct`: An instruction-tuned model optimized for practical applications.
16
+ - `LLaDA-MoE-7B-A1B-Instruct-TD`: A specialized instruction-tuned model, further optimized for accelerated inference using Trajectory Distillation.
17
  ---
18
  <div align="center">
19
  <img src="https://raw.githubusercontent.com/Ulov888/LLaDA_Assets/main/benchmarks_grouped_bar.png" width="800" />
 
48
  |--------|-------------|-------------------|
49
  | [`inclusionAI/LLaDA-MoE-7B-A1B-Base`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Base) | Base pre-trained model for research and fine-tuning. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Base) |
50
  | [`inclusionAI/LLaDA-MoE-7B-A1B-Instruct`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct) | Instruction-tuned model, ready for downstream applications. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct) |
51
+ | [`inclusionAI/LLaDA-MoE-7B-A1B-Instruct-TD`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct-TD) | An instruction-tuned model further optimized with **Trajectory Distillation (TD)** for accelerated inference. Decodes multiple tokens per forward pass. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct-TD) |
52
+
53
 
54
  ---
55