nielsr HF Staff commited on
Commit
305dff1
Β·
verified Β·
1 Parent(s): 8082a53

Add project page link to model card

Browse files

This PR enhances the model card by adding a link to the project page (`https://internvl.github.io/blog/2025-09-29-SDLM/`) in the introductory section. This improves discoverability for users seeking additional context and information about the project.

The existing metadata and content (including the paper link to arXiv and the sample usage) remain unchanged as they are already accurate and meet the documentation guidelines.

Files changed (1) hide show
  1. README.md +15 -15
README.md CHANGED
@@ -1,18 +1,6 @@
1
  ---
2
- license: apache-2.0
3
- license_name: qwen
4
- license_link: https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE
5
- pipeline_tag: text-generation
6
- library_name: transformers
7
  base_model:
8
  - Qwen/Qwen2.5-3B
9
- base_model_relation: finetune
10
- language:
11
- - en
12
- tags:
13
- - sdlm
14
- - diffusion language model
15
- - custom_code
16
  datasets:
17
  - dyyyyyyyy/ScaleQuest-Math
18
  - OpenCoder-LLM/opc-sft-stage2
@@ -20,15 +8,27 @@ datasets:
20
  - HuggingFaceTB/smoltalk2
21
  - LipengCS/Table-GPT
22
  - allenai/SciRIFF
 
 
 
 
 
 
 
 
 
 
 
 
23
  ---
24
 
25
  # SDLM-3B-D8
26
 
27
- [\[πŸ“‚ GitHub\]](https://github.com/OpenGVLab/SDLM) [\[πŸ“œ Tech Report\]](https://arxiv.org/abs/2509.24007) [\[πŸ€— HuggingFace\]](https://huggingface.co/collections/OpenGVLab/sdlm-68ac82709d7c343ad36aa552)
28
 
29
  ## Introduction
30
 
31
- We propose a <b>S</b>equential <b>D</b>iffusion <b>L</b>anguage <b>M</b>odel (<b>SDLM</b>), to cheaply stimulate the parallel prediction capabilities of diffusion models. Specifically, SDLM reduces distribution shift by limiting the prediction range to a fixed block length and enforces decoding order through the longest prefix decoding method, thereby significantly improving prediction efficiency while ensuring generation quality. Our method can be viewed as a further generalization of the autoregressive (AR) paradigm. Therefore, it is possible to use pre-trained AR weights and quickly migrate to the diffusion framework with only minimal instruction fine-tuning.
32
 
33
  ![image/png](https://huggingface.co/OpenGVLab/SDLM-3B-D8/resolve/main/assets/three_framework.png)
34
 
@@ -151,4 +151,4 @@ If you find this project useful in your research, please consider citing:
151
  journal={arXiv preprint arXiv:2509.24007},
152
  year={2025}
153
  }
154
- ```
 
1
  ---
 
 
 
 
 
2
  base_model:
3
  - Qwen/Qwen2.5-3B
 
 
 
 
 
 
 
4
  datasets:
5
  - dyyyyyyyy/ScaleQuest-Math
6
  - OpenCoder-LLM/opc-sft-stage2
 
8
  - HuggingFaceTB/smoltalk2
9
  - LipengCS/Table-GPT
10
  - allenai/SciRIFF
11
+ language:
12
+ - en
13
+ library_name: transformers
14
+ license: apache-2.0
15
+ license_name: qwen
16
+ license_link: https://huggingface.co/Qwen/Qwen2.5-3B/blob/main/LICENSE
17
+ pipeline_tag: text-generation
18
+ tags:
19
+ - sdlm
20
+ - diffusion language model
21
+ - custom_code
22
+ base_model_relation: finetune
23
  ---
24
 
25
  # SDLM-3B-D8
26
 
27
+ [\[πŸ“‚ GitHub\]](https://github.com/OpenGVLab/SDLM) [\[πŸ“œ Tech Report\]](https://arxiv.org/abs/2509.24007) [\\[πŸš€ Project Page\\]](https://internvl.github.io/blog/2025-09-29-SDLM/) [\[πŸ€— HuggingFace\]](https://huggingface.co/collections/OpenGVLab/sdlm-68ac82709d7c343ad36aa552)
28
 
29
  ## Introduction
30
 
31
+ We propose a **S**equential **D**iffusion **L**anguage **M**odel (**SDLM**), to cheaply stimulate the parallel prediction capabilities of diffusion models. Specifically, SDLM reduces distribution shift by limiting the prediction range to a fixed block length and enforces decoding order through the longest prefix decoding method, thereby significantly improving prediction efficiency while ensuring generation quality. Our method can be viewed as a further generalization of the autoregressive (AR) paradigm. Therefore, it is possible to use pre-trained AR weights and quickly migrate to the diffusion framework with only minimal instruction fine-tuning.
32
 
33
  ![image/png](https://huggingface.co/OpenGVLab/SDLM-3B-D8/resolve/main/assets/three_framework.png)
34
 
 
151
  journal={arXiv preprint arXiv:2509.24007},
152
  year={2025}
153
  }
154
+ ```