wyx

DecoderImmortal

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

ProxyAttn: Guided Sparse Attention via Representative Heads

liked a Space 19 days ago

yzweak/AutoPR

liked a model about 2 months ago

baidu/ERNIE-4.5-21B-A3B-Thinking

View all activity

Organizations

None yet

upvoted a paper 12 days ago

ProxyAttn: Guided Sparse Attention via Representative Heads

Paper • 2509.24745 • Published Sep 29 • 1

liked a Space 19 days ago

AutoPR

🚀

Generate social media posts from PDFs

liked a model about 2 months ago

baidu/ERNIE-4.5-21B-A3B-Thinking

Text Generation • 22B • Updated 13 days ago • 954 • • 759

liked a dataset 2 months ago

Naomibas/llm-system-prompts-benchmark

Viewer • Updated Jul 11, 2024 • 100 • 66 • 13

updated 2 models 4 months ago

DecoderImmortal/Llama3-8B-MSN

8B • Updated Jul 9 • 7

DecoderImmortal/DeepSeek-Coder-7B-MSN

7B • Updated Jul 9 • 4

upvoted a paper 4 months ago

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 92

upvoted a collection 4 months ago

ERNIE 4.5

Collection

collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 26 items • Updated Sep 24 • 174

published 2 models 5 months ago

DecoderImmortal/DeepSeek-Coder-7B-MSN

7B • Updated Jul 9 • 4

DecoderImmortal/Llama3-8B-MSN

8B • Updated Jul 9 • 7

upvoted an article 7 months ago

Article

What is test-time compute and how to scale it?

and 1 other •

Feb 6

• 107

upvoted a paper 7 months ago

What, How, Where, and How Well? A Survey on Test-Time Scaling in Large Language Models

Paper • 2503.24235 • Published Mar 31 • 54

updated a model 11 months ago

DecoderImmortal/LM-Combiner

Updated Nov 22, 2024 • 1

upvoted a paper about 1 year ago

Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers

Paper • 2404.04925 • Published Apr 7, 2024 • 1

updated a model about 1 year ago

DecoderImmortal/CDA4GEC

Updated Sep 1, 2024

wyx

AI & ML interests

Recent Activity

Organizations

DecoderImmortal's activity

AutoPR

What is test-time compute and how to scale it?