TMElyralab
/

lyraLLMs

Model card Files Files and versions

lyraLLMs / lyrallms /README.md

carsonhxsu

# This is a combination of 22 commits.

8453337 over 1 year ago

|

history blame contribute delete

692 Bytes

`lyrallms` 能力矩阵

	Attn方法		MEMOPT模式		KVCache精度
	Unfused	FlashAttn2	W4A16	W8A16	FP16	INT8
LLaMA	✅	✅	✅	✅	✅	✅
XVERSE	✅	✅	✅	✅	✅	✅
Baichuan 1/2 (7B及13B)	✅	❌	✅	✅	✅	❌
ChatGLM	✅	❌	❌	✅	✅	❌
BELLE	✅	❌	❌	✅	✅	❌

`lyrallms` 使用

校准 (Calibration)

参考calibration文件夹下的README.md 。

Python转换及调用加速模型

LLaMA

参考LyraLlamaPy文件夹下的README.md 。

Baichuan

参考LyraBaichuanPy文件夹下的README.md 。