Update README

Files changed (3) hide show

README.md CHANGED Viewed

@@ -118,7 +118,7 @@ We have released multiple optimized models converted from original HuggingFace o
 - XVERSE-13B-Chat
 - LLaMA-Ziya-13B
 - Baichuan-7B, Baichuan-13B-Base, Baichuan-13B-Chat, Baichuan2-7B-Base, Baichuan2-7B-Chat, Baichuan2-13B-Base and lyraBaichuan2-13B-Chat
-- Yi-6B
 Feel free to contact us if you would like to convert a finetuned version of LLMs.

 - XVERSE-13B-Chat
 - LLaMA-Ziya-13B
 - Baichuan-7B, Baichuan-13B-Base, Baichuan-13B-Chat, Baichuan2-7B-Base, Baichuan2-7B-Chat, Baichuan2-13B-Base and lyraBaichuan2-13B-Chat
+- Yi-6B, Yi-34B
 Feel free to contact us if you would like to convert a finetuned version of LLMs.

lyrallms/LyraBaichuanPy/README.md CHANGED Viewed

@@ -67,22 +67,3 @@ print(output_texts)
 - Batch推理
 - 不等长Batch推理
 - Batch流式推理
-## 自定义模型参数
-已提供转换脚本 `parse_model_params.py` 可以将 Baichuan1/2 模型的 HuggingFace 格式参数，转换为加速版本下各层模型需要的模型参数。这里我们提供一个模型名字 `-model_name` 的转换参数，可以自行填入，以便生成可区分的 config.in 文件。
-```shell
-python parse_model_params.py -i your_model_dir -o output_dir -t_g 1 -i_g 1 -weight_data_type "fp16" -model_name "baichuan2-13b"
-```
-该转换脚本还会将同目录下 tokenizer_source 里的 `tokenizer.model` `special_tokens_map.json` `tokenizer_config.json` 四个文件拷贝到 output_dir 下，以便后续使用加速模型时直接能初始化对应的 加速后的 Baichuan 的 tokenizer.
-转换后的模型参数将以每个参数一个文件的形式存放在 `output_dir/{i_g}-gpu-{weight_data_type}` 下，需要使用`merge_bin.py`将多个bin文件合并为一个。
-```shell
-layer_num=40 # 13B->40, 7B->32
-python merge_bin.py -i model_dir/{i_g}-gpu-{weight_data_type} -o output_dir -l ${layer_num}
-```
-将上述 `config.ini` `config.json` `tokenizer.model` `special_tokens_map.json` `tokenizer_config.json` 五个文件拷贝到 output_dir 下。

 - Batch推理
 - 不等长Batch推理
 - Batch流式推理

lyrallms/LyraLlamaPy/README.md CHANGED Viewed

@@ -59,17 +59,4 @@ print(output_texts)
 更多测试脚本及用法详见参考 `examples` 下的 [README.md](./examples/README.md) ，如：
 - Batch推理
 - 不等长Batch推理
-- Batch流式推理
-## 自定义模型参数
-已提供转换脚本 `parse_model_params.py` 可以将 LLaMa 模型的 HuggingFace 格式参数，转换为加速版本下各层模型需要的模型参数。因为 LLaMa 有很多变体，所以这里我们提供一个模型名字 `-model_name` 的转换参数，可以自行填入，以便生成可区分的 config.in 文件。
-```shell
-python parse_model_params.py -i your_model_dir -o output_dir -t_g 1 -i_g 1 -weight_data_type "fp16" -model_name "llama"
-```
-转换后的模型参数将以每个参数一个文件的形式存放在 `output_dir/{i_g}-gpu-{weight_data_type}` 下，分割的形式有助于并发 IO，但缺陷是不便捷。
-同时该转换脚本还会将同目录下 tokenizer_source 里的 `tokenizer.model` `tokenizer.json` `special_tokens_map.json` `tokenizer_config.json` 四个文件拷贝到 output_dir 下，以便后续使用加速模型时直接能初始化对应的 加速后的 LLaMa 的 tokenizer.

 更多测试脚本及用法详见参考 `examples` 下的 [README.md](./examples/README.md) ，如：
 - Batch推理
 - 不等长Batch推理
+- Batch流式推理