fulgidus commited on
Commit
98a6c61
·
verified ·
1 Parent(s): c3b36c7

Upload gguf/README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. gguf/README.md +152 -0
gguf/README.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ZigNet Qwen2.5-Coder-7B (GGUF Q4_K_M)
2
+
3
+ Quantized GGUF version of [ZigNet Qwen2.5-Coder-7B](https://huggingface.co/fulgidus/zignet-qwen2.5-coder-7b) for local inference.
4
+
5
+ ## Model Details
6
+
7
+ - **Base Model**: Qwen/Qwen2.5-Coder-7B-Instruct
8
+ - **Fine-tuning**: LoRA adapters trained on 13,756 Zig examples
9
+ - **Quantization**: Q4_K_M (4-bit, mixed K-quants)
10
+ - **Size**: 4.4GB (73% reduction from F16's 15GB)
11
+ - **Quality**: ~95-98% retention vs full precision
12
+ - **Speed**: 3-4x faster on consumer GPUs
13
+
14
+ ## Quick Start
15
+
16
+ ### With Ollama
17
+
18
+ ```bash
19
+ # Download model
20
+ huggingface-cli download fulgidus/zignet-qwen2.5-coder-7b \
21
+ gguf/zignet-qwen-7b-q4km.gguf \
22
+ --local-dir ./models
23
+
24
+ # Create Modelfile
25
+ cat > Modelfile << 'EOF'
26
+ FROM ./models/gguf/zignet-qwen-7b-q4km.gguf
27
+
28
+ TEMPLATE """{{ if .System }}<|im_start|>system
29
+ {{ .System }}<|im_end|>
30
+ {{ end }}{{ if .Prompt }}<|im_start|>user
31
+ {{ .Prompt }}<|im_end|>
32
+ {{ end }}<|im_start|>assistant
33
+ {{ .Response }}<|im_end|>
34
+ """
35
+
36
+ SYSTEM """You are ZigNet, an AI assistant specialized in Zig programming language (v0.13-0.15).
37
+
38
+ Your expertise includes:
39
+ - Explaining Zig syntax, features, and idioms
40
+ - Understanding comptime, generics, and error handling
41
+ - Providing code examples and fixes
42
+ - Referencing official Zig documentation"""
43
+
44
+ PARAMETER temperature 0.7
45
+ PARAMETER top_p 0.9
46
+ PARAMETER num_ctx 4096
47
+ PARAMETER stop "<|im_start|>"
48
+ PARAMETER stop "<|im_end|>"
49
+ EOF
50
+
51
+ # Import to Ollama
52
+ ollama create zignet:latest -f Modelfile
53
+
54
+ # Run
55
+ ollama run zignet:latest "Explain comptime in Zig"
56
+ ```
57
+
58
+ ### With llama.cpp
59
+
60
+ ```bash
61
+ # Download model
62
+ huggingface-cli download fulgidus/zignet-qwen2.5-coder-7b \
63
+ gguf/zignet-qwen-7b-q4km.gguf
64
+
65
+ # Run with llama.cpp
66
+ ./llama-cli \
67
+ -m ~/.cache/huggingface/hub/models--fulgidus--zignet-qwen2.5-coder-7b/gguf/zignet-qwen-7b-q4km.gguf \
68
+ -p "Explain Zig's error handling system:" \
69
+ -n 512 \
70
+ --temp 0.7 \
71
+ --top-p 0.9 \
72
+ -ngl 35 # Offload layers to GPU
73
+ ```
74
+
75
+ ### With node-llama-cpp (JavaScript/TypeScript)
76
+
77
+ ```typescript
78
+ import { LlamaModel, LlamaContext, LlamaChatSession } from "node-llama-cpp";
79
+
80
+ const model = new LlamaModel({
81
+ modelPath: "./models/gguf/zignet-qwen-7b-q4km.gguf"
82
+ });
83
+
84
+ const context = new LlamaContext({ model });
85
+ const session = new LlamaChatSession({ context });
86
+
87
+ const response = await session.prompt(
88
+ "Generate a generic ArrayList in Zig"
89
+ );
90
+
91
+ console.log(response);
92
+ ```
93
+
94
+ ## Hardware Requirements
95
+
96
+ | Configuration | VRAM | RAM | Speed |
97
+ |--------------|------|-----|-------|
98
+ | CPU only | - | 8GB | ~15-20 tokens/s |
99
+ | NVIDIA GPU | 6GB | 8GB | ~60-80 tokens/s |
100
+ | Apple M1/M2 | - | 8GB | ~40-50 tokens/s |
101
+ | RTX 3090 | 24GB | 16GB | ~100+ tokens/s |
102
+
103
+ ## Use Cases
104
+
105
+ - **Documentation Lookup**: Explain Zig language features
106
+ - **Code Generation**: Generate idiomatic Zig patterns
107
+ - **Error Fixes**: Suggest corrections for compilation errors
108
+ - **MCP Integration**: Power ZigNet's `get_zig_docs` and `suggest_fix` tools
109
+
110
+ ## Limitations
111
+
112
+ - Optimized for Zig 0.13-0.15 (may have reduced accuracy on other versions)
113
+ - Not suitable for real-time syntax parsing (use compiler-based tools)
114
+ - Always validate generated code with `zig build` or `zig test`
115
+
116
+ ## Example Outputs
117
+
118
+ **Prompt**: "Explain error sets in Zig"
119
+
120
+ **Response**:
121
+ > In Zig, an error set is a tagged union type that allows functions to express failure and return both a value and an error code. By declaring the return type as `error_set_type`, a function communicates which errors it may return, enabling compile-time safety and clarity about how errors are handled.
122
+
123
+ **Prompt**: "Generate a simple Zig struct with methods"
124
+
125
+ **Response**:
126
+ ```zig
127
+ const Point = struct {
128
+ x: f32,
129
+ y: f32,
130
+
131
+ pub fn init(x: f32, y: f32) Point {
132
+ return Point{ .x = x, .y = y };
133
+ }
134
+
135
+ pub fn distance(self: Point, other: Point) f32 {
136
+ const dx = self.x - other.x;
137
+ const dy = self.y - other.y;
138
+ return @sqrt(dx * dx + dy * dy);
139
+ }
140
+ };
141
+ ```
142
+
143
+ ## License
144
+
145
+ Apache 2.0
146
+
147
+ ## Links
148
+
149
+ - **Full Model (LoRA)**: [fulgidus/zignet-qwen2.5-coder-7b](https://huggingface.co/fulgidus/zignet-qwen2.5-coder-7b)
150
+ - **Training Dataset**: [fulgidus/zignet-training-dataset](https://huggingface.co/datasets/fulgidus/zignet-training-dataset)
151
+ - **ZigNet MCP Server**: [github.com/fulgidus/zignet](https://github.com/fulgidus/zignet)
152
+ - **Base Model**: [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)