Upload gguf/README.md with huggingface_hub
Browse files- gguf/README.md +152 -0
gguf/README.md
ADDED
|
@@ -0,0 +1,152 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ZigNet Qwen2.5-Coder-7B (GGUF Q4_K_M)
|
| 2 |
+
|
| 3 |
+
Quantized GGUF version of [ZigNet Qwen2.5-Coder-7B](https://huggingface.co/fulgidus/zignet-qwen2.5-coder-7b) for local inference.
|
| 4 |
+
|
| 5 |
+
## Model Details
|
| 6 |
+
|
| 7 |
+
- **Base Model**: Qwen/Qwen2.5-Coder-7B-Instruct
|
| 8 |
+
- **Fine-tuning**: LoRA adapters trained on 13,756 Zig examples
|
| 9 |
+
- **Quantization**: Q4_K_M (4-bit, mixed K-quants)
|
| 10 |
+
- **Size**: 4.4GB (73% reduction from F16's 15GB)
|
| 11 |
+
- **Quality**: ~95-98% retention vs full precision
|
| 12 |
+
- **Speed**: 3-4x faster on consumer GPUs
|
| 13 |
+
|
| 14 |
+
## Quick Start
|
| 15 |
+
|
| 16 |
+
### With Ollama
|
| 17 |
+
|
| 18 |
+
```bash
|
| 19 |
+
# Download model
|
| 20 |
+
huggingface-cli download fulgidus/zignet-qwen2.5-coder-7b \
|
| 21 |
+
gguf/zignet-qwen-7b-q4km.gguf \
|
| 22 |
+
--local-dir ./models
|
| 23 |
+
|
| 24 |
+
# Create Modelfile
|
| 25 |
+
cat > Modelfile << 'EOF'
|
| 26 |
+
FROM ./models/gguf/zignet-qwen-7b-q4km.gguf
|
| 27 |
+
|
| 28 |
+
TEMPLATE """{{ if .System }}<|im_start|>system
|
| 29 |
+
{{ .System }}<|im_end|>
|
| 30 |
+
{{ end }}{{ if .Prompt }}<|im_start|>user
|
| 31 |
+
{{ .Prompt }}<|im_end|>
|
| 32 |
+
{{ end }}<|im_start|>assistant
|
| 33 |
+
{{ .Response }}<|im_end|>
|
| 34 |
+
"""
|
| 35 |
+
|
| 36 |
+
SYSTEM """You are ZigNet, an AI assistant specialized in Zig programming language (v0.13-0.15).
|
| 37 |
+
|
| 38 |
+
Your expertise includes:
|
| 39 |
+
- Explaining Zig syntax, features, and idioms
|
| 40 |
+
- Understanding comptime, generics, and error handling
|
| 41 |
+
- Providing code examples and fixes
|
| 42 |
+
- Referencing official Zig documentation"""
|
| 43 |
+
|
| 44 |
+
PARAMETER temperature 0.7
|
| 45 |
+
PARAMETER top_p 0.9
|
| 46 |
+
PARAMETER num_ctx 4096
|
| 47 |
+
PARAMETER stop "<|im_start|>"
|
| 48 |
+
PARAMETER stop "<|im_end|>"
|
| 49 |
+
EOF
|
| 50 |
+
|
| 51 |
+
# Import to Ollama
|
| 52 |
+
ollama create zignet:latest -f Modelfile
|
| 53 |
+
|
| 54 |
+
# Run
|
| 55 |
+
ollama run zignet:latest "Explain comptime in Zig"
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
+
### With llama.cpp
|
| 59 |
+
|
| 60 |
+
```bash
|
| 61 |
+
# Download model
|
| 62 |
+
huggingface-cli download fulgidus/zignet-qwen2.5-coder-7b \
|
| 63 |
+
gguf/zignet-qwen-7b-q4km.gguf
|
| 64 |
+
|
| 65 |
+
# Run with llama.cpp
|
| 66 |
+
./llama-cli \
|
| 67 |
+
-m ~/.cache/huggingface/hub/models--fulgidus--zignet-qwen2.5-coder-7b/gguf/zignet-qwen-7b-q4km.gguf \
|
| 68 |
+
-p "Explain Zig's error handling system:" \
|
| 69 |
+
-n 512 \
|
| 70 |
+
--temp 0.7 \
|
| 71 |
+
--top-p 0.9 \
|
| 72 |
+
-ngl 35 # Offload layers to GPU
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
### With node-llama-cpp (JavaScript/TypeScript)
|
| 76 |
+
|
| 77 |
+
```typescript
|
| 78 |
+
import { LlamaModel, LlamaContext, LlamaChatSession } from "node-llama-cpp";
|
| 79 |
+
|
| 80 |
+
const model = new LlamaModel({
|
| 81 |
+
modelPath: "./models/gguf/zignet-qwen-7b-q4km.gguf"
|
| 82 |
+
});
|
| 83 |
+
|
| 84 |
+
const context = new LlamaContext({ model });
|
| 85 |
+
const session = new LlamaChatSession({ context });
|
| 86 |
+
|
| 87 |
+
const response = await session.prompt(
|
| 88 |
+
"Generate a generic ArrayList in Zig"
|
| 89 |
+
);
|
| 90 |
+
|
| 91 |
+
console.log(response);
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
+
## Hardware Requirements
|
| 95 |
+
|
| 96 |
+
| Configuration | VRAM | RAM | Speed |
|
| 97 |
+
|--------------|------|-----|-------|
|
| 98 |
+
| CPU only | - | 8GB | ~15-20 tokens/s |
|
| 99 |
+
| NVIDIA GPU | 6GB | 8GB | ~60-80 tokens/s |
|
| 100 |
+
| Apple M1/M2 | - | 8GB | ~40-50 tokens/s |
|
| 101 |
+
| RTX 3090 | 24GB | 16GB | ~100+ tokens/s |
|
| 102 |
+
|
| 103 |
+
## Use Cases
|
| 104 |
+
|
| 105 |
+
- **Documentation Lookup**: Explain Zig language features
|
| 106 |
+
- **Code Generation**: Generate idiomatic Zig patterns
|
| 107 |
+
- **Error Fixes**: Suggest corrections for compilation errors
|
| 108 |
+
- **MCP Integration**: Power ZigNet's `get_zig_docs` and `suggest_fix` tools
|
| 109 |
+
|
| 110 |
+
## Limitations
|
| 111 |
+
|
| 112 |
+
- Optimized for Zig 0.13-0.15 (may have reduced accuracy on other versions)
|
| 113 |
+
- Not suitable for real-time syntax parsing (use compiler-based tools)
|
| 114 |
+
- Always validate generated code with `zig build` or `zig test`
|
| 115 |
+
|
| 116 |
+
## Example Outputs
|
| 117 |
+
|
| 118 |
+
**Prompt**: "Explain error sets in Zig"
|
| 119 |
+
|
| 120 |
+
**Response**:
|
| 121 |
+
> In Zig, an error set is a tagged union type that allows functions to express failure and return both a value and an error code. By declaring the return type as `error_set_type`, a function communicates which errors it may return, enabling compile-time safety and clarity about how errors are handled.
|
| 122 |
+
|
| 123 |
+
**Prompt**: "Generate a simple Zig struct with methods"
|
| 124 |
+
|
| 125 |
+
**Response**:
|
| 126 |
+
```zig
|
| 127 |
+
const Point = struct {
|
| 128 |
+
x: f32,
|
| 129 |
+
y: f32,
|
| 130 |
+
|
| 131 |
+
pub fn init(x: f32, y: f32) Point {
|
| 132 |
+
return Point{ .x = x, .y = y };
|
| 133 |
+
}
|
| 134 |
+
|
| 135 |
+
pub fn distance(self: Point, other: Point) f32 {
|
| 136 |
+
const dx = self.x - other.x;
|
| 137 |
+
const dy = self.y - other.y;
|
| 138 |
+
return @sqrt(dx * dx + dy * dy);
|
| 139 |
+
}
|
| 140 |
+
};
|
| 141 |
+
```
|
| 142 |
+
|
| 143 |
+
## License
|
| 144 |
+
|
| 145 |
+
Apache 2.0
|
| 146 |
+
|
| 147 |
+
## Links
|
| 148 |
+
|
| 149 |
+
- **Full Model (LoRA)**: [fulgidus/zignet-qwen2.5-coder-7b](https://huggingface.co/fulgidus/zignet-qwen2.5-coder-7b)
|
| 150 |
+
- **Training Dataset**: [fulgidus/zignet-training-dataset](https://huggingface.co/datasets/fulgidus/zignet-training-dataset)
|
| 151 |
+
- **ZigNet MCP Server**: [github.com/fulgidus/zignet](https://github.com/fulgidus/zignet)
|
| 152 |
+
- **Base Model**: [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
|