Distribution Strategies, VRAM Tiering, and Deployment Protocols
Placing models inside Content/ for Pak packaging.
mmap efficiency.Storing models in Saved/Models/ or custom external paths.
To ensure stability across various hardware, define VRAM watermarks:
| Hardware Tier | Recommended Model | Locked VRAM Budget |
|---|---|---|
| Base (RTX 3060) | Gemma-2B (4-bit) | 1.8 GB |
| Standard (RTX 4060) | Gemma-4-E2B (4-bit) | 2.5 GB |
| Ultra (RTX 4080+) | Llama-3-8B (8-bit) | 6.5 GB |