The really neat thing about GGUF is that it's just one file. Compare this to a typical safetensors repo on huggingface, where there's a pile of necessary JSON files scattered around - or to a typical ollama model, which is an OCI with layers json, go templates, etc inside. The contents are roughly the same, but GGUF makes it more ergonomic by keeping all this stuff in a single file. ...