This workflow converts machine learning models downloaded from HuggingFace into the GGUF format that Ollama and llama.cpp use for local inference. It reads model weights, maps tensor names, extracts ...
This project is a refactored organization of code originally from llama.cpp by Georgi Gerganov and contributors. All model conversion logic, GGUF format handling, and tensor transformations are from ...