Build and run llama.cpp locally on Fedora 42 with ROCm
There are multiple tutorials online already on how to run an LLM on linux, but I found nothing really helpful on Fedora, so I will provide my process here.
I got my information from two tutorials Building llama.cpp with rocm on Fedora 41 and this github script.
My Hardware
I am doing this on a Framework Laptop with the Ryzen 7840U and consequently the AMD 740M graphics. I have 16GB of memory, which I will soon be updating to 32GB.
Installation
- I installed inside the folder
~/Documents/llama-build/
- To start, I needed to install the standard stuff for building llama.cpp
sudo dnf install rocminfo rocm- rocm-clinfo make gcc cmake 'hipblas-*'
- I also needed to install
libcurl-devel
andclang
, as the script failed without it
- Then you can clone the llama.cpp github repo
git clone https://github.com/ggml-org/llama.cpp.git
Troubleshooting
Just some additional info:
When running the script, got an error for
-- Could NOT find CURL (missing: CURL_LIBRARY CURL_INCLUDE_DIR)
CMake Error at common/CMakeLists.txt:85 (message):
Could NOT find CURL. Hint: to disable this feature, set -DLLAMA_CURL=OFF
This was fixed by installing libcurl-devel
Downloading a model
- To download the model, you need to use the python library for huggingface, the best place to get these models.
- It's important to use the
.gguf
model type, as this is what llama.cpp can run with. - Which model to run is up to your performance, of course. I ran
Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL
- The full download command was:
./build/bin/llama-cli --model ~/git/hf.co/unsloth/Qwen2.5-VL-7B-Instruct-GGUF/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf --main-gpu 0 --gpu-layers 40
- To check the paramters, you can run
./build.bin/llama-cli --help
Forget everything and just use LM Studio
I thought that LM-Studio is something like llama ui. But apparently its not, you can forget all of the shenanigans and run it directly. Follow this setup guide and get it up and running in like 2 mins.