Build and run llama.cpp locally on Fedora 42 with ROCm

Last updated on 17 Aug 2025

There are multiple tutorials online already on how to run an LLM on linux, but I found nothing really helpful on Fedora, so I will provide my process here.
I got my information from two tutorials Building llama.cpp with rocm on Fedora 41 and this github script.

My Hardware

I am doing this on a Framework Laptop with the Ryzen 7840U and consequently the AMD 740M graphics. I have 16GB of memory, which I will soon be updating to 32GB.

Installation

I installed inside the folder ~/Documents/llama-build/
To start, I needed to install the standard stuff for building llama.cpp
- sudo dnf install rocminfo rocm- rocm-clinfo make gcc cmake 'hipblas-*'
- I also needed to install libcurl-devel and clang, as the script failed without it
Then you can clone the llama.cpp github repo
- git clone https://github.com/ggml-org/llama.cpp.git

Troubleshooting

Just some additional info:
When running the script, got an error for

-- Could NOT find CURL (missing: CURL_LIBRARY CURL_INCLUDE_DIR) 
CMake Error at common/CMakeLists.txt:85 (message):
  Could NOT find CURL.  Hint: to disable this feature, set -DLLAMA_CURL=OFF

This was fixed by installing libcurl-devel

Downloading a model

To download the model, you need to use the python library for huggingface, the best place to get these models.
It's important to use the .gguf model type, as this is what llama.cpp can run with.
Which model to run is up to your performance, of course. I ran Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL
The full download command was: ./build/bin/llama-cli --model ~/git/hf.co/unsloth/Qwen2.5-VL-7B-Instruct-GGUF/Qwen2.5-VL-7B-Instruct-UD-Q4_K_XL.gguf --main-gpu 0 --gpu-layers 40
To check the paramters, you can run ./build.bin/llama-cli --help

Forget everything and just use LM Studio

I thought that LM-Studio is something like llama ui. But apparently its not, you can forget all of the shenanigans and run it directly. Follow this setup guide and get it up and running in like 2 mins.