LLM: ubuntu 24.04 ollama + open-webui gpu full docker

From OnnoWiki
Jump to navigation Jump to search

Referensi: https://projectable.me/ubuntu-24-04-nvidia-drivers-ollama/

Cek GPU

sudo su
lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation AD107M [GeForce RTX 4060 Max-Q / Mobile] (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 22be (rev a1)


Remove all nvidia drivers

sudo apt-get remove --purge 'libnvidia-.*'  
sudo apt-get remove --purge '^nvidia-.*'  
sudo apt-get remove --purge '^libnvidia-.*'  
sudo apt-get remove --purge '^cuda-.*'  
sudo apt clean  
sudo apt autoremove 

Update

sudo add-apt-repository ppa:graphics-drivers/ppa --yes  
sudo apt-get update

update-pciids

Install

sudo apt-get install nvidia-driver-580 -y
sudo apt-get reinstall linux-headers-$(uname -r)
sudo update-initramfs -u
sudo dkms status  
sudo reboot

Cek

nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.09              Driver Version: 580.82.09      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   55C    P0            590W /  115W |       2MiB /   8188MiB |     12%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+


nvidia cuda install

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb  
sudo dpkg -i cuda-keyring_1.1-1_all.deb

sudo apt-get update

sudo apt-get install cuda-toolkit -y  
sudo apt-get install nvidia-gds -y
sudo dkms status
sudo reboot


Install docker

sudo apt-get install curl apt-transport-https ca-certificates software-properties-common -y
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg  
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update  
sudo apt-get install docker-ce -y 
sudo usermod -aG docker $USER

Install Nvidia Container Toolkit

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
 
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
   sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
   sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update  
sudo apt-get install nvidia-container-toolkit -y
sudo systemctl restart docker

Install docker-compose

sudo su
apt install docker.io curl
curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose


Buat docker-compose.yml


version: "3.9"

name: ollama-openwebui

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    # Akses GPU NVIDIA
    gpus: all
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
      # Izinkan akses dari Open WebUI
      - OLLAMA_HOST=0.0.0.0
      - OLLAMA_KEEP_ALIVE=24h
      # (opsional) asal CORS jika dibutuhkan akses web dari domain lain
      # - OLLAMA_ORIGINS=*
    volumes:
      - ollama:/root/.ollama
    ports:
      - "11434:11434"   # API Ollama
    restart: unless-stopped
    networks:
      - llmnet 

  openwebui:
    image: ghcr.io/open-webui/open-webui:latest
    container_name: openwebui
    depends_on:
      - ollama
    environment:
      # Arahkan Open WebUI ke service ollama
      - OLLAMA_API_BASE=http://ollama:11434
      # (opsional) aktifkan auth admin pertama
      # - WEBUI_AUTH=true
      # - ADMIN_EMAIL=you@example.com
    volumes:
      - openwebui:/app/backend/data
    ports:
      - "3000:8080"     # UI Open WebUI -> http://localhost:3000
    restart: unless-stopped
    networks:
      - llmnet

volumes:
  ollama:
  openwebui:

networks:
  llmnet:
    driver: bridge

Install

sudo docker-compose up -d


Tarik model (sekali saja)

Anda bisa tarik model via CLI atau dari Open WebUI (menu Models).

Contoh lewat CLI:

# contoh model keluarga Llama 3.1 (8B)
docker exec -it ollama ollama pull llama3.1

# model kecil & cepat untuk test
docker exec -it ollama ollama pull qwen2.5:3b

# contoh generate cepat
docker exec -it ollama ollama run qwen2.5:3b


Di Open WebUI, pilih model tersebut di dropdown, lalu coba prompt.

Update / Maintenance

# update image
docker compose pull
docker compose up -d
# cek log jika ada masalah
docker logs -f ollama
docker logs -f openwebui
# stop & start
docker compose down
docker compose up -d

Hardening (opsional)

Jalankan di belakang reverse proxy (Caddy/Traefik/Nginx) + TLS jika akses publik.

Batasi binding ke localhost saja dan pakai SSH tunnel:

ganti ports: "3000:8080" → "127.0.0.1:3000:8080"

akses via ssh -L 3000:localhost:3000 user@server.

Set auth di Open WebUI (WEBUI_AUTH=true).


Troubleshooting cepat

Open WebUI tidak bisa konek ke Ollama: pastikan

OLLAMA_API_BASE=http://ollama:11434

dan keduanya di jaringan Compose yang sama (llmnet).

GPU tidak terdeteksi: cek nvidia-smi di host; lalu

docker run --rm --gpus all nvidia/cuda:12.5.0-base-ubuntu24.04 nvidia-smi

Jika gagal, ulangi instalasi NVIDIA Container Toolkit dan systemctl restart docker.

Lambat / kehabisan VRAM: coba model yang lebih kecil (mis. qwen2.5:3b, llama3.2:3b), atau pakai quantized varian (:q4_0, dll jika tersedia).

Kalau mau, saya bisa tambahkan contoh file Caddy/Traefik untuk HTTPS dan multi-user auth.

Download Models

docker exec -it ollama /bin/bash

ollama pull deepseek-r1:7b
ollama pull gemma3:4b
ollama pull qwen3:8b
ollama pull llama3.1:8b
ollama pull llama3.2:3b
ollama pull mistral:7b
ollama pull llava:7b
ollama pull qwen2.5-coder:7b
ollama pull olmo2:7b


Referensi