Skip to content

Commit c16cfc3

Browse files
chriscrosstalkclaude
authored andcommitted
fix(GPU): detect NVIDIA GPUs via Docker API instead of lspci
The previous lspci-based GPU detection fails inside Docker containers because lspci isn't available, causing Ollama to always run CPU-only even when a GPU + NVIDIA Container Toolkit are present on the host. Replace with Docker API runtime check (docker.info() -> Runtimes) as primary detection method. This works from inside any container via the mounted Docker socket and confirms both GPU presence and toolkit installation. Keep lspci as fallback for host-based installs and AMD. Also add Docker-based GPU detection to benchmark hardware info — exec nvidia-smi inside the Ollama container to get the actual GPU model name instead of showing "Not detected". Tested on nomad3 (Intel Core Ultra 9 285HX + RTX 5060): AI performance went from 12.7 tok/s (CPU) to 281.4 tok/s (GPU) — a 22x improvement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 812d13c commit c16cfc3

File tree

4 files changed

+117
-18
lines changed

4 files changed

+117
-18
lines changed

admin/app/services/benchmark_service.ts

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -270,6 +270,60 @@ export class BenchmarkService {
270270
gpuModel = discreteGpu?.model || graphics.controllers[0]?.model || null
271271
}
272272

273+
// Fallback: Check Docker for nvidia runtime and query GPU model via nvidia-smi
274+
if (!gpuModel) {
275+
try {
276+
const dockerInfo = await this.dockerService.docker.info()
277+
const runtimes = dockerInfo.Runtimes || {}
278+
if ('nvidia' in runtimes) {
279+
logger.info('[BenchmarkService] NVIDIA container runtime detected, querying GPU model via nvidia-smi')
280+
281+
// Try to get GPU model name from the running Ollama container
282+
try {
283+
const containers = await this.dockerService.docker.listContainers({ all: false })
284+
const ollamaContainer = containers.find((c) =>
285+
c.Names.includes(`/${SERVICE_NAMES.OLLAMA}`)
286+
)
287+
288+
if (ollamaContainer) {
289+
const container = this.dockerService.docker.getContainer(ollamaContainer.Id)
290+
const exec = await container.exec({
291+
Cmd: ['nvidia-smi', '--query-gpu=name', '--format=csv,noheader'],
292+
AttachStdout: true,
293+
AttachStderr: true,
294+
Tty: true,
295+
})
296+
297+
const stream = await exec.start({ Tty: true })
298+
const output = await new Promise<string>((resolve) => {
299+
let data = ''
300+
const timeout = setTimeout(() => resolve(data), 5000)
301+
stream.on('data', (chunk: Buffer) => { data += chunk.toString() })
302+
stream.on('end', () => { clearTimeout(timeout); resolve(data) })
303+
})
304+
305+
const gpuName = output.replace(/[\x00-\x08]/g, '').trim()
306+
if (gpuName && !gpuName.toLowerCase().includes('error') && !gpuName.toLowerCase().includes('not found')) {
307+
gpuModel = gpuName
308+
logger.info(`[BenchmarkService] GPU detected via nvidia-smi: ${gpuModel}`)
309+
} else {
310+
gpuModel = 'NVIDIA GPU (model unknown)'
311+
logger.info('[BenchmarkService] NVIDIA runtime present but nvidia-smi query failed, using generic name')
312+
}
313+
} else {
314+
gpuModel = 'NVIDIA GPU (model unknown)'
315+
logger.info('[BenchmarkService] NVIDIA runtime present but Ollama container not running')
316+
}
317+
} catch (execError) {
318+
gpuModel = 'NVIDIA GPU (model unknown)'
319+
logger.warn(`[BenchmarkService] nvidia-smi exec failed: ${execError.message}`)
320+
}
321+
}
322+
} catch (dockerError) {
323+
logger.warn(`[BenchmarkService] Could not query Docker info for GPU detection: ${dockerError.message}`)
324+
}
325+
}
326+
273327
// Fallback: Extract integrated GPU from CPU model name
274328
if (!gpuModel) {
275329
const cpuFullName = `${cpu.manufacturer} ${cpu.brand}`

admin/app/services/docker_service.ts

Lines changed: 38 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -454,13 +454,13 @@ export class DockerService {
454454
let gpuHostConfig = containerConfig?.HostConfig || {}
455455

456456
if (service.service_name === SERVICE_NAMES.OLLAMA) {
457-
const gpuType = await this._detectGPUType()
457+
const gpuResult = await this._detectGPUType()
458458

459-
if (gpuType === 'nvidia') {
459+
if (gpuResult.type === 'nvidia') {
460460
this._broadcast(
461461
service.service_name,
462462
'gpu-config',
463-
`NVIDIA GPU detected. Configuring container with GPU support...`
463+
`NVIDIA container runtime detected. Configuring container with GPU support...`
464464
)
465465

466466
// Add GPU support for NVIDIA
@@ -474,7 +474,7 @@ export class DockerService {
474474
},
475475
],
476476
}
477-
} else if (gpuType === 'amd') {
477+
} else if (gpuResult.type === 'amd') {
478478
// this._broadcast(
479479
// service.service_name,
480480
// 'gpu-config',
@@ -503,6 +503,12 @@ export class DockerService {
503503
// `[DockerService] Configured ${amdDevices.length} AMD GPU devices for Ollama`
504504
// )
505505
// }
506+
} else if (gpuResult.toolkitMissing) {
507+
this._broadcast(
508+
service.service_name,
509+
'gpu-config',
510+
`NVIDIA GPU detected but NVIDIA Container Toolkit is not installed. Using CPU-only configuration. Install the toolkit and reinstall AI Assistant for GPU acceleration: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html`
511+
)
506512
} else {
507513
this._broadcast(
508514
service.service_name,
@@ -691,44 +697,59 @@ export class DockerService {
691697
}
692698

693699
/**
694-
* Detect GPU type (NVIDIA or AMD) on the system.
695-
* Returns 'nvidia', 'amd', or 'none'.
700+
* Detect GPU type and toolkit availability.
701+
* Primary: Check Docker runtimes via docker.info() (works from inside containers).
702+
* Fallback: lspci for host-based installs and AMD detection.
696703
*/
697-
private async _detectGPUType(): Promise<'nvidia' | 'amd' | 'none'> {
704+
private async _detectGPUType(): Promise<{ type: 'nvidia' | 'amd' | 'none'; toolkitMissing?: boolean }> {
698705
try {
706+
// Primary: Check Docker daemon for nvidia runtime (works from inside containers)
707+
try {
708+
const dockerInfo = await this.docker.info()
709+
const runtimes = dockerInfo.Runtimes || {}
710+
if ('nvidia' in runtimes) {
711+
logger.info('[DockerService] NVIDIA container runtime detected via Docker API')
712+
return { type: 'nvidia' }
713+
}
714+
} catch (error) {
715+
logger.warn(`[DockerService] Could not query Docker info for GPU runtimes: ${error.message}`)
716+
}
717+
718+
// Fallback: lspci for host-based installs (not available inside Docker)
699719
const execAsync = promisify(exec)
700720

701-
// Check for NVIDIA GPU
721+
// Check for NVIDIA GPU via lspci
702722
try {
703723
const { stdout: nvidiaCheck } = await execAsync(
704724
'lspci 2>/dev/null | grep -i nvidia || true'
705725
)
706726
if (nvidiaCheck.trim()) {
707-
logger.info('[DockerService] NVIDIA GPU detected')
708-
return 'nvidia'
727+
// GPU hardware found but no nvidia runtime — toolkit not installed
728+
logger.warn('[DockerService] NVIDIA GPU detected via lspci but NVIDIA Container Toolkit is not installed')
729+
return { type: 'none', toolkitMissing: true }
709730
}
710731
} catch (error) {
711-
// Continue to AMD check
732+
// lspci not available (likely inside Docker container), continue
712733
}
713734

714-
// Check for AMD GPU
735+
// Check for AMD GPU via lspci
715736
try {
716737
const { stdout: amdCheck } = await execAsync(
717738
'lspci 2>/dev/null | grep -iE "amd|radeon" || true'
718739
)
719740
if (amdCheck.trim()) {
720-
logger.info('[DockerService] AMD GPU detected')
721-
return 'amd'
741+
logger.info('[DockerService] AMD GPU detected via lspci')
742+
return { type: 'amd' }
722743
}
723744
} catch (error) {
724-
// No GPU detected
745+
// lspci not available, continue
725746
}
726747

727748
logger.info('[DockerService] No GPU detected')
728-
return 'none'
749+
return { type: 'none' }
729750
} catch (error) {
730751
logger.warn(`[DockerService] Error detecting GPU type: ${error.message}`)
731-
return 'none'
752+
return { type: 'none' }
732753
}
733754
}
734755

admin/docs/faq.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,10 +110,32 @@ The Maps feature requires downloaded map data. If you see a blank area:
110110
### AI responses are slow
111111

112112
Local AI requires significant computing power. To improve speed:
113+
- **Add a GPU** — An NVIDIA GPU with the NVIDIA Container Toolkit can improve AI speed by 10-20x or more
113114
- Close other applications on the server
114115
- Ensure adequate cooling (overheating causes throttling)
115116
- Consider using a smaller/faster AI model if available
116-
- Add a GPU if your hardware supports it (NVIDIA or AMD)
117+
118+
### How do I enable GPU acceleration for AI?
119+
120+
N.O.M.A.D. automatically detects NVIDIA GPUs when the NVIDIA Container Toolkit is installed on the host system. To set up GPU acceleration:
121+
122+
1. **Install an NVIDIA GPU** in your server (if not already present)
123+
2. **Install the NVIDIA Container Toolkit** on the host — follow the [official installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
124+
3. **Reinstall the AI Assistant** — Go to [Apps](/settings/apps), find AI Assistant, and click **Force Reinstall**
125+
126+
N.O.M.A.D. will detect the GPU during installation and configure the AI to use it automatically. You'll see "NVIDIA container runtime detected" in the installation progress.
127+
128+
**Tip:** Run a [System Benchmark](/settings/benchmark) before and after to see the difference. GPU-accelerated systems typically see 100+ tokens per second vs 10-15 on CPU only.
129+
130+
### I added/changed my GPU but AI is still slow
131+
132+
When you add or swap a GPU, N.O.M.A.D. needs to reconfigure the AI container to use it:
133+
134+
1. Make sure the **NVIDIA Container Toolkit** is installed on the host
135+
2. Go to **[Apps](/settings/apps)**
136+
3. Find the **AI Assistant** and click **Force Reinstall**
137+
138+
Force Reinstall recreates the AI container with GPU support enabled. Without this step, the AI continues to run on CPU only.
117139

118140
### AI Chat not available
119141

admin/docs/getting-started.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,8 @@ N.O.M.A.D. includes a built-in AI chat interface powered by Ollama. It runs enti
8484

8585
**Note:** The AI Assistant must be installed first. Enable it during Easy Setup or install it from the [Apps](/settings/apps) page.
8686

87+
**GPU Acceleration:** If your server has an NVIDIA GPU with the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) installed, N.O.M.A.D. will automatically use it for AI — dramatically faster responses (10-20x improvement). If you add a GPU later, go to [Apps](/settings/apps) and **Force Reinstall** the AI Assistant to enable it.
88+
8789
---
8890

8991
### Knowledge Base — Document-Aware AI

0 commit comments

Comments
 (0)