Skip to content

Commit fe08fc0

Browse files
chriscrosstalkclaude
authored andcommitted
fix(GPU): persist GPU type to KV store for reliable passthrough
GPU detection results were only applied at container creation time and never persisted. If live detection failed transiently (Docker daemon hiccup, runtime temporarily unavailable), Ollama would silently fall back to CPU-only mode with no way to recover short of force-reinstall. Now _detectGPUType() persists successful detections to the KV store (gpu.type = 'nvidia' | 'amd') and uses the saved value as a fallback when live detection returns nothing. This ensures GPU config survives across container recreations regardless of transient detection failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 418f82f commit fe08fc0

File tree

2 files changed

+25
-0
lines changed

2 files changed

+25
-0
lines changed

admin/app/services/docker_service.ts

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -691,6 +691,7 @@ export class DockerService {
691691
const runtimes = dockerInfo.Runtimes || {}
692692
if ('nvidia' in runtimes) {
693693
logger.info('[DockerService] NVIDIA container runtime detected via Docker API')
694+
await this._persistGPUType('nvidia')
694695
return { type: 'nvidia' }
695696
}
696697
} catch (error) {
@@ -722,12 +723,26 @@ export class DockerService {
722723
)
723724
if (amdCheck.trim()) {
724725
logger.info('[DockerService] AMD GPU detected via lspci')
726+
await this._persistGPUType('amd')
725727
return { type: 'amd' }
726728
}
727729
} catch (error) {
728730
// lspci not available, continue
729731
}
730732

733+
// Last resort: check if we previously detected a GPU and it's likely still present.
734+
// This handles cases where live detection fails transiently (e.g., Docker daemon
735+
// hiccup, runtime temporarily unavailable) but the hardware hasn't changed.
736+
try {
737+
const savedType = await KVStore.getValue('gpu.type')
738+
if (savedType === 'nvidia' || savedType === 'amd') {
739+
logger.info(`[DockerService] No GPU detected live, but KV store has '${savedType}' from previous detection. Using saved value.`)
740+
return { type: savedType as 'nvidia' | 'amd' }
741+
}
742+
} catch {
743+
// KV store not available, continue
744+
}
745+
731746
logger.info('[DockerService] No GPU detected')
732747
return { type: 'none' }
733748
} catch (error) {
@@ -736,6 +751,15 @@ export class DockerService {
736751
}
737752
}
738753

754+
private async _persistGPUType(type: 'nvidia' | 'amd'): Promise<void> {
755+
try {
756+
await KVStore.setValue('gpu.type', type)
757+
logger.info(`[DockerService] Persisted GPU type '${type}' to KV store`)
758+
} catch (error) {
759+
logger.warn(`[DockerService] Failed to persist GPU type: ${error.message}`)
760+
}
761+
}
762+
739763
/**
740764
* Discover AMD GPU DRI devices dynamically.
741765
* Returns an array of device configurations for Docker.

admin/types/kv_store.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ export const KV_STORE_SCHEMA = {
99
'ui.hasVisitedEasySetup': 'boolean',
1010
'ui.theme': 'string',
1111
'ai.assistantCustomName': 'string',
12+
'gpu.type': 'string',
1213
} as const
1314

1415
type KVTagToType<T extends string> = T extends 'boolean' ? boolean : string

0 commit comments

Comments
 (0)