-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Describe the bug
After upgrading to v10.3.0+ (which switched from JS Chroma bindings to Python-based chroma-mcp via uvx), backfill of existing observations to ChromaDB fails for projects with large datasets. The error occurs in chroma_add_documents when metadata contains values that ChromaDB cannot accept as MetadataValue (only str/int/float/bool are supported).
New individual observation syncs work fine — the issue is specific to the bulk backfill code path.
Error message:
[CHROMA_SYNC] Backfilling observations {project=project-A, missing=1760, existing=0, total=1760}
[CHROMA_SYNC] Backfill failed {project=project-A} chroma-mcp tool "chroma_add_documents" returned error:
Error executing tool chroma_add_documents: Failed to add documents to collection 'cm__claude-mem':
argument 'metadatas': Cannot convert Python object to MetadataValue
Affected projects (same error):
project-A(1760 observations) — failedproject-B— failedproject-C— failedproject-D— failed (different error:JSON Parse error: Unexpected identifier "server")- Small projects (
project-E,project-F,project-G, etc.) — succeeded
Root cause:
addDocuments() sends batches of documents with metadata to chroma_add_documents. ChromaDB metadata only supports scalar types (str, int, float, bool). If any observation in a batch contains metadata with arrays, nested objects, or null values, the entire batch fails.
The previous JS-based Chroma integration likely handled type coercion internally. The new chroma-mcp bridge passes metadata as-is to Python chromadb, which strictly validates types.
Small projects succeed because they don't happen to contain observations with non-scalar metadata. Large projects inevitably have some.
Steps to reproduce
- Have existing observation data from pre-v10.3.0
- Upgrade to v10.3.0+ (or delete
~/.claude-mem/chroma/to force re-backfill) - Wait for backfill to trigger on session start
- Check logs: large projects fail with
Cannot convert Python object to MetadataValue
Suggested fix
Sanitize metadata before passing to chroma_add_documents — convert non-scalar values to strings:
// In addDocuments() or the metadata construction step:
const sanitizeMetadata = (meta) => {
const result = {};
for (const [key, value] of Object.entries(meta)) {
if (value === null || value === undefined) continue;
if (typeof value === 'object') {
result[key] = JSON.stringify(value);
} else {
result[key] = value;
}
}
return result;
};Environment
- claude-mem: 10.3.1
- chroma-mcp: 0.2.6 (via uvx)
- Platform: WSL2 (Ubuntu 24.04, kernel 6.6.87.2)
- Total observations across all projects: ~12,000+
Additional context
- Individual observation sync (non-backfill) works correctly — likely because new observations are constructed with scalar-only metadata
project-Dfails with a different error (JSON Parse error), which may be a separate data corruption issue- Related: v10.3.0+: chroma-mcp SSL default breaks CHROMA_MODE=remote with local HTTP server #1182 (SSL default issue in remote mode), v10.2.5: cache directory missing node_modules — semantic search broken on marketplace install #1166 (node_modules missing in cache)