Embeddings

GalaxDB computes text embeddings locally - no external API, no data leaving your machine. Embeddings are computed automatically when you insert rows into a table with an EMBEDDING MODEL column.

How It Works

When you declare a column with EMBEDDING MODEL 'model-id' DIM n, GalaxDB registers that column as an embedding source. On every INSERT the text value is sent to the sidecar process, which returns a float32 vector of dimension n. That vector is stored alongside the row and indexed in the HNSW graph.

INSERT INTO docs (id, body) VALUES (1, 'machine learning')
         ↓
  galaxdb-server receives INSERT
         ↓
  sends text to galaxdb-sidecar  (is_query: false → document prefix)
         ↓
  sidecar returns float32[384]
         ↓
  row stored + HNSW index updated

Sidecar Process

The sidecar (galaxdb-sidecar) is a separate process that loads the HuggingFace model and serves embedding requests over a local Unix socket. This isolation means:

The main database process never loads Python or ML frameworks
A sidecar crash does not crash the database - writes are queued and drained on recovery
The sidecar can be updated independently of the database binary

bash

galaxdb-server \
  --data-dir ./data \
  --sidecar /usr/local/bin/galaxdb-sidecar \
  --model sentence-transformers/all-MiniLM-L6-v2

Warning

If the sidecar is unavailable, INSERT into embedding columns returns a SidecarUnavailable error. GalaxDB never silently stores zero vectors or falls back to mock embeddings.

Supported Models

Seven model architectures are verified end-to-end. Pass any of these to --model at startup. The DIM value in your CREATE TABLE must match the model output dimension.

Model	Dim	Asymmetric	Notes
sentence-transformers/all-MiniLM-L6-v2	384	No	Default. Fast, general-purpose.
BAAI/bge-m3	1024	No	Multilingual, retrieval-optimized.
Qwen/Qwen3-Embedding-0.6B	1024	Yes	Instruction-tuned, CPU-friendly.
Qwen/Qwen3-Embedding-4B	2560	Yes	Higher quality. GPU recommended.
Qwen/Qwen3-Embedding-8B	4096	Yes	Highest quality. GPU required.
google/embeddinggemma-300m	768	Yes	Bidirectional Gemma. Gated - HF token needed.
LiquidAI/LFM2.5-Embedding-350M	1024	Yes	Hybrid architecture, CPU-friendly.

Models not in this list are auto-detected: the sidecar reads the model config from HuggingFace. If the architecture is one GalaxDB supports (BERT, XLM-RoBERTa, Qwen3, Gemma3, LFM2), it loads without a code change. An unknown architecture exits with a typed error listing what is supported.

Asymmetric Encoding

Qwen3-Embedding, EmbeddingGemma, and LFM2.5-Embedding apply different instruction prefixes to queries vs stored documents. Using the wrong prefix degrades retrieval quality. GalaxDB handles this automatically:

On INSERT, the sidecar receives is_query: false and applies the document prefix.
On SEMANTIC_MATCH, the sidecar receives is_query: true and applies the query prefix.

Symmetric models (all-MiniLM, BGE-M3) ignore the flag. You never need to manage prefixes manually.

Embedding Column Syntax

SQL

CREATE TABLE docs (
    id   INT PRIMARY KEY,
    body TEXT EMBEDDING MODEL 'sentence-transformers/all-MiniLM-L6-v2' DIM 384
);

-- BGE-M3 (1024-dim, multilingual)
CREATE TABLE docs_m3 (
    id   INT PRIMARY KEY,
    body TEXT EMBEDDING MODEL 'BAAI/bge-m3' DIM 1024
);

Health Check

SQL

SHOW EMBEDDING HEALTH;
SHOW EMBEDDING HEALTH FOR docs;

bash

curl http://localhost:9090/health
# {"status":"ok","version":"0.7.0","subsystems":{"disk_full":false,"sidecar_healthy":true,"connections_active":0}}

curl http://localhost:9090/metrics | grep embedding
# galaxdb_embedding_ops_total 1042
# galaxdb_embedding_queue_depth 0
# galaxdb_embedding_backlog_depth 0

Storage Engine

Vector Search