Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

EntDB is a Rust-based SQL database engine with PostgreSQL wire compatibility.

It runs in two modes:

  • embedded library (entdb crate),
  • server (entdb-server) for psql and pg drivers.

Current scope

  • Core DDL/DML for OLTP-style workloads.
  • MVCC transactions.
  • WAL + restart recovery.
  • Optional polyglot SQL ingress on both server and embedded paths. When enabled, EntDB rewrites selected non-PostgreSQL SQL forms (for example MySQL backticks and numeric LIMIT offset, count) into PostgreSQL-compatible SQL before parsing.

See quickstart.md for usage and architecture.md for runtime layout.

Quickstart

Run EntDB locally using one of two paths:

  • Server path (entdb-server + psql)
  • Embedded path (Rust API with EntDb::connect)

Path A: Server (entdb-server + psql)

1. Start the server

From crates.io (when entdb-server is published):

cargo install entdb-server --locked
entdb --host 127.0.0.1 --port 5433 --data-path ./entdb.data --auth-user entdb --auth-password entdb

From source (this repo):

cargo run -p entdb-server -- \
  --host 127.0.0.1 \
  --port 5433 \
  --data-path ./entdb.data \
  --auth-user entdb \
  --auth-password entdb

Optional: enable polyglot ingress rewrites (MySQL-style backticks and numeric LIMIT offset, count):

ENTDB_POLYGLOT=1 entdb --host 127.0.0.1 --port 5433 --data-path ./entdb.data --auth-user entdb --auth-password entdb

or from source:

ENTDB_POLYGLOT=1 cargo run -p entdb-server -- \
  --host 127.0.0.1 \
  --port 5433 \
  --data-path ./entdb.data \
  --auth-user entdb \
  --auth-password entdb

2. Connect with psql

psql "host=127.0.0.1 port=5433 user=entdb password=entdb dbname=entdb"

3. Execute SQL

CREATE TABLE users (id INTEGER, name TEXT);
INSERT INTO users VALUES (1, 'alice'), (2, 'bob');
SELECT id, name FROM users ORDER BY id LIMIT 10;

Expected result:

 id | name
----+-------
  1 | alice
  2 | bob

4. Try vector and BM25 SQL (server mode)

Run the following SQL in psql:

CREATE TABLE embeddings (id INT, vec VECTOR(3));
INSERT INTO embeddings VALUES (1, '[0.1,0.2,0.3]'), (2, '[0.9,0.8,0.7]');
SELECT id, vec <-> '[0.2,0.2,0.2]' AS dist FROM embeddings ORDER BY id;

CREATE TABLE docs (id INT, content TEXT);
INSERT INTO docs VALUES (1, 'database systems'), (2, 'search indexing');
CREATE INDEX idx_docs_bm25 ON docs USING bm25 (content) WITH (text_config='english');
SELECT id, content <@ to_bm25query('database', 'idx_docs_bm25') AS score FROM docs ORDER BY id;

Path B: Embedded Rust API

Add to Cargo.toml:

[dependencies]
entdb = "0.2.0"
use entdb::EntDb;

fn main() -> entdb::Result<()> {
    let db = EntDb::connect("./entdb_data")?;
    db.execute("CREATE TABLE users (id INT, name TEXT)")?;
    db.execute("INSERT INTO users VALUES (1, 'alice')")?;
    let rows = db.execute("SELECT * FROM users")?;
    println!("{rows:?}");
    db.close()?;
    Ok(())
}

5. Try vector and BM25 SQL (embedded mode)

Use the same features directly from Rust:

use entdb::EntDb;

fn main() -> entdb::Result<()> {
    let db = EntDb::connect("./entdb_data")?;
    db.execute("CREATE TABLE embeddings (id INT, vec VECTOR(3))")?;
    db.execute("INSERT INTO embeddings VALUES (1, '[0.1,0.2,0.3]')")?;
    db.execute("SELECT id, vec <-> '[0.2,0.2,0.2]' AS dist FROM embeddings")?;

    db.execute("CREATE TABLE docs (id INT, content TEXT)")?;
    db.execute("INSERT INTO docs VALUES (1, 'database systems')")?;
    db.execute("CREATE INDEX idx_docs_bm25 ON docs USING bm25 (content)")?;
    db.execute(
        "SELECT id, content <@ to_bm25query('database', 'idx_docs_bm25') AS score FROM docs",
    )?;
    db.close()?;
    Ok(())
}

Search and Vectors

EntDB supports vector similarity and BM25 text search in both runtime modes:

  • server mode (entdb-server via pgwire clients like psql)
  • embedded mode (entdb crate via EntDb::execute)

Vector SQL (pgvector-style surface)

Supported:

  • VECTOR(n) column type
  • vector text literals: '[0.1,0.2,0.3]'
  • <-> L2 distance
  • <=> cosine distance (distance form)

Example:

CREATE TABLE embeddings (id INT, vec VECTOR(3));
INSERT INTO embeddings VALUES
  (1, '[0.1,0.2,0.3]'),
  (2, '[0.9,0.8,0.7]');

SELECT id, vec <-> '[0.2,0.2,0.2]' AS l2_dist
FROM embeddings
ORDER BY id;

BM25 SQL (pg_textsearch-style surface)

Supported:

  • CREATE INDEX ... USING bm25 (...)
  • optional WITH (text_config='english'|'simple')
  • to_bm25query(query_text, index_name)
  • <@ score operator

Example:

CREATE TABLE docs (id INT, content TEXT);
INSERT INTO docs VALUES
  (1, 'database database systems'),
  (2, 'systems design'),
  (3, 'database retrieval');

CREATE INDEX idx_docs_bm25 ON docs USING bm25 (content) WITH (text_config='english');

SELECT id, content <@ to_bm25query('database', 'idx_docs_bm25') AS score
FROM docs
ORDER BY id;

Embedded mode example

use entdb::EntDb;

fn main() -> entdb::Result<()> {
    let db = EntDb::connect("./entdb_data")?;
    db.execute("CREATE TABLE embeddings (id INT, vec VECTOR(3))")?;
    db.execute("INSERT INTO embeddings VALUES (1, '[0.1,0.2,0.3]')")?;
    db.execute("SELECT id, vec <-> '[0.2,0.2,0.2]' AS d FROM embeddings")?;

    db.execute("CREATE TABLE docs (id INT, content TEXT)")?;
    db.execute("INSERT INTO docs VALUES (1, 'database systems')")?;
    db.execute("CREATE INDEX idx_docs_bm25 ON docs USING bm25 (content)")?;
    db.execute(
        "SELECT id, content <@ to_bm25query('database', 'idx_docs_bm25') AS score FROM docs",
    )?;
    db.close()?;
    Ok(())
}

Planner behavior and limitations

  • BM25 sidecar files are persisted and maintained for DML operations.
  • Planner can choose a BM25-backed scan for matching single-table query shapes that include column <@ to_bm25query('...', 'index_name') and a matching BM25 index.
  • Non-matching shapes safely fall back to standard plan paths.

Current limitations:

  • text_config supports only english and simple.
  • BM25 shape specialization is not universal for all SQL forms.
  • BM25/sidecar behavior is optimized for current SQL subset and continues to evolve.

Reliability and sidecar format

  • BM25 sidecar files are schema-versioned (version field).
  • Legacy unversioned sidecars are read for compatibility.
  • Writes are atomic via temp-file + rename.

Troubleshooting

  • unsupported bm25 text_config ...: use english or simple.
  • vector dimension mismatch ...: ensure literal dimensions match VECTOR(n).
  • to_bm25query first argument must be TEXT: pass a string literal/query text.

Architecture

EntDB has two ingress paths:

  • pgwire server path (entdb-server), and
  • embedded Rust API path (QueryEngine).

Both use the same SQL core (binder/planner/optimizer/executor), MVCC model, and storage engine.

+--------------------+
| External clients   |
| psql / pg drivers  |
+---------+----------+
          |
          v
+--------------------+                        +--------------------+
| entdb-server       |                        | Embedded app       |
| pgwire, auth, TLS  |                        | Rust API           |
+---------+----------+                        +---------+----------+
          |                                             |
          +----------------------+----------------------+
                                 |
                                 v
                      +------------------------------+
                      | SQL Core                     |
                      | Binder / Planner /           |
                      | Optimizer / Executors        |
                      +---------------+--------------+
                                      |
                                      v
                      +------------------------------+
                      | Catalog                       |
                      | table/index metadata          |
                      +---------------+--------------+
                                      |
                                      v
                      +------------------------------+
                      | TransactionManager            |
                      | MVCC + txn lifecycle          |
                      +-------+---------------+-------+
                              |               |
                              |               +--------------------------+
                              |                                          |
                              v                                          v
                    +----------------------+                  +----------------------+
                    | BufferPool + Table + |                  | *.txn.wal            |
                    | B+Tree               |                  | *.txn.json           |
                    +----------+-----------+                  +----------------------+
                               |
                               v
                    +----------------------+
                    | LogManager +         |
                    | RecoveryManager      |
                    +----------+-----------+
                               |
                               v
                    +----------------------+
                    | *.wal                |
                    +----------------------+
                               |
                               v
                    +----------------------+
                    | *.data               |
                    +----------------------+

+------------------------------+
| OptimizerHistoryRecorder     |
+--------------+---------------+
               ^
               |
+--------------+--------------+
| QueryEngine / entdb-server  |
| SQL path (read + write)     |
+--------------+--------------+
               |
               v
+------------------------------+
| *.optimizer_history.json     |
+------------------------------+

+------------------------------+
| *.bm25.<index>.json          |
| (versioned sidecar index)    |
+------------------------------+

Startup (Database::open)

  1. DiskManager + LogManager
  2. BufferPool::with_log_manager(...)
  3. RecoveryManager::recover()
  4. Catalog::load(...)
  5. TransactionManager persistence setup
  6. OptimizerHistoryRecorder initialization

Query flow

  1. Parse SQL, or reuse a prepared statement.
  2. Bind names/types/relations, unless a prepared fast path can bypass the generic binder.
  3. Plan logical operators.
  4. Optimize (CBO/HBO with history) or dispatch to a narrow prepared fast path.
  5. Execute operators against MVCC-visible rows or index lookup paths.
  6. Persist writes via WAL-first ordering and page flush according to durability policy.

Durability policy

EntDB exposes three runtime durability policies:

  • Full: strict sync on commit path
  • Normal: reduced sync pressure for higher throughput
  • Off: best-effort durability for ephemeral workloads

Embedded and server callers can also force durability on individual writes with an explicit barrier or per-call override.

Index path

  • CREATE INDEX ... USING btree builds a secondary B-tree over existing rows.
  • DML maintains those indexes on later INSERT, UPDATE, DELETE, UPSERT, and INSERT ... SELECT.
  • Equality filters on indexed columns can route directly to index lookup executors instead of scanning the table.

Prepared and bulk execution

  • Prepared statements can use a fast path for common point reads and simple keyed DML.
  • Embedded bulk APIs (insert_many, update_many, delete_many) run repeated keyed writes in one transaction and avoid parser/binder/planner churn.

Vector and BM25 path

  • Vector operators (<->, <=>) are evaluated in expression execution.
  • BM25 index metadata is stored in catalog (IndexType::Bm25 with text_config).
  • BM25 documents/postings are stored in per-index sidecar files: *.bm25.<index>.json.
  • Sidecar writes are atomic (temp file + rename) and sidecar format is versioned.
  • On matching query shapes (column <@ to_bm25query(...) with matching index), planner can route to a BM25-backed scan path; non-matching shapes fall back to regular scan/filter/project paths.

Storage Engine

The storage layer is page-native and built for deterministic flush/recovery behavior under pressure.

Buffer Pool Eviction/Flush Flow

+------------------------+
| fetch/new page         |
+------------------------+
            |
            v
+------------------------+
| pin frame in pool      |
+------------------------+
            |
            v
+------------------------+
| mutate page (dirty=1)  |
+------------------------+
            |
            v
+------------------------+
| unpin frame            |
+------------------------+
            |
            v
   +-------------------+
   | pressure present? |
   +-------------------+
      | yes       | no
      v           v
+----------------------+   +----------------------+
| choose LRU-K victim  |   | reuse frame/continue |
+----------------------+   +----------------------+
            |
            v
   +-------------------+
   | victim is dirty?  |
   +-------------------+
      | yes       | no
      v           v
+----------------------+   +----------------------+
| WAL flush_up_to(LSN) |   | evict victim         |
+----------------------+   +----------------------+
            |
            v
+----------------------+
| write page to disk   |
+----------------------+
            |
            v
+----------------------+
| evict victim         |
+----------------------+
            |
            v
+----------------------+
| reuse frame/continue |
+----------------------+

How EntDB uses storage components

  • Page: fixed-size page with header/checksum, used as the stable on-disk unit.
  • DiskManager: page allocation/deallocation and positioned I/O.
  • BufferPool: in-memory frames with pin/unpin semantics, dirty tracking, and LRU-K victim selection.
  • SlottedPage: tuple slot directory for variable-length row storage.
  • Table: heap-page row layout plus tuple identity management.
  • B+Tree: page-native index structure for point/range access.

What this is good for

  • deterministic correctness under pressure,
  • stable persistence format with reopen/recovery invariants,
  • predictable behavior during eviction-heavy and crash-recovery scenarios.

Reference files

  • crates/entdb/src/storage/page.rs
  • crates/entdb/src/storage/disk_manager.rs
  • crates/entdb/src/storage/buffer_pool.rs
  • crates/entdb/src/storage/slotted_page.rs
  • crates/entdb/src/storage/table.rs
  • crates/entdb/src/storage/btree/tree.rs

SQL and Query Engine

EntDB executes SQL through a fixed pipeline:

  1. Parse SQL into AST.
  2. Bind names/types/relations.
  3. Build logical plan.
  4. Optimize simple plan shape.
  5. Execute with concrete operators.

Supported SQL surface

Implemented subset includes:

  • DDL:
  • CREATE TABLE, DROP TABLE
  • CREATE INDEX, DROP INDEX
  • selected ALTER TABLE operations (add/drop/rename column, rename table)
  • DML/query:
  • INSERT, INSERT ... SELECT, UPSERT/ON CONFLICT, SELECT, UPDATE, DELETE, TRUNCATE
  • extended DML forms:
  • UPDATE ... FROM, UPDATE ... RETURNING
  • DELETE ... USING, DELETE ... RETURNING, DELETE ... ORDER BY/LIMIT, multi-table DELETE
  • predicates and ordering: WHERE, ORDER BY, LIMIT, OFFSET
  • aggregation: COUNT, SUM, AVG, MIN, MAX, multi-column GROUP BY
  • relational operators: INNER JOIN (including join chains), UNION, UNION ALL
  • query forms: CTE (WITH), derived subqueries, literal SELECT without FROM
  • window/query features: row_number() over (...), scalar function projections
  • transaction SQL: BEGIN, COMMIT, ROLLBACK

It also supports:

  • typed parameter binding for extended protocol (AST-level binding, no string substitution),
  • optional SQL dialect transpilation ingress with guarded fallback/error contracts.

Prepared execution

EntDB has two prepared execution layers:

  • generic prepared execution, which reuses parsed SQL and bound parameters
  • a narrow prepared fast path for common hot statements

The current fast path covers:

  • SELECT ... FROM t
  • SELECT ... FROM t WHERE col = $1
  • SELECT COUNT(*) ... WHERE col OP $1
  • simple single-row INSERT
  • keyed UPDATE
  • keyed DELETE

This fast path is intentionally narrow. Unsupported shapes fall back to the normal binder/planner/executor pipeline.

Index-backed equality lookups

For single-column B-tree indexes, equality predicates can bypass full scans:

  • CREATE INDEX ... USING btree builds the index over existing rows
  • later DML keeps the index in sync
  • simple equality filters can dispatch to an index lookup executor

This is the path used to speed up repeated keyed lookups such as WHERE id = $1.

Bulk embedded APIs

The embedded Rust API also exposes batched write helpers:

  • insert_many(...)
  • update_many(...)
  • delete_many(...)

These helpers are not new SQL syntax. They are embedded API shortcuts that execute repeated keyed changes in one transaction while avoiding repeated SQL parse/bind/plan work.

SQL Dialect Transpiler Support

EntDB includes a guarded SQL transpiler ingress for selected non-PostgreSQL query shapes.

It uses the external polyglot-sql crate.

How it works:

  • Transpiler is disabled by default.
  • Enable it with ENTDB_POLYGLOT=1 (server) or engine.set_polyglot_enabled(true) (embedded QueryEngine).
  • SQL is transpiled before PostgreSQL parsing/binding.
  • If transpilation changes SQL, EntDB records original and transpiled forms in error context for debugging.

Currently supported rewrites:

  • MySQL-style identifier quoting: `users` -> "users"
  • MySQL numeric LIMIT offset, count -> PostgreSQL LIMIT count OFFSET offset

Guardrails and behavior:

  • Non-numeric LIMIT offset, count is left unchanged (no unsafe guessing).
  • Unbalanced backticks are rejected.
  • Unsupported delimiter syntax is rejected.
  • Rewriter only triggers for candidate inputs; normal PostgreSQL SQL bypasses transpilation.

Example (with transpiler enabled):

SELECT `id`, `name` FROM `users` ORDER BY `id` LIMIT 1, 2;

is executed as:

SELECT "id", "name" FROM "users" ORDER BY "id" LIMIT 2 OFFSET 1;

Vector and BM25 support (both modes)

The same SQL surface is supported through:

  • server mode (entdb-server, pgwire clients such as psql)
  • embedded mode (EntDb Rust API)

Vector SQL

  • type: VECTOR(n)
  • literal form: '[x,y,z]'
  • operators:
  • <-> (L2 distance)
  • <=> (cosine distance)

Example:

CREATE TABLE embeddings (id INT, vec VECTOR(3));
INSERT INTO embeddings VALUES (1, '[0.1,0.2,0.3]');
SELECT id, vec <-> '[0.2,0.2,0.2]' AS dist FROM embeddings;

BM25 SQL

  • index DDL: CREATE INDEX ... USING bm25 (...) [WITH (...)]
  • query constructor: to_bm25query(query_text, index_name)
  • scoring operator: <@

Example:

CREATE TABLE docs (id INT, content TEXT);
CREATE INDEX idx_docs_bm25 ON docs USING bm25 (content) WITH (text_config='english');
SELECT id, content <@ to_bm25query('database', 'idx_docs_bm25') AS score
FROM docs;

Status note:

  • BM25 sidecar index persistence and DML maintenance are implemented.
  • Planner/executor can use a BM25-backed scan for matching <@ to_bm25query(...) query shapes.
  • BM25 sidecar files are versioned (version field) with legacy unversioned read compatibility.

Why this design

  • Same core engine for embedded and pgwire paths.
  • Predictable semantics for common transactional queries.
  • Defensive error handling for malformed SQL and unsupported rewrites.

Reference files

  • crates/entdb/src/query/binder.rs
  • crates/entdb/src/query/planner.rs
  • crates/entdb/src/query/optimizer.rs
  • crates/entdb/src/query/executor/mod.rs
  • crates/entdb/src/query/tests/engine_tests.rs

Transactions and MVCC

EntDB uses MVCC row versioning with transaction snapshots.

MVCC Visibility Flow

+-------------------------------------------+
| row version                               |
| (created_txn, deleted_txn?)               |
+-------------------------------------------+
                     |
                     v
+-------------------------------------------+
| lookup txn status for created/deleted txns|
+-------------------------------------------+
                     |
                     v
+-------------------------------------------+
| compare commit ts against reader snapshot |
+-------------------------------------------+
                     |
                     v
         +---------------------------+
         | created visible to reader?|
         +---------------------------+
            | no               | yes
            v                  v
   +----------------+   +---------------------------+
   | skip row       |   | deleted txn visible?      |
   +----------------+   +---------------------------+
                             | yes             | no
                             v                 v
                    +----------------+   +----------------+
                    | hide row       |   | return row     |
                    +----------------+   +----------------+

How EntDB applies MVCC

Highlights:

  • transaction API and SQL control (BEGIN / COMMIT / ROLLBACK),
  • snapshot-based visibility rules,
  • write-write conflict detection,
  • persisted transaction metadata (*.txn.wal, *.txn.json),
  • vacuum controls for version cleanup policy.

Why this is useful

  • readers get stable visibility while writers progress concurrently,
  • write-write conflicts fail fast instead of silently corrupting visibility,
  • restart behavior preserves committed state and hides incomplete work.

Validation coverage

  • concurrent transaction matrix tests,
  • restart visibility tests,
  • crash/recovery scenario tests.

Reference files

  • crates/entdb/src/tx.rs
  • crates/entdb/src/query/executor/mod.rs
  • crates/entdb/src/query/executor/update.rs
  • crates/entdb/src/query/executor/delete.rs
  • crates/entdb/src/query/tests/mvcc_tests.rs

Reliability and Recovery

EntDB durability and restart safety are driven by WAL-first write semantics and deterministic recovery.

Durability policy is configurable at runtime:

  • Full: strict sync on commit path
  • Normal: reduced sync pressure with the same recovery model
  • Off: best-effort durability for ephemeral workloads

Callers that normally run in Normal or Off can still force a durable boundary on a specific operation.

Reliability stack

  • WAL record checksums and replay safety,
  • analysis/redo/undo recovery paths,
  • failpoint and crash-point matrices,
  • idempotent recovery expectations.

WAL Recovery Flow

+------------------------------+
| startup                      |
+------------------------------+
               |
               v
+------------------------------+
| scan WAL records             |
+------------------------------+
               |
               v
+------------------------------+
| analysis                     |
| - collect txn states         |
| - collect touched pages      |
+------------------------------+
               |
               v
+------------------------------+
| redo                         |
| - replay committed updates   |
| - respect page LSN checks    |
+------------------------------+
               |
               v
+------------------------------+
| undo                         |
| - roll back incomplete txns  |
+------------------------------+
               |
               v
+------------------------------+
| consistent recovered state   |
+------------------------------+

Validation coverage

Reliability behavior is validated with crash matrices, failpoint-driven recovery tests, and MVCC restart visibility tests.

What this protects

  • committed transactions remain visible across restart,
  • incomplete/aborted transactions do not leak visibility,
  • repeated recovery runs converge to the same state.
  • durability policy changes trade commit latency against sync strictness, not MVCC visibility rules.

Reference files

  • crates/entdb/src/wal/log_record.rs
  • crates/entdb/src/wal/log_manager.rs
  • crates/entdb/src/wal/recovery.rs
  • crates/entdb/src/wal/tests/recovery_tests.rs
  • crates/entdb/tests/crash_matrix.rs

Operating EntDB

EntDB runs as a single-node transactional SQL engine.

Runtime boundaries

You can configure:

  • max concurrent connections,
  • max statement size,
  • per-query timeout,
  • auth policy (md5 or scram-sha-256),
  • optional TLS transport.

Deployment basics

  • Keep --data-path on durable storage.
  • Use different data paths for dev/stage/prod.
  • Keep auth enabled outside local development.
  • Use TLS for non-local traffic.

Lifecycle behavior

  • Clean shutdown flushes dirty pages and transaction metadata.
  • Restart replays recovery before serving queries.

What to monitor

  • query latency and error rate,
  • transaction conflict rate,
  • WAL flush latency,
  • buffer pool pressure.