NEDB Documentation

NEDB Reference Documentation

GitHub PyPI Spec

v2.0.27 — The DAG Engine

v2 DAG Engine

NEDB v2.0.27 ships the production DAG substrate: every document version is hashed with BLAKE2b and stored as an immutable, addressable object. There is no append-only log to replay and no corrupted tail to recover. After the first open, MANIFEST restores seq and the Merkle head in O(1); the deferred background cold scan lets the daemon accept connections immediately on first open. The head hash is a real Merkle root over the entire DAG, anchorable on-chain and comparable across replicas, updated incrementally on every write.

v1 content is still valid. The v1 AOF engine and its full API (NQL, time-travel, causal provenance, AES-256-GCM, RESP2, SQL/Redis adapters) remain unchanged. The DAG engine is a drop-in replacement substrate: same wire protocol, same query language, same Python/Node API surface. Pass --dag (or NEDBD_DAG=1) to opt in. Everything below this section also applies to the v2 engine unless it specifically refers to log.aof.

Production status: NEDB v2.0.27 is serving vision.interchained.org from the DAG engine — 1,310,703 sequences indexed, AES-256-GCM encrypted at rest, live at block height 620,989.

What changed at the substrate

Storage is a Merkle DAG, not a log. Every put produces an immutable, BLAKE2b-keyed object. Document history is a real DAG of version objects, not a sequential AOF.
Warm start is O(1). A small MANIFEST file records seq and the current Merkle head. After the first open, the daemon restores both in constant time — independent of dataset size.
Cold scan is deferred. On the first open of an existing dataset, the daemon accepts connections immediately; the integrity scan runs in a background thread with a live progress bar. Reads serve instantly; writes return 503 startup in progress until the startup_ready gate flips.
Corruption is impossible. Objects are immutable and hash-addressed. A flipped byte changes the digest; the DAG simply ignores the broken object and the head hash refuses to verify against it.
The head is a real Merkle root, updated in O(1). BLAKE2b commitment over the full DAG of every version ever written, advanced incrementally on every write, never recomputed.
IdIndex sharded across 256 subdirectories. Prevents directory slowdown at 1M+ documents.
TCP_NODELAY on the axum listener. Eliminates the 40–200 ms Nagle delay on macOS loopback.
Deletes are tombstones. History never disappears; AS OF still resolves the pre-delete value, and the head hash still commits to every version ever written.

--dag / NEDBD_DAG=1

Two equivalent ways to launch the v2 engine:

shell — flag form

nedbd --dag --data ./data
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./data  engine=dag  auth=off

shell — env form (recommended for systemd / containers)

NEDBD_DAG=1 NEDB_TMK=<hex> nedbd --data ./data
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./data  engine=dag  enc=on

Without --dag or NEDBD_DAG=1, nedbd runs the v1 AOF engine exactly as in 1.x. The v1 daemon binary, log format, and HTTP routes are unchanged.

Both flags do the same thing. The --dag CLI flag and the NEDBD_DAG=1 environment variable both flip the same internal switch — pick whichever fits your deployment style. Set both and either wins; they never conflict.

The nedbd-v2 binary

The v2 engine ships as a separate binary, nedbd-v2. The nedbd command is a thin launcher: when --dag or NEDBD_DAG=1 is detected, it execs nedbd-v2 from the same install directory with the same arguments and environment.

shell — where the binary lives

which nedbd-v2
# /usr/local/bin/nedbd-v2  (or wherever nedb-engine installs binaries)

# Both pip and npm packages ship the v2 binary alongside the v1 launcher.
pip show -f nedb-engine | grep nedbd
# scripts/nedbd
# bin/nedbd-v2

# You can also invoke nedbd-v2 directly, skipping the launcher.
nedbd-v2 --data ./data
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./data  engine=dag

Invoking nedbd-v2 directly is identical to nedbd --dag — the launcher only adds the engine-selection logic; everything else is the same Rust process.

Startup modes — warm vs. cold

nedbd v2 has two distinct startup paths. Both keep the daemon responsive; the difference is whether an integrity scan is required.

Warm start (every restart after the first open)

On every restart after the database has been opened at least once, the daemon reads the MANIFEST file and restores seq and the current Merkle head in O(1). The full DAG is not scanned; objects are read on demand. Boot is constant-time and independent of dataset size.

shell — warm start (subsequent restart)

NEDBD_DAG=1 nedbd --data ./data
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./data  engine=dag
#   manifest: seq=1310703  head=b2:9c14e07a…  (warm start in 4 ms)
#   ready

Cold start (first open of an existing dataset)

If no MANIFEST exists yet, the daemon spawns the integrity scan in a background thread and starts accepting connections immediately. Reads serve from the DAG as the scan proceeds. Writes are blocked behind the startup_ready gate — any PUT or mutation returns HTTP 503 with body {"error":"startup in progress","scan":{"objects":X,"of":Y,"rate":Z,"eta_s":W}} until the scan completes. The MANIFEST is written at the end of the scan; all subsequent restarts are warm.

shell — cold start (first open, deferred scan with progress bar)

NEDBD_DAG=1 nedbd --data ./data
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./data  engine=dag
#   listener ready    (writes blocked: 503 startup in progress)
#   cold scan:  47312 / 1310703 objs   1.4s elapsed   ~21k/s   eta 60s
#   cold scan:  cold scan: ████████░░░  730 K / 1.31 M  21k/s  eta 28s
#   manifest written: seq=1310703  head=b2:9c14e07a…   startup_ready=true
#   writes unblocked

Reads work instantly during cold scan. The DAG is content-addressed, so any object that has been written is fetchable by digest before the scan reaches it. Only the write gate (startup_ready) waits for the full scan to finish — to guarantee replay protection counters and idempotency tables are fully rehydrated before accepting new writes.

Watching startup live with SSE

The new GET /events SSE endpoint streams the cold-scan progress along with every subsequent write — useful for ops dashboards and CI.

shell — tail the live event stream

curl http://127.0.0.1:7070/events
# event: scan
# data: {"objects":730000,"of":1310703,"rate":21043,"eta_s":28}
# event: ready
# data: {"seq":1310703,"head":"b2:9c14e07a…"}
# event: write
# data: {"seq":1310704,"coll":"beliefs","head":"b2:7af3c11e…"}

Migrating from v1 AOF

v2 reads v1 AOF data on first open and produces an equivalent DAG. The original log.aof is preserved for rollback, and the new DAG is written to dag/ alongside it.

shell — in-place migration

# Before: ./data/log.aof, ./data/meta.json (v1 AOF engine)
# Stop v1 daemon, then start v2 on the same directory:
NEDBD_DAG=1 nedbd --data ./data

# Boot log on first open (cold scan runs deferred — socket open immediately):
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./data  engine=dag
#   listener ready    (writes blocked: 503 startup in progress)
#   migrate: scanning log.aof (1.2M ops) …
#   migrate: built DAG with 47312 unique object hashes
#   migrate: head = b2:7af3c11e…  verified
#   manifest written   startup_ready=true   (subsequent starts: O(1) warm)

# After: ./data/log.aof (kept for rollback), ./data/dag/, ./data/MANIFEST

The head hash after migration commits to the same logical history that verify() would have produced on v1 — same NQL results, same AS OF answers, same causal traces. The Merkle root, however, is a new value (the DAG hashes are computed differently than the v1 chain hashes), so anchor consumers must record the new root.

Migration is one-way at the head. After migration, v1 will still be able to read log.aof (it ignores the dag/ directory), but any writes made via v2 are stored only in the DAG and will not be visible to a v1 daemon. Keep both running in parallel only with care; the recommended path is a clean cutover.

BLAKE2b Merkle head

The v2 engine exposes the DAG root as db.head() (and on every HTTP response). It is a BLAKE2b digest over an ordered Merkle tree of all DAG objects.

Python

from nedb import NEDB

db = NEDB("./data", dag=True)

db.put("claims", "c1", {"fact": "Earth orbits Sun"})
db.head()
# "b2:7af3c11e9d4f5a1c…"  (BLAKE2b Merkle root over the full DAG)

# Every HTTP response also carries the current head:
# {"rows": [...], "count": 1, "seq": 12, "head": "b2:7af3c11e…"}

shell — verify the DAG

curl http://localhost:7070/v1/databases/myapp/verify
# {"ok": true, "engine": "dag", "head": "b2:7af3c11e…", "objects": 47312}

verify() on the v2 engine re-walks every DAG object, recomputes its BLAKE2b digest, confirms each parent edge resolves, and re-derives the Merkle root. Any tampering changes the digest and the verification fails. On the production benchmark hardware (Intel iMac), verify() processes 30,000 objects in 1.38 s — roughly 21,000 BLAKE2b/sec.

The Merkle head is incremental. Live updates advance the head in O(1) on every write — the previous root is combined with the new object's digest via a BLAKE2b update step. verify() only rewalks the full DAG on demand; normal operation never recomputes the root from scratch.

Tombstone deletes

Deletes in the DAG engine are tombstone objects: small immutable records that point at the version being deleted. They are part of the DAG and the Merkle root, so the head hash still commits to the fact that the delete happened.

Python

db.put("users", "alice", {"name": "Alice", "status": "active"})
snap = db.seq

db.delete("users", "alice")        # writes a tombstone object

db.get("users", "alice")              # → None (HEAD sees the tombstone)
db.get("users", "alice", as_of=snap)     # → {"name": "Alice", ...}

# The head still commits to the original version — proof never disappears.
assert db.verify()

Tombstones never reclaim space — the original version object is still in the DAG so AS OF reads can resolve it. If you need true physical erasure (e.g. for GDPR), use db.purge(coll, id) which removes the version object and writes a purge tombstone that records the removal in the Merkle history.

Replay protection still works. Nonce tracking and idempotency keys are stored as DAG objects too. Replaying a stale write raises ReplayError just as on v1; calling a write with the same idem key returns the original result and writes nothing new.

Environment variables

The v2 engine reads the same environment variables as v1, plus NEDBD_DAG to opt in:

Variable	Default	Description
`NEDBD_DAG`	`0`	Set to `1` to launch the v2 DAG engine (`nedbd-v2`). Equivalent to passing `--dag` on the command line.
`NEDBD_HOST`	`127.0.0.1`	Bind address. Defaults to loopback in v2.0.27 (was `0.0.0.0` previously) — security hardening fix. Set explicitly to `0.0.0.0` to expose the daemon on all interfaces.
`NEDBD_PORT`	`7070`	HTTP bind port. Unchanged from v1.
`NEDBD_TOKEN`	unset	Optional bearer token. If set, every `/v1/*` request must include `Authorization: Bearer <token>`. Unchanged from v1.
`NEDB_TMK`	unset	32-byte hex AES-256-GCM at-rest encryption key. The DAG engine encrypts every object before writing it to `dag/`; the Merkle structure remains verifiable through encryption.
`NEDBD_DATA`	`./nedb-data`	Root directory. The DAG engine creates `dag/` for objects, the IdIndex sharded across 256 subdirectories, and a small `MANIFEST` file for the current `seq` + Merkle head.

shell — full v2 launch

NEDBD_DAG=1 \
NEDBD_HOST=0.0.0.0 \
NEDBD_PORT=7070 \
NEDBD_DATA=/var/db/nedb \
NEDBD_TOKEN=my-secret \
NEDB_TMK=$(cat /run/secrets/nedb-tmk) \
nedbd

Benchmarks (v2.0.27)

Measured on an Intel iMac with 10,000 sequential writes, 100,000 reads, and a working set of ~30,000 unique DAG objects. nedbd-v2 was running with NEDBD_DAG=1 and AES-256-GCM enabled.

Operation	Throughput	p50	p99
Sequential writes	418 ops/s	2.3 ms	3.3 ms
Point-lookup reads	478 ops/s	2.0 ms	3.0 ms
ORDER BY queries	489 ops/s	1.8 ms	4.3 ms
Batch writes (500 ops/req)	1,104 ops/s	0.9 ms	1.2 ms
Tamper-verify (30k objects)	~21,000 BLAKE2b/s	—	1.38 s total

TCP_NODELAY matters. The axum listener sets TCP_NODELAY on every socket; without it, macOS loopback adds the Nagle algorithm's 40–200 ms delay on small writes — enough to drag p99 from 3.3 ms to 250 ms+.

Reproduce with the included benchmark:

NEDBD_DAG=1 nedbd --data /tmp/perf &
python3 tests/test_dag_perf.py --n 10000 --reads 100000

Introduction

NEDB is a versioned, time-traveling embedded database — replay-protected, idempotent, relational, and searchable. In v1 it is built on one hash-chained, nonce-enforced, append-only log; in v2 the same semantics sit on a content-addressed Merkle DAG. Both substrates expose the same API, the same NQL, and the same time-travel guarantees.

What makes it different: most databases store a snapshot of current state. NEDB stores the log of every operation and derives state from it. That one decision makes replay protection, crash recovery, time-travel reads, and on-chain provability all fall out for free — from the same structure.

Key properties

Replay-protected: every write carries a strictly-monotonic per-client nonce. Stale or duplicate ops are rejected.
Idempotent: pass an idem key and retries are no-ops — the original result is returned without re-executing.
MVCC time-travel: read the database exactly as it was at any past sequence — AS OF seq.
Relational: first-class graph edges with O(1) traversal — and the graph time-travels too.
Durable: NEDB(path) appends every op to disk (log.aof) and fsync's it; the database reloads by replaying that log on open.
Provable: verify() rewalks the BLAKE2b hash chain; the head hash is a commitment to the entire history, anchorable on-chain.

Getting Started

Installation

NEDB ships as a universal Python package — one command, any platform, no compiler needed.

shell

pip install nedb-engine

Verify the install and start the server daemon:

python3 -c "import nedb; print(nedb.__version__)"
# 2.0.27

nedbd
# nedbd 2.0.27 — http://127.0.0.1:7070  data=./nedb-data  auth=off  engine=aof

nedbd --dag
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./nedb-data  auth=off  engine=dag

Python 3.8+ required. The package ships a universal py3-none-any wheel — no Rust toolchain, no native compilation. The optional Rust core (nedb._native) is additive roadmap; the pure-Python engine is the production baseline.

Getting Started

Quickstart

Up and running in under a minute — a durable database with indexes, relations, and time-travel.

Python

from nedb import NEDB

# Durable (persists to disk). Use NEDB() for in-memory.
db = NEDB("./mydata")

# Indexes power fast filtering, sorting, and full-text search
db.create_index("users", "status", "eq")
db.create_index("users", "age", "ordered")
db.create_index("users", "bio", "search")

# Write rows
db.put("users", "alice", {"name": "Alice", "age": 31, "status": "active", "bio": "systems engineer"})
db.put("users", "bob",   {"name": "Bob",   "age": 40, "status": "active", "bio": "database architect"})

# Idempotent write — safe to retry forever
db.put("orders", "o1", {"total": 42}, client="checkout", nonce=7, idem="charge-o1")

# Query with NQL
db.query('FROM users WHERE status = "active" ORDER BY age DESC')
db.query('FROM users SEARCH "systems"')

# Relations + traversal
db.link("users:alice", "follows", "users:bob")
db.q("users").where("_id", "=", "alice").traverse("follows").run()

# Time-travel
snap = db.seq
db.put("users", "alice", {"age": 32, "status": "active"})
db.get("users", "alice", as_of=snap)["age"]  # → 31

# Integrity
assert db.verify()              # hash chain intact
assert db.verify_determinism()  # state == replay(log)

db.close()

Getting Started

Core Concepts

The append-only log

Every mutation (put, delete, link, put_file) creates an Op that is appended to the OpLog. Each Op is chained to the previous one via a BLAKE2b hash — the head hash is a cryptographic commitment to the entire history. Nothing is ever rewritten or deleted from the log.

State is a pure function of the log. The MVCC store, relations, and indexes are materialized views — they are rebuilt by replaying the log, which means crash recovery and time-travel are free side effects of the same design.

Sequences and time-travel

Every Op gets a monotonically-increasing seq — an integer starting at 0. db.seq returns the current sequence number. Any read can be made time-traveling by passing as_of=seq: the engine replays or truncates the log to that point and returns the result that was true then.

Replay protection and idempotency

Each write takes a client identifier and a nonce. The nonce must strictly exceed the last nonce seen from that client — stale or repeated ops raise ReplayError. If you also pass an idem key, the very first successful write is recorded; subsequent calls with the same key return the original result and append nothing to the log.

Collections and keys

NEDB is schema-agnostic. A "collection" is just a string namespace prefix. Every document has an _id field (set automatically from the id argument if missing). Internal store keys are formatted as collection:id.

Relations

Edges are stored as (from, relation, to) triples — the "from" and "to" are full node keys in collection:id form. Relations also time-travel: neighbors(frm, rel, as_of=seq) returns the edges that existed at that sequence.

Python API

NEDB()

The main database object. Instantiate once; keep open for the lifetime of your application.

NEDB(path: str | None = None) → NEDB

Parameter	Type	Description
path	str \| None	Directory path for durable storage. If `None` (default) the database is in-memory only and nothing is written to disk.

db = NEDB()              # in-memory
db = NEDB("./data")     # durable — creates ./data/log.aof + meta.json

# Use as a context manager for automatic close/flush:
with NEDB("./data") as db:
    db.put("k", "1", {"v": 1})

put()

Insert or replace a document. If a document with the same id already exists it is fully replaced. Returns the stored document (with _id set).

put(coll: str, id: str, doc: dict, *, client: str = "local", nonce: int | None = None, idem: str | None = None) → dict

Parameter	Type	Description
coll	str	Collection name (e.g. `"users"`)
id	str	Document identifier — unique within the collection
doc	dict	The document to store. `_id` is set automatically.
client	str	Client identifier for nonce tracking. Default: `"local"`
nonce	int \| None	Monotonic write counter for this client. Auto-incremented if `None`.
idem	str \| None	Idempotency key. If this key was already used, the original result is returned and nothing is appended.

# Simple write
doc = db.put("users", "alice", {"name": "Alice", "age": 31})

# Idempotent write — calling this 100 times = calling it once
db.put("orders", "o1", {"total": 99}, client="api", nonce=1, idem="order-o1-v1")

# Explicit nonce (replay-protected from a distributed service)
db.put("events", "e1", {"type": "click"}, client="tracker", nonce=42)

delete()

Remove a document. The deletion is appended to the log as a delete op — the history is preserved and the document is visible at past sequences via AS OF.

delete(coll: str, id: str, *, client: str = "local", nonce: int | None = None, idem: str | None = None)

db.delete("users", "alice")

get()

Retrieve a single document by id. Returns None if the document does not exist (or did not exist at as_of).

get(coll: str, id: str, as_of: int | None = None) → dict | None

doc = db.get("users", "alice")          # current HEAD
old = db.get("users", "alice", as_of=3)  # as it was at seq 3

query()

Execute a NQL query string. Returns a list of matching documents. See the NQL Reference for the full grammar.

query(nql: str) → list[dict]

rows = db.query('FROM users WHERE status = "active" ORDER BY age DESC LIMIT 10')
found = db.query('FROM users SEARCH "engineer"')
old   = db.query('FROM users AS OF 5 WHERE age > 25')

create_index()

Create a secondary index on a collection field. Indexes are maintained incrementally on every write and dramatically speed up queries. Existing rows are backfilled immediately.

create_index(coll: str, field: str, kind: str = "eq")

kind	Type	Use for
`"eq"`	Hash map	Equality filters — `WHERE field = value`
`"ordered"`	Sorted list	Range queries and sorting — `ORDER BY field`, `WHERE field > n`
`"search"`	Inverted index	Full-text search — `SEARCH "term"`

db.create_index("users", "status", "eq")       # WHERE status = "active"
db.create_index("users", "created_at", "ordered")  # ORDER BY created_at DESC
db.create_index("users", "bio", "search")        # SEARCH "engineer"

Create indexes before seeding data when possible — but creating after works too; existing rows are backfilled. Index configuration is persisted in meta.json (durable mode) so indexes survive restarts.

link() / unlink()

Create or remove a directed graph edge. Node keys are in collection:id format.

link(frm: str, rel: str, to: str, *, client: str = "local", nonce: int | None = None)

unlink(frm: str, rel: str, to: str, *, client: str = "local", nonce: int | None = None)

neighbors() / inbound()

Traverse edges from a node. Returns a list of node key strings (collection:id). Both support time-travel via as_of.

neighbors(frm: str, rel: str, as_of: int | None = None) → list[str]

inbound(to: str, rel: str, as_of: int | None = None) → list[str]

db.link("users:alice", "follows", "users:bob")
db.link("users:alice", "follows", "users:carol")

db.neighbors("users:alice", "follows")         # ["users:bob", "users:carol"]
db.inbound("users:bob", "follows")             # ["users:alice"]

snap = db.seq
db.unlink("users:alice", "follows", "users:bob")
db.neighbors("users:alice", "follows", as_of=snap)  # ["users:bob", "users:carol"] (time-travel)

put_file() / get_file()

Git-style versioned file storage with Cascade compression: content-defined chunking, content-addressed dedup, and two compression tiers ("warm" fast / "cold" archival). Every version has a Merkle root.

put_file(name: str, data: bytes, tier: str = "warm", ...) → int (version index)

get_file(name: str, version: int = -1, tier: str = "warm") → bytes

file_root(name: str, version: int = -1, tier: str = "warm") → str (Merkle root hex)

data = open("notes.txt", "rb").read()
v1 = db.put_file("notes.txt", data)           # warm tier (fast)
v2 = db.put_file("notes.txt", new_data, tier="cold")  # cold tier (max compression)

db.get_file("notes.txt", v1)                  # original bytes back
db.file_root("notes.txt", v1)                 # Merkle root — anchorable on ITC
db.compression_stats("warm")                  # {"ratio": 39.9, "dedup_hits": 20, ...}

verify() / verify_determinism()

verify() → bool

Rewalk the entire BLAKE2b hash chain and confirm no op has been modified, reordered, or deleted. Returns True if the chain is intact. Run this after loading from disk to confirm persistence was not corrupted.

verify_determinism() → bool

Replay the log from scratch into fresh state and compare the result with the current materialized state. Returns True if they match — proving that state is a pure function of the log.

Properties

Property	Type	Description
`db.seq`	int	Current sequence number (number of ops − 1)
`db.head`	str	Current chain head — BLAKE2b hex digest of the entire history

Lifecycle

flush() / close()

flush() forces an fsync without closing. close() flushes and closes the AOF file handle. Always call close() (or use the context manager) before exiting in durable mode.

Fluent Builder — q()

An alternative to raw NQL strings when you want to compose queries programmatically.

q(coll: str) → Query

Returns a Query builder. Chain methods and call .run() to execute.

Method	NQL equivalent
`.where(field, op, value)`	`WHERE field op value`
`.as_of(seq)`	`AS OF seq`
`.search(text)`	`SEARCH "text"`
`.order_by(field, "DESC")`	`ORDER BY field DESC`
`.traverse(rel)`	`TRAVERSE rel`
`.limit(n)`	`LIMIT n`
`.run()`	(executes and returns `list[dict]`)

results = (
    db.q("users")
      .where("status", "=", "active")
      .where("age", ">=", 25)
      .order_by("age", "DESC")
      .limit(10)
      .run()
)

Query Language

NQL Reference

NQL — the NEDB Query Language — is a small, readable query syntax. One grammar, one parser; the query() method and the q() builder both compile to the same plan.

Full grammar

FROM <collection>
  [ AS OF <seq> ]
  [ WHERE <field> <op> <value> ( AND <field> <op> <value> )* ]
  [ SEARCH "<text>" ]
  [ ORDER BY <field> [ ASC | DESC ] ]
  [ TRAVERSE <relation> ]
  [ LIMIT <n> ]

op ∈ = != < <= > >=
String values use double quotes. Numbers are unquoted.

FROM

Required. Specifies the collection to query.

db.query('FROM users')                     # all users
db.query('FROM orders LIMIT 20')           # first 20 orders

WHERE

Filter rows. Multiple conditions are combined with AND. All six comparison operators are supported. Equality filters use indexed lookups when an eq index exists.

db.query('FROM users WHERE status = "active"')
db.query('FROM users WHERE age >= 25 AND status = "active"')
db.query('FROM orders WHERE total > 100 AND status != "cancelled"')

Index hint: equality conditions on indexed fields (eq index) skip the collection scan entirely. Create an eq index on any field you filter by frequently.

SEARCH

Full-text search across all fields that have a search index. All terms in the query string must appear (AND semantics). Without an index, NEDB falls back to a linear scan — still correct, just slower.

db.create_index("users", "bio", "search")

db.query('FROM users SEARCH "rust"')
db.query('FROM users SEARCH "systems engineer"')  # both terms must match

ORDER BY

Sort results by a field. Default direction is ASC. An ordered index makes sorting faster but is not required.

db.query('FROM users ORDER BY age ASC')
db.query('FROM users ORDER BY created_at DESC LIMIT 10')
db.query('FROM orders WHERE status = "paid" ORDER BY total DESC')

TRAVERSE

Follow a relation from the result set. First filters/sorts rows in the FROM collection, then follows the named edge to the target collection. Returns documents from the target collection.

# Who does alice follow?
db.query('FROM users WHERE _id = "alice" TRAVERSE follows')

# All work orders for active projects
db.query('FROM projects WHERE status = "active" TRAVERSE contains')

AS OF

Time-travel read. Execute the query against the database as it existed at the given sequence number. Works with all other clauses.

snap = db.seq
db.put("users", "alice", {"age": 32})

# alice is 31 at the snapshot, 32 at HEAD
db.query(f'FROM users AS OF {snap} WHERE _id = "alice"')
db.query('FROM users AS OF 0')   # empty — nothing existed at seq 0

LIMIT

Truncate the result set. Applied after all other clauses (WHERE, SEARCH, ORDER BY, TRAVERSE).

db.query('FROM users ORDER BY created_at DESC LIMIT 5')
db.query('FROM users WHERE status = "active" LIMIT 100')

Persistence

Durable Mode

Pass a directory path to NEDB() to make the database durable. Every op is appended to disk immediately — NEDB uses the same model as Redis AOF persistence.

Python

# Session 1 — write data
db = NEDB("./mydata")
db.create_index("users", "status", "eq")
db.put("users", "alice", {"name": "Alice", "status": "active"})
db.close()  # flush + fsync

# Session 2 — reopen (replays log.aof, rebuilds state)
db = NEDB("./mydata")
assert db.verify()                            # chain intact across the restart
assert db.get("users", "alice")["name"] == "Alice"

Always call db.close() (or use the context manager) before exiting. Unflushed writes are in an OS buffer that may not be visible after an unclean shutdown. close() calls fsync() so the AOF is fully durable.

What gets written to disk

Two files are created in the data directory:

File	Contents
`log.aof`	One JSON line per op, in append order. Contains the full Op including seq, client, nonce, op type, payload, timestamp, prev_hash, and hash. Never rewritten — only appended.
`meta.json`	The index configuration: list of `[coll, field, kind]` tuples. Updated on `create_index()`. Loaded first on open so indexes are rebuilt during log replay.

On open, NEDB:

Loads meta.json and registers the index configuration.
Reads log.aof line by line and folds each Op into the MVCC store, relations, and indexes.
Restores the nonce counters and idempotency map from the ops themselves.
Opens log.aof in append mode — all new ops go to the end.

The hash chain is preserved verbatim (hashes are not recomputed on load), so verify() and the head commitment survive restarts exactly.

Server

nedbd — Server Daemon

Run NEDB as a long-lived process and connect clients over HTTP — the way you'd run Redis or Postgres. Each named database is a durable NEDB(path) held open in memory.

shell

nedbd
# nedbd 2.0.27 — http://127.0.0.1:7070  data=./nedb-data  auth=off  engine=aof

# v2 DAG engine — pass --dag (or set NEDBD_DAG=1)
nedbd --dag --data ./data
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./data  engine=dag

Environment variables

Variable	Default	Description
`NEDBD_DAG`	`0`	Set to `1` to launch the v2 DAG engine (`nedbd-v2`). Equivalent to passing `--dag`.
`NEDBD_HOST`	`127.0.0.1`	Bind address. As of v2.0.27 the default is loopback (was `0.0.0.0`) — a security-hardening fix. Set explicitly to `0.0.0.0` to expose on all interfaces.
`NEDBD_PORT`	`7070`	Bind port.
`NEDBD_DATA`	`./nedb-data`	Root directory for database files. Each database is a subdirectory.
`NEDBD_TOKEN`	unset	Optional bearer token. If set, every `/v1/*` request must include `Authorization: Bearer <token>`.
`NEDB_TMK`	unset	32-byte hex AES-256-GCM at-rest encryption key. Encrypts AOF (v1) or DAG objects (v2).

shell — custom config (v1 AOF engine)

NEDBD_PORT=9000 NEDBD_DATA=/var/db/nedb NEDBD_TOKEN=my-secret nedbd

shell — v2 DAG engine with encryption

NEDBD_DAG=1 NEDB_TMK=<hex> nedbd --data ./data
# nedbd-v2 2.0.27 — http://127.0.0.1:7070  data=./data  engine=dag  enc=on

HTTP Routes

All request/response bodies are JSON. Authentication (if NEDBD_TOKEN is set) uses Authorization: Bearer <token> on every /v1/* request.

Route	Description
GET`/health`	Ping. Returns `{"ok": true, "version": "2.0.27", "engine": "aof" \| "dag", "startup_ready": true\|false, "databases": [...]}`. No auth required.
GET`/events`	New in v2.0.27. Server-Sent Events stream. Emits cold-scan progress (`event: scan`), the `event: ready` transition when `startup_ready` flips, and one `event: write` per mutation thereafter (each carrying `seq`, `coll`, and the updated Merkle `head`). Auth-gated like `/v1/*`. `curl http://127.0.0.1:7070/events`
GET`/v1/databases`	List all databases with summary (name, seq, head, rows, collections).
POST`/v1/databases`	Create a database. Body: `{"name": "shop", "init": {...}}`. `init` is optional — see init schema below.
GET`/v1/databases/:name`	Full detail: collections, indexes, seq, head, integrity check, recent log.
DELETE`/v1/databases/:name`	Drop the database and delete its files. Irreversible.
POST`/v1/databases/:name/query`	Run NQL. Body: `{"nql": "FROM users LIMIT 10"}`. Returns `{"rows": [...], "count": N, "seq": N, "head": "..."}`
POST`/v1/databases/:name/put`	Insert or replace a row. Body: `{"coll": "users", "id": "u1", "doc": {...}, "client"?, "nonce"?, "idem"?}`
POST`/v1/databases/:name/index`	Add an index. Body: `{"coll": "users", "field": "status", "kind": "eq"}`
POST`/v1/databases/:name/link`	Create a graph edge. Body: `{"frm": "users:u1", "rel": "follows", "to": "users:u2"}`
DELETE`/v1/databases/:name/rows/:coll/:id`	Delete a row by collection and id. Appended to the log.
GET`/v1/databases/:name/verify`	Re-walk the hash chain. Returns `{"ok": true, "seq": N, "head": "..."}`
GET`/v1/databases/:name/log?limit=N`	Recent log entries (newest first). Default limit: 50.

The init payload

When creating a database you can seed it in one call:

{
  "name": "shop",
  "init": {
    "indexes": [
      ["users", "status", "eq"],
      ["orders", "total", "ordered"]
    ],
    "seed": {
      "users": [{"id": "u1", "name": "Ada", "status": "active"}]
    },
    "links": [
      ["users:u1", "placed", "orders:o1"]
    ]
  }
}

curl Examples

Create a database

curl -X POST http://localhost:7070/v1/databases \
  -H 'Content-Type: application/json' \
  -d '{"name":"myapp"}'

Query

curl -X POST http://localhost:7070/v1/databases/myapp/query \
  -H 'Content-Type: application/json' \
  -d '{"nql":"FROM users WHERE status = \"active\" ORDER BY age DESC LIMIT 5"}'

Insert a row

curl -X POST http://localhost:7070/v1/databases/myapp/put \
  -H 'Content-Type: application/json' \
  -d '{"coll":"users","id":"u1","doc":{"name":"Ada","status":"active","age":31}}'

Verify integrity

curl http://localhost:7070/v1/databases/myapp/verify
# {"ok": true, "seq": 1, "head": "a3f..."}

With bearer auth

curl http://localhost:7070/v1/databases \
  -H 'Authorization: Bearer my-secret'

Tail the SSE event stream (new in v2.0.27)

curl http://127.0.0.1:7070/events
# event: scan
# data: {"objects":730000,"of":1310703,"rate":21043,"eta_s":28}
# event: ready
# data: {"seq":1310703,"head":"b2:9c14e07a…"}
# event: write
# data: {"seq":1310704,"coll":"beliefs","head":"b2:7af3c11e…"}

Clients

Node / JavaScript

Connect to a running nedbd instance over HTTP. No native addon required — pure fetch.

JavaScript (Node 18+ / browser)

const BASE = 'http://127.0.0.1:7070'

async function query(db, nql) {
  const res = await fetch(`${BASE}/v1/databases/${db}/query`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ nql })
  })
  return (await res.json()).rows
}

async function put(db, coll, id, doc) {
  const res = await fetch(`${BASE}/v1/databases/${db}/put`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ coll, id, doc })
  })
  return res.json()
}

// Usage
const users = await query('myapp', 'FROM users WHERE status = "active"')
await put('myapp', 'users', 'u2', { name: 'Bo', status: 'active', age: 28 })

Python requests

Python — HTTP client (no import nedb needed)

import requests

BASE = "http://127.0.0.1:7070"

rows = requests.post(
    f"{BASE}/v1/databases/myapp/query",
    json={"nql": 'FROM users WHERE status = "active" ORDER BY age DESC'}
).json()["rows"]

requests.post(
    f"{BASE}/v1/databases/myapp/put",
    json={"coll": "users", "id": "u3", "doc": {"name": "Cy", "status": "active"}}
)

Adapters

SQL Compatibility

Translate standard SQL to NQL and NEDB operations — no MySQL or MariaDB code involved. SQL is a familiar entry point; the NEDB engine executes everything natively.

Python

from nedb import NEDB
from nedb.sql import sql_exec, sql_to_nql

db = NEDB("./data")

# SELECT → NQL query
sql_exec(db, "SELECT * FROM users WHERE status = 'active' ORDER BY age DESC LIMIT 10")
sql_exec(db, "SELECT * FROM users WHERE bio LIKE '%rust%'")       # LIKE → SEARCH
sql_exec(db, f"SELECT * FROM orders AS OF {snap}")               # NEDB extension: time-travel

# INSERT → db.put()
sql_exec(db, "INSERT INTO users (id, name, age, status) VALUES ('u1', 'Ada', 31, 'active')")

# UPDATE → fetch + merge + db.put()  (all other fields preserved)
sql_exec(db, "UPDATE users SET age = 32 WHERE id = 'u1'")

# DELETE → db.delete()
sql_exec(db, "DELETE FROM users WHERE id = 'u1'")

# See what NQL a SELECT compiles to:
sql_to_nql("SELECT * FROM users WHERE status = 'active' ORDER BY age DESC LIMIT 5")
# → 'FROM users WHERE status = "active" ORDER BY age DESC LIMIT 5'

Supported SQL

Statement	Mapped to	Notes
`SELECT *`	NQL `FROM … WHERE … ORDER BY … LIMIT`	All columns returned (no projection yet)
`WHERE field = 'v'`	`WHERE field = "v"`	All six ops: `= != < <= > >=`
`WHERE field LIKE '%x%'`	`SEARCH "x"`	Best-effort; requires a `search` index
`AS OF n`	`AS OF n`	NEDB extension — time-travel in SQL
`INSERT INTO t (id, …) VALUES (…)`	`db.put()`	An `id` or `_id` column is required
`UPDATE t SET f=v WHERE id = x`	fetch + merge + `db.put()`	Only id-targeted updates; all other fields preserved
`DELETE FROM t WHERE id = x`	`db.delete()`	Only id-targeted deletes

OR, JOIN, subqueries, and aggregates raise SQLUnsupportedError with a clear message. They are on the roadmap.

Adapters

Redis Compatibility

Map Redis commands to NEDB primitives — no hiredis or Redis server code involved. Redis keys become NEDB collections; the engine handles persistence, integrity, and time-travel automatically.

Python

from nedb import NEDB
from nedb.redis_compat import RedisCompat

r = RedisCompat(NEDB("./data"))

# Strings
r.execute("SET", "k", "hello")           # → "OK"
r.execute("GET", "k")                    # → "hello"
r.execute("INCR", "counter")             # → 1
r.execute("MSET", "a", "1", "b", "2")   # → "OK"

# Hashes  (Redis key names with colons work: "user:1")
r.execute("HSET", "user:1", "name", "Ada", "age", "31")
r.execute("HGET", "user:1", "name")     # → "Ada"
r.execute("HGETALL", "user:1")          # → {"name": "Ada", "age": "31"}

# Sets
r.execute("SADD", "tags", "python", "rust")
r.execute("SMEMBERS", "tags")           # → {"python", "rust"}
r.execute("SISMEMBER", "tags", "rust") # → 1

# Lists
r.execute("RPUSH", "q", "a", "b", "c")
r.execute("LRANGE", "q", 0, -1)         # → ["a", "b", "c"]
r.execute("LPOP", "q")                  # → "a"

# Unsupported — clear error + roadmap link
r.execute("EXPIRE", "k", 60)            # → RedisUnsupportedError

Command coverage

Group	Commands
Strings	`SET GET DEL EXISTS INCR INCRBY DECR DECRBY MSET MGET SETNX GETDEL APPEND STRLEN TYPE RENAME KEYS DBSIZE FLUSHDB`
Hashes	`HSET HMSET HSETNX HGET HMGET HGETALL HDEL HEXISTS HKEYS HVALS HLEN HINCRBY`
Sets	`SADD SMEMBERS SISMEMBER SREM SCARD SUNION SINTER SDIFF`
Lists	`LPUSH RPUSH LRANGE LLEN LINDEX LSET LPOP RPOP`
Unsupported (roadmap)	`EXPIRE TTL PEXPIRE PTTL PERSIST · SUBSCRIBE PUBLISH · MULTI EXEC DISCARD WATCH`

Key encoding: Redis key names that contain characters invalid in NQL collection names (e.g. user:1) are stored using a safe hex encoding (user__3a__1). This is transparent — you always use the original key name in commands.

v1.x features

wrap_redis() — Redis Layer-2

Already running on Redis? Wrap your connection in one line and add NEDB's full feature set alongside your existing app — zero migration required, zero impact on your existing keys.

Two surfaces coexist on the same connection:

Surface 1 — every Redis command (r.set, r.hset, r.get, …) passes through to Redis unchanged. Alice's app code doesn't change.
Surface 2 — r.nedb.* exposes the full NEDB API: NQL queries, time-travel, causal provenance, hash chain verification.

Python — one-line wrap

import redis, json
from nedb import wrap_redis

# ONE LINE — Alice's app code doesn't change
r = wrap_redis(redis.Redis("localhost", 6379), db_name="rideshare")

# Surface 1 — existing Redis commands, unchanged
r.set("driver:d1", json.dumps({"name": "Bob", "status": "active"}))
r.get("driver:d1")     # → b'{"name": "Bob", ...}'
r.hset("trip:t1", mapping={"status": "requested"})

# Surface 2 — new NEDB features on the same connection
r.nedb.put("driver", "d1", {"name": "Bob", "status": "active"})
r.nedb.query('FROM driver WHERE status = "active"')
r.nedb.verify()    # → True

Isolation guarantee

NEDB never writes to Alice's namespace. It owns only keys prefixed nedb:{db_name}::

Key	Type	Purpose
`nedb:{db_name}:oplog`	Redis Stream	Append-only op log (in-process mode)
`nedb:{db_name}:snapshot`	Redis Hash	Checkpoint
`nedb:{db_name}:meta`	Redis Hash	Index configuration

wrap_redis() signature

Python

wrap_redis(
    r,                         # redis.Redis (or compatible) connection
    db_name: str = "default",  # logical name; NEDB uses nedb:{db_name}:*
    nedbd_url: str = None,     # route r.nedb.* to a nedbd server (see below)
    nedbd_token: str = None,   # bearer token for nedbd auth (optional)
) → WrappedRedis

Local testing — no Redis server needed: use fakeredis as a drop-in. See examples/fakeredis_demo.py in the repo.

v1.x features

Backfill — import existing Redis data

One-time SCAN of Alice's existing Redis keys into NEDB's hash chain. After backfill, all historical data is queryable via NQL, time-travelable, and verified.

Three-step migration

Python

# Step 1 — register: map Redis key globs to NEDB collections (chainable)
(r.nedb
 .register("driver:*", collection="driver", value_parser=json.loads)
 .register("trip:*",   collection="trip",   value_type="hash")
)

# Step 2 — backfill: import all existing Redis data once
imported = r.nedb.backfill()     # → int (number of keys imported)

# Step 3 — shadow: all future surface-1 writes auto-chain
r.nedb.shadow_writes = True

register() — collection mapping

Param	Type	Description
`pattern`	str	Redis key glob, e.g. `"driver:*"`
`collection`	str	NEDB collection name
`id_extractor`	callable	`fn(key) → id`. Default: `key.rsplit(":", 1)[-1]`
`value_parser`	callable	`fn(raw) → dict`. Default: JSON decode, fallback to `{"_v": raw}`
`value_type`	str	`"string"` (default) · `"hash"` · `"json"`

Returns self for chaining.

backfill() — one-pass SCAN import

Python

# Backfill all registered patterns
imported = r.nedb.backfill()

# Or backfill a single pattern directly (no prior register needed)
imported = r.nedb.backfill("zone:*", "zone", value_parser=json.loads)

Every imported record gets _source: "backfill" and _evidence: "backfill" in the hash chain entry. Returns total keys imported (int).

v1.x features

Write Shadowing

Set r.nedb.shadow_writes = True and every surface-1 write is silently mirrored into NEDB's hash chain. Alice's app writes to Redis normally; NEDB captures the full write history without any code changes.

Python

r.nedb.shadow_writes = True

# Alice's code — unchanged
r.set("driver:d1", json.dumps({"name": "Bob", "status": "active"}))
r.hset("trip:t1", mapping={"status": "en_route", "driver_id": "d1"})

# → both writes auto-chained into NEDB
r.nedb.get("driver", "d1")  # → {"name": "Bob", "status": "active", "_source": "shadow"}

# HSET merges with existing NEDB doc
r.nedb.get("trip", "t1")   # → {"status": "en_route", "driver_id": "d1", "rider_id": "u1", ...}

# Time-travel through shadowed writes
snap = r.nedb.seq
r.set("driver:d1", json.dumps({"name": "Bob", "status": "offline"}))
r.nedb.get_as_of("driver", "d1", snap)   # → {"status": "active", ...}

# Disable at any time
r.nedb.shadow_writes = False

Shadowed commands

The following write commands are intercepted when shadow_writes=True: set setnx setex psetex getset mset msetnx hset hmset hsetnx hincrby hincrbyfloat hdel lpush rpush lset ltrim lpop rpop sadd srem zadd zincrby zrem del unlink rename append incr incrby decr decrby setrange.

Keys that match a registered pattern get a full NEDB put() (NQL-queryable, time-travelable). Unmatched keys get a raw chain entry (tamper-evidence only).

Shadow failures never break the Redis surface call — any error in NEDB is silently swallowed.

v1.x features

wrap_redis() + nedbd Mode

Pass nedbd_url= to route all r.nedb.* calls to a running nedbd server instead of the in-process engine. nedbd handles its own durable AOF persistence on disk; the Redis Stream backend is bypassed entirely.

Python

import redis
from nedb import wrap_redis

r = wrap_redis(
    redis.Redis("localhost", 6379),
    db_name="rideshare",
    nedbd_url="http://localhost:8421",   # ← nedbd server
    nedbd_token="my-secret",             # ← optional bearer token
)

# Surface 1 — Redis, unchanged
r.set("driver:d1", "...")

# Surface 2 — forwarded to nedbd over HTTP/JSON
r.nedb.put("driver", "d1", {"name": "Bob"})
r.nedb.query('FROM driver WHERE status = "active"')
r.nedb.verify()   # → True

# Backfill + shadow_writes work identically in nedbd mode
r.nedb.register("driver:*", "driver", value_parser=json.loads)
r.nedb.backfill()
r.nedb.shadow_writes = True

Mode comparison

Mode	How	Persistence	Best for
In-process	`wrap_redis(r, db_name=…)`	Redis Stream oplog	Lightweight; single-process apps
nedbd	`wrap_redis(r, …, nedbd_url=…)`	Durable AOF on disk	Multi-process; production deployments

NedBdProxy is the internal HTTP client that translates r.nedb.* calls to nedbd's /v1/databases/{name}/* API. The surface API is identical in both modes — switching is a one-line change to wrap_redis().

Causal provenance in nedbd mode

As of v1.2.1 the nedbd PUT endpoint accepts caused_by, evidence, confidence, valid_from, and valid_to from the request body and passes them through to the engine:

Python

r.nedb.put("trip", "t1",
    {"rider": "u1", "driver": "d1"},
    caused_by=[r.nedb.seq - 1],
    evidence="inference",
    confidence=0.97)

r.nedb.query('FROM trip WHERE _id = "t1" TRACE caused_by')  # works in nedbd mode

Adapters

Auto-Indexing

Wrap a NEDB instance with AutoIndexDB and indexes are created automatically based on observed query patterns — no manual create_index() calls required.

Python

from nedb import NEDB, AutoIndexDB

db = AutoIndexDB(NEDB("./data"), threshold=5, verbose=True)

# Query as normal — field usage is tallied automatically
db.query('FROM users WHERE status = "active"')   # 1/5
db.query('FROM users WHERE status = "active"')   # 2/5
# ... 3 more ...
db.query('FROM users WHERE status = "active"')
# [autoindex] created eq index on users.status (threshold=5)

# Check what's been created and what's close
db.analyze()
# {"tallies": {"users.status (eq)": 5}, "indexes_created": ["users.status (eq)"], "threshold": 5}

db.suggest()
# ["users.age (ordered) — 3/5 queries"]  ← not yet at threshold

# All other NEDB methods work unchanged — AutoIndexDB is a transparent proxy
db.put("users", "u1", {"name": "Ada", "status": "active"})
db.get("users", "u1")
db.verify()

Parameter	Type	Description
`db`	NEDB	Any NEDB instance (embedded or durable)
`threshold`	int	Query count before an index is auto-created. Default: `5`
`verbose`	bool	Print a message when an index is created. Default: `False`

How it works

Every query() call is intercepted. The NQL string is parsed to extract WHERE field names and ORDER BY fields. Each (collection, field, kind) combination is tallied. Once the count reaches threshold, create_index() is called automatically. Equality conditions (=, !=) create an eq index; range comparisons (<, >, <=, >=) and ORDER BY create an ordered index.

v0.5.x

Snapshot Checkpoints

Capture the full database state to disk — anchored in the hash chain — so future starts are O(delta) instead of O(total). The chain never breaks: the checkpoint is a real op in the AOF.

Python

from nedb import NEDB

db = NEDB("./data")
# ... write 100 K rows ...
db.checkpoint()   # anchor state in the chain → writes snapshot.json
db.close()

# Next open: loads snapshot then replays only delta ops
db2 = NEDB("./data")
assert db2.verify()   # chain from genesis → checkpoint op → delta: intact

nedbd auto-checkpoints on SIGTERM / SIGINT. Stopping the daemon with Ctrl+C or kill writes a checkpoint for every open database before shutdown. The next start loads from those snapshots — restart time is proportional to writes since the last checkpoint, not total history.

You can also trigger a checkpoint over HTTP at any time:

curl -X POST localhost:7070/v1/databases/mydb/checkpoint
# {"ok": true, "head": "abc...", "seq": 1042}

What the snapshot contains

MVCC store (every key's HEAD value and write seq)
Relations (graph edges with added/removed seqs)
Index configuration (eq / ordered / search specs)
BlobStore chunks and file manifests (both tiers, compressed)
Nonce and idempotency tables (replay protection survives restart)

Pre-checkpoint time-travel: AS OF queries for seqs before the snapshot require the full AOF. Keep the log file for archival use if you need indefinite time-travel; otherwise snapshots are self-contained for all operations after the checkpoint seq.

v0.5.x

TTL / Key Expiry

Set a time-to-live on any document. Expiry is lazy (checked on every read) and append-only (a delete op is written to the log when a key expires).

Python

# Put with TTL
db.put("cache", "session", {"token": "xyz"}, ttl_s=3600)   # expires in 1 hour

# Set / update TTL on an existing doc
db.expire("cache", "session", ttl_s=300)   # update to 5 minutes; False if not found

# Bulk sweep (call periodically for background maintenance)
n = db.sweep()   # → number of expired docs deleted

Method	Description
`put(..., ttl_s=N)`	Store `_expires_at = now + N` in the doc. Lazy expiry on every subsequent `get()`.
`expire(coll, id, ttl_s)`	Set or update TTL on an existing document. Returns `False` if the document doesn't exist.
`sweep()`	Scan and delete all expired documents now. Returns the count deleted.

Time-travel ignores expiry. get(..., as_of=seq) always returns what was true at that seq, even if the document has since expired. Expiry only fires on HEAD reads.

The Redis adapter's EXPIRE, TTL, and PTTL commands now map to db.expire().

v0.5.x

GROUP BY Aggregations

Aggregate query results by a field. Compatible with WHERE, SEARCH, and LIMIT — filtering happens before grouping.

NQL examples

# Count rows per group
db.query("FROM orders GROUP BY status COUNT")
# → [{"status": "paid", "count": 42}, {"status": "pending", "count": 7}]

# Sum a numeric field
db.query("FROM sales GROUP BY region SUM revenue")
# → [{"region": "north", "count": 3, "sum_revenue": 15000}, ...]

# Average
db.query("FROM scores GROUP BY grade AVG score")

# Min / Max
db.query("FROM items GROUP BY category MIN price")
db.query("FROM items GROUP BY category MAX price")

# With WHERE (filter before grouping)
db.query('FROM orders WHERE region = "EU" GROUP BY status COUNT')

Aggregate functions

Syntax	Output field	Description
`GROUP BY f COUNT`	`count`	Number of rows in the group
`GROUP BY f SUM field`	`sum_field`	Sum of `field` across the group
`GROUP BY f AVG field`	`avg_field`	Average of `field`
`GROUP BY f MIN field`	`min_field`	Minimum value of `field`
`GROUP BY f MAX field`	`max_field`	Maximum value of `field`

Every group result always includes count alongside the requested aggregate. Non-numeric values are skipped for SUM / AVG / MIN / MAX.

v0.5.3

Encryption at Rest

AES-256-GCM at-rest encryption with a double-envelope key structure. Toggle-able — zero overhead when disabled. Encrypts the AOF, snapshot.json, and BlobStore chunks.

The double envelope

External TMK (Table Master Key) ← provided by operator (env / arg / key file) ↓ AES-256-GCM wrap DEK (Data Encryption Key) ← random per database → stored in key.enc ↓ AES-256-GCM encrypt Data (AOF lines, snapshot.json, blob chunks)

Enable encryption

Python — programmatic TMK

db = NEDB("./data", tmk=bytes.fromhex("a3f1..."))   # 32-byte hex key

Environment variable (recommended for production)

# .env
NEDB_TMK=a3f1...  # 64-char hex string (any length accepted, normalised via HKDF)

# or a key file
NEDB_TMK_FILE=/run/secrets/nedb-tmk

nedbd — encrypts all databases it manages

NEDB_TMK=a3f1... nedbd

Key rotation

Re-wrap the DEK under a new TMK without re-encrypting any data. The old TMK is immediately rejected after rotation.

db = NEDB("./data", tmk=old_key)
db.rewrap_key(old_tmk=old_key, new_tmk=new_key)
db.close()
# Database now only opens with new_key

What is encrypted

File	Coverage
`log.aof`	Every op line encrypted individually — `{"enc":1,"ct":"<b64>"}`
`snapshot.json`	Entire file encrypted as a single envelope
`key.enc`	Wrapped DEK (never the plaintext DEK)
BlobStore chunks	Each chunk encrypted before base64 storage in the snapshot

The hash chain is unaffected. verify(), AS OF, and Merkle proofs operate on decrypted log entries — encryption is a transparent layer below the log. The chain from genesis through every checkpoint op to the current head remains continuously verifiable.

bundled since v0.5.5: cryptography is now a required dependency — it ships with every pip install nedb-engine. No separate install needed.

v0.6.0

RESP2 Wire Protocol

nedbd now speaks the Redis Serialization Protocol (RESP2). redis-cli, redis-benchmark, and every Redis client library in every language connects to nedbd natively — no Redis installation required.

Enable RESP2

# env var: NEDBD_RESP2_PORT (0 = disabled, default)
NEDBD_RESP2_PORT=6379 nedbd

# Boot log:
#   nedbd 0.6.0 — http://127.0.0.1:7070  data=./nedb-data  auth=off
#   resp2  — redis://  127.0.0.1:6379  (RESP2 wire protocol)

Connect with redis-cli

redis-cli -p 6379 PING                              # PONG
redis-cli -p 6379 SELECT salonbooking             # OK — open that database
redis-cli -p 6379 SET key "hello"                  # OK
redis-cli -p 6379 HSET "user:1" name Ada age 31   # :2
redis-cli -p 6379 SADD tags python rust            # :2
redis-cli -p 6379 SMEMBERS tags                     # {"python", "rust"}

NQL pass-through via EVAL

# EVAL runs any NQL query; rows are returned as JSON strings
redis-cli -p 6379 EVAL "FROM users WHERE status = \"active\" LIMIT 5" 0
# 1) "{\"_id\":\"u1\",\"name\":\"Ada\",\"status\":\"active\"}"
# 2) "{\"_id\":\"u2\",\"name\":\"Bo\",\"status\":\"active\"}"

SELECT maps to NEDB database names

SELECT <name> opens the named NEDB database (creates it if it doesn't exist). This replaces Redis's integer 0-15 DBs with NEDB's named databases — use the database name you deployed in nedbd.

Command coverage

Group	Commands
Strings	`SET GET DEL EXISTS INCR INCRBY DECR DECRBY MSET MGET SETNX GETDEL APPEND STRLEN TYPE RENAME KEYS DBSIZE FLUSHDB`
Hashes	`HSET HMSET HSETNX HGET HMGET HGETALL HDEL HEXISTS HKEYS HVALS HLEN HINCRBY`
Sets	`SADD SMEMBERS SISMEMBER SREM SCARD SUNION SINTER SDIFF`
Lists	`LPUSH RPUSH LRANGE LLEN LINDEX LSET LPOP RPOP`
Server	`PING SELECT COMMAND QUIT DBSIZE KEYS TYPE FLUSHDB`
NQL	`EVAL "<nql>" 0` — run any NQL query
Unsupported (roadmap)	`EXPIRE TTL PTTL SUBSCRIBE PUBLISH MULTI EXEC` — clear `-ERR` with roadmap note

Release History

Changelog

2.0.27

Warm start, deferred cold scan, SSE log stream, production hardening

MANIFEST warm start: nedbd loads seq and the Merkle head in O(1) on every restart after the first open — no scan, no replay, independent of dataset size. Deferred background cold scan: the server accepts connections immediately on first open; the integrity scan runs in a background thread with a live progress bar (rate + ETA). Reads serve instantly from the DAG; writes return HTTP 503 startup in progress until the startup_ready gate flips. GET /events: new Server-Sent Events endpoint streams cold-scan progress, the ready transition, and per-write events with the updated Merkle head. NEDBD_HOST=127.0.0.1 by default (was 0.0.0.0) — security-hardening fix; set explicitly to 0.0.0.0 to expose. O(1) Merkle head: the BLAKE2b root is advanced incrementally on every write, never recomputed. IdIndex sharded across 256 subdirectories to keep directory listing fast at 1M+ documents. TCP_NODELAY on the axum listener: eliminates the 40–200 ms Nagle delay on macOS loopback. Production: vision.interchained.org is live on v2.0.27 with 1,310,703 sequences indexed at block height 620,989. Benchmarks (Intel iMac, 10k writes / 100k reads / 30k objects): 418/s writes (p99 3.3 ms), 478/s point reads (p99 3.0 ms), 489/s ORDER BY (p99 4.3 ms), 1,104 ops/s batch writes, 30k-object tamper-verify in 1.38 s (~21k BLAKE2b/sec).

2.0.4

The DAG Engine — content-addressed Merkle DAG storage

New storage substrate built around a content-addressed Merkle DAG. Every document version is an immutable, BLAKE2b-verified object. Launch with nedbd --dag or NEDBD_DAG=1 — the launcher execs the new nedbd-v2 Rust binary. Instant cold start (no AOF replay; multi-GB databases boot in milliseconds). No corruption possible (immutable hash-addressed objects; a corrupted byte changes the digest and is rejected by the DAG root). Real Merkle head (BLAKE2b root over the full DAG of every version ever written, on every response). Tombstone deletes (history is permanent; AS OF still resolves pre-delete values). One-way migration from v1 AOF: existing log.aof is scanned once and a new dag/ directory is built alongside it. Same NQL, same API, same wire protocol — only the substrate changes. The v1 AOF engine remains available without --dag.

1.2.1

wrap_redis() + nedbd server mode

wrap_redis() now accepts nedbd_url= and nedbd_token= to route all r.nedb.* calls to a running nedbd HTTP server instead of the in-process engine. New NedBdProxy class translates put/get/query/create_index/link/delete/verify/checkpoint to nedbd's /v1/databases/{name}/* HTTP API. Backfill and write shadowing work identically in nedbd mode. Also: nedbd PUT endpoint now passes caused_by, evidence, confidence, valid_from, valid_to through to the engine (previously only client, nonce, idem were forwarded). 29/29 nedbd integration tests; 74/74 in-process tests.

1.2.0

wrap_redis() backfill + write shadowing

Three-step Redis migration: register() maps key globs to NEDB collections, backfill() scans existing Redis keys via SCAN and imports them into the hash chain in one pass (evidence tagged "backfill"), shadow_writes = True auto-chains all future surface-1 writes. New: CollectionMapping with custom id_extractor and value_parser; chainable register(); direct backfill(pattern, collection) without prior register. 30 new tests; 74/74 total. Also: updated fakeredis_demo.py (29/29 checks) and README wrap_redis section.

1.1.0

wrap_redis() — NEDB as a Redis layer-2

Wrap any existing redis.Redis connection with NEDB in one line: r = wrap_redis(redis.Redis(...), db_name="..."). Two surfaces coexist: Surface 1 — every Redis command passes through unchanged; Surface 2 — r.nedb.* gives time-travel, NQL, causal provenance, and hash-chain verification. NEDB persists via Redis Streams (nedb:{db_name}:oplog). Isolation guarantee: NEDB never writes outside its nedb:{db_name}: prefix. 44 tests; fakeredis demo.

1.0.5

License → GPL-3.0-or-later; npm homepage fix

Updated license in package.json and pyproject.toml from Apache-2.0 to GPL-3.0-or-later. Added explicit "homepage" to package.json to prevent npm from appending #readme.

1.0.3

Rust NQL integer/float comparison fix

serde_json::json!(3.0f64) serialises as Float(3.0) but a document field parsed from JSON arrives as PosInt(3). Fixed by comparing via as_f64() in the Rust nql.rs cmp() function so integer literals and float literals compare equal when numerically equal.

1.0.2

Rust AS OF index bypass fix

The eq index was being used even for AS OF queries; since the index reflects HEAD only, AS OF queries returned stale current-version results. Fixed with an as_of.is_none() guard so the eq index is only used for HEAD reads.

1.0.1

napi-rs Node.js binding rewrite

Complete rewrite of rust/crates/nedb-node/src/lib.rs to match the current NEDB core API. Fixes API drift from the original stub. All napi-rs bindings now match put/get/query/createIndex/link/verify/head/seq/getAsOf.

0.6.0

RESP2 wire protocol — nedbd is a drop-in Redis replacement

Enable with NEDBD_RESP2_PORT=6379 nedbd. redis-cli, redis-benchmark, and every Redis client library connects natively. SELECT <name> opens a named NEDB database. EVAL "<nql>" 0 runs any NQL query. All major command groups supported (strings, hashes, sets, lists); EXPIRE/TTL/SUBSCRIBE/MULTI return clear -ERR with roadmap note. Also: auto backfill-encrypt existing plaintext AOF on first encrypted open (v0.5.6), eager database open at startup so backfill shows in boot log (v0.5.7), cryptography bundled as required dep (v0.5.5).

0.5.3

AES-256-GCM encryption at rest — double envelope TMK

Toggle-able at-rest encryption for the AOF, snapshots, and BlobStore chunks. Double-envelope: a random per-database DEK is wrapped by an external TMK (programmatic, NEDB_TMK env, or key file). The GCM tag detects any tampering on read. Key rotation via db.rewrap_key() re-wraps the DEK without touching data. The hash chain, verify(), and AS OF work unchanged through encryption. Requires pip install cryptography. 38 new tests; 127/127 total.

0.5.2

BlobStore (Cascade files) persisted in checkpoints

Files stored via put_file() are now fully preserved across checkpoint() + restart. The snapshot serialises both BlobStore tiers: compressed chunk bytes (encrypted if DEK set), file manifests, Merkle roots, and dedup stats. get_file(), file_root(), file_proof(), and compression_stats() all work identically after a snapshot-assisted restart.

0.5.1

nedbd auto-checkpoint on SIGTERM / SIGINT

The daemon checkpoints every open database before shutdown (SIGTERM or SIGINT) so the next startup loads from the snapshot with zero delta ops to replay. Signal handler uses a daemon thread to call httpd.shutdown() — avoids deadlock with serve_forever(). Also adds POST /v1/databases/:name/checkpoint for on-demand checkpoints.

0.5.0

Snapshot checkpoints, TTL/expiry, GROUP BY aggregations

Snapshots: db.checkpoint() captures state in a snapshot.json anchored in the hash chain (the checkpoint is a real log op). Future opens are O(delta). TTL: db.put(..., ttl_s=N), db.expire(), db.sweep(). Lazy expiry on read; AS OF reads skip expiry checks. Redis EXPIRE now functional. GROUP BY: FROM t GROUP BY f COUNT|SUM|AVG|MIN|MAX field in NQL. 30 new tests; 89/89 total.

0.4.2

Structured benchmark suite

bench/benchmarks.py: measures GET/PUT/time-travel, indexed vs unindexed query, adapter overhead vs raw NQL, in-memory vs AOF write cost, optional Redis TCP and nedbd HTTP comparisons. Results written to bench/RESULTS.md. README perf table updated with real measured numbers.

0.4.0

SQL adapter, Redis compatibility, auto-indexing

Three new pure-Python adapters, zero external dependencies. SQL: sql_exec(db, sql) translates SELECT/INSERT/UPDATE/DELETE to NQL and NEDB primitives — MariaDB users can write what they know and immediately get time-travel and hash-chain integrity. Redis: RedisCompat(db).execute(cmd, *args) maps SET/GET/HSET/HGETALL/SADD/SMEMBERS/LPUSH/LRANGE and 30+ more commands to NEDB. EXPIRE/TTL/SUBSCRIBE/MULTI return RedisUnsupportedError with a roadmap note. Auto-indexing: AutoIndexDB(db, threshold=5) tallies query field usage and creates indexes automatically. 48 new tests; 59/59 total.

0.3.1

Universal wheel — installs everywhere

Switched build backend to hatchling. Publishes a py3-none-any wheel + sdist that installs on any platform and Python version without a toolchain. Fixes the silent upgrade failure on Intel Mac caused by arm64-only wheels in 0.2.0/0.3.0. The nedbd command is always present after install.

0.3.0

nedbd — server daemon

Run NEDB as a long-lived HTTP/JSON server. pip install nedb-engine now ships the nedbd console command. Each named database is a durable NEDB(path) held open in memory. Multi-database, optional bearer token auth, CORS, ThreadingHTTPServer. Verified: create, seed, query, traverse, verify, kill + restart → data intact.

0.2.0

Durable AOF persistence

NEDB(path) opens a durable database. Every op is appended to log.aof and fsync'd immediately. Index config is snapshotted to meta.json. Database reloads by replaying the log on open — verify(), AS OF time-travel, relations, and the head commitment all survive restarts. NEDB() with no path remains in-memory (fully backward-compatible). Added: flush(), close(), context-manager support.

0.1.4

Stable pure-Python baseline

Universal py3-none-any wheel + sdist. Last release in the 0.1.x line before the persistence work. All features complete: log, MVCC, relations, indexes (eq/ordered/search), NQL, Cascade compression, Merkle proofs, fluent builder. 10/10 invariant tests pass.

0.1.3

Native Rust wheels

First native wheel release — platform wheels for macOS arm64, Linux x86_64, Windows x64 via maturin + PyO3. The optional nedb._native accelerator is loaded lazily; the package works identically without it.

0.1.0

Initial release

Pure-Python reference engine. Full feature set: append-only hash-chained log, MVCC time-travel, replay protection, idempotency, first-class relations, three index types, NQL query language, git-style Cascade-compressed file store, Merkle proofs.