NEDB Reference Documentation

NEDB Documentation

NEDB is a versioned, time-traveling embedded database — replay-protected, idempotent, relational, and searchable. One hash-chained, nonce-enforced, append-only log is the substrate for idempotency, replay protection, crash recovery, MVCC, and time-travel simultaneously.

What makes it different: most databases store a snapshot of current state. NEDB stores the log of every operation and derives state from it. That one decision makes replay protection, crash recovery, time-travel reads, and on-chain provability all fall out for free — from the same structure.

Key properties

  • Replay-protected: every write carries a strictly-monotonic per-client nonce. Stale or duplicate ops are rejected.
  • Idempotent: pass an idem key and retries are no-ops — the original result is returned without re-executing.
  • MVCC time-travel: read the database exactly as it was at any past sequence — AS OF seq.
  • Relational: first-class graph edges with O(1) traversal — and the graph time-travels too.
  • Durable: NEDB(path) appends every op to disk (log.aof) and fsync's it; the database reloads by replaying that log on open.
  • Provable: verify() rewalks the BLAKE2b hash chain; the head hash is a commitment to the entire history, anchorable on-chain.

Installation

NEDB ships as a universal Python package — one command, any platform, no compiler needed.

shell
pip install nedb-engine

Verify the install and start the server daemon:

python3 -c "import nedb; print(nedb.__version__)"
# 0.3.1

nedbd
# nedbd 0.3.1 — http://127.0.0.1:7070  data=./nedb-data  auth=off

Python 3.8+ required. The package ships a universal py3-none-any wheel — no Rust toolchain, no native compilation. The optional Rust core (nedb._native) is additive roadmap; the pure-Python engine is the production baseline.

Quickstart

Up and running in under a minute — a durable database with indexes, relations, and time-travel.

Python
from nedb import NEDB

# Durable (persists to disk). Use NEDB() for in-memory.
db = NEDB("./mydata")

# Indexes power fast filtering, sorting, and full-text search
db.create_index("users", "status", "eq")
db.create_index("users", "age", "ordered")
db.create_index("users", "bio", "search")

# Write rows
db.put("users", "alice", {"name": "Alice", "age": 31, "status": "active", "bio": "systems engineer"})
db.put("users", "bob",   {"name": "Bob",   "age": 40, "status": "active", "bio": "database architect"})

# Idempotent write — safe to retry forever
db.put("orders", "o1", {"total": 42}, client="checkout", nonce=7, idem="charge-o1")

# Query with NQL
db.query('FROM users WHERE status = "active" ORDER BY age DESC')
db.query('FROM users SEARCH "systems"')

# Relations + traversal
db.link("users:alice", "follows", "users:bob")
db.q("users").where("_id", "=", "alice").traverse("follows").run()

# Time-travel
snap = db.seq
db.put("users", "alice", {"age": 32, "status": "active"})
db.get("users", "alice", as_of=snap)["age"]  # → 31

# Integrity
assert db.verify()              # hash chain intact
assert db.verify_determinism()  # state == replay(log)

db.close()

Core Concepts

The append-only log

Every mutation (put, delete, link, put_file) creates an Op that is appended to the OpLog. Each Op is chained to the previous one via a BLAKE2b hash — the head hash is a cryptographic commitment to the entire history. Nothing is ever rewritten or deleted from the log.

State is a pure function of the log. The MVCC store, relations, and indexes are materialized views — they are rebuilt by replaying the log, which means crash recovery and time-travel are free side effects of the same design.

Sequences and time-travel

Every Op gets a monotonically-increasing seq — an integer starting at 0. db.seq returns the current sequence number. Any read can be made time-traveling by passing as_of=seq: the engine replays or truncates the log to that point and returns the result that was true then.

Replay protection and idempotency

Each write takes a client identifier and a nonce. The nonce must strictly exceed the last nonce seen from that client — stale or repeated ops raise ReplayError. If you also pass an idem key, the very first successful write is recorded; subsequent calls with the same key return the original result and append nothing to the log.

Collections and keys

NEDB is schema-agnostic. A "collection" is just a string namespace prefix. Every document has an _id field (set automatically from the id argument if missing). Internal store keys are formatted as collection:id.

Relations

Edges are stored as (from, relation, to) triples — the "from" and "to" are full node keys in collection:id form. Relations also time-travel: neighbors(frm, rel, as_of=seq) returns the edges that existed at that sequence.

NEDB()

The main database object. Instantiate once; keep open for the lifetime of your application.

NEDB(path: str | None = None) → NEDB
ParameterTypeDescription
pathstr | NoneDirectory path for durable storage. If None (default) the database is in-memory only and nothing is written to disk.
db = NEDB()              # in-memory
db = NEDB("./data")     # durable — creates ./data/log.aof + meta.json

# Use as a context manager for automatic close/flush:
with NEDB("./data") as db:
    db.put("k", "1", {"v": 1})

put()

Insert or replace a document. If a document with the same id already exists it is fully replaced. Returns the stored document (with _id set).

put(coll: str, id: str, doc: dict, *, client: str = "local", nonce: int | None = None, idem: str | None = None) → dict
ParameterTypeDescription
collstrCollection name (e.g. "users")
idstrDocument identifier — unique within the collection
docdictThe document to store. _id is set automatically.
clientstrClient identifier for nonce tracking. Default: "local"
nonceint | NoneMonotonic write counter for this client. Auto-incremented if None.
idemstr | NoneIdempotency key. If this key was already used, the original result is returned and nothing is appended.
# Simple write
doc = db.put("users", "alice", {"name": "Alice", "age": 31})

# Idempotent write — calling this 100 times = calling it once
db.put("orders", "o1", {"total": 99}, client="api", nonce=1, idem="order-o1-v1")

# Explicit nonce (replay-protected from a distributed service)
db.put("events", "e1", {"type": "click"}, client="tracker", nonce=42)

delete()

Remove a document. The deletion is appended to the log as a delete op — the history is preserved and the document is visible at past sequences via AS OF.

delete(coll: str, id: str, *, client: str = "local", nonce: int | None = None, idem: str | None = None)
db.delete("users", "alice")

get()

Retrieve a single document by id. Returns None if the document does not exist (or did not exist at as_of).

get(coll: str, id: str, as_of: int | None = None) → dict | None
doc = db.get("users", "alice")          # current HEAD
old = db.get("users", "alice", as_of=3)  # as it was at seq 3

query()

Execute a NQL query string. Returns a list of matching documents. See the NQL Reference for the full grammar.

query(nql: str) → list[dict]
rows = db.query('FROM users WHERE status = "active" ORDER BY age DESC LIMIT 10')
found = db.query('FROM users SEARCH "engineer"')
old   = db.query('FROM users AS OF 5 WHERE age > 25')

create_index()

Create a secondary index on a collection field. Indexes are maintained incrementally on every write and dramatically speed up queries. Existing rows are backfilled immediately.

create_index(coll: str, field: str, kind: str = "eq")
kindTypeUse for
"eq"Hash mapEquality filters — WHERE field = value
"ordered"Sorted listRange queries and sorting — ORDER BY field, WHERE field > n
"search"Inverted indexFull-text search — SEARCH "term"
db.create_index("users", "status", "eq")       # WHERE status = "active"
db.create_index("users", "created_at", "ordered")  # ORDER BY created_at DESC
db.create_index("users", "bio", "search")        # SEARCH "engineer"

Create indexes before seeding data when possible — but creating after works too; existing rows are backfilled. Index configuration is persisted in meta.json (durable mode) so indexes survive restarts.

link() / unlink()

Create or remove a directed graph edge. Node keys are in collection:id format.

link(frm: str, rel: str, to: str, *, client: str = "local", nonce: int | None = None)
unlink(frm: str, rel: str, to: str, *, client: str = "local", nonce: int | None = None)

neighbors() / inbound()

Traverse edges from a node. Returns a list of node key strings (collection:id). Both support time-travel via as_of.

neighbors(frm: str, rel: str, as_of: int | None = None) → list[str]
inbound(to: str, rel: str, as_of: int | None = None) → list[str]
db.link("users:alice", "follows", "users:bob")
db.link("users:alice", "follows", "users:carol")

db.neighbors("users:alice", "follows")         # ["users:bob", "users:carol"]
db.inbound("users:bob", "follows")             # ["users:alice"]

snap = db.seq
db.unlink("users:alice", "follows", "users:bob")
db.neighbors("users:alice", "follows", as_of=snap)  # ["users:bob", "users:carol"] (time-travel)

put_file() / get_file()

Git-style versioned file storage with Cascade compression: content-defined chunking, content-addressed dedup, and two compression tiers ("warm" fast / "cold" archival). Every version has a Merkle root.

put_file(name: str, data: bytes, tier: str = "warm", ...) → int (version index)
get_file(name: str, version: int = -1, tier: str = "warm") → bytes
file_root(name: str, version: int = -1, tier: str = "warm") → str (Merkle root hex)
data = open("notes.txt", "rb").read()
v1 = db.put_file("notes.txt", data)           # warm tier (fast)
v2 = db.put_file("notes.txt", new_data, tier="cold")  # cold tier (max compression)

db.get_file("notes.txt", v1)                  # original bytes back
db.file_root("notes.txt", v1)                 # Merkle root — anchorable on ITC
db.compression_stats("warm")                  # {"ratio": 39.9, "dedup_hits": 20, ...}

verify() / verify_determinism()

verify() → bool

Rewalk the entire BLAKE2b hash chain and confirm no op has been modified, reordered, or deleted. Returns True if the chain is intact. Run this after loading from disk to confirm persistence was not corrupted.

verify_determinism() → bool

Replay the log from scratch into fresh state and compare the result with the current materialized state. Returns True if they match — proving that state is a pure function of the log.

Properties

PropertyTypeDescription
db.seqintCurrent sequence number (number of ops − 1)
db.headstrCurrent chain head — BLAKE2b hex digest of the entire history

Lifecycle

flush()  /  close()

flush() forces an fsync without closing. close() flushes and closes the AOF file handle. Always call close() (or use the context manager) before exiting in durable mode.

Fluent Builder — q()

An alternative to raw NQL strings when you want to compose queries programmatically.

q(coll: str) → Query

Returns a Query builder. Chain methods and call .run() to execute.

MethodNQL equivalent
.where(field, op, value)WHERE field op value
.as_of(seq)AS OF seq
.search(text)SEARCH "text"
.order_by(field, "DESC")ORDER BY field DESC
.traverse(rel)TRAVERSE rel
.limit(n)LIMIT n
.run()(executes and returns list[dict])
results = (
    db.q("users")
      .where("status", "=", "active")
      .where("age", ">=", 25)
      .order_by("age", "DESC")
      .limit(10)
      .run()
)

NQL Reference

NQL — the NEDB Query Language — is a small, readable query syntax. One grammar, one parser; the query() method and the q() builder both compile to the same plan.

Full grammar

FROM <collection>
  [ AS OF <seq> ]
  [ WHERE <field> <op> <value> ( AND <field> <op> <value> )* ]
  [ SEARCH "<text>" ]
  [ ORDER BY <field> [ ASC | DESC ] ]
  [ TRAVERSE <relation> ]
  [ LIMIT <n> ]

op ∈ = != < <= > >=
String values use double quotes. Numbers are unquoted.

FROM

Required. Specifies the collection to query.

db.query('FROM users')                     # all users
db.query('FROM orders LIMIT 20')           # first 20 orders

WHERE

Filter rows. Multiple conditions are combined with AND. All six comparison operators are supported. Equality filters use indexed lookups when an eq index exists.

db.query('FROM users WHERE status = "active"')
db.query('FROM users WHERE age >= 25 AND status = "active"')
db.query('FROM orders WHERE total > 100 AND status != "cancelled"')

Index hint: equality conditions on indexed fields (eq index) skip the collection scan entirely. Create an eq index on any field you filter by frequently.

ORDER BY

Sort results by a field. Default direction is ASC. An ordered index makes sorting faster but is not required.

db.query('FROM users ORDER BY age ASC')
db.query('FROM users ORDER BY created_at DESC LIMIT 10')
db.query('FROM orders WHERE status = "paid" ORDER BY total DESC')

TRAVERSE

Follow a relation from the result set. First filters/sorts rows in the FROM collection, then follows the named edge to the target collection. Returns documents from the target collection.

# Who does alice follow?
db.query('FROM users WHERE _id = "alice" TRAVERSE follows')

# All work orders for active projects
db.query('FROM projects WHERE status = "active" TRAVERSE contains')

AS OF

Time-travel read. Execute the query against the database as it existed at the given sequence number. Works with all other clauses.

snap = db.seq
db.put("users", "alice", {"age": 32})

# alice is 31 at the snapshot, 32 at HEAD
db.query(f'FROM users AS OF {snap} WHERE _id = "alice"')
db.query('FROM users AS OF 0')   # empty — nothing existed at seq 0

LIMIT

Truncate the result set. Applied after all other clauses (WHERE, SEARCH, ORDER BY, TRAVERSE).

db.query('FROM users ORDER BY created_at DESC LIMIT 5')
db.query('FROM users WHERE status = "active" LIMIT 100')

Durable Mode

Pass a directory path to NEDB() to make the database durable. Every op is appended to disk immediately — NEDB uses the same model as Redis AOF persistence.

Python
# Session 1 — write data
db = NEDB("./mydata")
db.create_index("users", "status", "eq")
db.put("users", "alice", {"name": "Alice", "status": "active"})
db.close()  # flush + fsync

# Session 2 — reopen (replays log.aof, rebuilds state)
db = NEDB("./mydata")
assert db.verify()                            # chain intact across the restart
assert db.get("users", "alice")["name"] == "Alice"

Always call db.close() (or use the context manager) before exiting. Unflushed writes are in an OS buffer that may not be visible after an unclean shutdown. close() calls fsync() so the AOF is fully durable.

What gets written to disk

Two files are created in the data directory:

FileContents
log.aofOne JSON line per op, in append order. Contains the full Op including seq, client, nonce, op type, payload, timestamp, prev_hash, and hash. Never rewritten — only appended.
meta.jsonThe index configuration: list of [coll, field, kind] tuples. Updated on create_index(). Loaded first on open so indexes are rebuilt during log replay.

On open, NEDB:

  1. Loads meta.json and registers the index configuration.
  2. Reads log.aof line by line and folds each Op into the MVCC store, relations, and indexes.
  3. Restores the nonce counters and idempotency map from the ops themselves.
  4. Opens log.aof in append mode — all new ops go to the end.

The hash chain is preserved verbatim (hashes are not recomputed on load), so verify() and the head commitment survive restarts exactly.

nedbd — Server Daemon

Run NEDB as a long-lived process and connect clients over HTTP — the way you'd run Redis or Postgres. Each named database is a durable NEDB(path) held open in memory.

shell
nedbd
# nedbd 0.3.1 — http://127.0.0.1:7070  data=./nedb-data  auth=off

Environment variables

VariableDefaultDescription
NEDBD_HOST127.0.0.1Bind address
NEDBD_PORT7070Bind port
NEDBD_DATA./nedb-dataRoot directory for database files. Each database is a subdirectory.
NEDBD_TOKENunsetOptional bearer token. If set, every /v1/* request must include Authorization: Bearer <token>.
shell — custom config
NEDBD_PORT=9000 NEDBD_DATA=/var/db/nedb NEDBD_TOKEN=my-secret nedbd

HTTP Routes

All request/response bodies are JSON. Authentication (if NEDBD_TOKEN is set) uses Authorization: Bearer <token> on every /v1/* request.

RouteDescription
GET/healthPing. Returns {"ok": true, "version": "0.3.1", "databases": [...]}. No auth required.
GET/v1/databasesList all databases with summary (name, seq, head, rows, collections).
POST/v1/databasesCreate a database. Body: {"name": "shop", "init": {...}}. init is optional — see init schema below.
GET/v1/databases/:nameFull detail: collections, indexes, seq, head, integrity check, recent log.
DELETE/v1/databases/:nameDrop the database and delete its files. Irreversible.
POST/v1/databases/:name/queryRun NQL. Body: {"nql": "FROM users LIMIT 10"}. Returns {"rows": [...], "count": N, "seq": N, "head": "..."}
POST/v1/databases/:name/putInsert or replace a row. Body: {"coll": "users", "id": "u1", "doc": {...}, "client"?, "nonce"?, "idem"?}
POST/v1/databases/:name/indexAdd an index. Body: {"coll": "users", "field": "status", "kind": "eq"}
POST/v1/databases/:name/linkCreate a graph edge. Body: {"frm": "users:u1", "rel": "follows", "to": "users:u2"}
DELETE/v1/databases/:name/rows/:coll/:idDelete a row by collection and id. Appended to the log.
GET/v1/databases/:name/verifyRe-walk the hash chain. Returns {"ok": true, "seq": N, "head": "..."}
GET/v1/databases/:name/log?limit=NRecent log entries (newest first). Default limit: 50.

The init payload

When creating a database you can seed it in one call:

{
  "name": "shop",
  "init": {
    "indexes": [
      ["users", "status", "eq"],
      ["orders", "total", "ordered"]
    ],
    "seed": {
      "users": [{"id": "u1", "name": "Ada", "status": "active"}]
    },
    "links": [
      ["users:u1", "placed", "orders:o1"]
    ]
  }
}

curl Examples

Create a database
curl -X POST http://localhost:7070/v1/databases \
  -H 'Content-Type: application/json' \
  -d '{"name":"myapp"}'
Query
curl -X POST http://localhost:7070/v1/databases/myapp/query \
  -H 'Content-Type: application/json' \
  -d '{"nql":"FROM users WHERE status = \"active\" ORDER BY age DESC LIMIT 5"}'
Insert a row
curl -X POST http://localhost:7070/v1/databases/myapp/put \
  -H 'Content-Type: application/json' \
  -d '{"coll":"users","id":"u1","doc":{"name":"Ada","status":"active","age":31}}'
Verify integrity
curl http://localhost:7070/v1/databases/myapp/verify
# {"ok": true, "seq": 1, "head": "a3f..."}
With bearer auth
curl http://localhost:7070/v1/databases \
  -H 'Authorization: Bearer my-secret'

Node / JavaScript

Connect to a running nedbd instance over HTTP. No native addon required — pure fetch.

JavaScript (Node 18+ / browser)
const BASE = 'http://127.0.0.1:7070'

async function query(db, nql) {
  const res = await fetch(`${BASE}/v1/databases/${db}/query`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ nql })
  })
  return (await res.json()).rows
}

async function put(db, coll, id, doc) {
  const res = await fetch(`${BASE}/v1/databases/${db}/put`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ coll, id, doc })
  })
  return res.json()
}

// Usage
const users = await query('myapp', 'FROM users WHERE status = "active"')
await put('myapp', 'users', 'u2', { name: 'Bo', status: 'active', age: 28 })

Python requests

Python — HTTP client (no import nedb needed)
import requests

BASE = "http://127.0.0.1:7070"

rows = requests.post(
    f"{BASE}/v1/databases/myapp/query",
    json={"nql": 'FROM users WHERE status = "active" ORDER BY age DESC'}
).json()["rows"]

requests.post(
    f"{BASE}/v1/databases/myapp/put",
    json={"coll": "users", "id": "u3", "doc": {"name": "Cy", "status": "active"}}
)

SQL Compatibility

Translate standard SQL to NQL and NEDB operations — no MySQL or MariaDB code involved. SQL is a familiar entry point; the NEDB engine executes everything natively.

Python
from nedb import NEDB
from nedb.sql import sql_exec, sql_to_nql

db = NEDB("./data")

# SELECT → NQL query
sql_exec(db, "SELECT * FROM users WHERE status = 'active' ORDER BY age DESC LIMIT 10")
sql_exec(db, "SELECT * FROM users WHERE bio LIKE '%rust%'")       # LIKE → SEARCH
sql_exec(db, f"SELECT * FROM orders AS OF {snap}")               # NEDB extension: time-travel

# INSERT → db.put()
sql_exec(db, "INSERT INTO users (id, name, age, status) VALUES ('u1', 'Ada', 31, 'active')")

# UPDATE → fetch + merge + db.put()  (all other fields preserved)
sql_exec(db, "UPDATE users SET age = 32 WHERE id = 'u1'")

# DELETE → db.delete()
sql_exec(db, "DELETE FROM users WHERE id = 'u1'")

# See what NQL a SELECT compiles to:
sql_to_nql("SELECT * FROM users WHERE status = 'active' ORDER BY age DESC LIMIT 5")
# → 'FROM users WHERE status = "active" ORDER BY age DESC LIMIT 5'

Supported SQL

StatementMapped toNotes
SELECT *NQL FROM … WHERE … ORDER BY … LIMITAll columns returned (no projection yet)
WHERE field = 'v'WHERE field = "v"All six ops: = != < <= > >=
WHERE field LIKE '%x%'SEARCH "x"Best-effort; requires a search index
AS OF nAS OF nNEDB extension — time-travel in SQL
INSERT INTO t (id, …) VALUES (…)db.put()An id or _id column is required
UPDATE t SET f=v WHERE id = xfetch + merge + db.put()Only id-targeted updates; all other fields preserved
DELETE FROM t WHERE id = xdb.delete()Only id-targeted deletes

OR, JOIN, subqueries, and aggregates raise SQLUnsupportedError with a clear message. They are on the roadmap.

Redis Compatibility

Map Redis commands to NEDB primitives — no hiredis or Redis server code involved. Redis keys become NEDB collections; the engine handles persistence, integrity, and time-travel automatically.

Python
from nedb import NEDB
from nedb.redis_compat import RedisCompat

r = RedisCompat(NEDB("./data"))

# Strings
r.execute("SET", "k", "hello")           # → "OK"
r.execute("GET", "k")                    # → "hello"
r.execute("INCR", "counter")             # → 1
r.execute("MSET", "a", "1", "b", "2")   # → "OK"

# Hashes  (Redis key names with colons work: "user:1")
r.execute("HSET", "user:1", "name", "Ada", "age", "31")
r.execute("HGET", "user:1", "name")     # → "Ada"
r.execute("HGETALL", "user:1")          # → {"name": "Ada", "age": "31"}

# Sets
r.execute("SADD", "tags", "python", "rust")
r.execute("SMEMBERS", "tags")           # → {"python", "rust"}
r.execute("SISMEMBER", "tags", "rust") # → 1

# Lists
r.execute("RPUSH", "q", "a", "b", "c")
r.execute("LRANGE", "q", 0, -1)         # → ["a", "b", "c"]
r.execute("LPOP", "q")                  # → "a"

# Unsupported — clear error + roadmap link
r.execute("EXPIRE", "k", 60)            # → RedisUnsupportedError

Command coverage

GroupCommands
StringsSET GET DEL EXISTS INCR INCRBY DECR DECRBY MSET MGET SETNX GETDEL APPEND STRLEN TYPE RENAME KEYS DBSIZE FLUSHDB
HashesHSET HMSET HSETNX HGET HMGET HGETALL HDEL HEXISTS HKEYS HVALS HLEN HINCRBY
SetsSADD SMEMBERS SISMEMBER SREM SCARD SUNION SINTER SDIFF
ListsLPUSH RPUSH LRANGE LLEN LINDEX LSET LPOP RPOP
Unsupported (roadmap)EXPIRE TTL PEXPIRE PTTL PERSIST · SUBSCRIBE PUBLISH · MULTI EXEC DISCARD WATCH

Key encoding: Redis key names that contain characters invalid in NQL collection names (e.g. user:1) are stored using a safe hex encoding (user__3a__1). This is transparent — you always use the original key name in commands.

wrap_redis() — Redis Layer-2

Already running on Redis? Wrap your connection in one line and add NEDB's full feature set alongside your existing app — zero migration required, zero impact on your existing keys.

Two surfaces coexist on the same connection:

  • Surface 1 — every Redis command (r.set, r.hset, r.get, …) passes through to Redis unchanged. Alice's app code doesn't change.
  • Surface 2r.nedb.* exposes the full NEDB API: NQL queries, time-travel, causal provenance, hash chain verification.
Python — one-line wrap
import redis, json
from nedb import wrap_redis

# ONE LINE — Alice's app code doesn't change
r = wrap_redis(redis.Redis("localhost", 6379), db_name="rideshare")

# Surface 1 — existing Redis commands, unchanged
r.set("driver:d1", json.dumps({"name": "Bob", "status": "active"}))
r.get("driver:d1")     # → b'{"name": "Bob", ...}'
r.hset("trip:t1", mapping={"status": "requested"})

# Surface 2 — new NEDB features on the same connection
r.nedb.put("driver", "d1", {"name": "Bob", "status": "active"})
r.nedb.query('FROM driver WHERE status = "active"')
r.nedb.verify()    # → True

Isolation guarantee

NEDB never writes to Alice's namespace. It owns only keys prefixed nedb:{db_name}::

KeyTypePurpose
nedb:{db_name}:oplogRedis StreamAppend-only op log (in-process mode)
nedb:{db_name}:snapshotRedis HashCheckpoint
nedb:{db_name}:metaRedis HashIndex configuration

wrap_redis() signature

Python
wrap_redis(
    r,                         # redis.Redis (or compatible) connection
    db_name: str = "default",  # logical name; NEDB uses nedb:{db_name}:*
    nedbd_url: str = None,     # route r.nedb.* to a nedbd server (see below)
    nedbd_token: str = None,   # bearer token for nedbd auth (optional)
) → WrappedRedis

Local testing — no Redis server needed: use fakeredis as a drop-in. See examples/fakeredis_demo.py in the repo.

Backfill — import existing Redis data

One-time SCAN of Alice's existing Redis keys into NEDB's hash chain. After backfill, all historical data is queryable via NQL, time-travelable, and verified.

Three-step migration

Python
# Step 1 — register: map Redis key globs to NEDB collections (chainable)
(r.nedb
 .register("driver:*", collection="driver", value_parser=json.loads)
 .register("trip:*",   collection="trip",   value_type="hash")
)

# Step 2 — backfill: import all existing Redis data once
imported = r.nedb.backfill()     # → int (number of keys imported)

# Step 3 — shadow: all future surface-1 writes auto-chain
r.nedb.shadow_writes = True

register() — collection mapping

ParamTypeDescription
patternstrRedis key glob, e.g. "driver:*"
collectionstrNEDB collection name
id_extractorcallablefn(key) → id. Default: key.rsplit(":", 1)[-1]
value_parsercallablefn(raw) → dict. Default: JSON decode, fallback to {"_v": raw}
value_typestr"string" (default) · "hash" · "json"

Returns self for chaining.

backfill() — one-pass SCAN import

Python
# Backfill all registered patterns
imported = r.nedb.backfill()

# Or backfill a single pattern directly (no prior register needed)
imported = r.nedb.backfill("zone:*", "zone", value_parser=json.loads)

Every imported record gets _source: "backfill" and _evidence: "backfill" in the hash chain entry. Returns total keys imported (int).

Write Shadowing

Set r.nedb.shadow_writes = True and every surface-1 write is silently mirrored into NEDB's hash chain. Alice's app writes to Redis normally; NEDB captures the full write history without any code changes.

Python
r.nedb.shadow_writes = True

# Alice's code — unchanged
r.set("driver:d1", json.dumps({"name": "Bob", "status": "active"}))
r.hset("trip:t1", mapping={"status": "en_route", "driver_id": "d1"})

# → both writes auto-chained into NEDB
r.nedb.get("driver", "d1")  # → {"name": "Bob", "status": "active", "_source": "shadow"}

# HSET merges with existing NEDB doc
r.nedb.get("trip", "t1")   # → {"status": "en_route", "driver_id": "d1", "rider_id": "u1", ...}

# Time-travel through shadowed writes
snap = r.nedb.seq
r.set("driver:d1", json.dumps({"name": "Bob", "status": "offline"}))
r.nedb.get_as_of("driver", "d1", snap)   # → {"status": "active", ...}

# Disable at any time
r.nedb.shadow_writes = False

Shadowed commands

The following write commands are intercepted when shadow_writes=True: set setnx setex psetex getset mset msetnx hset hmset hsetnx hincrby hincrbyfloat hdel lpush rpush lset ltrim lpop rpop sadd srem zadd zincrby zrem del unlink rename append incr incrby decr decrby setrange.

Keys that match a registered pattern get a full NEDB put() (NQL-queryable, time-travelable). Unmatched keys get a raw chain entry (tamper-evidence only).

Shadow failures never break the Redis surface call — any error in NEDB is silently swallowed.

wrap_redis() + nedbd Mode

Pass nedbd_url= to route all r.nedb.* calls to a running nedbd server instead of the in-process engine. nedbd handles its own durable AOF persistence on disk; the Redis Stream backend is bypassed entirely.

Python
import redis
from nedb import wrap_redis

r = wrap_redis(
    redis.Redis("localhost", 6379),
    db_name="rideshare",
    nedbd_url="http://localhost:8421",   # ← nedbd server
    nedbd_token="my-secret",             # ← optional bearer token
)

# Surface 1 — Redis, unchanged
r.set("driver:d1", "...")

# Surface 2 — forwarded to nedbd over HTTP/JSON
r.nedb.put("driver", "d1", {"name": "Bob"})
r.nedb.query('FROM driver WHERE status = "active"')
r.nedb.verify()   # → True

# Backfill + shadow_writes work identically in nedbd mode
r.nedb.register("driver:*", "driver", value_parser=json.loads)
r.nedb.backfill()
r.nedb.shadow_writes = True

Mode comparison

ModeHowPersistenceBest for
In-processwrap_redis(r, db_name=…)Redis Stream oplogLightweight; single-process apps
nedbdwrap_redis(r, …, nedbd_url=…)Durable AOF on diskMulti-process; production deployments

NedBdProxy is the internal HTTP client that translates r.nedb.* calls to nedbd's /v1/databases/{name}/* API. The surface API is identical in both modes — switching is a one-line change to wrap_redis().

Causal provenance in nedbd mode

As of v1.2.1 the nedbd PUT endpoint accepts caused_by, evidence, confidence, valid_from, and valid_to from the request body and passes them through to the engine:

Python
r.nedb.put("trip", "t1",
    {"rider": "u1", "driver": "d1"},
    caused_by=[r.nedb.seq - 1],
    evidence="inference",
    confidence=0.97)

r.nedb.query('FROM trip WHERE _id = "t1" TRACE caused_by')  # works in nedbd mode

Auto-Indexing

Wrap a NEDB instance with AutoIndexDB and indexes are created automatically based on observed query patterns — no manual create_index() calls required.

Python
from nedb import NEDB, AutoIndexDB

db = AutoIndexDB(NEDB("./data"), threshold=5, verbose=True)

# Query as normal — field usage is tallied automatically
db.query('FROM users WHERE status = "active"')   # 1/5
db.query('FROM users WHERE status = "active"')   # 2/5
# ... 3 more ...
db.query('FROM users WHERE status = "active"')
# [autoindex] created eq index on users.status (threshold=5)

# Check what's been created and what's close
db.analyze()
# {"tallies": {"users.status (eq)": 5}, "indexes_created": ["users.status (eq)"], "threshold": 5}

db.suggest()
# ["users.age (ordered) — 3/5 queries"]  ← not yet at threshold

# All other NEDB methods work unchanged — AutoIndexDB is a transparent proxy
db.put("users", "u1", {"name": "Ada", "status": "active"})
db.get("users", "u1")
db.verify()
ParameterTypeDescription
dbNEDBAny NEDB instance (embedded or durable)
thresholdintQuery count before an index is auto-created. Default: 5
verboseboolPrint a message when an index is created. Default: False

How it works

Every query() call is intercepted. The NQL string is parsed to extract WHERE field names and ORDER BY fields. Each (collection, field, kind) combination is tallied. Once the count reaches threshold, create_index() is called automatically. Equality conditions (=, !=) create an eq index; range comparisons (<, >, <=, >=) and ORDER BY create an ordered index.

Snapshot Checkpoints

Capture the full database state to disk — anchored in the hash chain — so future starts are O(delta) instead of O(total). The chain never breaks: the checkpoint is a real op in the AOF.

Python
from nedb import NEDB

db = NEDB("./data")
# ... write 100 K rows ...
db.checkpoint()   # anchor state in the chain → writes snapshot.json
db.close()

# Next open: loads snapshot then replays only delta ops
db2 = NEDB("./data")
assert db2.verify()   # chain from genesis → checkpoint op → delta: intact

nedbd auto-checkpoints on SIGTERM / SIGINT. Stopping the daemon with Ctrl+C or kill writes a checkpoint for every open database before shutdown. The next start loads from those snapshots — restart time is proportional to writes since the last checkpoint, not total history.

You can also trigger a checkpoint over HTTP at any time:

curl -X POST localhost:7070/v1/databases/mydb/checkpoint
# {"ok": true, "head": "abc...", "seq": 1042}

What the snapshot contains

  • MVCC store (every key's HEAD value and write seq)
  • Relations (graph edges with added/removed seqs)
  • Index configuration (eq / ordered / search specs)
  • BlobStore chunks and file manifests (both tiers, compressed)
  • Nonce and idempotency tables (replay protection survives restart)

Pre-checkpoint time-travel: AS OF queries for seqs before the snapshot require the full AOF. Keep the log file for archival use if you need indefinite time-travel; otherwise snapshots are self-contained for all operations after the checkpoint seq.

TTL / Key Expiry

Set a time-to-live on any document. Expiry is lazy (checked on every read) and append-only (a delete op is written to the log when a key expires).

Python
# Put with TTL
db.put("cache", "session", {"token": "xyz"}, ttl_s=3600)   # expires in 1 hour

# Set / update TTL on an existing doc
db.expire("cache", "session", ttl_s=300)   # update to 5 minutes; False if not found

# Bulk sweep (call periodically for background maintenance)
n = db.sweep()   # → number of expired docs deleted
MethodDescription
put(..., ttl_s=N)Store _expires_at = now + N in the doc. Lazy expiry on every subsequent get().
expire(coll, id, ttl_s)Set or update TTL on an existing document. Returns False if the document doesn't exist.
sweep()Scan and delete all expired documents now. Returns the count deleted.

Time-travel ignores expiry. get(..., as_of=seq) always returns what was true at that seq, even if the document has since expired. Expiry only fires on HEAD reads.

The Redis adapter's EXPIRE, TTL, and PTTL commands now map to db.expire().

GROUP BY Aggregations

Aggregate query results by a field. Compatible with WHERE, SEARCH, and LIMIT — filtering happens before grouping.

NQL examples
# Count rows per group
db.query("FROM orders GROUP BY status COUNT")
# → [{"status": "paid", "count": 42}, {"status": "pending", "count": 7}]

# Sum a numeric field
db.query("FROM sales GROUP BY region SUM revenue")
# → [{"region": "north", "count": 3, "sum_revenue": 15000}, ...]

# Average
db.query("FROM scores GROUP BY grade AVG score")

# Min / Max
db.query("FROM items GROUP BY category MIN price")
db.query("FROM items GROUP BY category MAX price")

# With WHERE (filter before grouping)
db.query('FROM orders WHERE region = "EU" GROUP BY status COUNT')

Aggregate functions

SyntaxOutput fieldDescription
GROUP BY f COUNTcountNumber of rows in the group
GROUP BY f SUM fieldsum_fieldSum of field across the group
GROUP BY f AVG fieldavg_fieldAverage of field
GROUP BY f MIN fieldmin_fieldMinimum value of field
GROUP BY f MAX fieldmax_fieldMaximum value of field

Every group result always includes count alongside the requested aggregate. Non-numeric values are skipped for SUM / AVG / MIN / MAX.

Encryption at Rest

AES-256-GCM at-rest encryption with a double-envelope key structure. Toggle-able — zero overhead when disabled. Encrypts the AOF, snapshot.json, and BlobStore chunks.

The double envelope

External TMK (Table Master Key) ← provided by operator (env / arg / key file) ↓ AES-256-GCM wrap DEK (Data Encryption Key) ← random per database → stored in key.enc ↓ AES-256-GCM encrypt Data (AOF lines, snapshot.json, blob chunks)

Enable encryption

Python — programmatic TMK
db = NEDB("./data", tmk=bytes.fromhex("a3f1..."))   # 32-byte hex key
Environment variable (recommended for production)
# .env
NEDB_TMK=a3f1...  # 64-char hex string (any length accepted, normalised via HKDF)

# or a key file
NEDB_TMK_FILE=/run/secrets/nedb-tmk
nedbd — encrypts all databases it manages
NEDB_TMK=a3f1... nedbd

Key rotation

Re-wrap the DEK under a new TMK without re-encrypting any data. The old TMK is immediately rejected after rotation.

db = NEDB("./data", tmk=old_key)
db.rewrap_key(old_tmk=old_key, new_tmk=new_key)
db.close()
# Database now only opens with new_key

What is encrypted

FileCoverage
log.aofEvery op line encrypted individually — {"enc":1,"ct":"<b64>"}
snapshot.jsonEntire file encrypted as a single envelope
key.encWrapped DEK (never the plaintext DEK)
BlobStore chunksEach chunk encrypted before base64 storage in the snapshot

The hash chain is unaffected. verify(), AS OF, and Merkle proofs operate on decrypted log entries — encryption is a transparent layer below the log. The chain from genesis through every checkpoint op to the current head remains continuously verifiable.

bundled since v0.5.5: cryptography is now a required dependency — it ships with every pip install nedb-engine. No separate install needed.

RESP2 Wire Protocol

nedbd now speaks the Redis Serialization Protocol (RESP2). redis-cli, redis-benchmark, and every Redis client library in every language connects to nedbd natively — no Redis installation required.

Enable RESP2

# env var: NEDBD_RESP2_PORT (0 = disabled, default)
NEDBD_RESP2_PORT=6379 nedbd

# Boot log:
#   nedbd 0.6.0 — http://127.0.0.1:7070  data=./nedb-data  auth=off
#   resp2  — redis://  127.0.0.1:6379  (RESP2 wire protocol)

Connect with redis-cli

redis-cli -p 6379 PING                              # PONG
redis-cli -p 6379 SELECT salonbooking             # OK — open that database
redis-cli -p 6379 SET key "hello"                  # OK
redis-cli -p 6379 HSET "user:1" name Ada age 31   # :2
redis-cli -p 6379 SADD tags python rust            # :2
redis-cli -p 6379 SMEMBERS tags                     # {"python", "rust"}

NQL pass-through via EVAL

# EVAL runs any NQL query; rows are returned as JSON strings
redis-cli -p 6379 EVAL "FROM users WHERE status = \"active\" LIMIT 5" 0
# 1) "{\"_id\":\"u1\",\"name\":\"Ada\",\"status\":\"active\"}"
# 2) "{\"_id\":\"u2\",\"name\":\"Bo\",\"status\":\"active\"}"

SELECT maps to NEDB database names

SELECT <name> opens the named NEDB database (creates it if it doesn't exist). This replaces Redis's integer 0-15 DBs with NEDB's named databases — use the database name you deployed in nedbd.

Command coverage

GroupCommands
StringsSET GET DEL EXISTS INCR INCRBY DECR DECRBY MSET MGET SETNX GETDEL APPEND STRLEN TYPE RENAME KEYS DBSIZE FLUSHDB
HashesHSET HMSET HSETNX HGET HMGET HGETALL HDEL HEXISTS HKEYS HVALS HLEN HINCRBY
SetsSADD SMEMBERS SISMEMBER SREM SCARD SUNION SINTER SDIFF
ListsLPUSH RPUSH LRANGE LLEN LINDEX LSET LPOP RPOP
ServerPING SELECT COMMAND QUIT DBSIZE KEYS TYPE FLUSHDB
NQLEVAL "<nql>" 0 — run any NQL query
Unsupported (roadmap)EXPIRE TTL PTTL SUBSCRIBE PUBLISH MULTI EXEC — clear -ERR with roadmap note

Changelog

1.2.1

wrap_redis() + nedbd server mode

wrap_redis() now accepts nedbd_url= and nedbd_token= to route all r.nedb.* calls to a running nedbd HTTP server instead of the in-process engine. New NedBdProxy class translates put/get/query/create_index/link/delete/verify/checkpoint to nedbd's /v1/databases/{name}/* HTTP API. Backfill and write shadowing work identically in nedbd mode. Also: nedbd PUT endpoint now passes caused_by, evidence, confidence, valid_from, valid_to through to the engine (previously only client, nonce, idem were forwarded). 29/29 nedbd integration tests; 74/74 in-process tests.

1.2.0

wrap_redis() backfill + write shadowing

Three-step Redis migration: register() maps key globs to NEDB collections, backfill() scans existing Redis keys via SCAN and imports them into the hash chain in one pass (evidence tagged "backfill"), shadow_writes = True auto-chains all future surface-1 writes. New: CollectionMapping with custom id_extractor and value_parser; chainable register(); direct backfill(pattern, collection) without prior register. 30 new tests; 74/74 total. Also: updated fakeredis_demo.py (29/29 checks) and README wrap_redis section.

1.1.0

wrap_redis() — NEDB as a Redis layer-2

Wrap any existing redis.Redis connection with NEDB in one line: r = wrap_redis(redis.Redis(...), db_name="..."). Two surfaces coexist: Surface 1 — every Redis command passes through unchanged; Surface 2 — r.nedb.* gives time-travel, NQL, causal provenance, and hash-chain verification. NEDB persists via Redis Streams (nedb:{db_name}:oplog). Isolation guarantee: NEDB never writes outside its nedb:{db_name}: prefix. 44 tests; fakeredis demo.

1.0.5

License → GPL-3.0-or-later; npm homepage fix

Updated license in package.json and pyproject.toml from Apache-2.0 to GPL-3.0-or-later. Added explicit "homepage" to package.json to prevent npm from appending #readme.

1.0.3

Rust NQL integer/float comparison fix

serde_json::json!(3.0f64) serialises as Float(3.0) but a document field parsed from JSON arrives as PosInt(3). Fixed by comparing via as_f64() in the Rust nql.rs cmp() function so integer literals and float literals compare equal when numerically equal.

1.0.2

Rust AS OF index bypass fix

The eq index was being used even for AS OF queries; since the index reflects HEAD only, AS OF queries returned stale current-version results. Fixed with an as_of.is_none() guard so the eq index is only used for HEAD reads.

1.0.1

napi-rs Node.js binding rewrite

Complete rewrite of rust/crates/nedb-node/src/lib.rs to match the current NEDB core API. Fixes API drift from the original stub. All napi-rs bindings now match put/get/query/createIndex/link/verify/head/seq/getAsOf.

0.6.0

RESP2 wire protocol — nedbd is a drop-in Redis replacement

Enable with NEDBD_RESP2_PORT=6379 nedbd. redis-cli, redis-benchmark, and every Redis client library connects natively. SELECT <name> opens a named NEDB database. EVAL "<nql>" 0 runs any NQL query. All major command groups supported (strings, hashes, sets, lists); EXPIRE/TTL/SUBSCRIBE/MULTI return clear -ERR with roadmap note. Also: auto backfill-encrypt existing plaintext AOF on first encrypted open (v0.5.6), eager database open at startup so backfill shows in boot log (v0.5.7), cryptography bundled as required dep (v0.5.5).

0.5.3

AES-256-GCM encryption at rest — double envelope TMK

Toggle-able at-rest encryption for the AOF, snapshots, and BlobStore chunks. Double-envelope: a random per-database DEK is wrapped by an external TMK (programmatic, NEDB_TMK env, or key file). The GCM tag detects any tampering on read. Key rotation via db.rewrap_key() re-wraps the DEK without touching data. The hash chain, verify(), and AS OF work unchanged through encryption. Requires pip install cryptography. 38 new tests; 127/127 total.

0.5.2

BlobStore (Cascade files) persisted in checkpoints

Files stored via put_file() are now fully preserved across checkpoint() + restart. The snapshot serialises both BlobStore tiers: compressed chunk bytes (encrypted if DEK set), file manifests, Merkle roots, and dedup stats. get_file(), file_root(), file_proof(), and compression_stats() all work identically after a snapshot-assisted restart.

0.5.1

nedbd auto-checkpoint on SIGTERM / SIGINT

The daemon checkpoints every open database before shutdown (SIGTERM or SIGINT) so the next startup loads from the snapshot with zero delta ops to replay. Signal handler uses a daemon thread to call httpd.shutdown() — avoids deadlock with serve_forever(). Also adds POST /v1/databases/:name/checkpoint for on-demand checkpoints.

0.5.0

Snapshot checkpoints, TTL/expiry, GROUP BY aggregations

Snapshots: db.checkpoint() captures state in a snapshot.json anchored in the hash chain (the checkpoint is a real log op). Future opens are O(delta). TTL: db.put(..., ttl_s=N), db.expire(), db.sweep(). Lazy expiry on read; AS OF reads skip expiry checks. Redis EXPIRE now functional. GROUP BY: FROM t GROUP BY f COUNT|SUM|AVG|MIN|MAX field in NQL. 30 new tests; 89/89 total.

0.4.2

Structured benchmark suite

bench/benchmarks.py: measures GET/PUT/time-travel, indexed vs unindexed query, adapter overhead vs raw NQL, in-memory vs AOF write cost, optional Redis TCP and nedbd HTTP comparisons. Results written to bench/RESULTS.md. README perf table updated with real measured numbers.

0.4.0

SQL adapter, Redis compatibility, auto-indexing

Three new pure-Python adapters, zero external dependencies. SQL: sql_exec(db, sql) translates SELECT/INSERT/UPDATE/DELETE to NQL and NEDB primitives — MariaDB users can write what they know and immediately get time-travel and hash-chain integrity. Redis: RedisCompat(db).execute(cmd, *args) maps SET/GET/HSET/HGETALL/SADD/SMEMBERS/LPUSH/LRANGE and 30+ more commands to NEDB. EXPIRE/TTL/SUBSCRIBE/MULTI return RedisUnsupportedError with a roadmap note. Auto-indexing: AutoIndexDB(db, threshold=5) tallies query field usage and creates indexes automatically. 48 new tests; 59/59 total.

0.3.1

Universal wheel — installs everywhere

Switched build backend to hatchling. Publishes a py3-none-any wheel + sdist that installs on any platform and Python version without a toolchain. Fixes the silent upgrade failure on Intel Mac caused by arm64-only wheels in 0.2.0/0.3.0. The nedbd command is always present after install.

0.3.0

nedbd — server daemon

Run NEDB as a long-lived HTTP/JSON server. pip install nedb-engine now ships the nedbd console command. Each named database is a durable NEDB(path) held open in memory. Multi-database, optional bearer token auth, CORS, ThreadingHTTPServer. Verified: create, seed, query, traverse, verify, kill + restart → data intact.

0.2.0

Durable AOF persistence

NEDB(path) opens a durable database. Every op is appended to log.aof and fsync'd immediately. Index config is snapshotted to meta.json. Database reloads by replaying the log on open — verify(), AS OF time-travel, relations, and the head commitment all survive restarts. NEDB() with no path remains in-memory (fully backward-compatible). Added: flush(), close(), context-manager support.

0.1.4

Stable pure-Python baseline

Universal py3-none-any wheel + sdist. Last release in the 0.1.x line before the persistence work. All features complete: log, MVCC, relations, indexes (eq/ordered/search), NQL, Cascade compression, Merkle proofs, fluent builder. 10/10 invariant tests pass.

0.1.3

Native Rust wheels

First native wheel release — platform wheels for macOS arm64, Linux x86_64, Windows x64 via maturin + PyO3. The optional nedb._native accelerator is loaded lazily; the package works identically without it.

0.1.0

Initial release

Pure-Python reference engine. Full feature set: append-only hash-chained log, MVCC time-travel, replay protection, idempotency, first-class relations, three index types, NQL query language, git-style Cascade-compressed file store, Merkle proofs.