colmena — distributed SQLite for Go

Live demo

Replicated event log

Watch a 4-node cluster heal itself.

A 4-machine Colmena cluster running right now on Fly.io. Every few seconds your browser fires a write at a random node — the row above shows which peer received it and which one applied the entry as leader, after Raft commits the replica. Restart any node with the ↻ button on its hero card; Fly brings it back in ~4 s and the cluster catches it up automatically.

polling every 2s from … Leader: electing… Source: examples/fly-ha API: colmena-ha-demo-5a1181.fly.dev

In one import

A 3-node cluster in 20 lines.

Construct a node, get a *sql.DB, write from anywhere. Writes are routed to the leader and replicated through Raft; reads hit local SQLite at native speed.

main.go

import "github.com/mentasystems/colmena"

node, _ := colmena.New(colmena.Config{
    NodeID:    "node-1",
    DataDir:   "./data/node1",
    Bind:      "0.0.0.0:9000",
    Bootstrap: true,
})
defer node.Close()

db := node.DB() // standard *sql.DB
db.Exec("CREATE TABLE kv (key TEXT PRIMARY KEY, value TEXT)")
db.Exec("INSERT INTO kv (key, value) VALUES (?, ?)", "hello", "world")

var value string
db.QueryRow("SELECT value FROM kv WHERE key = ?", "hello").Scan(&value)

What you get

Replicated SQLite without the operations tax.

Colmena combines hashicorp/raft for consensus with modernc.org/sqlite for storage. No separate database process, no sidecar, no out-of-band replication agent.

pure Go

No CGo, no C compiler. Cross-compiles cleanly to every GOOS/GOARCH.

database/sql

Drop-in *sql.DB. Existing queries, drivers, and test helpers work unchanged.

leader forwarding

Write from any node — colmena routes the call to the leader via RPC automatically.

write batching

Concurrent writes coalesce into a single Raft entry. ~60× throughput vs. unbatched.

buffered txns

db.Begin() / tx.Commit() packs every statement into one atomic Raft round-trip.

continuous backup

Litestream-style WAL streaming to filesystem or S3-compatible storage.

reactive hooks

OnApply fires on every node after each replicated write — fan out to caches, events, search.

custom RPC

Type-safe Forward[Req, Resp]() sends any request to the leader. Reuse the cluster's RPC plane.

background jobs

Job queue in colmena/jobs: typed handlers, retries with backoff, cron, cluster-wide concurrency caps. No Redis.

zero-config LAN

colmena/lan: flash one binary onto every machine. mDNS discovery, embedded mTLS CA, voter / non-voter failover.

One binary, one cluster. Ship a single Go binary with colmena embedded and you have a fault-tolerant database — no Postgres to run, no etcd to operate, no migration scripts at deploy time. The same process is the app and the data plane.

Read consistency

Three levels. Pick per query.

Every read carries a consistency level via context.Context. Default is Weak: always read from the node that processes writes, no quorum round-trip.

Level	Where it reads	Latency
ConsistencyNone	Local SQLite on this node — may lag if the node is behind on replication.	~8µs
ConsistencyWeak	Leader's local SQLite (forwarded via RPC if needed). Tiny staleness window during failover.	~90µs
ConsistencyStrong	Leader after quorum re-confirms leadership. Linearizable — impossible to read stale data.	~100µs+

Per-query override: ctx := colmena.WithConsistency(ctx, colmena.ConsistencyStrong). Mix freely — read-heavy paths can use None on followers and scale horizontally at native SQLite speed (~6µs).

Install

One `go get` away.

shell

$ go get github.com/mentasystems/colmena

# optional: zero-config LAN clustering (mDNS + embedded mTLS CA)
$ go get github.com/mentasystems/colmena/lan

# optional: S3-compatible continuous backup
$ go get github.com/mentasystems/colmena/backup/s3

Benchmarks

Throughput scales with concurrency.

The batcher coalesces concurrent writes into a single Raft entry, so throughput grows with the number of concurrent writers instead of being capped by Raft's per-entry fsync. Measured on Apple M1 Pro, 3-node cluster on localhost.

3,154w/s

128 writers · 2ms batch

4,224w/s

128 writers · 5ms batch

~11µs

P50 local follower read

Rule of thumb: a handful of concurrent writers gets you 1,000+ ops/sec; at 100+ writers you saturate SQLite on the leader, not Raft. Local reads from a follower match raw SQLite speed.

Design notes

Three principles.

embeddable

No separate database process. The cluster is your binary — distribute and run a single executable.

safe by default

Non-deterministic SQL (RANDOM(), datetime('now')) is rejected at the write path before it can silently diverge replicas. Self-describing format envelopes refuse to misinterpret data from a future release.

honest trade-offs

Single writer (inherent to Raft). 3-node minimum for fault tolerance. Statement-level replication. All documented up front — no surprises in production.

Replicated event log

Watch a 4-node cluster heal itself.

A 3-node cluster in 20 lines.

Replicated SQLite without the operations tax.

Three levels. Pick per query.

One go get away.

Throughput scales with concurrency.

Three principles.

One `go get` away.