Lumosql: Fossil-on-LumoSQL tests

This directory holds three kinds of test for the LMDB-backed Fossil:

correctness-* decide whether a build works,
speed-* decide whether a change made fossil quicker, and
stress-* which tries to break under load

Facts we assume:

native SQLite-backed Fossil and libfossil is what we are comparing against
Fossil was invented to serve the sqlite.org repo, so that is the ultimate test of correctness and speed

Correctness

correctness-fossil-lmdb.sh runs checks against fossil backed by LMDB store. All the obvious operations (clone, rebuild etc) are tested reproducibly. Features in a binary are detected (for example, an LMDBv1 binary may have encryption compiled in) and if so this is tested. A failing command produces a not ok line and the run continues.

correctness-libfossil-lmdb.sh is the same idea for libfossil. A normal lifecycle (new, open, add, commit) and the encryption check, driven through the f-* apps.

Stress

stress-fossil-lmdb.sh tries to break Fossil with load and contention, looking for corruption or crashes. There is no libfossil stress script yet: a concurrent-writer durability test needs libfossil to drive competing writers and to verify integrity afterward, which the f-* apps can't do.

Run the correctness tests, confirm they pass, then move to the speed tests.

Speed tests

The native fossil is standard measure. Every speed test reports each LMDB candidate's time as a multiple of the native.

A candidate for speed test is a name:binary:env string. env is a comma-separated list of VAR=VALUE pairs (for example LUMO_LMDB_MAPSIZE=2048), so one binary can be entered several times under different runtime configurations (eg transaction mode.) A speed test run answers "which (binary, configuration) is quickest for the end user".

Every candidate is run at the default sync level. A Fossil repository is a WAL-mode SQLite database at synchronous=NORMAL and both LMDB backends default to LUMO_SYNC=normal with the same durability guarantees.

The page cache is dropped before each run (SPEED_DROP_CACHE=1, the default). If /proc/sys/vm/drop_caches is not writable the script reports that the run is warm-cache.

The four speed scripts share speed-lib.sh. Caching is done under ~/.cache/LumoSQL/.

speed-fossil.sh

This is the tool that drives everything else. Usage:

NATIVE=/path/to/native/fossil \
CANDIDATES='before:/path/fossil-before:
after:/path/fossil-after:' \
sh speed-fossil.sh

Knobs: REPS (default 5), WARMUP (1), SPEED_COMMITS (200), SPEED_FILES (60), SPEED_CSV (a CSV sink path). A REPS=2 run has a wide spread and will mostly be inconclusive so raise REPS until the spread tightens.

speed-fossil-sqlite.sh

This regards sqlite.org's repo as the reason for Fossil's existence, so it clones the repo, pins it to a known version and keeps it in cache. Every native run imports, reads, rebuilds and commits against that pinned repository.

Takes a lot more time than speed-fossil.sh but you don't need to run it as often.

NATIVE=/path/to/native/fossil \
CANDIDATES='after:/path/fossil-after:' \
sh speed-fossil-sqlite.sh

Knobs: REPS (5), SPEED_REFRESH=1 re-fetches the pinned repository.

speed-fossil-git-import.sh

fossil import --git is the heaviest sustained write Fossil does. This script clones the drhsqlite git mirror of Fossil once, writes a git fast-export stream to disk, and times the import of that stream.

NATIVE=/path/to/native/fossil \
CANDIDATES='after:/path/fossil-after:' \
sh speed-fossil-git-import.sh

Knobs: REPS (3), IMPORT_COMMITS (0 imports the whole stream; N imports the first N commits), GIT_URL, SPEED_REFRESH=1 re-fetches.

speed-libfossil.sh library

This drives the libfossil f-* apps. Its source of truth is native-backed libfossil, ie the same f-* apps built on stock SQLite. Every candidate, native included, runs the same apps from its own build directory, so the only dimension that varies is the storage backend.

A candidate here is name:dir:env, where dir is a libfossil build directory holding the f-* apps and libfossil.so.

NATIVE=/path/to/native-libfossil \
CANDIDATES='after:/path/to/lmdb-libfossil:' \
sh speed-libfossil.sh

Knobs: REPS (5), WARMUP (1), SPEED_COMMITS (120), SPEED_CSV.

Examples to copy and paste

These assume a native build at ./fossil-native and an LMDB build at ./fossil-lmdb, with this directory as the working directory. Adjust the paths to your builds.

Correctness on the monolith:

FOSSIL=./fossil-lmdb \
CLONE_URL=https://lumosql.org/src/lumosql \
sh correctness-fossil-lmdb.sh

The everyday "did my change help?" loop, comparing two builds against native:

NATIVE=./fossil-native \
CANDIDATES='before:./fossil-before:
after:./fossil-after:' \
sh speed-fossil.sh

One binary entered twice under different mapsizes, to see whether the map reservation matters for the workload:

NATIVE=./fossil-native \
CANDIDATES='m512:./fossil-lmdb:LUMO_LMDB_MAPSIZE=512
m4096:./fossil-lmdb:LUMO_LMDB_MAPSIZE=4096' \
sh speed-fossil.sh

The bulk-write path, importing the first 500 commits of the frozen git stream:

NATIVE=./fossil-native \
CANDIDATES='after:./fossil-lmdb:' \
IMPORT_COMMITS=500 \
sh speed-fossil-git-import.sh

Stress on a single-CPU machine, where the throughput test would otherwise skip, sized down to a short run:

STRESS_TPUT_FORCE=1 STRESS_WRITES=200 \
FOSSIL=./fossil-lmdb \
sh stress-fossil-lmdb.sh