Log Meeting re Requirements For Benchmark Cluster

10:05:19 <dan> #startmeeting First Soak Test of LumoSQL on VUB Cluster
10:05:19 <lumosql-meetbot> dan: Meeting started at 2022-02-07T10:05+0000
10:05:20 <lumosql-meetbot> dan: Current chairs: dan
10:05:21 <lumosql-meetbot> dan: Useful commands: #action #info #idea #link #topic #motion #vote #close #endmeeting
10:05:22 <lumosql-meetbot> dan: See also: https://hcoop-meetbot.readthedocs.io/en/stable/
10:05:39 <dan> #here Dan Shearer
10:06:03 <gabby_bch> #here Gabby
10:06:38 <rubdos[m]> #here Ruben
10:06:42 <Labhraich> #here Claudio
10:06:46 <dan> morning Gabby, Ruben, Claudio
10:07:05 <dan> #topic Background
10:07:25 <dan> #info Ruben has offered a powerful 4-machine cluster at VUB for benchmarking
10:08:31 <dan> #info Claudio and Dan agreed at https://lumosql.org/meetings/lumosql.20220205.1007.html to provide consolidated benchmarks to Gabby
10:09:06 <dan> #info Right now, some benchmark.sqlite files exist on Dan and Claudio's machines
10:09:13 <dan> I think that's all?
10:09:42 <rubdos[m]> Well, I'd need to know what actions I have to do in order to make our cluster crunch data.
10:10:09 <rubdos[m]> In principe this is a kubernetes and OpenStack cluster. I don't have ssh access and don't intend to gain SSH access.
10:10:45 <rubdos[m]> I could roll out a pod on kube that gets one of you ssh access for the benchmarks, or I could do the benchmarks myself in a kube job and dump you the file.
10:11:06 <dan> #link https://lumosql.org/src/lumosql/doc/trunk/README.md "Quickstart: Using the Build and Benchmark System"
10:11:10 <Labhraich> You'll need some form of command prompt to run the command
10:11:54 <dan> #link https://lumosql.org/src/lumosql/doc/trunk/doc/lumo-build-benchmark.md  Full Documentation so far
10:12:27 <rubdos[m]> OK, so that sounds like I'll make some shell scripts that spawn a bunch of benchmark workloads on the cluster as Jobs
10:12:49 <dan> rubdos[m], if you can submit a job that does "Make" then that's all that is needed, provided you can see the resulting files.
10:12:53 <rubdos[m]> I can commit these kind of scripts to a repository
10:13:24 <rubdos[m]> and I can dump the resulting files to a cephfs filesystem that's then hosted over HTTPS
10:13:36 <rubdos[m]> does such a thing make sense?
10:13:39 <dan> The only resulting files we want are sqlite databases
10:14:01 <dan> Not sure how we handle the case of an error?
10:14:18 <rubdos[m]> so having the resulting sqlite files as nginx-https-hosted sounds good, then?
10:14:23 <rubdos[m]> Errors are just logged as errors in kube
10:14:29 <dan> yeah that's great.
10:14:57 <rubdos[m]> #action Ruben to write Kubernetes infra scripts to run benchmarks
10:15:00 <dan> How do you handle things like build dependencies? There are very few, but if its intended to be a compute cluster not a build cluster that may be an issue?
10:15:14 <rubdos[m]> #action Ruben to run those scripts on the VUB cluster
10:15:20 <rubdos[m]> #action Ruben to host the results via HTTPS
10:15:51 <rubdos[m]> I'll just compile them in place, that should work.
10:16:03 <dan> Ok it sounds like Ruben has got that part of it under control.
10:16:19 <rubdos[m]> I'll try to get them started before lnch
10:16:27 <dan> wonderful!
10:16:30 <dan> now then
10:16:35 <dan> just very briefly
10:16:46 <dan> #topic Goals for Cluster Use
10:17:34 <dan> #info The sqlite benchmark databases will be run on homogonous hardware so our goal is to compare different configs with identical hardware
10:18:42 <dan> #info SQLite is single-threaded so we probably won't see cluster load-type errors in the data
10:18:53 <rubdos[m]> #action Ruben to figure out whether our fourth server contains the same CPU as the three others. We bought that separately and is 2U instead of 1U
10:19:25 <rubdos[m]> I'll make sure to dump the cpuinfo and some other system info to a txt file next to the benchmark results, if that makes sense?
10:19:36 <dan> #info LumoSQL can test a zillion dimensions and will be something really new in the field of databases
10:19:38 <rubdos[m]> Also, I'd need to know which configs you want to test, ultimately
10:19:53 <dan> #info I propose we do not test a zillion dimensions this time
10:20:40 <rubdos[m]> We have at least 48 cores, so we can test 48 in parallel... right?
10:20:53 <rubdos[m]> (also twice that in threads)
10:21:15 <dan> Claudio you have your head in the benchmarking code, does it make sense for you to take an action to propose some "make benchmark" commandlines?
10:21:47 <Labhraich> Probably
10:22:01 <rubdos[m]> Well, that's also basically what my kube script will do, but that's of course kube-specific
10:22:04 <Labhraich> I'd start with a very small one, so that we can check that the setup has everything it needs to run
10:22:09 <dan> Just some simple ones
10:22:10 <Labhraich> Then go for a bigger one
10:22:47 <Labhraich> It doesn't really matter if the 4 servers are all different - as long as we know what has been running on what
10:23:00 <dan> rubdos[m], it isn't just CPU, there is an I/O dimension too. For example, SQLite $version writing 10k chunks is not the same as $version+1 writing 10k chunks.
10:23:25 <rubdos[m]> ack on I/O, I'll make sure to report what kind of SSD it's running on
10:23:38 <dan> #info LumoSQL benchmarking tries to capture the details of the environment it is running on. This is so we can get benchmarks from random people all over the world.
10:23:43 <Labhraich> And I/O makes more difference than anything else, so running benchmarks in parallel may not give any useful results
10:23:44 <rubdos[m]> #action Ruben to document servers in the kube-benchmarking suite.
10:24:47 <rubdos[m]> Labhraich: should I prefer to use a RAM-disk then?
10:25:04 <Labhraich> Like, the following on my own system give completely different results (while of course CPU etc remains the same): randisk, 3GB/s NVME, 550MB/s SSD, SATA disk, USB disk
10:25:18 <Labhraich> A ramdisk would be good, it masks the I/O
10:25:42 <rubdos[m]> I mean, we can also do both, but that's another dimesions
10:25:45 <rubdos[m]> a zillion and one
10:25:46 <Labhraich> And none of the tests will use even 1GB, so I think you have plenty of RAM
10:26:14 <rubdos[m]> We've got a lot of RAM indeed. Kube should be able to schedule the jobs whenever there's some spare, too.
10:26:50 <rubdos[m]> In what repository should these scripts go?
10:27:02 <dan> for example I have a poor little server in Amsterdam looping through the benchmarking increasing the chunksize by a factor of 2 each time. The resulting sqlite will be great for Gabby's collection. But also probably a very different graph to if I changed the disk type.
10:27:10 <Labhraich> lumosql I think.  Create some directory with an appropriate name
10:27:18 <rubdos[m]> ack
10:28:13 <dan> rubdos[m], we have a lot of benchmarking to come, especially relating to the costs/benefits of crypto
10:28:26 <Labhraich> I'll run datasize=5 tonight - that will give us 156 more results from me
10:29:13 <Labhraich> Yes, there are already options to include/exclude checksumming(signatures, if we modify the code), encryption, etc - but since we don't have the final mechanism to include these, it's a bit premature to run these benchmarks
10:29:16 <dan> so that means the benchmarking/ dir on lumosql can be expected to hold details for many aspects of driving the LumoSQL benchmarking code
10:30:02 <rubdos[m]> A bit off-topic: I just stumbled on "A backend needs certain characteristics:", which will probably nerd-snipe me one day to get Lumo on https://github.com/spacejam/sled
10:30:39 <rubdos[m]> Anyway, this sounds like meeting dismissed, if there's no other topics?
10:30:53 <dan> so that means the benchmarking/ dir on lumosql can be expected to hold details for many aspects of driving the LumoSQL benchmarking
10:31:04 <dan> Yes, very nearly meeting over.
10:31:27 <dan> Gabby - your lurking presence is welcome, as the customer who wants to consume sqlite databases
10:31:58 <Labhraich> "A backend needs" to be able to cope with the things sqlite sends to it - which may be a bit more than what a key-value store normally offers
10:32:43 <rubdos[m]> It'd be a fun experiment to me.
10:32:53 <dan> rubdos[m]: this needs to be explained better, and also in the context of a pile of citations in a standard form
10:33:13 <dan> I have a lot of bits that talk around the topic in the documentation
10:33:27 <dan> gabby_bch, ok no comments needed from you
10:33:41 <Labhraich> rubdos[m]: if you want to run a very quick benchmark (just 2 runs) to test the set up, I'd propose:
10:34:03 <Labhraich> make USE_BDB=no USE_SQLITE=no LMDB_VERSIONS=latest SQLITE_FOR_LMDB=latest
10:34:11 <dan> I have one final topic before closing meeting
10:34:30 <dan> rubdos[m], I'd also propose the use of the "make what" target before that even.
10:34:31 <rubdos[m]> ack Claudio, I'll use that for the first tests
10:34:31 <Labhraich> Then if it all works we can go and make it do more work
10:34:57 <dan> ok final topic
10:35:06 <dan> #topic Big Picture
10:36:05 <dan> #info LumoSQL people (Claudio and Dan so far) believe that SQLite as deployed to trillions and LMDB as deployed to billions do not necessarily work completely the way their respective excellent developers think
10:37:05 <dan> #info The scale matters - hardware decisions and electricity consumption when multiplied by these very high numbers, as a start
10:37:52 <dan> #action Dan to speak to the SQLite and LMDB teams/authors as soon as we have a repeatable test framework
10:38:16 <dan> #info The best result would be if these teams are running the benchmarking themselves and contributing what they get
10:38:22 <dan> that's all from me
10:38:28 <dan> Is the above clear?
10:38:35 <dan> Or sane?
10:39:18 <rubdos[m]> Sounds fair, but might at some point need a good study on effective net energy consumption so actually see the impact.
10:39:37 <Labhraich> One more thing about running benchmarks (don't even know if it's documented...)
10:39:39 <rubdos[m]> Labhraich: is `~/.cache/LumoSQL` concurrency-sensitive? I.e., if I start two separate jobs that share this cache, will they somehow fight eachother?
10:40:01 <Labhraich> Yes, that'll be fine (if not, it's a bug)
10:40:09 <dan> should be readonly for existing content
10:40:14 <Labhraich> You'll occasionally see things like "waiting for lock on ~/.cache/...
10:40:15 <Labhraich> "
10:40:21 <rubdos[m]> ack, that's great
10:40:36 <dan> note that will maffect the time to build the tool that does the benchmark, not the benchmark
10:40:54 <Labhraich> So two options one can add to the "make" command to move the build directory and the databases during benchmarking:
10:40:55 <rubdos[m]> yep, that's perfectly fine
10:41:04 <dan> ok thankyou all!
10:41:17 <Labhraich> make ... BUILD_DIR=/path/to/some/space DB_DIR=/tmp/ramdisk
10:41:20 <dan> #endmeeting