Log Re Getting Benchmarks to Gabby

10:07:20 <Labhraich> #startmeeting 
10:07:20 <lumosql-meetbot> Labhraich: Meeting started at 2022-02-05T10:07+0000
10:07:21 <lumosql-meetbot> Labhraich: Current chairs: Labhraich
10:07:22 <lumosql-meetbot> Labhraich: Useful commands: #action #info #idea #link #topic #motion #vote #close #endmeeting
10:07:23 <lumosql-meetbot> Labhraich: See also: https://hcoop-meetbot.readthedocs.io/en/stable/
10:07:32 <danshearer> #here Dan
10:07:39 <Labhraich> #topic Benchmarking and consolidating results
10:07:43 <Labhraich> #here Claudio
10:08:22 <Labhraich> So, to open the proceedings, here's what I had planned about 2 years ago for consolidating results:
10:08:34 <Labhraich> 1. people run benchmarks according to the documentation
10:08:48 <Labhraich> 2. then they run an "export" script on the sqlite3 database
10:09:15 <Labhraich> 3. they get the output of the "export" to us - it is a plain text file, so it can be emailed, uploaded somewhere, and so on
10:09:38 <Labhraich> 4. we import it into our own database - in the process we can add extra attributes like who provided it
10:09:49 <Labhraich> and then our database will contain everybody's results
10:10:36 <danshearer> Ok with sanity/security at point 3.5 as a given. Stet.
10:11:08 <Labhraich> We can, obviously, also just allow people to send their sqlite benchmarks database, and in fact benchmark-filter.tcl has an option to copy data from one database to another
10:11:23 <Labhraich> Yes, 3.5 is so essential that it went without saying :-)
10:12:39 <Labhraich> Using a text file rather than an uploaded database has some advantages in terms of validation - if nothing else, somebody can open it in a text editor and see if it looks completely wrong
10:13:27 <Labhraich> But if people want to upload just the database, we can export it and then re-import - which will also filter anything in the database which has nothing to do with benchmark results (one never know what they may have added in there)
10:13:44 <danshearer> Would there be fewer steps to go wrong if the plain text format was an SQLite CSV dump?
10:13:56 <Labhraich> Not really
10:14:19 <Labhraich> Because then we still have to go and filter things to make sure it's benchmark results and not their recipes database
10:15:15 <danshearer> Fair. It's almost as if we are defining a universal standard transport format for benchmarking results.
10:15:31 <Labhraich> ANd my simple text format allows benchmark results (extensible as we extend what data we save), and nothing else
10:15:58 <danshearer> Given the business we are in (integrity) should each plain-text line have a very simple checksum (eg md5)
10:16:11 <danshearer> Or something super-quick
10:16:13 <Labhraich> i.e. it knows about runs, attributes of a run (date, OS, architecture, etc), results and their attributes ("1000 inserts", pass, 42 seconds)
10:17:15 <Labhraich> Yes, a checksum would be good.  I thought about adding it but apparently I never did
10:17:17 <danshearer> parks the thought to one side: are we are talking about plain text Lumion format.
10:17:21 <Labhraich> Easy enough to do though
10:17:38 <Labhraich> I suppose we could export benchmark results to Lumions
10:17:44 <Labhraich> It woulc tie things together nicely
10:18:15 <danshearer> #idea Could Lumions be the way to export benchmark results, and if so, do we need a plain text representation
10:18:44 <danshearer> s/export/transport/ but we all know what I mean
10:18:50 <Labhraich> OK, how about
10:19:33 <Labhraich> for right now we just consolidate our results using the simple "copy from one database to another" option in benchmark-filter.tcl and send the output database to Gabby
10:19:41 <Labhraich> THis has the advantage of being there and ready to use
10:19:54 <Labhraich> The disadvantage is that there will be no verification
10:20:12 <Labhraich> But since it's data we provide, we must assume we trust it :-)
10:20:29 <Labhraich> And then we design a transport for benchmark results based on Lumions
10:20:36 <danshearer> Gabby is doing development not production. I suppose these results will persist into the future 10GiB database, but they'll be noise level.
10:20:41 <Labhraich> This will necessarily have to wait until we know what a Lumion looks like
10:21:41 <Labhraich> #motion we delay a precise definition on benchmark result transport until we have something more on Lumions; for the immediate use, we just join data together to have it in a single database
10:21:41 <lumosql-meetbot> Labhraich: Voting is open
10:21:54 <danshearer> +1
10:21:58 <danshearer> vote +1
10:22:02 <Labhraich> #vote +1
10:22:06 <danshearer> #vote +1
10:22:14 <Labhraich> Is anybody else around?
10:22:17 <Labhraich> If not, we all voted
10:22:21 <danshearer> we did
10:22:25 <danshearer> eventually.
10:22:29 <Labhraich> #close 
10:22:29 <lumosql-meetbot> Labhraich: Motion accepted: 2 in favor to 0 opposed
10:22:30 <lumosql-meetbot> Labhraich: In favor: Labhraich, danshearer
10:22:31 <lumosql-meetbot> Labhraich: Opposed:
10:22:44 <Labhraich> So I think
10:23:09 <Labhraich> #action Claudio to write a very simple documentation on how to join "trusted" benchmark results into a single database
10:24:47 <Labhraich> And while we are there, I fixed the LMDB backend for the latest sqlite, which yesterday I couldn't see how to do.  It was probably a literal "couldn't see" due to monitor problems
10:25:03 <Labhraich> But I still want to simplify the running of more tests
10:25:43 <Labhraich> I see a very simple change to the makefile / build.tcl which would allow more testing with less typing of commands
10:25:54 <danshearer> Ok so this brings us to a practical question: do you have lots of test-thing.sqlite files lying around?
10:26:06 <Labhraich> Currently, build.tcl allows a list of versions (or "all") in some places but not others
10:26:13 <Labhraich> I'd change that so one can use it everwhere
10:26:30 <Labhraich> e.g.  "make benchmark LMDB_VERSIONS=all" works
10:26:39 <danshearer> Labhraich yes it does! Just yesterday I looked at what might be required to do 'all' for SQLite as well as LMDB
10:26:43 <Labhraich> But currently "make benchmark SQLITE_FOR_LMDB=all" does not work
10:26:53 <Labhraich> And it would be useful to test changes...
10:27:03 <danshearer> you can't go SQLITE_VERSIONS=all , that's what I was looking at
10:27:13 <Labhraich> Yes, that's exactly what I want to change
10:27:34 <Labhraich> Also, allow any version to be specified as "latest" - which will ask not-fork what's the latest version and use it
10:27:40 <Labhraich> Again, to simplify typing
10:27:51 <Labhraich> Not a big change, but I think it needs doing
10:28:30 <Labhraich> So I think we have 2 more point of action, one for this week and one for after we have a more precise idea of what a Lumion looks like
10:28:56 <danshearer> I agree. Example: The difference between having two steps for Fossil of "fossil clone ; fossil open" to the one-step "fossil clone" was huge. People felt it was extremely important and a blocker to adoption.
10:29:08 <Labhraich> #action Claudio to extend the specification of versions in build.tcl so that "all" and "latest" can be used on anything which takes a version number (or a list of versions)
10:29:33 <Labhraich> And (we already agreed with the above vote):
10:29:55 <Labhraich> #action Claudio and Dan to review the transport benchmark results after we have a Lumion specification
10:30:23 <Labhraich> "the transport mechanism for benchmark results" I guess
10:30:31 <Labhraich> Let's see if this can fix the above
10:30:33 <Labhraich> #undo 
10:30:33 <lumosql-meetbot> Labhraich: Removed event: cac19d9c9b2d4968bc6392666ec8a3cf@2022-02-05T10:29+0000
10:30:44 <Labhraich> #action Claudio and Dan to review the transport mechanism for benchmark results after we have a Lumion specification
10:30:54 <danshearer> Nice!
10:30:57 <danshearer> Ok. I think that concludes this topic. Would you like to #topic Getting a Substantial Dataset of Results to Gabby, or similar?
10:30:57 <Labhraich> I guess the minutes will let us know if this all worked
10:31:13 <Labhraich> #topic Getting a Substantial Dataset of Results to Gabby
10:32:11 <danshearer> I propose that we both use tcl-benchmark to consolidate, then you send me what you have via scp, and I consolidate into a bigger one, and give that to Gabby via lumosql/dist
10:32:23 <Labhraich> We have a number of benchmark results on the lumosql server, I think, and there will be more as I test the changes I'll be making in the immediate future
10:32:32 <Labhraich> So I'll scp what I have to the server
10:32:36 <danshearer> I have no idea how many I have around, and I'll probably create some more
10:32:39 <danshearer> yes
10:33:06 <danshearer> #action Labhraich to scp benchmark results to server, with or without consolidating
10:33:12 <Labhraich> But I'd wait until I can have "XXX_VERSIONS=all" on everything because 1) it'll help testing the latest changes, and 2) will produce more results
10:33:39 <danshearer> #action Dan to consolidate Labhraich's consolidations with his own, and send to Gabby
10:34:38 <Labhraich> Unless you want to do a bunch of manual "INSERT ... FROM SELECT ..." you may want to wait for my simple docs on how to do it using benchmark-filter.tcl
10:35:34 <danshearer> #undo 
10:35:42 <Labhraich> But I think I'll do that first, today, and then commit the changes (probably in doc/lumo-benchmark-filter.md for now - as it's not the final consolidation docs)
10:35:46 <danshearer> no, that didn't work because I'm not the chair
10:35:59 <Labhraich> #undo 
10:35:59 <lumosql-meetbot> Labhraich: Removed event: 7d5742be526c44018cb42c85e4fc8283@2022-02-05T10:33+0000
10:36:04 <Labhraich> That probably worked
10:36:40 <Labhraich> In theory I can nominate you as co-chair and then we can all perform all actions
10:36:46 <Labhraich> Maybe we'll try that in another meeting
10:36:50 <danshearer> yup. And I just fixed a tyop in the build-benchmark.md . hmm. I wonder if commits should show up on #lumosql?
10:36:59 <danshearer> Yes let's try in another meeting.
10:37:06 <danshearer> Ok so let's see now.
10:37:21 <Labhraich> Would you like to state that as an #idea?  Have commits show up on IRC?
10:37:51 <danshearer> #idea It would be nice/interesting if LumoSQL commits showed up in irc
10:38:15 <danshearer> Gabby needs support with displaying benchmarking data
10:38:28 <danshearer> yes we can give her the benchmark sql, that's great
10:38:42 <danshearer> we will give her improved/new instructions on handling that data too
10:39:28 <Labhraich> I guess R can read a table from a database, but perhaps with more information on what's actually needed we can have something to extract the relevant stuff into a temporary database which then R can read
10:39:55 <Labhraich> I don't know what's needed so all I can do is ask "is there anything we can do to help?"
10:40:02 <danshearer> I guess that means she needs to be told about any new export/display options for benchmark-filter.tcl
10:40:25 <Labhraich> The options are already there, just not been used yet (but I did test them)
10:40:44 <danshearer> I have used the display ones somewhat
10:40:53 <danshearer> well, in fact there is some of that in the docs.
10:40:57 <danshearer> that I wrote
10:41:09 <Labhraich> #action Claudio to double check if existing documentation in doc shows all available benchmarking / testing / filtering options, and if not update
10:41:56 <danshearer> #action Dan to point Gabby at these meeting notes so she knows what to expect, and ask her if there is anything more she wants/needs
10:42:18 <Labhraich> I wrote docs for all tools, and you extracted that from various places (including README) to a different place
10:42:28 <Labhraich> No guarantees the docs were complete, etc
10:42:47 <danshearer> notes that there is someone in Edinburgh following the build/bench Lumo instructions from scratch right this weekend. Cool, no?!
10:43:08 <Labhraich> Well, it's not me
10:43:20 <Labhraich> But I'd be interested to know if they manage
10:43:30 <Labhraich> Because of course I never had to follow these instructions
10:43:50 <Labhraich> Or rather, never had to verify if they make sense to other people
10:44:15 <Labhraich> Have we discussed everything to be discussed?
10:44:40 <danshearer> No, it's Paul Hammant who just loves trunk-based development and is interested in both Fossil and benchmarking: https://devops.paulhammant.com/
10:44:57 <danshearer> I think so, yes
10:44:59 <Labhraich> I just checked the notes I wrote yesterday and updated this morning, and it all went into this meeting with action points - so no more from me
10:45:12 <danshearer> ok I think we're done.
10:45:26 <Labhraich> Right, this meeting is officially finished
10:45:27 <danshearer> lumosql-meetbot: well done, friend!
10:45:27 <lumosql-meetbot> danshearer: Error: "well" is not a valid command.
10:45:41 <Labhraich> #endmeeting