Lars Wirzenius

Lars Wirzenius at

The files storing B-tree nodes are not cacheable, since they get updated in place. Or rather, they are cacheable, but cache management is difficult and potentially quite expensive. The new design aims to not ever update files in place and so would be much, much more cacheable, avoiding a lot of rounttrips, I hope.

I expect bags to be somewhat small (perhaps as small as 64 k, but probably more like 1 meg), so that the overhead of wasted space and downloading too much is kept reasonable. I don't plan on rewriting bags when data is removed, normally, but there might be a "packing" function or option to force that.

I hadn't thought about random access to a bag file. I shall ponder on this.