[HN Gopher] The subtleties of proper B+Tree implementation ___________________________________________________________________ The subtleties of proper B+Tree implementation Author : tim_sw Score : 44 points Date : 2023-12-10 19:28 UTC (3 hours ago) (HTM) web link (ayende.com) (TXT) w3m dump (ayende.com) | mondobe wrote: | I've got a Data Structures and Algorithms final tomorrow, likely | including a few 2-4 Tree problems... I will be forever grateful | that we at least don't have to implement them, just simulate | them. | airstrike wrote: | just curious.. what language is it being taught in? | widforss wrote: | When I did it it wasn't in any language. We just drawed the | evolution of the trees on paper. I got my degree in 2021. | waynesonfire wrote: | Uhh.... maybe dont roll your own data structure if such a basic | issue is causing you havoc. | travisjungroth wrote: | This makes it sound like it's the author's fault or something. | They're pointing out a pretty subtle gotcha. | schneems wrote: | Maybe link to a resource for people to download a good | implementation of that data structure. Maybe link to an article | describing hot so implement this data structure correctly in | depth. Maybe try finding some other way to add to the | conversation instead of disparaging the author for sharing | something they learned. Maybe don't post anything at all if you | don't have something productive to say. Maybe use nonviolent- | communication (NVC) techniques to say exactly what you just | said, but without condescension or attack. | travisjungroth wrote: | There's something to generalize here. If you run out of X, split | by X. This bug happens when you run out of _space_ , but split by | _count_. Author's comment said he fixed it by splitting by size | (space taken). | cperciva wrote: | Note: In most cases you don't want to split pages prior to | insertion as is being done here. Instead, you want to insert data | first, and then restore the page size invariants at the end | (prior to committing the operation / writing to disk / etc). If | you're doing multiple operations at once (e.g. adding new keys | and deleting others) it may turn out that you can avoid the | expensive page-splitting operation by postponing it. | hyperman1 wrote: | When I implemented this, long ago, I had a expanded page | structure containing the old page plus the new key. So it had | more memory than 1 page. Then the expanded page, when writing it | back, becomes 2 pages if a split is needed. | | I did it this way for a few reasons. One was I could release old | key pages after new pages were written, avoiding data loss in IO | errors. Second was key (prefix) compression became a lot more | easy. I never realised it, but it avoids the trouble in both blog | post too. | | It comes with some copy overhead, unfortunately. | NooneAtAll3 wrote: | what's the point of storing different sized keys together? ___________________________________________________________________ (page generated 2023-12-10 23:00 UTC)