[HN Gopher] An unexpected find that freed 20GB of unused index s... ___________________________________________________________________ An unexpected find that freed 20GB of unused index space (2021) Author : skadamat Score : 212 points Date : 2023-08-28 14:42 UTC (8 hours ago) (HTM) web link (hakibenita.com) (TXT) w3m dump (hakibenita.com) | voiper1 wrote: | (2021) I thought the picture looked familiar. | pletnes wrote: | Does this principle apply if you have one or a few values that | are very common? E.g an integer column with 90% of them being 0? | convivialdingo wrote: | An index is most efficient when it represents a unique set of | values. It's still very useful to have an index for grouping, | but if the groups represent a severe minority then you will end | up wasting a lot of space and cycles searching through the | index. | lozenge wrote: | Yes, Postgres keeps track of the most common values. If it | knows 0 is common and "where val = 0" will keep 90% of rows it | might choose to use a table scan instead of the index. | teddyh wrote: | Couldn't you create separate indexes for the 0 and non-0 cases? | recursive wrote: | If the zeroes are most of the common value, it could be | slower to use the index when searching for zeroes. How so? | Reading values that aren't included in the index require | following a reference back to the table row. If 90% of the | rows in your query are zeroes, you'd be better of not using | that column in your query planning. A naive filter, possibly | even after a table scan, is likely to be faster than using | the index. | teddyh wrote: | Would an index of the non-0 rows still help? | recursive wrote: | Yes. If your WHERE clause is searching for only the ones | for instance. | ChristianGeek wrote: | Only if you used null to represent the common value, which | generally isn't a good idea. | hinkley wrote: | The first large scale project I worked on, my team couldn't | figure out why operations had slowed down as the data set grew. | | Indexes have log(n) insertion time per record. If you had 1000 | records in your test database, as you approach 65k your insertion | time goes up by 60% (2^10 vs 2^16 records). Success slows | everything down, and there are only so many server upgrades | available. | | Add a couple new indexes for obscure features the business was | looking for, and now you're up to double. | winrid wrote: | Double a very small number. If that's seriously an issue use a | faster disk. So many people trying to run a DB on EBS with less | IOPS than my PC from 2015. | | I manage plenty of DBs with hundreds of millions of records and | 40+ indexes per table/collection... | vjerancrnjak wrote: | Insertion should still be extremely fast for such a small | index, right? | | Doing binary search over a b-tree page is <100 cycles. b-tree | traversal over 100M records should still be measured in | microseconds and binary search over that should be in | microseconds too if not in 100s of nanoseconds. | hinkley wrote: | This conversation is about unrepresentative sample data. | vjerancrnjak wrote: | Are you saying that the index is made on columns which only | have few unique values so you need to scan everything all | the time? | | Given that you've included the log(b) claim I thought you | were referring to a normally functioning btree. | | A situation where 65k rows end up slowing down things | severely should have more to do with other database | features (foreign keys and similar). | lazide wrote: | 'It depends' - don't forget about locking overhead, needing | to flush multiple pages/leafs as thresholds get hit, disk I/O | wait times, etc. | | At the beginning it's rarely noticeable, but it does exist | and under heavy load or with a lot of index writes needing to | happen it can cause very noticeable overhead. | vjerancrnjak wrote: | Yes, but these bits don't scale with log(n) index tree. | They depend on something other than the index data | structure. | efxhoy wrote: | I spent last week freeing up 200GB from our 600GB db with just | reindex and pg_repack. The worst offender was a 17GB (of data) | table that had 142GB of indexes. Reindex took it down to 21GB. | The table indexing is crazy and has multiple indexes over | different sets of columns. | | A contributing factor for the huge index I think was the | distribution of data. It's had inserts, updates and deletes | continuously since 2015. Data is more likely to be deleted the | older it gets so there's more data from recent years, but about | 0.1% of the data is still from 2015. I think maybe this skewed | distribution with a very long tail meant vacuum had a harder time | dealing with that index bloat. | dang wrote: | Discussed at the time: | | _An unexpected find that freed 20GB of unused index space in | PostgreSQL_ - https://news.ycombinator.com/item?id=25988871 - Feb | 2021 (78 comments) | halayli wrote: | I highly recommend pganalyze.com to discover unused indexes, | optimization opportunities, and high latency queries. | aftbit wrote: | Making indexes smaller is nice even when you have a ton of | storage, as then more can fit into the hot set. However as | someone who runs TB of databases, "just provision more storage" | is always a valid option. Especially if you are outside the | cloud. If you have your own hardware, new enterprise NVMe SSDs | are about $80/TB, and DDR4 RAM is around $1.20/GB. Four hours of | engineering time (very roughly $1000) buys either 800 GB of RAM | or 12 TB of storage. | paulddraper wrote: | Neither one will save you fro slower writes due to unnecessary | indices. | dchftcs wrote: | Depending on the scale and complexity, if you make no effort to | control resource usage, costs grow exponentially, sometimes | even without growth in business because the requirements just | become more complex. For a certain optimization, you may save | 1TB today, but end up saving 2TB some years down the road; just | a few of these decisons you could be looking at a difference of | a magnitude or more, sometimes even at larger scale. Overall | there's always a balance to be struck. | rockostrich wrote: | Just provision more storage is usually the answer on the cloud | as well unless someone has a concrete idea as to how to use | less storage. Although the math is a bit trickier since it's a | monthly price rather than a one time cost (although if your | database is growing at a constant rate then that one time | provisioning is really just a monthly price on-prem as well). | crabbone wrote: | This is not at all a good way to compare. | | Engineering hours: there's a good chance you pay for that once, | and that solves the problem. | | 10 SSDs: will require rack space, electricity, PCIe slots, | timely replacement, management software... most of these | expenses will be recurring expenses. If done once, sometimes | the existing infrastructure can amortize these expenses (i.e. | you might have already had empty space in a rack, you might | have already had spare PCIe slots, etc.), but amortization will | only work in small numbers. | | Another aspect of this trade-off: systems inevitably lose | performance per unit of equipment as they grow due to | management expenses and increased latency. So, if you keep | solving problems by growing the system, overall, the system | will become more and more "sluggish" until it becomes | unserviceable. | | On the other hand, solutions which minimize the number of | system resources necessary to accomplish a task increase | overall performance per unit. In other words, create a higher- | quality system, which is an asset in its own right. | krembo wrote: | Even on RDS 20GB's price is neglectible for most companies and | doesn't worth the added efforts and engineers salaries of | looking into that. From DBA perspective, that's a cool find. | Thanks for sharing. | pgwhalen wrote: | I'm not super in the know about it, but I believe the _total_ | cost of SAN storage at my employer (on prem) is ~10X that. | codetrotter wrote: | > Four hours of engineering time (very roughly $1000) | | Why $1000 for four hours? | | Does it cost your company $150,000 per month per engineer? | vikingerik wrote: | Well, there's a multiplier between the nominal capacity and the | capacity you actually need to purchase for your whole system. | You're not buying just that 1 TB. You probably want at least | two live failover servers at least, maybe more. Plus however | many layers of backups and disaster recovery; a year of weekly | backups makes your 1 TB into 50, even if it's offline storage. | | My own company struggles with that - throwing more storage on | the live db is easy, so we've kept doing that for years, but | pushing around multi-terabyte backups is getting cumbersome and | we're going to have to slim down the data in prod even at the | cost of engineer effort. | danielheath wrote: | Weekly full backups are quite large vs eg ZFS snapshots, | right? | Dylan16807 wrote: | Are you putting indexes into your backups? | krab wrote: | pg_basebackup is, like any physical replication of | Postgres. | nine_k wrote: | Backing up indexes makes sense if you care about restoring | the DB _quickly_. | [deleted] | _a_a_a_ wrote: | > new enterprise NVMe SSDs are about $80/TB, and DDR4 RAM is | around $1.20/GB | | sounds incredibly cheap | telios wrote: | The post makes mention of B-tree de-duplication that is present | in PostgreSQL 13, but not 12, the version they're using; at the | same time, they're noting that the vast majority of values in | some of their foreign key indexes are NULL. | | I have to wonder if B-tree de-duplication would have helped with | that particular case? The PostgreSQL 13 documentation seems to | imply it, as far as I can tell[0] (under 63.4.2): | | > B-Tree deduplication is just as effective with "duplicates" | that contain a NULL value, even though NULL values are never | equal to each other according to the = member of any B-Tree | operator class. | | I don't think it would be as effective as a partial index as | applied in the post, I'm just curious. | | [0]: https://www.postgresql.org/docs/13/btree-implementation.html | luhn wrote: | In previous discussion of this article, an HN user did the | math: pg12 is 16 bytes per NULL and pg13 is 6.32 bytes per | NULL. https://news.ycombinator.com/item?id=25989467 So | definitely some pretty significant savings there. | zzzeek wrote: | Wow, they are not kidding, this is truly "we saved 20g with one | weird trick". We get a lot of requests for users wanting to use | these unusual Postgresql index forms that are largely unheard of | in old school Oracle / SQL Server shops, I didn't realize they | were indexing NULL values either. | aidos wrote: | The fact that this is news to you makes me feel better about | not considering it myself. I suspect we have a few large | indexes that could be given the same treatment. | hn_throwaway_99 wrote: | Wow, thanks for this great writeup! I thought this was really | useful not just from the point of the "partial index 'find'" that | is the focus of the post (but that was a great point and thing to | be aware of), but from the general overview of good techniques | and things to be aware of in postgres if you're worried about | using space inefficiently. Saving this one for reference! | | One minor "word to the wise", cause I thought this was a great | post but also has the potential to be misused: if you work at a | startup or early stage company, it is nearly _always_ the better | decision to throw more disk space at a storage problem like this | than worry about optimizing for size. Developers are expensive, | disk space is cheap. | nico wrote: | > if you work at a startup or early stage company, it is nearly | always the better decision to throw more disk space at a | storage problem like this than worry about optimizing for size. | Developers are expensive, disk space is cheap | | This is great advice. In general at the beginning it is better | to keep things as simple as possible. | | At one fast growing startup I worked at, one of the founders | insisted we kept upgrading just one machine (plus redundancy | and backups). It was a great strategy! Kept the architecture | super simple, easy to manage, easy to debug and recover. For | the first 5 years of the company, the whole thing ran on one | server, while growing exponentially and serving millions of | users globally. | | After seeing that, it's clear to me you should only upgrade | when needed, and in the simplest, most straightforward way | possible. | jtc331 wrote: | There's a lot of emphasis here about just adding storage, but | that's missing the fact that unnecessary indexing affects | writes and reads performance also (and potentially quite | significantly). | | Using a partial index where it's obviously matches the use case | (like when a majority of values are null) is just _correct | modeling_ and should not be considered a premature optimization | or waste or developer time. | hyperman1 wrote: | In a similar vein, I got good value from these scripts: | | https://github.com/NikolayS/postgres_dba | | I managed to free 10% a.k.a about 100 GB of storage by | reorganizing table column ordering in a big table. ___________________________________________________________________ (page generated 2023-08-28 23:00 UTC)