[HN Gopher] Show HN: Reduce SQLite database size by up to 80% wi... ___________________________________________________________________ Show HN: Reduce SQLite database size by up to 80% with transparent compression Author : phiresky Score : 360 points Date : 2022-08-01 09:36 UTC (13 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | wmanley wrote: | This is really neat. I like how the compression can be applied to | specific columns rather than whole rows - or whole pages. I could | imagine storing JSON from an HTTP request here and using | Generated Columns[1] to extract the bits my application is | interested in. | | I also like how simple and powerful the custom partitioning is. | | [1]: https://www.sqlite.org/gencol.html | wmanley wrote: | On that subject I've often thought of writing a combination | decompressor/JSON parser that can parse the contents of the | dictionary before decompression for faster JSON parsing and | lower memory usage. | aabbcc1241 wrote: | Maybe the compression algorithm used in | https://github.com/beenotung/compress-json can inspire you. | wmanley wrote: | Thanks, but that wasn't quite what I meant. What I was | thinking about was processing existing gzipped or zstd | compressed JSON rather than having a new compression | format. | | I could see how customising the compressor could be helpful | - a bit like how there's an `--rsyncable` option to `gzip`, | but I'd like to keep compatibility with existing formats. I | like to avoid coming up with new formats whenever possible | because you might want to mine the data in 20 years time - | and if it's a custom format it'll be much harder to deal | with. | | I don't have a current use-case for this, but I like | thinking about it. In particular I like data stores to be | as simple as possible. Managing persistent state is the | hard bit of most systems as unlike with runtime state or | code you have to manage forwards and backwards | compatibility, migrations, etc. and you can't just "turn it | off and on again" to fix issues. | gcr wrote: | a lot of document key/value databases that store arbitrary | JSON tend to work this way (for example Cassandra does this, | IIRC). some will even automatically build indices for you | (though hand-tuning is usually better) | klysm wrote: | I've been following that pattern in PostgreSQL and love it. It | cleanly decouples the ingest and keeps the stack simple. | masklinn wrote: | Do you mean gencols or compression? | klysm wrote: | I meant consuming the raw json into a jsonb column and then | using views or triggers to maintain the structured version | of it. | KingOfCoders wrote: | Can you elaborate? | allan_s wrote: | Not the author, but if you have a write intensive workflow | and I/O starts to be your blottleneck , compressing the | data application-side allow your database to write less | bytes, and thus to handle more load for the same capacity. | OnlyMortal wrote: | Be aware of the number of iops. That's a killer. | rovr138 wrote: | Look into ETL vs ELT | | ETL you grab the data from somewhere, transform it to clean | it/format it, then load it (in this case to postgres). | | ELT, you grab the data, load it as is. Then you use another | step to extract what you need or transform it. You keep the | original data in case you need it there too. | gcr wrote: | wait, are you saying that the columns of view tables could be | compressed? that doesn't make sense to me cause views don't use | extra space to begin with....? | | storing raw JSON in sqlite means the entire blob would be | compressed and live in a single column, right? | wmanley wrote: | You'd compress the JSON, and leave the data that you extract | from the JSON uncompressed using Generated Columns[1]. That | way you can save the original JSON data, but still have fast | querying, etc. | | It would look something like this: CREATE | TABLE t1( json TEXT, data INT | GENERATED ALWAYS AS (json_extract(body, '$.d')) STORED | ); | | If all you need is fast querying you could make `data` a | `VIRTUAL` column and set up an index on it. | | You can read more about this kind of technique here: | https://dgl.cx/2020/06/sqlite-json-support | | [1]: https://www.sqlite.org/gencol.html | hda111 wrote: | I love the idea. Can I use it with Django projects somehow? | d1l wrote: | I am happy for the author, but want to warn the credulous that | this is a very strange way to approach compression with SQLite. | The much more common way is to use SQLite vfs: | | https://www.sqlite.org/vfs.html | | Briefly search for SQLite compressed vfs and you will find many | extensions. Some are quite mature. | | I get the impression that the author doesn't know about the vfs, | as it's not even mentioned in the readme. The clickbait title | similarly seems amateurish. Caveat emptor. | Dowwie wrote: | To call it strange or unusual is fine but it isn't a compelling | argument why not to use it. Is it better or worse than a vfs | approach and why? | d1l wrote: | Why not read the sqlite teams own explanation: | | https://www.sqlite.org/zipvfs/doc/trunk/www/howitworks.wiki | kevingadd wrote: | Taking an amateurish or fancy or experimental approach to | compressing a sqlite database is a great way to end up with | data loss. Doesn't make it a worthless or "bad" project but | it's worthy of a warning. | phiresky wrote: | I did actually previously write a VFS, though it did something | else entirely: | | https://phiresky.github.io/blog/2021/hosting-sqlite-database... | | You're right that a comparison to compression VFS is completely | missing. I knew about their existence but I have to admit I | didn't look too closely. | | Note that the main "novelty" here is training a shared / | partitioned compression dictionary specifically to take | advantage of redundancy that wouldn't appear within only one | row or even within a database page / block . The compression | happens at the row level and can additionally use application | knowledge - you couldn't do that in the VFS level. For example, | you can have separate compression dictionaries per different | columns and per groups of rows with some commonality. | | I'll have to compare to a compression vfs (do you have a | favorite?) and see if maybe these two methods can even be | combined. | | Edit: I see that https://github.com/mlin/sqlite_zstd_vfs does | actually also train dictionaries. It's still at the database- | page level so can't take application knowledge into account or | compress only parts of the data, but that's still pretty smart | then. | kevingadd wrote: | How does this compare with modern compression algorithms like | Brotli that do context modeling, etc? I've found that they | manage to aggressively compress types of data you wouldn't | expect to compress well, to the point that investing energy | into doing compression yourself doesn't provide big returns | anymore. The downside is that codecs like Brotli tend to | compress very slowly, but I can imagine being able to do a | setup where you only compress old rows so it would be cool to | see an experiment with just compressing rows or columns and | comparing the sizes with your method. | vlovich123 wrote: | Zstd with training libraries should beat Brotli and Brotli | will struggle with non-text data although I haven't | benchmarked | | Your underlying point though remains valid that the | incremental complexity of building that training data | probably doesn't warrant it because the place where that | becomes valuable is quite rare particularly for typical | SQLite databases. Still a neat trick thing though. | skrebbel wrote: | > amateurish | | There's no need to be this dismissive. | 867-5309 wrote: | welcome to HN | skrebbel wrote: | Meh, I find HN to be among the least dismissive places on | the internet, especially when you only consider nerd-heavy | places. I like to think that this is in part due to | comments like the one I wrote being commonplace. I've been | at the receiving end of them as well (incl a few kicks | under the butt from dang himself) and it's helped me assume | good intent, not be overly dismissive, etc. I'm just | passing the favour along. | aidenn0 wrote: | I think the frustrating thing about HN is the | (relatively) high number of comments that are both | dismissive and substantive. It's much easier (for my | brain at least) to ignore comments where the entirety is | "this sucks" or "lame" (or even "If I made something this | crappy I would throw myself off a building" which isn't | uncommon in some forums). | | In comparison, gp comment is both dismissive _and_ | substantive in that it brings up SQLite VFS layers that | can accomplish a similar goal. Author of TFA even updated | the README to mention these, addressing gp 's complaint. | | The whole "this is crap because of X" where X is | something that subjectively could be considered | legitimate criticism is relatively common on HN, so | comments _that I don 't automatically skip over_ are | rather more dismissive than on other forums. | zimpenfish wrote: | > Briefly search for SQLite compressed vfs and you will find | many extensions. | | I found 3 with a brief search - CEVFS, sqlite_zstd_vfs, and | ZIPVFS (not free, therefore discounting.) | | Neither CEVFS nor sqlite_zstd_vfs support WAL mode (which | means, e.g., no litestream replication). | | Can you recommend a mature one that does support WAL mode? | skerit wrote: | Let's not water down the meaning of the word "clickbait". I | have no problem with this title. | aabbcc1241 wrote: | I'm using sqlite on zfs and btrfs, with transparent compression | supported by the file system, the query speed is very good and | compression rate also very good. (database below 3GB | uncompressed, 22% after compressed) | cmurf wrote: | Did you enable WAL? Any casual comparison in performance | between WAL and no-WAL? | aabbcc1241 wrote: | I enable WAL. It performs better than the default (which is | to keep a rollback journal) | usgroup wrote: | What do you think about zfs vs btrfs for your use case? Why do | you use both? | [deleted] | [deleted] | [deleted] | logicallee wrote: | This mentions something I've been wondering about: | | >Depending on the data, this can reduce the size of the database | by 80% while keeping performance mostly the same (or even | improving it, since the data to be read from disk is smaller). | | "Even improving it" is a bold claim! | | Modern CPU's are often hungry for data, aren't they? While being | incredibly fast and mostly idle? | | Is reading a smaller compressed file and decompressing it in the | CPU's hyperfast cache before use faster than reading it | uncompressed? | | Maybe you can speed up an entire server, even one with lots of | free space and no need for compression, by adding full disk | compression anyway! | | Unless the CPU's are at 100% utilization this could always be | used. Since the CPU will then use the data, it might be fast | enough to uncompress it at home faster than the bus can keep | dropping it off at file read speeds. | | This chart I found that goes up to 2000 suggests cache access had | an increasing performance gap with even RAM bandwidth, to say | nothing of SSD's[1] | | It is from here[2] | | (However, compressing a full disk can only help if there are | extra CPU cycles. | | For activity spikes when the CPU doesn't have extra cycles or the | cache for it, both compressed and uncompressed versions could be | available, at most doubling disk usage, so that the uncompressed | file could be read instead. | | This can ensure the CPU isn't burdened with a decompression task, | since this only helps if the fastest caches are used for it and | they might have better things to do than uncompress files. | | However, whenever the CPU has idle cores and lots of cache, which | is most of the time, it could request the compressed version. If | it is busy it could ask for the original version which saves the | decompression work.) | | Can this help solve the problem of waiting for the disk? | | [1] http://www.extremetech.com/wp-content/uploads/2014/08/CPU- | DR... | | [2] | https://www.extremetech.com/extreme/188776-how-l1-and-l2-cpu... | outworlder wrote: | > Maybe you can speed up an entire server, even one with lots | of free space and no need for compression, by adding full disk | compression anyway! | | Yes. Although not all filesystems support compression. One of | the filesystems that does support and this is regularly | benchmarked, is ZFS. It is also interesting that while you are | definitely using more CPU to read a compressed stream, if that | causes you to finish your task earlier, then it can be a net | gain even from a CPU usage standpoint. | | eg. | | https://www.servethehome.com/the-case-for-using-zfs-compress... | andrewmcwatters wrote: | Oh, interesting. I scrape GitHub's APIs to generate user and | repositories statistics data, then compress it with ZStandard to | allow the databases to fit within GitHub's file size | restrictions.[1] So, this is very relevant to me. | | It's pleasantly surprising how much data you can fit within a | ZStandard-compressed SQLite database. The two technologies fit | very well together. | | [1]: https://github.com/andrewmcwattersandco/github-statistics | mansoor_ wrote: | What does "transparent" mean in this context? | phiresky wrote: | Transparent to the application - basically the SQL queries | (select, insert, update) in your code base don't have to change | at all. | | This would be in contrast to e.g. compressing the data in your | code before inserting it into the database, then decompressing | after you load it. | AtNightWeCode wrote: | LZ77+ ftw. I am a heavy user of Zstd. I think Brotli might be a | better choice here. Or at least for my use-cases. | quietbritishjim wrote: | One option not listed (under "Other options for compressing data | in a database") but probably closest to what this solution does | under the hood, is to do row-interdependent compression at | application level. | | That is, while saving rows to the database, pick out every (say) | 10th row, and compress it by itself but keep the compression | state in memory. Then for the next 9 rows, compress that row | based on that dictionary, and a reference to the row it's based | on. (For example, in Python's zlib module, you'd use the zdict | parameter [1] of compress and decompress.) A bit like keyframes | in video compression. | | You can potentially get better results this way than a generic | solution like the article, because you might have some | application knowledge about what rows are likely to be similar to | others. But it obviously requires application level code (e.g. | fetching a logical row requires fetching two physical rows so you | have enough information to decode), and it would be a nightmare | if you need to remove or update the rows. | | [1] https://docs.python.org/3/library/zlib.html#zlib.compressobj | phiresky wrote: | > You can potentially get better results this way than a | generic solution like the article because you might have some | application knowledge about what rows are likely to be similar | to others | | The linked project actually does exactly this - you can | basically specify an SQL expression (such as strftime(date, | '%Y-%m')) that is then used to group rows together in order to | decide what to train dictionaries on. What you're describing as | picking every 10th row would be the expression (id/10). So you | can use application knowledge, you just don't have to. | | The difference is that in my project the dictionary is stored | separately so not affected by row deletions, and the trained | dictionary is based on all samples not just one of them. Since | the dictionary is stored separately it's only worth it if you | combine at least maybe 100-1000 rows under one dictionary. | | > Python's zlib module, you'd use the zdict parameter | | Yes, the training feature of zstd is basically a better version | of keeping another sample of data around and using it as a | prefix. In fact you can use a data sample instead of a trained | dictionary as the `-D` parameter of the zstd command line, zstd | just has the additional feature of reducing a set of samples | down to the most essential (data-saving) parts. | quietbritishjim wrote: | Thanks for making this project, it looks fantastic! I might | end up using it if I can figure out how to from Python. | | I did mention that this is what you're doing, I just thought | it wasn't clear from the blog post. But I did miss that you | can use an SQL expression to choose the grouping. This is | ideal for the application I'm thinking of, which a "message | type" field that corresponds pretty closely with which rows | are similar. | phiresky wrote: | You can use it from python as follows: db | = sqlite3.connect(...) db.enable_load_extension() | db.load_extension("libsqlite_zstd.so") | | You probably still have to figure out how to make that work | cross-platform / how to ship that binary though. | quietbritishjim wrote: | Oh interesting, I was assuming that recompiling the whole | sqlite module woudl be in order. Thanks again. | goldenkey wrote: | Your explanation is great. I like the key frame codec analogy. | Like you said though, this really belongs at the DB layer. An | application level solution would be much more than a nightmare. | | Plugin/extension/module development is severely underused in | our field. Extensibility, if possible, is often the best way to | handle edge cases or increments in a dependency, without | forking it. | | See "Benefits of Being an Extension to PostgreSQL": | https://www.citusdata.com/blog/2017/10/25/what-it-means-to-b... | | Some great software that is extensible in brilliant ways | through plugins, that comes to mind, is: | | postgres: https://www.postgresql.org/docs/current/external- | extensions.... | | nginx: https://www.nginx.com/resources/wiki/modules/ | | sqlite: https://www.sqlite.org/loadext.html | | redis: https://redis.io/docs/modules/ | | babel: https://babeljs.io/docs/en/plugins/ | yababa_y wrote: | This makes me wonder how sqlite augmented with blosc or some | similar faster-than-memory compressor might behave. Size and | speed wins? | alecco wrote: | AFAIK Blosc is just a container. Not a good fit for row-level | compression. | yababa_y wrote: | the vfs isn't row level, though! | avereveard wrote: | I don't want to diss on the effort, I have an honest question: | wouldn't this something that would be more appropriate to be done | at the filesystem level? | hexmiles wrote: | It depends on the data and application, | | I use a similar approach for compressing data that go in a | couple of database/store. Having control on when and how to | compress data can significantly improve performance, also, it | sometimes makes sense to keep compress data even in memory | since it can allow you to make better use of it. | | Also keep in mind that often you can't control the filesystem. | | One use case for which I am working right now is that I have is | storing highly repetitive data on a very slow flash storage, | (that technically support compression, but it is slow and with | not great compression ratio), I can significantly improve | performance. | | edit: grammar | phiresky wrote: | It would be interesting to try something like this on the file | system level, yes. | | Basically you could heuristically find interesting files, then | train a compression dictionary for e.g. each file or each | directory. Then you'd compress 32kB blocks of each file with | the corresponding dictionary. Note that this would be pretty | different from how existing file system compression works (by | splitting everything into blocks of e.g. 128kB and just | compressing those normally). | | I think it would be hard to find heuristics that work well for | this method. The result would be pretty different than what I'm | doing here with different tradeoffs. sqlite-zstd is able to use | application-level information of which data is similar and will | compress well together. You couldn't do that within the | filesystem. It also works "in cooperation" with the file format | of the database instead of against it - e.g. you probably | wouldn't want to compress the inner B-tree nodes. | | On the other hand, it would then work for any file format not | just SQLite databases. E.g. a huge folder of small json files | gred wrote: | > wouldn't this something that would be more appropriate to be | done at the filesystem level? | | Sometimes? You've gotten a ton of replies digging into the | "why" and "when", and I only wanted to add that your question | brought to mind this quote from the official SQLite website: | | > "Think of SQLite not as a replacement for Oracle but as a | replacement for fopen()" | | https://www.sqlite.org/about.html | xdfgh1112 wrote: | Yeah but your app doesn't have control over the filesystem. | Usually an sqlite db is bundled with an application or created | by it. | | That said, it looks like you could get the same results by | doing client side compression and decompression of data | yourself, and storing compressed blobs in sqlite. It only | compresses text column data, not entire pages. | d1l wrote: | https://www.sqlite.org/vfs.html | Gehinnn wrote: | But it uses a shared zstd dictionary among rows! | | The details are described here: | https://phiresky.github.io/blog/2022/sqlite-zstd/ | marginalia_nu wrote: | Context-aware compression can be a lot of faster than blind | filesystem level compression. | | Some data, like plain text, JSON, or XML, compresses extremely | well. You're almost certainly looking at compression rates of | up to 90%. Dense binary formats, especially formats that are | already compressed like most image formats won't compress | barely at all (sometimes they can even get bigger), and | attempting to compress them is an enormous waste of processing | power. | JulianWasTaken wrote: | Continuing the hypothetical without really knowing about | today's filesystems, such a system could presumably calculate | a moving per page, fd or process- compression factor and turn | it down or up if it started seeing lots of incompressible | data? | marginalia_nu wrote: | For a columnar database, this would probably work fairly | well as it would put data of similar compressability | adjacent to more of its kind, but more row-based databases | where various types of data is intermingled would probably | not benefit nearly as much. | zimpenfish wrote: | Tried this on my twitter database's `json` column (ie the | JSON response from the API fetching a singular tweet) with a | `dict_chooser` of '%Y-%W' to bundle weeks together and it | took 985k rows down from 6.6G to 1.5G. Not as good as | compressing the whole thing obviously but really handy for a | queryable backup. | cmurf wrote: | At least in btrfs there's compress vs compress-force mount | options. compress uses a cheap estimator of compressibility, | whereas compress-force always passes data through the | compression algorithm. | | ZSTD has its own estimator to avoid doing unnecessary work, | but I haven't found a comparison of these estimators (which | is cheaper). | | compress results in max extent size of 128K for compressed | extents, and 128M for uncompressed data. Whereas compress- | force results in 512K max extent size for uncompressed data. | While there's a cost to tracking more extents, it might be | less than the cost of frequent modification of large extents, | particularly if snapshots are used. Of course there's no one | answer, it'd be workload and hardware dependant. Hard drives | have significant latency impact from non-contiguous extents. | NVMe much less impact. | phiresky wrote: | The zstd estimator is not that great, so you should | definitely keep the auto-detection of the file system on. | | If you try to compress a completely incompressible file, it | will still be extremely slow at higher compression levels: | https://github.com/borgbase/vorta/issues/485#issuecomment-8 | 8... | magicalhippo wrote: | I can't recall if they actually implemented or just | talked about it, but during the ZFS team meetings there | was some discussion about using LZ4 as a "compression | canary", to determine if it was worth attempting | Zstandard. That is, first try compressing with LZ4, and | if it did compress, do actual compression with Zstandard. | | IIRC the argument went that LZ4 had a very quick bail-out | routine, and quite fast compression, so the overall | overhead would be quite small, at least for the higher | Zstandard compression levels. | Dagger2 wrote: | They implemented it; it landed in May: | https://github.com/openzfs/zfs/commit/f375b23c | fnord123 wrote: | Great idea. zfs can compress data on disk. But it's also useful | to keep the data compressed in memory so you can fit more data | in memory. There's a lot of cpu/memory tradeoffs so it's worth | having the options to do it on the data. | | It's also worth noting that compression on the application | layer works way better on column stores than row stores. | ciupicri wrote: | > Btrfs also supports transparent file compression. There are | three algorithms available: ZLIB, LZO and ZSTD (since v4.14). | Basically, compression is on a file by file basis. You can | have a single btrfs mount point that has some files that are | uncompressed, some that are compressed with LZO, some with | ZLIB, for instance (though you may not want it that way, it | is supported). | | https://btrfs.wiki.kernel.org/index.php/Compression | Someone wrote: | No. If you did that, SQLite would have to (logically) | decompress about your entire database on every query. For | example, how would it read the database schema to determine | whether _"select foo from bar"_ is valid sql for the database? | If it figures that out, how would it figure out there's an | appropriate index to use? How would it read that index? | | Databases do a lot of seeking into database files and typically | only read a small part of them. That's not a scenario that's a | good fit with whole-file compression. | | Compressing individual pages is a bit better, but I think this | can still be better. | | Your file system probably would cache the decompressed database | file, but that would use memory, lots of it if your database is | large. | formerly_proven wrote: | FS-level compression is usually limited to compressing | relatively small blocks (think ~dozens of sectors), they | don't normally compress a large file as a single solid block. | Someone wrote: | Yes, but still, the file system would do more decompression | than needed, and that will cost performance (CPU and | caching) | ajconway wrote: | This kind of solution should work fine event if the database is | encrypted. | antman wrote: | With this you can selectively compress per column | josephernest wrote: | Let's say I have a web visits analytics log database, with a | column "user-agent" and a column "url" that have most of the time | the same values (99.9% of rows have one of 100 typical values). | | Would this work well with compression here? | | ie. from 50 bytes on average for these columns... to a few bytes | only? | hnarn wrote: | Probably yes, but presumably you could achieve this by | compressing the data on the file system level as well (for | example with ZFS). | Scaevolus wrote: | Yes, if you use the "dictionary" option it will compute build a | buffer of common strings and use that during compression, so | your user-agent columns will largely become references to the | dictionary. Make sure you set compact to true! See this for | more information: https://facebook.github.io/zstd/#small-data | josephernest wrote: | Will this mode require a separate file along the data.db | Sqlite file, to store the compression dictionary? | | From your last link: | | > The result of this training is stored in a file called | "dictionary", which must be loaded before compression and | decompression. | | Also, what happens if a column has been compressed based on | 100 typical values, and then later we insert thousands of new | rows with 500 new frequent values. Does the compression | dictionary get automatically updated? Then do old rows need | to be rewritten? | | PS what is compact mode? | Scaevolus wrote: | It stores these compression dictionaries in the database: | | > If dictionary is an int i, it is functionally equivalent | to zstd_decompress(data, (select dict from _zstd_dict where | id = i)). | | Compact saves you 8 bytes per compressed column by removing | some metadata zstd uses : | | > if compact is true, the output will be without magic | header, without checksums, and without dictids. This will | save 4 bytes when not using dictionaries and 8 bytes when | using dictionaries. this also means the data will not be | decodeable as a normal zstd archive with the standard | tools. The same compact argument must also be passed to the | decompress function. | | You will need to update the dicts occasionally, but there's | a helper function to do this for you, read the docs about | zstd_incremental_maintenance. | cryptonector wrote: | The image near the top of the readme implies that compressing the | whole file means not having random access, which is indeed true, | but it also implies that there's nothing between whole file | compression and row/column level compression, and _that_ is not | true. You could compress pages, kinda like ZFS compresses blocks, | and still maintain random access, and get pretty close to whole- | file compression. | leobg wrote: | This looks great. How would I apply this retroactively? I.e. tell | SQLite to compress column x in an existing database, say, of type | TEXT? | Joel_Mckay wrote: | sudo apt-get install archivemount | | archivemount mydbimage.tgz /mnt/mydbimage -o big_writes | | // -o readonly disable write support | | // -o nosave do not save changes upon unmount. | | //open/cache /mnt/mydbimage/mydbimage.db | | //flush | | fusermount -u /mnt/mydbimage | | //No LGPL/Rust/cargo dependency issues... | | //as SQLite's best use-case is embedding ;) | marginalia_nu wrote: | Eh, tar is probably not an amazing format for this, since it's | designed for tape drives, and has ridiculously slow random | access. | Joel_Mckay wrote: | archivemount supports numerous formats as it is using | libarchive. | | My point was it doesn't contaminate the parent projects | license. | xyzzy_plugh wrote: | I understand why you suggest this, even though archivemount is | LGPL, but I digress. | | FUSE is cool and all but given it's "userspace" and we're just | reading a single file here, I don't find this compelling versus | reading from a block device in-process. | | TFA is compelling because it's compression over specific data | sets without being in the hot path for every operation. Putting | a sqlite.db in a tarball is... problematic. There is no random | access, so if your database doesn't fit in memory/cache, | performance will be abysmal. | Joel_Mckay wrote: | LGPL is not great for porting, and linux block caching is far | from the performance bottleneck if you know what you are | doing. | | I think we can agree it depends on the use-case. =) ___________________________________________________________________ (page generated 2022-08-01 23:00 UTC)