[HN Gopher] Cloudant/IBM back off from FoundationDB based CouchD... ___________________________________________________________________ Cloudant/IBM back off from FoundationDB based CouchDB rewrite Author : jFriedensreich Score : 97 points Date : 2022-03-12 16:06 UTC (6 hours ago) (HTM) web link (lists.apache.org) (TXT) w3m dump (lists.apache.org) | elitepleb wrote: | So what's the deal with the unpopularity of CouchDB? | | It's seems like a compelling database, but i've yet to run into | it in the wild. | tehbeard wrote: | Beyond the meta of it being old/mature and thus not continually | piercing the tech newsspace with releases etc. | | Querying in a more ad-hoc way (vs. building indexes ahead of | time and querying by key, etc) is a bit janky / not 1st class | (I think mango addresses this but not entirely sure). | | The runtime being erlang? It certainly seemed to be the cause | of some issues when I tried to run it in WSL, or atleast my | lack of knowledge with erlang made diagnosing it more trouble. | | The JS query server engine is/was fairly old (I think it might | have jumped to a more recent version of Spidermonkey at some | point), and hooked up in a way that, while more modular, limits | the performance (documents have to be serialized to/from the | engine in another process, rather than just natively passed in) | | The authorization model is... unique. You can limit down to a | doc/field level who can submit changes via | validate_doc_update(...) in a design doc. So allowing those | with a reviewer role to only be able to edit a notes field on a | document, while the user in the author field has full access to | the other fields is possible. But read access is at the | database level, as in you can either read the db, or not. | | The way around this for having "private" storage is enabling a | feature to make a db per user automatically and assign them | rights, but this is more complicated to manage client side (two | dbs to talk to) and replication even more of a nightmare if | stuff needs to be shareable instead of just private. | kache_ wrote: | I've used it, its pretty decent given you understand the | internals. | Already__Taken wrote: | npm does or did run on it https://github.com/npm/npm-registry- | couchapp if that's what you mean by in-the-wild | gedy wrote: | It is/was nice, just an early NoSQL DB with a lot of | interesting features. Just better options came about to take | its mindshare. We used it about 11 years ago for an internal | marketing CMS system and the replication and attachment support | were a good fit. | jFriedensreich wrote: | I would highly object to "better options came about". I am | not debating maybe a better fit to your specific problems | came along, but in general case of the sweet spot for couchdb | there are no obvious better alternatives. The sweet spot | being "a schemalesss json database with a rest api and first | level support for master-master and online/offline | replication that values your data safety and reliability | first and everything else second." | tehbeard wrote: | What's out there with a better client device sync option? | | This is something I've been looking for for a few PWAs that | need to operate on bad/no network, and most other solutions | are build your own entire sync setup, or magic-in-a-box you | can't tune. | | With Couch/pouch, I can sync with a filter/several filters to | make sure the subset of data I need is on the device. | gedy wrote: | Yeah agreed that's really cool. Closest I've seen is Apollo | client, but to your point you have a lot less fine grained | control. | rat9988 wrote: | What betters options would you have in mind? Asking from | curiosity, I don't follow this space closely. | gedy wrote: | It's been a while, but seems like many people wanted | something simpler like MongoDB for a NoSQL document | database. CouchDBs map/reduce queries were hard to get | people's heads around, many people didn't need attachments, | etc. | bsaul wrote: | Sidenote : i've heard foundationDB was used for cloudkit, but is | it also used for iMessage ? | | It seems like its transactionnal properties would be quite well | suited to something like a messenger service (where order of | messages matter, especially with e2e encryption) | navarro485 wrote: | pretty sure Cassandra is used for iMessage. although that may | have changed after apple acquired foundationDB. | gigatexal wrote: | On device iMessage is SQLite I think. Backend not sure. | samwillis wrote: | While I don't have enough knowledge of the wider implications of | this, it does impact something I was experimenting with last | year. | | The FoundationDB rewrite would introduce a size limit on document | attachments, there currently isn't one. Arguably the attachments | are a rarely used feature but I found a useful use case for them. | | I combined the CRDT Yjs toolkit with CouchDB (PouchDB on the | client) to automatically handle sync conflicts. Each couch | document was an export of the current state of the Yjs Doc (for | indexing and search), all changes done via Yjs. The Yjs doc was | then attached to the Couch document as an attachment. When there | was a sync conflict the Yjs documents would be merged and re- | exported to create a new version. The issue being that the | FoundationDB rewrite would limit the size and that makes this | architecture more difficult. It's partly why I ultimately put the | project on hold. | | (Slight aside, a CouchDB like DB with native support for a CRDT | toolkit such as Yjs or Automerge would be awesome, when syncing | mirrors you would be able to just exchange the document state | vectors - the changes to it - rather than the whole document) | HelloNurse wrote: | But is it a small size limit that affects realistic usage? | Don't you have performance issues if you use a CRDT implemented | in JavaScript and running in the browser with large files? | samwillis wrote: | So yes, a particularly large document is not the norm but it | can happen. | | JavaScript CRDTs can be quite performant, see the Yjs | benchmarks: https://github.com/dmonad/crdt-benchmarks | kevincox wrote: | I don't see why there would be a fundamental reason why there | would be an attachment size limit. I guess it would just need | to be implemented by breaking the attachment into multiple | keys? There may be some overhead but it seems that this is | valuable because it allows large attachments to be split across | servers as required. | tlarkworthy wrote: | When you chunk it you have problems about what happens if | that process is interrupted. So it's not trivial (though | solvable) but it's the kind of atomics you want the new | engine to do. | aseipp wrote: | I think the person you're replying to is saying that the | document should be split across keys inside the | implementation, i.e. split across the fdb keyspace, not | split by the user at the application level. Which is the | approach you mostly always have to use for 'large' values; | FoundationDB has size limitations on the k/v pairs it can | accept and splitting documents and writing those chunks in | small transactional batches is the recommended workaround | (along with some other 'switch over' transactional write | which makes the complete document visible all at once.) | tehbeard wrote: | If I remember the fdb docs, there's also a time limit on | transactions that further limits the feasible max size. | malkia wrote: | Reminds me, when a team I worked in, had to migrate from one | database to another (we were the only team left using that one, | and no one was supporting it internally), but the new one had | 22MB (or was it 44mb) limit on the total transaction size, | while previous one did not have (AFAIR). Someone worked on | splitting into several transactions (the bulk was really due to | long recorded conversation "forum" like messages related to | specific data), but overall it changed how things worked and | had some issues initially... Who would've thought you would | need that, years from the day it was originally designed... | robertnewson wrote: | The (low) attachment size limit at Cloudant is about service | quality and guiding folks to good uses of the service more than | a technical issue. | | As others have noted, the solution to storing attachments in | FDB, where keys and values have an enforced maximum length, is | to split the attachments over multiple key/values, which is | exactly what the CouchDB-FDB code currently does. | | The other limit in FDB is the five second transaction duration, | which is a more fundamental constraint on how large attachments | can be, as we are keen to complete a document update in a | single transaction. The S3 approach of uploading multiple parts | of a file and then joining them together in another request | would also work for CouchDB-FDB. While it _could_ be done, | there's no interest in the CouchDB project to support it. | samwillis wrote: | Exactly, almost all the time you would be better to save the | attachment to an object store. However I think I found that | small edge case where the attachment system was perfect. It | was essential to save the binary Yjs doc with the couch | document, it needed to be synced to clients with the main | document. Saving it to an object store is not viable due to | the overhead during syncing. | robertnewson wrote: | yup. the purpose of couchdb's original attachment support | was for "couchapps". The notion that you'd serve your | entire application from couchdb. Attachments were therefore | for html, javascript, image, font assets, which are all | relatively small. The attachment support in CouchDB <= 3.x | is a bit more capable than that due to its implementation, | but storing large binaries was not strictly a goal of the | project. | tluyben2 wrote: | Why don't you open source your work? Can you contact me | otherwise, maybe I can take over this work on couchdb; we have | to do it anyway and we would open source it. | [deleted] | matlin wrote: | This is too bad. I understand there is likely a ton of complexity | in making this switch but I think it still leaves CouchDB with a | frustrating problem which is document conflicts within a given | cluster. Client <-> Server conflicts are very understandable but | when you might unexpectedly get a document conflict from two | server instances replicating with each other, you're just bound | to run into a bunch of issues. | | To have multi-master work properly you basically need Strong | Eventual Consistency via CRDTs which most databases don't | natively support (I think only Riak). Otherwise, you're better | off switching to a single writer model. | endisneigh wrote: | What's the simplest client or way to use foundationDB? I was | excited for this because FDB is somewhat unintuitive to use and | deploy | manishsharan wrote: | The FoundationDB Document Layer is compatibile with MongoDb 3.x | API. https://github.com/FoundationDB/fdb-document-layer .and | you get the transaction Al integrity. | | I stopped using MongoDB and switched to this. | endisneigh wrote: | Amazing - much appreciated. How is it going compared to | mongo? | manishsharan wrote: | Better than MongoDB. Easy to scale up. And no MongoDB | gotchas for transaction. | | I use my FoundationDb cluster as a MongoDB alternative and | a Redis alternative. Only one cluster to maintain and two | types of functionality! I have tried setting up and | maintaining clusters of MongoDB and Redis in the past and | it was horribly complicated. FoundationDB cluster is so | much easier to setup and maintain. And it gives me | functionality of both Redis KV and MongoDB . ___________________________________________________________________ (page generated 2022-03-12 23:00 UTC)