[HN Gopher] Cloudant/IBM back off from FoundationDB based CouchD...
       ___________________________________________________________________
        
       Cloudant/IBM back off from FoundationDB based CouchDB rewrite
        
       Author : jFriedensreich
       Score  : 97 points
       Date   : 2022-03-12 16:06 UTC (6 hours ago)
        
 (HTM) web link (lists.apache.org)
 (TXT) w3m dump (lists.apache.org)
        
       | elitepleb wrote:
       | So what's the deal with the unpopularity of CouchDB?
       | 
       | It's seems like a compelling database, but i've yet to run into
       | it in the wild.
        
         | tehbeard wrote:
         | Beyond the meta of it being old/mature and thus not continually
         | piercing the tech newsspace with releases etc.
         | 
         | Querying in a more ad-hoc way (vs. building indexes ahead of
         | time and querying by key, etc) is a bit janky / not 1st class
         | (I think mango addresses this but not entirely sure).
         | 
         | The runtime being erlang? It certainly seemed to be the cause
         | of some issues when I tried to run it in WSL, or atleast my
         | lack of knowledge with erlang made diagnosing it more trouble.
         | 
         | The JS query server engine is/was fairly old (I think it might
         | have jumped to a more recent version of Spidermonkey at some
         | point), and hooked up in a way that, while more modular, limits
         | the performance (documents have to be serialized to/from the
         | engine in another process, rather than just natively passed in)
         | 
         | The authorization model is... unique. You can limit down to a
         | doc/field level who can submit changes via
         | validate_doc_update(...) in a design doc. So allowing those
         | with a reviewer role to only be able to edit a notes field on a
         | document, while the user in the author field has full access to
         | the other fields is possible. But read access is at the
         | database level, as in you can either read the db, or not.
         | 
         | The way around this for having "private" storage is enabling a
         | feature to make a db per user automatically and assign them
         | rights, but this is more complicated to manage client side (two
         | dbs to talk to) and replication even more of a nightmare if
         | stuff needs to be shareable instead of just private.
        
         | kache_ wrote:
         | I've used it, its pretty decent given you understand the
         | internals.
        
         | Already__Taken wrote:
         | npm does or did run on it https://github.com/npm/npm-registry-
         | couchapp if that's what you mean by in-the-wild
        
         | gedy wrote:
         | It is/was nice, just an early NoSQL DB with a lot of
         | interesting features. Just better options came about to take
         | its mindshare. We used it about 11 years ago for an internal
         | marketing CMS system and the replication and attachment support
         | were a good fit.
        
           | jFriedensreich wrote:
           | I would highly object to "better options came about". I am
           | not debating maybe a better fit to your specific problems
           | came along, but in general case of the sweet spot for couchdb
           | there are no obvious better alternatives. The sweet spot
           | being "a schemalesss json database with a rest api and first
           | level support for master-master and online/offline
           | replication that values your data safety and reliability
           | first and everything else second."
        
           | tehbeard wrote:
           | What's out there with a better client device sync option?
           | 
           | This is something I've been looking for for a few PWAs that
           | need to operate on bad/no network, and most other solutions
           | are build your own entire sync setup, or magic-in-a-box you
           | can't tune.
           | 
           | With Couch/pouch, I can sync with a filter/several filters to
           | make sure the subset of data I need is on the device.
        
             | gedy wrote:
             | Yeah agreed that's really cool. Closest I've seen is Apollo
             | client, but to your point you have a lot less fine grained
             | control.
        
           | rat9988 wrote:
           | What betters options would you have in mind? Asking from
           | curiosity, I don't follow this space closely.
        
             | gedy wrote:
             | It's been a while, but seems like many people wanted
             | something simpler like MongoDB for a NoSQL document
             | database. CouchDBs map/reduce queries were hard to get
             | people's heads around, many people didn't need attachments,
             | etc.
        
       | bsaul wrote:
       | Sidenote : i've heard foundationDB was used for cloudkit, but is
       | it also used for iMessage ?
       | 
       | It seems like its transactionnal properties would be quite well
       | suited to something like a messenger service (where order of
       | messages matter, especially with e2e encryption)
        
         | navarro485 wrote:
         | pretty sure Cassandra is used for iMessage. although that may
         | have changed after apple acquired foundationDB.
        
         | gigatexal wrote:
         | On device iMessage is SQLite I think. Backend not sure.
        
       | samwillis wrote:
       | While I don't have enough knowledge of the wider implications of
       | this, it does impact something I was experimenting with last
       | year.
       | 
       | The FoundationDB rewrite would introduce a size limit on document
       | attachments, there currently isn't one. Arguably the attachments
       | are a rarely used feature but I found a useful use case for them.
       | 
       | I combined the CRDT Yjs toolkit with CouchDB (PouchDB on the
       | client) to automatically handle sync conflicts. Each couch
       | document was an export of the current state of the Yjs Doc (for
       | indexing and search), all changes done via Yjs. The Yjs doc was
       | then attached to the Couch document as an attachment. When there
       | was a sync conflict the Yjs documents would be merged and re-
       | exported to create a new version. The issue being that the
       | FoundationDB rewrite would limit the size and that makes this
       | architecture more difficult. It's partly why I ultimately put the
       | project on hold.
       | 
       | (Slight aside, a CouchDB like DB with native support for a CRDT
       | toolkit such as Yjs or Automerge would be awesome, when syncing
       | mirrors you would be able to just exchange the document state
       | vectors - the changes to it - rather than the whole document)
        
         | HelloNurse wrote:
         | But is it a small size limit that affects realistic usage?
         | Don't you have performance issues if you use a CRDT implemented
         | in JavaScript and running in the browser with large files?
        
           | samwillis wrote:
           | So yes, a particularly large document is not the norm but it
           | can happen.
           | 
           | JavaScript CRDTs can be quite performant, see the Yjs
           | benchmarks: https://github.com/dmonad/crdt-benchmarks
        
         | kevincox wrote:
         | I don't see why there would be a fundamental reason why there
         | would be an attachment size limit. I guess it would just need
         | to be implemented by breaking the attachment into multiple
         | keys? There may be some overhead but it seems that this is
         | valuable because it allows large attachments to be split across
         | servers as required.
        
           | tlarkworthy wrote:
           | When you chunk it you have problems about what happens if
           | that process is interrupted. So it's not trivial (though
           | solvable) but it's the kind of atomics you want the new
           | engine to do.
        
             | aseipp wrote:
             | I think the person you're replying to is saying that the
             | document should be split across keys inside the
             | implementation, i.e. split across the fdb keyspace, not
             | split by the user at the application level. Which is the
             | approach you mostly always have to use for 'large' values;
             | FoundationDB has size limitations on the k/v pairs it can
             | accept and splitting documents and writing those chunks in
             | small transactional batches is the recommended workaround
             | (along with some other 'switch over' transactional write
             | which makes the complete document visible all at once.)
        
           | tehbeard wrote:
           | If I remember the fdb docs, there's also a time limit on
           | transactions that further limits the feasible max size.
        
         | malkia wrote:
         | Reminds me, when a team I worked in, had to migrate from one
         | database to another (we were the only team left using that one,
         | and no one was supporting it internally), but the new one had
         | 22MB (or was it 44mb) limit on the total transaction size,
         | while previous one did not have (AFAIR). Someone worked on
         | splitting into several transactions (the bulk was really due to
         | long recorded conversation "forum" like messages related to
         | specific data), but overall it changed how things worked and
         | had some issues initially... Who would've thought you would
         | need that, years from the day it was originally designed...
        
         | robertnewson wrote:
         | The (low) attachment size limit at Cloudant is about service
         | quality and guiding folks to good uses of the service more than
         | a technical issue.
         | 
         | As others have noted, the solution to storing attachments in
         | FDB, where keys and values have an enforced maximum length, is
         | to split the attachments over multiple key/values, which is
         | exactly what the CouchDB-FDB code currently does.
         | 
         | The other limit in FDB is the five second transaction duration,
         | which is a more fundamental constraint on how large attachments
         | can be, as we are keen to complete a document update in a
         | single transaction. The S3 approach of uploading multiple parts
         | of a file and then joining them together in another request
         | would also work for CouchDB-FDB. While it _could_ be done,
         | there's no interest in the CouchDB project to support it.
        
           | samwillis wrote:
           | Exactly, almost all the time you would be better to save the
           | attachment to an object store. However I think I found that
           | small edge case where the attachment system was perfect. It
           | was essential to save the binary Yjs doc with the couch
           | document, it needed to be synced to clients with the main
           | document. Saving it to an object store is not viable due to
           | the overhead during syncing.
        
             | robertnewson wrote:
             | yup. the purpose of couchdb's original attachment support
             | was for "couchapps". The notion that you'd serve your
             | entire application from couchdb. Attachments were therefore
             | for html, javascript, image, font assets, which are all
             | relatively small. The attachment support in CouchDB <= 3.x
             | is a bit more capable than that due to its implementation,
             | but storing large binaries was not strictly a goal of the
             | project.
        
         | tluyben2 wrote:
         | Why don't you open source your work? Can you contact me
         | otherwise, maybe I can take over this work on couchdb; we have
         | to do it anyway and we would open source it.
        
       | [deleted]
        
       | matlin wrote:
       | This is too bad. I understand there is likely a ton of complexity
       | in making this switch but I think it still leaves CouchDB with a
       | frustrating problem which is document conflicts within a given
       | cluster. Client <-> Server conflicts are very understandable but
       | when you might unexpectedly get a document conflict from two
       | server instances replicating with each other, you're just bound
       | to run into a bunch of issues.
       | 
       | To have multi-master work properly you basically need Strong
       | Eventual Consistency via CRDTs which most databases don't
       | natively support (I think only Riak). Otherwise, you're better
       | off switching to a single writer model.
        
       | endisneigh wrote:
       | What's the simplest client or way to use foundationDB? I was
       | excited for this because FDB is somewhat unintuitive to use and
       | deploy
        
         | manishsharan wrote:
         | The FoundationDB Document Layer is compatibile with MongoDb 3.x
         | API. https://github.com/FoundationDB/fdb-document-layer .and
         | you get the transaction Al integrity.
         | 
         | I stopped using MongoDB and switched to this.
        
           | endisneigh wrote:
           | Amazing - much appreciated. How is it going compared to
           | mongo?
        
             | manishsharan wrote:
             | Better than MongoDB. Easy to scale up. And no MongoDB
             | gotchas for transaction.
             | 
             | I use my FoundationDb cluster as a MongoDB alternative and
             | a Redis alternative. Only one cluster to maintain and two
             | types of functionality! I have tried setting up and
             | maintaining clusters of MongoDB and Redis in the past and
             | it was horribly complicated. FoundationDB cluster is so
             | much easier to setup and maintain. And it gives me
             | functionality of both Redis KV and MongoDB .
        
       ___________________________________________________________________
       (page generated 2022-03-12 23:00 UTC)