[HN Gopher] Downsides of Offline First
       ___________________________________________________________________
        
       Downsides of Offline First
        
       Author : typingmonkey
       Score  : 254 points
       Date   : 2021-10-01 12:57 UTC (10 hours ago)
        
 (HTM) web link (rxdb.info)
 (TXT) w3m dump (rxdb.info)
        
       | akkartik wrote:
       | I remember the good old days when we had offline first by
       | default. We called them just "computer programs".
       | 
       | Offline first as a principle is more important than web apps. If
       | today's browsers have trouble with offline first, consider that a
       | downside of today's browsers.
        
       | candiddevmike wrote:
       | This is a good list. The app I built
       | (https://about.homechart.app) is "read only" offline first, it's
       | a compromise I chose over having to solve queuing writes and
       | consistency checks during the (assuming rare) occurrences users
       | find themselves offline. I'd add write support, but no one has
       | asked for it. Don't go into offline first thinking you need to
       | solve for writes, it can be added later if necessary.
        
         | JamesSwift wrote:
         | The write queue is messy but not hard as long as you are
         | serializing all the operations. The real difficult part is
         | conflict resolution. If you have an easy resolution model (e.g.
         | last one wins) then I think its not too much extra work for
         | you.
        
       | [deleted]
        
       | endisneigh wrote:
       | In practice I believe last write wins or compare last two writes
       | is sufficient.
       | 
       | Any thoughts on this in practice?
        
         | sroussey wrote:
         | Last write wins is a simple CRDT merge type. It's just that
         | their are others depending on the data type. LWW might be good
         | for an avatar photo, but not great for a counter.
        
       | megamix wrote:
       | Humans first. How about that for an idea.
        
         | lovemenot wrote:
         | Not great to be honest. I cannot imagine how to use this idea
         | to support or reject a decision.
         | 
         | Maybe as a slogan or sales pitch...
         | 
         | Reminds me of Fujitsu's Vision Statement: Human-centric
         | Computing. I asked a dozen employees of that company what it
         | meant, and not one could give a coherent answer
        
         | olah_1 wrote:
         | Idk why you're being voted down. It's a genuine design
         | principle. Some ideas that come to mind that fit with "human
         | first":
         | 
         | - offline-first
         | 
         | - human curation instead of algorithms (or at least transparent
         | algorithms that are customizable)
         | 
         | - the user is not the account number. the account is just the
         | mech that the user climbs into. user can have multiple mechs.
         | 
         | - leverage existing social fabric to provide better user
         | experience. account recovery, etc.
        
       | hdjjhhvvhga wrote:
       | An alternative: a native app.
        
         | endisneigh wrote:
         | native to what?
        
           | j1elo wrote:
           | The word "native" has diluted in a sea of abstractions and
           | supporting technologies, but in this context I'd read "native
           | app" as removing the browser engine layer (so, not even
           | really native by far, but much closer to the OS and hardware
           | than when writing code on top of the mentioned layer)
        
             | hdjjhhvvhga wrote:
             | Yes. Putting an app in a browser has solved tens of
             | problems and introduced a hundred new ones. At some point
             | you start to wonder if a plain old native app wouldn't work
             | much better that an uncontrollably growing stack of
             | technologies, some ow which designed for something
             | completely different than what we're trying to accomplish.
        
               | hunterb123 wrote:
               | Okay. You still need a client side database with syncing.
               | Which is what this is about.
               | 
               | The pure native approach only solves (kind of) the
               | limited storage issues, nothing else.
               | 
               | If you nuke your site and only develop about native apps
               | then you'll lose users to competitors because people can
               | try their service without downloading an app.
               | 
               | Yeah maybe we shouldn't have shoved an app run-time into
               | a document viewer, but here we are and things run
               | decently well (thanks to v8 and other tech) if you do
               | things right.
        
         | Veen wrote:
         | Surely many of the problems highlighted in this article apply
         | whether it's web or native. Unless all the app's data is
         | generated on device, it needs some way to synchronize with a
         | server, which might either be an "offline-first sync all the
         | data approach" or an "online-only sync a little bit at a time
         | approach".
        
           | lytefm wrote:
           | Exactly, and the boundaries blur even more when you consider
           | that it's possible to build apps that look native to the user
           | with web technology (Electron, Cordova,..)
           | 
           | And I'd add than even if all data is generated and stored on
           | the device, syncing capabilities are desirable if the user
           | wants to use his Offline-first app across devices.
        
       | ltearno wrote:
       | I once made an app and a talk on this matter. Slides about
       | problems mentionned in the article were addressed from slide 15
       | onwards. Mainly I remember having used Lamport clocks to track
       | rows' causal history. And negative primary keys for inserting
       | data in offline mode (these days I'd use UUID). Since IndexedDB
       | was not a thing yet on every browser, I used asm.js (ancestor to
       | WebAssembly) to compile SQLite for the browser. The database file
       | was stored in the LocalStorage and I used zlib (compiled with
       | emscripten) to make most of the little space LocalStorage gave to
       | us. It has been a learning experience, and worked at the end. I
       | wonder how I would do differently today... By the way, we were
       | using GWT to code for the browser, but that's just an anecdote
       | and not important for that matter...
       | 
       | Here is the link : https://fr.slideshare.net/ltearno/easing-
       | offline-web-applica...
        
         | JamesSwift wrote:
         | Ive also gone the negative key route, and would move to a
         | "client dictates the key via UUID" like you mention if I redid
         | it today.
        
           | mamcx wrote:
           | Why? I have done this before and found negative much easier
           | to use and understand. UUID have the nasty property that deny
           | easy debugging, logging or any kind of HUMAN understanding...
        
             | JamesSwift wrote:
             | The primary benefit is theres no post-hoc reconciliation
             | you need to do where you re-write all the foreign keys and
             | object ids. It greatly streamlines the entire process, and
             | lets you eliminate a lot of code / potential bugs.
        
               | mamcx wrote:
               | I found it make far easier to see what object are
               | candidates for sync, and to know the original row in the
               | server.
               | 
               | I think makes a diference if exist a master database or
               | if is peer-to-peer..
        
               | JamesSwift wrote:
               | Yeah peer-to-peer is a different scenario. I'm assuming
               | traditional client-server.
        
             | WorldMaker wrote:
             | Some of that depends on the version of UUID you are using.
             | v1 and v6 UUIDs with timestamps can provide very useful for
             | debugging. Of course v1 are problematic with MAC address
             | embedding and v6 still just "draft/experimental" with the
             | IETF.
             | 
             | My offline-first apps I settled on using ULIDs, which have
             | time stamps and put them first so that they sort
             | lexicographically (such as when included in string
             | CouchDB/PouchDB _ids), and I've been pretty happy with
             | that. That timestamp up front in the first few bytes can
             | help a lot in debugging/"human understanding" of about
             | where the ULID fits in a log stream.
             | 
             | I can also tell you at this point way more than you care to
             | know about storing ULIDs in Microsoft SQL Server to get
             | decent clustered index behaviors.
        
         | leetrout wrote:
         | Thanks for sharing.
         | 
         | I do not miss GWT but I think it was inspirational.
        
       | jadafaa wrote:
       | Enjoyed reading this
        
       | josephg wrote:
       | I'm a "true believer" in CRDTs, which I have some experience in.
       | You can implement a useful CRDT for simple applications in under
       | 100 lines if all you care about are standard database objects -
       | like maps, sets and values. List CRDTs are where they get
       | complicated, but most applications aren't collaborative text
       | editors.
       | 
       | The promise of CRDTs is that unlike most conflict resolution
       | systems, you can layer over a crdt library and basically ignore
       | all the implantation details. Applications should (can) mostly
       | ignore how a CRDT works when building up the stack.
       | 
       | The biggest roadblock to their use is that they're poorly
       | understood. Well, that and implementation maturity. Automerge-rs
       | merged a PR the other day which brought a 5 minute benchmark run
       | down to 2 seconds. But by bit we're getting there.
        
         | the_duke wrote:
         | I might be missing something, but I have trouble seeing how
         | CRDTs can work for regular CRUD style applications.
         | 
         | Just the first example that pops into my head: edit A sets an
         | invoice status to paid, edit b changes the invoice amount from
         | 100 to 120. The merge is a paid invoice with an incorrect
         | amount.
         | 
         | A workaround would be to record a separate PaidInvoice that
         | wont be changed by the application logic.
         | 
         | But that's just a really trivial example that only involves
         | scalar fields inside a single object, and also relies on the
         | application logic considering all ways that the CRDT might
         | behave.
         | 
         | There are countless ways to end up with data that violates the
         | constraints of the domain.
         | 
         | Is there any theoretical groundwork happening on how CRDTs can
         | preserve domain semantics?
        
           | olah_1 wrote:
           | I think CRDTs necessitate a robust permission system to be
           | more than a toy.
           | 
           | If you want a decentralized CRDT, you'd probably end up re-
           | inventing smart contracts (which is what I found myself doing
           | once).
        
           | ollysb wrote:
           | Your general point withstanding, in this case wouldn't the
           | merge be a partially paid invoice?
        
         | mamcx wrote:
         | I'm interesting to se how can this apply for "regular" database
         | apps (like invoicing). I need to add sync/offline to my main
         | app and want something solid to build upon (have done things
         | homemade before) and has wonder if CRDT could be applied, but
         | how?
        
           | azteceagle wrote:
           | We are in the same need right now. Something that we have
           | researched and think it has a lot of potential, at least for
           | us is James Long sync implementation. You can check out his
           | talks and demos at: * https://www.youtube.com/watch?v=2dh_gtn
           | dayY&feature=emb_imp_... *
           | https://www.youtube.com/watch?v=DEcwa68f-jY
           | 
           | And his demo implementation (and annotated fork):
           | 
           | * https://github.com/jlongster/crdt-example-app *
           | htps://github.com/clintharris/crdt-example-
           | app_annotated/blob/master/NOTES.md
           | 
           | I wonder why there isn't some open source engine based on
           | this at least for CRUD apps since it has a lot of potential
           | and it is really "simple" to implement and even understand.
        
           | a_conservative wrote:
           | I'm a CRDT newbie, but I'll take a stab, hopefully someone
           | can correct me if I'm getting it wrong.
           | 
           | After a quick reading the "CRDTs go brr" and the wikipedia
           | page, I think CRDT gives us a mathematical strategy for
           | resolving conflicts. It doesn't mean that the end result will
           | make sense.
           | 
           | The Wikipedia article gives an example of merging an event
           | flag represented by a boolean variable. So the var in this
           | case means that "someone observed this event happening". So
           | the rule for merging this var from different sources is
           | simple, if any source of data reports the var as true, the
           | merged result should be true as well.
           | 
           | The implication is it matters what the data represents, not
           | just whether it is a boolean or a string, etc.
           | 
           | I'm guessing that a colloborative notes field, or a "did
           | someone call this customer" boolean might benefit from a CRDT
           | more so than keeping track of bank account values.
        
           | fauigerzigerk wrote:
           | I would like to know the answer to that as well.
           | 
           | My current understanding is that CRDTs merely guarantee
           | derministic merging of updates to some basic data structures,
           | where deterministic means that the outcome is well defined
           | and always the same regardless of the order in which updates
           | are merged.
           | 
           | That doesn't mean the outcome makes any sense at all in terms
           | of the sort of application level requirements and constraints
           | you would typically find in a transactional database
           | application. Conflicts may still arise on that level.
           | 
           | So I think what we need to really fulfill the promise of
           | CRDTs is a way to express those application level constraints
           | on top of them.
        
         | JamesSwift wrote:
         | Just read your "crdts-go-brrr" post from the other commenter
         | and just wanted to say thanks for writing all that up! Great
         | insight and great information.
        
         | thamer wrote:
         | I was also a "true believer" in CRDTs for a long time,
         | implementing my first ones in Erlang about 9 years ago[1], but
         | my opinion of where they fit has changed significantly.
         | 
         | The one issue with CRDT that I find is rarely mentioned and
         | often ignored is the case where you've deployed these data
         | structures that include merge logic to a set of participating
         | nodes that you can't necessarily update at will. Think phones
         | that people don't update, or IOT/sensor devices like electric
         | meters or other devices "in the wild".
         | 
         | When you include merge logic - really any code or rules that
         | dictate what happens when the the data of 2 or more CRDTs are
         | merged - and you have bugs in this code running on devices you
         | can never update, this can be a huge mess. Sure you can
         | implement simple counters easily (like the ones I linked to),
         | and you can even use model checking to validate them. But what
         | about complex tree logic like for edits made to a document?
         | Conflict resolution logic? Distributed file system operations?
         | These are already very complex and hard to get right without
         | multiple versions involved and unfixable bugs causing mayhem.
         | 
         | Having to deal with these bugs in the context of a fleet of
         | participants on a wide range of versions of the code, the
         | combinatorial explosion of the number of possible interactions
         | and effects of these differing versions and bugs taken
         | _together_ can really become impossible to manage.
         | 
         | I'd be interested to hear from folks who have experience with
         | these kinds of issues and how they have dealt with them,
         | especially if they are still convinced that CRDTs were the
         | right choice.
         | 
         | [1] https://github.com/nicolasff/distributed-counters
        
           | bmurphy1976 wrote:
           | How are CRDTs unique in this respect? Ossification is a
           | challenge for all APIs, domain models, network layers,
           | protocol specs, etc.
        
             | lazide wrote:
             | Because those other options abstract the decision making
             | away from the 'can never be updated' part usually into some
             | part of the system where the consequences of a
             | problem/interest in the right solution are also collocated
             | with someone who can do something about it (a API server
             | with ossified clients, or network switch with buggy NIC's
             | wired into it, etc.)
             | 
             | If truly peer to peer, that is a lot less clear - do you
             | end up in a collaborative p2p document model forking the
             | documents between 'new rev' clients and 'old rev' clients?
             | Who 'wins'? What is the consequence of losing?
             | 
             | At least an API server can clearly reject the client and
             | give an error message - if it's a CRDT, how does that work?
        
         | hinkley wrote:
         | I started trying to build a non-PHP Wiki for a hobby, and the
         | notion that I was going to have to implement my own version
         | control on top of it stymied me, so CRDTs looked good to me
         | when I first started to get familiar with the idea.
         | 
         | Trac (project management) does some stuff behind the scenes
         | stored in svn, which it uses for edit history. I always liked
         | that idea. Why have two? I have wished for some time that
         | someone did a Trac for git. I just don't want to be the one to
         | write it.
         | 
         | I've also wished for some time that someone would make a new
         | git, not designed by barbarian cannibals, potentially based on
         | CRDTs.
         | 
         | Somehow these notions melted together and now making progress
         | hinges on whether there's a CRDT out there that's up to the
         | task of managing source code - and written in a language I can
         | handle. So far those two haven't appeared, and based on the
         | number of corner cases I've heard described when I watch
         | lectures on CRDTs, I'm pretty sure if I tried to write one
         | myself I'd never see daylight again, but might see the inside
         | of a padded room. Do you think it's safe to say we're still in
         | the distillation phase of invention where CRDT's are concerned?
         | Is this accidental complexity we are seeing or is most of it
         | intrinsic?
         | 
         | I'm instead spending a lot of my free time on the hobby instead
         | of on writing collaborative tools and/or accidentally writing a
         | Trac replacement.
        
         | a_conservative wrote:
         | Can you point to a good introduction to Conflict-free
         | replicated data types (CRDT)?
         | 
         | It's crossed my (short attention span) radar a couple times and
         | seems interesting. I really love the idea about being able to
         | use it, but also "forget" about it while developing.
         | 
         | Does "leaky abstraction" apply here? If you constrain things
         | enough, can a dev really use it and forget about the details?
        
         | BiteCode_dev wrote:
         | Man, I will remember Google Wave for the rest of my life. It
         | was really ahead of its time. Kuddos for being that visionary.
        
         | BiteCode_dev wrote:
         | We all think of CRDT for collaborative text editors, but can
         | you give example of how it's used in the wild that are
         | unexpected, yet very useful?
        
         | jitl wrote:
         | I work on a collaborative text editing system
         | (https://www.notion.so) and read a lot about CRDTs, thank you
         | for your work in this area, particularly
         | https://josephg.com/blog/crdts-go-brrr/
         | 
         | I agree with your assessments here - CRDT is the way forward
         | for most applications; no user wants to fiddle with a merge UI
         | or picking versions like with iCloud. I think RxDB's position
         | here is from their CouchDB lineage.
         | 
         |  _> The biggest roadblock to their use is that they're poorly
         | understood. Well, that and implementation maturity._
         | 
         | I certainly have more understanding to do. My biggest open
         | question is how to design my centralized server side storage
         | system for CRDT data. To service writes from old clients I need
         | to retain a total history of a document, but I don't want to
         | force new clients to download the entire history nor do I want
         | these big histories in my cache, so I end up wanting a hot/cold
         | system; and building that kind of thing and dealing with the
         | edge cases seems like more than 100 lines of code.
         | 
         | It seems like the Yjs authors also recognize that CRDT storage
         | on the server is an area to address, there was some work on a
         | custom database in 2018, although my thinking is more about how
         | to retrofit text CRDTs into my existing very conservative
         | production cloud software stack than about writing to block
         | storage.
        
           | derefr wrote:
           | > no user wants to fiddle with a merge UI or picking versions
           | like with iCloud.
           | 
           | For prose text, what do you think about combining a document-
           | scale CRDT, with _fine-grained_ locking -- e.g. splitting the
           | document into a  "list of lines/sentences", where lines have
           | identity, and then only allowing one person to be modifying
           | _a given line_ at a time?
           | 
           | I've always felt like this was under-explored, given that in
           | prose text it's almost always _semantically_ incoherent for
           | multiple people to be trying to change a single sentence in
           | different ways at the same time anyway (i.e. they would each
           | have a series of edits they want to do to line A to turn it
           | into either A ' or A"; but any result that's not either
           | purely A' or A" will very likely be nonsensical. One could
           | say that the A - A' transformation a user does to a sentence
           | is _intentionally transactional_.)
           | 
           | I almost thought Notion would be a good example of this, but
           | apparently not (https://www.notion.so/Real-time-
           | collaboration-20a1873baf334d...) -- they actually do allow
           | multiple users to be editing the same leaf-node content block
           | at the same time, and so have taken on the full scope of the
           | CRDT problem.
           | 
           | > but I don't want to force new clients to download the
           | entire history nor do I want these big histories in my cache,
           | so I end up wanting a hot/cold system; and building that kind
           | of thing and dealing with the edge cases seems like more than
           | 100 lines of code.
           | 
           | Yes, but these needs are able to be cleanly abstracted away
           | on the backend -- there are internally-complex infrastructure
           | components (like CouchDB, or Kafka) that expose a pure CQRS
           | model to clients, but internally are doing fancier stuff
           | involving reducing changes onto snapshots and then exposing
           | new CQRS streams that begin history with atomic "introduce
           | all this stuff from a snapshot"-typed 'changes'.
           | 
           | There's also some convergent evolution happening here with
           | non-replay sync strategies in blockchains (which can be more
           | interesting to look at if you care about the serverless p2p
           | Operational Transformation type use-case of CRDTs.)
        
             | tylerhou wrote:
             | There is no notion of "same time" in a distributed system
             | -- what if a client locks a line and disconnects?
             | 
             | Also, it leads to a poor user experience.
        
               | derefr wrote:
               | My vision of this wouldn't involve not accepting edits;
               | but rather, the document would react to you trying to
               | edit a locked block the same way macOS reacts to you
               | trying to edit a locked document: offer to duplicate the
               | block and then let you edit the duplicate. Both versions
               | of the block would then appear in the document, perhaps
               | with some annotation that they're sibling forks of the
               | same original text. (Compare and contrast: books that
               | compare-and-contrast works with subtle changes between
               | different versions, e.g. the four Gospels of the Bible.)
               | 
               | Also, I'm speaking about locking that's fine-grained
               | temporally as well as spacially: a line would only need
               | to be locked with a ~10s TTL when a user begins to type
               | in that line. Think of it like the user composing a
               | transaction of modifications to the same line, and then
               | committing it. A lot like typing a message into a chat
               | program. Just the user having their cursor on the line,
               | wouldn't imply that the line is locked; it would only
               | lock when they start typing.
               | 
               | This is already how group-chat apps work, mind you; if
               | you're an admin who can edit other people's message
               | lines, you nevertheless can't edit someone else's message
               | line while _they're_ editing it. But they're only
               | considered to be editing it while they're actively
               | typing, and for a few seconds after that. If they go
               | idle, someone else edits their message-line, and then you
               | come back and try to submit your edit, it will be
               | rejected. (Of course, that behaviour makes perfect sense
               | for group chat software, where the only other people who
               | can edit your text are moderators, and so moderation
               | actions _should_ "trump" user actions. In a p2p
               | collaboration context, IMHO adding resistance
               | /intentionality to per-line forking, but nevertheless
               | allowing it, makes the most sense.)
        
               | resoluteteeth wrote:
               | > My vision of this wouldn't involve not accepting edits;
               | but rather, the document would react to you trying to
               | edit a locked block the same way macOS reacts to you
               | trying to edit a locked document: offer to duplicate the
               | block and then let you edit the duplicate. Both versions
               | of the block would then appear in the document, perhaps
               | with some annotation that they're sibling forks of the
               | same original text. (Compare and contrast: books that
               | compare-and-contrast works with subtle changes between
               | different versions, e.g. the four Gospels of the Bible.)
               | 
               | If you're using a system where you're guaranteed to have
               | knowledge of what other people are editing at all times,
               | there's really no need to use CRDTs in the first place.
        
               | tylerhou wrote:
               | Group chat apps don't care about preserving functionality
               | when the network drops. But a document editor does. And
               | editing a message is relatively rare in chat apps so they
               | can afford to lock; editing a line is common in documents
               | and the chance of a lock conflict is higher.
               | 
               | Your suggestion also assumes that the network is
               | reliable. What happens if a user takes a lock and there
               | is a partition? If the document is P2P, there is no
               | central authority; when should the other participants
               | override the lock? How much overhead does that add to the
               | protocol?
               | 
               | The main point is that there is no notion of a central
               | clock in a distributed system; hence "lock temporally" is
               | not precise. Relative to which participant? And what
               | happens when messages are dropped? (Even the lock message
               | might be dropped!) A distributed lock implementation is
               | non-trivial.
               | 
               | https://lamport.azurewebsites.net/pubs/time-clocks.pdf
               | 
               | https://static.googleusercontent.com/media/research.googl
               | e.c...
               | 
               | https://en.m.wikipedia.org/wiki/Fallacies_of_distributed_
               | com...
        
             | suchire wrote:
             | I don't personally have use experience of Quip, but from
             | engineers I know who work there, fine-grained locking is
             | how Quip handles collaboration.
        
               | jitl wrote:
               | The locking in Quip is like a UI concern - it's not a
               | guarantee, and I don't know how (or if) Quip handles
               | concurrent offline edits. As a user of Quip (while at
               | Airbnb) I was pretty frustrated by the lock UI, although
               | it improved once they added the "steal this lock" button.
        
           | josephg wrote:
           | Oh cool! I've wanted something like notion for years. Ideally
           | on top of CRDTs (so I own my own data). I really appreciate
           | all the work your company is doing! Feel free to get in touch
           | if you want to have a proper chat about this stuff.
           | 
           | > My biggest open question is how to design my centralized
           | server side storage system for CRDT data. To service writes
           | from old clients I need to retain a total history of a
           | document, but I don't want to force new clients to download
           | the entire history nor do I want these big histories in my
           | cache, so I end up wanting a hot/cold system; and building
           | that kind of thing and dealing with the edge cases seems like
           | more than 100 lines of code.
           | 
           | Yeah definitely more than 100 lines of code. I'm sad to
           | report that in diamond types (my own CRDT) I've spent ~12000
           | lines of code in an attempt to solve some of these problems.
           | I could probably get that down under 3000 loc in a rewrite if
           | I'm happy to throw away some of my optimizations. Doing so
           | would dramatically lower the size of the compiled wasm bundle
           | too - though the wasm bundle is still comfortably under 100kb
           | over the wire, so maybe its fine?
           | 
           | Regarding history, I have a lot of thoughts. The first is
           | that with the right approach, historical data compresses
           | really well. Martin Kleppman's automerge-perf data set has
           | 260k edits ending in 100kb of text. The saving system I'm
           | working on can store the entire editing history (enough to
           | merge changes from any version) in this example with just
           | 23kb of overhead on disk. I think that resulting data set
           | might only need to be accessed in the case of concurrent
           | changes, and then only back as far as the common ancestor.
           | But I haven't implemented that optimization yet.
           | 
           | And yeah; I've been thinking a lot about what a CRDT-native
           | database could look like too. There's way too many
           | interesting and useful problems here to explore.
        
           | prox wrote:
           | I really seen a lot of buzz recently about notion, cool work
           | you are doing.
        
           | taeric wrote:
           | While I am sympathetic to the idea that users don't like
           | merges, they also hate idiotic combinations of data without
           | their oversight. So, bit of a rock and a hard place.
           | 
           | If you want collaboration between people, you have to
           | structure it in a way that makes it a conversation, I
           | believe.
           | 
           | I could almost see an idea that you could pattern it after
           | musicians playing together, but that is a very particular
           | kind of rehearsal that has not been done in any other
           | practice, as far as I am aware. Improv may come close, but
           | even that has very specific techniques that really don't make
           | sense in a CRDT landscape.
        
           | somenewaccount1 wrote:
           | Is there a developer community or forum for the notion.so
           | api? The slack channel published on your developer website is
           | down.
        
             | jitl wrote:
             | We have the Slack you mentioned (invite:
             | https://notiondevs.slack.com/join/shared_invite/zt-
             | vkinpzs0-...) and a Stack Overflow tag
             | (https://stackoverflow.com/questions/tagged/notion-api).
             | 
             | The Slack link from https://developers.notion.com works for
             | me, maybe Slack has a DNS issue according to this article?
             | https://www.theverge.com/2021/9/30/22702876/slack-is-down-
             | ou...
        
               | somenewaccount1 wrote:
               | Thank you for the correct link
               | 
               | The Slack link I mentioned is
               | https://join.slack.com/t/notiondevs/shared_invite/zt-
               | lkrnk74...
               | 
               | The "invite has expired"
               | 
               | It's linked in the footer of
               | https://developers.notion.com/
        
           | [deleted]
        
         | fendrak wrote:
         | I would very much love to see the "simple CRDT" implementation
         | described above, seems like it would be a great learning tool
         | and/or foundation on which to build something more complicated!
        
           | azteceagle wrote:
           | This presentation by James Long helped me a lot:
           | https://www.youtube.com/watch?v=DEcwa68f-jY
        
         | mikevm wrote:
         | So how do the various popular CRDT libraries compare nowadays?
         | 
         | There's Yjs (with a Rust port that is in progress), Automerge-
         | rs, and your own diamond-types project :).
         | 
         | Is Yjs still the current go-to for most projects' needs?
        
           | hunterb123 wrote:
           | There's also GUN (https://gun.eco/) if you want a CRDT graph
           | p2p syncing database. I have no relation to the project, but
           | the community is awesome.
        
         | amelius wrote:
         | > I'm a "true believer" in CRDTs
         | 
         | Don't make the same mistake as the Google Wave team.
         | Collaborative editing is a great intellectual challenge to work
         | on, but in reality users don't care much about documents that
         | auto-update in real time. In fact, it can be annoying.
        
           | readams wrote:
           | That's like the main feature of Google Docs and people use it
           | all the time.
        
             | Thaxll wrote:
             | Most of the time they don't use it for the realtime aspect
             | of it, they use it to easily share documents.
        
               | twicetwice wrote:
               | I think you're typical-minding here. The realtime aspect
               | is used all the time by many many people.
        
           | jitl wrote:
           | You're replying to someone who worked on Wave.
        
           | cjblomqvist wrote:
           | Parent was actually a part of the team doing the core engine
           | for Wave, so in that case it's more like he never dropped his
           | belief in it.
        
         | jasode wrote:
         | > _I'm a "true believer" in CRDTs, which I have some experience
         | in._
         | 
         | What is the largest-scale or highest-profile real-world usage
         | of CRDT today?
         | 
         | (I glanced at this CRDT vs OT topic before but I'm not up-to-
         | date on where things stand in the real-world performance:
         | https://news.ycombinator.com/item?id=23988999)
        
           | dunham wrote:
           | It's not really high scale / high performance, but Apple's
           | notes.app uses state based CRDTs internally for conflict
           | resolution. It is perhaps high profile, since a lot of people
           | are using it.
        
           | dugmartin wrote:
           | I'm not sure about the largest-scale single usage but the
           | Phoenix Framework uses CRDTs for handing user presence.
           | 
           | It isn't enabled by default but it is very easy to use and
           | the CRDT backend is basically hidden away.
           | 
           | More info:
           | 
           | - https://hexdocs.pm/phoenix/presence.html -
           | https://dockyard.com/blog/2016/03/25/what-makes-phoenix-
           | pres...
        
           | jitl wrote:
           | Grandparent wrote a good article on CRDT performance
           | https://josephg.com/blog/crdts-go-brrr/
        
           | LAC-Tech wrote:
           | TomTom uses CRDTs for their SatNav software.
        
       | LeanderK wrote:
       | I am a strong believer in PWAs. I think most of the apps I use
       | could be PWAs without a problem. I really don't get why Apple
       | isn't developing them for MacOS, since I've never really used the
       | app-store on MacOS and in comparison to iOS a lot of the apps on
       | my Mac are productivity apps that are basically electron-apps. I
       | think some basic PWA functionality on the desktop would be more
       | interesting for me than more advanced PWAs for ios.
        
         | jdavis703 wrote:
         | I want to believe in PWAs. But in the real world they are just
         | way too clunky. I've observed this on both software I've
         | written, and on PWAs from teams that should objectively know
         | what they're doing like Google Calendar.
        
         | atatatat wrote:
         | PWAs lead to the walls of Apple's walled garden being torn
         | down.
         | 
         | They'll fight it with subtlety until the end, and I hope they
         | burn in hell for it.
        
           | blauditore wrote:
           | Exactly. They introduced auto-deletion of localStorage under
           | the hood of "privacy", when it was really about driving
           | people to building standalone apps (and use their store).
        
             | breakfastduck wrote:
             | Good. PWAs are absolutely awful compared to native apps.
        
         | breakfastduck wrote:
         | Because native apps are basically a huge chunk of the appeal to
         | using macOS and PWAs are the absolute antithesis of that.
         | 
         | Encourage developers to make apps that all look/feel completely
         | different in terms of style/UX - no thanks.
        
           | LeanderK wrote:
           | it's not working at all. I currently use vscode, jupyter lab
           | and Slack. They are all not native. I can't see myself
           | switch.
           | 
           | I use some native apps and they are great and all, but to be
           | honest I don't care. I want a good user experience and not
           | technical details what's behind the hood of the UI.
           | 
           | I also feel like the stance is pointless since electron apps
           | are here to stay and it just makes the experience of
           | everything worse.
        
             | breakfastduck wrote:
             | It's not about whats behind the hood. It's about a
             | consistent UX & performance.
             | 
             | Honestly VS Code is one of the best electron apps available
             | and it still feels clunky compared to basically any native
             | macOS app.
        
       | kevincox wrote:
       | I do find it disappointing that there is no reliable local
       | storage system for the web. I get the resistance to trackers but
       | there should be a way to request permission to store data that
       | isn't deleted except for explicit action by users. It means that
       | you effectively have to store the data in "the Cloud" which means
       | 1. I have to pay for it 2. If I shut down the service you are
       | screwed and 3. I have at least some access to it (encryption
       | aside).
       | 
       | I would also like to see a synced version since most browsers
       | these days support syncing settings and passwords. But creating a
       | generic syncing solution that is actually useful is hard.
        
         | oblib wrote:
         | A few years ago a tried using the browser's IndexedDB for my
         | invoicing app and loaded a 1000 documents into it at a time. It
         | did ok up to about 2-3k documents but choked the browser to the
         | point of being unusable at 5k documents. That was on a Late `09
         | Mac Mini so maybe newer or more powerful PCs would do better,
         | but that's not an issue at all if you're using CouchDB to store
         | the data on the client side.
         | 
         | I use CouchDB installed on the client side to implement Offline
         | First data storage. This works for Desktop PCs and if you run
         | it on local device that's accessible via your local network,
         | like a Raspberry Pi in-house for example, you can also use it
         | with mobile devices in-house too.
         | 
         | The local CouchDB will sync with the Cloud based CouchDB as
         | soon as they're both online. CouchDB will decide which version
         | to keep and deliver it from that point.
         | 
         | It's certainly not perfect and doesn't provide "real time
         | collaboration" but so far that's been out of reach anyway and
         | may not be a very good approach at all. The notion of several
         | people editing the same document at the same time seems to me
         | to be chaotic no matter how you approach it.
         | 
         | The biggest downside to this approach is that the user has to
         | install and configure CouchDB. I made a simple web app to help
         | with this, but it's a bit too much to expect users to install
         | and configure it.
         | 
         | What we need is a client side DB pre-installed that any web app
         | can access and the data for that app is sandboxed and can only
         | access the DB assigned to it. But it's not reasonable to add
         | that to a web browser. CouchDB can do that now.
        
         | richardwhiuk wrote:
         | Can you allow the user to load and save files?
        
           | kevincox wrote:
           | I mean sure, you can get the user to download/upload files,
           | but this is very awkward and not suited for storing every
           | change. (But is good for migrating and backing up the data).
           | Having to get the user to manually break the data out of the
           | browser is not a good UX for day-to-day work.
           | 
           | I know that Chrome is pushing for a filesystem API but I
           | don't know if that will be exempt from the usual
           | ephemerality. IIRC it is just a private storage space with a
           | filesystem-like API.
        
             | easrng wrote:
             | Chrome has an API that allows you to save to files without
             | a new download every time. They made a pretty nice library
             | that wraps that API on Chrome and it gracefully degrades on
             | other browsers.
             | 
             | https://web.dev/browser-fs-access/
        
               | jcims wrote:
               | i noticed this behavior in a drawing app called
               | excalidraw. first save opens a file dialog, subsequent
               | saves just update the file, basically like a standard
               | local text editor.
               | 
               | i keep doing 'save as' to create new files because i
               | don't trust it lol
        
         | nathcd wrote:
         | `navigator.storage.persist()` will prompt the user to allow
         | persistent storage for your site, so data won't be evicted
         | under storage pressure (for IndexedDB, service worker
         | registrations, localStorage, sessionStorage, etc.)
         | 
         | https://developer.mozilla.org/en-US/docs/Web/API/StorageMana...
        
           | marcosdumay wrote:
           | The problem runs deeper than what the GP complains about.
           | 
           | If you rely on that, and write important things on the
           | browser storage, a week later your user access some moronic
           | site that doesn't work, and their support tells him to clear
           | his browser cache, your data will almost certainly be cleaned
           | with it.
           | 
           | The problem is that browsers do not even consider that they
           | may be storing important data. There is no clear way to
           | recover or backup that data, and it is tangled with what
           | comes from every other site.
        
           | simiones wrote:
           | That doesn't work on iOS, nor on Safari on Mac for any
           | released version as far as I understand:
           | 
           | https://caniuse.com/mdn-api_storagemanager_persist
        
         | _fat_santa wrote:
         | > I do find it disappointing that there is no reliable local
         | storage system for the web. I get the resistance to trackers
         | but there should be a way to request permission to store data
         | that isn't deleted except for explicit action by users.
         | 
         | I've built a number of apps in the last year or two that use
         | browser local storage. It also annoys me that the idea around
         | "storing data in your browser" is automatically attributed to
         | ad tracking. I have to go out of my way in my apps to inform
         | users that yes this app uses localStorage/cookies, but no this
         | is not used for any ad tracking, rather for actually storing
         | your app's data.
        
           | kevincox wrote:
           | How do you deal with the face that the browser may just
           | decide to wipe the storage at any time?
        
             | jitl wrote:
             | The most important thing to do is get your users data
             | somewhere safe as quickly as possible. For the vast
             | majority of users that means your cloud database.
             | 
             | As the user generates new data, spray it to your servers
             | _as well_ as writing it to the syncable IndexedDB local
             | storage, and to an in-memory buffer. Make your backend
             | handle writes idempotently, and retry all failures a few
             | times. (Eg, IndexedDB disk might be full or flaking out, so
             | retry writing the memory buffer to disk.)
             | 
             | As long as the _write path_ is quick, users can tolerate
             | the browser nuking offline storage cache because they can
             | re-download all the data that made it up to your server.
             | 
             | Hopefully soon the browser vendors will allow more durable
             | file system access with appropriate user controls. Chromium
             | built out the file system access API (https://web.dev/file-
             | system-access/) but it's not supported in Firefox or
             | Safari.
        
               | chrisfinazzo wrote:
               | Is this (generally) reliable?
               | 
               | I know terrible, awful bugs eventually doomed WebSQL from
               | getting any traction and IndexedDB seems to be a more
               | competent replacement, but the fact that Google is
               | leaning on FSA seems like a non-starter.
               | 
               | It just feels like there's no way in hell Webkit will
               | ever implement this stuff - not because of the divide
               | between the App Store and PWA's - but due to the
               | implications for privacy.
               | 
               | Hard pass.
        
               | jitl wrote:
               | I'm not sure exactly what your reliability question is
               | about. I haven't actually used the file system API I
               | posted, probably same as you I'm waiting for it to ship
               | outside of Chromium.
               | 
               | On the subject of WebKit, IndexedDB bugs are also pretty
               | bad especially on iOS; we have debated about turning off
               | IndexedDB write buffering in Safari and just do in-memory
               | there. The best thing to do on Apple platforms is to make
               | the app they're trying to force you to make. Then you can
               | make a little adapter so your web app can write to disk
               | using SQLite and enjoy a nice relational API without
               | needing to worry about the whims of the browser.
        
             | aboodman wrote:
             | For a SaaS style (aka client-server) application, the right
             | way to think of client-side storage is as a persistent
             | cache, for a few reasons:
             | 
             | * it can be deleted at anytime (by browser, or even by
             | user!)
             | 
             | * you generally want the server to be authoritative. if
             | there's a bug client-side, server view of state should win.
             | 
             | * it's not possible in the general case to store _all_ user
             | data offline, it 's always a subset.
             | 
             | Once you realize that the client-side state is a cache,
             | potential uses of it become a lot more clear.
        
               | kevincox wrote:
               | That's the thing. I want to make apps where the client-
               | state is more than a cache. I want it to be able to be
               | authoritative.
               | 
               | Sure, you probably want to put some sort of syncing on
               | top, but that isn't even always necessary.
        
               | aboodman wrote:
               | Alright, there is some naming collision then.
               | 
               | "offline-first" (terrible name, but here we are)
               | generally refers to a classic web application that wants
               | to be able to run offline either for network resiliency
               | reasons or for performance.
               | 
               | "local-first" is a term that has been coined for
               | something close to what you are talking about:
               | https://www.inkandswitch.com/local-first.html
        
         | cdbattags wrote:
         | Not much more to say other than Noms was my favorite project
         | (https://github.com/attic-labs/noms) for a while until
         | acquisition and the engineers are now the ones behind
         | Replicache (https://replicache.dev/).
         | 
         | I think this is going to be the next "Realm" that works
         | everywhere.
        
           | nacs wrote:
           | Looks like replicache is pretty expensive though.
           | 
           | If you have more than 500 users, the price is $500/mo (and it
           | goes up from there).
        
           | aboodman wrote:
           | aww, thanks.
        
         | jefftk wrote:
         | _> there should be a way to request permission to store data
         | that isn 't deleted except for explicit action by users_
         | 
         | I think that's "request permission to use files on disk". This
         | is in progress as https://developer.mozilla.org/en-
         | US/docs/Web/API/File_System..., though there's still more work
         | before there's a version all the browsers like (Mozilla likes
         | the ability to work with files, but thinks cross-site access
         | should not be included: https://mozilla.github.io/standards-
         | positions/#native-file-s...)
        
           | danShumway wrote:
           | > but it is wrapped together with aspects for which we do not
           | think meaningful end user consent is possible to obtain (in
           | particular cross-site access to the end user's local file
           | system)
           | 
           | This is a really difficult problem to solve, and I get
           | Mozilla's hesitation. I'm also frankly very hesitant about
           | Google leading the charge on this, not because I'm paranoid
           | about them sneaking in tracking, just because I think Google
           | tends to create less thoughtful web specifications sometimes.
           | 
           | But... cross-site file access is really important for data
           | portability and open standards, and Google's current proposal
           | isn't bad, it might be rough but it's definitely workable.
           | Mozilla really should try to figure out a way to move forward
           | on this.
           | 
           | We've seen the difference in data portability between mobile
           | and desktop apps, and a big part of the difference between
           | those two platforms is being able to very easily have
           | multiple sources working on your data at the same time.
           | Siloing data has downsides. It's tough to embrace a Unix-
           | style philosophy without allowing programs to operate on the
           | same data. And having Unix-style smaller webapps that work
           | with each other is a good way of fighting against data silos
           | and in some cases a good way of fighting against anti-user
           | and anti-privacy services in general.
           | 
           | I'd love to see more progress made on this, but who knows how
           | that will work out. Caution is probably warranted for the
           | moment, I'm just disappointed that the language suggests
           | Mozilla would never consider a proposal that included this.
           | 
           | It's also very important that this expose _user-accessible_
           | file system access and not just a virtual filesystem in the
           | browser; otherwise it just becomes another data-silo in the
           | web browser. This is something that Google 's proposal really
           | gets right, and it's disappointing to see what appears to be
           | pushback on the idea that users should be able to open up the
           | directories that a web browser is writing to, inspect the
           | files, and open them or move them around the filesystem, or
           | even write to them from native apps. That to me is an
           | essential part of the proposal.
        
         | hinkley wrote:
         | sqlite bindings in browsers came and went in the time between
         | when I learned about them and finally found a nail that needed
         | that hammer.
         | 
         | I started on a design and literally within a few weeks Firefox
         | announced it was deprecated.
        
         | EamonnMR wrote:
         | What's wrong with offering the user a file to "download"
         | (actually creating a file on the fly) or "upload" (actually
         | loads it into the local web app?)
         | 
         | Edit: Here's the stack overflow copypasta I used to achieve it:
         | https://github.com/EamonnMR/Flythrough.Space/blob/master/src...
        
       | lytefm wrote:
       | I've been working on offline-first apps (CouchDB/PouchDB +
       | Cordova/Capacitor and published via App/Play Store) in the last
       | years and can definitely relate. But some points to add:
       | 
       | - The 7 days IDB limitation does not apply for apps that are
       | published through the stores
       | 
       | - Conflicts can happen, but depending on your design they might
       | not matter in practise. ,,Implement a proper conflict resulution
       | strategy" has been on my ,,todo: maybe" list for over 3 years now
       | but was never important enough.
       | 
       | - Data migration is not needed as long as schema changes are
       | additive (new doc fields, new doc types). Design carefully early
       | on, keep track of ,,abandoned" properties and you'll rarely need
       | a difficult migration.
       | 
       | - Depending on the performance of your customers' phones and the
       | amount of data your app is processing, it (JS -> ... -> IDB and
       | back) might not be fast enough. I had to add caching layers for
       | some use cases. But at some point, you probably want a proper
       | state management library anyways which should include caching
       | nearly for free.
       | 
       | - You can (and should!) still consider most of your data
       | relational. There is even a relational-pouch plugin. But I'm
       | strongly missing foreign key constraints and better DB-level data
       | validation than CouchDB's design docs provide.
        
       | yanis_t wrote:
       | PWAs are the future.
       | 
       | Along with other benefits comes the fact that you can "install"
       | apps on devices bypassing the app stores (which means not paying
       | commission fees), which I guess is the primarily reason why
       | Apples is so reluctant of giving it a proper support in Safari.
        
       | twobitshifter wrote:
       | Good article. For structured data, I've had good luck using UUIDs
       | for keys rather than needing an approach that relies on an atomic
       | clock.
        
       | ngrilly wrote:
       | Many of the problems mentioned in the article are solved by doing
       | offline first with a native app instead of a PWA.
        
       | taneq wrote:
       | The most confusing thing to me is a discussion of "offline first"
       | applications which starts with, and maintains, the assumption
       | that your only option is a web app.
       | 
       | Back in my day we had a word for software that always worked
       | without an internet connection. We called it "software" and it
       | was installed on the user's computer.
        
         | Swenrekcah wrote:
         | A user's computer? Written with a possessive? What a bizarre
         | idea!
        
         | tehbeard wrote:
         | Har de har look at these modern programmers with their web 2.0s
         | and js frameworks of the week...
         | 
         | Offline first is not "software" as you classified it.
         | 
         | Your "software" you are on about is more accurately described
         | as "offline only", little to none of the functionality requires
         | network access.
         | 
         | Offline first refers to how online functionality is needed for
         | the design of that app, but with steps taken to ensure that
         | even without connectivity for periods of time, it still
         | functions.
        
         | jjnoakes wrote:
         | While offline first seems to be discussed in the context of web
         | apps a lot, to me it is more about the data and synchronization
         | than where the executable lives, and most of the ideas also
         | apply to certain kinds of desktop software.
        
         | jitl wrote:
         | It is approximately infinity times easier to distribute and
         | grow the user base of a web app compared to locally installed
         | software. The consumer software economy is moving online for
         | this reason - it's much better for business.
         | 
         | RxDB software comes from this context - it's a JavaScript
         | library built for this world that attempts to retain the
         | distribution and sharing advantages of the web, while adding
         | back the responsiveness and availability of traditional
         | installed software.
         | 
         | I think you'll find the original "local first" manifesto more
         | aligned with both the user & traditional installed software
         | with less of a web focused bias:
         | https://www.inkandswitch.com/local-first.html
         | 
         |  _> In this article we propose "local-first software": a set of
         | principles for software that enables both collaboration and
         | ownership for users. Local-first ideals include the ability to
         | work offline and collaborate across multiple devices, while
         | also improving the security, privacy, long-term preservation,
         | and user control of data._
        
         | blacktriangle wrote:
         | Also back in that day 95% of our users were on Windows and we
         | had a direct relationship with them.
         | 
         | Now your user base is roughly split 40/40 between iOS and
         | Android with a non-ignorable 20 running some windows tablet
         | thing. Then for extra fun your access to those users is
         | mediated by the giant black boxes staffed by assholes that are
         | the iOS and Play stores.
         | 
         | And of course back then nobody had any expectation that your
         | software would easily sync between devices and users because
         | that just wasn't something that was doable easily.
         | 
         | The world changes. Yeah it was better for us devs back then,
         | but honestly the new world has some real advantages.
         | 
         | For me personally, I'm cautiously optimistic that PWAs are our
         | way out of the hell that is supporting 3 native platforms for
         | all but the rarest cases.
        
           | poetaster wrote:
           | Qt/qml works for me. Write once.
        
           | lytefm wrote:
           | For me, PWA doesn't quite cut it yet but using a single
           | codebase based on web technology + adding the appropriate
           | wrappers for native functionality (Electron, React Native,
           | Capacitor, whatever...) is fine for now.
        
         | dspillett wrote:
         | That isn't offline first though: desktop software like that is
         | generally offline _only_ , and the user wraps their own chosen
         | sync method (which could simply be good ol' sneaker-net or
         | frizby-net) around that if they want/need to.
         | 
         | Offline first is only used in the context of web applications
         | and sometimes their Android/iOS cousins (which probably share
         | the same backend, where both are available), once the decision
         | had been made that a not-locally-installed and/or remote synced
         | application is desirable where possible, so isn't being
         | suggested (directly) as an alternative to locally installed
         | offline programs.
        
         | JamesSwift wrote:
         | The difference is that back in the day it _only_ worked on your
         | computer. There was no cloud component and no collaborative
         | use-case. If you only have a single, local client then a lot of
         | this doesn't apply. But its rare for that to be the case in
         | modern software.
        
         | jasode wrote:
         | _> The most confusing thing to me is a discussion of "offline
         | first" applications which starts with, and maintains, the
         | assumption that your only option is a web app._
         | 
         | I didn't downvote your comment but in this author's article,
         | it's deliberate for the _starting context for discussion_ to be
         | a networked collaborative app.
         | 
         | Yes, internet connected apps is a _subset_ of all possible
         | software but that 's not the point.
         | 
         | As an analogy, imagine if someone else submitted an article
         | about C Language memory techniques on an embedded chip. E.g.:
         | https://www.embedded.com/memory-allocation-in-c/
         | 
         | And then a commenter misunderstands that article complaining,
         | _" it's confusing to me because this article maintains that the
         | only option is C in an embedded app but over here, I'm using
         | Python with Cloudflare Workers"_
         | 
         | In other words, it doesn't seem like you're interested in
         | collaborative apps that require distributed data consistency so
         | this article looks "wrong" to you.
        
           | phkahler wrote:
           | It's confusing because "offline-first" doesn't even seem to
           | make sense in the context of web-apps, which I thought meant
           | "fancy (functional, able to do stuff) web sites" or similar.
        
           | josephg wrote:
           | I've been working in this problem space for awhile now and I
           | sort of agree with the GP poster. I really like native
           | software; and I want native software which can work simply in
           | a distributed, collaborative context. As an example, I have a
           | note taking app on my laptop. I want to be able to read and
           | edit all my notes on all my other devices. And I want that to
           | work in a way that doesn't depend on some random startup
           | keeping their servers on the other side of the planet
           | running. Right now every software company which wants to
           | build something like this needs to invent their own data
           | stack, network protocols and storage systems. And the prize
           | at the end is with software that can only talk to itself via
           | closed protocols. It's infeasible, and inefficient.
           | 
           | We have an opportunity right now to do an awful lot better.
           | When we do, I want to service both native and web apps. If we
           | do it right, from the network level the distinction should
           | just boil away anyway.
        
             | zorr wrote:
             | I'm in this boat with the product I'm currently hacking on
             | but I just can't seem to commit to an architecture I'm
             | happy with. Mostly due to what you mentioned here.
             | 
             | Essentially I want to build an opensource/hackable
             | notes/tasks/calendar system with a central datastore and
             | message broker for coordination between various
             | systems/scripts/components/clients.
             | 
             | The thing I'm struggling with is to define which features
             | are supported in online and offline mode. Every feature
             | that gets added to offline mode adds tons of duplication
             | and complexity. I'm almost at the point of just saying I
             | don't need offline-mode except for viewing already-cached
             | data and maybe very basic creates/updates, with all the
             | processing happening on the backend once the client comes
             | back online.
             | 
             | edit for more context: The old-style native apps (for
             | example OmniFocus) usually have all the logic only in the
             | client and use "dumb" cloud storage for synchronizing
             | between clients. The difference with what I'm trying to
             | build is that I want that central hub to be "smart" and
             | always online so it becomes easy to hack/interact with the
             | system from cronjobs/scripts/external services.
        
               | josephg wrote:
               | The architecture I have in mind is to use a CRDT of some
               | sort as the data store. (Or OT if you have a centralised
               | server and want to keep complexity down). Then make the
               | client smart, like old school apps like OmniFocus. Do
               | concurrent editing via the data layer - so the
               | application only really deals with the local data and
               | hears about updates via the underlying data itself
               | changing.
               | 
               | The data model can be reused across many applications,
               | since there's nothing application specific about it. So
               | we can make standard, interoperable debugging tools,
               | backup tools, viewers, etc.
               | 
               | If you want to interact with the same data from scripts,
               | cron jobs and external services, just have another peer
               | on the network with access to the same set of API methods
               | the application can access. You can already read and
               | write data via that API, and any applications with the
               | data open should see any changes instantly.
               | 
               | Basically, what I'm imagining is pretty similar to a self
               | hosted firebase. Except, ideally, I want a CRDT under the
               | hood so we don't _need_ to send all edits via someone
               | else 's computer.
        
           | nonameiguess wrote:
           | Networked collaborative app doesn't need to mean runs in a
           | browser. A git repo fits that description, even a true
           | distributed VCS with no server where every editor has their
           | own copy and no single copy is authoritative. Each user
           | chooses which changes to merge into their personal copies.
           | The native versions of Microsoft Office when backed by
           | Sharepoint also operates that way, allowing users to check
           | out individual copies and edit them in a native editor,
           | although in that case Microsoft is clearly trying to push
           | people into editing directly in the browser.
           | 
           | A lot of these problems go away if you don't run in a
           | browser, because user inherently trust software more when
           | they're running a copy that can't change from underneath them
           | on a second-by-second basis. I'm a lot more willing to give
           | filesystem access to an application I have to explicitly
           | install and that remains what I installed until I knowingly
           | and intentionally upgrade it, as opposed to code pulled
           | continuously from the network as I am working.
        
         | psychometry wrote:
         | Ok, and if you want your non-web app to be compatible with
         | Windows, MacOS, iOS, and Android and you don't want to write
         | more than one app, your options are...?
        
           | poetaster wrote:
           | Qt/qml works for. With python sometimes.
        
           | justinclift wrote:
           | Generally Qt.
        
         | blondin wrote:
         | i agree with the essence of your comment. the "back in my day"
         | is what is, perhaps, causing resistance.
         | 
         | my impression these days is that web engineers have outnumbered
         | other software engineers. that's a problem because context
         | matters as you are alluding to.
         | 
         | we shouldn't use terms like "offline first" without an
         | appropriate context. or assume context.
        
         | hengheng wrote:
         | I will still call any web app a "web site", and if that is my
         | age showing, so be it.
        
           | atatatat wrote:
           | That's fine, if it's on your tablet or PC.
           | 
           | If you're looking at a web "site" on a vertical phone screen,
           | one side or the other did something wrong.
           | 
           | My point is: sites and services should have two separate
           | experiences, depending on screensize. These are: desktop
           | site, mobile webapp.
        
           | hoarad wrote:
           | i do so also. and it is because they have a link
        
           | taneq wrote:
           | We can shout at the cloud together.
        
       | ItsMonkk wrote:
       | The way I've learned to use git is to
       | 
       | 0. Sync with remote
       | 
       | 1. Edit files until I'm ready to check-in
       | 
       | 2. Stash changes
       | 
       | 3. Sync up with latest from remote
       | 
       | 4. Pop changes
       | 
       | 5. If there are any conflicts, deal with them here, locally.
       | Possibly delete all changes and redo. Redoing is equivalent to
       | someone checking in something at step 0. If this takes some time,
       | move back to Step 2.
       | 
       | 6. Push changes
       | 
       | If done this way, the only benefit of CRDT's is during step 5.
       | One of the lessons I've learned and truly believe is that if
       | something sucks, and it gets worse as the problem gets bigger,
       | you need to do it more often. Git merges are a great example of
       | this concept.
       | 
       | And this is where CRDT's are in trouble. The best way to make
       | step 5 easier is by making step 1 smaller. CRDT's viewpoint is
       | that we are offline, and therefore should allow any number of
       | edits, and when we reconnect the system should be able to work it
       | out. It flies in the face of smaller commits. The more changes
       | you need to merge, the harder it is. We live in a world that's
       | connected 99% of the time, and CRDT's simply aren't needed.
       | 
       | On the other hand, the "Offline First" model is great! A user
       | having access to all of their data is wonderful. As the blog
       | notes, it's not preferable for a user to have the entire
       | Wikipedia or Google indexes on their device. So you need to have
       | a use-case for Offline, and we need to do better about this type
       | of stuff, but we don't need to wait on CRDT research for this.
       | Smart caching and materialized views are where I think the real
       | progress is going to come from, making things like Offline
       | Wikipedia possible.
        
       | jcun4128 wrote:
       | I made a PWA that used IndexedDB and base64 photos. I ran into a
       | funky max-length issue that was an error from Chromium kind of
       | interesting. But yeah the main problem I had was the base64
       | images would get too big (if you had too many to load) and then
       | you would see a slower render vs. an image pulled by url.
       | 
       | Still pretty cool since I'm not a native developer and RN is
       | something I've dabbled in but don't use daily.
        
         | jitl wrote:
         | These days IndexedDB supports storing binary blobs as File
         | objects or even ArrayBuffers; the relevant bug on the Chrome
         | issue tracker was marked fixed in 2014:
         | https://bugs.chromium.org/p/chromium/issues/detail?id=108012
        
           | jcun4128 wrote:
           | Interesting I think was straight up using just text or
           | whatever is the default. Though I was using the Dexie wrapper
           | which is nice.
           | 
           | To be clear I don't know if the bug was from IndexedDB it was
           | something about length exceeded.
           | 
           | I did try to use small images too eg. some 150px by 150px but
           | going to base64 usually multiplies the size by 1.3
           | 
           | For anyone interested [1] not a phenomenal app but one I
           | poured idk 2-300+hrs into (was working on an RN version too)
           | and went nowhere sucks. I was contributing to one of those
           | codefor# deals.
           | 
           | [1] https://github.com/codeforkansascity/tagging-tracker-alt-
           | app...
        
         | ngokevin wrote:
         | Perhaps Service Workers is better than IDB for that?
        
       | begueradj wrote:
       | Just before yesterday, I read an article here praising the
       | "Offline First" principle. And as for everything related to
       | software, I am reading again what was good is bad, and what was
       | bad is good. All you have to do for a successful clickbait is to
       | wait until someone shares his opinion so that you can finally
       | write something against it... (most of the time, just for the
       | purpose of saying you're against)
        
         | esrch wrote:
         | The other article you mention comes from the same website:
         | https://rxdb.info/offline-first.html. It's in an opinion
         | section, giving the arguments from both sides.
        
           | typingmonkey wrote:
           | Yes. I have written both of them, mostly to make sure I
           | trigger 100% of people, no matter if they like offline first
           | or not.
           | 
           | I mean, read all these comments on both articles. People say
           | offline first does not work, then they say that every
           | software is offline first by default and it is nothing new.
           | 
           | It was totally worth it spending two 3 days on it :)
        
             | somenewaccount1 wrote:
             | And you know what, I AM triggered!
             | 
             | I feel like you missed the most important point of "offline
             | first" is that your data belongs to you and does not need
             | to be shared to the cloud in order to have tremendous
             | value. The code/logic to enhance your data should be
             | shipped to you, rather than the other way around.
        
       | JamesSwift wrote:
       | Its a good article but it should be noted that this very much is
       | a web-centric view of Offline First and its challenges. When I
       | say web-centric, I mean as opposed to offline-first on a mobile
       | app.
       | 
       | Native apps dont deal with the same issues around storage, and
       | are actually _much_ more performant overall. If you haven't done
       | an offline-first app, I highly recommend it. The experience is
       | magical. You can fly around your app at the speed of the users
       | touch. Content is magically loaded as soon as its clicked on. Its
       | amazing as a user.
       | 
       | As for similarities, conflict resolution is a universal problem
       | and is either the most difficult or second most difficult problem
       | [1] to solve. What makes this more difficult is that there is no
       | one-size-fits-all for this. You need to have a deep, nuanced
       | understanding of your system and what makes sense for your use
       | case in terms of resolution strategies. Then you need to
       | implement them, which is not easy especially if your backend is
       | bog-standard REST on a classic SQL datastore.
       | 
       | I've enjoyed reading the past 2 rxdb articles on this (the one
       | mentioned here as well as the one from a couple days ago [2]).
       | Its great to have more content on this publicly available, when I
       | was getting into offline-first I only had a couple options.
       | 
       | [1] - At the end of the day, offline-first is a whole bunch of
       | caching, and so you end up needing to deep dive into cache
       | invalidation strategies which we all know is a hairy problem.
       | 
       | [2] - https://news.ycombinator.com/item?id=28690427
        
         | lytefm wrote:
         | > At the end of the day, offline-first is a whole bunch of
         | caching, and so you end up needing to deep dive into cache
         | invalidation strategies which we all know is a hairy problem.
         | 
         | Looks like you're not truly getting the concept of Offline-
         | first apps then. You don't have or need a cache. You have a
         | local database on the device that syncs up with the server, if
         | online.
         | 
         | Conflicts can occur, but modeling data well with a
         | CouchDB/PouchDB setup (i.e. prevent conflicts by not modifying
         | docs all the time, rahter create new ones) + having a simple
         | time-stamp based heuristic can already be sufficient.
         | Otherwise, using CRDT is an option.
        
           | JamesSwift wrote:
           | > You don't have or need a cache. You have a local database
           | on the device that syncs up with the server, if online.
           | 
           | The source of truth is the server. Everything else (i.e. the
           | local copy) is a snapshot of that, aka a cache. Its just that
           | offline-first is _always_ a cache-first read, so you seem to
           | think that this makes it not a cache any more, but a regular
           | data store.
        
             | lytefm wrote:
             | > so you seem to think that this makes it not a cache any
             | more, but a regular data store.
             | 
             | Yes. In an true offline-first approach with
             | CouchDB/PouchDB, the client-side database can be considered
             | to be main data store and the Server-Side DB could just be
             | a backup. Or it might not be needed at all/ might only be
             | used to migrate from one device to another.
             | 
             | I'd say whether it's a master-slave or Multi-master model
             | depends on the conflict resolution strategy.
        
             | BackBlast wrote:
             | CRDT is really a multi-master model with eventual
             | consistency. It's not merely a cache but also operates as a
             | master and source of truth. With a well-built CRDT you can
             | also skip the server and sync client-to-client where both
             | have master copies.
             | 
             | The tricky part is conflict resolution.
        
               | JamesSwift wrote:
               | Right, the peer-to-peer or anything with a concept of
               | multi-master / majority is a different class of problem.
               | I am speaking only to traditional client-server
               | scenarios.
        
             | LAC-Tech wrote:
             | That just sounds like last write wins - the first client to
             | sync with the server is now the source of truth and other
             | clients that were working on the same thing will get
             | clobbered.
        
         | Trufa wrote:
         | Yeah,I'm trying to implement an electron offline first app that
         | syncs, there seems to readymade solution.
         | 
         | Stuff like https://github.com/aerogear/offix seem to be in the
         | right direction of what I'm looking for but not nearly mature
         | enough.
         | 
         | I don't want to pu to much effort on the app so I would like
         | something more or less ready made, preferably with graphql
         | apis.
         | 
         | Any suggestions welcome.
        
           | firebase-user wrote:
           | You should try using Firebase. It handles the data-sync for
           | you, and it applies all updates locally first so that your
           | app feels snappy.
        
             | JamesSwift wrote:
             | I mentioned Realm because thats what Im familiar with but
             | yeah Firebase is also a good choice if you want a client-
             | server setup.
        
             | code-is-code wrote:
             | Firebase works until you want a different conflict
             | resolution then last-write-wins. It is also only offline
             | first if the user is authenticated. Otherwise you need a
             | connection to the servers before using the local state.
        
             | jamil7 wrote:
             | Firebase is offline-resistant more so than offline-first
             | I'd argue.
        
             | LAC-Tech wrote:
             | Last write wins is not handling data sync, it's washing
             | your hands of it.
             | 
             | Your users will be frustrated.
        
             | davidzweig wrote:
             | The (JS) frontend SDK can be set to either online or
             | offline mode. When in online mode, it seems to insist on
             | downloading full copies of records, even when a version is
             | somewhere in the local cache, slowing things down and
             | incurring cost. Some optimisation could have been applied
             | here, I didn't find anything about it in the docs. Does
             | this match others experience?
        
             | danuker wrote:
             | Ah yes. Convenience is how Big Tech gets all data to pass
             | through it.
        
           | JamesSwift wrote:
           | I would hesitate to use any client-server setup that isnt
           | also your primary data store. So, if you have an "offline
           | aware, multi-tenant" store like CouchDB but you still need to
           | sync to the primary store which is SQL, then you lose a lot
           | of context and awareness around conflict resolution. If you
           | are going to eventually sync to the other store, I would say
           | only use Couch on the frontend, and do the syncing/resolution
           | from the perspective of the client, since the client knows
           | what it was trying to do and how best to notify on conflict.
           | 
           | The "better" option is to have an offline-aware client-server
           | component (e.g. Realm) which is the primary store as well.
           | This eliminates the sync and so all conflict resolution stays
           | in the same system with well defined semantics.
        
             | Trufa wrote:
             | Thanks for the recommendation, I'm a little bit skeptical
             | of MongoDB, but realm looks nice, Firebase I'm a little
             | reluctant on being so dependant on google, but otherwise
             | looks good.
        
               | JamesSwift wrote:
               | I share your concerns and avoid it for the same reasons :
               | )
               | 
               | Just giving options and google fodder.
        
               | BackBlast wrote:
               | PouchDB works well as the primary client store and can
               | sync to CouchDB in the cloud. IMHO this is one of the
               | more mature combinations that gives you a ready client
               | side document database that takes care of data
               | replication for you.
        
               | WorldMaker wrote:
               | It's just unfortunate that running CouchDB in the cloud
               | seems increasingly perilous. Since IBM acquired Cloudant,
               | it's much tougher to run CouchDB "at scale PaaS" on any
               | data center other than IBM's (for obvious reasons),
               | whereas early Cloudant had robust Azure and AWS support
               | for years.
               | 
               | I wish Couchbase were more helpful practically than they
               | try to present themselves theoretically. Even if their
               | products weren't so expensive the impedance mismatch
               | between their version of CouchDB's sync APIs and their
               | own APIs seems to increase by the year, and is pretty
               | noticeable in how different it works from a PouchDB
               | standpoint and how easy it is to break sync. (Impedance
               | mismatches in allowable database names and _id keys are
               | huge on their own that have massive repercussions in
               | application design.)
               | 
               | Even CouchDB is not CouchDB anymore with impedance
               | mismatches of its own between versions. On the one hand
               | it's good that Cloudant upstreamed a lot of their cluster
               | management tools directly into Apache CouchDB 2+ (even as
               | they made their PaaS offering IBM Cloud only [or whatever
               | it's name of the month is]), but huge architectural
               | changes below the covers in CouchDB 3+ start to present
               | their own sync issues akin to but distinct from
               | Couchbase's (and even some of Cloudant's as they seem [?]
               | to be diverging again into their own 2+ fork after all
               | that work upstreaming stuff?).
               | 
               | More than ever, Azure CosmosDB's focus on bare minimum
               | MongoDB capability and not supporting anything like
               | CouchDB sync, despite having close the same raw
               | ingredients (Cosmos' change feed looks a lot more like
               | Couch's, but is just missing a couple subtle things to
               | make it directly and immediately useful for Couch
               | replication) seems like a "CouchDB is dead and not worth
               | supporting" signal from Microsoft.
               | 
               | Unfortunately, I think PouchDB <=> CouchDB replication
               | has past the "mature" point to the "decrepit" and
               | "falling apart" stage, maybe to the point of
               | "evolutionary dead end" if I'm feeling strongly
               | pessimistic enough, and I've been for years trying to
               | figure what to replace it with.
        
               | BackBlast wrote:
               | That's a timely response and I too have to noticed that
               | CouchDB may not have the best/compatible future. I have
               | an existing product that depends on PouchDB in the client
               | and the replication.
               | 
               | I'm not really into the managed db service offerings, I
               | like to have control over it that they rarely offer.
               | 
               | My tentative solution for the application is to go with
               | PouchDB on node.js on the backend too.
        
       | psychometry wrote:
       | It's infuriating to me that WebSQL was killed. 99% of real-world
       | data is relational and yet the powers that be decided that we
       | should all be forced to use IndexedDB and hacky layers like
       | PouchDB built atop it.
       | 
       | I'm excited about https://github.com/jlongster/absurd-sql though.
        
         | goohle wrote:
         | IMHO, these days browser should just ship popular libraries,
         | e.g. sqlite, with browser, or download them once and cache them
         | permanently, until never version is released. Think like
         | <<Linux distribution>>, but for web/wasm.
        
         | dragonwriter wrote:
         | WebSQL wasn't killed by opposition to having a relational API,
         | it was killed because the spec was tied to, and only
         | implemented by embedding, a specific, identified version of
         | SQLite.
        
           | EvanAnderson wrote:
           | I think that a developer distaste for relational databases
           | was a major driver. Digging back into correspondence on this
           | a few months ago (when this came up on HN) I found clear
           | statements that Mozilla opposed anything relational. The
           | SQLite version is a convenient excuse for some developers
           | who, at the time, we're enamoured with "NoSQL".
           | 
           | Discussion here:
           | https://news.ycombinator.com/item?id=28156831
        
             | WorldMaker wrote:
             | Mozilla for a long time backed their IndexedDB with SQLite,
             | they wouldn't have done that if they were that antagonistic
             | to relational databases.
             | 
             | I trust Mozilla's surface reasons here: they inherited the
             | mess that was NPAPI from Netscape, then decades of
             | experience with XUL binary components, were among the many
             | dealing with Flash bugs and zero-day fallout well after
             | Flash's "heyday", and have combined multiple decades of
             | experience in what happens if the web depends on _specific_
             | binaries to do its job. From that standpoint of they were
             | already knee deep in trying to sandbox /reign in NPAPI,
             | remove XUL, and remove Flash I absolutely understand why
             | "you want the web to depend on the bugs and zero days of
             | SQLite directly with no abstraction layer between?" was a
             | complete non-starter.
        
           | ec109685 wrote:
           | The need to productionize another embedded database in order
           | to support embedded SQL in the browser seems like a tough
           | hill to climb given how widespread SQLLite is. This stance is
           | always going to keep web apps behind mobile apps in terms of
           | features and performance.
        
       | thayne wrote:
       | > in the end the user itself could be the one that deletes the
       | browsers local data
       | 
       | This is especially true, because often customer support for many
       | websites have customers begin troubleshooting with clearing the
       | browser's "cache and cookies". In other words deleting all of the
       | local data for all sites. There are ways to delete the local data
       | for just one site, but they are pretty hidden, and involve
       | multiple steps. I wish browsers had a simple "delete all data for
       | just this site" button.
        
       | lambda_dn wrote:
       | It's not worth it, have a internet connection is more ubiquitous
       | every day with Wifi, 4/5G and coming soon Low orbit satellite
       | grids.
       | 
       | Trying to engineer your app to work offline causes complexity in
       | the design and implementation for a issue that might never be an
       | issue for most customers.
       | 
       | Assuming your app is pretty crippled when it can't access the
       | cloud.
       | 
       | What's next making your app still work if there is no display by
       | screen reading?
        
         | lytefm wrote:
         | This statement very much depends on the kind of app you're
         | developing, as mentioned in the article. Is the main use case
         | communicating with another user? Do you need a third party API
         | for your app's basic functionality? Sure, don't even consider
         | offline first.
         | 
         | But what if the app is only about storing and displaying data
         | entered by the user and you'd definitely also want to be able
         | to use it on an airplane, e.g. a todo/notetaking/journaling
         | app? Then offline first can make sense.
         | 
         | I'd even say that the complexity of developing an offline-first
         | app can be lower than that of a classical client/server app +
         | caching logic. Sure, you need to figure out how to do schema
         | migrations and you probably want a kill switch to lock out
         | older apps at some point. But the same applies in a classical
         | setting when API-endpoints should be changed or removed.
         | Basically, your document model is now your API. And a very
         | simplistic conflict resolution strategy like ,,just take the
         | most recent version" is often good enough.
         | 
         | Once that + basics like auth and account creation are set up,
         | it's very productive and low overhead to work with
         | PouchDB/CouchDB and offline first:
         | 
         | - no need to coordinate with a backend team because nothing
         | else than auth happens there - simpler state management and
         | error handling because the state in the local DB can always be
         | assumed to be correct and the DB is always there
         | 
         | - no need for schema migrations as long as you only add new
         | docs or extend existing ones
         | 
         | - it's great for quickly hacking a prototype or for beginners
         | who just know some HTML/CSS/JS
        
         | jamil7 wrote:
         | Ideally yes, you'd make some effort to make your app accessible
         | by a screen reader.
        
       | austincheney wrote:
       | > When you create a web based offline first app, you cannot store
       | data directly on the users filesystem. In fact there are many
       | layers between your JavaScript code and the filesystem of the
       | operation system.
       | 
       | Solved: File system in the browser plus network distribution -
       | https://github.com/prettydiff/share-file-systems
        
       ___________________________________________________________________
       (page generated 2021-10-01 23:00 UTC)