[HN Gopher] Listen to your to PostgreSQL database in realtime vi...
       ___________________________________________________________________
        
       Listen to your to PostgreSQL database in realtime via WebSockets
        
       Author : gmac
       Score  : 205 points
       Date   : 2021-04-28 12:14 UTC (2 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | kitd wrote:
       | Nice!
       | 
       | Can you go in the other direction, ie push via WS into a table?
       | Obviously you'd need the correct auth.
       | 
       | I had an idea of doing something similar with MQTT. Data at rest
       | and data in motion in 1 (or maybe 1.5) components.
        
         | edoceo wrote:
         | yea, thats pretty easy. a small Go/Rust/whatever program to do
         | WS and write (well formed, authenticated) message data via
         | INSERT
        
         | jordan801 wrote:
         | Sure can! I contributed on this project a few months ago.
         | Hoping to use it soon on a project.
         | 
         | I implemented a portion of crud with a vuejs app. It's a pretty
         | neat setup. Just authorize with OAuth2 and then if you have
         | permissions youre good to go!
        
       | pihentagy wrote:
       | How does this relates to Debezium?
        
       | bcoates wrote:
       | Neat! Is the idea to subscribe to the websocket directly from
       | frontend code or from an application server?
       | 
       | It sounds like it's the latter, which seems like an unusual use
       | case for websockets. (or at least I can't think of anything else
       | that does ws for a server-to-server API)
        
         | jordan801 wrote:
         | Yeah! Front end apps can skip the server and get data from the
         | db directly. There's OAuth2 setup and you can authorize users
         | to have crud access to tables, or even a granular as rows.
        
         | cookguyruffles wrote:
         | As far as getting cheap framing and bidirectional communication
         | over an HTTP port goes, there's nothing to stop you using
         | websocket server->server. For example users, many crypto
         | exchanges publish their feed this way
        
           | codeflo wrote:
           | But why, except to claim "web-scale"? Especially for
           | publishing events, isn't something like MQTT a lot saner?
        
             | megous wrote:
             | No broker.
        
             | cookguyruffles wrote:
             | I already listed two, because you need
             | 
             | - cheap framing
             | 
             | - bidirectional comms
             | 
             | - are already exposing an HTTP port
             | 
             | - can make no assumptions about the client's networking
             | than port 443 available, even if only via a HTTPS proxy
             | 
             | - can make no assumptions about the client's software
             | except they can speak an extremely common protocol
             | 
             | - want security to work the same way it does for the rest
             | of your services
             | 
             | "web-scale" doesn't come into it. Websockets are inherently
             | difficult to scale because they're stateful, but in return
             | you get the lowest latency in both directions the
             | underlying network can provide with extremely reasonable
             | (2-4 bytes) overhead compared to raw TCP or SSL
        
       | hmsimha wrote:
       | This is something I have an immediate need for (and was about to
       | build myself) to use with a PostgreSQL/TimescaleDB instance. Is
       | it possible to have new subscriptions get up to an hour of
       | historical data before streaming starts, or even better, to
       | supply a parameter to receive historical data from a timestamp?
        
         | sk5t wrote:
         | Stream history is kind of thorny; oughtn't one consider Kafka,
         | Kinesis, etc., for that instead?
        
           | hmsimha wrote:
           | To be clear, it's not really stream history that I'm after.
           | Just the result of a database query upon connection ( `select
           | all items from table X with time > (now() - 1hr)` )
        
       | brap wrote:
       | I really want Supabase to take off. For me, the #1 reason not to
       | go with Firebase was lack of relational DB support, and #2 vendor
       | lock-in.
        
         | pier25 wrote:
         | For #1 you could check out Fauna. Much more powerful than
         | Firebase regarding querying and data modeling.
         | 
         | With Fauna you still get vendor lock-in though.
        
           | yawnxyz wrote:
           | i've moved from Fauna to Supabase... their FQL language was
           | kind of confusing and poorly documented compared to Supabase
        
           | jadbox wrote:
           | If Fauna went open-source, I'd make the case to pay for their
           | hosting. There's business value in just having the option to
           | host it internally or make modifications.
        
         | kabes wrote:
         | For 2 you could look at meteor, which does a similar thing with
         | mongo. It's not relational, but with supabasr you also don't
         | really get the advantages of a relational db, since the update
         | events are about single tables.
        
       | cglace wrote:
       | So this is similar to how you could subscribe to queries in
       | meteorjs.
        
       | tyre wrote:
       | Do you know if there is something similar for SQLite? I've been
       | looking for a while and wonder if maybe listening to the WAL is
       | the best option. There are hooks to tie into but the database I'm
       | looking at isn't mine (it's local on my machine for another
       | application) so idk if I should be futzing around with it.
        
         | WJW wrote:
         | https://litestream.io/ ?
        
           | benbjohnson wrote:
           | Litestream author here. Litestream monitors and replicates
           | based off the WAL and there are some tricks it has to do to
           | make that work. However, it doesn't currently decode those
           | WAL pages and determine the changed data.
           | 
           | That is something that's on the roadmap though:
           | https://github.com/benbjohnson/litestream/issues/129
        
             | tyre wrote:
             | This is awesome! I guess more likely given the title of
             | this post, but still pretty cool that the author of a
             | library is here in the comments.
        
       | raarts wrote:
       | So basically Apollo GraphQL server could use the same mechanism
       | for enabling GraphQL subscriptions right? Any idea what they
       | currently use?
        
         | speedgoose wrote:
         | They use websockets and a set of plug-ins to support redis,
         | rabbitmq, or many other things. However, Apollo is planning to
         | remove subscriptions for the next major release.
        
           | raarts wrote:
           | Apollo removing subscriptions? I don't understand, aren't
           | those a core feature of GraphQL?
        
           | ex3ndr wrote:
           | Where did you get plans to remove subscriptions?
        
         | endisneigh wrote:
         | GraphQL severs handle subscriptions websockets as well usually.
        
       | skyde wrote:
       | I love it! Only issue is if you can't consume the history of
       | changes. But this is good for caching materialized view in the
       | client
       | 
       | 1- subscribe to event and buffer event
       | 
       | 2- run SQL query for your materialized view
       | 
       | 3- apply all buffered event and new event to incrementally update
       | the materialized view
       | 
       | this way the (slow/expensive) query of the materialized view
       | don't need to be run periodically and your cache always is always
       | fresh without need to set TTL.
       | 
       | If you websocket connection get disconnected, drop the
       | materialized view and repeat step #1.
        
       | yawnxyz wrote:
       | How is this similar/different from using CouchDB as a live
       | database? I've been toying around with both (hosting CouchDB on
       | DO)
       | 
       | Which one would you recommend for small-ish (~100 user, <10k
       | records) projects?
        
         | eloff wrote:
         | For a project that small you can use NoSQL without issue
         | because you can always scan the whole thing when needed. For
         | larger data sets you want the relational model unless you're
         | very sure you don't need it (and most people who think they're
         | in this category are just wrong and will find out later the
         | hard way.)
        
           | yawnxyz wrote:
           | Thank you, that's really helpful!
           | 
           | I'm always wary of getting stuck in the "I wish I didn't use
           | NoSQL" camp, but thankfully I haven't been in that situation
           | yet
        
       | endisneigh wrote:
       | I'm really curious to see how this ends up doing vs. Hasura.
       | 
       | Postgrest (which is basically what Supabase is, obviously with
       | other value-adds) or Hasura which basically exposes a GraphQL
       | server that interfaces with Postgres.
       | 
       | Personally I prefer GraphQL as there's more tooling around that
       | compared to Postgrest but it's interesting to see. In this case
       | if supabase was GraphQL you could just use a subscription.
       | 
       | I'd be curious to know why supabase didn't go with GraphQL.
        
         | michael_j_ward wrote:
         | I believe the graphql equivalent of postgrest is postgraphile
         | [0].
         | 
         | [0] https://www.graphile.org/postgraphile/
        
         | pier25 wrote:
         | > _I 'd be curious to know why supabase didn't go with
         | GraphQL._
         | 
         | I'm guessing because GraphQL is very limited compared to SQL.
         | 
         | From reading their docs [1] it seems they have a JS API for SQL
         | with support for nested data a la GraphQL. Not sure if its
         | their own or comes from some other library.
         | 
         | [1] https://supabase.io/docs/reference/javascript/select
        
           | rattray wrote:
           | With postgraphile you get the full power of postgresql
           | exposed through graphql.
           | 
           | Using a few commonly-used add-ons, you can write a _whole
           | lot_ of sql in gql and it Just Works, translating your
           | deeply-nested and richly-filtered gql query into a single,
           | performant sql query.
           | 
           | You can see roughly how this translation ends up here: https:
           | //gist.github.com/rattrayalex/ae39c2cf0356f1257ece4f3c... (in
           | production it's more condensed etc).
           | 
           | You can also extend with sql functions or add/wrap resolvers
           | at the js level. (And yes, you can easily hide columns,
           | rename, etc)
        
         | eloff wrote:
         | So one big difference is this tails the bin log (as a
         | replication slave), while Hasura polls. Hasura makes that
         | polling efficient by multiplexing the polling queries
         | transparently so they send one query to poll N queries on the
         | same table with the same shape.
         | 
         | But once you're polling a large percentage of your whole data
         | set, the bin log approach has a clear advantage.
        
         | tablatom wrote:
         | Nhost is a very similar offering, built on Hasura
         | 
         | https://nhost.io/
        
         | thom wrote:
         | Have you had positive experiences using Hasura in production?
        
           | elitan wrote:
           | Hasura have some big clients:
           | 
           | https://twitter.com/commandodev/status/1387348391047335938
        
         | iooi wrote:
         | For Python folks, this library builds a GraphQL schema from
         | your SQLAlchemy models: https://github.com/gzzo/graphql-
         | sqlalchemy
        
       | karambahh wrote:
       | For some reason I understood the title as an audio rendering of
       | changes on your pg db
       | 
       | The actual purpose is actually way cooler, looks like a great
       | tool
        
         | rzzzt wrote:
         | You can listen to GitHub commits: https://github.audio/
        
           | Ombudsman wrote:
           | This is usually how I fall asleep.
        
         | yokto wrote:
         | This reminds me of a project I did a few years ago:
         | 
         | "Stockify is a live music service that follows the mood of the
         | stock markets. It plays happy music only when the stock is
         | going up and sad music when it's going down."
         | https://vimeo.com/310372406
         | 
         | (it's of course a parody, but I made a functional prototype)
        
           | shrimpx wrote:
           | I had a related idea, to make a running program produce an
           | audible hum like a car engine would. And the hum would vary
           | depending on what paths of the program are executing, so you
           | get an idea what/how your code is "doing" by listening in.
        
           | dkersten wrote:
           | I know someone who uses tones as an easy way of knowing if
           | there's a market move on some cryptocurrencies. Basically,
           | there's different tones for a trade that traded higher than
           | previous and for a trade that traded lower than the previous.
           | Possibly pitch or volume or something was used to indicate
           | the volume of the trade, I don't quite remember. Either way,
           | he could quickly hear if there was a major market move or
           | something just by the amount and tone of the beeps. He used
           | this as a way to know then he should look at the chart or
           | whatever.
        
             | krrrh wrote:
             | We spend a lot of time training digital computers to deal
             | with analog information that has been converted into
             | digital forms, and I wonder how much we could also gain by
             | finding better ways to convert digital information into
             | analog forms that our brains (as analog computers) can
             | better parse.
        
             | recuter wrote:
             | Oh you're also friends with Gilfoyle?
             | https://www.youtube.com/watch?v=uS1KcjkWdoU
        
           | grumblestumble wrote:
           | This is what NPR does on the market segment (Kai Rysdal?) The
           | background music reflects the changes in the market over the
           | day.
           | 
           | Also, if you haven't read Douglas Adam's "Dirk Gently's
           | Holistic Detective Agency", there's a good subplot around
           | this exact idea.
        
         | m_mueller wrote:
         | Same here. App idea geared at DevOps: ambient sound scape
         | generator based on monitoring events, e.g. from Datadog.
        
           | jrockway wrote:
           | We have this in New York City to monitor the flow of traffic
           | on quiet residential streets. If the flow rate is OK, it's
           | nice and quiet. If the flow rate gets too low (or a traffic
           | light turns red) then a chorus of very angry car horns
           | erupts, and you are jolted out of your train of thought,
           | wishing you had the guts / tolerance for prison time to run
           | outside and smash every single one of the cars with a
           | crowbar.
           | 
           | I think a recording of this is actually built into Pagerduty
           | to use as a sound when you're getting paged. I went with the
           | "golf ball hit into a flock of geese" one, though. Every time
           | that goes off my first thought is "OH GOD I'M DYING HELP" but
           | then I look and it's just GCP down again. Ironically
           | therapeutic.
        
           | tetha wrote:
           | There is an implementation and a paper called peep, the
           | network auralizer. I've been thinking of building something
           | like this in rust or go as a learning project, but audio
           | streaming and mixing is harder than I thought. If someone has
           | good libraries or tutorials there, I'd be grateful.
           | 
           | http://peep.sourceforge.net/intro.html https://www.usenix.org
           | /legacy/publications/library/proceedin...
        
           | jensneuse wrote:
           | Techno livestream. BPM increases when your website gets high
           | traffic.
        
         | mark-wagner wrote:
         | Similar: Solaris's snoop (tcpdump equivalent) with the -a
         | option:
         | 
         | Listen to packets on /dev/audio (warning: can be noisy).
         | 
         | https://docs.oracle.com/cd/E23823_01/html/816-5166/snoop-1m....
        
           | krrrh wrote:
           | You've got to throw in some uncompressed pcm audio streams
           | just to keep it interesting.
           | 
           | When we were learning Linux a friend and I used to pipe
           | /dev/hda into /dev/dsp for fun and when we hit some fragments
           | of uncompressed piano recordings we joked that we must have
           | hit the kernel source code. Good times.
        
         | domoritz wrote:
         | Same here and I am slightly disappointed it's not what I
         | thought it was.
        
           | SamBam wrote:
           | Same. Figured _pop-pop-poppop-pop_ could be the background
           | noise when we go back to the office...
        
         | jmull wrote:
         | That's what I thought too.
         | 
         | I guess someone needs to do it now, hm...
         | 
         | (But: I think live db-activity rendered to music as a kind of
         | monitoring mechanism is potentially way cooler than just
         | getting db updates over a web socket.)
        
       | sscarduzio wrote:
       | Will this work on AWS RDS?
        
       | JakaJancar wrote:
       | This works until you have an A->B->C hierarchy of entities and
       | want to monitor the subtree rooted at a specific A. Then the WAL
       | for table C, even with REPLICA IDENTITY FULL, will not tell you
       | whether a change is relevant. At that point you need to do a JOIN
       | anyways, so might as well just use a trigger and NOTIFY instead.
       | 
       | Two of the other reasons for this over triggers are also
       | misleading:
       | 
       | - Setting up triggers can be automated easily.
       | 
       | - True, you only use 1 connection to the database, but you now
       | operate this app. You could also run pgbouncer.
        
         | skyde wrote:
         | if C is a "child entity" of C then it's primary key should have
         | the primary key of B as a prefix.
         | 
         | This is how I design all my table schema. and it make database
         | partitioning easier too.
        
           | JakaJancar wrote:
           | "Should" is pretty strong. The choice of PK has its own
           | tradeoffs. But you are correct, if you go the prefix route,
           | it makes the WAL approach trivial, at least while everything
           | really is a tree and there are no many-to-many relationships.
        
           | saurik wrote:
           | I would consider this model "usually incorrect" in that it is
           | quite likely you will eventually want to be able to move an
           | entity to a different parent. I would thereby always give
           | people the advice that one "should" have primary keys for
           | every entity type that are at least global for that local
           | entity type (if not truly global for all entity types, as
           | there is a lot of power doing such) and then implement this
           | child relationship as a unique foreign key constraint, unless
           | you have some very specific functionality or performance
           | requirement that would force primary key conflation (and
           | FWIW, "I want to use this WAL watcher" might count, but it
           | seems like a steep price to pay being unable to do sane
           | reparenting).
        
           | derefr wrote:
           | I wish I could do this, but I already have more than four
           | billion As, so my A IDs need 64 bits. I use composite keys
           | for my Bs (A ID + sequence number per A), which allows me to
           | search on the pkey index using just the A column; but by the
           | time I get to C, that gets unwieldy. So my Cs have UUIDs, my
           | Bs _also_ have UUIDs (secondary unique ID, not the pkey) and
           | my Cs have a foreign key pointing to their B. While _also_
           | having an indexed A ID field.
           | 
           | I'd love to just just have "A-B-C" as my Cs' IDs... but it'd
           | only work for my use-case (i.e. be performant) if it was
           | running on a computer with 256-bit registers.
        
             | skyde wrote:
             | Curious why your primary key are GUID? Also if the primary
             | key of table C is made of 3 column (A GUID, B GUID, C GUID)
             | the index in PostgreSQL will not be that big with
             | compression enabled because all the common prefix will not
             | be stored redundantly.
             | 
             | Also having a single compound index on table C covering
             | column (A ForeingKey, B ForeingKey, C GUID) is much better
             | than having multiple index on table C.
        
               | Ombudsman wrote:
               | GUIDs are extremely useful if you want to move and merge
               | data around without having to worry about primary key
               | collisions.
        
       ___________________________________________________________________
       (page generated 2021-04-30 23:00 UTC)