[HN Gopher] Choose Postgres queue technology ___________________________________________________________________ Choose Postgres queue technology Author : bo0tzz Score : 174 points Date : 2023-09-24 20:30 UTC (2 hours ago) (HTM) web link (adriano.fyi) (TXT) w3m dump (adriano.fyi) | andrewstuart wrote: | You don't even need a database to make a message queue. The Linux | file system makes a perfectly good basis for a message queue | since file moves are atomic. | | My guess is that many people are implementing queuing mechanisms | just for sending email. | | You can see how this works in Arnie SMTP buffer server, a super | simple queue just for emails, no database at all, just the file | system. | | https://github.com/bootrino/arniesmtpbufferserver | repiret wrote: | That's a key property leveraged in the Maildir mailbox format. | doctor_eval wrote: | This is true, and I've worked on systems that use this, but | it's a lot more work than just a rename. | | I'd recommend that, if you have a Postgres database already, | definitely use that instead. Your queues will be transactional | and they will get backed up when the rest of your database | does. | andrewstuart wrote: | >> but it's a lot more work than just a rename | | Such as? | whartung wrote: | For those using simple SELECTs, what kind of WHERE clause are you | using that works well with lots of qualified pending messages and | (somewhat) guarantees the most appropriate (oldest?) message? | levkk wrote: | Running this exact implementation with 47M jobs processed and | counting. SKIP LOCKED is great for VACUUM, and having durable | storage with indexes make otherwise expensive patterns like | delayed jobs, retries, status updates, "at least once", etc. | really easy to implement. | dlisboa wrote: | Do you have some idea of how many jobs per minute or hour do | you have? Just want to compare with what we have on Redis at | work. | | Do you also have any idea on the concurrency? How many workers | you have pulling from Postgres. | | I've used this approach before (ages ago) when Redis wasn't | even a thing, though not at high throughout requirements. | devoutsalsa wrote: | I'm sure Redis is much faster than an RDBMS w/ all the ACID | features turned on. The biggest concern I always have with | Redis is simply overwhelming the in-memory storage limits | when someone wants to do process a large number of good-sized | messages at an inconvenient time. #tradeoffs | rubenfiszel wrote: | We use exactly this for windmill (OSS Retool alternative + modern | airflow) and run benchmarks everyday. On a modest github CI | instance where one windmill worker and postgres run as | containers, our benchmarks run at 1200jobs/s. Workers can be | added and it will scale gracefully up to 5000jobs/s. We are | exploring using Citus to cross the barrier of 5000j/s on our | multi-tenant instance. | | https://github.com/windmill-labs/windmill/tree/benchmarks | simonw wrote: | One of my favourite pieces of writing about worker queues is this | by Brandur Leach: | | Transactionally Staged Job Drains in Postgres - | https://brandur.org/job-drain | | It's about the challenge of matching up transactions with queues | - where you want a queue to be populated reliably if a | transaction completes, and also reliably NOT be populated if it | doesn't. | | Brandur's pattern is to have an outgoing queue in a database | table that gets updated as part of that transaction, and can then | be separately drained to whatever queue system you like. | paulddraper wrote: | USE. ADVISORY. LOCKS. | | Do not use SKIP LOCKED unless it is a toy/low throughout. | | Row locks require transactions and disk writes. | | Advisory locks require neither. (However, you do have to stay | inside the configurable memory budget.) | ukd1 wrote: | Maybe it's changed in the last year or so, but from | benchmarking and writing / running queue software for Postgres | - SKIP LOCKED was/is significantly faster. Is that different | for MySQL? | mikeocool wrote: | Pretty common advice for scaling Postgres is to deploy | pgbouncer in transaction mode in front of it to handle | connection pooling. | | Advisory locks don't work in this setup (and will start | behaving in strange ways if you do try to use them.) Something | to consider if you go this route. | ris wrote: | Transaction-scoped advisory locks are very much a thing too. | 5id wrote: | One of the biggest benefits imo of using Postgres as your | application queue, is that any async work you schedule benefits | from transactionality. | | That is, say you have a relatively complex backend mutation that | needs to schedule some async work (eg sending an email after | signup). With a Postgres queue, if you insert the job to send the | email and then in a later part of the transaction, something | fails and the transaction rollbacks, the email is never queued to | be sent. | laurencerowe wrote: | > One of the biggest benefits imo of using Postgres as your | application queue, is that any async work you schedule benefits | from transactionality. | | This is a really important point. I often end up using a | combination of Postgres and SQS since SQS makes it easy to | autoscale the job processing cluster. | | In Postgres I have a transaction log table that includes | columns for triggered events and the pg_current_xact_id() for | the transaction. (You can also use the built in xmin of the row | but then you have to worry about transaction wrap around.) | Inserting into this row triggers a NOTIFY. | | A background process runs in a loop. Selects all rows in the | transaction table with a transaction id between the last run's | xmin and the current pg_snapshot_xmin(pg_current_snapshot()). | Maps those events to jobs and submits them to SQS. Records the | xmin. LISTEN's to await the next NOTIFY. | matsemann wrote: | Good point. We alleviate that a bit by scheduling our queue | adds to not run until after commit. But then we still have some | unsafety, and if connection to rabbit is down we're in trouble. | theptip wrote: | Worth being clear that bridging to another non-idempotent | system necessarily requires you to pick at-least-once or at- | most-once semantics. So for emails, if you fail awaiting | confirmation of your email you still need to pick between | failing your transaction and potentially duplicating the email, | or continuing and potentially dropping it. | | The big advantage is for code paths which async modify your DB; | these can be done fully transactionally with exactly-once | semantics since the Job consumption and DB update are in the | same transaction. | skybrian wrote: | It seems like listen/notify doesn't play well with a serverless | architecture. Would it make sense for Postgres to make a web | request when there's work in the queue? Is that a thing? | dools wrote: | I'm always surprised that when I see people talk about queues I | never see anyone mention beanstalkd. I've been using it for | basically everything for 10 years and it's solid as a rock, | incredibly simple and requires basically no maintenance. It Just | Works(tm) | jjice wrote: | I've implemented queues with tables in RDBMSs a few times and | it's always great and usually all you need. Worried about future | scale? Make a class to wrapper the queue with a decent interface | and swap it for RabbitMQ or whatever you want down the road. | Implementation stays opaque and you have an easy upgrade path | later on. | tiagod wrote: | Temporal, which AFAIK was made by the Uber Cadence team, which | was also involved in SQS, uses postgres as a backend. | | I used it for a web automation system for an accounting client | (automatically read files from a network share, lookup the | clients on a database, submit the documents to government | websites, using headless browsers, and put the resulting files in | the directory). It allows for completely effortless deterministic | programs that call workers that run the non deterministic code, | with built in configurable retries (react to certain exception | type, exponential back off) so you can write code that works | almost like there were no issues with api connections, | filesystem, etc. | | This code has been running for 5 or more years, with barely any | maintenance, with 0 issues so far. It keeps everything in | postgres, so even full reboots and crashes have no impact, it | will just move the work back to the queue and it will run when | there's an available worker. | leoqa wrote: | Temporal is a pretty complicated system. It has sharding built | in, stores the entire activity history and runs multiple queues | for timers and events. I'm a big fan (worked at Uber) but it's | definitely not just postgres with a few indices. | aduffy wrote: | For several projects I've opted for the even dumber approach, | that works out of the box with every ORM/Query DSL framework in | every language: using a normal table with SELECT FOR UPDATE SKIP | LOCKED | | https://www.pgcasts.com/episodes/the-skip-locked-feature-in-... | | It's not "web scale" but it easily extends to several thousand | background jobs in my experience | qaq wrote: | batch inserts process tasks in batches and it is pretty much | webscale | matsemann wrote: | I've done even simpler without locks (as no transaction logic), | where I select a row, and then try to update a field about it | being taken. If 1 row is affected, it's mine. If 0, someone | else did it before me and I select a new row. | | I've used this for tasks at big organizations without issue. No | need for any special deployments or new infra. Just spin up a | few worker threads in your app. Perhaps a thread to reset | abandoned tasks. But in three years this never actually | happened, as everything was contained in try/catch that would | add it back to the queue, and our java app was damn stable. | klysm wrote: | With what transaction isolation level? | mbb70 wrote: | I've done the same with MongoDB with findOneAndModify, simple | and solid | andrelaszlo wrote: | I guess you update it with the assigned worker id, where the | "taken by" field is currently null? Does it mean that workers | have persistent identities, something like an index? How do | you deal with workers being replaced, scaled down, etc? | | Just curious. We maintained a custom background processing | system for years but recently replaced it with off the shelf | stuff, so I'm really interested in how others are doing | similar stuff. | matsemann wrote: | No, just update set taken=1. If it was a change to the row, | you updated it. If it wasn't, someone updated before you. | | Our tasks were quick enough so that all fetched tasks would | always be able to be completed before a scale down / new | deploy etc, but we stopped fetching new ones when the | signal came so it just finished what it had. I updated | above, we did have logic to monitor if a task got taken but | never got a finished status, but I can't remember it ever | actually reporting on anything. | fsniper wrote: | You can combine this "update" with a "where taken = 0" to | directly skip taken rows. | SahAssar wrote: | That is the sort of thing that bites you hard when it | bites. It might run perfectly for years but that one | period of flappy downtime at a third party or slightly | misconfigured DNS will bite you hard. | matsemann wrote: | But compared to our rabbit setup where I work now, it was | dead stable. No losing tasks or extra engineering effort | on maintaining yet another piece of tech. Our rabbit | cluster acting up has led to multiple disasters lately. | SahAssar wrote: | Agreed, I've had my own rabbit nightmares. But setting up | a more robust queue on postgresql is easy, so you can | easily gain a lot more guarantees without more | complexity. | fbdab103 wrote: | I would set the taken field to a timestamp. Then you | could have a cleanup job that looks for any lingering | jobs aged past a reasonable timeout and null out the | field. | tylergetsay wrote: | it wont work with a timestamp because each write will | have an affected row of 1 beacuse the writes happen at | different times. setting a boolean is static | jayd16 wrote: | You can do something like UPDATE row SET timeout = NOW() | WHERE NOW() - taskTimeout > row.timestamp. You're not | stuck with comparing bools. | twic wrote: | update tasks set taken_timestamp = now() where task_id = | ? and taken_timestamp is null | bushbaba wrote: | You could even use a timestamp for handling what if this task | was never finished by the worker who locked the row. | ricardobeat wrote: | That's what's in the article. | somsak2 wrote: | Fourth paragraph of the post: | | >Applied to job records, this feature enables simple queue | processing queries, e.g. SELECT * FROM jobs ORDER BY created_at | FOR UPDATE SKIP LOCKED LIMIT 1. | surprisetalk wrote: | I recently published a manifesto and code snippets for exactly | this in Postgres! | | [1] https://taylor.town/pg-task | orangepanda wrote: | As I understand, with SKIP LOCKED rows would no longer be | processed in-order? | klysm wrote: | Depends on how many consumers you have. If you need order | guarantees, then something like the outbox pattern is | probably a better fit. | riku_iki wrote: | article says he also uses "order by" clause, but I am | wondering if it will severely limit throughput since all | messages will need to be sorted on each lookup, but this | probably can be solved by introducing index. | jpgvm wrote: | Few things. | | 1. The main downside to using PostgreSQL as a pub/sub bus with | LISTEN/NOTIFY is that LISTEN is a session feature, making it | incompatible with statement level connection pooling. | | 2. If you are going to do this use advisory locks [0]. Other | forms of explicit locking put more pressure on the database while | advisory locks are deliberately very lightweight. | | My favorite example implementation is que [1] which is ported to | several languages. | | [0] https://www.postgresql.org/docs/current/explicit- | locking.htm... | | [1] https://github.com/que-rb/que | andrelaszlo wrote: | One issue with Redis as a queue backend seems to be that | persistence is quite expensive, at least for managed Redis | instances. Using PG seems like it could be much cheaper, | especially if you already have an instance with room to spare. | | I thought it was an interesting article, and I'd love to hear | more from people using PG for queues in production (my intuition | would say you'd get a lot of table bloat and/or vacuum latency, | but I haven't tested it myself), but when it comes to the | conclusion - "choosing boring technology should be one's default | choice" - I can't think of anything more boring (in a good sense, | mostly) than Sidekiq + Redis for a Rails app. | xdanger wrote: | I do enjoy using https://github.com/graphile/worker for my | postgresql queuing needs. Very scalable, the next release 0.14 | even more so, and easy to use. | andrewstuart wrote: | MS SQL server, Postgres and MySQL all support SKIP LOCKED, which | means they are all suitable for running queues. | | I built a complete implementation in Python designed to work the | same as SQS but be more simple: | | https://github.com/starqueue/starqueue | | Alternatively if you just want to quickly hack something into | your application, here is a complete solution in one Python | function with retries (ask ChatGPT to tell you what the table | structure is): import psycopg2 import | psycopg2.extras import random db_params | = { 'database': 'jobs', 'user': | 'jobsuser', 'password': 'superSecret', | 'host': '127.0.0.1', 'port': '5432', } | conn = psycopg2.connect(**db_params) cur = | conn.cursor(cursor_factory=psycopg2.extras.DictCursor) | def do_some_work(job_data): if random.choice([True, | False]): print('do_some_work FAILED') | raise Exception else: | print('do_some_work SUCCESS') def process_job(): | sql = """DELETE FROM message_queue WHERE id = ( | SELECT id FROM message_queue | WHERE status = 'new' ORDER BY created ASC | FOR UPDATE SKIP LOCKED LIMIT 1 ) | RETURNING *; """ cur.execute(sql) | queue_item = cur.fetchone() print('message_queue says | to process job id: ', queue_item['target_id']) sql = | """SELECT * FROM jobs WHERE id =%s AND status='new_waiting' AND | attempts <= 3 FOR UPDATE;""" cur.execute(sql, | (queue_item['target_id'],)) job_data = cur.fetchone() | if job_data: try: | do_some_work(job_data) sql = """UPDATE jobs | SET status = 'complete' WHERE id =%s;""" | cur.execute(sql, (queue_item['target_id'],)) | except Exception as e: sql = """UPDATE jobs | SET status = 'failed', attempts = attempts + 1 WHERE id =%s;""" | # if we want the job to run again, insert a new item to the | message queue with this job id | cur.execute(sql, (queue_item['target_id'],)) | else: print('no job found, did not get job | id: ', queue_item['target_id']) conn.commit() | process_job() cur.close() conn.close() | jarofgreen wrote: | > I'd love to see more neoq-like libraries for languages other | than Go. | | Python has Celery, but maybe the author is looking for more | choice between brokers. | https://docs.celeryq.dev/en/stable/index.html | scraplab wrote: | Yep, we process hundreds of thousands and sometimes a few million | jobs daily inside Postgres, using Oban in Elixir. | | Having transactional semantics around background jobs is | incredibly convenient for things like scheduling email only if | the transaction is successful, and so on. | | You do need to do a little bit of autovacuum tuning, but once | sorted it's been great for us. | nicoburns wrote: | For running queues on Postgres with Node.js backend(s), I highly | recommend https://github.com/timgit/pg-boss. I'm sure it has it | scale limits. But if you're one of the 90% of the apps that never | needs any kind of scale that a modern server can't easily handle | then it's fantastic. You get transactional queueing of jobs, and | it automatically handles syncing across multiple job processing | servers using Postgres locks. | kissgyorgy wrote: | Here is a Python example how to use it: | https://gist.github.com/kissgyorgy/beccba1291de962702ea9c237... | mildavw wrote: | For Rails apps, you can do this using the ActiveJob interface via | | https://github.com/bensheldon/good_job | | Had it in production for about a quarter and it's worked well. | pphysch wrote: | Another point is that task queue technology is highly fungible. | There's nothing stopping you from starting with cron, adding in | Postgres/Redis, then graduating to Kafka or something as _need_ | arises. And running all three in parallel, with different jobs in | each. I would be surprised if the average Kafka shop didn 't also | have a bunch of random cron jobs doing things that _could_ be | implemented on Kafka or vice versa. | | At some point you may want to refactor things to reduce tech | debt, but it really is a "and" rather than "or" decision. | Scubabear68 wrote: | I agree with all of this except for the part about "cron". Cron | jobs in my experience quickly become hard to manage and | effectively invisible over time. | | Use almost anything else to manage job scheduling.... | dharmab wrote: | I'm not sure if they literally mean crond, or something | vaguely cron-like but easier to manage like systemd timers. | evil-olive wrote: | there's an important dimension of scalability that I think gets | overlooked in a lot of these discussions about database-as-a- | queue vs queue-system-as-a-queue: | | are you queuing _jobs_ , or are you queuing _messages_? | | that's a fuzzy distinction, so somewhat equivalently, what's the | expected time it takes for a worker to process a given queue | item? | | at one end, an item on the queue may take several seconds to a | minute or longer to process. at the other end, an item might take | only a few milliseconds to process. in that latter case, it's | often useful to do micro-batching, where a single worker pulls | 100 or 1000 items off the queue at once, and processes them as a | batch (such as by writing them to a separate datastore) | | the "larger" the items are (in terms of wall-clock processing | time, not necessarily in terms of size in bytes of the serialized | item payload) the more effective the database-as-a-queue solution | is, in my experience. | | as queue items get smaller / shorter to process, and start to | feel more like "messages" rather than discrete "jobs", that's | when I tend to reach for a queue system over a database. | | for example, there's a RabbitMQ blog post [0] on cluster sizing | where their recommendations _start_ at 1000 messages /second. | that same message volume on a database-as-a-queue would require, | generally speaking, 3000 write transactions per second (if we | assume one transaction to enqueue the message, one for a worker | to claim it, and one for a worker to mark it as complete / delete | it). | | can Postgres and other relational databases be scaled & tuned to | handle that write volume? yes, absolutely. however, how much | write volume are you expecting from your queue workload, compared | to the write volume from its "normal database" workload? [1] | | I think that ends up being a useful heuristic when deciding | whether or not to use a database-as-a-queue - will you have a | relational database with a "side gig" of acting like a queue, or | will you have a relational database that in terms of data volume | is primarily acting like a queue, with "normal database" work | relegated to "side gig" status? | | 0: https://blog.rabbitmq.com/posts/2020/06/cluster-sizing- | and-o... | | 1: there's also a Postgres-specific consideration here where a | lot of very short-lived "queue item" database rows can put | excessive pressure on the autovacuum system. | doctor_eval wrote: | I've used PG as a message queue, actually it was used as a | transactional front end to Kafka; we'd push messages to a PG | table during a transaction, which would then be snarfed up to | Kafka by a separate process after the transaction completed. | | I've seen very high transaction rates from this arrangement, | more than 20k messages/second. | ukd1 wrote: | I maintain QueueClassic | (https://github.com/QueueClassic/queue_classic) for Rails/Ruby | folks; which is basically what you're talking about - a queuing | system for Postgres. A bonus reason, and why I originally wanted | this was the ability to use transactions fully - i.e. I can start | one, do some stuff, add a job in to the queue (to send an email), | .....and either commit, or roll back - avoiding sending the | email. If you use resque, I found sometimes either you can't see | the record (still doing other stuff and it's not committed), or | it's not there (rollback) - so either way you had to deal with | it. | | QC (and equivs) use the same db, and same connection, so same | transaction. Saves quite a bit of cruft. ___________________________________________________________________ (page generated 2023-09-24 23:00 UTC)