[HN Gopher] Asynchronous Task Scheduling at Dropbox ___________________________________________________________________ Asynchronous Task Scheduling at Dropbox Author : pimterry Score : 54 points Date : 2020-11-11 19:26 UTC (3 hours ago) (HTM) web link (dropbox.tech) (TXT) w3m dump (dropbox.tech) | ryanworl wrote: | "To avoid this situation, there is a termination logic in the | Executor processes whereby an Executor process terminates itself | as soon as three consecutive heartbeat calls fail. Each heartbeat | timeout is large enough to eclipse three consecutive heartbeat | failures. This ensures that the Store Consumer cannot pull such | tasks before the termination logic ends them--the second method | that helps achieve this guarantee." | | Neither this or the first method guarantees a lack of concurrent | execution. A long GC pause or VM migration after the second check | could allow the job to get rescheduled due to timeout. The first | worker could resume thinking it still had one heartbeat left to | execute before giving up on the job and it could've already been | handed out to another worker in the meantime. | richardARPANET wrote: | Why didn't they just use Celery? | newfeatureok wrote: | Why is it that this task scheduling problem appears to often? Why | hasn't this problem been "solved" in the same way sorting strings | has? | | I understand companies have different requirements, but if you | look at the history even on Hacker News this problem is basically | being resolved by different companies at least once a quarter. | mfateev wrote: | Because it is a hard problem to solve holistically. | | It looks simple on the surface. So almost any company ends up | creating an implementation similar to the one described in the | article. Then it learns that it is much harder than looks, but | it is usually too late. So they end up maintaining it for a | long time with the original team long gone. | | BTW I believe that temporal.io (I'm tech lead of the project) | is so far the best open source solution to this problem. | Thaxll wrote: | If you work in Go and need to work on similar problem I highly | recommend: https://cadenceworkflow.io/ | mfateev wrote: | It is not Go specific. It already supports both Go, Java and | Ruby. | | temporal.io our fork of Cadence will have PHP support very | soon. Support for other languages is coming. Python and | Typescript are the highest priority. | swyx wrote: | Clickable link: http://temporal.io/ | | i worked on this site btw - would be happy to receive | feedback on the site, particularly if any wording was | confusing or unclear! | rkagerer wrote: | I wish folks building Task frameworks would provide a standard | mechanism for tasks to signal their progress. I realize it might | not be relevant here but I've noticed this gap in more general | frameworks as well. | jeffbee wrote: | If you have the opportunity, _please_ do not build it like this. | Referring to the architectural diagram, it is going to be much | more efficient for the "Frontend" to persist the task data into | a durable data store, like they show, but then the Frontend | should simply directly call the "Store Consumer" with the task | data in an RPC payload. There is _no_ reason in the main | execution path why the store consumers should ever need to read | from the database, because almost all tasks can cut-through | immediately and be retired. Reading from the database should only | need to happen due to restarts and retries of tasks that fail to | cut through. | | Disclaimer/claim: I worked on this system and on gmail delivery. | mrfox321 wrote: | In other words, the frontend gets some ACK from the DB before | calling the "Store Consumer"? (just trying to make sure I | understand your critique of the design) | jeffbee wrote: | Well it doesn't necessarily need to happen in that order, I | think. Frontend needs to ensure that the task is durably | stored before it acknowledges the end of the operation to its | caller. | | Using email as an analogy, you have to commit the message to | durable storage before you respond 250 to DATA. | neolog wrote: | If you do it that way, how do you make sure the task gets | completed successfully and exactly once? | jeffbee wrote: | I wouldn't. Exactly-once is a fool's quest, and the scheme in | this article does not offer it. | | To achieve at-least-once, you need only track which tasks | have been successfully retired, and persist that knowledge in | the database by either deleting or mutating the task. During | a cold start you scan the persistent store to find tasks that | were still pending/live at the time your process began. | [deleted] | imstil3earning wrote: | What does this solve that something like Rabbitmq doesn't? Am I | missing some key points? | Thaxll wrote: | Rabbitmq does not solve this problem, Rabbitmq offer a solution | for message passing that's it, it does not offer a framework to | execute tasks etc ... | aeyes wrote: | Everything listed under "Features", "System guarantees" and | "Lambda requirements"? | | Dropbox using Python, the real question is what Celery didn't | solve for them. My guess would be scalability. | solumos wrote: | It seems that Nextdoor also had issues with celery[0]. | | "Scalability" is a great scapegoat for making dubious | decisions, but my guess here would be the "task priority" | requirement. | | [0] https://engblog.nextdoor.com/nextdoor-taskworker-simple- | effi... | sna1l wrote: | Why not use something like Cadence/Temporal/Amazon Workflow | Service? | [deleted] | stunt wrote: | Flyte is also in the same space. | | https://github.com/lyft/flyte | sna1l wrote: | Yeah lots of different options here, mostly surprised they | didn't talk about why they didn't choose any of the | alternatives. | jonpurdy wrote: | I don't meant to take away from the article, but it makes me sad | to see such awesome people building and writing about really cool | bespoke solutions. It's obvious that Arun knows their stuff and | is able to communicate it clearly. | | The sad thing is that Dropbox Product has so heavily dropped the | ball that users like myself (from back in 2009) have switched | away in droves over the past few years. | | I understand that Dropbox core functionality wouldn't have been | enough to multiply the valuation of the company to what investors | expected. But it would have been nice to not jam collaboration | features into the product and mess up the simple, platform-native | UI with it's current abomination. I'd pay $10/mo forever if I | could get the 2010-esque Dropbox Mac client and sync service back | since it's way better than anything else (especially iCloud). | rkagerer wrote: | Dropbox customer here, agree wholeheartedly. | | They've really gone downhill by adding unwanted bloat, and it | just seems to be accelerating. Meanwhile their core product is | degrading. Abandonment of the Public folder in spite of a huge | outcry from customers was disappointing. The user experience is | plastered with advertising to try their other products, even if | you turn off all the relevant notification settings. And lately | I've been running into subtle functionality bugs in the client. | | Would happily give my money to a competitor focused on a lean, | reliable product. | donor20 wrote: | This - we are a business user for dropbox, on windows the task | tray is a mess, the collab / editing / paper features so | annoying. Sync I think is still OK if you can ignore everything | on the website. | | I do wish you could PAY for a basic version (maybe make the | collab stuff free as part of some trial or something). | svara wrote: | I don't know, I think it's pretty great. Can't live without it. | | The client's UI is a bit odd, but at the end of the day it's | really good at what it's supposed to do: Syncing files. | | Performance is also great: I'm using multiple machines to write | code on, and I keep my local git repo on Dropbox. I can | literally save a change on my notebook and run it on some other | machine 3 seconds later. | | On Mac and Linux you might want to check out maestral | (https://github.com/SamSchott/maestral), a third-party client | that works really well. | secondcoming wrote: | Does Dropbox use proper filesystems? I considered using | Amazon's S3 to host a repo but apparently it may not work | properly since it's not a 'proper' file system | draw_down wrote: | Come on | Osiris wrote: | I can understand the need for a company to be constantly trying | to add value to their product, but that tendency to be changing | so much can easily cause you to lose sight of what made you | popular in the first place. | | I use Dropbox personally to keep documents synced between my | computer and my wife's and also to grab documents I need from | the web if I'm on another computer. I occasionally share a | folder if I need to give a large number of files to someone. | | I recently had a notification come up on the dropbox taskbar | icon and it popped up this huge window that looked like a | massive electron app. In the old days, there wasn't even a UI, | just a context menu that also showed the state of the sync. | | For me, Dropbox provides the most benefit when it's not | visible, running invisibly in the background doing it's thing. | staticassertion wrote: | Error: 4xx Error (4xx) We can't find the page you're looking for. | Check out our Help center and forums for help, or head back to | home. | | Getting this when I click the link. ___________________________________________________________________ (page generated 2020-11-11 23:01 UTC)