[HN Gopher] High-Performance Graph Databases ___________________________________________________________________ High-Performance Graph Databases Author : belter Score : 129 points Date : 2023-05-21 15:39 UTC (7 hours ago) (HTM) web link (arxiv.org) (TXT) w3m dump (arxiv.org) | shrubble wrote: | What would be a use of 100k cores for a graph database? | henrydark wrote: | 100 use cases for 1k cores | jandrewrogers wrote: | Aggregate effective memory bandwidth if you use the CPU cache | well. Graph databases are not compute intensive, nor are they | particularly large data models, but they are extremely memory | I/O intensive. The classic model for graph-oriented HPC was | vast numbers of weak cores and barrel processors for this | reason, though the utility of specialized hardware | architectures has been greatly reduced by better software | architecture for this purpose over time. | ocrow wrote: | A social media service for hundreds of millions of users? | threeseed wrote: | Fraud analytics. | | You're looking for patterns across large numbers of entities | and relationships. | | And ideally you want this all done in real-time so you can stop | transactions before they are approved. | belter wrote: | "...we harness established practices from the HPC landscape to | build a system that outperforms all past GDBs presented in the | literature by orders of magnitude, for both OLTP and OLAP | workloads. For this, we first identify and crystallize | performance-critical building blocks in the GDB design, and | abstract them into a portable and programmable API specification, | called the Graph Database Interface (GDI), inspired by the best | practices of MPI. We then use GDI to design a GDB for | distributed-memory RDMA architectures. Our implementation | harnesses one-sided RDMA communication and collective operations, | and it offers architecture-independent theoretical performance | guarantees. The resulting design achieves extreme scales of more | than a hundred thousand cores..." | [deleted] | jandrewrogers wrote: | I am a little confused by the purpose of this paper. The | architecture described is roughly how graph databases have always | been implemented on HPC systems. The main contribution seems to | be that they put a lot of polish on what were admittedly | prototype-ish implementations historically? I was hoping for some | interesting approaches to some of the fundamental computer | science problems that cause scaling issues when graphs become | large but this is more of a standard "throw hardware at it" | solution (which has significant limitations). | lmeyerov wrote: | Agreed. They brush aside the years of high performance | computing graph implementations, eg, cuStinger. If you look at | systems like neo4j's GDS and how other multipurpose systems are | going wrt views/projections for accelerated compute, that has | been enabling targeting way more performance without dying | under complexity. Benchmarking perf without that kind of | comparison is weird, I'm surprised reviewers allowed that. (The | work may still be good.. just you can't know without that.) | mathisfun123 wrote: | I love this - "it's just engineering" say the people that think | the hardest part about building a spaceship is having the big | idea to build the spaceship. Note the "I love this" was | sarcasm. | jandrewrogers wrote: | My point was that the architecture they are using has already | been done multiple times for graph databases using things | like RDMA (which has existed in HPC for ages), that is a | known quantity. It was less "it's just engineering" and more | I've seen similar implementations for a long time so what | makes this different? I am interested in this space in part | because I spent much of my time in HPC working on graph | databases. | eternalban wrote: | You should quickly skim sections marked with a (comically | fat) "!" symbol. These indicate their "key design choices | and insights" in the design space. | LukeEF wrote: | I am not sure of the exact statistic, but something like 95% of | all production databases are less than 10GB. There seems to be a | 'FAANG hacker' fascination with 'extreme-scale' which probably | comes from seeing the challenges faced by the handful of | organizations working at that level. Much of the time most graph | database users want (as in why are they there) a DB that allows | you to flexibly model your data and run complex queries. They | probably also want some sort of interoperability. If you can do | that well for 10GB, that is holy grail enough. We certainly found | that developing graph database TerminusDB [1] - most users have | smaller production DBs, more lightly use bells and whistles | features, and really want things like easy schema evolution. | | [1] https://github.com/terminusdb/terminusdb | deegles wrote: | What are the databases with easy schema evolution? | swader999 wrote: | I get that angle but I also see orgs capturing too much data. | What's the use case for it? Not sure but if we ever do need it | we'll have it is the typical answer. | fnord77 wrote: | really? I don't quite believe that. We're a tiny company with | maybe 70 customers and db is roughly 11Tb. | cubefox wrote: | That seems a lot. What type of data? | [deleted] | paulddraper wrote: | Assuming 1kb per "record" that's 150 million records per | customer. | | Definitely a data heavy product, wherever it is that you're | offering. | | (Unless you keep large blobs in the DB. But database scale | has more to do with records than raw storage.) | belter wrote: | Interesting that you mention the value 10 GB, as it is the size | of a DynamoDB partition or an AWS Aurora cell... | im_down_w_otp wrote: | I think I kind of agree with this. | | One of the simpler supported backends for our Modality product | (https://auxon.io/products/modality), which results in a data | model that's a special case of a DAG for modeling big piles of | casually correlated events from piles and piles of distributed | components for "system of systems" use cases, is built using | SQLite, and the scaling limiter is almost always how | efficiently the traces & telemetry can be exfiltrated from the | systems under test/observation before how fast the ingest path | can actually record things becomes a problem. | | That said, I do love me some RDMA action. 10 years ago I was | fiddling with getting Erlang clustering working via RDMA on a | little 5 node Infiniband cluster. To mixed results. | parentheses wrote: | I agree with your sentiment but I suppose you're considering | the wrong statistics. Instead you should consider: - how many | jobs have interviews that necessitate knowing how to handle | extreme scale | | - proportion of jobs (not companies) requiring extreme scale - | the fact that non extreme scales are the long tail doesn't mean | it's a fat tail | | - proportion of buyers/potential users that walk away from the | inability to handle extreme scale | | ... and more sarcastically | | - proportion of articles about extreme scale | | - proportion of repos about extreme scale | mumblemumble wrote: | Only one anecdote, but I found out a while after starting at | my current job that directly questioning the extent to which | scale-out was actually needed to solve a problem during a | technical interview question is the thing that made me stand | out from the rest of the crowd, and landed me the job. Being | able to constructively challenge assumptions is an incredibly | valuable job skill, and good managers know that. | zimpenfish wrote: | Counter-anecdote: directly questioning "scale-out | fantasies" has contributed to my early departure from a | handful of jobs and contracts. One place was obsessed with | getting everything into AWS auto-scaling groups when the | problem was actually that they were running on MySQL with a | godawful schema, dumbass session management, and horrific | queries that we weren't allowed to fix because they were | "migrating to node microservices anyway" (pretty sure that | still hasn't happened years later.) | | > Being able to constructively challenge assumptions is an | incredibly valuable job skill | | I would agree but ... | | > good managers | | ... are few and far between. | hobs wrote: | The best people challenge bad assumption and worst bosses | get mad. | | Had one boss get mad that I reduced the database | footprint by 94% - why? Because he wrote the initial | implementation and refused to believe that his baby, | which cost so much space because of how awesome it was, | could fit into 5GB. | | But challenging the status quo has gotten me to where I | am, so I wont stop it anytime soon :) | threeseed wrote: | This research paper is talking about performance whilst you're | talking about scalability. | | Those are related but are distinct from each other. | | And sure about 95% of companies would have their needs met with | a simpler system but that does leave a lot of companies who | will not. And for those of us in say finance doing | customer/fraud analytics I would welcome all the performance I | can get. | loeg wrote: | > This research paper is talking about performance whilst | you're talking about scalability. Those are related but are | distinct from each other. | | The paper has "Scale to Hundreds of Thousands of Cores" in | the title. I have not yet read the paper but it seems | unlikely it doesn't talk about scalability. | threeseed wrote: | I was referring to scalability in the sense of the size of | the data being stored. | | You can have slow queries with 10GB of data just like you | can have fast queries with 10PB of data. | adgjlsfhk1 wrote: | If your data is small enough to easily fit in ram, you | kind of can't have that slow a query on it (or at least | you no longer are talking about a database problem). | mumblemumble wrote: | I'm guessing that, when the paper's author mentioned | "hundreds of thousands of cores", they didn't have 10GB | of data in mind. That works out less than a typical L1 | cache's worth of data per core. | rocqua wrote: | This isn't a graph database like neo4j. This is a graph | database like I hoped neo4j would be. It's not about having an | easier time working with schemas. It's about analyzing graphs | that are too big to fit in RAM. Transaction analysis for banks, | trafic analysis of roads, failure resilience of utility | networks, etc. | | In these kinds of workloads you quickly run into performance | bottlenecks. Even in-memory analyses need care to avoid | conplete pointer chasing slowdowns. | | I do still hope this is fast in like a single CPU 32 core 64GB | system with an SSD. But if this takes a cluster to be useful, | then I will still love it. | bubblethink wrote: | >There seems to be a 'FAANG hacker' fascination | | Yeah, but the hacker fascination is what drives progress. You | could have made the same type of argument about ML, and we | would have been content with MNIST. | RhodesianHunter wrote: | But the 5% of places where that kind of scale is needed are the | ones paying the top 1% salary band, so this is the content | distributed systems engineers like to read about and work on. | lasfter wrote: | Maciej Besta, the first author of this paper, is a machine. | | Aside from coordinating big groups to write tons of papers, he | does a bunch of impressive wilderness exploration. I recommend | checking out his website, there's some stunning photos: | | https://people.inf.ethz.ch/bestam/expeditions.html | osigurdson wrote: | Stating that someone "is a machine" is somewhat ambiguous these | days as it is increasingly possible that they literally are | one. | EGreg wrote: | Wasn't TJ Holowaychuk confirmed to be an AI, or a group of | people like Bourbaki? He never showed up at any events :) | | https://qr.ae/pyvfKK | gumby wrote: | I also misinterpreted that sentence! | mathisfun123 wrote: | Torsten is PI of that group not Maciej so you're half right, | since being Maciej is first author on this one he probably did | most of the technical work (and who cares about | "coordination"). | uptownfunk wrote: | What are these used for in the "real" world? | perfect_wave wrote: | A few that I've encountered: | | - Feature generation for machine learning model training | (particularly popular with fraud detection at financial | institutions) - social networks (think LinkedIn or Facebook) - | supply chain analysis and optimization - healthcare patient | data analysis (looking at similar patients to recommend | treatments or do large scale analysis) - user identification | (eg taking lots of data points and tying to a specific user). | There's a more specific name for this I can't remember off the | top of my head. | z3t4 wrote: | I once was obsessed at finding the holy Grail of optimization and | scaling, but then came to the conclusion that every system is | optimized and scaled differently. Different systems have | different bottlenecks. | slively wrote: | So true, rarely is anything the "best" or "better", but instead | each thing is a bucket of trade offs we choose from. Maybe | frustrating, but also what makes engineering genuinely | interesting. | uptownfunk wrote: | This sounds similar to the no free lunch theorem in AI | jjgreen wrote: | Huh? NFL is a theorem on optimisation, not AI | cubefox wrote: | It's originally a theorem about supervised learning, which | turned out to be the prototypical application (via CNNs) of | deep learning. Deep learning is now mostly synonymous with | AI. Though I'm not aware of any NFL theorem for | reinforcement learning or unsupervised learning. | mumblemumble wrote: | Given that the most well-known use case for optimization | algorithms is training machine learning models, it seems to | me like a perfectly reasonable, if buzzword-oriented, thing | to say. ___________________________________________________________________ (page generated 2023-05-21 23:00 UTC)