[HN Gopher] High-Performance Graph Databases
       ___________________________________________________________________
        
       High-Performance Graph Databases
        
       Author : belter
       Score  : 129 points
       Date   : 2023-05-21 15:39 UTC (7 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | shrubble wrote:
       | What would be a use of 100k cores for a graph database?
        
         | henrydark wrote:
         | 100 use cases for 1k cores
        
         | jandrewrogers wrote:
         | Aggregate effective memory bandwidth if you use the CPU cache
         | well. Graph databases are not compute intensive, nor are they
         | particularly large data models, but they are extremely memory
         | I/O intensive. The classic model for graph-oriented HPC was
         | vast numbers of weak cores and barrel processors for this
         | reason, though the utility of specialized hardware
         | architectures has been greatly reduced by better software
         | architecture for this purpose over time.
        
         | ocrow wrote:
         | A social media service for hundreds of millions of users?
        
         | threeseed wrote:
         | Fraud analytics.
         | 
         | You're looking for patterns across large numbers of entities
         | and relationships.
         | 
         | And ideally you want this all done in real-time so you can stop
         | transactions before they are approved.
        
       | belter wrote:
       | "...we harness established practices from the HPC landscape to
       | build a system that outperforms all past GDBs presented in the
       | literature by orders of magnitude, for both OLTP and OLAP
       | workloads. For this, we first identify and crystallize
       | performance-critical building blocks in the GDB design, and
       | abstract them into a portable and programmable API specification,
       | called the Graph Database Interface (GDI), inspired by the best
       | practices of MPI. We then use GDI to design a GDB for
       | distributed-memory RDMA architectures. Our implementation
       | harnesses one-sided RDMA communication and collective operations,
       | and it offers architecture-independent theoretical performance
       | guarantees. The resulting design achieves extreme scales of more
       | than a hundred thousand cores..."
        
         | [deleted]
        
       | jandrewrogers wrote:
       | I am a little confused by the purpose of this paper. The
       | architecture described is roughly how graph databases have always
       | been implemented on HPC systems. The main contribution seems to
       | be that they put a lot of polish on what were admittedly
       | prototype-ish implementations historically? I was hoping for some
       | interesting approaches to some of the fundamental computer
       | science problems that cause scaling issues when graphs become
       | large but this is more of a standard "throw hardware at it"
       | solution (which has significant limitations).
        
         | lmeyerov wrote:
         | Agreed. They brush aside the years of high performance
         | computing graph implementations, eg, cuStinger. If you look at
         | systems like neo4j's GDS and how other multipurpose systems are
         | going wrt views/projections for accelerated compute, that has
         | been enabling targeting way more performance without dying
         | under complexity. Benchmarking perf without that kind of
         | comparison is weird, I'm surprised reviewers allowed that. (The
         | work may still be good.. just you can't know without that.)
        
         | mathisfun123 wrote:
         | I love this - "it's just engineering" say the people that think
         | the hardest part about building a spaceship is having the big
         | idea to build the spaceship. Note the "I love this" was
         | sarcasm.
        
           | jandrewrogers wrote:
           | My point was that the architecture they are using has already
           | been done multiple times for graph databases using things
           | like RDMA (which has existed in HPC for ages), that is a
           | known quantity. It was less "it's just engineering" and more
           | I've seen similar implementations for a long time so what
           | makes this different? I am interested in this space in part
           | because I spent much of my time in HPC working on graph
           | databases.
        
             | eternalban wrote:
             | You should quickly skim sections marked with a (comically
             | fat) "!" symbol. These indicate their "key design choices
             | and insights" in the design space.
        
       | LukeEF wrote:
       | I am not sure of the exact statistic, but something like 95% of
       | all production databases are less than 10GB. There seems to be a
       | 'FAANG hacker' fascination with 'extreme-scale' which probably
       | comes from seeing the challenges faced by the handful of
       | organizations working at that level. Much of the time most graph
       | database users want (as in why are they there) a DB that allows
       | you to flexibly model your data and run complex queries. They
       | probably also want some sort of interoperability. If you can do
       | that well for 10GB, that is holy grail enough. We certainly found
       | that developing graph database TerminusDB [1] - most users have
       | smaller production DBs, more lightly use bells and whistles
       | features, and really want things like easy schema evolution.
       | 
       | [1] https://github.com/terminusdb/terminusdb
        
         | deegles wrote:
         | What are the databases with easy schema evolution?
        
         | swader999 wrote:
         | I get that angle but I also see orgs capturing too much data.
         | What's the use case for it? Not sure but if we ever do need it
         | we'll have it is the typical answer.
        
         | fnord77 wrote:
         | really? I don't quite believe that. We're a tiny company with
         | maybe 70 customers and db is roughly 11Tb.
        
           | cubefox wrote:
           | That seems a lot. What type of data?
        
           | [deleted]
        
           | paulddraper wrote:
           | Assuming 1kb per "record" that's 150 million records per
           | customer.
           | 
           | Definitely a data heavy product, wherever it is that you're
           | offering.
           | 
           | (Unless you keep large blobs in the DB. But database scale
           | has more to do with records than raw storage.)
        
         | belter wrote:
         | Interesting that you mention the value 10 GB, as it is the size
         | of a DynamoDB partition or an AWS Aurora cell...
        
         | im_down_w_otp wrote:
         | I think I kind of agree with this.
         | 
         | One of the simpler supported backends for our Modality product
         | (https://auxon.io/products/modality), which results in a data
         | model that's a special case of a DAG for modeling big piles of
         | casually correlated events from piles and piles of distributed
         | components for "system of systems" use cases, is built using
         | SQLite, and the scaling limiter is almost always how
         | efficiently the traces & telemetry can be exfiltrated from the
         | systems under test/observation before how fast the ingest path
         | can actually record things becomes a problem.
         | 
         | That said, I do love me some RDMA action. 10 years ago I was
         | fiddling with getting Erlang clustering working via RDMA on a
         | little 5 node Infiniband cluster. To mixed results.
        
         | parentheses wrote:
         | I agree with your sentiment but I suppose you're considering
         | the wrong statistics. Instead you should consider: - how many
         | jobs have interviews that necessitate knowing how to handle
         | extreme scale
         | 
         | - proportion of jobs (not companies) requiring extreme scale -
         | the fact that non extreme scales are the long tail doesn't mean
         | it's a fat tail
         | 
         | - proportion of buyers/potential users that walk away from the
         | inability to handle extreme scale
         | 
         | ... and more sarcastically
         | 
         | - proportion of articles about extreme scale
         | 
         | - proportion of repos about extreme scale
        
           | mumblemumble wrote:
           | Only one anecdote, but I found out a while after starting at
           | my current job that directly questioning the extent to which
           | scale-out was actually needed to solve a problem during a
           | technical interview question is the thing that made me stand
           | out from the rest of the crowd, and landed me the job. Being
           | able to constructively challenge assumptions is an incredibly
           | valuable job skill, and good managers know that.
        
             | zimpenfish wrote:
             | Counter-anecdote: directly questioning "scale-out
             | fantasies" has contributed to my early departure from a
             | handful of jobs and contracts. One place was obsessed with
             | getting everything into AWS auto-scaling groups when the
             | problem was actually that they were running on MySQL with a
             | godawful schema, dumbass session management, and horrific
             | queries that we weren't allowed to fix because they were
             | "migrating to node microservices anyway" (pretty sure that
             | still hasn't happened years later.)
             | 
             | > Being able to constructively challenge assumptions is an
             | incredibly valuable job skill
             | 
             | I would agree but ...
             | 
             | > good managers
             | 
             | ... are few and far between.
        
               | hobs wrote:
               | The best people challenge bad assumption and worst bosses
               | get mad.
               | 
               | Had one boss get mad that I reduced the database
               | footprint by 94% - why? Because he wrote the initial
               | implementation and refused to believe that his baby,
               | which cost so much space because of how awesome it was,
               | could fit into 5GB.
               | 
               | But challenging the status quo has gotten me to where I
               | am, so I wont stop it anytime soon :)
        
         | threeseed wrote:
         | This research paper is talking about performance whilst you're
         | talking about scalability.
         | 
         | Those are related but are distinct from each other.
         | 
         | And sure about 95% of companies would have their needs met with
         | a simpler system but that does leave a lot of companies who
         | will not. And for those of us in say finance doing
         | customer/fraud analytics I would welcome all the performance I
         | can get.
        
           | loeg wrote:
           | > This research paper is talking about performance whilst
           | you're talking about scalability. Those are related but are
           | distinct from each other.
           | 
           | The paper has "Scale to Hundreds of Thousands of Cores" in
           | the title. I have not yet read the paper but it seems
           | unlikely it doesn't talk about scalability.
        
             | threeseed wrote:
             | I was referring to scalability in the sense of the size of
             | the data being stored.
             | 
             | You can have slow queries with 10GB of data just like you
             | can have fast queries with 10PB of data.
        
               | adgjlsfhk1 wrote:
               | If your data is small enough to easily fit in ram, you
               | kind of can't have that slow a query on it (or at least
               | you no longer are talking about a database problem).
        
               | mumblemumble wrote:
               | I'm guessing that, when the paper's author mentioned
               | "hundreds of thousands of cores", they didn't have 10GB
               | of data in mind. That works out less than a typical L1
               | cache's worth of data per core.
        
         | rocqua wrote:
         | This isn't a graph database like neo4j. This is a graph
         | database like I hoped neo4j would be. It's not about having an
         | easier time working with schemas. It's about analyzing graphs
         | that are too big to fit in RAM. Transaction analysis for banks,
         | trafic analysis of roads, failure resilience of utility
         | networks, etc.
         | 
         | In these kinds of workloads you quickly run into performance
         | bottlenecks. Even in-memory analyses need care to avoid
         | conplete pointer chasing slowdowns.
         | 
         | I do still hope this is fast in like a single CPU 32 core 64GB
         | system with an SSD. But if this takes a cluster to be useful,
         | then I will still love it.
        
         | bubblethink wrote:
         | >There seems to be a 'FAANG hacker' fascination
         | 
         | Yeah, but the hacker fascination is what drives progress. You
         | could have made the same type of argument about ML, and we
         | would have been content with MNIST.
        
         | RhodesianHunter wrote:
         | But the 5% of places where that kind of scale is needed are the
         | ones paying the top 1% salary band, so this is the content
         | distributed systems engineers like to read about and work on.
        
       | lasfter wrote:
       | Maciej Besta, the first author of this paper, is a machine.
       | 
       | Aside from coordinating big groups to write tons of papers, he
       | does a bunch of impressive wilderness exploration. I recommend
       | checking out his website, there's some stunning photos:
       | 
       | https://people.inf.ethz.ch/bestam/expeditions.html
        
         | osigurdson wrote:
         | Stating that someone "is a machine" is somewhat ambiguous these
         | days as it is increasingly possible that they literally are
         | one.
        
           | EGreg wrote:
           | Wasn't TJ Holowaychuk confirmed to be an AI, or a group of
           | people like Bourbaki? He never showed up at any events :)
           | 
           | https://qr.ae/pyvfKK
        
           | gumby wrote:
           | I also misinterpreted that sentence!
        
         | mathisfun123 wrote:
         | Torsten is PI of that group not Maciej so you're half right,
         | since being Maciej is first author on this one he probably did
         | most of the technical work (and who cares about
         | "coordination").
        
       | uptownfunk wrote:
       | What are these used for in the "real" world?
        
         | perfect_wave wrote:
         | A few that I've encountered:
         | 
         | - Feature generation for machine learning model training
         | (particularly popular with fraud detection at financial
         | institutions) - social networks (think LinkedIn or Facebook) -
         | supply chain analysis and optimization - healthcare patient
         | data analysis (looking at similar patients to recommend
         | treatments or do large scale analysis) - user identification
         | (eg taking lots of data points and tying to a specific user).
         | There's a more specific name for this I can't remember off the
         | top of my head.
        
       | z3t4 wrote:
       | I once was obsessed at finding the holy Grail of optimization and
       | scaling, but then came to the conclusion that every system is
       | optimized and scaled differently. Different systems have
       | different bottlenecks.
        
         | slively wrote:
         | So true, rarely is anything the "best" or "better", but instead
         | each thing is a bucket of trade offs we choose from. Maybe
         | frustrating, but also what makes engineering genuinely
         | interesting.
        
         | uptownfunk wrote:
         | This sounds similar to the no free lunch theorem in AI
        
           | jjgreen wrote:
           | Huh? NFL is a theorem on optimisation, not AI
        
             | cubefox wrote:
             | It's originally a theorem about supervised learning, which
             | turned out to be the prototypical application (via CNNs) of
             | deep learning. Deep learning is now mostly synonymous with
             | AI. Though I'm not aware of any NFL theorem for
             | reinforcement learning or unsupervised learning.
        
             | mumblemumble wrote:
             | Given that the most well-known use case for optimization
             | algorithms is training machine learning models, it seems to
             | me like a perfectly reasonable, if buzzword-oriented, thing
             | to say.
        
       ___________________________________________________________________
       (page generated 2023-05-21 23:00 UTC)