[HN Gopher] Scylla - Real-Time Big Data Database
       ___________________________________________________________________
        
       Scylla - Real-Time Big Data Database
        
       Author : andrewstuart
       Score  : 56 points
       Date   : 2021-08-24 20:24 UTC (2 hours ago)
        
 (HTM) web link (www.scylladb.com)
 (TXT) w3m dump (www.scylladb.com)
        
       | criticaltinker wrote:
       | The benchmarks against DynamoDB, Bigtable, & CockroachDB [1]
       | appear quite impressive - anyone have real world experience that
       | can attest to these claims of improved performance and reduced
       | cost?
       | 
       | > Scylla vs DynamoDB - Database Benchmark
       | 
       |  _> 20x better throughput in the hot-partition test_
       | 
       |  _> Scylla Cloud is 1 /7 the expense of DynamoDB when running
       | equivalent workloads _
       | 
       | _> Scylla Cloud: Average replication latency of 82ms. DynamoDB:
       | Average latency of 370ms. _
       | 
       | > Scylla vs Bigtable - Database Benchmark
       | 
       |  _> Scylla Cloud performs 26X better than Google Cloud Bigtable
       | when applied with real-world, unoptimized data distribution _
       | 
       | _> Google BigTable requires 10X as many nodes to accept the same
       | workload as Scylla Cloud _
       | 
       | _> Scylla Cloud was able to sustain 26x the throughput, and with
       | read latencies 1 /800th and write latencies less than 1/100th of
       | Cloud Bigtable_
       | 
       | > Scylla vs CockroachDB - Database Benchmark
       | 
       |  _> Loading 10x the data into Scylla took less than half the time
       | it took for CockroachDB to load the much lesser dataset._
       | 
       |  _> Scylla handled 10x the amount of data. _
       | 
       | _> Scylla achieved 9.3x the throughput of CockroachDB at 1 /4th
       | the latency._
       | 
       | [1] https://www.scylladb.com/product/benchmarks/
        
         | kasey_junk wrote:
         | I was unwilling to sign up to read the actual benchmark report
         | for the comparison to cockroachdb but it jumped out at me as
         | odd. They solve completely different kinds of problems in my
         | experience so I'm not surprised Scylla did better in raw
         | throughout. That's not interesting though. It would be just as
         | weird for cockroach to put up a benchmark showing it
         | outperforms in distributed sql queries.
         | 
         | That said I've seen the value Scylla brings in its core value
         | prop, replacing Cassandra. It's real good at that.
        
           | biggestdummy wrote:
           | Full report is posted here with no registration wall:
           | https://www.scylladb.com/2021/01/21/cockroachdb-vs-scylla-
           | be... And they admit that it's an odd comparison. "Obviously,
           | the comparison is of the apples and oranges type..."
        
       | mianos wrote:
       | It is interesting in that this is here on the front page and an
       | old article about Discord moving to Cassandra is also here
       | considering Discord went from Cassandra to Scylla I beleive.
        
       | andrewstuart wrote:
       | I posted this because I'm interested to hear from anyone using it
       | - how has it worked out for you?
       | 
       | I note it's written in C++ which is a bit of a surprise - I'd
       | expected Rust or Golang.
       | 
       | Interesting as well is is AGPL - licensing is always contentious:
       | 
       | https://github.com/scylladb/scylla/blob/master/LICENSE.AGPL
        
         | zinclozenge wrote:
         | I think the main reason it's in C++ is because of its async
         | executor, Seastar. There's a similar Rust project called
         | Glommio but seems still very early.
        
           | biggestdummy wrote:
           | Glommio was created by Glauber Costa, one of the early
           | contributors to Seastar (and Scylla). The resemblance between
           | the two is not coincidence.
           | https://glaubercosta-11125.medium.com/c-vs-rust-an-async-
           | thr...
        
         | krapht wrote:
         | Only on Hackernews would somebody be surprised that high-
         | performance system software would be written in C++...
        
           | masterof0 wrote:
           | You read my mind. LOL. "Mr. Developer, can you please write
           | your project in Rust, or __insert_your_meme_language_here__,
           | or Javascript?"
        
             | ethelward wrote:
             | Fromthe mouth of CockraochDB's CTO: ``So if we were
             | starting at this point in time, I would take a hard look at
             | Rust, and I imagine that we would pick it instead of C++.''
        
         | jandrewrogers wrote:
         | If you are building a database engine that strongly prioritizes
         | performance, and Scylla does position itself that way, then C++
         | is the only practical choice today for many people, depending
         | on the details. It isn't that C++ is great, though modern
         | versions are pretty nice, but that it wins by default.
         | 
         | Garbage collected languages like Golang and high-performance
         | database kernels are incompatible because the GC interferes
         | with core design elements of high-performance database kernels.
         | In addition to a significant loss of performance, it introduces
         | operational edge cases you don't have to deal with in non-GC
         | languages.
         | 
         | Rust has an issue unique to Rust in the specific case of high-
         | performance database kernels. The internals of high-performance
         | databases are full of structures, behaviors, and safety
         | semantics that Rust's safety checking infrastructure is not
         | designed to reason about. Consequently, to use Rust in a way
         | that produces equivalent performance requires marking most of
         | the address space as "unsafe". And while you could do this,
         | Rust is currently less expressive than modern C++ for this type
         | of code anyway, so it isn't ergonomic either.
         | 
         | C++ is just exceptionally ergonomic for writing high-
         | performance database kernels compared to the alternatives at
         | the moment.
        
           | nhourcard wrote:
           | At QuestDB we chose zero-GC Java for 80% of the code base,
           | which resulted in superior performance on ingestion compared
           | to the alternatives.
        
           | dralley wrote:
           | Zig might be a good option -- eventually, once it's past 1.0.
        
         | enedil wrote:
         | Quoting the interview with ScyllaDB CTO, Avi Kivity (
         | https://www.scylladb.com/2020/06/30/ask-me-anything-with-avi...
         | )
         | 
         | > Q: Would you implement Scylla in Go, Rust or Javascript if
         | you could?
         | 
         | > Avi: Good question. I wouldn't implement Scylla in
         | Javascript. It's not really a high-performance language, but I
         | will note that Node.js and Seastar share many characteristics.
         | Both are using a reactor pattern and designed for high
         | concurrency. Of course the performance is going to be very
         | different between the two, but writing code for Node.js and
         | writing code for Seastar is quite similar.
         | 
         | > Go also has an interesting take on concurrency. I still
         | wouldn't use it for something like Scylla. It is a garbage-
         | collected language so you lose a lot of predictability, and you
         | lose some performance. The concurrency model is great. The
         | language lacks generics. I like generics a lot and I think they
         | are required for complex software. I also hear that Go is
         | getting generics in the next iteration. Go is actually quite
         | close to being useful for writing a high-performance database.
         | It still has the downside of having a garbage collector, so
         | from that point-of-view I wouldn't pick it.
         | 
         | > If you are familiar with how Scylla uses the direct I/O and
         | asynchronous I/O, this is not something that Go is great at
         | right now. I imagine that it will evolve. So I wouldn't pick
         | Javascript or Go.
         | 
         | > However, the other language you mentioned, Rust, does have
         | all of the correct characteristics that Scylla requires.
         | Precise control over what happens. It doesn't have a garbage
         | collector so it means that you have predictability over how
         | much time your things take, like allocation. You don't have
         | pause times. And it is a well-designed language. I think it is
         | better than C++ which we are currently using. So if we were
         | starting at this point in time, I would take a hard look at
         | Rust, and I imagine that we would pick it instead of C++. Of
         | course, when we started Rust didn't have the maturity that it
         | has now, but it has progressed a long time since then and I'm
         | following it with great interest. I think it's a well-done
         | language.
        
         | milesward wrote:
         | We're using it with several customers: fast, reliable,
         | straightforward.
        
       ___________________________________________________________________
       (page generated 2021-08-24 23:00 UTC)