[HN Gopher] Scylla - Real-Time Big Data Database ___________________________________________________________________ Scylla - Real-Time Big Data Database Author : andrewstuart Score : 56 points Date : 2021-08-24 20:24 UTC (2 hours ago) (HTM) web link (www.scylladb.com) (TXT) w3m dump (www.scylladb.com) | criticaltinker wrote: | The benchmarks against DynamoDB, Bigtable, & CockroachDB [1] | appear quite impressive - anyone have real world experience that | can attest to these claims of improved performance and reduced | cost? | | > Scylla vs DynamoDB - Database Benchmark | | _> 20x better throughput in the hot-partition test_ | | _> Scylla Cloud is 1 /7 the expense of DynamoDB when running | equivalent workloads _ | | _> Scylla Cloud: Average replication latency of 82ms. DynamoDB: | Average latency of 370ms. _ | | > Scylla vs Bigtable - Database Benchmark | | _> Scylla Cloud performs 26X better than Google Cloud Bigtable | when applied with real-world, unoptimized data distribution _ | | _> Google BigTable requires 10X as many nodes to accept the same | workload as Scylla Cloud _ | | _> Scylla Cloud was able to sustain 26x the throughput, and with | read latencies 1 /800th and write latencies less than 1/100th of | Cloud Bigtable_ | | > Scylla vs CockroachDB - Database Benchmark | | _> Loading 10x the data into Scylla took less than half the time | it took for CockroachDB to load the much lesser dataset._ | | _> Scylla handled 10x the amount of data. _ | | _> Scylla achieved 9.3x the throughput of CockroachDB at 1 /4th | the latency._ | | [1] https://www.scylladb.com/product/benchmarks/ | kasey_junk wrote: | I was unwilling to sign up to read the actual benchmark report | for the comparison to cockroachdb but it jumped out at me as | odd. They solve completely different kinds of problems in my | experience so I'm not surprised Scylla did better in raw | throughout. That's not interesting though. It would be just as | weird for cockroach to put up a benchmark showing it | outperforms in distributed sql queries. | | That said I've seen the value Scylla brings in its core value | prop, replacing Cassandra. It's real good at that. | biggestdummy wrote: | Full report is posted here with no registration wall: | https://www.scylladb.com/2021/01/21/cockroachdb-vs-scylla- | be... And they admit that it's an odd comparison. "Obviously, | the comparison is of the apples and oranges type..." | mianos wrote: | It is interesting in that this is here on the front page and an | old article about Discord moving to Cassandra is also here | considering Discord went from Cassandra to Scylla I beleive. | andrewstuart wrote: | I posted this because I'm interested to hear from anyone using it | - how has it worked out for you? | | I note it's written in C++ which is a bit of a surprise - I'd | expected Rust or Golang. | | Interesting as well is is AGPL - licensing is always contentious: | | https://github.com/scylladb/scylla/blob/master/LICENSE.AGPL | zinclozenge wrote: | I think the main reason it's in C++ is because of its async | executor, Seastar. There's a similar Rust project called | Glommio but seems still very early. | biggestdummy wrote: | Glommio was created by Glauber Costa, one of the early | contributors to Seastar (and Scylla). The resemblance between | the two is not coincidence. | https://glaubercosta-11125.medium.com/c-vs-rust-an-async- | thr... | krapht wrote: | Only on Hackernews would somebody be surprised that high- | performance system software would be written in C++... | masterof0 wrote: | You read my mind. LOL. "Mr. Developer, can you please write | your project in Rust, or __insert_your_meme_language_here__, | or Javascript?" | ethelward wrote: | Fromthe mouth of CockraochDB's CTO: ``So if we were | starting at this point in time, I would take a hard look at | Rust, and I imagine that we would pick it instead of C++.'' | jandrewrogers wrote: | If you are building a database engine that strongly prioritizes | performance, and Scylla does position itself that way, then C++ | is the only practical choice today for many people, depending | on the details. It isn't that C++ is great, though modern | versions are pretty nice, but that it wins by default. | | Garbage collected languages like Golang and high-performance | database kernels are incompatible because the GC interferes | with core design elements of high-performance database kernels. | In addition to a significant loss of performance, it introduces | operational edge cases you don't have to deal with in non-GC | languages. | | Rust has an issue unique to Rust in the specific case of high- | performance database kernels. The internals of high-performance | databases are full of structures, behaviors, and safety | semantics that Rust's safety checking infrastructure is not | designed to reason about. Consequently, to use Rust in a way | that produces equivalent performance requires marking most of | the address space as "unsafe". And while you could do this, | Rust is currently less expressive than modern C++ for this type | of code anyway, so it isn't ergonomic either. | | C++ is just exceptionally ergonomic for writing high- | performance database kernels compared to the alternatives at | the moment. | nhourcard wrote: | At QuestDB we chose zero-GC Java for 80% of the code base, | which resulted in superior performance on ingestion compared | to the alternatives. | dralley wrote: | Zig might be a good option -- eventually, once it's past 1.0. | enedil wrote: | Quoting the interview with ScyllaDB CTO, Avi Kivity ( | https://www.scylladb.com/2020/06/30/ask-me-anything-with-avi... | ) | | > Q: Would you implement Scylla in Go, Rust or Javascript if | you could? | | > Avi: Good question. I wouldn't implement Scylla in | Javascript. It's not really a high-performance language, but I | will note that Node.js and Seastar share many characteristics. | Both are using a reactor pattern and designed for high | concurrency. Of course the performance is going to be very | different between the two, but writing code for Node.js and | writing code for Seastar is quite similar. | | > Go also has an interesting take on concurrency. I still | wouldn't use it for something like Scylla. It is a garbage- | collected language so you lose a lot of predictability, and you | lose some performance. The concurrency model is great. The | language lacks generics. I like generics a lot and I think they | are required for complex software. I also hear that Go is | getting generics in the next iteration. Go is actually quite | close to being useful for writing a high-performance database. | It still has the downside of having a garbage collector, so | from that point-of-view I wouldn't pick it. | | > If you are familiar with how Scylla uses the direct I/O and | asynchronous I/O, this is not something that Go is great at | right now. I imagine that it will evolve. So I wouldn't pick | Javascript or Go. | | > However, the other language you mentioned, Rust, does have | all of the correct characteristics that Scylla requires. | Precise control over what happens. It doesn't have a garbage | collector so it means that you have predictability over how | much time your things take, like allocation. You don't have | pause times. And it is a well-designed language. I think it is | better than C++ which we are currently using. So if we were | starting at this point in time, I would take a hard look at | Rust, and I imagine that we would pick it instead of C++. Of | course, when we started Rust didn't have the maturity that it | has now, but it has progressed a long time since then and I'm | following it with great interest. I think it's a well-done | language. | milesward wrote: | We're using it with several customers: fast, reliable, | straightforward. ___________________________________________________________________ (page generated 2021-08-24 23:00 UTC)