[HN Gopher] 15k inserts/s with Rust and SQLite (2021)
       ___________________________________________________________________
        
       15k inserts/s with Rust and SQLite (2021)
        
       Author : mattrighetti
       Score  : 63 points
       Date   : 2023-04-01 12:51 UTC (1 days ago)
        
 (HTM) web link (kerkour.com)
 (TXT) w3m dump (kerkour.com)
        
       | jeffbee wrote:
       | Anyone able to replicate this? With XFS on an NVMe device I only
       | get 500 inserts per second. I wonder if the 15000 figure is
       | attributable to the scaleway virtual server SCSI device ignoring
       | syncs.
        
         | nordsieck wrote:
         | > pragma temp_store = memory;
         | 
         | Did you do this?
        
           | jeffbee wrote:
           | I am just taking their repo and command directly from their
           | blog post.
        
           | cldellow wrote:
           | I'm probably missing something, but why would the temp_store
           | pragma be relevant? They don't appear to be using temp tables
           | in this benchmark.
        
         | cldellow wrote:
         | They're using WAL and synchronous = NORMAL, so there shouldn't
         | be that many syncs to begin with.
         | 
         | By default, I think it'd sync only on checkpoints, and a
         | checkpoint would happen for every 1000 pages of WAL. 1000 is
         | the default value of the wal_autocheckpoint pragma.
         | 
         | I am very surprised that you cannot reproduce more than 500
         | inserts/second on an NVMe drive with the same settings. I
         | didn't try running this benchmark, as I'm not familiar with
         | rust, but I've achieved ~15k/sec write transactions with python
         | bindings under similar settings.
        
       | nprescott wrote:
       | This reminded me of a prior discussion[0] on bulk data generation
       | in SQLite with Rust (vs Python vs PyPy) which previously led me
       | to trying out two different techniques using just SQLite[1]. The
       | approach here is so similar I tried my prior solution on the
       | slowest VPS I have access to (1 vCPU core 2.4GHz, 512Mi,
       | $2.50/month from vultr):                   sqlite> create table
       | users (id blob primary key not null, created_at text not null,
       | username text not null);         sqlite> create unique index
       | idx_users_on_id on users(id);         sqlite> pragma
       | journal_mode=wal;         sqlite> .load '/tmp/uuid.c.so'
       | sqlite> .timer on         sqlite> insert into users(id,
       | created_at, username)                 select uuid(),
       | strftime('%Y-%m-%dT%H:%M:%fZ'), 'hello'                 from
       | generate_series limit 100000;         Run Time: real 1.159 user
       | 0.572631 sys 0.442133
       | 
       | where the UUID extension comes from the SQLite authors[2] and
       | generate_series is compiled into the SQLite CLI. It is possible
       | further pragma-tweaking might eke out further performance but I
       | feel like this representative of the no-optimization scenario I
       | typically find myself in.
       | 
       | In the interest of finding where the bulk of the time is spent
       | and on a hunch I tried swapping the UUID for plain auto-
       | incrementing primary keys as well:                   sqlite>
       | insert into users(created_at, username) select
       | strftime('%Y-%m-%dT%H:%M:%fZ'), 'hello' from generate_series
       | limit 100000;         Run Time: real 0.142 user 0.068090 sys
       | 0.025507
       | 
       | Clearly UUIDs are not free!
       | 
       | [0]: https://news.ycombinator.com/item?id=27872575
       | 
       | [1]: https://idle.nprescott.com/2021/bulk-data-generation-in-
       | sqli...
       | 
       | [2]: https://sqlite.org/src/file/ext/misc/uuid.c
        
       | klabb3 wrote:
       | Decided to run this with NodeJS (on Linux) for mandatory Web
       | Scale:
       | 
       | 100k inserts using `better-sqlite3` (difference: it _does use_
       | prepared statements), all single-threaded with the same pragmas.
       | 
       | - 12.0k inserts/s with `uuid` and `Date.toUTCString()`
       | 
       | - 8.6k inserts/s with faking uuid and date (generated unique
       | strings)
       | 
       | ...So a large amount of time was spend in uuid generation and
       | current date.
       | 
       | It's fair to say that although Rust is slightly more Web Scale,
       | this benchmark simply measures the excellent performance of
       | SQLite.
        
         | sroussey wrote:
         | Try again with bun. It uses a baked-in SQLite silicate to
         | better-sqlite3.
         | 
         | https://bun.sh
        
       | szundi wrote:
       | I had 30k with PostgreSQL 6.5.3 like in 1999 on a PIII-400 or
       | something like that.
        
       | VWWHFSfQ wrote:
       | 15k inserts/second is basically nothing. This is just a batch
       | load of INSERT statements.
       | 
       | Now do it while simultaneously serving 15k selects/second.
       | 
       | Now try it with 15k counts/second. You'll see where this breaks
       | down fast.
        
         | phiresky wrote:
         | If WAL mode is enabled and the inserts are running in a
         | separate thread or process then the performance will be exactly
         | while any amount of selects are running.
        
         | ijidak wrote:
         | Can you qualify your statement that this is nothing?
         | 
         | How many CPUs and cores are available? Are we talking SSDs?
         | 
         | How much data is being inserted?
         | 
         | What are the keys and index?
         | 
         | 15,000 per second seems like plenty much in some contexts.
        
           | andrewstuart wrote:
           | 15K doesn't seem particularly noteworthy.
           | 
           | I made a prototype message queue in Rust that processed about
           | 7 million messages/second. That was running from RAM.
           | 
           | I also had a prototype that ran from disk and it very rapidly
           | maxed out the random write performance of the SSD. Can't
           | remember the number but it was in the order of 15K to 30K
           | messages a second, bottlenecked by the SSD.
           | 
           | If the sqlite test is writing to disk it is likely also
           | bottlenecked at the disk.
        
             | gomezjdaniel wrote:
             | > I made a prototype message queue in Rust that processed
             | about 7 million messages/second. That was running from RAM.
             | 
             | Can you share the source code?
        
               | andrewstuart wrote:
               | It was pure garbage - a failed attempt to learn Rust.
        
             | marginalia_nu wrote:
             | To be fair it's very easy to outperform a DBMS by not
             | offering ACID guarantees.
        
           | VWWHFSfQ wrote:
           | The schema:                   CREATE TABLE IF NOT EXISTS
           | users (             id BLOB PRIMARY KEY NOT NULL,
           | created_at TEXT NOT NULL,             username TEXT NOT NULL
           | );
           | 
           | The data:                   let user = User {             id:
           | uuid::Uuid::new_v4(),             created_at:
           | chrono::Utc::now(),             username:
           | String::from("Hello"),         };
           | 
           | If you can't insert 15,000 of these records per second then
           | there's something wrong with your database. I'm aware that
           | people are all on the SQLite hype train, but this kind of
           | stuff is table-stakes.
           | 
           | I really like SQLite. But this blog isn't any kind of great
           | performance indicator for it.
        
             | ZephyrBlu wrote:
             | I think a lot of people have forgotten or don't know how
             | powerful CPUs are. Simple operations should be insanely
             | fast, like easily sub-millisecond if not sub-nanosecond
             | fast.
        
             | snacktaster wrote:
             | This entire post reduces to:                   insert into
             | users (id, created_at, username) values ("<id>",
             | "<created_at>", "Hello"), ...15,000 times.
             | 
             | And we're going to be impressed with sqlite's performance?
        
       | HopenHeyHi wrote:
       | Is it possible to improve this micro benchmark? Of course, by
       | bundling all the inserts in a single transaction, for example, or
       | by using another, non-async database driver, but it does not make
       | sense as it's not how a real-world codebase accessing a database
       | looks like. We favor simplicity over theorical numbers.
       | 
       | I don't understand. Using transactions and prepared statements
       | you would get hundreds of thousands of inserts per second even
       | using languages like python/ruby.
       | 
       | Nothing theoretical or not-real-world about it. It is how SQLite
       | docs recommend you do it.
       | 
       | https://www.sqlite.org/faq.html#q19
       | 
       | https://rogerbinns.github.io/apsw/tips.html#sqlite-is-differ...
        
         | morelisp wrote:
         | It would not be real-world in the sense that most OLTP
         | workloads are inserting (or updating) only a couple rows at a
         | time. In this sense it provides an upper bound on transactions
         | per second, which for many real-world workloads essentially
         | means requests per second (or requests*n per second for small
         | n).
         | 
         | However, it's still kind of weak because you really need to
         | interleave those with SELECTs and some UPDATEs to get a decent
         | view of how it would perform in such a case.
        
       | andrewstuart wrote:
       | Here is a totally unscientific benchmark in C++. Of course there
       | is "lies, damned lies and benchmarks".
       | 
       | AMD Ryzen 9 5900X 12-Core Processor
       | 
       | ./sqlitetest
       | 
       | For writing to disk: Inserted 1500000 records in 13 seconds, at a
       | rate of 115385 records per second.
       | 
       | For writing to RAM: change users.db to :memory: Inserted 1500000
       | records in 9 seconds, at a rate of 166667 records per second.
       | 
       | Credit goes not to me but to everyone's favorite AI programmer.
       | 
       | I've made a tiny attempt to optimise - likely much more can be
       | done.
       | 
       | To compile on Linux:
       | 
       | g++ sqlitetest.cpp -lsqlite3 -luuid -o sqlitetest
       | #include <iostream>         #include <cstring>         #include
       | <ctime>         #include <chrono>         #include <sqlite3.h>
       | #include <uuid/uuid.h>                  struct User {
       | uuid_t id;             char created_at[25];             char
       | username[6];         };                  int main() {
       | sqlite3 *db;             sqlite3_open(":memory:", &db);
       | const char *sql = "CREATE TABLE IF NOT EXISTS users ("
       | "id BLOB PRIMARY KEY NOT NULL,"
       | "created_at TEXT NOT NULL,"
       | "username TEXT NOT NULL"                               ");";
       | sqlite3_exec(db, sql, nullptr, nullptr, nullptr);
       | sqlite3_exec(db, "BEGIN", nullptr, nullptr, nullptr);
       | User user;             char uuid_str[37];             time_t now
       | = time(nullptr);                      int count = 0;
       | auto start = std::chrono::steady_clock::now();
       | for (int i = 0; i < 1500000; i++) {
       | uuid_generate(user.id);
       | uuid_unparse_lower(user.id, uuid_str);
       | strftime(user.created_at, sizeof(user.created_at), "%Y-%m-%d
       | %H:%M:%S", localtime(&now));
       | strncpy(user.username, "Hello", sizeof(user.username));
       | sqlite3_stmt *stmt;                 sqlite3_prepare_v2(db,
       | "INSERT INTO users (id, created_at, username) VALUES (?, ?, ?);",
       | -1, &stmt, nullptr);                 sqlite3_bind_blob(stmt, 1,
       | user.id, sizeof(user.id), SQLITE_STATIC);
       | sqlite3_bind_text(stmt, 2, user.created_at, -1, SQLITE_STATIC);
       | sqlite3_bind_text(stmt, 3, user.username, -1, SQLITE_STATIC);
       | sqlite3_step(stmt);                 sqlite3_finalize(stmt);
       | count++;             }                      sqlite3_exec(db,
       | "COMMIT", nullptr, nullptr, nullptr);                      auto
       | end = std::chrono::steady_clock::now();             auto elapsed
       | = std::chrono::duration_cast<std::chrono::seconds>(end -
       | start).count();             double rate =
       | static_cast<double>(count) / elapsed;
       | std::cout << "Inserted " << count << " records in " << elapsed <<
       | " seconds, at a rate of " << rate << " records per second." <<
       | std::endl;                      sqlite3_close(db);
       | return 0;         }
        
         | andrewstuart wrote:
         | The thing to note of course is that this is single core. Only
         | one core lights up when running.
        
       | [deleted]
        
       | [deleted]
        
       | sjc02060 wrote:
       | [dead]
        
       | fooker wrote:
       | Is that a lot?
        
         | TinkersW wrote:
         | 15 per millisecond sounds horrible.. and it looks like it is
         | actually using 3 threads so its even worse..
        
         | gnuvince wrote:
         | It's not enormous, but it's good for the simplicity and would
         | probably be sufficient for quite a number of use cases. And I'd
         | guess that it's also probably more than what many programmers
         | get by Lego stacking 14 different abstractions.
        
         | natas wrote:
         | Doesn't strike me as a lot honestly.
        
         | shpx wrote:
         | This is running on an AMD EPYC 7543 CPU, which has 32 cores
         | that can each cycle 2,794,750,000 times per second. It's
         | running with 3 threads (if I understand correctly), so as a
         | rough estimate it's doing one insert every 500,000 CPU cycles
         | (2,794,750,000 * 3 / 15000).
        
       | trollied wrote:
       | No idea why this is on the front page. Doesn't seem like anything
       | special. People seem to blindly upvote anything "with Rust", and
       | I've no idea why.
        
         | berkle4455 wrote:
         | It's also that most devs don't understand reasonable estimates
         | on how many inserts/sec a database should be able to handle or
         | how many requests/sec a website should be able to handle, and
         | they upvote and wax estatic about some of the most unimpressive
         | stats ever.
        
         | [deleted]
        
         | 0xfedbee wrote:
         | It's a known tactic to get anything on HN frontpage. Just put
         | Rust in the title, even if it's about a food recipe.
        
         | web3-is-a-scam wrote:
         | Anything with Rust or SQLite is basically instant front page.
        
       | 32gbsd wrote:
       | Yeah but rust has gotten simpler?
        
         | nindalf wrote:
         | They're working on it. For example, knowing that you can use
         | async in some contexts (functions) but not in others (traits)
         | is complexity that needs to be learned. By making it work
         | everywhere, the language becomes simpler.
         | 
         | But no, there will always be some inherent complexity in the
         | language because it has new concepts not present elsewhere.
         | Newer tutorials in novel formats are being made all the time
         | and they help with learning, but there's no way around having
         | to learn new things.
        
           | candiddevmike wrote:
           | I thought there was contention within the community about
           | async?
        
             | nindalf wrote:
             | I think async exists and it works well. There were 2
             | equally popular runtimes earlier but now there's only 1
             | that everyone uses. I wouldn't characterise that as
             | contention, just people trying different stuff out.
        
               | galangalalgol wrote:
               | I still think async/await is a bit of a fad outside web
               | services, and one I can't ignore if I want to use some
               | popular crates.
        
               | vlovich123 wrote:
               | Tokio is super popular but there are still better
               | runtimes at the moment. The proper design you want is
               | thread per core with some background pool of threads
               | available for work stealing / batch jobs. The reason is
               | that you can completely remove the need for any locking
               | in your critical path. I think tokio is working on making
               | that paradigm available but I don't know how that's
               | progressing
        
         | timeon wrote:
         | For me, everyday.
        
       | FpUser wrote:
       | Translation - SQLite is capable of 15K batch uniform inserts/s
       | under some configuration. I have no clue what is so impressing
       | here.
       | 
       | And what magical potion Rust adds to this result?
        
         | pornel wrote:
         | The article is not meant to show off how fast it is, but how
         | easy it is to achieve a good-enough level of performance
         | without anything special. It's meant to show that you don't
         | need specialized "infinitely scalable" services.
        
           | FpUser wrote:
           | >"It's meant to show that you don't need specialized
           | "infinitely scalable" services."
           | 
           | 1) This "revelation" has been known to any decent programmer
           | for ages.
           | 
           | 2) The example is very contrived and has zero real practical
           | value as the access pattern, amount of tables involved,
           | constraints and ACID requirements in real life are very
           | different.
        
       ___________________________________________________________________
       (page generated 2023-04-02 23:00 UTC)