[HN Gopher] Launch HN: QuestDB (YC S20) - Fast open source time ...
       ___________________________________________________________________
        
       Launch HN: QuestDB (YC S20) - Fast open source time series database
        
       Hey everyone, I'm Vlad and I co-founded QuestDB
       (https://questdb.io) with Nic and Tanc. QuestDB is an open source
       database for time series, events, and analytical workloads with a
       primary focus on performance (https://github.com/questdb/questdb).
       It started in 2012 when an energy trading company hired me to
       rebuild their real-time vessel tracking system. Management wanted
       me to use a well-known XML database that they had just bought a
       license for. This option would have required to take down
       production for about a week just to ingest the data. And a week
       downtime was not an option. With no more money to spend on
       software, I turned to alternatives such as OpenTSDB but they were
       not a fit for our data model. There was no solution in sight to
       deliver the project.  Then, I stumbled upon Peter Lawrey's Java
       Chronicle library [1]. It loaded the same data in 2 minutes instead
       of a week using memory-mapped files. Besides the performance
       aspect, I found it fascinating that such a simple method was
       solving multiple issues simultaneously: fast write, read can happen
       even before data is committed to disk, code interacts with memory
       rather than IO functions, no buffers to copy. Incidentally, this
       was my first exposure to zero-GC Java.  But there were several
       issues. First, at the time It didn't look like the library was
       going to be maintained. Second, it used Java NIO instead of using
       the OS API directly. This adds overhead since it creates individual
       objects with sole purpose to hold a memory address for each memory
       page. Third, although the NIO allocation API was well documented,
       the release API was not. It was really easy to run out of memory
       and hard to manage memory page release. I decided to ditch the XML
       DB and then started to write a custom storage engine in Java,
       similar to what Java Chronicle did. This engine used memory mapped
       files, off-heap memory and a custom query system for geospatial
       time series. Implementing this was a refreshing experience. I
       learned more in a few weeks than in years on the job.  Throughout
       my career, I mostly worked at large companies where developers are
       "managed" via itemized tasks sent as tickets. There was no room for
       creativity or initiative. In fact, it was in one's best interest to
       follow the ticket's exact instructions, even if it was complete
       nonsense. I had just been promoted to a managerial role and
       regretted it after a week. After so much time hoping for a
       promotion, I immediately wanted to go back to the technical side. I
       became obsessed with learning new stuff again, particularly in the
       high performance space.  With some money aside, I left my job and
       started to work on QuestDB solo. I used Java and a small C layer to
       interact directly with the OS API without passing through a
       selector API. Although existing OS API wrappers would have been
       easier to get started with, the overhead increases complexity and
       hurts performance. I also wanted the system to be completely GC-
       free. To do this, I had to build off-heap memory management myself
       and I could not use off-the-shelf libraries. I had to rewrite many
       of the standard ones over the years to avoid producing any garbage.
       As I had my first kid, I had to take contracting gigs to make ends
       meet over the following 6 years. All the stuff I had been learning
       boosted my confidence and I started performing well at interviews.
       This allowed me to get better paying contracts, I could take fewer
       jobs and free up more time to work on QuestDB while looking after
       my family. I would do research during the day and implement this
       into QuestDB at night. I was constantly looking for the next thing,
       which would take performance closer to the limits of the hardware.
       A year in, I realised that my initial design was actually flawed
       and that it had to be thrown away. It had no concept of separation
       between readers and writers and would thus allow dirty reads.
       Storage was not guaranteed to be contiguous, and pages could be of
       various non-64-bit-divisible sizes. It was also very much cache-
       unfriendly, forcing the use of slow row-based reads instead of fast
       columnar and vectorized ones.Commits were slow, and as individual
       column files could be committed independently, they left the data
       open to corruption.  Although this was a setback, I got back to
       work. I wrote the new engine to allow atomic and durable multi-
       column commits, provide repeatable read isolation, and for commits
       to be instantaneous. To do this, I separated transaction files from
       the data files. This made it possible to commit multiple columns
       simultaneously as a simple update of the last committed row id. I
       also made storage dense by removing overlapping memory pages and
       writing data byte by byte over page edges.  This new approach
       improved query performance. It made it easy to split data across
       worker threads and to optimise the CPU pipeline with prefetch. It
       unlocked column-based execution and additional virtual parallelism
       with SIMD instruction sets [2] thanks to Agner Fog's Vector Class
       Library [3]. It made it possible to implement more recent
       innovations like our own version of Google SwissTable [4]. I
       published more details when we released a demo server a few weeks
       ago on ShowHN [5]. This demo is still available to try online with
       a pre-loaded dataset of 1.6 billion rows [6]. Although it was hard
       and discouraging at first, this rewrite turned out to be the second
       best thing that happened to QuestDB.  The best thing was that
       people started to contribute to the project. I am really humbled
       that Tanc and Nic left our previous employer to build QuestDB. A
       few months later, former colleagues of mine left their stable low-
       latency jobs at banks to join us. I take this as a huge
       responsibility and I don't want to let these guys down. The amount
       of work ahead gives me headaches and goosebumps at the same time.
       QuestDB is deployed in production, including into a large fintech
       company. We've been focusing on building a community to get our
       first users and gather as much feedback as possible.  Thank you for
       reading this story - I hope it was interesting. I would love to
       read your feedback on QuestDB and to answer questions.  [1]
       https://github.com/peter-lawrey/Java-Chronicle  [2]
       https://news.ycombinator.com/item?id=22803504  [3]
       https://www.agner.org/optimize/vectorclass.pdf  [4]
       https://github.com/questdb/questdb/blob/master/core/src/main...
       [5] https://news.ycombinator.com/item?id=23616878  [6]
       http://try.questdb.io:9000/
        
       Author : bluestreak
       Score  : 256 points
       Date   : 2020-07-28 13:57 UTC (9 hours ago)
        
       | nlitened wrote:
       | Do you measure performance vs k/shakti?
        
         | santafen wrote:
         | No one is allowed to benchmark them except them. :-)
        
       | jrexilius wrote:
       | This looks great, but more importantly good luck! There seems to
       | be market need for this and it looks a solid implementation at
       | first glance. You're off to a good start. I hope you and your
       | team are successful!
        
         | santafen wrote:
         | Thanks!
        
       | neurostimulant wrote:
       | Congrats! I've been looking for a time series database but most
       | of them seems to be in-memory nosql databases. QuestDB might be
       | exactly what I need. I'll definitely give it a try soon!
        
         | bluestreak wrote:
         | Thank you! We have quite active slack and we try to listen to
         | and help our community there. Feel free to join!
        
         | pachico wrote:
         | It would then be in your interest to know ClickHouse. I
         | recommend your to have a look at it.
        
           | j1897 wrote:
           | We've had one of their contributors bench questdb versus
           | Clickhouse recently - you can find the results here
           | https://github.com/questdb/questdb/issues/436
           | 
           | This came from a bench we had on our previous website versus
           | them about summing 1 billion doubles.
        
       | Random_ernest wrote:
       | Testing out the demo:
       | 
       | SELECT * FROM trips WHERE tip_amount > 500 ORDER BY tip_amount
       | DESC
       | 
       | Very interesting :-)
        
         | 120bits wrote:
         | For some reason this query is taking too long to execute. Not
         | sure if I missed something.
        
           | santafen wrote:
           | When I ran it it took about 20s total.
        
         | santafen wrote:
         | Some of those are absolutely monstrous tips!
        
         | joan4 wrote:
         | thanks for trying the live demo. That's a very interesting
         | result indeed. Btw, we are working on applying SIMD operations
         | on filter queries (where clause) that will speed up the runtime
         | of queries like that considerably.
        
       | js4ever wrote:
       | https://try.questdb.io:9000/ is down
        
         | joan4 wrote:
         | please try http instead of https
        
         | bluestreak wrote:
         | http://try.questdb.io:9000/
        
         | [deleted]
        
       | aloukissas wrote:
       | This is great! Quick question: would you mind sharing why you
       | went with Java vs something perhaps more performant like all
       | C/C++ or Rust? I'd suspect language familiarity (which is 100%
       | ok).
        
         | aloukissas wrote:
         | You may also want to check out NetData (a hugely popular OSS
         | project) for ideas how to grow.
        
         | bluestreak wrote:
         | Java was the starting point. Back in the day Rust wasn't a
         | thing and C++ projects were quite expensive to maintain. What
         | Java does for us is IDE support, instant compilation time and
         | super easy test coverage. For things that does require ultimate
         | performance we do use C/C++ though. These libraries are
         | packaged with Java and transparent to end user.
        
           | aloukissas wrote:
           | Makes sense, that's what I also guessed.
        
       | zumachase wrote:
       | Hi Vlad - your anecdote about ship tracking is interesting (my
       | other startup is an AIS based dry freight trader). You must know
       | the Vortexa guys given your BP background.
       | 
       | How does QuestDB differ from other timeseries/OLAP offerings? I'm
       | not entirely clear.
        
         | bluestreak wrote:
         | thank you, life is an interesting experience :) I used to work
         | with Fabio, Vortexa CEO and had to turn down an offer of being
         | first employee there to focus on QuestDB. They are an absolute
         | awesome bunch of guys and deserve every bit of their success!
         | 
         | What makes QuestDB different from other tools is the
         | performance we aim to offer. We are completely open on how we
         | achieve this performance and we serve community first and
         | foremost.
        
       | jedberg wrote:
       | How does your performance compare to Atlas? [0]
       | 
       | [0] https://github.com/Netflix/atlas
        
         | mpsq wrote:
         | I have not tried to benchmark Atlas but I am not sure the
         | result would be meaningful. Atlas is an in-memory database,
         | QuestDB persists to disk, the 2 are not very comparable.
        
           | jedberg wrote:
           | Atlas persists to disk too. Netflix stores trillions of data
           | points in it.
           | 
           | It stores recent data in memory for increased performance
           | which is replicated across instances and then persists to S3
           | for long term storage, making aggregates queryable and full
           | resolution data available with a delay for a restore from
           | storage.
        
       | hintymad wrote:
       | I'm curious how QuestDB handles dimensions. OLAP support with
       | reasonably large number of dimensions and cardinality in the
       | range of at least thousands is a must for modern-day time series
       | database. Otherwise, what we get is only incremental improvement
       | to Graphite -- a darling among startups, I understand, but a non-
       | scalable extremely hard to use timeseries database nonetheless.
       | 
       | A common flaw I see in many time-series DBs is that they store
       | one time series per combination of dimensions. As a result, any
       | aggregation will result in scanning of potentially millions of
       | time series. If any time-series DB claims that it is backed up by
       | a key-value store, say, Cassandra, then the DB will have the
       | aforementioned issue. For instance, Uber's M3 used to be backed
       | up by Cassandra, and therefore would give this mysterious warning
       | that an aggregation function exceeded the quota of 10,000 time
       | series, even though from user's point of view the function dealt
       | with a single time series with a number of dimensions.
        
         | bluestreak wrote:
         | We store "dimensions" as table columns with no artificial
         | limits on column count. If you able to send all dimensions in
         | the same message, they will be stored on one row of data. If
         | dimensions are sent as separate messages, current
         | implementation will store them on different rows. This will
         | make columns sparse. We can change that if need be and "update"
         | the same row as dimensions arrive as long as they have the same
         | timestamp value.
         | 
         | There is an option to store set of dimensions separately as
         | asof/splice join separate tables.
        
           | architectonic wrote:
           | Can you handle multiple time dimensions efficiently? We have
           | 3 of them, can one get away without having to physically
           | store "slices" on one of them?
        
             | bluestreak wrote:
             | if you can send all three in the same message, Influx Line
             | Protocol for example, we will store them as 3 columns in
             | one table. Does this help?
        
           | hintymad wrote:
           | Thanks for the explanation.
        
         | roskilli wrote:
         | FYI M3 is now backed by M3DB, a distributed quorum read/write
         | replicated time-series based columnar store specialized for
         | realtime metrics. You can associate multiple values/timeseries
         | with a single set of dimensions if you use Protobuf's to write
         | data, for more see the storage engine documentation[0]. The
         | current recommendation is not to limit your queries but limit
         | the global data queried per second[1] by a single DB node by
         | using a limit on the number of datapoints (inferred by blocks
         | of datapoints per series). M3DB also uses an inverted index
         | using FST segments that are mmap'd[2] similar to Apache Lucene
         | and Elastic Search to make multi-dimensional searches on very
         | large data sets fast (hundreds of trillions of datapoints,
         | petabytes of data) which is a bit different to traditional
         | columnar databases which focus on column stores and rarely is
         | accompanied by a full text search inverted index.
         | 
         | [0]: https://docs.m3db.io/m3db/architecture/engine/
         | 
         | [1]: https://docs.m3db.io/operational_guide/resource_limits/
         | 
         | [2]: https://fosdem.org/2020/schedule/event/m3db/,
         | https://fosdem.org/2020/schedule/event/m3db/attachments/audi...
         | (PDF)
        
           | ignoramous wrote:
           | Recommended reading on FST for the curious:
           | https://blog.burntsushi.net/transducers/
        
             | roskilli wrote:
             | Thank you for mentioning that, Andrew's post is really
             | fantastic covering many things altogether: fundamentals,
             | data structure, real world impact and examples.
        
           | hintymad wrote:
           | Thanks, @roskilli! Nice documentation.
        
       | mooneater wrote:
       | Awesome! Could you share a bit about business model?
        
         | j1897 wrote:
         | hi, co-founder of QuestDB here.
         | 
         | QuestDB is open source and therefore free for everybody to use.
         | Another product using QuestDB as a library with features that
         | are typically required for massive enterprise deployment will
         | be distributed and sold to companies through a fully managed
         | solution.
         | 
         | Our idea is to empower developers to solve their problems with
         | QuestDB open source, and for those developers to then push the
         | product within the organisation bottom up.
        
         | gk1 wrote:
         | I'm not associated with QuestDB, but if it's anything like the
         | other open-source startups I work with then the business model
         | is probably selling a managed or hosted version of the DB with
         | enterprise benefits like security compliance, SLAs, and
         | engineering support. In that case the open-source DB will act
         | as a driver of awareness and of demand for the commercial
         | option.
        
       | gregwebs wrote:
       | I am still hoping to see comparisons to Victoria Metrics, which
       | also shows much better performance than many other TSDB. Victoria
       | Metrics is Prometheus compatible whereas Quest now supports
       | Postgres compatibility. Both have compatibility with InfluxDB.
       | 
       | The Victoria Metrics story is somewhat similar where someone
       | tried using Clickhouse for large time series data at work and was
       | astonished at how much faster it was. He then made a
       | reimplementation customized for time series data and the
       | Prometheus ecosystem.
        
       | Maro wrote:
       | Can you add a tldr?
        
         | kgraves wrote:
         | this. i'm sure qdb is a great product, but i can't even stomach
         | reading long lines and walls of text...
        
       | monstrado wrote:
       | Any plans on integration with Apache Arrow?
        
         | bluestreak wrote:
         | It has been asked here:
         | https://github.com/questdb/questdb/issues/261. Definitely on
         | our road map. It would be good if you could share your story
         | why you need arrow?
        
           | monstrado wrote:
           | No urgent reason. I've noticed a decent of technologies have
           | adopted it in some way or another. I could imagine it being
           | useful for integrating QuestDB with existing internal systems
           | which use Arrow for its in-memory/interchange format.
           | 
           | Appreciate the issue link :)
        
       | pknerd wrote:
       | Stories like these help a product to get traction. Every
       | founder/creator must come up with a story related to the product.
       | 
       | Congrats!
        
       | pachico wrote:
       | I see this as a very interesting project. I use ClickHouse as
       | OLAP and I'm very happy with it. I can tell you features that
       | make me stick to it. If some day QuestDB offers them, I might
       | explore the possibility to switch but never before. - very fast
       | (I guess we're aligned here) - real time materialized views for
       | aggregation functions (this is absolutely a killer feature that
       | makes it quite pointless to be fast if you don't have it) - data
       | warehouse features: I can join different data sources in one
       | query. This allows me to join, for instance, my MySQL/MariaDB
       | domain dB with it and produce very complete reports. - Grafana
       | plugin - very easy to share/scale at table level - huge set of
       | functions, from geo to URL, from ML to string manipulation -
       | dictionaries: I can load maxdb geo dB and do real time
       | localisation in queries I might add some more once they come to
       | my mind. Having said this, good job!!!
        
         | bluestreak wrote:
         | Thank you for the kind words and constructive feedback. We are
         | here to build on feedback like this. Grafana plugin is coming
         | soon.
        
           | pachico wrote:
           | Glad to be useful. On the other side, I can tell you that
           | ClickHouse also misses a feature everyone in the community of
           | users wish for, which is automatic regarding when you add a
           | new node (sort of what elasticsearch does).
           | 
           | And before I forget, ClickHouse Kafka Engine is simply
           | brilliant. The possibility of just publishing to Kafka and
           | have your data not only inserted in your DB but also pre-
           | processed is very powerful.
           | 
           | Let me know if I can help you with use cases we have.
           | 
           | Cheers
        
             | bluestreak wrote:
             | This is incredibly useful, thank you! It would be awesome
             | if we could chat more about your use cases at some point.
             | Drop us a line on hello at questdb.io or join our slack.
             | Whichever is easier for you.
        
             | santafen wrote:
             | Thanks for the helpful feedback! Feel free to reach out to
             | chat more. I'm super interested in more feedback from you.
             | davidgs(at)questdb(dot)io
        
               | pachico wrote:
               | I certainly will. Cheers
        
       | nmnm wrote:
       | Loved the story and the product!
        
       | bravura wrote:
       | Can you talk about some of the ideal use cases for a time series
       | db? Versus Postgres or a graph database.
        
         | santafen wrote:
         | Great question! Time series databases are a great solution for
         | applications that need to process streams of data. IoT is a
         | popular use case. DevOps and infrastructure monitoring
         | applications as well. As has been mentioned in other comments
         | here, there are a lot of use cases in financial services as
         | well.
         | 
         | These are all applications where you're dealing with streams of
         | time-stamped data that needs to be ingested, stored, and
         | queried in huge volumes.
        
       | airstrike wrote:
       | There's an opportunity for a tool that combines this sort of
       | technology in the backend with a spreadsheet-like GUI powered by
       | formulas and all the user friendliness that comes with a non-
       | programmer interface. Wall Street would forever be changed.
       | Source: I'm one of the poor souls fighting my CPU and RAM to do
       | the same thing with Excel and non-native add-ins by {FactSet,
       | Capital IQ, Bloomberg}
       | 
       | This stuff                   SELECT * FROM balances
       | LATEST BY balance_ccy, cust_id         WHERE timestamp <=
       | '2020-04-22T16:15:00.000Z'         AND NOT inactive;
       | 
       | Makes me literally want to cry for knowing what is possible yet
       | not being able to do this on my day job :(
        
         | bluestreak wrote:
         | We are working on building a solid PostgreSQL support insofar
         | as allowing ODBC driver to execute this type of query from
         | Excel. This is work in progress with not that much left on it.
        
           | airstrike wrote:
           | Awesome! I think about this almost on a daily basis, and
           | could very well be wrong, but from my perspective think the
           | killer feature is integrating the querying with the financial
           | data providers I mentioned above so they could sell the whole
           | thing as the final product to end users. (EDIT: from a reply
           | to another comment, it seems like some people are onto the
           | concept: https://factset.quantopian.com)
           | 
           | If you ever install FactSet for a trial period and try
           | querying time series with even ~10,000+ data points, you'd be
           | amazed at how long it takes, how sluggish it is and how often
           | Excel crashes.
           | 
           | My _real_ perspective is Microsoft should roll something
           | similar out as part of Excel and also get in the business of
           | providing the financial data as they continue the transition
           | into services over products
        
       | posedge wrote:
       | Your story is very inspiring. I wish you all the best with this
       | project.
        
         | santafen wrote:
         | Thanks for the kind words!
        
       | monstrado wrote:
       | I noticed there is "Clustering" mentioned under enterprise
       | features, but I can't seem to find any references to it in the
       | documentation. Is this something that will be strictly closed
       | source?
        
         | bluestreak wrote:
         | There will be two different flavors of replication:
         | 
         | - TCP-based replication for WAN - UDP-based replication for LAN
         | and high traffic environments
         | 
         | We are currently building foundation elements of this
         | replication, such as column-first and parallel writes. These
         | will go into and always be part of QuestDB. TCP-replication
         | will go on top of this foundation and also part of QuestDB.
         | UDP-based replication will be a part of a different product we
         | are building that will be named Pulsar.
        
           | monstrado wrote:
           | Thanks for your response! Last question...
           | 
           | Will the clustering target just replication (HA) or will it
           | also target sharding for scaling out storage capacity?
        
             | bluestreak wrote:
             | :)
             | 
             | Eventually both. We are starting with baby steps, e.g. get
             | data from A to B quickly and reliably. Replication/HA will
             | be first of course. Then we want to scale queries across
             | multiple hosts. Since all nodes have the same data - they
             | may as well all participate. Sharding will be last. We are
             | thinking of taking a route of virtualizing tables. Each
             | shard can be its own table and SQL optimiser can use them
             | as partitions of single virtual table. We already take
             | single table and partition it for execution. Sharding seems
             | almost like a natural fit.
        
       | myth_drannon wrote:
       | https://questdb.io/docs/crudOperations Has js errors and is not
       | loading/page not found
        
         | mpsq wrote:
         | Thanks for reporting this! This is an old link, please use
         | https://questdb.io/docs/guide/crud instead. I am currently
         | updating the README and removing all dead links.
        
       | judofyr wrote:
       | Congratulations on launching! It looks like a great product. Some
       | technical questions which I didn't see answered on my first
       | glance:
       | 
       | (1) Is it a single-server only, or is it possible to store data
       | replicated as well?
       | 
       | (2) I'm guessing that all the benchmarks were done with all the
       | hot data paged into memory (correct?); what's the performance
       | once you hit the disk? How much memory do you recommend running
       | with?
       | 
       | (3) How's the durability? How often do you write to disk? How do
       | you take backups? Do you support streaming backups? How
       | fast/slow/big are snapshot backups?
        
         | bluestreak wrote:
         | thank you!
         | 
         | - replication is in the works, this is going to be both TCP and
         | UDP based, column-first, very fast.
         | 
         | - yes, benchmarks are indeed are done on second pass over the
         | mmaped pages. First pass would trigger IO, which is OS-driven
         | and dependant on disk speed. We've seen well over 1.5Gb/s on
         | disks that support this speed. Columns are mapped into memory
         | separately and they are lazy accessed. So the memory footprint
         | depends on what data your SQLs actually lift. We go quite far
         | to minimize false disk reads by working with rowids as much and
         | possible. For example 'order by' will need memory for 8 x
         | row_count bytes in most cases.
         | 
         | - durability is something we want user to have control over.
         | Under the hood we have these commit modes:
         | 
         | https://github.com/questdb/questdb/blob/master/core/src/main...
         | 
         | NOSYNC = means OS flushes memory whenever. That said, we use
         | sliding 16MB memory window when writing. Flushes will trigger
         | by unmapping pages. ASYNC = we call msync(async) SYNC = we call
         | msync(sync)
        
           | roskilli wrote:
           | Curious: What is your strategy on replication? Is it some
           | form of synchronous replication or asynchronous (i.e.
           | active/passive with potential for data loss in event of hard
           | loss of primary)? Also curious why you might look at UDP
           | replication given unless using a protocol like QUIC on top of
           | it, UDP replication would be inherently lossy (i.e. not even
           | eventually consistent).
        
             | bluestreak wrote:
             | The strategy is to multicast data to several nodes
             | simultaneously. Data packets are sequence to allow receiver
             | identify data loss. When loss is detected receiver finds
             | breathing space to send a NACK. The packet and the nack
             | would identify missing data chunk with O(1) complexity and
             | sender then re-sends. Overall this method is lossless and
             | avoids overhead of contacting nodes individually and
             | sending same data over the network multiple times. This is
             | useful in scenarios where several nodes participate in
             | query execution and getting them up to date quickly is
             | important.
        
           | biztos wrote:
           | Definitely enjoyed the story and I find the product
           | interesting! I especially like the time-series aggregation
           | clauses since it makes it easy to "think in SQL."
           | 
           | I was also going to ask about replication. Any idea when it's
           | going to be done?
           | 
           | Oh and kudos for the witty (previous) company name: Appsicle,
           | haha, love that.
        
             | patrick73_uk wrote:
             | Hi, I'm a questdb dev working on replication, we should
             | have something working within a couple of months. If you
             | have any questions feel free to ask me.
        
       | TheRealNGenius wrote:
       | Maybe I'm out of the loop, but I noticed lately that a majority
       | of show/launch hn posts I click on have text that is muted. I
       | know this happens on down voted comments, but is this saying that
       | people are down voting the post itself?
        
         | santafen wrote:
         | I don't think it's being downvoted. Maybe it's because it's an
         | actual post?
        
       | shay_ker wrote:
       | Absolutely love the story. TimescaleDB & InfluxDB have had a lot
       | of posts on HN, so I'm sure others are wondering - how do we
       | compare QuestDB to them? It sounds like performance is a big one,
       | but I'm curious to hear your take on it.
        
         | mpsq wrote:
         | As you said, performance is the main differentiator. We are
         | orders of magnitude faster than TimescaleDB and InfluxDB on
         | both data ingestion and querying. TimescaleDB relies on
         | Postgres and has great SQL support. This is not the case for
         | InfluxDB and this is where QuestDB shines: we do not plan to
         | move away from SQL, we are very dedicated in bringing good
         | support and some enhancements to make sure the querying
         | language is as flexible and efficient as possible for our
         | users.
        
           | avthar wrote:
           | Are there any performance comparisons to TimescaleDB and
           | Influx that you can share? A blog post perhaps?
        
             | j1897 wrote:
             | hi there - co-founder of questdb here. The demo on our
             | website hosts a 1.6 billion rows NYC taxi dataset with 10
             | years of weather data with around 30-minute resolution and
             | weekly gas prices over the last decade.
             | 
             | We've got example of queries in the demo, and you can see
             | the execution times there.
             | 
             | We have posted a blog post comparing the ingestion speed of
             | InfluxDB and QuestDB via InfluxDB Line Protocol some time
             | ago: https://questdb.io/blog/2019/12/19/lineprot
        
               | dominotw wrote:
               | > We are orders of magnitude faster than TimescaleDB and
               | InfluxDB
               | 
               | I think gp might be asking for a source for this claim.
               | 
               | I see execution times on the demo but not sure if thats
               | enough to say its faster than timescale.
        
               | mpsq wrote:
               | j1897 is referring to
               | https://questdb.io/blog/2020/04/02/using-simd-to-
               | aggregate-b...
        
               | srini20 wrote:
               | (Hard to draw many meaningful conclusions from a single,
               | extremely simple query without much explanation?)
               | 
               | Graph shows PostgreSQL as taking a long time, but doesn't
               | say anything about configuration or parallelization.
               | PostgreSQL should be able to parallelize that type of
               | query since 9.6+, but I _think_ they didn 't use
               | parallelization in these experiments with PostgreSQL,
               | even though they used a bunch of parallel threads with
               | QuestDB?
               | 
               | So would be good to know:
               | 
               | - What version of Postgres
               | 
               | - How many parallel workers for this query
               | 
               | - If employing JIT'ing the query
               | 
               | - If pre-warming the cache in PostgreSQL and configuring
               | it to store fully in memory (as benchmarks with QuestDB
               | appeared to do a two-pass to first mmap into memory, and
               | only accounting for the second pass over in-memory data).
               | 
               | etc
               | 
               | Database benchmarking is pretty complex (and easy to
               | bias), and most queries do not look like this toy one.
        
           | shay_ker wrote:
           | I'm sure many folks would be really interested to see two
           | things:
           | 
           | 1. A blog post around a reproducible benchmark between
           | QuestDB, TimescaleDB, and InfluxDB
           | 
           | 2. A page, like questdb.io/quest-vs-timescale, that details
           | the differences in side-by-side feature comparisons, kind of
           | like this page: https://www.scylladb.com/lp/scylla-vs-
           | cassandra/. Understandably, in the early days, this page will
           | update frequently, but that level of transparency is really
           | helpful to build trust with your users. Additionally, it'll
           | help your less technical users to understand the differences,
           | and it will be a sharable link for people to convince others
           | & management that QuestDB is a good investment.
        
             | avthar wrote:
             | Perhaps the QuestDB team could add it to the Time Series
             | Benchmarking Suite [1]? It currently supports benchmarking
             | 9 databases including TimescaleDB and InfluxDB.
             | 
             | [1] https://github.com/timescale/tsbs
        
               | mpsq wrote:
               | This is a great idea, we will have a look! It is good to
               | see that the ecosystem is moving towards a normalized /
               | "standard" benchmarking tool.
        
           | hawk_ wrote:
           | do you do realtime steaming using SQL as well?
        
             | bluestreak wrote:
             | Over the network streaming is not yet available. Someone
             | has mentioned Kafka support, how useful would that be to
             | stream processed (aggregated) values and/or actual table
             | changes?
        
         | gregmac wrote:
         | Is also be interested in hearing when is QuestDB _not_ a good
         | choice? Are there use cases where TimescaleDB, InfluxDB,
         | ClickHouse or something else are better suited?
        
           | j1897 wrote:
           | Hard question to answer because each solution is unique and
           | has its own tradeoffs. Taking a step back QuestDB is a less
           | mature product than the ones mentioned, and therefore there
           | are many features, integrations etc. to build on our side.
           | This is a reflection of how long we have been around and
           | capital we have raised versus those companies who are much
           | larger in size.
        
           | bluestreak wrote:
           | OLTP is not a good fit, if your workflow consists from
           | INSERT/UPDATE/DELETE statements
        
       | didip wrote:
       | I find your story very interesting, thank you for sharing that.
       | 
       | It also gives an interesting background as to why questdb is
       | different than all the other competitors in the space.
        
         | bluestreak wrote:
         | Thank you for the kind words!
        
       | tosh wrote:
       | kudos @ launching, impressive
        
         | santafen wrote:
         | Thank so much!
        
       | jankotek wrote:
       | Good luck. I work on similar OS database engine for about decade
       | now. It is not bad, but I think consulting is better way to get
       | funds. Also avoid "zero gc", JVM can be surprisingly good.
       | 
       | Will be in touch :)
        
       | thegreatpeter wrote:
       | Am I the only one that's like "wtf is a time-series database
       | compared to a normal one?"
        
         | avthar wrote:
         | This is actually an underrated question.
         | 
         | Time-series databases offer better performance and usability
         | for dealing with time-series data (think DevOps metrics, data
         | from IoT devices, stock prices etc, anything where you're
         | monitoring and analyzing how things change over time)
         | 
         | They allow you answer questions where time is the main
         | component of interest much more quickly and easily:
         | 
         | eg 1: IoT Sensors) Show me the average of temperature over all
         | my devices over the past 3 days in 15 minute intervals
         | 
         | eg 2: Financial data) What's the price of stock X over the past
         | 5 years
         | 
         | eg 3: DevOps data) What's the average memory and CPU used by
         | all my servers of the past 5 mins
         | 
         | A normal database could be a purely relational database (e.g
         | Postgres) or a non-relational database (e.g MongoDB). In both
         | these cases, while you could use these databases for time-
         | series data, they tend to offer worse performance at scale and
         | a worse experience for doing common things (e.g real-time
         | aggregations of data, data retention policies etc)
         | 
         | For more on time-series data and when you'd need a time-series
         | database, check out: https://blog.timescale.com/blog/what-the-
         | heck-is-time-series...
        
           | airstrike wrote:
           | > eg 2: Financial data) What's the price of stock X over the
           | past 5 years
           | 
           | This is _so incredibly frustratingly slow_ to pull on FactSet
           | and Capital IQ, it makes me want to pull my hair every time I
           | have to build line charts over time for a period greater than
           | 2 years
        
             | avthar wrote:
             | Sounds like you need a time-seres database for those sorts
             | of narrow and deep queries :)
             | 
             | What's difficult is to find a database that has good
             | performance on both narrow and deep queries (e.g Price of
             | stock X for past 5 years) as well as shallow and wide
             | queries (e.g Price of all stocks in past 15mins)
        
             | fawce wrote:
             | plug, but our system provides very fast access to price,
             | fundamentals, estimates, etc:
             | https://factset.quantopian.com
        
         | Maro wrote:
         | In general, imagine a problem space where you have millions (or
         | much much more) of timeseries, each is potentially millions
         | long (usually it's purely floats), and you want to perform
         | time-series specific operations, like interpolate, extrapolate,
         | moving avg, forecast, alert if ususual, plot, etc. Like,
         | imagine AWS has 100s-1000s (or more) of metrics per "thing",
         | and there's a very large number of things (EC2 instances,
         | subnets, SageMaker instances, network switches, etc). Very
         | specific data model, usually append-only, very specific read
         | operations.
        
         | Ixiaus wrote:
         | Yes. Google it.
        
           | airstrike wrote:
           | https://news.ycombinator.com/newsguidelines.html
           | 
           | Please don't post shallow dismissals, especially of other
           | people's work. A good critical comment teaches us something.
        
         | bluestreak wrote:
         | "normal" database does not preserve the order of data as it
         | comes in. To get the data out you have to constantly rely on
         | indexes or "order by" constantly to get chronological order
         | back. Time series database should maintain the order of data
         | and not rely on indexes to have data make sense again.
        
         | santafen wrote:
         | Here's a Medium post on Time Series Databases:
         | https://medium.com/datadriveninvestor/what-are-time-series-d...
         | But basically, they are databases where the time dimension is a
         | critical aspect of the data. They handle never-ending streams
         | of incoming time-stamped data. Think of streaming stock prices,
         | or streaming temperature data from a sensor. You want to be
         | able to query your data specifically in a time dimension --
         | let's see the temperature fluctuations over the past 24 hours,
         | including the mean. Stuff like that.
         | 
         | A 'normal one' is typically used for things like transactional
         | data which adds, deletes, and updates data among linked tables.
         | While these transactions happen in time, the time component
         | isn't necessarily a critical dimension of the data.
        
         | viraptor wrote:
         | It organises storage so that operations like "drop all entries
         | from before X", "get entries between X and Y with tags A or B"
         | are cheap and so that storing ~linearly increasing values is
         | super efficient with stupid ratios like 1:800+ for DoubleDelta.
        
         | beagle3 wrote:
         | One differentiating feature is "as of" join. You have records
         | of the form (time, value), and you ask "what's the most recent
         | value as of $time?"; On a non-TS oriented DBMS, this query is
         | usually slow and hard to write. Window extensions to SQL can
         | make it a little better, but - you can assume that a proper
         | TSDB answers this query x10 to x10,000 times faster on the same
         | hardware, especially when done in bulk (e.g.: I have one
         | million (time,bid_price) records, and one million
         | (time,transaction_price) records; For each transaction record,
         | I want to know what the most-recent bid price was at that time.
         | 
         | That's something kdb+ and ClickHouse do in milliseconds; and I
         | assume QuestDB can to, though I didn't check.
        
       | sylvain_kerkour wrote:
       | Congrats!
       | 
       | Also thank you for your awesome blog[0]! It's really the kind of
       | technical gem I enjoy reading late at night :)
       | 
       | [0] https://questdb.io/blog
        
       | rbruggem wrote:
       | great story! well done.
        
       | [deleted]
        
       | vii wrote:
       | mmap'd databases are really quick to implement. I implemented
       | both row and column orientated databases. The traders and quants
       | loved it - and adoption took off after we built a web interface
       | that let you see a whole day and also zoom into exact trades with
       | 100ms load times for even the most heavily traded symbols.
       | 
       | The benefits of mmaping and in general POSIX filesystem atomic
       | properties are quick implementation, where you don't have to
       | worry about buffer management. The filesystem and disk block
       | remapping layer (in SSD or even HDDs now) are radically more
       | efficient when data are given to them in contiguous large chunks.
       | This is difficult to control with mmap where the OS may write out
       | pages at its whim. However, even using advanced Linux system
       | calls like mremap and fallocate, which try to improve the
       | complexity of changing mappings and layout in the filesystem,
       | eventually this lack of control over buffers will bite you.
       | 
       | And then when you look at it, the kernel (with help from the
       | processor TLB) has to maintain complex data-structures to
       | represent the mappings and their dirty/clean states. Accessing
       | memory is not O(1) even when it is in RAM. Making something
       | better tuned to a database than the kernel page management is a
       | significant hurdle but that's where there are opportunities.
        
         | bluestreak wrote:
         | thank you for sharing! The core of memory management is
         | abstracted away. All of the query execution logic is unaware of
         | the source of memory pointer. That said we are still learning
         | and really appreciate your feedback. There are some places
         | where we could not beat aggregation of julia, but the delta
         | wasn't very big. This could have been down to mapped memory. We
         | will definitely try things with direct memory too!
        
           | vii wrote:
           | The databases I implemented experimented with various ways to
           | compile queries. Turns out that the JVM can run quite fast. I
           | feel like LLVM (Julia) is likely to be able to be better for
           | throughput and definitely better for predictability of
           | performance.
           | 
           | If you know layouts and sizes, then your generated code can
           | run really fast - using SIMD and not checking bounds is a
           | win.
           | 
           | Hugepages would greatly reduce pagetable bookkeeping, but
           | obviously may magnify writes. Wish I could have tried that!
        
             | bluestreak wrote:
             | Our best performance currently is in C++ code and LLVM is
             | something we are considering using to compute expressions,
             | such as predicates and select clause items. This is most
             | likely to be way faster than what we can currently do in
             | Java. What I would like to know if LLVM can optimize all
             | the way to AVX512?
             | 
             | We also need to experiment with hugepages. The beauty is
             | that if read and write are separated - there is no issue
             | with writes. They can still use 4k pages!
        
       | samsk wrote:
       | Does it supports some kind of compression ? That's very important
       | when storing billions of events.
        
         | mpsq wrote:
         | Not yet but this is on the roadmap. In the meantime you could
         | use a filesystem that supports compression such as ZFS or
         | brtfs. The data is column orientated, this means that
         | compression would be super efficient.
        
       | jeromerousselot wrote:
       | Great story! Thanks for sharing
        
         | joan4 wrote:
         | Thanks Jerome!
        
       | dominotw wrote:
       | something is off with your website. I just see images
       | https://questdb.io/blog/2020/07/24/use-questdb-for-swag/
        
         | mpsq wrote:
         | What browser are you using?
        
           | dominotw wrote:
           | chrome on osx Version 84.0.4147.89 (Official Build) (64-bit)
           | 
           | works fine in safari, something is up with the dark theme.
        
             | mpsq wrote:
             | I tried with the same setup and it works fine. I tried to
             | disable JS too and it's OK. Could it be a rogue extension?
        
               | dominotw wrote:
               | ah yea you are right. I had high contrast extension
               | messing with this page.
        
       | lpasselin wrote:
       | Does postgres wire support mean QuestDB can be a drop-in
       | replacement for a postgres database?
       | 
       | Is this common?
        
         | santafen wrote:
         | QuestDB Head of DevRel here ... Yes, it can be a replacement of
         | Postgres and it will be cheaper and faster. That being said,
         | PGwire is still in alpha and is not 100% covered yet, so while
         | migrating is possible, 100% Postgres Wire Protocol
         | compatibility is not there yet.
         | 
         | For traditional transactional RDBMS data, I don't think it's a
         | very common choice. For Time Series data, QuestDB is by far the
         | fastest choice for Postgres-compatible SQL Time Series
         | databases.
        
           | lpasselin wrote:
           | This is great if it works well. Drop-in replacement would be
           | great for systems with DB abstraction like in django,
           | sqlalchemy.
        
           | jaydub wrote:
           | Yeah, I checked it out and wanted to use but a bunch of
           | regular old SQL queries don't work. Please add support for
           | the old fashioned group by syntax! (This will be helpful for
           | getting to a true drop-in replacement!)
        
             | joan4 wrote:
             | QuestDB dev here. We added support for GROUP BY syntax in
             | yesterday's release
        
       | maz1b wrote:
       | Hi Vlad, this looks really interesting!
       | 
       | I really enjoyed reading the backstory and the founding dynamics
       | upon QuestDB was born and I think a lot of others in the YC
       | community will as well.
       | 
       | Can you give some use cases or specific examples of why QuestDB
       | is unique?
        
         | bluestreak wrote:
         | thanks! What differentiates us from other time series databases
         | is the performance. Both for ingestion and queries. For example
         | we can ingest application performance metrics via Influx Line
         | Protocol and query them via SQL and both should faster than
         | incumbents
        
       ___________________________________________________________________
       (page generated 2020-07-28 23:00 UTC)