[HN Gopher] Timescale raises $110M Series C
       ___________________________________________________________________
        
       Timescale raises $110M Series C
        
       Author : petercooper
       Score  : 152 points
       Date   : 2022-02-22 16:38 UTC (6 hours ago)
        
 (HTM) web link (www.timescale.com)
 (TXT) w3m dump (www.timescale.com)
        
       | riffic wrote:
       | This company should have a wikipedia article at this point,
       | right? Where is it?
       | 
       | closest I can find are these:
       | 
       | https://en.wikipedia.org/wiki/TimescaleDB
       | 
       | https://en.wikipedia.org/wiki/Draft:TimescaleDB
        
         | LoriP wrote:
         | Timescale's Community Manager here. Unfortunately that's a
         | longish story of a strongly opinionated Wikipedia editor.
        
           | riffic wrote:
           | I think I might have bumped into that particular editor
           | recently, lol.
           | 
           | Let me know if you want help in putting together a draft, I'm
           | good at finding sources[0] to establish notability.
           | 
           | [0] https://www.google.com/search?tbs=bks:1&q="TimescaleDB"+-
           | wik...
        
             | LoriP wrote:
             | Thank you, I might take you up on that...and exchange
             | experiences lol
        
       | ashvardanian wrote:
       | Insane! $110M towards yet another Postgres extension. Theoretical
       | CS and hardware has advanced so much, but the people are using
       | the same old boring approaches. Truly sad.
        
         | pella wrote:
         | proposal for "Solving PostgreSQL wicked problems"
         | 
         | https://github.com/orioledb/orioledb/blob/main/orioledb-post...
        
           | ashvardanian wrote:
           | Great report, but I am new B-Trees alone will not enough. The
           | simplest common solution is to switch to LSM Trees for higher
           | write throughput. Thats exactly what Yugabute does, by
           | putting Postgres over RocksDB. Same way as Facebook uses
           | MyRocks = MySQL + RocksDB.
        
         | Beltalowda wrote:
         | What do you think better alternatives/approaches are?
        
           | ashvardanian wrote:
           | Literally anything. There is so much to do better. Faster
           | I/O, kernel bypass and async filesystem, new persistent data-
           | structures, alternative lock-free concurrency resolution
           | schemes...
           | 
           | Disclaimer: I am highly biased, as I am
           | funding/researching/developing a DBMS myself. Out of
           | necessity though, as we constantly hit bottlenecks in the
           | persistent I/O layer. We are not selling or offering
           | anything, but will soon share some fresh internal results on
           | aforementioned topics.
           | 
           | In the meantime, here is a list of startups re-implementing
           | mostly identical ideas:
           | https://unum.cloud/post/2021-12-31-dbms-startups/
        
           | dboreham wrote:
           | PostgresSQL process-per-connection execution model is
           | limiting.
        
         | _joel wrote:
         | True, why use proven technology with decades of production
         | usage with your data when you can use a novel and theoretical
         | CS paper implementations.
        
         | pella wrote:
         | "boring" == less transactional surprise :-)
         | 
         | https://jepsen.io/analyses/postgresql-12.3
        
       | georgewfraser wrote:
       | I would love to run a time-series benchmark against a good column
       | store like Snowflake to see if purpose-built time series
       | databases are actually faster. I have a sneaking suspicion the
       | time scale databases are just reinventing the column store, and
       | that an appropriate non-sabotaged benchmark would show this.
        
         | ren_engineer wrote:
         | the main selling point is developer experience I think, rather
         | than building out a bunch of stuff on top of a more general
         | purpose tool you use a specialized DB and save time. They also
         | have some benchmarks against Clickhouse for example
        
           | ashvardanian wrote:
           | I have just looked up those charts on Timescales website [1]
           | and I am a bit surprised. Never extensively used any of those
           | DBs, but I have seen their sources and must say that expected
           | bigger gaps [2]. Also worth looking: the Taxi Rides Benchmark
           | on Postgres vs Clickhouse [3].
           | 
           | [1]: https://www.timescale.com/blog/what-is-clickhouse-how-
           | does-i... [2]:
           | https://www.pradeepchhetri.xyz/clickhousevstimescaledb/ [3]:
           | https://tech.marksblogg.com/benchmarks.html
        
         | manigandham wrote:
         | > _" timescale ... are just reinventing the column store"_
         | 
         | Not reinventing but reimplementing it for Postgres, which
         | didn't have serious OLAP capabilities before. Lots of "newsql"
         | systems are combining OLTP and OLAP by starting at one side and
         | adding the other.
         | 
         | So far Timescale has column-oriented compressed storage and
         | scale-out partitioning, and they're working on matching the
         | compute part.
        
         | abraxas wrote:
         | There is a lot of value in having a columnar storage that is
         | fully ANSI SQL and supports all of the goodies that you get in
         | the Postgres ecosystem.
         | 
         | NoSQL databases with their half assed SQL grammar
         | implementations are a real pain to use in real applications
         | where they often have to be handled differently in code vs the
         | RDBMS because either their syntax is slightly different or
         | their connection stack is incompatible.
        
       | akulkarni wrote:
       | (Timescale co-founder / CEO)
       | 
       | I just want to say that we wouldn't be here without the support,
       | feedback -- and yes, even the honest critiques -- from the HN
       | community. So thank you everyone.
       | 
       | As we like to say, we've come a long way in the past 5 years, but
       | we're just getting started :-)
       | 
       | And we're hiring globally for our remote-first team!
       | 
       | https://www.timescale.com/careers
        
         | 999900000999 wrote:
         | How did you get started?
         | 
         | I've dealt with databases for a very long time, But I frankly
         | find SQL extremely hard and I could never imagine forking
         | postgres to improve it.
         | 
         | Is this your first company, did you personally write much of
         | the early code, or did you hire a team to do so.
         | 
         | As a side note, I'm seeing an insane amount of movement when it
         | comes to business to business VC funding.
         | 
         | What about an entertainment product ?
         | 
         | Hypothetically, if you were given a year off to come up with a
         | new product, do you think it would be possible?
         | 
         | I would absolutely love to read a blog post where you discuss
         | the challenges you've faced getting here.
        
           | avthar wrote:
           | Timescaler here. Linking a few posts below [0][1] that answer
           | the majority of your very good questions. The posts detail
           | why and how TimescaleDB started and why the founders chose to
           | build a time-series database on PostgreSQL.
           | 
           | [0]: https://www.timescale.com/blog/when-boring-is-awesome-
           | buildi... [1]: https://www.timescale.com/blog/40-million-to-
           | help-developers...
        
             | 999900000999 wrote:
             | Thank you !
             | 
             | I plan on reading this all in full.
             | 
             | Edit : Very impressive way to pivot , best of luck !
        
         | sdesol wrote:
         | Congrats and I was wondering if you can comment on the current
         | team size? I'm looking at the number of contributors that have
         | created pull requests within the last four months and it is
         | shockingly low (in a good way). Based on the following:
         | 
         | https://oss.gitsense.com/insights/github?q=pull-age%3A%3C%3D...
         | 
         | It looks like there has only really been 7-10 full time
         | contributors and for you to have raised what you have with such
         | a small team is quite impressive. Is development happening
         | elsewhere or is my hunch correct?
         | 
         | Edit: Thanks to feedback from mfreed, below is a more accurate
         | picture of development activity:
         | 
         | https://oss.gitsense.com/insights/github?p=authors&q=pull-ag...
        
           | mfreed wrote:
           | Hi! So the team is over 100 at this point, but engineering
           | effort is spread across multiple products at this point.
           | 
           | The core timescaledb repo [0] currently has 10-15 primary
           | engineers, with a few others working on DB hyperfunctions and
           | our function pipelining [1] in a separate extension [2]. I
           | think generally the set of outside folks who contribute to
           | low-level database internals in C is just smaller than other
           | type of projects.
           | 
           | We also have our promscale product [3], which is our
           | observability backend powered by SQL & TimescaleDB.
           | 
           | And then there is Timescale Cloud [4], which is obviously a
           | large engineering effort, most of which does not happen in
           | public repos.
           | 
           | Interested? We're growing the teams aggressively! Fully
           | remote & global.
           | 
           | https://www.timescale.com/careers
           | 
           | --
           | 
           | [0] https://github.com/timescale/timescaledb
           | 
           | [1] https://www.timescale.com/blog/function-pipelines-
           | building-f...
           | 
           | [2] https://github.com/timescale/timescaledb-toolkit
           | 
           | [3] https://github.com/timescale/promscale ;
           | https://github.com/timescale/tobs
           | 
           | [4] https://www.timescale.com/blog/announcing-the-new-
           | timescale-...
        
             | sdesol wrote:
             | Hey thanks for the insights! I've added all timescale repos
             | for indexing and should have the bigger picture in a few
             | hours. Thanks again for catering to my curiosity.
        
         | pcthrowaway wrote:
         | As a user of Timescale, one of the things I find most lacking
         | with Timescale (and postgres also to be honest) is good
         | educational content on par with MongoDB University; structured
         | courses that teach you database design concepts from first
         | principles and then what problems postgres/timescale solve on
         | top of them. Hands on experience working with datasets and an
         | interactive way of learning more about the types of things you
         | can do.
         | 
         | I realize Timescale and Mongo are very different things, but
         | when I got professionally started with software 10 years ago,
         | the MongoDB courses (and the stanford online Intro to DB
         | course) were immensely helpful. Working with Timescale
         | professionally now I'm often unsure whether I'm doing things
         | suboptimally, e.g. making tables hypertables when a regular
         | table might be better, flexibility and capabilities of indexes
         | with hypertables, and application-facing tooling.
        
           | djk447 wrote:
           | Totally fair and something that I'm actually forming a team
           | to work on! We're starting with some very foundational
           | material [1], that may well be review and it's not as formal
           | / professional as Mongo University or the like, but I am
           | going to be continuing this course and then we'll be
           | iterating more from there. I'd really love some feedback and
           | also your questions, ie what you want to cover or what you
           | find confusing. You can leave comments on the video or in our
           | community Slack channel[2] or forum[3]. Thanks for the
           | feedback and I hope we'll be able to do some of that for you
           | over the coming months!
           | 
           | [1]: https://www.youtube.com/watch?v=tLJm2oStD9w [2]:
           | timescaledb.slack.com [3]: https://www.timescale.com/forum/
        
       | jabiko wrote:
       | While I think that TimescaleDB is a great technology, the support
       | experience of their Timescale Cloud offering was quite
       | underwhelming.
       | 
       | In one occurrence we wanted to create a VPC peering between the a
       | database hosted on Timescale Cloud and our Kubernetes cluster
       | hosted on Azure. For this you need to put the Azure resource
       | group name in a form on Timescale cloud.
       | 
       | Turns out our resource group name contained an uppercase
       | character and the form on Timescale Cloud has a (broken)
       | validation that required the name to be all lowercase. We
       | couldn't easily change that name since that would have required
       | us to re-create our production AKS/K8S cluster.
       | 
       | After contacting Timescale support (as a paying customer) the
       | answer was basically: "Well, we require the resource group name
       | to be lowercase, we can't change that, sucks to be you, bye"
       | 
       | We can live without that VPC peering, so we didn't push that
       | further, but there are zero technical reasons for that
       | restrictions and I would bet that its just a broken validation
       | regex in their backend that they are unwilling to fix.
        
         | akulkarni wrote:
         | Hi there, sorry you had a negative experience with Timescale
         | Cloud.
         | 
         | It's true that on Azure we require both the resource group name
         | and also the Virtual Network name to be in lowercase.
         | 
         | But Microsoft names are case-agnostic, so this should be okay.
         | 
         | We've had other customers with this issue before and converting
         | the resource group name to be lowercase worked for them.
         | 
         | Also, sometimes technical restrictions exist for internal
         | reasons that are not obvious / hard to share externally. :-)
         | 
         | That said, I shared your message internally and someone is
         | looking into this. Stay tuned. More soon.
        
           | jabiko wrote:
           | Hi, thanks you. I'm really grateful for your answer. The
           | Azure documentation seems to hint at the resource group name
           | being case insensitive, so I guess that could actually work.
           | To be honest I don't know if we tried just using a lowercase
           | version.
           | 
           | Again, I want to emphasize that I find it really great that
           | you took the time to answer here.
        
       | CyberDildonics wrote:
       | I don't understand what is difficult or non trivial about these
       | types of databases and when people try to explain it, it usually
       | just gets more confusing. Filtering values over time is just the
       | same operations that you would find in an audio editor or a one
       | dimensional version of what you find in an image editor (weighted
       | averages of values). I wonder how many customers could just use
       | sqlite but don't know anything about computers and end up buying
       | some sort of subscription to a 'new kind of database'.
       | 
       | The web page just drops as many buzzwords as possible - web3,
       | crypto, nfts, monitor soil to fight global warming - it looks
       | like a disaster to anyone who understands the basics of
       | programming.
        
         | jacobr1 wrote:
         | If your data is small enough, then sure, any number of well
         | tested data platforms will work for you.
         | 
         | The problem something like timescale tries to solve is dealing
         | with "high cardinality." When you have many unique values the
         | indexing approaches needed to ensure performance start becoming
         | different. You'll run into write performance issues if need
         | indexes on every single column, and every single combination of
         | columns, and each column has a large number of unique values.
         | While the common factor many of these datasets tend to share is
         | that they are being constantly generated by some kind of
         | sensor/probe/live-system, they tend to have a variety of other
         | dimensions that are also high-cardinality.
        
           | CyberDildonics wrote:
           | There are two different things here - the first is people not
           | needing an elaborate solution because computers are fast and
           | the second is that if someone does need a solution with less
           | overhead, why is that difficult?
           | 
           | Values over time like audio is a one dimensional signal.
           | Seeking is basic data structure stuff, filtering is basic
           | signal stuff. There aren't going to be other dimensions like
           | time, which makes the other values just other channels. If
           | you need to combine dense values they can be not only
           | filtered, but filtered into individual distributions.
           | 
           | People give abstract descriptions like you have here, but I'm
           | just not seeing a difficult problem in all of this.
        
         | lopatin wrote:
         | > I don't understand what is difficult or non trivial about
         | these types of databases
         | 
         | Boy were you right about that
        
           | beanjuiceII wrote:
           | promote that person to management
        
         | cleancoder0 wrote:
         | SQLite does not support column based optimizations. Time series
         | data is insanely compressible.
        
         | jtlisi wrote:
         | Have you read the gorilla TSDB paper?
         | https://www.vldb.org/pvldb/vol8/p1816-teller.pdf
         | 
         | It does a good job laying out why TSDBs are used and some of
         | the tricks they leverage to store this type of data. See the
         | requirements for the service layed out in the paper:
         | 
         | * 2 billion unique time series identified by a string key.
         | 
         | * 700 million data points (time stamp and value) added per
         | minute.
         | 
         | * Store data for 26 hours.
         | 
         | * More than 40,000 queries per second at peak.
         | 
         | * Reads succeed in under one millisecond.
         | 
         | * Support time series with 15 second granularity (4 points per
         | minute per time series).
         | 
         | * Two in-memory, not co-located replicas (for disaster recovery
         | capacity).
         | 
         | * Always serve reads even when a single server crashes.
         | 
         | * Ability to quickly scan over all in memory data.
         | 
         | * Support at least 2x growth per year
         | 
         | Lots of organizations want to adopt an SRE/devops model and
         | want a similar system. Also one thing you should know is that
         | trying to accomplish this with traditional DBMS is usually
         | possible but since it is not making specifically optimized
         | trade offs it usually is more expensive and requires a lot of
         | tuning/expertise.
         | 
         | Lots of organizations (even legacy companies) have a massive
         | need for this kind of service. Also there are very cheap
         | options out there than can handle the million metric use case
         | for basically a <100$ a month is infra costs. The use case is
         | definitely there and even if it's possible with traditional
         | DBMS systems, it usually cheaper and more performant to use a
         | dedicated TSDB.
        
         | [deleted]
        
       | ishikawa wrote:
       | This shows that awareness to time series data is huge today,
       | unlike 10 years ago.
        
       | mfringel wrote:
       | How does Timescale handle lookup tables?
       | 
       | That is, "I have this seldomly-updated list of ~10000 things, and
       | I'm going to need to join it against my time-series data."
       | 
       | With other time-series databases I've dealt with, it's an
       | afterthought at best and the answer is typically "Enrich the data
       | via flink/benthos/etc. on import and avoid using any kind of
       | join."
       | 
       | Does Timescale's use of PostgreSQL circumvent this issue, both in
       | terms of storage of lookup tables, and performance on join?
        
       | aeyes wrote:
       | > Looking ahead, our goal is to keep innovating on top of
       | PostgreSQL and to continue adding breakthrough capabilities
       | 
       | Does Timescale contribute back to PostgreSQL or do they truly
       | only build on top of it?
       | https://www.postgresql.org/community/contributors/ only lists two
       | contributors and they both worked on Postgres before joining
       | Timescale.
        
         | yieldgap wrote:
         | On Twitter, they said they're building a team for upstream
         | contributions
         | https://twitter.com/acoustik/status/1496145349735559168?t=HG...
        
       | tkinom wrote:
       | I have written time series logging db with sqlite believe that
       | approach has following advantages:                  System
       | performance scales well with latest SSD HW.           As compare
       | to cloud base approach that is limited by network/cloud speed.
       | One can store logs per day / week / year in separate db files as
       | needed.               Backup of small db files for last few
       | days/weeks are trivial with rsync.
       | 
       | Love to hear other pro/con arguments from folks who use Timescale
       | type approach.
        
       | fabioyy wrote:
       | i'm using timescale to store sensor/gps log ( 50 inserts/s - 24/7
       | ). after 2 months still very good )
        
         | hardwaresofton wrote:
         | Have you written about this anywhere? I'm sure TimescaleDB
         | would love to signal boost that post, and I separately would
         | love to read about how you have it set up and the nitty gritty
         | of the setup.
         | 
         | How are you dealing with backups/WAL and general DB
         | administration? Are you using Timescale Cloud?
        
       | mparnisari wrote:
       | Wonder if Timescale would be a good answer to
       | https://stackoverflow.com/questions/70841804/is-aws-timestre...
        
         | akulkarni wrote:
         | In our benchmarks (which you and others are welcome to
         | replicate), Timescale vastly outperformed AWS Timestream:
         | 
         | https://www.timescale.com/blog/timescaledb-vs-amazon-timestr...
        
           | mfreed wrote:
           | To replicate, please see the Time Series Benchmark Suite,
           | which is open-source and has many vendor-contributed
           | configurations:
           | 
           | - https://github.com/timescale/tsbs
           | 
           | - https://github.com/timescale/tsbs/blob/master/docs/timestre
           | a...
        
       ___________________________________________________________________
       (page generated 2022-02-22 23:00 UTC)