hngopher.com

       [HN Gopher] Why is Snowflake so Valuable?
       ___________________________________________________________________
        
       Why is Snowflake so Valuable?
        
       Author : malisper
       Score  : 120 points
       Date   : 2020-09-30 17:50 UTC (5 hours ago)
        
 (HTM) web link (www.freshpaint.io)
 (TXT) w3m dump (www.freshpaint.io)
        
       | autokad wrote:
       | I think part of it is that many know that most companies under
       | perform the market. So I imagine it's not hard to see someone
       | justifying (correctly or incorrectly) that its worth paying more
       | for a company you think is more likely going to be one of those
       | outliers. and being that there is a limited supply of these
       | companies, they can shoot up in value fast.
       | 
       | I never used snowflake, so hard for me to have a solid opinion of
       | this company. I remember when facebook IPO'ed and people were
       | like 'what? worth 100 billion? OVERVALUED'. and they were wrong
       | in every way. so who knows? Though my gut doesn't tell me this
       | company is the next facebook. coming out of the gate with a 70
       | bil market cap feels like all the growth is already priced in.
       | 
       | With facebook and on their IPO, I felt just their mobil revenue
       | in 5 years time alone would be worth their valuation. But I had
       | week hands. I think I bought them at 28 and sold at 22. I should
       | have had more conviction because I truly did believe they were
       | worth a lot more.
        
       | victor106 wrote:
       | > Net promoter score (NPS) is a way of measuring customer
       | satisfaction.
       | 
       | How easy/hard is it to fake an NPS score? Is this somehow
       | regulated? Can the company only provide its most satisfied
       | customers (which it knows beforehand) and only have them
       | participate to get a good NPS?
        
         | stingraycharles wrote:
         | NPS can be gamified in any way you want it. If you want to use
         | it as a real metric to improve your product, it's a great
         | metric. But if you want to use it to convince shareholders your
         | company is very valuable, it's extremely easy to do so by
         | implementing certain biases.
        
         | jmt_ wrote:
         | I also have some questions/concerns over the NPS. Online
         | surveys, which is effectively the instrument of NPS, typically
         | yield statistically incorrect data due to, often, some flavor
         | of self-selection bias. If you think of an online survey as an
         | experiment, they rarely allow enough control over the sample to
         | mean much. However, that's not to say it's impossible to
         | properly conduct an NPS, just that it's probably very easy to
         | get wrong which may paint a false picture.
        
         | runako wrote:
         | Likely this can be easily gamed. But in the context of
         | Snowflake's value, NPS manifests in Net Retention, which is
         | likely to be more difficult to fudge:
         | 
         | > For every $1 of revenue Snowflake received from their
         | customers a year ago, that same pool of customers are now
         | paying $1.58.
         | 
         | Net Retention is more important, but in this case it also gives
         | credence to the NPS number.
        
         | adarioble wrote:
         | NPS originally came from car manufactures and industries that
         | produce products that are easy comparable, I.e. how happy are
         | you with the car? Would you recommend BMW to friends?
         | 
         | I don't think it's that great for software, but it's very
         | trendy. It can be gamed, like anything else, usually by sending
         | NPS surveys to decision makers who aren't usually the actual
         | users.
         | 
         | Slightly better metric is Customer Effort Score (CEF) which
         | shows how easy is to do business with a company.
        
       | jredwards wrote:
       | Because we're in a tech bubble. These valuations are absurd.
        
         | drawkbox wrote:
         | That and low interest rates and looking for a place to put
         | money, plus the Warren Buffet multiplier.
         | 
         | There are only 3600~ companies on the US markets [1], half of
         | what was there in the 90s. There aren't many places to put lots
         | of money.
         | 
         | The rise of private equity (PE) really has taken lots of the
         | growth out of the public markets but since there are so few,
         | and low interest rates, people are looking for returns. Lots of
         | it is also pump and dump schemes loading up stocks and then
         | short and distort. The problem is the growth is tapped on
         | public markets now, PE drains it before it gets there, so more
         | volatility games are being played. Couple that with less
         | spending purchasing power in the lower/middle and that adds to
         | the games as investment in new consumer focused companies isn't
         | working as well when purchasing power is drained, M2V has hit a
         | precipice [2]
         | 
         | [1] https://www.wsj.com/articles/where-have-all-the-public-
         | compa...
         | 
         | [2] https://fred.stlouisfed.org/series/M2V
        
       | physcab wrote:
       | I've used Snowflake a fair amount. It's a decent product,
       | probably on par with Redshift / BigQuery. Obviously theres a lot
       | of hype and free money floating around but my take on why they
       | are popular is that they are basically a replacement for large
       | Hadoop installations that have become untenable to manage over
       | the past decade. If a company is already using Redshift or
       | BigQuery I'm not sure why they would switch.
       | 
       | I would be apprehensive in investing in Snowflake long term
       | purely because their product is highly susceptible to being
       | obsoleted in the next 5-10 years.
        
         | cgenschwap wrote:
         | I was at a company that switched from Redshift to Snowflake. It
         | was a night and day difference. Faster (orders of magnitude!),
         | cheaper, and significantly easier to work with (since everyone
         | had their own personal view of the data to mutate/work with).
         | 
         | As far as I can tell, it is a unique product in the database
         | space. Extremely well executed ideas and design.
        
           | jeffffff wrote:
           | yeah redshift is not at all comparable to snowflake. big
           | query is much closer, it's ahead in some areas and in the
           | last year has closed some of the gaps where it wasn't. big
           | query's biggest problem is that it's tied to gcp which is a
           | distant 3rd in cloud marketshare. they have big query omni
           | coming which is multi-cloud but it'll probably be a while
           | before it's comparable to big query in gcp.
        
             | philjohn wrote:
             | The other problem with BigQuery is that you can very easily
             | write a query that's going to cost you a lot of money to
             | run - with Snowflake you can let it run for an hour or so,
             | and then realise it was a bad idea and you're only out a
             | few credits, a handful of dollars.
             | 
             | The killer feature for me was the query profiler - you can
             | see WHY a query is taking a long time and optimise it -
             | BigQuery just felt like Google were brute forcing the
             | performance, and then charging you accordingly.
             | 
             | When the project I was on switched, the micro-clusters (and
             | the ability to recluster a table) as well as the MERGE
             | semantics beat BigQuery hands down - although those
             | features my be out of beta now (but I've moved on to a new
             | gig).
        
               | m0zg wrote:
               | That's also a problem that it'd be fairly straightforward
               | for Google to solve by automatically spinning up smaller,
               | entirely separate serving clusters for customers who are
               | worried about such a blowout (for a fee, obvs). It's just
               | the serving tree (+ whatever in-memory storage service
               | they use to do distributed joins nowadays), no need to
               | duplicate the rest of the service. The caveat is, a
               | smaller cluster will favor query optimizations specific
               | to that smaller cluster. Some of those "small cluster"
               | optimizations could hurt query performance when deployed
               | against BQ proper with its tens of thousands of workers.
               | 
               | Also, BQ does explain the query plan to some extent:
               | https://cloud.google.com/bigquery/query-plan-explanation.
               | Not quite at the level of a "regular" SQL DB, but it does
               | give you some info to work with when optimizing queries.
               | If you haven't used it in a while I'd give it another
               | try.
        
           | MrPowers wrote:
           | Snowflake seems like a unique product and I can only imagine
           | the complex math they're doing under the hood to achieve
           | these incredible query times. memsql is the only real
           | competitor I know of. Redshift is a lot less user friendly
           | (constant need to run vacuum queries). Parquet lakes / Delta
           | lakes don't have anything close to the performance.
           | 
           | Predicate pushdown filtering enabled by the Snowflake Spark
           | connector seems really promising. Lots of companies are
           | currently running big data analyses on Parquet files in S3.
           | Snowflake has the opportunity to grab a huge slice of the big
           | data market.
        
             | javajosh wrote:
             | _> I can only imagine the complex math they're doing under
             | the hood to achieve these incredible query times_
             | 
             | Maybe its cynical/paranoid, but in this age of Theranos I
             | must ask: is it possible their algorithm excels at showing
             | you a reasonable looking number, rather than an accurate
             | one?
        
               | dumbfounder wrote:
               | It's SQL, if they were giving wrong answers people would
               | notice.
        
           | [deleted]
        
         | paxys wrote:
         | Nothing ever gets obsolete once it gains a large foothold in
         | the enterprise space. There's a reason why Oracle and IBM are
         | worth what they are today.
        
           | scarface74 wrote:
           | Novell, Word Perfect
        
           | mr_toad wrote:
           | > Nothing ever gets obsolete once it gains a large foothold
           | in the enterprise space.
           | 
           | Lotus? Delphi?
        
         | ghgdynb1 wrote:
         | I think the obsolescence issue is complicated.
         | 
         | I recently saw a criticism of Palantir which went: "The company
         | has largely succeeded, they say, not because of its
         | technological wizardry but because its interface is slicker and
         | more user friendly than the alternatives created by defense
         | contractors."
         | 
         | A lot of the most successful tech firms started post-dot-com
         | are decent interfaces to not-particularly-revolutionary
         | databases. In high-end consulting and investment banking,
         | appearances are hugely important. You can't have trash decks.
         | It's unsurprising to me that the same is true in defense and
         | intelligence. You can get a roof over your head and breakfast
         | at a trashy motel or the Ritz. Everybody knows the Ritz can
         | command a much higher price because "its interface is slicker
         | and more user friendly than the alternatives."
         | 
         | I think the same thing is true here.
        
         | bpodgursky wrote:
         | At the end of the day, all the data warehouses run on SQL, with
         | a bit of customization around ingestion and export. Most of
         | them are backed by object storage (S3/GCS) and those
         | integrations look very similar.
         | 
         | I wouldn't be that worried about lock-in or being made
         | obsolete. Business logic is going to be pretty easy to port
         | between Redshift, BigQuery, Snowflake, or whatever comes next.
        
           | worker767424 wrote:
           | > Most of them are backed by object storage (S3/GCS)
           | 
           | Redshift is backed by worker instances that have their own
           | stores in what's basically an EC2 instance. It's definitely
           | not backed by S3 like Athena.
           | 
           | Bigquery and GCS are both built on top of Colossus, but they
           | have different layers in between them.
        
             | bpodgursky wrote:
             | Sorry, probably should have been more precise. Meant to
             | say: most users are going to interact with the warehouses
             | via object storage for import and export of data.
             | 
             | Since the object store APIs are almost identical across
             | platforms, it doesn't matter that much which warehouse you
             | actually use for production work. It's something that does
             | massive SQL, imports data from S3, and exports data to S3.
        
             | AtlasLion wrote:
             | Same applies to Teradata vantage on cloud.
        
             | EwanToo wrote:
             | With the newer Redshift ra3 instances you use S3 backed
             | storage with local SSD caching
             | 
             | https://aws.amazon.com/redshift/features/ra3/
        
         | zeroxfe wrote:
         | > I would be apprehensive in investing in Snowflake long term
         | purely because their product is highly susceptible to being
         | obsoleted in the next 5-10 years.
         | 
         | This can be said about most products and companies. What keeps
         | them alive is how robustly they capture (and hold on to) the
         | market, reduce costs through economies of scale, and innovate.
         | This specific market is also very rapidly growing.
        
         | boh wrote:
         | I would think it wouldn't be the same product in 5-10 years.
        
       | Hizonner wrote:
       | I take it I'm supposed to know or care what "Snowflake" is...
        
         | AnimalMuppet wrote:
         | If you don't know and don't care, you could always _not
         | comment_...
        
         | haswell wrote:
         | There are many things that show up on HN that have names I
         | don't recognize. When this happens, I'm excited! I get to learn
         | about something I didn't know about before.
         | 
         | A simple Google search for "Snowflake" will immediately answer
         | your question - both by the company being the top result, and
         | by Google conveniently including a card with an overview of the
         | company.
         | 
         | There's also plenty of stuff that shows up on HN that I don't
         | care about because it's not relevant to me, but that doesn't
         | mean it isn't relevant to the rest of the community.
        
       | kwillets wrote:
       | Nerds are surprisingly susceptible to hard-sell tactics.
        
       | cblconfederate wrote:
       | If you were a foreigner and had to invest your savings somewhere
       | (because banks and govs are forcing negative rates) where would
       | you invest
        
         | newguy1234 wrote:
         | I would invest all of it into Snowflake stock, of course.
         | 
         | The shoe shine boy told me about it....
        
       | jariel wrote:
       | The article talks about 'good things' but doesn't put them in
       | context of valuation.
       | 
       | $60B is still too much.
       | 
       | It's odd that Buffet is in, it's a weird signal, because this is
       | a weird era for markets: all other things equal, we are looking
       | at .com-ish situations here and the timing would be ideal for a
       | true crash.
       | 
       | That the world economy is shrinking by 10% and governments, major
       | industries are going insolvent should be scary.
       | 
       | Perhaps investors think they are preparing for the 'covid future'
       | but this may be a weird kind of inflation whereby everything else
       | (including cash) is crap so they are piling into winners.
       | 
       | There is an emotion to a lot of stocks these days that is
       | probably making every analysts job a nightmare - if the CEO or
       | company is popular, it really messes with valuation.
        
         | panda88888 wrote:
         | One thing to keep in mind is the price at which Buffet
         | invested. I vaguely remember they he invested at about $70 per
         | share (I may be wrong. Just going by memory here). This means
         | there are a lot of buffer room for correction at current price
         | of around $250.
        
       | blackbear_ wrote:
       | Am I wrong if I say that they are so valuable (in monetary terms)
       | just because they are the only (or one of the few) non-free
       | database management systems (or whatever they are)?
        
         | johncolanduoni wrote:
         | Not really, they're largely targeting the same kind of use
         | cases as redshift and bigquery.
        
       | sharadov wrote:
       | Ok, It's a great product, but valuation still does not make
       | sense!
        
         | [deleted]
        
         | newguy1234 wrote:
         | My experience with investing tells me that you will never get a
         | good investment at a fair value, you always have to pay a
         | premium for it. There is a pretty consistent pattern in my
         | stock portfolio. The stocks that I overpaid for ended up being
         | great investments while the stocks that I thought I was getting
         | a deal on ended up being a disaster of an investment.
        
       | soumyadeb wrote:
       | People are excited about Snowflake because it can completely
       | disrupt the traditional data-warehouse market.
       | 
       | The legacy players like Teradata and Exadata (from Oracle) really
       | don't scale. Teradata has ~2B in revenue, Exadata is probably in
       | the same range. That's all up for grabs but that's only
       | scratching the surface.
       | 
       | Historically, only transactional data was dumped into the
       | warehouse. Snowflake is selling storage at S3 price (plus you get
       | compression so often ends up cheaper) while they are making money
       | of compute/query. If they can provide all the right query
       | abstractions (SQL, full-text search), in theory all data can be
       | thrown in Snowflake. Yes, tech savy bay area companies can setup
       | their own stack using Presto etc but rest of the world is not
       | like that.
        
         | CharlesW wrote:
         | [Teradata employee here.]
         | 
         | > _The legacy players like Teradata and Exadata (from Oracle)
         | really don 't scale._
         | 
         | I get why Teradata gets labelled "legacy", but one of
         | Teradata's main differentiators is scale. Teradata engineers
         | have been tackling incredibly interesting scale problems (on
         | many dimensions of "scale") for 40 years. Teradata has many
         | customers who routinely manage and perform analytics on many
         | petabytes of data.
         | 
         | > _Historically, only transactional data was dumped into the
         | warehouse._
         | 
         | That was once true, because initially that was all the data
         | that companies had. However, companies have long since used
         | data warehouses for all kinds of data -- sensor data, text,
         | behavioral data, product info/BOMs, vendor info, contract info,
         | etc. -- whatever's necessary to run the business.
         | 
         | > _Snowflake is selling storage at S3 price..._
         | 
         | This is important, but not unique. For example, Teradata's
         | current product has native support for S3 and S3-compatible
         | object stores, and you can query them just like any other
         | database table, join that data with data in high-performance
         | native storage, etc.
        
           | soumyadeb wrote:
           | Sorry, I didn't clarify well. I am sure it scales technically
           | well but not on cost.
           | 
           | My experience of TD is > 10 yrs and then the multi-node
           | version was substantially more expensive than the single-node
           | version. Also, storage and compute was coupled which meant I
           | had to pay for nodes even if 99% of my data was cold. That's
           | a problem with RedShift too but not for Snowflake.
           | 
           | De-coupling storage and compute was a brilliant move by
           | Snowflake. BigQuery can completely abstracted compute - you
           | don't provision compute and only pay for data scanned.
           | However, it gives you a sense of insecurity around cost - A
           | single bad cron job running a query every sec can blow up
           | your cost (real-life experience). Snowflake provides the best
           | cost/performance tradeoff I have seen.
        
           | chrisjc wrote:
           | > Teradata's current product has native support for S3 and
           | S3-compatible object stores too, and you can query them just
           | like any other database table, join that data with data in
           | high-performance native storage, etc.
           | 
           | Storage costs for S3 (or any cloud-provider object storage)
           | are only one dimension of the price. The other is interaction
           | costs which can get prohibitively expensive, for example if
           | you accidentally forget to provide a partition key in your
           | query predicate. Snowflake absorbs this cost if you use
           | internal storage (or just copy into tables).
        
         | hilbertseries wrote:
         | > Yes, tech savy bay area companies can setup their own stack
         | using Presto etc but rest of the world is not like that.
         | 
         | My last company was an early adopter of Snowflake. And we tried
         | Presto first, circa 2016 and Presto was sloooow. We were using
         | vertica at the time and it was so much slower. Snowflake on the
         | other hand was able to perform on the same order of latency as
         | Vertica, which was pretty crazy to us.
        
           | kwillets wrote:
           | So why did you switch from Vertica (or did you)?
        
             | hilbertseries wrote:
             | Vertica was too expensive, their licensing fees were
             | terrible at our scale. Operations were also awful, if we
             | had two nodes go down we were always in trouble. We built
             | an EBS solution that made it a little better, but it still
             | wasn't tenable long term.
        
               | kwillets wrote:
               | Good info -- thanks.
               | 
               | Their Eon mode product is very similar to Snowflake, with
               | S3 storage and semi-dynamic compute nodes, but they may
               | not be as slick at marketing it or providing a UI.
        
           | soumyadeb wrote:
           | That's interesting. I thought Vertica's pitch was real-time
           | analytics for which draditional disk based data-warehouses
           | are too slow.
        
             | hilbertseries wrote:
             | Vertica is a disk based analyctics database. It was very
             | fast, but also very expensive. And hardware failures could
             | be particularly difficult to recover from.
        
         | baskire wrote:
         | Smaller companies with presto won't get the same performance
         | benefit.
         | 
         | Snowflake & BigQuery get the ability to have multiple customers
         | on a large cluster.
         | 
         | It'd be cost prohibitive for a single smaller customer to have
         | all that compute sitting idle for a few queries per minute.
         | 
         | Storage also benefits as snowflake/ BQ can shard your data
         | across a much larger array of disk giving you better IO.
         | 
         | Think is it faster to drive a car 100 ft starting at 0mph and
         | flooring it. Or to drive 100ft with a car which starts off
         | doing 120mph
        
         | ETHisso2017 wrote:
         | >>>Yes, tech savy bay area companies can setup their own stack
         | using [insert open source tool here] etc but rest of the world
         | is not like that.
         | 
         | It sounds like a ton of these cloud infra companies have this
         | product strategy (datadog, snowflake, elastic, hashicorp, etc)
        
           | Spartan-S63 wrote:
           | When you look at cloud infra companies like that, their
           | competitive advantage is in quickly being able to ingest data
           | and make it accessible, so an off-the-shelf solution likely
           | doesn't exist for their particular use-case. Also, since that
           | operation is your competitive advantage, you should look to
           | in-source it rather than reach for a COTS solution.
        
       | jng wrote:
       | Can someone summarize Snowflake's unique technical value? I'm
       | quite familiar with both Redshift (I would summarize it as
       | Postgres adapted to sharded, columnar OLAP functioning) and
       | BigQuery (there is a famous paper explaining the architecture).
       | Also with more traditional databases such as MySQL, PostgreSQL,
       | SQL Server, and columnar OLAP databases like Vertica. I explored
       | the website a little bit, but couldn't construe a clear statement
       | of the technical architectural value. Some of the comments here
       | are valuable, but I'm missing a clearer "big picture" overview.
       | Thanks!
        
         | malisper wrote:
         | (I'm the author of the post.)
         | 
         | I've worked with a large Postgres cluster before (~1PB of data)
         | and have been experimenting with Snowflake recently. I would
         | say there's two clear technical advantages of Snowflake over
         | Redshift. First is there's no maintenance when using Snowflake.
         | You just signup for a Snowflake account, upload a CSV, and you
         | can start querying the data. This is in contrast to Redshift
         | where you have to manually provision a cluster, resize it as
         | you add more data, etc.
         | 
         | The second is their pricing. Storing data in Snowflake costs
         | the same as it would cost to store in S3. The tradeoff is you
         | also have to pay based on how long your queries take. Depending
         | on your workload this can result in a massive cost savings. If
         | you access only small amounts of your data infrequently, it's
         | like you're storing the data in S3 and you only have to pay a
         | bit more when accessing the data. This is in contrast to
         | Redshift where you have to pay for the full cost of the cluster
         | regardless of whether you are actually querying the data or
         | not.
         | 
         | Snowflake also has a ton of quality of life improvements
         | compared to Redshift. One really nice thing is you can change
         | the amount of compute used for any individual query. For
         | example, if you have one specific slow query, you can allocate
         | 4x the compute for that one query, pay 4x as much while the
         | query is running, and get the query to run 4x faster
         | (ultimately costing you the same amount as if you used 1x the
         | compute).
         | 
         | One neat thing is there's ultimately only one "Snowflake
         | instance" in each region. Everyone's tables are in the same
         | instance, but you can only access the tables you have
         | permission to access. This allows you to easily share data
         | between different Snowflake accounts. You can store the data in
         | one account and query it from another.
         | 
         | So the core value proposition is really strong and it also has
         | a bunch of extra features that are all pretty useful at the end
         | of the day.
         | 
         | This post focused on Snowflake solely from a business point of
         | view. I'm considering writing another one that focuses on it
         | from a technical point of view.
        
           | jng wrote:
           | Thanks for the details, very useful. Please write that post,
           | I'm sure it will make it to the front page here :)
        
         | soared wrote:
         | Snowflake's value is that they provide the same technical
         | products as amazon/google/etc, but are not amazon/google/etc.
         | Some shops like buying into the google ecosystem, some are
         | afraid of vendor lock in.
         | 
         | Probably other things, but many companies exist just to be
         | alternatives to faang. If you're good enough, you surpass that
         | intention.
        
           | jng wrote:
           | I saw them a year or two ago positioning themselves in
           | contrast to Redshift and BigQuery. I though "these guys are
           | building something for Microsoft to acquire" (my thought was,
           | something with a more modern OLAP architecture than SQL
           | Server, which they can offer via Azure). Naive me, they were
           | so much more business savvy than that...
        
       | bob33212 wrote:
       | People want to own the next Salesforce. Snowflake could expand
       | into ERP/CRM/Visualizations. There is over 50B revenue across the
       | world in those areas.
        
         | jariel wrote:
         | Those lines of business are very far away from Snowflake.
        
       | kthejoker2 wrote:
       | As someone who spends a lot of time in this space, their only
       | "killer app" is automated workload / data distribution
       | management. Which is cool, and hard to get right, but clearly
       | something the cloud vendors and other data players are have taken
       | steps towards / offer more or less the same outcomes.
       | 
       | And in contrast their Silicon Valley roots means a lot of their
       | tooling/UX/data capabilities are ... undercooked. Their Web IDE
       | feels like a throwback to 2003 Hadoop, their ETL capabilities are
       | a joke, they don't support joins in views ...
       | 
       | And they've also squandered some opportunities to actually offer
       | a differentiating "all in one" data processing experience for ad
       | hoc/exploratory, BI/aggregated, and Big Data/AI/ML model
       | crunching. For example, here's their garbage blog post on Spark
       | SQL - https://www.snowflake.com/blog/snowflake-spark-
       | part-2-pushin...
       | 
       | tl;dr when someone writes a Spark job that includes a filter
       | against data in Snowflake, it's more efficient to let Snowflake
       | filter the data before shipping it off to the (much more
       | performant) Spark engine to do the actual analytical pieces of
       | the query plan, instead of just shipping all the data over and
       | letting Spark do the filtering.
       | 
       | Like ... wow, predicate pushdown is your answer?
       | 
       | Contrast with Azure Synapse providing Spark and SQL Server
       | compute in the same environment; Databricks adding Delta Lake
       | capabilities to be more schema-on-write friendly; Dremio building
       | AI into their caching, and Starburst into their workload
       | management ...
       | 
       | Anyway, I don't see any secret sauce, which means it's still just
       | traditional enterprise sales cycles...
        
         | chrisjc wrote:
         | > they don't support joins in views
         | 
         | Perhaps you mean they don't support joins in Materialized Views
         | (uppercase M)? We use Snowflake views with joins all over the
         | place. Furthermore if views don't cut it for you, you can
         | always use joins in UDTFs. Or if you really need joins in a
         | materialize view (lowercase M), you can use change streams in
         | combination with joins to maintain your own materialized view
         | (table).
         | 
         | > instead of just shipping all the data over and letting Spark
         | do the filtering
         | 
         | Forgive my ignorance, but in what capacity would this be less
         | efficient? Doesn't it make more sense to reduce you result set
         | before shipping it off to external compute?
        
       | yalogin wrote:
       | I don't know anything about Snowflake but in general they are
       | fortunate to go public in this time. The tech market is in a
       | bubble and investors are frothy at the mouth for tech stocks.
       | Pessimism started to creep in for some of the already sky high
       | tech stocks. Snowflake IPO'd right at that time and people just
       | flocked to it.
        
       | aerodog wrote:
       | buffet effect. it's insane
        
       | gumby wrote:
       | Because interest rates are super low and there's a lot of
       | uninvested capital sloshing around.
        
       | [deleted]
        
       | zwieback wrote:
       | I guess this quote from the article sums it up: "There was so
       | much hype, my mom, who doesn't even know what Snowflake is,
       | decided to invest in Snowflake."
        
         | joshdick wrote:
         | "Taxi drivers told you what to buy. The shoeshine boy could
         | give you a summary of the day's financial news as he worked
         | with rag and polish. An old beggar who regularly patrolled the
         | street in front of my office now gave me tips and, I suppose,
         | spent the money I and others gave him in the market. My cook
         | had a brokerage account and followed the ticker closely. Her
         | paper profits were quickly blown away in the gale of 1929."
        
       | TuringNYC wrote:
       | Almost all the big companies I worked for had a "database gang"
       | -- a database group which, in the name of centralization, forced
       | you to bow to them to get anything. New DB? bow to them. More
       | nodes? bow to them. Reboot? bow to them. The internal budget
       | "prices" would be off the charts unbelievable.
       | 
       | It makes sense to centralize, but only at a certain cost. Beyond
       | that cost, it is better to just de-centralize because not every
       | project can spend 4-5 months of meetings to spin up a DB.
       | 
       | The cloud changed this because it became an OpEx discussion and
       | something you could spin up on your own. For non-production
       | workloads, it comes especially obvious to do this.
        
         | chrisjc wrote:
         | > it is better to just de-centralize
         | 
         | Until you want to join and then all of sudden it's not your
         | problem. Then you end up with a "gang" of cowboy analysts
         | running ad-hoc data-dumps against operational datastores
         | affecting production uptime and stability only so that they can
         | do a lookup between the multiple (source_table_column_count *
         | source_table_row_count) sheets in their uber excel document.
         | 
         | I'm all for decomposing the monolith as long as you have a plan
         | for recomposing when it's necessary.
        
           | TuringNYC wrote:
           | Yep, thats why I said centralizing makes sense -- up until
           | some price point. Beyond that point, you might as well just
           | spend the money on re-composing when you need.
        
             | chrisjc wrote:
             | Fair point! Sorry I missed that.
        
         | stingraycharles wrote:
         | This still in no way answers why Snowflake is so valuable,
         | though. I completely understand our argument, and I agree with
         | it; I just don't think the article's arguments are anything
         | else than ex post facto rationalization. When they mentioned
         | NPS I almost snorted my coffee, that metric can be gamified in
         | any way you want it.
        
           | stcredzero wrote:
           | _> > Almost all the big companies I worked for had a
           | "database gang" -- a database group which, in the name of
           | centralization, forced you to bow to them to get anything.
           | New DB? bow to them. More nodes? bow to them. Reboot? bow to
           | them. The internal budget "prices" would be off the charts
           | unbelievable._
           | 
           |  _> This still in no way answers why Snowflake is so
           | valuable, though._
           | 
           | It explains it to a T! You have something you want, but
           | internal company politics and territoriality keep you from
           | getting it the way you want. An outside provider lets
           | everyone get it for a bit of cash. It's basically the same
           | play as Salesforce. It's not some kind of technical moonshot.
           | It has to do with a modicum of technology delivered by a 3rd
           | party who can avoid all of the internal friction.
           | 
           | The next founder who can think of this kind of play, then
           | execute on it, will be the next Salesforce/Snowflake, and
           | will probably have the ear of the same investors!
        
             | stingraycharles wrote:
             | What I meant was why Snowflake _specifically_ is so
             | valuable. BigQuery, Redshift or any other cloud db would
             | fill this gap as well. Why Snowflake?
        
               | mr_toad wrote:
               | > What I meant was why Snowflake specifically
               | 
               | Marketing. Vast amounts of marketing.
        
               | mmm_grayons wrote:
               | It's a chance for investors to get in on the next big
               | thing and invest specifically in data warehousing. No one
               | can put his money directly into BigQuery or Redshift.
               | 
               | Edit: why are so many people downvoting this? Is there
               | some other reason for Snowflake's valuation (aside from
               | tech bubble playing a role)?
        
               | chrisjc wrote:
               | Snowflake fanboy here who can't really answer your
               | question about why it's so valuable. Not sure I can
               | rationalize the current value. Not sure I think it should
               | be valued this much.
               | 
               | But I can probably answer why Snowflake instead of
               | Redshift (sorry, not too familiar with BigQuery)...
               | 
               | First of all it's cloud-provider agnostic so you can set
               | up Snowflake on any or all of the 3 major cloud providers
               | as well as set up replication between them directly or
               | indirectly through their data exchange. Probably the most
               | powerful feature is the way that Snowflake has the
               | ability to scale (up or down) compute (vertically and
               | horizontal) and storage independently of each other.
               | Furthermore, you have the ability to scale compute down
               | to nothing, and spin up "instantly" when the demand
               | arrives. On top of all of this there is an incredible
               | selection of functionality that i could go on and on
               | about.
        
               | texasbigdata wrote:
               | Honestly, if you don't mind, please do. I believe the
               | decoupled elastic compute / storage advantage has been
               | well described; what are the more granular or technical
               | things you like?
               | 
               | Edit: seems you've already answered this :)
               | https://news.ycombinator.com/item?id=24265856
        
               | TuringNYC wrote:
               | I find Snowflake much easier to use than BigQuery,
               | Redshift. It is also cloud-service-provider agnostic. So
               | your only hook at that point is ingest + any snowflake-
               | specific SQL (and obviously security migration etc.) So
               | for retention, they compete on UX rather than walls.
               | 
               | W/r/t value, the idea is a disproportionate of egress
               | from Oracle, Teradata, etc will end up at Snowflake,
               | hence huge TAM, SAM, and SOM.
        
         | hitekker wrote:
         | > The internal budget "prices" would be off the charts
         | unbelievable.
         | 
         | I find that this occurs when an infrastructure team considers
         | itself a "platform". The only supplier of an asset that
         | everyone else demands can set the "price" of the asset as high
         | as they want.
        
       | jcfrei wrote:
       | I don't see it mentioned in the article but isn't one of the main
       | selling points of Snowflake their data exchange? Companies upload
       | their data to Snowflake in the hopes of someday monetizing it? If
       | that's the case then I think it's just a matter of time until
       | regulators become interested too.
        
       ___________________________________________________________________
       (page generated 2020-09-30 23:01 UTC)