[HN Gopher] Whom the gods would destroy, they first give real-ti... ___________________________________________________________________ Whom the gods would destroy, they first give real-time analytics (2013) Author : sbdchd Score : 57 points Date : 2023-07-25 21:56 UTC (1 hours ago) (HTM) web link (mcfunley.com) (TXT) w3m dump (mcfunley.com) | tech_ken wrote: | If the main objection to constructing a real-time product | monitoring system for A/B(C/D/E...) decisions is that optional | stopping is bad why not throw away the null-hypothesis sig | testing and instead treat the problem as a multi-armed bandit? | morkalork wrote: | MAB and its friends like contextual MAB has always been the | dream. Closing the loop so analytics data is pushed back to the | decision point in code and isn't a one-way pipe to some | dashboard is the hardest part though. For non-technical | reasons. | tech_ken wrote: | Sort of a generalized PEBCAK | whimsicalism wrote: | Because it is difficult to map that onto real business | decisions and requires oftentimes supporting a large space of | possible UI combinations because they haven't been fully ruled | out yet. | taeric wrote: | How well does that dodge the problem? I'd imagine a multi armed | bandit should stay such that it is always sampling from many | fair coins, as it were. I would be delighted to read a study on | that. | tqi wrote: | I also think real time is mostly useless (aside from for | alerting, which probably is a different tool), but I don't think | the one day delay is much of (if any) protection against the | experimental pitfalls described. | brycewray wrote: | (2013) | PeterCorless wrote: | Exactly. And it betrays the biasea of the era. This author | really got it wrong. | whimsicalism wrote: | While there are probably all sorts of problems with marxism when | it comes to economics, in large companies there should be a | 'vanguard party' of statisticians who prevent the masses from | making false claims of causality from p-hacked tests. | bluecoconut wrote: | I believe that the comment about CAP theorem violation / treating | the problem as a technically unsolved thing isn't true. Eg. See | the dataflow paper that sets up more clear tradeoffs for latency | and correctness in large scale data processing [1]. I think it | makes sense to always hold a high bar for your technology -- if | it's technically feasible, and fits within budgets (time and | complexity for the team), accepting artificial limitations | because they soften social problems feels like a mistake / | believing in a false "ignorance is bliss" belief. I think the | problem that is presented is more of a problem of popular | understanding of statistics and game theory, and not the | technical problem. | | [1] "The Dataflow Model: A Practical Approach to Balancing | Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out- | of-Order Data Processing" https://research.google/pubs/pub43864/ | hn_throwaway_99 wrote: | I really liked this article, and I thought this statement hit the | nail on the head: "Confusing _how we do things_ with _how we | decide which things to do_ is a fatal mistake. " I've worked at | companies that practice what I call "thrash management" | (constantly jumping from one priority to the next based on | whichever fire happens to be burning brightest that day) and it's | no fun, to put it mildly. | | That said, once you build a system for operational metrics (i.e. | what you need to detect anomalies that indicate outages, security | concerns, etc.) you're already a huge way there towards having | real-time analytics. I still wholeheartedly agree with the author | that these real time metrics should only be in the service of | operations, not product planning. | wellpast wrote: | And... near real-time with high uptime is relatively more costly | to build / maintain / deploy / operate than batch -- so save your | org the cost! | masswerk wrote: | > I can understand why engineers are predisposed to see | instantaneous A/B statistics as self-evidently positive | | This is the crucial misunderstanding: in actuality, you are | running a panel. | | (There is no such thing as an A/B test outside of marketing. | Running a meaningful panel requires some information on the | population, your samples, the homogeneity of those, etc, just to | pick the right test, to begin with. Also, you need a controlled | setup, which notably includes a predetermined, fixed timeframe | for your panel to run. Before this is over, you have no data, at | all. You are merely tossing coins...) | PeterCorless wrote: | Data scientists also do A/B testing on algorithms to see which | one has better fit for a use case against real-world, real-time | data. | PeterCorless wrote: | This is very 2013. Meanwhile in 2023, a decade later, you | literally have systems detecting credit card fraud in | milliseconds. [Disclosure: I work for StarTree, which is powered | by Apache Pinot. We eat petabytes of data for breakfast.] | Ecstatify wrote: | What has that to do with product decisions? | asimjalis wrote: | This has aged well. | dang wrote: | Related: | | _Whom the Gods Would Destroy, They First Give Real-Time | Analytics (2013)_ - https://news.ycombinator.com/item?id=15379660 | - Oct 2017 (70 comments) | | _Whom the Gods Would Destroy, They First Give Real-time | Analytics_ - https://news.ycombinator.com/item?id=6515805 - Oct | 2013 (1 comment) | | _Whom the Gods Would Destroy, They First Give Real-time | Analytics_ - https://news.ycombinator.com/item?id=5032588 - Jan | 2013 (55 comments) | throwaway63820 wrote: | Just use Amplitude | alex_lav wrote: | Last I used Amplitude it was insanely expensive. Is that not | still the case? | codevark wrote: | [dead] | Xeoncross wrote: | Them: We need metrics to know if the users like this new feature | we're pushing on them. | | Me: or you know, we could maybe see what the users biggest issues | are first and try to build stuff to solve those problems. ___________________________________________________________________ (page generated 2023-07-25 23:00 UTC)