[HN Gopher] The pivot table, the spreadsheet's most powerful too...
       ___________________________________________________________________
        
       The pivot table, the spreadsheet's most powerful tool (2020)
        
       Author : airstrike
       Score  : 224 points
       Date   : 2023-10-09 14:36 UTC (8 hours ago)
        
 (HTM) web link (qz.com)
 (TXT) w3m dump (qz.com)
        
       | mcsoft wrote:
       | I use pivot tables all the time. The concept is brilliant, but
       | the Excel UI leaves a lot to be desired.
       | 
       | At first, you're amazed at the flexibility, but once you become
       | comfortable, you suddenly hit the limitations. Can't sort by a
       | calculated column, can't categorize without adding columns in the
       | data, etc.
       | 
       | I looked at Quantrix for a while, and it was a bit too complex
       | for practical purposes. I wonder if there are any decent
       | PivotTable tools out there?
        
       | asah wrote:
       | nice!
       | 
       | AI is about to end this - shortly, we'll be able to ask for the
       | answer directly, in zero clicks.
        
         | airstrike wrote:
         | this is unfairly downvoted, IMHO -- i doubt it will take "zero
         | clicks" -- we're more likely replacing clicks with prompts --
         | but LLMs can definitely help bridge the gap between data
         | wrangling and presentation.
         | 
         | replies to this comment seem to overvalue the challenges
         | without acknowledging that those may eventually be overcome.
        
           | seektable wrote:
           | search-driven analytics is not really a new thing and
           | products in this space were before LLM-era. This kind of
           | interface can be useful for some categories of users but it
           | is not a game-changer - prompts cannot replace Excel and its
           | pivot tables, and in fact typing prompts may be even more
           | complicated for users than good old 'clicks'.
        
         | Brian_K_White wrote:
         | and have no confidence in it
        
         | rchaud wrote:
         | Perhaps we could create a set of logical prompts to get there
         | even faster. We could call it a structured query language!
        
           | mxuribe wrote:
           | Or better yet: conversation query language (CQL)! ...Though
           | the risk exists for verbal name collision. ;-)
        
         | Bishonen88 wrote:
         | Dunno, seems to me that asking (writing in this case) is not
         | final solution. In fact, in most cases, typing down a question
         | will take longer than looking in a well created table.
         | 
         | This chat-like back-and-forth, will take forever to understand
         | some dataset properly. One question answered will give birth to
         | 5 additional questions and so on.
        
       | hankyone wrote:
       | Everytime I have tried to use pivot tables I end up just firing
       | PowerBI instead, it's a much better experience.
        
         | jimnotgym wrote:
         | It depends. It is common to use the results in further Excel
         | calculations.
        
         | bafe wrote:
         | I usually end up using panda or R
        
       | Scubabear68 wrote:
       | They are indeed very powerful, if unfortunately named, and the UI
       | always takes me a while to do it properly in Excel.
       | 
       | I like how MacOS Numbers does it, the Categories feature is much
       | easier to use, however it has limitations that are very annoying
       | if you reach them.
       | 
       | For anything complex these days, it is often easiest to export
       | the spreadsheet to CSV and run SQL queries on it with CSVQ.
        
       | kagakuninja wrote:
       | Microsoft directly stole their pivot table from a former employer
       | of mine, Brio Technology, and their product DataPivot.
       | https://en.wikipedia.org/wiki/Brio_Technology
       | 
       | According to the CEO, Microsoft summoned him to Redmond and made
       | an insultingly low offer to buy the company. They threatened to
       | build their own product and put Brio out of business if he
       | refused. He refused and MS later added pivot tables to Excel.
       | 
       | The article mentions Lotus, who apparently built a similar
       | product around the same time. The Brio founders were involved
       | with a company called Metaphor, and maybe some of the ideas were
       | developed there...
       | 
       | Brio products shipped with a sample database, the contents of the
       | CEO's wine cellar. That exact data could be seen on early
       | Microsoft office boxes...
        
         | [deleted]
        
         | viraptor wrote:
         | What part did MS steal? The pivot table concept (as in
         | summarising data by categories of rows and similar operations)
         | existed for a really long time, so I'm guessing you're not
         | referring to that?
        
           | onlyrealcuzzo wrote:
           | Aren't you allowed to "steal" basically everything except
           | exact code in software?
           | 
           | Didn't the supreme court rule in favor of Google vs Oracle
           | that Google's use of the JDK API was fair use?
           | 
           | Wasn't there also a recent supreme court ruling that you
           | can't patent non-novel math (algorithms)?
           | 
           | I'm not sure how you could be convicted of theft in software
           | outside of literally stealing and reselling someone's non-
           | open source code.
        
           | kagakuninja wrote:
           | You could say they just reverse-engineered Brio's product,
           | and therefore it wasn't theft.
           | 
           | But the presence of the Brio sample wine DB appearing on MS
           | Office boxes showed that they bought a copy of DataPivot and
           | used that DB when developing their clone. They didn't even
           | bother hiding the evidence.
           | 
           | BTW, DataPivot was released in 1991, and the only other
           | references I've seen for prior art are a Lotus product
           | released about the same time.
        
         | chinchilla2020 wrote:
         | What did they steal? The underlying tech? the name 'pivot
         | table'?
         | 
         | The concept of aggregating categorized columns of data is not
         | something Brio Technology invented.
        
         | ambyra wrote:
         | Remember when Microsoft wanted to license CPM to IBM, but they
         | didn't own it, so they got someone to make a clone and licensed
         | that to IBM?
        
         | riemannzeta wrote:
         | Something similar happened to Stac Electronics.
         | https://en.wikipedia.org/wiki/Stac_Electronics
         | 
         | But Stac had patents, lawyered up, and sued Microsoft for
         | patent infringement. Microsoft didn't lie down, but after a
         | court ruled that both of them needed a license to ship
         | products, Stac went to the largest OEM customers and offered a
         | license. It was either take the patent license or not ship, and
         | the pressure from the OEMs on Microsoft resulted in a
         | settlement that was not terrible for Stac.
         | 
         | This is one of the best examples I know of how software patents
         | can be good for competition.
        
           | roenxi wrote:
           | I think it is worth noting that these patents weren't good at
           | all for competition. Stac Electronics proved to be in no way
           | capable of competing with MS Excel, pivot tables are
           | technically not that hard and the lawsuit might have stopped
           | MS bringing them to a huge number of people faster.
           | 
           | They did nothing much, and if they'd done more they'd have
           | retarded progress.
        
             | wtallis wrote:
             | The fact that you clearly have no clue what business Stac
             | was in rather undermines the argument you're trying to
             | make.
        
           | echelon wrote:
           | > This is one of the best examples I know of how software
           | patents can be good for competition.
           | 
           | Small companies and startups with products in the market
           | (before existing incumbents have products) should be able to
           | get software patents to defend their innovation.
           | 
           | Big companies should be able to have patents, but not use
           | them against smaller players.
           | 
           | Patent trolls with no products in the market shouldn't be
           | allowed to have patents.
           | 
           | Universities and university researchers should be able to
           | have patents, but they should be forced to license at not-
           | unreasonable terms to startups.
           | 
           | My thoughts on the matter, anyway. The point is to encourage
           | innovation, especially by enabling small players bringing new
           | stuff to market. Give them a small shield against the big
           | incumbents.
        
           | deaddodo wrote:
           | The settlement was alright, but Microsoft's inclusion of
           | DoubleSpace still inevitably killed Stac (and even larger
           | potential profits) as they were unable to pivot when the
           | technology became unviable.
        
           | kagakuninja wrote:
           | Brio had a patent on their pivot table aggregation algorithm,
           | perhaps Microsoft did something differently in their
           | implementation. I've never used an Excel pivot table.
        
             | eps wrote:
             | According to Wikipedia Brio got their patent in 1999, i.e.
             | well after Excel got pivot tables.
        
               | kagakuninja wrote:
               | OK, fair enough. We don't know how long it took to get
               | approved. In theory they could have gone after MS. I
               | vaguely remember that our rival Business Objects had some
               | BI patents and sued us, and we sued them over pivot.
        
             | canadianfella wrote:
             | [dead]
        
           | HappyDaoDude wrote:
           | I am completely on board with the idea of Richard Stallmans
           | of abolishing software patents BUT it is simply an ideal that
           | I do not see ever happening on a large scale. The fact is
           | that when it comes to business, they can help. It is just a
           | shame when there is the potential to leverage them against
           | smaller players.
           | 
           | Things that work on one scale for good can also be leveraged
           | for bad on another scale.
        
           | pacaro wrote:
           | Arguably this, and similar cases, is what motivated Microsoft
           | to become an aggressive patent filer. I know that at various
           | times in my tenure there (99-13) that I was on teams where
           | patent filing was encouraged
        
       | thrdbndndn wrote:
       | this probably is more about specific implementations than pivot
       | table in general, but I often find it's "lacking" when wanting to
       | do some more advanced things with it.
       | 
       | To name a few:
       | 
       | The way how the "values" are generated is very limited in both
       | Excel and Google Sheets (in different ways).
       | 
       | The way how filter/sorting works with pivot table isn't the most
       | straightforward or flexible.
       | 
       | the "UI" elements (headers, styles, etc.) is very hard to
       | control. I often find myself creating a pivot table, and copy it
       | somewhere else, and manually fix bunch of stuff -- which kinda
       | defeats the point of it (since I can no longer dynamically update
       | it).
       | 
       | Disclaimer: I'm by no means a spreadsheet expert so I may just
       | miss something.
        
         | airstrike wrote:
         | Agreed, current implementations are totally brittle and far
         | from "feature complete" relative to what an experienced / power
         | user would expect coming in
         | 
         | It actually bothers me so much I've decided to write my own
         | spreadsheet engine. This is one of the pain points I want to
         | fix
        
         | conductr wrote:
         | I'm pretty close to a spreadsheet expert, I categorize pivot
         | tables as just quick and dirty adhoc/disposable data
         | visualization and never would use it as something for
         | presentation or where styling mattered.
         | 
         | I usually use it to quickly find unexpected values in the
         | underlying data columns. Although with spill formulas like
         | =unique() I'm using it less and less for this.
         | 
         | I find often people wanting more out of it, are really asking
         | for Power Query. Which is there, just a lot of people are
         | intimidated by. (Maybe not HN, but general population)
        
         | seektable wrote:
         | You may try these online pivot tables https://www.seektable.com
         | where you can re-order rows/columns simply with a click on the
         | header, apply filters via simple input where you can specify
         | which items to keep or exclude.
        
       | xyzelement wrote:
       | Despite being an engineer and well versed in SQL, the ability to
       | do things quickly in Excel, including whipping up pivot tables,
       | has been a big boost to my career, especially as I moved more to
       | product and business management.
       | 
       | Pivot tables let you super quickly see how things break down
       | across dimensions and play with that analysis in a way that makes
       | for rapid decision making that's not matched by much else other
       | than established tools for well-defined spaces.
        
         | chinchilla2020 wrote:
         | I feel the opposite. I rapidly upload my data to bigquery and
         | start slicing.
         | 
         | I can fire out SQL queries way faster than I can click around
         | in excel... and it is reproducible since it's easier to copy-
         | paste a query from history than redo all formatting in an excel
         | sheet later on.
        
         | WendyTheWillow wrote:
         | I've never gotten into pivot tables, mostly because data has
         | always been available in a relational db for my career.
         | 
         | But the effort to move the data into one from a spreadsheet is
         | way overkill, so I do think it's not suboptimal to use them
         | even as an engineer.
        
           | xyzelement wrote:
           | I can see how all pivot table capabilities can be replicated
           | via SQL but I think some of that becomes very expensive and
           | error prone while being simple drag-drop in Excel/Sheets.
           | 
           | The ease of doing it is a key feature. if I have to build a
           | certain report because it's my job, I will do it whatever it
           | takes. If I am just doing extra due diligence for myself, I
           | may not do it if it takes hours of SQL crafting.
        
       | bloopernova wrote:
       | I don't know if anyone will find this useful, but here's
       | something I learned literally this weekend.
       | 
       | Sort rows/cells into groups based on the value of a cell.
       | =FILTER(Stories!B2:D13,Stories!F2:F13=A2)
       | 
       | First parameter, "Stories!B2:D13" is a group of cells showing
       | some stories.
       | 
       | Second parameter, "Stories!F2:F13=A2" is the column where each
       | cell is compared to the value of A2. Rows that match are then
       | copied into wherever the =FILTER formula is placed.
       | 
       | I use it to take a list of Stories and sorts them into Sprints
       | automatically. That's useful for program increment planning, etc.
       | 
       | The other useful Excel thing I learned recently is:
       | =IF(NOT(ISBLANK(A2)),HYPERLINK("https://jira-
       | instance.atlassian.net/browse/PROJECT-"&A2,"PROJECT-"&A2),"")
       | 
       | That says: If the cell A2 is not blank, append its value onto the
       | url given, and show that as a link with the text PROJECT- with
       | the value of A2 appended.
       | 
       | I know I should have probably done something cooler with Emacs
       | and org-mode, but I have to share it with a lot of business
       | folks.
       | 
       | If by some chance either of those are useful to you, I hope they
       | work OK for you :)
        
         | layman51 wrote:
         | Thanks, I really find these examples useful. For me, the QUERY
         | function in Google Sheets has been really useful too. Same
         | thing with IMPORTRANGE. It really helps with documenting what
         | you did to get a count of rows.
        
           | bloopernova wrote:
           | I'm glad you found them useful, thank you for letting me
           | know!
        
       | __mharrison__ wrote:
       | I use this in Pandas all the time. (Just got done teaching a
       | class about it.) Very useful once you grok the syntax.
       | 
       | Smart folks like duckdb realize the utility (and the pain of
       | doing this in normal SQL) and have added PIVOT to their
       | implementation. Super useful.
        
         | tda wrote:
         | Back in the days I recall there was some plug-in to query Pivot
         | tables in postgres, but it was such a pain to use that a two
         | step approach (query the column names, and then generate a
         | second query based on these column names) with
         | Python/SQLAlchemy glue code was easier and possibly even more
         | performant. I hope the situation has improved, as indeed pivot
         | tables are so useful
        
         | bafe wrote:
         | As much as I don't like working in R, the pivoting/reshaping
         | functions in tidyr are hard to beat. It's much more ergonomic
         | than the similar functions in Python
        
         | phkahler wrote:
         | >> Smart folks like duckdb realize the utility (and the pain of
         | doing this in normal SQL)
         | 
         | I used to love the Microsoft Access visual query tool. Super
         | intuitive but maybe a little too abstract for normal people. It
         | would also produce SQL which was how I learned a little bit of
         | that.
        
         | ideamotor wrote:
         | I like R's implementation in dplyr, which is pivot_longer and
         | pivot_wider. I'd really like to see those in Postgres.
        
           | bafe wrote:
           | Weren't those functions moved to tidyr? Anyway they should be
           | taken as a model for how to implement a sane interface for
           | data reshaping in other languages. You can say what you want
           | about R/Wickham, but he's not afraid to thinker with his
           | modules until they find the perfect user experience
        
         | hackandthink wrote:
         | Oracle has PIVOT as well. (the last time I used it you had to
         | specify the columns manually)
        
       | tacon wrote:
       | I once read about several companies whose entire product was an
       | Excel addin that did ... pivot tables. The average businessperson
       | had never heard of the pivot table functionality built into
       | Excel, and they were an easy sale for the magical demo of a pivot
       | table on their own data.
        
       | Qem wrote:
       | The second most powerful is probably the Solver, that few
       | programming languages (like prolog) include equivalent
       | functionality out of the box. That's one thing that consistently
       | gets me crawling back from Python/Pharo to Calc.
        
       | mritchie712 wrote:
       | For more power: QUERY()
       | 
       | https://www.benlcollins.com/spreadsheets/google-sheets-query...
        
       | sorokod wrote:
       | "Here's an example of the pivot table in action."
       | 
       | What follows reminds me of "draw the rest of the f* *ing owl"
       | meme.
        
       | elsadek wrote:
       | I new about pivot table when I was working with Python Pandas.
       | Pandas has this feature, and it has saved me a huge amount of
       | time.
        
       | cgio wrote:
       | Great feature but also a very dangerous one. Few cases of people
       | sharing a spreadsheet with an innocuous summary pivot table, not
       | realising they also, in the process, share the full dataset that
       | supports the pivot with sensitive data.
        
       | jimnotgym wrote:
       | I am regarded as a decent Excel user, and in my field, Finance,
       | that is saying at least something. However, I had never tried
       | power pivot until today. I was a bit annoyed that it can't handle
       | many to many relationships. Time to dust off my SQL skills and do
       | it the old fashioned way...
       | 
       | I had a table of sales transactions, and a table of stock
       | balance. I wanted to join them on the item sold so per item I had
       | stock balance and a sales value per sku. I was suprised it
       | wouldn't do it. It returned in less than a second as a sell
       | query.
        
         | proamdev123 wrote:
         | The workaround for many to many relationships isn't too hard.
         | 
         | What you have to do is create another table containing unique
         | values of items sold, and then make 1:many relationships from
         | that table to the other two. You can easily make the unique
         | value table by copying and pasting all of the items sold into a
         | single column on a new sheet, highlight them all, and then Data
         | -> Drop Duplicates. It's a little annoying, but not hard.
        
       | strongpigeon wrote:
       | I remember in Google Ads, we'd build all these cool looking
       | dashboards but whenever we did user research it was pretty
       | apparent that all they wanted was their data in a pivot table...
        
         | listenallyall wrote:
         | Basically the users were trying to tell you (sounds like you
         | never got the message) that your dashboards were insufficient
         | at answering the questions people had to answer in their
         | workplace by their bosses every day.
         | 
         | Nobody cares that they looked cool (highly subjective, anyway)
         | if they can't be used to get work done. Where your team thought
         | you were adding value, you were just wasting time.
        
         | c4mpute wrote:
         | All these cool-looking dashboards are just too inflexible. You
         | cannot add your own aggragates beyond trivialities. You cannot
         | just "color that one value that bugs you". You cannot just
         | generate a readable report plus some explanatory text.
         | 
         | Spreadsheet export + pivot table gives you all that. Doable for
         | any moderately competent office drone without a round-trip
         | through some endless backlog-spec-sprint-program-test-respec-
         | sprint-... loop
        
           | Bishonen88 wrote:
           | That's a bit oversimplifying IMO.
           | 
           | There's a place for well-crafted analytics dashboards in
           | today's business, too. They're mostly tailored to specific
           | user requirements/use-cases and look nothing like the flashy
           | stuff one sees on dribbble or elsewhere.
           | 
           | Tailored analytics dashboard can solve many pain points of
           | Excel + Spreadsheets if done well. If ~1k people need to
           | access the same data each day and 'analyze' it for similar
           | things (patterns/outliers/seasonalities etc.) then a good
           | dashboard will be quicker, better and cheaper than 1k office
           | workers trying to create pivot tables. If that dashboard is
           | tailored to the use case, then those 'color that one value
           | that bugs you' can oftentimes be implemented within minutes
           | after hearing a good use-case from a user. I say that from
           | experience.
           | 
           | And from experience, I'd say that most Excel users know the
           | basics of basics. I'd bet that 90%+
        
             | hackandthink wrote:
             | Agree.
             | 
             | And a tool like Superset enables users to customize their
             | dashboards and charts.
        
             | c4mpute wrote:
             | The problem here is that you usually do not have ~1k users
             | with all the same requirements. You have 200 groups of
             | average 5 users each, all with their own department-
             | specific, country-specific or workflow-specific
             | requirements. Of course a central solution will be better
             | and cheaper. But it will never be quicker, because you will
             | take ages to just gather requirements from all 200 distinct
             | user groups. As soon as you have those requirements, they
             | will have changed already, so you are working on
             | yesteryear's problems.
             | 
             | And of course, given a working system, the users can drop
             | you a quick email, explain their problem (yes, in an ideal
             | world they could do that, and you would understand them
             | right away...) and you implement a 5min change. In reality
             | however, their problem will first have to be specified in a
             | user story, with a ton of clarification requests until the
             | story is really understood by the dev team, then you need
             | goodwill, time and money for the implementation. And maybe
             | their problem can only be solved by an ugly hack, a weird
             | special case for the ternary currency and ages-old lunar-
             | calendar-based tax-system of lampukistan. Would that really
             | be quicker than just the lampukistan team throwing together
             | a few formulas and be done faster than the initial email?
             | Even when multiplied by the special requirements of the
             | other 100 country sales teams?
             | 
             | Also, I've had similar change requests where is was
             | explicitly asked to provide a spreadsheet prototype of what
             | the statistics should look like. Well, thanks, why again do
             | we need a dev team?
             | 
             | I know that spreadsheets suck. They are ugly, undebuggable
             | hacks, always and without exception. You need tons of time
             | to implement in hours what would be a quick one-liner SQL
             | query. With terrible error behaviour, weird edge cases and
             | hell knows how many hidden bugs when the locale uses the
             | lampukistan-currency-separator instead of a decimal dot...
             | 
             | ...Except that they provide those office drones with
             | velocity, which, as the usual wisdom around here goes, is
             | everything.
        
               | airstrike wrote:
               | This is the way
        
           | c4mpute wrote:
           | To be somewhat constructive: What you rather should have done
           | is not create more elaborate dashboards. What imho the world
           | needs is an easy way to use a spreadsheet tool to generate
           | and publish a dashboard. A "make web dashboard" button right
           | next to the print button. With auto-updates when input data
           | changes of course.
        
             | jamesmaniscalco wrote:
             | There is Smartsheet, which mostly works well for this, but
             | its power-user features are pretty limited compared to
             | Excel.
        
             | airstrike wrote:
             | Yup. That's what I want to build. Thank you for saying that
             | -- I feel like it really validates my feelings hah
        
               | fiddlerwoaroof wrote:
               | The problem is always this project turns into "let's
               | build excel or tableau" and the customers that care
               | usually already use one or the other.
        
               | airstrike wrote:
               | That's fair, but fortunately I'm not planning on doing
               | either. (Well, I am still implementing ~all of Excel's
               | formulas for compatibility, but not the the UI/UX...)
               | 
               | People don't really consume data, they read documents. I
               | think that's (part of) the vision these projects lack.
        
               | amcaskill wrote:
               | I'm working on an OSS BI tool focused on a document form
               | factor. Might be of interest to you.
               | https://github.com/evidence-dev/evidence
        
               | airstrike wrote:
               | Thank you! Definitely interesting! I had actually starred
               | that repo when I saw it being discussed on some HN thread
               | a week or two ago
        
               | [deleted]
        
               | fiddlerwoaroof wrote:
               | Yeah, I misread the post I replied to: I've been on a
               | bunch of internal dashboard projects that were in danger
               | of losing focus and turning into full-fledged
               | visualization platforms.
        
             | uxp8u61q wrote:
             | Have you... used Excel? It's very simple to create any kind
             | of "dashboard" (AKA graphs on a page) and then you just
             | share the web link to the page.
        
               | c4mpute wrote:
               | Yes, I have. What Excel is still lacking is an easy
               | solution for the input side. You can bind tons of data
               | sources, but all are weird, hard-to-use, manual. There is
               | no easy "grab this from that website, get the current
               | data of what I just pasted there, mash it together,
               | publish it"
               | 
               | Hell, it cannot even do proper CSV import. You need to
               | reformat your CSV to match the locale Excel is running
               | under!
        
               | uxp8u61q wrote:
               | Uh? Are you sure you've actually used Excel? The CSV
               | import is highly configurable and leads you immediately
               | into Power Query where you can massage the data any way
               | you want.
        
               | c4mpute wrote:
               | The LibreOffice CSV import is configurable. The Excel one
               | isn't.
               | 
               | You can do things in PowerQuery, but that is far from
               | obvious and still buggy. Not to mention all the woes
               | after import, like date/time auto-interpretation and
               | autocorrections that cannot be switched off.
               | 
               | I stand by what I said. Excel imports are a huge mess.
        
               | uxp8u61q wrote:
               | Would you like me to send you some online tutorials on
               | how to import CSVs into excel? Because at this point it's
               | just crazy. Are you using excel 2009? Do you not know
               | about the "Data" tab in the ribbon? There's a whole
               | dialog to complete with several options when you import a
               | CSV file.
        
               | Seanambers wrote:
               | Powerquery oh god never again!
        
       | caycep wrote:
       | It is basically Excel's way of bringing matrix math to the masses
        
       | gadders wrote:
       | I was working at Lotus when Improv came out. From what I
       | remember, it was popular but as you started adding dimensions
       | memory use blew up. This was in the early days of Windows and 4mb
       | of memory.
        
         | kagakuninja wrote:
         | And I was working at Brio Technology in the late 90s, they had
         | a similar product called DataPivot, first sold in 1991.
         | https://en.wikipedia.org/wiki/Brio_Technology
         | 
         | I don't remember Brio's pivot tables "blowing up" per-se, but I
         | suppose computers had a lot more memory by the time I joined.
         | They used a patented algorithm to create the pivot structure by
         | aggregating a result-set from an SQL query.
        
       | tqi wrote:
       | Pivot tables powered by SSAS cubes may be the best self serve
       | analytics tool I've ever seen (where "best" is measured by how
       | much users actually use it). The ability to meet business users
       | where they are is huge for actual adoption, especially compared
       | to something like Looker (which is hot garbage for other reasons
       | as well). Plus, 9 times out of 10 people want to apply additional
       | lightweight transformations/calculations to the data, so nothing
       | beats being in Excel already.
       | 
       | Too bad OSX support is non existent and writing MDX is a pain in
       | the fucking ass.
        
         | kyllo wrote:
         | I agree and I do this via Power BI. If you import data into a
         | Power BI report, create a data model with calculated measures
         | (in DAX, not MDX), and publish it to the online service, then
         | users can click on "Analyze in Excel" and it downloads an Excel
         | workbook with a pivot table connected to that data model. I
         | provide this to the PMs for the product I work on and they're
         | able to answer a _lot_ of their questions just by pivoting
         | instead of having to write bespoke SQL.
        
           | tqi wrote:
           | Oh that's cool - does it work for business users on OSX? I
           | feel like the fall of Thinkpads was the final nail in the
           | coffin. We tried running a dedicated windows machine that
           | people could remote into, but it was just too much friction.
           | 
           | Edit: One tool that looks promising is Equals (equals.com),
           | but I haven't had a chance to play with it directly to see
           | how it compares.
        
             | kyllo wrote:
             | Assuming the company has a Microsoft 365 subscription, Mac
             | users can just use the Excel web client in their browser to
             | access the data via pivot tables.
        
         | archiewood wrote:
         | > Looker (which is hot garbage for other reasons as well)
         | 
         | From what I remember, Looker does allow you to create pivot
         | tables from the Explore interface?
         | 
         | You can then also download to csv / excel from a Looker
         | explore.
         | 
         | Something missing for you there?
        
           | tqi wrote:
           | It does, but everything is translated to raw sql then pushed
           | to the database layer, which means anything with a meaningful
           | amount of data runs like dogshit. I haven't touched MDX in a
           | long time, but my recollection is that OLAP cubes make a lot
           | of these pivot-table type queries a lot more performant.
           | 
           | Also, while it may seem like a minor thing, not being
           | connected to live source introduces a significant amount of
           | friction and room for human error. Adding a new filter or
           | measure = new copy of a file that you need to keep track of,
           | refreshing with a new month of data = a new copy of a file,
           | etc.
        
             | archiewood wrote:
             | Oh yeah. Any manual data update has potential to go wrong.
             | 
             | Especially since in my exp the most common way to do this
             | is to paste the new data over previous sheet in an excel,
             | and hope all the formulas still work.
             | 
             | Kind of fine, but let's hope there aren't any new
             | categories that weren't there last month!
        
               | tqi wrote:
               | > Kind of fine, but let's hope there aren't any new
               | categories that weren't there last month!
               | 
               | if it does mess up your formulas hopefully it does it in
               | a way that you actually notice!
               | 
               | hyperbole aside, I don't think it's entirely Looker's
               | fault that business users can't seem to get the hang of
               | it, but I think the delta between what users "should use"
               | and "actually use" is large enough that the tool just
               | isn't worth it.
        
             | [deleted]
        
         | hermitcrab wrote:
         | Have you tried Easy Data Transform? It is a lightweight ETL
         | tool for doing data transformations (such as pivot) on Excel,
         | CSV and various other file formats. Runs natively on Windows
         | and Mac.
        
           | tqi wrote:
           | The ETL part is not really the problem business users seem to
           | have, but rather needing the ability to use pre-built
           | measures / dimensions natively in Excel.
        
         | davio wrote:
         | Back in the mid 2000s, our killer app was an example app from
         | MSDN magazine. It basically embedded Excel in a simple web page
         | and we could use the pivot tables against SSAS cubes. We did a
         | little web work for permissions and to save views, but probably
         | had less than a total week of dev work.
        
         | SOLAR_FIELDS wrote:
         | I worked at a company that had this (OLAP Cube + Pivot Tables)
         | as one of my first internships and it was mind blowing. I bring
         | this up in terms of amazing analytics tools in the data space
         | somewhat often. Even highly developed BI tools like metabase
         | can't handle dimensions as well as the pivot table.
        
           | Jgrubb wrote:
           | Can you give an example? I can make metabase pivot tables
           | DANCE (imho).
        
           | namtab00 wrote:
           | I used to work on a BI and reporting platform on the
           | Microsoft stack (SSIS + SSAS + SSRS).. Even got decent at
           | writing MDX.
           | 
           | Thing is, you needed beasty servers to get good performance
           | (event with loads of cube modeling optimizations), but that
           | was rarely the case... This was in the physical servers era,
           | mind you, even started on 32 bit Win Server, which choked up
           | pretty fast...
           | 
           | I switched company around when tabular OLAP models started
           | replacing multi-dimensional models, so I never got to
           | understand those.
           | 
           | MS SSAS got replaced with Qlik / Tableau AFAIK.
        
           | [deleted]
        
       | trelane wrote:
       | In case folks don't yet know how to use pivot tables in
       | LibreOffice:
       | https://books.libreoffice.org/en/CG71/CG7108-PivotTables.htm...
        
       | digging wrote:
       | Okay... so _what is a pivot table_? I 've never used one or seen
       | one used, and my browser didn't load the embedded video.
       | 
       | What I learned about pivot tables from this article:
       | 
       | - they are an easy way to show data that's in a spreadsheet
       | 
       | - they were invented at Lotus (maybe)
       | 
       | I don't even begin to know what they actually do or how someone
       | would use one.
        
         | airstrike wrote:
         | It's a drag-and-drop UI for grouping items in a table by
         | category. Plenty of quick videos on YT since business users
         | generally struggle to grasp it. Here's one example:
         | https://www.youtube.com/watch?v=qu-AK0Hv0b4
         | 
         | It's paradoxically very useful and complete garbage at the same
         | time, IMHO
        
       | ipython wrote:
       | It's too bad that the pivot table is a poor approximation of a
       | true multidimensional spreadsheet, for example Lotus Improv:
       | https://instadeq.com/blog/posts/no-code-history-lotus-improv...
        
         | fiddlerwoaroof wrote:
         | I believe this is a sort of descendant of Improv:
         | https://quantrix.com/products/quantrix-modeler/
        
           | steve1977 wrote:
           | It is, and like Improv, has its roots in NeXSTSTEP.
           | 
           | http://www.kevra.org/TheBestOfNext/ThirdPartyProducts/ThirdP.
           | ..
           | 
           | There used to be a version of Quantrix Modeler that was
           | affordable for ,,home users", it's been quite a while though.
        
         | getravi wrote:
         | Never knew Improv existed. There are tools that are similar to
         | the vision of Improv. I use Anaplan at work everyday and it is
         | exactly what a modern cloud based SaaS version of Improv would
         | feel like. It is a multi billion dollar company and it worked
         | because it did not go after the spreadsheet space but played
         | along nicely with it.
         | 
         | The Improv article concludes "the key strategy mistake was to
         | try to market Improv to the existing spreadsheet market.
         | Instead, if the product were marketed to a segment where the
         | more structured model was a 'feature' not a 'bug' would have
         | given Lotus the time to learn and improve and refine the model
         | to a point where it would have satisfied the larger market as
         | well." and Anaplan seems to not have made this mistake. They
         | have carved out a niche in the EPM (Enterprise Performance
         | Management) market.
        
         | Brian_K_White wrote:
         | The article says Improv is where pivot tables started.
        
         | kagakuninja wrote:
         | You may want to read about DataPivot, developed by Brio
         | Technology during the same period as Lotus. Brio had a patent
         | on the pivot data aggregation algorithm, I'm not sure how
         | Lotus's pivot table worked...
         | 
         | https://en.wikipedia.org/wiki/Brio_Technology
        
         | qsort wrote:
         | Which is itself the poor man's groupby.
         | 
         | At some point you've just got to admit you have the wrong
         | abstraction. Strong Zalgo vibes.
        
           | contravariant wrote:
           | Goupby may be the higher abstraction, but it's not
           | necessarily better.
           | 
           | Dimensional models are basically modules (generalised linear
           | spaces), a groupby can do the same things but doesn't really
           | give much useful structure to work with (at best the result
           | is ordet independent, most of the time).
           | 
           | This is also why sums and counts tend to be more useful than
           | averages.
        
           | airstrike wrote:
           | Or at some point you realize we need some new abstraction ;-)
        
           | jasode wrote:
           | _> Which is itself the poor man's groupby._
           | 
           | SQL GROUP BY creates summarized horizontal _rows_.
           | 
           | Instead, pivot tables are more analogous to _crosstab_
           | queries which creates summarized vertical _columns_. It
           | "pivots" data groupings by rotating from horizontal to
           | vertical.
           | 
           | The older versions of SQL dialects that didn't have the newer
           | cross tab syntax required convoluted CASE syntax to
           | "simulate" pivot tables which didn't really work that well
           | since one had to know ahead of time -- all the unique values
           | -- to put in each CASE condition branch.
        
             | qsort wrote:
             | If you are grouping over them, they shouldn't be columns in
             | the first place, the original SQL is "right". Rows and
             | columns aren't symmetric!
             | 
             | Reshaping data should be a _presentation_ -time decision,
             | not a query-time decision. You have a dataset, a relation
             | in the algebraic sense, and you are choosing to display it
             | in some way: as a table, as a pivoted table, as a pie-
             | chart...
             | 
             | Conflating the two is a consequence of spreadsheets having
             | an unbeatable UX but a terrible data model that lets you
             | treat rows as columns and vice-versa.
        
               | fifilura wrote:
               | There are real use cases for pivoting data in SQL not
               | only for presentation. Tables can be too "long".
               | 
               | You need data points to be on the same row if you want to
               | do arithmetic operations on on different types of values.
               | For example
               | 
               | k + 3.14* number_of_chimneys - 0.13 * age -
               | 1.34*neigbourhood_criminality as house_price
        
               | tomnipotent wrote:
               | > should be a presentation-time decision, not a query-
               | time decision
               | 
               | Agree, but sometimes you just need to shove the results
               | of a SQL query into an Excel file and you don't want to
               | get fancy.
               | 
               | You're either 1) overwriting Sheet B and then using a
               | pivot table in Sheet A to get the final presentation, 2)
               | pivoting in the code/program executing the query before
               | writing to Excel or 3) pivoting in SQL and skipping the
               | code and Excel pivot table altogether.
               | 
               | I run into this a lot with data used for financial
               | modeling, or financial reporting that heavily relies on
               | using dates/categories as column/row headers.
        
               | toyg wrote:
               | _> Reshaping data should be a presentation-time decision_
               | 
               | When you potentially billions or trillions of data
               | points, it isn't.
               | 
               | Rows and columns have limits. You need some hard logic
               | for true multidimensional data at scale.
        
         | ghaff wrote:
         | One of the "crimes" (he types hyperbolically) of Microsoft
         | Office is that is basically sucked all the air out of the room
         | for anything else in the non-graphical artist office
         | productivity area that wasn't Office or a pretty direct knock-
         | off.
         | 
         | The spreadsheet model is a good example (even if Excel _is_
         | probably the best thing in Microsoft Office. But it also means
         | that if you can 't make Word do a good enough job for desktop
         | publishing you generally have to go to InDesign which is
         | probably way overkill if yoiu're not a publishing professional.
        
           | znpy wrote:
           | To be honest i saw a teacher in high school working on his
           | own textbook)the second revision of an already published
           | textbook) in word and while it had the classic wysiwyg
           | experience, it was typographically okay, almost ready to be
           | printed.
        
             | ghaff wrote:
             | Word, or even something like Google Docs which lacks some
             | features in areas like Section numbering, isn't terrible. I
             | published a book using Google Docs and basically decided
             | anything it couldn't do I didn't need or could handle
             | manually. But, if I were actually come up with a wish list
             | for a low-end publishing platform it would probably look a
             | bit different than Word.
        
         | vondur wrote:
         | The linked article specifically mentions Lotus Improv as the
         | app that had this functionality in it. Interesting how Steve
         | Jobs was able to get Lotus to make it a NeXT exclusive app
         | initially.
        
         | dannyobrien wrote:
         | I remember going to the UK launch of Lotus Improv as a newbie
         | journalist. It was really notable how both how flashy and
         | professional it was compared to other products (in retrospect,
         | I'm presuming that was Jobs' influence). Nonetheless, I think
         | they really struggled to explain pivot tables, and why, in
         | itself, that feature was sufficient to move to a new
         | application _and_ a new hardware platform.
        
         | RGamma wrote:
         | Sounds a lot like what you can do with tabular model
         | (PowerPivot) and DAX (measures) in Excel now.
        
         | II2II wrote:
         | I have not used spreadsheets very often since the mid-1990's.
         | One of the reasons: I was excited by the potential of Improv,
         | but disappointed when I realized that it had no future. Little
         | did I realize that pivot tables were a different take on the
         | concept!
         | 
         | (The other reason for abandoning spreadsheets was performance.
         | I forget how good/bad Improv was in this respect, but I doubt
         | that I would have stuck with spreadsheets since the data sets I
         | was dealing with weren't really appropriate for them.)
        
         | re5i5tor wrote:
         | On NeXT 2 years before Windows
        
         | huhtenberg wrote:
         | Half of the QZ article is literally about Salas and Improv.
        
       | bambax wrote:
       | > _Rather than enter formulas, users would be able to point and
       | click to get those summary statistics. The Lotus team called this
       | tool "flexible views," but today similar tools are called "pivot
       | tables" in both Microsoft Excel and Google Sheets._
       | 
       | > _The Lotus team showed Jobs an early prototype. "Steve Jobs
       | thought it was the coolest thing ever," Salas, now a professor at
       | Brandeis University, tells Quartz. Jobs then convinced Lotus to
       | develop the pivot table software exclusively for the NeXT
       | computer. The software came out as Lotus Improv, and though the
       | NeXT computer was a commercial failure, Lotus Improv would be
       | hugely influential._
       | 
       | Joel Spolsky had this to say about Improv though [0]:
       | 
       | > _When we were designing Excel 5.0, the first major release to
       | use serious activity-based planning, we only had to watch about
       | five customers using the product before we realized that an
       | enormous number of people just use Excel to keep lists. They are
       | not entering any formulas or doing any calculation at all! We
       | hadn't even considered this before. Keeping lists turned out to
       | be far more popular than any other activity with Excel. And this
       | led us to invent a whole slew of features that make it easier to
       | keep lists: easier sorting, automatic data entry, the AutoFilter
       | feature which helps you see a slice of your list, and multi-user
       | features which let several people work on the same list at the
       | same time while Excel automatically reconciles everything._
       | 
       | > _While Excel 5 was being designed, Lotus had shipped a "new
       | paradigm" spreadsheet called Improv. According to the press
       | releases, Improv was a whole new generation of spreadsheet, which
       | was going to blow away everything that existed before it. For
       | various strange reasons, Improv was first available on the NeXT,
       | which certainly didn't help its sales, but a lot of smart people
       | believed that Improv would be to NeXT as VisiCalc was to the
       | Apple II: it would be the killer app that made people go out and
       | buy all new hardware just to run one program._
       | 
       | > _Of course, Improv is now a footnote in history. Search for it
       | on the web, and the only links you'll find are from very over-
       | organized storeroom managers who have, for some reason, made a
       | web site with an inventory of all the stuff they have collecting
       | dust._
       | 
       | > _Why? Because in Improv, it was almost impossible to just make
       | lists. The Improv designers thought that people were using
       | spreadsheets to create complicated multi-dimensional financial
       | models. Turns out, if they asked people, they would discover that
       | making lists was so much more common than multi-dimensional
       | financial models, and in Improv, making lists was a downright
       | chore, if not impossible._
       | 
       | [0] https://www.joelonsoftware.com/2000/05/09/the-process-of-
       | des...
       | 
       | - - -
       | 
       | I don't know if pivot tables are cool; one big problem that
       | people often overlook is that they have to be recalculated
       | ("refreshed") manually; this can lead to significant errors.
       | 
       | Conditional sums are not "complex formulas", they are often
       | easier to understand and debug than pivot tables -- and they are
       | recomputed with each change, which will eventually save your ass.
        
         | rvba wrote:
         | It would be interesting to know why Microsoft was so incredibly
         | stubborn to not allow to filter by colors / collect value of
         | color in a cell.
         | 
         | I am very, very aware that this is the wrong way to do things,
         | but users really prefer to take their list and color their
         | wrong cells red and the correct cells in green instead of using
         | a separate column for "status". Of course a separate column
         | with status then allows to have the 99999 different type of
         | statuses that are created.. but is just clunkier.
         | 
         | And I know that you can now (after how many years?) filter by
         | colors, what again is clunky if there are multiple colors, but
         | you cannot get the cells color without custom VBA.
         | 
         | A simple "cellcolor()" formula would allow to make faster color
         | filters.
         | 
         | It is very funny that every organization seems to have a
         | "database" which is a list of stuff. And "big data" is when
         | this list does not fit to Excel anymore.
        
       | klysm wrote:
       | Just let me write SQL please
        
         | ChrisClark wrote:
         | You can in Google Sheets. Well, close enough at least. Use the
         | QUERY function.
        
         | whalesalad wrote:
         | duckdb will let you do both!
         | https://duckdb.org/docs/sql/statements/pivot.html
        
         | jabroni_salad wrote:
         | Well, another nice thing with pivot tables, in excel at least,
         | is I can just follow the data connection to its home and do
         | whatever I want with that. RBQL in rainbowcsv is my usual.
        
         | yunohn wrote:
         | IME pivoting on SQL results is a very common usecase too.
        
         | chasd00 wrote:
         | ignorance is bliss but you should learn to work all your tools
         | well not just rely on one.
        
         | airstrike wrote:
         | In Excel, you can have the source data for the pivot table be
         | an SQL query -- then the pivoting is just for presentation
         | purposes when you get down to the very last layer of groupby's,
         | so to speak
         | 
         | You can also use SQL queries as sources for regular tables in
         | Excel (i.e. no pivoting)
         | 
         | One of my first roles out of undergrad was to take reports that
         | would take an analyst literally days or weeks to generate into
         | a set of SQL queries that got them very close to the answer
         | with pivot tables (or just tables). Their job went from working
         | on reports to changing a couple parameters in the queries like
         | "current month" or "forecast version", which only took a couple
         | minutes, leaving them with plenty of time to think about new
         | reports to generate, how else to improve current reports, etc.
         | Still one of the most personally satisfying things I've ever
         | done.
         | 
         | Granted we can't expect all business users to learn to write
         | SQL (spoilers: they never will). I believe we collectively need
         | a more robust solution than having an SQL angel come by and
         | write queries for business users... it feels obvious and within
         | reach, but no one has really done it yet.
        
         | baq wrote:
         | common use case: get a metric truckload of data from a column
         | store binned per minute (or 5), give the giant sheet to users,
         | users use pivot tables to create hourly/daily/weekly reports as
         | they see fit. (not all users are dumb it turns out)
        
         | DanAtC wrote:
         | Dynamically pivoting SQL is a pain in most dialects
        
         | roywiggins wrote:
         | If only generating pivot tables in SQL were uniformly easy.
         | Postgres's "crosstab" is awkward at best, forcing you to
         | specify column names every time makes exploration unpleasant.
         | In the end I found it easier to do crosstabulation _in the UI_
         | (or at least, not in SQL).
        
       ___________________________________________________________________
       (page generated 2023-10-09 23:00 UTC)