[HN Gopher] The pivot table, the spreadsheet's most powerful too... ___________________________________________________________________ The pivot table, the spreadsheet's most powerful tool (2020) Author : airstrike Score : 224 points Date : 2023-10-09 14:36 UTC (8 hours ago) (HTM) web link (qz.com) (TXT) w3m dump (qz.com) | mcsoft wrote: | I use pivot tables all the time. The concept is brilliant, but | the Excel UI leaves a lot to be desired. | | At first, you're amazed at the flexibility, but once you become | comfortable, you suddenly hit the limitations. Can't sort by a | calculated column, can't categorize without adding columns in the | data, etc. | | I looked at Quantrix for a while, and it was a bit too complex | for practical purposes. I wonder if there are any decent | PivotTable tools out there? | asah wrote: | nice! | | AI is about to end this - shortly, we'll be able to ask for the | answer directly, in zero clicks. | airstrike wrote: | this is unfairly downvoted, IMHO -- i doubt it will take "zero | clicks" -- we're more likely replacing clicks with prompts -- | but LLMs can definitely help bridge the gap between data | wrangling and presentation. | | replies to this comment seem to overvalue the challenges | without acknowledging that those may eventually be overcome. | seektable wrote: | search-driven analytics is not really a new thing and | products in this space were before LLM-era. This kind of | interface can be useful for some categories of users but it | is not a game-changer - prompts cannot replace Excel and its | pivot tables, and in fact typing prompts may be even more | complicated for users than good old 'clicks'. | Brian_K_White wrote: | and have no confidence in it | rchaud wrote: | Perhaps we could create a set of logical prompts to get there | even faster. We could call it a structured query language! | mxuribe wrote: | Or better yet: conversation query language (CQL)! ...Though | the risk exists for verbal name collision. ;-) | Bishonen88 wrote: | Dunno, seems to me that asking (writing in this case) is not | final solution. In fact, in most cases, typing down a question | will take longer than looking in a well created table. | | This chat-like back-and-forth, will take forever to understand | some dataset properly. One question answered will give birth to | 5 additional questions and so on. | hankyone wrote: | Everytime I have tried to use pivot tables I end up just firing | PowerBI instead, it's a much better experience. | jimnotgym wrote: | It depends. It is common to use the results in further Excel | calculations. | bafe wrote: | I usually end up using panda or R | Scubabear68 wrote: | They are indeed very powerful, if unfortunately named, and the UI | always takes me a while to do it properly in Excel. | | I like how MacOS Numbers does it, the Categories feature is much | easier to use, however it has limitations that are very annoying | if you reach them. | | For anything complex these days, it is often easiest to export | the spreadsheet to CSV and run SQL queries on it with CSVQ. | kagakuninja wrote: | Microsoft directly stole their pivot table from a former employer | of mine, Brio Technology, and their product DataPivot. | https://en.wikipedia.org/wiki/Brio_Technology | | According to the CEO, Microsoft summoned him to Redmond and made | an insultingly low offer to buy the company. They threatened to | build their own product and put Brio out of business if he | refused. He refused and MS later added pivot tables to Excel. | | The article mentions Lotus, who apparently built a similar | product around the same time. The Brio founders were involved | with a company called Metaphor, and maybe some of the ideas were | developed there... | | Brio products shipped with a sample database, the contents of the | CEO's wine cellar. That exact data could be seen on early | Microsoft office boxes... | [deleted] | viraptor wrote: | What part did MS steal? The pivot table concept (as in | summarising data by categories of rows and similar operations) | existed for a really long time, so I'm guessing you're not | referring to that? | onlyrealcuzzo wrote: | Aren't you allowed to "steal" basically everything except | exact code in software? | | Didn't the supreme court rule in favor of Google vs Oracle | that Google's use of the JDK API was fair use? | | Wasn't there also a recent supreme court ruling that you | can't patent non-novel math (algorithms)? | | I'm not sure how you could be convicted of theft in software | outside of literally stealing and reselling someone's non- | open source code. | kagakuninja wrote: | You could say they just reverse-engineered Brio's product, | and therefore it wasn't theft. | | But the presence of the Brio sample wine DB appearing on MS | Office boxes showed that they bought a copy of DataPivot and | used that DB when developing their clone. They didn't even | bother hiding the evidence. | | BTW, DataPivot was released in 1991, and the only other | references I've seen for prior art are a Lotus product | released about the same time. | chinchilla2020 wrote: | What did they steal? The underlying tech? the name 'pivot | table'? | | The concept of aggregating categorized columns of data is not | something Brio Technology invented. | ambyra wrote: | Remember when Microsoft wanted to license CPM to IBM, but they | didn't own it, so they got someone to make a clone and licensed | that to IBM? | riemannzeta wrote: | Something similar happened to Stac Electronics. | https://en.wikipedia.org/wiki/Stac_Electronics | | But Stac had patents, lawyered up, and sued Microsoft for | patent infringement. Microsoft didn't lie down, but after a | court ruled that both of them needed a license to ship | products, Stac went to the largest OEM customers and offered a | license. It was either take the patent license or not ship, and | the pressure from the OEMs on Microsoft resulted in a | settlement that was not terrible for Stac. | | This is one of the best examples I know of how software patents | can be good for competition. | roenxi wrote: | I think it is worth noting that these patents weren't good at | all for competition. Stac Electronics proved to be in no way | capable of competing with MS Excel, pivot tables are | technically not that hard and the lawsuit might have stopped | MS bringing them to a huge number of people faster. | | They did nothing much, and if they'd done more they'd have | retarded progress. | wtallis wrote: | The fact that you clearly have no clue what business Stac | was in rather undermines the argument you're trying to | make. | echelon wrote: | > This is one of the best examples I know of how software | patents can be good for competition. | | Small companies and startups with products in the market | (before existing incumbents have products) should be able to | get software patents to defend their innovation. | | Big companies should be able to have patents, but not use | them against smaller players. | | Patent trolls with no products in the market shouldn't be | allowed to have patents. | | Universities and university researchers should be able to | have patents, but they should be forced to license at not- | unreasonable terms to startups. | | My thoughts on the matter, anyway. The point is to encourage | innovation, especially by enabling small players bringing new | stuff to market. Give them a small shield against the big | incumbents. | deaddodo wrote: | The settlement was alright, but Microsoft's inclusion of | DoubleSpace still inevitably killed Stac (and even larger | potential profits) as they were unable to pivot when the | technology became unviable. | kagakuninja wrote: | Brio had a patent on their pivot table aggregation algorithm, | perhaps Microsoft did something differently in their | implementation. I've never used an Excel pivot table. | eps wrote: | According to Wikipedia Brio got their patent in 1999, i.e. | well after Excel got pivot tables. | kagakuninja wrote: | OK, fair enough. We don't know how long it took to get | approved. In theory they could have gone after MS. I | vaguely remember that our rival Business Objects had some | BI patents and sued us, and we sued them over pivot. | canadianfella wrote: | [dead] | HappyDaoDude wrote: | I am completely on board with the idea of Richard Stallmans | of abolishing software patents BUT it is simply an ideal that | I do not see ever happening on a large scale. The fact is | that when it comes to business, they can help. It is just a | shame when there is the potential to leverage them against | smaller players. | | Things that work on one scale for good can also be leveraged | for bad on another scale. | pacaro wrote: | Arguably this, and similar cases, is what motivated Microsoft | to become an aggressive patent filer. I know that at various | times in my tenure there (99-13) that I was on teams where | patent filing was encouraged | thrdbndndn wrote: | this probably is more about specific implementations than pivot | table in general, but I often find it's "lacking" when wanting to | do some more advanced things with it. | | To name a few: | | The way how the "values" are generated is very limited in both | Excel and Google Sheets (in different ways). | | The way how filter/sorting works with pivot table isn't the most | straightforward or flexible. | | the "UI" elements (headers, styles, etc.) is very hard to | control. I often find myself creating a pivot table, and copy it | somewhere else, and manually fix bunch of stuff -- which kinda | defeats the point of it (since I can no longer dynamically update | it). | | Disclaimer: I'm by no means a spreadsheet expert so I may just | miss something. | airstrike wrote: | Agreed, current implementations are totally brittle and far | from "feature complete" relative to what an experienced / power | user would expect coming in | | It actually bothers me so much I've decided to write my own | spreadsheet engine. This is one of the pain points I want to | fix | conductr wrote: | I'm pretty close to a spreadsheet expert, I categorize pivot | tables as just quick and dirty adhoc/disposable data | visualization and never would use it as something for | presentation or where styling mattered. | | I usually use it to quickly find unexpected values in the | underlying data columns. Although with spill formulas like | =unique() I'm using it less and less for this. | | I find often people wanting more out of it, are really asking | for Power Query. Which is there, just a lot of people are | intimidated by. (Maybe not HN, but general population) | seektable wrote: | You may try these online pivot tables https://www.seektable.com | where you can re-order rows/columns simply with a click on the | header, apply filters via simple input where you can specify | which items to keep or exclude. | xyzelement wrote: | Despite being an engineer and well versed in SQL, the ability to | do things quickly in Excel, including whipping up pivot tables, | has been a big boost to my career, especially as I moved more to | product and business management. | | Pivot tables let you super quickly see how things break down | across dimensions and play with that analysis in a way that makes | for rapid decision making that's not matched by much else other | than established tools for well-defined spaces. | chinchilla2020 wrote: | I feel the opposite. I rapidly upload my data to bigquery and | start slicing. | | I can fire out SQL queries way faster than I can click around | in excel... and it is reproducible since it's easier to copy- | paste a query from history than redo all formatting in an excel | sheet later on. | WendyTheWillow wrote: | I've never gotten into pivot tables, mostly because data has | always been available in a relational db for my career. | | But the effort to move the data into one from a spreadsheet is | way overkill, so I do think it's not suboptimal to use them | even as an engineer. | xyzelement wrote: | I can see how all pivot table capabilities can be replicated | via SQL but I think some of that becomes very expensive and | error prone while being simple drag-drop in Excel/Sheets. | | The ease of doing it is a key feature. if I have to build a | certain report because it's my job, I will do it whatever it | takes. If I am just doing extra due diligence for myself, I | may not do it if it takes hours of SQL crafting. | bloopernova wrote: | I don't know if anyone will find this useful, but here's | something I learned literally this weekend. | | Sort rows/cells into groups based on the value of a cell. | =FILTER(Stories!B2:D13,Stories!F2:F13=A2) | | First parameter, "Stories!B2:D13" is a group of cells showing | some stories. | | Second parameter, "Stories!F2:F13=A2" is the column where each | cell is compared to the value of A2. Rows that match are then | copied into wherever the =FILTER formula is placed. | | I use it to take a list of Stories and sorts them into Sprints | automatically. That's useful for program increment planning, etc. | | The other useful Excel thing I learned recently is: | =IF(NOT(ISBLANK(A2)),HYPERLINK("https://jira- | instance.atlassian.net/browse/PROJECT-"&A2,"PROJECT-"&A2),"") | | That says: If the cell A2 is not blank, append its value onto the | url given, and show that as a link with the text PROJECT- with | the value of A2 appended. | | I know I should have probably done something cooler with Emacs | and org-mode, but I have to share it with a lot of business | folks. | | If by some chance either of those are useful to you, I hope they | work OK for you :) | layman51 wrote: | Thanks, I really find these examples useful. For me, the QUERY | function in Google Sheets has been really useful too. Same | thing with IMPORTRANGE. It really helps with documenting what | you did to get a count of rows. | bloopernova wrote: | I'm glad you found them useful, thank you for letting me | know! | __mharrison__ wrote: | I use this in Pandas all the time. (Just got done teaching a | class about it.) Very useful once you grok the syntax. | | Smart folks like duckdb realize the utility (and the pain of | doing this in normal SQL) and have added PIVOT to their | implementation. Super useful. | tda wrote: | Back in the days I recall there was some plug-in to query Pivot | tables in postgres, but it was such a pain to use that a two | step approach (query the column names, and then generate a | second query based on these column names) with | Python/SQLAlchemy glue code was easier and possibly even more | performant. I hope the situation has improved, as indeed pivot | tables are so useful | bafe wrote: | As much as I don't like working in R, the pivoting/reshaping | functions in tidyr are hard to beat. It's much more ergonomic | than the similar functions in Python | phkahler wrote: | >> Smart folks like duckdb realize the utility (and the pain of | doing this in normal SQL) | | I used to love the Microsoft Access visual query tool. Super | intuitive but maybe a little too abstract for normal people. It | would also produce SQL which was how I learned a little bit of | that. | ideamotor wrote: | I like R's implementation in dplyr, which is pivot_longer and | pivot_wider. I'd really like to see those in Postgres. | bafe wrote: | Weren't those functions moved to tidyr? Anyway they should be | taken as a model for how to implement a sane interface for | data reshaping in other languages. You can say what you want | about R/Wickham, but he's not afraid to thinker with his | modules until they find the perfect user experience | hackandthink wrote: | Oracle has PIVOT as well. (the last time I used it you had to | specify the columns manually) | tacon wrote: | I once read about several companies whose entire product was an | Excel addin that did ... pivot tables. The average businessperson | had never heard of the pivot table functionality built into | Excel, and they were an easy sale for the magical demo of a pivot | table on their own data. | Qem wrote: | The second most powerful is probably the Solver, that few | programming languages (like prolog) include equivalent | functionality out of the box. That's one thing that consistently | gets me crawling back from Python/Pharo to Calc. | mritchie712 wrote: | For more power: QUERY() | | https://www.benlcollins.com/spreadsheets/google-sheets-query... | sorokod wrote: | "Here's an example of the pivot table in action." | | What follows reminds me of "draw the rest of the f* *ing owl" | meme. | elsadek wrote: | I new about pivot table when I was working with Python Pandas. | Pandas has this feature, and it has saved me a huge amount of | time. | cgio wrote: | Great feature but also a very dangerous one. Few cases of people | sharing a spreadsheet with an innocuous summary pivot table, not | realising they also, in the process, share the full dataset that | supports the pivot with sensitive data. | jimnotgym wrote: | I am regarded as a decent Excel user, and in my field, Finance, | that is saying at least something. However, I had never tried | power pivot until today. I was a bit annoyed that it can't handle | many to many relationships. Time to dust off my SQL skills and do | it the old fashioned way... | | I had a table of sales transactions, and a table of stock | balance. I wanted to join them on the item sold so per item I had | stock balance and a sales value per sku. I was suprised it | wouldn't do it. It returned in less than a second as a sell | query. | proamdev123 wrote: | The workaround for many to many relationships isn't too hard. | | What you have to do is create another table containing unique | values of items sold, and then make 1:many relationships from | that table to the other two. You can easily make the unique | value table by copying and pasting all of the items sold into a | single column on a new sheet, highlight them all, and then Data | -> Drop Duplicates. It's a little annoying, but not hard. | strongpigeon wrote: | I remember in Google Ads, we'd build all these cool looking | dashboards but whenever we did user research it was pretty | apparent that all they wanted was their data in a pivot table... | listenallyall wrote: | Basically the users were trying to tell you (sounds like you | never got the message) that your dashboards were insufficient | at answering the questions people had to answer in their | workplace by their bosses every day. | | Nobody cares that they looked cool (highly subjective, anyway) | if they can't be used to get work done. Where your team thought | you were adding value, you were just wasting time. | c4mpute wrote: | All these cool-looking dashboards are just too inflexible. You | cannot add your own aggragates beyond trivialities. You cannot | just "color that one value that bugs you". You cannot just | generate a readable report plus some explanatory text. | | Spreadsheet export + pivot table gives you all that. Doable for | any moderately competent office drone without a round-trip | through some endless backlog-spec-sprint-program-test-respec- | sprint-... loop | Bishonen88 wrote: | That's a bit oversimplifying IMO. | | There's a place for well-crafted analytics dashboards in | today's business, too. They're mostly tailored to specific | user requirements/use-cases and look nothing like the flashy | stuff one sees on dribbble or elsewhere. | | Tailored analytics dashboard can solve many pain points of | Excel + Spreadsheets if done well. If ~1k people need to | access the same data each day and 'analyze' it for similar | things (patterns/outliers/seasonalities etc.) then a good | dashboard will be quicker, better and cheaper than 1k office | workers trying to create pivot tables. If that dashboard is | tailored to the use case, then those 'color that one value | that bugs you' can oftentimes be implemented within minutes | after hearing a good use-case from a user. I say that from | experience. | | And from experience, I'd say that most Excel users know the | basics of basics. I'd bet that 90%+ | hackandthink wrote: | Agree. | | And a tool like Superset enables users to customize their | dashboards and charts. | c4mpute wrote: | The problem here is that you usually do not have ~1k users | with all the same requirements. You have 200 groups of | average 5 users each, all with their own department- | specific, country-specific or workflow-specific | requirements. Of course a central solution will be better | and cheaper. But it will never be quicker, because you will | take ages to just gather requirements from all 200 distinct | user groups. As soon as you have those requirements, they | will have changed already, so you are working on | yesteryear's problems. | | And of course, given a working system, the users can drop | you a quick email, explain their problem (yes, in an ideal | world they could do that, and you would understand them | right away...) and you implement a 5min change. In reality | however, their problem will first have to be specified in a | user story, with a ton of clarification requests until the | story is really understood by the dev team, then you need | goodwill, time and money for the implementation. And maybe | their problem can only be solved by an ugly hack, a weird | special case for the ternary currency and ages-old lunar- | calendar-based tax-system of lampukistan. Would that really | be quicker than just the lampukistan team throwing together | a few formulas and be done faster than the initial email? | Even when multiplied by the special requirements of the | other 100 country sales teams? | | Also, I've had similar change requests where is was | explicitly asked to provide a spreadsheet prototype of what | the statistics should look like. Well, thanks, why again do | we need a dev team? | | I know that spreadsheets suck. They are ugly, undebuggable | hacks, always and without exception. You need tons of time | to implement in hours what would be a quick one-liner SQL | query. With terrible error behaviour, weird edge cases and | hell knows how many hidden bugs when the locale uses the | lampukistan-currency-separator instead of a decimal dot... | | ...Except that they provide those office drones with | velocity, which, as the usual wisdom around here goes, is | everything. | airstrike wrote: | This is the way | c4mpute wrote: | To be somewhat constructive: What you rather should have done | is not create more elaborate dashboards. What imho the world | needs is an easy way to use a spreadsheet tool to generate | and publish a dashboard. A "make web dashboard" button right | next to the print button. With auto-updates when input data | changes of course. | jamesmaniscalco wrote: | There is Smartsheet, which mostly works well for this, but | its power-user features are pretty limited compared to | Excel. | airstrike wrote: | Yup. That's what I want to build. Thank you for saying that | -- I feel like it really validates my feelings hah | fiddlerwoaroof wrote: | The problem is always this project turns into "let's | build excel or tableau" and the customers that care | usually already use one or the other. | airstrike wrote: | That's fair, but fortunately I'm not planning on doing | either. (Well, I am still implementing ~all of Excel's | formulas for compatibility, but not the the UI/UX...) | | People don't really consume data, they read documents. I | think that's (part of) the vision these projects lack. | amcaskill wrote: | I'm working on an OSS BI tool focused on a document form | factor. Might be of interest to you. | https://github.com/evidence-dev/evidence | airstrike wrote: | Thank you! Definitely interesting! I had actually starred | that repo when I saw it being discussed on some HN thread | a week or two ago | [deleted] | fiddlerwoaroof wrote: | Yeah, I misread the post I replied to: I've been on a | bunch of internal dashboard projects that were in danger | of losing focus and turning into full-fledged | visualization platforms. | uxp8u61q wrote: | Have you... used Excel? It's very simple to create any kind | of "dashboard" (AKA graphs on a page) and then you just | share the web link to the page. | c4mpute wrote: | Yes, I have. What Excel is still lacking is an easy | solution for the input side. You can bind tons of data | sources, but all are weird, hard-to-use, manual. There is | no easy "grab this from that website, get the current | data of what I just pasted there, mash it together, | publish it" | | Hell, it cannot even do proper CSV import. You need to | reformat your CSV to match the locale Excel is running | under! | uxp8u61q wrote: | Uh? Are you sure you've actually used Excel? The CSV | import is highly configurable and leads you immediately | into Power Query where you can massage the data any way | you want. | c4mpute wrote: | The LibreOffice CSV import is configurable. The Excel one | isn't. | | You can do things in PowerQuery, but that is far from | obvious and still buggy. Not to mention all the woes | after import, like date/time auto-interpretation and | autocorrections that cannot be switched off. | | I stand by what I said. Excel imports are a huge mess. | uxp8u61q wrote: | Would you like me to send you some online tutorials on | how to import CSVs into excel? Because at this point it's | just crazy. Are you using excel 2009? Do you not know | about the "Data" tab in the ribbon? There's a whole | dialog to complete with several options when you import a | CSV file. | Seanambers wrote: | Powerquery oh god never again! | caycep wrote: | It is basically Excel's way of bringing matrix math to the masses | gadders wrote: | I was working at Lotus when Improv came out. From what I | remember, it was popular but as you started adding dimensions | memory use blew up. This was in the early days of Windows and 4mb | of memory. | kagakuninja wrote: | And I was working at Brio Technology in the late 90s, they had | a similar product called DataPivot, first sold in 1991. | https://en.wikipedia.org/wiki/Brio_Technology | | I don't remember Brio's pivot tables "blowing up" per-se, but I | suppose computers had a lot more memory by the time I joined. | They used a patented algorithm to create the pivot structure by | aggregating a result-set from an SQL query. | tqi wrote: | Pivot tables powered by SSAS cubes may be the best self serve | analytics tool I've ever seen (where "best" is measured by how | much users actually use it). The ability to meet business users | where they are is huge for actual adoption, especially compared | to something like Looker (which is hot garbage for other reasons | as well). Plus, 9 times out of 10 people want to apply additional | lightweight transformations/calculations to the data, so nothing | beats being in Excel already. | | Too bad OSX support is non existent and writing MDX is a pain in | the fucking ass. | kyllo wrote: | I agree and I do this via Power BI. If you import data into a | Power BI report, create a data model with calculated measures | (in DAX, not MDX), and publish it to the online service, then | users can click on "Analyze in Excel" and it downloads an Excel | workbook with a pivot table connected to that data model. I | provide this to the PMs for the product I work on and they're | able to answer a _lot_ of their questions just by pivoting | instead of having to write bespoke SQL. | tqi wrote: | Oh that's cool - does it work for business users on OSX? I | feel like the fall of Thinkpads was the final nail in the | coffin. We tried running a dedicated windows machine that | people could remote into, but it was just too much friction. | | Edit: One tool that looks promising is Equals (equals.com), | but I haven't had a chance to play with it directly to see | how it compares. | kyllo wrote: | Assuming the company has a Microsoft 365 subscription, Mac | users can just use the Excel web client in their browser to | access the data via pivot tables. | archiewood wrote: | > Looker (which is hot garbage for other reasons as well) | | From what I remember, Looker does allow you to create pivot | tables from the Explore interface? | | You can then also download to csv / excel from a Looker | explore. | | Something missing for you there? | tqi wrote: | It does, but everything is translated to raw sql then pushed | to the database layer, which means anything with a meaningful | amount of data runs like dogshit. I haven't touched MDX in a | long time, but my recollection is that OLAP cubes make a lot | of these pivot-table type queries a lot more performant. | | Also, while it may seem like a minor thing, not being | connected to live source introduces a significant amount of | friction and room for human error. Adding a new filter or | measure = new copy of a file that you need to keep track of, | refreshing with a new month of data = a new copy of a file, | etc. | archiewood wrote: | Oh yeah. Any manual data update has potential to go wrong. | | Especially since in my exp the most common way to do this | is to paste the new data over previous sheet in an excel, | and hope all the formulas still work. | | Kind of fine, but let's hope there aren't any new | categories that weren't there last month! | tqi wrote: | > Kind of fine, but let's hope there aren't any new | categories that weren't there last month! | | if it does mess up your formulas hopefully it does it in | a way that you actually notice! | | hyperbole aside, I don't think it's entirely Looker's | fault that business users can't seem to get the hang of | it, but I think the delta between what users "should use" | and "actually use" is large enough that the tool just | isn't worth it. | [deleted] | hermitcrab wrote: | Have you tried Easy Data Transform? It is a lightweight ETL | tool for doing data transformations (such as pivot) on Excel, | CSV and various other file formats. Runs natively on Windows | and Mac. | tqi wrote: | The ETL part is not really the problem business users seem to | have, but rather needing the ability to use pre-built | measures / dimensions natively in Excel. | davio wrote: | Back in the mid 2000s, our killer app was an example app from | MSDN magazine. It basically embedded Excel in a simple web page | and we could use the pivot tables against SSAS cubes. We did a | little web work for permissions and to save views, but probably | had less than a total week of dev work. | SOLAR_FIELDS wrote: | I worked at a company that had this (OLAP Cube + Pivot Tables) | as one of my first internships and it was mind blowing. I bring | this up in terms of amazing analytics tools in the data space | somewhat often. Even highly developed BI tools like metabase | can't handle dimensions as well as the pivot table. | Jgrubb wrote: | Can you give an example? I can make metabase pivot tables | DANCE (imho). | namtab00 wrote: | I used to work on a BI and reporting platform on the | Microsoft stack (SSIS + SSAS + SSRS).. Even got decent at | writing MDX. | | Thing is, you needed beasty servers to get good performance | (event with loads of cube modeling optimizations), but that | was rarely the case... This was in the physical servers era, | mind you, even started on 32 bit Win Server, which choked up | pretty fast... | | I switched company around when tabular OLAP models started | replacing multi-dimensional models, so I never got to | understand those. | | MS SSAS got replaced with Qlik / Tableau AFAIK. | [deleted] | trelane wrote: | In case folks don't yet know how to use pivot tables in | LibreOffice: | https://books.libreoffice.org/en/CG71/CG7108-PivotTables.htm... | digging wrote: | Okay... so _what is a pivot table_? I 've never used one or seen | one used, and my browser didn't load the embedded video. | | What I learned about pivot tables from this article: | | - they are an easy way to show data that's in a spreadsheet | | - they were invented at Lotus (maybe) | | I don't even begin to know what they actually do or how someone | would use one. | airstrike wrote: | It's a drag-and-drop UI for grouping items in a table by | category. Plenty of quick videos on YT since business users | generally struggle to grasp it. Here's one example: | https://www.youtube.com/watch?v=qu-AK0Hv0b4 | | It's paradoxically very useful and complete garbage at the same | time, IMHO | ipython wrote: | It's too bad that the pivot table is a poor approximation of a | true multidimensional spreadsheet, for example Lotus Improv: | https://instadeq.com/blog/posts/no-code-history-lotus-improv... | fiddlerwoaroof wrote: | I believe this is a sort of descendant of Improv: | https://quantrix.com/products/quantrix-modeler/ | steve1977 wrote: | It is, and like Improv, has its roots in NeXSTSTEP. | | http://www.kevra.org/TheBestOfNext/ThirdPartyProducts/ThirdP. | .. | | There used to be a version of Quantrix Modeler that was | affordable for ,,home users", it's been quite a while though. | getravi wrote: | Never knew Improv existed. There are tools that are similar to | the vision of Improv. I use Anaplan at work everyday and it is | exactly what a modern cloud based SaaS version of Improv would | feel like. It is a multi billion dollar company and it worked | because it did not go after the spreadsheet space but played | along nicely with it. | | The Improv article concludes "the key strategy mistake was to | try to market Improv to the existing spreadsheet market. | Instead, if the product were marketed to a segment where the | more structured model was a 'feature' not a 'bug' would have | given Lotus the time to learn and improve and refine the model | to a point where it would have satisfied the larger market as | well." and Anaplan seems to not have made this mistake. They | have carved out a niche in the EPM (Enterprise Performance | Management) market. | Brian_K_White wrote: | The article says Improv is where pivot tables started. | kagakuninja wrote: | You may want to read about DataPivot, developed by Brio | Technology during the same period as Lotus. Brio had a patent | on the pivot data aggregation algorithm, I'm not sure how | Lotus's pivot table worked... | | https://en.wikipedia.org/wiki/Brio_Technology | qsort wrote: | Which is itself the poor man's groupby. | | At some point you've just got to admit you have the wrong | abstraction. Strong Zalgo vibes. | contravariant wrote: | Goupby may be the higher abstraction, but it's not | necessarily better. | | Dimensional models are basically modules (generalised linear | spaces), a groupby can do the same things but doesn't really | give much useful structure to work with (at best the result | is ordet independent, most of the time). | | This is also why sums and counts tend to be more useful than | averages. | airstrike wrote: | Or at some point you realize we need some new abstraction ;-) | jasode wrote: | _> Which is itself the poor man's groupby._ | | SQL GROUP BY creates summarized horizontal _rows_. | | Instead, pivot tables are more analogous to _crosstab_ | queries which creates summarized vertical _columns_. It | "pivots" data groupings by rotating from horizontal to | vertical. | | The older versions of SQL dialects that didn't have the newer | cross tab syntax required convoluted CASE syntax to | "simulate" pivot tables which didn't really work that well | since one had to know ahead of time -- all the unique values | -- to put in each CASE condition branch. | qsort wrote: | If you are grouping over them, they shouldn't be columns in | the first place, the original SQL is "right". Rows and | columns aren't symmetric! | | Reshaping data should be a _presentation_ -time decision, | not a query-time decision. You have a dataset, a relation | in the algebraic sense, and you are choosing to display it | in some way: as a table, as a pivoted table, as a pie- | chart... | | Conflating the two is a consequence of spreadsheets having | an unbeatable UX but a terrible data model that lets you | treat rows as columns and vice-versa. | fifilura wrote: | There are real use cases for pivoting data in SQL not | only for presentation. Tables can be too "long". | | You need data points to be on the same row if you want to | do arithmetic operations on on different types of values. | For example | | k + 3.14* number_of_chimneys - 0.13 * age - | 1.34*neigbourhood_criminality as house_price | tomnipotent wrote: | > should be a presentation-time decision, not a query- | time decision | | Agree, but sometimes you just need to shove the results | of a SQL query into an Excel file and you don't want to | get fancy. | | You're either 1) overwriting Sheet B and then using a | pivot table in Sheet A to get the final presentation, 2) | pivoting in the code/program executing the query before | writing to Excel or 3) pivoting in SQL and skipping the | code and Excel pivot table altogether. | | I run into this a lot with data used for financial | modeling, or financial reporting that heavily relies on | using dates/categories as column/row headers. | toyg wrote: | _> Reshaping data should be a presentation-time decision_ | | When you potentially billions or trillions of data | points, it isn't. | | Rows and columns have limits. You need some hard logic | for true multidimensional data at scale. | ghaff wrote: | One of the "crimes" (he types hyperbolically) of Microsoft | Office is that is basically sucked all the air out of the room | for anything else in the non-graphical artist office | productivity area that wasn't Office or a pretty direct knock- | off. | | The spreadsheet model is a good example (even if Excel _is_ | probably the best thing in Microsoft Office. But it also means | that if you can 't make Word do a good enough job for desktop | publishing you generally have to go to InDesign which is | probably way overkill if yoiu're not a publishing professional. | znpy wrote: | To be honest i saw a teacher in high school working on his | own textbook)the second revision of an already published | textbook) in word and while it had the classic wysiwyg | experience, it was typographically okay, almost ready to be | printed. | ghaff wrote: | Word, or even something like Google Docs which lacks some | features in areas like Section numbering, isn't terrible. I | published a book using Google Docs and basically decided | anything it couldn't do I didn't need or could handle | manually. But, if I were actually come up with a wish list | for a low-end publishing platform it would probably look a | bit different than Word. | vondur wrote: | The linked article specifically mentions Lotus Improv as the | app that had this functionality in it. Interesting how Steve | Jobs was able to get Lotus to make it a NeXT exclusive app | initially. | dannyobrien wrote: | I remember going to the UK launch of Lotus Improv as a newbie | journalist. It was really notable how both how flashy and | professional it was compared to other products (in retrospect, | I'm presuming that was Jobs' influence). Nonetheless, I think | they really struggled to explain pivot tables, and why, in | itself, that feature was sufficient to move to a new | application _and_ a new hardware platform. | RGamma wrote: | Sounds a lot like what you can do with tabular model | (PowerPivot) and DAX (measures) in Excel now. | II2II wrote: | I have not used spreadsheets very often since the mid-1990's. | One of the reasons: I was excited by the potential of Improv, | but disappointed when I realized that it had no future. Little | did I realize that pivot tables were a different take on the | concept! | | (The other reason for abandoning spreadsheets was performance. | I forget how good/bad Improv was in this respect, but I doubt | that I would have stuck with spreadsheets since the data sets I | was dealing with weren't really appropriate for them.) | re5i5tor wrote: | On NeXT 2 years before Windows | huhtenberg wrote: | Half of the QZ article is literally about Salas and Improv. | bambax wrote: | > _Rather than enter formulas, users would be able to point and | click to get those summary statistics. The Lotus team called this | tool "flexible views," but today similar tools are called "pivot | tables" in both Microsoft Excel and Google Sheets._ | | > _The Lotus team showed Jobs an early prototype. "Steve Jobs | thought it was the coolest thing ever," Salas, now a professor at | Brandeis University, tells Quartz. Jobs then convinced Lotus to | develop the pivot table software exclusively for the NeXT | computer. The software came out as Lotus Improv, and though the | NeXT computer was a commercial failure, Lotus Improv would be | hugely influential._ | | Joel Spolsky had this to say about Improv though [0]: | | > _When we were designing Excel 5.0, the first major release to | use serious activity-based planning, we only had to watch about | five customers using the product before we realized that an | enormous number of people just use Excel to keep lists. They are | not entering any formulas or doing any calculation at all! We | hadn't even considered this before. Keeping lists turned out to | be far more popular than any other activity with Excel. And this | led us to invent a whole slew of features that make it easier to | keep lists: easier sorting, automatic data entry, the AutoFilter | feature which helps you see a slice of your list, and multi-user | features which let several people work on the same list at the | same time while Excel automatically reconciles everything._ | | > _While Excel 5 was being designed, Lotus had shipped a "new | paradigm" spreadsheet called Improv. According to the press | releases, Improv was a whole new generation of spreadsheet, which | was going to blow away everything that existed before it. For | various strange reasons, Improv was first available on the NeXT, | which certainly didn't help its sales, but a lot of smart people | believed that Improv would be to NeXT as VisiCalc was to the | Apple II: it would be the killer app that made people go out and | buy all new hardware just to run one program._ | | > _Of course, Improv is now a footnote in history. Search for it | on the web, and the only links you'll find are from very over- | organized storeroom managers who have, for some reason, made a | web site with an inventory of all the stuff they have collecting | dust._ | | > _Why? Because in Improv, it was almost impossible to just make | lists. The Improv designers thought that people were using | spreadsheets to create complicated multi-dimensional financial | models. Turns out, if they asked people, they would discover that | making lists was so much more common than multi-dimensional | financial models, and in Improv, making lists was a downright | chore, if not impossible._ | | [0] https://www.joelonsoftware.com/2000/05/09/the-process-of- | des... | | - - - | | I don't know if pivot tables are cool; one big problem that | people often overlook is that they have to be recalculated | ("refreshed") manually; this can lead to significant errors. | | Conditional sums are not "complex formulas", they are often | easier to understand and debug than pivot tables -- and they are | recomputed with each change, which will eventually save your ass. | rvba wrote: | It would be interesting to know why Microsoft was so incredibly | stubborn to not allow to filter by colors / collect value of | color in a cell. | | I am very, very aware that this is the wrong way to do things, | but users really prefer to take their list and color their | wrong cells red and the correct cells in green instead of using | a separate column for "status". Of course a separate column | with status then allows to have the 99999 different type of | statuses that are created.. but is just clunkier. | | And I know that you can now (after how many years?) filter by | colors, what again is clunky if there are multiple colors, but | you cannot get the cells color without custom VBA. | | A simple "cellcolor()" formula would allow to make faster color | filters. | | It is very funny that every organization seems to have a | "database" which is a list of stuff. And "big data" is when | this list does not fit to Excel anymore. | klysm wrote: | Just let me write SQL please | ChrisClark wrote: | You can in Google Sheets. Well, close enough at least. Use the | QUERY function. | whalesalad wrote: | duckdb will let you do both! | https://duckdb.org/docs/sql/statements/pivot.html | jabroni_salad wrote: | Well, another nice thing with pivot tables, in excel at least, | is I can just follow the data connection to its home and do | whatever I want with that. RBQL in rainbowcsv is my usual. | yunohn wrote: | IME pivoting on SQL results is a very common usecase too. | chasd00 wrote: | ignorance is bliss but you should learn to work all your tools | well not just rely on one. | airstrike wrote: | In Excel, you can have the source data for the pivot table be | an SQL query -- then the pivoting is just for presentation | purposes when you get down to the very last layer of groupby's, | so to speak | | You can also use SQL queries as sources for regular tables in | Excel (i.e. no pivoting) | | One of my first roles out of undergrad was to take reports that | would take an analyst literally days or weeks to generate into | a set of SQL queries that got them very close to the answer | with pivot tables (or just tables). Their job went from working | on reports to changing a couple parameters in the queries like | "current month" or "forecast version", which only took a couple | minutes, leaving them with plenty of time to think about new | reports to generate, how else to improve current reports, etc. | Still one of the most personally satisfying things I've ever | done. | | Granted we can't expect all business users to learn to write | SQL (spoilers: they never will). I believe we collectively need | a more robust solution than having an SQL angel come by and | write queries for business users... it feels obvious and within | reach, but no one has really done it yet. | baq wrote: | common use case: get a metric truckload of data from a column | store binned per minute (or 5), give the giant sheet to users, | users use pivot tables to create hourly/daily/weekly reports as | they see fit. (not all users are dumb it turns out) | DanAtC wrote: | Dynamically pivoting SQL is a pain in most dialects | roywiggins wrote: | If only generating pivot tables in SQL were uniformly easy. | Postgres's "crosstab" is awkward at best, forcing you to | specify column names every time makes exploration unpleasant. | In the end I found it easier to do crosstabulation _in the UI_ | (or at least, not in SQL). ___________________________________________________________________ (page generated 2023-10-09 23:00 UTC)