[HN Gopher] Pandas Illustrated: Visual Guide to Pandas ___________________________________________________________________ Pandas Illustrated: Visual Guide to Pandas Author : nemoniac Score : 109 points Date : 2023-01-27 19:41 UTC (3 hours ago) (HTM) web link (scribe.citizen4.eu) (TXT) w3m dump (scribe.citizen4.eu) | axi1 wrote: | The proper (free) link is https://betterprogramming.pub/pandas- | illustrated-the-definit... | jcq3 wrote: | Yet another pandas tutorial. Got chatgpt now, thx. | r2_pilot wrote: | Good luck with plausible hallucinated interfaces in your | statistically-generated responses. | timdellinger wrote: | This seems to be getting the Hug of Death, but this looks like | the content: | | https://betterprogramming.pub/pandas-illustrated-the-definit... | neonate wrote: | https://web.archive.org/web/20230127194856/https://scribe.ci... | dark-star wrote: | These are not the Pandas I was looking for _waves hand_ | matsemann wrote: | Can recommend taking a look at Polars. Kinda a successor to | pandas. | | https://www.pola.rs/ | z3c0 wrote: | Interesting. Seems to also take quite a few leaves from | PySpark's book. | 89vision wrote: | Neat. I love that there's a rust implementation. Types make | everything better | throwaway_75369 wrote: | So, given the title and how stressful the last couple of weeks | have been, I was sadly disappointed when this wasn't about | drawing cute black and white bears. | | I mean, data analysis is useful and all, but not what the heart | wanted at the moment. | [deleted] | 867-5309 wrote: | asking DALL-E for some Python Pandas might relieve our | disappointment | [deleted] | [deleted] | irrational wrote: | LOL. Those were not the kind of pandas I was expecting. | | One of my daughters is a panda bear fanatic and I thought this | would be a resource I could share with her. | tomcam wrote: | Same! Although at first glance it appears to be an excellent | example of clear, well-illustrated documentation. | oneoff786 wrote: | I do almost all of my day job in pandas. I consider myself very | good at it. My number one recommendation to new data scientists | learning the ropes is to just not use NumPy almost at all. I'm | not sure where people learn it but they do all of this | complicated nonsense. Just map simple Python lambda funcs with | pd.Series.map and that's most of what you need. Memorize your | pd.DataFrame methods. | | If your code feels like it dealing with a matrix and not a table, | it's probably doing something funny. | boppo1 wrote: | What is your day job? | ajoseps wrote: | I think it really depends on the scale of data. If you're | dealing with anything less than a GB, it probably doesn't | matter all that much, but once you're dealing with larger | datasets there is a pretty massive difference with using | vectorized operation. Some of the pandas dataframes methods map | to underlying numpy ones, but I don't believe that is always | the case | _Wintermute wrote: | You lose a lot of performance not using vectorised functions. | Maybe not an issue if you're only dealing with small amounts of | data. | oneoff786 wrote: | Series.map is vectorized. | | Pretty much everything you need in pandas is as performant as | you ought to need for doing tabular data manipulation in | Python. Except dataframe.apply | _Wintermute wrote: | It is not. df = pd.DataFrame({"foo": | np.random.randn(100000)}) | | pandas map: df["foo"].map(lambda x: x * | 2) | | 18.1 ms +- 109 us per loop (mean +- std. dev. of 7 runs, | 100 loops each) | | pandas apply: df["foo"].apply(lambda x: x | * 2) | | 17.9 ms +- 46.6 us per loop (mean +- std. dev. of 7 runs, | 100 loops each) | | Vectorised function, using underlying numpy operations: | df["foo"] * 2 | | 267 us +- 11.8 us per loop (mean +- std. dev. of 7 runs, | 1000 loops each) | lcvriend wrote: | If by "vectorized" you mean: "able to delegate the task of | performing mathematical operations on the array's contents | to optimized, compiled C code." then I do not think you are | correct (unless perhaps you are supplying map with a dict | or Series). | | Series.map is not compiling your lambda's to C and running | it. If there is a built-in method available it usually will | be faster. Notable exception are pandas str methods which | devolve into Python code but generally with more overhead | than map/apply. | voxelghost wrote: | Check out polars. | | Vectorized, choice between lazy optimization and eager. | rbanffy wrote: | @dang can you replace the link with the original? | https://betterprogramming.pub/pandas-illustrated-the-definit... | [deleted] ___________________________________________________________________ (page generated 2023-01-27 23:00 UTC)