[HN Gopher] Probability and Statistics Cookbook (2011) [pdf] ___________________________________________________________________ Probability and Statistics Cookbook (2011) [pdf] Author : cpp_frog Score : 174 points Date : 2022-06-06 13:34 UTC (9 hours ago) (HTM) web link (pages.cs.wisc.edu) (TXT) w3m dump (pages.cs.wisc.edu) | kiernanmcgowan wrote: | Another set of notes that I refer to often are from ECE 830, also | at UW Madison[0]. It was a great class that really represented | the culmination of all the probability theory and signals classes | I had taken over the years. | | [0] https://nowak.ece.wisc.edu/ece830/index.html | cjohnson318 wrote: | This is a nice collection of definitions and key results, but | it's not a cookbook. I think of a cookbook as a collection of | useful, focused examples, demonstrating best practices, and | listing caveats. | jenny91 wrote: | This actually seems pretty good and has great coverage! | snicker7 wrote: | There is a book titled "All of Statistics" if you'd like a | whirlwind tour. | conformist wrote: | https://www.stat.cmu.edu/~larry/all-of-statistics/index.html | buzzdenver wrote: | I would call this a cheat-sheet rather than a cookbook. | jmt_ wrote: | Agree, I typed up something similar (but less detailed) as a | reference during my undergrad stats major. I'd expect a | cookbook to have worked out examples of applications of these | topics. But still looks very useful as a reference | mturmon wrote: | Actually quite good. | | I've TA'd this class but it's surprising how many of these little | facts can be helpful if you recall them at the right time. I was | just reminded of: Var[Y] = E[Var[Y|X]] + | Var[E[Y|X]] | | and it unstuck me from a little puzzle. | | More recent version at: http://statistics.zone | jarenmf wrote: | Similar to the law of total expectation, easier for me to think | of partitioning a weighted mean :) | willdearden wrote: | There is a professor who was at Wisconsin, Charles Manski, who | developed partial identification, which uses tons of these | decompositions. | | Idea is let's say you have a binary survey question where 80% | respond and 90% of them respond "yes". What can we say about | population "yes" rate (assume sample size is huge for | simplicity)? | | P(Yes) = P(Yes | response) * P(response) + P(Yes | no response) | * P(no response) = 0.9 * 0.8 + P(Yes | no response) * 0.2 = | 0.72 + P(Yes | no response) * 0.2 | | Then 0 <= P(Yes | no response) <= 1, so 0.72 <= P(Yes) <= 0.92. | This example is somewhat trivial but it's a useful technique | for showing exactly how your assumptions map to inferences. | evandwight wrote: | For those who are wondering how this equation is true: | | https://en.wikipedia.org/wiki/Law_of_total_variance | mdp2021 wrote: | Note that the linked PDF is a 2011 version (as is explicit), | | but the most recent version (0.2.7) is dated 2021 - and available | at https://github.com/mavam/stat- | cookbook/releases/download/0.2... | | There exist "release notes" pages with the differences (at | https://github.com/mavam/stat-cookbook/releases ) | Bo0kerDeWitt wrote: | Nice, original LaTeX code is there too. | russellbeattie wrote: | I don't know math at all. I'd love a programmer version of this, | with all the algorithms in code. Probably already exists in NumPy | or something. | time_to_smile wrote: | If you're interested in probability and statistics it's well | worth your time to get more comfortable with the math. | | There's a common mistake in thinking among programmers that | there's a one-to-one mapping between math and code and that | mathematical notation is just an annoying terse short hand. | | As someone who spends a lot of time implementing mathematical | ideas into code I can tell you this is not remotely true. | Mathematics is dealing with a level of abstraction and thinking | that is fundamentally distinct from the computational | implementation of these. | | A clear example of this is the Gamma function which appears all | over those notes. It's an essential function for working deeply | with statistics, you'll find it shows up just about everywhere | if you look carefully enough. You can manipulate it | mathematically to solve a range of problem. | | However if you want to implement this from scratch in code, | that is to understand how to _compute_ the Gamma function, you | 're going to have to spend a lot of time studying numeric | methods if you want to do more than robotically copy it from | _Numerical Recipes_. | | Similarly many of the integrals used in statistics can end up | quite difficult to compute, but that difficulty doesn't impact | their ease of use in a mathematical context. This is a common | theme when working with applied math: you can do quite a lot of | mathematical work on problems that you don't necessarily know | how to compute yet. Once you solve your problem mathematically, | then you can go on to solving how to actually compute the | answer. | jbay808 wrote: | To add to this comment, it's hard to make any useful program | if you don't have at least a clear conceptual understanding | of what you're trying to do. | | For example, perhaps you are trying to calculate a variance. | But do you have a set of raw data from which you will | estimate the variance? Or some summary statistics? Or do you | already have a probability distribution from which you will | compute the variance? How is it represented? Is that the | probability distribution over the particular variable you | want the variance for, or is it a related variable that needs | to be transformed first somehow? | | You don't necessarily need to know how to handle all the math | by hand, but there's no avoiding the need for at least a | clear idea of what you're doing and what the sticking points | might be. ___________________________________________________________________ (page generated 2022-06-06 23:01 UTC)