[HN Gopher] The unreasonable effectiveness of conditional probab...
       ___________________________________________________________________
        
       The unreasonable effectiveness of conditional probabilities
        
       Author : kqr
       Score  : 134 points
       Date   : 2023-02-22 16:04 UTC (6 hours ago)
        
 (HTM) web link (two-wrongs.com)
 (TXT) w3m dump (two-wrongs.com)
        
       | Hitton wrote:
       | The first problem is known as "Coupon collector's problem".
        
       | [deleted]
        
       | ketanmaheshwari wrote:
       | The unreasonable effectiveness of articles having the phrase
       | "Unreasonable Effectiveness" in their title on HN:
       | https://hn.algolia.com/?q=unreasonable+effectiveness
        
       | nickponline wrote:
       | I wrote a blog about the general case
       | https://nickp.svbtle.com/general-collectors-problem
        
       | gnulinux wrote:
       | Other commenters made other comments, but I also want to note
       | that this article is terrible for another reason. For the first
       | problem, they computed mathematically all the way up to
       | calculating the recursive expression they got. After that they
       | moved to computing the value with Python. The reason this is bad
       | is because writing the Python code to calculate ~8.3564 was easy
       | in the first place. They made basic mathematical analysis that
       | wasn't even necessary, then moved to code to actually solve the
       | problem. To be instructional, they could compute 8.333 solution
       | mathematically (as e.g. dgacmu or vecter in this thread did) and
       | then later move to Python for an alternative solution. This is
       | not only confusing but also incomplete for both approaches.
        
       | photochemsyn wrote:
       | While I wouldn't dream of posting ChatGPT content, here are some
       | _questions_ you could give to ChatGPT to a make this all a fair
       | bit clearer:
       | 
       | First, what is an expectation value, as defined in the field of
       | statistics?
       | 
       | Second, is there some reason an expectation value wouldn't behave
       | like a strange attractor from dynamical systems theory and sort
       | of wander about, rather than settle down to a precise value?
       | 
       | Third, what's the connection (if any) between conditional
       | probability and expectation values?
       | 
       | For the first example, a more interesting real world-problem
       | would be one where the total number of different outcomes (toys)
       | is unknown - say, counting the number of species of beetle that
       | exist on planet Earth by steadily sampling all the habitats one
       | can find - how long would the tails be?
       | 
       | Variations of the latter examples might be odds that vary over
       | time in conjunction with costs-of-entry that vary over time -
       | what's the acceptable range of odds & costs to justify playing
       | with some expectation of at least coming out even? That's a bit
       | more like real-world markets.
        
         | throwaway81523 wrote:
         | I think you were downvoted (not by me) because your questions,
         | except for the one about strange attractors, are about basic
         | topics in probability theory. So you should probably read a
         | book or take a class. The one about strange attractors is more
         | interesting and maybe not studied enough, but you could look at
         | https://en.wikipedia.org/wiki/El_Farol_Bar_problem for a
         | related example.
        
       | dgacmu wrote:
       | I find this a quite terrible explanation of a fairly simple
       | concept, but perhaps I haven't had enough coffee. Here's an
       | alternative try:
       | 
       | First, for the computer scientists out there, this is the Coupon
       | Collector's problem - so you probably know already it's Th(N log
       | N). That doesn't give us a precise answer but it should guide our
       | intuition - it should take _somewhere_ like 4 * log(4) ~= 8 draws
       | to get what we want. [1]
       | 
       | Being exact:
       | 
       | When you have already have 3 of the 4 toys, what's the chance you
       | get the 4th? It's 1 in 4. The expected number of draws to get an
       | item with P=1/4 is 4.
       | 
       | Generalizing, the expected number of tries to get an item with
       | probability p is 1/p. When you have 2 toys, you have a 1/2
       | probability of getting one of the two you haven't seen already,
       | etc. When you have no toys, you always get something you don't
       | have.
       | 
       | So the only slightly tricky case is when you have 1 toy, where
       | you have a 3 in 4 chance of getting something new. 1/(3/4) = 1
       | 1/3.
       | 
       | So the answer is very simply 4 + 2 + 1 1/3 + 1 = 8.3333.
       | 
       | More on the Coupon Collector's Problem:
       | https://en.wikipedia.org/wiki/Coupon_collector%27s_problem
       | 
       | [1] The actual bounds use ln, not log2, but from a "getting
       | intuition" perspective log2 gets us into the right range.
        
         | amluto wrote:
         | I agree - the OP's explanation is terrible. What does this even
         | mean:
         | 
         | > From a fundamentals perspective, the expected number of meals
         | is a function of the probability of finding the fourth toy in
         | any specific meal.
         | 
         | I can barely parse it. The probability of finding the nth toy
         | before the nth meal is 0, so is the answer a function of 0? Or
         | maybe the probability of finding the toy with index 4 in a
         | specific meal is 1/4, and the answer is a function of 1/4?
         | 
         | Anyway, if you drink another coffee and stop reducing your
         | fractions, you get a nicer way to say the answer :) . After
         | getting n different toys, the probability of a duplicate on the
         | next try is n/4, so the probability of a new toy is 1-n/4 =
         | (4-n)/4, and it takes 4/(4-n) draws on expectation to go from n
         | to n+1 toys. So to do from 0 toys to 4 toys takes 4 * (1/4 +
         | 1/3 + 1/2 + 1/1) tries on expectation, because you can just add
         | the expected number of tries for each new toy.
        
           | kgwgk wrote:
           | >> From a fundamentals perspective, the expected number of
           | meals is a function of the probability of finding the fourth
           | toy in any specific meal.
           | 
           | > I can barely parse it. The probability of finding the nth
           | toy before the nth meal is 0, so is the answer a function of
           | 0?
           | 
           | The probability of finding the fourth toy in the first meal
           | is 0%.
           | 
           | The probability of finding the fourth toy in the second meal
           | is 0%.
           | 
           | The probability of finding the fourth toy in the third meal
           | is 0%.
           | 
           | The probability of finding the fourth toy in the fourth meal
           | is 9.4% or whatever.
           | 
           | The probability of finding the fourth toy in the fifth meal
           | is 37.5% or whatever.
           | 
           | Etc.
           | 
           | The expected number of meals is 1 x 0% + 2 x 0% + 3 x 0% + 4
           | x 9.4% + 5 x 37.5% + 6 x ... + ...
        
             | [deleted]
        
             | kgwgk wrote:
             | More precisely:
             | 
             | The probability of finding the fourth toy in the first meal
             | is 0%.
             | 
             | The probability of finding the fourth toy in the second
             | meal is 0%.
             | 
             | The probability of finding the fourth toy in the third meal
             | is 0%.
             | 
             | The probability of finding the fourth toy in the fourth
             | meal is 3/32 = 9.4%.
             | 
             | The probability of finding the fourth toy in the fifth meal
             | is 9/64 = 14.1%.
             | 
             | The probability of finding the fourth toy in the sixth meal
             | is 123/512 = 24.0%.
             | 
             | The probability of finding the fourth toy in the seventh
             | meal is 135/1024 = 12.0%.
             | 
             | The probability of finding the fourth toy in the eight meal
             | is 903/8192 = 11.0%.
             | 
             | The probability of finding the fourth toy in the ninth meal
             | is 1449/16384 = 8.8%.
             | 
             | Etc.
             | 
             | The expected number of meals is 1 x 0% + 2 x 0% + 3 x 0% +
             | 4 x 9.4% + 5 x 14.1% + 6 x 24.0% + 7 x 12.0% + 8 x 11.0% +
             | 9 x 8.8% + ...
        
         | onos wrote:
         | A way to see the scaling this commenter quoted:
         | 
         | Suppose there are on average N items left to be seen at time t.
         | Then using a continuous time approximation:
         | 
         | dN/dt = -N / N_tot
         | 
         | This gives
         | 
         | -dN / N = dt / N_tot
         | 
         | Integrate from N_tot to a small value at left and from 0 to t
         | at right,
         | 
         | Log(N_tot /a) = t / N_tot,
         | 
         | Solving for t gives the scaling quoted.
        
         | vecter wrote:
         | To put it in a different way, construct a Markov chain with
         | these states and transitions:
         | 
         | (A_0) -> (A_1) -> (A_2) -> (A_3) -> (A_4)
         | 
         | State A_i = you have i unique toys. Each state from 1-4 also
         | has a transition to itself.
         | 
         | The probability of going from A_i -> A_{i+1} = P(A_{i,i+1}) =
         | (1 - i/4). This is pretty obviously: 1, 3/4, 2/4, and 1/4. The
         | transition probability of staying in the same state is 1 - that
         | = i/4, although we don't use those values in our calculations.
         | 
         | The expected time to reach A_4 is E[A_0 -> A_1] + E[A_1 -> A_2]
         | + E[A_2 -> A_3] + E[A_3 -> A_4] = 1/P(A_{0,1}) + 1/P(A_{1,2}) +
         | 1/P(A_{2,3}) + 1/P(A_{3,4}) = 1 + 4/3 + 2 + 4 = 8 1/3. This is
         | true due both to linearity of expectation and the fact that
         | each state only either stays the same or advances. If any state
         | could transition to another state besides itself or the
         | subsequent one, you'd have to do a more complex calculation.
        
         | tunesmith wrote:
         | After 9 purchases, what's the percentage certainty you'd have
         | all four toys? If you had to pre-purchase all n at once from
         | the outset, what's the value of n to be 99.9% certain you'd get
         | all four?
        
           | travisjungroth wrote:
           | Estimates from 67 million simulations
           | 
           | After 9: 71.13%
           | 
           | p99.9: 29
        
             | tunesmith wrote:
             | Thanks for doing that. I suspected those answers were very
             | different than the 8.3, which I'm guessing must have been
             | around the 50th percentile. The way the initial question
             | was asked, it seemed to suggest you had a pretty high
             | probability of all four after 8.3, which isn't true.
        
         | wizofaus wrote:
         | Doesn't it depend on the #/ distribution of coupons
         | printed/toys manufactured though? To take an extreme example if
         | there are 4 different toys and only 10 of each are
         | manufactured, then the probability of getting 4 all the same is
         | surely somewhat lower than it would be if you assume
         | essentially an infinite toy supply where the chance of getting
         | a particular toy is always 1/4.
        
           | mlyle wrote:
           | We're effectively assuming that the number of toys are
           | infinite and equally likely.
           | 
           | Or, to put it slightly more formally, we're choosing each
           | time from 4 equally likely possibilities, with replacement.
           | Imagine you're drawing from a bag with 4 toys inside. How
           | long until you've drawn each of the 4 toys, if you replace
           | the toy in the bag after each choice?
        
         | whack wrote:
         | My initial hunch was also to calculate the answer as 1 + 4/3 +
         | 2 + 4. Ie, "average tries to get 1st toy, plus average tries to
         | get 2nd toy, plus ...."
         | 
         | However, my memory of probability is very fuzzy and I wasn't
         | sure if my above intuition was mathematically rigorous. Ie, can
         | the expected values of these 4 events be simply summed in order
         | to get the combined expected value?
         | 
         | For anyone else curious about this, here's the mathematical
         | proof that you can indeed do so. Ie, proof that E(X+Y) = E(X) +
         | E(Y). Where X can be defined as the number of tries to get the
         | 1st toy, Y is the number of tries to get the 2nd toy, and so
         | on: https://www.milefoot.com/math/stat/rv-sums.htm
         | 
         | In this specific example, X and Y are independent random
         | variables. But if my reading of the above proof is correct, you
         | can still sum the expected values even if X and Y happen to be
         | correlated.
        
           | vecter wrote:
           | Linearity of expectation is of course true in general, but in
           | this case summing the expectations of the individual
           | transition probabilities is only valid because the only valid
           | state transitions are going from state i -> i+1 (owning i
           | unique toys to i+1 unique toys) or just staying in state i.
           | If you could magically go from state i to any other state
           | (you lose unique toys or gain more than one unique toy), then
           | this simple calculation would not be correct.
        
       | allanrbo wrote:
       | Published 2023-03-18. Futuristic :-)
        
       | wirrbel wrote:
       | Conditional probabilities are reasonably effective because they
       | are the mathematical underpinning of probabilities.
       | 
       | They are not unreasonable effective.
       | 
       | It's just that sampling from probability distributions is
       | computationally expensive, which has lead to the development of
       | frequentist statistics (as opposed to Bayesian statistics).
       | Frequentist statistics is unreasonably effective, since it makes
       | so many assumptions that its hard to do a frequentist statistical
       | analysis without violating some of the assumptions the methods
       | have with regards to the data distributions, etc.
        
         | [deleted]
        
       | travisjungroth wrote:
       | On the first problem, if you're already going to estimate the
       | answer with finite iterations over an infinite series, I think
       | you should just skip ahead to the simulation that was used to
       | test it.
       | 
       | The header for the test section says "Validation" but that's not
       | what's happening. It's not taking some answer and seeing if it's
       | correct. It's coming to the answer in a different way, in a way
       | that's faster to write, faster to run and easier to understand
       | than the other code.
       | 
       | Easier to extend and modify, too. Uneven weights for the toys?
       | Easy. Want the P99 case? The standard deviation? Easy. Simulate
       | it ten million times and see what comes out.
       | 
       | I've just been hyped on simulations over calculations for
       | probabilities lately and this is wonderful confirmation bias.
        
         | danuker wrote:
         | "The Unreasonable Effectiveness of Monte Carlo Methods"
        
         | LegionMammal978 wrote:
         | Depending on how much precision you'd like for your answer,
         | Monte Carlo methods can converge unbearably slowly, since the
         | standard deviation is proportional to 1/sqrt(N) as you add more
         | samples. I recall once having to run 100 billion simulations to
         | get the mean of a certain quantity to within a few decimal
         | places. I'd take an infinite series over that ant day of the
         | week (well, as long as it's not something as slow as the
         | Leibniz series for pi/4).
         | 
         | Of course, such precision relative to the standard deviation is
         | rarely necessary for most practical situations.
        
           | dekhn wrote:
           | As a friend in grad school used to say... "An infinitely long
           | Monte Carlo simulation recapitulates the underlying
           | probability distribution perfectly, but we don't want to wait
           | that long.":
        
           | travisjungroth wrote:
           | Yeah, it's the practicality. Most business decisions don't
           | need more than three sigfig [source?]. We only tend to think
           | about precision on things that already have it.
        
             | 0cf8612b2e1e wrote:
             | Unless you are trying to prove the existence of a subatomic
             | particle, I am not sure what business decisions have
             | anything close to that level of certainty.
        
         | 0cf8612b2e1e wrote:
         | Just making a simulation and codifying assumptions is hugely
         | helpful. Forcing SMEs to rationalize a number and its spread
         | has so much more utility than a closed form calculation which
         | people have to take on faith.
        
         | jrumbut wrote:
         | Yeah I think simulations are massively underused.
         | 
         | They might not help you prepare for a black swan type event,
         | but a lot of times we're not ready for very normal problems. I
         | suspect a day or two per year spent doing basic modeling and
         | simulation would benefit a lot of teams, even back of the
         | envelop stuff. Then you can return to the exercise a year later
         | and see if your assumptions were on point or what turned out to
         | be much different than expected and refine the model and
         | parameters of the simulation.
        
           | college_physics wrote:
           | > They might not help you prepare for a black swan type event
           | 
           | at some point probabilistic simulations were all the rage in
           | the financial industry and then the financial crisis happened
           | 
           | it is a fine art to figure out how to elicit the benefits of
           | Monte Carlo simulation to put some structure around the
           | future without falling into a mental trap that blinds you to
           | non-quantifiable scenarios
        
           | travisjungroth wrote:
           | They're actually great for preparing black swan events. Taleb
           | talks about them in "The Black Swan".
           | 
           | Usually the rules of the game aren't as clearly stated as in
           | this problem. Sometimes there is a literal coupon contest, or
           | a node randomly pairs with another node. But usually you're
           | looking at history to try to to understand the odds and then
           | forecast some futures. The problem with doing that with
           | statistics is it's not very receptive to "fiddling", even if
           | it's Bayesian. With a simulation it's generally easier to
           | throw something weird in there, like an effect orders of
           | magnitude larger, dynamic effects, or a totally new rule.
        
       | __anon-2023__ wrote:
       | For the collectible toy problem, how much money would you save if
       | you paired up with someone else and purchased meals until you
       | both had all 4 toys? My intuition says going from 1 collector to
       | 2 collectors would help significantly (like queuing theory).
        
       | jonnycomputer wrote:
       | Pet peeve. Just exactly what about it is unreasonable?
        
       | sedundnes wrote:
       | For "Game of Stopped Dice" the article states "we would keep our
       | winnings on the first throw only if they are 5 or 6". Would it
       | not be better to cash out even at $4 and play again using those
       | winnings?
        
       ___________________________________________________________________
       (page generated 2023-02-22 23:00 UTC)