[HN Gopher] Markov Chain Monte Carlo Without All the Bullshit (2...
       ___________________________________________________________________
        
       Markov Chain Monte Carlo Without All the Bullshit (2015)
        
       Author : larve
       Score  : 151 points
       Date   : 2022-10-25 16:04 UTC (6 hours ago)
        
 (HTM) web link (jeremykun.com)
 (TXT) w3m dump (jeremykun.com)
        
       | ckrapu wrote:
       | I like the article. That said, if you are expecting
       | biostatisticians to give the best explanation of the biggest
       | hammer in the Bayesian toolbox, you may be looking in the wrong
       | place.
       | 
       | Folks in spatial stats, machine learning, and physics have some
       | really nice introductory material.
        
         | zekrioca wrote:
         | Would you suggest some?
        
         | nightski wrote:
         | I'm not so sure about that, Richard McElreath's Statistical
         | Rethinking is an amazing resource.
        
           | emehex wrote:
           | Statistical Rethinking is probably the best textbook that
           | I've ever read. It basically jumped started my career in
           | data!
        
           | 0cf8612b2e1e wrote:
           | I don't think Richard would ever describe himself as a
           | statistician? He seems just as frustrated by the bad
           | terminology and convoluted explanations as anyone.
        
             | [deleted]
        
             | [deleted]
        
       | emehex wrote:
       | I love Markov chains! But, I too don't like "terminology,
       | notation, and style of writing in statistics"... so, I built a
       | simple "Rosetta Stone" (Python <> Swift) library implementation
       | of Markov chains here: https://github.com/maxhumber/marc
        
       | madrox wrote:
       | In the 21st century, statistics has had the odd distinction of
       | being thrust into the spotlight in a way that it never has before
       | in its centuries of existence. Before now, practitioners have
       | mostly been on an island...or more like several islands that are
       | multi-day journeys from each other by rowboat. It's created weird
       | terminology, and even the initiated don't all use the same
       | jargon. I have a degree in statistics, and one thing I learned in
       | school is that if you pick up a textbook the first thing you have
       | to do is figure out its internal terminology. Only the basest
       | concepts or ones named after people tended to use the same names
       | everywhere.
       | 
       | I _think_ this is why computer science has been so successful at
       | co-opting a lot of statistic 's thunder. It's reorganizing a lot
       | of concepts, and because everyone is interested in the results
       | (image generation, computer vision, etc) it's getting a lot of
       | adoption.
       | 
       | Amusingly, I thought that the blurb on MCMC the author quoted was
       | pretty clear. That doesn't happen to me often.
        
         | nerdponx wrote:
         | > I think this is why computer science has been so successful
         | at co-opting a lot of statistic's thunder. It's reorganizing a
         | lot of concepts, and because everyone is interested in the
         | results (image generation, computer vision, etc) it's getting a
         | lot of adoption.
         | 
         | I think it's also that "computer science" can be perceived as
         | having succeeded where statistics failed. And the terminology
         | is frankly a lot more appealing: would you rather "train an AI
         | algorithm", or "fit a model"?
         | 
         | Even the recasting of "model" as "algorithm" has a marketing
         | benefit: models have uncertainty and uncertainty is scary,
         | whereas algorithms are perceived as precise and correct.
        
           | canjobear wrote:
           | Usually I hear "training a model"
        
             | blitzar wrote:
             | When I hear "training a model" I translate that to y=mx+c
        
       | shafoshaf wrote:
       | As someone who took statistics 30 years ago and promptly forgot
       | most of it, I followed everything except "I want to efficiently
       | draw a name from this distribution". What makes a drawing
       | efficient?
        
         | ajkjk wrote:
         | Yeah the article really disappoints right away by saying
         | something seemingly arbitrary in the first paragraph
        
         | dxbydt wrote:
         | Look, if you want to draw from a unit circle, you could enclose
         | the unit circle inside a square. If the center of the unit
         | circle is the origin, it should be clear from some middle
         | school geometry that such a square has upper left cartesian
         | coordinates at (-1,1), bottom right at (1,-1). So its a square
         | of side two. Now you can sample from this square by calling the
         | rand() function twice in any programming language, scaling &
         | translating. So def foo() { -1 + 2*rand() } will do the trick
         | in c/c++/scala/python/whatnot ( I've scaled by two and
         | translated by minus one). So you have your two random
         | variables. Pair them up & that's your tuple (x,y). Now if you
         | make 100 such tuples, not all the tuples will lie inside the
         | unit circle. So you have to toss out the ones that don't. So
         | your drawing isn't 100% efficient. How efficient is it ? Well
         | if you toss out say 21 of those 100, your sampler is 79%
         | efficient. Now where the fuck does the 21 come from ? Well, if
         | you use some high school geometry, unit circle has area pi and
         | that enclosing square has area 4, so pi/4 is approximately 79%,
         | so 100-79 is 21 and so on...So one can construct more efficient
         | samplers for the unit circle by not being so foolish. We should
         | stop enclosing circles in squares & listen to Marsaglia. He
         | died a decade back but before his death he solved the above
         | problem, among others, so we don't waste 21% of our energy.
         | That said, most programs I've seen in banking, data science etc
         | are written by programmers, not statisticians. So they happily
         | use an if statement & reject x% of the samples, so they are
         | super-inefficient. Drawing can be efficient if the statistician
         | codes it up. But that fucker wants to use R, so given the
         | choice between some diehard R fucker & python programmer who
         | can mess around with kubernetes & terraform in their spare
         | time, hapless manager will pick the python programmer
         | everytime, so that's what makes the drawing inefficient. /s
         | tag, but not really. Just speaking from bitter personal
         | experience :)
        
           | idontpost wrote:
           | Isn't the obvious solution to sample in polar coordinates
           | instead? 0-1 for the radial coordinate, and (0 - 1) * 2pi for
           | the angle.
        
             | zorgmonkey wrote:
             | If you do this you won't get a uniform distribution on the
             | circle, the points will be the most dense at the center and
             | get less dense as you go towards the edge. To make the
             | points uniform you need to use inverse transform
             | sampling[0], which will give the formula r*sqrt(rand()) for
             | radius poolcoordinate, where r is the radius of the circle
             | and rand() returns and uniform random number from the
             | interval 0 to 1.
             | 
             | [0]:
             | https://en.wikipedia.org/wiki/Inverse_transform_sampling
        
             | tfehring wrote:
             | Yes, though you have to take the square root of the sampled
             | radius for the resulting distribution to be uniform on the
             | unit circle. (The area of the donut with r>0.5 is greater
             | than the area of the circle with r<0.5, but the naive
             | implementation would sample from each of those with
             | probability 0.5.)
             | 
             | It's still a useful illustration, though, since MCMC
             | samplers used in practice _do_ end up throwing away lots of
             | the sampled points based on predefined acceptance criteria.
        
             | [deleted]
        
           | WaxProlix wrote:
           | I have to agree, we could save the world a lot of wasted
           | energy if there were a way to get statisticians off of
           | R/matlab and into more 'portable' spaces.
        
         | thehumanmeat wrote:
         | If you want to select an integer uniformly random from 0...n-1,
         | you need an expected logn mutually independent random bits.
         | What if you don't want it to be uniformly random, but some
         | other distribution instead? That's where Markov chains help;
         | they use random bits efficiently to draw from an interesting
         | distribution.
        
         | cygaril wrote:
         | One possible cost measure is thr number of evaluations of the
         | probability function.
        
         | yxwvut wrote:
         | Computational efficiency is a major consideration in sampling
         | from a high dimensional distribution.
         | https://en.wikipedia.org/wiki/Rejection_sampling#Drawbacks
        
       | harry8 wrote:
       | I quite enjoyed McElreath's Statistical Rethinking including on
       | this topic.
       | 
       | https://www.youtube.com/watch?v=Qqz5AJjyugM
        
       | yuzzy192 wrote:
       | This is so useful.
        
         | blitzar wrote:
         | Be careful - removing the fancy words from "Markov Chain Monte
         | Carlo Simulations" and translating it into English can have a
         | negative effect professionally.
        
       | mdp2021 wrote:
       | Full list of primers from Jeremy Kun (of which the submitted page
       | is one):
       | 
       | https://jeremykun.com/primers/
        
       | graycat wrote:
       | The quote from the article in _Encyclopedia of Biostatistics_ is
       | awash in undefined terminology sometimes about peripheral issues.
       | 
       | Clean, logical, all terms well defined and explained, with plenty
       | of advanced content, is in (with some markup using TeX)
       | 
       | Erhan \c Cinlar, {\it Introduction to Stochastic Processes,\/}
       | ISBN 0-13-498089-1, Prentice-Hall, Englewood Cliffs, NJ, 1975.\ \
       | 
       | The author was long at Princeton. He is a _high quality_ guy.
       | 
       | As I was working my way through grad school in a company working
       | on US national security, a question came up about the
       | _survivability_ of the US SSBN fleet under a special scenario of
       | global nuclear war but limited to sea. Results were wanted in two
       | weeks. So, I drew from Cinlar 's book, postulated a Markov
       | process _subordinated_ to a Poisson process, typed some code into
       | a text editor, called a random number generator I 'd written in
       | assembler based on the recurrence
       | 
       | X(n+1) = X(n) 5^15 + 1 mod 2^47
       | 
       | and was done on time.
       | 
       | A famous probabilist was assigned to review my work. His first
       | remark was that there was no way for my software to "fathom" the
       | enormous "state space". I responded, at each time t, the number
       | of SSBNs left is a random variable, finite, with an expectation.
       | So, I generate 500 sample paths, take their average, use the
       | strong law of large numbers, and get an estimate of their
       | expected value within a "gnat's ass" nearly all the time. "The
       | Monte Carlo puts the effort where the action is."
       | 
       | The probabilist's remark was "That is a good way to think of it."
       | 
       | Need to do some work with Markov chains, simulation, etc.? Right,
       | just read some Cinlar, not much in prerequisites (he omitted
       | measure theory), get clear explanations, no undefined
       | terminology, from first principles to some relatively advanced
       | material, and be successful with your project.
        
         | heintje_ghulam wrote:
         | Thank you for the book recommendation. I have gained a lot from
         | reading your math study recommendations over the past few
         | years. I wish I had more time/motivation to fully follow-
         | through with them.
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-10-25 23:00 UTC)