[HN Gopher] Entropy: An Introduction
       ___________________________________________________________________
        
       Entropy: An Introduction
        
       Author : aishwarya_m
       Score  : 60 points
       Date   : 2020-07-17 18:24 UTC (4 hours ago)
        
 (HTM) web link (homes.cs.washington.edu)
 (TXT) w3m dump (homes.cs.washington.edu)
        
       | cmehdy wrote:
       | I imagine the timing of this post is correlated with release of
       | the documentary Bit Player about Claude Shannon. Haven't seen it
       | yet but looking forward to it.
       | 
       | The article does a decent job at graphing and laying out some of
       | the concepts of entropy for information theory, but I'm not sure
       | who the target reader is, since prereqs are perhaps only slightly
       | narrower than what one needs to read Shannon's paper[0] and the
       | article is really illustrating only a fraction of the concept.
       | 
       | It can perhaps work as a primer for what shows up starting on
       | pages 10-11 of the original document, in any case, provided you
       | grasp the mathematical definition of entropy through
       | thermodynamics, and the microstates-based definition through
       | Boltzman, as well as "basic probabilities" (expected value,
       | typical discrete distributions, terms like "i.i.d"), you should
       | be good to go. But then you might already know all this..
       | 
       | And if you do, and you like what you read, then the full original
       | thing by Shannon is a delight to explore to truly grasp what has
       | been so foundational to a lot of things since 1948.
       | 
       | [0]
       | http://people.math.harvard.edu/~ctm/home/text/others/shannon...
        
       | Analog24 wrote:
       | Might be worth clarifying in the title that this is about entropy
       | in the context of information theory.
        
         | jbay808 wrote:
         | In which context is it a different concept?
        
           | hexxiiiz wrote:
           | In thermodynamics there are two other formulations of
           | entropy: the Clausius one in terms of temperature and heat,
           | and the Boltzmann one. The latter defines entropy as the log
           | of the number of microstates a system could be in a
           | particular macrostate.
           | 
           | The Shannon definition is equivalent to the Boltzmann def
           | only in the case that the micro state consists of infinitely
           | many identical subsystems. If there are only finitely many,
           | for instance, the log of the quantity does not correspond to
           | the same "-p log p".
           | 
           | The Clausius def can be derived from the Boltzmann one, but
           | they are nevertheless also distinct formulations.
        
             | kgwgk wrote:
             | In thermodynamics / statistical mechanics there is another
             | formulation of entropy: Gibbs entropy is different from
             | Boltzmann entropy (and equivalent to Shanon entropy in
             | information theory).
        
             | jbay808 wrote:
             | https://en.wikipedia.org/wiki/Boltzmann%27s_entropy_formula
             | #...
             | 
             | According to Wikipedia, if you start with the Gibbs entropy
             | (which is the same as Shannon entropy), and then assume all
             | microstate probabilities are equal (which Boltzmann does),
             | you get the Boltzmann entropy formula. It also says
             | Boltzmann himself used a p ln(p) formulation.
             | 
             | So aren't they the same, perhaps up to a constant factor?
        
               | hexxiiiz wrote:
               | If you count the number of microstates for a given
               | macrostate you get a hyper geometric number
               | N!/(n_1!n_2!...) The log of this is the Boltzmann
               | entropy. However, if you consider N to be very large or
               | infinite, you can show using the Stirling approximation
               | that this ends up being the Gibbs/Shannon entropy in that
               | case. So, in general, no.
        
           | Analog24 wrote:
           | "Entropy" has been a concept in physics for a long time. Far
           | longer than the concept of information entropy.
        
             | jbay808 wrote:
             | Sure, but that doesn't mean that the concept of entropy in
             | physics is a _different_ concept than its incarnation in
             | information theory. Just like the concept of energy existed
             | before the development of thermodynamics, but thermodynamic
             | energy is still energy.
        
               | Analog24 wrote:
               | If you want to be pedantic about it sure. The fact
               | remains that a discussion or blog post can be about very
               | different things depending on the context. You're not
               | going to learn anything about the relationship between
               | energy and entropy, for example, if you're talking about
               | information theory. Hence my original comment.
        
       | danielrk wrote:
       | Love the post. Just FYI, your post is not mobile-friendly. When
       | scrolling down on iPhone it's impossible not to accidentally
       | shift the viewport away from the left margin making the left side
       | hard to read
        
       | ethanweinberger wrote:
       | Hi HN, I'm the author of this piece (Ethan Weinberger). I wrote
       | this originally as a set of notes for myself when brushing up on
       | concepts in information theory the past couple of weeks. I found
       | the presentations I was reading of the material to be a little
       | dry for my taste, so I tried to incorporate more visuals and
       | really emphasize the intuition behind the concepts. Glad to see
       | others are finding it useful/interesting! :)
        
         | sohamsankaran wrote:
         | Ethan also writes about machine learning at
         | https://honestyisbest.com/kernels-of-truth each week -- his
         | most recent piece there (https://honestyisbest.com/kernels-of-
         | truth/2020/Jul/14/facia...) has a neat explanation of how
         | convolutional neural networks (CNNs) work.
        
         | spinningslate wrote:
         | Thanks, I enjoyed reading. As an electronic engineering
         | student, I remember grappling with information theory in the
         | abstract: it was a weather example very similar to yours that
         | gave me the intuition I was missing.
         | 
         | An observation/suggestion. The intro is accessible to many
         | people; that drops off a steep cliff when you hit the maths.
         | Now, I'm not complaining about that: it's instructive and
         | necessary to formalise things. Where I struggle is in reading
         | the equations in my head when I don't know what words to use
         | for the symbols. For example, that very first `X ~ p(x)`. I
         | didn't know what to say for the tilde character, so couldn't
         | verbalise the statement. I do know that $\in$ (the rounded 'E')
         | means 'is a member of' so I could read the next statement. The
         | problem gets even more confusing for a non-mathematician as the
         | same symbol is used with different meaning in different
         | branches of maths/science (e.g. $\Pi$).
         | 
         | I get that writing out every equation in English isn't feasible
         | (or, at least, is asking a lot of the writer). But I wonder if
         | there's middle way, e.g. through hyperlinking?
         | 
         | As I say: not a criticism and I don't have a good solution.
         | Just an observation from a non-mathematician. Enjoyed the piece
         | anyway.
        
           | jessriedel wrote:
           | "X ~ p(x)" means "X is a random variable drawn from the
           | probability distribution p(x)" or maybe "X is drawn from
           | p(x)" for short.
           | 
           | Are you sure it's a matter of knowing what to _say_ (in your
           | head) vs knowing the definition of the notation in the first
           | place? I am pretty familiar with this notation, but I rarely
           | verbalize it mentally. I can tell because I read and
           | understand it quickly without problem, but on the rare
           | occasion when I have to read it aloud I realize I 'm not sure
           | how I should pronounce it.
        
         | onurcel wrote:
         | Thank you for the great article. I believe there is a typo in
         | "we assign a value of 0 to p(x) log p(x) when x=0", it should
         | be "when p(x) = 0".
        
         | wcookeverton wrote:
         | Awesome paper Ethan!!!
        
       | abetusk wrote:
       | Not sure if it's appropriate but here's my own take at a very
       | terse restatement of Shannon's original paper [1]:
       | https://mechaelephant.com/dev/Shannon-Entropy/
       | 
       | I recommend everyone that's interested to read Shannon's original
       | paper. It's one of the few examples of an original paper that's
       | both clear and readable.
       | 
       | [1]
       | https://homes.cs.washington.edu/~ewein/blog/2020/07/14/entro...
        
       ___________________________________________________________________
       (page generated 2020-07-17 23:00 UTC)