[HN Gopher] Principal Component Analysis Explained Visually
       ___________________________________________________________________
        
       Principal Component Analysis Explained Visually
        
       Author : spking
       Score  : 93 points
       Date   : 2022-10-29 18:02 UTC (4 hours ago)
        
 (HTM) web link (setosa.io)
 (TXT) w3m dump (setosa.io)
        
       | aquafox wrote:
       | Here is a much better explanation of PCA:
       | https://stats.stackexchange.com/questions/2691/making-sense-...
       | 
       | The key insight that many are missing is that PCA solves a series
       | of optimization problems, namely that reconstructing the data
       | from the first k PCs gives the best k-dimensional approximation
       | in terms of the squared error. Even more, this is equivalent to
       | assuming that the data lives in a k-dimensional subspace and
       | becomes truly high-dimensional because of normally distributed
       | noise that spills into every direction (dimension).
        
         | larrydag wrote:
         | I really like the way Harrell uses PCA to build regression
         | analysis in Regression Modeling Strategies
         | 
         | https://link.springer.com/book/10.1007/978-3-319-19425-7
        
         | swyx wrote:
         | Principal Components is a wonderful concept, together with
         | sister concepts eigenvalues/vectors, and orthogonality. i wish
         | i could force everyone i talk to to internalize these ideas so
         | that I could have more useful discussions with them.
         | 
         | that said, yeah not everything is linearly separable
        
       | blt wrote:
       | In the UK eating example, it would be better to examine the
       | feature-space singular vector associated with the first singular
       | value instead of instructing the reader to "go back and look at
       | the data in the table". PCA has already done that work, no
       | additional (error-prone, subjective) interpretation needed.
        
       | lxe wrote:
       | Also see
       | 
       | - Markov Chains (https://setosa.io/ev/markov-chains/)
       | 
       | - Image Kernels (https://setosa.io/ev/image-kernels/)
       | 
       | - Bus Bunching (https://setosa.io/bus/)
       | 
       | Wish these guys kept producing more visualizations!
        
       | wjnc wrote:
       | Best thing I've ever read on PCA is Madeleine Udell's PhD-thesis
       | [1]. It extends PCA in many directions and shows that well-known
       | techniques fit into the developed framework. (Was also impressed
       | with a 138 page thesis in math that is readable as well. Quite
       | the achievement.)
       | 
       | [1] https://people.orie.cornell.edu/mru8/doc/udell15_thesis.pdf
        
         | Bukhmanizer wrote:
         | It's kind of crazy that so many people have read this thesis,
         | but it's really good. I came across it independently a few
         | years ago when I was trying to understand some stuff, but ended
         | up saving it as a reference because I liked it so much.
        
         | isoprophlex wrote:
         | This is some hot stuff! Thanks for sharing. Very lucid writing,
         | clearly she has some deep understanding of the subject matter
         | to be able to write that down so eloquently
        
         | flashfaffe2 wrote:
         | Indeed, this seems worth a deep read as this especially address
         | main PCA shortcomings ( heterogeneous data, non numerical
         | data,.etc...). Thanks mate I've definitely find a way to keep
         | myself busy this weekend.
        
       | nerdponx wrote:
       | I'm not sure this is an explanation as much as an introductory
       | demo. Nice visualizations though.
        
       ___________________________________________________________________
       (page generated 2022-10-29 23:00 UTC)