[HN Gopher] Optimization Techniques for GPU Programming [pdf]
       ___________________________________________________________________
        
       Optimization Techniques for GPU Programming [pdf]
        
       Author : ibobev
       Score  : 37 points
       Date   : 2023-08-09 20:25 UTC (2 hours ago)
        
 (HTM) web link (dl.acm.org)
 (TXT) w3m dump (dl.acm.org)
        
       | flakiness wrote:
       | To ones who are interested: "Programming Massively Parallel
       | Processors: A Hands-on Approach" is a great book to learn CUDA
       | programming, and it talks mostly about performance because, after
       | all, GPU is about speed.
       | 
       | Unlike normal programming books, it talks a lot about how GPUs
       | work and how the introduced techniques fit in that picture. It's
       | interesting even if you are just curious how a (NVIDIA) GPU works
       | at code-level. Strongly recommended.
        
         | mathisfun123 wrote:
         | > it talks a lot about how GPUs work
         | 
         | it's true - out of all of the "LEARN CUDA IN 24 HOURS" books,
         | this is the best one. indeed this isn't one of those same books
         | - this is a textbook - but at first glance it resembles them
         | (at least the color scheme and the title led me astray when i
         | first found it).
        
         | gpuhacker wrote:
         | I bought the first edition when it came out, and definitely it
         | was a gold mine of information on the subject. I wonder though,
         | is the fourth edition worth buying another copy? Nvidia has
         | been advancing CUDA, in particular moving more towards C++ in
         | the kernel language. But none of that was present when this
         | book came out in 2007. Now more and more stuff is happening at
         | thread block level with the cooperative group C++ API and warp
         | level for tensor cores. It would be great if the authors
         | revisited all the early chapters to modernize that content, but
         | that's a lot of work so I don't usually count on authors making
         | such an effort for later editions.
        
       | w-m wrote:
       | Does anybody have an idea on how to get in to Metal programming
       | (as in Apple Metal)? I'd love to mess around a little with this
       | on iOS and macOS while learning about tile-based rendering, but I
       | have trouble locating educational written material.
       | 
       | There's a book (https://metalbyexample.com/the-book/), but the
       | author has put up a note that it's quite out of date. It seems
       | the most up-to-date information is available in the WWDC videos
       | (regarding e.g. Metal 3), but I'd really prefer something
       | written. And Apple's documentation reads more like a reference
       | material and is quite confusing when starting out.
        
         | winwang wrote:
         | (+1) I'm a newb to Metal myself, and I wanted to use Swift as
         | the driving language (which was a main selling point).
         | Unfortunately, almost all the material is in Objective C.
        
       | winwang wrote:
       | If people like GPU programming, I wrote a blog post this week
       | about GPU-accelerated hashmaps, semi-provocatively titled "Can we
       | 10x Rust hashmap throughput?".
       | 
       | HN post here: https://news.ycombinator.com/item?id=37036058
        
       | eachro wrote:
       | I've been looking into getting into GPU programming, starting
       | with CS334 (https://developer.nvidia.com/udacity-cs344-intro-
       | parallel-pr...) on Udacity. I'm curious to hear from some of the
       | more seasoned GPU veterans out there, what other resources would
       | be good to take a look at after finishing the videos and
       | assignments?
        
         | yzh wrote:
         | I would recommend the course from Oxford
         | (https://people.maths.ox.ac.uk/gilesm/cuda/). Also explore the
         | tutorial section of cutlass (https://github.com/NVIDIA/cutlass/
         | blob/main/media/docs/cute/...) if you want to learn more about
         | high performance gemm. OpenAI triton is another good resource
         | if you want to write relatively performant cuda kernels using
         | python for deep learning (https://openai.com/research/triton)
        
         | gpuhacker wrote:
         | If you want to go really in-depth I can recommend GTC on
         | demand. It's Nvidia streaming platform with videos from past
         | GTC conferences. Tony Scuderio had a couple of videos on there
         | called GPU memory bootcamp that are among the best advanced GPU
         | programming learning material out there.
        
         | pengaru wrote:
         | https://shadertoy.com is a great way to explore shaders
        
       ___________________________________________________________________
       (page generated 2023-08-09 23:00 UTC)