[HN Gopher] Optimization Techniques for GPU Programming [pdf] ___________________________________________________________________ Optimization Techniques for GPU Programming [pdf] Author : ibobev Score : 37 points Date : 2023-08-09 20:25 UTC (2 hours ago) (HTM) web link (dl.acm.org) (TXT) w3m dump (dl.acm.org) | flakiness wrote: | To ones who are interested: "Programming Massively Parallel | Processors: A Hands-on Approach" is a great book to learn CUDA | programming, and it talks mostly about performance because, after | all, GPU is about speed. | | Unlike normal programming books, it talks a lot about how GPUs | work and how the introduced techniques fit in that picture. It's | interesting even if you are just curious how a (NVIDIA) GPU works | at code-level. Strongly recommended. | mathisfun123 wrote: | > it talks a lot about how GPUs work | | it's true - out of all of the "LEARN CUDA IN 24 HOURS" books, | this is the best one. indeed this isn't one of those same books | - this is a textbook - but at first glance it resembles them | (at least the color scheme and the title led me astray when i | first found it). | gpuhacker wrote: | I bought the first edition when it came out, and definitely it | was a gold mine of information on the subject. I wonder though, | is the fourth edition worth buying another copy? Nvidia has | been advancing CUDA, in particular moving more towards C++ in | the kernel language. But none of that was present when this | book came out in 2007. Now more and more stuff is happening at | thread block level with the cooperative group C++ API and warp | level for tensor cores. It would be great if the authors | revisited all the early chapters to modernize that content, but | that's a lot of work so I don't usually count on authors making | such an effort for later editions. | w-m wrote: | Does anybody have an idea on how to get in to Metal programming | (as in Apple Metal)? I'd love to mess around a little with this | on iOS and macOS while learning about tile-based rendering, but I | have trouble locating educational written material. | | There's a book (https://metalbyexample.com/the-book/), but the | author has put up a note that it's quite out of date. It seems | the most up-to-date information is available in the WWDC videos | (regarding e.g. Metal 3), but I'd really prefer something | written. And Apple's documentation reads more like a reference | material and is quite confusing when starting out. | winwang wrote: | (+1) I'm a newb to Metal myself, and I wanted to use Swift as | the driving language (which was a main selling point). | Unfortunately, almost all the material is in Objective C. | winwang wrote: | If people like GPU programming, I wrote a blog post this week | about GPU-accelerated hashmaps, semi-provocatively titled "Can we | 10x Rust hashmap throughput?". | | HN post here: https://news.ycombinator.com/item?id=37036058 | eachro wrote: | I've been looking into getting into GPU programming, starting | with CS334 (https://developer.nvidia.com/udacity-cs344-intro- | parallel-pr...) on Udacity. I'm curious to hear from some of the | more seasoned GPU veterans out there, what other resources would | be good to take a look at after finishing the videos and | assignments? | yzh wrote: | I would recommend the course from Oxford | (https://people.maths.ox.ac.uk/gilesm/cuda/). Also explore the | tutorial section of cutlass (https://github.com/NVIDIA/cutlass/ | blob/main/media/docs/cute/...) if you want to learn more about | high performance gemm. OpenAI triton is another good resource | if you want to write relatively performant cuda kernels using | python for deep learning (https://openai.com/research/triton) | gpuhacker wrote: | If you want to go really in-depth I can recommend GTC on | demand. It's Nvidia streaming platform with videos from past | GTC conferences. Tony Scuderio had a couple of videos on there | called GPU memory bootcamp that are among the best advanced GPU | programming learning material out there. | pengaru wrote: | https://shadertoy.com is a great way to explore shaders ___________________________________________________________________ (page generated 2023-08-09 23:00 UTC)