[HN Gopher] CUDA 11.0 ___________________________________________________________________ CUDA 11.0 Author : ksec Score : 113 points Date : 2020-07-08 17:42 UTC (5 hours ago) (HTM) web link (docs.nvidia.com) (TXT) w3m dump (docs.nvidia.com) | Yajirobe wrote: | > Added support for Ubuntu 20.04 LTS on x86_64 platforms. | | Huh? I've been using CUDA for a while now on my Ubuntu 20.04 | machine | HuShifang wrote: | Coincidentally I was just looking at this after upgrading from | 19.10 to 20.04 today -- basically, the version installed from | Nvidia's old 18.04 repos work fine on 20.04. | hobofan wrote: | That probably just means that they are testing that they are | testing that platform in CI now, and are officially supporting | it (taking bug reports into account, etc.). | init wrote: | This has been problematic for me after upgrading from 18.04 to | 20.04. Every time I apt update cuda or some nvidia package my x | server fails to start for several different reasons. | | Just today I've already spent 30 minutes trying to start x with | this latest cuda update. Too bad I can't switch back to the | open source nouveau driver. | usmannk wrote: | I noticed CUDA 11.0 was almost ready for release last week when I | went to install CUDA and the default download page linked to the | 11.0 Release Candidate. The 10.1 and 10.2 links were buried | behind a link off to the side labeled "legacy". The thing is, no | library you use is going to be supporting the CUDA 11.0 RC, | that's ridiculous. For example, Pytorch stable is on 10.2 and | Tensorflow only goes up to 10.1. | | This is generally indicative of how poorly organized the CUDA | documentation and installation instructions are. The Conda | dependency manager has made this a lot easier recently. | Especially by, e.g., providing pytorch binaries. Though if you | want to use packages like NVIDIA Apex for mixed precision DL[0] | you're going to be in for a huge headache trying to compile torch | from source while also managing your cuda and nvcc version, which | sometimes must be the same but sometimes can not be![1] | | [0] Yes, I'm aware that Apex was very recently brought into torch | but it seems that the performance issues haven't been ironed out | yet. | | [1] https://stackoverflow.com/questions/53422407/different- | cuda-... | bbatsell wrote: | I have to use containers with nvidia-docker because NVIDIA so | consistently and relentlessly breaks things without so much as | a glance at backward compatibility. | etaioinshrdlu wrote: | The annoying thing is that nvidia-docker is still not great. | You still have to deal with the driver installed outside the | container, and it makes a big difference. | | Furthermore it seems like even the CUDA runtime is typically | not installed in the container, but rather injected in by the | nvidia-docker container runtime. | | It is not fun to deal with. | dijksterhuis wrote: | I moved our Deep Learning servers over to Docker images + | JupyterHub DockerSpawners recently because maintaining all | the various version dependencies between frameworks was an | absolute PITA. | | Images are publicly available here in case anyone else needs | something similar: https://hub.docker.com/u/uodcvip | TheGuyWhoCodes wrote: | I'm never sure of the relation between the driver, nvidia- | docker and the container with a specific cuda version. | | Last time I tried it the cuda inside the container tough it | was using some old driver version while a much newer version | was installed on the host. So I had to manual install the | older version, not sure where the issue was but maybe it was | because I was using the deprecated nvidia-docker version 2 | which is still needed to pass gpu resources to containers run | inside kubernetes. | liuliu wrote: | Yeah. To make the matter worse, they have updated libnccl-dev | from their apt repo to be CUDA-11 based a few weeks ago. That | breaks my CUDA app (because it is still on 10.2 and not | compatible) in interesting ways. apt-hold libnccl-dev for a | while and waiting for this release. | jjoonathan wrote: | Yeah, and the CUDA 10.0 official Visual Studio demo project | build was broken for... looks like a year, at least, because | they didn't want to populate the toolkit path. NVidia, you're | better than this. | | https://forums.developer.nvidia.com/t/the-cuda-toolkit-v10-0... | | > The Conda dependency manager has made this a lot easier | | Yeah but conda is "Let's do dependency management with a SAT | solver, it'll be great!" On a good day, it's just slow. On a | bad day, the SAT solver spins for hours before failing to | converge. On a really bad day, the SAT solver does something | "clever." | | I've had a couple of really bad days this year. I'm really | starting to not like conda very much. | ajtulloch wrote: | You might find https://github.com/TheSnakePit/mamba useful, | especially if you are slowed down by package resolution. | jjoonathan wrote: | That looks worth a look for sure! | techwizrd wrote: | Conda's SAT solver for dependency management is the bane of | my existence. For pip-installable packages, I'll almost | always turn to pip rather than conda even when in a conda | environment. | jabl wrote: | My biggest gripe with the conda depencency manager is that it | doesn't keep track of which packages own which files, and if | multiple packages own the same file the last one to be | installed will happily scribble over whatever was there | before. With hilarious results, of course. | | This means that keeping a conda installation up to date is | often very tricky, when upgrading you frequently have to | uninstall and reinstall some packages. | | It works better if you start from scratch with a | requirements.yml file. | ykl wrote: | Interesting that Fedora support seems to have been dropped. | Anyone know why that might be? | | Edit: oh wait I think I see. Latest supported gcc for CUDA 11 is | gcc 9.x, but I think latest Fedora is on gcc 10. | tw04 wrote: | I'm guessing it was an oversight in the table given they still | have all the fedora installation instructions on the install | page: | | https://docs.nvidia.com/cuda/cuda-installation-guide-linux/i... | core-questions wrote: | They've come to the conclusion that you should also come to, | that Fedora is basically a waste of time to support because | whatever you've gotten working will be terribly broken in the | next release for no good reason anyone can point to? | infairverona wrote: | Everytime I have to deal with multiple versions of CUDA on Linux | I feel like poking my eyes out. I get that supporting developer | libraries that have to interact with hardware is hard but come | on... | ziddoap wrote: | >cuFFT now accepts __nv_bfloat16 input and output data type for | power-of-two sizes with single precision computations within the | kernels. | | This exact sentence is listed both under "New Feature" and "Known | Issues". I'm not super familiar with CUDA stuff, but, it can't be | both right? | usmannk wrote: | Looks like a mistake, should only be in New Features. | bjornsing wrote: | So a known issue in the known issues? | willwill100 wrote: | A known known | flx42_ wrote: | Thanks, I have reported it internally and it is now fixed. | cjhanks wrote: | Does anyone understand why such minor upgrades resulted in a | major version bump? Is this some sort of stability check point? | Or some other versioning convention? | ibn-python wrote: | Usually for an API it indicates a breaking change. In this case | the removal of some functions which might require refactoring | on the consumers end. | gowld wrote: | And to keep users on a hardware upgrade treadmill. | bigdict wrote: | I think the page lists the changes since 11.0 RC, not the | previous major version. | darksoulzz wrote: | Exciting | chewxy wrote: | oh great. More chasing to do. Anyone interested in working on the | CUDA integration for Go (https://gorgonia.org/cu)? PRs welcome, | as I am quite short on time. ___________________________________________________________________ (page generated 2020-07-08 23:00 UTC)