[HN Gopher] Vgpu_unlock: Unlock vGPU functionality for consumer ... ___________________________________________________________________ Vgpu_unlock: Unlock vGPU functionality for consumer grade GPUs Author : fragileone Score : 273 points Date : 2021-04-09 18:42 UTC (4 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | ternox99 wrote: | I don't understand how and where I can download nvidia grid vgpu | driver. Anyone can help me? | ur-whale wrote: | If - like me - you don't have a clue what vGPU is: | | https://www.nvidia.com/en-us/data-center/virtual-solutions/ | | TL;DR: seems to be something useful for deploying GPUs in the | cloud, but I may not have understood fully. | lovedswain wrote: | It instantiates multiple logical PCI adaptors for a single | physical adaptor. The logical adaptors can then be mapped into | VMs which can directly program a hardware-virtualized view of | the graphics card. Intel has the same feature in their graphics | and networking chips | ur-whale wrote: | Thanks for the explanation, but that's more of a "this is how | it works" than a "this is why it's useful". | | What would be the main use case? | kjjjjjjjjjjjjjj wrote: | 4 people sharing 1 CPU and 1 GPU that is running a | hypervisor with separate installations of windows for | gaming | | Basically any workload that requires sharing a GPU between | discrete VMs | wmf wrote: | The use case is allowing the host system and VM(s) to | access the same GPU at the same time. | ur-whale wrote: | Yeah, I got that from the technical explanation. | | What's the _practical_ use case, as in, when would I need | this? | | [EDIT]: To maybe ask a better way: will this practically | help me train my DNN faster? | | Or if I'm a cloud vendor, will this allow me to deploy | cheaper GPU for my users? | | I guess I'm asking about the economic value of the hack. | lovedswain wrote: | Running certain ML models in VMs | | Running CUDA in VMs | | Running transcoders in VMs | | Running <anything that needs a GPU> in VMs | ur-whale wrote: | This is the exact same information you posted above. | | Please see my edit. | [deleted] | jandrese wrote: | You have a Linux box but you want to play a game and it | doesn't work properly under Proton, so you spin up a | Windows VM to play it instead. | | The host still wants access to the GPU to do stuff like | compositing windows and H.265 encode/decode. | skykooler wrote: | And outputting anything to the screen in general. | Usually, your monitor(s) are plugged into the ports on | the GPU. | jowsie wrote: | Same as any hypervisor/virtual machine setup. Sharing | resources. You can build 1 big server with 1 big GPU and | have multiple people doing multiple things on it at once, | or one person using all the resources for a single | intensive load. | ur-whale wrote: | Thanks, this is a concise answer. | | However, I was under the impression - at least on Linux - | that I could run multiple workloads in parallel on the | same GPU without having to resort to vGPU. | | I seem to be missing something. | antattack wrote: | If you are running Linux in a VM, vGPU will allow | acceleration with OpenGL, WebGL, Vulcan applications like | games, CAD, CAM, EDA, for example. | hesk wrote: | In addition to the answer by skykooler, virtual GPUs also | allow you to set hard resource limits (e.g., amount of L2 | cache, number of streaming multiprocessors), so different | workloads do not interfere with each other. | cosmie wrote: | This[1] may help. | | What you're saying is true, but it's generally using | either the API remoting or device emulation methods | mentioned on that wiki page. In those cases, the VM does | not see your actual GPU device, but emulated device | provided by the VM software. I'm running Windows within | Parallels on a Mac, and here[2] is a screenshot showing | the different devices each sees. | | In the general case, the multiplexing is all software | based. The guest VM talks to the an emulated GPU, the | virtualized device driver then passes those to the | hypervisor/host, which then generates equivalent calls on | to the GPU, then back up the chain. So while you're still | ultimately using the GPU, the software-based indirection | introduces a performance penalty and potential | bottleneck. And you're also limited to the cross-section | of capabilities exposed by your virtualized GPU driver, | hypervisor system, and the driver being used by that | hypervisor (or host OS, for Type 2 hypervisors). The | table under API remoting shows just how varied 3D | acceleration support is across different hypervisors. | | As an alternative to that, you can use fixed passthrough | to directly expose your physical GPU to the VM. This lets | you tap into the full capabilities of the GPU (or other | PCI device), and achieves near native performance. The | graphics calls you make in the VM now go directly to the | GPU, cutting out game of telephone that emulated devices | play. Assuming, of course, your video card drivers aren't | actively trying to block you from running within a VM[3]. | | The problem is that when a device is assigned to a guest | VM in this manner, that VM gets exclusive access to it. | Even the host OS can't use it while its assigned to the | guest. | | This article is about the fourth option - mediated | passthrough. The vGPU functionality enables the graphics | card to expose itself as multiple logical interfaces. So | every VM gets its own logical interface to the GPU and | send calls directly to the physical GPU like it does in | normal passthrough mode, and the hardware handles the | multiplexing aspect instead of the host/hypervisor | worrying about it. Which gives you the best of both | worlds. | | [1] https://en.wikipedia.org/wiki/GPU_virtualization | | [2] https://imgur.com/VMAGs5D | | [3] https://wiki.archlinux.org/index.php/PCI_passthrough_ | via_OVM... | skykooler wrote: | You can, but only directly under that OS. If you wanted | to run, say, a Windows VM to run a game that doesn't work | in Wine, you'd need some way to give a virtual GPU to the | virtual machine. (As it is now, the only way you'd be | able to do this is to have a separate GPU that's | dedicated to the VM and pass that through entirely.) | noodlesUK wrote: | I wish Nvidia would open this up properly. The fact that intel | integrated gpus can do GVT-G and I literally can't buy a laptop | which will do vgpu passthrough with an Nvidia card for any amount | of money is infuriating. | my123 wrote: | GVT-g is gone on 10th-gen GPUs (Ice Lake) and later. Not | supported on Intel dGPUs either. | cercatrova wrote: | For virtualized Windows from Linux, check out Looking Glass which | I posted about previously | | https://news.ycombinator.com/item?id=22907306 | albertzeyer wrote: | The Python script actually mostly uses Frida (https://frida.re/) | scripting. I haven't seen Frida before, but this looks very | powerful. I did some similar (but very basic) things with | GDB/LLDB scripting before but Frida seems to be done for exactly | things like this. | madjam002 wrote: | I built an "X gamers/workstations 1 CPU" type-build last year and | this has been the main problem, I have two GPUs, one of which is | super old and I have to choose which one I want to use when I | boot up a VM. | | Will definitely be checking this out! | WrtCdEvrydy wrote: | This is a dumb question, but which hypervisor configuration is | this targeted towards to. | | There's a lot of detail on the link which I appreciate but maybe | I missed it. | sudosysgen wrote: | Amazing! Simply amazing! | | This not only enables the use of GPGPU on VMs, but also enables | the use of a single GPU to virtualize Windows video games from | Linux! | | This means that one of the major problems with Linux on the | desktop for power users goes away, and it also means that we can | now deploy Linux only GPU tech such as HIP on any operating | system that supports this trick! | cercatrova wrote: | For virtualized Windows from Linux, check out Looking Glass | which I posted about previously | | https://news.ycombinator.com/item?id=22907306 | zucker42 wrote: | That requires two GPUs. | ur-whale wrote: | > Amazing! Simply amazing! | | If it's such a cool feature, why does NVidia lock it away non- | Tesla H/W? | | [EDIT]: Funny, but the answers to this question actually | provide way better answers to the other question I posted in | this thread (as in: what is this for). | sudosysgen wrote: | Because otherwise, people would be able to use non-Tesla GPUs | for cloud compute workloads, drastically reducing the cost of | cloud GPU compute, and it would also enable the use of non- | Tesla GPUs as local GPGPU clusters - additionally reducing | workstation GPU sales due to more efficient resource use. | | GPUs are a duopoly due to intellectual property laws and high | costs of entry (the only companies I know of that are willing | to compute are Chinese and only a result of sanctions), so | for NVidia this just allows for more profit. | userbinator wrote: | Interestingly, Intel is probably the most open with its | GPUs, although it wasn't always that way; perhaps they | realised they couldn't compete on performance alone. | bayindirh wrote: | I think AMD is on par with Intel, no? | colechristensen wrote: | Openness usually seems to be a feature of the runners up. | moonbug wrote: | trivial arithmetic will tell you it's not the cost of the | hardware that makes AWS and Azure GPU instances expensive. | sudosysgen wrote: | Certainly, and both AWS, GCP and Azure even on CPU are | much beyond simply hardware cost - there are hosts that | are 2-3x cheaper for most uses with equivalent hardware | resources. | semi-extrinsic wrote: | Yeah, but now the comparison for many companies (e.g. R&D | dept. is dabbling a bit in machine learning) becomes "buy | one big box with 4x RTX 3090 for ~$10k and spin up VMs on | that as needed", versus the cloud bill. Previously the | cost of owning physical hardware with that capability | would be a lot higher. | | This has the potential to challenge the cloud case for | sporadic GPU use, since cloud vendors cannot buy RTX | cards. But it would require that the tooling becomes | simple to use and reliable. | simcop2387 wrote: | Entirely for market segmentation. The ones they allow it on | are much more expensive. With this someone could create a | cloud game streaming service using normal consumer cards and | dividing them up for a much cheaper experience than the $5k+ | cards that they currently allow it on. The recent change to | allow virtualization at all (removing the code 43 block) does | allow some of that, but does not allow you to say take a 3090 | and split it up for 4 customers and get 3060-like performance | for each of them for a fraction of the cost. | lostmsu wrote: | I am interested in the recent change you are referring to. | Is there a good article on how to use it on Windows or at | least Linux? | sirn wrote: | The OP is referring to GPU passthrough setup[1], which | passes through a GPU from Linux host to Windows guest | (e.g. for gaming). This is done by detaching the GPU from | the host and pass it to the VM, thus most setup requires | two GPUs since one need to remain with the host (although | single GPU passthrough is also possible). | | Nvidia used to detect if the host is a VM and return | error code 43 blocking them from being used (for market | segmentation between GeForce and Quadro). This is usually | solved by either patching VBIOS or hiding KVM from the | guest, but it was painful and unreliable. Nvidia removed | this limitation with RTX 30 series. | | This vGPU feature unlock (TFA) would allow GPU to be | virtualized without requiring the GPU to first be | detached from the host, vastly simplify the setup and | open up the possibility of having multiple VMs running on | a single GPU, all with its own dedicated vGPU. | | [1]: https://wiki.archlinux.org/index.php/PCI_passthrough | _via_OVM... | my123 wrote: | The RTX A6000 is at USD 4650, with 48GB of VRAM and the | full chip enabled (+ECC, vGPU, pro drivers of course) | | The RTX 3090, with 24GB of VRAM is at USD 1499. | | Customer dGPUs from other HW providers do not have | virtualisation capabilities either. | baybal2 wrote: | Well, I believe intel has it on iGPUs just very well | hidden | my123 wrote: | https://news.ycombinator.com/item?id=26367726 | | Not anymore. | RicoElectrico wrote: | Ngreedia - the way it's meant to be paid(tm) | IncRnd wrote: | Nvidia sells an ever greater percentage of their sales to the | data-center market, and consumers purchase a shrinking | portion. They do not want to flatten their currently upward | trending data-center sales of high-end cards. | | _NVIDIA 's stock price has doubled since March 2020, and | most of these gains can be largely attributed to the | outstanding growth of its data center segment. Data center | revenue alone increased a whopping 80% year over year, | bringing its revenue contribution to 37% of the total. Gaming | still contributes 43% of the company's total revenues, but | NVIDIA's rapid growth in data center sales fueled a 39% year- | over-year increase in its companywide first-quarter revenues. | | The world's growing reliance on public and private cloud | services requires ever-increasing processing power, so the | market available for capture is staggering in its potential. | Already, NVIDIA's data center A100 GPU has been mass adopted | by major cloud service providers and system builders, | including Alibaba (NYSE:BABA) Cloud, Amazon (NASDAQ:AMZN) | AWS, Dell Technologies (NYSE:DELL), Google (NASDAQ:GOOGL) | Cloud Platform, and Microsoft (NASDAQ: MSFT) Azure._ | | https://www.fool.com/investing/2020/07/22/data-centers- | hold-... | matheusmoreira wrote: | To make people pay more. | Youden wrote: | > This means that one of the major problems with Linux on the | desktop for power users goes away, and it also means that we | can now deploy Linux only GPU tech such as HIP on any operating | system that supports this trick! | | If you're brave enough, you can already do that with GPU | passthrough. It's possible to detach the entire GPU from the | host and transfer it to a guest and then get it back from the | guest when the guest shuts down. | [deleted] | spijdar wrote: | This could be way more practically useful than GPU | passthrough. GPU passthrough demands at least two GPUs (an | integrated one counts), requires at least two monitors (or | two video inputs on one monitor), and in my experience has a | tendency to do wonky things when the guest shuts off, since | the firmware doesn't seem to like soft resets without the | power being cycled. It also requires some CPU and PCIe | controller settings not always present to run safely. | | This could allow a single GPU with a single video output to | be used to run games in a Windows VM, without all the hoops | that GPU passthrough entails. I'd definitely be excited for | it! | sudosysgen wrote: | Certainly, but this requires both BIOS/UEFI fiddling and it | also means you can't use both Windows and Linux at the same | time, which is very important for me. | airocker wrote: | This is super! What would it take to abstract it similar to | CPU/Memory by specifying limits only in croups? Limits could be | like GPU Memory size/amount of parallelization? | liuliu wrote: | One thing I want to figure out (because I don't have a dedicated | Windows gaming desktop), and the documentation on the internet | seems sparse: it is my understanding that if I want to use PCIe | passthrough with Windows VM, these GPUs cannot be available to | the host machine at all, or technically it can, but I need to do | some scripting to make sure the NVIDIA driver doesn't own these | PCIe lanes before open Windows VM and re-enable it after | shutdown? | | If I go with vGPU solution, I don't need to turn on / off NVIDIA | driver for these PCIe lanes when running Windows VM? (I won't use | these GPUs on host machine for display). | Youden wrote: | > One thing I want to figure out (because I don't have a | dedicated Windows gaming desktop), and the documentation on the | internet seems sparse: it is my understanding that if I want to | use PCIe passthrough with Windows VM, these GPUs cannot be | available to the host machine at all, or technically it can, | but I need to do some scripting to make sure the NVIDIA driver | doesn't own these PCIe lanes before open Windows VM and re- | enable it after shutdown? | | The latter statement is correct. The GPU can be attached to the | host but it has to be detached from the host before the VM | starts using it. You may also need to get a dump of the GPU ROM | and configure your VM to load it at start up. | | Regarding the script, mine resembles [0]. You need to remove | the NVIDIA drivers and then attach the card to VFIO. And then | the opposite afterwards. You may also need to image your GPU | ROM [1] | | [0]: https://techblog.jeppson.org/2019/10/primary-vga- | passthrough... | | [1]: https://clayfreeman.github.io/gpu-passthrough/#imaging- | the-g... | matheusmoreira wrote: | Exactly. With GPU virtualization the driver is able to share | the GPU resources with multiple systems such as the host | operating system and guest virtual machine. Shame on nvidia for | arbitrarily locking us out of this feature. | [deleted] | DCKing wrote: | Dual booting is for chumps. If I could run a base Linux system | and arbitrarily run fully hardware accelerated VMs of multiple | Linux distros, BSDs and Windows, I'd be all over that. I could | pretend here that I really _need_ the ability to quickly switch | between OSes, that I 'd like VM-based snapshots, or that I have | big use cases to multiplex the hardware power in my desktop box | like that. I really don't need it. I just want it. | | I really hope Intel sees this as an opportunity for their DG2 | graphics cards due out later this year. | | If anyone from Intel is reading this: if you guys want to carve | out a niche for yourself, and have power users advocate for your | hardware - this is it. Enable SR-IOV for your upcoming Xe DG2 GPU | line just as you do for your Xe integrated graphics. Just observe | the lengths that people go to for their Nvidia cards, injecting | code into their proprietary drivers just to run this. You can | make this a champion feature just by _not disabling_ something | your hardware can already do. Add some driver support for it in | the mix and you 'll have an instant enthusiast fanbase for years | to come. | strstr wrote: | Passthrough is workable right now. It's a pain to get set up, | but it is workable. | | You don't need vgpu to get the job done. I've had two set ups | over time: one based on a jank old secondary gpu that is used | by the vm host, another based on just using the jank integrated | graphics on my chip. | | Even still, I dual boot because it just works. It always works, | and boot times are crazy low for Windows these days. No | fighting with drivers. No fighting with latency issues for non- | passthrough devices. It all just works. | DCKing wrote: | Oh I'm aware of passthrough. It's just a complete second | class citizen because it isn't really virtualization, it's a | hack. Virtualization is about multiplexing hardware. | Passthrough is the opposite of multiplexing hardware: it's | about yanking a peripheral from your host system and shoving | it into one single guest VM. The fact that this yanking is | poorly supported and has poor UX makes complete sense. | | I consider true peripheral multiplexing with true GPU | virtualization to be the way of the future. It's true | virtualization and doesn't even require you to sacrifice | and/or babysit a single PCIe connected GPU. Passthrough is | just a temporary hacky workaround that people have to apply | now because there's nothing better. | | In the best case scenario - with hardware SR-IOV support plus | basic driver support for it, enabling GPU access in your VM | with SR-IOV would be a simple checkbox in the virtualization | software of the host. GPU passthrough can't ever get there in | terms of usability. | fock wrote: | I have a Quadro card and at least for Windows guests I can | easily move the card between running guests (Linux has some | problems with yanking though). Still, virtualized GPUs | would be nice. | jagrsw wrote: | It works with some cards, not with others. Eg. for Radeon Pro | W5500 there's no known card reset method that works (no | method from https://github.com/gnif/vendor-reset works) so I | had to do S3 suspend before running a VM with _systemctl | suspend_ or with _rtcwake -m mem -s 2_ | | Now I have additional RTX 2070 and it works ok. | blibble wrote: | passthrough has become very easy to set up, just add your pci | card in virt-manager and away you go | | saying that, these days I just have a second pc with a load | of cheap USB switches... | m463 wrote: | I've been running proxmox. I haven't run windows, but I have | ubuntu vm's with full hardware gpu passthrough. I've passed | through nvidia and intel gpus. | | I also have a macos vm, but I didn't set up gpu passthrough for | that. Tried it once, it hung, didn't try it again. I use remote | desktop anyway. | | here are some misc links: | | https://manjaro.site/how-to-enable-gpu-passthrough-on-proxmo... | | https://manjaro.site/tips-to-create-ubuntu-20-04-vm-on-proxm... | | https://pve.proxmox.com/wiki/Pci_passthrough | | https://blog.konpat.me/dev/2019/03/11/setting-up-lxc-for-int... | easton wrote: | Given that I use my desktop 90% of the time remotely these | days, I'm going to set this up next time I'm home and move my | Windows stuff into a VM. Then I can run Docker natively on the | host and when Windows stops cooperating, just create a new VM | (which I can't do remotely with it running on bare metal, at | least without the risk of it not coming back up). | schaefer wrote: | There's a _lot_ of customer loyalty on the table waiting for the | first GPU manufacturer to unlock this feature on consumer grade | cards without forcing us to resort to hacks. | [deleted] | neatze wrote: | To me this is laughably naive question, but I ask it any way. | | My understanding is that CPU/GPU per application can make only | single draw call in sequential manner. (eg. CPU->GPU->CPU->GPU) | | Could vgpu's be used for concurrent draw calls from multiple | processes of an single application ? | milkey_mouse wrote: | > My understanding is that CPU/GPU per application can make | only single draw call in sequential manner. | | The limitation you're probably thinking of is in the OpenGL | drivers/API, not in the GPU driver itself. OpenGL has global | (per-application) state that needs to be tracked, so outside of | a few special cases like texture uploading you have to only | issue OpenGL calls from one thread. If applications use the | lower-level Vulkan API, they can use a separate "command queue" | for each thread. Both of those are graphics APIs, I'm less | familiar with the compute-focused ones but I'm sure they can | also process calls from multiple threads. | milkey_mouse wrote: | And VGPUS are isolated from one another, that's the whole | point-so using multiple in one application would be very | difficult, as I don't think they can share data/memory in any | way. | neatze wrote: | My primitive thoughts: | | Threaded Computation on CPU -> Single GPU Call -> Parallel | Computation on GPU -> Threaded Computation on CPU ... | | I wonder if it can be used in such way: | | Asyc Concurrent Computation on CPU -> Asyc Concurrent GPU | Calls -> Parallel Time Independent Computations on GPU -> | Asyc Concurrent Computation on CPU | [deleted] | shmerl wrote: | Is this for SR-IOV? It's too bad SR-IOV isn't supported on | regular desktop AMD GPUs for example in the Linux driver. | Nullabillity wrote: | Yes, this is basically NVidia's SR-IOV. | jarym wrote: | Hacking at its finest! Nice | h2odragon wrote: | > In order to make these checks pass the hooks in | vgpu_unlock_hooks.c will look for a ioremap call that maps the | physical address range that contain the magic and key values, | recalculate the addresses of those values into the virtual | address space of the kernel module, monitor memcpy operations | reading at those addresses, and if such an operation occurs, keep | a copy of the value until both are known, locate the lookup | tables in the .rodata section of nv-kernel.o, find the signature | and data bocks, validate the signature, decrypt the blocks, edit | the PCI device ID in the decrypted data, reencrypt the blocks, | regenerate the signature and insert the magic, blocks and | signature into the table of vGPU capable magic values. And that's | what they do. | | I'm very grateful _I_ wasn 't required to figure that out. | stingraycharles wrote: | I love the conciseness of this explanation. In just a few | sentences, I completely understand the solution, but at the | same time also understand the black magic wizardry that was | required to pull it off. | jacquesm wrote: | Not to mention the many hours or days of being stumped. This | sort of victory typically doesn't happen overnight. | | What bugs me about companies like NV is that if they just | sold their hardware and published the specs they'd probably | sell _more_ than with all this ridiculously locked down | nonsense, it is just a lot of work thrown at limiting your | customers and protecting a broken business model. | eli wrote: | But they'd also sells fewer high end models. I don't doubt | that they've done the math. | minimalist wrote: | Related but different: | | - nvidia-patch [0] "This patch removes restriction on maximum | number of simultaneous NVENC video encoding sessions imposed by | Nvidia to consumer-grade GPUs." | | - About a week ago "NVIDIA Now Allows GeForce GPU Pass-Through | For Windows VMs On Linux" [1]. Note, this is only for the driver | on Windows VM guests not GNU/Linux guests. | | Hopefully the project in the OP will mean that GPU access is | finally possible on GNU/Linux guests on Xen, thank you for | sharing OP. | | [0]: https://github.com/keylase/nvidia-patch | | [1]: | https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-G... ___________________________________________________________________ (page generated 2021-04-09 23:00 UTC)