[HN Gopher] GTK: Introducing Graphics Offload ___________________________________________________________________ GTK: Introducing Graphics Offload Author : signa11 Score : 249 points Date : 2023-11-18 08:31 UTC (14 hours ago) (HTM) web link (blog.gtk.org) (TXT) w3m dump (blog.gtk.org) | ng55QPSK wrote: | Is the same infrastructure available in Windows and MacOS? | diath wrote: | The last paragraph says: | | > At the moment, graphics offload will only work with Wayland | on Linux. There is some hope that we may be able to implement | similar things on MacOS, but for now, this is Wayland-only. It | also depends on the content being in dmabufs. | knocte wrote: | From the article: | | > What are the limitations? | | > At the moment, graphics offload will only work with Wayland | on Linux. There is some hope that we may be able to implement | similar things on MacOS, but for now, this is Wayland-only. It | also depends on the content being in dmabufs. | PlutoIsAPlanet wrote: | macOS supports similar things in its own native stack, but | GTK doesn't make use of it. | pjmlp wrote: | Nope, it is yet another step making Gtk only relevant for Linux | development. | andersa wrote: | Has it been relevant for something else before? | pjmlp wrote: | Yes, back when G stood for Gimp, and not GNOME. | danieldk wrote: | Even in those days, Gtk+ applications were quite horrible | on non-X11 platforms. GTK has never been a good cross- | platform toolkit in contrast to e.g. Qt. | pjmlp wrote: | I guess people enjoying GIMP and Inkscape would beg to | differ. | danieldk wrote: | Maybe it's a thing on Windows (I don't know), but I've | never seen anyone use GIMP or Inkscape on macOS. I'm | pretty sure they exist somewhere, but all Mac users I | know use Photoshop, Pixelmator or Affinity Photo rather | than GIMP. | ognarb wrote: | Inskape devs have a lot of trouble making their app work | on other OS. | jdub wrote: | Supporting a feature on one platform does not make a toolkit | less relevant or practical on another platform. | pjmlp wrote: | Except this has been happening for quite a while, hence why | a couple of cross platform projects have migrated from Gtk | to Qt, including Subsurface, a bit ironically, given the | relationship of the project to Linus. | jdub wrote: | Gtk has always been primarily built by and for Linux | users. | pjmlp wrote: | The GIMP Tolkit was always cross platform, the GNOME | Tolkit not really. | jdub wrote: | That is ahistorical, and the misnaming doesn't help make | your point. | pjmlp wrote: | As former random Gtkmm contributor, with articles on the | The C/C++ User Journal, I am not the revisionist here. | DonHopkins wrote: | What's a Tolkit? And why two of them? I thought GTK was | the Toolkit, GIMP was the Image Manipulation Program, and | Gnome was the desktop Network Object Model Environment. | Am I a revisionist here? (I certainly have my | reservations about them!) | pjmlp wrote: | GTK stand for The GIMP Toolkit, as it was originally used | to write GIMP, which actually started as a MOTIF | application. | | When GNOME adopted GTK as its foundation, there was a | clear separation between GTK and the GNOME libraries, | back in the 1.0 - 2.0 days. | | Eventually GNOME needs became GTK roadmap. | | The rest one can find on the history books. | DonHopkins wrote: | Dude, I know. I've been implementing user interface | toolkits since the early 80's, but I've still never heard | of a "Tolkit", which you mentioned twice, so I asked you | what it was -- are you making a silly pun like "Tollkit" | for "Toolkit" or "Lamework" for "Framework" or "Bloatif" | for "Motif" and I'm missing it? No hits on urban | dictionary, even. And also you still haven't explained | whether I'm a revisionist or not. | | Just like you, I love to write articles about user | interface stuff all the time, too. Just in the past week: | | My enthusiastic but balanced response to somebody who | EMPHATICALLY DEMANDED PIE MENUS ONLY for GIMP, and who | loves pie fly, but pushed my button by defending the name | GIMP by insisting that instead of the GIMP project simply | and finally conceding its name is offensive, that our | entire society adapt by globally re-signifying a widely | known offensive hurtful word (so I suggested he first go | try re-signifying the n-word first, and see how that | went): | | https://news.ycombinator.com/item?id=38233793 | | (While I would give more weight to the claim that the | name GIMP is actually all about re-signifying an | offensive term if it came from a qualified and empathic | and wheelchair using interface designer like Haraldur | Ingi Thorleifsson, I doubt that's actually the real | reason, just like it's not white people's job to re- | signify the n-word by saying it all the time...) | | Meet the man who is making Iceland wheelchair accessible | one ramp at a time: | | https://scoop.upworthy.com/meet-the-man-who-is-making- | icelan... | | Elon Musk apologises after mocking laid-off Twitter | worker, questioning his disability: | | https://www.abc.net.au/news/2023-03-08/elon-musk- | haraldur-th... | | The article about redesigning GIMP we were discussing | credited Blender with being the first to show what mouse | buttons do what at the bottom of the screen, which | actually the Lisp Machine deserves credit for, as far as | I know: | | https://news.ycombinator.com/item?id=38237231 | | I made a joke about how telling GIMP developers to make | it more like Photoshop was like telling RMS to develop | Open Software for Linux, instead of Free Software for | GNU/Linux, and somebody took the bait so I flamed about | the GIMP developer's lack of listening skills: | | https://news.ycombinator.com/item?id=38238274 | | Somebody used the phrase "Easy as pie" in a discussion | about user interface design so I had to chime in: | | https://news.ycombinator.com/item?id=38239113 | | Discussion about HTML Web Components, in which I confess | my secret affair with XML, XSLT, obsolete proprietary | Microsoft technologies, and Punkemon pie menus: | | https://news.ycombinator.com/item?id=38253752 | | Deep interesting discussion about Blender 4.0 release | notes, focusing on its historic development and its | developer's humility and openness to its users' | suggestions, in which I commented on its excellent Python | integration. | | https://news.ycombinator.com/item?id=38263171 | | Comment on how Blender earned its loads of money and | support by being responsive to its users. | | https://news.ycombinator.com/item?id=38232404 | | Dark puns about user interface toolkits and a cheap shot | at Motif, with an analogy between GIMP and Blender: | | https://news.ycombinator.com/item?id=38263088 | | A content warning to a parent who wanted to know which | videos their 8-year-old should watch on YouTube to learn | Blender: | | https://news.ycombinator.com/item?id=38288629 | | Posing with a cement garden gnome flipping the bird with | Chris Toshok and Miguel de Icaza and his mom at GDC2010: | | https://www.facebook.com/photo/?fbid=299606531754&set=a.5 | 173... | | https://www.facebook.com/photo/?fbid=299606491754&set=a.5 | 173... | | https://www.facebook.com/photo/?fbid=299606436754&set=a.5 | 173... | vore wrote: | It was clearly a typo you could choose to ignore | charitably instead of nitpick. Also, what is the rest of | this comment and how is it related to GTK? | DonHopkins wrote: | Because he was incorrectly nitpicking himself, and was | wrong to call somebody else a revisionist without citing | any proof, while he was factually incorrect himself, and | offering an appeal to authority of himself as a writer | and "random Gtkmm contributor" instead. I too have lots | of strong opinions about GTK, GNOME, and GIMP, so I am | happy for the opportunity to write them up, summarize | them, and share them. | | You'll have to read the rest of the comment and follow | the links to know what it says, because I already wrote | and summarized it, and don't want to write it again just | for you, because I don't believe you'd read it a second | time if you didn't read it the first time. Just use | ChatGPT, dude. | | Then you will see that it has a lot to do with GTK and | GNOME and GIMP, even including exclusive photos of Miguel | de Icaza and his mom with a garden gnome flipping the | bird. | pjmlp wrote: | Oopsie I touched a nerve. | DonHopkins wrote: | You HAD to mention MOTIF! ;) There's a reason I call it | BLOATIF and SLOWTIF... | | https://donhopkins.medium.com/the-x-windows- | disaster-128d398... | | >The Motif Self-Abuse Kit | | >X gave Unix vendors something they had professed to want | for years: a standard that allowed programs built for | different computers to interoperate. But it didn't give | them enough. X gave programmers a way to display windows | and pixels, but it didn't speak to buttons, menus, scroll | bars, or any of the other necessary elements of a | graphical user interface. Programmers invented their own. | Soon the Unix community had six or so different interface | standards. A bunch of people who hadn't written 10 lines | of code in as many years set up shop in a brick building | in Cambridge, Massachusetts, that was the former home of | a failed computer company and came up with a "solution:" | the Open Software Foundation's Motif. | | >What Motif does is make Unix slow. Real slow. A stated | design goal of Motif was to give the X Window System the | window management capabilities of HP's circa-1988 window | manager and the visual elegance of Microsoft Windows. We | kid you not. | | >Recipe for disaster: start with the Microsoft Windows | metaphor, which was designed and hand coded in assembler. | Build something on top of three or four layers of X to | look like Windows. Call it "Motif." Now put two 486 boxes | side by side, one running Windows and one running | Unix/Motif. Watch one crawl. Watch it wither. Watch it | drop faster than the putsch in Russia. Motif can't | compete with the Macintosh OS or with DOS/Windows as a | delivery platform. | smoldesu wrote: | > Eventually GNOME needs became GTK roadmap. | | Exactly? If you're still holding out for GTK to be a non- | Linux toolkit in 2023 then you're either an incredibly | misguided contributor and/or ignorant of the history | behind the toolkit. The old GTK does not exist anymore, | you either use GNOME's stack or you don't. | chrismorgan wrote: | GNOME co-opted and sabotaged GTK for anyone that's not | GNOME. GTK used to be capable of being fairly OS-neutral, | and was certainly quite neutral within Linux and so | became the widget toolkit of choice for diverse desktop | environments and worked well thus; but over time GNOME | has taken it over completely, and the desires of other | desktop environments are utterly ignored. The GNOME | Foundation has become a very, very bad custodian for GTK. | | As you say, the old GTK is dead. GNOME murdered it. I | mourn it. | smoldesu wrote: | Yeah, I don't disagree with anything you've said. Still | though, I use GTK because it works and think the pushback | against it is silly. GTK was never destined to be the | cross-platform native framework. If that was attainable, | people would have forked GTK 2 (for what?) or GTK 3 (too | quirky). Now we're here, and the only stakeholders on the | project is the enormously opinionated GNOME team. | | They've made a whole lot of objective and subjective | missteps in the past, but I don't think it's fair to | characterize them as an evil party here. They did the | work, they reap the rewards, and they take the flak for | the myriad of different ways the project could/should | have gone. | jamesfmilne wrote: | macOS has IOSurface [0], so it can be done there too. It would | require someone to implement it for GTK. | | [0] https://developer.apple.com/documentation/iosurface | audidude wrote: | When I wrote the macOS backend and GL renderer I made them | use IOSurface already. So it's really a matter of setting up | CALayer automatically the same way that we do it on Linux. | | I don't really have time for that though, I only wrote the | macOS port because I had some extra holiday hacking time. | torginus wrote: | On Windows and DirectX, you have the concept of Shared Handles, | which are essentially handles you can pass between process | boundaries. It also comes with a mutex mechanism to signal who | is using the resource at the moment. Fun fact - Windows at the | kernel level works with the concept of 'objects', which can be | file handles, window handles, threads, mutexes, or in this | case, textures, which are reference counted. Sharing a | particular texture is just exposing the handle to multiple | processes. | | A bit of reading if you are interested: | | https://learn.microsoft.com/en-us/windows/win32/direct3darti... | diath wrote: | I wonder if there are plans to make it work with X11 in the | future, I've yet to see the benefit of trying to switch to | Wayland on my desktop, it just doesn't work as-is the way my 8 | year old setup works. | knocte wrote: | I doubt they have the energy to backport bleeding edge tech. | PlutoIsAPlanet wrote: | This is one of the benefits of the Wayland protocol over X, | being able to do this kind of thing relatively | straightforwardly. | | Once support for hardware planes becomes more common in Wayland | compositors, this can be tied to ultimately allow no-copy | rendering to the display for non-fullscreen applications, which | for video playback (incl. likes of Youtube) equals to reduced | CPU & GPU usage and less power draw, as well as reduced | latency. | AshamedCaptain wrote: | > This is one of the benefits of the Wayland protocol over X | | What. | | The original design of X actually encouraged a separate | surface / Window for each single widget on your UI. This was | actually removed in Gtk+3 ("windowless widgets"). And now | they are bringing it back just for wayland ("subsurfaces"). | As far as I can read, it is practically the same concept. | tadfisher wrote: | The original design of X had clients send drawing commands | to the X server, basically treating the server as a remote | Cairo/Skia-like 2D rasterizer, and subwindows were a cheap | way to avoid pixel-level damage calculations. This was | obviated in the common case by the Xdamage extension. Later | use of windows as a rendering surface for shared | client/server buffers was added with Xshm, then for device | video buffers with Xv. | | GTK3 got rid of windowed widgets because Keith Packard | introduced the Xrender extension, which basically added 2D | compositing to X, which was the last remaining use for | subwindows for every widget. | AshamedCaptain wrote: | This is completely wrong. Xrender is completely | orthogonal to having windows or not. Heck, Xrender takes | a _window_ as target -- Xrender is just an extension to | allow more complicated drawing commands to be sent to the | server (like alpha composition). You make your toolkit's | programmer's life more complicated, not less, by having | windowless widgets (at the very minimum you now have to | complicate your rendering & event handling code with | offsets and clip regions and the like). | | The excuse that was used when introducing windowless | widgets is to reduce tearing/noise during resizing, as | Gtk+ had trouble synchronizing the resizing of all the | windows at the same time. | play_ac wrote: | >Xrender is completely orthogonal to having windows or | not. Heck, Xrender takes a _window_ as target -- Xrender | is just an extension to allow more complicated drawing | commands to be sent to the server (like alpha | composition). | | Yes, that's the point. When you can tell Xrender to | efficiently composite some pixmaps then there's really no | reason to use sub-windows ever. | | >You make your toolkit's programmer's life more | complicated, not less, by having windowless widgets (at | the very minimum you now have to complicate your code | with offsets and clip regions and the like). | | No, you still had to have offsets and clip regions before | too because the client still had to set and update those. | And it was more complicated because when you made a sub- | window every single bit of state like that had to be | synchronized with the X server and repeatedly copied over | the wire. With client-side rendering everything is simply | stored in the client and never has to deal with that | problem. | AshamedCaptain wrote: | > When you can tell Xrender to efficiently composite some | pixmaps then there's really no reason to use sub-windows | ever. | | There is, or we would not be having subsurfaces on | Wayland or this entire discussion in the first place. | | Are you seriously arguing that the only reason to using | windows in Xorg is to have composition? People were using | Xshape/Xmisc and the like to handle the lack of alpha | channels in the core protocol? This is not what I | remember. I would be surprised if Xshape even worked on | non-top level windows. heck, even MOTIF had windowless | widgets (called gadgets iirc), and the purpose most | definitely was not composition-related. | audidude wrote: | Drawables on Drawables doesn't help here at all. | | Sure it lets you do fast 2d acceleration but we don't use | 2d accel infrastructure anywhere anymore. | | Subsurfaces have been in Wayland since the beginning of | the protocol. | | This is simply getting them to work on demand so we can | do something X (or Xv) could never do since Drawables | would get moved to new memory (that may not even be | mappable on the CPU-side) on every frame. | | And that's to actually use the scanout plane correctly to | avoid powering up the 3d part of the GPU when doing video | playback on composited systems. | hurryer wrote: | No screen tearing is a major benefit of using a compositor. | mrob wrote: | And screen tearing is a major benefit of not using a | compositor. There's an unavoidable tradeoff between image | quality and latency. Neither is objectively better than the | other. Xorg has the unique advantage that you can easily | switch between them by changing the TearFree setting with | xrandr. | RVuRnvbM2e wrote: | It's not unique. Wayland has a tearing protocol. | | https://gitlab.freedesktop.org/wayland/wayland- | protocols/-/t... | mrob wrote: | This is something that every application has to opt in to | individually. It's not a global setting like TearFree. | RVuRnvbM2e wrote: | This is just untrue. | mrob wrote: | From the XML file that describes the protocol: | | "This global is a factory interface, allowing clients to | inform which type of presentation the content of their | surfaces is suitable for." | | Note that "global" refers to the interface, not the | setting. Which Wayland compositor has the equivalent | feature of "xrandr --output [name] --set TearFree off"? | kaba0 wrote: | Which is the correct default? No application should | unknowingly render half-ready frames, that's stupid. The | few niches where it makes sense (games, 3D applications) | can opt into it, and do their own thing. | badsectoracula wrote: | That is subjective, i do not want the input latency | induced by synchronizing to the monitor's refresh rate in | my desktop as it makes it feel sluggish. The only time i | want this is when i watch some video (and that is of | course only when i actively watch the video, sometimes i | put a video at the background when i do other stuff) - so | for my case the correct default is to have this disabled | with the only exception being when watching videos. | kaba0 wrote: | Okay, and then the program will get an event, that | propagates down to the correct component, which reacts | some way, everything that changes due to that are | damaged, and every damaged component is re-rendered with | synchronization from the framework itself. It has to be | specifically coded (e.g. text editor directly writing to | the same buffer the rendered character) to actually make | efficient use of tearing, it will literally just tear | otherwise with zero benefits. | mrob wrote: | You don't need to do anything special. Just render to the | front buffer immediately and don't worry about the | current scanout position. If it's above the part of the | screen you're updating, great, you saved some latency. If | it's below, latency is the same as if you waited for | vsync. And if it's in the middle, you at least get a | partial update early. | JelteF wrote: | Could you explain in what scenario you think it is better | to have a display show two half images slightly faster | (milliseconds) than one full one? | mrob wrote: | Text editing. I mostly work on a single line at a time. | The chance of the tear ending up in that line is low. And | even if it does, it lasts only for a single screen | refresh cycle, so it's not a big deal. | | And you're not limited to two images. As the frame rate | increases, the number of images increases and the tear | becomes less noticeable. Blur Busters explains: | | https://blurbusters.com/faq/benefits-of-frame-rate-above- | ref... | kaba0 wrote: | As the frame rate increases, the latency decreases, | making it a non-issue. I rather chose this option, over | blinking screens. | mrob wrote: | The minimum latency is bottlenecked by the monitor unless | you allow tearing. | AshamedCaptain wrote: | Many technologies have been invented to allow to display | "two half images slightly faster", such as interlaced | scanning... | | Most humans will actually prefer the "slightly faster" | option. (Obviously if you can do both, then they'd prefer | that; but given the trade-off...) | andreyv wrote: | First person shooters. Vertical synchronization causes a | noticeable output delay. | | For example, with a 60 Hz display and vsync, game actions | might be shown up to 16 ms later than without vsync, | which is ages in FPS. | badsectoracula wrote: | Input latency. I find the forced vsync by compositors | annoying even when doing simple stuff like moving or | resizing windows - it gives a sluggish feel to my | desktop. This is something i notice even on a high | refresh rate monitor. | beebeepka wrote: | Personally - gaming. Never liked vsync | maccard wrote: | > I've yet to see the benefit of trying to switch to Wayland on | my desktop | | how about Graphics Offload? | diath wrote: | This feature would be nice-to-have but is not impactful | enough (at least to me) to outweigh the cons of having to | switch to Wayland, which would include migrating my DE and | getting accustomed to it as well as looking for replacement | applications for these that do not work properly with Wayland | (most notably ones that deal with global keyboard hooks). | Admittedly I have never tried XWayland which I think could | potentially solve some of these issues. | maccard wrote: | I think if your waiting for a magic bullet of a feature to | upgrade you might be waiting a long time, and even Wayland | will be replaced at that point. Instead, look at the | combination of features (like this) and think about it and | future upgrades. I think you're right that xwayland is | probably a compromise for now if you need it for things | like global shortcuts. | mnd999 wrote: | If it worked exactly the same there would indeed be no benefit. | If you're happy with that you have then there's no reason to | switch. | aktuel wrote: | I am sorry to tell you that X11 is completely unmaintained by | now. So the chances of that happening are zero. | NGRhodes wrote: | FYI 21.1.9 was released less than a month ago | (https://lists.x.org/archives/xorg/2023-October/061515.html), | they are still fixing bugs. | dralley wrote: | You mean, they're still fixing critical CVEs. | badsectoracula wrote: | Which are still bugs. | | Also only two of the four changes mentioned in the mail | are about CVEs. | AshamedCaptain wrote: | Frankly, it was X11 which introduced "Graphics Offload" in the | first place, with stuff like XV, chroma keying, and hardware | overlays. Then compositors came and we moved to | texture_from_surface extensions and uploading things into GPUs. | This is just the eternal wheel of reinventing things in | computing (TM) doing yet another iteration and unlikely to give | any tangible benefits over the situation from decades ago. | play_ac wrote: | No, nothing like this exists in X11. Xorg still doesn't | really have support for non-RGB surfaces. DRI3 gets you part | of the way there for attaching GPU buffers but the way | surfaces work would have to be overhauled to work more like | Wayland, where they can be any format supported by the GPU. | There isn't any incentive to implement this in X11 either | because X11 is supposed to work over the network and none of | this stuff would. | | Yes, you're technically right that this would have been | possible years ago but it wasn't actually ever done, because | X11 never had the ability to do it at the same time as using | compositing. | AshamedCaptain wrote: | > Xorg still doesn't really have support for non-RGB | surfaces | | You really need to add context to these statements, because | _right now_ I am using through Xorg a program which uses a | frigging colormap, which is as non-RGB as it gets. The | entire reason Xlib has this "WhitePixel" and XGetPixel and | XYPixmap and other useless functions which normally fetch a | lot of ire is because it tries to go out of its way to | support practically other-wordly color visuals and image | formats. If anything, I'd say it is precisely RGB which has | the most problems with X11, specially when you go more than | 24bpp. | | > there for attaching GPU buffers | | None of this is about the GPU, but about about directly | presenting images for _hardware_ composition using direct | scan-out, hardware layers or not. Exactly what Xv is about, | and the reason Xv supports formats like YUV. | | > There isn't any incentive to implement this in X11 either | because X11 is supposed to work over the network and none | of this stuff would | | As if that prevented any of the extensions done to X11 in | the last three decades, including Xv. | kaba0 wrote: | There are plenty of wheel reinventions in IT, but let's not | pretend that modern graphics are anything like it used to be. | We have 8k@120Hz screens now, the amount of pixels that have | to be displayed in a short amount of time is staggering. | AshamedCaptain wrote: | At the same time, you also have hardware that can push | those pixels without problem. When X and these technologies | were introduced, the hardware was not able to store the | entire framebuffer for one screenful in memory, let alone | two. Nowadays you are able to store a handful at the very | minimum. Certainly there's a different level of performance | here in all parts, but the concepts have changed very | little, and this entire article kind of shows it. | audidude wrote: | This would require protocol changes for X11 at best, and nobody | is adding new protocols. Especially when nobody does Drawable | of Drawables anymore and all use client-side drawing with Xshm. | | You need to dynamically change stacking of subsurfaces on a | per-frame basis when doing the CRTC. | AshamedCaptain wrote: | I really don't see why it would need a new protocol. You can | change stacking of "subsurfaces" in the traditional X11 | fashion and you can most definitely do "drawables of | drawables". At the very least I'd bet most clients still | create a separate window for video content. | | I agree though it would require a lot of changes to the | server and no one is in the mood (like, dynamically decide | whether I composite this window or push it to a Xv port or | hardware plane? practically inconceivable in the current | graphics stack, albeit it is not a technical X limitation | per-se). This entire feature is also going to be pretty | pointless in Wayland desktop space either way because no one | is in the mood either -- your dmabufs are going to end up in | the GPU anyway for the foreseeable future, just because of | the complexity of liftoff, variability of GPUs, and the like. | audidude wrote: | > I really don't see why it would need a new protocol. | | You'll need API to remap the Drawable to the scanout plane | from a compositor on a per-frame basis (so when submitting | the CRTC) and the compositor isn't in control of the CRTC. | So... | AshamedCaptain wrote: | This assumes it would be the role of the window manager | or compositor rather than the server who decides that, | which I didn't think that way. But I guess it'd make | sense (policy vs mechanism). "per-frame basis" I don't | see why and it just mirrors Wayland concepts. Still, as | protocols go, it's quite a minor change, and one | applications don't necessarily have to support. | kelnos wrote: | I would very much doubt it. This would likely require work on | Xorg itself (a new protocol extension, maybe; I don't believe | X11 supports anything but RGB, [+A, with XRender] for windows, | and you'd probably need YUV support for this to be useful), | which no one seems to care to do. And the GTK developers seem | to see their X11 windowing backend as legacy code that they | want to remove as soon as they can do so without getting too | many complaints. | MarcusE1W wrote: | Is this something where it would be helpful if the Linux | (environment) developers worked together? Like the (graphics) | kernel, GTK, KDE, Wayland, ... guys all in one room (or video | conference) to discuss requirements iron out one graphics | architecture that is efficient and transparent? | | I think it's good that different graphics systems exist but it | feels unnecessary that every team has to make their own | discoveries how to handle the existing pieces. | | If work were coordinated at least on the requirements and | architecture level then I think a lot of synergies could be | achieved. After that everyone can implement the architecture the | way that works best for their use case, but on some common | elements could be relied on. | rawoul wrote: | They are: | | https://indico.freedesktop.org/event/4/ | | https://emersion.fr/blog/2023/hdr-hackfest-wrap-up/ | | https://gitlab.freedesktop.org/wayland/wayland-protocols/-/w... | | ... | jdub wrote: | They do, it's just not hugely visible. Two great conferences | where some of that work happened were linux.conf.au and the | Linux Plumbers Conference. | dontlaugh wrote: | That's exactly how Wayland came to be. | BearOso wrote: | That's exactly what happened. This is the original intent for | subsurfaces. A bunch of Wayland developers got together and | wrote the spec a long time ago. The only thing happening now is | Gtk making use of them transparently in the toolkit. | | Subsurfaces didn't have bug-free implementations for a while, | so maybe some people avoided them. But I know some of us | emulator programmers have been using them for output | (especially because they can update asynchronously from the | parent surface), and I think a couple media players do, too. | It's not something that most applications really need. | jiehong wrote: | I'm not sure I understand why an overlay allows partial | offloading while rounding the corner of the video does not. | | Couldn't the rounded corners of a video also be an overlay? | | I'm sure I'm missing something here, but the article does not | explain that point. | andyferris wrote: | I think it's that the _window_ has rounded corners, and you | don't want the content appearing outside the window. | audidude wrote: | No, you can already be sure it's the right size. This has to | do with what it takes to occlude the rounded area from the | final display. | orra wrote: | I'd love to know the answer to that. This is fantastic work, | but it'd be a shame for it to be scunnered by rounded corners. | phkahler wrote: | Because the UX folks want what they want. I want my UI out of | the way, including the corners of video and my CPU load. | audidude wrote: | Easily solved by black bars just like people are used to on | a TV. I assume most video players will do this when in | windowed mode. | orra wrote: | That's a good observation. Plus, you'll get black bars | anyway, if you resize the window to not be the same | aspect ratio as the video. | play_ac wrote: | >Couldn't the rounded corners of a video also be an overlay? | | No because the clipping is done in the client after the content | is drawn. The client doesn't have the full screen contents. To | make it work with an overlay, the clipping would have to be | moved to the server. There could be another extension that lets | you pass an alpha mask texture to the server to use as a clip | mask. But this doesn't exist (yet?) | audidude wrote: | If you have the video extend to where the corners are rounded, | you must use a "rounded clip" on the video ontop of the shadow | region (since they butt). | | That means you have to power up the 3d part of the GPU to do | that (because the renderer does it in shaders). | | Where as if you add some 9 pixels of black above/below to | account for the rounded corner, there is no clipping of the | video and you can use hardware scanout planes. | | That's important because keeping the 3d part of the GPU turned | off is a huge power savings. And the scanout plane can already | scale for you to the correct size. | unwind wrote: | Very cool! | | I think I found a minor typo: | | _GTK 4.14 will introduce a GtkGraphicsOffload widget, whose only | job it is to give a hint that GTK should try to offload the | content of its child widget by attaching it to a subsurface | instead of letting GSK process it like it usually does._ | | I think that "GSK" nesr the end should just be "GTK". It's not a | very near miss on standard qwerty, though ... | pja wrote: | GSK is the GTK Scene Graph Kit: | https://en.wikipedia.org/wiki/GTK_Scene_Graph_Kit | unwind wrote: | Wow thanks, TIL! Goes to show how far removed I am from GTK | these days I guess. :/ | caslon wrote: | Not a typo: https://en.wikipedia.org/wiki/GTK_Scene_Graph_Kit | ahartmetz wrote: | It is strange that the article doesn't compare and contrast to | full-screen direct scanout, which most X11 and presumably Wayland | compositors implement, e.g. KDE's kwin-wayland since 2021: | https://invent.kde.org/plasma/kwin/-/merge_requests/502 | | Maybe that is because full-screen direct scanout doesn't take | much (if anything) in a toolkit, it's almost purely a compositor | feature. | kaba0 wrote: | Is there a significant difference? Hardware planes are | basically that, just optionally not full-screen. | smallstepforman wrote: | BeOS and Haiku allowed exposure to kernel graphics buffers | decades ago (https://www.haiku-os.org/legacy- | docs/bebook/BDirectWindow.ht...), which bypass the roundtrip to | compositor and back. A companion article describing the design is | here (https://www.haiku-os.org/legacy- | docs/benewsletter/Issue3-12....) | | After 25 years, GTK joins the party ... | | With no native video drivers, moving a Vesa Haiku window with | video playback still seems smoother than doing the same in | Gnome/Kde under X11. | pjmlp wrote: | Like most OSes where the desktop APIs are part of the whole | developer experience, and not yet another pluggable piece. | tmountain wrote: | BeOS was truly ahead of its time. It's a shame that it didn't | get more traction. | ris wrote: | There must be a name for the logical fallacy I see whenever | someone pines over a past "clearly superior" technology that | wasn't adopted or had its project cancelled, but I can never | grasp it. I guess the closest thing to it is an unfalsifiable | statement. | | The problem comes from people remembering all of the positive | traits (often just _promises_ ) of a technology but either | forgetting all the problems it had or the technology never | being given enough of a chance for people to discover all the | areas in which it is a bit crap. | | BeOS? Impressive technology demo. Missing _loads_ of features | expected in a modern OS, even at that time. No multi-user | model even! | | This is also a big phenomenon in e.g. aviation. The greatest | fighter jet _ever_ is always the one that (tragically!) got | cancelled before anyone could discover its weaknesses. | darkwater wrote: | This, plus nostalgia pink glasses plus rooting for the | underdog (who lost) plus my niche tech is better than your | mainstream one. | AshamedCaptain wrote: | > There must be a name for the logical fallacy I see | whenever someone pines over a past "clearly superior" | technology that wasn't adopted or had its project | cancelled, but I can never think of it. I guess the closest | thing to it is an unfalsifiable statement. | | You can run Haiku today, so it's hardly unfalsifiable, nor | an effect of nostalgia or whatever way you want to phrase | it. | | > BeOS? Impressive technology demo. Missing loads of | features expected in a modern OS, even at that time. No | multi-user model even! | | "Even at that time" is just false. multi-user safe OSes | abound in the 90s? | dihrbtk wrote: | Windows NT...? | ris wrote: | > You can run Haiku today, so it's hardly unfalsifiable | | Excellent, so let's falsify it: how come 20 years later | the best thing people really have to say about BeOS/Haiku | is that is has smooth window dragging? | | > multi-user safe OSes abound in the 90s? | | Windows NT. | | Linux and at least two other free unixes. | | Countless proprietary unixes. | | VMS. | | The widespread desktop OSs in the 90s were not considered | serious OSs even then, more accidents of backward- | compatibility needs. | tialaramex wrote: | The thing about Haiku is that their plan (initially as | "OpenBeOS") was since they're just re-doing BeOS and some | of BeOS is Open Source they'll just get some basics | working then they're right back on the horse and in a few | years they'll be far ahead of where BeOS was. | | _Over two decades later_ they don 't have a 1.0 release. | cmrdporcupine wrote: | Exactly this. I'm as big a fan of "alternative tech | timelines" as the next nerd, but I also can see in | retrospect why we have the set of compromises we have | today, and all along I watched the intense efforts people | made to navigate the mindbogglingly complicated minefield | of competing approaches and political players and technical | innovations that were on the scene. | | People have been working damned hard to build things like | Gtk, Qt, etc. not to mention Wayland, etc. all the while | maintaining compatibility etc and I personally am happy for | their efforts. | | BeOS/HaikuOS is a product of a mid-90s engineering scene | that predates the proliferation of GPUs, the web, and the | set of programming languages that we work with today. | There's nothing wrong with it in that context, but it's | also not "better." Just different compromises. | | The other one I see nostalgia nerds reach for is the Amiga. | A system _highly_ coupled to a set of custom chips that | only made sense when RAM was as fast (or faster) than the | CPU, whose OS had no memory protection, and which was | easily outstripped in technical abilities by the early 90s | by PCs with commodity ISA cards, etc. because of the | development of economics of scale in computer | manufacturing, etc. It was ahead of its time for about 2-3 | years in the mid 80s, but in a way that was a dead end. | | Anyways, what we have right now is a messy set of | compromises. It doesn't hurt to go looking for | simplifications, but it _does_ hurt to pretend that the | compromises don 't exist for a reason. | | EDIT: I would add though that "multiuser" as part of a | desktop (or especially mobile) OS has maybe proven to be a | pointless thing. The vast majority of Linux machines out | there in the world are run in a single user fashion, even | if they are capable of multiuser. Android phones, | Chromebooks, desktop machines, and even many servers -- | mostly run with just one master user. And we've also seen | how rather not-good the Unix account permissions model is | in terms of security, and how weak its timesharing of | resources etc is in context of today's needs -- hence the | development of cgroups, containers, and virtual machine / | hypervisor etc. | BenjiWiebe wrote: | They run with one master user perhaps, but they have | multiple users at one time anyways. | cmrdporcupine wrote: | I mean, in those cases they're almost always just using | multiple users as a proxy for job authorization levels, | not people. | | Anybody who is serious about securing a single physical | machine for multiuser access isn't doing it through | multiple OS accounts, and is slicing it up by VMs, | instead. | | I _do_ have a home (Windows) computer that gets used by | multiple family members through accounts, but I think | this isn 't a common real world use case. | pjmlp wrote: | Unfortunately being technology superior isn't enough to | win. | | Money, marketing, politics, wanting to be part of the | crowd, usually play a bigger role. | timetraveller26 wrote: | I was greatly impressed with BeOS filesystem SQL'esque | indexing and querying. | fulafel wrote: | You can bypass the compositor with X11 hw acceleration features | too. But what about Wayland? I thought apps always go through | the compositor there. As shown eg in the diagram at | https://www.apertis.org/architecture/wayland_compositors/ | | Drawing directly from app to kernel graphics buffers (or hw | backed surfaces) and participating in composition are | orthogonal I think. The compositor may be compositing the | kernel or hw backed surfaces. | arghwhat wrote: | Bypassing _composition_ is a key feature in Wayland. The | protocols are all written with zero-copy in mind. The X11 | tricks are already there, forwarding one clients buffers | straight to hardware, while the end-game with libliftoff is | offloading multiple subsurfaces directly to individual planes | at once. | | The _compositor_ is the entire display server in Wayland, and | is also the component responsible for bypassing composition | when possible. | fulafel wrote: | In my meager understanding which I'm happy to be corrected | about: In a windowed scenario (vs fullscreen), in both the | X direct-rendering and Wayland scenarios the application | provides a (possibly gpu backed) surface that the | compositor uses as a texture when forming the full screen | video output. | | In a full-screen case AFAIK it's possible to skip the | compositing step with X11, and maybe with Wayland too. | | "Zero copy" seems a bit ambiguous term in graphics because | there's the kind of copying where whole screen or window | sized buffers are being copied about, and then there are | compositing operations where various kinds of surfaces are | inputs to the final rendering, where also pixels are copied | around possibly several times in shaders but there aren't | necessary extra texture sized intermediate buffers | involved. | arghwhat wrote: | > In a full-screen case AFAIK it's possible to skip the | compositing step with X11, and maybe with Wayland too. | | This is the trivial optimization all Wayland compositors | do. | | The neater trick is to do this for non-fullscreen content | - and even just parts of windows - using overlay planes. | Some compositors have their own logic for this, but | libliftoff aims to generalize it. | | Zero-copy is not really ambiguous, but to clarify: | Wayland protocols are designed to maximize the cases | where a buffer rendered by a client can be presented | directly by the display hardware as-is (scanned out), | without any intermediate operations on the content. | | Note "maximize" - the content must be compatible with | hardware capabilities. Wayland provides hints to stay | within capabilities, but a client might pick a render | buffer modifier/format that cannot be scanned out by the | display hardware. GPUs have a _lot_ of random | limitations. | AshamedCaptain wrote: | To this day, moving a Haiku window under a 5k unaccelerated EFI | GOP framebuffer still feels _significantly_ faster than doing | the same under Windows 11, and everything KWin's X/Wayland has | to offer on the same hardware (AMD Navi 2 GPU). | | > BeOS and Haiku allowed exposure to kernel graphics buffers | decades ago | | In any case, X also allowed this for ages, with XV and XShm and | the like. Of course then everyone got rid of this in order to | have the GPU in the middle for fancier animations and whatnot, | and things went downhill since. | baybal2 wrote: | Android has horrific UI latency despite heavily employing | hardware acceleration. | | Enlightenments EFL was exclusively software for a long time, | but is buttery smooth despite largely using redraw most of | the time. | | Hardware acceleration does not compensate for the lack of | hard computer science knowledge about efficient memory | operations, caching, and such | play_ac wrote: | >In any case, X also allowed this for ages, with XV and XShm | and the like. | | No, XShm doesn't do that and the way XV does it is completely | dependent on drivers. If you're using Glamour then XV won't | use overlays at all. XShm uses a buffer created in CPU memory | allocated by the client that the X server then has to copy to | the screen. | | > Of course then everyone got rid of this in order to have | the GPU in the middle for fancier animations | | No, for video, the GPU is used in the middle so you can do | post-processing without copying everything into main memory | and stalling the whole pipeline. I'd like to see an actual | benchmark for how a fullscreen 5k video with post-processing | plays on Haiku without any hardware acceleration. | AshamedCaptain wrote: | > XShm uses a buffer created in CPU memory allocated by the | client that the X server then has to copy to the screen. | | Fair enough. Even with XShmCreatePixmap, you are still | never simply mmaping the card's actual entire framebuffer, | unlike what BDirectWindow allows (if https://www.haiku- | os.org/legacy-docs/benewsletter/Issue3-12.... is to | believed, which is closer to something like DGA). In XShm, | the server still has to copy your shared memory segment to | the actual framebuffer. | | (sorry for previous answer here, I misunderstood your | comment) | | > No, for video, the GPU is used in the middle so you can | do post-processing without copying everything into main | memory and stalling the whole pipeline. | | Depends on what you mean by "post-processing". You can do | many types of card-accelerated zero-copy post-processing | using XV: colorspace conversion, scaling, etc. At the time, | scaling the video in software or even just doing an extra | memory copy per frame would have tanked frame rate -- Xine | can be used to watch DVDs in PentiumII-level hardware. | Obviously you cannot put the video in the faces of rotating | 3D cubes, but this is precisely what I call "fancy | animations". | play_ac wrote: | >colorspace conversion, scaling | | There's a lot more than that. Please consider installing | the latest version of VLC or something like that and | checking all the available post-processing effects and | filters. These aren't "fancy animations" and they're not | rotating 3D cubes, they're basic features that a video | player is required to support now. If you want to support | arbitrary filters then you need to use the GPU. All these | players stopped using XV ages ago, on X11 you'll get the | GPU rendering pipeline too because of this. | | I don't really see what's the point of making these | condescending remarks like trying to suggest that | everyone is stupid and is only interested in making | wobbly windows and spinning cubes. Those have never been | an actual feature of anything besides Compiz, which is a | dead project. | AshamedCaptain wrote: | I don't see what you mean by "condescending remarks", but | I do think it is stretching it to claim "arbitrary | filters" is a "basic feature that a video player is | required to support now". As a consumer, I have | absolutely _never_ used any such video filters, doubt | most consumers are even aware of them, have seen few | video players which support them, and most definitely I | have no idea what they are in VLC. Do they even enable | any video filters by default? The only video filter I | have sometimes used is deinterlacing which doesn't really | fit well in the GPU rendering pipeline anyway but fits | very nicely in fixed hardware. So yes, I hardly see the | need to stop using native accelerated video output and | fallback to GPU just in case someone wants to use such | filters. This is how I end up with a card which consumes | 20W just for showing a static desktop on two monitors. | | Anyway, discussing about this is besides the point, and | forgive me from the rant above. | | If you really need GPU video filters then the GPU is | obviously going to be the best way to implement them, | there's no discussion possible about that. But the entire | point of TFA is to (dynamically) go back to a model where | the GPU is _not_ in the middle. And that model -- sans | GPU -- happens to match what Xv was doing and is actually | faster and less power consuming than to always blindly | use the GPU which is where we are now post-Xv. | DonHopkins wrote: | That's also how SunView worked on SunOS in 1982, and Sun's | later GX graphics accelerated framebuffer driver worked in the | 90's. The kernel managed the clipping list, and multiple | processes shared the same memory, locking and respecting the | clipping list and pixel buffers in shared memory (main memory, | not GPU memory!), so multiple process could draw on different | parts of the screen efficiently, without incurring system calls | and context switches. | | https://en.wikipedia.org/wiki/SunView | | Programmers Reference Manual for the Sun Window System, rev C | of 1 November 1983: Page 23, Locking and Clipping: | | http://bitsavers.trailing-edge.com/pdf/sun/sunos/1.0/800-109... | | But GPUs change the picture entirely. From what I understand by | reading the article, GTK uses GL to render in the GPU then | copies the pixels into main memory for the compositor to mix | with other windows. But in modern GPU-first systems, the | compositor is running in the GPU, so there would be no reason | to ping-pong the pixels back and forth between CPU and GPU | memory after drawing 3D or even 2D graphics with the GPU, even | when having different processes draw and render the same | pixels. | | So I'm afraid Wayland still has a lot of catching up to do, if | it still uses a software compositor, and has to copy pixels | back from the GPU that it drew with OpenGL. (Which is what I | interpret the article as saying.) | | More recently (on an archeological time scale, but for many | years by now), MacOS, Windows, iOS, and Android have all | developed ways of sharing graphics between multiple processes | not only in shared main CPU memory, but also on the GPU, which | greatly accelerates rendering, and is commonly used by web | browsers, real time video playing and processing tools, desktop | window managers, and user interface toolkits. | | There are various APIs to pass handles to "External" or "IO | Surface" shared GPU texture memory around between multiple | processes. I've written about those APIs on Hacker News | frequently over the years: | | https://news.ycombinator.com/item?id=13534298 | | DonHopkins on Jan 31, 2017 | parent | context | favorite | on: | Open-sourcing Chrome on iOS | | It's my understanding that only embedded WKWebViews are allowed | to enable the JIT compiler, but not UIWebViews (or in-process | JavaScriptCore engines). WKWebView is an out-of-process web | browser that uses IOSurface [1] to project the image into your | embedding application and IPC to send messages. | | So WKWebView's dynamically generated code is running safely | firewalled in a separate address space controlled by Apple and | not accessible to your app, while older UIWebViews run in the | address space of your application, and aren't allowed to write | to code pages, so their JIT compiler is disabled. | | Since it's running in another process, WkWebView's | JavaScriptEngine lacks the ability to expose your own Objective | C classes to JavaScript so they can be called directly [2], but | it does include a less efficient way of adding script message | handlers that call back to Objective C code via IPC [3]. | | [1] https://developer.apple.com/reference/iosurface | | [2] | https://developer.apple.com/reference/javascriptcore/jsexpor... | | [3] | https://developer.apple.com/reference/webkit/wkusercontentco... | | https://news.ycombinator.com/item?id=18763463 | | DonHopkins on Dec 26, 2018 | parent | context | favorite | on: | WKWebView, an Electron alternative on macOS/iOS | | Yes, it's a mixed bag with some things better and others worse. | But having a great JavaScript engine with the JIT enabled is | pretty important for many applications. But breaking up the | browser into different processes and communicating via messages | and sharing textures in GPU memory between processes | (IOSurface, GL_TEXTURE_EXTERNAL_OES, etc) is the inextricable | direction of progress, what all the browsers are doing now, and | why for example Firefox had to make so many old single-process | XP-COM xulrunner plug-ins obsolete. | | IOSurface: | | https://developer.apple.com/documentation/iosurface?language... | | https://shapeof.com/archives/2017/12/moving_to_metal_episode... | | GL_TEXTURE_EXTERNAL_OES: | | https://developer.android.com/reference/android/graphics/Sur... | | http://www.felixjones.co.uk/neo%20website/Android_View/ | | pcwalton on Dec 27, 2018 | prev [-] | | Chrome and Firefox with WebRender are going the opposite | direction and just putting all their rendering in the chrome | process/"GPU process" to begin with. | | DonHopkins on Dec 27, 2018 | parent [-] | | Yes I know, that's exactly what I meant by "breaking up the | browser into different processes". They used to all be in the | same process. Now they're in different processes, and | communicate via messages and shared GPU memory using platform | specific APIs like IOSurface. So it's no longer possible to | write an XP/COM plugin for the browser in C++, and call it from | the renderer, because it's running in a different process, so | you have to send messages and use shared memory instead. But | then if the renderer crashes, the entire browser doesn't crash. | | https://news.ycombinator.com/item?id=20313751 | | DonHopkins on June 29, 2019 | parent | context | favorite | on: | Red Hat Expecting X.org to "Go into Hard Maintenan... | | Actually, Electron (and most other web browsers) on the Mac | OS/X and iOS use IOSurface to share zero-copy textures in GPU | memory between the render and browser processes. Android and | Windows (I presume, but don't know name of the API, probably | part of DirectX) have similar techniques. It's like shared | memory, but for texture memory in the GPU between separate | heavy weight processes. Since simply sharing main memory | between processes wouldn't be nearly as efficient, requiring | frequent uploading and downloading textures to and from the | GPU. | | Mac OS/X and iOS IOSurface: | | https://developer.apple.com/documentation/iosurface?language... | | http://neugierig.org/software/chromium/notes/2010/08/mac-acc... | | https://github.com/SimHacker/UnityJS/blob/master/notes/IOSur... | | Android SurfaceTexture and GL_TEXTURE_EXTERNAL_OES: | | https://developer.android.com/reference/android/graphics/Sur... | | https://www.khronos.org/registry/OpenGL/extensions/OES/OES_E... | | https://docs.google.com/document/d/1J0fkaGS9Gseczw3wJNXvo_r-... | | https://github.com/SimHacker/UnityJS/blob/master/notes/ZeroC... | | https://github.com/SimHacker/UnityJS/blob/master/notes/Surfa... | | https://news.ycombinator.com/item?id=25997356 | | DonHopkins on Feb 2, 2021 | parent | context | favorite | on: | VideoLAN is 20 years old today | | >Probably do a multi-process media player, like Chrome is | doing, with parsers and demuxers in a different process, and | different ones for decoders and renderers. Knowing that you | probably need to IPC several Gb/s between them. Chrome and | other browsers and apps, and drivers like virtual webcams, and | libraries like Syphon, can all pass "zero-copy" image buffers | around between different processes by sharing buffers in GPU | memory (or main memory too of course) and sending IPC messages | pointing to the shared buffers. | | That's how the browser's web renderer processes efficiently | share the rendered images with the web browser user interface | process, for example. And how virtual webcam drivers can work | so efficiently, too. | | Check out iOS/macOS's "IOSurface": | | https://developer.apple.com/documentation/iosurface | | >IOSurface Share hardware-accelerated buffer data (framebuffers | and textures) across multiple processes. Manage image memory | more efficiently. | | >Overview: The IOSurface framework provides a framebuffer | object suitable for sharing across process boundaries. It is | commonly used to allow applications to move complex image | decompression and draw logic into a separate process to enhance | security. | | And Android's "SurfaceTexture" and GL_TEXTURE_EXTERNAL_OES: | | https://developer.android.com/reference/android/graphics/Sur... | | >The image stream may come from either camera preview or video | decode. A Surface created from a SurfaceTexture can be used as | an output destination for the android.hardware.camera2, | MediaCodec, MediaPlayer, and Allocation APIs. When | updateTexImage() is called, the contents of the texture object | specified when the SurfaceTexture was created are updated to | contain the most recent image from the image stream. This may | cause some frames of the stream to be skipped. | | https://source.android.com/devices/graphics/arch-st | | >The main benefit of external textures is their ability to | render directly from BufferQueue data. SurfaceTexture instances | set the consumer usage flags to GRALLOC_USAGE_HW_TEXTURE when | it creates BufferQueue instances for external textures to | ensure that the data in the buffer is recognizable by GLES. | | And Syphon, which has a rich ecosystem of apps and tools and | libraries: | | http://syphon.v002.info | | >Syphon is an open source Mac OS X technology that allows | applications to share frames - full frame rate video or stills | - with one another in realtime. Now you can leverage the | expressive power of a plethora of tools to mix, mash, edit, | sample, texture-map, synthesize, and present your imagery using | the best tool for each part of the job. Syphon gives you | flexibility to break out of single-app solutions and mix | creative applications to suit your needs. | | Of course there's a VLC Syphon server: | | https://github.com/rsodre/VLCSyphon | mananaysiempre wrote: | > From what I understand by reading the article, GTK uses GL | to render in the GPU then copies the pixels into main memory | for the compositor to mix with other windows. | | This seems very strange to me. It's how things would work | with wl_shm, which is the baseline pixel-pushing interface in | Wayland, but AFAIU Gtk uses EGL / Mesa, which in turn uses | Linux dmabufs, which is how you do hardware-accelerated | rendering / DRI on Linux today in general. | | However, _how_ precisely Linux dmabufs work in a DRI context | is not clear to me, because the documentation is lacking, to | say the least. It seems that you can ask map to dmabufs into | memory, and you can create EGLSurfaces from them, but are | they always mapped into CPU memory (if only kernel-side), or | can they be bare GPU memory handles until the user asks to | map them? | | I'd hope for the latter, and if so, the only thing the work | discussed in the article avoids is extra _GPU_ -side blits | (video decoding buffer to window buffer to screen), which is | non-negligible but not necessarily the end of the world. | rjsw wrote: | Linux on ARM SoCs with HW video decoders that are separate | to the GPU can use the V4L2 API to avoid some copying. The | decoder writes a frame to a buffer that the GPU can see | then you use GL to get the GPU to merge it into the | framebuffer. | DonHopkins wrote: | I misunderstood the article saying "exports the resulting | texture" as meaning it exported it to CPU memory, but | audidude explained how it actually works. | | I believe GL_TEXTURE_EXTERNAL_OES is an Android-only OpenGL | extension that takes the place of some uses of DMABUF but | is not as flexible and general. | | ChatGPT seems to know more about them, but I can't | guarantee how accurate and up-to-date it is: | | https://chat.openai.com/share/abff036b-3020-4093-a13b-86cbf | 0... | | The tricky bit may be teaching pytorch to accept dmabuf | handles and read and write dmabuf GPU buffers. (And ffmpeg | too!) | audidude wrote: | I can't respond to everything incorrect in this, because it's | way to long to read. But from the very start... | | Also, I wrote a significant part of GTK's current OpenGL | renderer. | | > But GPUs change the picture entirely. From what I | understand by reading the article, GTK uses GL to render in | the GPU then copies the pixels into main memory for the | compositor to mix with other windows. | | This is absolutely and completely incorrect. Once we get | things into GL, the texture is backed by a DMABUF on Linux. | You never read it back into main memory. That would be very, | very, very stupid. | | > But in modern GPU-first systems, the compositor is running | in the GPU, so there would be no reason to ping-pong the | pixels back and forth between CPU and GPU memory after | drawing 3D or even 2D graphics with the GPU, even when having | different processes draw and render the same pixels. | | Yes, the compositor is running in the GPU too. So of course | we just tell the compositor what the GL texture id is and it | composites if it cannot map that texture (again, because it's | really a DMABUF) as a toplevel plane for hardware scanout | _without_ using 3d capabilities at all. | | That doesn't mean unaccelerrated. It means it doesn't power | up the 3d part of the GPU. It's the fastest way in/out with | the least power. You can avoid "compositing" from a | compositor too when things are done right. | | > So I'm afraid Wayland still has a lot of catching up to do, | if it still uses a software compositor, and has to copy | pixels back from the GPU that it drew with OpenGL. (Which is | what I interpret the article as saying.) | | Again, completely wrong. | | > Check out iOS/macOS's "IOSurface": | | Fun fact, I wrote the macos backend for GTK too. And yes, it | uses IOSurface just like DMABUF works on Linux. | kaba0 wrote: | I'm not the parent poster, I'm just tryin go to grab this | opportunity that I "met" someone so familiar with GTK :) | | Could you please share your opinion on the toolkit, and its | relation to others? Also, I heard that there were quite a | lot of tech debt in GTK3 and part of the reason why GTK4 | came as a bigger update is to fix those -- what would you | say, was it successful? Or is there still some legacy | decisions that harm the project somewhat? | audidude wrote: | > Also, I heard that there were quite a lot of tech debt | in GTK3 and part of the reason why GTK4 came as a bigger | update is to fix those | | GTK 3 itself was trying to lose the tech debt of 2.x | (which in turn 1.x). But they were still all wrapping a | fundamentally crap API of X11 for graphics in this | century. | | GTK 4 changed that, and it now wraps a Wayland model of | API. That drastically simplified GDK, which is why I | could write a macOS backend in a couple of weeks. | | It also completely changed how we draw. We no longer do | immediate mode style (in the form of Cairo) and instead | do a retained mode of draw commands. That allows for lots | of new things you just couldn't do before with the old | drawing model. It will also allow us to do a lot more fun | things in the future (like threaded/tiled renderers). | | The APIs all over the place were simplified and focused. | I can't imagine writing an application the size of GNOME | Builder again with anything less than GTK 4. | | Hell, last GNOME cycle I rewrote Sysprof from scratch in | a couple months, and it's become my co-pilot every day. | kaba0 wrote: | Thanks for the comment and for your work! | DonHopkins wrote: | Thank you for the correction, it's a relief! That's nice | work. | | I'm sorry, I misinterpreted the paragraph in the article | saying "exports" as meaning that it exports the pixels from | GPU memory to CPU memory, not just passing a reference like | GL_TEXTURE_EXTERNAL_OES and IOSurface does. | | >GTK has already been using dmabufs since 4.0: When | composing a frame, GTK translates all the render nodes | (typically several for each widget) into GL commands, sends | those to the GPU, and mesa then exports the resulting | texture as a dmabuf and attaches it to our Wayland surface. | | Perhaps I'd have been less confused if it said "passes a | reference handle to the resulting texture in GPU memory" | instead of "exports the resulting texture", because | "exports" sounds expensive to me. | | Out of curiosity about the big picture, are dmabufs a Linux | thing that's independent of OpenGL, or independent of the | device driver, or build on top of GL_TEXTURE_EXTERNAL_OES, | or is GL_TEXTURE_EXTERNAL_OES/SurfaceTexture just an | Android or OpenGL ES thing that's an alternative to dmabufs | in Linux? Do they work without any dependencies on X or | Wayland or OpenGL, I hope? (Since pytorch doesn't use | OpenGL.) | | https://source.android.com/docs/core/graphics/arch-st | | One practical non-gui use case I have for passing | references to GPU textures between processes on Linux is | pytorch: I'd like to be able to decompress video in one | process or docker container on a cloud instance with an | NVidia accelerator, and then pass zero-copy references to | the resulting frames into another process (or even two -- | each frame of video needs to be run through two different | vision models) in another docker container running pytorch, | sharing and multitasking the same GPU, possibly sending | handles through a shared local file system or ipc (like how | IOSurface uses Mach messages to magically send handles, or | using unix domain sockets or ZeroMQ or something like | that), but I don't know if it's something that's supported | at the Linux operating system level (ubuntu), or if I'd | have to drop down to the NVidia driver level to do that. | | NVidia has some nice GPU video decompressor libraries, but | they don't necessarily play well with pytorch in the same | process, so I'd like to run them (or possibly ffmpeg) in a | different process, but on the same GPU. Is it even | possible, or am I barking up the wrong tree? | | It would be ideal if ffmpeg had a built-in "headless" way | to perform accelerated video decompression and push out GPU | texture handles to other processes somehow, instead of | rendering itself or writing pixels to files or touching CPU | memory. | audidude wrote: | > Out of curiosity about the big picture, are dmabufs a | Linux thing that's independent of OpenGL, or independent | of the device driver, | | They are independent of the graphics subsystem altogether | (although that is where they got their start, afaik). | Your webcam also uses DMABUF. So if you want to display | your webcam from a GTK 4 application, this | GtkGraphicsOffload will help you take that DMABUF from | your camera (which may not be mappable on CPU memory, but | can DMA pass to your GPU), and display it in a GTK | application. It could either be composited on the GPU, or | mapped directly to scanout if the right conditions are | met. | | I wrote a library recently (libmks) and found the | culprits in Qemu/VirGL/virtio_gpu that were preventing | passing a DMABUF from inside a guest VM to the host. That | stuff is all fixed now so theoretically you could even | have a webcam in a VM which then uses a GTK 4 application | to render with VirGL and the compositor submit the scene | to the host OS which itself can set the planes correctly | to get the same performance as if it were in the host OS. | | > I'd like to be able to decompress video in one process | or docker container on a cloud instance with an NVidia | accelerator, and then pass zero-copy references | | If you want this stuff with NVidia, and you're a | customer, I highly suggest you tell your NVidia | representative this. Getting them to use DMABUF in a | fashion that can be used from other sub-systems would be | fantastic. | | But at it's core, if you were using Mesa and open drivers | for some particular piece of hardware, yes it's capable | of working given the right conditions. | play_ac wrote: | No, that isn't allowing exposure to kernel graphics buffers. | That's allowing clients to draw to the main framebuffer with no | acceleration at all. If you're memory mapping pixels into user | space and drawing with the CPU then you're necessarily leaving | kernel space. Around the same time X11 had an extension called | DGA that did the same thing. It was removed because it doesn't | work correctly when you have hardware acceleration. | | So the optimization only makes sense for a machine like yours | with no native drivers. With any kind of GPU acceleration it | will actually make things much slower. GTK doesn't do this | because it would only be useful for that kind of machine | running around 25 years ago. | DonHopkins wrote: | Shared kernel graphics buffers in main or memory mapped | framebuffer device memory are one thing (common in the early | 80's, i.e. 1982 SunView using /dev/fb), but does it expose | modern shared GPU texture buffers to multiple processes, | which are a whole other ball game, and orders of magnitude | more efficient, by not requiring ping-ponging pixels back and | forth between the CPU and GPU when drawing and compositing or | crossing process boundaries? | AshamedCaptain wrote: | > With any kind of GPU acceleration it will actually make | things much slower. GTK doesn't do this because it would only | be useful for that kind of machine running around 25 years | ago. | | Precisely one of the points of TFA is be able to use the "25 | year old" hardware overlay support whenever possible (instead | of the GPU) in order to save power, like Android (and classic | Xv) does. | ris wrote: | > After 25 years, GTK joins the party ... | | I mean.. shall we start a list of ways in which BeOS/Haiku have | yet to "join the party" that the linux desktop has managed? | | Juggling the various needs of one hell of a lot more users | across a lot more platforms, with a lot more API-consuming apps | to keep working on systems which are designed with a lot more | component independence is a much harder problem to solve. | CyberDildonics wrote: | Are you mixing up number of users with technical | sophistication? | pengaru wrote: | Wasn't this one of the main security issues with the BeOS | architecture? | | It's not the same thing as what's being done here via Wayland. | BeOS is more YOLO style direct access of the framebuffer | contents, without solving any of the hard problems (I don't | think it was really possible to do properly using the available | hardware at the time). | thriftwy wrote: | I can't imagine anybody passing video frames one by one by a | system call as a n array of pixels. | | I believe neither Xv nor GL-based renderers do that, even before | we discuss hw accel. | chlorion wrote: | Emulators for older systems very often do this! | | The older consoles like the NES had a pixel processing unit | that generated the picture, you need to emulate it's state and | interaction with the rest of the system, possibly cycle-by- | cycle, which makes it not possible to do on the GPU as a shader | or whatever. | | This is kind of a niche use case for sure but it's interesting. | thriftwy wrote: | They usually use libSDL instead of GTK, though. | donatj wrote: | Is this the same sort of thing Windows 95-XP had before Vista | added DWM? | | Back when videos wouldn't show up in screen shots and their | position on screen could sometimes gets out of sync with the | window they were playing in? | | I never fully understood what was happening but my theory at the | time was that the video was being sent to the video card separate | from the rendered windows. | superkuh wrote: | By going wayland only Gtk is becoming less of a GUI toolkit and | more of just an extremely specific and arbitrary lib for GNOME | desktop environment. | mixedCase wrote: | Gtk is not going wayland only. | superkuh wrote: | https://www.phoronix.com/news/GTK5-Might-Drop-X11 "Red Hat's | Matthias Clasen opened an issue on GTK entitled "Consider | dropping the X11 backend"" | https://gitlab.gnome.org/GNOME/gtk/-/issues/5004 | walteweiss wrote: | Everyone goes Wayland, not just Gnome, X is obsolete. | superkuh wrote: | And the handful of feature incompatible waylands are feature | incomplete. You still can't keyboard/mouse share under any of | them. That's just one of innumberable things you can't do. | walteweiss wrote: | I assume that will happen over time, won't it? | freedomben wrote: | > _By going wayland only_ | | You mean this one very small piece? That seems a bit hyperbolic | amelius wrote: | This sounds so overly complicated considering that you can do all | this in HTML without much effort. | freedomben wrote: | I wonder why then, don't they just implement GTK with HTML? | | Probably because HTML is at the very _top_ of the stack, while | this is much lower... Without everything below it on the stack, | HTML is just a text file. | amelius wrote: | You're reading into it too much. | | All I meant was: if the API of HTML is simple, then why does | GTK's API have to be that complicated. | kaba0 wrote: | This is literally an implementation detail, the API part is | a single flag you can set if you want your subsurface to | potentially make use of this future. | bogwog wrote: | Semi-related question: Are there any real benefits to having the | compositor deal with compositing your buffers as opposed to doing | it yourself? Especially if you're already using hardware | acceleration, passing your buffers to the system compositor seems | like it could potentially introduce some latency. | | I guess it would allow the user/system to customize/tweak the | compositing process somehow, but for what purpose? | kaba0 wrote: | Any kind of post-process effect like transparency, zoom; many | stuff like window previews, overviews screens (these are | sometimes possible without as well), and tear-freedom. | neurostimulant wrote: | Rounded corners seems like a feature that has unexpectedly high | performance penalty but the ui designers refused to let it go. | bee_rider wrote: | Is it possible that they are just the well know representative | example? I vaguely suspect they that is the case, but I can't | think of the broader class they are an example of, haha. | | The play button they show seems to be a good one, though. It is | really nice to have it overlaid on the video. | DonHopkins wrote: | Fortunately the Play button disappears when the video starts | playing, so it has no effect on the frame rate! | | Or instead of a triangular Play button, you could draw a big | funny nose in some position and orientation, and the game | would be to pause the video on a frame with somebody's face | in it, with the nose in just the right spot. | | I don't know why the vlc project is ignoring my prs. | solarkraft wrote: | It's something I as a user would also refuse to let go, given | that the performance penalty is reasonably small (I think it | is). | torginus wrote: | I think the point is that it's not - rather than just copying | a rectangular area to the screen, you have to go through the | intermediate step of rendering everything to a temporary | buffer, and compositing the results via a shader. | chris_wot wrote: | But... the example given shows that they place the video | frame behind the window an make the front window | transparent except for the round play button. This | apparently offloads the frame... so why not just do the | same for rounded corners? | | What am I missing? | DonHopkins wrote: | It's not like crazy out of control avant garde different | thinking UI designers haven't and totally ruined the user | interface of a simple video player ever before! | | Interface Hall of Shame - QuickTime 4.0 Player (1999): | | http://hallofshame.gp.co.at/qtime.htm | bsder wrote: | Professional designers mostly cut their teeth on physical | objects and physical objects almost _never_ have sharp corners. | | This then got driven into the ground with the "Fisher-Price | GUI" that is the norm on mobile because you can't do anything | with precision since you don't have a mouse. | | I would actually really like to see a UI with just rectangles. | Really. It's okay, designers. Take a deep breath and say: "GUIs | aren't bound by the physical". BeOS and MacOS used to be very | rectangular. Give us a nice nostaligia wave of fad design with | rectangles, please. | | Animations and drop shadows are another thing I'd like to see | disappear. | twoodfin wrote: | Rounded corners for windows have been in the Macintosh | operating system since the beginning. | | https://www.folklore.org/StoryView.py?story=Round_Rects_Are_. | .. | sylware wrote: | steam deck wayland compositor is built on drm(dmabufs)/vulkan. | | (Sad it is written in c++) | | But what did surprise me even more: I was expecting GTK, and | moreover the 4, to be on par with valve software. | | On linux, I would not event think to code a modern system | hardware accelerated GFX component which is not | drm(dmabuf)/vulkan. | charcircuit wrote: | This blog post is not talking about Mutter, GNOME's compositor. | GTK's hardware acceleration had already been using dmabufs | before adding this graphics offload feature. | sylware wrote: | But as the article states, with GL, not vulkan. | | Unless the article is obsolete itself? | charcircuit wrote: | Sure, but OpenGL itself is still useful and used in modern | software. | sylware wrote: | It is legacy and started to be retired. | | Not to mention GL is a massive and gigantic kludge/bloat | compared to vulkan (mostly due to the glsl compiler). So | this is good to let it go. | tristan957 wrote: | If you have spare time, the GTK maintainers want people | to work on the Vulkan renderer. Benjamin Otte and Georges | Stavracas Neto have put in a bit of effort to make the | Vulkan renderer better. | | GL is only deprecated on Mac from what I understand. | charcircuit wrote: | I don't think Valve's window toolkit ever supported | vulkan. Steam no longer uses OpenGL because they replaced | their window toolkit with Chrome. | | >It is legacy and started to be retired. | | The standard itself, but the implementations are still | being maintained and new extensions are being added. It | is still a solid base to build upon. | ori_b wrote: | The thing that's always felt slow to me in GTK was resizing | windows, not getting pixels to the screen. I'm wondering if | adding all these composited surfaces adds a cost when resizing | the windows and their associated out of process surfaces. | rollcat wrote: | More likely it removes costs. This is very specifically an | optimization. ___________________________________________________________________ (page generated 2023-11-18 23:00 UTC)