[HN Gopher] Raspberry Pi 4 achieves Vulkan 1.1 conformance, gets...
       ___________________________________________________________________
        
       Raspberry Pi 4 achieves Vulkan 1.1 conformance, gets GPU
       performance boost
        
       Author : rcarmo
       Score  : 280 points
       Date   : 2021-10-30 15:03 UTC (7 hours ago)
        
 (HTM) web link (www.cnx-software.com)
 (TXT) w3m dump (www.cnx-software.com)
        
       | rixrax wrote:
       | What's the test software / benchmark I should use on Linux
       | nowadays to measure (and compare) shader and raw GPU performance?
       | That would ideally run under both X and Wayland?
        
         | arminiusreturns wrote:
         | I have always tended towards Phoronixs test suite
         | (https://www.phoronix-test-suite.com/) but Im sure there are a
         | few specific to Vulkan around. Not sure about wayland.
        
         | fulafel wrote:
         | The application(s) you want to run is the best benchmark.
        
           | rixrax wrote:
           | Problem with this is that the application I have in mind
           | doesn't provide anything but perceptual feedback. I'd rather
           | have some cold numbers that are to some degree reproducible
           | and would give at least rough idea of the performance of
           | given HW+drivers+other-settings combination.
        
       | handrous wrote:
       | Just checked, looks like Vulcan under DRM on the Pi4 works, and
       | at least some people in the Libretro ecosystem have already
       | messed around with it, so this could benefit Lakka. Awesome.
       | Maybe this'll mean getting to play with some decent CRT shaders
       | on the Pi without an unacceptable performance hit, and/or getting
       | to make better use of Retroarch's advanced input lag reduction
       | features.
        
         | mewse-hn wrote:
         | If this could get playstation emulators using vulkan running at
         | decent frame rates that would be really, really awesome
        
         | willis936 wrote:
         | CRT Royale running 60 fps in a sub 15 W machine would be
         | impressive. 1080p would be nice, 1440p would be great, and 4K
         | would be best. The pi4 can output 4K60, but I really doubt it
         | can shove through that many simulated pixels.
        
       | exabrial wrote:
       | ugh, now if I could only buy a few haha
        
         | boromi wrote:
         | Seriously I had to pay 75$ to get one recently, which was
         | painful but I needed it.
        
           | juanse wrote:
           | Same here but with 10 units. Almost 100$ each.
        
         | GhettoComputers wrote:
         | What for? I'm sure you have some older computers that run much
         | better that you already have in your house. An APU would
         | trounce it. Pi feel like netbooks of desktop computers, the new
         | ones get extremely hot, I would expect it to require a heavy
         | heatsink and constantly spinning fan if you tried this.
        
           | lytedev wrote:
           | Better power efficiency
        
       | ncmncm wrote:
       | I guess this means you can use Kompute (kompute.cc) on RPi 4,
       | now?
        
         | lucb1e wrote:
         | (A GPGPU framework, to save others a click.)
        
           | ncmncm wrote:
           | Except not a framework, it's just a library. You can use it
           | to do any of the stuff you would do with CUDA, about as fast,
           | but portably. #include it to accelerate your game's physics
           | engine, or whatever.
           | 
           | It doesn't say so at kompute.cc, but I found that it depends
           | on Vulkan 1.1.
        
       | Factorium wrote:
       | Could we see a portable Epic Games console, pugged directly into
       | their store?
       | 
       | Like the Steam Deck, but better, since developers will get 88% of
       | revenue instead of 65% on Steam.
        
         | LeoPanthera wrote:
         | The Steam Deck is a generic PC. It's not locked to the Steam
         | store. It's not even locked to the OS it comes with. You can
         | install the Epic store, or any other store, on it right now. If
         | you have one, anyway.
        
         | fortyseven wrote:
         | *slurp*
        
         | smoldesu wrote:
         | You can download EGS games on Linux just fine, so ostensibly
         | you could build one of these right now. Of course, you probably
         | wouldn't want to use ARM for a PC game console, but you're
         | welcome to try it.
        
       | pengaru wrote:
       | in TFA: s/Iglia/Igalia/
        
       | hesdeadjim wrote:
       | If only there was a world where Apple would sell M1 chips
       | separately from their walled garden.
        
         | bla3 wrote:
         | The Pi has much better performance per dollar, which is a
         | metric that's important to some people too.
        
           | tinus_hn wrote:
           | Purchase dollar? Or energy dollar?
        
             | rbanffy wrote:
             | I don't think there's anything else on the planet that
             | rivals the performance per watt of the M1 family.
             | 
             | Also, the RPi's SoC is made in an older 28nm process
             | (that's one of the reasons why it's cheaper).
        
         | Rovanion wrote:
         | They won't. Their margins on the services side are obscene so
         | getting people into that ecosystem is worth much more than the
         | sales of some processors.
        
         | mirekrusin wrote:
         | Why? They gave "it's possible" proof. They rip benefits of
         | doing it first - all good. Now it's time for competition to
         | pick it up, possibly improve on it or fade away Intel style.
        
         | smoldesu wrote:
         | Also accepted would be a world where they just add Vulkan
         | support to their APUs already.
        
           | WithinReason wrote:
           | How about MoltenVK?
           | 
           | https://github.com/KhronosGroup/MoltenVK
        
             | smoldesu wrote:
             | It's fine, but it's frankly silly that you're forced to
             | translate a _free and open_ graphics API into a more
             | proprietary one. Compare that to something like DXVK, which
             | exists because Linux users cannot license DirectX on their
             | systems. MoltenVK exists simply because Apple thought
             | "let's not adopt the industry-wide standard for CG graphics
             | on our newer machines". Again, not bad, but a bit of a
             | sticky situation that is entirely predicated by technology
             | politics, not what's _actually possible_ on these GPUs.
        
               | tinus_hn wrote:
               | Is what is possible with Metal possible with GL though?
               | Both in performance and features? They didn't build Metal
               | just to be contrarian.
        
               | bzzzt wrote:
               | Metal was released a year before Vulkan. Apple just
               | didn't want to wait and decided to design their own
               | better than OpenGL API.
        
               | oynqr wrote:
               | Mantle was released ~1 year before Metal.
        
               | smoldesu wrote:
               | DirectX was released a decade before Vulkan, that didn't
               | stop manufacturers from including support for both so the
               | user could decide for themselves.
        
           | my123 wrote:
           | A fully compliant Vulkan implementation for M1 would come
           | with very surprising performance cliffs for a developer.
           | 
           | One of them:
           | https://github.com/KhronosGroup/MoltenVK/issues/1244
        
             | WithinReason wrote:
             | And also potential optimisations that are not possible in
             | other GPUs:
             | 
             | https://developer.apple.com/documentation/metal/gpu_feature
             | s...
        
             | monocasa wrote:
             | That's pretty common for TBDRs. The tile is rendered into a
             | fixed size on chip buffer, and the driver has to split the
             | tile into multiple passes to fit all of the render target
             | data for nutty amounts of data coming out of the shader.
             | PowerVR works the same way (completely unsurprisingly).
        
             | fulafel wrote:
             | See this comment on that issue: https://github.com/KhronosG
             | roup/MoltenVK/issues/1244#issueco...
        
             | zamadatix wrote:
             | It'd be surprising if an architecture had 0 such surprises
             | and did everything Vulkan allows without any special
             | performance considerations vs another architecture.
        
         | ChuckNorris89 wrote:
         | Well, number one, why would they? Apple makes money by getting
         | consumers and locking them into their unicorns and rainbows
         | ecosystem where everything is perfect which makes consumers
         | comfortable spending boat loads of money, not by selling
         | commodity hardware.
         | 
         | Ecosystems with great UX and paid subscriptions plus a 30% cut
         | on all transactions are far more profitable than the margins
         | you make selling commodity hardware. Just ask famous phone
         | manufacturers like Siemens, Nokia and Blackberry why that is.
         | That's why SW dev salaries are much higher than HW dev salaries
         | as the former generates way more revenue than the latter.
         | That's why Apple doesn't roll out their own cloud datacenters
         | and instead just gets Amazon, Microsoft and Google to compete
         | against each other on pricing.
         | 
         | Apple only rolls out their solutions when they have an impact
         | on the final UX, like designing their own M1 silicon.
         | 
         | And number two, selling chips comes with a lot of hassle like
         | providing support to your partners like Intel and AMD do.
         | Pretty sure they don't want to bother with that.
         | 
         | Before they start selling chips I would rather they open
         | iMessage to other platforms to eliminate the bubble color
         | discrimination.
        
           | rafamaddd wrote:
           | > Before they start selling chips I would rather they open
           | iMessage to other platforms to eliminate the bubble color
           | discrimination.
           | 
           | Outside of the countries where iOS is on par with Android (I
           | think US, Canada and UK are the only ones, maybe also
           | Australia) in terms of popularity, I don't know or have seen
           | a single person using iMessage, of course there's a lot
           | people using iphone outside of the mentioned countries, but
           | absolutely nobody uses iMessage.
           | 
           | The whole discrimination of the color bubble seems to only
           | happen in those countries were iOS is the same or more
           | popular than android and people is actually using iMessage.
        
             | InvaderFizz wrote:
             | It's worse than that in the US. While iOS is a bit over
             | 50%, it's closing in on 90% for teens[0], where such
             | discrimination is most likely to occur. These numbers also
             | bode well for Apple's future market share as these teens
             | grow into adults.
             | 
             | 0: https://finance.yahoo.com/news/apple-i-phone-ownership-
             | among...
        
               | ChuckNorris89 wrote:
               | It's getting similar in Europe for teens. I rarely see
               | them on public transport with anything other than an
               | iPhone.
        
             | rimliu wrote:
             | > but absolutely nobody uses iMessage
             | 
             | Uhm, iMessage works transparently. I just use Messages app,
             | if my recipient uses iPhone it get an iMessage, if they use
             | something else, they get SMS.
        
               | mcintyre1994 wrote:
               | Their point is that most people don't use the Messages
               | app to communicate with others. In the UK for example
               | WhatsApp is massively dominant.
        
           | rbanffy wrote:
           | > Before they start selling chips I would rather they open
           | iMessage to other platforms to eliminate the bubble color
           | discrimination.
           | 
           | When so many telcos charge outrageous prices for SMSs, it's a
           | useful feature.
        
           | IgorPartola wrote:
           | I agree with you right up to how exactly does the M1 chip
           | affect the final UX? A different keyboard, screen, touchpad,
           | etc. all make a difference but why does the chip make a
           | difference?
        
             | masklinn wrote:
             | > I agree with you right up to how exactly does the M1 chip
             | affect the final UX?
             | 
             | It allows apple to focus on what they want without being
             | limited by and two their hardware provider's strategy.
        
             | ArgyleSound wrote:
             | Power efficiency for one.
        
               | IgorPartola wrote:
               | Was there nobody else who made power efficient chips?
        
               | JiNCMG wrote:
               | Not in the x86 arena. Every time Apple gets involved with
               | a CPU developers (Motorola, IBM, Intel) their needs
               | splits from the developers desires. This time they
               | decided to go on their own (well after years of doing
               | this for the iPhone). Note: They have been involved in
               | the ARM CPU market since the days of the Newton.
        
             | ChuckNorris89 wrote:
             | _> how exactly does the M1 chip affect the final UX?_
             | 
             | Everything runs faster, cooler, quieter and battery lasts
             | longer. Is that not part of the product UX?
        
               | IgorPartola wrote:
               | That makes it sound like Intel, AMD, ARM, etc. we're
               | trying to build chips that run hotter and less
               | efficiently.
        
               | bzzzt wrote:
               | Seems like Intel really lost the plan there with every
               | new generation having just a few percent better
               | performance, trouble with moving to smaller nodes and the
               | enormous regression from spectre/meltdown.
               | 
               | The Apple chips are made for running macOS/iOS. Seems
               | there are some hardware instructions that are tailor made
               | for increasing the performance of Apple software so they
               | can make sure everything is working toward a common goal.
        
               | ChuckNorris89 wrote:
               | The end users don't care what brand of chip is under the
               | hood, or why the UX on Apple's implementation of Intel
               | chips sucked, they just know the new device has much
               | better UX overall due to the more powerful and more
               | efficient chip and will upgrade for that.
        
           | tinus_hn wrote:
           | > Before they start selling chips I would rather they open
           | iMessage to other platforms to eliminate the bubble color
           | discrimination
           | 
           | It's probably easier to just move to one of the 99% of
           | countries where nobody uses iMessage.
        
             | ChuckNorris89 wrote:
             | I already do, in Europe, where everyone and their mom uses
             | Facebook's WhatsApp for everything. While that evens the
             | playing field, I'm not sure I'd call trading a walled
             | garden for a spyware one a massive victory though.
        
               | tinus_hn wrote:
               | So who cares that a network nobody uses exists where only
               | people that have an Apple device can login?
        
               | ChuckNorris89 wrote:
               | Apparently teens and even some adults in the US where
               | they'll miss out on social activities or be mocked or
               | ignored due to not being on iMessage.
               | 
               | That doesn't affect me though as i don't live in the US
               | and am too old for that kind of stuff but I do remember
               | how easy it was to be mocked or bullied as a teen for not
               | having the same stuff as the herd, even before
               | smartphones were a thing.
        
         | NelsonMinar wrote:
         | Or alternately one where some Windows / Linux manufacturer
         | could match Apple for all the innovations in the M1 Macbooks.
         | I'm not an Apple fan but I'm envious of what they've
         | accomplished and wish I could run Windows and Linux on similar
         | hardware.
         | 
         | Other folks are starting to get there but only from the mobile
         | device direction, e.g. Tensor. Maybe I should look closer at
         | what Microsoft has done with ARM Surface.
        
           | smoldesu wrote:
           | It doesn't help that Apple bought the entire manufacturing
           | capacity for 5nm silicon from TSCM right before the chip
           | shortage hit. I think the next few years are going to get
           | very competitive though, and I'm excited to see how Intel and
           | AMD respond.
        
             | phkahler wrote:
             | Apple has done that before. IIRC when the original iPod
             | came out it used a new generation of HDD. Apple went to the
             | drive manufacturer and said "we'll take all of them" and
             | they agreed.
        
             | taf2 wrote:
             | How is Amazon able to product their arm chips for aws?
             | Assuming those are not the 5nm?
        
               | smoldesu wrote:
               | There's still 5nm silicon for sale, but just not at TSCM
               | (the largest semiconductor manufacturer in the world).
               | Companies like Samsung are just now getting around to
               | mass-producing 5nm, and afaik there were a few domestic
               | Chinese manufacturers who claimed to be on the node too.
               | 
               | As for Amazon specifically though, I've got no idea.
               | They're a large enough company that they could buy out an
               | entire fab or foundry if they wanted, AWS makes more than
               | enough money to cover the costs.
        
         | jeffbee wrote:
         | The "walled garden" comes with a C and C++ toolchain, python,
         | perl, awk, sed, and a Unix shell. It is not, in any way, a
         | "walled garden" in a universe where words have shared meaning.
        
           | aftbit wrote:
           | Its a walled garden when you're not allowed to leave or bring
           | your friends in, no matter how nice the stuff on the inside
           | is.
        
             | jeffbee wrote:
             | And that analogy applies to macOS and the M1 CPU how,
             | exactly?
        
             | rimliu wrote:
             | What does it even mean?
        
         | monocasa wrote:
         | I'm hoping Alyssa Rosenzweig's fantastic work documenting the
         | M1 GPU will let us write native Vulkan drivers even for MacOS.
         | I believe she's been focusing thus far on the user space
         | visible interfaces, so a lot of that work should translate
         | well.
        
         | snvzz wrote:
         | No worries. Competition is coming.
         | 
         | https://www.phoronix.com/scan.php?page=news_item&px=SiFive-P...
         | 
         | Should be roughly M1 performance, but on RISC-V.
        
           | rafamaddd wrote:
           | uffff
           | 
           | who knows when that is coming and when are we going to be
           | able to buy regular laptops from e.g. Lenovo, HP, Acer, etc
           | with that.
           | 
           | By the time that happens, Apple may already be on their
           | third, fourth? generation on M1. Which is going to much much
           | much faster than M1.
        
           | phkahler wrote:
           | M1 is WAY faster than a cortex A78.
        
       | marcodiego wrote:
       | Now combine this with Zink and boom! We get OpenGL 4.6 for free:
       | https://www.phoronix.com/scan.php?page=news_item&px=Zink-Clo... .
       | 
       | Vulkan is too low level, but AFAICS it is not something one use
       | directly, instead a library which uses it as a back-end should be
       | used.
        
         | kcb wrote:
         | I've always wondered how this would work. Surely if it was
         | possible to reasonably implement OpenGL 4.6 on the PI GPU it
         | would already be done through Mesa.
        
         | my123 wrote:
         | > Now combine this with Zink and boom! We get OpenGL 4.6 for
         | free
         | 
         | For the RPi4 specifically:
         | 
         | That GPU has hardware limitations that make it unable of OpenGL
         | 3.0. However, it supports GLES 3.2.
         | 
         | If you want GL desktop minus the unsupported features by the
         | hardware, you can set MESA_GL_VERSION_OVERRIDE=3.3 for example.
         | That will however never be compliant.
         | 
         | Vulkan has many extensions to allow it to work on hardware
         | which doesn't support the full feature set. (by not
         | implementing them, instead of having only version numbers)
        
           | zamadatix wrote:
           | The Pi hardware may not support multiple render targets or
           | other features in hardware directly but Zink is not required
           | to (and does not always) emit 1 Vulkan API call for each
           | OpenGL API call. It is free to issue as many as are needed to
           | properly emulate the OpenGL API in a conformant way. That
           | being said I don't think this particularly compatibility is
           | in Zink today but there is nothing preventing it from being
           | possible just because the hardware couldn't create the render
           | targets all in one shot.
        
             | seba_dos1 wrote:
             | > but Zink is not required to (and does not always) emit 1
             | Vulkan API call for each OpenGL API call
             | 
             | The OpenGL driver also doesn't have to emit 1 logical
             | hardware operation for each OpenGL API call.
        
               | zamadatix wrote:
               | There is no hard technical requirement for hardware
               | drivers but it's riskier to expose performance impacting
               | emulation at that level vs the layered driver level
               | (where Zink is). For instance imagine a case where the
               | hardware supported 4 MRTs but the hardware driver
               | emulation layer exposed 8 MRTs for OpenGL compatibility
               | yet Zink needed to use 16 MRTs. Now you've got all sorts
               | of translation happening where Zink is likely calling the
               | lower emulation layer multiple times rather than just
               | calling the hardware directly. Such emulation layers are
               | expected in a layered driver, that's part of their actual
               | intent, whereas base hardware drivers are meant to expose
               | what the hardware is able to do natively and let you work
               | around it otherwise.
        
               | seba_dos1 wrote:
               | You can already enjoy stuff like OpenGL 2.1 support on
               | purely GLES 2.0 hardware this way - for instance on older
               | Raspberry Pis. There's not much Zink will bring on the
               | table that Gallium doesn't already when it comes to
               | emulation of missing hardware features (at least not if
               | you want them to actually perform in any reasonable way).
        
           | jdc wrote:
           | I wonder what specifically the GPU missing that OpenGL needs.
        
             | my123 wrote:
             | The OpenGL 3.0 spec mandates support for 8 render targets,
             | the RPi4 GPU only has support for 4.
        
               | salawat wrote:
               | When you say render targets, do you mean drm buffers? Or
               | on GPU output buffers?
               | 
               | I'm not quite completely clueless, but I have the feeling
               | that clarification on this point will nudge me in the
               | right direction to understanding these things better.
        
               | my123 wrote:
               | GL_MAX_DRAW_BUFFERS
        
       | ArtWomb wrote:
       | Congrats! Huge effort. Full spec of Broadcomm GPU (24 GFLOPS)
       | 
       | https://forums.raspberrypi.com/viewtopic.php?t=244519
        
       | prox wrote:
       | I wonder if a Raspberry GPU board (low cost graphics performance)
       | is possible. For light Blender work and maybe simple games.
        
         | my123 wrote:
         | It's better even in GPU perf/$ to buy a Jetson Nano 2GB, the
         | RPi4 GPU is really small (and not that well featured).
        
           | amelius wrote:
           | But you can only run one flavor of Linux on it, since NVidia
           | keeps the specs closed.
        
             | my123 wrote:
             | Today on the Jetson Nanos, you can just use the Fedora
             | stock image. (flashed to a microSD card)
             | 
             | It's much better than what it was before. nouveau works
             | ootb, including reclocking too.
             | 
             | It's also to be noted that all Tegras have an open-source
             | kernel mode GPU driver (nvgpu) even when using the
             | proprietary stack. However, that driver isn't in an ideal
             | state today.
        
           | numpad0 wrote:
           | FP32 GFLOPS, ballparks from random sources:
           | 
           | - this: 24
           | 
           | - Ryzen 5600g: 200(CPU)
           | 
           | - Jetson nano: 235
           | 
           | - GeForce GT1030: 1127
           | 
           | - Ryzen 3rd IGP: 2100
           | 
           | - Apple M1X: 5200
           | 
           | - Apple M1 Pro: 10400
           | 
           | - RTX3080: 35580
           | 
           | 1030 can be had for $110 even at this height of GPU
           | shortages, not that much more than a Nano. hmm
        
           | prox wrote:
           | Wow that's an interesting device! Thanks!
        
           | krallja wrote:
           | The Jetson Nano uses a very similar SoC to the Nintendo
           | Switch, so you can expect similar performance.
        
             | JustFinishedBSG wrote:
             | It uses half a switch SoC GPU wise
        
               | GhettoComputers wrote:
               | All versions? Would be cool to use a hacked switch
               | running linux instead of Jetson if the performance was
               | that much better.
        
               | my123 wrote:
               | 921.6MHz is the GPU clock on Jetson Nano (at MAXN).
               | 
               | For the Switch:
               | 
               | > The GPU cores are clocked at 768 MHz when the device is
               | docked, and in handheld mode, fluctuating between the
               | following speeds: 307.2 MHz, 384 MHz, and 460 MHz
        
       | StreamBright wrote:
       | This is great for many reasons.
       | 
       | https://www.reddit.com/r/MachineLearning/comments/ilcw2f/p_v...
        
       | causi wrote:
       | I wonder if we'll see any impacts from this on the Pi 4
       | applications that are presently borderline when it comes to
       | performance, like N64 emulation.
        
       ___________________________________________________________________
       (page generated 2021-10-30 23:00 UTC)