[HN Gopher] The Apple GPU and the impossible bug
       ___________________________________________________________________
        
       The Apple GPU and the impossible bug
        
       Author : stefan_
       Score  : 719 points
       Date   : 2022-05-13 13:32 UTC (9 hours ago)
        
 (HTM) web link (rosenzweig.io)
 (TXT) w3m dump (rosenzweig.io)
        
       | stefan_ wrote:
       | > The Tiled Vertex Buffer is the Parameter Buffer. PB is the
       | PowerVR name, TVB is the public Apple name, and PB is still an
       | internal Apple name.
       | 
       | Patent lawyers love this one silly trick.
        
         | robert_foss wrote:
         | Seeing how Apple licensed the full PowerVR hardware before,
         | they probably currently have a license for the whatever
         | hardware they based their design on.
        
           | kimixa wrote:
           | They originally claimed they completely redesigned it and
           | announced they were therefore going to drop the PowerVR
           | architecture license - that was the reason for the stock
           | price crash and Imagination Technologies sale in 2017.
           | 
           | Then they have since scrubbed the internet of all such claims
           | and to this day pay for an architecture license. I think it's
           | similar to an ARM architecture license - where it's a license
           | for any derived technology and patents rather than actually
           | being given the RTL for powervr-designed cores.
           | 
           | I worked at PowerVR during that time (I have Opinions, but
           | will try to keep them to myself), and my understanding was
           | that Apple hadn't actually taken new PowerVR RTL for a number
           | of years and had significant internal redesigns of large
           | units (e.g. the shader ISA was rather different from the
           | PowerVR designs of the time), but presumably they still use
           | enough of the derived tech and ideas that paying the
           | architecture license is necessary. This transfer was only one
           | way - we never saw anything internal about Apple's designs,
           | so reverse engineering efforts like this are still
           | interesting.
           | 
           | And as someone who worked on the PowerVR cores (not the Apple
           | derivatives) I can assure you all this discussed in the
           | original post is _extremely_ familiar.
        
           | pyb wrote:
           | Apple's claim is that they designed it themselves. https://en
           | .wikipedia.org/wiki/Talk:Apple_M1#[dubious_%E2%80%...
        
             | gjsman-1000 wrote:
             | There's no reason that couldn't be a half-truth - it could
             | be a PowerVR with certain components replaced, or even the
             | entire GPU replaced but with PowerVR-like commands and
             | structure for compatibility reasons. Kind of like how AMD
             | designed their own x86 chip despite it being x86 (Intel's
             | architecture).
             | 
             | Also, if you read Hector Martin's tweets (he's doing the
             | reverse-engineering), Apple replacing the actual logic
             | while maintaining the "API" of sorts is not unheard of.
             | It's what they do with ARM themselves - using their own ARM
             | designs instead of the stock Cortex ones while maintaining
             | ARM compatibility.*
             | 
             | *Thus, Apple has a right to the name "Apple Silicon"
             | because the chip is designed by Apple, and just happens to
             | be ARM-compatible. Other chips from almost everyone else
             | use stock ARM designs from ARM themselves. Otherwise, we
             | might as well call AMD an "Intel design" because its x86 by
             | the same logic.
        
               | quux wrote:
               | Didn't Apple have a large or even dominant role in the
               | design of the ARM64/AArch64 architecture? I remember
               | reading somewhere that they developed ARM64 and
               | essentially "gave it" to ARM who accepted but nobody
               | could understand at the time why a 64 bit extension to
               | ARM was needed so urgently, and why some of the details
               | of the architecture had been designed the way they had.
               | Years later with Apple Silicon it all became clear.
        
               | kalleboo wrote:
               | The source is a former Apple engineer (now at Nvidia
               | apparently)
               | 
               | https://twitter.com/stuntpants/status/1346470705446092811
               | 
               | > _arm64 is the Apple ISA, it was designed to enable
               | Apple's microarchitecture plans. There's a reason Apple's
               | first 64 bit core (Cyclone) was years ahead of everyone
               | else, and it isn't just caches_
               | 
               | > _Arm64 didn't appear out of nowhere, Apple contracted
               | ARM to design a new ISA for its purposes. When Apple
               | began selling iPhones containing arm64 chips, ARM hadn't
               | even finished their own core design to license to
               | others._
               | 
               | > _ARM designed a standard that serves its clients and
               | gets feedback from them on ISA evolution. In 2010 few
               | cared about a 64-bit ARM core. Samsung & Qualcomm, the
               | biggest mobile vendors, were certainly caught unaware by
               | it when Apple shipped in 2013._
               | 
               | > > _Samsung was the fab, but at that point they were
               | already completely out of the design part. They likely
               | found out that it was a 64 bit core from the diagnostics
               | output. SEC and QCOM were aware of arm64 by then, but
               | they hadn't anticipated it entering the mobile market
               | that soon._
               | 
               | > _Apple planned to go super-wide with low clocks, highly
               | OoO, highly speculative. They needed an ISA to enable
               | that, which ARM provided._
               | 
               | > _M1 performance is not so because of the ARM ISA, the
               | ARM ISA is so because of Apple core performance plans a
               | decade ago._
               | 
               | > > _ARMv8 is not arm64 (AArch64). The advantages over
               | arm (AArch32) are huge. Arm is a nightmare of
               | dependencies, almost every instruction can affect flow
               | control, and must be executed and then dumped if its
               | precondition is not met. Arm64 is made for reordering._
        
               | quux wrote:
               | Thanks!
        
               | travisgriggs wrote:
               | > > M1 performance is not so because of the ARM ISA, the
               | ARM ISA is so because of Apple core performance plans a
               | decade ago.
               | 
               | This is such an interesting counterpoint to the
               | occasional "Just ship it" screed (just one yesterday I
               | think?) we see on HN.
               | 
               | I have to say, I find this long form delivery of tech to
               | be enlightening. That kind of foresight has to mean some
               | level of technical saaviness at high decision making
               | levels. Whereas many of us are caught at companies with
               | short sighted/tech naive leadership who clamor to just
               | ship it so we can start making money and recoup the money
               | we're losing on these expensive tech type developers.
        
               | kif wrote:
               | I think the "just ship it" method is necessary when
               | you're small and starting out. Unless you are well
               | funded, you couldn't afford to do what Apple did.
        
               | pyb wrote:
               | I haven't followed the announcements CPU side - do Apple
               | clearly claim that they designed their own CPU (with an
               | ARM instruction set)?
        
               | daneel_w wrote:
               | They are one of a handful of companies that hold a
               | license allowing them to both customize the reference
               | core and to implement the Arm ISA through their own
               | silicon design. Everyone else's SoCs all use the same Arm
               | reference mask. Qualcomm also holds such a license, which
               | owes to their Snapdragon SoC, just like Apple's A- and
               | M-series, occupying a performance hierarchy above
               | everything else Arm.
        
               | happycube wrote:
               | The _only_ Qualcomm designed 64-bit mobile core so far
               | was the Kyro core in the 820. They then assigned that
               | team to server chips (Centriq) then sacked the whole team
               | when they felt they needed to cut cash flow to stave off
               | Avago /Broadcom. The "Kyro" cores from 835 on are
               | rebadged/adjusted ARM cores.
               | 
               | IMO the Kyro/820 wasn't a _major_ failure, it turned out
               | a lot better than the 810 which had A53 /A57 cores.
               | 
               | And _then_ they decided they needed a mobile CPU team
               | again and bought Nuvia for ~US$1 Billion.
        
               | masklinn wrote:
               | According to Hector Martin (the project lead of Asahi) in
               | previous threads of the subject[0], Apple actually has an
               | "architecture+" license which is completely exclusive to
               | them, thanks to having literally been at the origins of
               | ARM: not only can Apple implement the ISA on completely
               | custom silicon rather than license ARM cores, they can
               | _customise_ the ISA (as in add instructions, as well as
               | opt out of mandatory ISA features).
               | 
               | [0] https://news.ycombinator.com/item?id=29798744
        
               | pyb wrote:
               | Such a license is a big clue, but not quite what I was
               | enquiring about...
        
               | paulmd wrote:
               | To be blunt, you're asking about questions that could be
               | solved with a quick google and you are coming off as a
               | bit of a jerk asking for very specific citations with
               | exact specific wording for basic facts like this that,
               | again, could be solved by looking through the wikipedia
               | for "apple silicon" and then bouncing to a specific
               | source. People have answered your question and you're
               | brushing them off because you want it answered in an
               | exact specific way.
               | 
               | https://en.wikipedia.org/wiki/Apple_silicon
               | 
               | https://www.anandtech.com/show/7335/the-
               | iphone-5s-review/2
               | 
               | > NVIDIA and Samsung, up to this point, have gone the
               | processor license route. They take ARM designed cores
               | (e.g. Cortex A9, Cortex A15, Cortex A7) and integrate
               | them into custom SoCs. In NVIDIA's case the CPU cores are
               | paired with NVIDIA's own GPU, while Samsung licenses GPU
               | designs from ARM and Imagination Technologies. Apple
               | previously leveraged its ARM processor license as well.
               | Until last year's A6 SoC, all Apple SoCs leveraged CPU
               | cores designed by and licensed from ARM.
               | 
               | > With the A6 SoC however, Apple joined the ranks of
               | Qualcomm with leveraging an ARM architecture license. At
               | the heart of the A6 were a pair of Apple designed CPU
               | cores that implemented the ARMv7-A ISA. I came to know
               | these cores by their leaked codename: Swift.
               | 
               | Yes, Apple has been designing and using non-reference
               | cores since the A6 era, and were one of the first to the
               | table with ARMv8 (apple engineers claim it was designed
               | for them under contract to their specifications, but
               | _this_ part is difficult to verify with anything more
               | than citations from individual engineers).
               | 
               | I expect that Apple has said as much in their
               | presentations somewhere, but if you're that keen on
               | finding such an incredibly specific attribution, then
               | knock yourself out. It'll be in an apple conference
               | somewhere, like WWDC. They probably have said "apple-
               | designed silicon" or "custom core" at some point, and
               | that would be your citation - but they also sell
               | products, not hardware, and they don't _extensively_ talk
               | about their architectures since they 're not really the
               | product, so you probably won't find a deep-dive like
               | Anandtech from Apple directly where they say "we have
               | 8-wide decode, 16-deep pipeline... etc" sorts of things.
        
               | [deleted]
        
               | gjsman-1000 wrote:
               | Qualcomm did use their own design called _Kyro_ for a
               | little while, but is now focusing on cores designed by
               | Nuvia which they just bought for the future.
               | 
               | As for Apple, they've designed their own cores since the
               | Apple A6 which used the _Swift_ core. If you go to the
               | Wikipedia page, you can actually see the names of their
               | core designs, which they improve every year. For the M1
               | and A14, they use _Firestorm_ High-Performance Cores and
               | _Icestorm_ Efficiency Cores. The A15 uses _Avalanche_ and
               | _Blizzard_. If you visit AnandTech, they have deep-dives
               | on the technical details of many of Apple 's core designs
               | and how they differ from other core designs including
               | stock ARM.
               | 
               | The Apple A5 and earlier were stock ARM cores, the last
               | one they used being Cortex A9.
               | 
               | For this reason, Apple is about as much an ARM chip as
               | AMD is an Intel chip. Technically compatible,
               | implementation almost completely different. It's also why
               | Apple calls it "Apple Silicon" and it is not just
               | marketing, but actually justified just as much as AMD not
               | calling their chips Intel derivatives.
        
               | GeekyBear wrote:
               | > Qualcomm did use their own design called Kyro for a
               | little while
               | 
               | Before that, they had Scorpion and Krait, which were both
               | quite successful 32 bit ARM compatible cores at the time.
               | 
               | Kryo started as an attempt to quickly launch a custom 64
               | bit ARM core and the attempt failed badly enough that
               | Qualcomm abandoned designing their own cores and turned
               | to licensing semi-custom cores from ARM instead.
        
               | amaranth wrote:
               | Kyro started as custom but flopped in the Snapdragon 820
               | so they moved to a "semi-custom" design, it's unclear how
               | different it really is from the stock Cortex designs.
        
               | daneel_w wrote:
               | The other-wordly performance-per-watt would be another.
        
               | stephen_g wrote:
               | They do, and their microarchitecture is unambiguously,
               | hugely different to anything else (some details in 1).
               | The last Apple Silicon chip to use a standard Arm design
               | was the A5X, whereas they were using customised PowerVR
               | GPUs until I think the A11.
               | 
               | 1. https://www.anandtech.com/show/16226/apple-
               | silicon-m1-a14-de...
        
               | rjsw wrote:
               | > Apple replacing the actual logic while maintaining the
               | "API" of sorts is not unheard of.
               | 
               | They did this with ADB, early PowerPC systems contained a
               | controller chip that has the same API that was
               | implemented in software in the 6502 IOP coprocessor in
               | the IIfx/Q900/Q950.
        
           | brian_herman wrote:
           | Also laywers that can keep it in court long enough for a
           | redesign.
        
       | tambourine_man wrote:
       | Few things are more enjoyable than reading a good bug story, even
       | when it's not one's area of expertise. Well done.
        
         | alimov wrote:
         | I had the same thought. I really enjoy following along and
         | getting a glimpse into the thought process of people working
         | through challenges.
        
       | danw1979 wrote:
       | Alyssa and the rest of the Asahi team are basically magicians as
       | far as I can tell.
       | 
       | What amazing work and great writing that takes an absolute
       | graphics layman (me) on a very technical journey yet it is still
       | largely understandable.
        
         | [deleted]
        
       | nicoburns wrote:
       | > Why the duplication? I have not yet observed Metal using
       | different programs for each.
       | 
       | I'm guessing whoever designed the system wasn't sure whether they
       | would ever need to be different, and designed it so that they
       | could be. It turned out that they didn't need to be, but it was
       | either more work than it was worth to change it (considering that
       | simply passing the same parameter twice is trivial), or they
       | wanted to leave the flexibility in the system in case it's needed
       | in future.
       | 
       | I've definitely had APIs like this in a few places in my code
       | before.
        
         | pocak wrote:
         | I don't understand why the programs are the same. The partial
         | render store program has to write out both the color and the
         | depth buffer, while the final render store should only write
         | out color and throw away depth.
        
           | kimixa wrote:
           | Possibly pixel local storage - I think this can be accessed
           | with extended raster order groups and image blocks in metal.
           | 
           | https://developer.apple.com/documentation/metal/resource_fun.
           | ..
           | 
           | E.g in their example in the link above for deferred rendering
           | (figure 4) the multiple G buffers won't actually need to
           | leave the on-chip tile buffer - unless there's a partial
           | render before the final shading shader is run.
        
           | hansihe wrote:
           | Not necessarily, other render passes could need the depth
           | data later.
        
             | Someone wrote:
             | So it seems it allows for optimization. If you know you
             | don't need everything, one of the steps can do less than
             | the other.
        
             | johntb86 wrote:
             | Most likely that would depend on what storeAction is set
             | to: https://developer.apple.com/documentation/metal/mtlrend
             | erpas...
        
             | pocak wrote:
             | Right, I had the article's bunny test program on my mind,
             | which looks like it has only one pass.
             | 
             | In OpenGL, the driver would have to scan the following
             | commands to see if it can discard the depth data. If it
             | doesn't see the depth buffer get cleared, it has to be
             | conservative and save the data. I assume mobile GPU drivers
             | in general do make the effort to do this optimization, as
             | the bandwidth savings are significant.
             | 
             | In Vulkan, the application explicitly specifies which
             | attachment (i.e. stencil, depth, color buffer) must be
             | persisted at the end of a render pass, and which need not.
             | So that maps nicely to the "final render flush program".
             | 
             | The quote is about Metal, though, which I'm not familiar
             | with, but a sibling comment points out it's similar to
             | Vulkan in this aspect.
             | 
             | So that leaves me wondering: did Rosenzweig happen to only
             | try Metal apps that always use _MTLStoreAction.store_ in
             | passes that overflow the TVB, or is the Metal driver
             | skipping a useful optimization, or neither? E.g. because
             | the hardware has another control for this?
        
           | plekter wrote:
           | I think multisampling may be the answer.
           | 
           | For partial rendering all samples must be written out, but
           | for the final one you can resolve(average) them before
           | writeout.
        
         | [deleted]
        
       | 542458 wrote:
       | It's been said more than a few times in the past, but I cannot
       | get over just how smart and motivated Alyssa Rosenzweig is -
       | she's currently an undergraduate university student, and was
       | leading the Panfrost project when she was still in high school!
       | Every time I read something she wrote I'm astounded at how
       | competent and eloquent she is.
        
         | frostwarrior wrote:
         | While I was reading I was already thinking that. I can't
         | believe how smart and an awesome developer she is.
        
         | pciexpgpu wrote:
         | Undergrad? I thought she was some Staff SWE in an OSS company.
         | Seriously impressive, and ought to give anyone imposter
         | syndrome.
        
           | gjsman-1000 wrote:
           | Well, Alyssa is, and works for Collabora while also being
           | undergrad.
        
           | coverband wrote:
           | I was about to post "very impressive", but that seems a huge
           | understatement after finding out she's still in school...
        
         | [deleted]
        
         | aero-glide2 wrote:
         | Have to admit, wherever i see people much younger than me do
         | great things I get very depressed.
        
           | kif wrote:
           | I used to feel this way, too. However, every single one of us
           | has their own unique circumstances.
           | 
           | I can't give too many details unfortunately. But, there's a
           | specific step I took in my career, which was completely
           | random at the time. I was still a student, and I decided not
           | to work somewhere. I resigned two weeks in. Had I not done
           | that, I wouldn't be where I am today. My situation would be
           | totally different.
           | 
           | Yes, some people are very talented. But it does take quite a
           | lot of work and dedication. And yes, sometimes you cannot
           | afford to dedicate your time to learning something because
           | life happens.
        
           | cowvin wrote:
           | No need to be depressed. It's not a competition between you.
           | You can find inspiration in what others achieve and try to
           | achieve more yourself.
        
           | ip26 wrote:
           | I get that. But then I remember at that age, I was only just
           | cobbling together my very first computer from the scrap bin.
           | An honest comparison is nearly impossible.
        
           | pimeys wrote:
           | And for me, her existence is enough to keep me of getting
           | depressed about my industry. Whatever she's doing, is keeping
           | my hopes up for computer engineering.
        
           | [deleted]
        
           | ohgodplsno wrote:
           | Be excited! This means amazing things are coming, from
           | incredibly talented people. And even better when they put out
           | their knowledge in public, in an easy to digest form, letting
           | you learn from them.
        
         | azinman2 wrote:
         | Does anyone know if she has a proper interview somewhere? I'd
         | love to know how she got so technical in high school to be able
         | to reverse engineer a GPU -- something I would have no idea how
         | to start even with many more years experience (although
         | admittedly I know very little about GPUs and don't do graphics
         | work).
        
       | daenz wrote:
       | That image gave me flashbacks of gnarly shader debugging I did
       | once. IIRC, I was dividing by zero in some very rare branch of a
       | fragment shader, and it caused those black tiles to flicker in
       | and out of existence. Excruciatingly painful to debug on a GPU.
        
       | thanatos519 wrote:
       | What an entertaining story!
        
       | ninju wrote:
       | > Comparing a trace from our driver to a trace from Metal,
       | looking for any relevant difference, we eventually _stumble on
       | the configuration required_ to make depth buffer flushes work.
       | 
       | > And with that, we get our bunny.
       | 
       | So what was the configuration that needed to change? Don't leave
       | us hanging!!!
        
         | [deleted]
        
       | dry_soup wrote:
       | Very interesting and easy to follow writeup, even for a graphics
       | ignoramus like myself.
        
       | Jasper_ wrote:
       | Huh, I always thought tilers re-ran their vertex shaders multiple
       | times -- once with position-only to do binning, and then _again_
       | when computing for all attributes with each tile; that 's what
       | the "forward tilers" like Adreno/Mali do. That's crazy they dump
       | all geometry to main memory rather than keeping it in pipe. It
       | explains why geometry is more of a limit on AGX/PVR than
       | Adreno/Mali.
        
         | pocak wrote:
         | That's what I thought, too, until I saw ARM's Hot Chips 2016
         | slides. Page 24 shows that they write transformed positions to
         | RAM, and later write varyings to RAM. That's for Bifrost, but
         | it's implied Midgard is the same, except it doesn't filter out
         | vertices from culled primitives.
         | 
         | That makes me wonder whether the other GPUs with position-only
         | shading - Intel and Adreno - do the same.
         | 
         | As for PowerVR, I've never seen them described as position-only
         | shaders - I think they've always done full vertex processing
         | upfront.
         | 
         | edit: slides are at https://old.hotchips.org/wp-
         | content/uploads/hc_archives/hc28...
        
           | Jasper_ wrote:
           | Mali's slides here still show them doing two vertex shading
           | passes, one for positions, and again for other attributes.
           | I'm guessing "memory" here means high-performance in-unit
           | memory like TMEM, rather than a full frame's worth of data,
           | but I'm not sure!
        
         | atq2119 wrote:
         | I was under that impression as well. If they write out all
         | attributes, what is really the remaining difference to a
         | traditional immediate more renderer? Nvidia reportedly has
         | vertex attributes going through memory for many generations
         | already (and they are at least partially tiled...).
         | 
         | I suppose the difference is whether the render target lives in
         | the "SM" and is explicitly loaded and flushed (by a shader, no
         | less!) or whether it lives in a separate hardware block that
         | acts as a cache.
        
           | Jasper_ wrote:
           | NV has vertex attributes "in-pipe" (hence mesh shaders), and
           | the appearance of a tiler is a misread, it's just a change to
           | the macro-rasterizer about which quads get dispatched first,
           | it's not a true tiler.
           | 
           | The big difference is the end of the pipe, as mentioned;
           | whether you have ROPs or whether your shader cores load/store
           | from a framebuffer segment. Basically, whether or not
           | framebuffer clears are expensive (assuming no fast-clear
           | cheats), or free.
        
       | [deleted]
        
       | bob1029 wrote:
       | I really appreciate the writing and work that was done here.
       | 
       | It is amazing to me how complicated these systems have become. I
       | am looking over the source for the single triangle demo. Most of
       | this is just about getting information from point A to point B in
       | memory. Over 500 lines worth of GPU protocol overhead... Granted,
       | this is a one-time cost once you get it working, but it's still a
       | lot to think about and manage over time.
       | 
       | I've written software rasterizers that fit neatly within 200
       | lines and provide very flexible pixel shading techniques.
       | Certainly not capable of running a cyberpunk 2077 scene, but
       | interactive framerates otherwise. In the good case, I can go from
       | a dead stop to final frame buffer in <5 milliseconds. Can you
       | even get the GPU to wake up in that amount of time?
        
         | mef wrote:
         | with great optimization comes great complexity
        
           | [deleted]
        
       | quux wrote:
       | Impressive work and really interesting write up. Thanks!
        
       | VyseofArcadia wrote:
       | > Yes, AGX is a mobile GPU, designed for the iPhone. The M1 is a
       | screaming fast desktop, but its unified memory and tiler GPU have
       | roots in mobile phones.
       | 
       | PowerVR has its roots in a desktop video card with somewhat
       | limited release and impact. It really took off when it was used
       | in the Sega Dreamcast home console and the Sega Naomi arcade
       | board. It was only later that people put them in phones.
        
         | robert_foss wrote:
         | But being a Tiling rendering architecture which is normal for
         | mobile applications and not how desktop GPUs are architectured,
         | it would be fair to call it a mobile GPU.
        
           | Veliladon wrote:
           | Nvidia appears to be an immediate mode renderer to the user
           | but has used a tiled rendering architecture under the hood
           | since Maxwell.
        
             | pushrax wrote:
             | According to the sources I've read, it uses a tiled
             | rasterizing architecture but it's not deferred in the same
             | way as typical mobile TBDR that bins all vertexes before
             | starting rasterization, deferring all rasterization after
             | all vertex generation, and flushing each tile to the
             | framebuffer once.
             | 
             | NV seems to rasterize vertexes in small batches (i.e.
             | immediately) but buffers the rasterizer output on die in
             | tiles. There can still be significant overlap between
             | vertex generation and rasterization. Those tiles are
             | flushed to the framebuffer, potentially before they are
             | fully rendered, and potentially multiple times per draw
             | call depending on the vertex ordering. They do some
             | primitive reordering to try to avoid flushing as much, but
             | it's not a full deferred architecture.
        
               | [deleted]
        
             | monocasa wrote:
             | Nvidia's is a tile-based immediate mode rasterizer. It's
             | more a cache friendly immediate renderer than a TBDR.
        
         | tomc1985 wrote:
         | I actually had one of those cards! The only games I could get
         | it to work with were Half-Life, glQuake, and Jedi Knight, and
         | the bilinear texture filtering had some odd artifacting IIRC
        
         | wazoox wrote:
         | Unified memory was introduced by SGI with the O2 workstation in
         | 1996, then they used it again with their x86 workstations SGI
         | 320 and 540 in 1999. So it was a workstation-class technology
         | before being a mobile one :)
        
           | andrekandre wrote:
           | even the n64 had unified memory way back in 1995
        
             | nwallin wrote:
             | The N64's unified memory model had a pretty big asterisk
             | though. The system had only 4kB for textures out of 4MB of
             | total RAM. And textures are what uses the most memory in a
             | lot of games.
        
             | ChuckNorris89 wrote:
             | N64 chip was also SGI designed
        
         | iforgotpassword wrote:
         | Was it the kyro 2? I had one of these but killed it by
         | overclocking... Would make for a good retro system.
        
           | smcl wrote:
           | The Kyro and Kyro 2 were a little after the Dreamcast.
        
       | sh33sh wrote:
       | Really enjoyed the way it was written
        
         | GeekyBear wrote:
         | Alyssa's writing style steps you through a technical mystery in
         | a way that remains compelling even if you lack the domain
         | knowledge to solve the mystery yourself.
        
       ___________________________________________________________________
       (page generated 2022-05-13 23:00 UTC)