[HN Gopher] The genius of binary space partitioning in Doom ___________________________________________________________________ The genius of binary space partitioning in Doom Author : pgayed Score : 248 points Date : 2022-11-21 14:39 UTC (8 hours ago) (HTM) web link (twobithistory.org) (TXT) w3m dump (twobithistory.org) | pkilgore wrote: | Also discussed / covered in | https://news.ycombinator.com/item?id=21906051 | kazinator wrote: | It seems that a challenge in BSP is going to be artifacts, when a | surface, particularly a textured one, is chopped into pieces by | several planes. If this is not done very carefully, the cuts | might show, due to edge artifacts caused by the separate scan | conversion of the fragments. Certain shading tricks like Phong | may be difficult to apply in such a way that the result looks as | if the original undivided polygon had been rendered. | dmbaggett wrote: | Amazingly brilliant work, especially given the CPU capabilities | at the time. Carmack's use of BSP trees inspired my own work on | the Crash Bandicoot renderer. I was also really intrigued by Seth | Teller's Ph.D. thesis on Precomputed Visibility Sets though I | knew that would never run on home console hardware. | | None of these techniques is relevant anymore given that all the | hardware has Z buffers, obviating the need to explicitly order | the polygons during the rendering process. But at the time (mid | 90s) it was arguably the key problem 3D game developers needed to | solve. (The other was camera control; for Crash Andy Gavin did | that.) | | A key insight is that sorting polygons correctly is inherently | O(N^2), not O(N lg N) as most would initially assume. This is | because polygon overlap is not a transitive property (A in front | of B and B in front of C does NOT imply A in front of C, due to | cyclic overlap.) This means you can't use O(N lg N) sorting, | which in turn means sorting 1000 polygons requires a million | comparisons -- infeasible for hardware at the time. | | This is why many games from that era (3DO, PS1, etc) suffer from | polygons that flicker back and forth, in front of and behind each | other: most games used bucket sorting, which is O(N) but only | approximate, and not stable frame to frame. | | The handful of games that did something more clever to enable | correct polygon sorting (Doom, Crash and I'm sure a few others) | looked much better as a result. | | Finally, just to screw with other developers, I generated a giant | file of random data to fill up the Crash 1 CD and labeled it | "bsptree.dat". I feel a bit guilty about that given that everyone | has to download it when installing the game from the internet, | even though it is completely useless to the game. | PainfullyNormal wrote: | > None of these techniques is relevant anymore given that all | the hardware has Z buffers, obviating the need to explicitly | order the polygons during the rendering process. But at the | time (mid 90s) it was arguably the key problem 3D game | developers needed to solve. (The other was camera control; for | Crash Andy Gavin did that.) | | In your opinion, What is the key problem 3d game developers | need to solve in 2022? | chrisseaton wrote: | > None of these techniques is relevant anymore given that all | the hardware has Z buffers, obviating the need to explicitly | order the polygons during the rendering process. | | You can't mean that all the polygons in a game world are now | sent to the GPU, entirely relying on viewport culling and Z | buffer to remove the ones out of view? I'm not an expert but | I'm sure that's not true - doesn't Source and latest iD Tech | use BSP right now for example? | dmbaggett wrote: | Occlusion culling is still relevant and of course BSP trees | help with that as well. What I meant is there is no value in | sorting polygons any longer. (As far as I know; the last time | I worked on a game was 1997.) | gary_0 wrote: | It should be noted that virtually all renderers do _frustum | culling_ , meaning anything in the game world that is | guaranteed to be out the camera's viewing volume is not | rendered. Frustum culling is quite simple. _Occlusion | culling_ is usually done after frustum culling and is more | complex, and the method tends to vary depending on the scene | type and game engine, or is sometimes not done at all (modern | GPUs are so powerful that smaller or simpler game areas | render fast enough without occlusion culling). | Jasper_ wrote: | That's known as "occlusion culling", and it's still a bit of | an open problem [0]; a lot of games do indeed just send all | draw calls inside the frustum. Turns out a good Z-prepass is | pretty free and helps a lot with things occlusion culling | would normally help with. And deferred rendering helps even | more with things like quad overdraw from disparate objects, | as long as your material mask is mostly the smae. | | The Source Engine still uses has a pre-built BSP for surface | visibility. Source 2 has replaced this with a world-aligned | visibility octree [1]. | | [0] The common solution these days is a low-resolution depth | buffer rasterized on the _CPU_ with SIMD, because doing it on | the GPU would have noticeable latency... you 've probably | played a game where you turn into a twisty hallway and the | walls/object in there only show up after a few frames... | that's GPU occlusion culling at work. | | [1] https://developer.valvesoftware.com/wiki/Half- | Life:_Alyx_Wor... | teraflop wrote: | It's a bit more nuanced than that. The parent commenter is | correct that modern game engines don't _need_ to sort | polygons to render them correctly (with some exceptions e.g. | transparent objects), and doing so at a fine-grained level is | generally not worth the CPU cost. Especially since the | _bandwidth_ between the CPU and GPU can be a bottleneck, so | if possible you want to only upload the level geometry once | instead of having to rearrange it on every frame. | | Game engines can of course still do their own _coarse- | grained_ occlusion culling, in order to reduce the amount of | geometry that needs to be processed per frame. And there can | still be a benefit to _approximately_ sorting objects: if you | draw objects in approximate front-to-back order, then a | shader can do "early depth testing" and skip the expensive | shading computations for pixels that are occluded by already- | drawn polygons. | gary_0 wrote: | Even for modern games that don't care about depth ordering, | they still tend to render first-person weapons or third- | person characters first, and the sky last, just because | it's an easy way to skip a bunch of overdraw, so why not? | hmry wrote: | Source was released 18 years ago. | | The current Source 2 does not use BSP anymore. | gary_0 wrote: | I think BSP was kind of obsolete even when it was used in | Source 1, but was left in because removing it would have | required changing a lot of code for no real benefit. The | blocky brush-based level geometry system from GoldSrc still | remained, but they added a lot of higher-fidelity features | on top of it. IIRC, the level editing pipeline code for | things like PVS and baked lighting were still based on code | from GoldSrc. Better, newer techniques existed but what | Valve had wasn't broken, so why fix it? | skocznymroczny wrote: | From what I've seen in the modern game engines, the current | state of the art seems to be executing a compute shader which | does frustum and occlusion culling on GPU and dynamically | generates an indirect argument buffer which gets drawn in | several or a single indirect draw. | | This article contains a basic implementation of such idea in | Vulkan - | https://vkguide.dev/docs/gpudriven/gpu_driven_engines/ | CyberDildonics wrote: | _I'm not an expert but I'm sure that's not true_ | | If you're not an expert why are you _sure_ it 's not true. | Most games render a lightweight pass to do all rasterization | to a g-buffer, then do the expensive shading later. This | separates visibility from shading. If fragments are already | occluded by the z-buffer they can be skipped as soon as they | can test their z value against the buffer. | chrisseaton wrote: | Do you have my comments bookmarked or something? You reply | to me with something negative to say far more frequently | than if you came across them randomly. Can you give it a | rest if you can't manage to tone down the snide please? | | As other comments say, including the original comment | author, game engines actually do still rely on their own | space partitioning to reduce draw calls. Source 2 just does | it quad rather than binary. Source is still used in | actively developed games and is still BSP, so it's not true | that the techniques are not relevant. | CyberDildonics wrote: | It's strange to get so upset about negativity when you | replied to someone saying "I'm sure that's not true" even | though you were talking to an expert who was making a | general statement about the way rendering works. | chrisseaton wrote: | Just give replying to me a rest will you, if you don't | like my comments. | CyberDildonics wrote: | The person you replied to made an informative comment and | you seemed to want to poke a hole in a generalization | that people could learn from, which a lot of other people | have pointed out, so I don't know why you would get upset | at me for explaining more about how rendering works. | dicroce wrote: | yeah this didn't seem right to me either | zasdffaa wrote: | > sorting polygons correctly is inherently O(N^2), [...] | because polygon overlap is not a transitive property (A in | front of B and B in front of C does NOT imply A in front of C, | due to cyclic overlap.) | | Well ok but I don't get this: | | > This means you can't use O(N lg N) sorting, which in turn | means sorting 1000 polygons requires a million comparisons -- | infeasible for hardware at the time | | ISTM you CAN'T sort it full stop because of cycles - so what | kind of sort (never mind O(N^2), _any_ sort) can order cycles? | They can 't. | kvark wrote: | > None of these techniques is relevant anymore given that all | the hardware has Z buffers | | Not true if you consider transparent objects. Rendering with | order-independent transparency is still a hot topic without a | clearly winning solution on GPU. | | Web browsers have lots of semi-transparent rectangles, which | can be transformed under "preserve3d" context. This is a | classic case of effective BSP that is actual. (background: | implementing BSP in WebRender a few years ago). | | https://github.com/servo/plane-split | bodhiandphysics wrote: | As ray tracing becomes more common, I think we'll see a big | return of various BSP trees in games. BSP is still the crucial | element in offline rendering. | detritus wrote: | > Finally, just to screw with other developers, I generated a | giant file of random data to fill up the Crash 1 CD and labeled | it "bsptree.dat". I feel a bit guilty about that given that | everyone has to download it when installing the game from the | internet, even though it is completely useless to the game. | | Wonderful! THIS is the kind of silly nitty gritty detail I want | to hear more about - particularly the whys and wherefores of | the decision, largely-unconsidered as it may well have been. | Puts me in mind of the semi-affectionate rivalry between | different demo/crack houses in the eighties and nineties, | engaging in largely unseen conflict, all submarine-like :) And, | if you're reading this - know that this thoroughly un-tech nerd | has read all of your Crash Bandicoot breakdowns, no matter how | arcane they might seem to me :) | jetbalsa wrote: | Great comment! That poor poor bsptree.dat will forever live on. | I wonder if anyone really sat down and tried to bash their head | against bsptree.dat | YesBox wrote: | I'm developing an isometric city builder game. I've figured out | how to depth sort everything on the CPU efficiently, though | there were some ugly workarounds I had to implement. Suffice to | say, for anything outside of buildings, sorting on the screen.y | position of the texture works perfectly (so long as each asset | is one tile in size, or broken up into multiple tiles). | | But, I am starting to implement vehicles, first time adding | assets that are multiple tiles in size. It would be really nice | if I could create one asset for the vehicles, but the sorting | on screen.y will not work for 3D rectangles, so I am breaking | the up into multiple tiles... | | Do you think BSP trees will work with thousands of moving | assets? i.e., I would have to recreate the BSP tree every | frame. | kqr wrote: | How much do things move between two frames? Would you really | have to recreate the full tree or just reorder a small number | of leaves? | fragmede wrote: | there's probably some optimization that you can do, but you | can't know that the second frame isn't the conclusion to a | move from 10 frames ago that is what reveals a giant area. | frame 0: open door frame 10: still can't see past | edge of door frame 11: door opens enough to see big | new area | | between 10 and 11 it turns out there's a huge difference, | even though they're only two frames apart. | YesBox wrote: | Somewhere between one to $tile_size pixels. Tile size | depends on the zoom level (I'm using SFML, had to add a | separate sprite sheet for each zoom level cause SFML's | built in zoom out method is awful). | | This is the first time of hearing BSP, and I read most of | the OP's article to have a very basic understanding how it | works. | | Since this is a tree, reordering N elements would be | approach N^2 complexity, would it not? (edit: I assumed you | would have to find each node from the root, which could | very well be a bad premise). | ant6n wrote: | Assuming everything touches the floor, and floor is flat, | sort based on bottom y coordinate of each object | YesBox wrote: | This works perfectly for isometric cubes, and is what I do | currently. | | If you imagine two long rectangles, one in each of your two | hands, and pretend they are two cars passing each other in | an isometric world, you will see that pretty soon, one | car's screen.y position will be below the other before it's | clear of the vehicle which should actually be occluding | part of it. | pfedak wrote: | Do you know how much changed for the sequels? From some reverse | engineering, Crash Warped relies on depth bucketing for dynamic | triangles, while level geometry is streamed from the disk based | on the camera position, already at the appropriate LOD and | sorted and bucketed. Is the BSP logic you're talking about | real-time, or part of a similar pre-processing step? | HellDunkel wrote: | The beauty of the BSP tree is that it used recursion. When first | approached this is a challenge to wrap your head around- very | similar to the quicksort algorithm. Great read. | | These days octtrees more or less replaced BSP trees i guess. They | are easier to handle and work better with more polygons. | j7ake wrote: | Brilliant synthesis of the problem and why it was difficult. | | My feeling is that Carmack is really good at speeding up | important parts of video game performance to achieve | qualitatively different gaming experience. | | Sometimes I wonder if Carmack was born 30 years later, when | computing resources make such advances less needed, he would | still be as successful? Perhaps he would go on to improve | performance of other problems eg virtual reality and deep | learning..... | NayamAmarshe wrote: | Doom is truly an amazing showcase of engineering achievements and | going the extra mile for optimizations. | | Games these days just depend on raw hardware and optimization | takes a back seat. Games like Plague Tale Requiem, while look | fantastic, are equally bad at the engineering part. How is an RTX | 3080 the minimum recommended spec for 1080@60fps, is beyond me. | Sadly, people these days do not care about software optimization | and those who do are bombarded with comments like "Get better | hardware". | HellDunkel wrote: | The truely great games do. I am sure the developers of BOTW | were very serious about optimization and you can actually feel | that playing the game (on switch). It's a masterpiece just like | doom. | _whiteCaps_ wrote: | Wow, this brings back memories... | | I took a 3D computer graphics course in university in the early | 2000s, and the final project was to make a game. IIRC we could | use as many open source libraries + assets as we wanted, but we | had to write the renderer. | | Our professor was a big Buffy the Vampire Slayer fan, so we made | a Buffy game. My team thought we'd have some easy points for | that. When we presented our plan, he says "I know you know I'm a | Buffy fan, so this better be good"... Ooops. Time to get to work. | | We used the Quake map format, which is BSP, and our renderer | worked pretty well. We used Quake 2 character models, which | worked pretty well on its own. I think it was because someone had | made some vampire models for it. | | When we integrated the two, the performance was terrible. Like a | slideshow. After a brief panic, we realized that the Z axis was | inverted between the two modules we'd written, and the guy that | put them together was inverting the model _every frame_. After | inverting it once when the model was loaded, they worked pretty | well together. | | We added a simple hitbox (a rectangular prism at the maximum | points of the model), and had a fun little game after that! | diskzero wrote: | I implimented a BSP renderer in my somewhat unremarkable CDROM | game Dark Fiber [1] in 1995 or so. I also remember seeing the | algorithm in Foley and Van Dam, but it wasn't until a | conversation with John at some conference/tradeshow that I went | for doing the engine using BSP. It was a lot of fun figuring out | how to optimize the algorithm and I also spent a lot of time | using Mathematica figuring out the equations. Well, that was at | least my excuse for buying a fairly expensive piece of software | as a small gaming startup. | | https://archive.org/details/DarkFiberWindowsMac | tasty_freeze wrote: | BSP (binary space partitioning) was a well known algorithm, not | something Carmack picked out of obscurity. It is well covered in | every edition of Foley and van Dam's bible "Computer Graphics". | The arcade game "I, Robot" from Atari (1983) used BSP to render | the polygons back to front -- there was no z-buffer. | | That isn't to deny that Carmack was brilliant. But him using BSP | isn't some masterstroke of genius in itself. | throw_m239339 wrote: | > The arcade game "I, Robot" from Atari (1983) used BSP to | render the polygons back to front -- there was no z-buffer. | | wow 1983 | | https://www.youtube.com/watch?v=EHkwdvfXHJc | bitexploder wrote: | Carmack even discusses this on the Lex Fridman podcast. Carmack | basically had high school level math around that era and was a | voracious reader of every scrap of game programming lore he | could find. And outside of it. He is a very intellectually | honest guy and it is refreshing to hear him talk about how he | worked in those days. I came up similar to Carmack and he is my | programming hero, so, I have a lot of admiration for how he | managed to do things. | gibolt wrote: | Long video (5hrs+), but well worth a watch/listen. Can | increase playback speed to save time | | https://youtu.be/I845O57ZSy4 | beebeepka wrote: | I spent several game nights shooting guys in TF2 while | listening to this interview. It is long. The channel has a | few good interviews with biology guys as well | bitexploder wrote: | It was easily one of my favorite podcasts of the last few | years and Lex usually drives me a little crazy. | phkahler wrote: | I, Robot used a lot of great tricks, but I'm gonna have to | consult John Manfrida on the use of actual BSP trees. | | They did use normal vectors to determine visibility and may | have done simple back to front within objects using that, but | the playfield is a grid and mostly rendered back to front. | lordfrito wrote: | FYI I'm the author of the original I Robot emulator from | 1998. This is the one that interprets the mathbox, which | means I only know what the mathbox does at the highest 'API' | level (can't 100% speak to what the mathbox microcode | actually does) | | Paul is essentially correct. It's mostly back to front, with | some assumptions. I can confirm quite a number of tricks were | used, not entirely sure they qualify as a full BSP (although | it does involve precompiled mesh optimizations) but then | again I'm no BSP expert (just an I Robot expert). | | The rasterizer draws a list of polygons (full polys, not just | triangles) which must be fully computed before sending to the | rasterizer. This "display list" is drawn in order. There is | no Z buffer, so everything draws on top of what was already | there. So it's important that the display list have the | polygons somewhat sorted. Also the polygons are drawn as is, | which means rotation/projection is done before being written | to the display list. | | The camera was fixed (can't yaw) as to simplify math. This | means you are always looking straight ahead in Z axis, which | allows for all sorts of optimizations. It also makes it | trivial to sort objects back to front, simply by checking z | coordinate. | | Next, the hardware only supports 16 moveable "3D mesh" | objects that can be drawn per frame. With an object count | this low it's trivial to sort back to front without consuming | many CPU cycles. 6809 index instructions make this part a | breeze. | | The game playfield is "built on the fly", essentially | mirroring a 16x16 (16x24? 16x32?) playfield tile array in RAM | (each byte encodes tile height), and building each cube face | polygon on the fly. | | Because of the fixed camera, no real sorting or culling is | necessary when creating the display list at the highest | level. You simply draw from back of the screen (furthest | playfield row) to front (closest row). This process never has | to change, it's a linear traversal of playfield array memory, | and it always works as long as the camera doesn't yaw. Just | draw each row on top of the last one. Guaranteed to not cause | problems. | | To draw a row (16 tiles, 15 visible) all of the | front/top/side polygons are computed and written to the | display list. Again, because Z axis is fixed it makes | creating the polygon coordinates trivial (you always see | front of cube, no culling needed). Once that's done you draw | any 3D meshes that are in the row (objects sitting on the | playfield cells). Then you move to the next row, and the | next, and then you're done. | | So at this high level, no real culling is done... things | essentially get drawn in order of depth, which always works | as long as you don't yaw the camera. | | Now... where things get interesting is the 3D mesh objects | themselves are stored in a sort of "pre-computed" way to | optimize surface removal (not sure if this was done by hand, | or by an automated/compile process). It's not so much a tree | structure, as it's actually done through specialized | instructions. I suppose the jumping/branching is the | equivalent of tree traversal. | | Basically the meshes are held in ROM, and are really just | specialized subroutines, written as a series of simple | instructions for the Mathbox. To keep this writeup simple, | treat the Mathbox as being able to execute two instructions: | 1) write a polygon to the display list 2) "relative | jump to instruction" if normal vector dot product points | towards/away from camera | | A 3D mesh object might be coded in ROM like this | Mathbox3DMeshObject: if normal vector dot product < | 0, jump to X ; pointing away from camera write poly 1 | to display list ; forward facing polygons write | poly 2 to display list ... Jump to Y | X: write poly 10 to display list ; back facing | polygons write poly 11 to display list ... | Y: if normal vector dot product < 0, jump to Z ; | sideway facing? write poly 20 to display list | write poly 21 to display list ... Z: | write poly 99 to display list ; always drawn ... | END | | A mesh object could code any number of jumps, to cover side | facing objects, corner cases, etc. Note I'm glossing over the | fact that the mathbox would also multiply the coordinates by | the current rotation matrix, project to X/Y, perform shading, | etc. before emitting to the display list. | | As code it's not a real tree structure, although executing | the code behaves a lot like traversing a tree. I guess this | is sort of a BSP? Can anyone chime in here? | | It was pretty clever how it worked out. The mathbox executes | the instructions, and creates draw instructions in order. | Polygons get skipped by virtue of the fact you're | periodically checking normal vectors, and skipping around to | different instructions as needed. | | Edited: clarity, typos | phkahler wrote: | Sounds like the mesh objects are nearly equivalent to small | BSP trees. | | Thanks John! | lordfrito wrote: | It definitely does work like a BSP tree. | | The issue to me is that a tree is a data structure, but | the meshes in I, Robot are encoded as literal subroutines | of machine instructions. There is no tree to traverse, | the Mathbox simply executes the instructions as it | encounters them. | | These aren't even "interpreted" (fake) instructions for a | software VM (like Sweet16). They're literal instructions | for the custom Mathbox microcode. The Mathbox executes | the instructions, and outputs polygons to the display | list as told. | | It's a lot like a compiler optimization that unrolls a | loop inline... Can you even call it a loop once it's | unrolled? In this case, it's as if a BSP was created, | then turned into literal instructions for a custom | machine. Many times the polygons were duplicated in the | code (in-lining polys works faster than branching back). | It blurs the line between code and data. | | I'm really blown away by the sophistication of the I, | Robot Mathbox... The microcode was exceedingly complex, | much more than just some fast matrix math (like | Battlezone etc). It executed instructions to compute dot | products, which would potentially result in X/Y/Z points | being multiplied by a rotation matrix, projected, and | written as polygons to the display list, for later | rasterization by a different processor altogether. | | Anyhow it's fun to revisit this stuff :) | chowells wrote: | Sure, that's still a BSP tree. It's just represented as | code instead of data. The duality of data and code is a | long-established optimization path for static | information. | JohnBooty wrote: | These aren't even "interpreted" (fake) | instructions for a software VM (like Sweet16). | They're literal instructions for the custom | Mathbox microcode. The Mathbox executes the | instructions, and outputs polygons to the | display list as told. | | Wow, that's amazing, I love that. Thank you so much for | explaining this. | corysama wrote: | 20 years after playing around with your emulator, I get to | tell you that the "metafile dump to Window's clipboard" | feature is just so nifty it still sticks out in my memory | today. | lordfrito wrote: | Glad you appreciated it! Not many people knew it was | there. I made a custom T shirt from some of the clipboard | vector output. | | FYI I've ported the emulator to C#, it runs in even | higher resolutions than before. [1] | | [1] https://github.com/manfreda-dot-org | tasty_freeze wrote: | To be fair, I didn't ever look at the code. I took the word | of Dave Sherman, the guy who designed the hardware for it | (though the "math box" was lifted from another game, probably | Battlezone). | | Here is one more interesting thing about the display that | Dave told me. DRAM was expensive, and at the time the I, | Robot game was much more expensive than most of the arcade | machines they put out. To save on DRAM, they used fake double | horizontal resolution. | | This is a memory from a conversation 30 years ago, but there | is 3b per pixel to specify color, and one bit to indicate if | the edge should be pushed out half a pixel. That is, when | rasterizing the polygons, they saved one fractional X bit in | the pixel. In the interior of polygons it had no effect, but | when generating video, if pixel at offset (x+0) and pixel at | offset (x+1) had a different color index, the fraction bit | from pixel (x+0) was used in the video timing generator to | place where the color transition would happen, doubling the | apparent horizontal resolution. When more than one edge | crossed a pixel the information isn't that useful, but such | pixels are a tiny fraction of all edge pixels, so overall it | was a big win. | lordfrito wrote: | You know reading this caused some forgotten neuron in my | brain to fire. | | When I was reverse engineering the thing, I seem to | remember there being a bit related to color/shading that I | couldn't figure out what it did (I could see that it was | being used). | | The system had a 64 color palette. The software broke that | up into 8 sub-palettes to give you for 8 main colors with 8 | shades each (0-7 first color, 8-15 second color, etc.) | | Polygons could call out colors directly using full palette | index. Most polygons "faked" shading, they really just | specified different shades of the same color for effect. | For example, the dodecahedron / meteors in the space waves | aren't real-time shaded, just colored to look that way. | | However the hardware could do shading, which cost an | expensive dot product. So it was used sparingly. To do | shading, the polygon used a 3 bit color "index" to call out | one of the 8 sub-pallets. Then you add an offset of 0-7 | based on the dot product of the normal vector to apply | shading (I used 8 x dot product to get you an offset from 0 | to 7.9999). | | I recall there being an extra bit there (4 bits instead of | 3) but couldn't figure out what that last bit was for (it | seemed like a "1/2 bit" to me). It looked like it was being | used, but didn't make sense, so I stripped it off, and it | didn't seem to affect anything so I forgot about it... till | now. If I'm remembering correctly, this might literally be | a "1/2 step" to add between shades? That can't be right can | it? | | Another way I read what you wrote is that it could be | considered an "anti-aliasing" bit to do some alpha blending | at the edge of a polygon to fake higher resolution? Maybe | that's a better way to read it? | | My memory is probably not 100% here... Would love to hear | more. | phkahler wrote: | I had read about that fractional pixel online, and the bit | does seen to exist in the vram (3 bits color, 3 bits | intensity, one other bit). | | I don't think it ever got implemented all the way through. | If it works as described, it should cause the right edges | of polygons to appear with higher resolution than the left | edges, particularly on black (blank) background. I've got | an I, Robot in my living room and have looked for this | visual difference and can not see it. Doodle City even | provides the opportunity to move and rotate objects, so | even with direct control of object and orientation I can't | see it. | | Apparently there were other things that never got finished | with the game too. | toolslive wrote: | Yes, If you were reading DrDobbs at the time, you would know | that they actually tried every algorithm in the computer | graphics bible (and they reported about it in that magazine). | Some algorithms were in fact better than BSP, but were a bit | instable performance wise (better avg performance, but worse | worst case) | Teslazar wrote: | I think you're referring to Michael Abrash's Ramblings in | Realtime articles which were written about Quake. When I was | a teen I used to carry these around with me and read them | over and over while I was implementing my own engine. | | Copies are available here: https://www.bluesnews.com/abrash/ | toolslive wrote: | You're right: Quake. not Doom. There are 2 problems getting | old: the first one is the memory that goes, ... the second | one I forgot. | winrid wrote: | A lot of this is covered in his Graphics Programming Black | Book too. | johnvanommen wrote: | I was thinking the same thing. | | The author references the Michael Abrash book, and said it | was from "the late 90s." | | But IIRC, Abrash was writing about BSP in Doctor Dobbs | Journal in the early 90s. | | IE: Abrash inspired Carmack, not the other way around. | | In 1992 I was really keen on learning to make arcade style | games on the PC, and I studied those Abrash articles like | crazy. | pavon wrote: | Yeah, when I was inexperienced what impressed me about Carmack | was all the "cutting edge" techniques and tricks he used to | push hardware to its limits. As I've became more experienced | what impresses me is how iD was able to crank out so much | software so quickly, have time to research these more advanced | methods, quickly triage which were promising and which were | not, and get them working and well optimized in production | software. | | I can read research and incorporate it into my code, but it | takes me a while, and I usually have multiple dead-ends where | it isn't until after I've tried using a technique with real | data that I realize it has weaknesses the researchers didn't | point out and which can't easily be fixed. I can crank out | quick code, but it isn't going to be advanced or optimized. I | struggle finding the right balance of how to trade between | these three things, and yet Carmack and iD were able to squeeze | in all three at once. | kudokatz wrote: | > As I've became more experienced what impresses me is how iD | was able to crank out so much software so quickly | | > usually have multiple dead-ends where it isn't until after | I've tried using a technique with real data that I realize it | has weaknesses the researchers didn't point out and which | can't easily be fixed | | If you listen to John Romero himself telling the story of id | and the fast iteration, it was "due to having, basically, 10 | years of intense game development experience prior": | | https://youtu.be/E2MIpi8pIvY?t=543 | | ("it's also due to the first principle we had - no | prototypes") | | A theme I tend to see is that a lot of folks with super | output appear to others like icebergs - a LOT of effort that | nobody can see, and a tiny peak at which everyone marvels. | ww520 wrote: | Carmack work long hours everyday, really long hours. | HelloNurse wrote: | Using a BSP tree to render front to back rather than back to | front is also quite obvious: it's the same visit in the | opposite order. | 0xAFFFF wrote: | I you guys are interested in how Doom was implemented, I cannot | recommend enough Fabien Sanglard's book, Game Engine Black Book: | Doom. | | https://fabiensanglard.net/gebbdoom/ | teaearlgraycold wrote: | Also the first one on Wolfenstein | phkahler wrote: | I was studying computer graphics at the time. The book we used | was "Computer Graphics Principles and Practice" I don't recall | which edition. BSP trees are covered in the book, and like the | article says, had been written about more than a decade prior. | | What Carmack had was the ability to read research papers and | translate them into working code. I can do that too, but it does | seem to be a less common skill in the software world. | JohnBooty wrote: | What Carmack had was the ability to read research | papers and translate them into working code. I can do | that too, but it does seem to be a less common skill | in the software world. | | Well, that's the argument for Carmack being merely _very very | very good_ and not a "genius." | | Here's the argument _for_ him: | | There were a lot of exceptionally talented people working in | the games industry, and for four+ generations of PC hardware | (the Keen, Wolf3D, Doom, and Quake eras) Carmack's engines were | years ahead of everybody else's and had an absolutely massive | impact on the industry. | | Of course, the term "genius" is nebulous and it can mean | whatever we want it to mean. | | However, describing him as merely a guy who implemented other | peoples' algorithms may be true in a literal sense but really | misses the big picture IMO. | bluedino wrote: | > What Carmack had was the ability to read research papers | | And he had to read actual papers, too. I could see him making | copies at the library. No Google or StackOverflow back then. | jasonwatkinspdx wrote: | Not quite when Doom came out, but within a few years you | could find a surprising amount about graphics on grad | students' homepages. I remember learning about PVS, Portal | based rendering, and a number of similar things from the | original paper author's homepage. | bitwize wrote: | Not just working code -- code that worked well in a real-time | video game on 386/486 class hardware. That's the genius bit as | far as Doom's code is concerned. | | Edit: clarify | johnvanommen wrote: | That was the thing that was so revolutionary about Abrash: he | figured out how to create graphics that rivaled arcade | hardware on the IBM PC. | | Most importantly: _without_ a GPU. | | At the time, all of the really "flashy" arcade games used | GPUs that cost around $1500 in today's dollars. Abrash pulled | off this stunt with no GPU. | | I'd argue that the entire reason the Xbox is called "Xbox" is | because of "Mode X", invented by Abrash. | bitwize wrote: | The Xbox is called Xbox because Microsoft wanted to build a | console around Windows APIs, internally calling it a | "DirectX box". | | And DirectX has nothing to do with ModeX; it's a family of | APIs: DirectDraw, Direct3D, DirectSound, etc. | coliveira wrote: | When Carmack did his work, the computers he had were 386/486 | class. So the distinction you're making doesn't make sense. | Working code is code that works in the computer you have at | the time you're writing the software. | bombcar wrote: | Doom was technically developed on NeXT workstations, but it | had to perform well enough on a 386 (even if in a reduced | window). | royjacobs wrote: | There is a distinction between 'working code' and 'working | code that runs at interactive frame rates in a severely | memory-constrained environment'. | sebstefan wrote: | > The really neat thing about a BSP tree, which Fuchs, Kedem, and | Naylor stress several times, is that it only has to be | constructed once. This is somewhat surprising, but the same BSP | tree can be used to render a scene no matter where the camera | viewpoint is. The BSP tree remains valid as long as the polygons | in the scene don't move. This is why the BSP tree is so useful | for real-time rendering--all the hard work that goes into | constructing the tree can be done beforehand rather than during | rendering. | | I can't visualize why it's true | whatshisface wrote: | Here's my explanation of BSP trees, written as much for my own | benefit as anybody else's (maybe it will help you): | | To start off with, we have a basic geometric fact: if you have | a surface that divides space into separate regions (this can be | a closed surface like a sphere, or an infinite surface like a | plane), a line of sight can't cross between the two partitions | of space without passing through the surface. If the surface is | convex, a line of sight can only cross it once. (That's one | definition of convexity.) | | Now, suppose that you have a lot of objects, and none of them | are on both sides of the boundary. You can ensure that | condition by splitting any object that crosses the boundary in | two. No matter where the line of sight begins, and no matter | what direction it moves in, the boundary then offers a | definite, albeit partial ordering of events: objects on the | same side of the boundary may intersect, then the line will | intersect the boundary, then the objects on the other side may | intersect. | | If you find an intersection with an opaque object on the same | side of the boundary, you never need to check any objects on | the other side, because the order of events above. That is a | big benefit when about half of the objects in front of you are | on your side of the boundary: if you are right in front of the | boundary, or if there's nothing on the other side of it, it | doesn't help a lot. | | Checking all of the objects on one side of the boundary is the | same kind of problem as checking all the objects, and that | means you can speed those up too by having sub-boundaries for | the sub-problems. That gives you the recursion and makes it a | tree. If the reasoning based on boundaries continues until | there is only one object in a sub-problem, you don't need to | sort lists based on distance - meaning that the whole ordering | problem can be solved just with partition trees. | | Choosing a good tree is a difficult problem because, unlike the | fact that the ordering problem can always be solved, the amount | that an individual boundary division helps depends on where you | and the objects are in relation to it. Planes are generally | used instead of other things like spheres because splitting a | polygon along the edge of a sphere doesn't result in another | polygon, while splitting it along a plane does. | bombcar wrote: | https://developer.valvesoftware.com/wiki/BSP may help - the key | is you have divided it into _convex_ spaces. | [deleted] | kazinator wrote: | Because the input to the BSP traversal is the view position. | | Every node in the tree is associated with an infinite plane | that divides space in half. Stuff on one side of the plane is | in the left subtree; stuff on the other side of the plane is in | the other subtree. | | The viewer is on one side of the plane or the other (or maybe | on it, oops). | | We know that stuff on the other side of the plane cannot | occlude the view of stuff on the viewer's side of the plane. So | for instance if we're doing back-to-front rendering, we would | draw the material from the other side of the plane first, then | stuff from this side. (Subject to the content being rotated | into the correct view and clipped to the view frustum.) | | There is no polygon in the tree that intersects/straddles these | dividing planes, which is the point. When the BSP is | constructed, whenever these planes cut through some input | polygon, it is cut into two pieces that get assigned to | different subtrees. There are then situations in which those | pieces will not get drawn at the same time though they belong | to the same logical surface. | | The planes which subdivide space in half are independent of | each other; they go every which way. So if we have nodes like | a b c d e f g | | it is possibly the case that plane b cuts through the polygon | sets f and g, and maybe some of them even straddle plane b. | None of that is relevant to them because the b subset is on the | opposite side of common ancestor a, the b plane subdivision is | concerned only separating d from e. Binary space partitioning | is not actually a mathematical partition (exhaustive division | into non-overlapping parts) of the space. The set of poygons | gets partititoned, of course; not the space itself. | yojo wrote: | For people not reading the whole thing (reasonable, it is long): | the author comes to the conclusion that it was not so "genius" | after all, given the prior art, and that he read it in a computer | graphics book. He still gives Carmack props for doing his | homework and getting a working implementation built in a novel | game engine. | jamestimmins wrote: | One thing confuses me about this: does it require a separate tree | per-axis? | gary_0 wrote: | It doesn't split by axis, it splits by planes that can be | oriented however. Planes split a space into 2 half-spaces, | hence the "binary" part. Only the older Wolfenstein-style | rendering cared about the axis. | jamestimmins wrote: | That makes sense if I think about everything existing on a | single axis (objects in front of each other on the z-axis). | | But if you then turn 90 degrees, then suddenly a bunch of | those z-axis items are side by side, and order is based on | the x-axis of my original perspective. So I don't understand | how a binary structure accounts for that perspective shift. | | Edit: For clarification, my understanding is that there's a | single tree built per-level in advance. If I'm mistaken and | it's re-building the tree 30 times per second, the lack of | axis concerns makes sense, but the article seemed to indicate | that that's computationally prohibitive. | gary_0 wrote: | The tree is built once. The tree can be traversed from any | point in the world because each node stores which plane it | used to split, and it is easy to compute which side of the | plane you are viewing. | | Have you tried reading | https://en.wikipedia.org/wiki/Binary_space_partitioning ? | It shows how a BSP is built and traversed for rendering. | jamestimmins wrote: | Interesting. Thanks for sharing. I think i need to dig | into that article to try to understand this better. | isolli wrote: | I'm reminded of the render algorithm in Comanche, the 1992 video | game [0]. I find it remarkably ingenious, and it certainly felt | incredible at the time. | | [0] https://github.com/s-macke/VoxelSpace | speed_spread wrote: | I've played countless hours with the A-10 simulator which I | believe used the same "voxel" technique. | atan2 wrote: | Really? I am not aware of such game. Do you remember the A-10 | flight sim you played that used voxelspace? | samlittlewood wrote: | Quite a few games (eg: Starfox) in the 90's would use BSP in a | slightly different way - purely within each object. The tree | would be generated offline (mechanically or by hand), and would | often be some sort of branching bytecode (test this, draw that | etc.) | | It was very useful to be able to use one plane test to discard a | whole bunch of geometry, eg: a surface plus its detail, or | everything visible through a window. | | This still left the problem of sorting between objects - mostly a | depth sort was just fine. Special cases like weapons on | hardpoints, or vehicles in hangars could be handled by having | some sort of 'call submodel' for spaces within the parent model. | Beyond that, just dial up the hostile firepower until players | stop complaining about rendering artefacts. | bigmattystyles wrote: | I thought this was going to be about all the small hallways in | Doom where you have to close one door before opening another - | this was done so that the engine would never have to render more | space than 1 room at a time. I don't know if that's true, just | something I've often heard repeated. | tboughen wrote: | I played Doom, Doom 2 and Final Doom all the way through about | 20 years ago - I can't recall a single instance of this kind of | level design. Doors stayed open once opened. Maybe it was done | with corners/sight lines instead? To be honest I remember some | vast areas rendered, especially when accessing secrets. | jonny_eh wrote: | Doors definitely auto-closed behind you. | sedeki wrote: | What is a good learning path (books, videos, ???) to take in | order to understand modern graphics programming? At the level of | Vulkan or Metal, say. | | I have had a bird's eye exposure to shader-based OpenGL but don't | feel I have a good intuitive understanding of how the GPU | operates. | newsclues wrote: | Graphics Programming Black Book by Michael Abrash 1997 | bombcar wrote: | I recommend this even though it's now 20+ years old, it's | accessible enough that you can get a foothold. | | Alternatively, get the Black Books for Wolfenstein and Doom, | and read them first. | dceddia wrote: | This tutorial is well regarded for learning the ins and outs of | modern OpenGL including how the GPU pipeline works, which I | think translates pretty well to Vulkan or Metal (at least the | concepts anyway) https://learnopengl.com/ | coldpie wrote: | Start with Foley & van Dam to lay the groundwork & terms of | art, then grab whatever API tutorial you like and get started. | You'll find plenty of further resources once you're more | familiar with the field and start discovering your interests. | Graphics programming is a seriously deep discipline and will | take years of study to start doing real work in. But it's a | super cool field with plenty of work on both the research & | practical implementation sides. | i_like_apis wrote: | Include: Ericson, Real-Time Collision Detection | https://www.goodreads.com/book/show/620505.Real_Time_Collisi... | Waterluvian wrote: | I think the "sit and read research papers" part is often omitted, | to our own peril, from the romanticized image of the genius | computer hacker of that era. | ddeville wrote: | Carmack himself describes it well in the recent interview with | Lex Fridman https://youtu.be/I845O57ZSy4?t=8961 | random314 wrote: | Yet another Carmack worship story. He is a great engineer, but | let's not overhype this. | | It started with the legendary square root function. Which lost | its legendary status once it turned out Carmack wasn't the | author. | | Now it is about standard BSP technique from Computer graphics | text books from that era. | | There is also hype about Carmack solving AGI by 2030, when he is | yet to publish a single widely cited AI paper. | | EDIT: | | I am guilty of reading the first couple of paragraphs and | assuming this was Carmack worship. My bad! Leaving the original | comment, mostly as is. | EamonnMR wrote: | The story of tracing the magic number back is really fun and | it's interesting to see how the trade of computer graphics | programming was passed down from one company to another. I | don't think it's legendary status was at all diminished when it | turns out not to have been Carmack's invention. | | https://www.beyond3d.com/content/articles/8/ | | https://web.archive.org/web/20120920120948/http://blog.quent... | dmbaggett wrote: | As someone working at the same time as him on similar problems, | I honestly and sincerely disagree. At the time my reaction was | "this is space cadet stuff". 486 hardware was really slow, and | the idea of doing any kind of real computational geometry back | then was incredibly far-fetched. Until he showed it wasn't. | ajuc wrote: | Carmack is a great programmer because he consistently produces | solid, fast code, not because he has groundbreaking ideas. | | It's similar with Linus. He didn't invented monolithic kernels, | UNIX, distributed version control. He "just" wrote good | implementations. | | Turns out brilliant ideas are overrated in programming, what | matters is execution and daily grind. | FartyMcFarter wrote: | > Turns out brilliant ideas are overrated in programming, | what matters is execution and daily grind. | | You also have to be good at finding the brilliant ideas that | are worth using. | manv1 wrote: | You need to understand the problem, understand what you're | trying to do, and understand how this abstract solution | maps to your problem. | | Then you have to implement it. | smoldesu wrote: | FWIW, the inverse square root function wasn't popularized by | Carmack. Most people shared it because they were amused by the | confused comments surrounding it, and how it demonstrated an | extremely vertical slice of optimization. | | Treating anyone like a rockstar is a bad idea, but Carmacks and | Wozniaks deserve the praise more than the Zuckerbergs and Jobs' | out there. | Ygg2 wrote: | What do you mean not worship Jiu-Jitsu practicioner, Luddite | Nemesis, Jace Hall asphyxiatior, and master of the Anti-Life | equation - John Carmack? | JustSomeNobody wrote: | Yeah, I don't think we read the same article. This isn't a | Carmack worship article. | royjacobs wrote: | Let's not underhype it, either: It seems you are | underestimating the impact that Doom had on PC gaming when it | came out. Using BSP trees to perform zero overdraw rendering | definitely was not something that had been done before on | consumer level hardware and it can definitely be argued that | applying them this way for Doom was a stroke of genius. | coliveira wrote: | You're setting the bar for "genius" too low, which | unfortunately is commonplace in computer programming. If you | take something that already exists and apply it to your code | I don't call it genius. Hi quality programming, yes, but | genius no. | royjacobs wrote: | I didn't realise I was not allowed to set my own bars, | though :) | cwillu wrote: | From the Carmack worship story: "So Carmack was not the first | person to think of using BSP trees in a real-time graphics | simulation. Of course, it's one thing to anticipate that BSP | trees might be used this way and another thing to actually do | it. But even in the implementation Carmack may have had more | guidance than is commonly assumed." | random314 wrote: | I am guilty of reading the first couple of paragraphs and | assuming this was Carmack worship. My bad! | cwillu wrote: | Honey for the poison. | choeger wrote: | What I don't understand is, how the tree can be used even after | movement? How do you know which polygon is behind the other if | camera position and angle are freely chosen? ___________________________________________________________________ (page generated 2022-11-21 23:00 UTC)