[HN Gopher] The Lost Art of C Structure Packing (2014)
       ___________________________________________________________________
        
       The Lost Art of C Structure Packing (2014)
        
       Author : isaacimagine
       Score  : 78 points
       Date   : 2022-04-27 16:50 UTC (6 hours ago)
        
 (HTM) web link (www.catb.org)
 (TXT) w3m dump (www.catb.org)
        
       | CalChris wrote:
       | Struct packing has its points for small structs. Indeed, you can
       | reduce cache use and increase cache locality. However for large
       | structs, page aligned structs, the cache lines will be
       | constrained to a particular set. Moreover, pointer following from
       | struct to struct can incur a TLB hit; the TLB is another small
       | cache. So while you may cleverly encode things to squeeze size,
       | you may then watch things slow to a crawl.
       | 
       | You are packing small structs in order to squeeze lots of them
       | into the caches. However for large structs, you should at least
       | consider refactoring them into small structs which you can then
       | pack to your heart's content.
        
       | Comevius wrote:
       | Also related: https://media.handmade-seattle.com/practical-data-
       | oriented-d...
        
       | dang wrote:
       | Related:
       | 
       |  _The Lost Art of C Structure Packing_ -
       | https://news.ycombinator.com/item?id=15626205 - Nov 2017 (49
       | comments)
       | 
       |  _The Lost Art of C Structure Packing (2014)_ -
       | https://news.ycombinator.com/item?id=12231464 - Aug 2016 (112
       | comments)
       | 
       |  _The Lost Art of C Structure Packing_ -
       | https://news.ycombinator.com/item?id=9517623 - May 2015 (4
       | comments)
       | 
       |  _The Lost Art of C Structure Packing_ -
       | https://news.ycombinator.com/item?id=9069031 - Feb 2015 (113
       | comments)
       | 
       |  _The Lost Art of C Structure Packing_ -
       | https://news.ycombinator.com/item?id=6995568 - Jan 2014 (143
       | comments)
        
       | ncmncm wrote:
       | It neglects to mention that bit fields have always been the
       | buggiest part of C compilers, and there is never a good enough
       | reason to rely on them, if you have a choice at all. Honest
       | shift-and-mask operations on unsigned machine words are always
       | better, if you absolutely must pack bitwise.
        
         | camgunz wrote:
         | Super agree. I read up on bit fields [0] a while ago and some
         | of the details about them are bonkers:
         | 
         | > Multiple adjacent bit-fields are usually packed together
         | (although this behavior is implementation-defined)
         | 
         | > The special unnamed bit-field of size zero can be forced to
         | break up padding.
         | 
         | > int b:3; may have the range of values 0..7 or -4..3 in C
         | 
         | > on some platforms, bit-fields are packed left-to-right, on
         | others right-to-left
         | 
         | I wouldn't touch them unless I absolutely had to, and knew I
         | could guarantee compiler and platform.
         | 
         | [0]: https://en.cppreference.com/w/cpp/language/bit_field
        
         | [deleted]
        
         | mjevans wrote:
         | Until I read this article that's what I thought C Bitfields
         | _were_. I didn't realize the specification was so uselessly
         | sloppy about alignment packing that a programmer couldn't
         | reliably address specific bit field members with just a lowest
         | bit first to highest bit field of exact widths. It's quite
         | annoying that such is not what those are.
        
           | scatters wrote:
           | If the specification is lax, it's because the expected
           | behavior is different across platforms. C (and C++) has to
           | support platforms where bytes are more than 8 bits, where
           | floats are non-IEEE, and (until recently) where signed
           | integers are not 2's complement. If you want a specific
           | behavior and don't care about portability all you have to do
           | is read the ABI spec alongside the standard.
        
         | ithinkso wrote:
         | Bitfields are used a lot when you have constrained resources,
         | in my case in LTE/5G both on modem and bts sides. Every
         | struct's field takes as much as it needs to and you leave rest
         | as 'reservedX'. You never know when a new feature will have to
         | be implemented and few more bits will be needed for some new
         | field.
         | 
         | Without bitfields the code would be absolutely filled with bit-
         | access macros decreasing readability and screwing with IDE's
         | indexers and static analyzers big time
         | 
         | Not to mention the pain it would be to refactor/reorder/change
         | fields sizes which is relatively painless with bitfields
        
           | zozbot234 wrote:
           | The drawback though is that you can't reference individual
           | fields by pointer. You basically need to write the
           | equivalents of OOP getters and setters to really keep the
           | code tidy. The compiler can't plan for such things on its
           | own, not across multiple compilation units at any rate.
        
         | chrisseaton wrote:
         | > shift-and-mask operations on unsigned machine words are
         | always better
         | 
         | ... but that's what a bit field is?
        
           | jcranmer wrote:
           | That's not what a bit field is.
           | 
           | A bit field (in C/C++) is a weird object type that can only
           | exist in a structure or union type, which kind of acts like
           | an underlying regular integral type except for those
           | situations where it does not.
           | 
           | For an example of why compilers might have issues compiling
           | bit fields properly (although this requires C++, since C's
           | ternary operator works on rvalues, not lvalues):
           | struct A { int x: 3; int y: 5 } a;       (choice ? a.x : a.y)
           | = val;
           | 
           | Enjoy making that codegen work properly.
        
             | mwint wrote:
             | Can someone explain further why this is so hard? I'm not
             | familiar enough with any of this to understand without some
             | help.
        
               | jcranmer wrote:
               | There's a few layers of complexity here.
               | 
               | The first is lvalues. In compiler jargon, an lvalue is a
               | kind of object that can have a value stored to it. And
               | you can usually represent it as the address of some
               | memory location [1]. Of course, bitfields break this
               | representation: you need to know what the bit offset and
               | bit size of the field you're storing is (as well as the
               | signedness).
               | 
               | The next level of complexity is the conditional operator.
               | This means that, when conditional operators yield lvalues
               | [2], you now end up in a situation where the lvalue now
               | has a _conditional_ bit offset and bit size within the
               | address. Or maybe one leg of the expression returns a
               | bit-field and the other leg returns a regular int lvalue.
               | Imagine how complex your datastructure needs to be to
               | represent an lvalue during this code generation phase.
               | 
               | [1] Not all lvalues need to have memory locations. But if
               | you're writing a C compiler, it's an easy first
               | approximation to give every variable, even those marked
               | register, some memory location and rely on an
               | optimization pass to convert stack memory locations into
               | register locations, rather than keeping track of this
               | information when the frontend does code generation.
               | 
               | [2] As mentioned elsewhere, conditional operators in C do
               | not yield lvalues. But conditional operators in C++ do.
        
               | ncmncm wrote:
               | It is just very finicky, with myriad edge cases easy to
               | get wrong, and even easier to neglect to have complete
               | tests for. Each target CPU design and version has quirks.
               | Many involve sign extension.
        
               | jstimpfle wrote:
               | Extracting a subrange of bits and shifting them to the
               | beginning is not exactly rocket science.
        
               | ncmncm wrote:
               | Experience indicates otherwise.
               | 
               | If you _must_ use bit fields, make them unsigned. Bugs
               | love to hide under signed bit fields.
        
               | jstimpfle wrote:
               | That makes sense, looking at how signed arithmetic works
               | on different architectures it would feel strange to use
               | signed bitfields.
               | 
               | Unsigned bitfields are a nice way to get modular
               | arithmetic with n bits without syntactic clutter.
        
               | ncmncm wrote:
               | "Unsigned bitfields are a nice ..."
               | 
               | Appear to be. Are, when all the stars align. Are not in
               | fact, often enough that you are issued a red warning you
               | may ignore if you are insulated from all consequences.
        
             | chrisseaton wrote:
             | Is that valid C code? Is a ternary on an L-value an
             | L-value? I'm not sure it is - regardless of bitfields or
             | not?
             | 
             | https://godbolt.org/z/aP8v5xKaz
        
               | jcranmer wrote:
               | I mentioned in the note that it's not legal C, since
               | ternaries must yield rvalues in C. It is legal C++,
               | however, since there ternaries may be lvalues.
        
               | chrisseaton wrote:
               | I feel like you added that after I replied.
        
             | WalterBright wrote:
             | The compiler can rewrite it as:                   choice ?
             | ((a.x = val),a.x) : ((a.y = val),a.y);
        
             | iainmerrick wrote:
             | But, you'd have an even harder time making that work with
             | mask-and-shift macros!
             | 
             | It also doesn't seem like something that would come up very
             | often. I can't think of the last time I conditionally
             | stored to one of two struct fields, if I ever have.
             | 
             | The much more normal case would be:                 val =
             | choice ? a.x : a.y;
             | 
             | That one seems pretty straightforward from a codegen
             | perspective.
        
               | ncmncm wrote:
               | "Seems" is not the domain under discussion.
        
               | jcranmer wrote:
               | The example I gave is an example of something legal with
               | bitfields (in C++) that is legitimately challenging to
               | implement [1] that leads to bugs in compilers. It's not
               | meant to be something that anyone is intended to use--
               | indeed, I'd firmly suggest that the standard ought to
               | prohibit this kind of usage.
               | 
               | The broader point is that bitfields are actually weird
               | little objects that look a lot like regular objects in
               | many, but not all, contexts. And it's very easy from a
               | language design or implementation perspective to forget
               | to account for the possibility that you're dealing with a
               | weird little object. This leads to underspecified
               | language specifications and compilers that crash if you
               | do something weird (but legal) such as virtually inherit
               | from a struct containing a bitfield as its last member.
               | 
               | [1] So challenging, in fact, that Clang gives an error
               | message "cannot compile this conditional operator yet".
               | It does work in g++, icx, and MSVC though.
        
               | iainmerrick wrote:
               | That all makes it sound like a C++ problem, not a C
               | problem.
        
               | jcranmer wrote:
               | You can make a lot of the "fun" of bitfields go away with
               | lvalue-to-rvalue conversion, and C tends to do this
               | conversion very rapidly so that it's hard to find good
               | cases for truly bizarre stuff, whereas C++ makes lvalues
               | last a lot longer.
               | 
               | Of course, if you go reach for C's standard "fun with
               | lvalue" operations, you can get some crazy nonsense. What
               | machine code should you generate here [1]:
               | struct A { int x : 5; volatile _Atomic int y: 3; } a;
               | a.y++;
               | 
               | I will note that the intersection of volatile and
               | bitfields has been another fruitful area of compiler bugs
               | [2] historically speaking. While C++ does provide better
               | what-the-ever-living-fuck moments for bitfields, C has
               | had its fair share of issues with bitfields.
               | 
               | [1] Whether or not you can make a bitfield _Atomic in C
               | is implementation-defined, so it's possible that someone
               | writes a C implementation where this is legal. I will
               | note that, in a rare display of sanity, all C compilers I
               | can test do in fact sensibly reject _Atomic bitfields,
               | but for the purposes of argument, assume that someone has
               | one where it's permitted, since it is allowable by the
               | standard.
               | 
               | [2] Or programmer bugs blamed on the compiler. This is
               | the intersection of two areas that are notorious for
               | underspecification to begin with, and combined with the
               | general tendency of programmers to expect C compilers to
               | be a thin veneer over assembly, makes it awfully
               | difficult to figure out which behavior is language-
               | intended.
        
               | dfox wrote:
               | I vaguely recall that gcc supports this even in C mode.
               | 
               | The underlying problem has to do with whether the IR has
               | first-class concept of arbitrary lvalue or whether the
               | frontend has to convert lvalues that get passed around to
               | some pointer-like thing.
               | 
               | It might look irrelevant for discussion of low-level AOT
               | compilers, but it is also interesting to compare how this
               | is implemented in dynamic/"scripting" runtimes and how
               | the choice of underlying implementation of the concept of
               | "lvalue"/"place" influences the user visible language.
               | Somewhat notably first draft of Common Lisp had something
               | akin to first-class lvalues and the final standard
               | replaced all that with significantly simpler mechanism
               | that purely relies on macros.
        
           | jmwilson wrote:
           | With shifts and masks you know where the bits are. With
           | bitfields, you don't because the specification leaves
           | everything up to the compiler.                 struct foo {
           | char a : 4;         char b : 4;       };
           | 
           | Is a in the high-order 4 bits, or the lower 4 bits? Both
           | choices are allowed, so it's up to the compiler and makes the
           | code non-portable.
        
             | chrisseaton wrote:
             | Surely there's an ABI? Otherwise how does this work at all?
        
               | pavlov wrote:
               | Never ever use bitfields in structs that may cross
               | library boundaries. There are some corners of C that are
               | not fit for public APIs.
        
               | masklinn wrote:
               | > Otherwise how does this work at all?
               | 
               | Hopes, prayers, and a single version of a single compiler
               | being involved.
        
             | [deleted]
        
             | iainmerrick wrote:
             | _With shifts and masks you know where the bits are._
             | 
             | You know where the bits are _within a single word_. But if
             | you have a struct with multiple fields, it's not safe to
             | rely on the exact memory layout even if it doesn't have any
             | bitfields.
             | 
             | If you need to represent a very specific memory layout,
             | it's not just bitfields you need to avoid, it's structs in
             | general.
             | 
             | Conversely, if you don't need to guarantee a specific
             | layout, bitfields are fine to use, and could be a useful
             | optimisation hint for the compiler.
        
               | ncmncm wrote:
               | In other words, you don't understand.
        
               | iainmerrick wrote:
               | Here's an example where I think bitfields are totally
               | appropriate:
               | 
               | Say I have a window manager, and I want to attach a bunch
               | of boolean flags to each window object (isVisible,
               | isMaximized, etc). I don't need to serialize them to
               | disk. It's highly preferable that they should be
               | efficiently bit-packed, but not strictly essential.
               | 
               | The conservative way to implement that would be bit-
               | shifts and masking (either manually or via a macro). But
               | implementing it with bitfields would be a lot easier and
               | less error-prone, and would work just as well. What
               | problems do you see with the bitfield approach?
        
             | dmitrygr wrote:
             | sometimes you do not care, and                  x = foo.a
             | 
             | is simpler than                 x = (foo & FOO_MASK_A) >>
             | FOO_SHIFT_A
             | 
             | and for assignments, the difference is even bigger:
             | foo.a = x
             | 
             | is much better than                 foo = (foo &~
             | FOO_MASK_A) | ((a << FOO_SHIFT_A) & FOO_MASK_A)
        
               | zozbot234 wrote:
               | If "foo" is defined as part of an API/ABI that's used in
               | multiple compile units you will always care, since
               | otherwise a random change in "implementation defined"
               | bitfield encodings on some obscure architecture might
               | break your build. Bitfields are a misfeature in most
               | real-world cases.
        
               | InitialLastName wrote:
               | The case where a) you don't care about the in-memory
               | representation of your struct and b) you care a lot about
               | being able to pack into the absolute minimum memory
               | space, but not enough to make sure the compiler actually
               | packs the fields (depending on architecture and
               | optimization settings, they might not!) is vanishingly
               | small.
               | 
               | The more frequent perceived use for bit-fields (in the
               | situation where they actually work) is to pack into a
               | serialized data format, such that memory or a data stream
               | can be accessed elsewhere. In that case, "the compiler
               | can do whatever it wants with your data packing" is
               | pretty useless, since your "elsewhere" might have a
               | different compiler that does a totally different thing.
        
               | dmitrygr wrote:
               | > is vanishingly small.
               | 
               | Ladies and gentlemen, this thought is why we now consider
               | 8GB of ram to be a "weak device".
               | 
               | No, no no no no, 1000 times no. Every situation is a low
               | ram situation. Every!
        
               | InitialLastName wrote:
               | What I'm saying is that the case where you want to use
               | less RAM for a bit field but you don't actually care if
               | the compiler _allocates less then an addressable line of
               | RAM for that bit field_ (because it actually just might
               | not) is pretty empty.
               | 
               | Edit: I know it's hard to read a whole sentence at once,
               | but I made that same point directly up there too.
        
             | Findecanor wrote:
             | While the C and C++ _language_ specs don 't specify the
             | layout of bitfields, modern platforms tend to have a
             | specified _ABI_ which compilers follow when compiling for
             | that platform.
             | 
             | 64-bit Linux distros and the BSDs follow the convention
             | once set by the "C ABI for Itanium".
             | 
             | In that, bitfields are grouped in declaration order into
             | container words of the same width as the bitfield's type
             | (char, int, etc.). Bitfields don't span multiple container
             | words, and container words don't overlap. On little-endian
             | platforms, bitfields are packed LSB first, but on big-
             | endian platforms they are packed MSB first within their
             | container word. Alignment rules apply only to the container
             | words.
        
               | ncmncm wrote:
               | That is all very fine.
               | 
               | If the instructions emitted and the instructions
               | implemented both happen to match that, on every chip your
               | code must run on, you got lucky.
        
               | [deleted]
        
               | dfox wrote:
               | The point is that if you care about the resulting in-
               | memory layout then you by definition know on what
               | platform the code will run and what is the ABI.
               | 
               | If you want to produce same sequence of bytes regardless
               | of underlying platform, then you have to do it by hand
               | with uint8_t[] buffers and explicit shifts and masks.
               | Casting pointer to struct to char* and writing it
               | somewhere is inherently non-portable and this gas nothing
               | to do with bitfields and nothing to do with things like
               | __attributte__((packed)), although both of these things
               | are useful when you want to do that and understand the
               | (non-)portability implications.
        
           | ncmncm wrote:
           | Physically, yes. The difference is whether you let the
           | compiler generate and hide the shift-and-mask ops, or code
           | them by hand. _Normally_ it is better to leave details to the
           | compiler. This is the exception to that rule.
           | 
           | A result of people avoiding declaring bit fields in serious
           | use cases has been that compiler vendors didn't worry too
           | much about bitfield codegen bugs.
           | 
           |  _Probably_ Gcc and Clang are OK on x86, by now. But that
           | does not carry to, e.g., obscure microcontrollers. Heaven
           | help you if your bit field members are supposed to correspond
           | to hardware register sub-fields.
        
             | iainmerrick wrote:
             | The same applies to structs in general, not just bitfields.
        
               | ncmncm wrote:
               | Common experience is that compilers, ABIs, and
               | instruction implementations get _ordinary_ struct fields
               | right.
        
           | layer8 wrote:
           | Bit fields are a specific C language feature, allowing to
           | treat bit slices as a unit whose length doesn't need to
           | correspond to one of the integer types. See for example
           | https://docs.microsoft.com/en-us/cpp/c-language/c-bit-
           | fields....
        
             | chrisseaton wrote:
             | Yeah they compile to the same machine code operations
             | though. If those machine code operations aren't right as a
             | bitfield then they aren't going to be right done manually
             | either.
             | 
             | https://godbolt.org/z/csvTx89EG
        
               | layer8 wrote:
               | Bit field support being buggy exactly means that they
               | don't compile to the same machine code as the bit
               | shifting/masking code you would write by hand (if your
               | hand-written code is correct).
        
               | alcover wrote:
               | Is it not rather                 void bar(struct y *s,
               | unsigned int foo) {           s->c = (s->c & 0xf0) | foo;
               | }
        
               | tom_ wrote:
               | ARM is little-endian, and by tradition bitfield bit
               | indexes are assigned from least significant (bit 0 in ARM
               | terms) to more significant. b occupies bits 4-7
               | inclusive.
        
               | minipci1321 wrote:
               | > Yeah they compile to the same machine code operations
               | though.
               | 
               | Not always. Switch your example to AARCH64 and check out
               | the BFI instruction.
        
         | bsder wrote:
         | Quite true. C bitpacking is lousy.
         | 
         | The best "bitpacking" I have ever dealt with is the "Erlang Bit
         | Syntax". I really wish more languages would adopt it.
         | 
         | See:
         | https://www.erlang.org/doc/programming_examples/bit_syntax.h...
        
         | WalterBright wrote:
         | > bit fields have always been the buggiest part of C compiler
         | 
         | Not in my experience. The buggiest part was the preprocessor.
         | You don't hear much about preprocessor bugs anymore because the
         | C standard doesn't dare change it, and in 40 years people have
         | finally got them working right :-/
         | 
         | Personally, I had to scrap and rewrite the C preprocessor 3
         | times to get it right.
        
         | AdamH12113 wrote:
         | Bitfields make for much more readable code when accessing
         | individual fields of hardware registers, although there are
         | some caveats if the registers are poorly-designed. The main one
         | is that bitfield writes are usually read-modify-writes, so if
         | reading the register or writing back its current value causes
         | something to happen, bitfields are a no-go. But when they work,
         | you get code like:                   old_divider =
         | SpiRegs.CONFIG_REG.bit.CLK_DIVIDER;
         | SpiRegs.CONFIG_REG.bit.CLK_DIVIDER = new_divider;
         | 
         | instead of:                   old_divider = (SpiRegs.CONFIG_REG
         | & SPI_CONFIG_CLK_DIVIDER_MASK) >> SPI_CONFIG_CLK_DIVIDER_POS;
         | SpiRegs.CONFIG_REG = (SpiRegs.CONFIG_REG &
         | ~SPI_CONFIG_CLK_DIVIDER_MASK) | (new_divider <<
         | SPI_CONFIG_CLK_DIVIDER_POS);
         | 
         | or the slightly nicer but even longer:                   config
         | = SpiRegs.CONFIG_REG;         old_divider = (config &
         | SPI_CONFIG_CLK_DIVIDER_MASK) >> SPI_CONFIG_CLK_DIVIDER_POS;
         | config &= ~SPI_CONFIG_CLK_DIVIDER_MASK;         config |=
         | new_divider << SPI_CONFIG_CLK_DIVIDER_POS;
         | SpiRegs.CONFIG_REG = config;
         | 
         | For anything other than hardware registers, I agree that
         | they're not portable enough to rely on.
        
           | dataflow wrote:
           | It feels weird to see arguments like this when you could just
           | use a language (C++ being the elephant in the room here) that
           | lets you define methods, then call those methods instead.
        
             | nomel wrote:
             | A method that does will often (depending on the
             | architecture) have much more overhead than a struct lookup.
             | If you're doing hardware stuff, you often care about
             | performance.
        
               | dataflow wrote:
               | Inlining?
        
             | foldr wrote:
             | You can equally define helper functions to update registers
             | in C.
        
               | dataflow wrote:
               | Definitely, but it's more ergonomically annoying with IDE
               | stuff others complained about. [1]
               | 
               | [1] https://news.ycombinator.com/item?id=31185723
        
           | LAC-Tech wrote:
           | I remember being surprised that setting an individual bit on
           | an AVR hardware register in assembly was so much shorter than
           | doing all that C bit masking stuff.
        
           | pavon wrote:
           | Helper functions or macros are just as clean as the bitfield
           | syntax. That said, hardware register access is one of those
           | things that is intrinsically tied to a specific platform (and
           | if you target multiple platforms, it will already be behind
           | an abstraction layer), so you can usually know the quirks of
           | how the toolchain for that platform supports bitfields, and
           | use them accordingly. Still more work for people reading the
           | code, though since there are a lot of hidden assumptions
           | behind that deceptively simple "=" than with an explicit mask
           | and shift.
        
             | AdamH12113 wrote:
             | >Helper functions or macros are just as clean as the
             | bitfield syntax.
             | 
             | They can be, if done right, but then I have to remember the
             | names of all the helper functions and macros. :-) An IDE
             | can auto-complete bitfield names.
             | 
             | >Still more work for people reading the code, though since
             | there are a lot of hidden assumptions behind that
             | deceptively simple "=" than with an explicit mask and
             | shift.
             | 
             | Depends on the platform. IIRC on ARM a bitfield access is
             | masking and shifting, only done by the compiler instead of
             | me. With optimized code I often have to look at the
             | disassembly anyway if I want to know what's really going
             | on.
        
           | ncmncm wrote:
           | Readable code that does not necessarily execute what it says
           | does nobody any favors.
        
             | nomel wrote:
             | This is why we test our code, or build in runtime checks,
             | before releasing it.
        
         | dmitrygr wrote:
         | > Honest shift-and-mask operations on unsigned machine words
         | are always better, if you absolutely must pack bitwise
         | 
         | Now you go ahead and teach GCC to use the arm UBFX instruction
         | for those cases. It _DOES_ use it for actual bitfields. shift +
         | mask = 2-3 instructions (immediate load may be needed). UBFX is
         | one.
        
           | ncmncm wrote:
           | The more complicated the instruction is, the less likely it
           | was implemented to spec on all the various products and mask
           | steppings you might execute on ... and the less likely its
           | published definition exactly matches C or C++ Standard and
           | platform C ABI specs. And, the less likely that ABI spec
           | nails down all the details.
           | 
           | Compiler implementors don't like to guess, but don't get a
           | choice. If the instruction provided doesn't match the
           | Standard, which do they implement? Both choices are wrong.
        
       | [deleted]
        
       | [deleted]
        
       | mistrial9 wrote:
       | bad compilers make bad days.. custom hardware used to (?) use
       | memory locations to control/enable features.. anything from
       | electronic access paths to actual servo-motors firing. Probably a
       | better idea to use human-readable constructs and avoid this
       | compact and tiny use pattern, IMO. If you want a tricky test for
       | yourself, perhaps some actual hardware design is a better use of
       | time these days?
        
       | loup-vaillant wrote:
       | I once encountered a structured that were packed, even though it
       | shouldn't have been. Took me over a day to notice where the error
       | came from. I was poking at the internals of a library so I could
       | gather information that it had, but did not provide. There was
       | this context structure I normally only could access through a
       | pointer, but copying the definition of the structure into my own
       | code ought to do the trick...
       | 
       | ...except it didn't.
       | 
       | The way the library was compiled by default made the structure
       | there _smaller_ than my copy. Took me some time to guess why my
       | data was all garbled, but the cause was pretty simple: there was
       | no padding, even if it meant some members ended up unaligned. I
       | had to replace the unaligned members by char arrays to get it to
       | work (I did not dare explore the compilation options of the
       | library).
       | 
       | And then I found a totally different solution for my problem. Oh
       | well.
        
       | digikata wrote:
       | Is what the article says re: the pahole utility still correct
       | (that it's not maintained)? Looks like it might be maintained now
       | w/ kernel git under the dwarves area.
       | 
       | Pahole is a decent utility to look at what the packing of a
       | structure actually ended up after everything has had its last
       | effect in the compile chain.
        
       | hnur wrote:
       | "a technique for reducing the memory footprint of programs in
       | compiled languages with C-like structures"
       | 
       | I figure that's not the primary reason for structure packing, but
       | rather for fine-grained control over writing to very specific
       | memory layouts (think global descriptor table) as structs.
        
         | UncleEntity wrote:
         | I know in blender there is a compile time check to ensure the
         | structs are properly packed and it has nothing to do specific
         | memory layouts.
         | 
         | I _think_ it has to do with reading /writing them to disk but
         | honestly never cared enough to ask anyone. Did make things
         | convenient sometimes when you could 'steal' a padding value and
         | magically got backwards compatibly because the older versions
         | just ignored that field (and when reading an older file just
         | set it to a sane default).
        
       | waynecochran wrote:
       | I found it interesting that Pascal had a "packed" keyword and C
       | didn't (outside of implementation specific attributes like
       | __attribute__ ((aligned (8))) in GNU).
        
         | layer8 wrote:
         | The reason is that Pascal was used on computers with long
         | machine words (e.g. 36 bits) where memory wasn't byte
         | addressable. It was customary (in assembly code) to "pack"
         | multiple logical fields into a single word, in particular
         | multiple characters of a text string. The "packed" feature in
         | Pascal was added for that purpose.
        
       | dragontamer wrote:
       | I feel like "struct of arrays" style coding has really taken off
       | in the past decade, and seems to be the best way to maximize
       | memory operations these days.
       | 
       | Its not so much that "structure packing" is dead, as much as a
       | wide variety of techniques have been developed above-and-beyond
       | just simply structure packing. There's many ways to skin a cat
       | these days, and packing your structures more intelligently is
       | just one possible data optimization.
        
         | kolbusa wrote:
         | It really depends on the domain. HPC is more frequently SOA
         | (think CSR sparse matrices), while AOS may make more sense in
         | other cases.
        
         | layer8 wrote:
         | That entirely depends on the access patterns. SOA makes sense
         | when you don't often access the different fields of the same
         | object (array index) at the same time. If you do, on the other
         | hand, then AOS is more efficient.
        
           | TillE wrote:
           | Right. There's a little too much cargo-culting in the "struct
           | of arrays" pattern, you really want to understand why it
           | works or doesn't.
           | 
           | If you have some giant bloated struct and you only care about
           | one or two fields at a time, that's one thing. But if you
           | have a well-aligned, correctly packed struct and you're
           | processing all its data, it's total nonsense to break that
           | up.
        
             | dragontamer wrote:
             | I certainly think SOA has been cargo-culted to all hell and
             | back.
             | 
             | But empirically speaking, it seems like SOA / AOS is the
             | easiest "beginner topic" to get high performance-
             | programmers thinking about memory-layout issues.
             | 
             | Maybe in the 90s or 00s, it was more popular to think about
             | struct layouts, alignment issues and the like. But today,
             | SOA is popular because RAM has gotten less... random... and
             | more sequential.
             | 
             | I think its the changing nature of 90s era computers (RAM
             | behaving more random-accessy) vs the nature of 10s era
             | computers (RAM behaving more sequential-accessy)
             | 
             | --------
             | 
             | Its not like the 90s techniques don't work anymore. But the
             | 10s technique of "structure of arrays" and then iterating
             | for-loops over your data works better with prefetchers,
             | multiple-cache hierarchies, and other tidbits that have
             | made RAM more sequential than ever before.
             | 
             | Hopefully programmers continue to study the techniques and
             | understand what is going on under-the-hood, instead of
             | cargo-culting the pattern. Alas, we all know that cargo-
             | culting works in the short term and is easier to do than
             | actually learning the underlying machine!
        
             | layer8 wrote:
             | It's similar to row vs. column oriented databases.
        
         | monocasa wrote:
         | Wasn't that one of the cool things about the language Jai?
         | Struct definitions could be cleanly inverted between AoS and
         | SoA at use time?
        
         | pclmulqdq wrote:
         | "Struct of arrays" becoming popular may also have something to
         | do with few people understanding structure packing. AOS has
         | much better performance if you pack your structs well than if
         | you pack them naively.
        
       ___________________________________________________________________
       (page generated 2022-04-27 23:01 UTC)