[HN Gopher] Structures in C: From Basics to Memory Alignment
       ___________________________________________________________________
        
       Structures in C: From Basics to Memory Alignment
        
       Author : aheck
       Score  : 47 points
       Date   : 2023-06-29 21:08 UTC (1 hours ago)
        
 (HTM) web link (abstractexpr.com)
 (TXT) w3m dump (abstractexpr.com)
        
       | colonwqbang wrote:
       | > The only good reason to use packed structures is when you need
       | to map some memory (e.g. hardware registers exposed to memory)
       | bit by bit to a structure.
       | 
       | Another common reason is when two CPUs of different architecture
       | need to access the same structure in memory. E.g. you have a
       | RiscV and an Arm64 processor in the same system, sharing memory.
       | Or you read structured binary data from disk and need to specify
       | an exact layout.
        
         | mananaysiempre wrote:
         | All of these sound weird to me--most non-stupid (hello 802.2)
         | protocols and hardware are going to have natural-aligned
         | structure fields, so basically any mainstream (8-bit-byte,
         | two's complement, etc.) ABI is going to lay them out the same
         | way, packed or not.
         | 
         | As for RV64 and Arm64, the layout rules for same-size scalar
         | types in their common ABIs are outright identical aren't they?
         | 
         | We're (most of us) a long way away from the time where each DOS
         | compiler had its own opinions on whether long double should be
         | 8-, 16-, 32-, or 64-bit aligned and 80 or 128 bits long.
        
           | packetlost wrote:
           | You assume it was RV64. It could have easily been RV32
        
           | com2kid wrote:
           | > All of these sound weird to me--most non-stupid (hello
           | 802.2) protocols and hardware are going to have natural-
           | aligned structure fields, so basically any mainstream (8-bit-
           | byte, two's complement, etc.) ABI is going to lay them out
           | the same way, packed or not.
           | 
           | In the long ago year of 2015 I worked on a project where the
           | same binary packet was:
           | 
           | 1. Generated by an 8 bit micro controller
           | 
           | 2. Consumed by a 32bit Cortex M3
           | 
           | 3. Passed onto iPhones, Androids, Windows Phones, and Windows
           | PCs running ObjC, Java, C#, and C++ respectively
           | 
           | 4. Uploaded to a cloud provider
           | 
           | The phrase "natural aligned" has no meaning in that context.
        
             | mananaysiempre wrote:
             | > The phrase "natural aligned" has no meaning in that
             | context.
             | 
             | The phrase "naturally aligned" as I'm accustomed to seeing
             | it used refers to the alignment of a power-of-two-sized
             | type (usually a scalar one) being equal to its size. Unless
             | you're working with, say, 18-bit or 24-bit integers (that
             | do exist in obscure places), it does have a meaning, and
             | unless you're using non-eight-bit bytes that meaning is
             | fairly universal (and if you're not, your I/O is probably
             | screwed up in hard-to-predict ways[1]).
             | 
             | At least for your items 2, 3, and 4--excluding Java and C#
             | which are not relevant to TFA about C and are likely to use
             | manual packing code--you have, let's see,
             | 
             | - The bytes are eight bits wide, and ASCII strings have
             | their usual meaning;
             | 
             | - The integer types are wraparound unsigned and two's
             | complement signed least-endian with no padding bits or trap
             | representations and come in 8-bit, 16-bit, 32-bit, and
             | 64-bit sizes and identical alignments;
             | 
             | - The floating-point types are IEEE 754 single and double
             | precision floats, little endian, respectively 32 bits and
             | 64 bits in size and of identical alignment, though you
             | should probably avoid relying on subnormals or the exact
             | choice of NaNs;
             | 
             | - Structures and unions have the alignment requirement of
             | their most strictly aligned member;
             | 
             | - The members of a structure are laid out at increasing
             | offsets, with each member starting at the earliest offset
             | permitted by its alignment (while the members of a union
             | all start at offset zero as the standard requires);
             | 
             | - The structure or union is then padded at the end so that
             | its alignment divides its size.
             | 
             | If you avoid extended precision and SIMD types, the default
             | ABI settings should get you completely compatible layouts
             | here. (On an earlier ARM you might've run into mixed-endian
             | floats, but not on any Cortex flavour.) Even bitfields
             | would be entirely fine, except Microsoft _bloody_ Windows
             | had to be stupid there.
             | 
             | Honestly the only potential problem is 1, an unspecified
             | 8-bit micro, and that only because the implicit integer
             | promotions of standard C make getting decent performance
             | out of those a bit of a crapshoot, leading to noncompliant
             | hacks like 8-bit ints or 48-bit long longs. Still, if the
             | usual complement of 8/16/32/64-bit integers is available,
             | the worst you're likely to have to do is spell out any
             | structure padding explicitly.
             | 
             | [1] https://thephd.dev/conformance-should-mean-something-
             | fputc-a...
        
           | colonwqbang wrote:
           | > As for RV64 and Arm64
           | 
           | This was just a placeholder, perhaps a bad example. I program
           | a proprietary CPU architecture which does not require
           | alignment. And for which the compiler naturally prefers to
           | pack structs. Getting it to mimick Arm style struct padding
           | is much harder and error prone than just having the Arm pack
           | everything.
           | 
           | Maybe you are right and we are heading for a One True Struct
           | Layout in the future. Today I think it is still too scary to
           | pass the same unpacked struct declaration to various compiler
           | archs and hope they come up with the same interpretation.
        
       | synergy20 wrote:
       | pretty cool, love anything related to C.
       | 
       | might want to add anonymous struct.
       | 
       | also put function pointers inside struct for simple object-
       | oriented-programming in C.
       | 
       | flexible array is handy, you do one malloc for all, but pointers
       | inside struct is more 'flexible', for example you can put a 'void
       | *' and cast it to various data types. for flexible array, the
       | data types must be chosen first.
        
         | keyle wrote:
         | In passing, there is no version of myself that hasn't shot
         | himself in the foot using `void *` at some point :)
        
       | zabzonk wrote:
       | > Using a Structure via a Pointer
       | 
       | code that sets the array doesn't use the pointer.
        
       ___________________________________________________________________
       (page generated 2023-06-29 23:00 UTC)