[HN Gopher] The Byte Order Fiasco
       ___________________________________________________________________
        
       The Byte Order Fiasco
        
       Author : cassepipe
       Score  : 269 points
       Date   : 2021-05-08 11:00 UTC (12 hours ago)
        
 (HTM) web link (justine.lol)
 (TXT) w3m dump (justine.lol)
        
       | londons_explore wrote:
       | Remember how we used to have machines with a 7 bit byte? And
       | everything was written to handle either 6, 7, or 8 bit bytes?
       | 
       | And now we've settled on all machines being 8 bit bytes, and
       | programmers no longer have to worry about such details?
       | 
       | Is it time to do the same for big endian machines? Is it time to
       | accept that all machines that matter are little endian, and the
       | extra effort keeping everything portable to big endian is no
       | longer worth the mental effort?
        
         | bumbada wrote:
         | What happens is that all machines that matter are little endian
         | but network works always in Big Endian.
        
           | londons_explore wrote:
           | We'll have to keep it as a quirk of history...
           | 
           | A bit like the electron has a negative charge...
        
             | PixelOfDeath wrote:
             | They hat a 50/50 chance at getting the technical
             | electricity direction right... and the fucked it up!
        
           | bombcar wrote:
           | Isn't big endian a bit more natural considered on a bit
           | level? The bits start from highest to lowest on a serial
           | connection.
        
             | ben509 wrote:
             | Big-endian is natural when you're comparing numbers, which
             | is probably why people represent numbers in a big-endian
             | fashion.
             | 
             | Little-endian is natural with casts because the address
             | doesn't change, and it's the order in which addition takes
             | place.
        
               | kangalioo wrote:
               | I feel like big endian is more _intuitive_ because that's
               | what our number notation has evolved to be.
               | 
               | But more _natural_ is little endian because, well, it's
               | just more straightforward to have the digits' magnitude
               | be in ascending order (2^0, 2^1, 2^2, 2^3...) instead of
               | putting it in reverse.
               | 
               | Plus you encounter less roadblocks in practice with
               | little endian (e.g. address changes with casts) which is
               | often a sign of good natural design
        
               | CydeWeys wrote:
               | I'm curious how you're defining "natural", and if you
               | think ISO-8601 is the reverse of "natural" too.
               | 
               | All human number systems I've ever seen write numbers out
               | as big Endian (yes, even Roman numerals), so I'm really
               | struggling to see how that wouldn't be considered
               | natural.
        
               | ByteJockey wrote:
               | It seems like it would be a more natural for representing
               | the number when communicating with a human.
               | 
               | But that's not what we're doing here, so it's not
               | entirely relevant.
        
         | pabs3 wrote:
         | IBM is going to be pretty annoyed when your code doesn't work
         | on their mainframes.
        
           | jart wrote:
           | In my experience IBM does the right thing and sends patches
           | rather than asking us to fix their problems for them, and I
           | respect them for that reason, even if it's a tiny burden to
           | review those changes.
           | 
           | However endianness isn't just about supporting IBM. Modern
           | compilers will literally break your code if you alias memory
           | using a type wider than char. It's illegal per the standard.
           | In the past compilers would simply not care and say, oh the
           | architecture permits unaligned reads so we'll just let you do
           | that. Not anymore. Modern GCC and Clang force your code to
           | conform to the abstract standard definition rather than the
           | local architecture definition.
           | 
           | It's also worth noting that people think x86 architecture
           | permits unaligned reads but that's not entirely true. For
           | example, you can't do unaligned read-ahead on C strings,
           | because in extremely rare cases you might cross a page
           | boundary that isn't defined and trigger a segfault.
        
             | froh wrote:
             | yes IBM provided asm for s390 hton ntoh, and "all we had to
             | do" for mainframe Linux was patch x86 only packages to use
             | hton ntoh when they persisted binary data. for the kernel
             | IBM did it on their own, contributing mainline, for
             | userland suse did it, grabbing some patches from japanese
             | turbolinux, and then red hat grabbed the patches from turbo
             | and suse, and together we got them mainline lol. and PPC
             | then just piggybacked on top of that effort.
        
         | einpoklum wrote:
         | > Is it time to accept that all machines that matter are little
         | endian.
         | 
         | Well, no, because it's not the case. SPARC is big-endian, and a
         | bunch of IBM processors. ARM processors are mostly bi-endian.
         | 
         | > Is it time to do the same for big endian machines?
         | 
         | No. Not just because of their prevalence, but because there
         | isn't a compelling reason why everything should be little-
         | endian.
        
         | genmon wrote:
         | That reminds me of a project to interface with vending
         | machines. (We built a bookshop in a vending machine that would
         | tweet whenever it sold an item, with automated stock
         | management.)
         | 
         | Vending machines have an internal protocol a little like I2C.
         | We created a custom peripheral to bridge the machine to the
         | web, based on a Raspberry Pi.
         | 
         | The protocol was defined by Coca Cola Japan in 1975 (in order
         | to have optionality in their supply chain). It's still in use
         | today. But because it was designed in Japan, with a need for
         | wide characters, it assumes 9 bit bytes.
         | 
         | We couldn't find any way to get a Raspberry Pi to speak 9 bit
         | bytes. The eventual solution was a custom shield that would
         | read the bits, and reserialise to 8 bit bytes for the Pi to
         | understand. And vice versa.
         | 
         | 9 bit bytes. I grew up knowing that bytes had variable length,
         | bit this was the first time I encountered it in the wild. This
         | was 2015.
        
           | raverbashing wrote:
           | Well you could bit bang and the 9 bits wouldn't be an issue.
           | (Even if you had a tiny PIC microcontroler just to do that)
           | 
           | This is best solvable the closer to the device in question
           | and in the simplest way possible.
        
             | ghoward wrote:
             | Sorry, dumb question: what is bit banging?
        
               | gspr wrote:
               | The practice of using software to literally toggle (or
               | read) individual pins with the correct software-
               | controlled timing in order to communicate with some
               | hardware.
               | 
               | To transmit a bit pattern 10010010 over a single pin
               | channel, for example, you'd literally set the pin high,
               | sleep for a some predetermined amount of time, set it
               | low, sleep, set it low, sleep, set it high, etc.
        
               | thristian wrote:
               | In order to exchange data over a serial connection, the
               | ones and zeroes have to be sent with exact timing, so the
               | receiver can reliably tell where one bit ends and the
               | next begins. Because of this, the hardware that's doing
               | the communication can't do anything else at the same
               | time. And since the actual mechanics of the process are
               | simple and straightforward, most computers with a serial
               | connection have special serial-interface hardware (a
               | Universal Asynchronous Receiver/Transmitter, or UART) to
               | take care of it - the CPU gives the UART some data, then
               | returns to more productive pursuits while the UART works
               | away.
               | 
               | But sometimes you can't use a UART: maybe you're working
               | on a tiny embedded computer without one, or maybe you
               | need to speak a weird 9-bit protocol a standard UART
               | doesn't understand. In that case, you can make the CPU
               | pump the serial line directly. It's inefficient (there's
               | probably more interesting work the CPU could be doing)
               | and it can be difficult to make the CPU pause for
               | _exactly_ the right amount of time (CPUs are normally
               | designed to run as fast or as efficiently as possible,
               | nothing in between), but it 's possible and sometimes
               | it's all you've got. That's bit-banging.
        
               | londons_explore wrote:
               | Consider being a teacher. Thats a good explanation.
        
             | stefan_ wrote:
             | The irony is that while a tiny PIC can do bit banging
             | easily, the mighty Pi will struggle with it.
        
               | hazeii wrote:
               | I'm familiar with both, and have Pi's bit-banging at
               | 8MHz. It's not hard-realtime like a PIC though (where
               | I've bitbanged a resistor D2A hung off a dsPIC33 to
               | 17.734475MHz). It's an improvement over the years, but
               | surprisingly little since bit-banging 4MHz Z80's more
               | than 4 decades ago, where resolution was 1 T state
               | (250ns).
        
               | londons_explore wrote:
               | The 9 bit serial OP mentioned likely doesn't have a
               | seperate clock line, so it is hard realtime and timing
               | matters a _lot_ , and I doubt the Pi could reliably do
               | anything over 1 kHz baud with bit banging. You could do
               | much better if you didn't run Linux.
        
           | BlueTemplar wrote:
           | We really should have moved to 32 bit bytes when moving to 64
           | bit words. Would have simplified Unicode considerably.
        
             | wongarsu wrote:
             | People were holding off on transitioning because pointers
             | use twice as much space in x64. If bytes had quadrupled in
             | space with x64 we would still be using 32 bit software
             | everywhere
        
               | BlueTemplar wrote:
               | Well, obviously it would have delayed the transition.
               | However you can only go so far with 4Go-limited memory.
               | 
               | And do you have examples of still widely used 8-bit sized
               | data formats ?
        
               | jart wrote:
               | RGB and Y'CbCr
        
               | Narishma wrote:
               | You can go very far with just 4GB of memory, especially
               | when not using wasteful software.
        
               | owl57 wrote:
               | I assume you wrote this comment in UTF-8 over HTTP
               | (ASCII-based) and TLS (lots of uint8 fields).
        
             | jart wrote:
             | Use Erlang. It has 32-bit char.
        
               | toast0 wrote:
               | Not really. Strings are a list of integers [1], integers
               | are signed and fill a system word, but there's also 4
               | bits of type information. So you can have a 28-bit signed
               | integer char on a 32-bit system or a signed 60-bit
               | integer.
               | 
               | However, since Unicode is limited to 21-bits by utf-16
               | encoding, a unicode code point will fit in a small
               | integer.
               | 
               | [1] unless you use binaries, which is often a better
               | choice.
        
             | spacechild1 wrote:
             | You know, bytes are not only about text, they are also used
             | to represent _binary_ data...
             | 
             | Not to mention that bytes have nothing to do with unicode.
             | Unicode codepoints can be encoded in many different ways:
             | UTF8, UTF16, UTF32, etc.
        
             | ChrisSD wrote:
             | Not really. Unicode is a variable width abstract encoding;
             | a single character can be made up of multiple code points.
             | 
             | For Unicode, 32-bit bytes would be an incredibly wasteful
             | in memory encoding.
        
               | BlueTemplar wrote:
               | One byte = one "character" makes for much easier
               | programming.
               | 
               | Text generally uses a small fraction of memory and
               | storage these days.
        
               | cygx wrote:
               | Not all user-perceived characters can be represented as a
               | single Unicode codepoint. Hence, Unicode text encodings
               | (almost[1]) always have to be treated as variable length,
               | even UTF-32.
               | 
               | [1] at runtime, you could dynamically assign 'virtual'
               | codepoints to grapheme clusters and get a fixed-length
               | encoding for strings that way
        
               | jart wrote:
               | Even the individual unicode codepoints themselves are
               | variable width if we consider that things like cjk and
               | emoji take up >1 monospace cells.
        
               | lanstin wrote:
               | Every time I see one of these threads, my gratitude to
               | only do backend grows. Human behavior is too complex, let
               | the webdevs handle UI, and human languages are too
               | complex, not sure what speciality handles that. Give me
               | out of order packets and parsing code that skips a
               | character if the packet length lines up just so any day.
               | 
               | I am thankful that almost all the Unicode text I see is
               | rendered properly now, farewell the little boxes. Good
               | job lots of people.
        
               | jart wrote:
               | I think we really have the iPhone jailbreakers to thank
               | for that. U.S. developers were allergic almost offended
               | by anything that wasn't ASCII and then someone released
               | an app that unlocked the emoji icons that Apple had
               | originally intended only for Japan. Emoji is defined in
               | the astral planes so almost nothing at the time was
               | capable of understanding them, yet were so irresistible
               | that developers worldwide who would otherwise have done
               | nothing to address their cultural biases immediately
               | fixed everything overnight to have them. So thanks to
               | cartoons, we now have a more inclusive world.
        
               | londons_explore wrote:
               | I'm pretty sure Unicode was pretty widespread before the
               | iphone/emoji popularity.
        
               | cygx wrote:
               | There's supporting Unicode, and 'supporting' Unicode. If
               | you're only dealing with western languages, it's easy to
               | fall into the trap of only 'supporting' Unicode. Proper
               | emoji handling will put things like grapheme clusters and
               | zero-width joiners on your map.
        
               | kortex wrote:
               | > One byte = one "character" makes for much easier
               | programming.
               | 
               | Only if you are naively operating in the Anglosphere /
               | world where the most complex thing you have to handle is
               | larger character sets. In reality, there's ligatures,
               | diacritics, combining characters, RTL, nbsp, locales, and
               | emoji (with skin tones!). Not to mention legacy encoding.
               | 
               | And no, it does not use a "small fraction of memory and
               | storage" in a huge range of applications, to the point
               | where some regions have transcoding proxies still.
        
           | AnIdiotOnTheNet wrote:
           | This just doesn't seem right. Granted, I don't know much
           | about your use case, but Raspberry Pi's are powerful
           | computing devices and I find it difficult to believe there
           | was no way to handle this without additional hardware.
        
             | DeRock wrote:
             | I'm not familiar with the "vending machine" protocol he's
             | talking about, but it's entirely reasonable that it has
             | certain timing requirements. Usually the way you interface
             | with these is by having a dedicated HW block to talk the
             | protocol, or by bit banging. The former wouldn't be
             | supported on RPi because it's obscure, the latter requires
             | tight GPIO timing control that is difficult to guarantee on
             | a non-real-time system like the RPi usually runs.
        
         | [deleted]
        
         | DonHopkins wrote:
         | We used to have machines with arbitrarily sized bytes, and 36
         | bit words!
         | 
         | http://pdp10.nocrew.org/docs/instruction-set/Byte.html
         | 
         | >In the PDP-10 a "byte" is some number of contiguous bits
         | within one word. A byte pointer is a quantity (which occupies a
         | whole word) which describes the location of a byte. There are
         | three parts to the description of a byte: the word (i.e.,
         | address) in which the byte occurs, the position of the byte
         | within the word, and the length of the byte.
         | 
         | >A byte pointer has the following format:
         | 000000 000011 1 1 1111 112222222222333333          012345
         | 678901 2 3 4567 890123456789012345
         | _________________________________________         |      |
         | | | |    |                  |         | POS  | SIZE |U|I| X  |
         | Y         |         |______|______|_|_|____|__________________|
         | 
         | >POS is the byte position: the number of bits from the right
         | end of the byte to the right end of the word. SIZE is the byte
         | size in bits.
         | 
         | >The U field is ignored by the byte instructions.
         | 
         | >The I, X and Y fields are used, just as in an instruction, to
         | compute an effective address which specifies the location of
         | the word containing the byte.
         | 
         | "If you're not playing with 36 bits, you're not playing with a
         | full DEC!" -DIGEX (Doug Humphrey)
         | 
         | http://otc.umd.edu/staff/humphrey
        
       | [deleted]
        
       | stkdump wrote:
       | Historical and obscure machines aside, there are a few things
       | modern C++ code should take for granted, because even new systems
       | will probably not bother breaking them: Text is encoded in UTF-8.
       | Negative integers are twos-complement. Float is 32 bit ieee 754,
       | double and long double are 64 bit ieee 754. Char is 8 bit, short
       | is 16 bit, int is 32 bit, long long is 64 bit.
        
       | pabs3 wrote:
       | I wonder if those macros work with middle-endian systems.
        
         | froh wrote:
         | no. but hton(3)/ntoh(3) from inet.h do.
        
         | dataflow wrote:
         | Is this a joke or am I just unaware of any systems out there
         | that are "middle-endian"..?!
        
           | mannschott wrote:
           | Sadly not a joke, but thankfully quite obscure:
           | https://en.wikipedia.org/wiki/Endianness#Middle-endian
        
           | hvdijk wrote:
           | There are no current middle-endian systems but they used to
           | exist. The PDP-11 is the most famous one. The macros would
           | work on all systems, but as only very old systems are middle-
           | endian, they also have old compilers so may not be able to
           | optimise it as well.
        
       | ttt0 wrote:
       | https://twitter.com/m13253/status/1371615680068526081
       | 
       | Would it hurt anyone to define this undefined behavior and do
       | exactly what the source code says?
        
         | MauranKilom wrote:
         | Not sure what you think the source code "says". I mean, I know
         | what you want it to mean, but just because integer wrapping is
         | intuitive to you doesn't imply that that is what the code
         | means. C++ abstract machine and all.
         | 
         | But to answer the actual question: For C++20, integer types
         | were revisited. It is now (finally) guaranteed that signed
         | integers are two's complement, along with a list of other
         | changes. See http://www.open-
         | std.org/jtc1/sc22/wg21/docs/papers/2018/p090... also for how
         | the committee voted on the individual issues.
         | 
         | Note in particular:
         | 
         | > The main change between [P0907r0] and the subsequent revision
         | is to maintain undefined behavior when signed integer overflow
         | occurs, instead of defining wrapping behavior. This direction
         | was motivated by:
         | 
         | > - Performance concerns, whereby defining the behavior
         | prevents optimizers from assuming that overflow never occurs;
         | 
         | > - Implementation leeway for tools such as sanitizers;
         | 
         | > - Data from Google suggesting that over 90% of all overflow
         | is a bug, and defining wrapping behavior would not have solved
         | the bug.
         | 
         | So yes, the committee very recently revisited this specific
         | issue, and re-affirmed that signed integer overflow should be
         | UB.
        
           | ttt0 wrote:
           | I haven't noticed the signed integer overflow, which does
           | indeed complicate things, and I thought it was just the
           | infinite loop UB.
           | 
           | > Data from Google suggesting that over 90% of all overflow
           | is a bug, and defining wrapping behavior would not have
           | solved the bug.
           | 
           | Of _all_ overflow? Including unsigned integers where the
           | behavior is defined?
        
             | aliceryhl wrote:
             | That 90% of all overflows are bugs doesn't surprise me at
             | all, even if you include unsigned integers.
        
       | nly wrote:
       | This is why, in 2021, the mantra that C is a good language for
       | these low level byte twiddling tasks needs to die. Dealing with
       | alignment and endianness properly requires a language that allows
       | you to build abstractions.
       | 
       | The following is perfectly well defined in C++, despite looking
       | like almost the same as the original unsafe C:
       | #include <boost/endian.hpp>         #include <cstdio>
       | using namespace boost::endian;              unsigned char b[5] =
       | {0x80,0x01,0x02,0x03,0x04};              int main() {
       | uint32_t x = *((big_uint32_t*)(b+1));
       | printf("%08x\n", x);         }
       | 
       | Note that I deliberately misaligned the pointer by adding 1.
       | 
       | https://gcc.godbolt.org/z/5416oefjx
       | 
       | [Edit] Fun twist: the above code doesn't work where the
       | intermediate variable x is removed because printf itself is not
       | type safe, so no type conversion (which is when the bswap is
       | deferred to) happens. In pure C++ when using a type safe
       | formatting function (like fmt or iostreams) this wouldn't happen.
       | printf will let you throw any garbage in to it. tl;dr outside
       | embedded use cases writing C in 2021 is fucking nuts.
        
         | IgorPartola wrote:
         | As a very minor counterpoint: I like C because frankly it's
         | fun. I wouldn't start a web browser or maybe even an operating
         | system in it today, but as a language for messing around I find
         | it rewarding. I also think it is incredibly instructive in a
         | lot of ways. I am not a C++ developer but ANSI C has a special
         | place in my heart.
         | 
         | Also, I will say that when it comes to programming Arduinos and
         | ESP8266/ESP32 chips, I still find that C is my go to despite
         | things like Alia, MicroPython, etc. I think it's possible that
         | once Zig supports those devices fully that I might move over.
         | But in the meantime I guess I'll keep minding my off by one
         | errors.
        
         | themulticaster wrote:
         | This has nothing to do with C++ because your example only hides
         | the real issue occurring in the blog post example: The
         | unaligned read on the array. Try adding something like
         | printf("%08x\n", *((uint32_t*)(b)));
         | 
         | to your example and you'll see that it produces UB as well. The
         | reason there is no UB with big_uint32_t probably is that that
         | struct/class/whatever it is probably redefines its
         | dereferencing operator to perform byte-wise reads.
         | 
         | Godbolt example: https://gcc.godbolt.org/z/seWrb5cz7
        
           | nly wrote:
           | I fail to see your point. The point of my post is that the
           | abstractions you can build in C++ are as easy to use and as
           | efficient as doing things the wrong, unsafe way...so there's
           | no reason not to do things in a safe, correct way.
           | 
           | Obviously if you write C and compile it as C++ you still end
           | up with UB, because C++ aims for extreme levels of
           | compatibility with C.
        
             | themulticaster wrote:
             | Sorry for being unclear. My point is that the example in
             | the blog post does two things, a) it reads an unaligned
             | address causing UB and b) it performs byte-order swapping.
             | The post then goes on about avoiding UB in part b), but all
             | the time the UB was caused by the unaligned access in a).
             | 
             | Of course your example solves both a) and b) by using
             | big_uint32_t, and I agree that this is an interesting
             | abstraction provided by Boost, but I think the takeaway
             | "use C++ for low-level byte fiddling" is slightly
             | misleading: Say I was a novice C++ programmer, saw your
             | example of how C++ improves this but at the same time don't
             | know that big_uint32_t solves the hassle of reading a word
             | from an unaligned address for me. Now I use your pattern in
             | my byte-fiddling code, but then I need to read a word in
             | host endianness. What do I do? Right, I remember the HN
             | post and write *((uint32_t*)(b+1)) (without the big_,
             | because I don't need that!). And then I unintentionally
             | introduced UB. In other words, big_uint32_t is a little
             | "magic" in this case, as it suggests a similarity to
             | uint32_t which does not actually exist.
             | 
             | To be honest, I don't think the byte-wise reading is in any
             | way inappropriate in this case: If you're trying to read a
             | word _in non-native byte order from an unaligned access_ ,
             | it is perfectly fine to be very explicit about what you're
             | doing in my opinion. There also is nothing unsafe about
             | doing this as long as you follow certain guidelines, as
             | mentioned elsewhere in this thread.
        
               | nly wrote:
               | Sure, the only correct way to read an unaligned value in
               | to an aligned data type in both C or C++ is via memcpy.
               | 
               | I still think being able to define a type that models
               | what you're doing is incredibly valuable because as long
               | as you don't step outside your type system you get so
               | much for free.
        
               | sgtnoodle wrote:
               | You could also mask and shift the value byte-wise just
               | like with an endian swap. Depending on the destination
               | and how aggressive the compiler optimizes memcpy or not,
               | it could even produce more optimal code, perhaps by
               | working in registers more.
               | 
               | Conceptual consistency is a good thing, but there is a
               | generally higher cognitive load to using C++ over C. I've
               | used both C++ and C professionally, and I've gone deeper
               | with type safety and metaprogramming than most folk. I've
               | mostly used C for the last few years, and I don't feel
               | like I'm missing anything. It's still possible to write
               | hard-to-misuse code by coming up with abstractions that
               | play to the language's strengths.
               | 
               | Operator overloading in particular is something I've
               | refined my opinion on over the years. My current thought
               | is that it's best not to use operators in
               | user/application defined APIs, and should be reserved for
               | implementing language defined "standard" APIs like the
               | STL. Instead, it's better to use functions with names
               | that unambiguously describe their purpose.
        
         | foldr wrote:
         | What are the advantages of this over a simple function with the
         | following signature?                   uint32_t
         | read_big_uint32(char *bytes);
         | 
         | Having a big_uint32_t type seems wrong to me conceptually. You
         | should either deal with sequences of bytes with a defined
         | endianness or with native 32-bit integers of indeterminate
         | endianness (assuming that your code is intended to be endian
         | neutral). Having some kind of halfway house just confuses
         | things.
        
           | nly wrote:
           | The library provides those functions too, but I don't see how
           | having an arithmetic type with well defined size, endiannness
           | and alignment is a bad thing.
           | 
           | If you're defining a struct to mirror a data structure from a
           | device, protocol or file format then the language / type
           | system should let you define the properties of the fields,
           | not necessarily force you to introduce a parsing/decoding
           | stage which could be more easily bypassed.
        
             | lanstin wrote:
             | It is no longer arithmetic if there is an endianness. Some
             | things are numbers and some things are sequences of bytes.
             | Arithmetic only works on the former.
        
           | mfost wrote:
           | I'd say, putting multiple of those types into a struct that
           | then perfectly describes the memory layout of each byte of
           | data in memory/network packet in a reliable and user friendly
           | way to manipulate for the coder.
        
             | foldr wrote:
             | I see. That does seem helpful once you consider how these
             | types compose, rather than thinking about a one-off
             | conversion. However, I think it would be cleaner to have a
             | library that auto-generated a parser for a given struct
             | paired with an endianness specification, rather than baking
             | the endianness into the types. (Probably this could be
             | achieved by template metaprogramming too.)
        
           | themulticaster wrote:
           | I agree, but a little nitpick: A sequence of bytes does not
           | have a defined endianness. Only groups of more than one bytes
           | (i.e. half words, words, double words or whatever you want to
           | call them) have an endianness.
           | 
           | In practice, most projects (e.g. the Linux kernel or the
           | socket interface) differentiate between host (indeterminate)
           | byte order and a specific byte order (e.g. network byte
           | order/big endian).
        
         | jeffreygoesto wrote:
         | Wouldn't that cast be UB because it is type punning?
        
           | professoretc wrote:
           | char* is a allowed to alias to other pointer types.
        
             | jeffreygoesto wrote:
             | Hm. Afsik, you are always allowed to convert _to_ a char,
             | but _from_ is not ok in general. See i.e. [0]
             | 
             | [0] https://gist.github.com/shafik/848ae25ee209f698763cffee
             | 272a5...
        
         | jstimpfle wrote:
         | I find you missed the point of the post and the issues
         | described in it.
         | 
         | In my estimation, libraries like boost are way too big and way
         | too clever and they create more problems than they solve. Also,
         | they don't make me happy.
         | 
         | You're overfocusing on a "problem" that is almost completely
         | irrelevant for most of programming. Big endian is rare to be
         | found (almost no hardware to be found, but some file formats
         | and networking APIs have big-endian data in them). Where you
         | still meet it, you don't do endianness conversions willy-nilly.
         | You have only a few lines in a huge project that should be
         | concerned with it. Similar situation for dealing with aligned
         | reads.
         | 
         | So, with boost you end up with a huge slow-compiling dependency
         | to solve a problem using obscure implicit mechanisms that
         | almost no-one understands or can even spot (I would never have
         | guessed that your line above seems to handle misalignment or
         | byte swapping).
         | 
         | This approach is typical for a large group of C++ programmers,
         | who seem to like to optimize for short code snippets,
         | cleverness, and/or pedantry.
         | 
         | The actual issue described in the post was the UB that is easy
         | to hit when doing bit shifting, caused by the implicit
         | conversions that are defined in C. While this is definitely an
         | unhappy situation, it's easy enough to avoid this using plain C
         | syntax (cast expression to unsigned before shifting), using not
         | more code than the boost-type cast in your above code.
         | 
         | The fact that the UB is so easy to hit doesn't call for
         | excessive abstraction, but simply a revisit of some of the UB
         | defined in C, and how compiler writers exploit it.
         | 
         | (Anecdata: I've written a fair share of C code, while not
         | compression or encryption algorithms, and personally I'm not
         | sure I've ever hit one of the evil cases of UB. I've hit
         | Segmentation faults or had Out-of-bounds accesses, sure, but
         | personally I've never seen the language or compilers "haunt
         | me".)
        
           | jart wrote:
           | Do you use UBSAN and ASAN? When you write unit tests do you
           | feed numbers like 0x80000000 into your algorithm? When you
           | allocate test memory have you considered doing it with
           | mmap(4096) and putting the data at the _end_ of the map? (Or
           | better yet, double it and use mprotect). Those are some good
           | examples of torture tests if you 're in the mood to feel
           | haunted.
        
             | [deleted]
        
           | sdenton4 wrote:
           | Every day I spend futzing around with endianness is a day I'm
           | not solving 'real' problems. These things are a distraction
           | and a complete waste of developer time: It should be solved
           | 'once' and only worried about by people specifically looking
           | to improve on the existing solution. If it can't be handled
           | by a library call, there's something really broken in the
           | language.
           | 
           | (imo, both c and cpp are mainly advocated by people suffering
           | from stockholm syndrome.)
        
           | raphlinus wrote:
           | I agree with the bulk of this post.
           | 
           | Re the anecdata at the end. Have you ever run your code
           | through the sanitizers? I have. CVE-2016-2414 is one of my
           | battle scars, and I consider myself a pretty good programmer
           | who is aware of security implications.
        
             | jstimpfle wrote:
             | Very little, quite frankly. I've used valgrind in the past,
             | and found very few problems. I just ran
             | -fsanitize=undefined for the first time on one of my
             | current projects, which is an embedded network service of
             | 8KLOC, and with a quick test covering probably 50% of the
             | codepaths by doing network requests, no UB was detected (I
             | made sure the sanitizer works in my build by introducing a
             | (1<<31) expression).
             | 
             | Admittedly I'm not the type of person who spends his time
             | fuzzing his own projects, so my statement was just to say
             | that the kind of bugs that I hit by just testing my
             | software casually are almost all of the very trivial kind -
             | I've never experienced the feeling that the compiler
             | "betrayed" me and introduced an obscure bug for something
             | that looks like correct code.
             | 
             | I can't immediately see the problem in your CVE here [0],
             | was that some kind of betrayal by compiler situation? Seems
             | like strange things could happen if (end - start)
             | underflows.
             | 
             | [0] https://android.googlesource.com/platform/frameworks/mi
             | nikin...
        
               | raphlinus wrote:
               | This one wasn't specifically "betrayal by compiler," but
               | it was a confusion between signed and unsigned quantities
               | for a size field, which is very similar to the UB
               | exhibited in OP.
               | 
               | Also, the fact that you can't see the problem is actually
               | evidence of how insidious these problems are :)
               | 
               | The rules for this are arcane, and, while the solution
               | suggested in OP is correct, it skates close to the edge,
               | in that there are many similar idioms that are not ok. In
               | particular, (p[1] << 8) & 0xff00, which is code I've
               | written, is potentially UB (hence "mask, and then shift"
               | as a mantra). I'd be surprised if anyone other than jart
               | or someone who's been part of the C or C++ standards
               | process can say why.
        
             | [deleted]
        
             | vlovich123 wrote:
             | Raph, clearly you're just not as good a programmer as you
             | think you are.
        
               | raphlinus wrote:
               | Why thank you Vitali. Coming from you, that is high
               | praise indeed.
        
         | vladharbuz wrote:
         | Correct me if I'm wrong, but your example is just using a
         | library to do the same task, rather than illustrating any
         | difference between C and C++. If you want to pull boost in to
         | do this, that's great, but that hardly seems like a fair
         | comparison to the OP, since instead of implementing code to
         | solve this problem yourself you're just importing someone
         | else's code.
        
           | nly wrote:
           | No, the fact that this can be done in a library and looks
           | like a native language feature demonstrates the power of C++
           | as a language.
           | 
           | This example is demonstrating:
           | 
           | - First class treatment of user (or library) defined types
           | 
           | - Operator overloading
           | 
           | - The fact that it produces fast machine code. Try changing
           | big_uint32_t to regular uint32_t to see how this changes.
           | When you use the later ubsan will introduce a trap for
           | runtime checks, but it doesn't need to in this case.
        
             | simias wrote:
             | Operator overloading is a mixed blessing though, it can be
             | very convenient but it's also very good at obfuscating
             | what's going on.
             | 
             | For instance I'm not familiar with this boost library so
             | I'd have a lot of trouble piecing out what your snippet
             | does, especially since there's no explicit function call
             | besides the printf.
             | 
             | Personally if we're going the OOP route I'd much prefer
             | something like Rust's `var.to_be()`, `var.to_le` etc... At
             | least it's very explicit.
             | 
             | My hot take is that operator overloading should only ever
             | be used for mathematical operators (multiplying vectors
             | etc...), everything else is almost invariably a bad idea.
        
               | pwdisswordfish8 wrote:
               | Ironically, it was proposed not so long ago to deprecate
               | to_be/to_le in favour of to_be_bytes/to_le_bytes, since
               | the former conflate abstract values with bit
               | representations.
        
               | nly wrote:
               | That's fine if whatever type 'var' happens to be is NOT
               | usable as an arithmetic type, otherwise you can easily
               | just forget to call .to_le() or .to_native(), or
               | whatever, and end up with a bug. I don't know Rust, so
               | don't know if this is the case?
               | 
               | Boost.Endian actually lets you pick between arithmetic
               | and buffer types.
               | 
               | 'big_uint32_buf_t' is a buffer type that requires you to
               | call .value() or do a conversion to an integral type. It
               | does not support arithmetic operations.
               | 
               | 'big_uint32_t' is an arithmetic type, and supports all
               | the arithmetic operators.
               | 
               | There are also variants of both endian suffixed '_at' for
               | when you know you have aligned access.
        
               | raphlinus wrote:
               | The idiomatic way to do this in Rust is to use functions
               | like .to_le_bytes(), so you have the u32 (or whatever) on
               | one end and raw bytes (something like [u8; 4]) on the
               | other. It can get slightly tedious if you're doing it by
               | hand, but it's impossible to accidentally forget. If
               | you're doing this kind of thing at scale, like dealing
               | with TrueType fonts (another bastion of big-endian), it's
               | common to reach for derive macros, which automate a great
               | deal of the tedium.
        
               | nly wrote:
               | Who decides what methods to add to the bytes
               | type/abstraction?
               | 
               | If I have a 3 byte big endian integer can I access it
               | easily in rust without resorting to shifts?
               | 
               | In C++ I could probably create a fairly convincing
               | big_uint24_t type and use it in a packed struct and there
               | would be no inconsistencies with how it's used with
               | respect to the more common varieties
        
               | raphlinus wrote:
               | In Rust, [u8; N] and &[u8] are both primitive types, and
               | not abstractions. It's possible to create an abstraction
               | around either (the former even more so now with const
               | generics), but that's not necessary. It's also possible
               | to use "extension traits" to add methods, even to
               | existing and built-in types[1].
               | 
               | I'm not sure about a 3 byte big endian integer. I mean,
               | that's going to compile down to some combination of
               | shifting and masking operations anyway, isn't it? I
               | suspect that if you have some oddball binary format that
               | needs, this it will be possible to write some code to
               | marshal it, that compiles down to the best possible asm.
               | Godbolt is your friend here :)
               | 
               | [1]: https://rust-lang.github.io/rfcs/0445-extension-
               | trait-conven...
        
               | nly wrote:
               | I agree then that in Rust you could make something
               | consistent.
               | 
               | I think there's no need for explicit shifts. You need to
               | memcpy anyway to deal with alignment issues, so you may
               | as well just copy in to the last 3 bytes of a zero-
               | initialized, big endian, 32bit uint.
               | 
               | https://gcc.godbolt.org/z/jEnsW8WfE
        
               | raphlinus wrote:
               | That's just constant folding. Here's what it looks like
               | when you actually need to go to memory:
               | 
               | https://gcc.godbolt.org/z/9qGqh6M1E
               | 
               | And I think we're on the same page, it should be possible
               | to get similar results in Rust.
        
             | cbmuser wrote:
             | You are still casting one pointer type into another which
             | can result in unaligned access.
             | 
             | If you need to change byte orders, you should use library
             | to achieve that.
        
               | nly wrote:
               | Boost.Endian is the library here and this code is safe
               | because the big_uint32_t type has an alignment
               | requirement of 1 byte.
               | 
               | This is why ubsan is silent and not even injecting a
               | check in to the compiled code.
               | 
               | You can check the alignment constraints with
               | static_assert (something else you can't do in standard
               | C): https://gcc.godbolt.org/z/KTcf9ax6r
        
               | kevin_thibedeau wrote:
               | C11 has static_assert:
               | https://gcc.godbolt.org/z/E3bGc95o3
               | 
               | Is also has _Generic() so you can roll up a family of
               | endianness conversion functions and safely change types
               | without blowing up somewhere else with a hardcoded
               | conversion routine.
        
             | Brian_K_White wrote:
             | It demonstrates that c++ is even less safe.
        
         | 0x000000E2 wrote:
         | By the same token, I think most uses for C++ these days are
         | nuts. If you're doing a greenfield project 90% of the time it's
         | better to use Rust.
         | 
         | C++ has a multitude of its own pitfalls. Some of the C
         | programmer hate for C++ is justified. After all, it's just C
         | with a pre-processing stage in the end.
         | 
         | There's good reasons why many C projects never considered C++
         | but are already integrating the nascent Rust. I always hated
         | low level programming until Rust made it just as easy and
         | productive as high level stuff
        
         | jart wrote:
         | C is perfect for these problems. I like teaching the endian
         | serialization problem because it broaches so many of the topics
         | that are key to understanding C/C++ in general. Even if we
         | choose to spend the majority of our time plumbing together
         | functions written by better men, it's nice to understand how
         | the language is defined so we could write those functions, even
         | if we don't need to.
        
           | nly wrote:
           | For sure, it's a good way to teach that C is insufficient to
           | deal with even the simplest of tasks. Unfortunately teaching
           | has a bad habit of becoming practice, no matter how good the
           | intention.
           | 
           | With regard to teaching C++ specifically I tend to agree with
           | this talk:
           | 
           | CppCon 2015 - Kate Gregory "Stop Teaching C":
           | https://www.youtube.com/watch?v=YnWhqhNdYyk
        
             | jart wrote:
             | One of her slides was titled "Stop teaching pointers!" too.
             | My VP back at my old job snapped at me once because I got
             | too excited about the pointer abstractions provided by
             | modern C++. Ever since that day I try to take a more
             | rational approach to writing native code where I consider
             | what it looks like in binary and I've configured my Emacs
             | so it can do what clang.godbolt.org does in a single
             | keystroke.
        
               | nly wrote:
               | For the record, she's not really saying people shouldn't
               | learn this low level stuff... just that 'intro to C++'
               | shouldn't be teaching this stuff _first_
               | 
               | The biggest problem with C++ in industry is that people
               | tend to write "C/C++" when it deserves to be recognized
               | as a language in its own right.
        
               | jart wrote:
               | One does not simply introduce C++. It's the most insanely
               | hardcore language there is. I wouldn't have stood any
               | chance understanding it had it not been for my gentle
               | introduction with C for several years.
        
               | SAI_Peregrinus wrote:
               | C++ makes Rust look easy to learn.
        
               | pjmlp wrote:
               | Really?
               | 
               | Apparently the first year students at my university
               | didn't had any issue going from Standard Pascal to C++,
               | in the mid-90's.
               | 
               | Proper C++ was taught using our string, vector and
               | collection classes, given that we were still a couple of
               | years away from ISO C++ being fully defined.
               | 
               | C style programming with low level tricks were only
               | introduced later as advanced topics.
               | 
               | Apparently thousands of students managed to get going the
               | remaining 5 years of the degree.
        
               | BenjiWiebe wrote:
               | C++ in the mid 90s was a lot simpler than C++ now.
        
               | pjmlp wrote:
               | No one obliges you to write C++20 with SFINAE template
               | meta-programming, using classes with CTAD constructors.
               | 
               | Just like no Python newbie is able to master Python 3.9
               | full language set, standard library, numpy, pandas,
               | django,...
        
               | jart wrote:
               | Well there's a reason universities switched to Java when
               | teaching algorithms and containers after the 90's. C++ is
               | a weaker abstraction that encourages the kind of
               | curiosity that's going to cause a student's brain to melt
               | the moment they try to figure out how things work and
               | encounter the sorts of demons the coursework hasn't
               | prepared them to face. If I was going to teach it, I'd
               | start with octal machine codes and work my way up.
               | https://justine.lol/blinkenlights/realmode.html Sort of
               | like if I were to teach TypeScript then I'd start with
               | JavaScript. My approach to native development probably
               | has more in common with web development than it does with
               | modern c++ practices to be honest, and that's something I
               | talk about in one of my famous hacks: https://github.com/
               | jart/cosmopolitan/blob/4577f7fe11e5d8ef0a...
        
               | pjmlp wrote:
               | US universities maybe, there isn't much Java on my former
               | university learning plan.
               | 
               | The only subjects that went full into Java were
               | distributed computing and compiler design.
               | 
               | And during the last 20 years they already went back into
               | their decision.
               | 
               | I should note that languages like Prolog, ML and
               | Smalltalk were part of the learning subjects as well.
               | 
               | Assembly was part of electronic subjects where design of
               | a pseudo CPU was also part of the themes. So we had our
               | own pseudo Assembly, x86 and MIPS.
        
               | jcelerier wrote:
               | > Well there's a reason universities switched to Java
               | when teaching algorithms and containers after the 90's
               | 
               | Where ? I learned algorithms in C and C++ (and also a bit
               | in Caml and LISP) and I was in university 2011-2014
        
               | ta988 wrote:
               | Yes this is the curse of knowledge, people that know c++
               | by their exposure to it for decades are usually unable to
               | bring any new comer to it.
        
           | microtherion wrote:
           | Yes, there is some value in using C for teaching these
           | concepts. But the problem I see is that, once taught, many
           | people will then continue to use C and their hand written
           | byte swapping functions, instead of moving on to languages
           | with better abstraction facilities and/or availing themselves
           | of the (as you point out) many available library
           | implementations of this functionality.
        
         | ok123456 wrote:
         | Or just use the functions in <arpa/inet.h> to convert from host
         | to network byteorder?
        
           | froh wrote:
           | this! use hton/ntoh and be happy.
           | 
           | nitpick: the 64bit versions are not fully available yet,
           | htonll, ntohll
        
       | Animats wrote:
       | Rust gets this right. These primitives are available for all the
       | numeric types.                   u32::from_le_byte(bytes) // u32
       | from 4 bytes, little endian         u32::from_be_byte(bytes) //
       | u32 from 4 bytes, big endian         u32::to_le_bytes(num) // u32
       | to 4 bytes, little endian         u32::to_be_bytes(num) // u32 to
       | 4 bytes, big endian
       | 
       | This was very useful to me recently as I had to write the
       | marshaling and un-marshaling for a game networking format with
       | hundreds of messages. With primitives like this, you can see
       | what's going on.
        
         | infradig wrote:
         | There are equivalent functions in C too. The point of the
         | article is about not using them. So how would you implement the
         | above functions in Rust would be more pertinent.
        
           | froh wrote:
           | isnt the point to be careful when implementing them? so the
           | compiler detects the intention to byteswap?
           | 
           | when we ported little endian x86 Linux to the big endian
           | mainframe we sprinkled hton/ntoh all over the place, happily
           | so. they are the way to go and they should be implemented
           | properly, not be replaced by a homegrown version.
           | 
           | all that said, I'm surprised 64bit htonll and ntohll are not
           | standard yet. anybody knows why?
        
             | thechao wrote:
             | Blech. I learned to program (around '99) by implementing
             | the crusty old FCS1.0 format, which allows for aggressively
             | weird wire formats. Our machine was a PDP-11/72 with its
             | head sawzalled off and custom wire wrap boards dropped in.
             | The "native" format (coming from analog) was 2143 order as
             | a 36b packet. The bits were [8,0:7] (using verilog
             | notation). However, sprinkled randomly in the binary header
             | were chunks of 7- and 8- bit ANSI (packed) and some mutant
             | knockoff 6-bit EBCDIC.
             | 
             | The original listing was written by "Jennifer -- please
             | call me if you have troubles", an undergraduate from MIT.
             | It was hand-assembled machine code, in a neat hand in a big
             | blue binder. That code ran non-stop except for a few
             | hurricanes from 1988 until 2008; bug-free as far as I could
             | tell. Jennifer last-name-unknown, you were my idol & my
             | demon!
             | 
             | I swore off programming for nearly a year after that.
        
         | Negitivefrags wrote:
         | Unless you are planning on running your game on a mainframe,
         | just don't bother with endianness for the networking.
         | 
         | Big endian is dead for game developers.
         | 
         | Copy entire arrays of structs onto the wire without fear!
         | 
         | (Just #pragma pack them first)
        
           | musicale wrote:
           | > game on a mainframe
           | 
           | Maybe your program isn't a game.
           | 
           | Maybe you have to deal a server that uses Power, or an
           | embedded system that uses PowerPC (or ARM or MIPS in big-
           | endian mode).
           | 
           | Maybe you're running on an older architecture (SPARC,
           | PowerPC, 68K.)
           | 
           | Maybe you have to deal with a pre-defined data format (e.g.
           | TCP/IP packet headers) that uses big-endian byte ordering for
           | some of its components.
        
             | Aeolun wrote:
             | That's theoretically possible. But I'd be very interested
             | in why. Especially if you are doing anything involving
             | networking.
        
       | einpoklum wrote:
       | This is valid code in C++20:                   if constexpr
       | (std::endian::native == std::endian::big) {             std::cout
       | << "big-endian" << '\n';         }         else if constexpr
       | (std::endian::native == std::endian::little) {
       | std::cout << "little-endian"  << '\n';         }         else {
       | std::cout << "mixed-endian"  << '\n';         }
       | 
       | Doesn't solve everything, but it's saner even if what you're
       | writing is C-style low-level code.
        
       | st_goliath wrote:
       | FWIW there is a <sys/endian.h> on various BSDs that contains
       | "beXXtoh", "leXXtoh", "htobeXX", "htoleXX" where XX is a number
       | of bits (16, 32, 64).
       | 
       | That header is also available on Linux, but glibc (and compatible
       | libraries) named it <endian.h> instead.
       | 
       | See: man 3 endian (https://linux.die.net/man/3/endian)
       | 
       | Of course it gets a bit hairier if the code is also supposed to
       | run on other systems.
       | 
       | MacOS has OSSwapHostToLittleIntXX, OSSwapLittleToHostIntXX,
       | OSSwapHostToBigIntXX and OSSwapBigToHostIntXX in
       | <libkern/OSByteOrder.h>.
       | 
       | I'm not sure if Windows has something similar, or if it even
       | supports running on big endian machines (if you know, please
       | tell).
       | 
       | My solution for achieving some portability currently entails
       | cobbling together a "compat.h" header that defines macros for the
       | MacOS functions and including the right headers. Something like
       | this:
       | 
       | https://github.com/AgentD/squashfs-tools-ng/blob/master/incl...
       | 
       | This is usually my go-to-solution for working with low level on-
       | disk or on-the-wire binary data structures that demand a specific
       | endianness. In C I use "load/store" style functions that memcpy
       | the data from a buffer into a struct instance and do the endian
       | swapping (or reverse for the store). The copying is also
       | necessary because the struct in the buffer may not have proper
       | alignment.
       | 
       | Technically, the giant macro of doom in the article takes care of
       | all of this as well. But unlike the article, I would very much
       | not recommend hacking up your own stuff if there are systems
       | libraries readily available that take care of doing the same
       | thing in an efficient manner.
       | 
       | In C++ code, all of this can of course be neatly stowed away in a
       | special class with overloaded operators that transparently takes
       | care of everything and "decays" into a single integer and exactly
       | the above code after compilation, but is IMO somewhat cleaner to
       | read and adds much needed type safety.
        
         | anticristi wrote:
         | Indeed, I don't get the article. It's like writing "C is hard
         | because here is how hard it is to implement memcpy using SIMD
         | correctly."
         | 
         | Please don't do that. Use battle-tested low-level routines.
         | Unless your USP is "our software swaps bytes faster than the
         | competition", you should not spend brain power on that.
        
         | nwallin wrote:
         | Windows/MSVC has _byteswap_ushort(), _byteswap_ulong(),
         | _byteswap_uint64(). (note that unsigned long is 32 bits on
         | Windows) It's ugly but it works.
         | 
         | Boost provides boost::endian which allows converting between
         | native and big or little, which just does the right thing on
         | all architectures and compilers and compiles down to a no-op or
         | bswap instruction instruction. It's much better than writing
         | (and testing!) your own giant pile macros and ifdefs to detect
         | the compiler/architecture/OS, include the correct includes, and
         | perform the correct conversions in the correct places.
        
         | [deleted]
        
         | tjoff wrote:
         | At least historically windows have had big-endian versions as
         | both SPARC and Itanium use big endian.
        
           | electroly wrote:
           | Itanium can be configured to run in either endianness (it's
           | "bi-endian"). Windows on Itanium always ran in little-endian
           | mode and did not support big-endian mode. The same was true
           | of PowerPC. Windows never ran in big-endian mode on any
           | architecture.
        
       | Sebb767 wrote:
       | In case anyone else wonders how the code in the linked tweet [0]
       | would format your hard drive, it's the missing return on f1.
       | Therefore, f1 is empty as well (no ret) and calling it will
       | result in f2 being run. The commented out code is irrelevant.
       | 
       | EDIT: Reading the bug report [1], the actual cause for the
       | missing ret is that the for loop will overflow, which is UB and
       | causes clang to not emit any code for the function.
       | 
       | [0] https://twitter.com/m13253/status/1371615680068526081
       | 
       | [1] https://bugs.llvm.org/show_bug.cgi?id=49599
        
       | vlmutolo wrote:
       | > If you program in C long enough, stuff like this becomes second
       | nature, and it starts to almost feel inappropriate to even have
       | macros like the above, since it might be more appropriately
       | inlined into the specific code. Since there have simply been too
       | many APIs introduced over the years for solving this problem. To
       | name a few for 32-bit byte swapping alone: bswap_32, htobe32,
       | htole32, be32toh, le32toh, ntohl, and htonl which all have pretty
       | much the same meaning.
       | 
       | > Now you don't need to use those APIs because you know the
       | secret.
       | 
       | This sentiment seems problematic. The solution shouldn't be "we
       | just have to educate the masses of C programmers on how to
       | properly deal with endianness". That will never happen.
       | 
       | The solution should be "It's in the standard library. Go look
       | there and don't think too hard." C is sufficiently low-level, and
       | endianness problems sufficiently common, that I would expect that
       | kind of routine to be available.
        
         | lanstin wrote:
         | The point is that keeping the distinction clear in your head
         | between numeric semantics and sequence of octets semantics
         | makes the problem universally tractible. You have a data
         | structure where with a numeric value. Here you have a sequence
         | of octets described by some protocol formalism, BNF in the old
         | days. The mapping from one to the other occurs in the math
         | between octets and numeric values and the various network
         | protocols for representing numbers. There are many more choices
         | than just big endian or little endian. Could be ASN infinite
         | precision ints. Could be 32 bit IEEE floats or 64 bit IEEE
         | floats. The distinction is universal between language semantics
         | and external representations.
         | 
         | This is why people that memcpy structs right into the buf get
         | such derision, even if it's faster and written for a mono-
         | Implementation of a language semantics. It is sloppy thought
         | made manifest.
        
         | pjmlp wrote:
         | Typical C culture, you would also expect that by now something
         | like SDS would be part of the standard as well.
         | 
         | https://github.com/antirez/sds
        
           | saagarjha wrote:
           | Adding API that introduces an entirely new string model that
           | is incompatible with the rest of the standard library seems
           | like a nonstarter.
        
       | secondcoming wrote:
       | Isn't the 'modern' solution to memcpy into a temp and swap the
       | bytes in that? C++ has added/will add std::launder and std::bless
       | to deal with this issue
        
         | lanstin wrote:
         | No, it is to read a byte at a time and turn it into the
         | semantic value for the data structure you are filling in. Like
         | read 128 and then 1 and set the variable to 32769. If u are the
         | author of protobufs then you may run profiling and write the
         | best assembly etc but otherwise no, don't do it.
        
         | loeg wrote:
         | > Isn't the 'modern' solution to memcpy into a temp and swap
         | the bytes in that?
         | 
         | Or just use the endian.h / sys/endian.h routines, which do the
         | right thing (be32dec / be32enc / whatever). memcpy+swap is
         | fine, and easier to get right than the author's giant
         | expressions, but you might as well use the named routines that
         | do exactly what you want already.
        
       | kingsuper20 wrote:
       | I've never been very satisfied with these approaches for C where
       | you hope the compiler does the right thing. It makes sense to
       | provide some C implementation for portability's sake but any
       | sizeable reordering cries out for a handtuned, processor
       | specific, approach (and the non-sizeable probably doesn't require
       | high speed). I would expect any SIMD instruction set to include a
       | shuffle.
        
         | phkahler wrote:
         | It can also be a good idea to swap recursively. First swap the
         | upper and lower half, then swap the upper and lower quarters
         | (bytes for a 32bit) which can be done with only 2 masks. Then
         | if its 64bit value swap alternate bytes, again with only 2
         | masks. This can be extended all the way to full bit reverse in
         | 3 more lines each with 2 masks and shifts.
        
       | [deleted]
        
       | ipython wrote:
       | It's not every day you can write a blog post that calls out rob
       | pike... ;)
        
         | jart wrote:
         | Author here. I'm improving upon Rob Pike's outstanding work.
         | Standing on the shoulders of a giant.
        
           | ipython wrote:
           | Totally agree. My comment was made in jest. Mad kudos to you
           | as you clearly possess talent and humility that's in short
           | supply today.
        
       | bigbillheck wrote:
       | I just use ntohl/htonl like a civilized person.
       | 
       | (Yes, the article mentions those, but they've been standard for
       | decades).
        
         | froh wrote:
         | what's the best practice for 64bit values these days? is htonll
         | ntohll widely available yet?
        
       | amelius wrote:
       | Byte order is one of the great unnecessary historical fuck ups in
       | computing.
       | 
       | A similar one is that signedness of char is machine dependent.
       | It's typically signed on Intel and unsigned on ARM.
       | 
       | Sigh!
        
         | mytailorisrich wrote:
         | I don't think it's a fuck up, rather I think it was
         | unavoidable: Both ways are equally valid and when the time came
         | to make the decision, some people decided one way, some people
         | decided the other way.
        
         | amelius wrote:
         | By the way, mathematicians also have their fuck ups:
         | 
         | https://tauday.com/tau-manifesto
        
           | 8jy89hui wrote:
           | For anyone curious or who is still attached to pi, here is a
           | response to the tau manifesto:
           | 
           | https://blog.wolfram.com/2015/06/28/2-pi-or-not-2-pi/
        
         | joppy wrote:
         | Why is it an issue any more than say, order of fields in a
         | struct is an issue? In one case you read bytes off the disk by
         | doing ((b[0] << 8) | b[1]) (or equivalent), with the order
         | reversed the other way around. Any application-level (say, not
         | a compiler, debugger, etc) program should not even need to know
         | the native byte order, it should only need to know the encoding
         | that the file it's trying to read used.
        
           | zabzonk wrote:
           | > order of fields in a struct
           | 
           | This is defined in C to be the order the fields are declared
           | in.
        
             | occamrazor wrote:
             | But the padding rules between fields are a mess.
        
         | ta_ca wrote:
         | the greatest of all is lisp not being the most mainstream
         | language, and we can only blame the lisp companies for this
         | fiasco. in an ideal world we all would be using a lisp with
         | parametric polymorphism. from highest level abstractions to
         | machine level, all in one language.
        
           | ta_ca wrote:
           | i hope these downvotes are due to my failure at english or
           | the comment being off-topic (or both). if not, can i just
           | replace lisp with rust and be friends again?
        
           | [deleted]
        
         | bregma wrote:
         | And which is the correct byte ordering, pray tell?
        
           | bonzini wrote:
           | Little endian has the advantage that you can read the low
           | bits of data without having to adjust the address. So you can
           | for example do long addition in memory order rather than
           | having to go backwards, or (with an appropriate
           | representation such as ULEB128) in one pass without knowing
           | the size.
        
             | js8 wrote:
             | Maybe I am biased working on mainframes, but I would
             | personally take big endian over little endian. The reason
             | is when reading a hex dump, I can easily read the binary
             | integers from left to right.
        
               | bonzini wrote:
               | That's the only thing that BE has over LE.
               | 
               | But for example bitmaps in BE are a huge source of bugs,
               | as readers and writers need to agree on the size to use
               | for memory operations.
               | 
               | "SIMD in a word" (e.g. doing strlen or strcmp with 32- or
               | 64-bit memory accesses) might have mostly fallen out of
               | fashion these days, but it's also more efficient in LE.
        
           | wongarsu wrote:
           | Big and little endian are named after the never-ending "holy"
           | war in Gulliver's Travels over how to open eggs. So we were
           | always of the opinion that it doesn't really matter. But I
           | open my eggs on the little end
        
           | CountHackulus wrote:
           | Middle-endian is the only correct answer. It's a tradeoff
           | between both little-endian and big-endian. The PDP-11 got it
           | right.
        
             | ben509 wrote:
             | Yup, we're all waiting for the rest of the world to catch
             | up to MM/DD/YYYY.
        
           | rwmj wrote:
           | Big Endian of course :-) However the one which has won is
           | Little Endian. Even IBM admitted this when it switched the
           | default in POWER 7 to little endian. s390x is the only
           | significant architecture that is still big endian.
        
           | kstenerud wrote:
           | Big endian is easier for humans to read when looking at a
           | memory dump, but little endian has many useful features in
           | binary encoding schemes due to the low byte being first.
           | 
           | I used to like big endian more, but after deep investigation
           | I now prefer little endian for any encoding schemes.
        
             | bombcar wrote:
             | Couldn't encoding systems be redone with emphasis on the
             | high-order bits? Or is the assumption that the values are
             | clustered in the low bits?
        
               | amelius wrote:
               | I think the fundamental problems is that if you start a
               | computation using N most significant bits and then
               | incrementally add more bits, e.g. N+M bits total, then
               | your first N bits might change as a result.
               | 
               | E.g. decimal example:                   1.00/1.00 = 1.00
               | 1.000/1.001 = 0.999000999000...
               | 
               | (adding one more bit changes the first bits of the
               | outcome)
        
               | kstenerud wrote:
               | You can put emphasis on high order bits, but that makes
               | decoding more complex. With little endian the decoder
               | builds low to high, which is MUCH easier to deal with,
               | especially on spillover.
               | 
               | For example, with ULEB128 [1], you just read 7 bits at a
               | time, going higher and higher up the value you're
               | reconstituting. If the value grows too big and you need
               | to spill over to the next (such as with big integer
               | implementations), you just fill the last bits of the old
               | value, then put the remainder bits in the next value and
               | continue on.
               | 
               | With a big endian encoding method (i.e. VLQ used in MIDI
               | format), you start from the high bits and work your way
               | down, which is fine until your value spills over. Because
               | you only have the high bits decoded at the time of the
               | spillover, you now have to start shifting bits along each
               | of your already decoded big integer portions until you
               | finally decode the lowest bit. This of course gets
               | progressively slower as the bits and your big integer
               | portions pile up.
               | 
               | Encoding is easier too, since you don't need to check if
               | for example a uint64 integer value can be encoded in 1,
               | 2, 3, 4, 5, 6, 7 or 8 bits. Just encode the low 8 bits,
               | shift the source right by 8, repeat, until the source
               | value is 0. Then backtrack to the as-yet-blank encoded
               | length field in your message and stuff in how many bytes
               | you encoded. You just got the length calculation for
               | free. Use a scheme where you only encode up to 60 bit
               | values, place the length field in the low 4 bits, and
               | Robert's your father's brother!
               | 
               | For data that is right-heavy (i.e. the fully formed data
               | always has real data on the right side and blank filler
               | on the left - such as uint32 value 8 is actually
               | 0x00000008), you want a little endian scheme. For data
               | that is left-heavy, you want a big endian scheme. Since
               | most of the data we deal with is right-heavy, little
               | endian is the way to go.
               | 
               | You can see how this has influenced my encoding design in
               | [2] [3] [4].
               | 
               | [1] https://en.wikipedia.org/wiki/LEB128
               | 
               | [2] https://github.com/kstenerud/concise-
               | encoding/blob/master/cb...
               | 
               | [3] https://github.com/kstenerud/compact-
               | float/blob/master/compa...
               | 
               | [4] https://github.com/kstenerud/compact-
               | time/blob/master/compac...
        
         | pantalaimon wrote:
         | The good thing is that Big Endian is pretty much irrelevant
         | these days. Of all the historically Big Endian architectures,
         | s390x is indeed the only one left that has not switched to
         | little endian.
        
           | globular-toast wrote:
           | Even if all CPUs were little-endian, big-endian would exist
           | almost everywhere _except_ CPUs, including in your head.
           | Unless you 're some odd person that actually thinks in
           | little-endian.
        
           | chrisseaton wrote:
           | > The good thing is that Big Endian is pretty much irrelevant
           | these days.
           | 
           | This is nonsense - many file formats are big endian.
        
             | benjohnson wrote:
             | With a bonus of some being EBCDIC too.
        
               | lanstin wrote:
               | This is true.
        
           | erk__ wrote:
           | As there was talk about in a subthread yesterday [0] so does
           | arm support big endian though it is not used as much anymore
           | is it still there.
           | 
           | POWER also still uses big endian though recently little
           | endian POWER have gotten more popular
           | 
           | [0]: https://news.ycombinator.com/item?id=27075419
        
           | akvadrako wrote:
           | Network protocols still mostly use "Network Byte Order", i.e.
           | big endian.
        
             | lanstin wrote:
             | Or text. Or handled by generated code like protobuf.
        
           | tssva wrote:
           | Network byte order is big endian so it is far from being
           | pretty much irrelevant these days.
        
             | BenoitEssiambre wrote:
             | Also, this might be irrelevant at the cpu level, but within
             | a byte, bits are usually displayed most significant bit
             | first, so with little endian you end up with bit order:
             | 
             | 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8
             | 
             | instead of
             | 
             | 15 to 0
             | 
             | This is because little endian is not how humans write
             | numbers. For consistency with little endianness we would
             | have to switch to writing "one hundred and twenty three" as
             | 
             | 321
        
               | froh wrote:
               | that's why little endian == broken endian
               | 
               | said a friend who also quips: "never trust a computer you
               | can lift"
        
               | LightMachine wrote:
               | Exactly. This is so infuriating. Whoever let little-
               | endian win made a huge disfavor for humanity.
        
               | jart wrote:
               | Blame the people who failed to localize the right-to-left
               | convention when arabic numerals were adopted. It's one of
               | those things like pi vs. tau or jacobin weights and
               | measurements vs. planck units. Tradition isn't always
               | correct. John von Neumann understood that when he
               | designed modern architecture and muh hex dump is not an
               | argument.
        
               | kstenerud wrote:
               | The only benefit to big endian is that it's easier for
               | humans to read in a hex dump. Little endian on the other
               | hand has many tricks available to it for building
               | encoding schemes that are efficient on the decoder side.
        
               | tom_mellior wrote:
               | Could you elaborate on these tricks? This sounds
               | interesting.
               | 
               | The only thing I'm aware of that's neat in little endian
               | is that if you want the low byte (or word or whatever
               | suffix) of a number stored at address a, then you can
               | simply read a byte from exactly that address. Even if you
               | don't know the size of the original number.
        
               | kstenerud wrote:
               | I've posted in some other replies, but a few:
               | 
               | - Long addition is possible across very large integers by
               | just adding the bytes and keeping track of the carry.
               | 
               | - Encoding variable sized integers is possible through an
               | easy algorithm: set aside space in the encoded data for
               | the size, then encode the low bits of the value, shift,
               | repeat until value = 0. When done, store the number of
               | bytes you wrote to the earlier length field. The length
               | calculation comes for free.
               | 
               | - Decoding unaligned bits into big integers is easy
               | because you just store the leftover bits in the next
               | value of the bigint array and keep going. With big
               | endian, you're going high bits to low bits, so once you
               | pass to more than one element in the bigint array, you
               | have to start shifting across multiple elements for every
               | piece you decode from then on.
               | 
               | - Storing bit-encoded length fields into structs becomes
               | trivial since it's always in the low bit, and you can
               | just incrementally build the value low-to-high using the
               | previously decoded length field. Super easy and quick
               | decoding, without having to prepare specific sized
               | destinations.
        
               | mafuy wrote:
               | Correct me if I'm wrong, but were the now common numbers
               | not imported in the same order from Arabic, which writes
               | right to left? So numbers were invented in little endian,
               | and we just forgot to translate their order.
        
               | dahart wrote:
               | Good question, I just did a little digging to see if I
               | could find out. It sounds like old Arabic did indeed use
               | little endian in writing and speaking, but modern Arabic
               | does not. However, place values weren't invented in
               | Arabic, Wikipedia says that occurred in Mesopotamia,
               | which spoke primarily Sumerian and was written in
               | Cuneiform - where the direction was left to right.
               | 
               | https://en.wikipedia.org/wiki/Number#First_use_of_numbers
               | 
               | https://en.wikipedia.org/wiki/Mesopotamia
               | 
               | https://en.wikipedia.org/wiki/Cuneiform
        
               | gpanders wrote:
               | It might not be how humans _write_ numbers but it is
               | consistent with how we think about numbers in a base
               | system.
               | 
               | 123 = 3x10^0 + 2x10^1 + 1x10^2
               | 
               | So if you were to go and label each digit in 123 with the
               | power of 10 it represents, you end up with little endian
               | ordering (eg the 3 has index 0 and the 1 has index 2).
               | This is why little endian has always made more sense to
               | me, personally.
        
               | dahart wrote:
               | I always think about _values_ in big endian, largest
               | digit first. Scientific notation, for example, since
               | often we only care about the first few digits.
               | 
               | I sometimes think about _arithmetic_ in little endian,
               | since addition always starts with the least significant
               | digit, due to the right-to-left dependency of carrying.
               | 
               | Except lately I've been doing large additions big-endian
               | style left-to-right, allowing intermediate "digits" with
               | a value greater than 9, and doing the carry pass
               | separately after the digit addition pass. It feels easier
               | to me to think about addition this way, even though it's
               | a less efficient notation.
               | 
               | Long division and modulus are also big-endian operations.
               | My favorite CS trick was learning how you can compute any
               | arbitrarily sized number mod 7 in your head as fast as
               | people are reading the digits of the number, from left to
               | right. If you did it little-endian you'd have to remember
               | the entire number, but in big endian you can forget each
               | digit as soon as you use it.
        
               | BenoitEssiambre wrote:
               | I don't know, when we write in general, we tend to write
               | the most significant stuff first so you lose less
               | information if you stop early. Even numbers we truncate
               | twelve millions instead of something like twelve
               | millions, zero thousand zero hundreds and 0.
        
               | lanstin wrote:
               | Next you are going to want little endian polynomials, and
               | that is just too far. Also, the advantage of big endian
               | is it naturally extends to decimals/negative exponents
               | where the later on things are less important. X squared
               | plus x plus three minus one over x plus one over x
               | squared etc.
               | 
               | Loss of big endian chips saddens me like the loss of
               | underscores in var names in Go Lang. The homogeneity is
               | worth something, thanks intel and camelCase, but the old
               | order that passes away and is no more had the beauty of a
               | new world.
        
               | occamrazor wrote:
               | In German _ein hundert drei und zwanzig_, literally _one
               | hundred three and twenty_. The hardest part is are
               | telephone numbers, that are usually given in blocks of
               | two digits.
        
               | lanstin wrote:
               | Well that would be hard for me to learn. I always find
               | the small numbers between like 10 and 100 or 1000 the
               | hardest for me to remember in languages I am trying to
               | learn a bit of.
        
       | mrlonglong wrote:
       | In an ideal world which endian format would one go for?
        
         | tempodox wrote:
         | I for one would go for big-endian, simply because reading
         | memory dumps and byte blocks in assembly or elsewhere works
         | without mental byte-swapping arithmetics for multi-byte
         | entities.
         | 
         | Just out of curiosity, I would be interested in learning why so
         | many CPUs today are little-endian. Is it because it is cheaper
         | / more efficient for processor implementations or is it because
         | "the others do it, so we do it the same way"?
        
           | anticristi wrote:
           | My brain is trained to read little-endian in memory dumps.
           | It's no different than the German "funf-und-zwanzig" (five
           | and twenty). :))
        
           | bombcar wrote:
           | https://stackoverflow.com/questions/5185551/why-
           | is-x86-littl...
           | 
           | It simplifies certain instructions internally. Practically
           | everything is little endian because x86 won.
           | 
           | > And if you think about a serial machine, you have to
           | process all the addresses and data one-bit at a time, and the
           | rational way to do that is: low-bit to high-bit because
           | that's the way that carry would propagate. So it means that
           | [in] the jump instruction itself, the way the 14-bit address
           | would be put in a serial machine is bit-backwards, as you
           | look at it, because that's the way you'd want to process it.
           | Well, we were gonna built a byte-parallel machine, not bit-
           | serial and our compromise (in the spirit of the customer and
           | just for him), we put the bytes in backwards. We put the low-
           | byte [first] and then the high-byte. This has since been
           | dubbed "Little Endian" format and it's sort of contrary to
           | what you'd think would be natural. Well, we did it for
           | Datapoint. As you'll see, they never did use the [8008] chip
           | and so it was in some sense "a mistake", but that [Little
           | Endian format] has lived on to the 8080 and 8086 and [is] one
           | of the marks of this family.
        
             | mrlonglong wrote:
             | And does middle endian even exist?
        
               | FabHK wrote:
               | US date format: 12/31/2021
        
         | bitwize wrote:
         | Little endian. There is no extant big-endian CPU that matters.
        
           | mrlonglong wrote:
           | I did say in an ideal world.
        
             | bitwize wrote:
             | Hint: The reason why it's called "endianness" comes from
             | the novel _Gulliver 's Travels_, in which the neighboring
             | nations of Lilliput and Blefuscu went to bitter, bloody war
             | over which end to break your eggs from: the big end or the
             | little end. The warring factions were also known as Big-
             | Endians and Little-Endians, and each thought themselves
             | superior to the dirty heathens on the other side. If one
             | side were objectively correct, if there were an inherent
             | advantage to breaking your egg from one side or the other,
             | would there be a war at all?
        
               | dragonwriter wrote:
               | > if there were an inherent advantage to breaking your
               | egg from one side or the other, would there be a war at
               | all?
               | 
               | Fascism vs. not-fascism, Stalinist Communism vs. Western
               | Capitalism, Islamism vs. liberal democracy... I'm not
               | sure "the existence of war around a divide in ideas
               | proves that neither sides ideas are correct" is a
               | particularly comfortable maxim to consider the
               | ramifications of.
        
         | enqk wrote:
         | https://fgiesen.wordpress.com/2014/10/25/little-endian-vs-bi...
        
           | marcosdumay wrote:
           | Why would one choose the memory representation of the number
           | based on the advantages of the internal ALU wiring?
           | 
           | Of all those reasons, the only one I can make sense of is the
           | "I can't transparently widen fields after the fact!", and
           | that one is way too niche to explain anything.
        
             | enqk wrote:
             | I don't understand? Why not make the memory representation
             | sympathetic with the operations you're going to do on it?
             | It's the raison d'etre of computers to compute and to do it
             | fast.
             | 
             | Another example: memory representation of pixels in GPUs
             | which are swizzled to make computations efficient
        
               | marcosdumay wrote:
               | > I don't understand? Why not make the memory
               | representation sympathetic with the operations you're
               | going to do on it?
               | 
               | There's no reason to, as there's no reason not to. It's
               | basically irrelevant.
               | 
               | If carrier passing is so important, why can't you just
               | mirror your transistors and operate on the same wires,
               | but on the opposite order? Well, you can, and it's
               | trivial. (And, by the way, carrier passing isn't
               | important. High performances ALU pass carrier only though
               | blocks, that can appear anywhere. And the wiring of those
               | isn't even planar, so how you arrange them isn't a
               | showstopper.)
        
       | cygx wrote:
       | _> So the solution is simple right? Let 's just use unsigned char
       | instead. Sadly no. Because unsigned char in C expressions gets
       | type promoted to the signed type int._
       | 
       | If you do use _unsigned char_ , an alternative to masking would
       | be performing the cast to _uint32_t_ before instead of after the
       | shift.
       | 
       |  _edit:_ For reference, this is what it would look like when
       | implemented as a function instead of a macro:
       | static inline uint32_t read32be(const uint8_t *p)         {
       | return (uint32_t)p[0] << 24                  | (uint32_t)p[1] <<
       | 16                  | (uint32_t)p[2] <<  8                  |
       | (uint32_t)p[3];         }
        
       | jcadam wrote:
       | A while back I was on a project to port a satellite simulator
       | from SPARC/Solaris to RHEL/x64. The compressed telemetry stream
       | that came from the satellite needed to be in big endian (and
       | that's what the ground station software expected), and the
       | simulator needed to mimic the behavior.
       | 
       | This was not a problem for the old SPARC system, which naturally
       | put everything in the correct order without any fuss, but one of
       | the biggest sticking points in porting over to x64 was having to
       | now manually pack all of that binary data. Using Ada, (what
       | else!) of course.
        
         | metiscus wrote:
         | If memory serves correctly, ada 2012 and beyond has language
         | level support for this. I was working on porting some code from
         | an aviation platform to run on PC and it was all in ada 2005 so
         | we didn't have the benefit of that available.
        
           | jcadam wrote:
           | Same here, Ada2005 for the port. The simulator was originally
           | written in Ada95. Part of what made it even less fun was the
           | data was highly packed and individual fields crossed byte
           | boundaries (these 5 bits are X, the next 4 bits are Y, etc.)
           | :(
        
             | bombcar wrote:
             | Given enough memory it may be worth treating the whole
             | stream internally as a bitstream.
        
             | onox wrote:
             | Couldn't you add the Bit_Order and Scalar_Storage_Order
             | attributes (or aspects in Ada 2012) to your records/arrays?
             | Or did Scalar_Storage_Order not exist at the time?
        
       | the_real_sparky wrote:
       | This problem is it's own special horror in Canbus data. Between
       | endianness and sign it's a nightmare of en/decoding possibilities
       | and the associated mistakes that come with that.
        
         | rwmj wrote:
         | TIFF is another one. The only endian-switchable image format
         | that I'm aware of.
         | 
         | Fun fact: CD-ROM superblocks have both-endian fields. Each
         | integer is stored twice in big and little endian format. I
         | assume this was to allow underpowered 80s hardware which didn't
         | have enough resource to do byte swapping.
        
       | gumby wrote:
       | In her first sentence, the phrase "the C / C++ programming
       | language" is no longer correct: C++20 requires two's complement
       | signed integers.
       | 
       | C++ 20 is quite new so I would assume that very few people know
       | this yet.
       | 
       | C and C++ obviously differ a lot, but by that phrase she clearly
       | means "the part where then two languages overlap". The C++
       | committee has been willing to break C compatibility in a few ways
       | (not every valid C program is a valid C++ program), and this has
       | been true for a while.
        
         | loeg wrote:
         | It hasn't been true since C99, at least -- C++ didn't adopt C99
         | designated initializers.
        
         | hctaw wrote:
         | What chips can be targeted by C compilers today that don't use
         | 2's complement?
        
           | gumby wrote:
           | I haven't seen a one's complement machine in decades but at
           | the time C was standardized here were still quite a few
           | (afaik none had a single-chip CPU, to get to your question).
           | But since they existed, the language definition didn't
           | require it and some optimizations were technically UB.
           | 
           | The C++ committee decided that everyone had figured this out
           | by now and so made this breaking change.
        
         | klyrs wrote:
         | "the c/c++ language" exists insofar as you can import this c
         | code into your c++, and this is something that c++ programmers
         | need to know how to do, so they'd better learn enough of the
         | differences between c and c++ or they'll be stumped when they
         | crack open somebody else's old code.
        
       | mitchs wrote:
       | Or just cast the pointer to uint##_t and use be##toh and htobe##
       | from <endian.h>? I think this is making a mountain out of a mole
       | hill. I've spent tons of time doing wire (de)serialization in C
       | for network protocols and endian swaps are far from the most
       | pressing issue I see. The big problem imo is the unsafe practices
       | around buffer handling allowing buffer over runs.
        
       | amluto wrote:
       | Why mask and then shift instead of casting to the correct type
       | and then shifting, like this:                   (uint32_t)x[0] <<
       | 24 | ...
       | 
       | Of course, this requires that x[0] be unsigned.
        
         | syockit wrote:
         | If this is for deserialisation then it's okay for x[0] to be
         | signed. You just need to recast the result as int32_t (or
         | simply assign to an int32_t variable without any cast) and it
         | is not UB.
        
       | baby wrote:
       | I suggest this article:
       | https://www.cryptologie.net/article/474/bits-and-bytes-order...
       | (shameless plug)
        
       | tails4e wrote:
       | Ubsan should default on. If people don't like it, then they
       | should be made turn it off with a switch, so at least it's more
       | likely to be run than not run. Could save a huge amount of time
       | debugging when compilers or architecture changes. Without it, I'd
       | say many a programmer would be caught by these subtleties in the
       | standard. Coming from a HW background (Verilog) I'd more
       | naturally default to masking and shifting when building up larger
       | variables from smaller ones, but I can imagine many would not.
        
         | patrakov wrote:
         | There was a blog post and a FOSDEM presentation by (misguided)
         | Gentoo developers a few years ago, and it was retracted,
         | because sanitizers add their own exploitable vulnerabilities
         | due to the way they work.
         | 
         | https://blog.hboeck.de/archives/879-Safer-use-of-C-code-runn...
         | 
         | https://www.openwall.com/lists/oss-security/2016/02/17/9
        
           | jart wrote:
           | Sanitizers have the ability to bring Rust-like safety
           | assurances to all the C/C++ code that exists. The fact that
           | existing ASAN runtimes weren't designed for setuid binaries
           | shouldn't dissuade us from pursuing those benefits. We just
           | need a production-worthy runtime that does less things. For
           | example, here's the ASAN runtime that's used for the redbean
           | web server: https://github.com/jart/cosmopolitan/blob/master/
           | libc/intrin...
        
             | pornel wrote:
             | Run-time detection and heuristics on a language that is
             | hard to analyze (e.g. due to weak aliasing, useless const,
             | ad-hoc ownership and thread-safety rules) aren't in the
             | same ballpark as compile-time safety guaranteed by
             | construction, and an entire modern ecosystem centered
             | around safety. Rust can use LLVM sanitizers in addition to
             | its own checks, so that's not even a trade-off.
        
           | tails4e wrote:
           | Sorry for my ignorance, but surely some UB being used for
           | optimization by the compiler is compile time only. This is
           | the part that should default on. Runtime detection is a
           | different thing entirely, but compile time is a no brainer.
        
         | MauranKilom wrote:
         | > Ubsan should default on
         | 
         | > Could save a huge amount of time debugging when compilers or
         | architecture changes.
         | 
         | I'm assuming we come from very different backgrounds, but it's
         | not clear to me how switching compilers or _architectures_ is
         | so common that hardening code against it _by default_ is
         | appropriate. I would think that switching compilers or
         | architectures is generally done very deliberately, so
         | instrumenting code with UBsan _for that transition_ would be
         | the right thing to do?
        
           | toast0 wrote:
           | Changing compilers is a pretty regular thing IMHO; I use the
           | compiler that comes with the OS and let's assume a yearly OS
           | release cycle. Most of those will contain at least some
           | changes to the compiler.
           | 
           | I don't really want to have to take that yearly update to go
           | through and review (and presumablu fix) all the UB that has
           | managed to sneak in over the year. It would be better to have
           | avoided putting it in.
        
           | tails4e wrote:
           | Changing gcc version could cause your code with undefined
           | behaviour to change. If you rely UB, whether you know you are
           | or not, you are in for a bad time. Ubsan at least let's you
           | know if your code is robust, or a ticking time bomb...
        
         | jedisct1 wrote:
         | Sanitizers may introduce side channels. This is an issue for
         | crypto code.
        
       | rwmj wrote:
       | If you can assume GCC or Clang then __builtin_bswap{16,32,64}
       | functions are provided which will be considerably more efficient,
       | less error-prone, and easier to use than anything you can
       | homebrew.
        
         | dataflow wrote:
         | _byteswap_{ushort,ulong,uint64} for MSVC. Together with yours
         | on x86 these should take care of the three major compilers.
        
         | st_goliath wrote:
         | Well, yes. The only thing missing is knowing if you have to
         | swap or not, if you don't want to assume your code will run on
         | little endian systems exclusively.
         | 
         | Or, on Linux and BSD systems at least, you can use the
         | <endian.h> or <sys/endian.h> functions
         | (https://linux.die.net/man/3/endian) and rely on the libc
         | implementation to do the system/compiler detection for you and
         | use an appropriate compiler builtin inside of an inline
         | function instead of bothering to hack something together in
         | your own code.
         | 
         | The article mentions those functions at the bottom, but
         | strangely still recommends hacking up your own macros.
        
         | jart wrote:
         | That's not true. If you write the byte swap in ANSI C using the
         | gigantic mask+shift expression it'll optimize down to the bswap
         | instruction under both GCC and Clang, as the blog post points
         | out.
        
           | rwmj wrote:
           | Assuming the macros or your giant expression are correct. But
           | you might as well use the compiler intrinsics which you
           | _know_ are both correct and the most efficient possible, and
           | get on with your life.
        
             | jart wrote:
             | Sorry I'd rather place my faith in arithmetic rather than
             | someone's API provided the compiler is smart enough to
             | understand the arithmetic and optimize accordingly.
        
               | loeg wrote:
               | "Someone" here is the same compiler you're trusting to
               | optimize your giant arithmetic expression of the same
               | idea. Your statement is internally inconsistent.
        
               | lanstin wrote:
               | There is a value to keeping it completely clear in your
               | head the difference between a value with arithmetic
               | semantics vs a value with octets in a stream semantics.
               | That thinking will work in all contexts, while the
               | compiler knowledge is limited. The thinking will help you
               | write correct ways to encode data in the URL or into a
               | file being uploaded that your code generates for discord
               | or whatever, in Python, without knowledge of the true
               | endianness of the system the code is running on.
        
               | [deleted]
        
           | borman wrote:
           | Funny that compilers (e.g. clang:
           | https://github.com/llvm/llvm-
           | project/blob/b04148f77713c92ee5... ) might be able to do that
           | only because someone on the compiler team has hand-coded a
           | bswap expression detector.
        
         | bombcar wrote:
         | Given it can be done with careful code AND many processors have
         | a single instruction to do it I'm surprised it hasn't been
         | added to the C standard.
        
         | savant2 wrote:
         | The article explicitly shows that the provided macros are very
         | efficient with a modern compiler. You can check on godbolt.org
         | that they emit the same code.
         | 
         | Though the article only mentions bswap64 and mentioning
         | __builtin_bswap64 would be a nice addition.
        
         | fanf2 wrote:
         | But then you have to #ifdef the endianness of the target
         | architecture. If you do it the right way as Russ Cox and
         | Justine Tunney say, then your code can serialize and
         | deserialize correctly regardless of the platform endianness.
        
         | chrisseaton wrote:
         | __builtin_bswap does exactly the same thing as the macros.
        
         | russdill wrote:
         | The fallacy in the article is that anyone should code these
         | functions. There's plenty of public domain libraries that do
         | this correctly.
         | 
         | https://github.com/rustyrussell/ccan/blob/master/ccan/endian...
        
         | nly wrote:
         | My favourite builtins are the overflow checked integer
         | operations:
         | 
         | https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins...
        
       | captainmuon wrote:
       | It is a ridiculous feature of modern C that you have to write the
       | super verbose "mask and shift" code, which then gets compiled to
       | a simple `mov` and maybe a `bswap`. Wheras, the direct equivalent
       | in C, an assignment with a (type changing) cast, is illegal.
       | There is a huge mismatch between the assumptions of the C spec
       | and actual machine code.
       | 
       | One of the few reasons I ever even reached to C is the ability to
       | slurp in data and reinterpret it as a struct, or the ability to
       | reason in which registers things will show up and mix in some
       | `asm` with my C.
       | 
       | I think there should really be a dialect of C(++) where the
       | machine model is exactly the physical machine. That doesn't mean
       | the compiler can't do optimizations, but it shouldn't do things
       | like prove code as UB and fold everything to a no-op. (Like when
       | you defensively compare a pointer to NULL that according to spec
       | must not be NULL, but practically could be...)
       | 
       | `-fno-strict-overflow -fno-strict-aliasing -fno-delete-null-
       | pointer-checks` gets you halfway there, but it would really only
       | be viable if you had a blessed `-std=high-level-assembler` or
       | `-std=friendly-c` flag.
        
         | MrBuddyCasino wrote:
         | > There is a huge mismatch between the assumptions of the C
         | spec and actual machine code.
         | 
         | People like to say ,,C is close to the metal". Really not true
         | at all anymore.
        
           | goldenkey wrote:
           | Actually, it is true - which is why endian is a problem in
           | the first place. ASM code is different when written for
           | little endian vs big endian. Access patterns are positively
           | offset instead of negatively.
           | 
           | A language that does the same things regardless of endianness
           | would not have pointer arithmetic. That is not ASM and not C.
        
         | pjmlp wrote:
         | It does, macro assemblers, specially those with PC and Amiga
         | roots.
         | 
         | Which given its heritage, that is what PDP-11 C used to be,
         | after all BCPL origin was as minimal language required to
         | bootstrap CPL, nothing else.
         | 
         | Actually, I think TI has a macro Assembler with a C like
         | syntax, just cannot recall the name any longer.
        
         | simias wrote:
         | > _Wheras, the direct equivalent in C, an assignment with a
         | (type changing) cast, is illegal._
         | 
         | I don't understand what you mean by that. The direct equivalent
         | of what? Endianess is not part of the type system in C so I'm
         | not sure I follow.
         | 
         | > _I think there should really be a dialect of C(++) where the
         | machine model is exactly the physical machine._
         | 
         | Linus agrees with you here, and I disagree with both of you.
         | _Some_ UBs could certainly be relaxed, but as a rule I want my
         | code to be portable and for the compiler to have enough leeway
         | to correctly optimize my code for different targets without
         | having to tweak my code.
         | 
         | I want strict aliasing and I want the compiler to delete
         | extraneous NULL pointer checks. Strict overflow I'm willing to
         | concede, at the very least the standard should mandate wrap-on-
         | overflow ever for signed integers IMO.
        
           | lanstin wrote:
           | I am sympathetic, but portability was more important in the
           | past and gets less important each year. I used to write code
           | strictly keeping the difference between numeric types and
           | sequences of bytes in mind, hoping to one day run on an Alpha
           | or a Tandem or something, but it has been a long time since I
           | have written code that runs on non-(Intel AMD or le ARM)
        
         | mhh__ wrote:
         | D's machine model does actually assume the hardware, and using
         | the compile time metaprogramming you can pretty much do
         | whatever you want when it comes to bit twiddling - whether that
         | means assembly, flags etc.
        
         | pornel wrote:
         | Of course nobody wants C to backstab them with UB, but at the
         | same time programmers want compilers to generate optimal code.
         | That's the market pressure that forces optimizers to be so
         | aggressive. If you can accept less optimized code, why aren't
         | you using tcc?
         | 
         | The idea of C that "just" does a straightforward machine
         | translation breaks down almost immediately. For example, you'd
         | want `int` to just overflow instead of being UB. But then it
         | turns out indexing `arr[i]` can't use 64-bit memory addressing
         | modes, because they don't overflow like a 32-bit int does. With
         | UB it doesn't matter, but a "straightforward C" would emit
         | unnecessary separate 32-bit mul/shift instructions.
         | 
         | https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759...
        
           | MaxBarraclough wrote:
           | > nobody wants C to backstab them with UB, but at the same
           | time programmers want compilers to generate optimal code
           | 
           | The value of compiler optimization isn't the same thing as
           | the value of having extensive undefined behaviour in a
           | programming language.
           | 
           | Rust and Ada perform about the same as C, but lack C's many
           | footguns.
           | 
           | > indexing `arr[i]` can't use 64-bit memory addressing modes
           | 
           | What do you mean here?
        
             | remexre wrote:
             | Typically, the assembly instruction that would do the read
             | in arr[i] can do something like:                   x = *(y
             | + z);
             | 
             | where y and z are both 64-bit integers. If I had
             | int arr[1000];         initialize(&arr);         int i =
             | read_int();         int x = arr[i];         print(x);
             | 
             | then to get x I'd need to do something like,
             | tmp = i * 4;         tmp1 = (uint64_t)tmp;         x =
             | *(arr + tmp1);
             | 
             | Which, since i is signed, can't just be a cheap shift, and
             | then needs to be upcasted to a uint64_t (which is cheap, at
             | least).
        
         | ajross wrote:
         | > There is a huge mismatch between the assumptions of the C
         | spec and actual machine code.
         | 
         | Right, which is why the kind of UB pedantry in the linked
         | article is hurting and not helping. Cranky old man perspective
         | here:
         | 
         | Folks: the fact that compilers will routinely exploit edge
         | cases in undefined behavior in the language specification to
         | miscompile obvious idiomatic code is a _terrible bug in the
         | compilers_. Period. And we should address that by fixing the
         | compilers, potentially by amending the spec if feasible.
         | 
         | But instead the community wants to all look smart by showing
         | how much they understand about "UB" with blog posts and (worse)
         | drive-by submissions to open source projects (with passive
         | agressive sneers about code quality), so nothing gets better.
         | 
         | Seriously: don't tell people to shift and mask. Don't
         | pontificate over compiler flags. Stop the masturbatory use of
         | ubsan (though the tool itself is great). And start submitting
         | bugs against the toolchain to get this fixed.
        
           | wnoise wrote:
           | I read this, and go "yes, yes, yes", and then "NO!".
           | 
           | Shifts and ors really is the sanest and simplest way to
           | express "assembling an integer from bytes". Masking is _a_
           | way to deal with the current C spec which has silly promotion
           | rules. Unsigned everything is more fundamental than signed.
        
           | jart wrote:
           | I agree but language of the standard very unambiguously lets
           | them do it. Quoth X3.159-1988                    * Undefined
           | behavior --- behavior, upon use of a nonportable or
           | erroneous program construct, of erroneous data, or of
           | indeterminately-valued objects, for which the Standard
           | imposes no            requirements.  Permissible undefined
           | behavior ranges from ignoring the            situation
           | completely with unpredictable results, to behaving during
           | translation or program execution in a documented manner
           | characteristic            of the environment (with or without
           | the issuance of a diagnostic            message), to
           | terminating a translation or execution (with the issuance
           | of a diagnostic message).
           | 
           | In the past compilers "behaved during translation or program
           | execution in a documented manner characteristic of the
           | environment" and now they've decided to "ignore the situation
           | completely with unpredictable results". So yes what gcc and
           | clang are doing is hostile and dangerous, but it's legal.
           | https://justine.lol/undefined.png So let's fix our code. The
           | blog post is intended to help people do that.
        
             | userbinator wrote:
             | _So let 's fix our code._
             | 
             | No; I say we force the _compiler writers_ to fix their
             | idiotic assumptions instead of bending over backwards to
             | please what 's essentially a tiny minority. There's a lot
             | more programmers who are not compiler writers.
             | 
             | The standard is really a _minimum bar_ to meet, and what 's
             | not defined by it is left to the discretion of the
             | implementers, who should be doing their best to follow the
             | "spirit of C", which ultimately means behaving sanely. "But
             | the standard allows it" should never be a valid argument
             | --- the standard allows a lot of other things, not all of
             | which make sense.
             | 
             | A related rant by Linus Torvalds:
             | https://bugzilla.redhat.com/show_bug.cgi?id=638477#c129
        
         | cbmuser wrote:
         | > One of the few reasons I ever even reached to C is the
         | ability to slurp in data and reinterpret it as a struct, or the
         | ability to reason in which registers things will show up and
         | mix in some `asm` with my C.
         | 
         | Which results in undefined behavior according to the C ISO
         | standard.
         | 
         | Quote:
         | 
         | "2 All declarations that refer to the same object or function
         | shall have compatible type; otherwise, the behavior is
         | undefined."
         | 
         | From: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
         | 6.2.7
        
           | [deleted]
        
           | jstanley wrote:
           | Exactly.
        
           | innocenat wrote:
           | How? I mean, doesn't GP mean this?                   struct
           | whatever p;         fread(p, sizeof(p), 1, fp);
        
           | tsimionescu wrote:
           | It should be perfectly fine to do this:                 union
           | reinterpret {         char raw[100];         struct myStruct
           | interpreted;       } example;            read(fd,
           | &example.raw)       struct myStruct dest = interpreted;
           | 
           | This is standard-compliant C code, and it is a common way of
           | reading IP addresses from packets, for example.
        
             | saagarjha wrote:
             | (It should be noted that this is not valid C++ code.)
        
         | nine_k wrote:
         | I suspect you might like C--.
         | 
         | https://en.m.wikipedia.org/wiki/C--
        
         | froh wrote:
         | you could instead simply use hton/ntoh and trust the library
         | properly does The Right Thing tm
        
         | nly wrote:
         | > I think there should really be a dialect of C(++) where the
         | machine model is exactly the physical machine.
         | 
         | Sounds great, until you have to rewrite all your software to go
         | from x86-64 to ARM
        
           | pjmlp wrote:
           | Quite common when coding games back in the 8 and 16 bit days.
           | :)
           | 
           | However for the case in hand, it would suffice to just write
           | the key routines in Assembly, not everything.
        
         | pm215 wrote:
         | So in your 'machine model is the physical machine' flavour,
         | should "I cast an unaligned pointer to a byte array to int32_t
         | and deref" on SPARC (a) do a bunch of byte-load-and-shift-and-
         | OR or (b) emit a simple word load which segfaults? If the
         | former, it's not what the physical machine does, and if the
         | latter, then you still need to write the code as "some portable
         | other thing". Which is to say that the spec's UB here is in
         | service of "allow the compiler to just emit a word load when
         | you write *(int32_t)p".
         | 
         | What I think the language is missing is a way to clearly write
         | "this might be unaligned and/or wrong endianness, handle that".
         | (Sometimes compilers provide intrinsics for this sort of gap,
         | as they do with popcount and count-leading-zeroes; sometimes
         | they recognize common open-coded idioms. But proper
         | standardised support would be nicer.)
        
           | jart wrote:
           | Endianness doesn't matter though, for the reasons Rob Pike
           | explained. For example, the bits inside each byte have an
           | endianness probably inside the CPU but they're not
           | addressable so no one thinks about that. The brilliance of
           | Rob Pike's recommendation is that it allows our code to be
           | byte order agnostic for the same reasons our code is already
           | bit order agnostic.
           | 
           | I agree about bsf/bsr/popcnt. I wish ASCII had more
           | punctuation marks because those operations are as fundamental
           | as xor/and/or/shl/shr/sar.
        
         | klodolph wrote:
         | You don't have to mask and shift. You can memcpy and then byte
         | swap in a function. It will get inlined as mov/bswap.
         | 
         | Practically speaking, common compilers have intrinsics for
         | bswap. The memcpy function can be thought of as an intrinsic
         | for unaligned load/store.
        
           | BeeOnRope wrote:
           | How do you detect if a byte swap is needed? I.e. wether the
           | (fixed) wire endianness matches the current platform
           | endianness?
        
             | edflsafoiewq wrote:
             | Ie how do you know the target's endianness? C++20 added
             | std::endian. Otherwise you can use a macro like this one
             | from SDL
             | 
             | https://github.com/libsdl-
             | org/SDL/blob/9dc97afa7190aca5bdf92...
        
               | hermitdev wrote:
               | There have been CPU architectures where the endianness at
               | compile time isn't necessarily sufficient. I forget
               | which, maybe it was DEC Alpha, where the CPU could flip
               | back and forth? I can't recall if it was a "choose at
               | boot" or a per process change.
        
               | magicalhippo wrote:
               | ARM allows dynamic changing of endianess[1].
               | 
               | [1]:
               | https://developer.arm.com/documentation/dui0489/h/arm-
               | and-th...
        
           | user-the-name wrote:
           | When do you byte swap?
        
       | themulticaster wrote:
       | The first example in the article is flawed (or at least
       | misleading).
       | 
       | 1) They define a char array (which defaults to signed char, as
       | mentioned in the post), including the value 0x80 which can't be
       | represented in char, resulting in a compiler warning (e.g. in GCC
       | 11.1).
       | 
       | The mentioned reason against using unsigned char (that shifting
       | 128 left by 24 places results in UB) is also misleading: I could
       | not reproduce the UB when changing the array to unsigned char.
       | Perhaps the author meant leaving the array defined as signed
       | char, but casting the signed chars to unsigned before shifting.
       | That indeed results in UB, but I don't see why you would define
       | the array as signed in the first place.
       | 
       | 2) The cause for the undefined behavior isn't the bswap_32,
       | rather it's because they try reading an uint32_t value from a
       | char array, where b[0] is not aligned on a word boundary.
       | 
       | There is no need at all do redefine bswap. The simple solution
       | would be to use an unsigned char array instead of a char array
       | and just reading the values byte-wise.
       | 
       | Of course C has its footguns and warts and so on, but there is no
       | need to dramatize it this much in my opinion.
       | 
       | I've prepared a Godbolt example to better explain the arguments
       | mentioned above: https://godbolt.org/z/Y1EWK6e17
       | 
       | Edit: To add to point 2) above: Another way to avoid the UB (in
       | this specific case) would be to add __attribute__ ((aligned (4)))
       | to the definition of b. In that case, even reading the array as a
       | single uint32_t works as expected since the access is aligned to
       | a word boundary.
       | 
       | Obviously, you can't expect any random (unsigned char) pointer to
       | be aligned on a word boundary. Therefore, it is still necessary
       | to read the uint32_t byte by byte.
        
         | cygx wrote:
         | _> The mentioned reason against using unsigned char (that
         | shifting 128 left by 24 places results in UB) is also
         | misleading_
         | 
         | No, that reasoning is correct. Integer promotions are performed
         | on the operands of a shift expression, meaning the left operand
         | will be promoted to signed int even if it starts out as
         | unsigned char. Trying to shift a byte value with highest bit
         | set by 24 will results in a value not representable as signed
         | int, leading to UB.
        
           | themulticaster wrote:
           | Thanks, I just noticed a small mistake in my example (I don't
           | trigger the UB because I access b[0] containing 0x80 without
           | shifting, however I meant to do it the other way around).
           | 
           | Still, adding an explicit cast to the left operand seems to
           | be enough to avoid this, e.g.:                 uint32_t x =
           | ((uint32_t)b[0]) << 24;
           | 
           | In summary, I think my point that using unsigned char would
           | be appropriate in this case still stands.
        
             | cygx wrote:
             | _> Still, adding an explicit cast to the left operand seems
             | to be enough to avoid this_
             | 
             | Indeed. See my other comment,
             | https://news.ycombinator.com/item?id=27086482
        
       | commandersaki wrote:
       | It wasn't clear to me but what was the undefined behaviour in the
       | naive approach?
        
         | cygx wrote:
         | Violation of the effective typing rules ('strict aliasing') and
         | a potential violation of alignment requirements of your
         | platform.
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2021-05-08 23:00 UTC)