[HN Gopher] The Byte Order Fiasco ___________________________________________________________________ The Byte Order Fiasco Author : cassepipe Score : 269 points Date : 2021-05-08 11:00 UTC (12 hours ago) (HTM) web link (justine.lol) (TXT) w3m dump (justine.lol) | londons_explore wrote: | Remember how we used to have machines with a 7 bit byte? And | everything was written to handle either 6, 7, or 8 bit bytes? | | And now we've settled on all machines being 8 bit bytes, and | programmers no longer have to worry about such details? | | Is it time to do the same for big endian machines? Is it time to | accept that all machines that matter are little endian, and the | extra effort keeping everything portable to big endian is no | longer worth the mental effort? | bumbada wrote: | What happens is that all machines that matter are little endian | but network works always in Big Endian. | londons_explore wrote: | We'll have to keep it as a quirk of history... | | A bit like the electron has a negative charge... | PixelOfDeath wrote: | They hat a 50/50 chance at getting the technical | electricity direction right... and the fucked it up! | bombcar wrote: | Isn't big endian a bit more natural considered on a bit | level? The bits start from highest to lowest on a serial | connection. | ben509 wrote: | Big-endian is natural when you're comparing numbers, which | is probably why people represent numbers in a big-endian | fashion. | | Little-endian is natural with casts because the address | doesn't change, and it's the order in which addition takes | place. | kangalioo wrote: | I feel like big endian is more _intuitive_ because that's | what our number notation has evolved to be. | | But more _natural_ is little endian because, well, it's | just more straightforward to have the digits' magnitude | be in ascending order (2^0, 2^1, 2^2, 2^3...) instead of | putting it in reverse. | | Plus you encounter less roadblocks in practice with | little endian (e.g. address changes with casts) which is | often a sign of good natural design | CydeWeys wrote: | I'm curious how you're defining "natural", and if you | think ISO-8601 is the reverse of "natural" too. | | All human number systems I've ever seen write numbers out | as big Endian (yes, even Roman numerals), so I'm really | struggling to see how that wouldn't be considered | natural. | ByteJockey wrote: | It seems like it would be a more natural for representing | the number when communicating with a human. | | But that's not what we're doing here, so it's not | entirely relevant. | pabs3 wrote: | IBM is going to be pretty annoyed when your code doesn't work | on their mainframes. | jart wrote: | In my experience IBM does the right thing and sends patches | rather than asking us to fix their problems for them, and I | respect them for that reason, even if it's a tiny burden to | review those changes. | | However endianness isn't just about supporting IBM. Modern | compilers will literally break your code if you alias memory | using a type wider than char. It's illegal per the standard. | In the past compilers would simply not care and say, oh the | architecture permits unaligned reads so we'll just let you do | that. Not anymore. Modern GCC and Clang force your code to | conform to the abstract standard definition rather than the | local architecture definition. | | It's also worth noting that people think x86 architecture | permits unaligned reads but that's not entirely true. For | example, you can't do unaligned read-ahead on C strings, | because in extremely rare cases you might cross a page | boundary that isn't defined and trigger a segfault. | froh wrote: | yes IBM provided asm for s390 hton ntoh, and "all we had to | do" for mainframe Linux was patch x86 only packages to use | hton ntoh when they persisted binary data. for the kernel | IBM did it on their own, contributing mainline, for | userland suse did it, grabbing some patches from japanese | turbolinux, and then red hat grabbed the patches from turbo | and suse, and together we got them mainline lol. and PPC | then just piggybacked on top of that effort. | einpoklum wrote: | > Is it time to accept that all machines that matter are little | endian. | | Well, no, because it's not the case. SPARC is big-endian, and a | bunch of IBM processors. ARM processors are mostly bi-endian. | | > Is it time to do the same for big endian machines? | | No. Not just because of their prevalence, but because there | isn't a compelling reason why everything should be little- | endian. | genmon wrote: | That reminds me of a project to interface with vending | machines. (We built a bookshop in a vending machine that would | tweet whenever it sold an item, with automated stock | management.) | | Vending machines have an internal protocol a little like I2C. | We created a custom peripheral to bridge the machine to the | web, based on a Raspberry Pi. | | The protocol was defined by Coca Cola Japan in 1975 (in order | to have optionality in their supply chain). It's still in use | today. But because it was designed in Japan, with a need for | wide characters, it assumes 9 bit bytes. | | We couldn't find any way to get a Raspberry Pi to speak 9 bit | bytes. The eventual solution was a custom shield that would | read the bits, and reserialise to 8 bit bytes for the Pi to | understand. And vice versa. | | 9 bit bytes. I grew up knowing that bytes had variable length, | bit this was the first time I encountered it in the wild. This | was 2015. | raverbashing wrote: | Well you could bit bang and the 9 bits wouldn't be an issue. | (Even if you had a tiny PIC microcontroler just to do that) | | This is best solvable the closer to the device in question | and in the simplest way possible. | ghoward wrote: | Sorry, dumb question: what is bit banging? | gspr wrote: | The practice of using software to literally toggle (or | read) individual pins with the correct software- | controlled timing in order to communicate with some | hardware. | | To transmit a bit pattern 10010010 over a single pin | channel, for example, you'd literally set the pin high, | sleep for a some predetermined amount of time, set it | low, sleep, set it low, sleep, set it high, etc. | thristian wrote: | In order to exchange data over a serial connection, the | ones and zeroes have to be sent with exact timing, so the | receiver can reliably tell where one bit ends and the | next begins. Because of this, the hardware that's doing | the communication can't do anything else at the same | time. And since the actual mechanics of the process are | simple and straightforward, most computers with a serial | connection have special serial-interface hardware (a | Universal Asynchronous Receiver/Transmitter, or UART) to | take care of it - the CPU gives the UART some data, then | returns to more productive pursuits while the UART works | away. | | But sometimes you can't use a UART: maybe you're working | on a tiny embedded computer without one, or maybe you | need to speak a weird 9-bit protocol a standard UART | doesn't understand. In that case, you can make the CPU | pump the serial line directly. It's inefficient (there's | probably more interesting work the CPU could be doing) | and it can be difficult to make the CPU pause for | _exactly_ the right amount of time (CPUs are normally | designed to run as fast or as efficiently as possible, | nothing in between), but it 's possible and sometimes | it's all you've got. That's bit-banging. | londons_explore wrote: | Consider being a teacher. Thats a good explanation. | stefan_ wrote: | The irony is that while a tiny PIC can do bit banging | easily, the mighty Pi will struggle with it. | hazeii wrote: | I'm familiar with both, and have Pi's bit-banging at | 8MHz. It's not hard-realtime like a PIC though (where | I've bitbanged a resistor D2A hung off a dsPIC33 to | 17.734475MHz). It's an improvement over the years, but | surprisingly little since bit-banging 4MHz Z80's more | than 4 decades ago, where resolution was 1 T state | (250ns). | londons_explore wrote: | The 9 bit serial OP mentioned likely doesn't have a | seperate clock line, so it is hard realtime and timing | matters a _lot_ , and I doubt the Pi could reliably do | anything over 1 kHz baud with bit banging. You could do | much better if you didn't run Linux. | BlueTemplar wrote: | We really should have moved to 32 bit bytes when moving to 64 | bit words. Would have simplified Unicode considerably. | wongarsu wrote: | People were holding off on transitioning because pointers | use twice as much space in x64. If bytes had quadrupled in | space with x64 we would still be using 32 bit software | everywhere | BlueTemplar wrote: | Well, obviously it would have delayed the transition. | However you can only go so far with 4Go-limited memory. | | And do you have examples of still widely used 8-bit sized | data formats ? | jart wrote: | RGB and Y'CbCr | Narishma wrote: | You can go very far with just 4GB of memory, especially | when not using wasteful software. | owl57 wrote: | I assume you wrote this comment in UTF-8 over HTTP | (ASCII-based) and TLS (lots of uint8 fields). | jart wrote: | Use Erlang. It has 32-bit char. | toast0 wrote: | Not really. Strings are a list of integers [1], integers | are signed and fill a system word, but there's also 4 | bits of type information. So you can have a 28-bit signed | integer char on a 32-bit system or a signed 60-bit | integer. | | However, since Unicode is limited to 21-bits by utf-16 | encoding, a unicode code point will fit in a small | integer. | | [1] unless you use binaries, which is often a better | choice. | spacechild1 wrote: | You know, bytes are not only about text, they are also used | to represent _binary_ data... | | Not to mention that bytes have nothing to do with unicode. | Unicode codepoints can be encoded in many different ways: | UTF8, UTF16, UTF32, etc. | ChrisSD wrote: | Not really. Unicode is a variable width abstract encoding; | a single character can be made up of multiple code points. | | For Unicode, 32-bit bytes would be an incredibly wasteful | in memory encoding. | BlueTemplar wrote: | One byte = one "character" makes for much easier | programming. | | Text generally uses a small fraction of memory and | storage these days. | cygx wrote: | Not all user-perceived characters can be represented as a | single Unicode codepoint. Hence, Unicode text encodings | (almost[1]) always have to be treated as variable length, | even UTF-32. | | [1] at runtime, you could dynamically assign 'virtual' | codepoints to grapheme clusters and get a fixed-length | encoding for strings that way | jart wrote: | Even the individual unicode codepoints themselves are | variable width if we consider that things like cjk and | emoji take up >1 monospace cells. | lanstin wrote: | Every time I see one of these threads, my gratitude to | only do backend grows. Human behavior is too complex, let | the webdevs handle UI, and human languages are too | complex, not sure what speciality handles that. Give me | out of order packets and parsing code that skips a | character if the packet length lines up just so any day. | | I am thankful that almost all the Unicode text I see is | rendered properly now, farewell the little boxes. Good | job lots of people. | jart wrote: | I think we really have the iPhone jailbreakers to thank | for that. U.S. developers were allergic almost offended | by anything that wasn't ASCII and then someone released | an app that unlocked the emoji icons that Apple had | originally intended only for Japan. Emoji is defined in | the astral planes so almost nothing at the time was | capable of understanding them, yet were so irresistible | that developers worldwide who would otherwise have done | nothing to address their cultural biases immediately | fixed everything overnight to have them. So thanks to | cartoons, we now have a more inclusive world. | londons_explore wrote: | I'm pretty sure Unicode was pretty widespread before the | iphone/emoji popularity. | cygx wrote: | There's supporting Unicode, and 'supporting' Unicode. If | you're only dealing with western languages, it's easy to | fall into the trap of only 'supporting' Unicode. Proper | emoji handling will put things like grapheme clusters and | zero-width joiners on your map. | kortex wrote: | > One byte = one "character" makes for much easier | programming. | | Only if you are naively operating in the Anglosphere / | world where the most complex thing you have to handle is | larger character sets. In reality, there's ligatures, | diacritics, combining characters, RTL, nbsp, locales, and | emoji (with skin tones!). Not to mention legacy encoding. | | And no, it does not use a "small fraction of memory and | storage" in a huge range of applications, to the point | where some regions have transcoding proxies still. | AnIdiotOnTheNet wrote: | This just doesn't seem right. Granted, I don't know much | about your use case, but Raspberry Pi's are powerful | computing devices and I find it difficult to believe there | was no way to handle this without additional hardware. | DeRock wrote: | I'm not familiar with the "vending machine" protocol he's | talking about, but it's entirely reasonable that it has | certain timing requirements. Usually the way you interface | with these is by having a dedicated HW block to talk the | protocol, or by bit banging. The former wouldn't be | supported on RPi because it's obscure, the latter requires | tight GPIO timing control that is difficult to guarantee on | a non-real-time system like the RPi usually runs. | [deleted] | DonHopkins wrote: | We used to have machines with arbitrarily sized bytes, and 36 | bit words! | | http://pdp10.nocrew.org/docs/instruction-set/Byte.html | | >In the PDP-10 a "byte" is some number of contiguous bits | within one word. A byte pointer is a quantity (which occupies a | whole word) which describes the location of a byte. There are | three parts to the description of a byte: the word (i.e., | address) in which the byte occurs, the position of the byte | within the word, and the length of the byte. | | >A byte pointer has the following format: | 000000 000011 1 1 1111 112222222222333333 012345 | 678901 2 3 4567 890123456789012345 | _________________________________________ | | | | | | | | | POS | SIZE |U|I| X | | Y | |______|______|_|_|____|__________________| | | >POS is the byte position: the number of bits from the right | end of the byte to the right end of the word. SIZE is the byte | size in bits. | | >The U field is ignored by the byte instructions. | | >The I, X and Y fields are used, just as in an instruction, to | compute an effective address which specifies the location of | the word containing the byte. | | "If you're not playing with 36 bits, you're not playing with a | full DEC!" -DIGEX (Doug Humphrey) | | http://otc.umd.edu/staff/humphrey | [deleted] | stkdump wrote: | Historical and obscure machines aside, there are a few things | modern C++ code should take for granted, because even new systems | will probably not bother breaking them: Text is encoded in UTF-8. | Negative integers are twos-complement. Float is 32 bit ieee 754, | double and long double are 64 bit ieee 754. Char is 8 bit, short | is 16 bit, int is 32 bit, long long is 64 bit. | pabs3 wrote: | I wonder if those macros work with middle-endian systems. | froh wrote: | no. but hton(3)/ntoh(3) from inet.h do. | dataflow wrote: | Is this a joke or am I just unaware of any systems out there | that are "middle-endian"..?! | mannschott wrote: | Sadly not a joke, but thankfully quite obscure: | https://en.wikipedia.org/wiki/Endianness#Middle-endian | hvdijk wrote: | There are no current middle-endian systems but they used to | exist. The PDP-11 is the most famous one. The macros would | work on all systems, but as only very old systems are middle- | endian, they also have old compilers so may not be able to | optimise it as well. | ttt0 wrote: | https://twitter.com/m13253/status/1371615680068526081 | | Would it hurt anyone to define this undefined behavior and do | exactly what the source code says? | MauranKilom wrote: | Not sure what you think the source code "says". I mean, I know | what you want it to mean, but just because integer wrapping is | intuitive to you doesn't imply that that is what the code | means. C++ abstract machine and all. | | But to answer the actual question: For C++20, integer types | were revisited. It is now (finally) guaranteed that signed | integers are two's complement, along with a list of other | changes. See http://www.open- | std.org/jtc1/sc22/wg21/docs/papers/2018/p090... also for how | the committee voted on the individual issues. | | Note in particular: | | > The main change between [P0907r0] and the subsequent revision | is to maintain undefined behavior when signed integer overflow | occurs, instead of defining wrapping behavior. This direction | was motivated by: | | > - Performance concerns, whereby defining the behavior | prevents optimizers from assuming that overflow never occurs; | | > - Implementation leeway for tools such as sanitizers; | | > - Data from Google suggesting that over 90% of all overflow | is a bug, and defining wrapping behavior would not have solved | the bug. | | So yes, the committee very recently revisited this specific | issue, and re-affirmed that signed integer overflow should be | UB. | ttt0 wrote: | I haven't noticed the signed integer overflow, which does | indeed complicate things, and I thought it was just the | infinite loop UB. | | > Data from Google suggesting that over 90% of all overflow | is a bug, and defining wrapping behavior would not have | solved the bug. | | Of _all_ overflow? Including unsigned integers where the | behavior is defined? | aliceryhl wrote: | That 90% of all overflows are bugs doesn't surprise me at | all, even if you include unsigned integers. | nly wrote: | This is why, in 2021, the mantra that C is a good language for | these low level byte twiddling tasks needs to die. Dealing with | alignment and endianness properly requires a language that allows | you to build abstractions. | | The following is perfectly well defined in C++, despite looking | like almost the same as the original unsafe C: | #include <boost/endian.hpp> #include <cstdio> | using namespace boost::endian; unsigned char b[5] = | {0x80,0x01,0x02,0x03,0x04}; int main() { | uint32_t x = *((big_uint32_t*)(b+1)); | printf("%08x\n", x); } | | Note that I deliberately misaligned the pointer by adding 1. | | https://gcc.godbolt.org/z/5416oefjx | | [Edit] Fun twist: the above code doesn't work where the | intermediate variable x is removed because printf itself is not | type safe, so no type conversion (which is when the bswap is | deferred to) happens. In pure C++ when using a type safe | formatting function (like fmt or iostreams) this wouldn't happen. | printf will let you throw any garbage in to it. tl;dr outside | embedded use cases writing C in 2021 is fucking nuts. | IgorPartola wrote: | As a very minor counterpoint: I like C because frankly it's | fun. I wouldn't start a web browser or maybe even an operating | system in it today, but as a language for messing around I find | it rewarding. I also think it is incredibly instructive in a | lot of ways. I am not a C++ developer but ANSI C has a special | place in my heart. | | Also, I will say that when it comes to programming Arduinos and | ESP8266/ESP32 chips, I still find that C is my go to despite | things like Alia, MicroPython, etc. I think it's possible that | once Zig supports those devices fully that I might move over. | But in the meantime I guess I'll keep minding my off by one | errors. | themulticaster wrote: | This has nothing to do with C++ because your example only hides | the real issue occurring in the blog post example: The | unaligned read on the array. Try adding something like | printf("%08x\n", *((uint32_t*)(b))); | | to your example and you'll see that it produces UB as well. The | reason there is no UB with big_uint32_t probably is that that | struct/class/whatever it is probably redefines its | dereferencing operator to perform byte-wise reads. | | Godbolt example: https://gcc.godbolt.org/z/seWrb5cz7 | nly wrote: | I fail to see your point. The point of my post is that the | abstractions you can build in C++ are as easy to use and as | efficient as doing things the wrong, unsafe way...so there's | no reason not to do things in a safe, correct way. | | Obviously if you write C and compile it as C++ you still end | up with UB, because C++ aims for extreme levels of | compatibility with C. | themulticaster wrote: | Sorry for being unclear. My point is that the example in | the blog post does two things, a) it reads an unaligned | address causing UB and b) it performs byte-order swapping. | The post then goes on about avoiding UB in part b), but all | the time the UB was caused by the unaligned access in a). | | Of course your example solves both a) and b) by using | big_uint32_t, and I agree that this is an interesting | abstraction provided by Boost, but I think the takeaway | "use C++ for low-level byte fiddling" is slightly | misleading: Say I was a novice C++ programmer, saw your | example of how C++ improves this but at the same time don't | know that big_uint32_t solves the hassle of reading a word | from an unaligned address for me. Now I use your pattern in | my byte-fiddling code, but then I need to read a word in | host endianness. What do I do? Right, I remember the HN | post and write *((uint32_t*)(b+1)) (without the big_, | because I don't need that!). And then I unintentionally | introduced UB. In other words, big_uint32_t is a little | "magic" in this case, as it suggests a similarity to | uint32_t which does not actually exist. | | To be honest, I don't think the byte-wise reading is in any | way inappropriate in this case: If you're trying to read a | word _in non-native byte order from an unaligned access_ , | it is perfectly fine to be very explicit about what you're | doing in my opinion. There also is nothing unsafe about | doing this as long as you follow certain guidelines, as | mentioned elsewhere in this thread. | nly wrote: | Sure, the only correct way to read an unaligned value in | to an aligned data type in both C or C++ is via memcpy. | | I still think being able to define a type that models | what you're doing is incredibly valuable because as long | as you don't step outside your type system you get so | much for free. | sgtnoodle wrote: | You could also mask and shift the value byte-wise just | like with an endian swap. Depending on the destination | and how aggressive the compiler optimizes memcpy or not, | it could even produce more optimal code, perhaps by | working in registers more. | | Conceptual consistency is a good thing, but there is a | generally higher cognitive load to using C++ over C. I've | used both C++ and C professionally, and I've gone deeper | with type safety and metaprogramming than most folk. I've | mostly used C for the last few years, and I don't feel | like I'm missing anything. It's still possible to write | hard-to-misuse code by coming up with abstractions that | play to the language's strengths. | | Operator overloading in particular is something I've | refined my opinion on over the years. My current thought | is that it's best not to use operators in | user/application defined APIs, and should be reserved for | implementing language defined "standard" APIs like the | STL. Instead, it's better to use functions with names | that unambiguously describe their purpose. | foldr wrote: | What are the advantages of this over a simple function with the | following signature? uint32_t | read_big_uint32(char *bytes); | | Having a big_uint32_t type seems wrong to me conceptually. You | should either deal with sequences of bytes with a defined | endianness or with native 32-bit integers of indeterminate | endianness (assuming that your code is intended to be endian | neutral). Having some kind of halfway house just confuses | things. | nly wrote: | The library provides those functions too, but I don't see how | having an arithmetic type with well defined size, endiannness | and alignment is a bad thing. | | If you're defining a struct to mirror a data structure from a | device, protocol or file format then the language / type | system should let you define the properties of the fields, | not necessarily force you to introduce a parsing/decoding | stage which could be more easily bypassed. | lanstin wrote: | It is no longer arithmetic if there is an endianness. Some | things are numbers and some things are sequences of bytes. | Arithmetic only works on the former. | mfost wrote: | I'd say, putting multiple of those types into a struct that | then perfectly describes the memory layout of each byte of | data in memory/network packet in a reliable and user friendly | way to manipulate for the coder. | foldr wrote: | I see. That does seem helpful once you consider how these | types compose, rather than thinking about a one-off | conversion. However, I think it would be cleaner to have a | library that auto-generated a parser for a given struct | paired with an endianness specification, rather than baking | the endianness into the types. (Probably this could be | achieved by template metaprogramming too.) | themulticaster wrote: | I agree, but a little nitpick: A sequence of bytes does not | have a defined endianness. Only groups of more than one bytes | (i.e. half words, words, double words or whatever you want to | call them) have an endianness. | | In practice, most projects (e.g. the Linux kernel or the | socket interface) differentiate between host (indeterminate) | byte order and a specific byte order (e.g. network byte | order/big endian). | jeffreygoesto wrote: | Wouldn't that cast be UB because it is type punning? | professoretc wrote: | char* is a allowed to alias to other pointer types. | jeffreygoesto wrote: | Hm. Afsik, you are always allowed to convert _to_ a char, | but _from_ is not ok in general. See i.e. [0] | | [0] https://gist.github.com/shafik/848ae25ee209f698763cffee | 272a5... | jstimpfle wrote: | I find you missed the point of the post and the issues | described in it. | | In my estimation, libraries like boost are way too big and way | too clever and they create more problems than they solve. Also, | they don't make me happy. | | You're overfocusing on a "problem" that is almost completely | irrelevant for most of programming. Big endian is rare to be | found (almost no hardware to be found, but some file formats | and networking APIs have big-endian data in them). Where you | still meet it, you don't do endianness conversions willy-nilly. | You have only a few lines in a huge project that should be | concerned with it. Similar situation for dealing with aligned | reads. | | So, with boost you end up with a huge slow-compiling dependency | to solve a problem using obscure implicit mechanisms that | almost no-one understands or can even spot (I would never have | guessed that your line above seems to handle misalignment or | byte swapping). | | This approach is typical for a large group of C++ programmers, | who seem to like to optimize for short code snippets, | cleverness, and/or pedantry. | | The actual issue described in the post was the UB that is easy | to hit when doing bit shifting, caused by the implicit | conversions that are defined in C. While this is definitely an | unhappy situation, it's easy enough to avoid this using plain C | syntax (cast expression to unsigned before shifting), using not | more code than the boost-type cast in your above code. | | The fact that the UB is so easy to hit doesn't call for | excessive abstraction, but simply a revisit of some of the UB | defined in C, and how compiler writers exploit it. | | (Anecdata: I've written a fair share of C code, while not | compression or encryption algorithms, and personally I'm not | sure I've ever hit one of the evil cases of UB. I've hit | Segmentation faults or had Out-of-bounds accesses, sure, but | personally I've never seen the language or compilers "haunt | me".) | jart wrote: | Do you use UBSAN and ASAN? When you write unit tests do you | feed numbers like 0x80000000 into your algorithm? When you | allocate test memory have you considered doing it with | mmap(4096) and putting the data at the _end_ of the map? (Or | better yet, double it and use mprotect). Those are some good | examples of torture tests if you 're in the mood to feel | haunted. | [deleted] | sdenton4 wrote: | Every day I spend futzing around with endianness is a day I'm | not solving 'real' problems. These things are a distraction | and a complete waste of developer time: It should be solved | 'once' and only worried about by people specifically looking | to improve on the existing solution. If it can't be handled | by a library call, there's something really broken in the | language. | | (imo, both c and cpp are mainly advocated by people suffering | from stockholm syndrome.) | raphlinus wrote: | I agree with the bulk of this post. | | Re the anecdata at the end. Have you ever run your code | through the sanitizers? I have. CVE-2016-2414 is one of my | battle scars, and I consider myself a pretty good programmer | who is aware of security implications. | jstimpfle wrote: | Very little, quite frankly. I've used valgrind in the past, | and found very few problems. I just ran | -fsanitize=undefined for the first time on one of my | current projects, which is an embedded network service of | 8KLOC, and with a quick test covering probably 50% of the | codepaths by doing network requests, no UB was detected (I | made sure the sanitizer works in my build by introducing a | (1<<31) expression). | | Admittedly I'm not the type of person who spends his time | fuzzing his own projects, so my statement was just to say | that the kind of bugs that I hit by just testing my | software casually are almost all of the very trivial kind - | I've never experienced the feeling that the compiler | "betrayed" me and introduced an obscure bug for something | that looks like correct code. | | I can't immediately see the problem in your CVE here [0], | was that some kind of betrayal by compiler situation? Seems | like strange things could happen if (end - start) | underflows. | | [0] https://android.googlesource.com/platform/frameworks/mi | nikin... | raphlinus wrote: | This one wasn't specifically "betrayal by compiler," but | it was a confusion between signed and unsigned quantities | for a size field, which is very similar to the UB | exhibited in OP. | | Also, the fact that you can't see the problem is actually | evidence of how insidious these problems are :) | | The rules for this are arcane, and, while the solution | suggested in OP is correct, it skates close to the edge, | in that there are many similar idioms that are not ok. In | particular, (p[1] << 8) & 0xff00, which is code I've | written, is potentially UB (hence "mask, and then shift" | as a mantra). I'd be surprised if anyone other than jart | or someone who's been part of the C or C++ standards | process can say why. | [deleted] | vlovich123 wrote: | Raph, clearly you're just not as good a programmer as you | think you are. | raphlinus wrote: | Why thank you Vitali. Coming from you, that is high | praise indeed. | vladharbuz wrote: | Correct me if I'm wrong, but your example is just using a | library to do the same task, rather than illustrating any | difference between C and C++. If you want to pull boost in to | do this, that's great, but that hardly seems like a fair | comparison to the OP, since instead of implementing code to | solve this problem yourself you're just importing someone | else's code. | nly wrote: | No, the fact that this can be done in a library and looks | like a native language feature demonstrates the power of C++ | as a language. | | This example is demonstrating: | | - First class treatment of user (or library) defined types | | - Operator overloading | | - The fact that it produces fast machine code. Try changing | big_uint32_t to regular uint32_t to see how this changes. | When you use the later ubsan will introduce a trap for | runtime checks, but it doesn't need to in this case. | simias wrote: | Operator overloading is a mixed blessing though, it can be | very convenient but it's also very good at obfuscating | what's going on. | | For instance I'm not familiar with this boost library so | I'd have a lot of trouble piecing out what your snippet | does, especially since there's no explicit function call | besides the printf. | | Personally if we're going the OOP route I'd much prefer | something like Rust's `var.to_be()`, `var.to_le` etc... At | least it's very explicit. | | My hot take is that operator overloading should only ever | be used for mathematical operators (multiplying vectors | etc...), everything else is almost invariably a bad idea. | pwdisswordfish8 wrote: | Ironically, it was proposed not so long ago to deprecate | to_be/to_le in favour of to_be_bytes/to_le_bytes, since | the former conflate abstract values with bit | representations. | nly wrote: | That's fine if whatever type 'var' happens to be is NOT | usable as an arithmetic type, otherwise you can easily | just forget to call .to_le() or .to_native(), or | whatever, and end up with a bug. I don't know Rust, so | don't know if this is the case? | | Boost.Endian actually lets you pick between arithmetic | and buffer types. | | 'big_uint32_buf_t' is a buffer type that requires you to | call .value() or do a conversion to an integral type. It | does not support arithmetic operations. | | 'big_uint32_t' is an arithmetic type, and supports all | the arithmetic operators. | | There are also variants of both endian suffixed '_at' for | when you know you have aligned access. | raphlinus wrote: | The idiomatic way to do this in Rust is to use functions | like .to_le_bytes(), so you have the u32 (or whatever) on | one end and raw bytes (something like [u8; 4]) on the | other. It can get slightly tedious if you're doing it by | hand, but it's impossible to accidentally forget. If | you're doing this kind of thing at scale, like dealing | with TrueType fonts (another bastion of big-endian), it's | common to reach for derive macros, which automate a great | deal of the tedium. | nly wrote: | Who decides what methods to add to the bytes | type/abstraction? | | If I have a 3 byte big endian integer can I access it | easily in rust without resorting to shifts? | | In C++ I could probably create a fairly convincing | big_uint24_t type and use it in a packed struct and there | would be no inconsistencies with how it's used with | respect to the more common varieties | raphlinus wrote: | In Rust, [u8; N] and &[u8] are both primitive types, and | not abstractions. It's possible to create an abstraction | around either (the former even more so now with const | generics), but that's not necessary. It's also possible | to use "extension traits" to add methods, even to | existing and built-in types[1]. | | I'm not sure about a 3 byte big endian integer. I mean, | that's going to compile down to some combination of | shifting and masking operations anyway, isn't it? I | suspect that if you have some oddball binary format that | needs, this it will be possible to write some code to | marshal it, that compiles down to the best possible asm. | Godbolt is your friend here :) | | [1]: https://rust-lang.github.io/rfcs/0445-extension- | trait-conven... | nly wrote: | I agree then that in Rust you could make something | consistent. | | I think there's no need for explicit shifts. You need to | memcpy anyway to deal with alignment issues, so you may | as well just copy in to the last 3 bytes of a zero- | initialized, big endian, 32bit uint. | | https://gcc.godbolt.org/z/jEnsW8WfE | raphlinus wrote: | That's just constant folding. Here's what it looks like | when you actually need to go to memory: | | https://gcc.godbolt.org/z/9qGqh6M1E | | And I think we're on the same page, it should be possible | to get similar results in Rust. | cbmuser wrote: | You are still casting one pointer type into another which | can result in unaligned access. | | If you need to change byte orders, you should use library | to achieve that. | nly wrote: | Boost.Endian is the library here and this code is safe | because the big_uint32_t type has an alignment | requirement of 1 byte. | | This is why ubsan is silent and not even injecting a | check in to the compiled code. | | You can check the alignment constraints with | static_assert (something else you can't do in standard | C): https://gcc.godbolt.org/z/KTcf9ax6r | kevin_thibedeau wrote: | C11 has static_assert: | https://gcc.godbolt.org/z/E3bGc95o3 | | Is also has _Generic() so you can roll up a family of | endianness conversion functions and safely change types | without blowing up somewhere else with a hardcoded | conversion routine. | Brian_K_White wrote: | It demonstrates that c++ is even less safe. | 0x000000E2 wrote: | By the same token, I think most uses for C++ these days are | nuts. If you're doing a greenfield project 90% of the time it's | better to use Rust. | | C++ has a multitude of its own pitfalls. Some of the C | programmer hate for C++ is justified. After all, it's just C | with a pre-processing stage in the end. | | There's good reasons why many C projects never considered C++ | but are already integrating the nascent Rust. I always hated | low level programming until Rust made it just as easy and | productive as high level stuff | jart wrote: | C is perfect for these problems. I like teaching the endian | serialization problem because it broaches so many of the topics | that are key to understanding C/C++ in general. Even if we | choose to spend the majority of our time plumbing together | functions written by better men, it's nice to understand how | the language is defined so we could write those functions, even | if we don't need to. | nly wrote: | For sure, it's a good way to teach that C is insufficient to | deal with even the simplest of tasks. Unfortunately teaching | has a bad habit of becoming practice, no matter how good the | intention. | | With regard to teaching C++ specifically I tend to agree with | this talk: | | CppCon 2015 - Kate Gregory "Stop Teaching C": | https://www.youtube.com/watch?v=YnWhqhNdYyk | jart wrote: | One of her slides was titled "Stop teaching pointers!" too. | My VP back at my old job snapped at me once because I got | too excited about the pointer abstractions provided by | modern C++. Ever since that day I try to take a more | rational approach to writing native code where I consider | what it looks like in binary and I've configured my Emacs | so it can do what clang.godbolt.org does in a single | keystroke. | nly wrote: | For the record, she's not really saying people shouldn't | learn this low level stuff... just that 'intro to C++' | shouldn't be teaching this stuff _first_ | | The biggest problem with C++ in industry is that people | tend to write "C/C++" when it deserves to be recognized | as a language in its own right. | jart wrote: | One does not simply introduce C++. It's the most insanely | hardcore language there is. I wouldn't have stood any | chance understanding it had it not been for my gentle | introduction with C for several years. | SAI_Peregrinus wrote: | C++ makes Rust look easy to learn. | pjmlp wrote: | Really? | | Apparently the first year students at my university | didn't had any issue going from Standard Pascal to C++, | in the mid-90's. | | Proper C++ was taught using our string, vector and | collection classes, given that we were still a couple of | years away from ISO C++ being fully defined. | | C style programming with low level tricks were only | introduced later as advanced topics. | | Apparently thousands of students managed to get going the | remaining 5 years of the degree. | BenjiWiebe wrote: | C++ in the mid 90s was a lot simpler than C++ now. | pjmlp wrote: | No one obliges you to write C++20 with SFINAE template | meta-programming, using classes with CTAD constructors. | | Just like no Python newbie is able to master Python 3.9 | full language set, standard library, numpy, pandas, | django,... | jart wrote: | Well there's a reason universities switched to Java when | teaching algorithms and containers after the 90's. C++ is | a weaker abstraction that encourages the kind of | curiosity that's going to cause a student's brain to melt | the moment they try to figure out how things work and | encounter the sorts of demons the coursework hasn't | prepared them to face. If I was going to teach it, I'd | start with octal machine codes and work my way up. | https://justine.lol/blinkenlights/realmode.html Sort of | like if I were to teach TypeScript then I'd start with | JavaScript. My approach to native development probably | has more in common with web development than it does with | modern c++ practices to be honest, and that's something I | talk about in one of my famous hacks: https://github.com/ | jart/cosmopolitan/blob/4577f7fe11e5d8ef0a... | pjmlp wrote: | US universities maybe, there isn't much Java on my former | university learning plan. | | The only subjects that went full into Java were | distributed computing and compiler design. | | And during the last 20 years they already went back into | their decision. | | I should note that languages like Prolog, ML and | Smalltalk were part of the learning subjects as well. | | Assembly was part of electronic subjects where design of | a pseudo CPU was also part of the themes. So we had our | own pseudo Assembly, x86 and MIPS. | jcelerier wrote: | > Well there's a reason universities switched to Java | when teaching algorithms and containers after the 90's | | Where ? I learned algorithms in C and C++ (and also a bit | in Caml and LISP) and I was in university 2011-2014 | ta988 wrote: | Yes this is the curse of knowledge, people that know c++ | by their exposure to it for decades are usually unable to | bring any new comer to it. | microtherion wrote: | Yes, there is some value in using C for teaching these | concepts. But the problem I see is that, once taught, many | people will then continue to use C and their hand written | byte swapping functions, instead of moving on to languages | with better abstraction facilities and/or availing themselves | of the (as you point out) many available library | implementations of this functionality. | ok123456 wrote: | Or just use the functions in <arpa/inet.h> to convert from host | to network byteorder? | froh wrote: | this! use hton/ntoh and be happy. | | nitpick: the 64bit versions are not fully available yet, | htonll, ntohll | Animats wrote: | Rust gets this right. These primitives are available for all the | numeric types. u32::from_le_byte(bytes) // u32 | from 4 bytes, little endian u32::from_be_byte(bytes) // | u32 from 4 bytes, big endian u32::to_le_bytes(num) // u32 | to 4 bytes, little endian u32::to_be_bytes(num) // u32 to | 4 bytes, big endian | | This was very useful to me recently as I had to write the | marshaling and un-marshaling for a game networking format with | hundreds of messages. With primitives like this, you can see | what's going on. | infradig wrote: | There are equivalent functions in C too. The point of the | article is about not using them. So how would you implement the | above functions in Rust would be more pertinent. | froh wrote: | isnt the point to be careful when implementing them? so the | compiler detects the intention to byteswap? | | when we ported little endian x86 Linux to the big endian | mainframe we sprinkled hton/ntoh all over the place, happily | so. they are the way to go and they should be implemented | properly, not be replaced by a homegrown version. | | all that said, I'm surprised 64bit htonll and ntohll are not | standard yet. anybody knows why? | thechao wrote: | Blech. I learned to program (around '99) by implementing | the crusty old FCS1.0 format, which allows for aggressively | weird wire formats. Our machine was a PDP-11/72 with its | head sawzalled off and custom wire wrap boards dropped in. | The "native" format (coming from analog) was 2143 order as | a 36b packet. The bits were [8,0:7] (using verilog | notation). However, sprinkled randomly in the binary header | were chunks of 7- and 8- bit ANSI (packed) and some mutant | knockoff 6-bit EBCDIC. | | The original listing was written by "Jennifer -- please | call me if you have troubles", an undergraduate from MIT. | It was hand-assembled machine code, in a neat hand in a big | blue binder. That code ran non-stop except for a few | hurricanes from 1988 until 2008; bug-free as far as I could | tell. Jennifer last-name-unknown, you were my idol & my | demon! | | I swore off programming for nearly a year after that. | Negitivefrags wrote: | Unless you are planning on running your game on a mainframe, | just don't bother with endianness for the networking. | | Big endian is dead for game developers. | | Copy entire arrays of structs onto the wire without fear! | | (Just #pragma pack them first) | musicale wrote: | > game on a mainframe | | Maybe your program isn't a game. | | Maybe you have to deal a server that uses Power, or an | embedded system that uses PowerPC (or ARM or MIPS in big- | endian mode). | | Maybe you're running on an older architecture (SPARC, | PowerPC, 68K.) | | Maybe you have to deal with a pre-defined data format (e.g. | TCP/IP packet headers) that uses big-endian byte ordering for | some of its components. | Aeolun wrote: | That's theoretically possible. But I'd be very interested | in why. Especially if you are doing anything involving | networking. | einpoklum wrote: | This is valid code in C++20: if constexpr | (std::endian::native == std::endian::big) { std::cout | << "big-endian" << '\n'; } else if constexpr | (std::endian::native == std::endian::little) { | std::cout << "little-endian" << '\n'; } else { | std::cout << "mixed-endian" << '\n'; } | | Doesn't solve everything, but it's saner even if what you're | writing is C-style low-level code. | st_goliath wrote: | FWIW there is a <sys/endian.h> on various BSDs that contains | "beXXtoh", "leXXtoh", "htobeXX", "htoleXX" where XX is a number | of bits (16, 32, 64). | | That header is also available on Linux, but glibc (and compatible | libraries) named it <endian.h> instead. | | See: man 3 endian (https://linux.die.net/man/3/endian) | | Of course it gets a bit hairier if the code is also supposed to | run on other systems. | | MacOS has OSSwapHostToLittleIntXX, OSSwapLittleToHostIntXX, | OSSwapHostToBigIntXX and OSSwapBigToHostIntXX in | <libkern/OSByteOrder.h>. | | I'm not sure if Windows has something similar, or if it even | supports running on big endian machines (if you know, please | tell). | | My solution for achieving some portability currently entails | cobbling together a "compat.h" header that defines macros for the | MacOS functions and including the right headers. Something like | this: | | https://github.com/AgentD/squashfs-tools-ng/blob/master/incl... | | This is usually my go-to-solution for working with low level on- | disk or on-the-wire binary data structures that demand a specific | endianness. In C I use "load/store" style functions that memcpy | the data from a buffer into a struct instance and do the endian | swapping (or reverse for the store). The copying is also | necessary because the struct in the buffer may not have proper | alignment. | | Technically, the giant macro of doom in the article takes care of | all of this as well. But unlike the article, I would very much | not recommend hacking up your own stuff if there are systems | libraries readily available that take care of doing the same | thing in an efficient manner. | | In C++ code, all of this can of course be neatly stowed away in a | special class with overloaded operators that transparently takes | care of everything and "decays" into a single integer and exactly | the above code after compilation, but is IMO somewhat cleaner to | read and adds much needed type safety. | anticristi wrote: | Indeed, I don't get the article. It's like writing "C is hard | because here is how hard it is to implement memcpy using SIMD | correctly." | | Please don't do that. Use battle-tested low-level routines. | Unless your USP is "our software swaps bytes faster than the | competition", you should not spend brain power on that. | nwallin wrote: | Windows/MSVC has _byteswap_ushort(), _byteswap_ulong(), | _byteswap_uint64(). (note that unsigned long is 32 bits on | Windows) It's ugly but it works. | | Boost provides boost::endian which allows converting between | native and big or little, which just does the right thing on | all architectures and compilers and compiles down to a no-op or | bswap instruction instruction. It's much better than writing | (and testing!) your own giant pile macros and ifdefs to detect | the compiler/architecture/OS, include the correct includes, and | perform the correct conversions in the correct places. | [deleted] | tjoff wrote: | At least historically windows have had big-endian versions as | both SPARC and Itanium use big endian. | electroly wrote: | Itanium can be configured to run in either endianness (it's | "bi-endian"). Windows on Itanium always ran in little-endian | mode and did not support big-endian mode. The same was true | of PowerPC. Windows never ran in big-endian mode on any | architecture. | Sebb767 wrote: | In case anyone else wonders how the code in the linked tweet [0] | would format your hard drive, it's the missing return on f1. | Therefore, f1 is empty as well (no ret) and calling it will | result in f2 being run. The commented out code is irrelevant. | | EDIT: Reading the bug report [1], the actual cause for the | missing ret is that the for loop will overflow, which is UB and | causes clang to not emit any code for the function. | | [0] https://twitter.com/m13253/status/1371615680068526081 | | [1] https://bugs.llvm.org/show_bug.cgi?id=49599 | vlmutolo wrote: | > If you program in C long enough, stuff like this becomes second | nature, and it starts to almost feel inappropriate to even have | macros like the above, since it might be more appropriately | inlined into the specific code. Since there have simply been too | many APIs introduced over the years for solving this problem. To | name a few for 32-bit byte swapping alone: bswap_32, htobe32, | htole32, be32toh, le32toh, ntohl, and htonl which all have pretty | much the same meaning. | | > Now you don't need to use those APIs because you know the | secret. | | This sentiment seems problematic. The solution shouldn't be "we | just have to educate the masses of C programmers on how to | properly deal with endianness". That will never happen. | | The solution should be "It's in the standard library. Go look | there and don't think too hard." C is sufficiently low-level, and | endianness problems sufficiently common, that I would expect that | kind of routine to be available. | lanstin wrote: | The point is that keeping the distinction clear in your head | between numeric semantics and sequence of octets semantics | makes the problem universally tractible. You have a data | structure where with a numeric value. Here you have a sequence | of octets described by some protocol formalism, BNF in the old | days. The mapping from one to the other occurs in the math | between octets and numeric values and the various network | protocols for representing numbers. There are many more choices | than just big endian or little endian. Could be ASN infinite | precision ints. Could be 32 bit IEEE floats or 64 bit IEEE | floats. The distinction is universal between language semantics | and external representations. | | This is why people that memcpy structs right into the buf get | such derision, even if it's faster and written for a mono- | Implementation of a language semantics. It is sloppy thought | made manifest. | pjmlp wrote: | Typical C culture, you would also expect that by now something | like SDS would be part of the standard as well. | | https://github.com/antirez/sds | saagarjha wrote: | Adding API that introduces an entirely new string model that | is incompatible with the rest of the standard library seems | like a nonstarter. | secondcoming wrote: | Isn't the 'modern' solution to memcpy into a temp and swap the | bytes in that? C++ has added/will add std::launder and std::bless | to deal with this issue | lanstin wrote: | No, it is to read a byte at a time and turn it into the | semantic value for the data structure you are filling in. Like | read 128 and then 1 and set the variable to 32769. If u are the | author of protobufs then you may run profiling and write the | best assembly etc but otherwise no, don't do it. | loeg wrote: | > Isn't the 'modern' solution to memcpy into a temp and swap | the bytes in that? | | Or just use the endian.h / sys/endian.h routines, which do the | right thing (be32dec / be32enc / whatever). memcpy+swap is | fine, and easier to get right than the author's giant | expressions, but you might as well use the named routines that | do exactly what you want already. | kingsuper20 wrote: | I've never been very satisfied with these approaches for C where | you hope the compiler does the right thing. It makes sense to | provide some C implementation for portability's sake but any | sizeable reordering cries out for a handtuned, processor | specific, approach (and the non-sizeable probably doesn't require | high speed). I would expect any SIMD instruction set to include a | shuffle. | phkahler wrote: | It can also be a good idea to swap recursively. First swap the | upper and lower half, then swap the upper and lower quarters | (bytes for a 32bit) which can be done with only 2 masks. Then | if its 64bit value swap alternate bytes, again with only 2 | masks. This can be extended all the way to full bit reverse in | 3 more lines each with 2 masks and shifts. | [deleted] | ipython wrote: | It's not every day you can write a blog post that calls out rob | pike... ;) | jart wrote: | Author here. I'm improving upon Rob Pike's outstanding work. | Standing on the shoulders of a giant. | ipython wrote: | Totally agree. My comment was made in jest. Mad kudos to you | as you clearly possess talent and humility that's in short | supply today. | bigbillheck wrote: | I just use ntohl/htonl like a civilized person. | | (Yes, the article mentions those, but they've been standard for | decades). | froh wrote: | what's the best practice for 64bit values these days? is htonll | ntohll widely available yet? | amelius wrote: | Byte order is one of the great unnecessary historical fuck ups in | computing. | | A similar one is that signedness of char is machine dependent. | It's typically signed on Intel and unsigned on ARM. | | Sigh! | mytailorisrich wrote: | I don't think it's a fuck up, rather I think it was | unavoidable: Both ways are equally valid and when the time came | to make the decision, some people decided one way, some people | decided the other way. | amelius wrote: | By the way, mathematicians also have their fuck ups: | | https://tauday.com/tau-manifesto | 8jy89hui wrote: | For anyone curious or who is still attached to pi, here is a | response to the tau manifesto: | | https://blog.wolfram.com/2015/06/28/2-pi-or-not-2-pi/ | joppy wrote: | Why is it an issue any more than say, order of fields in a | struct is an issue? In one case you read bytes off the disk by | doing ((b[0] << 8) | b[1]) (or equivalent), with the order | reversed the other way around. Any application-level (say, not | a compiler, debugger, etc) program should not even need to know | the native byte order, it should only need to know the encoding | that the file it's trying to read used. | zabzonk wrote: | > order of fields in a struct | | This is defined in C to be the order the fields are declared | in. | occamrazor wrote: | But the padding rules between fields are a mess. | ta_ca wrote: | the greatest of all is lisp not being the most mainstream | language, and we can only blame the lisp companies for this | fiasco. in an ideal world we all would be using a lisp with | parametric polymorphism. from highest level abstractions to | machine level, all in one language. | ta_ca wrote: | i hope these downvotes are due to my failure at english or | the comment being off-topic (or both). if not, can i just | replace lisp with rust and be friends again? | [deleted] | bregma wrote: | And which is the correct byte ordering, pray tell? | bonzini wrote: | Little endian has the advantage that you can read the low | bits of data without having to adjust the address. So you can | for example do long addition in memory order rather than | having to go backwards, or (with an appropriate | representation such as ULEB128) in one pass without knowing | the size. | js8 wrote: | Maybe I am biased working on mainframes, but I would | personally take big endian over little endian. The reason | is when reading a hex dump, I can easily read the binary | integers from left to right. | bonzini wrote: | That's the only thing that BE has over LE. | | But for example bitmaps in BE are a huge source of bugs, | as readers and writers need to agree on the size to use | for memory operations. | | "SIMD in a word" (e.g. doing strlen or strcmp with 32- or | 64-bit memory accesses) might have mostly fallen out of | fashion these days, but it's also more efficient in LE. | wongarsu wrote: | Big and little endian are named after the never-ending "holy" | war in Gulliver's Travels over how to open eggs. So we were | always of the opinion that it doesn't really matter. But I | open my eggs on the little end | CountHackulus wrote: | Middle-endian is the only correct answer. It's a tradeoff | between both little-endian and big-endian. The PDP-11 got it | right. | ben509 wrote: | Yup, we're all waiting for the rest of the world to catch | up to MM/DD/YYYY. | rwmj wrote: | Big Endian of course :-) However the one which has won is | Little Endian. Even IBM admitted this when it switched the | default in POWER 7 to little endian. s390x is the only | significant architecture that is still big endian. | kstenerud wrote: | Big endian is easier for humans to read when looking at a | memory dump, but little endian has many useful features in | binary encoding schemes due to the low byte being first. | | I used to like big endian more, but after deep investigation | I now prefer little endian for any encoding schemes. | bombcar wrote: | Couldn't encoding systems be redone with emphasis on the | high-order bits? Or is the assumption that the values are | clustered in the low bits? | amelius wrote: | I think the fundamental problems is that if you start a | computation using N most significant bits and then | incrementally add more bits, e.g. N+M bits total, then | your first N bits might change as a result. | | E.g. decimal example: 1.00/1.00 = 1.00 | 1.000/1.001 = 0.999000999000... | | (adding one more bit changes the first bits of the | outcome) | kstenerud wrote: | You can put emphasis on high order bits, but that makes | decoding more complex. With little endian the decoder | builds low to high, which is MUCH easier to deal with, | especially on spillover. | | For example, with ULEB128 [1], you just read 7 bits at a | time, going higher and higher up the value you're | reconstituting. If the value grows too big and you need | to spill over to the next (such as with big integer | implementations), you just fill the last bits of the old | value, then put the remainder bits in the next value and | continue on. | | With a big endian encoding method (i.e. VLQ used in MIDI | format), you start from the high bits and work your way | down, which is fine until your value spills over. Because | you only have the high bits decoded at the time of the | spillover, you now have to start shifting bits along each | of your already decoded big integer portions until you | finally decode the lowest bit. This of course gets | progressively slower as the bits and your big integer | portions pile up. | | Encoding is easier too, since you don't need to check if | for example a uint64 integer value can be encoded in 1, | 2, 3, 4, 5, 6, 7 or 8 bits. Just encode the low 8 bits, | shift the source right by 8, repeat, until the source | value is 0. Then backtrack to the as-yet-blank encoded | length field in your message and stuff in how many bytes | you encoded. You just got the length calculation for | free. Use a scheme where you only encode up to 60 bit | values, place the length field in the low 4 bits, and | Robert's your father's brother! | | For data that is right-heavy (i.e. the fully formed data | always has real data on the right side and blank filler | on the left - such as uint32 value 8 is actually | 0x00000008), you want a little endian scheme. For data | that is left-heavy, you want a big endian scheme. Since | most of the data we deal with is right-heavy, little | endian is the way to go. | | You can see how this has influenced my encoding design in | [2] [3] [4]. | | [1] https://en.wikipedia.org/wiki/LEB128 | | [2] https://github.com/kstenerud/concise- | encoding/blob/master/cb... | | [3] https://github.com/kstenerud/compact- | float/blob/master/compa... | | [4] https://github.com/kstenerud/compact- | time/blob/master/compac... | pantalaimon wrote: | The good thing is that Big Endian is pretty much irrelevant | these days. Of all the historically Big Endian architectures, | s390x is indeed the only one left that has not switched to | little endian. | globular-toast wrote: | Even if all CPUs were little-endian, big-endian would exist | almost everywhere _except_ CPUs, including in your head. | Unless you 're some odd person that actually thinks in | little-endian. | chrisseaton wrote: | > The good thing is that Big Endian is pretty much irrelevant | these days. | | This is nonsense - many file formats are big endian. | benjohnson wrote: | With a bonus of some being EBCDIC too. | lanstin wrote: | This is true. | erk__ wrote: | As there was talk about in a subthread yesterday [0] so does | arm support big endian though it is not used as much anymore | is it still there. | | POWER also still uses big endian though recently little | endian POWER have gotten more popular | | [0]: https://news.ycombinator.com/item?id=27075419 | akvadrako wrote: | Network protocols still mostly use "Network Byte Order", i.e. | big endian. | lanstin wrote: | Or text. Or handled by generated code like protobuf. | tssva wrote: | Network byte order is big endian so it is far from being | pretty much irrelevant these days. | BenoitEssiambre wrote: | Also, this might be irrelevant at the cpu level, but within | a byte, bits are usually displayed most significant bit | first, so with little endian you end up with bit order: | | 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 | | instead of | | 15 to 0 | | This is because little endian is not how humans write | numbers. For consistency with little endianness we would | have to switch to writing "one hundred and twenty three" as | | 321 | froh wrote: | that's why little endian == broken endian | | said a friend who also quips: "never trust a computer you | can lift" | LightMachine wrote: | Exactly. This is so infuriating. Whoever let little- | endian win made a huge disfavor for humanity. | jart wrote: | Blame the people who failed to localize the right-to-left | convention when arabic numerals were adopted. It's one of | those things like pi vs. tau or jacobin weights and | measurements vs. planck units. Tradition isn't always | correct. John von Neumann understood that when he | designed modern architecture and muh hex dump is not an | argument. | kstenerud wrote: | The only benefit to big endian is that it's easier for | humans to read in a hex dump. Little endian on the other | hand has many tricks available to it for building | encoding schemes that are efficient on the decoder side. | tom_mellior wrote: | Could you elaborate on these tricks? This sounds | interesting. | | The only thing I'm aware of that's neat in little endian | is that if you want the low byte (or word or whatever | suffix) of a number stored at address a, then you can | simply read a byte from exactly that address. Even if you | don't know the size of the original number. | kstenerud wrote: | I've posted in some other replies, but a few: | | - Long addition is possible across very large integers by | just adding the bytes and keeping track of the carry. | | - Encoding variable sized integers is possible through an | easy algorithm: set aside space in the encoded data for | the size, then encode the low bits of the value, shift, | repeat until value = 0. When done, store the number of | bytes you wrote to the earlier length field. The length | calculation comes for free. | | - Decoding unaligned bits into big integers is easy | because you just store the leftover bits in the next | value of the bigint array and keep going. With big | endian, you're going high bits to low bits, so once you | pass to more than one element in the bigint array, you | have to start shifting across multiple elements for every | piece you decode from then on. | | - Storing bit-encoded length fields into structs becomes | trivial since it's always in the low bit, and you can | just incrementally build the value low-to-high using the | previously decoded length field. Super easy and quick | decoding, without having to prepare specific sized | destinations. | mafuy wrote: | Correct me if I'm wrong, but were the now common numbers | not imported in the same order from Arabic, which writes | right to left? So numbers were invented in little endian, | and we just forgot to translate their order. | dahart wrote: | Good question, I just did a little digging to see if I | could find out. It sounds like old Arabic did indeed use | little endian in writing and speaking, but modern Arabic | does not. However, place values weren't invented in | Arabic, Wikipedia says that occurred in Mesopotamia, | which spoke primarily Sumerian and was written in | Cuneiform - where the direction was left to right. | | https://en.wikipedia.org/wiki/Number#First_use_of_numbers | | https://en.wikipedia.org/wiki/Mesopotamia | | https://en.wikipedia.org/wiki/Cuneiform | gpanders wrote: | It might not be how humans _write_ numbers but it is | consistent with how we think about numbers in a base | system. | | 123 = 3x10^0 + 2x10^1 + 1x10^2 | | So if you were to go and label each digit in 123 with the | power of 10 it represents, you end up with little endian | ordering (eg the 3 has index 0 and the 1 has index 2). | This is why little endian has always made more sense to | me, personally. | dahart wrote: | I always think about _values_ in big endian, largest | digit first. Scientific notation, for example, since | often we only care about the first few digits. | | I sometimes think about _arithmetic_ in little endian, | since addition always starts with the least significant | digit, due to the right-to-left dependency of carrying. | | Except lately I've been doing large additions big-endian | style left-to-right, allowing intermediate "digits" with | a value greater than 9, and doing the carry pass | separately after the digit addition pass. It feels easier | to me to think about addition this way, even though it's | a less efficient notation. | | Long division and modulus are also big-endian operations. | My favorite CS trick was learning how you can compute any | arbitrarily sized number mod 7 in your head as fast as | people are reading the digits of the number, from left to | right. If you did it little-endian you'd have to remember | the entire number, but in big endian you can forget each | digit as soon as you use it. | BenoitEssiambre wrote: | I don't know, when we write in general, we tend to write | the most significant stuff first so you lose less | information if you stop early. Even numbers we truncate | twelve millions instead of something like twelve | millions, zero thousand zero hundreds and 0. | lanstin wrote: | Next you are going to want little endian polynomials, and | that is just too far. Also, the advantage of big endian | is it naturally extends to decimals/negative exponents | where the later on things are less important. X squared | plus x plus three minus one over x plus one over x | squared etc. | | Loss of big endian chips saddens me like the loss of | underscores in var names in Go Lang. The homogeneity is | worth something, thanks intel and camelCase, but the old | order that passes away and is no more had the beauty of a | new world. | occamrazor wrote: | In German _ein hundert drei und zwanzig_, literally _one | hundred three and twenty_. The hardest part is are | telephone numbers, that are usually given in blocks of | two digits. | lanstin wrote: | Well that would be hard for me to learn. I always find | the small numbers between like 10 and 100 or 1000 the | hardest for me to remember in languages I am trying to | learn a bit of. | mrlonglong wrote: | In an ideal world which endian format would one go for? | tempodox wrote: | I for one would go for big-endian, simply because reading | memory dumps and byte blocks in assembly or elsewhere works | without mental byte-swapping arithmetics for multi-byte | entities. | | Just out of curiosity, I would be interested in learning why so | many CPUs today are little-endian. Is it because it is cheaper | / more efficient for processor implementations or is it because | "the others do it, so we do it the same way"? | anticristi wrote: | My brain is trained to read little-endian in memory dumps. | It's no different than the German "funf-und-zwanzig" (five | and twenty). :)) | bombcar wrote: | https://stackoverflow.com/questions/5185551/why- | is-x86-littl... | | It simplifies certain instructions internally. Practically | everything is little endian because x86 won. | | > And if you think about a serial machine, you have to | process all the addresses and data one-bit at a time, and the | rational way to do that is: low-bit to high-bit because | that's the way that carry would propagate. So it means that | [in] the jump instruction itself, the way the 14-bit address | would be put in a serial machine is bit-backwards, as you | look at it, because that's the way you'd want to process it. | Well, we were gonna built a byte-parallel machine, not bit- | serial and our compromise (in the spirit of the customer and | just for him), we put the bytes in backwards. We put the low- | byte [first] and then the high-byte. This has since been | dubbed "Little Endian" format and it's sort of contrary to | what you'd think would be natural. Well, we did it for | Datapoint. As you'll see, they never did use the [8008] chip | and so it was in some sense "a mistake", but that [Little | Endian format] has lived on to the 8080 and 8086 and [is] one | of the marks of this family. | mrlonglong wrote: | And does middle endian even exist? | FabHK wrote: | US date format: 12/31/2021 | bitwize wrote: | Little endian. There is no extant big-endian CPU that matters. | mrlonglong wrote: | I did say in an ideal world. | bitwize wrote: | Hint: The reason why it's called "endianness" comes from | the novel _Gulliver 's Travels_, in which the neighboring | nations of Lilliput and Blefuscu went to bitter, bloody war | over which end to break your eggs from: the big end or the | little end. The warring factions were also known as Big- | Endians and Little-Endians, and each thought themselves | superior to the dirty heathens on the other side. If one | side were objectively correct, if there were an inherent | advantage to breaking your egg from one side or the other, | would there be a war at all? | dragonwriter wrote: | > if there were an inherent advantage to breaking your | egg from one side or the other, would there be a war at | all? | | Fascism vs. not-fascism, Stalinist Communism vs. Western | Capitalism, Islamism vs. liberal democracy... I'm not | sure "the existence of war around a divide in ideas | proves that neither sides ideas are correct" is a | particularly comfortable maxim to consider the | ramifications of. | enqk wrote: | https://fgiesen.wordpress.com/2014/10/25/little-endian-vs-bi... | marcosdumay wrote: | Why would one choose the memory representation of the number | based on the advantages of the internal ALU wiring? | | Of all those reasons, the only one I can make sense of is the | "I can't transparently widen fields after the fact!", and | that one is way too niche to explain anything. | enqk wrote: | I don't understand? Why not make the memory representation | sympathetic with the operations you're going to do on it? | It's the raison d'etre of computers to compute and to do it | fast. | | Another example: memory representation of pixels in GPUs | which are swizzled to make computations efficient | marcosdumay wrote: | > I don't understand? Why not make the memory | representation sympathetic with the operations you're | going to do on it? | | There's no reason to, as there's no reason not to. It's | basically irrelevant. | | If carrier passing is so important, why can't you just | mirror your transistors and operate on the same wires, | but on the opposite order? Well, you can, and it's | trivial. (And, by the way, carrier passing isn't | important. High performances ALU pass carrier only though | blocks, that can appear anywhere. And the wiring of those | isn't even planar, so how you arrange them isn't a | showstopper.) | cygx wrote: | _> So the solution is simple right? Let 's just use unsigned char | instead. Sadly no. Because unsigned char in C expressions gets | type promoted to the signed type int._ | | If you do use _unsigned char_ , an alternative to masking would | be performing the cast to _uint32_t_ before instead of after the | shift. | | _edit:_ For reference, this is what it would look like when | implemented as a function instead of a macro: | static inline uint32_t read32be(const uint8_t *p) { | return (uint32_t)p[0] << 24 | (uint32_t)p[1] << | 16 | (uint32_t)p[2] << 8 | | (uint32_t)p[3]; } | jcadam wrote: | A while back I was on a project to port a satellite simulator | from SPARC/Solaris to RHEL/x64. The compressed telemetry stream | that came from the satellite needed to be in big endian (and | that's what the ground station software expected), and the | simulator needed to mimic the behavior. | | This was not a problem for the old SPARC system, which naturally | put everything in the correct order without any fuss, but one of | the biggest sticking points in porting over to x64 was having to | now manually pack all of that binary data. Using Ada, (what | else!) of course. | metiscus wrote: | If memory serves correctly, ada 2012 and beyond has language | level support for this. I was working on porting some code from | an aviation platform to run on PC and it was all in ada 2005 so | we didn't have the benefit of that available. | jcadam wrote: | Same here, Ada2005 for the port. The simulator was originally | written in Ada95. Part of what made it even less fun was the | data was highly packed and individual fields crossed byte | boundaries (these 5 bits are X, the next 4 bits are Y, etc.) | :( | bombcar wrote: | Given enough memory it may be worth treating the whole | stream internally as a bitstream. | onox wrote: | Couldn't you add the Bit_Order and Scalar_Storage_Order | attributes (or aspects in Ada 2012) to your records/arrays? | Or did Scalar_Storage_Order not exist at the time? | the_real_sparky wrote: | This problem is it's own special horror in Canbus data. Between | endianness and sign it's a nightmare of en/decoding possibilities | and the associated mistakes that come with that. | rwmj wrote: | TIFF is another one. The only endian-switchable image format | that I'm aware of. | | Fun fact: CD-ROM superblocks have both-endian fields. Each | integer is stored twice in big and little endian format. I | assume this was to allow underpowered 80s hardware which didn't | have enough resource to do byte swapping. | gumby wrote: | In her first sentence, the phrase "the C / C++ programming | language" is no longer correct: C++20 requires two's complement | signed integers. | | C++ 20 is quite new so I would assume that very few people know | this yet. | | C and C++ obviously differ a lot, but by that phrase she clearly | means "the part where then two languages overlap". The C++ | committee has been willing to break C compatibility in a few ways | (not every valid C program is a valid C++ program), and this has | been true for a while. | loeg wrote: | It hasn't been true since C99, at least -- C++ didn't adopt C99 | designated initializers. | hctaw wrote: | What chips can be targeted by C compilers today that don't use | 2's complement? | gumby wrote: | I haven't seen a one's complement machine in decades but at | the time C was standardized here were still quite a few | (afaik none had a single-chip CPU, to get to your question). | But since they existed, the language definition didn't | require it and some optimizations were technically UB. | | The C++ committee decided that everyone had figured this out | by now and so made this breaking change. | klyrs wrote: | "the c/c++ language" exists insofar as you can import this c | code into your c++, and this is something that c++ programmers | need to know how to do, so they'd better learn enough of the | differences between c and c++ or they'll be stumped when they | crack open somebody else's old code. | mitchs wrote: | Or just cast the pointer to uint##_t and use be##toh and htobe## | from <endian.h>? I think this is making a mountain out of a mole | hill. I've spent tons of time doing wire (de)serialization in C | for network protocols and endian swaps are far from the most | pressing issue I see. The big problem imo is the unsafe practices | around buffer handling allowing buffer over runs. | amluto wrote: | Why mask and then shift instead of casting to the correct type | and then shifting, like this: (uint32_t)x[0] << | 24 | ... | | Of course, this requires that x[0] be unsigned. | syockit wrote: | If this is for deserialisation then it's okay for x[0] to be | signed. You just need to recast the result as int32_t (or | simply assign to an int32_t variable without any cast) and it | is not UB. | baby wrote: | I suggest this article: | https://www.cryptologie.net/article/474/bits-and-bytes-order... | (shameless plug) | tails4e wrote: | Ubsan should default on. If people don't like it, then they | should be made turn it off with a switch, so at least it's more | likely to be run than not run. Could save a huge amount of time | debugging when compilers or architecture changes. Without it, I'd | say many a programmer would be caught by these subtleties in the | standard. Coming from a HW background (Verilog) I'd more | naturally default to masking and shifting when building up larger | variables from smaller ones, but I can imagine many would not. | patrakov wrote: | There was a blog post and a FOSDEM presentation by (misguided) | Gentoo developers a few years ago, and it was retracted, | because sanitizers add their own exploitable vulnerabilities | due to the way they work. | | https://blog.hboeck.de/archives/879-Safer-use-of-C-code-runn... | | https://www.openwall.com/lists/oss-security/2016/02/17/9 | jart wrote: | Sanitizers have the ability to bring Rust-like safety | assurances to all the C/C++ code that exists. The fact that | existing ASAN runtimes weren't designed for setuid binaries | shouldn't dissuade us from pursuing those benefits. We just | need a production-worthy runtime that does less things. For | example, here's the ASAN runtime that's used for the redbean | web server: https://github.com/jart/cosmopolitan/blob/master/ | libc/intrin... | pornel wrote: | Run-time detection and heuristics on a language that is | hard to analyze (e.g. due to weak aliasing, useless const, | ad-hoc ownership and thread-safety rules) aren't in the | same ballpark as compile-time safety guaranteed by | construction, and an entire modern ecosystem centered | around safety. Rust can use LLVM sanitizers in addition to | its own checks, so that's not even a trade-off. | tails4e wrote: | Sorry for my ignorance, but surely some UB being used for | optimization by the compiler is compile time only. This is | the part that should default on. Runtime detection is a | different thing entirely, but compile time is a no brainer. | MauranKilom wrote: | > Ubsan should default on | | > Could save a huge amount of time debugging when compilers or | architecture changes. | | I'm assuming we come from very different backgrounds, but it's | not clear to me how switching compilers or _architectures_ is | so common that hardening code against it _by default_ is | appropriate. I would think that switching compilers or | architectures is generally done very deliberately, so | instrumenting code with UBsan _for that transition_ would be | the right thing to do? | toast0 wrote: | Changing compilers is a pretty regular thing IMHO; I use the | compiler that comes with the OS and let's assume a yearly OS | release cycle. Most of those will contain at least some | changes to the compiler. | | I don't really want to have to take that yearly update to go | through and review (and presumablu fix) all the UB that has | managed to sneak in over the year. It would be better to have | avoided putting it in. | tails4e wrote: | Changing gcc version could cause your code with undefined | behaviour to change. If you rely UB, whether you know you are | or not, you are in for a bad time. Ubsan at least let's you | know if your code is robust, or a ticking time bomb... | jedisct1 wrote: | Sanitizers may introduce side channels. This is an issue for | crypto code. | rwmj wrote: | If you can assume GCC or Clang then __builtin_bswap{16,32,64} | functions are provided which will be considerably more efficient, | less error-prone, and easier to use than anything you can | homebrew. | dataflow wrote: | _byteswap_{ushort,ulong,uint64} for MSVC. Together with yours | on x86 these should take care of the three major compilers. | st_goliath wrote: | Well, yes. The only thing missing is knowing if you have to | swap or not, if you don't want to assume your code will run on | little endian systems exclusively. | | Or, on Linux and BSD systems at least, you can use the | <endian.h> or <sys/endian.h> functions | (https://linux.die.net/man/3/endian) and rely on the libc | implementation to do the system/compiler detection for you and | use an appropriate compiler builtin inside of an inline | function instead of bothering to hack something together in | your own code. | | The article mentions those functions at the bottom, but | strangely still recommends hacking up your own macros. | jart wrote: | That's not true. If you write the byte swap in ANSI C using the | gigantic mask+shift expression it'll optimize down to the bswap | instruction under both GCC and Clang, as the blog post points | out. | rwmj wrote: | Assuming the macros or your giant expression are correct. But | you might as well use the compiler intrinsics which you | _know_ are both correct and the most efficient possible, and | get on with your life. | jart wrote: | Sorry I'd rather place my faith in arithmetic rather than | someone's API provided the compiler is smart enough to | understand the arithmetic and optimize accordingly. | loeg wrote: | "Someone" here is the same compiler you're trusting to | optimize your giant arithmetic expression of the same | idea. Your statement is internally inconsistent. | lanstin wrote: | There is a value to keeping it completely clear in your | head the difference between a value with arithmetic | semantics vs a value with octets in a stream semantics. | That thinking will work in all contexts, while the | compiler knowledge is limited. The thinking will help you | write correct ways to encode data in the URL or into a | file being uploaded that your code generates for discord | or whatever, in Python, without knowledge of the true | endianness of the system the code is running on. | [deleted] | borman wrote: | Funny that compilers (e.g. clang: | https://github.com/llvm/llvm- | project/blob/b04148f77713c92ee5... ) might be able to do that | only because someone on the compiler team has hand-coded a | bswap expression detector. | bombcar wrote: | Given it can be done with careful code AND many processors have | a single instruction to do it I'm surprised it hasn't been | added to the C standard. | savant2 wrote: | The article explicitly shows that the provided macros are very | efficient with a modern compiler. You can check on godbolt.org | that they emit the same code. | | Though the article only mentions bswap64 and mentioning | __builtin_bswap64 would be a nice addition. | fanf2 wrote: | But then you have to #ifdef the endianness of the target | architecture. If you do it the right way as Russ Cox and | Justine Tunney say, then your code can serialize and | deserialize correctly regardless of the platform endianness. | chrisseaton wrote: | __builtin_bswap does exactly the same thing as the macros. | russdill wrote: | The fallacy in the article is that anyone should code these | functions. There's plenty of public domain libraries that do | this correctly. | | https://github.com/rustyrussell/ccan/blob/master/ccan/endian... | nly wrote: | My favourite builtins are the overflow checked integer | operations: | | https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins... | captainmuon wrote: | It is a ridiculous feature of modern C that you have to write the | super verbose "mask and shift" code, which then gets compiled to | a simple `mov` and maybe a `bswap`. Wheras, the direct equivalent | in C, an assignment with a (type changing) cast, is illegal. | There is a huge mismatch between the assumptions of the C spec | and actual machine code. | | One of the few reasons I ever even reached to C is the ability to | slurp in data and reinterpret it as a struct, or the ability to | reason in which registers things will show up and mix in some | `asm` with my C. | | I think there should really be a dialect of C(++) where the | machine model is exactly the physical machine. That doesn't mean | the compiler can't do optimizations, but it shouldn't do things | like prove code as UB and fold everything to a no-op. (Like when | you defensively compare a pointer to NULL that according to spec | must not be NULL, but practically could be...) | | `-fno-strict-overflow -fno-strict-aliasing -fno-delete-null- | pointer-checks` gets you halfway there, but it would really only | be viable if you had a blessed `-std=high-level-assembler` or | `-std=friendly-c` flag. | MrBuddyCasino wrote: | > There is a huge mismatch between the assumptions of the C | spec and actual machine code. | | People like to say ,,C is close to the metal". Really not true | at all anymore. | goldenkey wrote: | Actually, it is true - which is why endian is a problem in | the first place. ASM code is different when written for | little endian vs big endian. Access patterns are positively | offset instead of negatively. | | A language that does the same things regardless of endianness | would not have pointer arithmetic. That is not ASM and not C. | pjmlp wrote: | It does, macro assemblers, specially those with PC and Amiga | roots. | | Which given its heritage, that is what PDP-11 C used to be, | after all BCPL origin was as minimal language required to | bootstrap CPL, nothing else. | | Actually, I think TI has a macro Assembler with a C like | syntax, just cannot recall the name any longer. | simias wrote: | > _Wheras, the direct equivalent in C, an assignment with a | (type changing) cast, is illegal._ | | I don't understand what you mean by that. The direct equivalent | of what? Endianess is not part of the type system in C so I'm | not sure I follow. | | > _I think there should really be a dialect of C(++) where the | machine model is exactly the physical machine._ | | Linus agrees with you here, and I disagree with both of you. | _Some_ UBs could certainly be relaxed, but as a rule I want my | code to be portable and for the compiler to have enough leeway | to correctly optimize my code for different targets without | having to tweak my code. | | I want strict aliasing and I want the compiler to delete | extraneous NULL pointer checks. Strict overflow I'm willing to | concede, at the very least the standard should mandate wrap-on- | overflow ever for signed integers IMO. | lanstin wrote: | I am sympathetic, but portability was more important in the | past and gets less important each year. I used to write code | strictly keeping the difference between numeric types and | sequences of bytes in mind, hoping to one day run on an Alpha | or a Tandem or something, but it has been a long time since I | have written code that runs on non-(Intel AMD or le ARM) | mhh__ wrote: | D's machine model does actually assume the hardware, and using | the compile time metaprogramming you can pretty much do | whatever you want when it comes to bit twiddling - whether that | means assembly, flags etc. | pornel wrote: | Of course nobody wants C to backstab them with UB, but at the | same time programmers want compilers to generate optimal code. | That's the market pressure that forces optimizers to be so | aggressive. If you can accept less optimized code, why aren't | you using tcc? | | The idea of C that "just" does a straightforward machine | translation breaks down almost immediately. For example, you'd | want `int` to just overflow instead of being UB. But then it | turns out indexing `arr[i]` can't use 64-bit memory addressing | modes, because they don't overflow like a 32-bit int does. With | UB it doesn't matter, but a "straightforward C" would emit | unnecessary separate 32-bit mul/shift instructions. | | https://gist.github.com/rygorous/e0f055bfb74e3d5f0af20690759... | MaxBarraclough wrote: | > nobody wants C to backstab them with UB, but at the same | time programmers want compilers to generate optimal code | | The value of compiler optimization isn't the same thing as | the value of having extensive undefined behaviour in a | programming language. | | Rust and Ada perform about the same as C, but lack C's many | footguns. | | > indexing `arr[i]` can't use 64-bit memory addressing modes | | What do you mean here? | remexre wrote: | Typically, the assembly instruction that would do the read | in arr[i] can do something like: x = *(y | + z); | | where y and z are both 64-bit integers. If I had | int arr[1000]; initialize(&arr); int i = | read_int(); int x = arr[i]; print(x); | | then to get x I'd need to do something like, | tmp = i * 4; tmp1 = (uint64_t)tmp; x = | *(arr + tmp1); | | Which, since i is signed, can't just be a cheap shift, and | then needs to be upcasted to a uint64_t (which is cheap, at | least). | ajross wrote: | > There is a huge mismatch between the assumptions of the C | spec and actual machine code. | | Right, which is why the kind of UB pedantry in the linked | article is hurting and not helping. Cranky old man perspective | here: | | Folks: the fact that compilers will routinely exploit edge | cases in undefined behavior in the language specification to | miscompile obvious idiomatic code is a _terrible bug in the | compilers_. Period. And we should address that by fixing the | compilers, potentially by amending the spec if feasible. | | But instead the community wants to all look smart by showing | how much they understand about "UB" with blog posts and (worse) | drive-by submissions to open source projects (with passive | agressive sneers about code quality), so nothing gets better. | | Seriously: don't tell people to shift and mask. Don't | pontificate over compiler flags. Stop the masturbatory use of | ubsan (though the tool itself is great). And start submitting | bugs against the toolchain to get this fixed. | wnoise wrote: | I read this, and go "yes, yes, yes", and then "NO!". | | Shifts and ors really is the sanest and simplest way to | express "assembling an integer from bytes". Masking is _a_ | way to deal with the current C spec which has silly promotion | rules. Unsigned everything is more fundamental than signed. | jart wrote: | I agree but language of the standard very unambiguously lets | them do it. Quoth X3.159-1988 * Undefined | behavior --- behavior, upon use of a nonportable or | erroneous program construct, of erroneous data, or of | indeterminately-valued objects, for which the Standard | imposes no requirements. Permissible undefined | behavior ranges from ignoring the situation | completely with unpredictable results, to behaving during | translation or program execution in a documented manner | characteristic of the environment (with or without | the issuance of a diagnostic message), to | terminating a translation or execution (with the issuance | of a diagnostic message). | | In the past compilers "behaved during translation or program | execution in a documented manner characteristic of the | environment" and now they've decided to "ignore the situation | completely with unpredictable results". So yes what gcc and | clang are doing is hostile and dangerous, but it's legal. | https://justine.lol/undefined.png So let's fix our code. The | blog post is intended to help people do that. | userbinator wrote: | _So let 's fix our code._ | | No; I say we force the _compiler writers_ to fix their | idiotic assumptions instead of bending over backwards to | please what 's essentially a tiny minority. There's a lot | more programmers who are not compiler writers. | | The standard is really a _minimum bar_ to meet, and what 's | not defined by it is left to the discretion of the | implementers, who should be doing their best to follow the | "spirit of C", which ultimately means behaving sanely. "But | the standard allows it" should never be a valid argument | --- the standard allows a lot of other things, not all of | which make sense. | | A related rant by Linus Torvalds: | https://bugzilla.redhat.com/show_bug.cgi?id=638477#c129 | cbmuser wrote: | > One of the few reasons I ever even reached to C is the | ability to slurp in data and reinterpret it as a struct, or the | ability to reason in which registers things will show up and | mix in some `asm` with my C. | | Which results in undefined behavior according to the C ISO | standard. | | Quote: | | "2 All declarations that refer to the same object or function | shall have compatible type; otherwise, the behavior is | undefined." | | From: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf | 6.2.7 | [deleted] | jstanley wrote: | Exactly. | innocenat wrote: | How? I mean, doesn't GP mean this? struct | whatever p; fread(p, sizeof(p), 1, fp); | tsimionescu wrote: | It should be perfectly fine to do this: union | reinterpret { char raw[100]; struct myStruct | interpreted; } example; read(fd, | &example.raw) struct myStruct dest = interpreted; | | This is standard-compliant C code, and it is a common way of | reading IP addresses from packets, for example. | saagarjha wrote: | (It should be noted that this is not valid C++ code.) | nine_k wrote: | I suspect you might like C--. | | https://en.m.wikipedia.org/wiki/C-- | froh wrote: | you could instead simply use hton/ntoh and trust the library | properly does The Right Thing tm | nly wrote: | > I think there should really be a dialect of C(++) where the | machine model is exactly the physical machine. | | Sounds great, until you have to rewrite all your software to go | from x86-64 to ARM | pjmlp wrote: | Quite common when coding games back in the 8 and 16 bit days. | :) | | However for the case in hand, it would suffice to just write | the key routines in Assembly, not everything. | pm215 wrote: | So in your 'machine model is the physical machine' flavour, | should "I cast an unaligned pointer to a byte array to int32_t | and deref" on SPARC (a) do a bunch of byte-load-and-shift-and- | OR or (b) emit a simple word load which segfaults? If the | former, it's not what the physical machine does, and if the | latter, then you still need to write the code as "some portable | other thing". Which is to say that the spec's UB here is in | service of "allow the compiler to just emit a word load when | you write *(int32_t)p". | | What I think the language is missing is a way to clearly write | "this might be unaligned and/or wrong endianness, handle that". | (Sometimes compilers provide intrinsics for this sort of gap, | as they do with popcount and count-leading-zeroes; sometimes | they recognize common open-coded idioms. But proper | standardised support would be nicer.) | jart wrote: | Endianness doesn't matter though, for the reasons Rob Pike | explained. For example, the bits inside each byte have an | endianness probably inside the CPU but they're not | addressable so no one thinks about that. The brilliance of | Rob Pike's recommendation is that it allows our code to be | byte order agnostic for the same reasons our code is already | bit order agnostic. | | I agree about bsf/bsr/popcnt. I wish ASCII had more | punctuation marks because those operations are as fundamental | as xor/and/or/shl/shr/sar. | klodolph wrote: | You don't have to mask and shift. You can memcpy and then byte | swap in a function. It will get inlined as mov/bswap. | | Practically speaking, common compilers have intrinsics for | bswap. The memcpy function can be thought of as an intrinsic | for unaligned load/store. | BeeOnRope wrote: | How do you detect if a byte swap is needed? I.e. wether the | (fixed) wire endianness matches the current platform | endianness? | edflsafoiewq wrote: | Ie how do you know the target's endianness? C++20 added | std::endian. Otherwise you can use a macro like this one | from SDL | | https://github.com/libsdl- | org/SDL/blob/9dc97afa7190aca5bdf92... | hermitdev wrote: | There have been CPU architectures where the endianness at | compile time isn't necessarily sufficient. I forget | which, maybe it was DEC Alpha, where the CPU could flip | back and forth? I can't recall if it was a "choose at | boot" or a per process change. | magicalhippo wrote: | ARM allows dynamic changing of endianess[1]. | | [1]: | https://developer.arm.com/documentation/dui0489/h/arm- | and-th... | user-the-name wrote: | When do you byte swap? | themulticaster wrote: | The first example in the article is flawed (or at least | misleading). | | 1) They define a char array (which defaults to signed char, as | mentioned in the post), including the value 0x80 which can't be | represented in char, resulting in a compiler warning (e.g. in GCC | 11.1). | | The mentioned reason against using unsigned char (that shifting | 128 left by 24 places results in UB) is also misleading: I could | not reproduce the UB when changing the array to unsigned char. | Perhaps the author meant leaving the array defined as signed | char, but casting the signed chars to unsigned before shifting. | That indeed results in UB, but I don't see why you would define | the array as signed in the first place. | | 2) The cause for the undefined behavior isn't the bswap_32, | rather it's because they try reading an uint32_t value from a | char array, where b[0] is not aligned on a word boundary. | | There is no need at all do redefine bswap. The simple solution | would be to use an unsigned char array instead of a char array | and just reading the values byte-wise. | | Of course C has its footguns and warts and so on, but there is no | need to dramatize it this much in my opinion. | | I've prepared a Godbolt example to better explain the arguments | mentioned above: https://godbolt.org/z/Y1EWK6e17 | | Edit: To add to point 2) above: Another way to avoid the UB (in | this specific case) would be to add __attribute__ ((aligned (4))) | to the definition of b. In that case, even reading the array as a | single uint32_t works as expected since the access is aligned to | a word boundary. | | Obviously, you can't expect any random (unsigned char) pointer to | be aligned on a word boundary. Therefore, it is still necessary | to read the uint32_t byte by byte. | cygx wrote: | _> The mentioned reason against using unsigned char (that | shifting 128 left by 24 places results in UB) is also | misleading_ | | No, that reasoning is correct. Integer promotions are performed | on the operands of a shift expression, meaning the left operand | will be promoted to signed int even if it starts out as | unsigned char. Trying to shift a byte value with highest bit | set by 24 will results in a value not representable as signed | int, leading to UB. | themulticaster wrote: | Thanks, I just noticed a small mistake in my example (I don't | trigger the UB because I access b[0] containing 0x80 without | shifting, however I meant to do it the other way around). | | Still, adding an explicit cast to the left operand seems to | be enough to avoid this, e.g.: uint32_t x = | ((uint32_t)b[0]) << 24; | | In summary, I think my point that using unsigned char would | be appropriate in this case still stands. | cygx wrote: | _> Still, adding an explicit cast to the left operand seems | to be enough to avoid this_ | | Indeed. See my other comment, | https://news.ycombinator.com/item?id=27086482 | commandersaki wrote: | It wasn't clear to me but what was the undefined behaviour in the | naive approach? | cygx wrote: | Violation of the effective typing rules ('strict aliasing') and | a potential violation of alignment requirements of your | platform. | [deleted] ___________________________________________________________________ (page generated 2021-05-08 23:00 UTC)