commented: Only called in case of an error, so performance is not important. Ah yeah, don’t love this. An obvious example of where this fails is when you’re dealing with untrusted inputs or just arbitrary inputs. Errors might be what we expect. Or if you’re dealing with corrupted data in a memory dump or something, and trying to parse out whatever you can find. The error path shouldn’t negatively impact the success path, that’s fine, but the error path should be fast too. Lovely post. Great information, super interesting, just A+ love it. commented: Only called in case of an error, so performance is not important. For context: This is a comment in serde’s source code, not the words of the author of this blog post. commented: Article author here, glad to answer any questions! commented: Thanks both for your work and for the explanation! commented: Very nice! wrt: This, of course, only works on little-endian machines. On big-endian machines, c has to be bytereversed. You can skip the byteswap by fiddling with the clz/ctz instead, to find the most significant non-zero byte instead of the least significant byte. Good luck finding a bigendian machine to test it on, tho! commented: This is actually really funny because that’s what I first thought of. In fact, that was the original implementation. And then, while writing the post, I realized this is actually wrong and dtolnay and I had to release a hotfix: https://github.com/serde-rs/json/pull/1173 The reason? Subtraction can overflow, so you can only trust the lowest bit set. commented: Oh right, now I see you actually explained that in the sentence immediately before the one I quoted, d’oh! The tolower hacks I have worked on had to prevent borrows or carries from propagating between bytes. It’s cool that search-style SWAR tricks can save an op or two by letting the borrows fly. commented: Fabulous article and great work! .