commented:

Only called in case of an error, so performance is not important.

Ah yeah, don’t love this. An obvious example of where this fails is
when you’re dealing with untrusted inputs or just arbitrary inputs.
Errors might be what we expect. Or if you’re dealing with corrupted
data in a memory dump or something, and trying to parse out whatever
you can find. The error path shouldn’t negatively impact the success
path, that’s fine, but the error path should be fast too.
Lovely post. Great information, super interesting, just A+ love it.

  commented:


Only called in case of an error, so performance is not important.


For context: This is a comment in serde’s source code, not the words
of the author of this blog post.

  commented:
Article author here, glad to answer any questions!

  commented:
Thanks both for your work and for the explanation!

  commented:
Very nice!
wrt:

This, of course, only works on little-endian machines. On big-endian
machines, c has to be bytereversed.

You can skip the byteswap by fiddling with the clz/ctz instead, to
find the most significant non-zero byte instead of the least
significant byte. Good luck finding a bigendian machine to test it on,
tho!

  commented:
This is actually really funny because that’s what I first thought of.
In fact, that was the original implementation. And then, while writing
the post, I realized this is actually wrong and dtolnay and I had to
release a hotfix: https://github.com/serde-rs/json/pull/1173
The reason? Subtraction can overflow, so you can only trust the lowest
bit set.

  commented:
Oh right, now I see you actually explained that in the sentence
immediately before the one I quoted, d’oh!
The tolower hacks I have worked on had to prevent borrows or carries
from propagating between bytes. It’s cool that search-style SWAR
tricks can save an op or two by letting the borrows fly.

  commented:
Fabulous article and great work!
.