hngopher.com

       [HN Gopher] (Don't) crank up the warnings to 11
       ___________________________________________________________________
        
       (Don't) crank up the warnings to 11
        
       Author : jjgreen
       Score  : 47 points
       Date   : 2023-03-15 21:48 UTC (1 hours ago)
        
 (HTM) web link (lemire.me)
 (TXT) w3m dump (lemire.me)
        
       | ok123456 wrote:
       | Compiler generated warnings are generally pretty high quality.
       | Dynamic analysis tools generate high quality warnings. Static
       | analysis tools that work on a significant amount of the AST seem
       | to generally give good enough warnings.
       | 
       | It's the shallow pattern based, heuristic tools seem to generate
       | a lot of noise.
        
       | teddyh wrote:
       | > _However, I fear that it is part of a larger trend where people
       | come to rely more or more on overbearing static analysis to judge
       | code quality. The more warnings, the better, they think._
       | 
       | Don't get me started on security scanners which basically report
       | all "open" ports as a security issue.
        
       | 2h wrote:
       | Or, maybe stop doing pointer arithmetic? We have had some pretty
       | good new languages appear in the last 7 years. I know people are
       | averse to change, me included. But you may want to ask yourself
       | if continuing to use a 50 year old language is the right answer.
        
         | Snetry wrote:
         | Oh how correct you are.
         | 
         | Let us throw away all the existing code written in languages
         | made in a different age and use languages that are completely
         | binary incompatible just because we can.
        
         | eklitzke wrote:
         | This example isn't doing pointer arithmetic.
        
           | jeffbee wrote:
           | Yes, it is.                 const __m512i *ptr1 = (const
           | __m512i*)container1->words;
           | 
           | Then later:                 __m512i r1 =
           | _mm512_loadu_si512(ptr1+i);
        
         | zaptheimpaler wrote:
         | This is low level code written in a library to make full use of
         | AVX-512.. which is inherently a processor native feature that
         | has to be written in a language close to assembly because the
         | processor only understands machine code.. there is no way to
         | avoid assembly or the pointer arithmetic. Someone has to do the
         | pointer arithmetic so the higher level languages and libraries
         | don't have to.
        
         | loup-vaillant wrote:
         | Well, then the proper response from Github would be something
         | like:                 WARNING: HIGH-TO-CRITICAL SEVERITY:
         | You are using C. C is not memory safe, and likely       to
         | expose your users to critical vulnerabilities.
         | 
         | Not sure it would fly.
        
       | [deleted]
        
       | ChrisMarshallNY wrote:
       | _> Training young programmers to avoid non-issues may make them
       | less productive._
       | 
       | I've found that training _good habits_ , pays off in spades.
       | 
       | Once a technique becomes _habit_ , its overhead basically drops
       | to 0.
       | 
       | May take bit to get there, but once it's ingrained, it is cost-
       | free.
       | 
       | An example is when I trained myself to switch from Whitesmiths
       | indenting, to K&R indenting (when I switched from C++, to Swift).
       | 
       | It took a few weeks of discomfort, but it is totally ingrained,
       | these days.
        
       | adastra22 wrote:
       | I have turned off all GitHub scanner notifications. There is just
       | too much noise and my time is limited.
        
       | bragr wrote:
       | I don't know anything about the code quality of the author, but
       | in my experience, the ones that complain about static checking
       | and other guard rails the most, are the ones that need it the
       | most, because in their hubris of thinking they won't make those
       | kinds of errors, they make the most egregious forms of those
       | errors. e.g. smart enough take pot shots about the possible
       | downsides of the process, but not smart enough (or self aware
       | enough) to see their own fallibility.
        
         | jjgreen wrote:
         | There's a link to Professor Lemire's bio on the linked article:
         | https://lemire.me/en/
        
           | [deleted]
        
           | [deleted]
        
         | Someone wrote:
         | * * *
        
         | stusmall wrote:
         | I remember an argument with an engineer about static code
         | analysis years ago. Without a trace of irony, he honestly said
         | "I know it's high quality code. I wrote it. What could a
         | program tell me about it that I don't already know?" You can
         | guess how much of a nightmare his code was to work with.
         | 
         | This isn't a comment on the original article or the author at
         | all. Just my experience with a certain type of engineer.
        
       | omoikane wrote:
       | > Modifying code to fix a non-issue has a non-zero chance of
       | introducing a real bug
       | 
       | One example of such bug, where fixing a purify warning about
       | uninitialized memory crippled OpenSSL:
       | 
       | https://www.schneier.com/blog/archives/2008/05/random_number...
       | 
       | via: https://xkcd.com/424/
       | 
       | I think it's generally preferred to enable most warnings but have
       | some annotations to silence false positives, preferably with
       | sufficient comments accompanying those annotations to help the
       | code reviewers.
        
       | jeffbee wrote:
       | The PR in question seems to be
       | https://github.com/RoaringBitmap/CRoaring/pull/446
       | 
       | I agree that the GitHub analyses are worse than useless, but I
       | also think it highlights the friction between intrinsics and the
       | program in which they appear. There probably is a better model.
        
       | zX41ZdbW wrote:
       | The order should be as follows:
       | 
       | - automated testing;
       | 
       | - -Wall -Wextra, and -Weverything with some exceptions;
       | 
       | - address, memory, thread, and undefined sanitizers;
       | 
       | - fuzzing (at least some sort of), coverage-guided fuzzing with
       | sanitizers;
       | 
       | - static analysis, starting with clang-tidy;
       | 
       | - other static analyzers, such as Coverity, CodeQL - have the
       | least priority.
       | 
       | An example of how everything of the above, everywhere, all at
       | once is applied in public can be seen in the ClickHouse
       | repository: https://github.com/ClickHouse/ClickHouse in the per-
       | commit checks. But keep in mind that the checks are really heavy
       | - run for multiple hours, and we have to apply kludges to support
       | GitHub Actions with spot instances...
       | 
       | Also the dashboard can be checked here:
       | https://aretestsgreenyet.com/
        
         | wahern wrote:
         | C and C++ developers who have been around long enough
         | understand why clang _specifically_ says _not_ to regularly use
         | -Weverything.
         | 
         | Historically even -Wextra has resulted in spurious warnings for
         | perfectly valid and clean code, as most if not all -Wextra
         | warnings are _heuristics_. -Weverything is also all heuristics,
         | but even more volatile and debatable. Any worthwhile
         | -Weverything heuristic will eventually move to -Wextra.
         | 
         | So, sure, use -Weverything on occasion to see if there's
         | anything interesting to take note of. But if you include it by
         | default you're inviting fatigue and unnecessary rewrites, which
         | may be just as likely to result in the introduction of bugs as
         | to fix any bugs.
        
         | tmtvl wrote:
         | I like adding -pedantic to -Wall and -Wextra, although getting
         | other people's software to compile with that is a nightmare.
        
       | aidenn0 wrote:
       | Right thinking people can also disagree on what is a false-
       | positive. I remember a dialog between two people about one such
       | thing. The summary is:
       | 
       | 1. No improper behavior occurs currently because of the item
       | flagged
       | 
       | 2. However, reasonable changes to other parts of the code _could_
       | cause the problem to occur
       | 
       | 3. There is an obvious way of rewriting that avoids #2.
       | 
       | One person argued it was a false positive because of #1, while
       | the other argued it was a true positive because #2 and #3
       | combined means you definitely want to change the code...
        
       | jgrahamc wrote:
       | I prefer maximum warnings and use of #pragma to disable a warning
       | at a specific point when I know the warning is useless/wrong.
       | That way I can get to a totally clean build with zero warnings
       | output. The last significant piece of C I wrote:
       | # Turn on absolutely all warnings and turn them into errors
       | CFLAGS += -g -Wall -Wextra -Wno-unused-parameter -Werror
       | 
       | https://github.com/cloudflare/keyless/blob/master/Makefile#L...
       | 
       | The reason I do this is if I leave some warnings in I can get
       | complacent "oh, there's always a few warnings" or "that's
       | expected" and then miss something. So, I like all the warnings
       | and warnings treated as errors.
        
         | Snetry wrote:
         | well, clearly not "maximum warnings" since you are missing a
         | lot of them. -pedantic -Wunreachable-code -Wcast-qual -Wswitch-
         | default -Wconversion -Wshadow -Wstrict-aliasing -Winit-self
         | -Wcast-align -Wpointer-arith -Wmissing-declarations -Wmissing-
         | include-dirs -Wno-unused-parameter -Wuninitialized and thats
         | not even all that are useful.
         | 
         | Also, don't use -Werror if you can Warnings will change based
         | on compiler, compiler version and a countless number of other
         | factors that make working with code that turns them into errors
         | a pain.
        
         | loup-vaillant wrote:
         | Man, you should try Clang's `-Weverything` it's... well you
         | know what, I'll try that on my stupidly high-quality
         | cryptographic code:                 $ make "CC=clang -std=c99"
         | "CFLAGS=-O3 -Weverything"       clang -std=c99 -O3 -Weverything
         | -I src -I src/optional -fPIC -c -o lib/monocypher.o
         | src/monocypher.c       src/monocypher.c:262:6: warning: mixing
         | declarations and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]                       u8
         | tmp[64];                          ^
         | src/monocypher.c:231:9: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               u32    pool[16];
         | ^       src/monocypher.c:337:12: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]               const u64 s0 =
         | ctx->h[0] + (u64)s[0]; // s0 <= 1_fffffffe
         | ^       In file included from src/monocypher.c:54:
         | src/monocypher.h:289:9: warning: padding size of
         | 'crypto_poly1305_ctx' with 4 bytes to alignment boundary
         | [-Wpadded]       typedef struct {               ^
         | src/monocypher.c:410:9: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               size_t nb_blocks = message_size >> 4;
         | ^       src/monocypher.c:437:6: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]               u64 c = 5;
         | ^       src/monocypher.c:498:6: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]               u64 v0 =
         | ctx->hash[0];  u64 v8  = iv[0];                   ^
         | src/monocypher.c:621:10: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]                       size_t nb_words = message_size
         | >> 3;                              ^
         | src/monocypher.c:600:9: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               size_t nb_blocks = message_size >> 7;
         | ^       src/monocypher.c:640:9: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]               size_t hash_size
         | = MIN(ctx->hash_size, 64);                      ^
         | src/monocypher.c:786:6: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]                       u8 hash_area[1024];
         | ^       src/monocypher.c:874:10: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]
         | u32 next_slice   = ((slice + 1) % 4) * segment_size;
         | ^       src/monocypher.c:801:6: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]               int constant_time
         | = config.algorithm != CRYPTO_ARGON2_D;                   ^
         | src/monocypher.c:1170:6: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               i32 q = (19 * t[9] + (((i32) 1) <<
         | 24)) >> 25;                   ^       src/monocypher.c:1337:5:
         | warning: mixing declarations and code is incompatible with
         | standards before C99 [-Wdeclaration-after-statement]
         | u8 isodd = s[0] & 1;                  ^
         | src/monocypher.c:1349:6: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               int isdifferent = crypto_verify32(fs,
         | gs);                   ^       src/monocypher.c:1433:7:
         | warning: mixing declarations and code is incompatible with
         | standards before C99 [-Wdeclaration-after-statement]
         | i32 *quartic = t1;                    ^
         | src/monocypher.c:1500:5: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               fe x2, z2, x3, z3, t0, t1;
         | ^       src/monocypher.c:1650:6: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]               u64 carry = 1;
         | ^       src/monocypher.c:1676:6: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]               u32 B[8];
         | load32_le_buf(B, b, 8);                   ^
         | src/monocypher.c:1756:6: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               int is_square = invsqrt(h->X, h->X);
         | ^       src/monocypher.c:1956:8: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]
         | int lsb = v & (~v + 1); // smallest bit of v
         | ^       src/monocypher.c:2015:7: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]                       int
         | h_digit = slide_step(&h_slide, P_W_WIDTH, i, h);
         | ^       src/monocypher.c:1994:12: warning: mixing declarations
         | and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]               ge_cached
         | lutA[P_W_SIZE];                         ^
         | src/monocypher.c:2185:5: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               fe tmp_a, tmp_b;  // temporaries for
         | addition                  ^       src/monocypher.c:2540:5:
         | warning: mixing declarations and code is incompatible with
         | standards before C99 [-Wdeclaration-after-statement]
         | fe t1, t2;                  ^       src/monocypher.c:2641:6:
         | warning: mixing declarations and code is incompatible with
         | standards before C99 [-Wdeclaration-after-statement]
         | int is_square = invsqrt(t1, t1);                   ^
         | src/monocypher.c:2696:6: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               int is_square = invsqrt(t3, t3); // t3
         | = sqrt(-1 / non_square * u * (u+A))                   ^
         | src/monocypher.c:2775:6: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               u32 t[16] = {0};                   ^
         | src/monocypher.c:2815:6: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               u32 m_scl[8];                   ^
         | src/monocypher.c:2870:22: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               crypto_poly1305_ctx poly_ctx;
         | // auto wiped...                                   ^
         | src/monocypher.c:2925:6: warning: mixing declarations and code
         | is incompatible with standards before C99 [-Wdeclaration-after-
         | statement]               int mismatch = crypto_verify16(mac,
         | real_mac);                   ^       src/monocypher.c:2953:6:
         | warning: mixing declarations and code is incompatible with
         | standards before C99 [-Wdeclaration-after-statement]
         | int mismatch = crypto_aead_read(&ctx, plain_text, mac, ad,
         | ad_size,                   ^       33 warnings generated.
         | clang -std=c99 -O3 -Weverything -I src -I src/optional -fPIC -c
         | -o lib/monocypher-ed25519.o src/optional/monocypher-ed25519.c
         | src/optional/monocypher-ed25519.c:165:8: warning: mixing
         | declarations and code is incompatible with standards before C99
         | [-Wdeclaration-after-statement]
         | u64 in = K[i16 + j] + ctx->input[j];
         | ^       src/optional/monocypher-ed25519.c:158:9: warning:
         | mixing declarations and code is incompatible with standards
         | before C99 [-Wdeclaration-after-statement]               size_t
         | i16 = 0;                      ^       2 warnings generated.
         | ar cr lib/libmonocypher.a lib/monocypher.o lib/monocypher-
         | ed25519.o       clang -std=c99 -O3 -Weverything  -shared
         | -Wl,-soname,libmonocypher.so.4 -o lib/libmonocypher.so.4
         | lib/monocypher.o lib/monocypher-ed25519.o       ln -sf
         | `basename lib/libmonocypher.so.4` lib/libmonocypher.so
         | doc/doc_gen.sh
         | 
         | ---
         | 
         | Well...
         | 
         | Yeah.
         | 
         |  _That_ is cranking up warnings up to 11. Now this is C99 we
         | 're talking about, how about removing that obviously useless
         | `-Wdeclaration-after-statement`?                 $ make
         | "CC=clang -std=c99" "CFLAGS=-O3 -Weverything -Wno-declaration-
         | after-statement"       clang -std=c99 -O3 -Weverything -Wno-
         | declaration-after-statement -I src -I src/optional -fPIC -c -o
         | lib/monocypher.o src/monocypher.c       In file included from
         | src/monocypher.c:54:       src/monocypher.h:289:9: warning:
         | padding size of 'crypto_poly1305_ctx' with 4 bytes to alignment
         | boundary [-Wpadded]       typedef struct {             ^
         | 1 warning generated.
         | 
         | Ah, this one's may be real. Let's see this struct:
         | typedef struct {        // Do not rely on the size or contents
         | of this type,        // for they may change without notice.
         | uint8_t  c[16];  // chunk of the message        size_t   c_idx;
         | // How many bytes are there in the chunk.        uint32_t r
         | [4]; // constant multiplier (from the secret key)
         | uint32_t pad[4]; // random number added at the end (from the
         | secret key)        uint32_t h  [5]; // accumulated hash       }
         | crypto_poly1305_ctx;
         | 
         | Ah, I see: we end by an odd number of 32-bit words, and in the
         | 64-bit architecture I'm compiling this on, there will be 4
         | bytes of padding. But then I have two problems: first, who
         | cares? Second, I also target 32-bit architectures on which
         | there will _not_ be any padding, and adding it manually would
         | be wasteful.
         | 
         | There's the `#pragma` to silence the warning of course, but
         | it's ugly, and I'm not sure it is strictly portable (unless the
         | standard says compilers should ignore the `#pragma` if they
         | don't understand it, but I confess haven't checked).
        
       | remexre wrote:
       | Iunno, for this example, isn't this actually UB according to
       | standard C? I don't think it'd be that objectionable to put e.g.
       | a semantic comment saying "yeah no this is actually not UB
       | because we check in ./configure that your compiler is one known
       | to not do bad things," * _if*_ such a thing were standardized
       | across linting tools.
       | 
       | Of course, that's a heckuva if.
        
         | david2ndaccount wrote:
         | This is not standard C, this is already in the land of compiler
         | extensions and platform specific code. The behavior is defined
         | by the compiler and no compiler should treat this as UB.
        
           | remexre wrote:
           | Is the code actually compiled with -fno-strict-aliasing
           | (which IME is rare for userspace code, but I don't work with
           | numerical code very much), or are you saying this is because
           | of the types involved in the cast? I didn't see anything in
           | [0] about a change in pointer semantics from using the vector
           | extensions in the compilation unit?
           | 
           | [0]: https://gcc.gnu.org/onlinedocs/gcc/Vector-
           | Extensions.html#Ve...
        
           | deathanatos wrote:
           | It is using compiler intrinsics, I grant, and taking
           | advantage of IB is yes, well-defined given an implementation
           | (as we have here) ... but it seems like it's still C?
           | 
           | Here, I'm not 100% certain this code isn't UB: we type-pun a
           | __m512i and a uint64_t ... I'm not certain that's not UB. If
           | I'm forced into writing C, I try to avoid doing that at all
           | costs ... I'd rather not encounter nasal demons.
           | 
           | The warning also seems a bit merited, given that each
           | "increment" is over more than a single object on the
           | underlying array. (But, of course, that is the intent. But
           | I'm just not sure the static analyzer can see it.)
        
             | cpeterso wrote:
             | Plus, a pointer to __m512i might require different
             | alignment than a pointer to unsigned long.
             | 
             | This code received a pointer to unsigned long from
             | somewhere that might have cared about the data it pointed
             | to, so his code should acknowledge that (with type casts or
             | assertions) or make a __m512i or void* pointer part of this
             | function's signature, making the type change part of the
             | contract with the caller.
        
         | eklitzke wrote:
         | This is not UB.
         | 
         | In this case where you're using the wrong pointer type the
         | worst thing that can happen is you can have aliasing issues.
         | For example, if I have an int* and a double*, the C compiler is
         | free to assume that the pointers can never alias each other so
         | it can optimize loads and stores assuming that an update to the
         | int pointer will never affect the double pointer. In general
         | this can cause problems if the pointers actually do alias each
         | other, but if there's no aliasing occurring then nothing bad
         | will happen.
        
           | remexre wrote:
           | My understanding was that the UB was defined such that:
           | int i = 0;         float f = 1.0f;         void* p;
           | if(rand() & 1)           p = &i;         else           p =
           | &f;                  int j = *(int*)p;         assert(i ==
           | j);
           | 
           | could legally be compiled to a no-op (or a single call to
           | rand() with an ignored result, I suppose), despite there
           | being no "weird" pointer stores, since the ill-typed-at-
           | runtime load "can't happen," so the compiler can reason that
           | (int*)p must equal &i, and hence j = i.
           | 
           | EDIT: And, in fact, gcc does this:
           | https://godbolt.org/z/WM33vzzdo
           | 
           | Not sure if that applies to vector loads, and the gcc docs
           | don't specify, though.
        
       ___________________________________________________________________
       (page generated 2023-03-15 23:00 UTC)