[HN Gopher] (Don't) crank up the warnings to 11 ___________________________________________________________________ (Don't) crank up the warnings to 11 Author : jjgreen Score : 47 points Date : 2023-03-15 21:48 UTC (1 hours ago) (HTM) web link (lemire.me) (TXT) w3m dump (lemire.me) | ok123456 wrote: | Compiler generated warnings are generally pretty high quality. | Dynamic analysis tools generate high quality warnings. Static | analysis tools that work on a significant amount of the AST seem | to generally give good enough warnings. | | It's the shallow pattern based, heuristic tools seem to generate | a lot of noise. | teddyh wrote: | > _However, I fear that it is part of a larger trend where people | come to rely more or more on overbearing static analysis to judge | code quality. The more warnings, the better, they think._ | | Don't get me started on security scanners which basically report | all "open" ports as a security issue. | 2h wrote: | Or, maybe stop doing pointer arithmetic? We have had some pretty | good new languages appear in the last 7 years. I know people are | averse to change, me included. But you may want to ask yourself | if continuing to use a 50 year old language is the right answer. | Snetry wrote: | Oh how correct you are. | | Let us throw away all the existing code written in languages | made in a different age and use languages that are completely | binary incompatible just because we can. | eklitzke wrote: | This example isn't doing pointer arithmetic. | jeffbee wrote: | Yes, it is. const __m512i *ptr1 = (const | __m512i*)container1->words; | | Then later: __m512i r1 = | _mm512_loadu_si512(ptr1+i); | zaptheimpaler wrote: | This is low level code written in a library to make full use of | AVX-512.. which is inherently a processor native feature that | has to be written in a language close to assembly because the | processor only understands machine code.. there is no way to | avoid assembly or the pointer arithmetic. Someone has to do the | pointer arithmetic so the higher level languages and libraries | don't have to. | loup-vaillant wrote: | Well, then the proper response from Github would be something | like: WARNING: HIGH-TO-CRITICAL SEVERITY: | You are using C. C is not memory safe, and likely to | expose your users to critical vulnerabilities. | | Not sure it would fly. | [deleted] | ChrisMarshallNY wrote: | _> Training young programmers to avoid non-issues may make them | less productive._ | | I've found that training _good habits_ , pays off in spades. | | Once a technique becomes _habit_ , its overhead basically drops | to 0. | | May take bit to get there, but once it's ingrained, it is cost- | free. | | An example is when I trained myself to switch from Whitesmiths | indenting, to K&R indenting (when I switched from C++, to Swift). | | It took a few weeks of discomfort, but it is totally ingrained, | these days. | adastra22 wrote: | I have turned off all GitHub scanner notifications. There is just | too much noise and my time is limited. | bragr wrote: | I don't know anything about the code quality of the author, but | in my experience, the ones that complain about static checking | and other guard rails the most, are the ones that need it the | most, because in their hubris of thinking they won't make those | kinds of errors, they make the most egregious forms of those | errors. e.g. smart enough take pot shots about the possible | downsides of the process, but not smart enough (or self aware | enough) to see their own fallibility. | jjgreen wrote: | There's a link to Professor Lemire's bio on the linked article: | https://lemire.me/en/ | [deleted] | [deleted] | Someone wrote: | * * * | stusmall wrote: | I remember an argument with an engineer about static code | analysis years ago. Without a trace of irony, he honestly said | "I know it's high quality code. I wrote it. What could a | program tell me about it that I don't already know?" You can | guess how much of a nightmare his code was to work with. | | This isn't a comment on the original article or the author at | all. Just my experience with a certain type of engineer. | omoikane wrote: | > Modifying code to fix a non-issue has a non-zero chance of | introducing a real bug | | One example of such bug, where fixing a purify warning about | uninitialized memory crippled OpenSSL: | | https://www.schneier.com/blog/archives/2008/05/random_number... | | via: https://xkcd.com/424/ | | I think it's generally preferred to enable most warnings but have | some annotations to silence false positives, preferably with | sufficient comments accompanying those annotations to help the | code reviewers. | jeffbee wrote: | The PR in question seems to be | https://github.com/RoaringBitmap/CRoaring/pull/446 | | I agree that the GitHub analyses are worse than useless, but I | also think it highlights the friction between intrinsics and the | program in which they appear. There probably is a better model. | zX41ZdbW wrote: | The order should be as follows: | | - automated testing; | | - -Wall -Wextra, and -Weverything with some exceptions; | | - address, memory, thread, and undefined sanitizers; | | - fuzzing (at least some sort of), coverage-guided fuzzing with | sanitizers; | | - static analysis, starting with clang-tidy; | | - other static analyzers, such as Coverity, CodeQL - have the | least priority. | | An example of how everything of the above, everywhere, all at | once is applied in public can be seen in the ClickHouse | repository: https://github.com/ClickHouse/ClickHouse in the per- | commit checks. But keep in mind that the checks are really heavy | - run for multiple hours, and we have to apply kludges to support | GitHub Actions with spot instances... | | Also the dashboard can be checked here: | https://aretestsgreenyet.com/ | wahern wrote: | C and C++ developers who have been around long enough | understand why clang _specifically_ says _not_ to regularly use | -Weverything. | | Historically even -Wextra has resulted in spurious warnings for | perfectly valid and clean code, as most if not all -Wextra | warnings are _heuristics_. -Weverything is also all heuristics, | but even more volatile and debatable. Any worthwhile | -Weverything heuristic will eventually move to -Wextra. | | So, sure, use -Weverything on occasion to see if there's | anything interesting to take note of. But if you include it by | default you're inviting fatigue and unnecessary rewrites, which | may be just as likely to result in the introduction of bugs as | to fix any bugs. | tmtvl wrote: | I like adding -pedantic to -Wall and -Wextra, although getting | other people's software to compile with that is a nightmare. | aidenn0 wrote: | Right thinking people can also disagree on what is a false- | positive. I remember a dialog between two people about one such | thing. The summary is: | | 1. No improper behavior occurs currently because of the item | flagged | | 2. However, reasonable changes to other parts of the code _could_ | cause the problem to occur | | 3. There is an obvious way of rewriting that avoids #2. | | One person argued it was a false positive because of #1, while | the other argued it was a true positive because #2 and #3 | combined means you definitely want to change the code... | jgrahamc wrote: | I prefer maximum warnings and use of #pragma to disable a warning | at a specific point when I know the warning is useless/wrong. | That way I can get to a totally clean build with zero warnings | output. The last significant piece of C I wrote: | # Turn on absolutely all warnings and turn them into errors | CFLAGS += -g -Wall -Wextra -Wno-unused-parameter -Werror | | https://github.com/cloudflare/keyless/blob/master/Makefile#L... | | The reason I do this is if I leave some warnings in I can get | complacent "oh, there's always a few warnings" or "that's | expected" and then miss something. So, I like all the warnings | and warnings treated as errors. | Snetry wrote: | well, clearly not "maximum warnings" since you are missing a | lot of them. -pedantic -Wunreachable-code -Wcast-qual -Wswitch- | default -Wconversion -Wshadow -Wstrict-aliasing -Winit-self | -Wcast-align -Wpointer-arith -Wmissing-declarations -Wmissing- | include-dirs -Wno-unused-parameter -Wuninitialized and thats | not even all that are useful. | | Also, don't use -Werror if you can Warnings will change based | on compiler, compiler version and a countless number of other | factors that make working with code that turns them into errors | a pain. | loup-vaillant wrote: | Man, you should try Clang's `-Weverything` it's... well you | know what, I'll try that on my stupidly high-quality | cryptographic code: $ make "CC=clang -std=c99" | "CFLAGS=-O3 -Weverything" clang -std=c99 -O3 -Weverything | -I src -I src/optional -fPIC -c -o lib/monocypher.o | src/monocypher.c src/monocypher.c:262:6: warning: mixing | declarations and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] u8 | tmp[64]; ^ | src/monocypher.c:231:9: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] u32 pool[16]; | ^ src/monocypher.c:337:12: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] const u64 s0 = | ctx->h[0] + (u64)s[0]; // s0 <= 1_fffffffe | ^ In file included from src/monocypher.c:54: | src/monocypher.h:289:9: warning: padding size of | 'crypto_poly1305_ctx' with 4 bytes to alignment boundary | [-Wpadded] typedef struct { ^ | src/monocypher.c:410:9: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] size_t nb_blocks = message_size >> 4; | ^ src/monocypher.c:437:6: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] u64 c = 5; | ^ src/monocypher.c:498:6: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] u64 v0 = | ctx->hash[0]; u64 v8 = iv[0]; ^ | src/monocypher.c:621:10: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] size_t nb_words = message_size | >> 3; ^ | src/monocypher.c:600:9: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] size_t nb_blocks = message_size >> 7; | ^ src/monocypher.c:640:9: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] size_t hash_size | = MIN(ctx->hash_size, 64); ^ | src/monocypher.c:786:6: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] u8 hash_area[1024]; | ^ src/monocypher.c:874:10: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] | u32 next_slice = ((slice + 1) % 4) * segment_size; | ^ src/monocypher.c:801:6: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] int constant_time | = config.algorithm != CRYPTO_ARGON2_D; ^ | src/monocypher.c:1170:6: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] i32 q = (19 * t[9] + (((i32) 1) << | 24)) >> 25; ^ src/monocypher.c:1337:5: | warning: mixing declarations and code is incompatible with | standards before C99 [-Wdeclaration-after-statement] | u8 isodd = s[0] & 1; ^ | src/monocypher.c:1349:6: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] int isdifferent = crypto_verify32(fs, | gs); ^ src/monocypher.c:1433:7: | warning: mixing declarations and code is incompatible with | standards before C99 [-Wdeclaration-after-statement] | i32 *quartic = t1; ^ | src/monocypher.c:1500:5: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] fe x2, z2, x3, z3, t0, t1; | ^ src/monocypher.c:1650:6: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] u64 carry = 1; | ^ src/monocypher.c:1676:6: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] u32 B[8]; | load32_le_buf(B, b, 8); ^ | src/monocypher.c:1756:6: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] int is_square = invsqrt(h->X, h->X); | ^ src/monocypher.c:1956:8: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] | int lsb = v & (~v + 1); // smallest bit of v | ^ src/monocypher.c:2015:7: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] int | h_digit = slide_step(&h_slide, P_W_WIDTH, i, h); | ^ src/monocypher.c:1994:12: warning: mixing declarations | and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] ge_cached | lutA[P_W_SIZE]; ^ | src/monocypher.c:2185:5: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] fe tmp_a, tmp_b; // temporaries for | addition ^ src/monocypher.c:2540:5: | warning: mixing declarations and code is incompatible with | standards before C99 [-Wdeclaration-after-statement] | fe t1, t2; ^ src/monocypher.c:2641:6: | warning: mixing declarations and code is incompatible with | standards before C99 [-Wdeclaration-after-statement] | int is_square = invsqrt(t1, t1); ^ | src/monocypher.c:2696:6: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] int is_square = invsqrt(t3, t3); // t3 | = sqrt(-1 / non_square * u * (u+A)) ^ | src/monocypher.c:2775:6: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] u32 t[16] = {0}; ^ | src/monocypher.c:2815:6: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] u32 m_scl[8]; ^ | src/monocypher.c:2870:22: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] crypto_poly1305_ctx poly_ctx; | // auto wiped... ^ | src/monocypher.c:2925:6: warning: mixing declarations and code | is incompatible with standards before C99 [-Wdeclaration-after- | statement] int mismatch = crypto_verify16(mac, | real_mac); ^ src/monocypher.c:2953:6: | warning: mixing declarations and code is incompatible with | standards before C99 [-Wdeclaration-after-statement] | int mismatch = crypto_aead_read(&ctx, plain_text, mac, ad, | ad_size, ^ 33 warnings generated. | clang -std=c99 -O3 -Weverything -I src -I src/optional -fPIC -c | -o lib/monocypher-ed25519.o src/optional/monocypher-ed25519.c | src/optional/monocypher-ed25519.c:165:8: warning: mixing | declarations and code is incompatible with standards before C99 | [-Wdeclaration-after-statement] | u64 in = K[i16 + j] + ctx->input[j]; | ^ src/optional/monocypher-ed25519.c:158:9: warning: | mixing declarations and code is incompatible with standards | before C99 [-Wdeclaration-after-statement] size_t | i16 = 0; ^ 2 warnings generated. | ar cr lib/libmonocypher.a lib/monocypher.o lib/monocypher- | ed25519.o clang -std=c99 -O3 -Weverything -shared | -Wl,-soname,libmonocypher.so.4 -o lib/libmonocypher.so.4 | lib/monocypher.o lib/monocypher-ed25519.o ln -sf | `basename lib/libmonocypher.so.4` lib/libmonocypher.so | doc/doc_gen.sh | | --- | | Well... | | Yeah. | | _That_ is cranking up warnings up to 11. Now this is C99 we | 're talking about, how about removing that obviously useless | `-Wdeclaration-after-statement`? $ make | "CC=clang -std=c99" "CFLAGS=-O3 -Weverything -Wno-declaration- | after-statement" clang -std=c99 -O3 -Weverything -Wno- | declaration-after-statement -I src -I src/optional -fPIC -c -o | lib/monocypher.o src/monocypher.c In file included from | src/monocypher.c:54: src/monocypher.h:289:9: warning: | padding size of 'crypto_poly1305_ctx' with 4 bytes to alignment | boundary [-Wpadded] typedef struct { ^ | 1 warning generated. | | Ah, this one's may be real. Let's see this struct: | typedef struct { // Do not rely on the size or contents | of this type, // for they may change without notice. | uint8_t c[16]; // chunk of the message size_t c_idx; | // How many bytes are there in the chunk. uint32_t r | [4]; // constant multiplier (from the secret key) | uint32_t pad[4]; // random number added at the end (from the | secret key) uint32_t h [5]; // accumulated hash } | crypto_poly1305_ctx; | | Ah, I see: we end by an odd number of 32-bit words, and in the | 64-bit architecture I'm compiling this on, there will be 4 | bytes of padding. But then I have two problems: first, who | cares? Second, I also target 32-bit architectures on which | there will _not_ be any padding, and adding it manually would | be wasteful. | | There's the `#pragma` to silence the warning of course, but | it's ugly, and I'm not sure it is strictly portable (unless the | standard says compilers should ignore the `#pragma` if they | don't understand it, but I confess haven't checked). | remexre wrote: | Iunno, for this example, isn't this actually UB according to | standard C? I don't think it'd be that objectionable to put e.g. | a semantic comment saying "yeah no this is actually not UB | because we check in ./configure that your compiler is one known | to not do bad things," * _if*_ such a thing were standardized | across linting tools. | | Of course, that's a heckuva if. | david2ndaccount wrote: | This is not standard C, this is already in the land of compiler | extensions and platform specific code. The behavior is defined | by the compiler and no compiler should treat this as UB. | remexre wrote: | Is the code actually compiled with -fno-strict-aliasing | (which IME is rare for userspace code, but I don't work with | numerical code very much), or are you saying this is because | of the types involved in the cast? I didn't see anything in | [0] about a change in pointer semantics from using the vector | extensions in the compilation unit? | | [0]: https://gcc.gnu.org/onlinedocs/gcc/Vector- | Extensions.html#Ve... | deathanatos wrote: | It is using compiler intrinsics, I grant, and taking | advantage of IB is yes, well-defined given an implementation | (as we have here) ... but it seems like it's still C? | | Here, I'm not 100% certain this code isn't UB: we type-pun a | __m512i and a uint64_t ... I'm not certain that's not UB. If | I'm forced into writing C, I try to avoid doing that at all | costs ... I'd rather not encounter nasal demons. | | The warning also seems a bit merited, given that each | "increment" is over more than a single object on the | underlying array. (But, of course, that is the intent. But | I'm just not sure the static analyzer can see it.) | cpeterso wrote: | Plus, a pointer to __m512i might require different | alignment than a pointer to unsigned long. | | This code received a pointer to unsigned long from | somewhere that might have cared about the data it pointed | to, so his code should acknowledge that (with type casts or | assertions) or make a __m512i or void* pointer part of this | function's signature, making the type change part of the | contract with the caller. | eklitzke wrote: | This is not UB. | | In this case where you're using the wrong pointer type the | worst thing that can happen is you can have aliasing issues. | For example, if I have an int* and a double*, the C compiler is | free to assume that the pointers can never alias each other so | it can optimize loads and stores assuming that an update to the | int pointer will never affect the double pointer. In general | this can cause problems if the pointers actually do alias each | other, but if there's no aliasing occurring then nothing bad | will happen. | remexre wrote: | My understanding was that the UB was defined such that: | int i = 0; float f = 1.0f; void* p; | if(rand() & 1) p = &i; else p = | &f; int j = *(int*)p; assert(i == | j); | | could legally be compiled to a no-op (or a single call to | rand() with an ignored result, I suppose), despite there | being no "weird" pointer stores, since the ill-typed-at- | runtime load "can't happen," so the compiler can reason that | (int*)p must equal &i, and hence j = i. | | EDIT: And, in fact, gcc does this: | https://godbolt.org/z/WM33vzzdo | | Not sure if that applies to vector loads, and the gcc docs | don't specify, though. ___________________________________________________________________ (page generated 2023-03-15 23:00 UTC)