[HN Gopher] Static Analysis in GCC 10 ___________________________________________________________________ Static Analysis in GCC 10 Author : fanf2 Score : 279 points Date : 2020-03-28 14:42 UTC (8 hours ago) (HTM) web link (developers.redhat.com) (TXT) w3m dump (developers.redhat.com) | thekhatribharat wrote: | I believe _type sytems /type theory_ is likely going to be the | most popular method for _formal verification of programs_ (aka | _static analysis_ ). And of course, there's a limit to what you | can _prove_ about programs (ref: Rice 's theorem). | jfkebwjsbx wrote: | Static analysis is not formal verification. | thekhatribharat wrote: | Sure, things like enforcing style guides, etc. can be seen as | _lightweight_ formal verification. | irundebian wrote: | Yes, but you wrote "formal verification of programs (aka | static analysis)". Formal verification is not also known as | static analysis. | exdsq wrote: | It's part of the field of formal methods. | brundolf wrote: | I have to wonder if Rust is putting pressure on C/++ to have more | static analysis (while at the same time blazing trails in what's | possible in that space, and what's possible in terms of error | message helpfulness). I think it's a great idea to start baking | these things into the compiler, even if it will never be 100% | free of false-negatives because of the limitations of what the | language can express and guarantee. Still seems like a great way | to eliminate a lot of common problems, as a default across the | ecosystem instead of as an extra step. | rurban wrote: | It's clang, not rust. | | And clang's analyzer has different UI concept via web, which is | far superior. And for the screen valgrind has a far superior | solution. I don't see the advantage of gcc's analyzer yet. Far | too verbose. and the most important errors, like wrong | optimizer decisions based on their interpretation of UB code | are still silenced. | TwoBit wrote: | I'd love to see a compiler warn me that it's doing something | potentially unexpected due to UB considerations. | masklinn wrote: | That is intrinsically limited because UB are runtime issues | (otherwise they'd be compiler errors) and the compiler | simply assumes they don't happen, plus many of the "UB- | related" optimisations (in the sense that these | optimisations change the behaviour of non-well-formed | programs) are surfaced by other optimisations (largely | inlining). | saagarjha wrote: | Luckily, compiler authors have gotten the message and are | starting to improve on this situation slowly. | ali_m wrote: | clang has UBSan, which adds runtime checks for detecting | various kinds of undefined behaviour: | https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html | codys wrote: | gcc also has ubsan since gcc version 4.9. Many of the | sanitizers are shared by the clang & gcc projects. Some | functionality, like emitting errors on unsigned overflow, | is confined to clang. | Tempest1981 wrote: | Yes, I tried the ReSharper Inspector tool, and it uses clang | to find all sorts of code issues, with a friendly UI. | thekhatribharat wrote: | I think you mean "false-positives". Most static analysis tools | are "conservative": will throw more warnings than they can | prove. But assuming they're "sound" (and don't "give up" | because of time/space constraints), there'd be no "false- | negatives". | UncleMeat wrote: | Vanishingly few (maybe even zero) industrial static analysis | tools are sound. FPs are too common, especially for something | like C++ where common code patterns can only be soundly | handled by just forcing everything to Top. Good luck with | your sound interprocedural pointer analysis as soon as | somebody makes an indirect call. | thekhatribharat wrote: | I think you mean industrial static analysis tools aren't | _complete_. _Sound_ means no FNs, _Complete_ means no FPs. | UncleMeat wrote: | No I meant what I said. Virtually no tool is sound (zero | false negatives, lots of false positives) because in | practice false positives just make people stop using | tools when they are too common. | mgerdts wrote: | At my previous job, we used an internally developed | static analysis tool. While false positives were a | problem, the team responsible for the tool took bug | reports and fixed these issues. There was a way to | silence false positives as well as a mode to only see new | (potential) problems. Once code was clean, the false | positive rate was low enough to not really be a problem. | | The bar of zero false negatives is not useful for me when | the alternative is to rely solely human code analysis and | tests. | | One thing I learned from using this tool is that static | analyzers need to be able to configured to know about | side effects of functions that are in libraries. For | example, if a library function will free memory, it is | important that there's a way to tell the analyzer about | this behavior to avoid false positives about leaks. | | Proper design, careful coding, static analysis, code | review, and various forms of testing all have roles to | play in delivering reliable code. | UncleMeat wrote: | But you can't fix all false positives with bug reports. | Its _fundamental_ if you want soundness. And for lots of | properties (like I said, pointer analysis in C++) the | only scalable and sound techniques basically havoc as | soon as interesting things start to happen. The community | even has a term (soundiness) to describe how everybody | throws out certain kinds of soundness (indirect calls | through function pointers, reflection in java, tons of | features in javascript, etc). | | Requiring the user to add piles of annotations still | sucks, even if you are smart and do something like | abductive inference to identify minimally required | annotations. Even something as simple as the checker | framework in Java has trouble getting people to write | correct annotations and has a few corner cases (what is | the nullness property of a field in code reachable from a | constructor, for example). Nobody does this stuff | correctly for something like dataflow annotations on | methods. | | I've got a PhD in this and develop a static analysis tool | for my job. I've _never_ seen a productionized, fully | sound tool for general purpose static analysis of general | programs that solves problems more difficult that | slightly augmented type checking. | thekhatribharat wrote: | So if industrial static analysis tools aren't _sound_ , | which means they produce _false negatives_ (ie. they can | pass programs with logical errors they 're meant to | detect), how do you use them to assure code quality? | pjmlp wrote: | Internet and ramping project costs due to CVE fixes are what is | putting pressure into adopting static analysis and hardware | memory tagging, not Rust. | | https://msrc-blog.microsoft.com/2019/07/16/a-proactive-appro... | | https://security.googleblog.com/2019/08/adopting-arm-memory-... | | Lint exists since 1979. | saagarjha wrote: | Hardware memory tagging is not a panacea, it just catches | egregious memory errors. | pjmlp wrote: | Indeed, yet it does a better job than the mythical | developer that never gets pointer arithmetic wrong. | [deleted] | ape4 wrote: | You don't want that option every time since its slower. But I | wonder if there would be a smart way to run it occasionally, like | an option to -fanalyzer every 10th time or when the size of a | source file changes a lot, etc. | brundolf wrote: | That seems out of scope for the compiler itself. Dev | tools/workflows could easily set up something like this as | needed. | saagarjha wrote: | Perhaps as part of your CI? | [deleted] | archi42 wrote: | Oh, that's really nice. Though, as a user one should remember: | The approach described here gives up at some point. So it doesn't | prove the absence of a bug class (e.g. double free), but it finds | some instances. Which is already a very good thing, and hugely | non-trivial. | | The problem with "not giving up at some point" is the | computational complexity: Analyzing big code bases takes half an | eternity (days), while using huge amounts of memory (>128GB). And | once you enter the "least-defined state", you either throw lots | of false positives (which gives the users a hard time) or you | need to "give up" (and hence potentially miss bugs). | | Disclaimer: I work for a company that builds static analysis | tools. I don't see this as competition, though. Our tools are | used in industries where "safety-critical" is _really_ important | - so the "giving up"-part of the analysis is no option for us, | and solely relying only on GCC isn't an option for our customers | either ;-) | s3cur3 wrote: | I'd be interested to know which company (if you're free to | say). IIUC, this is very different from the way Coverity (the | biggest commercial player I'm aware of in this space) operates. | yitchelle wrote: | Interestingly, different industry seems to have its own | leading tool even though they all check for the same type of | risks or errors. Q-AC is extremely prevalent in the | Automotive Industry, Polyspace's Code Prover is also quite | popular as well. | a1369209993 wrote: | Per Rice's theorem, it's _not_ giving up that 's not a option; | it's just a question of whether you have false positives or | false negatives. (To be fair, for safety-critical code, | insisting on only false positives (ie treating anything you | give up on as a positive) is a pretty resonable choice.) | nullc wrote: | A tool like this could be sound but incomplete. E.g. return | true, false, or idunno. | progval wrote: | I'm guessing that by "giving up", GP meants having a false | negative | saagarjha wrote: | Nice, this looks pretty cool! It seems a bit like Clang's static | analyzer: https://clang-analyzer.llvm.org/ | mshockwave wrote: | Just a side note that the equivalent solution in LLVM/Clang is | Clang Static Analyzer: https://clang-analyzer.llvm.org . But it's | an external tool rather than integrating into clang | marta_morena_23 wrote: | "My thinking here is that it's best to catch problems as early as | possible as the code is written, using the compiler the code is | written in as part of the compile-edit-debug cycle, rather than | having static analysis as an extra tool "on the side" (perhaps | proprietary)." | | Really? Your thinking? I mean seriously, why do you have to start | off a good article and feature with a sentence like this? This is | not your thinking, and if it was, you have been living under a | rock for the past decades... Why the need to make it sound as if | this decade long idea was yours? | raegis wrote: | Let's all try to be nice to each other. Everybody's stuck at | home and frustrated right now. | glouwbug wrote: | New users typically think this is Reddit. Best we can do is | encourage civil discourse | glouwbug wrote: | It probably was their idea, but you are right, it was the idea | others before their idea. | | Difference is, their idea came to fruition, and their idea has | contributed to what is arguably the world's most important | piece of software | yjftsjthsd-h wrote: | Are they only allowed to think thoughts that nobody else has | had? | dang wrote: | Could you please not post in the flamewar style to HN? It's not | what this site is for, and if you'll read | https://news.ycombinator.com/newsguidelines.html and | https://news.ycombinator.com/newswelcome.html, you'll see that | we're actively trying for a different sort of discussion. | google234123 wrote: | How far behind LLVM/Clang is this? | jchw wrote: | > As of GCC 10, that option text is now a clickable hyperlink | (again, assuming a sufficiently capable terminal) | | Seems like mostly only GNOME Terminal and iTerm2. Here's some | that apparently won't work: | | - Konsole | | - Kitty | | - LXDE Terminal | | - MATE Terminal | | - hyper | | - Windows Terminal | | - ConEmu | | - PuTTY | | ... so it's kind of weird to suggest this is an accepted | standard. Especially since some of the discussions in feature | requests suggest they will likely not implement it due to | security concerns or otherwise. | guerrilla wrote: | how does on produce such a link in output? (not that i really | want this bloat and increased attack surface) | jchw wrote: | There's a fair bit of documentation here. I'd try it but I | don't have any supported terminals on any of my machines... | (usually I use kitty, but I have a couple machines with KDE | and Konsole.) | | https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3c. | .. | bonzini wrote: | Escape sequences. It's similar to the "tell the emulator | about the current directory" feature that is used to open new | windows on the current directory. | labawi wrote: | Does the CWD feature actually use terminal escape | sequences? | | I always assumed the emulator accessed the working | directory of the child process (as in /proc/$PID/cwd). On | my terminal the CWD feature only seems to work for the | topmost shell, and symlinks get resolved. | | EDIT: Linked bug report mentions OCS7 (presumably an escape | sequence), as one of the ways to track CWD. | jbk wrote: | Konsole had security concerns about this feature: | https://bugs.kde.org/show_bug.cgi?id=379294 | zzo38computer wrote: | Another that probably won't work is xterm (which I currently | use). (And if I implement my own terminal emulator, I probably | won't implement it either, as I would probably mainly | implementing the features of xterm and of DEC terminals.) | cowsandmilk wrote: | I don't see anything suggesting it is a standard. It seems like | it is an enhancement that some terminals support. | | It isn't like this feature degrades support for any other | terminal, just improves life for some users. Why complain? | jchw wrote: | Because it can _break_ terminals that don't. Nothing forces | you to put enough context in your link to be able to recover | the URL. Unsuspecting developers who only target iTerm2 will | put a link in for "Click here to complete OAuth2 flow" or | something and it will be broken... | | In this case it isn't the worst problem ever. but i think | support for this should probably be removed from terminal | emulators unless a quorum of popular terminal emulators | support it. | | (Addendum: do developers actually do things that only work in | virtually a single terminal emulator? Yeah. A lot of Node.JS | software printed status using emoji that broke horridly on | everything but macOS. It mostly works elsewhere now, and it | was purely a visual problem, but I would assume many didn't | even realize what their app would've done in most terminal | emulators. The same with hyperlinks would be a disaster.) | pwdisswordfish2 wrote: | Someone should create a database of various terminal | emulators and the features they support, so that | applications would be able to look up what features are | available. | | We shall call it the 'terminal capabilities library'. | jchw wrote: | I dislike this strategy because then everyone pretends to | be xterm and we're back to square one. | yjftsjthsd-h wrote: | Yeah, it feels like we're just doing user agent vs | feature detection all over again. And the xterm thing is | a real concern, since I've read that lots of terminals | have indeed pretended to be xterm, leading to programs | having to try to detect the real terminal identity | because they set TERM=xterm but don't have 100% of | xterm's features... | jchw wrote: | Yep... this is exactly the problem. In fact sometimes you | are recommended to go and override $TERM in order to fix | a bug. Worse, is apps that do "work" but miss features. | Like it is often recommended to set your term to | xterm-256color to get high color in terminal apps. It's | not a huge deal, but it is simultaneously not a situation | I like or think should be encouraged. | | FWIW I am quite confident you can do better progressive | feature detection in terminals, but until that somehow | gains traction we're going to get a lot of "hey, how come | i cant click the link?" and the answer is they are using | KDE or Windows. | yjftsjthsd-h wrote: | Yup. I use alacritty, but if I SSH I frequently have to | force TERM=xterm or something because nothing knows about | that. But it's fine when I SSH from in tmux... because I | have tmux force TERM=screen-256color to get better color | support:) | saagarjha wrote: | Don't those terminals have hyperlink detection anyways? | What's the need to use special escape sequences in this | case? | jchw wrote: | I dunno, but people are getting quite mad about me | suggesting this feature should be removed, so I guess it | is truly mandatory that our terminal emulators do this. | Shortlinks are not good enough, phishing be damned? | VadimPR wrote: | What telnet text are they using? The only hyperlinking in | telnet I know of is MXP, http://www.zuggsoft.com/zmud/mxp.htm. | wyldfire wrote: | Previously submitted as | https://news.ycombinator.com/item?id=22708586 | leni536 wrote: | It looks great and useful. I suspect that this only works within | a single translation unit and can't work between separate | translation units. But maybe it could work together with lto, | that would be pretty awesome. | | Some of the worst lifetime issues are 3rd party library calls | with unclear ownership semantics and static analyzers are just as | clueless as you are. The function signature doesn't help you out | in this regard (in C). My recent "favorite" is libzip's | zip_open_from_source that conditionally takes ownership of its | zip_source_t* argument. | | https://libzip.org/documentation/zip_open_from_source.html | | https://libzip.org/documentation/zip_source_free.html | 3fe9a03ccd14ca5 wrote: | These types of tools go a really long way in improving the | reliability and safety of C code. | | Hats off to the redhat team for putting in the effort on this. | Their blog posts have been really interesting. It's definitely | changing my perception of what redhat really does. | jabedude wrote: | I've always had a positive impression of redhat, but I was | recently blown away with their dedication to upstreaming | contributions across different open source projects. I was | investigating a new Linux kernel feature that redhat | contributed and saw that the same developer opened pull | requests that added support for the new kernel feature in three | major open source projects. And one of the projects took over a | year to accept the changes, but he was persistent in reaching | out, making requested changes, etc. It really shows the passion | at the company to share their contributions. | Vogtinator wrote: | It doesn't have to be passion - having something upstreamed | has a lot of other benefits as well. | WalterBright wrote: | Double-free's can be tracked by doing data flow analysis on the | function. This is how D does it in its nascent implementation of | an Ownership/Borrowing system. It can be done without DFA if | getting it only 90% right and having numerous false positives is | acceptable. | | I've used many static checkers in the past, and the rate of false | positives was high enough to dissuade me from using them. This is | why D uses DFA to catch 100% of the positives with 0% negatives. | I knew this could be done because the compilers were using DFA in | the optimization pass. | | In order to get the tracking to work, one cannot just track | things for a function named "free". After all, a common thing to | do is write one's own custom storage allocators, and the compiler | won't know what they are. Hence, there has to be some mechanism | to tell the compiler when a pointer parameter to a function is | getting "consumed" by the caller, and when it is just "loaned" to | the caller (hence the nomenclature of an Ownership/Borrowing | system). | | One of the difficulties to overcome with D in doing this is there | are several complex semantic constructs that needed to be | deconstructed into their component pointer operations. I noticed | that Rust simplified this problem by simplifying the language :-) | | But once done, it works, and works satisfyingly well. | | Note that none of this is a criticism of what GCC 10 does, | because the article gives insufficient detail to draw any | informed conclusions. But I do see this as part of a general | trend that people are sick and tired of memory safety bugs in | programming languages, and it's good to see progress on all | fronts here. | chrisseaton wrote: | > This is why D uses DFA to catch 100% of the positives with 0% | negatives. | | For example if we have a graph data structure and we are | visiting it and freeing nodes, how can your static analysis | know that nodes will not be visited twice by my logic and freed | twice? | WalterBright wrote: | Because the DFA follows edges, not logic. If free appears | twice along any sequence of edges, the compiler rejects it. | chrisseaton wrote: | I think that's why you're getting the response you are in | this thread - if it really did '100% of the positives with | 0% negatives', it'd _have_ to consider the logic. There 's | an impossibility proof for that which has already been | cited in the thread. It's usually illustrated using a | function that solves the halting problem rather than the | Goldbach conjecture. | | If the compiler rejects things that have free appear twice | on branching control flow paths but aren't actually broken | due to dynamic control flow (like your looping example | given elsewhere) then that's a false positive, isn't it? | | And people thought '0% negatives' was meaning 'no false | positives'. | WalterBright wrote: | I see your point, and it's a valid criticism of my | statement. I meant it more in the sense that the static | checkers I have used would use hackish ad-hoc heuristics | which did not work very well, and gets constantly tweaked | by the implementor. | | The D implementation, in contrast, uses a well-defined | rule outlined above, and it works predictably and | reliably (absent implementation bugs, rather than ad hoc | rule shortcomings). | | Another way of putting it is that the _rules of the | language_ are changed to make 0% /100% results workable. | The compiler never says "there may be a problem here". It | passes or is rejected. If it is passes, it is memory | safe. | anchpop wrote: | That approach is a really good point in the design space | IMO. It is also the tradeoff that people have to make | when designing type systems. Many programs are correct | but rejected by their language's type system, but if the | language is well designed it was probably a really weird | and unintuitive program to begin with, so it's okay to | reject it anyway. | vanderZwan wrote: | > _I noticed that Rust simplified this problem by simplifying | the language :-)_ | | That's a bit of an open-ended qualifier. Would you say that is | a positive simplification (an increase in elegance) or one that | has a trade-off in reducing its expressive power? | WalterBright wrote: | I mentioned it as another way of solving the problem. Which | way you prefer is a matter of professional taste, something | I'd really prefer not diverging into in this thread. | anchpop wrote: | > This is why, for D, I was determined to use DFA to catch 100% | of the positives with 0% negatives. I knew this could be done | because my compilers were using DFA in the optimization pass. | | Is this really true? I thought this was impossible due to | Rice's theorem | WalterBright wrote: | Fortunately, I am unaware that it was impossible and did it | anyway :-) | | But it is possible I made a mistake. | | It is also true that for it to work, one has to change the | way one writes code, like Rust does. This is why D requires | and @live attribute be added to functions to enable the | checking for just those functions, so it doesn't break every | program out there. It will enable incremental use of the | checking at the user's option. | nullc wrote: | You're probably using a different definition of 100% than | any impossibility proof would use. | | Consider some code: | | --- | | a=malloc(1); | | needfree=true; | | if (hashfn(first_factor(huge_static_rsanum1))&1){needfree=f | alse;free(a);} | | if (hashfn(first_factor(huge_static_rsanum2))&1){needfree=f | alse;free(a);} | | if(needfree)free(a); | | --- | | The decision if this has a double free or not depends on | the factorizations of two huge difficult to factor | constants. It either double-frees or not depending on those | constants. | | Surely your software cannot decide that... | | What you probably mean is something like "100% on real | programs rather than contrived cases". Of course, in that | case, your definition of 'real programs' is the catch. :P | | Sometimes things that seem like they should always work | except on contrived junk like the above example actually | run into limitations in practice because macros and machine | code generation produce ... well ... contrived junk from | time to time. | jjnoakes wrote: | I personally would want code like this flagged by static | analysis. I would fix the code by not calling free in the | second 'if' body if needfree was already false (just like | the call at the end of the function is guarded). | nullc wrote: | You might want it flagged-- I would--, but it would still | be a false positive if it did so (assuming the constants | didn't result in a double-free). :) | [deleted] | WalterBright wrote: | > Surely your software cannot decide that... | | The D implementation would reject such code. The DFA | assumes all control paths are executed. For example, | if (c) free(p); *p = 3; | | is rejected as use-after-free. if (c) | free(p); if (!c) *p = 3; | | is also rejected as use-after-free. If the DFA is done | properly, you will not be able to trick it. | UncleMeat wrote: | Then that doesn't mean "0% of the negatives". | Ygg2 wrote: | Ok. And if you call into non-D code, like C or assembly? | | I think this is just moving potential problem under rug, | which is best it can do, given Rice Theorem | WalterBright wrote: | Calling non-D functions means the compiler cannot check | them, and will rely on the user having correctly | specified and annotated the function signature. | | I.e. you can have "unsafe" annotated code which the | compiler doesn't check, which is indeed where the dirty | deeds get swept. For example, I doubt someone could write | a guaranteed safe implementation of malloc(). | | The idea, naturally, is to specifically mark such | sections so that your crack QA team knows what to focus | on, and to minimize the amount of such code. | | The trick is to make the safe sections expressive enough | that it is practical to write the bulk of the code in | safe sections, thereby minimizing the "threat surface". | anchpop wrote: | Imagine this pseudocode: void main() { | check_goldbach(); // iterates over every integer > 2 | // to see if it is an exception to | // the goldbach conjecture, and | // returns if so use_after_free(); } | | This program may be memory safe. It's possible that there | is no exception to the goldbach conjecture, so | check_goldbach will never halt, so use_after_free will | never be called. It also may not be memory safe. If the | goldbach conjecture is not true, check_goldbach will | eventually return and then we'll call use_after_free. | | You say your compiler accepts all programs which are memory | safe, and rejects all that aren't. If so, you can use it to | prove or disprove the goldbach conjecture. Since proving or | disproving that conjecture comes with a one million dollar | prize, it is surprising that you can solve it with your | method | | edit: I didn't mean to disparage WatlerBright's work, I | only was taking issue with the idea that any method of | static analysis can accept 100% of correct programs and | reject 100% incorrect programs. This is an unavoidable | consequence of the halting problem and being aware of it | allows you to decide where you want the tradeoff to be. it | sounds like his implementation has no false negatives at | the cost of a few false positives that are rare enough for | them not to really be an issue, which is still really cool | WalterBright wrote: | > You say your compiler accepts all programs which are | memory safe, and rejects all that aren't. | | It does it by implementing O/B rules, which force changes | in the way one writes code. This may sidestep the | goldbach conjecture, which I don't understand. | funglewest wrote: | > the goldbach conjecture, which I don't understand | | Clearly. | dang wrote: | Please stop making accounts to break the site guidelines | with. | | https://news.ycombinator.com/newsguidelines.html | jjnoakes wrote: | It is possible that the compiler simply rejects any | definition of a function that has a use after free, which | would both uphold the compiler's claim and also not prove | or disprove the goldbach conjecture. | anchpop wrote: | If that is the case then his approach has false | negatives, because a function that is never called cannot | introduce memory unsafety but would incorrectly be | flagged as unsafe | WalterBright wrote: | Which it does, if such use is on any control flow path | after the free(). For example, bool | first = true; while (f()) { if (first) | free(p); first = false; } | | is rejected as double-free. | [deleted] | bjoli wrote: | having your static analyser solving million dollar | problems seems like an unnecessary high bar. | | how is your comment, under any reasonable interpretation | of what Walter wrote, not pointless pedantry? | anchpop wrote: | What Walter meant in his comment was that his static | analyzer has no false positives (it never allows a memory | unsafe program to compile). He's provided a few simpler | examples in this thread of cases where his analyzer | produces false negatives. I knew that was going to be the | case because no program can flag 100% of negatives | without also flagging a few positives by accident (as a | consequence of Rice's theorem). I just meant to | demonstrate that, but I don't know anything about D so I | wanted to pick something that I was certain his analyzer | would fail one way or the other. | | That doesn't mean that his analyzer isn't useful. I find | rust extremely useful because it catches almost me | whenever I accidentally write memory unsafe code without | burdening me too much otherwise. But almost every rust | coder has run into a case where they wrote some code that | was safe, but not allowed. That's just a fact of life and | it serves no one to make incorrect claims about one's | software | ufo wrote: | I wonder... Has anyone here tried using gcc with the -fanalizer | option? Did it find any bugs that you did not know about? | [deleted] | 6gvONxR4sf7o wrote: | I'm super happy to see more static tools to prevent or at least | find buggy code. | mynegation wrote: | I, at one time, worked on a tool, commercial and external to the | compiler, that did this (among other things). Probably the most | intellectually challenging job I have ever had. I am happy static | analysis makes inroads into mainstream! | | Few takeaways from that time: inter procedural matters: if your | function reallocated a pointer passed as an argument, you want to | treat it as 'free' regarding this argument, and conversely, if | your function returns a newly allocated memory, you want to mark | it as such, and so on. There is also a trade off between the | breadth of the analysis and the human ability to comprehend it, | author mentions 110 node path in the article. | | The subject of my unfinished PhD thesis and something I hope also | picks up is the combination of static and dynamic analysis, used | iteratively. If your static analysis flags a suspicious path but | does not have the means to figure out if it is true or not, | instrument it and leave it to the dynamic analysis to run through | it (the idea here that total instrumentation a la valgrind is | detrimental to performance so you will get some gain from | selective instrumentation). Conversely, dynamic analysis may | provide some hints as to where static analysis should be applied | at a greater depth and provide automatic annotation of functions | with regard to their behaviour and - possibly - invariants, that | help with the state explosion. | DyslexicAtheist wrote: | ca 2000 - 2004 I had the luck to work on a massive C/C++code | base building base station / telecoms infrastructure. We had | several hundred engineers contributing code with all kind of | different philosophies (many grey beards who did C/C++ for | decades). | | Running Flexelint was part of our CI chain and also part of the | internal coding standard (e.g. definition of done). There was | no other time in my life where I learned so much about secure | coding as I did back then. Biggest challenge was agreeing about | false positives and we had 1 guy in the team who maintained the | official wiki document on when a lint warning needed to go on a | whitelist with an agreed description of why. The initial | overhead to become _lint-clean_ got a lot of push-back but | thanks to management support we got there and you could really | see how things stayed at that level even after years. | | It felt at times bureaucratic or like yak shaving, but in | retrospect linting was what kept the code base at a quality I | haven't seen ever again since. It also ensured everyone was on | the same page. Taking linting seriously required a small | learning curve though and lead to some discussions here or | there. These discussions were really valuable since we got to | really learn from another too. | | When I left and went to another project it felt like a step | back where we were chasing the same old bugs due to bad coding | practices and it was a major step back in my career as a dev. I | miss those days. | | Really love that this becomes part of gcc. | neilv wrote: | Kudos for lint-clean, and for high-quality C/C++ programming. | One of my own stories of that (coincidentally, from when I | worked on dev tools for aerospace/datacomm): | https://news.ycombinator.com/item?id=21158546 | olivierduval wrote: | "using the compiler the code is written in as part of the | compile-edit-debug cycle, rather than having static analysis as | an extra tool "on the side" (perhaps proprietary)" | | Mmmm... and why not have an external tool, part of the GCC family | but with a proper API, to allow to use ANY TOOL instead of | bloating GCC with one more tool that won't be usable on other | compilers and will need specific maintainers, althought this | field is already really complex and need a lot of different | knowledges ? | | For example, it could be based on intermediate code so better | than just source-code or machine-code analysis... | olivierduval wrote: | Just to be more specific: why not use the "UNIX philosophy" | with a compiler to compile (translate to Intermediate | Representation), an optimizer to optimize, an assembler to | produce machine code from IR (with allocation registry) and so | on...? | UncleMeat wrote: | Because GCC is explicitly designed to be a tangled mess (as | opposed to Clang/LLVM), in part because it makes it harder | for groups with different beliefs about FLOSS code to | repurpose it. | | Its a choice that has caused them to cede a lot of territory | to Clang/LLVM. | ndesaulniers wrote: | I think the major mistake was FSF refusing the objective c | front end from Apple. | pjmlp wrote: | For starters, NeXT wasn't even considering giving | anything back to GCC. | aw1621107 wrote: | IIRC in the past GCC (or maybe just some of the maintainers?) | has been resistant to that kind of architecture because it | makes it easier to write proprietary | extensions/plugins/backends/frontends/etc. to GCC. I'm not | sure whether that has changed. | | LLVM was designed to be a lot more modular, so it fits these | use cases a lot better. In your example, you have the front | ends (C, C++, Objective-C, Fortran frontends in LLVM proper, | Rust, Zig, etc. as third-party front-ends) compiling to LLVM | IR, opt for optimization, and different backends to generate | machine code from IR. ___________________________________________________________________ (page generated 2020-03-28 23:00 UTC)