[HN Gopher] Tell HN: C Experts Panel - Ask us anything about C
       ___________________________________________________________________
        
       Tell HN: C Experts Panel - Ask us anything about C
        
       Hi HN,  We are members of the C Standard Committee and associated C
       experts, who have collaborated on a new book called Effective C,
       which was discussed recently here:
       https://news.ycombinator.com/item?id=22716068. After that thread,
       dang invited me to do an AMA and I invited my colleagues so we
       upgraded it to an AUA. Ask us about C programming, the C Standard
       or C standardization, undefined Behavior, and anything C-related!
       The book is still forthcoming, but it's available for pre-order and
       early access from No Starch Press:
       https://nostarch.com/Effective_C.  Here's who we are:  rseacord -
       Robert C. Seacord is a Technical Director at NCC Group, and author
       of the new book by No Starch Press "Effective C: An Introduction to
       Professional C Programming" and C Standards Committee (WG14)
       Expert.  AaronBallman - Aaron Ballman is a compiler frontend
       engineer for GrammaTech, Inc. and works primarily on the static
       analysis tool, CodeSonar. He is also a frontend maintainer for
       Clang, a popular open source compiler for C, C++, and other
       languages. Aaron is an expert for the JTC1/SC22/WG14 C programming
       language and JTC1/SC22/WG21 C++ programming language standards
       committees and is a chapter author for Effective C.  msebor -
       Martin Sebor is Principal Engineer at Red Hat and expert for the
       JTC1/SC22/WG14 C programming language and JTC1/SC22/WG21 C++
       programming language standards committees and the official
       Technical Reviewer for Effective C.  DougGwyn - Douglas Gwyn is
       Emeritus at US Army Research Laboratory and Member Emeritus for the
       JTC1/SC22/WG14 C programming language and a major contributor to
       Effective C.  pascal_cuoq - Pascal Cuoq is the Chief Scientist at
       TrustInSoft and co-inventor of the Frama-C technology. Pascal was a
       reviewer for Effective C and author of a foreword part.  NickDunn -
       Nick Dunn is a Principal Security Consultant at NCC Group, ethical
       hacker, software security tester, code reviewer, and major
       contributor to Effective C.  Fire away with your questions and
       comments about C!
        
       Author : rseacord
       Score  : 604 points
       Date   : 2020-04-14 13:03 UTC (9 hours ago)
        
       | papermachete wrote:
       | How and why will C combat Rust?
        
         | pascal_cuoq wrote:
         | In my opinion, the two languages are going to co-exist for a
         | long time. C has billions of lines of legacy software written
         | in it... In recent news, COBOL developers were sought after in
         | order to update existing COBOL software, so the same thing will
         | happen with C, perhaps to the end of humanity (I have become
         | pessimistic as to humanity's future).
         | 
         | There are pieces of software that should be given priority for
         | a rewrite in Rust, but most of C software is never going to be
         | rewritten, because there is simply too much of it.
         | 
         | Therefore, even if C did not have any advantage of its own over
         | Rust, there would still be legacy software to maintain and to
         | extend.
         | 
         | The advantages of C include that sometimes, an embedded
         | processor with a proprietary instruction set is provided by the
         | chipmaker with its own C compiler, which is the only compiler
         | supporting the instruction set; that C is still currently used
         | to write the runtimes of higher-level languages (I'm familiar
         | with OCaml, but it isn't too much of a stretch to imagine that
         | the runtimes of Python, Haskell,... are also written in C).
        
           | cesarb wrote:
           | > In my opinion, the two languages are going to co-exist for
           | a long time.
           | 
           | It goes deeper than that, in a couple of places Rust depends
           | on the C standard: the fixed-layout `#[repr(C)]` structs
           | (without that attribute, the compiler is free to reorder the
           | struct fields; with that attribute, it's laid out the way C
           | would do it), and the `extern "C"` function call ABI. The way
           | to call any other language from Rust, or Rust from any other
           | language, is to go through `extern "C"` functions passing
           | `#[repr(C)]` structs. So even if the C language dies one day,
           | parts of it will live in Rust forever (or as long as the Rust
           | language lives).
        
           | sramsay wrote:
           | There's tons of legacy C around, we have to maintain it, it's
           | not ideal unless you're on some niche platform, lots of stuff
           | should probably be written in a better language . . .
           | 
           | I sincerely hope this is not the general attitude of the
           | standards committee. Some of us actually _prefer_ C, and
           | would like to see the language continue to flourish.
        
             | pascal_cuoq wrote:
             | Note that among the C experts participating in this AMA, I
             | am not one who is in the standardization committee. At
             | 14:59 EDT, just before the AMA was posted, we were joking
             | between ourselves about me having to post this disclaimer
             | but I guess there was a hidden truth in the joke.
        
         | rseacord wrote:
         | C is a pretty well established language, so this question
         | should probably be asked the other way around. C was primarily
         | designed to complete with FORTRAN.
        
         | ape4 wrote:
         | Rust has a package manager while C and C++ don't (as far as I
         | now). This alone make Rust more attractive for some projects. I
         | hope C and C++ get one.
        
       | clarry wrote:
       | Open up WG14 mailing list for non-members?
       | 
       | It's hard to appreciate what's going on at WG14 (or take part)
       | when you can see the results only from afar, with none of the
       | surrounding discussion.
       | 
       | I recently read Jens Gustedt's blog on C2x where he casually
       | recommended this as a way to get involved: "The best is to get
       | involved in the standard's process by adhering to your national
       | standards body, come to the WG14 meetings and/or subscribing to
       | the committee's mailing list."
       | 
       | Afaict (from browsing the wg14 site), the mailing list and its
       | archives are not open to access.
       | 
       | https://webcache.googleusercontent.com/search?q=cache:TnEGL4...
       | 
       | EDIT: In general, how is one supposed to approach wg14 with ideas
       | or need for clarification on the standard's wording /
       | interpretation?
        
         | AaronBallman wrote:
         | > In general, how is one supposed to approach wg14 with ideas
         | or need for clarification on the standard's wording /
         | interpretation?
         | 
         | I'm currently working on an update to the committee website to
         | clarify exactly this sort of thing! Unfortunately, the update
         | is not live yet, but it should hopefully be up Soon(tm).
         | 
         | Currently, the approach for clarifications and ideas both
         | require you to find someone on the committee to ask the
         | question or champion your proposal for you. We hope to improve
         | this process as part of this website update to make it easier
         | for community collaboration.
        
         | DougGwyn wrote:
         | In general, the committee accepts what we used to call "defect
         | reports" (now something like "requests for improvement"),
         | assigns them "WG14 series" sequence numbers, and upon requests
         | for "floor time" schedules meeting discussions. Occasional
         | votes are taken, which might trigger modifications to the draft
         | standard. At some point, the committee decides that the updated
         | draft standard is ready for public review, and the various
         | national representatives deal with review comments. All this
         | starts with proposal documents in "WG14 series" form.
        
         | ori_b wrote:
         | Agreed. I would like to get involved, but I don't see any
         | reasonable way for me to do that as an individual.
        
       | aray wrote:
       | When do you think we will get an update to C11 or more recent
       | version of C to MISRA? Do you all have any influence on "Safety
       | Critical C" standards?
        
         | AaronBallman wrote:
         | The MISRA committee is a separate organization from the C
         | standards committee, but there is overlap between the two
         | groups and an official liaison process for the committees to
         | collaborate. So there's a bit of bidirectional influence
         | between the two groups.
         | 
         | I am not on the MISRA committee, but I believe they talk a bit
         | about their public roadmap in this video:
         | https://vimeo.com/190304951
        
       | shric wrote:
       | Is there a rule that any new proposals must already be a feature
       | in an existing major implementation?
        
         | beefhash wrote:
         | (Not one of the OPs:) Wasn't C11 Annex K, the notoriously
         | failed bounds-checking interfaces, a example of not having an
         | existing implementation?
        
           | AaronBallman wrote:
           | Annex K had an existing implementation from Microsoft. It
           | wasn't a fully conforming implementation when C11 shipped,
           | however (the specification drifted apart from the initial
           | implementation).
        
         | AaronBallman wrote:
         | Yes, the C2x charter has this requirement: http://www.open-
         | std.org/jtc1/sc22/wg14/www/docs/n2086.htm
        
           | shric wrote:
           | Thanks, so from "Only those features that have a history and
           | are in common use by a commercial implementation should be
           | considered", this precludes stuff that may only exist in
           | clang, gcc, glibc, etc.? If so, why?
        
             | parenthesis wrote:
             | You could interpret that as "in common use by a
             | commercial[ly used] implementation".
        
             | AaronBallman wrote:
             | I wouldn't read into "commercial" there, I think we meant
             | "production-quality" instead. (We should fix that!)
             | 
             | Basically, we prefer seeing features that real users have
             | used as opposed to an experimental branch of a compiler
             | that doesn't have usage experience. Knowing it can be
             | implemented is one thing, but knowing users want to use it
             | is more compelling.
        
       | hellofunk wrote:
       | I really like the relative simplicity of C compared to C++ and
       | recently wrote a project in C, but eventually rewrote it in C++
       | for just a few seemingly trivial reasons that nonetheless were
       | important time savers. I'd love to know if the C standard, as can
       | run on GPUs also, will ever evolve to offer:
       | 
       | 1) namespaces, so function names don't need to be 30 characters
       | to avoid naming collision
       | 
       | 2) guaranteed copy elision or RVO -- provides greater confidence
       | for common idioms and expressivity compared to passing out
       | parameters
        
       | AvImd wrote:
       | Is there a possibility there will be introduced a new rule saying
       | "if the compiler detects an UB it should abort the compilation
       | instead of breaking the code in the most incomprehensible way
       | possible"?
       | 
       | Right now it's just scary to start a new project in C. It would
       | be really great if there was more emphasis on correctness of the
       | produced code instead of the insane optimizations.
        
         | DougGwyn wrote:
         | Try using "lint" or other code checkers.
        
         | Someone wrote:
         | int i;       [...]       i += 1;
         | 
         | potentially is undefined behavior; _i_ could overflow.
         | 
         | Compilers nowadays are fairly good at warning about _definite_
         | undefined behavior.
         | 
         | I don't think anybody would be happy with a compiler that
         | aborted on all _potential_ undefined behavior. That would
         | (almost) be equivalent to banning the use of all signed ints.
        
         | AaronBallman wrote:
         | It would be _wonderful_ (IMO) if we could get to that point,
         | but that would leave implementations with too great of a burden
         | because many forms of UB can only be caught at runtime (without
         | a considerable number of false positives). Generally, the C
         | committee makes things a  "constraint violation" (aka, we would
         | like implementations to err) whenever something can be caught
         | at compile time, and we leave the undefined behavior hammer for
         | scenarios where there is not a reasonable alternative.
         | 
         | Thankfully, there are a lot of tools to help developers catch
         | UB these days (UBSan, static analyzers, valgrind, etc). I would
         | recommend using those tools whenever starting a new project in
         | C (or C++, for that matter).
        
         | klodolph wrote:
         | This can only be done at compile time in very specific cases.
         | The huge problem here is the compiler has no way of knowing
         | which cases of undefined behavior are _bugs in the program_ and
         | which cases of undefined behavior are just examples of
         | unreachable code. If the compiler aborted compilation when it
         | detected undefined behavior, you'd be getting a lot of false
         | positives for unreachable code, and you'd need to solve that
         | problem (figuring out how to generate sensible errors and
         | suppress them). This is not even remotely easy.
         | 
         | If you are concerned about safety there are ways to achieve
         | that, like using MISRA C, formally verifying your C, or by
         | writing another language like Rust.
        
           | kzrdude wrote:
           | Good point, but could it not be required that the unreachable
           | code would be annotated to be unreachable? It could even have
           | a (development only) assertion in the location.
        
             | klodolph wrote:
             | That would be an immense undertaking. It's not really just
             | that some statement or expression is unreachable (we have
             | __builtin_unreachable() in GCC for stuff like that) but
             | that certain states are unreachable.
             | 
             | For example,                   int buffer_len(struct buffer
             | *buf) {             return buf->end - buf->start;         }
             | 
             | There are at least three states that trigger undefined
             | behavior: buf is not a valid pointer, buf->end - buf->start
             | doesn't fit in int, and buf->end and buf->start don't point
             | to the same object.
             | 
             | I'm not sure how you would annotate this. At the function
             | call site, you would somehow need to show that buf is a
             | valid pointer, and that start/end point to same object and
             | the difference fits in an int. It would start looking more
             | like Coq or Agda than C.
             | 
             | Honestly, I think if you really want this kind of safety,
             | your options are to use formal methods or switch to a
             | different language.
             | 
             | There's also this weird assumption here that the compiler
             | detects undefined behavior in your program and then mangles
             | it. It's really the opposite--the compiler assumes that
             | there is no undefined behavior in your program, and
             | optimizes accordingly. In practice you can turn
             | optimizations off and get something much closer to the
             | "machine model" of C (which doesn't really exist anyway)
             | but most people hate it because their code is too slow.
        
               | kzrdude wrote:
               | Thanks, so it's definitely easier said than done! Good
               | explanation.
        
           | AvImd wrote:
           | > If the compiler aborted compilation when it detected
           | undefined behavior, you'd be getting a lot of false positives
           | for unreachable code
           | 
           | Could you please provide an example of this?
        
             | toasted_flakes wrote:
             | Overflow of signed integers is undefined.
             | int add(int a, int b) { return a + b; }
             | 
             | Unless the compiler can prove that `add` is never called
             | with a and b values resulting in an overflow, this code can
             | lead to UB, and, under your rules, the compilation aborts.
        
         | msebor wrote:
         | Some implementations have been making a lot of effort to do
         | just that. GCC in particular has been adding these types of
         | checks (either as warnings or sanitizers) in recent years and
         | although there is still much to improve I'd like to think we
         | have made good progress.
         | 
         | Adding a rule requiring implementations to error out in cases
         | of undefined behavior would be hard to specify in the standard.
         | It could (and in my view should) be done by providing non-
         | normative encouragement as "Recommended Practice."
        
       | mcguire wrote:
       | Any chance of getting something like Frama-C officially blessed?
        
       | Jahak wrote:
       | Tell me where I can get the C89 standard for free (pdf or other
       | formats)
        
         | pascal_cuoq wrote:
         | The last time I needed it, archive.org had a link to a PDF of
         | it.
         | 
         | I couldn't find that again in one minutes, but here is the text
         | version:
         | http://web.archive.org/web/20030222051144/http://home.earthl...
        
           | Jahak wrote:
           | Thanks
        
       | grok22 wrote:
       | Things I would like C to have:
       | 
       | - stricter type-checks on typedef types (useful when passing
       | function parameters) - gcc's ' warn_unused_result' attribute for
       | functions (ensure error returns are checked) - on-entry/on-exit
       | qualifiers for functions (to do things like make sure you
       | lock/unlock semaphores for instance before entry/exit of
       | function) - D language's 'scope' feature (better handling of
       | error path) - loops in the c pre-processor! (better code-gen)
       | 
       | Any chance any of this is on the radar for the next-gen C
       | standard? Some of these are just ergonomics, but the first two
       | might've have saved me some grief a few times.
        
         | _kst_ wrote:
         | typedef, in spite of the name, doesn't create a new type. It
         | only creates a new name for an existing type. Changing that
         | would break existing code.
         | 
         | I wouldn't mind seeing a new feature that _does_ define a new
         | type (one that 's identical to, but incompatible with, an
         | existing type), but we can't call it "typedef".
         | 
         | In a sense that feature already exists. You can define a
         | structure with a single member of an existing type. But you
         | have to refer to the member by name to do anything with it.
        
       | arunc wrote:
       | What do you think about D language's mode to work as a better C
       | alternative[0]? It seems to even do printf format validation. Can
       | this be the future of C?
       | 
       | [0] https://dlang.org/spec/betterc.html
        
       | begriffs wrote:
       | Has there been a survey to determine what percentage of known
       | compilers support each C version, like C89, C99, C11? I've been
       | sticking to C99 because I assumed later versions won't be widely
       | adopted for a long time to come. Is this accurate?
        
         | DougGwyn wrote:
         | There is a Web page I saw a few days ago that does that,
         | probably findable by grepping Wikipedia. Unfortunately I forget
         | its URL.
        
       | emilfihlman wrote:
       | Will you ever add / have you considered adding sane formatting
       | options for fixed length variables in printf? Say %u32 or %s64 ?
       | 
       | Have you considered adding access to structure members by index
       | or by string name? Have you considered dynamic structures?
        
         | rmind wrote:
         | Just FYI -- there are macros for the fixed-length types, e.g.:
         | printf("U32: %" PRIu23 ", U64: " PRId64, (uint32_t)1,
         | (int64_t)2);
         | 
         | Perhaps not as handy as %u32 or %s64, but it's here.
        
           | emilfihlman wrote:
           | Yeah, and the issue is with those macros exactly. It makes
           | writing code on them really damn annoying and it relies on C
           | constant string concatenation, breaks the flow quite a lot.
        
             | _kst_ wrote:
             | Which is why I usually convert to intmax_t or uintmax_t, or
             | to some type that I know is wide enough:
             | uint64_t foo = ...;         printf("foo = %ju\n",
             | (uintmax_t)foo);         /* OR */         printf("foo =
             | %llu\n", (unsigned long long)foo);
        
           | michaelt wrote:
           | I think what emilfihlman means is those macros are hard to
           | remember and clumsy to use - which you might agree with when
           | I point out you made two mistakes in two usages :-p
        
           | [deleted]
        
         | AaronBallman wrote:
         | > Will you ever add / have you considered adding sane
         | formatting options for fixed length variables in printf? Say
         | %u32 or %s64 ?
         | 
         | I'm not certain about the historical answer to this, but I do
         | know that we're currently considering a proposal to introduce
         | an exact bit-width integer type '_ExtInt(N)' to the language,
         | and how to handle format specifiers for it is part of those
         | discussions, so we are considering some changes in this area.
         | 
         | > Have you considered adding access to structure members by
         | index or by string name? Have you considered dynamic
         | structures?
         | 
         | I don't recall seeing any such proposals. I'm not familiar with
         | the term "dynamic structures", what do you have in mind there?
        
           | emilfihlman wrote:
           | >and how to handle format specifiers for it is part of those
           | discussions, so we are considering some changes in this area.
           | 
           | Please, please, please pick short and descriptive format
           | specifiers, like %[suf]\d+, ie                 s64
           | v=somenumber;       printf("%s64\n", v);
           | 
           | _ExtInt(N) and PRIx64 etc look absolutely horrid. u?int\d+_t
           | are also really bad, it would be great to have just [suf]\d+
           | as types, where \d+ is 8, 16, 32, 64 for [us] and 32 and 64
           | for f.
           | 
           | >what do you have in mind there?
           | 
           | Say like VLAs but structures with members that are
           | dynamically defined and used.
        
             | AaronBallman wrote:
             | > Please, please, please pick short and descriptive format
             | specifiers, like %[su]\d+, ie
             | 
             | That's my personal preference as well. Using the PRI macros
             | always makes me feel sad.
             | 
             | > Say like VLAs but structures with members that are
             | dynamically defined and used.
             | 
             | Ah, no, I don't recall any proposals along those lines.
             | It's an interesting idea, and I'd be curious what the
             | runtime performance characteristics would be vs what kind
             | of new coding patterns would emerge that you couldn't do
             | previously though!
        
       | bumblebritches5 wrote:
       | Hey guys,
       | 
       | How likely would the standard be to accept a proposal to add
       | compile time reflection to the preprocessor, or even adopt C++'s
       | constexpr?
       | 
       | My use case is creating a global array in a header from static
       | compound literals in multiple source files at compile time, and
       | outside of some crazy clang-tblgen type solution, or very
       | platform specific linker hacks, it's completely unsupported by C.
        
       | dhhwrongagain wrote:
       | Is memset(malloc(0), 0, 0) undefined behavior?
        
         | DougGwyn wrote:
         | Let's assume the types have been corrected. malloc((size_t)0)
         | behavior is defined by the implementation; there are two
         | choices: (a) always returns a null pointer; or (b) acts like
         | malloc((size_t)1) which can allocate or fail, and if it
         | allocates then the program shall not try to reference anything
         | through the returned non-null pointer. Now, memset itself is
         | required (among other things) to be given as its first argument
         | a valid pointer to a byte array. In particular, it shall not be
         | a null pointer. Tracking through the conformance requirements,
         | if the malloc call returns a null pointer then the behavior is
         | undefined. Thus, you should not program like this.
        
       | hsivonen wrote:
       | Does the committee have plans to deprecate (as in: give compiler
       | license to complain suchthat compiler developers can appeal to
       | yhe standard when users complain back) locale-sensitive functions
       | like isdigit, which is useless for processing protocol syntax,
       | because it is locale-sensitive, and useless for processing
       | natural-language text, because it examines only one UTF-8 codw
       | unit?
        
         | DougGwyn wrote:
         | isdigit is likely to remain, because much existing code does
         | use it (perhaps in different contexts from the one you cited).
         | If you need a different function specification to do something
         | different, it could be added in a future release, but that
         | doesn't mean that we need to force programmers to change their
         | existing code.
        
           | hsivonen wrote:
           | Does there exist a use case in portable code such that use of
           | isdigit is not a bug?
           | 
           | How does the committee view non-portable existing code
           | generally when considering changes?
        
             | DougGwyn wrote:
             | Code can be non-portable for various reasons, not all of
             | them bad. I just grepped a recent release of DWB and found
             | about 100 uses of isdigit, most of which were not input
             | from random text but rather were used internally, such as
             | "register" names (limited to a specified range). Other
             | packages are likely to have similar usage patterns. I
             | really don't want to have to edit that code just for
             | aesthetics.
        
           | _kst_ wrote:
           | What about giving isdigit and friends defined behavior for
           | any argument value that's within the range of any of char,
           | signed char, or unsigned char?
           | 
           | The background (I know Doug knows this): isdigit() takes an
           | argument of type int, which is required to be either within
           | the range of unsigned char, or have the value EOF (required
           | to be negative, typically -1).
           | 
           | The problem: plain char is often signed, typically with a
           | range of -128..+127. You might have a negative char value in
           | a string -- but passing any negative value other than EOF to
           | isdigit() has undefined behavior. Thus to use isdigit()
           | safely on arbitrary data, you have to cast the argument to
           | unsigned char:                   if (isdigit((unsigned
           | char)s[i])) ...
           | 
           | A lot of C programmers aren't aware of this and will pass
           | arbitrary char values to isdigit() and friends -- which works
           | fine most of the time, but risks going kaboom.
           | 
           | Changing this could raise issues if -1 is a valid character
           | value and also the value of EOF, but practically speaking -1
           | or 0xff will almost never be a digit in any real-world
           | character set. (It's y in Unicode and Latin-1, which might
           | cause problems for islower and isalnum.)
        
       | bonzini wrote:
       | Is there any reason to keep the undefined behavior for shifts of
       | negative numbers, instead of making it implementation defined?
       | Most compilers (for twos-complement architectures at least) are
       | not using that latitude, and I would also guess that most
       | programs that are written for twos-complement arithmetic likewise
       | not expecting undefined behavior for non-overflowing left shifts
       | of negative numbers. Thanks!
        
         | DougGwyn wrote:
         | "Implementation-defined" is a nuisance, because then you need
         | to add code for all the variations, which also requires a set
         | of standard macros, etc. It is easier and less trouble-prone to
         | just avoid using the currently undefined behavior.
        
       | hyc_symas wrote:
       | The standard string library is still pretty bad. This would have
       | been a much better addition for safe strcpy.
       | 
       | Safe strcpy                   char *stecpy(char *d, const char
       | *s, const char *e)         {          while (d < e && *s)
       | *d++ = *s++;          if (d < e)           *d = '\0';
       | return d;         }              main() {           char buf[64];
       | char *ptr, *end = buf+sizeof(buf) ;                ptr =
       | stecpy(buf, "hello", end);           ptr = stecpy(ptr, " world",
       | end);         }
       | 
       | Existing solutions are still error-prone, requiring continual
       | recalculation of buffer len after each use in a long sequence,
       | when the only thing that matters is where the buffer ends, which
       | is effectively a constant across multiple calls.
       | 
       | What are the chances of getting something like this added to the
       | standard library?
        
         | pascal_cuoq wrote:
         | For what it's worth, I personally like this approach, because
         | there are some cases in which it requires less arithmetic in
         | order to be used correctly. And it lends itself better to some
         | forms of static analysis, for similar reasons, in the following
         | sense:
         | 
         | There is the problem of detecting that the function overflows
         | despite being a "safe" function. And there is the problem of
         | precisely predicting what happens after the call, because there
         | might be an undefined behavior in that part of the execution.
         | When writing to, say, a member of a struct, you pass the
         | address of the next member and the analyzer can safely assume
         | that that member and the following ones are not modified. With
         | a function that receives a length, the analyzer has to detect
         | that if the pointer passed points 5 bytes before the end of the
         | destination, the accompanying size it 5, if the pointer points
         | 4 bytes before the end the accompanying size is 4, etc.
         | 
         | This is a much more difficult problem, and as soon as the
         | analyzer fails to capture this information, it appears that the
         | safe function a) might not be called safely and b) might
         | overwrite the following members of the struct.
         | 
         | a) is a false positive, and b) generally implies tons of false
         | positives in the remainder of the analysis.
         | 
         | (In this discussion I assume that you want to allow a call to a
         | memory function to access several members of a struct. You can
         | also choose to forbid this, but then you run into a different
         | problem, which is that C programs do this on purpose more often
         | than you'd think.)
        
         | doublesCs wrote:
         | What's wrong with:                   *p += sprintf(*p,
         | "hello");         *p += sprintf(*p, "world");
        
           | ftvy wrote:
           | It looks like you'd be dereferencing the pointer p, but you'd
           | also need to make sure that what p points to has enough
           | memory.
        
           | pjscott wrote:
           | That could lead to buffer overflow.A
        
             | doublesCs wrote:
             | When I wrote that, I had in mind the observation about
             | continued recalculation of buffer len. My suggestion has no
             | such thing. It looks so good that I imagine this was
             | probably how it was intended to be used. With that in mind,
             | isn't it the user's job to know the size of the buffers
             | he's using? Doesn't expecting that the function know about
             | buffer size go against the single responsibility principle?
             | 
             | I'm new to C, in case you couldn't tell.
        
               | clarry wrote:
               | > With that in mind, isn't it the user's job to know the
               | size of the buffers he's using?
               | 
               | Yes. The user knows the size of his buffer, and then
               | passes that knowledge on to the string constructing
               | functions so that they do not overflow the buffer.
               | 
               | > Doesn't expecting that the function know about buffer
               | size go against the single responsibility principle?
               | 
               | What's single responsibility again? "Execute this one
               | assembly instruction"?
               | 
               | What you want from standard library functions is,
               | usually, "construct a string into this buffer (whose size
               | is N)."
        
               | pascal_cuoq wrote:
               | The problem in practice is that you do not write "hello"
               | and "world" to the destination buffer. You write data
               | that is computed more or less directly from user inputs.
               | Often a malicious user.
               | 
               | So the user only needs to find a way to make the data
               | longer than the developer expected. This may be very
               | simple: the developer may have written a screensaver to
               | accept 20 characters for a password, because who has a
               | longer password than this? Everyone knows that only the
               | first 8 characters matter anyway. (This may have been
               | literally true a long time ago, I think, although it's
               | terrible design. Anyway only 8 characters of hash were
               | stored, so in a sense characters after the first 8 did
               | not buy you as much security as the first 8, even if it
               | was not literally true.)
               | 
               | And this is how there were screensavers that, when you
               | input ~500 characters into the password field, would
               | simply crash and leave the applications they were hiding
               | visible and ready for user input. This is an actual
               | security bug that has happened in actual Unix
               | screensavers. The screensavers were written in C.
               | 
               | And long story short, we have been having the exact same
               | problem approximately once a week for the last 25 years.
               | Many people agree that it is urgent to finally fix this,
               | especially as the consequences are getting worse and
               | worse as computers are more connected.
               | 
               | One solution that some favor is functions that make it
               | easier not to overflow buffers because you tell them the
               | size of the buffer instead of trying to guess in advance
               | how much is enough for all possible data that may be
               | written in the buffer. This is the thing being discussed
               | in this thread. The function sprintf is not a contender
               | in this discussion. The function snprintf could be, if
               | used wisely, but it is a bit unwieldy and the OP's
               | proposal has a specific advantage: you compute the end
               | pointer only once, because this is the invariant.
        
           | wahern wrote:
           | Perhaps you meant snprintf. But snprintf can fail on
           | allocation failure, fail if the buffer size is > INT_MAX, and
           | in general isn't very light weight--last time I checked
           | glibc, snprintf was a thin wrapper around the printf
           | machinery and is not for the faint of heart--e.g.
           | initializing a proxy FILE object, lots of malloc interspersed
           | with attempts to avoid malloc by using alloca.
           | 
           | It can also fail on bad format specifiers--not directly
           | irrelevant here except that it forces snprintf to have a
           | signed return value, and mixing signed (the return value) and
           | unsigned (the size limit parameter) types is usually bad
           | hygiene, especially in interfaces intended to obviate buffer
           | overflows.
        
           | spc476 wrote:
           | Well, that should be `snprintf()` to start with, but even
           | with that, there are issues. The return type of `snprintf()`
           | is `int`, so it can return a negative value if there was some
           | error, so you have to check for that case. That out of the
           | way, a positive return value is (and I'm quoting from the man
           | page on my system) "[i]f the output was truncated due to this
           | limit then the return value is the number of characters which
           | would have been written to the final string if enough space
           | had been available." So to safely use `snprintf()` the code
           | would look something like:                   int size =
           | snprintf(NULL,0,"some format string blah blah ...");
           | if (size < 0) error();         if (size == INT_MAX)
           | error(); // because we need one more byte to store the NUL
           | byte         size++;         char *p = malloc(size);
           | if (p == NULL)           error();         int newsize =
           | snprintf(p,size,"some format string blah blabh ... ");
           | if (newsize < 0) error();         if (newsize > size)
           | {           // ... um ... we still got truncated?         }
           | 
           | Yes, using NULL with `snprintf()` if the size is 0 is allowed
           | by C99 (I just checked the spec).
           | 
           | One thing I've noticed about the C standard library is that
           | is seems adverse to functions allocating memory (outside of
           | `malloc()`, `calloc()` and `realloc()`). I wonder if this has
           | something to do with embedded systems?
        
         | msebor wrote:
         | There are many improved versions of string APIs out there, too
         | many in fact to choose from, and most suffer from one flaw or
         | another, depending on one's point of view. Most of my recent
         | proposals to incorporate some that do solve some of the most
         | glaring problems and that have been widely available for a
         | decade or more and are even parts of other standards (POSIX)
         | have been rejected by the committee. I think only memccpy and
         | strdup and strdndup were added for C2X. (See http://www.open-
         | std.org/jtc1/sc22/wg14/www/docs/n2349.htm for an overview.)
        
           | AceJohnny2 wrote:
           | > _Most of my recent proposals [...] have been rejected by
           | the committee._
           | 
           | Does anyone have insight on why?
        
       | [deleted]
        
       | clarry wrote:
       | 1. Are there any plans for standardizing empty initializer lists?
       | struct foo { int a; void *p; };              struct foo f = {0};
       | // legal C, f->p initialized like a static variable
       | struct foo f = {}; // not legal but supported by gcc
       | 
       | To me it would make sense that there is no need to specify a
       | value for any of the members that are intended to be initialized
       | exactly like static variables (and the first member is not
       | special so I shouldn't have to explicitly assign a zero?).
       | However the syntax currently demands at least one initializer.
       | 
       | --
       | 
       | 2. I recall seeing a proposal for allowing declarations after
       | case labels:                   switch (foo) {         case 1:
       | int var;             // ...         }
       | 
       | This is currently not allowed and you'd have to wrap the lines
       | after case in braces, or insert a semicolon after the case label.
       | Is this making it to c2x?
       | 
       | --
       | 
       | 3. I've run into some recent controversy w.r.t. having multiple
       | functions called main (and this has come up in production code).
       | In particular, I ran into a program programs that has a static
       | main() function (with parameters that are not void or int and
       | char _[]), which is not intended to be_ the* main function that
       | is the program's entry point.
       | 
       | gcc warns about this because the parameters disagree with what's
       | prescribed for the program entry point. It's not clear to me
       | whether this is intended to be legal or not.
       | 
       | --
       | 
       | 4. Looking at the requirements for main brings up another
       | question: it says how main should be defined (no static or extern
       | keyword). However, the definition could be preceded by a static
       | declaration, which then affects the definition that follows:
       | 
       |  _If the declaration of an identifier for a function has no
       | storage-class specifier, its linkage is determined exactly as if
       | it were declared with the storage-class specifier extern._
       | 
       |  _For an identifier declared with the storage-class specifier
       | extern in a scope in which a prior declaration of that identifier
       | is visible, if the prior declaration specifies internal or
       | external linkage, the linkage of the identifier at the later
       | declaration is the same as the linkage specified at the prior
       | declaration._
       | 
       | Therefore, it is possible to have a main function with internal
       | linkage and a definition that exactly matches the one given in
       | the spec:                   static int main(int, char *[]);
       | int main(int argc, char *argv[]) { /* ... */ }
       | 
       | As one might guess, this program doesn't make it through the
       | linker when compiled with gcc. Is this supposed to be legal?
       | Should the spec perhaps require main to have external linkage,
       | and then allow other functions called main with internal linkage
       | (and parameters that do not match what is required of the
       | external one)?
       | 
       | EDIT: ---
       | 
       | Are the fixes w.r.t. reserved identifiers going to make it in
       | c2x? Can I finally have a function called toilet() without
       | undefined behavior?
        
       | potiuper wrote:
       | Any plans to add semantics for exceptional situations such as
       | divide by zero and dereferencing a null pointer?
       | https://blog.regehr.org/archives/232
       | 
       | Or incorporating features from this 14 item list?
       | https://blog.regehr.org/archives/1180
       | 
       | As it appears these have failed:
       | https://blog.regehr.org/archives/1287
        
         | DougGwyn wrote:
         | The problem is that if the checks are always performed, the
         | object code is significantly slowed down. If all computers
         | supported the checking in hardware, then we could do it. You
         | don't really want the current C approach (signal) to trigger
         | except in an emergency, because there is no way to insert
         | cleanup/retry/etc. recovery code via a signal handler.
        
         | rseacord wrote:
         | I don't know of any plans to add semantics for divide-by-zero
         | of dereferencing a null pointer. I'm guessing this is not
         | viable because there is no agreed upon semantics among
         | different implementations.
         | 
         | Making C friendlier is always a good idea, and I think the
         | committee is (slowly) working towards this goal. I would have
         | to examine these papers by John Regehr in more detail. Looking
         | quickly at his proposals I can see why there he couldn't find
         | consensus for these ideas as some of them do appear
         | controversial.
         | 
         | An example of a friendly dialect of C is always is C0
         | (C-naught) from CMU. I don't think I'm exaggerating when I say
         | that this language has not "caught on".
        
       | rurban wrote:
       | 1. When will we get proper strings in the stdlib?
       | 
       | 2. When we will get the Secure Annex K extensions?
       | 
       | 3. When we will get mandatory warnings when the compiler decides
       | to throw away statements it thinks it doesn't need? Like memset
       | or assignments. Compilers are getting worse and worse, and
       | certainly not better.
       | 
       | ad 1) Strings are Unicode nowadays, not ASCII. Nobody uses wchar
       | but Microsoft. Everybody else is using utf8, but there's nothing
       | in the standard. Not even search functions with proper casing
       | rules and normalization. Searching for strings should be pretty
       | basic enough.
       | 
       | 2. The usual glibc answer is just bollocks. You either do
       | compile-time bounds checks or you don't. But when you don't, you
       | have to do it at runtime. So it's either the compilers job, or
       | the stdlib job. But certainly not the users.
        
         | rseacord wrote:
         | For (2) I guess it depends. Annex K is obviously already a part
         | of the standard so it depends on the implementation. There is a
         | push to eliminate Annex K altogether from the C Standard. If
         | this push fails, it may be the case that more libraries will
         | add support for this optional feature of the language. In the
         | meanwhile, there is the Open Watcom compiler implementation
         | [1], the Safe C Library [2], and Slibc [3].
         | 
         | [1] Watcom C Library Reference Version 1.8. Open Watcom. 2008.
         | ftp://ftp.openwatcom.org/manuals/current/clib.pdf
         | 
         | [2] Safe C Library -- A full implementation of Annex K
         | https://github.com/rurban/safeclib/
         | 
         | [3] slibc https://code.google.com/archive/p/slibc/
        
         | rseacord wrote:
         | For (3) mandatory warnings the closest thing is probably
         | ISO/IEC TS 17961:2013. The purpose of ISO/IEC TS 17961 is to
         | establish a baseline set of requirements for analyzers,
         | including static analysis tools and C language compilers, to be
         | applied by vendors that wish to diagnose insecure code beyond
         | the requirements of the language standard. All rules are meant
         | to be enforceable by static analysis. The criterion for
         | selecting these rules is that analyzers that implement these
         | rules must be able to effectively discover secure coding errors
         | without generating excessive false positives.
        
         | rseacord wrote:
         | Going to try to answer these separately. For (1) if you mean
         | strings that are primitive types my guess is never. When had an
         | hour discussion on this topic at a London meeting where we were
         | discussing new features for C11 and my take away was that this
         | would never happen because it would require a significant
         | change to the memory model for the language.
        
           | rurban wrote:
           | For the u8 type sure. Nobody needs a new type.
           | 
           | But at least add wcsnorm and wcsfc as I implemented them in
           | the safeclib are required. Not even coreutils, grep, awk, ...
           | can search unicode strings.
           | 
           | And u8 library variants of str* and wcs* are definitely
           | needed, maybe just with uchar* not char*.
        
             | DougGwyn wrote:
             | Why would the utilities not handle unicode searching?
             | Unicode characters match properly, the null terminator
             | works the same, and non-ANSI codes are just one or more
             | random 8-bit values which can be compared, copied, etc.
        
       | [deleted]
        
       | dboon wrote:
       | What are two or three C codebases that are elegantly and cleanly
       | written, and that every mid-level C programmer should read for
       | sake of knowledge?
        
         | pascal_cuoq wrote:
         | I would recommend musl, although the style is a bit
         | idiosyncratic in places: https://www.musl-libc.org
         | 
         | Mbed TLS, since I have it in mind from another thread, is also
         | a pretty clean C library for the problem it tries to solve;
         | it's a testament to its design that we (TrustInSoft, who had
         | not participated to its development) were able to verify that
         | some uses of the library were free of Undefined Behavior:
         | https://tls.mbed.org
        
           | uasm wrote:
           | > "I would recommend musl, although the style is a bit
           | idiosyncratic in places: https://www.musl-libc.org"
           | 
           | Opened a random part of musl out of sheer boredom. Here's
           | what I see:
           | 
           | https://git.musl-libc.org/cgit/musl/tree/include/aio.h
           | 
           | A bunch of return codes #defined like so (see
           | https://git.musl-libc.org/cgit/musl/tree/src/aio/aio.c):
           | 
           | #define AIO_CANCELED 0 #define AIO_NOTCANCELED 1 #define
           | AIO_ALLDONE 2
           | 
           | #define LIO_READ 0 #define LIO_WRITE 1 #define LIO_NOP 2
           | 
           | #define LIO_WAIT 0 #define LIO_NOWAIT 1
           | 
           | Why weren't they using an enum instead? I wouldn't sign off
           | on this code (and I don't think it lives up to best
           | practices).
        
             | pdw wrote:
             | musl is implementing POSIX. POSIX requires those constants
             | to be preprocessor defines. (Generally, musl asssumes the
             | reader is quite familiar with the C and POSIX standards,
             | which makes sense since it's a libc implementation.)
        
       | rvp-x wrote:
       | A lot of you seem to be working on commercial solutions to C's
       | insecurity. Does this feel like a conflict of interest to you?
        
         | pascal_cuoq wrote:
         | I have been told in this very AMA that I lacked enthusiasm
         | about C (and the gratuitous insecurity of the language when we
         | know that a well-designed type system and a few runtime checks
         | solve the problem entirely is indeed the reason for my
         | perceived lack of enthusiasm):
         | https://news.ycombinator.com/item?id=22865912
         | 
         | I hope that this perceived lack of enthusiasm means I am
         | handling the conflict of interest honorably.
        
         | rseacord wrote:
         | Good question, but not at all! I've been working as hard as I
         | can for the past 15 years to improve C Language security as
         | have other security-minded members of the committee. Generally
         | speaking, we are in the minority as performance is still the
         | major driver for the language. Any security solution that
         | introduces > 5% overhead, for example, is a nonstarter. I think
         | we all understand that are jobs are completely safe no matter
         | what security improvements we can get adopted.
         | 
         | The committee works a lot lobbyist. A minority of people with a
         | large financial interest in the technology (such as compiler
         | writers) have undue influence because they participate in the
         | process. I always encourage C language users to take a more
         | active role, but they usually don't. Cisco is an example of
         | user community that actively takes part in C Standardization.
        
           | pjmlp wrote:
           | I guess this is why vendors like Apple, Oracle, ARM and
           | Google end up going the hardware memory tagging route
           | instead.
        
       | WalterBright wrote:
       | I wrote about a simple addition to C that could eliminate most
       | buffer overflows:
       | 
       | https://www.digitalmars.com/articles/C-biggest-mistake.html
       | 
       | I.e. offering a way that arrays won't automatically decay to
       | pointers when passed as a function parameter.
        
         | quelsolaar wrote:
         | Arrays are pointers. If they aren't pointers then you need to
         | copy the data when you are giving an array as a function
         | parameter. that's a lot slower. Being able to prepare an set of
         | data in an array and then giving a pointer to a function is
         | very useful. You could add a second type of array on top of
         | what you have in C that includes more stuff, but if that's what
         | you want you can implement that yourself with a struct.
        
           | napsy wrote:
           | An array is not a pointer. These are completely different
           | data types. For example, you can't apply pointer arithmetic
           | to arrays without casting them to pointers.
        
             | WalterBright wrote:
             | That's right. They are converted to pointers when passed to
             | a function, even if the function declares the parameter as
             | an array.
        
               | napsy wrote:
               | They're not converted but can be implicitly casted to
               | pointer types.
        
               | _kst_ wrote:
               | No, they're converted. There is no such thing as an
               | "implicit cast". And it's not specific to arguments in
               | function calls.
               | 
               | Array types and pointer types are distinct.
               | 
               | An expression of array type is, in most but not all
               | contexts, implicitly converted (really more of a compile-
               | time adjustment) to an expression of pointer type that
               | yields the address of the 0th element of the array
               | object. The exceptions are when the array expression is
               | the operand of a unary & (address-of) or sizeof operator,
               | or when it's a string literal in an initializer used to
               | initialize an array (sub)object. (The N1570 draft
               | incorrectly lists _Alignof as another exception. In fact,
               | _Alignof can only take a parenthesized type name as its
               | operand.)
               | 
               | If you do:                   int arr[10];
               | some_func(arr);
               | 
               | then arr is "converted" to the equivalent of &arr[0] --
               | not because it's an argument in a function call, but
               | because it's not in one of the three contexts listed
               | above in which the conversion doesn't take place.
               | 
               | Another rule that causes confusion here is that if you
               | define a function parameter with an array type, it's
               | treated as a pointer parameter. For example, these
               | declarations are exactly equivalent:
               | void func(int arr[]);         void func(int arr[42]); //
               | the 42 is quietly ignored         void func(int *arr);
               | 
               | Suggested reading: http://www.c-faq.com/, particularly
               | section 6, "Arrays and Pointers".
               | 
               | A conversion converts a value of one type to another type
               | (possibly the same one). The term "cast" refers only to
               | an explicit conversion, one specified by a cast operator
               | (a parenthesized type name preceding the expression to be
               | converted, like "(double)42"). An implicit conversion is
               | one that isn't specified by a cast operator.
        
             | JoeAltmaier wrote:
             | Sure you can. int aFoo[]; has many legal array operations
             | possible:                 *(aFoo+3) should work fine and
             | return the 4th int in the array.
        
             | quelsolaar wrote:
             | they are accessed using pointer arithmetic, if you wanted
             | them to contain length data, you would need a different
             | access pattern. I think one of the great features of C is
             | that it doesn't do anything under the hood, its all
             | explicit. If you want to bounds check, then do it.
        
               | WalterBright wrote:
               | > they are accessed using pointer arithmetic
               | 
               | Not always. Consider:                   int a[3];
               | a[1] = 2;
               | 
               | This is not using pointer arithmetic. Dump the generated
               | code if you don't believe me :-)
        
               | quelsolaar wrote:
               | Its still pointer arithmetic, its just done compile time
               | rather then at execution. Still, you deserve style points
               | :-)
        
       | seamyb88 wrote:
       | Thoughts on Gnome glib, gobject, vala etc?
       | 
       | I tend to use glib for my (academic) code for pretending C is a
       | high-level language. It also seems to make up for implementation-
       | dependent functions in C and many portability issues. Also, IMO,
       | vala > C++.
       | 
       | My question is, really, are there any other tools for high-level
       | C programming and do you know of any disadvantages of the Gnome
       | stack?
        
       | tayistay wrote:
       | To what extent does compiler complexity factor into your thinking
       | about the evolution of C?
       | 
       | Thanks for this!
        
         | AaronBallman wrote:
         | When the committee considers proposals, we do consider the
         | implementation burden of the proposal as part of the feature.
         | If parts of the proposal would be an undue burden for an
         | implementation, the committee may request modifications to the
         | proposal, or justification as to why the burden is necessary.
        
           | tayistay wrote:
           | Thanks. Do you have an example of a proposal that the
           | committee considered an undue burden for an implementation
           | but was otherwise sound?
        
             | AaronBallman wrote:
             | Not off the top of my head, but as an example along similar
             | lines, when talking about whether we could realistically
             | specify twos complement integer representations for C2x, we
             | had to determine whether this would require an
             | implementation to emulate twos complement in order to
             | continue to support C. Such emulation might have been too
             | much of a burden for an implementation's users to bear for
             | performance reasons and could have been a reason to not
             | progress the proposal.
        
       | floatms wrote:
       | 1. How likely are named constants of any types to be included in
       | C2x? I'm referring to the idea of making register const values be
       | usable in constant expressions.
       | 
       | 2. Is there, or was there ever a proposal to make struct types
       | without a tag be structurally typed? This would not break
       | backwards compatibility as far as I can see, and would make these
       | types much more useful as ad-hoc bags of data. Small example:
       | struct {size_t size; void *data;} data = get_data();       int
       | hash = hash_data(data);
       | 
       | I believe there was at least one proposal about error handling
       | that more or less relied on the above to be valid semantically.
       | 
       | 3. Is there any interest in making the variadic function
       | interface a bit nicer to use? I would like to bring back an old
       | feature and have an intrinsic to extract a pointer from the
       | variadic parameter list, so that we can iterate over it ourselves
       | (or even index directly).                 void *arg_ptr =
       | va_ptr(last);
       | 
       | More out there would be a parameter that would be implicitly
       | passed to a variadic function to indicate the number of
       | arguments.                 void variadic(..., va_size count) {
       | }            variadic(10, 20, 30); // count would be three
        
         | pascal_cuoq wrote:
         | 3. would have to be a new mechanism for variadic functions,
         | that would have to be distinguished in header files from the
         | old mechanism with which it is incompatible. So this proposal
         | would imply some new keyword or syntax. I am not in the
         | committee, but I don't think this is going to happen. The
         | improvement is way too incremental to force a new syntax.
         | 
         | (The committee is fine with incremental improvements, but new
         | syntax need to have strong motivation behind it, much stronger
         | than this.)
        
           | floatms wrote:
           | Yes, I know that this is the most disruptive out of the
           | three. The implicit parameter more so than the va_ptr()
           | intrinsic (in my opinion), but I understand that changes like
           | these are not very well motivated (except for a slightly
           | nicer developer experience).
        
         | uecker wrote:
         | (disclaimer: also a WG14 member)
         | 
         | 1. I want this too.
         | 
         | 2. Here is my proposal: http://www.open-
         | std.org/jtc1/sc22/wg14/www/docs/n2366.pdf
         | 
         | 3. Yes, variadic functions should be improved.
        
         | msebor wrote:
         | I'd expect a proposal for (1) to be well received. The only
         | proposal I recall that deals with (2) is http://www.open-
         | std.org/jtc1/sc22/wg14/www/docs/n2067.pdf. I think it's still
         | being discussed. (3) is highly unlikely if it involved ABI
         | changes. Even if it could be done without such changes unless
         | there is a precedent for it in an existing compiler (and
         | preferably more), it would likely be a tough sell.
        
           | floatms wrote:
           | Is the linked proposal really dealing with unnamed struct
           | types? I skimmed it and it seems like it is dealing with
           | named constants. Also, is there a proposal for (1) currently,
           | or is someone planning on writing one? Regarding (3), yes,
           | this one was mostly wishful thinking.
        
       | oldiob wrote:
       | Is the committee planning on working on the preprocessor? I don't
       | see any reason for not boosting it. It's time for C to have real
       | meta-programming. Would be nice to have local macros that are
       | scoped.
       | 
       | On another note:
       | 
       | - Official support for __attribute__
       | 
       | - void pointers should offset the same size as char pointers.
       | 
       | - typeof (can't stress this one enough)
       | 
       | - __VA_OPT__
       | 
       | - inline assembly
       | 
       | - range designated initializer for arrays
       | 
       | - some GCC/Clang builtins
       | 
       | - for-loop once (Same as for loop, but doesn't loop)
       | 
       | Finally, stop putting C++ craps into C.
        
         | jparkie wrote:
         | +1 for Modern Metaprogramming.
         | 
         | I know some people are against metaprogramming because they
         | believe the abstractions hide the intrinsic of how the
         | underlying code will execute, but I would love to write
         | substantial tests in C without relying on FFI to Python or C++
         | to perform property-based testing, complex fuzzing, and
         | whatever. I feel metaprogramming would be a huge boon for C
         | tooling and developer productivity.
        
           | oldiob wrote:
           | In my point of view, there's a difference between abstraction
           | created by the language, e.g. lambdas or virtual table in
           | C++, and abstraction created by the programmers via the CPP.
           | 
           | The former is compiler dependent and you cannot know how it's
           | implemented. The former is simple text substitution and
           | you're the one implementing it. I often find myself creating
           | small embedded languages in CPP for making abstraction, and I
           | know exactly what C code it's going to generate and thus the
           | penalty if there's any.
           | 
           | People that are afraid of the preprocessor simply don't
           | understand how powerful it's in good hands.
        
       | blocks_plz wrote:
       | Thanks for the AMA
       | 
       | 1. Will the Apple's Blocks extension, which allows creation of
       | Closures and Lambda functions, be included in C2X?
       | 
       | 2. Are there any plans to improve the _Generic interface (to make
       | it easy to switch on multiple arguements, etc.)?
        
         | AaronBallman wrote:
         | > 1. Will the Apple's Blocks extension, which allows creation
         | of Closures and Lambda functions, be included in C2X?
         | 
         | We haven't seen a proposal to add them to C2x, yet. However,
         | there has been some interest within the committee regarding the
         | idea, so I think such a proposal could have some support.
         | 
         | > 2. Are there any plans to improve the _Generic interface (to
         | make it easy to switch on multiple arguements, etc.)?
         | 
         | I haven't seen any such plans, but there is some awareness that
         | _Generic can be hard to use, especially as you try to compose
         | generic operations together.
        
         | yvdriess wrote:
         | +1 for the first point. Every major compiler can do the lambda-
         | lifting transformation, either because of C++ lambda or OpenMP
         | support. It's frustrating doing this manually while knowing the
         | compiler supports it internally, but does not expose it
         | natively.
        
       | beefhash wrote:
       | C has been making strides towards complete Unicode support. I've
       | been having trouble following along though: Am I correct in
       | assuming that there's no _actual_ multi-byte UTF-8 to UTF-32 Rune
       | function and the best approximation depends on whatever wchar_t
       | is? How would I best handle pure Unicode input and output
       | scenarios on a  "hostile" OS whose native character encoding is
       | some EBCDIC abomination or a Windows codepage?
        
         | loeg wrote:
         | Probably link libicu rather than rely on libc.
        
           | rurban wrote:
           | libicu is a 40MB mess where you need only 5Kb of it. Only
           | case folding and one normalization is needed, with tiny
           | tables.
           | 
           | Additionally the used UNICODE_MAJOR and _MINOR are needed.
           | They are always years behind, and you never know which tables
           | versions are implemented.
        
         | moonchild wrote:
         | Converting arrays of utf8-encoded char to arrays of
         | utf32-encoded 'rune' would probably not do what you want. That
         | still leaves e.g. combining diacritical marks as separate from
         | the characters they modify. If you care about breaking up text
         | into codepoints, you probably also care about that sort of
         | thing. The base unit of unicode is the extended grapheme
         | cluster. In order to actually convert text into extended
         | grapheme clusters, however, you need to have a database that
         | tells you what kind of codepoint each codepoint is. Since c is
         | standardized less frequently than unicode, any kind of unicode
         | or utf support from the specification would quickly get out of
         | date.
        
       | jasonhansel wrote:
       | Can/should the C language be extended to better support vector
       | processors and GPGPU?
        
       | hedora wrote:
       | I frequently rely on reading and writing uninitialized struct
       | padding in code that compare and swaps the underlying struct
       | representation with some (up to 128bit) integer.
       | 
       | I could use a union type, but that adds extra memory operations,
       | and is finicky.
       | 
       | Is there a better way?
        
       | parenthesis wrote:
       | Could we have variadic macros with zero arguments in the
       | standard? I'm not using any compiler that doesn't allow it.
        
         | pascal_cuoq wrote:
         | The C standard description does not allow a function that does
         | not have at least one normal argument before the variadic
         | arguments.
         | 
         | Conceptually, something must indicate to the function how many
         | arguments it is supposed to request next, and with what types.
         | Yes, you could write a function where this information is
         | passed through a static-lifetime variable, but in practice the
         | first mandatory argument is almost always used for that anyway.
        
           | david2ndaccount wrote:
           | You're replying to a comment about macros, not about
           | functions.
        
       | emilfihlman wrote:
       | Have you considered adding multiplexing capability to the
       | standard? It would be great to have a directly portable one.
        
         | DougGwyn wrote:
         | We would need a specific proposal and assurance that nearly all
         | computers can efficiently provide that service. It is more
         | likely in the POSIX standard.
        
           | emilfihlman wrote:
           | Though it's interesting that threads were added to the
           | standard. Perhaps though they filled a niche that wasn't as
           | well filled as select/poll/epoll/kqueue/etc had already since
           | pthread api is perhaps harder.
        
             | DougGwyn wrote:
             | I thought it would be best to standardize just a single
             | thread, which should be the basic unit to be embedded in a
             | good parallel-processing model. However, others prevailed.
        
           | [deleted]
        
       | [deleted]
        
       | om42 wrote:
       | Not particular to the C language, but what are your opinions on
       | build systems, particularly for the embedded space? There's a
       | couple vendor specific embedded IDEs and toolchains and having to
       | glue together make/cmake files to support all of them can be a
       | pain.
        
         | msebor wrote:
         | Robert's upcoming book has a survey of a few popular IDEs.
        
       | asimpletune wrote:
       | Why is shifting by a negative amount undefined?
        
         | kps wrote:
         | Because people want `c = a << b` to compile into `shl c, a, b`
         | and C89 made the giant mistake of calling it 'undefined'
         | instead of 'implementation-defined, possibly fatal'.
        
       | hawski wrote:
       | What do you think about Zig language [0] and if you have any
       | opinions on it, what distinguishing features would you like to
       | see adopted in the C world?
       | 
       | [0] https://ziglang.org/
        
       | radford-neal wrote:
       | The syntax used in the following function definition is said to
       | be obsolescent in C11:
       | 
       | int f (a, n) int n; int a[n][n]; { return a[n-1][n-1]; }
       | 
       | How could one define this function without using the obsolete
       | syntax?
        
         | AaronBallman wrote:
         | You couldn't in that parameter order. However, you could do
         | this: int f(size_t n, int a[n][n]) { return a[n-1][n-1]; }
         | 
         | (https://godbolt.org/z/DV9c-C)
         | 
         | Btw, that definition was obsolescent in C89 too.
        
           | radford-neal wrote:
           | Well, yes. But putting the array argument(s) first is the
           | more natural order, in my opinion. And it is surely odd that
           | only one order is allowed in this context, when otherwise C
           | is happy with changing the order of parameters to be whatever
           | you like.
           | 
           | Plus, of course, there may be existing code using such
           | functions, with parameters in the order that would become
           | impossible if this syntax were disallowed.
        
       | Daemon404 wrote:
       | What has been the rationale or hinderance for not adding locale-
       | independent versions of various stdlib functions?
       | 
       | Practically every second C codebase on earth has their own
       | implementations of these at some point, and it remains a huge
       | problem for e.g. writers of libraries, where you don't know
       | how/where your library will be used.
        
         | msebor wrote:
         | First, there needs to be a proposal for adding a feature (I'm
         | not aware of one having been submitted recently). Second, any
         | non-trivial proposed feature needs to have some existing user
         | experience behind it. For libraries that typically means
         | implementations shipping with operating systems or compilers
         | (but successful third party libraries might also be
         | considered). Finally, it also needs to appeal to people on the
         | committee; that can be quite challenging as well. Many
         | proposals that meet the first two criteria die because they
         | simply don't get enough support within the committee.
        
           | Daemon404 wrote:
           | Sounds mostly like the issue is nobody has bothered to submit
           | a proposal for it then? (There is _so_ much in-the-wild
           | experience and code dealing with this issue, I cannot imagine
           | the second point being problematic.)
           | 
           | On the third point, I have trouble thinking of any technical
           | objections to such proposal.
        
         | rwmj wrote:
         | To clarify, do you mean functions like c_isalpha (part of
         | Gnulib) which is like isalpha but only matches 7 bit ASCII
         | characters?
        
           | Daemon404 wrote:
           | An easy (and problematic) example is decimal separators
           | (radix characters) being parsed or written differently based
           | on locale.
        
       | loeg wrote:
       | Have any of you looked at the CHERI hardware architecture and fat
       | capability pointers, broadly?
        
       | Uptrenda wrote:
       | What would you say to people who claim that writing "secure C
       | code" is impossible [not me but I'm curious what you all think]?
        
         | AaronBallman wrote:
         | I'd ask them if they really meant "impossible" or just "harder
         | than I wish it was".
         | 
         | I've typically found that the tradeoffs between security,
         | performance, and implementation efforts are usually more to
         | blame for why writing secure C code is a challenge. There are a
         | ton of tools out there to help with writing secure code
         | (compiler diagnostics, secure coding standards, static
         | analyzers, fuzzers, sanitizers, etc), but you need to use all
         | the tools at your disposal (instead of only a single source of
         | security) which adds implementation cost and sometimes runtime
         | overhead that needs to be balanced against shipping a product.
         | 
         | This isn't to suggest that the language itself doesn't have
         | sharp edges that would be nice to smooth over, though!
        
       | mesaframe wrote:
       | How to become a compiler engineer if you don't have a degree in
       | CS?
        
       | axelf4 wrote:
       | In C89 is there a portable way to figure out the alignment
       | requirement for a struct, to be able to, say, store it after the
       | NUL terminator in the same allocation as a C string?
        
         | DougGwyn wrote:
         | I'm not sure what your requirement is. Usually things work out
         | if you're careful not to assume any specific value for
         | alignment etc. It may mean a few unused bytes here and there,
         | but keeping things simple and portable often pays off.
        
           | quelsolaar wrote:
           | Being able to know your alignments is VERY important for a
           | lot of network implementations. They are all defined by the
           | ABIs, but its very annoying that the standard keeps thinking
           | that alignment is unknowable, when in fact its impossible to
           | implement a ABI without defining it. One of the reasons I
           | stick to C89.
        
             | DougGwyn wrote:
             | Note that the ABIs cover endianness as well as value range
             | and/or object widths. In general, one needs to have
             | explicit marshaling and unmarshaling functions to map from
             | network octet array and C internal data representation.
             | Failure to get this right is (or used to be) a common bug
             | for code developed and tested on too few architectures.
        
               | quelsolaar wrote:
               | Sure, it wont be portable between any architectures, but
               | a lot of times you know you will be on a little endian
               | platform where types are aligned to their sizeofs. That
               | covers a lot of ground and the performance gains you get
               | from optimizing with this in mind is significant. There
               | is value in C being able to be portable, but there is
               | also a huge value in being able to write non-portable
               | code that takes advantage of what you know about the
               | platform. C needs to acknowledge that that is a
               | legitimate use case.
        
       | sgawlik wrote:
       | When you're looking at an unfamiliar C code base for the first
       | time, how do you approach it? Which files do you look for? Which
       | tools to you open up immediately?
        
         | rseacord wrote:
         | This depends a bunch on what your goals are. There are no
         | specially named files, so looking for a particular filename is
         | not particularly useful. It is sometimes informative to find
         | the file containing the main, but not always.
         | 
         | My job at NCC Group involves a lot of code reviews, so
         | frequently the files that are of interest to me are the ones
         | that contain the most defects. I typically identify these by
         | compiling with compiler warnings turned up and warning
         | suppression turned down. I'll frequently also make use of
         | static and dynamic analysis, including the GCC and Clang
         | sanitizers.
        
         | jhallenworld wrote:
         | cscope can help
        
           | clarry wrote:
           | Is there a vim-style cscope interface for emacs? I hate that
           | xcscope brings up its own persistent buffers (replacing other
           | buffers that I had deliberately placed on the screen). Vim,
           | conveniently, just pops up the cscope interface when I need
           | to enter some input, and then hides it away. Also I don't
           | think xcscope works with evil's tag stack whereas in vim, I
           | believe, you can just return to where you were with ^T,
           | whether using ctags or cscope.
        
           | DougGwyn wrote:
           | Yes, I have found it helpful. One nice feature is that it
           | uses a character-terminal interface, not a platform-specific
           | GUI.
        
         | DougGwyn wrote:
         | It all depends on how organized previous workers were, and what
         | your goal is for a modification of the source text. Often,
         | headers (dot-h files) document the data structures and
         | interfaces.
        
         | loeg wrote:
         | I start with generating tags.                 exctags
         | --exclude=TAGS --exclude=TAGS.NEW --append -R -f TAGS.NEW
         | --sort=yes  && mv TAGS.NEW TAGS
         | 
         | My editor (vim) has native support for quickly jumping from a
         | use to definition via this TAGS index. History is preserved
         | (i.e., there is a "back" button), so you can quickly dive
         | through 5 layers of API and back out to understand where a
         | value went. It is quite useful for starting with what you know
         | and following it to the surprising behavior, without executing
         | the code.
        
       | mey wrote:
       | This is a subjective question. From the array of tools in your
       | belt, when do you personally/professionally reach for C, or maybe
       | more interestingly, when do you _not_ reach for C?
        
         | DougGwyn wrote:
         | Since I do almost all my software development in a Unix
         | environment, usually I check the toolbox to see if there is
         | already a program that has nearly the functionality I want, and
         | if so then I cobble together a shell script. Sometimes (as with
         | the Sudoku solver) it will be necessary to build a new
         | component, and for that I usually use C since I am comfortable
         | and experienced with it. (Also, if coded in Standard C, odds
         | are that I can install it on whatever platform I need, with
         | little or no adaptation.)
        
       | [deleted]
        
       | RandNOx wrote:
       | - Which differences between the C abstract machine and actual
       | modern CPUs/hardware have proven most difficult to deal with in
       | the language?
       | 
       | - Are you planning any addition regarding modeling of how modern
       | CPUs work (e.g. pipelines, branches, speculative execution, cache
       | lines, etc)?
       | 
       | PS: Thank you for doing this!
        
         | AaronBallman wrote:
         | > - Which differences between the C abstract machine and actual
         | modern CPUs/hardware have proven most difficult to deal with in
         | the language?
         | 
         | For me, I think it's 'volatile' because, by its nature, you
         | can't describe what it means in the abstract machine very well.
         | For instance, consider a proposal to add something like a
         | "secure clear" function for clearing out sensitive data. The
         | natural inclination is to pretend that data is volatile so the
         | optimizer won't dead-code strip your secure clear function
         | call, but that leaves questions about things like cache lines,
         | distributed memory, etc.
         | 
         | > - Are you planning any addition regarding modeling of how
         | modern CPUs work (e.g. pipelines, branches, speculative
         | execution, cache lines, etc)?
         | 
         | Maybe? ;-) We tend to talk about features at a higher level of
         | abstraction than the hardware because hardware changes at such
         | a rapid pace compared to the standards process. So we largely
         | leave hardware-specific considerations as a matter of QoI for
         | implementers.
         | 
         | However, that doesn't mean we wouldn't consider proposals for
         | more concrete things like a defensive attribute to help
         | mitigate speculative execution attacks.
        
       | 7532yahoogmail wrote:
       | pascal_cuoq - Pascal Cuoq is the Chief Scientist at TrustInSoft
       | and co-inventor of the Frama-C technology
       | 
       | This looks to be a hell'va' good tool chain. I'm playing with as
       | of yesterday.
        
       | jpizza wrote:
       | Hello,
       | 
       | First off thank you so much for taking the time to answer
       | questions.
       | 
       | As a new programmer starting with C I am trying to learn how to
       | go from a beginner to an intermediate any recommendations of
       | projects to help learn C?
       | 
       | It is difficult for me to find projects that I see are "valuable"
       | for a lack of a better term.
       | 
       | Thank you!
        
         | DougGwyn wrote:
         | One possibility is to modify some existing program to include
         | an additional new feature. You should soon develop a sense for
         | what works well versus what causes problems.
        
       | emreiyican wrote:
       | I know this opinion is unpopular and contradict with a core value
       | of the C standardization committee but I personally think at some
       | point, C standard should abandon supporting the legacy codebase.
       | I think bool and stdint definitions should be available as part
       | of the standard feature set and shouldn't need including their
       | respective headers. These and some other features are available
       | at the core of every modern language but C, and C has to provide
       | them via other means. Is the sentiment of discontinuing legacy
       | support shared within the committee, by any proportion?
        
         | xyzzy2020 wrote:
         | Can't upvote enough. I think these changes could also be made
         | in a way that can be mechanically-translatable.
         | 
         | For example: removing the register keyword, always requiring a
         | return statement, etc etc.
         | 
         | A lot of changes can me made that will make static analysis
         | easier.
         | 
         | There will always be people with 50 year old code bases that
         | will never change (and some c89 compiler will always be there
         | for them), but the language is pervasive enough that it
         | deserves progressive changes to make it (even) simpler and
         | safer and slightly more high level.
        
         | loeg wrote:
         | I'd love it if we could do away with all the headers.
         | 
         | Just #include <stdc.h> and be done with it. No need to remember
         | stdio, stdint, stdbool, limits, assert, signal.h, etc, etc.
         | 
         | This new header comes with a guarantee that use of identifiers
         | in the standard-reserved namespace will break your code.
         | Perhaps compilers could even enforce this preemptively.
        
           | DougGwyn wrote:
           | You can easily create your own stdc.h include file. Something
           | similar was done on Plan 9.
           | 
           | Note that by including the content of all the headers, you're
           | increasing the chance for collisions with application
           | identifiers. You might consider that more of a benefit than a
           | drawback.
        
         | AaronBallman wrote:
         | We've started doing some things in this area, but I don't think
         | the committee would abandon legacy code bases entirely.
         | Instead, we try to make a migration path for code bases.
         | 
         | For instance, we added the '_Bool' data type and require you to
         | include <stdbool.h> to spell it 'bool' instead and to get
         | 'true' and 'false' identifiers. This was done to not impact
         | existing code bases that had their own bool/true/false
         | implementation with those spellings. Now that "enough" time has
         | passed for legacy code bases to update, we're looking into
         | making these "first-class" features of the language and not
         | requiring <stdbool.h> to be included to use them. We're doing
         | the same for things like _Static_assert vs static_assert, etc
         | for the same reason.
        
       | tayistay wrote:
       | I'm no C expert, but my two wishes for C would be:
       | 
       | - Basic type inference to reduce keystrokes, and prevent ripples
       | when changing types. (like auto in C++)
       | 
       | - Equality operators defined for structs. Perhaps even
       | lexicographical comparison, if I'm dreaming.
       | 
       | Any thoughts on either of those?
        
       | cyber1 wrote:
       | Ken Thompson, Rob Pike, Brian Kernighan, Russ Cox, Robert
       | Griesemer are guys who created Unix, B, C, Go, Utf-8, etc. Maybe
       | it will be useful to invite these guys(one of them) in the C
       | Standards Committee for help to improve and design new language
       | features?
        
         | rseacord wrote:
         | I think a lot of these dudes are retired. A lot of good C
         | people like P.J. Plauger, John Benito, and Clark Nelson have
         | all retired recently. Anyway, they are all invited back. As an
         | incentive, we typically have free coffee and snacks at most of
         | the meetings. :)
        
       | Tronic2 wrote:
       | char effectively behaves as a signed type, making it unsuitable
       | for binary operations (e.g. UTF-8 manipulation). I/O functions
       | deal with char pointers, so using unsigned type like uint8_t
       | requires casting back and forth. Is there any way out of this
       | problem, and am I already breaking the aliasing rules with that
       | cast?
        
         | emilfihlman wrote:
         | There are no aliasing differences between uint8_t and char as
         | far as I know.
        
           | hsivonen wrote:
           | In practice not. In theory, it's implementation-defined
           | whether yhere are differences.
        
             | emilfihlman wrote:
             | At least from what I've heard that's because stdint values
             | are optional.
             | 
             | 6.2.5p17 The three types char, signed char, and unsigned
             | char are collectively called the character types. The
             | implementation shall define char to have the same range,
             | representation, and behavior as either signed char or
             | unsigned char. 48)
             | 
             | and
             | 
             | 5.2.4.2.1 says that width of char, signed char and unsigned
             | char are the same (8).
        
               | radford-neal wrote:
               | I don't think it's anything to do with uint8_t being
               | optional. It's because a char might have more than 8
               | bits.
        
         | msebor wrote:
         | Casting between the three character types is safe and doesn't
         | violate aliasing rules. In addition, objects of all types can
         | be accessed by lvalues of any of the three character types
         | (though unsigned char is recommended), so there's no problem
         | there either.
         | 
         | I/O functions that take a plain char* are designed to
         | interoperate with char arrays and strings, so passing in
         | unsigned or signed char is a sign that they aren't being used
         | as intended. (Functions that traffic in binary data like
         | fread/fwrite should take void*).
        
       | packetlost wrote:
       | I'm about a mid-level experienced developer, and have been
       | attempting to learn C via a few side projects. I come from mostly
       | Python and Go, which both have very robust standard libraries, so
       | I was quite surprised to find that string parsing is _very_
       | poorly supported in C. Is there a reason that very common string
       | parsing cases are missing from the C stdlib?
        
       | WFHRenaissance wrote:
       | A bit off topic, but what are your views on Golang? I'm leaving
       | this pretty open-ended, but I'm curious how you see it
       | interacting with the C/C++ ecosystem in the future.
        
       | tridentboy wrote:
       | Know it's not exactly related to what you do. But do you have
       | some recommendations of books/online classes to learn C?
        
       | Koshkin wrote:
       | Why not keep C a simple little language with fast compile times
       | and delegate all "enhancements" (such as 'cleanup') to C++?
        
       | nchelluri wrote:
       | Hello, just a quick note; I wanted to buy the book so I went to
       | the website and when I picked my country as Canada it started
       | giving me a strange list of provinces (definitely not Canadian)
       | so I abandoned the process for now.
        
         | billpollock wrote:
         | I've asked our Operations Manager to look into this issue.
         | Thanks for bringing this to our attention. We'll get it sorted
         | out. Please email info@nostarch.com so that they can help
         | troubleshoot.
        
         | rseacord wrote:
         | I'll pass this on to the publisher....
        
       | jfmc wrote:
       | C is a great low-level language to write the engines/runtimes of
       | other languages.
        
       | rudchenkos wrote:
       | Are any concurrency primitives planned for introduction in future
       | C revisions?
        
         | AaronBallman wrote:
         | We currently have not seen papers proposing to add new
         | concurrency primitives for C2x, but we have been actively
         | working on the concurrency object model and would welcome
         | proposals for new primitives or concurrency-related fixes.
         | 
         | One goal is to re-unify C with the concurrency object model
         | used by C++ to make std::atomic<T> and _Atomic(T) be ABI
         | compatible as intended in C11. Some small fixes in this area
         | are the removal of ATOMIC_VAR_INIT, clarifying whether library
         | functions can use thread_local storage for internal state, and
         | things along those lines. However, we expect there to be more
         | efforts in this area as we progress the standard.
        
       | jpfr wrote:
       | C11 has seen new features, such as Generic Selection. Is the
       | current language standardization converging (just adding
       | clarifications, removing the surface for undefined behavior,
       | etc.) or is C still growing with new features?
       | 
       | In other words, will the C standard be effectively "done" at some
       | time in the future?
        
         | msebor wrote:
         | Fixing minor bugs or inconsistencies and reducing the number
         | and kinds of instances of undefined behavior are some of the
         | efforts keeping the C committee busy.
         | 
         | Reviewing proposals to incorporate features supported by common
         | implementations is another.
         | 
         | Aligning with other standards (e.g., floating point) and
         | improving compatibility with others (C++) is yet another.
         | 
         | In general, when an ISO standard is done it essentially becomes
         | dead. So for the C standard to continue to be active (on ISO's
         | books) it needs to evolve.
        
           | ken wrote:
           | It's interesting to hear the standardization perspective,
           | because it's pretty much the opposite of my perspective as a
           | user.
           | 
           | I see the classic path of any programming language --
           | regardless of standardization -- is to continuously add
           | features until it's too big and complex that nobody wants to
           | deal with it any more. Then it's replaced by a newer, simpler
           | language that takes the important bits and drops the
           | unnecessary complexities. At that point, everybody sees that
           | the older language was barking up the wrong tree, and they
           | stop wasting time on it.
           | 
           | It's not the cessation of language change that _causes_
           | language death -- that 's merely a symptom. You can't keep a
           | language alive simply by changing it every year. Some people
           | sure have tried.
           | 
           | Alternatively, until it's evolved so much that there is so
           | much diversity of implementation that simply knowing a
           | library is written in "language X" doesn't tell me much about
           | how it's written, or whether I can use it in my program which
           | is also written in "language X".
           | 
           | Then again, C is the exception to every rule, so maybe we can
           | keep piling on features indefinitely, and people will have to
           | use it (even if they don't like it), for the same reason they
           | started using it decades ago (even if we didn't like it).
        
         | rseacord wrote:
         | I would say no, that we are still adding new features. Aaron
         | Ballman was responsible for adding attributes to the C2x (he
         | can tell you more). We're also looking at #embed feature to
         | incorporate binaries the way that #include incorporates text.
        
         | rseacord wrote:
         | A full list of proposals to WG14 can be found here:
         | 
         | http://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log...
         | 
         | These papers are usually quite interesting.
        
       | ancarda wrote:
       | As a C newbie, will there ever be "safe" C, i.e. no undefined
       | behavior and help with writing code that has less memory related
       | crashes/bugs? For comparison, Rust has the `unsafe { }' block
       | which lets you mark regions of code as being able to do funky
       | stuff. Could we get the opposite for C, i.e. `safe { }' and for
       | an entire file, `#pragma safe'?
       | 
       | I have a love-hate relationship with C - I like it for small
       | projects, but anything serious I really need to write it in a
       | more safe language. I think GCC has some flags that can help, and
       | I've been using tools like splint, but something baked into the
       | standard would be amazing.
        
         | sramsay wrote:
         | I'm pretty happy with C as it is, but I will admit to being
         | surprised that a "minimalistic Rust" hasn't risen to
         | prominence.
         | 
         | I guess what I mean by that is a language that has Rust's
         | hyperactive, strongly opinionated compiler, borrow checker, no
         | NULL, immutable by default, etc, but in a language that is no
         | more syntactically ambitious that C89. I would be way more into
         | a language like that than Rust.
         | 
         | A language that sort of _feels_ like Go, but can actually be
         | used for low-level systems programming.
        
           | Leherenn wrote:
           | I think it's going to arrive, but some time is needed to see
           | what works in Rust or not. D is going this way as well, so
           | should provide another data point.
        
       | modeless wrote:
       | Can you do anything to push Microsoft to implement recent C
       | standards? Their failure to fully implement even C99 in Visual
       | Studio is holding the language back.
        
         | AaronBallman wrote:
         | Not really -- vendors are free to ignore newer releases of the
         | standard that do not meet their customers needs and the
         | committee can't do much about it.
         | 
         | However, as a user, you can help apply pressure on the vendor
         | to support newer standards. For instance, with Microsoft, you
         | could support this feedback request:
         | https://developercommunity.visualstudio.com/idea/387315/add-...
        
         | DougGwyn wrote:
         | There is little that the C Standards group can do about it. One
         | idea is to write a C Standards conformance into contracts. When
         | I was in the government we often did that, but it still wasn't
         | enough clout.
        
       | overfl0w wrote:
       | Can memory safety be ensured in the C programming language? By
       | static analysis at compile time for example?
        
         | [deleted]
        
         | pascal_cuoq wrote:
         | It is possible to guarantee that a C program does not have any
         | undefined behavior, which includes all the memory errors that
         | are often also security vulnerabilities.
         | 
         | "Static analysis" may be the wrong name to classify the tools
         | that work in that area, because "static analysis" is usually
         | used for purely automatic tools, whereas the tools used to
         | guarantee the absence of undefined behaviors are not entirely
         | automatic except for the simplest of programs.
         | 
         | Results of a static analyzer are often characterized in terms
         | of "false positives" and "false negatives". It is a possible
         | design choice to make an analyzer with no false negatives. It
         | is absolutely not impossible! (Some people think it is
         | fundamentally impossible because it sounds like a computer
         | science theorem, but it isn't one. The theorem would apply if
         | one intended to make an analyzer with no false positives and no
         | false negatives--and if computers were Turing machines.)
         | 
         | Analyzers designed to have no false positives are called
         | "sound". In practice, this kind of analyzer may prove that a
         | simple program is free of Undefined Behavior if the program is
         | a simple example of 100 lines, but for a more realistic
         | software component of at least a few thousand lines, the result
         | will be obtained after a collaborative human-analyzer process
         | (in which the analyzer catches reasoning errors made the human,
         | so the result is still better than what you can get with code
         | reviews alone).
         | 
         | Here is what the result of this collaborative human-analyzer
         | process may look like for a library as cleanly designed and
         | self-contained as Mbed TLS (formerly PolarSSL): https://trust-
         | in-soft.com/polarSSL_demo.pdf?
        
       | emilfihlman wrote:
       | Why isn't there a binary prefix in the standard? Like 0b0111010?
        
       | [deleted]
        
       | ocithrowaway wrote:
       | A couple of (I hope easy) requests - 1. Can we add separators in
       | constants (C++ does 0xFFFF'FFFF'FFFF'FFFF any other reasonable
       | scheme is fine too?)
       | 
       | 2. I think many compilers already do this, but can the static
       | initialization rules be relaxed a bit?                 static
       | const int a = 0;       static const int b = a; /* This is not
       | standard C afaik. */
       | 
       | Thank you, CodeandC
        
         | rightbyte wrote:
         | A binary literal would be nice too. Doing masks for embedded
         | systems makes my head hurt sometimes. "Cpp compatibility" etc
         | etc could be the excuse to implement it.
        
         | msebor wrote:
         | WG14 in general looks favorably at proposals to align C more
         | closely with C++ (within the overall spirit of the language)
         | and I'd expect (1) would viewed in that light.
         | 
         | I'd also say there is consensus that (2) would be beneficial.
         | There are some good ideas in http://www.open-
         | std.org/jtc1/sc22/wg14/www/docs/n2067.pdf although I don't
         | think repurposing the register keyword for it was very popular.
         | Not just because it wouldn't be compatible with C++ which
         | deprecated register some time ago, but also because it's novel
         | with no implementation or user experience behind it. My
         | impression that this is waiting for a new proposal.
        
       | MaxBarraclough wrote:
       | Does the following code fragment cause undefined behaviour?
       | unsigned int x;         x -= x;
       | 
       | There's a lengthy StackOverflow thread where various C language-
       | lawyers disagree on what the spec has to say about trap values,
       | and under what circumstances reading an uninitialised variable
       | causes UB. I'd appreciate an authoritative answer. Thanks for
       | dropping by on HN!
       | 
       | https://stackoverflow.com/q/11962457/
        
         | pascal_cuoq wrote:
         | This example is clearly UB.
         | 
         | You could argue that it suddenly becomes less UB if you take
         | the address of x:                 unsigned int x;       &x;
         | x -= x;
         | 
         | I'm not sure if this will add anything to the discussion on SO,
         | but if you allow programs to do this, then after applying
         | modern optimizing C compilers, you may end with multiplications
         | by 2 that produce odd results, or uninitialized char variables
         | that contain 500:
         | http://blog.frama-c.com/index.php?post/2013/03/13/indetermin...
         | 
         | So the short answer is that, for all intent and purposes, you
         | should consider use of uninitialized variables as UB, because C
         | compilers already do. (There exists somewhere a document
         | clarifying what C compilers can and cannot do with
         | indeterminate values. A search for "wobbly values" might turn
         | it up. Anyway, you do not want to have wobbly values in your C
         | programs any more than you want it to have undefined behavior.)
        
           | MaxBarraclough wrote:
           | Interesting link, thanks. So then:
           | 
           | * Under C90, reading an uninitialized local was explicitly
           | listed as UB.
           | 
           | * Under C99, if you weren't using a character type, it was
           | still essentially UB, by way of trap values. (I don't think
           | the particulars of the target hardware platform are
           | relevant.)
           | 
           | * C11 reintroduced UB even for some cases involving character
           | types. We were already invoking UB under C99, so we know
           | we're still invoking UB under C11.
           | 
           | > You could argue that it suddenly becomes less UB if you
           | take the address of x
           | 
           | I don't think so. As we're not using a character type, I
           | don't think taking its address would change anything. This
           | aligns with what msebor said.
           | 
           | Lastly, from the article:                   > No, GCC is
           | still acting as if j *= 2; was undefined.
           | 
           | I think GCC's behaviour is legal here. The target platform
           | may have no trap values, but I don't see that GCC is
           | prohibited from behaving as if there _are_. It would be legal
           | (albeit bizarre) for it to generate code for a completely
           | different ISA, and to bundle an emulator. If the spec says
           | you 've opened the door to UB, then unless your compiler
           | documentation says otherwise, it's permitted to generate code
           | that goes haywire, no?
        
         | msebor wrote:
         | Yes, it's undefined. It involves a read of an uninitialized
         | local variable. Except for the special case of unsigned char,
         | any uninitialized read is undefined.
        
           | emilfihlman wrote:
           | >Except for the special case of unsigned char, any
           | uninitialized read is undefined.
           | 
           | Could you expand on this?
        
             | loeg wrote:
             | I'm guessing you were asking about this part rather than UB
             | in general:
             | 
             | > Except for the special case of unsigned char,
             | 
             | The SO article makes the bizarre claim that because
             | 
             | (1) an unsigned char, per the standard, cannot have any
             | padding bits, it therefore cannot have a trap
             | representation. And
             | 
             | (2) if it cannot have a trap representation, the use of an
             | uninitialized value isn't undefined.
             | 
             | I'm willing to buy (1) but I don't remember (2) being
             | required for UB. I think (2) is the step that is harder to
             | follow intuitively. Admittedly, I have not read that part
             | of the standard closely in some time.
        
             | msebor wrote:
             | An object of any type, initialized or not, can be read by
             | an lvalue of unsigned char (or any character type). That
             | lets functions like memcpy (either the standard one or a
             | hand-rolled loop) copy arbitrary chunks of memory.
             | 
             | There's some debate about the effects of reading an
             | uninitialized local variable of unsigned char (like whether
             | the same value must be read each time, or whether it's okay
             | for each read to yield a different value).
             | 
             | This special exemption doesn't extend to any other types,
             | regardless of whether or not they have padding bits or trap
             | representations that could cause the read to trap. Few
             | types do, yet the behavior of uninitialized reads in
             | existing implementations is demonstrably undefined
             | (inconsistent or contradictory to invariants expressed in
             | the code of a test case), so any subtleties one might
             | derive from the text of the standard must be viewed in that
             | light.
        
               | MaxBarraclough wrote:
               | Thanks for your answers. A related question: this article
               | [0] appears to single out _memcpy_ and _memmove_ as being
               | special regarding effective type. Is it accurate? It
               | seems to be at odds with your suggestion that there 's
               | nothing stopping me writing my own memcpy provided I'm
               | careful to use the right types.
               | 
               | [0] https://en.cppreference.com/w/c/language/object#Effec
               | tive_ty...
        
               | msebor wrote:
               | memcpy and memmove aren't special. The part that
               | discusses the copying of allocated objects is 6.5, p6,
               | quoted below:
               | 
               | The effective type of an object for an access to its
               | stored value is the declared type of the object, if any.
               | If a value is stored into an object having no declared
               | type through an lvalue having a type that is not a
               | character type, then the type of the lvalue becomes the
               | effective type of the object for that access and for
               | subsequent accesses that do not modify the stored value.
               | If a value is copied into an object having no declared
               | type using memcpy or memmove, or is copied as an array of
               | character type, then the effective type of the modified
               | object for that access and for subsequent accesses that
               | do not modify the value is the effective type of the
               | object from which the value is opied, if it has one. For
               | all other accesses to an object having no declared type,
               | the effective type of the object is simply the type of
               | the lvalue used for the access.
        
               | MaxBarraclough wrote:
               | I see, so in short the article is failing to reflect this
               | excerpt: _or is copied as an array of character type_.
               | Thanks again.
        
               | AaronBallman wrote:
               | I think that may be inaccurate -- IIRC, in C, you can do
               | type punning via a union but not memcpy, and in C++ you
               | can do type punning via memcpy but not a union and this
               | incompatibility drives me nuts because it makes inline
               | functions in a header file shared between C and C++
               | really messy. (Moral of the story: don't pun types.)
        
               | pascal_cuoq wrote:
               | The C standard also allows to use memcpy to do type
               | punning:                   If a value is copied into an
               | object having no declared type using memcpy or memmove,
               | or is copied as an array of character type, then the
               | effective type of the modified         object for that
               | access and for subsequent accesses that do not modify the
               | value is         the effective type of the object from
               | which the value is copied, if it has one
               | 
               | Simply memcpy into a variable (as opposed to dynamically
               | allocated memory).
               | 
               | https://port70.net/~nsz/c/c11/n1570.html#6.5p6
        
               | AaronBallman wrote:
               | I must be remembering incorrectly then, thank you!
        
             | rseacord wrote:
             | Uninitialized Reads
             | https://queue.acm.org/detail.cfm?id=3041020
        
           | [deleted]
        
       | ux wrote:
       | Is there any plan to deal with the locale fiasco at some point?
       | 
       | Some hints on what I'm referring to can be found here:
       | https://github.com/mpv-player/mpv/commit/1e70e82baa9193f6f02...
       | 
       | Unrelated, but I also miss a binary constant notation (such as
       | 0b10101)
        
         | eqvinox wrote:
         | I haven't read most of that rant, but a thread-local
         | setlocale() would be a godsend. Not sure if that's ISO C or
         | POSIX though.
        
           | wahern wrote:
           | POSIX has added _l variants taking a locale_t argument to all
           | the relevant string functions. I can see how per-thread state
           | would be convenient, but it's not a comprehensive solution.
           | With the _l variants you can write your own wrappers that
           | pass a per-thread locale_t object.
        
         | r12477 wrote:
         | For binary constant notation, I have incorporated the following
         | macro into my projects:
         | https://gist.github.com/61131/009961b781f387ed1474ffaf19e375...
        
         | nickysielicki wrote:
         | In the same vein, I really like being able to use underscores
         | in binary and hex literals to denote subfields in hardware
         | registers.
         | 
         | 0xDEADB_EEF
         | 
         | 0b1_010_110111001001
         | 
         | etc.
        
           | jhallenworld wrote:
           | Should take Verilog binary construction syntax, like {
           | 12'd12, 16'hffee, 3'b101 } (or something similar that would
           | fit with C's syntax).
        
             | dirtydroog wrote:
             | Maybe not.
        
               | jhallenworld wrote:
               | Why not? If you have to combine bit fields now, it's a
               | mess of shifting and masking.
        
         | pascal_cuoq wrote:
         | Many C compilers offer, as an extension, the very binary
         | constant notation that you miss, as anyone who has worked on
         | the front-end of a C static analyzer would tell you.
        
           | ux wrote:
           | Yes I'm aware. But we can agree it would be welcome in the
           | standard, isn't it?
        
             | pascal_cuoq wrote:
             | Yes, if only so that we (as a category) do not have to
             | discover it exists when already facing C programs that use
             | it.
        
         | OnACoffeeBreak wrote:
         | I know that we're not voting, but I miss a binary literal very
         | much. I would also like a literal digit separator to improve
         | readability. Verilog Hardware Description Language does that
         | with an underscore [1]. For example, 0xad_beef to improve
         | readability of a hex literal, and 0b011_1010 to improve
         | readability of a binary literal.
         | 
         | 1: http://verilog.renerta.com/mobile/source/vrg00020.htm
        
           | jfkebwjsbx wrote:
           | If they pick this up, they will likely use C++'s
           | syntax/rules.
        
       | magicbanana wrote:
       | Is there a chance to ever see C++-template-like features appear
       | in C?
       | 
       | For instance, a lot of redundant code (or ugly macro business)
       | could be neatly replaced by function templates. Even just
       | template functions with only POD values allowed would be a great
       | readability improvement.
        
         | MiKom wrote:
         | It's already there. It's called C++ templates
        
       | pantalaimon wrote:
       | Will C eventually get something like C++' constexpr?
        
         | AaronBallman wrote:
         | C has some basic support for constant expressions already, but
         | there has not yet been a proposal to bring 'constexpr' over
         | from C++. Personally, I would _love_ this feature to be in C!
        
           | loeg wrote:
           | You and me both!
        
         | dktoao wrote:
         | This is really the only thing that I really want from C++. It
         | would be amazing if this could make the cut for a future spec.
         | 
         | EDIT: I work on embedded systems, where C is king, and it seems
         | like a spend an inordinate amount of time working with code
         | generators that build simple tables. All of which could go away
         | with this feature.
        
       | pornel wrote:
       | Would you consider adding a built-in way to safely multiply two
       | numbers?
       | 
       | Numeric overflows in things like calculation of buffer sizes can
       | lead to vulnerabilities.
       | 
       | Signed overflow is UB, and due to integer promotion signs creep
       | in unexpected places.
       | 
       | It's not trivial to check if overflow happened due to UB rules. A
       | naive check can make things even worse by "proving" the opposite
       | to the optimizer.
       | 
       | And all of that is to read one bit that CPUs have readily
       | available.
        
         | DougGwyn wrote:
         | There are a lot of arithmetic conditions for which C could
         | generate special code. There are div_t-related functions for
         | the other direction. I for one would like a good way to obtain,
         | using some Standard C coding pattern, fast "carry" for
         | multiple-precision integer arithmetic.
         | 
         | Several places in support functions, I have coded unusually to
         | avoid wrap-around etc. I bet you could devise something like
         | that for (unsigned) multiplication.
        
       | freemind wrote:
       | 1. What is the easiest way to build cross-platform (native) GUI
       | with C?
       | 
       | 2. Why it is harder to find lgpl licenced libraries to access
       | windows directories over network like jcifs pysmb (and libraries
       | overall) when needed to close most part of software source to
       | sell small softwares to businesses?
       | 
       | 3. If you needed to combo C with another language to do
       | everything you need to do forever and never look back what other
       | language would that be?
        
       | DougGwyn wrote:
       | Back from lunch. Any West Coasters?
        
       | hsivonen wrote:
       | Does the committee have any plans to document the rationale for
       | each kind of Undefined Behavior?
       | 
       | Does the committee have any plans to make NULL pointer arguments
       | to memcpy non-UB when the size argument is 0?
        
         | AaronBallman wrote:
         | > Does the committee have any plans to document the rationale
         | for each kind of Undefined Behavior?
         | 
         | In the C99 timeframe, we had a rationale document that was
         | separately maintained. My understanding (this predates my
         | joining the committee) is that this was prohibitively labor-
         | intensive and so we stopped doing it for C11. I don't know of
         | any plans to start doing this again, even in a limited sense
         | for justifying UB. That said, we do spend time considering
         | whether an aspect of a proposal requires UB or not, so the
         | rationale exists in the proposals and committee minutes.
         | 
         | > Does the committee have any plans to make NULL pointer
         | arguments to memcpy non-UB when the size argument is 0?
         | 
         | I have not seen such a proposal, and suspect that
         | implementations may be concerned about losing their
         | optimization opportunities from such a change. (Personally, I'd
         | be okay losing those optimization opportunities as this does
         | not seem like a situation where UB is necessary.)
        
       | natch wrote:
       | Can you please repeat this AMA at a later date and at a time of
       | day when people on the west coast of the USA are awake?
       | Alternatively, please keep it going for a few hours if you would
       | be able to be so generous with your time! Thank you for doing
       | this!
       | 
       | Do you also answer questions about the standard libraries? This
       | is not so much a C question as a library question:
       | 
       | I'm wondering if Apple's Grand Central Dispatch ever made it into
       | a more integrated role in C's libraries, or if it will forever
       | remain an outside add-on. And whether there is anything else at
       | that level (level in the sense of high versus low level) in the
       | standard libraries that plays such a role, that I should read up
       | on instead of GCD.
        
         | AaronBallman wrote:
         | > Alternatively, please keep it going for a few hours if you
         | would be able to be so generous with your time!
         | 
         | We're remaining active while there are still people asking
         | questions, so the west coast folks should hopefully have the
         | chance to ask what they'd like.
         | 
         | > Do you also answer questions about the standard libraries?
         | 
         | Sure!
         | 
         | > I'm wondering if Apple's Grand Central Dispatch ever made it
         | into a more integrated role in C's libraries, or if it will
         | forever remain an outside add-on.
         | 
         | GCD has not been adopted into C yet, and I don't believe it's
         | even been proposed to do so by anyone (or an alternative to
         | GCD, either).
         | 
         | It would be an interesting proposal to see fleshed out for the
         | committee, and there is a lot of implementation experience with
         | the feature, so I think the committee would consider it more
         | carefully than an inventive proposal with no real-world field
         | experience.
        
           | wahern wrote:
           | GCD relies on Blocks (closures) for ergonomics, and Blocks
           | have been proposed to WG14, for example N1451:
           | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1451.pdf
        
       | DougGwyn wrote:
       | Some simple instructions about how to use a thread for
       | conversation would be appreciated. Thanks!
        
         | rseacord wrote:
         | Nothing to it! Just hit the reply button on comments you want
         | to respond to. You can also upvote anything you like by
         | clicking on the up arrow to the left of the comment.
        
           | DougGwyn wrote:
           | Okay, is there a starting thread for today's C Experts panel?
           | I miss the old net newsgroups.
        
             | dang wrote:
             | The thread is
             | https://news.ycombinator.com/item?id=22865357, which is the
             | page you've been posting to. It's now listed on the front
             | page of the forum, https://news.ycombinator.com/, which is
             | a list of the stories people have upvoted today.
             | 
             | You're not the only person who misses the old newsgroups!
             | The format that Hacker News uses is one that became sort of
             | standard on the web in the early 2000s. It works
             | differently than usenet did, but you get threaded comments
             | in the sense that replies are nested under the posts
             | they're replying to.
        
         | pascal_cuoq wrote:
         | There are very little formatting options when writing posts,
         | for better or for worse: https://news.ycombinator.com/formatdoc
        
       | stwcx wrote:
       | One feature of C which I do not use often is enums. Support for
       | constants beyond the range of an int is not portable. And I also
       | try to avoid is putting enums inside structs, because there is no
       | portable way to enforce the size or the alignment of the enum's
       | base type.
       | 
       | Will this be addressed in future revisions of the C standard?
        
       | iamed2 wrote:
       | What's an example of a codebase where _Generic has had a notable
       | positive impact?
        
         | AaronBallman wrote:
         | Not necessarily a code base, but _Generic is what makes
         | <tgmath.h> implementable for the type-generic math functions.
        
       | teleonorax wrote:
       | What's up with `strlcpy` and `strlcat`? Are they getting
       | standardized?
        
         | AaronBallman wrote:
         | We've been considering proposals to add common POSIX APIs into
         | C, but I don't believe we've seen a proposal for strlcpy or
         | strlcat yet. I recall we agreed to add strdup to C given its
         | wide availability and usage.
        
           | DougGwyn wrote:
           | There are deficiencies in almost all proposals. Two new
           | functions which avoid the problems are supposed to be
           | published in C202x: strcasecmp and strncasecmp, added in
           | header strings.h (note: not string.h).
        
           | sramsay wrote:
           | strdup seems like a perfect example of "standardizing
           | existing practice." And it has never struck me as running
           | against the spirit of C.
        
             | DougGwyn wrote:
             | In fact I proposed strdup on a few occasions, but it wasn't
             | adopted. It seems that they didn't like for standard
             | library functions to use malloc. POSIX.1 specifies strdup.
        
         | rseacord wrote:
         | No one has proposed making these standard. I doubt they would
         | gain much support as they are similar to the Annex K Bounds
         | Checked Interface functions strcpy_s and strcat_s but not quite
         | as good IMHO.
        
           | teleonorax wrote:
           | > similar to the Annex K Bounds Checked Interface functions
           | strcpy_s and strcat_s but not quite as good IMHO.
           | 
           | Err... I thought Annex K is deprecated and dead? Whereas
           | strl* seem very much alive, some compilers even give a
           | "strcpy/strncpy is unsafe, use strlcpy instead" warning.
        
             | AaronBallman wrote:
             | FWIW, Annex K is not currently deprecated.
        
               | eqvinox wrote:
               | It's not commonly available though, e.g. on Linux/BSD
               | systems...
        
               | AaronBallman wrote:
               | Correct -- it would be nice if the glibc maintainers
               | would reconsider their opinion of supporting the optional
               | Annex K functionality. There is definitely user demand
               | for the feature.
        
               | eqvinox wrote:
               | > _rseacord 22 minutes ago [-]_
               | 
               | > _The C Committee has taken two votes on this, and in
               | each case, the committee has been equally divided.
               | Without a consensus to change the standard, the status
               | quo wins._
               | 
               | The fact that it has only survived on status quo is a
               | pretty crass hint that things aren't well with Annex K.
        
               | beefhash wrote:
               | _And_ every BSD out there. _And_ whatever it is that
               | macOS does. Microsoft looks to be the outlier to me.
        
               | loeg wrote:
               | Microsoft does not even implement Annex K.
               | 
               | > Microsoft Visual Studio implements an early version of
               | the APIs. However, the implementation is incomplete and
               | conforms neither to C11 nor to the original TR 24731-1.
               | 
               | > As a result of the numerous deviations from the
               | specification the Microsoft implementation cannot be
               | considered conforming or portable.
               | 
               | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm
        
               | loeg wrote:
               | It should be.
        
           | loeg wrote:
           | Well, I am informally proposing making those standard :-).
           | 
           | IMO they're a lot more ergonomic than the Annex K functions,
           | and do the thing most programmers think the strncat/strncpy
           | functions do (admittedly, not part of ISO C).
           | 
           | Annex K should be forgotten as the mistake it is and we can
           | move on with existing real-world interfaces instead of
           | inventing features from whole-cloth. I thought that was
           | generally the C standard operating practice.
        
           | rseacord wrote:
           | There were a number of recent proposals to adopt various
           | POSIX functions by Martin Sebor into C including:
           | N2353 2019/03/17 Sebor, Add strdup and strndup to C2X
           | N2352 2019/03/17 Sebor, Add stpcpy, and stpncpy to C2X
           | N2351 2019/03/17 Sebor, Add strnlen to C2X
           | 
           | He is lurking on this thread as well. These proposals can all
           | be found in the document log at http://www.open-
           | std.org/jtc1/sc22/wg14/www/wg14_document_log...
        
             | rseacord wrote:
             | The results (from the minutes http://www.open-
             | std.org/jtc1/sc22/wg14/www/docs/n2377.pdf)
             | 
             | 6.33 Sebor, Add strnlen to C2X [N 2351] Result: No
             | consensus on putting N2351 into C2X.
             | 
             | 6.34 Sebor, Add stpcpy, and stpncpy to C2X [N 2352] Result:
             | No consensus to put N2352 into C2X.
             | 
             | 6.35 Sebor, Add strdup and strndup to C2X [N 2353] Result:
             | N2353 be put into C2X. The committee wants a proposal for
             | the wide character versions of any POSIX functions voted in
             | this meeting.
        
             | rmind wrote:
             | There have been some disagreements on strlcpy/strlcat (BSD
             | vs glibc crowd), although by now the debate has died off
             | and these functions are pretty widely used. Also, while
             | here, it would be lovely to have strchrnul() included.
        
               | loeg wrote:
               | glibc still refuses to add the functions because they are
               | not required by a standard.
        
       | orsenthil wrote:
       | Why is still the learning curve for C so high?
       | 
       | * Why can't the learning curve be solved using tools? * Why don't
       | we actively promote more higher level languages which are
       | implemented in C (by fewer people)?
        
         | NickDunn wrote:
         | I think that C provides fewer layers of abstraction than other
         | languages. This requires the programmer to deal with memory
         | management, treat strings as an array of characters, and other
         | things that the majority of high-level languages conceptualise,
         | so that the human mind deals with it more easily. This provides
         | advantages and disadvantages, as it requires more thought and
         | understanding to write the code but also allows the use of low-
         | level features. The lack of tools to solve any learning issues
         | is probably down to the programmer needing the right conceptual
         | understanding and the requirements placed on anyone using the
         | various features of the language.
        
         | Tronic2 wrote:
         | Syntax of pointers. Easy to use high level languages make
         | extensive use of pointers (i.e. all their variables are
         | actually pointers) but beginners cope with them because no
         | stars or ampersands are required, with the help of GC. Of
         | course they'll get bitten soon and often because it is too easy
         | to create copies of pointers rather than copies of full data
         | structures, and without understanding pointers it's hard to
         | grasp why that happens.
        
         | bumblebritches5 wrote:
         | I taught myself C from just reading code and trying to
         | contribute to a few projects right out of high school, no
         | books, no school.
         | 
         | So I don't think C has a very high learning curve, C++ on the
         | other hand...
        
         | pantalaimon wrote:
         | Do you find the learning curve for C to be high? I find it
         | quite the opposite. It's a simple language with only a few
         | concepts to learn, once you got those, that's it. There might
         | be some preprocessor tricks you'll pick up later, but the base
         | language and library is pretty comprehensive IMHO.
        
           | throw_m239339 wrote:
           | > It's a simple language with only a few concepts to learn
           | 
           | I mean by that logic, Assembly could be deem even simpler,
           | yet writing OR reading programs in Assembly is absolutely not
           | simple at all.
           | 
           | At the end of day, one has to write programs that solve
           | (complicated) problems, and learning how to do that in C is
           | difficult, thus the learning curve deemed higher when it
           | comes to writing professional C.
           | 
           | I can guarantee you that writing professional Go or Java and
           | writing correct programs in both takes way less effort than
           | with C, for use cases that would make Go or Java viable.
        
             | quelsolaar wrote:
             | Modern assembly language has a huge set of instructions,
             | that make them hard to learn, but the concept is still easy
             | to learn.
        
               | DougGwyn wrote:
               | Many antique computers are simulated by SIMH. If you have
               | the corresponding software, you can operate on your
               | desktop a simulated computer's software development
               | system. For example, DEC VAX (VMS or Unix) has a
               | relatively simple and sane assembly language.
        
               | quelsolaar wrote:
               | I think, learning a tiny bit of assembler, even if in an
               | emulator, is very valuable to teach the basics.
        
           | orsenthil wrote:
           | C is indeed a very small language. But the expressive power
           | of C for real-world problems brings a huge learning curve in
           | terms of organization, tracking, and understanding.
        
           | darepublic wrote:
           | Coming from python/js I found it to be high. Mostly because
           | of the memory management/ making sure I call free correctly
           | etc. In many cases where I would plow ahead in programming,
           | with C I had to stop and would feel dread. A lifesaver for me
           | was using C/C++ Repl environments where I could quickly
           | prototype or sanity check things I was doing.
        
             | pantalaimon wrote:
             | The trick is to just not use `malloc()` and `free()` unless
             | absolutely necessary ;)
        
               | throw_m239339 wrote:
               | > The trick is to just not use `malloc()` and `free()`
               | unless absolutely necessary ;)
               | 
               | The problem is that often C programmers have to deal with
               | API and libraries they didn't write themselves to solve
               | their problems, thus are forced to use constructors and
               | destructors even when they don't want to.
        
       | rwmj wrote:
       | Not a question, a request: Please make __attribute__((cleanup))
       | or the equivalent feature part of the next C standard.
       | 
       | It's used by a lot of current software in Linux, notably systemd
       | and glib2. It solves a major headache with C error handling
       | elegantly. Most compilers already support it internally (since
       | it's required by C++). It has predictable effects, and no impact
       | on performance when not used. It cannot be implemented without
       | help from the compiler.
        
         | rseacord wrote:
         | My idea was to add something like the GoLang defer statement to
         | C (as a function with some special compiler magic). The
         | following is an example of how such a function could be used to
         | cleanup allocated resources regardless of how a function
         | returned:                 int do_something(void) {         FILE
         | *file1, *file2;         object_t *obj;         file1 =
         | fopen("a_file", "w");         if (file1 == NULL) {
         | return -1;         }         defer(fclose, file1);
         | file2 = fopen("another_file", "w");         if (file2 == NULL)
         | {           return -1;         }         defer(fclose, file2);
         | obj = malloc(sizeof(object_t));         if (obj == NULL) {
         | return -1;         }         // Operate on allocated resources
         | // Clean up everything         free(obj);  // this could be
         | deferred too, I suppose, for symmetry                 return 0;
         | }
        
           | eqvinox wrote:
           | Cleanup on function return is not enough, it needs to be
           | scope exit. We're using this for privilege raising/dropping
           | (example posted above) and also mutex acquisition/release.
           | Both of these really "want" it on the scope level.
        
           | rwmj wrote:
           | Golang gets this wrong. It should be scope-level not
           | function-level (or perhaps there should be two different
           | types, but I have never personally had a need for a function-
           | level cleanup).
           | 
           | Edit: Also please review how attribute cleanup is used by
           | existing C code before jumping into proposals. If something
           | is added to C2x which is inconsistent with what existing code
           | is already doing widely, then it's no help to anyone.
        
             | rseacord wrote:
             | Yes, we have discussed adding this feature at scope level.
             | A not entirely serious proposal was to implement it as
             | follows:                 #define DEFER(a, b, c)  \
             | for (bool _flag = true; _flag; _flag = false) \
             | for (a; _flag && (b); c, _flag = false)            int
             | fun() {          DEFER(FILE *f1 = fopen(...), (NULL != f1),
             | mfclose(f1)) {            DEFER(FILE *f2 = fopen(...),
             | (NULL != f2), mfclose(f2)) {              DEFER(FILE *f3 =
             | fopen(...), (NULL != f3), mfclose(f3)) {
             | ... do something ...              }            }          }
             | }
             | 
             | We are also looking at the attribute cleanup. Sounds like
             | you should be involved in developing this proposal?
        
               | rwmj wrote:
               | Yes, I'll ask around in Red Hat too, see if we can get
               | some help with this.
        
         | AaronBallman wrote:
         | Funny you should mention that, as that feature has come up
         | recently in mailing list discussions. We have not seen an
         | actual proposal for adopting it yet, but features similar
         | semantics are being discussed as a possible idea (no promises).
         | 
         | FWIW, I don't think it would wind up being spelled with
         | attribute syntax because we would likely want programmers to
         | have a guarantee that the cleanup will happen (and attributes
         | can be ignored by the implementation).
        
           | rwmj wrote:
           | I believe the last proposal was in 2008 (ignore the
           | try..finally stuff here): http://www.open-
           | std.org/jtc1/sc22/wg14/www/docs/n1298.pdf
           | 
           | So I guess it needs someone to take that and update it, also
           | to pull up a full list of current Linux software which is
           | using this feature (which as I say these days is a surprising
           | amount).
        
             | eqvinox wrote:
             | Here's our usage: https://github.com/FRRouting/frr/blob/mas
             | ter/lib/privs.h#L14...                 #define
             | frr_with_privs(privs)
             | \               for (struct zebra_privs_t *_once = NULL,
             | \                                         *_privs
             | __attribute__(                       \
             | (unused, cleanup(_zprivs_lower))) =  \
             | _zprivs_raise(privs, __func__);      \
             | _once == NULL; _once = (void *)1)
             | 
             | This gives us a block construct that guarantees elevated
             | privileges are dropped when the block is done:
             | frr_with_privs(privs) {         ... whatever ...
             | break;  /* exit block, drop privileges */         return;
             | /* return, drop privileges */       }
        
               | rwmj wrote:
               | We have a nice macro for acquiring locks that only
               | applies to the scope:
               | 
               | https://github.com/libguestfs/nbdkit/blob/e58d28d65bfea3a
               | f36...
               | 
               | You end up with code like this:
               | 
               | https://github.com/libguestfs/nbdkit/blob/e58d28d65bfea3a
               | f36...
               | 
               | It's so useful to be able to be sure the lock is released
               | on all return paths. Also because it's scope-level you
               | can scope your locks tightly to where they are needed.
        
             | loeg wrote:
             | We use it extensively in our proprietary codebases as well,
             | FWIW. Not real open data for me to point to, but: a few
             | million lines of C, and a handful of billion USD in
             | revenue. If that helps weigh in on "yes, please standardize
             | this common practice."
        
           | eqvinox wrote:
           | Hopefully it'd at least be syntactically similar, so we can
           | have an                 #ifdef __STDC_CLEANUP__       #define
           | my_cleanup(func) stdc_cleanup(func)       #else       #define
           | my_cleanup(func) __attribute__((cleanup(func)))       #endif
           | 
           | i.e. it would require that it at least goes in the same
           | places as an attribute.
        
       | neop1x wrote:
       | In my opinion C is good as it is. C++ is terrible complicated
       | mess, always have been and adding more and more "modern"
       | functionality isn't helping it much. There are great standard
       | functions, e.g. for strings in C, whereas it is often very
       | inconvenient or complicated to do simple things like uppercase
       | string in C++. I always ended up basically using C with just
       | basic OOP functionality from C++. But I am not writing in C/C++
       | daily so my opinion is not very important...
        
       | eska wrote:
       | There's a compiler attribute in GCC to promise that a function is
       | pure, i.e. free from side effects and only uses its inputs.
       | 
       | This is useful for parallel computations, optimizations and
       | readability, e.g.                  sum += f(2);        sum +=
       | f(2);
       | 
       | can be optimized to                  x = f(2);        sum += x;
       | sum += x;
       | 
       | Would the current motto of the consortium forbid adding a feature
       | such as marking a function as pure, that would not just promise,
       | but also enforce that no side effects are caused (only local
       | reads/writes, only pure functions may be called), and no inputs
       | except for the function arguments are used?
        
         | oh_sigh wrote:
         | sum = 2*f(2) seems nicer than having sum= twice.
         | 
         | If you were enforcing this with the compiler, you would also
         | need something that would suppress the enforcing, because the
         | millions of pre-existing functions would probably not get an
         | updated attribute marking it as pure. And once you do that, the
         | compiler can't really trust anything that function does,
         | because it may actually be calling a non-pure function.
        
         | pascal_cuoq wrote:
         | If you wrote down your proposal, which the C committee member
         | Robert Seacord is encouraging you to do here:
         | https://news.ycombinator.com/item?id=22870210 , you would have
         | to think carefully about functions that are pure according to
         | your definition (free from side effects and only uses its
         | inputs) but do not terminate for some inputs.
         | 
         | There is at least one incorrect optimization present in Clang
         | because of this (function that has no side-effects detected as
         | pure, and call to that function omitted from a caller on this
         | basis, when in fact the function may not terminate).
        
           | temac wrote:
           | I thought the compiler was free to pretend loops without side
           | effects always terminate, and in that sense it is already a
           | "correct" optimization? Or is it only for C++, I'm not sure?
        
             | pascal_cuoq wrote:
             | That may be the case in C++, but in C infinite loops are
             | allowed as long as the controlling condition is a constant
             | expression (making it clear that the developper intends an
             | infinite loop). These infinite loops without side-effects
             | are even useful from time to time in embedded software, so
             | it was natural for the committee to allow them:
             | https://port70.net/~nsz/c/c11/n1570.html#6.8.5p6
             | 
             | And you now have all the details of the Clang bug, by the
             | way: write an infinite loop without side-effects in a C
             | function, then call the function from another C function,
             | without using its result.
        
         | kazinator wrote:
         | No enforcing! This is useful even when it's, strictly speaking,
         | a lie.
         | 
         | Suppose I want to add some debug tracing into f():
         | f.c: 42: f entered        f:c: 43: returning 2
         | 
         | that's a side effect, right? But now the pure attribute tells a
         | lie. Never mind though; I don't care that some calls to f are
         | "wrongly" optimized away; I want the tracing for the ones that
         | aren't.
         | 
         | In C++ there are similar situations involving temporary
         | objects: there is a freedom to elide temporary objects even if
         | the constructors and destructors have effects.
         | 
         | Even a perfectly pure function can have a side effect, namely
         | this one: triggering a debugger to stop on a breakpoint set in
         | that function!
         | 
         | If a call to f(2) is elided from some code, then that code will
         | no longer hit the breakpoint set on f.
         | 
         | Side effect is all P.O.V. based: to declare something to be
         | effect-free in a conventional digital machine, you have to
         | first categorize certain effects as not counting.
        
           | gbear605 wrote:
           | Just offer a -Wpure flag for checking if functions are pure.
           | That way production/test releases can check while you can
           | still use it for debugging.
           | 
           | Also, the problem with eliding breakpoints already exists
           | afaik, since the compilers already check for pure functions.
        
       | BeeOnRope wrote:
       | When deciding on the behavior of some operation that maps to
       | hardware [1], how do you weight the existing hardware behaviors?
       | 
       | For example, if all past, current and contemplated hardware
       | behaves in the same way, I assume that the standard will simply
       | enshrine this behavior.
       | 
       | However, what if 99% of hardware behaves one way and 1% another?
       | Do you set the behavior to "undefined" to accommodate the 1%? At
       | what point to you decide that the minority is too small and
       | you'll enshrine the majority behavior even though it
       | disadvantages minority hardware?
       | 
       | ---
       | 
       | [1] Famous examples include things like bit shift and integer
       | overflow behavior.
        
         | rseacord wrote:
         | I would say that the committee does pay attention to hardware
         | variations, even when there are no examples of existing
         | hardware that implement a feature (for example, a trap
         | representation for integers other than _Bool). Some of the
         | thinking is that "if it was ever implemented in hardware, it
         | could be again). I'm not crazy about this thinking, and I
         | largely think that language features for which there are no
         | existing hardware implementations should be eliminated and then
         | brought back if needed. However, the C Committee is much
         | smaller than the C++ committee so there is a labor shortage.
         | More people getting involved would certainly help.
         | 
         | We have dropped support for sign and magnitude and one's
         | complement architectures from C2x (a decision Doug Gwyn does
         | not agree with). There was some concern that Unisys may still
         | use a one's complement architecture, but that this may only be
         | in emulation nowadays.
        
           | rseacord wrote:
           | Some example of hardware variation (since you mentioned
           | shifting and overflow):
           | 
           | - signed integer overflow or division by zero occurs, a
           | division instruction traps on x86, while it silently produces
           | an undefined result on PowerPC - left-shifting a 32-bit one
           | by 32 bits yields 0 on ARM and PowerPC, but 1 on x86; - left-
           | shifting a 32-bit one by 64 bits yields 0 on ARM, but 1 on
           | x86 and PowerPC
        
             | BeeOnRope wrote:
             | On x86 it's actually mixed: scalar shifts behave as you
             | describe, but vectorised logical shifts flush to zero when
             | the shift amount is greater than the element size!
             | 
             | So x86 actually has both behaviors in one box (three
             | behaviors if you could the 32-bit and 64-bit scalar things
             | you mentioned separately).
             | 
             | This is an example of where UB for simple operations
             | actually helps even on a single hardware platform: it
             | allows efficient vectorization.
        
         | loeg wrote:
         | A good example might be 1's complement signed integers. They
         | were dead weight in the standard for a long time.
        
           | BeeOnRope wrote:
           | Yes, but that is a slightly different question: how long you
           | do you keep something in the standard after all the relevant
           | hardware has disappeared, e.g,. is there a framework for
           | periodically re-evaluating decisions in light of the changing
           | hardware landscape.
           | 
           | My question was more about when behavior is being defined for
           | the first time, which admittedly doesn't happen that often
           | (but it could apply e.g., when thing fixed-width integer
           | types, uintX_t and friends were introduced).
        
             | DougGwyn wrote:
             | Original standard feature specifications were not meant to
             | obtain a 1-to-1 map from C onto hardware, but we used
             | practical experience to judge what overhead was acceptable
             | for the kinds of processors we had seen or thought were
             | reasonable choices that the architects might make in the
             | not too distant future. If a frequently-executed action had
             | to (for example) check for a special condition every time,
             | the overhead might increase by several percent, depending
             | on the instruction set architecture. So quite often we
             | argued that "if the programmer wants to test for that
             | condition, he can do so, but typically it is a waste of
             | cycles". There are a lot of such trade-offs; maybe we
             | should write a paper or book on this topic.
        
       | oreally wrote:
       | About time someone advocated for code in lower level styles of
       | programming. Hope it goes well!
       | 
       | Anyway, here's some questions:
       | 
       | - What kind of programs would you say C is a good fit for?
       | 
       | - There is some catching up to do for C. Is there a roadmap for C
       | improvement, or even a recommendation of C++ things that fit
       | somewhat in the style/philosophy of C? For example, I'd recommend
       | not using the C++ smart pointers stuff, while still using C++
       | threads and lambdas.
       | 
       | Also, you should include programmers from other fields in your
       | committee. Game (engine) developers, HFT programmers are used to
       | lower level styles of coding and align with your perspective.
        
       | 0xDEEPFAC wrote:
       | Dear god, is the precedence of the "&" operator ever going to be
       | fixed?
        
         | rseacord wrote:
         | I can't imagine it will ever be changed, since this would be a
         | breaking change to the language.
        
           | 0xDEEPFAC wrote:
           | I disagree that this would be a "breaking" change as many
           | people have already resorted to using extra () and such a
           | change might actually may "fix" broken code which makes the
           | reasonable assumption that things like == have a higher-
           | order.
           | 
           | https://ericlippert.com/2020/02/27/hundred-year-mistakes/
           | 
           | int x = 0, y = 1, z = 0;
           | 
           | int r = (x & y) == z; // 1
           | 
           | int s = x & (y == z); // 0
           | 
           | int t = x & y == z; // 0 UGH
        
             | DougGwyn wrote:
             | If you're using parentheses, as has been recommended for
             | decades, there is no problem. Otherwise, it is likely that
             | such a change would adversely impact previously working
             | code. There just isn't a pressing need to change it.
        
               | 0xDEEPFAC wrote:
               | Besides the fact that its unintuitive and could lead to
               | low-level or hard-to-find bugs?
               | 
               | It seems to me that C would benefit greatly to iron over
               | its many inconsistencies and exactly the kind of thing
               | people expect in new revisions of the language.
               | 
               | Also, I dont see how it would impact previous working
               | code when compilers already do things like allow
               | selections between versions of languages a la C99, C2x,
               | etc. Users could just avoid the new version if they don't
               | feel like changing.
        
               | DougGwyn wrote:
               | I don't think most users of C want things changing
               | underfoot. Keeping track of all the version combinations
               | is infeasible, especially when you consider that an app
               | and its library packages are likely to have been
               | developed and tested for a variety of environments. To
               | the extent that existing correct code has to be scanned
               | and revised when a new compiler release comes out, one of
               | the primary goals of standardization has failed.
        
               | 0xDEEPFAC wrote:
               | I disagree with your view of standardization - as
               | restricting changes to be additions to the runtime seems
               | pointless as users could easily use other (often more
               | optimized) libraries.
               | 
               | But, I do see the benefit of having a language "frozen in
               | time" which never really changes and can be mastered
               | painlessly without having to refresh on new versions.
               | Perhaps C is special/sacred in this regard.
        
       | [deleted]
        
       | watergatorman wrote:
       | Some random thoughts:
       | 
       | I appreciate the original simplicity of K & R, "The C Programming
       | Language", 2nd Edition, and the relatively simple semantics of
       | ANSI C89/ISO C90 compared to C99 and later.
       | 
       | You don't need complex parsing methods for ANSI C89/ISO C90 and
       | you do not need the "lexer hack" to handle the typedef-name
       | versus other "ordinary identifier" ambiguity.
       | 
       | A surprising number of colleges still teach K & R 2nd Edition C.
       | 
       | Whenever someone brags about using recursive-descent parsing
       | methods, I always ask, are they using predictive, top-down
       | parsing, or back-tracking?
       | 
       | I hope C never loses sight of it's roots nor morphs into C++
       | under the guise of creating a common subset, but which is really
       | a disguised superset of C and C++
       | 
       | Please prevent the ever increasing demand for new features from
       | overwhelming C's simplicity so it can no longer be parsed with
       | simple methods.
        
       | zabana wrote:
       | Is it worth it to learn C in 2020 ? Will it still be a prominent
       | language for systems programming in the future ?
        
         | rseacord wrote:
         | C also has renewed interest around IoT programming and mobile
         | devices
        
         | wolf550e wrote:
         | I believe C will continue to be used as lingua franca after no
         | one uses it to write software, and we're decades from even that
         | point.
         | 
         | You need to know enough C to interface with the OS, and enough
         | C to talk about memory layout, memory management, dynamic
         | libraries, ABI, etc.
         | 
         | Most higher language runtimes need C, even with a self hosting
         | compiler. Not being able to work on the C parts is limiting.
         | 
         | You also need to know enough assembly to be able to understand
         | what the compiler did with your own code, even if you never
         | write assembly yourself. Not being able to compare the
         | disassembly to the high level language to understand why it
         | doesn't work (or is order of magnitude slower than expected) is
         | limiting.
        
         | stephencanon wrote:
         | Yes.
         | 
         | - Languages like Rust will gain more mindshare over the next
         | decade, and be used in more and more new projects, but there
         | are billions of lines of existing code in C, and those aren't
         | going away.
         | 
         | - Hardware architects, for better or worse, largely think about
         | software in terms of [a somewhat dated and idealized mental
         | model of] C. So if you want to be able to converse with
         | architects (which anyone doing systems programming should want
         | to do), you need to have some basic fluency with C.
        
       | wcarey wrote:
       | I'm teaching C to high schoolers as their first language, which
       | is quite the adventure. Do you have any good advice or resources
       | on how to introduce the way C treats the function stack and heap
       | allocated memory? Most of my students struggle (naturally) with
       | making sense of function scoped identifiers and pass-by-value
       | semantics.
        
         | pascal_cuoq wrote:
         | This service has been designed to try out small self-contained
         | C examples online (in a manner reminiscent of Compiler
         | Explorer):
         | 
         | https://taas.trust-in-soft.com/tsnippet/
         | 
         | One advantage is that it identifies a LOT of undefined
         | behaviors during execution for which traditional compilation
         | and execution only give puzzling results.
         | 
         | One drawback is that some of the undefined behaviors it
         | identifies are obscure, and for others the message may be
         | unusual. For instance, using a standard function without
         | including the appropriate header may result in a warning about
         | the mismatch between the type in the header and the type of the
         | arguments the standard function was applied to after arguments
         | promotions.
         | 
         | Overall, you may still find it useful for teaching.
        
           | wcarey wrote:
           | Thanks! Definitely an interesting tool. Two of my students
           | are fascinated by the idea of undefined behavior right now
           | (having run into it in practice; the idea that off-by-one
           | errors sometimes crash their program and sometimes behave
           | "normally" is really odd to them), so I'll point them at this
           | to play with.
        
         | imglorp wrote:
         | Curious what were the requirements to select C as a first high
         | school language over many other choices? I imagine there's a
         | balance of practicality (after the class), and then the usual
         | questions about tooling, sharp edges, and ease of learning.
        
           | wcarey wrote:
           | It's a three year rotation: Python, C (Unix), C (Arduino). My
           | goal with the class is to teach ideas that will stand the
           | test of time. C (and Unix) certainly fit that bill.
           | 
           | Happily, the tooling is the easiest part. Every student has a
           | rasberry-pi running debian, no mouse, no window server, and
           | no extraneous software. You can spool kids up on a nano-based
           | C toolchain in one class period with remarkably few sharp
           | edges. There's even some fun accidental learning the first
           | time they nano their executable file.
        
         | jayp1418 wrote:
         | Have you given ada language a thought ? Also there are lot of
         | competition your students can take part in
         | https://www.makewithada.org/
        
           | wcarey wrote:
           | I haven't - is there a good tool chain you'd recommend me
           | checking out? What's the enduring idea in ada?
        
         | DougGwyn wrote:
         | Everybody seems to draw pictures of the raw memory (word-
         | oriented) data.
        
           | wcarey wrote:
           | I've been doing the same! It certainly helps for strings.
           | Pointer block diagrams (like K&R use) seem to help too.
           | Mostly what melts their brains is the idea that an identifier
           | can be "in two places at once" - for example, you can have a
           | variable x declared in some scope and a function one of whose
           | arguments is named x, and those are two different things.
        
             | DougGwyn wrote:
             | Try explaining the concept of "scope", starting with nested
             | blocks. It does require some practice. I suggest not
             | unnecessarily reusing identifiers associated with different
             | objects.
        
               | wcarey wrote:
               | Thanks!
        
       | quelsolaar wrote:
       | A few proposals:
       | 
       | Why not mandate a warning every time the compiler detects and
       | makes use of UB? It would solve SO many issues. If you are
       | looking to improve security of C programs, then letting the user
       | know what the compiler does should be number one.
       | 
       | Try to convert as many UB's to Platform specific, as possible
       | would also be a big help.
       | 
       | I would love to see native vector types. Its time. Vector types
       | are now more common in hardware then float was when it was
       | included in the C spec. Time to make it a native type. Hoping the
       | compiler does the vectorization for you is not good enough.
       | 
       | Allow for more then one break.
       | 
       | for(i = 0; i < n; i++) for(j = 0; j < n; j++) if(array[i][j] ==
       | x) break break;
       | 
       | is equal to:
       | 
       | for(i = 0; i < n; i++) for(j = 0; j < n; j++) if(array[i][j] ==
       | x) goto found; found :
        
         | clarry wrote:
         | > Why not mandate a warning every time the compiler detects and
         | makes use of UB? It would solve SO many issues.
         | 
         | Because that's hardly ever what happens, except when it
         | actually does, and compilers do an increasingly good job of
         | issuing diagnostics in that case. If you actually mandated it,
         | no compiler today would come close to being standards
         | compliant. This comes close to making the language
         | unimplementable.
         | 
         | The most common issue with UB and optimizations is not that
         | "compiler detects UB and does something with it," it's that
         | compiler analyzes and optimizes code _with the assumption that
         | UB doesn 't actually happen._ It doesn't know whether it does
         | (and in general, it is impossible to tell whether it would
         | happen -- it's something that might or might not happen at run
         | time, and proving it one way or another amounts to solving the
         | halting problem), it just assumes it doesn't.
         | 
         | And if one mandated compilers to report every time they make an
         | optimization that is valid under the assumption that the
         | program is well behaved, then you would never finish reading
         | compiler output. Or you would turn off optimizations.
        
           | quelsolaar wrote:
           | They need to do better then remove NULL checks silently. You
           | can read all about Linus rants on this. Every time the
           | compiler breaks things they blame the C standard for letting
           | them do what ever. Thats whats wrong with C today. The C
           | standard hasn't put its foot down.
        
             | clarry wrote:
             | I want my compiler to remove redundant checks (without any
             | noise), and that is why I pass it an optimization flag. If
             | you don't want such optimizations, then maybe you should
             | not ask the compiler to make them.
        
               | quelsolaar wrote:
               | This attitude is terrible! Its an attitude that says that
               | unless you know exactly every pit fall in the language by
               | heart you have no place writing code. I guess you dont
               | use a debugger either because you never write bugs right?
               | And you think that every software that helps the user is
               | for noobs right?
               | 
               | There is an endless list of bugs that have been produced
               | by very competent C programmers, because the compiler has
               | silently removed things for some very shaky reasons.
        
               | clarry wrote:
               | Huh? I just want performant code. That's why I write C,
               | and that's why I use an optimizing compiler, and that's
               | why I ask my compiler to optimize.
               | 
               | I also want to write code that is reasonably generic.
               | Thus, it will have checks and branches that cover
               | important corner cases; they are required for
               | completeness and correctness. But very often, all of
               | these checks turn out to be redundant in a specific
               | context, and an optimizing compiler can figure it out,
               | and eliminate these checks for me.
               | 
               | So I don't manually need to go and write two or three
               | versions of each function like do_foo and
               | assume_x_is_not_null_and_do_foo and
               | assume_y_is_less_than_int_max_minus_sizeof_z_and_do_foo
               | and make damn sure not to call the wrong one.
               | 
               | I just write one version, with the right checks in place,
               | and if after macro expansion, inlining, range analysis,
               | common subexpression elimination, and other inference
               | from context, with C's semantics at hand, the compiler
               | can figure out that some of these checks are redundant,
               | then it will optimize them out.
               | 
               | I ask for it, and I'm glad compiler developers deliver
               | it. You don't need to ask for it. Just turn off these
               | optimizations (or, rather, don't enable them) if you
               | prefer slow and redundant code.
        
       | [deleted]
        
       | faehnrich wrote:
       | I've been waiting for a book on C from No Starch Press, so I'm
       | really excited for this one.
       | 
       | This might not be too deep a question on the C language in
       | regards to this book, but I've been wondering, why did you decide
       | to have an eldritch horror as the book's cover?
        
         | rseacord wrote:
         | It's a longish story, but people do seem to like the cover. We
         | started equating the idea of C == Sea, so we had some early
         | drawings of the robot riding various undersea creatures
         | including a giant squid. I thought that looked overly phallic,
         | so I suggested the robot ride Cthulhu instead, an unofficial
         | mascot of NCC Group.
        
           | faehnrich wrote:
           | I like how Cthulhu is shown as kind of a guide for the robot.
           | 
           | The C==Sea brings to mind the book Expert C Programming: Deep
           | C Secrets.
        
             | beardedwizard wrote:
             | Deep c secrets, a classic.
        
         | cptnapalm wrote:
         | Wait, they put Cthulhu on the cover of a programming book? I'm
         | buying it.
        
       | douglascorrea wrote:
       | I'm trying to learn C during this quarantine times. I'm looking
       | for good beginner-friendly opensource projects to learn from. Can
       | you please suggest some repositories to look into?
        
         | woodrowbarlow wrote:
         | (obviously, i'm not one of the panel members, just chiming in.)
         | 
         | if you're interested in looking at how C can be used in
         | embedded realtime operating systems, i recommend diving into:
         | 
         | https://github.com/ARMmbed/littlefs
         | 
         | (i'm not affiliated.)
         | 
         | it's a lean, logging flash filesystem implementation and i
         | recommend it because the research, rationales, documentation,
         | organization, codebase, test harness, and public API ergonomics
         | all impressed me a lot. it was written for the mbed OS, but it
         | is so well designed that i could integrate it into any realtime
         | OS without too much trouble. and the documentation is thorough
         | enough that after skimming the wikipedia article for
         | filesystems, and maybe an article on how flash chips read and
         | write data, you'll be able to work your way through it. i
         | learned a lot by reading through that repository.
        
       | begriffs wrote:
       | What does the presence or absence of __STDC_ISO_10646__ indicate
       | exactly? I found this part of the C99 spec obscure.
       | 
       | For instance, the macOS clang environment does not define this
       | symbol. Is their implementation of wchar_t or <wctype.h> lacking
       | some aspect of Unicode support?
        
         | AaronBallman wrote:
         | If that macro is defined, then wchar_t is able to represent
         | every character from the Unicode required character set with
         | the same value as the short code for that character. Which
         | version of Unicode is supported is determined by the date value
         | the macro expands to.
         | 
         | Clang defines that macro for some targets (like the Cloud ABI
         | target), but not others. I'm not certain why the macro is not
         | defined for macOS though (it might be worth a bug report to
         | LLVM, as this could be a simple oversight).
        
           | begriffs wrote:
           | Would the following be a correct way to determine whether
           | there's a problem?
           | 
           | * First call setlocale(LC_CTYPE, "en_US.UTF-8")
           | 
           | * Next feed the UTF-8 string representation of every Unicode
           | codepoint one at a time to mbstowcs() and ensure that the
           | output for each is a wchar_t string of length one
           | 
           | * If all input codepoints numerically match the output
           | wchar_t UTF-32 code units, then the implementation is
           | officially good, and should define __STDC_ISO_10646__?
        
             | AaronBallman wrote:
             | I think this is correct, assuming that locale is supported
             | by the implementation and wchar_t is wide enough, but I am
             | by no means an expert on character encodings.
        
             | rseacord wrote:
             | Should work provided your wchar_t type is at least 21-bits
             | wide.
        
       | SaxonRobber wrote:
       | can we get compile time constant variables? something cleaner
       | than enums and defines
        
       | jcranmer wrote:
       | Are there any plans to add support for multiple register return
       | values to C?
        
         | rseacord wrote:
         | None that I'm aware of.
        
       | Javantea_ wrote:
       | Do you think that static analysis is a valuable tool for security
       | research? Do you recommend static analysis software to a single
       | developer with a limited budget or an amateur?
        
         | msebor wrote:
         | Yes, both :) There are a few in public domain that might be
         | helpful to experiment with. Clang has had a static analyzer for
         | a while and GCC 10 adds one as well (and the maintainer is
         | looking for help with implementing checkers so that's a good
         | way to gain experience with writing one).
        
       | knz42 wrote:
       | What is the story behind the removal of VLAs from C99 in later
       | revisions?
        
         | AaronBallman wrote:
         | VLAs are still present in C17 and have not been removed. They
         | are, however, an optional feature with a truly weird (IMHO)
         | feature testing macro. If '__STDC_NO_VLA__' is defined to 1,
         | then the implementation does not support VLAs.
         | 
         | IIRC, this macro was added to C11 along with a batch of other
         | "these are optional" macros for atomics, complex, threads, etc.
         | However, I don't recall whether C99 adopted the features as
         | optional features and missed the feature testing macro, or if
         | they were required features in C99 that we made optional in
         | C11.
        
           | stephencanon wrote:
           | Complex and VLA were required by C99, but made optional in
           | C11. The others were new in C11.
        
         | wuxb wrote:
         | I can't live without it.
        
         | rseacord wrote:
         | So I spend a possibly unreasonable amount of time and page
         | space discussing VLAs in the Effective C book. I understand
         | there are some problems with them, but for what it is worth, I
         | really like the feature, particularly when used in function
         | prototype scope.
        
           | jedbrown wrote:
           | I usually don't let them leak into public interfaces, and
           | don't allocate VLAs, but really like VLA pointers for multi-
           | dimensional array processing such as [ _]:
           | double (*a)[N][P] = (double (*)[N][P])a_flat;       for (i=0;
           | i<M; i++)         for (j=0; j<N; j++)           for (k=0;
           | k<P; k++)             a[i][j][k] = f(i, j, k);
           | 
           | The alternative would be
           | a_flat[(i*M+j)*P+k] = f(i, j, k);
           | 
           | which is a lot more error-prone. I understand that some
           | implementation (MSVC) declined to implement VLAs, but I
           | really wish that at least VLA-pointers could have remained a
           | mandatory part of C11 and later standards.
           | 
           | [_] Has there been any discussion of adding GCC's "typeof" to
           | the standard?
        
         | alerighi wrote:
         | They did not remove them, but made them optional.
         | 
         | Is a controverial feature, that can produce bugs, and are
         | banned in a lot of project (one famouse, the Linux kernel).
        
         | DougGwyn wrote:
         | What removal? C11 section 6.7.6.2 specifies the semantics.
        
           | cesarb wrote:
           | What the parent comment probably meant is that support for
           | VLA was required in C99, but is no longer required in C11, so
           | while code written for C99 could use VLAs without any special
           | consideration, code written for C11 cannot depend on VLAs
           | since it might not be present in all compilers.
        
       | cperciva wrote:
       | When will C gain a mechanism for "do not leave this sensitive
       | information laying around after this function returns"? We have
       | memset_s but that doesn't help when the compiler copies data into
       | registers or onto the stack.
        
         | pascal_cuoq wrote:
         | This is an entire language extension, as you note. The last
         | time various people interested in this were in the same room
         | (it was in January 2020 in a workgroup called HACS), what
         | emerged was that the Rust people would try to add the "secret"
         | keyword to the language first, since their language is still
         | more agile than C, while the LLVM people would prepare LLVM for
         | the arrival of at least one front-end that understand secret
         | data.
         | 
         | Is this enough to answer your question? I can look up the names
         | of the people that were involved and communicate them privately
         | if you are further interested.
        
           | loeg wrote:
           | (Not OP) I would appreciate any references you can provide.
           | An LLVM __attribute__((secret)) would be a great place to
           | start.
        
             | pascal_cuoq wrote:
             | Unfortunately I am out of useful information:
             | 
             | https://news.ycombinator.com/item?id=22868999
             | 
             | I hope someone will provide the next link.
        
           | stephencanon wrote:
           | Also worth noting that a language extension may not be
           | sufficient for all cases. E.g. the OS stores register state
           | on a context switch; do you also need a flag for the system
           | to zero any memory used for this purpose following the state
           | restore, or is it OK to trust that it won't leak through some
           | mechanism? For some applications, there may be contractual or
           | regulatory requirements to have an erasing mechanism for
           | copies like this as well.
        
             | cperciva wrote:
             | I want to use this in the OS kernel too. ;-)
        
           | cperciva wrote:
           | Thanks for the update. I was encouraging some of the people
           | who were going to be at HACS to address this but I hadn't
           | heard the latest progress. Unfortunately I couldn't be there
           | myself.
        
             | pascal_cuoq wrote:
             | If I remember correctly, Chandler was the one writing down
             | the draft for LLVM developers to comment on LLVM-side.
             | Unfortunately, if you Google his name and the relevant
             | keywords, the results are full of his work on speculative
             | load hardening.
             | 
             | Someone who read the LLVM mailing-list attentively should
             | have seen it and may have a link.
        
       | dpipemazo wrote:
       | One of my favorite features recently while developing C for
       | embedded systems has been the --wrap linker flag that allows me
       | to effectively test code that interacts with hardware without
       | modifying the source.
       | 
       | By passing -Wl,--wrap=some_function at link time with test code
       | we can then define                 __wrap_some_function
       | 
       | that will be called instead of some function. Within
       | __wrap_some_function one can also call __real_some_function which
       | will resolve to the original version if you still want to call
       | the original one. This is especially useful if trying to observe
       | certain function calls in tests that interact with hardware.
       | 
       | Do you have any other recommendations/preferences to help with
       | unit-testing C code?
        
       | freemind wrote:
       | Do you think object oriented languages are better than C to
       | develop GUI-based cross-platform programs?
       | 
       | The licenses of the majority of third-party libraries available
       | for C are GPL, do you think this makes harder reusing code to
       | sell software?
        
       | oscoder wrote:
       | A more chill question for you - What's your favourite string
       | library?
        
       | ebg13 wrote:
       | How accurate, relevant, and useful today is http://c-faq.com ?
        
       | lemaudit wrote:
       | Hi,
       | 
       | Do you think Annex K of C11 will be widely adopted by programmers
       | or unused? Why aren't people adopting it?
       | 
       | Do you see the use of any analysis tools that are particularly
       | effective for finding memory safety issues?
       | 
       | C++ added in smart pointers to its specification. Are there any
       | plans to do something similar in future C specifications?
       | 
       | Thanks!
        
         | AaronBallman wrote:
         | > Do you think Annex K of C11 will be widely adopted by
         | programmers or unused? Why aren't people adopting it?
         | 
         | So far, it's not been widely adopted. Part of the issue is that
         | there are specification issues relating to threads and the
         | constraint handlers, and part of the issue is that popular libc
         | implementations have actively resisted implementing the annex.
         | 
         | That said, I field questions about Annex K on a regular basis
         | and there are a few implementations in the wild, so there is
         | user interest in the functionality.
         | 
         | > Do you see the use of any analysis tools that are
         | particularly effective for finding memory safety issues?
         | 
         | <biased opinion>I think CodeSonar does a great job at finding
         | memory safety issues, but I work for the company that makes
         | this tool.</biased opinion>
         | 
         | I've also had good luck with the memory and address sanitizers
         | (https://github.com/google/sanitizers) and tools like valgrind.
         | 
         | > C++ added in smart pointers to its specification. Are there
         | any plans to do something similar in future C specifications?
         | 
         | We currently don't have any proposals for adding smart pointers
         | to C. Given that C does not have constructors or destructors,
         | we would have to devise some new mechanism to implement or
         | replace RAII in C, which would be one major hurdle to overcome
         | for smart pointers.
        
           | hedora wrote:
           | I've had good luck (in C++) replacing the underlying memory
           | allocator with one that tracks leaks by allocation type
           | (which is fast enough for production use).
           | 
           | This can be done in C, but the calling code has to spell
           | malloc and free differently.
           | 
           | In debug mode, configuring malloc to poison (and add fences)
           | on allocation and free finds most of the remaining things.
           | 
           | These techniques tend to have much lower runtime overhead
           | than valgrind (2-digit percentages vs 5-10x), so they can be
           | left on throughout testing and partially enabled in
           | production.
           | 
           | They find >90% of the memory bugs that I write (assuming
           | valgrind finds 100%). YMMV.
        
           | jcelerier wrote:
           | > We currently don't have any proposals for adding smart
           | pointers to C. Given that C does not have constructors or
           | destructors, we would have to devise some new mechanism to
           | implement or replace RAII in C, which would be one major
           | hurdle to overcome for smart pointers.
           | 
           | why would you _have_ to devise a new mechanism and not borrow
           | one from one of the thousand other mechanisms already
           | existing in PL litterature for this ?
        
         | loeg wrote:
         | Annex K isn't being adopted because it's unergonomic and
         | doesn't solve the problem it purports to. Even the proposer
         | (Microsoft) does not actually implement Annex K as specified in
         | the ISO.
        
           | rseacord wrote:
           | Microsoft originally implemented the Annex K Bounds checked
           | interfaces (e.g., the *_s functions) back in the 1990s in
           | response to well-publicized vulnerabilities. They proposed
           | standardization to the C Standards committee. The committee
           | made many changes to the proposal, possibly going too far
           | away from the original implementation. During this time, I
           | would say that Microsoft was very differential to the wishes
           | of the committee.
           | 
           | By the time the ISO/IEC TR 24731-1:2007 was released, and
           | then later Annex K added to the C Standard, Microsoft had to
           | decide if they wanted to change the interfaces to conform to
           | the changed standard and re-implement their code bases. They
           | presumably decided that they did not, which I think is a
           | defensible decision.
           | 
           | As to unergonomic, examples please?
        
           | rurban wrote:
           | Wrong. Many implemented them, Microsoft as first, followed by
           | Cisco, Watcom, Embarcadero, Huawei and Android. Widely used
           | in Windows, Embedded and phones.
           | 
           | Microsoft just changed one bit of the proposal, but no one
           | followed them there. Currently it's the most widely used and
           | worst implemented. I tested all of them.
           | 
           | It solves the bounds checking problem better than
           | _FORTIFY_SOURCE, ASAN and valgrind, because it does the
           | checks always, if compile-time or run-time, independent on
           | the optimizer, the used intrinsics, where valgrind fails, and
           | is much faster than ASAN. Also faster than glibc btw.
        
       | jmckinley wrote:
       | It is 2020. You are looking at a series of projects your company
       | has teed up. All are greenfield efforts - no legacy. What would
       | be the attributes of a project that would have you recommend C as
       | the programming language?
        
         | quelsolaar wrote:
         | Anything high performance: game engine, scientific computation,
         | deep packet inspection, image analysis, machine learning,
         | rendering engines, high frequency trading.... The list is long!
        
         | joefourier wrote:
         | For anything embedded you have practically no choice but to use
         | C (or assembly). Same goes for a lot of systems programming,
         | e.g. writing Linux drivers.
        
       | commandersaki wrote:
       | A few years ago I came across this article Pointers Are More
       | Abstract Than You Might Expect In C [1].
       | 
       | I followed the article which attempted to interpret the C
       | standard and come to a conclusion. The conclusion is:
       | 
       | > The takeaway message is that pointer arithmetic is only defined
       | for pointers pointing into array objects or one past the last
       | element. Comparing pointers for equality is defined if both
       | pointers are derived from the same (multidimensional) array
       | object. Thus, if two pointers point to different array objects,
       | then these array objects must be subaggregates of the same
       | multidimensional array object in order to compare them. Otherwise
       | this leads to undefined behavior.
       | 
       | Based on the above, I arrived at the conclusion after reading
       | this that comparing two distinct malloc()'d pointers for equality
       | itself is undefined behaviour since malloc() is likely to return
       | pointers to distinct objects that are not part of a sub-aggregate
       | object.
       | 
       | I know this is incorrect, but I don't know why I'm wrong.
       | 
       | [1]: https://stefansf.de/post/pointers-are-more-abstract-than-
       | you...
        
         | pascal_cuoq wrote:
         | The only thing that is not defined is comparing a pointer one-
         | past-the-end to a pointer to the very beginning of a toplevel
         | object. Apart from this rule, pointers of course do not need to
         | be derived from the same object in order to be compared with ==
         | and !=.
         | 
         | &a + 1 == &b is unspecified: it may produce 0 or 1, and it may
         | not produce the same result if you evaluate it several times.
         | 
         | Similarly, if both the char pointers p and q were obtained with
         | malloc(10), after they have been tested for NULL, all these
         | operations are valid:                 p == q (false)       p +
         | 1 == q (false)       p + 1 == q + 1 (false)       p + 10 == q +
         | 1 (false)
         | 
         | Only p+10 == q and p == q+10 are unspecified (of the
         | comparisons that can be built without invoking UB during the
         | pointer arithmetic itself).
         | 
         | I have no idea what led that person to (apparently) write that
         | &a==&b is undefined. This is plain wrong. I do not see any
         | ambiguity in the relevant clause
         | (https://port70.net/~nsz/c/c11/n1570.html#6.5.9p6 ). Yes, the
         | standard is in English and natural languages are ambiguous, but
         | you might as well claim that a+b is undefined because the
         | standard does not define what the word "sum" means
         | (https://port70.net/~nsz/c/c11/n1570.html#6.5.6p5 ).
        
           | azinman2 wrote:
           | Why is this undefined if it's all just pointers to addresses
           | in memory, regardless if the memory is valid for that object
           | or not?
        
             | pascal_cuoq wrote:
             | Here is an example I have at hand that shows that when you
             | are using an optimizing compiler, there is no such thing as
             | "just pointers to addresses in memory". There are plenty
             | more examples, but I do not have the other ones at hand.
             | 
             | https://gcc.godbolt.org/z/Budx3n
        
             | joosters wrote:
             | I would guess that it is because it gives some freedom to
             | the compiler. e.g. If you have two pointers 'foo' and 'bar'
             | that point to two separate structures (e.g. two arrays of
             | ints), the compiler can always assume that the pointers,
             | even with some adds/subtracts, will never 'collide', i.e.
             | foo will never == bar, regardless of their relative memory
             | positions.
        
           | cormacrelf wrote:
           | That's quite precise, can you give a sense of why it's useful
           | to have? Does it translate as "you can never know whether two
           | mallocs are adjacent, so don't even try merging them"?
        
             | pascal_cuoq wrote:
             | One concrete reason why "unspecified" means "anything and
             | not always the same thing" is to enable the maximum of
             | optimizations.
             | 
             | Write a function c that compares pointers in a compilation
             | unit, and in another compilation using, define:
             | int a, b;         X1 = (&a == &b + 1);         X2 = c(&a,
             | &b + 1);
             | 
             | The compiler can optimize the computation of X1 on the
             | basis that comparing an offset of &a to an offset of &b
             | will always:                 - be false       - or invoke
             | undefined behavior       - or be unspecified
             | 
             | But the optimization will not apply to the computation of
             | X2, so the two variables X1 and X2 can receive different
             | values when you execute this example, although they appear
             | to compute the same thing.
        
               | cormacrelf wrote:
               | I get why unspecified means that and it's good to know
               | what the limit is for applying an optimisation, but I was
               | asking about why the specific comparison of "one past the
               | end" with the beginning of another being unspecified
               | would be useful. It's cool you can optimise it out, but
               | what does a compiler gain from being able to do that?
               | 
               | Imagine a standard stated that > and < character
               | comparisons involving '%' were unspecified. Why would
               | this be good? It wouldn't, so it's not in any standard.
               | But specifically it wouldn't because (a) nobody writes ch
               | < '%', and (b) if they did, compilers couldn't make
               | programs any faster, more portable, etc, because of its
               | inclusion.
               | 
               | I guessed above that this is kinda like having hashmaps
               | iterate in a random order: compilers do spooky things
               | when you try to check whether two allocas/mallocs are
               | adjacent, so don't do it. Is that accurate? Or does it
               | mean that compilers can move things around on the stack
               | if they want, without worrying about updating the
               | registers or locations that store the pointers, i.e. this
               | is mainly to make compilers easier to write? If it's
               | that, I imagine I would want some other pointer
               | comparisons on the list. The reason it's in there is what
               | I wanted you to shed some light on.
        
               | pascal_cuoq wrote:
               | Oh, that was your question. In this case, the reason why
               | &a + 1 == &b is unspecified is that:
               | 
               | - it's generally false--there is no reason for b to be
               | just after a in memory, so these two addresses compare
               | different.
               | 
               | - it is sometimes true: when addresses are implemented as
               | integers, and compilers use exactly sizeof(T) bytes to
               | represent an object of type T, and do not waste precious
               | integers by leaving gaps between objects, and == between
               | pointers is implemented as the assembly instruction that
               | compares integers, sometimes that instruction produces
               | true for &a + 1 == &b, because b was placed just after a
               | in memory.
               | 
               | In short, &a + 1 == &b was made unspecified so that
               | compilers could implement pointer == by the integer
               | equality instruction, and could place objects in memory
               | without having to leave gaps between them. Anything more
               | specific (such as "&a + 1 == &b is always false") would
               | have forced compilers to take additional measures against
               | providing the wrong answer.
        
         | _kst_ wrote:
         | Pointer _equality_ (the == and != operators) is well defined
         | for any pointers (of the same type) to any objects.
         | 
         | Relational operators (< <= > >=) on pointers have undefined
         | behavior unless both pointers point to elements of the same
         | array object or just past the end of it. A single non-array
         | object is treated as a 1-element array for this purpose.
         | 
         | (That's for object pointers. Function pointers can be compared
         | for equality, but relational operators on function pointers are
         | invalid.)
        
       | graycat wrote:
       | (1) Explain just how malloc() and free() work _under the covers_
       | and the implications for multi-threading, _memory leaks_ ,
       | virtual memory paging, etc.
       | 
       | Maybe also cover some means, algorithms, and code for reporting
       | on the _state_ , status, etc. of the memory use by malloc() and
       | free().
       | 
       | By the way, I know and have known well for longer than most C
       | programmers have lived JUST what the _heap_ data structure, as
       | used in  "heap sort", is. But what is the meaning of "the heap"
       | in C programming language documentation?
       | 
       | (2) Cover in overwhelmingly fine detail the "stack" and the
       | chuckhole in the road, _stack overflow_.
       | 
       | (3) Where to get a reliable package for a reasonable package of
       | code for handling character strings -- what I saw and worked with
       | in C is not reasonable.
       | 
       | (4) From the C programming I did, it looks like a large C program
       | for significant work involves some hundreds, maybe tens of
       | thousands, of _includes_ , _inserts_ , whatever, and what a
       | linkage editor would call _external references_. There must
       | somewhere be some tools to help a programmer make sense of all
       | those includes and references, the resulting memory maps, issues
       | of locality of reference, _word boundary alignment_ , etc.
       | 
       | (5)How can C exploit a processor with 64 bit addressing and main
       | memory in the tens of gigabytes and maybe terabytes?
       | 
       | (6) How can C support, i.e., exploit, integers and IEEE floating
       | point in 64 and/or 128 bit lengths?
       | 
       | (7) How to handle exceptional conditions with, say, non-local
       | gotos and without danger of memory leaks?
       | 
       | (8) Sorry, but far and away my favorite programming language long
       | has been and remains PL/I, especially for its scope of names
       | rules, handling of aggregates with external scope, its _data
       | structures_ , and its exceptional conditional handling with non-
       | local gotos and freeing _automatic_ storage and, thus, avoiding
       | _memory leaks_. Of course I can 't use PL/I now, but the problems
       | PL/I solved are still with us, also when writing C code. So, how
       | to solve these problems with C code?
       | 
       | (9) For C++, please explain how that works _under the covers_.
       | E.g., some years ago it appeared the C++ was defined as only a
       | source code pre-processor to C. Is this still the case? If so,
       | then explaining C++ _under the covers_ should be feasible and
       | valuable.
        
         | zokier wrote:
         | > But what is the meaning of "the heap" in C programming
         | language documentation?
         | 
         | The C language standard does not contain the word "heap"
         | anywhere; as far as C is considered, there is no "heap" in
         | particular.
        
         | DougGwyn wrote:
         | It has been many years since a C++-to-C preprocessor has been
         | commonplace. There's just too much new stuff in recent C++ to
         | map it all easily into straight C.
        
         | mesarvagya wrote:
         | (7) Exactly. Please add how to free memory in standard way if
         | there's an exception, and how not to use GoTo in such cases.
        
         | aw1621107 wrote:
         | > Explain just how malloc() and free() work under the covers
         | and the implications for multi-threading, memory leaks, virtual
         | memory paging, etc. > > Maybe also cover some means,
         | algorithms, and code for reporting on the state, status, etc.
         | of the memory use by malloc() and free().
         | 
         | Strictly speaking, these are implementation details that the C
         | standard leaves unspecified. If you want to know how the memory
         | allocation functions work or methods for inspecting the state
         | of the heap you'll need to look at a specific implementation
         | (e.g., glibc, musl, jemalloc, etc.) since the details can vary
         | wildly between implementations.
         | 
         | > Cover in overwhelmingly fine detail the "stack" and the
         | chuckhole in the road, stack overflow.
         | 
         | Both these are not really specific to C, and there should be a
         | lot of resources you can find that explain these concepts ([0],
         | [1] for some example general explanations). Did you have more
         | specific questions in mind?
         | 
         | > How can C exploit a processor with 64 bit addressing and main
         | memory in the tens of gigabytes and maybe terabytes? > How can
         | C support, i.e., exploit, integers and IEEE floating point in
         | 64 and/or 128 bit lengths?
         | 
         | I think pointer/integer sizes are implementation details. C
         | specifies pointer behavior and minimum integer sizes (and
         | optional fixed-width types), but the precise widths are chosen
         | by the implementation. In the case of floating-point, the sizes
         | are specified by IEEE 754 widths.
         | 
         | In other words, you don't really need to do anything special as
         | long as you pick the appropriate types as defined by your
         | implementation.
         | 
         | > For C++, please explain how that works under the covers.
         | E.g., some years ago it appeared the C++ was defined as only a
         | source code pre-processor to C. Is this still the case?
         | 
         | As far as I know no (production-quality?) C++ compiler has been
         | implemented as a source-level preprocessor for basically the
         | entirety of C++'s existence [2]. The very first "compiler" for
         | C++ was Cpre, back when C++ was still the C dialect "C with
         | classes" (around October 1979), and that was indeed a
         | preprocessor. That was replaced by the Cfront front end around
         | 1982-1983, about when "C with classes" started gaining new
         | features and got a new name. Cfront is a proper compiler front
         | end that output C code, and I think from that point on C++
         | compilers used "standard" compiler tech.
         | 
         | [0]: https://stackoverflow.com/questions/79923/what-and-where-
         | are... [1]: https://en.wikipedia.org/wiki/Stack_overflow [2]:
         | http://www.stroustrup.com/hopl2.pdf
        
           | graycat wrote:
           | Thanks.
           | 
           | > Did you have more specific questions in mind?
           | 
           | On stack overflow, my understanding was that could encounter
           | that fatal condition from suddenly a too deep _call stack_ ,
           | that is, too many calls without a return. So, if the "stack"
           | is a, say, finite resource, then the programmer should know
           | in the code how much of that resource is being used and act
           | accordingly.
           | 
           | For a preprocessor for C++, I IIRC at one point the
           | definition of C++ was in terms of a preprocessor -- I was
           | just thinking of the definition, that is, get a more explicit
           | definition of C++. I've always understood that always or
           | nearly so C++ implementation was usual _compilation_. The
           | issue is that at least at one time it seemed difficult to be
           | precise about C++ semantics, that is, what the code would do
           | and how it would do it. Maybe now C++ is beautifully
           | documented.
        
         | DougGwyn wrote:
         | (1) There are several implementations; most are based on
         | Knuth's "boundary tag" algorithms. As to "heap", a stack has
         | one accessible end, a heap is essentially random-accessible.
         | Nothing to do with the heap data structure. (2) Stack overflow
         | can occur even early within a program. I've campaigned for a
         | requirement that such overflows be caught and integrated into a
         | standard exception handler, to no avail. (3) Why not code your
         | own, so there won't be arguments about it. (4) There are lots
         | of tools for program development, but it's not standardized by
         | WG14. (5) Use wider integer types. (6) Use wider floating
         | representations. (7) Standard C doesn't specify such a
         | facility, but it has occasionally be suggested. (8) There were
         | a lot of books, e.g. on structured system analysis, during the
         | 1970s trying to apply lessons learned. C isn't special in that
         | regard, as many of the big problems don't involve syntax. (9)
         | C++ is now a big language and it takes a lot of work to master
         | its internals.
        
       | jfim wrote:
       | Out of curiosity, if there was anything you could change about C,
       | and not have to worry about breaking existing code or any other
       | practical concern, what would it be, and why?
        
       | mlvljr wrote:
       | How much UB does your own code contain, folks (and what practices
       | do you follow to avoid it)?
       | 
       | Cheers from the shadowland :)
        
       | AvImd wrote:
       | What is your vision of C, its future and its past? What was it
       | supposed to become and did it become that thing? What is it now?
       | What will it involve into in the near and far future?
        
         | msebor wrote:
         | The C charter and the C committee's job is to standardize
         | existing practice. That means codifying features that emerge as
         | successful in multiple implementations (compilers or
         | libraries), and that are in the overall spirit of the language.
        
       | kazinator wrote:
       | CAN I HAZ UNNAMED UNUSED PARAM                  void callback(int
       | x, void *) // VOID STAR UNUZED, SO ANON        {
       | foo(x);        }
        
         | DougGwyn wrote:
         | Why is there a second argument which is not used?
        
           | wnoise wrote:
           | To match an API where it is sometimes used.
        
       | networkimprov wrote:
       | Has there been consideration of async/await semantics?
        
       | rand0mstring wrote:
       | will we ever see compile time programming in C like constexpr in
       | C++?
        
       | gautamcgoel wrote:
       | Curious what the committee members think of the new competitors
       | to C, e.g. Go, Rust, and Zig. Any comments?
        
         | VWWHFSfQ wrote:
         | go isnt a competitor to c
        
           | pjmlp wrote:
           | F-Secure apparently thinks otherwise,
           | 
           | https://www.f-secure.com/en/consulting/foundry/usb-armory
           | 
           | As does Google,
           | 
           | https://github.com/google/gvisor
           | 
           | https://github.com/google/gapid
           | 
           | Naturally if one is talking about specific uses cases like
           | IoT with a couple of KBs, MISRA-C, or UNIX kernels, then yes
           | Go is not a competitor.
        
       | rmind wrote:
       | A lot C programmers prefer to keep structures within the C source
       | file ("module"), as a poor man's encapsulation. For example:
       | 
       | component.h:                   struct obj;         typedef struct
       | obj obj_t;              obj_t *obj_create(void);         // ..
       | the rest of the API
       | 
       | component.c:                   struct obj {             int
       | status;             // .. whatever else         };
       | obj_t *         obj_create(void)         {             return
       | calloc(1, sizeof(obj_t));         }
       | 
       | However, as the component grows in complexity, it often becomes
       | necessary to separate out some of the functionality (in order to
       | re-abstract and reduce the complexity) into a another file or
       | files, which also operate on "struct obj". So, we move the
       | structure into a header file under #ifdef __COMPONENT_PRIVATE
       | (and/or component_impl.h) and sprinkle #define
       | __COMPONENT_PRIVATE in the component source files. It's a poor
       | man's "namespaces".
       | 
       | Basically, this boils down to the lack
       | namespaces/packages/modules in C. Are you aware of any existing
       | compiler extensions (as a precedent or work in that direction)
       | which could provide a better solution and, perhaps, one day end
       | up in the C standard?
       | 
       | P.S. And if C will ever grow such feature, I really hope it will
       | _NOT_ be the C++  'namespace' (amongst many other depressing
       | things in C++). :)
        
         | msebor wrote:
         | The ELF visibility attributes solve the part of the problem at
         | the binary level (by hiding private library APIs from the
         | application). The rest should be doable by structuring the
         | project sources and headers in a suitable way.
        
           | loeg wrote:
           | ELF is very much not part of the C standard.
        
         | pascal_cuoq wrote:
         | I am sorry I do not have an answer to your question. It's a
         | very valid one and I would be interested in any pointer to an
         | answer.
         | 
         | What I _can_ say while we are on the subject, is that I have
         | seen C code (most often C code that started its life in the
         | 1990s, to be fair) that instead of showing an abstract struct
         | in the public interface, showed a different struct definition.
         | 
         | Please don't do this. Yes, when compiling nowadays, eventually
         | every compilation unit ends up as object files passed to a
         | linker that doesn't know about types, but this is undefined
         | behavior. It makes it difficult to find undefined behavior in
         | the rest of the code because there is a big undefined behavior
         | right in the middle of it.
        
           | rmind wrote:
           | I assume you mean something like that:
           | struct obj_impl {             // real members             ...
           | };              In public API header:              struct obj
           | {             unsigned char _private[N]; // -- where N is the
           | size of obj_impl         };
           | 
           | I have seen such code too. It is also potentially error-
           | prone. Certainly not advocating for it.
        
           | beefhash wrote:
           | Wait, doesn't this mean that the BSD sockets API is
           | inherently dependent on UB, casing different socket types to
           | each other and sometimes only using the first few members, or
           | am I misunderstanding you?
        
             | pascal_cuoq wrote:
             | Yes and no.
             | 
             | The thing I am describing is when you link a compilation
             | unit using:                 struct internal_state { int
             | dummy; } state;
             | 
             | with another compilation unit that defined the same state
             | differently:                 struct internal_state {
             | int actual_meaningful_member_1;          unsigned long
             | actual_meaningful_member_2; } state;
             | 
             | As far as I know, BSD socked do not do this. Zlib was doing
             | this (https://github.com/pascal-cuoq/zlib-
             | fork/blob/a52f0241f72433... ), but I have had the privilege
             | of discussing this with Mark Adler, and I think the no-
             | longer-necessary hack was removed from Zlib.
             | 
             | BSD sockets probably have a different kind of UB, related
             | to so-call "strict aliasing" rules, unless they have been
             | carefully audited and revised since the carefree times in
             | which they were written. I am going to have to let you read
             | this article for details (example st1, page 5):
             | https://trust-in-soft.com/wp-
             | content/uploads/2017/01/vmcai.p...
        
               | loeg wrote:
               | BSD sockets are weird in that the first struct's
               | (sockaddr) size wasn't big enough, so APIs all take a
               | nominal pointer to sockaddr but may require larger
               | storage (sockaddr_storage) depending on the actual
               | address.                 /*        * Structure used by
               | kernel to store most        * addresses.        */
               | struct sockaddr {               unsigned char   sa_len;
               | /* total length */               sa_family_t
               | sa_family;      /* address family */               char
               | sa_data[14];    /* actually longer; address value */
               | };                 /*        * RFC 2553: protocol-
               | independent placeholder for socket addresses        */
               | #define _SS_MAXSIZE     128U       #define _SS_ALIGNSIZE
               | (sizeof(__int64_t))       #define _SS_PAD1SIZE
               | (_SS_ALIGNSIZE - sizeof(unsigned char) - \
               | sizeof(sa_family_t))       #define _SS_PAD2SIZE
               | (_SS_MAXSIZE - sizeof(unsigned char) - \
               | sizeof(sa_family_t) - _SS_PAD1SIZE - _SS_ALIGNSIZE)
               | struct sockaddr_storage {               unsigned char
               | ss_len;         /* address length */
               | sa_family_t     ss_family;      /* address family */
               | char            __ss_pad1[_SS_PAD1SIZE];
               | __int64_t       __ss_align;     /* force desired struct
               | alignment */               char
               | __ss_pad2[_SS_PAD2SIZE];       };
        
               | wahern wrote:
               | struct sockaddr_storage is insufficient as well. A Unix
               | domain socket path can be longer than `sizeof ((struct
               | sockaddr_un){ 0}).sun_path`. That's a major reason why
               | all the socket APIs take a separate socklen_t argument.
               | Most people just assume that a domain socket path is
               | limited to a relatively short string, but it's not
               | (except possibly Minix, IIRC).
        
               | asveikau wrote:
               | > A Unix domain socket path can be longer than `sizeof
               | ((struct sockaddr_un){ 0}).sun_path`
               | 
               | Hm, I didn't realize this, or if I knew this I had
               | forgotten. It makes sense because sun_path is usually
               | pretty small, I believe 108 chars is the most common
               | choice, and typically file paths are allowed to be much
               | longer.
               | 
               | Do you have a citation for this behavior? I can't seem to
               | find it, though I'm not looking very hard.
               | 
               | I guess you are right that any syscall taking a struct
               | sockaddr * also has a length passed to it... Some systems
               | have sa_len inside struct sockaddr to indicate length,
               | but IIRC linux does not. I've often thought that length
               | parameter was sort of redundant, because (1) some
               | platforms have sa_len, and (2) even without that, you
               | should be able to derive length from family. But your
               | Unix domain socket example breaks (2). Without being able
               | to do that, I start to imagine that the kernel would need
               | to probe for NUL chars terminating the C string anytime
               | it inspects a struct sockaddr_un, rather than block-
               | copying the expected size of the structure -- that would
               | be needlessly complicated.
        
               | wahern wrote:
               | So I just reran some tests on my existing VMs and it
               | turns out I remembered wrong. Here's the actual break
               | down:
               | 
               | * Solaris 11.4: .sun_path: 108; bind/connect path
               | maximum: 1023. Length seems to be same as open.
               | Interestingly, open path maximum seems to be 1023 (judged
               | by trying ls -l /path/to/sock), although I always thought
               | it was unbounded on Solaris.
               | 
               | * MacOS 10.14: .sun_path: 104, bind/connect path maximum:
               | 253. Length can be bigger than .sun_path but less than
               | open path limit.
               | 
               | * NetBSD 8.0: .sun_path: 104, bind/connect path maximum:
               | 253. Same as MacOS.
               | 
               | * FreeBSD 12.0: .sun_path: 104, bind/connect path
               | maximum: 104.
               | 
               | * OpenBSD 6.6: .sun_path: 104, bind/connect path maximum:
               | 103 (104 - 1).
               | 
               | * Linux 5.4: .sun_path: 108, bind/connect path maximum:
               | 108.
               | 
               | * AIX 7.1: .sun_path: 1023, bind/connect path maximum:
               | 1023. Yes, .sun_path is statically sized to 1023! And
               | like Solaris, open path maximum seems to be 1023 (as
               | judged by trying ls -l /path/to/socket). Thanks to Polar
               | Home, polarhome.com, for the free AIX shell account.
               | 
               | Note that all the above lengths are _exclusive_ of NUL,
               | and the passed socklen_t argument did not include a NUL
               | terminator.
               | 
               | For posterity: on all these systems you can still create
               | sockets with long paths, you just have to chdir or use
               | bindat/connectat if available. My test code confirmed as
               | much. And AFAICT getsockname/getpeername will only return
               | the .sun_path path (if anything) used to bind or connect,
               | but that's a more complex topic (see https://github.com/w
               | ahern/cqueues/blob/e3af1f63/PORTING.md#g...)
        
               | asveikau wrote:
               | Linux also has the unusual extension of: if sun_path[0]
               | is NUL, the path is not a filesystem path and the rest of
               | the name buffer is an ID. I don't remember if that can
               | have embedded NULs in that ID. I believe so.
        
               | haberman wrote:
               | I'm curious what exactly makes this undefined behavior.
               | 
               | And in particular, what about something like this?
               | struct Foo {         #ifdef __cplusplus           int
               | bar() const { return bar_; }          private:
               | #endif           int bar_;         };
               | 
               | Or, taking this a step further:                   struct
               | _Foo;         typedef struct _Foo Foo;              // In
               | C "struct _Foo" is never defined.         int
               | Foo_bar(const Foo* foo) { return *(int*)foo; }
               | void Foo_setbar(Foo* foo) { *(int*)foo; }         Foo*
               | Foo_new() { return malloc(sizeof(int)); }
               | #ifdef __cplusplus         struct _Foo {           void
               | set_bar() { bar_ = bar; }           int bar() const {
               | return bar_; }          private:           int bar_;
               | };         #endif
               | 
               | The above isn't ideal but it does provide encapsulation
               | in a way that doesn't seem to violate strict aliasing
               | (the memory location is consistently read/written as
               | "int").
        
               | pascal_cuoq wrote:
               | I think this is plenty ok. For one thing, If a struct as
               | a member of type T, it's ok to access it through a
               | pointer to T (and also the address of the struct is
               | guaranteed to be identical to the address of the first
               | member). For another, you are using dynamically allocated
               | memory, so the only thing that matters is the type of the
               | pointer when the access is finally made. It doesn't
               | matter that it was a Foo* before, if what you dereference
               | is an int*.
               | 
               | This is different from pretending that the address of a
               | struct s { int a; double b; } is the address of a struct
               | t { int a; long long c; } and accessing it through a
               | pointer to that. If you do that, C compilers will (given
               | the opportunity) assume that the write-through-a-pointer-
               | to-struct-t does not modify any object of type "struct
               | s". This is what the example st1 in the article
               | illustrates.
               | 
               | The latter is what I suspect plenty of socket
               | implementations still do (because there are several types
               | of sockets, represented by different struct types with a
               | common prefix). It is possible to revise them carefully
               | so that they do not break the rules, but I doubt this
               | work has been done.
        
             | loeg wrote:
             | Yeah, the BSD socket API is kind of terrible like that. You
             | could consider it an unspecified union type, or use
             | memcpy() exclusively to access it safely.
        
             | emilfihlman wrote:
             | Yeah, it depends on well agreed convention but which is ub
             | according to the standard.
        
       | cyber1 wrote:
       | How close C Standard Committee works with Linux Kernel
       | Developers? Is Linux Kernel development influence on C standard?
        
         | AaronBallman wrote:
         | There's not an official collaboration between the committee and
         | the kernel developers (that I'm aware of), but we do have
         | people on the committee who need to support Linux kernel
         | development (such as GCC maintainers), so there is some level
         | of indirect influence there.
        
       | clarry wrote:
       | Why can't I have flexible array members in union? Consider this:
       | struct foo {             enum { t_char, t_int, t_ptr, /* .. */ }
       | type;             int count;                  union {
       | char c[];                 int i[];                 void *p[];
       | /* .. */             };         };
       | 
       | This isn't allowed, since flexible array members are only allowed
       | in structs (but the union here is exactly where you'd put a
       | flexible array member if you had only one type to deal with).
       | 
       | Furthermore, you can't work around this by wrapping the union's
       | members in a struct because they must have more than one named
       | member:                   struct foo {             enum { t_char,
       | t_int, t_ptr } type;             int count;
       | union { /* not allowed! */                 struct { char c[]; };
       | struct { int i[]; };                 struct { void *p[]; };
       | };         };
       | 
       | But it's all fine if we either add a useless dummy variable or
       | move some prior member (such as _count_ ) into these structs:
       | struct foo {             enum { t_char, t_int, t_ptr } type;
       | int count;                  union { /* this works but is silly
       | and redundant */                 struct { int dumb1; char c[]; };
       | struct { int dumb2; int i[]; };                 struct { int
       | dumb3; void *p[]; };             };         };
       | 
       | Of course, you could have the last member be
       | union { char c; int i; void *p; } u[];
       | 
       | but then each element of u is as large as the largest possible
       | member which is wasteful, and u can't be passed to any function
       | that expects to get a normal, tightly packed array of one
       | specific type.
        
       | psherbet wrote:
       | I love how small of a language C is and get concerned when people
       | recommend adding feature x,y and z.
       | 
       | What's the plan for C over the next 5 - 10 years?
        
         | DougGwyn wrote:
         | There is no grand goal that I know of. I wish more importance
         | were being placed on keeping existing well-written code
         | working, which includes continued support for what might be
         | considered near-obsolete. If one wanted to design a new (not
         | fully compatible) language, that could have lofty goals; just
         | don't call it "C".
        
       | ativzzz wrote:
       | Other than these experts, what kind of companies do C developers
       | work at? How does the compensation look like compared to doing
       | web development?
        
         | pascal_cuoq wrote:
         | I do not actually develop in C (other than short examples to
         | feed the C analyzer that I work on, which is not written in C)
         | but our customers do employ plenty of C developers. These
         | customers are developing embedded software that reads inputs
         | from sensors, process them, and send the final results of the
         | computations to actuators, in fields such as IoT, aeronautics,
         | rail, space, nuclear energy production, autonomous
         | transportation, ...
         | 
         | The list is very much biased by the sort of analyzer we
         | provide. There are certainly plenty of non-embedded codebases
         | in C and of developers paid to maintain and extend them, it's
         | just that we currently do not work with them as much.
         | 
         | I do not know about whether the compensation is better or worse
         | than for other technologies.
        
       | baybal2 wrote:
       | Hello, I coded in C as a high schooler. Now, 16 years later, I
       | have to code C again semiprofessionally after a very long break.
       | 
       | Big question, how to start programming in C on a high
       | professional level for somebody self schooled in it? Is there a
       | way to cut the corner, without having to go through 10+ years
       | trial and error to gain experience?
       | 
       | Anything for somebody ready to sit, study, and practice for a few
       | hours a day?
        
         | Nemerie wrote:
         | There was a nice discussion recently
         | https://news.ycombinator.com/item?id=22519876
        
       | rseacord wrote:
       | Many of your remaining questions have devolved into "When will I
       | see my favorite feature xyz appear in the C Standard?" The answer
       | in most cases is "that depends on how long it takes you to submit
       | a proposal". Take a look at http://www.open-
       | std.org/jtc1/sc22/wg14/www/wg14_document_log... for previous
       | proposals and review the minutes to see which proposals have been
       | adopted. In general, the committee is not going to adopt
       | proposals for which there is insufficient existing practice or
       | haven't been fully thought out. There are cases where people have
       | come to a single meeting with a well-considered proposal that was
       | adopted into the C Standard. I wrote about one such case here:
       | https://www.linkedin.com/pulse/alignment-requirements-memory...
       | Alternatively, you can approach someone on the committee and ask
       | us to champion a proposal for you. It is likely that we'll agree
       | or at least provide you with feedback on your proposal.
        
       | billfruit wrote:
       | I do find that C is difficult use for large programs. It there
       | any thoughts that introducing features like namespaces.
       | 
       | Another thing very cumbersome is to do in C is object creation;
       | creating instantiable objects is possible very cumbersome. Is
       | there some feature in the thoughy process to deal with it. To
       | make it clear, in C we can create a data structure like a Stack
       | or a queue easily. But if the program needs 10 stacks then
       | presently no simple way of achieving it.
        
         | DougGwyn wrote:
         | In BRL's MUVES project, we used a 2-character prefix indicating
         | category. E.g., all the external identifiers for our fancy
         | memory allocator began with "Mm", where Mm.h documented the
         | interface for the Mm package only.
         | 
         | To minimize the external identifiers, one could make just the
         | name of a container structure the sole entry access handle,
         | with structure members pointing to the functions. Then use it
         | like:                 #include <Mm.h>       if ((new =
         | Mm.allo(size)) == NULL)         Er.abort("out of memory");
        
           | jfkebwjsbx wrote:
           | Tip: you can use four leading spaces to write code.
           | Like this
        
             | [deleted]
        
             | steveklabnik wrote:
             | You only need two!                 like this
        
               | sgt wrote:
               | I did not know that, after spending years on HN.
               | while(1) fork();
        
               | steveklabnik wrote:
               | https://news.ycombinator.com/formatdoc
        
               | DougGwyn wrote:
               | I tried, but two spaces yielded what you saw.
        
               | dang wrote:
               | Huh, it also needed an extra line break before the first
               | line of code. I didn't realize that! I've fixed it now.
        
               | steveklabnik wrote:
               | I didn't realize that either, but it's described in
               | formatdoc as such. So if you changed that behavior,
               | probably should change the docs too.
        
               | dang wrote:
               | I didn't change the behavior - I just added a newline.
               | Sorry that wasn't clear.
        
               | DougGwyn wrote:
               | You should be commended for the fast customer service!
        
       | beefhash wrote:
       | Now that C2x plans to make two's complement the only sign
       | representation, is there any reason why signed overflow has to
       | continue being undefined behavior?
       | 
       | On a slightly more personal note: What are some undefined
       | behaviors that you would like to turn into defined behavior, but
       | can't change for whatever reasons that be?
        
         | msebor wrote:
         | Some instances of undefined behavior at translation time can
         | effectively be avoided in practice by tightening up
         | requirements on implementations to diagnose them. But strictly
         | speaking, because the standard allows compilers to continue to
         | chug along even after an error and emit object code with
         | arbitrary semantics, turning even such straightforward
         | instances into constraint violations (i.e., diagnosable errors)
         | doesn't prevent UB.
         | 
         | It might seem like defining the semantics for signed overflow
         | would be helpful but it turns out it's not, either from a
         | security view or for efficiency. In general, defining the
         | behavior in cases that commonly harbor bugs is not necessarily
         | a good way to fix them.
        
         | klodolph wrote:
         | Just going to inject that this impacts a bunch of random
         | optimizations and benchmarks. Just to fabricate an example:
         | for (int i = 0; i < N; i += 2) {             //         }
         | 
         | Reasonably common idea but the compiler is allowed to assume
         | the loop terminates precisely because signed overflow is
         | undefined.
         | 
         | I'm not trying to argue that signed overflow is the right tool
         | for the job here for expressing ideas like "this loop will
         | terminate", but making signed overflow defined behavior will
         | impact the performance of numerics libraries that are currently
         | written in C.
         | 
         | From my personal experience, having numbers wrap around is not
         | necessarily "better" than having the behavior undefined, and
         | I've had to chase down all sorts of bugs with wraparound in the
         | past. What I'd personally like is four different ways to use
         | integers: wrap on overflow, undefined overflow, error on
         | overflow, and saturating arithmetic. They all have their places
         | and it's unfortunate that it's not really explicit which one
         | you are using at a given site.
        
           | nickodell wrote:
           | Under C11, the compiler is still allowed to assume
           | termination of a loop if the controlling expression is non-
           | constant and a few other conditions are met.
           | 
           | https://stackoverflow.com/a/16436479/530160
        
           | alerighi wrote:
           | The compiler assumes that the loop will alwasy terminate and
           | that assumption is wrong, because in reality there is the
           | possibility that the loop will not terminate, since the
           | hardware WILL overflow.
           | 
           | So it's not the best solution. If we want to make this
           | behaviour for optimizations (that are to me not worthed,
           | giving the risk of potentially critical bugs) we must make
           | that behavior explicit, not implicit: thus is the programmer
           | that has to say to the compiler, I guarantee you that this
           | operation will never overflow, if it does it's my fault.
           | 
           | We can agree that having a number that wraps around is not a
           | particularly good choice. But unless we convince Intel in
           | some way that this is bad and make the CPU trap on an
           | overflow, so we can catch that bug, this is the behaviour
           | that we have because is the behaviour of the hardware.
        
             | coliveira wrote:
             | > I guarantee you that this operation will never overflow,
             | if it does it's my fault.
             | 
             | This is exactly what every C programmer does, all the time.
        
             | klodolph wrote:
             | > The compiler assumes that the loop will alwasy terminate
             | and that assumption is wrong, because in reality there is
             | the possibility that the loop will not terminate, since the
             | hardware WILL overflow.
             | 
             | The language is not a model of hardware, nor should it be.
             | If you want to write to the hardware, the only option
             | continues to be assembly.
        
           | iainmerrick wrote:
           | _the compiler is allowed to assume the loop terminates
           | precisely because signed overflow is undefined._
           | 
           | Just to be sure I understand the fine details of this -- what
           | would the impact be if the compiler assumed (correctly) that
           | the loop might not terminate? What optimization would that
           | prevent?
        
             | klodolph wrote:
             | > ...what would the impact be if the compiler assumed
             | (correctly) that the loop might not terminate?
             | 
             | Loaded question--the compiler is absolutely correct here.
             | There are two viewpoints where the compiler is correct.
             | First, from the C standard perspective, the compiler
             | implements the standard correctly. Second, if we have a
             | real human look at this code and interpret the programmer's
             | "intent", it is most reasonable to assume that overflow
             | does not happen (or is not intentional).
             | 
             | The only case which fails is where N = INT_MAX. No other
             | case invokes undefined behavior.
             | 
             | Here is an example you can compile for yourself to see the
             | different optimizations which occur:
             | typedef int length;         int sum_diff(int *arr, length
             | n) {             int sum = 0;             for (length i =
             | 0; i < n; i++) {                 sum += arr[2*i+1] -
             | arr[2*i];             }             return sum;         }
             | 
             | At -O2, GCC 9.2 (the compiler I happened to use for
             | testing) will use pointer arithmetic, compiling it as
             | something like the following:                   int
             | sum_diff(int *arr, length n) {             int sum = 0;
             | int *ptr = arr;             int *end = arr + n;
             | while (ptr < end) {                 sum += ptr[1] - ptr[0];
             | ptr += 2;             }             return sum;         }
             | 
             | At -O3, GCC 9.2 will emit SSE instructions. You can see
             | this yourself with Godbolt.
             | 
             | Now, try replacing "int" with "unsigned". Neither of these
             | optimizations happen any more. You get neither
             | autovectorization nor pointer arithmetic. You get the
             | original loop, compiled in the most dumb way possible.
             | 
             | I wouldn't read into the exact example here too closely. It
             | is true that you can often figure out a way to get the
             | optimizations back and still use unsigned types. However,
             | it is a bit easier if you work with signed types in the
             | first place.
             | 
             | Speaking as someone who does some numerics work in C, there
             | is something of a "black art" to getting good numerics
             | performance. One easy trick is to switch to Fortran. No
             | joke! Fortran is actually really good at this stuff. If you
             | are going to stick with C, you want to figure out how to
             | communicate to the compiler some facts about your program
             | that are obvious to you, but not obvious to the compiler.
             | This requires a combination of understanding the compiler
             | builtins (like __builtin_assume_aligned, or
             | __builtin_unreachable), knowledge of aliasing (like use of
             | the "restrict" keyword), and knowledge of undefined
             | behavior.
             | 
             | If you _need_ good performance out of some tight inner
             | loop, the easiest way to get there is to communicate to the
             | compiler the "obvious" facts about the state of your
             | program and check to see if the compiler did the right
             | thing. If the compiler did the right thing, then you're
             | done, and you don't need to use vector intrinsics, rewrite
             | your code in a less readable way, or switch to assembly.
             | 
             | (Sometimes the compiler can't do the right thing, so go
             | ahead and use intrinsics or write assembly. But the
             | compiler is pretty good and you can get it to do the right
             | thing _most_ of the time.)
        
             | joosters wrote:
             | If the compiler knows that the loop will terminate in 'x'
             | iterations, it can do things like hoist some arithmetic out
             | of the loop. The simplest example would be if the code
             | inside the loop contained a line like 'counter++'. Instead
             | of executing 'x' ADD instructions, the binary can just do
             | one 'counter += x' add at the end.
        
               | iainmerrick wrote:
               | What I'm driving at is, if the loop really doesn't
               | terminate, it would still be safe to do that optimization
               | because the incorrectly-optimized code would never be
               | executed.
               | 
               | I guess that doesn't necessarily help in the "+=2" case,
               | where you probably want the optimizer to do a "result +=
               | x/2".
               | 
               | In general, I'd greatly prefer to work with a compiler
               | that detected the potential infinite loop and flagged it
               | as an error.
        
         | tlb wrote:
         | Another approach would be a standard library of arithmetic
         | routines that signal overflow.
         | 
         | If people used them while parsing binary inputs that would
         | prevent a lot of security bugs.
         | 
         | The fact that this question exists and is full of wrong answers
         | suggests a language solution is needed:
         | https://stackoverflow.com/questions/1815367/catch-and-comput...
        
           | colanderman wrote:
           | You can enable this in GCC on a compilation unit basis with
           | `-fsanitize=signed-integer-overflow`. In combination with
           | `-fsanitize-undefined-trap-on-error`, the checks are quite
           | cheap (on x86, usually just a `jo` to a `ud2` instruction).
           | 
           | (Note that while `-ftrapv` would seem equivalent, I've found
           | it to be less reliable, particularly with compile-time
           | checking.)
        
             | corndoge wrote:
             | And clang!
        
           | asveikau wrote:
           | Microsoft in particular has a simple approach to this with
           | things like DWordMult().                   if
           | (FAILED(DWordMult(a, b, &product)))         {            //
           | handle error         }
        
             | stephencanon wrote:
             | Clang and GCC's approach for these operations is even nicer
             | FWIW (__builtin_[add/sub/mul]_overflow(a, b, &c)), which
             | allow arbitrary heterogenous integer types for a, b, and c
             | and do the right thing.
             | 
             | I know there's recently been some movement towards
             | standardizing something in this direction, but I don't know
             | what the status of that work is. Probably one of the folks
             | doing the AUA can update.
        
               | AaronBallman wrote:
               | We've been discussing a paper on this (http://www.open-
               | std.org/jtc1/sc22/wg14/www/docs/n2466.pdf) at recent
               | meetings and it's been fairly well-received each time,
               | but not adopted for C2x as of yet.
        
               | stephencanon wrote:
               | It feels like it would be a real shame to standardize
               | something that gives up the power of the Clang/GCC
               | heterogeneous checked operations. We added them in Clang
               | precisely because the original homogeneous operations
               | (__builtin_smull_overflow, etc) led to very substantial
               | correctness bugs when users had to pick a single common
               | type for the operation and add conversions. Standardizing
               | homogeneous operations would be worse than not addressing
               | the problem at all, IMO. There's a better solution, and
               | it's already implemented in two compilers, so why
               | wouldn't we use it?
               | 
               | The generic heterogeneous operations also avoid the
               | identifier blowup. The only real argument against them
               | that I see is that they are not easily implementable in C
               | itself, but that's nothing new for the standard library
               | (and should be a non-goal, in my not-a-committee-member
               | opinion).
               | 
               | Obviously, I'm not privy to the committee discussions
               | around this, so there may be good reasons for the choice,
               | but it worries me a lot to see that document.
        
               | wklieber wrote:
               | >the original homogeneous operations
               | (__builtin_smull_overflow, etc) led to very substantial
               | correctness bugs when users had to pick a single common
               | type for the operation and add conversions.
               | 
               | Hi Stephen, thank you for bringing this to our attention.
               | David Svoboda and I are now working to revise the
               | proposal to add a supplemental proposal to support
               | operations on heterogeneous types. We are leaning toward
               | proposing a three-argument syntax, where the 3rd argument
               | specifies the return type, like:
               | ckd_add(a, b, T)
               | 
               | where _a_ and _b_ are integer values and _T_ is an
               | integer type, in addition to the two-argument form
               | ckd_add(a, b)
               | 
               | (Or maybe the two-argument and three-argument forms
               | should have different names, to make it easier to
               | implement.)
        
               | stephencanon wrote:
               | Glad to hear it, looking forward to seeing what you come
               | up with! The question becomes, once you have the
               | heterogeneous operations, is there any reason to keep the
               | others around (my experience is that they simply become a
               | distraction / attractive nuisance, and we're better off
               | without them, but there may be use cases I haven't
               | thought of that justify their inclusion).
        
               | wklieber wrote:
               | When David and I are done revising the proposal, we would
               | like to send you a copy. If you would be interested in
               | reviewing, can you please let us know how to get in touch
               | with you? David and I can be reached at
               | {svoboda,weklieber} @ cert.org.
               | 
               | >once you have the heterogeneous operations, is there any
               | reason to keep the others around
               | 
               | The two-argument form is shorter, but perhaps that isn't
               | a strong enough reason to keep it. Also, requiring a
               | redundant 3rd argument can provide an opportunity for
               | mistakes to happen if it gets out of sync with the type
               | of first two arguments.
               | 
               | As for the non-generic functions (e.g., ckd_int_add,
               | ckd_ulong_add, etc.), we are considering removing them in
               | favor of having only the generic function-like macros.
        
           | rseacord wrote:
           | Take a look at N2466 2020/02/09 Svoboda, Towards Integer
           | Safety which has some support in the committee:
           | 
           | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2466.pdf
           | 
           | (signal is a strong word... maybe indicate?)
        
         | cataphract wrote:
         | Signed overflow being undefined behavior allows optimizations
         | that wouldn't otherwise be possible
         | 
         | Quoting http://blog.llvm.org/2011/05/what-every-c-programmer-
         | should-...
         | 
         | > This behavior enables certain classes of optimizations that
         | are important for some code. For example, knowing that
         | INT_MAX+1 is undefined allows optimizing "X+1 > X" to "true".
         | Knowing the multiplication "cannot" overflow (because doing so
         | would be undefined) allows optimizing "X*2/2" to "X". While
         | these may seem trivial, these sorts of things are commonly
         | exposed by inlining and macro expansion. A more important
         | optimization that this allows is for "<=" loops like this:
         | 
         | > for (i = 0; i <= N; ++i) { ... }
         | 
         | > In this loop, the compiler can assume that the loop will
         | iterate exactly N+1 times if "i" is undefined on overflow,
         | which allows a broad range of loop optimizations to kick in. On
         | the other hand, if the variable is defined to wrap around on
         | overflow, then the compiler must assume that the loop is
         | possibly infinite (which happens if N is INT_MAX) - which then
         | disables these important loop optimizations. This particularly
         | affects 64-bit platforms since so much code uses "int" as
         | induction variables.
        
           | rbultje wrote:
           | > for (i = 0; i <= N; ++i) { ... }
           | 
           | The worst thing is that people take it as acceptable that
           | this loop is going to operate differently upon overflow (e.g.
           | assume N is TYPE_MAX) depending on whether i or N are signed
           | vs. unsigned.
        
             | JoeAltmaier wrote:
             | Is this a real concern, beyond 'experts panel' esoteric
             | discussion? Do folks really put a number into an int, that
             | is sometimes going to need to be exactly TYPE_MAX but no
             | larger?
             | 
             | I've gone a lifetime programming, and this kind of stuff
             | never, ever matters one iota.
        
               | sdegutis wrote:
               | The very few times I've ever put in a check like that, I
               | always do something like i < MAX_INT - 5 just to be sure,
               | because I'm never confident that I intuitively understand
               | off-by-one errors.
        
               | btilly wrote:
               | Yes, people really do care about overflow. Because it
               | gets used in security checks, and if they don't
               | understand the behavior then their security checks don't
               | do what they expected.
               | 
               | https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475 shows
               | someone going hyperbolic over the issue. The technical
               | arguments favor the GCC maintainers. However I prefer the
               | position of the person going hyperbolic.
        
               | JoeAltmaier wrote:
               | That example was not 'overflow'. It was 'off by one'?
               | That seems uninteresting, outside as you say the security
               | issue where somebody might take advantage of it.
        
               | btilly wrote:
               | That example absolutely was overflow. The bug is,
               | "assert(int+100 > int) optimized away".
               | 
               | GCC has the behavior that overflowing a signed integer
               | gives you a negative one. But an if tests that TESTS for
               | that is optimized away!
               | 
               | The reason is that overflow is undefined behavior, and
               | therefore they are within their rights to do anything
               | that they want. So they actually overflow in the fastest
               | way possible, and optimize code on the assumption that
               | overflow can't happen.
               | 
               | The fact that almost no programmers have a mental model
               | of the language that reconciles these two facts is an
               | excellent reason to say that very few programmers should
               | write in C. Because the compiler really is out to get
               | you.
        
               | JoeAltmaier wrote:
               | Sure. Sorry, I was ambiguous. The earlier example of ++i
               | in a loop I was thinking of. Anyway, yes, overflow for
               | small ints is a real thing.
        
           | gwd wrote:
           | So in a corner case where you have a loop that iterates over
           | all integer values (when does this ever happen?) you can
           | optimize your loop. As a consequence, signed integer
           | arithmetic is very difficult to write while avoiding UB, even
           | for skilled practitioners. Do you think that's a useful
           | trade-off, and do you think anything can be done for those of
           | us who think it's not?
        
             | zodiac wrote:
             | No, the optimizations referred to include those that will
             | make the program faster when N=100.
        
             | buckminster wrote:
             | N is a variable. It might be INT_MAX so the compiler cannot
             | optimise the loop for _any_ value of N. Unless you make
             | this UB.
        
             | andrepd wrote:
             | No, it's exactly the opposite. Without UB the compiler must
             | assume that the corner case may arise at any time. Knowing
             | it is UB we can assert `n+1 > n`, which without UB would be
             | true for all `n` except INT_MAX. Standardising wrap-on-
             | overflow would mean you can now handle that corner case
             | safely, at the cost of missed optimisations on everything
             | else.
        
               | vermilingua wrote:
               | I hadn't understood the utility of undefined behaviour
               | until reading this, thank you.
        
               | rbultje wrote:
               | I/we understand the optimization, and I'm sure you
               | understand the problem it brings to common procedures
               | such as DSP routines that multiply signed coefficients
               | from e.g. video or audio bitstreams:
               | 
               | for (int i = 0; i < 64; i++) result[i] = inputA[i] *
               | inputB[i];
               | 
               | If inputA[i] * inputB[i] overflowed, why are my credit
               | card details at risk? The question is: can we come up
               | with an alternate behaviour that incorporates both
               | advantages of the i<=N optimization, as well as leave my
               | credit card details safe if the multiplication in the
               | inner loop overflowed? Is there a middle road?
        
               | qppo wrote:
               | Another problem is that there's no way to define it,
               | because in that example the "proper" way to overflow is
               | with saturating arithmetic, and in other cases the
               | "proper" overflow is to wrap. Even on CPUs/DSPs that
               | support saturating integer arithmetic in hardware, you
               | either need to use vendor intrinsics or control the
               | status registers yourself.
        
               | jononor wrote:
               | One could allow the overflow behavior to be specified,
               | for example on the scope level. Idk, with a #pragma ?
               | #pragma integer-overflow-saturate
        
               | gwd wrote:
               | I'd almost rather have a separate "ubsigned" type which
               | has undefined behavior on overflow. By default, integers
               | behave predictably. When people really need that extra 1%
               | performance boost, they can use ubsigned just in the
               | cases where it matters.
        
               | qppo wrote:
               | I don't know if I agree. Overflow is like uninitialized
               | memory, it's a bug almost 100% of the time, and cases
               | where it is tolerated or intended to occur are the
               | exception.
               | 
               | I'd rather have a special type with defined behavior.
               | That's actually what a lot of shops do anyways, and there
               | are some niche compilers that support types with defined
               | overflow (ADI's fractional types on their Blackfin tool
               | chain, for example). It's just annoying to do in C, this
               | is one of those cases where operator overloading in C++
               | is really beneficial.
        
               | gwd wrote:
               | > I don't know if I agree. Overflow is like uninitialized
               | memory, it's a bug almost 100% of the time, and cases
               | where it is tolerated or intended to occur are the
               | exception.
               | 
               | Right, but I think the problem is that UB means
               | _literally anything_ can happen and be conformant to the
               | spec. If you do an integer overflow, and as a result the
               | program formats your hard drive, then it is acting within
               | the C spec.
               | 
               | Now compiler writers don't usually format your hard drive
               | when you trigger UB, but they often do things like remove
               | input sanitation or other sorts of safety checks. It's
               | one thing if as a result of overflow, the number in your
               | variable isn't what you thought it was going to be. It's
               | completely different if suddenly safety checks get tossed
               | out the window.
               | 
               | When you handle unsanitized input in C on a security
               | boundary, you must literally treat the compiler as a
               | "lawful evil" accomplice to the attackers: you must
               | assume that the compiler will follow the spec to the
               | letter, but will look for any excuse to open up a gaping
               | security hole. It's incredibly stressful if you know that
               | fact, and incredibly dangerous if you don't.
        
               | thayne wrote:
               | Have you considered adding intrinsic functions for
               | arithmetic operations that _do_ have defined behavior on
               | overflow. Such as the overflowing_* functions in rust?
        
         | _kst_ wrote:
         | > Now that C2x plans to make two's complement the only sign
         | representation, is there any reason why signed overflow has to
         | continue being undefined behavior?
         | 
         | I presume you'd want signed overflow to have the usual
         | 2's-complement wraparound behavior.
         | 
         | One problem with that is that a compiler (probably) couldn't
         | warn about overflows that are actually errors.
         | 
         | For example:                   int n = INT_MAX;         /* ...
         | */         n++;
         | 
         | With integer overflow having undefined behavior, if the
         | compiler can determine that the value of n is INT_MAX it can
         | warn about the overflow. If it were defined to yield INT_MIN,
         | then the compiler would have to assume that the wraparound was
         | what the programmer intended.
         | 
         | A compiler _could_ have an option to warn about detected
         | overflow /wraparound even if it's well defined. But really, how
         | often do you _want_ wraparound for signed types? In the code
         | above, is there any sense in which INT_MIN is the  "right"
         | answer for any typical problem domain?
        
           | enriquto wrote:
           | > In the code above, is there any sense in which INT_MIN is
           | the "right" answer for any typical problem domain?
           | 
           | There is no answer different that INT_MIN that would be right
           | and make sense, i.e. the natural properties of the + operator
           | (associativity, commutativity) are respected. Thus, by want
           | of another possibility, INT_MIN is precisely _the_ right
           | answer to your code.
           | 
           | I read your code and it seems to me very clear that INT_MIN
           | is exactly what the programmer intended.
        
         | colanderman wrote:
         | Beside optimization (as others have pointed out), disallowing
         | wrapping of signed values has the important safety benefit that
         | it permits run-time (and compile-time) _detection_ of
         | arithmetic overflow (e.g. via -fsanitize=signed-integer-
         | overflow). If signed arithmetic were defined to wrap, you could
         | not enable such checks without potentially breaking existing
         | correct code.
        
         | wyldfire wrote:
         | Could we instead just have standard-defined integer types which
         | saturate or trap on overflow?
         | 
         | Sometimes you're writing code where it really, really matters
         | and you're more than willing to spend the extra cycles for
         | every add/mul/etc. Having these new types as a portable idiom
         | would help.
        
           | rseacord wrote:
           | There was a proposal for a checked integer type that you
           | might want to look at:
           | 
           | N2466 2020/02/09 Svoboda, Towards Integer Safety
           | 
           | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2466.pdf
           | 
           | The committee asked the proposers for further work on this
           | effort.
           | 
           | Integer types that saturate are an interesting idea. Because
           | signed integer overflow is undefined behavior,
           | implementations are not prohibited from implementing
           | saturation or trapping on overflow.
        
             | hyc_symas wrote:
             | Eh? I thought that would only be "legal" if it was
             | specified to be implementation-defined behavior. Which
             | would, frankly, be perfectly good. But since it is
             | specified as _undefined_ behavior, programmers are
             | forbidden to use it, and compilers assume it doesn 't
             | happen/doesn't exist.
             | 
             | The entire notion that "since this is undefined behavior it
             | does not exist" is the biggest fallacy in modern compilers.
        
               | DougGwyn wrote:
               | The rule is: If you want your program to conform to the C
               | Standard, then (among other things) your program must not
               | cause any case of undefined behavior. Thus, if you can
               | arrange so that instances of UB will not occur, it
               | doesn't matter that identical code under different
               | circumstances could fail to conform. The safest thing is
               | to make sure that UB cannot be triggered _under any
               | circumstances_ ; that is, defensive programming.
        
         | rseacord wrote:
         | Maybe someone else can respond to this as well, but I feel like
         | the primary reason signed overflow is still undefined behavior
         | is because so many optimizations depend upon the undefined
         | nature of signed integer overflow. My advice has always been to
         | use unsigned integer types when possible.
         | 
         | Personally, I would like to get rid of many of the trap
         | representations (e.g., for integers) because there is no
         | existing hardware in many cases that supports them and it gives
         | implementers the idea that uninitialized reads are undefined
         | behavior.
         | 
         | On the other hand, I just wrote a proposal to WG14 to make
         | zero-byte reallocations undefined behavior that was unanimously
         | accepted for C2x.
        
       | [deleted]
        
       | BeeOnRope wrote:
       | When deciding on standardized behavior for C operations or data
       | representation that may favor some hardware over others [1], who
       | argues the side of the various hardware vendors, if they have no
       | members on the standardization committee?
       | 
       | Is it fair to assume that hardware-related decisions occur in an
       | environment where members who are sponsored by vendors argue
       | their employers case, rather an a neutral one?
       | 
       | ---
       | 
       | [1] E.g., because some hardware's behavior may more naturally
       | implement the operation.
        
         | AaronBallman wrote:
         | > When deciding on standardized behavior for C operations or
         | data representation that may favor some hardware over others
         | [1], who argues the side of the various hardware vendors, if
         | they have no members on the standardization committee?
         | 
         | The C committee has a number of implementation vendors on it
         | (GCC, Clang, IBM, Intel, sdcc, etc) and these folks do a good
         | job of speaking up about the hardware they have to support (and
         | in some cases, they're also the hardware vendor). If needed, we
         | will also research hardware from vendors who have no active
         | representation on the committee, but this is usually for more
         | broad changes like "can we require 2's complement?".
         | 
         | > Is it fair to assume that hardware-related decisions occur in
         | an environment where members who are sponsored by vendors argue
         | their employers case, rather an a neutral one?
         | 
         | In my experience, the committee members typically do a good job
         | of differentiating between "this is my opinion" and "this is my
         | employer's opinion" during discussions where that matters.
         | However, at the end of the day, each committee member is there
         | representing some constituency (whether it's themselves or
         | their company) and votes their own conscience.
        
           | BeeOnRope wrote:
           | Thanks for your quick and honest answer.
        
       | ori_b wrote:
       | What do you think of a variant on this?
       | 
       | https://blog.regehr.org/archives/1180
        
         | dang wrote:
         | pascal_cuoq cowrote it. Maybe we should ask him if his views
         | have changed since then.
         | 
         | Btw, there was a thread about it at the time:
         | https://news.ycombinator.com/item?id=8233484.
        
           | pascal_cuoq wrote:
           | Thanks Dan, I missed this question in the heat of the moment.
        
         | pascal_cuoq wrote:
         | I still want to write at least one sequel to that post, on the
         | theme "Alright, can we make a Friendly C Compiler by disabling
         | the annoying optimizations, then?".
         | 
         | Obviously the people who want a Friendly C Compiler do not want
         | to disable _all_ optimizations. This would be easy to do, but
         | these users do not want the stupid 1+2+16 expressions in their
         | C programs, generated through macro-expansion, to be compiled
         | to two additions with each intermediate result making a round-
         | trip through memory.
         | 
         | So the question is: can we get a Friendly C Compiler by
         | enabling only the Friendly optimizations in an unfriendly
         | compiler?
         | 
         | And for the answer to that, I had to write an entire other blog
         | post as preparation, to show that there are some assumptions an
         | optimizing compiler can do:
         | 
         | - that may be used in one or several optimizations, but the
         | compiler authors did not really keep track of where they were
         | used,
         | 
         | - that cannot be disabled and that the compiler maintainers
         | will not consider having an option to disable,
         | 
         | - and that are definitely unfriendly.
         | 
         | Here is the URL of the blog post that I had to write in
         | preparation for the upcoming blog post about getting ourselves
         | a Friendly C Compiler: https://trust-in-
         | soft.com/blog/2020/04/06/gcc-always-assumes... . I recommend
         | you take a look, I think it is interesting in itself.
         | 
         | You will have guessed that I'm not optimistic about the
         | approach. We can try to maintain a list of friendly
         | optimizations for ourselves, though, even if the compiler
         | developers are not helping. This might still be less work that
         | maintaining a C compiler.
        
           | ori_b wrote:
           | > _Here is the URL of the blog post that I had to write in
           | preparation for the upcoming blog post about getting
           | ourselves a Friendly C Compiler:https://trust-in-
           | soft.com/blog/2020/04/06/gcc-always-assumes.... . I recommend
           | you take a look, I think it is interesting in itself._
           | 
           | So, it's definitely interesting -- I think a lot of odd stuff
           | you can do should probably be undefined. Eliminating pointer
           | accesses after a null check sounds A-ok to me, because your
           | program should never dereference null.
           | 
           | Another interesting thought is requiring more of these things
           | that lead to miscompilation to produce compile time
           | diagnostics.
        
       | 0x09 wrote:
       | Not about the language exactly, so maybe not fair game, but: how
       | did you all find yourselves joining ISO? And maybe more
       | generally, what's the path for someone like a regular old
       | software engineer to come to participate in the standardization
       | process for something as significant and ubiquitous as the C
       | programming language?
        
         | AaronBallman wrote:
         | Great question!
         | 
         | Joining the committee requires you to be a member of your
         | country's national body group (in the US, that's INCITS) and
         | attend at least some percentage of the official committee
         | meetings, and that's about it. So membership is not difficult,
         | but it can be expensive. Many committee members are sponsored
         | by their employers for this reason, but there's no requirement
         | that you represent a company.
         | 
         | I joined the committees because I have a personal desire to
         | reduce the amount of time it takes developers to find the bugs
         | in their code, and one great way to reduce that is to design
         | features to make harder to write the bugs in the first place,
         | or to turn unbounded undefined behavior into something more
         | manageable. Others join because they have specific features
         | they want to see adopted or want to lend their domain expertise
         | in some area to the committee.
        
           | johannes1234321 wrote:
           | Related to that: C++ standards body seems to be quite open
           | allowing non-members to participate (outside official votes,
           | while respecting them when looking for consensus) is it just
           | due to my limited observation or is the C group less open?
           | Any plans in that regard?
        
             | msebor wrote:
             | Most of us on the committee would like to see more
             | participation from other experts. The committee's mailing
             | list should be open even to non-members. Attendance by non-
             | members at meetings might require an informal invitation (I
             | imagine a heads up to the convener should do it).
        
               | DougGwyn wrote:
               | I think that's right. These days, much of the discussion
               | occurs through study subgroups (like the floating-point
               | guys) and the committee e-mailing list.
        
             | AaronBallman wrote:
             | I would love to see more open interactions between the
             | broader C community and the WG14 committee. One of the
             | plans I am currently working on is an update to the
             | committee's webpage to at least make it more obvious as to
             | how you can get involved. The page isn't ready to go live
             | yet, but will hopefully be in shape soon.
        
       | Lucasoato wrote:
       | Is there any new programming language that you particularly love?
       | Do you like the way programming is evolving?
        
         | pascal_cuoq wrote:
         | As a member of the development team for a C static analyzer, I
         | use OCaml, which is also my favorite programming language, but
         | that is because I'm from the generation in which it was the new
         | thing (I learnt it when it had the same level of maturity as
         | Rust, at a time when Rust didn't exist). It helps that it's
         | perfect for writing compilers and static analyzers.
         | 
         | There are a lot of problems that seem a good match for Rust,
         | and Rust is first in my list of programming languages I will
         | never find the time to learn but wish I could.
        
           | artursapek wrote:
           | Why won't you ever find time? It should only take a good 20
           | hours of reading and playing with code before you start to
           | grok it.
        
             | rseacord wrote:
             | I spent the early part of my career bragging about how many
             | programming languages I knew, and the later part of my
             | career complaining about how I don't know any of them well
             | enough.
        
               | artursapek wrote:
               | I certainly wouldn't go for quantity there, but if you
               | really want to learn Rust you should. It brings some
               | groundbreaking new ideas to programming and is more than
               | "just another language".
        
       | hsivonen wrote:
       | What's the current committee thinking on providing locale-
       | independent conversions from potentially-invalid UTF-8 to valid
       | UTF-8, from potentially-invalid UTF-8 to valid UTF-16, and from
       | potentially-invalid UTF-16 to valid UTF-8 (i.e. replacing ill-
       | formed sequences with yhe REPLACEMENT CHARACTER)?
        
         | DougGwyn wrote:
         | If you changed UTF-16 to UTF-32 or UCS-4 I'd support it. I
         | think there are already implementations that use the
         | replacement character for all "impossible" codes.
        
       | brainzap wrote:
       | How do you join three float values into a comma separated string,
       | and then split it again?
        
         | emilfihlman wrote:
         | Not sure what you mean but would                 s8
         | buf[enoughspace];       snprintf(buf, sizeof(buf), "%f,%f,%f",
         | your, three, values);       sscanf(buf, "%f,%f,%f", &your,
         | &three, &values);
         | 
         | Do the job?
        
       | jpfr wrote:
       | Quite a few new languages generate C code for the "backend" of
       | their compiler. For example ATS and the ZZ language.
       | 
       | This helps bringing these languages to embedded targets with
       | closed toolchains (with an existing C compiler).
       | 
       | Will there be developments to use a subset of C as a "portable
       | assembly" in a standard way? Like there is WebAssembly for
       | JavaScript.
        
         | msebor wrote:
         | That doesn't seem likely. There have been no proposals for
         | anything like it and there is a general resistance to
         | subsetting either C or C++ (the exception being making support
         | for new features optional).
        
       | Tronic2 wrote:
       | Why is the struct tm* returned by localtime() not thread-local
       | like errno and other similar variables are (at least in
       | implementations)? Do you have any plans to improve calendar
       | support for practical uses?
        
         | pascal_cuoq wrote:
         | Both question would get better answers if they were asked to a
         | panel of experts on POSIX (which could including members of the
         | POSIX standardization committee).
         | 
         | For the first one, I can attempt a guess: maybe it was feared
         | that making the result of localtime thread-locale would break
         | some programs? You could build such a program on purpose,
         | although I am not clear how frequently one would write one by
         | accident.
         | 
         | Anyway, localtime_r is the function that one should use if one
         | is concerned by thread-safety. A more likely answer is that no
         | Unix implementation bothered to fix localtime because the
         | proper fix was for programs to call localtime_r.
        
       | emilfihlman wrote:
       | Thank you for taking time to take questions!
       | 
       | Have you ever considered or will you consider deprecating char,
       | int, long, (s)size_t, float, double and etc in favour of specific
       | length types?
       | 
       | Will you ever add / have you considered adding [su]\d+ and f\d+
       | as synonyms for those mentioned stdint.h?
       | 
       | Since char is signed on most platforms, arm eabi being an
       | exception and even there it's really just a matter of compile
       | time flags, will you ever just drop char from being able to be
       | either and just say it's signed, as int is also signed?
       | 
       | Will you ever define / have you considered defining signed
       | overflow behaviour?
        
         | rseacord wrote:
         | I don't think we'll ever deprecate char, int, long, float,
         | double, or size_t. ssize_t is not part of the C Standard, and
         | hopefully never will be as it is a bit of an abomination. The
         | main driver behind the evolution of the C Standard is not to
         | break existing code written in C, because the world largely
         | runs on C programs.
         | 
         | C does provide fixed width types like uint8_t, uint16_t,
         | uint32_t, and uint64_t. These are optional types because they
         | can't be implemented on implementations that don't have the
         | appropriate word sizes. We also have required types such as
         | 
         | uint_least16_t uint_least32_t uint_least64_t uint_least8_t
        
           | emilfihlman wrote:
           | >The main driver behind the evolution of the C Standard is
           | not to break existing code written in C, because the world
           | largely runs on C programs.
           | 
           | If not deprecate, then at least make fixed width types as
           | equivalent members to them, ie all char based apis should
           | accept s8 (typedef signed char s8) and all int based apis
           | should accept s32.
        
             | rseacord wrote:
             | Well, there are number of problems with this proposal. For
             | example, if your implementation defines int as a 16-bit
             | type (which is permitted for by the standard) and you pass
             | an int32_t, the value you pass maybe truncated if it is
             | outside of the range of the narrower type. When
             | programming, it is best to match the type of the API of the
             | function you are calling for portability.
        
           | [deleted]
        
       | nicoburns wrote:
       | Are there any plans to "clean up C"? A lot of effort has been put
       | into alternative languages, which are great, but there is still a
       | lot of momentum with C, and it seems that a lot of improvements
       | that could be done in a backwards compatible way and without
       | introducing much in the way of complexity. For example:
       | 
       | - Locking down some categories of "undefined behaviour" to be
       | "implementation defined" instead.
       | 
       | - Proper array support (which passes around the length along with
       | the data pointer).
       | 
       | - Some kind of module system, that allows code to be imported
       | with the possibility of name collisions.
        
         | metalforever wrote:
         | What does clean up c mean?
        
         | ori_b wrote:
         | > - Some kind of module system, that allows code to be imported
         | with the possibility of name collisions.
         | 
         | That doesn't particularly need modules -- just some form of
         | namespace foo {          }
        
         | dooglius wrote:
         | You can very easily make a struct consisting of a pointer and
         | length, is adding such a thing to the standard really a big
         | deal? Personally, I don't see a problem with passing two
         | arguments.
        
           | Someone1234 wrote:
           | - In your example there's no guarantee that the length will
           | be accurate, or that the data hasn't been modified
           | independently elsewhere in the program.
           | 
           | - In other words you've created a fantastic shoe-gun. One
           | update line missed (either length or data, or data re-used
           | outside the struct) and your "simple" struct is a huge
           | headache, including potential security vulnerabilities.
           | 
           | - Re-implementing a common error prone thing is exactly what
           | language improvements should target.
        
             | rwmj wrote:
             | I mean, this is C so "fantastic shoe-gun" is part of the
             | territory. But in C you can wrap this vector struct in an
             | abstract data type to try to prevent callers from breaking
             | invariants.
        
             | dooglius wrote:
             | >In your example there's no guarantee that the length will
             | be accurate, or that the data hasn't been modified
             | independently elsewhere in the program.
             | 
             | And having a special data-and-length type would make these
             | guarantees... how? You're ultimately going to need to be
             | able to create these objects from bare data and length
             | somehow, so it's a case of garbage-in-garbage-out.
        
               | gbear605 wrote:
               | Declaring it with a custom struct:                   int
               | raw_arr[4] = {0,0,0,0};         struct SmartArray arr;
               | arr.length = 4;         arr.val = raw_arr;
               | some_function(arr);
               | 
               | Smart declaration with custom type: (assume that they'll
               | come up with a good syntax)
               | smart_int_arr arr[4] = {0,0,0,0};
               | some_function(arr);
               | 
               | With the custom struct, it requires the number `4` to be
               | typed twice manually, while in the second it only needs a
               | single input.
        
             | FpUser wrote:
             | In Delphi/FreePascal there are dynamic arrays (strings
             | included) that are in fact fat pointers that hide inside
             | more info than just length. All opaque types and work just
             | fine with automatic lifecycle control and COW and whatnot.
        
         | msebor wrote:
         | There are "projects" underway to clean up the spec where it's
         | viewed as either buggy, inconsistent, or underspecified. The
         | atomics and threads sections are a coupled of example.
         | 
         | There are efforts to define the behavior in cases where
         | implementations have converged or died out (e.g., twos
         | complement, shifting into the sign bit).
         | 
         | There have been no proposals to add new array types and it
         | doesn't seem likely at the core language level. C's charter is
         | to standardize existing practice (as opposed to invent new
         | features), and no such feature has emerged in practice. Same
         | for modules. (C++ takes a very different approach.)
        
           | bear8642 wrote:
           | >clean up the spec
           | 
           | Would this involve further specification of bitfields? Feel
           | implementation defined nature of bitfields limits potential
        
           | DougGwyn wrote:
           | Actually there was no need to disenfranchise non-twos-
           | complement architectures. Now that SIMH has a CDC-1700
           | emulation, I had planned on producing a C system for it as an
           | example for students who have never seen such a model.
        
           | rkangel wrote:
           | > C's charter is to standardize existing practice (as opposed
           | to invent new features)
           | 
           | Passing a pair of arguments (pointer and a length) is surely
           | one of the more universal conventions among C programmers?
        
             | cperciva wrote:
             | When they say "existing practice" they mean things already
             | implemented in compilers -- not existing practice among
             | developers.
        
               | apotheon wrote:
               | This seems like a poor way to establish criteria for
               | standardization. It essentially encourages non-standard
               | practice and discourages portable code by saying that to
               | improve the language standard we have to have mutually
               | incompatible implementations.
               | 
               | It has been said that design patterns (not just in the
               | GOF sense of the term) are language design smells,
               | implying that when very common patterns emerge it is a de
               | facto popular-uprising call for reform. That, to me, is a
               | more ideal criterion for updating a language standard,
               | but practiced conservatively to avoid too much movement
               | too fast or too much language growth.
               | 
               | On the other hand, I think you might be close to what
               | they meant by "existing practice". I'm just disappointed
               | to find that seems like the probable case (though I think
               | it might also include some convergent evolutionary
               | library innovations by OS devs as well as language
               | features by compiler devs).
        
               | cperciva wrote:
               | One of the principles for the C language is that you
               | should be able to use C on pretty much any platform out
               | there. This is one of the reasons that other languages
               | are often written in C.
               | 
               | In order to uphold that principle, it's important that
               | the standard consider not just "is this useful" but "is
               | this going to be reasonably straightforward for compiler
               | authors to add". Seeing that people have already
               | implemented a feature helps C to avoid landing in the
               | "useful feature which nobody can use because it's not
               | widely available" trap. (For example, C99 made the
               | mistake of adding floating-point complex types in
               | <complex.h> -- but these ended up not being widely
               | implemented, so C11 backed that out and made them an
               | optional feature.)
        
               | jschwartzi wrote:
               | What is your definition of "portable"? Are you using that
               | term to mean "code I write for one platform can run
               | without modification on other platforms" or "the language
               | I use for one platform works on other platforms"?
               | 
               | I think when you get down to the level of C you're
               | looking at the latter much more than the former. C is
               | really more of a platform-agnostic assembler. It's not a
               | design smell to have conventions within the group of
               | language users that are de-facto language rules. For
               | reference, see all the PEP rules about whitespace around
               | different language constructs. These are not enforced.
               | 
               | The whole point of writing a C program is to be close to
               | the addressable resources of the platform, so you'd
               | probably want to expose those low-level constructs unless
               | there's a compelling reason not to. Eliminating an
               | argument from a function by hiding it in a data structure
               | is not that compelling to me since I can just do that on
               | my own. And then I can also pass other information such
               | as the platforms mutex or semaphore representation in the
               | same data structure if I need to.
               | 
               | By the way, that convenient length+pointer array requires
               | new language constructs for looping that are effectively
               | syntactic sugar around the for loop. Or you need a way to
               | access the members of the structure. And syntactic sugar
               | constrains how you can use the construct. So I'm not sure
               | that it adds anything to the language that isn't already
               | there. And the fact that length+pointer is such a common
               | construct indicates that most people don't have any
               | issues with it at all once they learn the language.
        
           | nabla9 wrote:
           | > no such feature has emerged in practice
           | 
           | Arrays with length constantly emerge among C users and
           | libraries. They are just all incompatible because without
           | standardization there is no convergence.
        
             | rseacord wrote:
             | Sounds like a good use of standardization. If there is
             | existing implementation practice, please go ahead and
             | submit a proposal. I would be happy to champion such a
             | proposal if you can't attend in person.
        
               | nabla9 wrote:
               | It was an observation, not suggestion.
               | 
               | When the language standardization body has not managed to
               | add arrays with length in 48 years, I don't think it
               | should be added at this point. The culture is backward
               | looking and incompatible with modern needs and people
               | involved are old and incompatible with the future (no
               | offense, so am I).
               | 
               | C standardization effort should focus on finishing the
               | language, not developing it to match modern world. I have
               | programmed with C over 20 years, since I was a teenager.
               | It's has long been the system programming language I'm
               | most familiar with. For the last 10 years I have never
               | written an executable. Just short callable functions from
               | other languages. Python, Java, Common Lisp, Matlab, and
               | 'horrors or horrors' C++.
               | 
               | I think Standard C's can live next 50 years in gradual
               | decline as portable assembler called from other languages
               | and compilation target.
               | 
               | If I would propose new extension to C language, I would
               | propose completely new language that can be optionally
               | compiled into C and works side by side with old C code.
        
               | apotheon wrote:
               | > If I would propose new extension to C language, I would
               | propose completely new language that can be optionally
               | compiled into C and works side by side with old C code.
               | 
               | There are a few somewhat popular languages that fit that
               | description already, and none of them are suitable
               | replacements for C (as far as I've seen). That's not to
               | say there couldn't be a suitable replacement -- just that
               | nobody in a position to do something about it wants the
               | suitable replacement enough for it to have emerged,
               | apparently.
               | 
               | I suspect the first really suitable complete replacement
               | for C would be something like what Checked C [1] tried to
               | be, but a little more ambitious and willing to include
               | wholly new (but perhaps backward-compatible) features
               | (like some of those you've proposed) implemented in an
               | interestingly new enough way to warrant a whole new
               | compile-to-C implementation. Something like that could
               | greatly improve the use cases where a true C replacement
               | would be most appreciated, and still fit "naturally" into
               | environments where C is already the implementation
               | language of choice via a piecemeal replacement strategy
               | where the first step is just using the new language's
               | compiler as the project compiler front end's drop-in
               | replacement (without having to make any changes to the
               | code at all for this first step).
               | 
               | 1: https://www.microsoft.com/en-
               | us/research/project/checked-c/
        
               | xtian wrote:
               | Sounds like you are describing Zig. https://ziglang.org
        
             | simias wrote:
             | I think the problem is that C is simply ill-suited for
             | these "high level" constructs. The best you're likely to
             | get is an ad-hoc special library like for wchar_t and
             | wcslen and friends. Do we really want that?
             | 
             | I'd argue that linked list might make a better candidate
             | for inclusions, because I've seen the kernel's list.h or
             | similar implementations in many projects and that's stuff
             | is trickier to get right than stuffing a pointer and a
             | size_t in a struct.
        
             | ATsch wrote:
             | typedef struct {uint8_t *data; size_t len;} ByteBuf; is the
             | first line of code I write in a C project.
        
               | kkdwivedi wrote:
               | Another option is a struct with a FAM at the end.
               | typedef struct {           size_t len;           uint8_t
               | data[];       } ByteBuf;
               | 
               | Then, allocation becomes                 ByteBuf *b =
               | malloc(sizeof(*b) + sizeof(uint8_t) * array_size);
               | b->len = array_size;
               | 
               | and data is no longer a pointer.
        
               | ATsch wrote:
               | Well, your ByteBuf is still a pointer. You also now need
               | to dereference it to get the length. It also can't be
               | passed by value, since it's very big. You can also not
               | have multiple ByteBufs pointing at subsections of the
               | same region of memory.
               | 
               | Thing is, you rarely want to share just a buffer anyway.
               | You probably have additional state, locks, etc. So what I
               | do is embed my ByteBuf directly into another structure,
               | which then owns it completely:                   typedef
               | struct {             ...             ByteBuf mybuffer;
               | ...         } SomeThing;
               | 
               | So we end up with the same amount of pointers (1), but
               | with some unique advantages.
        
               | saagarjha wrote:
               | sizeof(ByteBuf) == sizeof(size_t), and you _can_ pass it
               | by value; I just don 't think you can do anything useful
               | with it because it'll chop off the data.
        
               | kkdwivedi wrote:
               | Right, totally depends on what you're doing. My example
               | is not a good fit for intrusive use cases.
        
               | kevin_thibedeau wrote:
               | This will an alignment problem on any platform with data
               | types larger than size_t. You'd need an
               | alignas(max_align_t) on the struct. At which point some
               | people are going to be unhappy about the wasteful padding
               | on a memory constrained target.
        
               | enriquto wrote:
               | That's a really bizarre layout for your struct. Why don't
               | you put the length first?
        
               | ATsch wrote:
               | I'm not sure if it matters. It might be better for some
               | technical reason, such as speeding up double
               | dereferences, because you don't need to add anything to
               | get to the pointer. But to be honest I just copied it out
               | of existing code.
        
               | saagarjha wrote:
               | Most platforms have instructions for dereferencing with a
               | displacement.
        
               | twic wrote:
               | Why would it matter? The bytes aren't inline, this is
               | just a struct with two word-sized fields.
               | 
               | A possible tiny advantage for this layout is that a
               | pointer to this struct can be used as a pointer to a
               | pointer-to-bytes, without having to adjust it. Although
               | i'm not sure that's not undefined behaviour.
        
           | scythe wrote:
           | >There have been no proposals to add new array types and it
           | doesn't seem likely at the core language level.
           | 
           | One alternative to adding types is to allow enforcing
           | consistency in some structs with the trailing array:
           | struct my_obj {           const size_t n;           //other
           | variables           char text[n];         };
           | 
           | where for simplicity you might only allow the first member to
           | act as a length (and it must of course be constant). The
           | point is that then the initializer:                   struct
           | my_obj b = {.n = 5};
           | 
           | should produce an object of the right size. For heap
           | allocation you could use something like:
           | void * vmalloc(size_t base, size_t var, size_t cnt) {
           | void *ret = malloc(base + var * cnt);           if (!ret)
           | return ret;           * (size_t *) ret = cnt;
           | return ret;         }
        
             | jschwartzi wrote:
             | I would love this.
        
         | 7532yahoogmail wrote:
         | I agree which brought me into looking at Zig. A future version
         | of C might disallow macros, preprocessor, disallow circular
         | libraries, include a module system, but allow importing legacy
         | libs like Zig. Also something like llvm so we can automatically
         | do static analysis, transforms would be great.
        
         | rseacord wrote:
         | I think we are always looking at ways to "clean up C" but that
         | this has to be done very carefully not to break existing code.
         | For example, the committee recently voted to remove support for
         | function definitions with identifier lists from C2x
         | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2432.pdf At
         | least one vendor was not very happy with this decision.
         | 
         | Undefined behaviors tend to be undefined for a reason and
         | shouldn't be thought of as defects in the standard. In my years
         | on the committee, I have always argued to define as much
         | behavior as possible and to as narrowly define undefined
         | behaviors as possible.
         | 
         | We also had a recent discussion about adding additional name
         | spaces (when discussing reserved identifiers), but it didn't
         | gain much traction.
        
           | bumblebritches5 wrote:
           | Looks like that proposal is dropping support for K&R function
           | declarations, is that right?
        
             | rseacord wrote:
             | yes, that is correct.
        
           | tropo wrote:
           | C has strayed very far from the original intent because
           | compiler authors prioritized benchmark results at the expense
           | of real-world use cases. This bad trend needs to be reversed.
           | 
           | Consider signed integer overflow.
           | 
           | The intent wasn't that the compiler could generate nonsense
           | code if the programmer overflowed an integer. The intent was
           | the the programmer could determine what would happen by
           | reading the hardware manual. You'd wrap around if the
           | hardware naturally would do so. On some other hardware you
           | might get saturation or an exception.
           | 
           | In other words, all modern computers should wrap. That
           | includes x86, ARM, Power, Alpha, Itanium, SPARC, and just
           | about everything else. I don't believe you can even buy non-
           | wrapping hardware with a C99 or newer compiler. Since this is
           | likely to remain true, there is no longer any justification
           | for retaining undefined behavior that is getting abused to
           | the detriment of C users.
        
           | 3JPLW wrote:
           | Does it concern you how aggressively compiler teams are
           | exploiting UB?
        
             | Spivak wrote:
             | You do have to understand that compiler teams aren't saying
             | something like "this triggers UB, quick just replace it
             | with noop." It's just something that naturally happens when
             | you need to reason about code.
             | 
             | For example, consider a very simple statement.
             | let array[10];         let i = some_function();
             | print(array[i]);
             | 
             | The function might not even be known to the compiler at
             | compilation time if it was from a DLL or something.
             | 
             | But the compiler is like "hey! you used the result of this
             | function as an index for this array! i must be in the range
             | [0, 10)! I can use that information!"
        
               | gwd wrote:
               | > But the compiler is like "hey! you used the result of
               | this function as an index for this array! i must be in
               | the range [0, 10)! I can use that information!"
               | 
               | As a developer who has seen lots of developers (including
               | himself) make really dumb mistakes, this seems like a
               | very strange statement.
               | 
               | Imagine if you hired a security guard to stand outside
               | your house. One day, he sees you leave the house and
               | forget to lock the door. So he reasons, "Oh, nothing
               | important inside the house today -- guess I can take the
               | day off", and walks off. That's what a lot of these "I
               | can infer X must be true" reasonings sounds like to me:
               | they assume that developers don't make mistakes; and that
               | all unwanted behavior is exactly the same.
               | 
               | So suppose we have code that does this:
               | int array[10];       int i = some_function();
               | /* Lots of stuff */       if ( i > 10 ) {         return
               | -EINVAL;       }            array[i] = newval;
               | 
               | And then someone decides to add some optional debug
               | logging, and forgets that `i` hasn't been sanitized yet:
               | int array[10];       int i = some_function();
               | logf("old value: %d\n", array[i]);            /* Lots of
               | stuff */            if ( i > 10 ) {         return
               | -EINVAL;       }            array[i] = newval;
               | 
               | Now _reading_ `array[i]` if `i`  > 10 is certainly UB;
               | but in a lot of cases, it will be harmless; and in the
               | worst case it will crash with a segfault.
               | 
               | But suppose a clever compiler says, "We've accessed
               | array[i], so I can infer that i < 10, and get rid of the
               | check entirely!" Now we've changed an out-of-bounds
               | _read_ into an out-of-bounds _write_ , which has changed
               | worst-case a DoS into a privilege escalation!
               | 
               | I don't know whether anything like this has ever
               | happened, but 1) it's certainly the kind of thing allowed
               | by the spec, 2) it makes C a much more dangerous language
               | to deal with.
        
               | asveikau wrote:
               | > in a lot of cases, it will be harmless; and in the
               | worst case it will crash with a segfault.
               | 
               | I am not sure if a segfault is always the worst case. It
               | could be by some coincidence that array[i] contains some
               | confidential information [maybe part of a private key? 32
               | bits of the user's password?] and you've now written it
               | to a log file.
               | 
               | I know it's hard to imagine a mis-read of ~32 bits would
               | have bad consequences of that sort, but it's not out of
               | the question.
        
               | btilly wrote:
               | Per https://lwn.net/Articles/575563/, Debian at one point
               | found that 40% of the C/C++ programs that they have are
               | vulnerable to known categories of undefined behavior like
               | this which can open up a variety of security holes.
               | 
               | This has been accepted as what to expect from C. All
               | compiler authors think it is OK. People who are aware of
               | the problem are overwhelmed at the size of it and there
               | is no chance of fixing it any time soon.
               | 
               | The fact that this has become to be seen as normal and
               | OK, is an example of _Normalization of Deviance_. See
               | http://lmcontheline.blogspot.com/2013/01/the-
               | normalization-o... for a description of what I mean. And
               | deviance will continue to be normalized right until
               | someone writes an automated program that walks through
               | projects, finds the surprising undefined behavior, and
               | tries to come up with exploits. After project after
               | project gets security holes, perhaps the C language
               | committee will realize that this really __ISN 'T __okay.
               | 
               | And the people who already migrated to Rust will be
               | laughing their asses off in the corner.
        
               | msebor wrote:
               | This is a good example. Let me flesh it out a bit more to
               | illustrate a specific instance of this problem:
               | int a[2][2];       int f (int i, int j)        {
               | int t = a[1][j];            a[0][i] = 0;          //
               | cannot change a[1]            return a[1][j] - t;   //
               | can be folded to zero        }
               | 
               | The language says that elements of the matrix a must only
               | be accessed by indices that are valid for each bound, so
               | compilers can and some do optimize code based on that
               | requirement (see https://godbolt.org/z/spSF8e).
               | 
               | But when a program breaks that requirement (say, by
               | calling f(2, 0)) the function will likely return an
               | unexpected value.
        
               | Spivak wrote:
               | But I don't know what you want to happen in this case? If
               | you actually call f(2,0) then the program makes no sense.
               | How can you have an expected value for a function call
               | that violates its preconditions?
        
             | rseacord wrote:
             | I would say that there is a lot of concern in the committee
             | about how compilers are optimizing based on pointer
             | providence. There has been a study group looking at this.
             | It now appears that they are likely to publish their
             | proposal as a Technical Report.
        
               | _kst_ wrote:
               | "based on pointer providence"
               | 
               | I think you meant "provenance" (mentioning it for the
               | sake of anyone who wants to search for it).
        
               | rseacord wrote:
               | Yes, my mistake--I was thinking of Rhode Island. I wrote
               | a short bit about this at
               | https://www.nccgroup.trust/us/about-us/newsroom-and-
               | events/b... if anyone is interested.
        
               | revertts wrote:
               | What's the best way to keep an eye out for that TR?
               | Periodically checking http://www.open-
               | std.org/jtc1/sc22/wg14/ ?
               | 
               | I can't ever tell if I'm looking in the right place. :)
        
               | AaronBallman wrote:
               | If you're interested in the final TR, I would imagine
               | we'd list it on that page you linked. If you're
               | interested in following the drafts before it becomes
               | published, you'd fine them on http://www.open-
               | std.org/jtc1/sc22/wg14/www/wg14_document_log... (A draft
               | has yet to be posted, though, so you won't find one there
               | yet.)
        
             | msebor wrote:
             | This is a common misconception (or poor way of phrasing it,
             | sorry). Compiler implementers don't go looking for
             | instances of undefined behavior in a program with the goal
             | of optimizing it in some way. There is little value in
             | optimizing invalid code. The opposite is the case.
             | 
             | But we must write code that relies on the same rules and
             | requirements that programs are held to (and vice versa).
             | When either party breaks those rules, either accidentally
             | or deliberately, bad things happen.
             | 
             | What sometimes happens is that code written years or
             | decades ago relies on the absence of an explicit guarantee
             | in the language suddenly stops working because a compiler
             | change depends on the assumption that code doesn't rely on
             | the absence of the guarantee. That can happen as a result
             | of improving optimizations, which is often but not not
             | necessarily always motivated by improving the efficiency of
             | programs. Better analysis can also help find bugs in code
             | or avoid issuing warnings for safe code.
        
               | ori_b wrote:
               | There are rules and requirements documented in the spec,
               | and there are de-facto rules and requirements that
               | programs expect. Not only that, but when they _do_
               | exploit these rules, often the code generated is
               | obviously incorrect, and could have been flagged at
               | compile time.
               | 
               | Right now, it seems like compiler vendors are playing a
               | game of chicken with their users.
        
               | cwzwarich wrote:
               | > This is a common misconception (or poor way of phrasing
               | it, sorry). Compiler implementers don't go looking for
               | instances of undefined behavior in a program with the
               | goal of optimizing it in some way. There is little value
               | in optimizing invalid code. The opposite is the case.
               | 
               | Compilers do deliberately look to optimize loops with
               | signed counters by exploiting UB to assume that they will
               | never wrap.
        
               | Leherenn wrote:
               | Well yes, they assume they never wrap because that is not
               | allowed by the language, by definition. UB are the
               | results of broken preconditions at the language level.
        
               | qznc wrote:
               | I'd say both statements are correct.
               | 
               | Compiler implementers are happy when they don't have to
               | care about some edge case because then the code is
               | simpler. Thus, only for unsigned counters there is the
               | extra logic to compile them correctly.
               | 
               | That is my interpretation of "The opposite is the case".
               | Writing a compiler is easier with lots of undefined
               | behavior.
        
           | ximeng wrote:
           | Why would a vendor be unhappy about that? They have a large
           | library using this deprecated syntax? Or many customers? It
           | seems like a relatively easy fix to existing code.
        
             | AaronBallman wrote:
             | The usual argument is: once you've verified some piece of
             | code is correct, changing it (even when there should be no
             | functional change in the semantics) carries risk. Some
             | customers have C89-era code that compiles in C17 mode and
             | they don't want to change that code because of these risks
             | (perhaps the cost of testing is prohibitively expensive,
             | there may be contractual obligations that kick in when
             | changing that code, etc).
        
               | rmind wrote:
               | Well, one argument is that the vendors should not compile
               | C89 code as C17. If you write C89, then stick with
               | -std=c89 (or upgrade to the latest officially compatible
               | revision).
               | 
               | It makes sense to preserve language compatibility within
               | several language revisions, gradually sunsetting some
               | features, but why do that for the eternity? Gradual de-
               | supporting would push the problem to the compilers, but
               | while it is no fun supporting, let's say, C89 and a
               | hypothetical incompatible language C3X, this is where the
               | effort should go (after all, companies with the old
               | codebases can stick with older compilers). There is a
               | great value in paving a way for a more fundamental C
               | language simplifications and clean ups.
        
               | apotheon wrote:
               | These are all good points, and I don't see a legitimate,
               | technical reason to avoid deprecating and eliminating
               | identifier list syntax in new C standards (but then, I'm
               | not as much of an expert as some people, so I might be
               | missing something important).
               | 
               | That having been said, a compiler _vendor_ has, almost
               | _by definition_ as its _first priority_ , an undeniable
               | interest in keeping customers happy while, at the same
               | time, ensuring strong reasons to see value in a version
               | upgrade. When dealing with corporate enterprise
               | customers, that often means offering new features without
               | deprecating old features, because the customers want the
               | new features but don't want to have to rewrite _anything_
               | just because of a compiler upgrade.
               | 
               | They'll want C17 (and C32, for that matter) hot new
               | features, but they will not want to pay a developer to
               | "rewrite code that already works" (in the view of middle
               | managers).
               | 
               | That's why I think they'd most likely complain. Their
               | concerns about removing identifier lists likely have
               | _nothing at all_ to do with good technical sense.
               | Ideally, if you don 't want to rewrite your rickety old
               | bit-rotting shit code, you should just continue compiling
               | it with an old compiler, and if you want new language
               | features you should use them in new language standard
               | code, period, but business (for pathological, perhaps,
               | but not really upstream-curable reasons) doesn't
               | generally work that way.
        
               | ximeng wrote:
               | One alternative at that point is to just ignore the fact
               | that the deprecated feature is now removed and continue
               | supporting it in your compiler. Maybe you hide standards
               | compliance behind a flag. Annoying and more overhead, but
               | saves your clients from spending dollars on upgrading
               | their obsolete code.
        
         | pcr910303 wrote:
         | Or... deprecating unsafe or not-well-designed (but this is a
         | bit subjective) ideas. Like... deprecating locales. (For why
         | locales aren't well-designed ideas: https://github.com/mpv-
         | player/mpv/commit/1e70e82baa9193f6f02...)
        
         | cesarb wrote:
         | > Proper array support (which passes around the length along
         | with the data pointer).
         | 
         | I second this one. One of the best things from Rust is its "fat
         | pointers", which combine a (pointer, length) or a (pointer,
         | vtable) pair as a single unit. When you pass an array or string
         | slice to a function, under the covers the Rust compiler passes
         | a pair of arguments, but to the programmer they act as if they
         | were a single thing (so there's no risk of mixing up lengths
         | from different slices).
        
           | loeg wrote:
           | Fat pointers in C would involve an ABI break for existing
           | code, in that uintptr_t and uintmax_t would probably need to
           | double in size.
        
             | rkangel wrote:
             | It would presumably involve a new type that didn't exist in
             | the current ABI. Those pointers would stay the same, and
             | the new (twice as big) pointers would be used for the array
             | feature.
        
               | professoretc wrote:
               | The point of uintptr_t is that it's an integer type to
               | which _any_ pointer type can be cast. If you introduce a
               | new class of pointers which are not compatible with
               | uintptr_t, then suddenly you have pointers which are not
               | pointers.
        
               | loeg wrote:
               | Ditto uintmax_t. We do not want a uintmax2_t.
        
               | _kst_ wrote:
               | No, uintptr_t is an integer type to which any _object_
               | pointer type can be converted without loss of
               | information. (Strictly speaking, the guarantee is for
               | conversion to and from void*.) And if an implementation
               | doesn 't have a sufficiently wide integer type, it won't
               | define uintptr_t. (Likewise for intptr_t the signed
               | equivalent.)
               | 
               | There's no guarantee that a function pointer type can be
               | converted to uintptr_t without loss of information.
               | 
               | C currently has two kinds of pointer types: object
               | pointer types and function pointer types. "Fat pointers"
               | could be a third. And since a fat pointer would
               | internally be similar to a structure, converting it to or
               | from an integer doesn't make a whole lot of sense. (If
               | you want to examine the representation, you can use
               | memcpy to copy it to an array of unsigned char.)
        
               | kazinator wrote:
               | You would be shocked by this language called C++ which is
               | highly compatible with C and has "pointer to member"
               | types that don't fit into a uintptr_t.
               | 
               | (Spoiler: no, there is no uintptr2_t).
        
               | kazinator wrote:
               | On a given platform, the fat pointer type could have an
               | easily defined ABI expressible in C90 declarations (whose
               | ABI is then deducible accordingly).
               | 
               | For instance, complex double numbers can have an ABI
               | which says that they look like struct { double re, im; };
        
             | cesarb wrote:
             | Existing code would be using normal pointers, not fat
             | pointers, so there would be no ABI break. New code using
             | fat pointers would know that they fit into a _pair_ of
             | uintptr_t, so the size of uintptr_t would not need to
             | change either.
        
               | loeg wrote:
               | I don't think we want a uintptr_t and uintptr2_t.
        
               | monocasa wrote:
               | IDK, it's not like it'd be an auto_ptr situation where
               | you just don't use uintptr_t anymore and call the other
               | one uintptr2_t. THere's different enough semantics that
               | they both still make sesne.
               | 
               | Like, as someone who does real, real dirty stuff in Rust,
               | usize as a uintptr equivalent gets used still even though
               | fat pointers are about as well supported as you can
               | imagine.
        
           | kazinator wrote:
           | The C family has already evolved in this direction decades
           | ago. Have you heard of C++ (Cee Plus Plus)?
           | 
           | It is production-ready; if you want a dialect of C with
           | arrays that know their length, you can use C++. If you wanted
           | a dialect of C in 1993 with arrays that know their length for
           | use in a production app you could also have used C++ then.
           | 
           | The problem with all these "can we add X to C" is that there
           | is always an implicit "... but please let us not add Y, Z and
           | W, because that would start to turn C into C++, which we all
           | agree that we definitely don't want or need."
           | 
           | The kicker is that _everyone wants a different X_.
           | 
           | Elsewhere in this thread, I noticed someone is asking for
           | _namespace { }_ and so it goes.
           | 
           | C++ _is_ the result --- is that version of the C language ---
           | where most of the crazy  "can you add this to C" proposals
           | have converged and materialized. "Yes" was said to a lot of
           | proposals over the years. C++ users had to accept features
           | they don't like that other people wanted, and had to learn
           | them so they could understand C++ programs in the wild, not
           | just their own programs.
        
             | apotheon wrote:
             | C++ introduces a shit-ton of stuff that one often doesn't
             | want, and even Bjarne Stroustrup (who many content has
             | never seen a language feature he didn't want) has been a
             | little alarmed at the sheer mass of cruft being crammed
             | into recent updates to the standard. I know many C++ people
             | think C++ is pure improvement over C in all contexts and
             | manners, but it's not. It's different, and there are
             | features implemented in C++ and not in C that could be
             | added to C without damaging C's particular areas of
             | greatest value, and many other features in C++ that would
             | be pretty bad for some of C's most important use cases.
             | 
             | C shouldn't turn into C++, or even C++ Lite(tm), but it
             | shouldn't remain strictly unchanging for all eternity,
             | either. It should just always strive to be a better C,
             | conservatively, because its niche is one where conservative
             | advancement is important.
             | 
             | Some way to adopt programming practices that guaranteee
             | consistent management of array and pointer length -- not
             | just write code to check it, but actually _guarantee_ it --
             | would, I think, perfectly fit the needs of conservative
             | advancement suitable to C 's most important niche(s). It
             | may not take the form of a Rust-like "fat pointer". It may
             | just be the ability to tell the compiler to enforce a
             | particular constraint for relationships between specific
             | struct fields/members (as someone else in this discussion
             | suggested), in a backward-compatible manner such that the
             | exact same code would compile in an older-standard compiler
             | -- a _very_ conservative approach that should, in fact,
             | solve the problem as well as  "fat pointers".
             | 
             | There are ways to get the actually important upgrades
             | without recreating C++.
        
               | kazinator wrote:
               | > _C++ introduces a shit-ton of stuff that one often
               | doesn 't want_
               | 
               | The point in my comment is that every single item in C++
               | was wanted and championed by _someone_ , exactly like all
               | the talk about adding this and that to C.
               | 
               | > _C shouldn 't turn into C++_
               | 
               | Well, C _did_ turn into C++. The entity that gave forth
               | C++ is C.
               | 
               | Analogy: when we say "apes turned into humans", we don't
               | mean that apes don't exist any more or are not continuing
               | to evolve.
               | 
               | Since C++ is here, there is no need for C to turn into
               | another C++ _again_.
               | 
               | A good way to have a C++ with fewer features would be to
               | trim from C++ rather than add to C.
        
             | twic wrote:
             | > if you want a dialect of C with arrays that know their
             | length, you can use C++
             | 
             | C++ doesn't have arrays which know their length.
        
               | zokier wrote:
               | What's std::array then?
               | 
               | > combines the performance and accessibility of a C-style
               | array with the benefits of a standard container, such as
               | knowing its own size
               | 
               | https://en.cppreference.com/w/cpp/container/array
        
               | kevin_thibedeau wrote:
               | They're objects that mostly behave like arrays. You can't
               | index element two of std::array foo as 1[foo] since it
               | isn't an actual C array.
        
               | kazinator wrote:
               | C++ has features in its syntax so that you can write
               | objects that behave like arrays: support [] indexing via
               | operator [], and can be passed around (according to
               | whatever ownershihp discipline you want: duplication,
               | reference counting). C++ provides such objects in its
               | standard library, such as: std::basic_string<T> and
               | std::vector<T>. There is a newer std::array also.
        
       | pjmlp wrote:
       | Microsoft's "Checked C" seems to be the last attempt to fix C
       | security flaws.
       | 
       | From the outside, after Annex K adoption failure, WG14 doesn't
       | seem to be willing to make C safer in any way.
       | 
       | Are there any plans to take efforts like Checked C in
       | consideration regarding the future of ISO C?
        
       | Bambo wrote:
       | What is your favourite design pattern?
        
       | wpietri wrote:
       | As experts, where do you see C going? In particular, given the
       | many languages now out there built on decades of learnings from
       | C, where will C have unique strengths? What projects starting
       | today and hoping to run for 20 years should definitely pick C?
        
         | rseacord wrote:
         | I don't really see C going anywhere. It's not going away, and
         | it's not going to evolve into Java. It's going to remain
         | especially useful for memory constrained and performance
         | critical applications such as IoT and embedded.
        
           | wpietri wrote:
           | That sounds reasonable, but the resource-constrained space
           | seems to me to be an ever-shrinking share of the field. So is
           | it fair to say you see C becoming a specialist niche language
           | going forward?
        
       | rand0mstring wrote:
       | is there no way to make C "memory-safe" during compilation?
        
         | zzzcpan wrote:
         | There are a bunch of research projects that did just that. And
         | even just compiling with address sanitizer makes it "memory-
         | safe" to a significant degree.
        
           | rand0mstring wrote:
           | can you link any to check out?
        
       | rbultje wrote:
       | I'd love your opinion on the abundance of "undefined behaviour"
       | (as opposed to implementation-defined, or some new incantation
       | such as "unknown result in variable but system is safe") for
       | relatively trivial things such as signed (but not unsigned)
       | integer overflows. I've heard that this is to allow for non-twos-
       | complement implementations. However, in practice, you notice that
       | most people use ugly workarounds which lead to ugly code that
       | (because of e.g. casting to unsigned and allowing the same
       | overflow to happen anyway) only work correctly on twos-complement
       | anyway. Is this intended to be addressed in the future in some
       | way?
        
         | stephencanon wrote:
         | > (because of e.g. casting to unsigned and allowing the same
         | overflow to happen anyway) only work correctly on twos-
         | complement anyway
         | 
         | Unsigned arithmetic never overflows, and guarantees
         | two's-complement behavior, because unsigned arithmetic is
         | always carried out modulo 2^n:
         | 
         | > A computation involving unsigned operands can never overflow,
         | because a result that cannot be represented by the resulting
         | unsigned integer type is reduced modulo the number that is one
         | greater than the largest value that can be represented by the
         | resulting type. (6.2.5, Types)
         | 
         | Doing the computation in unsigned always does the "right
         | thing"; the thing that one needs to be careful of with this
         | approach is the conversion of the final result back to the
         | desired signed type (which is very easy to get subtly wrong).
        
           | jimktrains2 wrote:
           | Interesting. I guess most/many arch's overflow flag is set
           | when the sign bit changes and the carry flag when the result
           | rollsover the word size.
           | 
           | I think most people colloquially call going A + 1 = B where B
           | < A an overflow. Interesting. I knew they're different
           | things, but never really thought about my word choice.
        
           | rini17 wrote:
           | And are there standard primitives to do this correctly
           | (signed-unsigned-signed conversion) that never invoke
           | undefined behavior?
        
             | stephencanon wrote:
             | Signed to unsigned conversion is fully defined (and does
             | the two's complement thing):
             | 
             | > Otherwise, if the new type is unsigned, the value is
             | converted by repeatedly adding or subtracting one more than
             | the maximum value that can be represented in the new type
             | until the value is in the range of the new type (6.3.1.3
             | Signed and unsigned integers)
             | 
             | Unsigned to signed is the hard direction. If the result
             | would be positive (i.e. in range for the signed type), then
             | it just works, but if it would be negative, the result is
             | implementation-defined (but note: _not_ undefined). You can
             | further work around this with various constructs that are
             | ugly and verbose, but fully defined and compilers are able
             | to optimize away. For example, `x  <= INT_MAX ? (int)x :
             | (int)(x + INT_MIN) + INT_MIN` works if int has a twos-
             | complement representation (finally guaranteed in C2x, and
             | already guaranteed well before then for the intN_t types),
             | and is optimized away entirely by most compilers.
        
           | _kst_ wrote:
           | A quibble on wording: Unsigned overflow is not "twos-
           | complement". It gives you the same bit patterns that typical
           | two's-complement overflow gives you, but strictly speaking
           | two's-complement is a representation for _signed_ values.
        
           | shawnz wrote:
           | Wrapping around the modulus to me is an "overflow", although
           | maybe the spec doesn't use the word that way
        
             | GuB-42 wrote:
             | There is also a difference in x86 assembly, and probably
             | others.
             | 
             | For unsigned operations the carry flag is used, and for
             | signed operations, the overflow flag is used.
        
               | kwillets wrote:
               | Most compilers will translate unsigned (x + y < x) to CF
               | usage.
        
             | _kst_ wrote:
             | Right, there are (at least) two ways to describe this.
             | 
             | One is that unsigned arithmetic can overflow, and the
             | behavior on overflow is defined to wrap around.
             | 
             | Another is to say that unsigned arithmetic cannot overflow
             | because the result wraps around.
             | 
             | Both correctly describe the way it works; they just use the
             | word "overflow" in different ways.
             | 
             | The C standard chooses the second way of describing it.
        
       | a-bit-of-code wrote:
       | Any chance that we could have an STL equivalent in C. Of course,
       | templating and other features being absent it won't be as generic
       | as CPP. However, having even something close to STL will help in
       | the long run. Thanks!
        
         | rseacord wrote:
         | There is always a chance. We would need to see a proposal based
         | on experience with an existing implementation.
        
       | polishdude20 wrote:
       | What is your favorite language other than C and why?
        
         | pascal_cuoq wrote:
         | I answered a similar question in another thread:
         | https://news.ycombinator.com/item?id=22866242
        
       | ender1235 wrote:
       | Hi I took an amazing course in college that focused heavily on C.
       | Do you have any recent examples of small side projects you've
       | worked on using C?
        
         | DougGwyn wrote:
         | How about a Sudoku solver? Send me a request via e-mail.
        
           | dang wrote:
           | Doug, the email address in your account is private by
           | default, but you can make it public by putting it in the
           | About field of your profile at
           | https://news.ycombinator.com/user?id=DougGwyn.
           | 
           | ender1235, if you don't see an email address there, email
           | hn@ycombinator.com and I'll put you in touch.
        
             | DougGwyn wrote:
             | Okay, check my About text. I'll soon remove it, to avoid
             | getting a lot of spam.
        
       | tzs wrote:
       | If an old timer who used to be good with C wanted to use C again,
       | would they have to learn a whole bunch of weird new stuff or
       | could they pretty much use it like they did back in the stone age
       | (i.e., the 20th century)?
       | 
       | Back in the '80s and '90s I was pretty good at C. I don't think
       | there was anything about the language or the compilers than that
       | I did not understand. I used C to write real time multitasking
       | kernels for embedded systems, device drivers and kernel
       | extensions for Unix, Windows, Mac, Netware, and OS/2. I did a
       | Unix port from swapping hardware to paging hardware, rewriting
       | the processes and memory subsystems. I tricked a friend into
       | writing a C compiler. I could hold my own with the language
       | lawyers on comp.lang.c.
       | 
       | Somewhere in there I started using C++, but only as a C with more
       | flexible strings, constructors, destructors, and "for (int i =
       | ...)", and later added STL containers to that.
       | 
       | Sometime in the 2000s, I ended up spending more and more time on
       | smaller programs that were mostly processing text, and Perl
       | became my main tool. Also I ended up spending a lot of helping
       | out less experiences people at work who were doing things in PHP,
       | or JavaScript, or Java. My C and C++ trickled to nothing.
       | 
       | I've occasionally looked at modern C++, but it is so different
       | from what I was doing back in '90s or even early '00s I sometimes
       | have to double check that I'm actually looking at C++ code.
       | 
       | Is modern C like that, or is it still at its core the same
       | language I used to know well?
        
         | AaronBallman wrote:
         | I'd put it this way -- as someone who writes both C and C++ and
         | has for a long while, I find that the difference between "best
         | practice" C89 and C17 code is not as wide as the difference
         | between "best practice" C++98 and C++17 code. However, this is
         | subjective and may be specific to what kinds of projects I work
         | on, so YMMV.
        
         | msebor wrote:
         | C17 doesn't look much different than C89. If you are used to
         | K&R C there may be some adjustment but I would expect it to be
         | manageable.
         | 
         | What might perhaps be more challenging is adjusting to the
         | changes in compilers. They tend to optimize code more
         | aggressively and so writing code that closely follows the rules
         | of the language (rather than making assumptions about the
         | underlying hardware, even valid ones) is more important today
         | than it was back in the 80's.
        
           | rmind wrote:
           | Given the above, it is worth pointing out that the compilers
           | are also much much better in verification and useful
           | warnings/errors. Back in the (very old) days, there was a
           | motivation to cut down PCC (Portable C Compiler) and give the
           | birth to Lint as a separate application (because cutting the
           | compilation time was a greater priority). The current trends
           | are completely the opposite: compilers are getting
           | increasingly more powerful built-in static analyzers and
           | sanitizers by default.
           | 
           | I think the lack of powerful tools in 1990s-2000s contributed
           | to the thought by some that C is 'diffcult' in terms of
           | safety. However, things have moved on.
        
             | pjmlp wrote:
             | As additional info,
             | 
             | > Although the first edition of K&R described most of the
             | rules that brought C's type structure to its present form,
             | many programs written in the older, more relaxed style
             | persisted, and so did compilers that tolerated it. To
             | encourage people to pay more attention to the official
             | language rules, to detect legal but suspicious
             | constructions, and to help find interface mismatches
             | undetectable with simple mechanisms for separate
             | compilation, Steve Johnson adapted his pcc compiler to
             | produce lint [Johnson 79b], which scanned a set of files
             | and remarked on dubious constructions.
             | 
             | -- https://www.bell-labs.com/usr/dmr/www/chist.html
        
         | x0re4x wrote:
         | Take a look at "Modern C" :)
         | 
         | https://gforge.inria.fr/frs/download.php/latestfile/5298/Mod...
         | 
         | (Homepage: https://modernc.gforge.inria.fr/ )
        
         | DougGwyn wrote:
         | The main editing needed to bring "old C" source code up to
         | snuff using a "modern C" compiler is to make sure that the
         | standard header-defined types are used. No more assuming that a
         | lot of things are, by default, int type. A second, related
         | editing pass is to make sure all functions are declared as
         | prototypes, no longer K&R style; K&R style is slated to be
         | deprecated by the next version of the Standard. (There are some
         | rare uses for non-prototyped functions, but evidently the
         | committee thinks there is more benefit in forcing prototypes.)
        
         | adrianmonk wrote:
         | I'm sort of in the same boat, although I didn't do as much C.
         | (And my interest in getting back into it is more hypothetical.)
         | 
         | Aside from understanding how the language itself has changed,
         | maybe something else to put on the list is how to apply more
         | modern programming practices in C.
         | 
         | In the 90s, I don't think I ever saw C code with unit tests.
         | Any kind of automated testing was pretty rare. I've become
         | convinced that testing in some form is a good thing. If I were
         | going back to C, I'd want to understand the best way to go
         | about that.
         | 
         | People also didn't care (or know) much about security back
         | then. C has some obvious pitfalls (buffer overflows, etc.), and
         | it is pretty important to know good ways to minimize risk. I'd
         | want to understand best practices and techniques for this.
         | 
         | Also, back then build tools were very simple, and some of them
         | were not my favorite things to use (Imake, I'm looking at you).
         | Build tools have advanced a lot since then. Features like
         | reliable, deterministic incremental builds exist now. Some
         | things could be less tedious to configure and maintain. There
         | are probably best practices and preferred choices in build
         | tools, but what exactly they are is another thing I'd want to
         | know.
         | 
         | These are probably not questions that necessarily need an
         | answer from people whose expertise is the language itself,
         | though, so I guess this is a tangent.
        
       | pcr910303 wrote:
       | As there are a lot of C-masters lurking in this thread:
       | 
       | How can one process unicode (UTF-8) properly in C? As a CJK
       | person, I wish there was a robust solution. Are there any
       | standardized ways or proposals? (Using wchar doesn't count.)
        
         | begriffs wrote:
         | Your best bet is probably to use a library like ICU.
         | 
         | Here are examples of working with unicode in C:
         | https://begriffs.com/posts/2019-05-23-unicode-icu.html
        
         | pascal_cuoq wrote:
         | As a reviewer for Robert's upcoming C book "Effective C", I
         | thought that this aspect was better covered than in existing
         | manuals for learning C.
         | 
         | However, the book only describes the available standard
         | functions, so even doing better than other manuals, everything
         | it has to say on this subject fits in one chapter and feel
         | underpowered.
        
         | Tronic2 wrote:
         | Ignore all character support in the standard library and handle
         | UTF-8 as opaque binary buffers. If you need complex string
         | algorithms, decode into UCS-4 (UTF-32). You'll find short
         | encoding and decoding functions on StackOverflow. For case-
         | insensitive comparisons and sorting, use an external library
         | that knows the latest Unicode standard.
        
           | barbegal wrote:
           | Except that not all binary data is valid UTF-8 so you also
           | need functions that check if a binary buffer is valid UTF-8.
        
             | Tronic2 wrote:
             | The decoding phase will do that, if needed. Also note that
             | in many cases you must process it as opaque binary, even
             | though it _should be_ valid UTF-8. This is in particular
             | with filenames on POSIX systems because otherwise you could
             | not access any files that happen to have invalid UTF-8 in
             | their names.
        
         | ori_b wrote:
         | https://bitbucket.org/knight666/utf8rewind/src/default/
        
         | DougGwyn wrote:
         | UTF-8 encoding works "as is" based on byte strings (char[]).
         | The latest versions of the draft standard provide somewhat more
         | support.
         | 
         | I recommend heading toward a future where only UTF-8 encoding
         | is used for multibyte characters and UCS-2 or similar for
         | wchar_t. There is no need to support several different
         | encodings.
        
           | [deleted]
        
           | rseacord wrote:
           | Aaron Ballman even got a u8 character prefix added to C2x:
           | 
           | N2198 2018/01/02 Ballman, Adding the u8 character prefix
           | 
           | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2198.pdf
        
           | ori_b wrote:
           | UCS-2 is a bad choice -- it fails to represent most unicode
           | characters. If you meant UTF-16, that's also a bad choice,
           | because UTF-16 is _also_ a variable width encoding, forcing
           | programmers to use a some for of  "extra-wide char".
           | 
           | I'm of the opinion that wchar_t should become an alias for
           | char32_t.
        
             | DougGwyn wrote:
             | Yes, I meant the 31-bit code point value (more than 16,
             | anyway). It is the most useful width for doing things with
             | wide characters.
        
         | loeg wrote:
         | What sort of processing do you want to do?
        
         | oreally wrote:
         | Check this: http://utf8everywhere.org/
         | 
         | Basically store the text as char arrays, and convert them when
         | needed. Meanwhile, you could use this single file header:
         | https://github.com/RandyGaul/cute_headers/blob/master/cute_u...
        
       | loeg wrote:
       | Has Annex K been axed yet, and if not, why not?
        
         | rseacord wrote:
         | It has not. The C Committee has taken two votes on this, and in
         | each case, the committee has been equally divided. Without a
         | consensus to change the standard, the status quo wins.
         | 
         | Sounds like you don't care for Annex K. What don't you like
         | about it?
        
           | loeg wrote:
           | I think my complaints are summed up nicely in some of your
           | coauthors' report:
           | 
           | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm
           | 
           | (1) runtime constraint handler callbacks are a terrible API.
           | 
           | (2) The additional boilerplate doesn't buy us anything -- the
           | user can still specify the wrong size.
           | 
           | (3) The Annex invents a feature out of whole cloth, rather
           | than standardizing existing practices. There are no
           | performant real-world implementations that anyone uses.
           | Microsoft's similar functionality is non-standard.
        
       | commandersaki wrote:
       | Will Effective C cover the strict aliasing rule and also why the
       | BSD sockets API seems to get away with it (e.g. (sockaddr *)
       | &sockaddr_in)?
        
         | DougGwyn wrote:
         | I thought we had fixed the BSD socket aliasing a long time ago?
        
         | AaronBallman wrote:
         | I don't think the book covers strict aliasing, at least not in
         | detail.
        
       | cyber1 wrote:
       | I think C is an exceptional good language for a long time, but
       | the world is changing and maybe C must evolve with new trends,
       | new researches in programming languages.
       | 
       | In my view C and C++ now almost different languages with a
       | different philosophy of programming, different future, and
       | different language design.
       | 
       | It will be sad if "modern" C++ almost replace C. Many C++
       | developers use "Orthodoxy C++"
       | https://gist.github.com/bkaradzic/2e39896bc7d8c34e042b, and this
       | shows that people will be more comfortable with C plus some
       | really useful features(namespaces, generics, etc), but not modern
       | C++. I very often hear from my job colleagues and from many other
       | people who work with C++ is how terrible modern C++
       | (https://aras-p.info/blog/2018/12/28/Modern-C-Lamentations/,
       | https://www.youtube.com/watch?v=9-_TLTdLGtc) and haw will be good
       | to see and use new C but with some extra features. Maybe time to
       | start thinking about evolution C, for example:                 -
       | Generics. Something like generics in Zig, Odin, Rust. etc.
       | - AST Macros. For example Rust or Lisp macroses, etc.       -
       | Lambda       - Defer statement       - Namespaces
       | 
       | What do you think?
       | 
       | https://ziglang.org/documentation/master/#Generic-Data-Struc...
       | 
       | https://odin-lang.org/docs/overview/#parametric-polymorphism
       | 
       | https://doc.rust-lang.org/rust-by-example/generics.html
        
       | eqvinox wrote:
       | What's the best way to deal with "transitive const-ness", i.e.
       | utility functions that operate on pointers and where the return
       | type should technically get const from the argument?
       | 
       | (strchr is the most obvious, but in general most search/lookup
       | type functions are like this...)
       | 
       | Add to clarify: the current prototype for strchr is
       | char *strchr(const char *s, int c);
       | 
       | Which just drops the "const", so you might end up writing to
       | read-only memory without any warning. Ideally there'd be
       | something like:                 maybe_const_out char
       | *strchr(maybe_const_in char *s, int c);
       | 
       | So the return value gets const from the input argument. Maybe
       | this can be done with _Generic? That kinda seems like the
       | "cannonball at sparrows" approach though :/ (Also you'd need to
       | change the official strchr() definition...)
        
         | DSMan195276 wrote:
         | The straight-forward approach is just two functions, one with
         | `const` and one without (You can make one of them `static
         | inline` around the other and do some casting to avoid
         | implementing the same thing twice).
         | 
         | With that, selecting the correct function via `_Generic` should
         | be possible (`_Generic` is a bit fiddly, but matching on `const
         | char * ` and `char * ` should work just fine for this), and for
         | the most part this is actually an/the intended use case for
         | `_Generic` - it's basically the same as the type-generic math
         | functions, more or less.
        
         | msebor wrote:
         | The committee has reviewed a proposal (document N2360) to for
         | const-correct string functions.
         | 
         | But making function signatures const-correct solves only a
         | small part of the problem. A new API can only be used in new
         | code, and casts can remove the constness from pointers leaving
         | open the possibility that poorly written code will
         | inadvertently change the const object. An attempt to change a
         | global variable declared const will in all likelihood crash,
         | but changing a local const can cause much more subtle bugs.
         | 
         | In my view, a more complete solution must include improving the
         | detection of these types bugs in compilers and other static and
         | even dynamic analyzers even without requiring code changes.
         | It's not any more difficult to do that detecting out of bounds
         | accesses. (In full generality it cannot be done just by relying
         | on const; some other annotation is necessary to specify that a
         | function that takes a const pointer doesn't cast the constness
         | away and modify the object regardless.)
        
         | DougGwyn wrote:
         | Many uses of strchr do write via a pointer derived from a non-
         | const declaration. When we introduced const qualifier it was
         | noted that they were actually declaring read-only access, not
         | unchangeability. The alternative was tried experimentally and
         | the consequent "const poisoning" got in the way.
        
           | coliveira wrote:
           | I believe C is doing the right thing. Const as immutability
           | is a kludge to force the language to operate at the level of
           | data structure/API design, something that it cannot do
           | properly.
        
             | moonchild wrote:
             | Have you ever used a high-level statically-typed language,
             | e.g. haskell?
        
         | pascal_cuoq wrote:
         | Speaking as someone who is not in the committee but has
         | observed trends since 2003 or so, I would say that solving this
         | problem is way beyond the scope of evolutions that will make it
         | in C2a or even the next one.
         | 
         | There are plenty of programing languages that distinguish
         | strongly between mutable and immutable references, and that
         | have the parametric polymorphism to let functions that can use
         | both kinds return the same thing you passed to them, though. C
         | will simply just never be one of them.
        
         | pwdisswordfish2 wrote:
         | One proposal solved this by doing exactly that:
         | 
         | http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2068.pdf
        
         | _kst_ wrote:
         | strchr() is one of several C library functions that have this
         | issue.
         | 
         | C++ solved this by overloading strchr():
         | const char *strchr(const char *s, int c);         char
         | *strchr(*char *s, int c);
         | 
         | C of course doesn't have overloading.
         | 
         | One solution could have been to define two functions with
         | different names, perhaps "strchr" and "strcchr". The time to do
         | that would have been 1989, when the original ANSI C standard
         | was published.
         | 
         | I suppose a future C standard could leave strchr() as it is
         | (necessary to avoid breaking existing code) and add two new
         | functions.
        
       | JoshTriplett wrote:
       | What are the chances of typeof, or statement expressions, finding
       | their way into the C standard? They're already widely
       | implemented.
        
         | msebor wrote:
         | Several of us discussed typeof and I'd expect a proposal for a
         | feature along these lines to be well received. (I recall
         | someone even saying they're working on one but that shouldn't
         | stop anyone from submitting one of their own.)
        
           | JoshTriplett wrote:
           | I'm glad to hear that.
           | 
           | What about statement expressions? They're quite useful, and
           | supported by multiple independent compilers.
        
             | msebor wrote:
             | I'm not aware of recent proposals for those but we have
             | discussed ideas along those lines (closures: N2030, C++
             | lambdas, Apple Blocks: N1451, and I think there was one
             | from Cilk). I think there was interest but not enough
             | support for the details and likely also concerns from
             | implementers.
        
       | rseacord wrote:
       | So what do people think about having a feature in the C language
       | akin to the defer statement in GoLang?
       | 
       | The GoLang defer statement defers the execution of a function
       | until the surrounding function returns. The deferred call's
       | arguments are evaluated immediately, but the function call is not
       | executed until the surrounding function returns. It looks like an
       | interesting mechanism for cleaning up resources.
        
         | [deleted]
        
         | bokwoon wrote:
         | How about deferring until the surrounding block scope ends? In
         | Go you can get around the limitation of defer only executing at
         | the end of a function by wrapping any arbitrary section of code
         | inside an immediately executed anonymous function. But in C I'm
         | not sure that's possible so maybe one could declare a new block
         | scope instead to control when defer kicks in.
        
         | NickDunn wrote:
         | It could be very useful for cleaning resources. I've never used
         | GoLang, but can see how that could be useful in various
         | circumstances. As we're talking about C, I suspect a feature
         | like that, with the potential to make things safer, would also
         | enable the unwary to shoot themselves in the foot more easily.
        
         | majke wrote:
         | I personally don't like golang's defer. For me it obscures the
         | flow of the program. For example when I acquire a lock, I like
         | to see where exactly it's released.
         | 
         | For me "defer" only makes sense in the context of exceptions,
         | basically as an equivalent to "finally". This is a slippery
         | slope though, since golang's exceptions are, for a reason,
         | rudimentary.
        
         | smasher164 wrote:
         | I would love to see defer in the language. It helps keep
         | cleanup code close to the resource that is acquired.
         | 
         | Would the proposed defer statement apply to loops as well? How
         | would one implement such defers without dynamic allocation?
        
         | pascal_cuoq wrote:
         | It sounds like the __attribute__((cleanup(...))) already
         | offered by GCC is similar to this. I probably won't have time
         | to investigate the differences while the AMA is ongoing though.
        
       | LHopital wrote:
       | I'll pass! Thanks though.
        
         | dang wrote:
         | Please don't do this here.
        
         | [deleted]
        
       | jboschpons wrote:
       | Hola!
        
       | gnachman wrote:
       | Since 1999, a lot of undefined behavior has been added to the
       | language to improve compilers' ability to optimize. For example,
       | pointer aliasing rules. How have you measured the benefit?
        
       ___________________________________________________________________
       (page generated 2020-04-14 23:00 UTC)