[HN Gopher] C's Biggest Mistake (2009) ___________________________________________________________________ C's Biggest Mistake (2009) Author : todsacerdoti Score : 46 points Date : 2020-09-12 18:29 UTC (4 hours ago) (HTM) web link (digitalmars.com) (TXT) w3m dump (digitalmars.com) | david2ndaccount wrote: | In C you can declare pointers to arrays, the syntax is just | somewhat strange. You can even declare it as a pointer to a | variable sized array with c99, eg: void | foo(size_t length, char (*x)[length]){ size_t size = | sizeof(*x); assert(size == length); | printf("sizeof(*x): %zu\n", sizeof(*x)); } | dependenttypes wrote: | Here is a post from 2014 from someone who is in the C standard | committee. https://gustedt.wordpress.com/2014/09/08/dont-use- | fake-matri... | WalterBright wrote: | Author here. I'll be blunt and repeat a prediction I made 3 years | ago or so: | | C is finished if it doesn't address the buffer overflow problem, | and this proposal is a simple, easy, backwards compatible way to | do it. It is simply too expensive to deal with buffer overflow | bugs anymore. | | This one addition will revolutionize C programming like adding | function prototypes did. | enriquto wrote: | The real troubles are undefined behavior and aliasing. Buffer | overflows are just a well known gimmick of the language that is | more or less controllable with some discipline. Aliasing is | hell. You cannot even use a global variable safely! | xxpor wrote: | Isn't that what asan/ubsan is for? | | Granted, it's not static analysis, but it should catch most | aliasing related errors, no? | fizixer wrote: | I'm fully in the camp of C plus powerful analysis tools, | plus a high-level language (Python or Scheme). | nsajko wrote: | No. There has been some effort in that direction, somebody | proposed a Clang "type sanitizer" patch, but it wasn't | merged. | Ar-Curunir wrote: | only if your tests exercise that code path | eps wrote: | I'm not sure whom this proposal is aimed at exactly. | | Any production-quality C code will already use a (pointer + | count) combo when passing arrays to a function, which is | something that will still be needed under your proposal because | the vast majority of arrays is dynamically sized. So unless | _all_ arrays in C are given the fat pointer treatment, I don 't | really see how what you suggest would make much of a | difference. That is, if fat pointers are made the first class | language construct, then, yes, that can be useful... though I | disagree if it's not done, it will cause a demise of C. | leetcrew wrote: | pointer + size does not really fix anything, as you are | relying on the programmer to correctly keep track of the | size. I'm not even sure what alternative this improves upon. | even more error-prone null value marking the end? praying the | array will be big enough (looking at you, gets!)? | | unless you have a team of incredibly diligent coders, people | are going to read past the end of bare arrays over and over | again. one specific mistake I keep seeing is where people | misinterpret the meaning of a variable named `size`. is it | the number of elements or the size in bytes? who knows, but | it's probably UB either way if you're wrong. | true_religion wrote: | Would you just wrap the pointer and size in a strut, then | only iterate the array via a library of functions that | check the size first? | | I don't code c full time, but it's what I have always done | when needing to use c via ffi to get a speed up in a | dynamic language. | mhh__ wrote: | Any production-quality C code? | | Any or some? I'm not sure if I've seen that in the wild. | baby wrote: | I don't see a future where C survives, not only because of | memory corruption bugs (although that's a pretty big one), but | also for usability: the lack of package manager, common build | system, good documentation, good standard library, etc. are | just too much to compete with any modern system language. | edoceo wrote: | Those are features which makes C flexible on main-stream | platforms and also usable for so many other platform where | other languages just don't/won't work. | pjmlp wrote: | Those features are not unique to C, they are just cargo | culted as such. | dependenttypes wrote: | > the lack of package manager | | Just use nix or even apt. Both of them are MUCH better when | compared to trash like npm or cargo which do not even check | for signatures. | | > common build system | | Such as make? There is also Ninja/Meson if you prefer. | pjmlp wrote: | Unfortunely until we get rid of UNIX/POSIX clones, C will be | kept around. | | So not in my lifetime. | fortran77 wrote: | C will survive, if just for embedded/systems programming | where you need a "portable assembly language" that can run on | the simplest CPUs. | mhh__ wrote: | That's because of sunk-cost rather than design. | | Thanks to LLVM and GCC you can happily write embedded code | in a higher level language, but the vendors don't bother | supporting it because a lot of embedded coding isn't really | what we would call software (no tests etc.) | ajsnigrutin wrote: | > I don't see a future where C survives | | I've been seeing those exact words for decades now, and C is | still going strong. Every few year a new language comes, | somes writes something in it, that was written in C before, | someone might even write a basic OS in it, and after a few | years, that language is almost forgotten, a new one is here, | and again, someone is writing something in it, but in the | end, we still use C for the things we used it 10, 20, for | some, even 30 years ago. | blogant wrote: | > lack of package manager, common build system, good | documentation. | | This is where C is superior to virtually every other | language. It has K&R to start with [1], a wealth of examples | to progress from there, man pages, autotools, cmake, static | and shared libraries. | | > good standard library. | | It should have hash tables at least, but it isn't bad. | | [1] Which is still the best language book ever written (yes, | it has some anti patterns, you unlearn them quickly). | non-entity wrote: | As much as I dislike C | | > There are only two kinds of languages: the ones people | complain about and the ones nobody uses. | | This unfortunately seems to mostly hold true. | unsatchmo wrote: | Porque no los dos? There are a few languages that nobody | uses and also everybody seems to complain about. | samatman wrote: | SQLite alone has a support contract through 2050. | | C survives. | rumanator wrote: | > the lack of package manager | | What do you call linux distro's package managers then? I | mean,in distributions like Debian you can even download a | package's source code with apt-get. | forrestthewoods wrote: | > I don't see a future where C survives | | if C dies then what replaces it? | nicoburns wrote: | Perhaps a combination of a language like Zig (a 1:1 | replacement for situations where you really do want a lot | of manual low-level control) and higher-level languages | like Rust eating into more and more of the use cases. | humanrebar wrote: | I don't see C being in much worse shape than C++ with respect | to build system and package manager. It's slow going, but | progress seems to be happening there. | | Are you saying both are doomed? Or is there some scenario | where C++ survives without C? | vlovich123 wrote: | I think both are, long term (think FORTRAN where it's not | particularly popular but a lot of existing code is | maintained and not rewritten). | | C++ is actually in a slightly better spot ironically | because it's harder to integrate with. If you have a C | program you can pretty easily start replacing parts with | Rust. You can't do the same with C++ which insulates it | better in that sense. | ryl00 wrote: | Reports of Fortran's death (latest standard 2018) are | greatly exaggerated (much like C). It's receded to a | niche, but it's still a very important niche (numerical, | HPC). Hopefully, the development of a new Fortran front | end for LLVM (from PGI/Nvidia?) pans out, as this would | fill a gap in LLVM's offerings, and provide more | competition for ifort and gfortran. | vlovich123 wrote: | You're proving my point. FORTRAN is a niche language. C++ | is still mainstream. It will recede but not completely | disappear | baby wrote: | I don't see a great future for C++ either | Upvoter33 wrote: | "C is finished if it doesn't address the buffer overflow | problem" | | You should keep making this prediction ... one day you might be | right! :) | WalterBright wrote: | I suspect C has been steadily losing ground since I made it. | rumanator wrote: | C has been "losing ground" not because of random per peeves | of those who never wrote a line of code in C but because | since C's last standard update there have been other | programming languages that offer developers something of | value so that the trade-off between using C or any | alternative starts to make technical sense. | | It also helps that C's standardization proceeds I ways that | feel somewhat between sabotage and utter neglect. | | Meanwhile, C is still the absolute best binary interop | language devised by mankind. | teej wrote: | C code is being replaced by Rust fast. The only limit is how | quickly programmers can become good at Rust. It's already | happening. | viraptor wrote: | I love the move to d/rust/zig/nim/... but there are other | issues too. Ecosystem of libraries, stabilisation of common | patterns (futures and Tokio issues are still out there), | platform compatibilities, industry support for moving away | from known solutions, and many other issues. Even if we all | suddenly knew Rust perfectly tomorrow, there are other | issues in the way. | rightbyte wrote: | The proposal is just syntactic sugar for an size argument. It | doean't add or solve anything really. | WalterBright wrote: | In my experience with language design, a little bit of | syntactic sugar can have transformative results. | | C's function prototypes, syntactic sugar added circa 1990, | were transformative for C programming. | Koshkin wrote: | I wonder if a better idea (in principle) would be to have | some kind of hardware implementation, sort of like a finer- | grained memory segmentation. | mhh__ wrote: | It allows automatic bounds checking i.e. I don't need to | point out how many bugs that could fix. | | If you're worried about performance test it and turn it off. | skywhopper wrote: | It's not a "mistake". This article is complaining about a | misinterpretation of C's functionality. Arrays are not "real" | data structures in C: there's no such thing. The array-ish syntax | that's available is just a some syntactic sugar on top of | pointers. You could say that having the sugar at all is a | mistake. Or that C is incomplete without first-class array types. | This is a cute hack, but at this point (far more than 10 years | ago) it's probably better to move on to Rust if you don't like | this aspect of C, rather than proposing to hack the language. | chadcmulligan wrote: | Niklaus Wirth would concur - that was similar to his argument for | pascal - strings contain a size. | SamReidHughes wrote: | The biggest mistake to me feels like implicit integer | conversions. That's where C feels like it's really out to get | you. | leetcrew wrote: | on a somewhat related note, I've always wished for something | like `explicit` that prevents assigning different typedefs for | the same underlying type to each other. like suppose I have two | types, WorldVec (vector in worldspace) and ViewVec (vector in | view/sceenspace). under the hood they are both typedefs for | float[3], so I can freely assign them back and forth. but any | vector operation that mixes the types would almost always be a | bug, since they are in different spaces. would be cool to get | this functionality out of the humble typedef. | ncmncm wrote: | Agree. And they have leaked out to C++ where they have been | very hard to fix, and even, to some degree, to Rust. | forrestthewoods wrote: | How have they leaked into Rust? I thought Rust had no | implicit conversions? | steveklabnik wrote: | There are a small number of coercions, but we do not do | them around numeric types, it's true. Not sure what your | parent is referring to. | ncmncm wrote: | It does, however, have integer overflow, in release mode. | So if you do code a conversion, you can end up with a value | different from the source. | rowanG077 wrote: | I don't feel it's so bad. You have a a specific flag to tell | the compiler to show warnings if you have any. | pvg wrote: | Previously: | | https://news.ycombinator.com/item?id=17585357 | | https://news.ycombinator.com/item?id=1014533 | bumblebritches5 wrote: | is errno, global state is incompatible with multithreading | tus88 wrote: | > Conflating pointers with arrays. | | AND Strings. | | FTFY. | yarrel wrote: | C doesn't have strings. ;-) | Snarwin wrote: | More precisely, it doesn't have a string _type_. | Something1234 wrote: | Stupid question, but how do I access the size of an array using | this fancy new declaration if it were to be added? It doesn't | seem like any sugar is there to provide "range based for loops." | | Wait I would just use `sizeof` but then I'm still doing pointer | math then? | WalterBright wrote: | A macro can be added to access the length property. | franciscop wrote: | This was probably the most confusing thing about C when I first | started learning programming back in the day. When you call a | function you pass the value, except in arrays where it gets | converted as a pointer. It was explained back then to me that the | reason is because copying the whole array was not efficient so it | was better to pass the reference. | quelsolaar wrote: | I think a better way to think about is to say that when you | type: | | int a[10]; | | you allocate 10 integers and "a" is the pointer to the first | one of them. | | Arrays are just memory, just like what you get wen calling | malloc, and memory is accessed using pointers in C. | rightbyte wrote: | That mindset doesn't cover sizeof(a) properly. | ktpsns wrote: | I don't think it is a mistake in language design. In the 90s, | memory was a rare good, and it still is in the microprocessor | world, where "only" a few kilobytes of RAM are available. There | are performance critical paths where passing a size_t is just | unnecessary. | | The actual mistake is to don't pass size_t as a user. This is one | kind of "premature optimization". We can safely say the language | design doesn't encourage the user to write safe code, and | succeror languages do that. | | Don't get me wrong -- I just try to do the point that C itself is | not the point to blame. It's people using computers who write the | million dollar bugs. | bobbyi_settv wrote: | > There are performance critical paths where passing a size_t | is just unnecessary | | You would still be able to declare your function as taking a | pointer (instead of an array, which in this world would be a | far pointer) if you need to | | He's saying to deprecate char[] as a parameter type, not char _ | WalterBright wrote: | That's right, nothing is taken away from the user with my | proposal. | antiquark wrote: | An existing alternative is to put an array in a struct: | struct string123 { char data[123]; }; | | Then create functions that user pointers to these string123 | structs. | david2ndaccount wrote: | If you want a pointer to a fixed size array, just use one, | eg | | char (*data)[123]; // syntax is somewhat awkward | mav3rick wrote: | You can type def that | WalterBright wrote: | That does work, except for: | | 1. variable length buffers | | 2. every other piece of code you want to interface with | uses `char*` | tgb wrote: | Interestingly, I assume you meant to end your post with | "pointer to char" not "char" itself, but asterix is the the | italics formatting character on HN so it's italicized it. But | the funny thing is that it's italicized the "reply" button | (as well as an empty i-tag after "char"). | WalterBright wrote: | The #1 undetected bug problem with C programs is buffer | overflows. Experience shows it is extremely difficult to verify | that arbitrary C code doesn't have buffer overflows in it. | Assistance from the core language design can improve things a | great deal. | | D allows passing both raw pointers as parameters and | pointer/length pairs. It's up to the user to choose. In | practice, people have simply moved away from using raw pointers | into buffers. | | As for performance, in C to determine the length of a string | one uses strlen(). Over and over and over again on the same | string. This can be a major performance problem, even not | considering the memory cache effects. When I look at speeding | up C code, often the first nuggets of gold is reviewing all the | explicit and implicit uses of strlen(). (Implicit uses are | functions like strcat()). It's also the first place I look for | bugs when reviewing C code - anytime you see a sequence of | strlen, strcat, strcpy, it's often broken (typically in | neglecting somewhere to account for the extra 0 byte). | Gibbon1 wrote: | All of this I agree with. In a better world 'arrays' would | have added in the 1980's. The arguments about memory | limitations is spurious since if you're writing good code you | always pass a pointer and the length. Always no exceptions. | | Yeah and all the string functions should have been marked as | depreciated with C89 and fully depreciated with C99. | nmarks100 wrote: | I don't agree that in the days of valgrind, asan etc. the #1 | issue is buffer overflows. | | #1 and #2 are integer overflows and aliasing mistakes. | WalterBright wrote: | valgrind is a marvelous tool, but it only detects actual | buffer overflows, not vulnerability to buffer overflows. | mhh__ wrote: | You must pass a size_t somewhere, surely? Otherwise you have no | idea how long the array is - this is about doing it properly | rather than relying on yourself at 9AM to get it right | everytime. | quelsolaar wrote: | Personally I like it the way it is. If you want to copy an array | when making a function call you can define a struct with a array | in it, and pas the structure. | | If C did pass array lengths it still wouldn't matter since C | doesn't (and in my opinion shouldn't) check for overflows. | dependenttypes wrote: | > since C doesn't ... check for overflows | | Because it's a language, not an implementation. An | implementation is free to do so (and there are such | implementations after all). | quelsolaar wrote: | That's correct! C doesn't require checking for overflows, but | it also doesn't forbid implementations from doing so. both | are features. | Koshkin wrote: | I don't think it is possible, not without changing some | parts of the C's specification. At the very least you'd | need to be able to somehow encode the length of the buffer | in the pointer to it. (There is no semantic difference | between a pointer to a simple, fixed-length variable and a | pointer to an array.) | bawolff wrote: | Which is what the article proposed. ___________________________________________________________________ (page generated 2020-09-12 23:00 UTC)