[HN Gopher] How safe is Zig?
       ___________________________________________________________________
        
       How safe is Zig?
        
       Author : orf
       Score  : 177 points
       Date   : 2022-06-23 15:19 UTC (7 hours ago)
        
 (HTM) web link (www.scattered-thoughts.net)
 (TXT) w3m dump (www.scattered-thoughts.net)
        
       | nwellnhof wrote:
       | UBSan has a -fsanitize-minimal-runtime flag which is supposedly
       | suitable for production:
       | 
       | https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#...
       | 
       | So it seems that null-pointer dereferences and integer overflows
       | can be checked at runtime in C. Besides, there should be
       | production-ready C compilers that offer bounds checking.
        
         | pjmlp wrote:
         | There should but there aren't, GCC had a couple of extensions
         | on a branch like 20 years ago that never got merged.
         | 
         | The best is to use C++ instead, with bounds checked library
         | types.
        
           | uecker wrote:
           | You can already get some bounds checking, although more work
           | is needed:
           | 
           | https://godbolt.org/z/abx7KE44z
        
             | pjmlp wrote:
             | Yeah, indeed. Thanks for sharing it.
        
       | pjmlp wrote:
       | This is why for me, Zig is mostly a Modula-2 with C syntax in
       | regards to safety.
       | 
       | All the runtime tooling it offers, already exists for C and C++
       | for at least 30 years, going back to stuff like Purify (1992).
        
       | lmh wrote:
       | Question for Zig experts:
       | 
       | Is it possible, in principle, to use comptime to obtain Rust-like
       | safety? If this was a library, could it be extended to provide
       | even stronger guarantees at compile time, as in a dependent type
       | system used for formal verification?
       | 
       | Of course, this does not preclude a similar approach in Rust or
       | C++ or other languages; but comptime's simplicity and generality
       | seem like they might be beneficial here.
        
         | avgcorrection wrote:
         | Why would the mere existence of some static-eval capability
         | give you that affordance?
         | 
         | Researchers have been working on these three things for
         | decades. Yes, "comptime" isn't some Zig invention but a
         | somewhat limited (and anachronistic to a degree) version of
         | what researchers have added to research versions of ML and
         | Ocaml. So can it implement all the static language goodies of
         | Rust _and_ give you dependent types? Sure, why not? After all,
         | computer scientists never had the idea that you can evaluate
         | values and types at compile-time. Now all those research papers
         | about static programming language design will wither on their
         | roots now that people can just use the simplicity and
         | generality of `comptime` to prove programs correct.
        
         | anonymoushn wrote:
         | It is possible in principle to write a Rust compiler in
         | comptime Zig, but the real answer is "no."
        
         | kristoff_it wrote:
         | Somebody implemented part of it in the past, but it was based
         | on the ability to observe the order of execution of comptime
         | blocks, which is going to be removed from the language
         | (probably already is).
         | 
         | https://github.com/DutchGhost/zorrow
         | 
         | It's not a complete solution, among other things, because it
         | only works if you use it to access variables, as the language
         | has no way of forcing you.
        
         | ptato wrote:
         | Not an expert by any means, but my gut says that it would be
         | very cumbersome and not practical for general use.
        
         | pron wrote:
         | Not as it is (it would require mutating the type's "state"),
         | but hypothetically, comptime could be made to support even more
         | programmable types. But could doesn't mean should. Zig values
         | language simplicity and explicitness above many other things.
        
       | dleslie wrote:
       | And here is the table with Nim added; though potentially many
       | GC'd languages would be similar to Nim:
       | 
       | https://uploads.peterme.net/nimsafe.html
       | 
       | Edit: noteworthy addendum: the ARC/ORC features have been
       | released, so the footnote is now moot.
        
         | 3a2d29 wrote:
         | Seeing Nim danger made me think, shouldn't rust unsafe be
         | added?
         | 
         | Seems inaccurate to display rust as safe and not include what
         | actually allows memory bugs to be found in public crates.
        
         | jewpfko wrote:
         | Thanks! I'd love to see a Dlang BetterC column too
        
           | Snarwin wrote:
           | Here's a version with D included:
           | 
           | https://gist.github.com/pbackus/0e9c9d0c83cd7d3a46365c054129.
           | ..
           | 
           | The only difference in BetterC is that you lose access to the
           | GC, so you have to use RC if you want safe heap allocation.
        
         | IshKebab wrote:
         | I don't know why Rust gets "runtime" and Nim gets "compile
         | time" for type confusion?
        
           | shirleyquirk wrote:
           | yes, for tagged unions specifically, (which the linked post
           | refers to for that row) Nim raises an exception at runtime
           | when trying to access the wrong field, (or trying to change
           | the discriminant)
        
       | ArrayBoundCheck wrote:
       | I like zig but this is taking a page out of rust book and
       | exaggerating C and C++
       | 
       | clang and gcc will both tell you at runtime if you go out of
       | bounds, have an integer overflow, use after free etc. You need to
       | turn on the sanitizer. You can't have them all on at the same
       | time because code will be unnecessarily slow (ex: having thread
       | sanitizer on in a single threaded app is pointless)
        
         | lijogdfljk wrote:
         | What is the cause of all those notorious C bugs then?
        
           | CodeSgt wrote:
           | > at runtime
        
           | [deleted]
        
         | kubanczyk wrote:
         | Whoa the username checks out perfectly.
        
           | ArrayBoundCheck wrote:
           | Haha yes. I love knowing I'm in bounds but unfortunately
           | saying anything about C++ (that isn't a criticism) is out of
           | bounds and my comment got downvoted enough that I don't feel
           | like saying more
        
         | masklinn wrote:
         | > clang and gcc will both tell you at runtime if you go out of
         | bounds [...] You can't have them all on at the same time
         | because code will be unnecessarily slow
         | 
         | Yeah, so clang and gcc don't actually tell you at runtime if
         | you go out of bounds. How many program ship production binaries
         | with asan or ubsan enabled, to say nothing of msan or tsan?
         | 
         | Also you can't have them all on at the same time because
         | they're not necessarily compatible with one another[0], you
         | literally can't run with both asan and msan, or asan and tsan.
         | 
         | [0] https://github.com/google/sanitizers/issues/1039
        
           | pjmlp wrote:
           | Quite a few subsystems on Android, but that is about it.
           | 
           | https://source.android.com/devices/tech/debug/hwasan
        
         | woodruffw wrote:
         | Neither Clang nor GCC has perfect bounds or lifetime analysis,
         | since the language semantics forbid it: it's perfectly legal at
         | compile time to address at some offset into a supplied pointer,
         | because the compiler has no way of knowing that the memory
         | there _isn 't_ owned and initialized.
         | 
         | Sanitizers are great; I _love_ sanitizers. But you can 't run
         | them in production without a significant performance hit, and
         | that's where they're needed most. I don't believe this post
         | blows that problem out of proportion, and is correct in noting
         | that we can solve it without runtime instrumentation and
         | overhead.
        
           | AlotOfReading wrote:
           | State of the art sanitizing is pretty consistently in the
           | <50% overhead range (e.g. SANRAZOR), with things like UBSAN
           | coming in under 10%. If you can't afford even that, tools
           | like ASAP have been around for 7-ish years now to make
           | overhead arbitrarily low by trading off increased false-
           | negatives in hot codepaths.
           | 
           | Yes, the "just-enable-the-compiler-flags" approach can be
           | expensive, but the tools exist to allow most people to be
           | sanitizing most of the time. Devs simply don't know what's
           | available to them.
        
             | woodruffw wrote:
             | I'd consider even 10% to be a significant performance hit.
             | People scream bloody murder when CPU-level mitigations
             | cause even 1-2% regressions. The marginal cost of
             | mitigations when memory safe code can run without them is
             | infinite.
             | 
             | But let's say, for the sake of argument, that I can
             | tolerate programs that run twice as long in production.
             | This doesn't improve much:
             | 
             | * I'm not going to be deploying SoTA sanitizers (SANRAZOR
             | is currently a research artifact; it's not available in
             | mainline LLVM as far as I can tell.)
             | 
             | * No sanitizer that I know of _guarantees_ that execution
             | corresponds to memory safety. ASan famously won 't detect
             | reads of uninitialized memory (MSan will, but you can't use
             | both at the same time), and it similarly won't detect
             | layout-adjacent overreads/writes.
             | 
             | That's a lot of words to say that I think sanitizers are
             | great, but they're not a meaningful alternative to actual
             | memory safety. Not when I can have my cake and eat it too.
        
               | AlotOfReading wrote:
               | I think we basically agree. Hypothetically ideal memory
               | safety is strictly better, but sanitizers are better than
               | nothing for code using fundamentally unsafe languages. My
               | personal experience is that more people are dissuaded
               | from sanitizer usage more by hypothetical (and
               | manageable) issues like overhead than real implementation
               | problems.
        
               | KerrAvon wrote:
               | If you can afford a 10-50% across-the-board performance
               | reduction, why would you not use a higher-level, actually
               | safe language like Ruby or Python? Remember that the
               | context of this article is Zig vs other languages, so the
               | assumption is you're writing new code.
        
               | slowking2 wrote:
               | Python is usually a lot more than a 50% reduction in
               | performance. Sometimes you need better performance but
               | not the best performance.
        
               | AlotOfReading wrote:
               | I work in real time, often safety critical environments.
               | High level interpreted languages aren't particularly
               | useful there. The typical options are C/C++, hardware
               | (e.g. FPGAs), or something more obscure like Ada/Spark.
               | 
               | But in general, sanitizers are also something you can do
               | to _legacy code_ to bring it closer to safety and you can
               | turn them off for production if you absolutely,
               | definitely need those last few percent (which few people
               | do). It 's hard to overstate how valuable all of that is.
               | A big part of the appeal of zig is its interoperability
               | with C and the ability to introduce it gradually. Compare
               | to the horrible contortions you have to do with CFFI to
               | call Python from C.
        
               | anonymoushn wrote:
               | For Ruby or Python I think you'll be paying more than 90%
        
               | anonymoushn wrote:
               | > People scream bloody murder when CPU-level mitigations
               | cause even 1-2% regressions
               | 
               | For a particular simulation on a particular Cascade Lake
               | chip, mitigations collectively cause it to run about 30%
               | slower. So I won't scream about 1%, but that's a lot of
               | 1%s.
        
               | ArrayBoundCheck wrote:
               | > I'd consider even 10% to be a significant performance
               | hit. People scream bloody murder when CPU-level
               | mitigations cause even 1-2% regressions. The marginal
               | cost of mitigations when memory safe code can run without
               | them is infinite.
               | 
               | What people? and in my experience rust has always been
               | much higher than 2% regression
        
           | com2kid wrote:
           | > it's perfectly legal at compile time to address at some
           | offset into a supplied pointer, because the compiler has no
           | way of knowing that the memory there isn't owned and
           | initialized.
           | 
           | Embedded land, everything is a flat memory map, odds are
           | malloc isn't used at all, memory is possibly 0'd on boot.
           | 
           | It is perfectly valid to just start walking all over memory.
           | You have a bunch of #defines with known memory addresses in
           | them and you can just index from there.
           | 
           | Fun fact: Microsoft Band writes crash dumps to a known
           | location in SRAM and because SRAM doesn't instantly lose its
           | contents on reboot, after a crash the runtime checks for
           | crash dump data at that known address and if present would
           | upload the crash dump to servers for analysis and then 0 out
           | that memory.[1]
           | 
           | Embedded rocks!
           | 
           | [1] There is a bit more to it to ensure we aren't just
           | reading random data after along power off, but I wasn't part
           | of the design, I just benefited from a 256KB RAM wearable
           | having crash dumps that we could download debugging symbols
           | for.
        
         | xedrac wrote:
         | > So I'm not covering tools like AddressSanitizer that are
         | intended for testing and are not recommended for production
         | use.
         | 
         | How is it an exaggeration when he explicitly called this out?
        
           | ArrayBoundCheck wrote:
           | ASAN isn't just "for testing". A lot of people went straight
           | to the chart (like me) and it reeks of bullshit. double free
           | is the same as use after free, null pointer dereference is
           | essentially the same as type confusion since a nullable
           | pointer is confused with a non null pointer, invalid stack
           | read/write is the same as array out of bounds (or invalid
           | pointers), etc
           | 
           | I also never heard of a data race existing without a race
           | condition existing. That's a pointless metric like many of
           | the above I mentioned
        
         | hyperpape wrote:
         | Can you explain why, in spite of the fact that (according to
         | you) C & C++ aren't that unsafe, critical projects like
         | Chromium can't get this right?
         | https://twitter.com/pcwalton/status/1539112080590217217
         | 
         | Is the Project Zero team just too lazy to remind Chromium to
         | use sanitizers?
        
           | uecker wrote:
           | I think the big question is, whether two teams writing
           | software on a fixed budget using Rust or C using modern tools
           | and best practices would end up with a safer product. I think
           | this is not clear at all.
        
             | pcwalton wrote:
             | People have done just that with, for example, Firefox
             | components and found that yes, Rust gives you a safer
             | product.
        
               | uecker wrote:
               | Do you have a pointer? I know they rewrote Firefox
               | components, but I am not aware of a real study with a 1:1
               | comparison.
        
             | uecker wrote:
             | (Ok, I should read the text before sending.)
        
           | jerf wrote:
           | While I'm generally in favor of the proposition that C++ is
           | an intrinsically dangerous language, pointing at one of the
           | largest possible projects that uses it isn't the best
           | argument. If I pushed a button and magically for free Chrome
           | was suddenly in 100% pure immaculate Rust, I'm sure it would
           | still have many issues and problems that few other projects
           | would have, just due to its sheer scale. I would still
           | consider it an open question/problem as to whether Rust can
           | scale up to that size and still be something that humans can
           | modify. I could make a solid case that the difficulty of
           | working in Rust would very accurately reflect a true and
           | essential difficulty of working at that scale in general, but
           | it could still be a problem.
           | 
           | (Also Rust defenders please note I'm not saying Rust _can 't_
           | work at that scale. I'm just saying, it's a very big scale
           | and I think it's an open problem. My personal opinion and gut
           | say yes, it shouldn't be any worse than it has to be because
           | of the sheer size (that is, the essential complexity is
           | pretty significant no matter what you do), but I don't _know_
           | that.)
        
             | hyperpape wrote:
             | You're right that Chromium* is a very difficult task, but I
             | disagree with the conclusion you draw. I think Chromium is
             | one of the best examples we can consider.
             | 
             | There would absolutely be issues, including security
             | issues. But there is also very good evidence that the
             | issues that are most exploited in browsers and operating
             | systems relate to memory safety. Alex Gaynor's piece that
             | the author linked is good on this point.
             | 
             | While securing Chromium is huge and a difficult task, it
             | and consumer operating systems are crucial for individual
             | security. Until browsers and consumer operating systems are
             | secure, individuals ranging from persecuted political
             | dissidents to Jeff Bezos won't be secure.
             | 
             | * Actually not sure why I said Chromium rather than Chrome.
             | Nothing hangs on the distinction, afaict.
        
           | ArrayBoundCheck wrote:
           | Considering how much I got downvoted no I don't want to
           | comment more about this. But I'll let you ponder why while
           | using rust has you could get a use after free sometimes
           | https://cve.mitre.org/cgi-
           | bin/cvename.cgi?name=CVE-2021-4572...
        
             | hyperpape wrote:
             | Here's the commit: https://github.com/jeromefroe/lru-
             | rs/pull/121/commits/416a2d....
             | 
             | I don't think this does much for your initial claim. Take
             | the most generous reading you can--Rust isn't any better at
             | preventing UAF than C/C++. That doesn't make safe C/C++ a
             | thing, it means that Rust isn't an appropriate solution.
        
               | alfiedotwtf wrote:
               | > Rust isn't any better at preventing UAF than C/C++
               | 
               | Maybe I'm missing something here?
        
               | ArrayBoundCheck wrote:
               | You missed the point. Just like the author did when he
               | disqualified all the C++ tools
               | 
               | Writing unsafe code and removing tools "because
               | production" gets you unsafe code as shown in that rust
               | cve
        
               | XelNika wrote:
               | With Zig and Rust you have to explicitly opt-out with
               | `ReleaseFast` and `unsafe` respectively, that makes a big
               | difference. Rust has the added safety that you cannot (to
               | my knowledge at least) gain performance by opting out
               | with a flag at compile-time, it has to be done with
               | optimized `unsafe` blocks directly in the code.
               | 
               | Lazy C++ is unsafe, lazy Zig is safe-ish, lazy Rust is
               | safe. Given how lazy most programmers are, I consider
               | that a strong argument against C++.
        
               | ArrayBoundCheck wrote:
               | You didn't seem to click the commit the guy linked with
               | the rust code https://github.com/jeromefroe/lru-
               | rs/pull/121/commits/416a2d...
               | 
               | It has nothing to do with opting out. Zig, Rust and no
               | language saves you when you write incorrect unsafe code.
               | My original point is disqualifying c tools is misleading
               | and everything suffers from incorrect unsafe code
        
               | Arnavion wrote:
               | >It has nothing to do with opting out.
               | 
               | It does. The original code compiled because the borrow is
               | computed using `unsafe`. That `unsafe` is the opt-out.
               | 
               | >Zig, Rust and no language saves you when you write
               | incorrect unsafe code. My original point is disqualifying
               | c tools is misleading and everything suffers from
               | incorrect unsafe code
               | 
               | And the other people's point is that if one language
               | defaults to writing unsafe code and the other language
               | requires opting out of safety to write unsafe code, then
               | the second language has merit over the first.
        
         | wyldfire wrote:
         | One interesting distinction is that it sounds as if - for Zig,
         | this is a language feature and not a toolchain feature.
         | Although if there's only one toolchain for zig maybe that's a
         | distinction-without-a-difference. At least it's not opt-in,
         | that's really nice. Believe it or not, there are lots of people
         | who write and debug C/C++ code who don't know about sanitizers
         | or they know about it and never decide to use them.
        
           | throwawaymaths wrote:
           | I think it would be interesting to see zig move towards
           | annotation-based compile time lifetime checking plugin
           | (ideally in-toolchain, but alternatively as a library). You
           | could choose to turn it on selectively for security-critical
           | pathways, turn it off for "trust me" functions, or, do it on
           | "not every recompilation", as desired.
        
           | pjmlp wrote:
           | The irony being that lint exists since 1979, and already
           | using a static analyser would be a bing improvement in some
           | source bases.
        
       | woodruffw wrote:
       | This was a great read, with an important point: there's always a
       | tradeoff to be made, and we can make it (e.g. never freeing
       | memory to obtain temporal memory safety without static lifetime
       | checking).
       | 
       | One thought:
       | 
       | > Never calling free (practical for many embedded programs, some
       | command-line utilities, compilers etc)
       | 
       | This works well for compilers and embedded systems, but please
       | don't do it command-line tools that are meant to be scripted
       | against! It would be very frustrating (and a violation of the
       | pipeline spirit) to have a tool that works well for `N`
       | independent lines of input but not `N + 1` lines.
        
         | avgcorrection wrote:
         | > This was a great read, with an important point: there's
         | always a tradeoff to be made, and we can make it (e.g. never
         | freeing memory to obtain temporal memory safety without static
         | lifetime checking).
         | 
         | I.e. we can choose to risk running out of memory? I don't
         | understand how this is a viable strategy unless you know you
         | only will process a certain input size.
        
         | samatman wrote:
         | There are some old-hand approaches to this which work out fine.
         | 
         | An example would be a generous rolling buffer, with enough room
         | for the data you're working on. Most tools which are working on
         | a stream of data don't require much memory, they're either
         | doing a peephole transformation or building up data with
         | filtration and aggregation, or some combination.
         | 
         | You can't have a use-after-free bug if you never call free,
         | treating the OS as your garbage collector for memory (not other
         | resources please) is fine.
        
           | woodruffw wrote:
           | Yeah, those are the approaches that I've used (back when I
           | wrote more user tools in C). I wonder how those techniques
           | translate to a language like Zig, where I'd expect the naive
           | approach to be to allocate a new string for each line/datum
           | (which would then never truly be freed, under this model.)
        
         | anonymoushn wrote:
         | I've been writing a toy `wordcount` recently, and it seems like
         | if I wanted to support inputs much larger than the ~5GB file
         | I'm testing against, or inputs that contain a lot more unique
         | strings per input file size, I would need to realloc, but I
         | would not need to free.
        
           | woodruffw wrote:
           | Is that `wordcount` in Zig? My understanding (which could be
           | wrong) is that reallocation in Zig would leave the old buffer
           | "alive" (from the allocator's perspective) if it couldn't be
           | expanded, meaning that you'd eventually OOM if a large enough
           | contiguous region couldn't be found.
        
             | anonymoushn wrote:
             | It's in zig but I just call mmap twice at startup to get
             | one slab of memory for the whole file plus all the space
             | I'll need. I am not sure whether Zig's
             | GeneralPurposeAllocator or PageAllocator currently use
             | mremap or not, but I do know that when realloc is not
             | implemented by a particular allocator, the Allocator
             | interface provides it as alloc + memcpy + free. So I think
             | I would not OOM. In safe builds when using
             | GeneralPurposeAllocator, it might be possible to exhaust
             | the address space by repeatedly allocating and freeing
             | memory, but I wouldn't expect to run into this on accident.
        
               | woodruffw wrote:
               | That's interesting, thanks for the explanation!
        
               | dundarious wrote:
               | They don't (at least the GPA's defaulting backing
               | allocator is the page_allocator, which doesn't). https://
               | github.com/ziglang/zig/blob/master/lib/std/heap.zig
        
       | avgcorrection wrote:
       | A meta point to make here but I don't quite understand the
       | pushback that Rust has gotten. How often does a language come
       | around that flat out eliminates certain errors statically, and at
       | the same time manages to stay in that low-level-capable pocket?
       | _And_ doesn't require a PhD (or heck, a scholarly stipend) to
       | use? Honestly that might be a once in a lifetime kind of thing.
       | 
       | But not requiring a PhD (hyperbole) is not enough: it should be
       | Simple as well.
       | 
       | But unfortunately Rust is ( _mamma mia_ ) Complex and only
       | pointy-haired Scala type architects are supposed to gravitate
       | towards it.
       | 
       | But think of what the distinction between no-found-bugs (testing)
       | and no-possible-bugs (a certain class of bugs) buys you; you
       | don't ever have to even think about those kinds of things as long
       | as you trust the compiler and the Unsafe code that you rely on.
       | 
       | Again, I could understand if someone thought that this safety was
       | not worth it if people had to prove their code safe in some
       | esoteric metalanguage. And if the alternatives were fantastic.
       | But what are people willing to give up this safety for? A whole
       | bunch of new languages which range from improved-C to high-level
       | languages with low-level capabilities. And none of them seem to
       | give some alternative iron-clad guarantees. In fact, one of their
       | _selling point_ is mere optionality: you can have some safety and
       | /or you can turn it off in release. So runtime checks which you
       | might (culturally/technically) be encouraged to turn off when you
       | actually want your code to run out in the wild, where users give
       | all sorts of unexpected input (not just your "asdfg" input) and
       | get your program into weird states that you didn't have time to
       | even think of. (Of course Rust does the same thing with certain
       | non-memory-safety bug checks like integer overflow.)
        
         | the__alchemist wrote:
         | This is a concise summary of why I'm betting on Rust as the
         | future of performant and embedded computing. You or I could
         | poke holes in it for quite some time. Yet, I imagine the holes
         | would be smaller and less numerous than in any other language
         | capable in these domains.
         | 
         | I think some of the push back is from domains where Rust isn't
         | uniquely suited. Eg, You see a lot of complexity in Rust for
         | server backends; eg async and traits. So, someone not used to
         | Rust may see these, and assume Rust is overly complex. In these
         | domains, there are alternatives that can stand toe-to-toe with
         | it. In lower-level domains, it's not clear there are.
        
           | cogman10 wrote:
           | > I think some of the push back is from domains where Rust
           | isn't uniquely suited. Eg, You see a lot of complexity in
           | Rust for server backends; eg async and traits. So, someone
           | not used to Rust may see these, and assume Rust is overly
           | complex. In these domains, there are alternatives that can
           | stand toe-to-toe with it. In lower-level domains, it's not
           | clear there are.
           | 
           | The big win for rust in these domains is startup time, memory
           | usage, and distributable size.
           | 
           | It may be that these things outweigh the easier programming
           | of go or java.
           | 
           | Now if you have a big long running server with lots of
           | hardware at your disposal then rust doesn't make a whole lot
           | of sense. However, if want something like an aws lambda or
           | rapid up/down scaling based on load, rust might start to look
           | a lot more tempting.
        
         | dilap wrote:
         | What Rust does is incredibly cool and impressive.
         | 
         | But as someone that's dabbled a bit in both Zig and Rust, I
         | think there's a lot of incidental complexity in Rust.
         | 
         | For example, despite having used them and read the docs, I'm
         | still not exactly sure how namespaces work in Rust. It takes
         | 30s to understand exactly what is going on in Zig.
        
         | kristoff_it wrote:
         | > Of course Rust does the same thing with certain non-memory-
         | safety bug checks like integer overflow.
         | 
         | The problem with getting lost too much in the ironclad
         | certainties of Rust is that you start forgetting that
         | simplicity ( _papa pia_ ) protects you from other problems. You
         | can get certain programs in pretty messed up states with an
         | unwanted wrap around.
         | 
         | Programming is hard. Rust is cool, very cool, but it's not a
         | universal silver bullet.
        
           | avgcorrection wrote:
           | Nothing Is Perfect is a common refrain and non-argument.
           | 
           | If option A has 20 defects and option B has the superset of
           | 25 defects then option A is better--the fact that option A
           | has defects at all is completely besides the point with
           | regards to relative measurements.
        
             | coldtea wrote:
             | > _If option A has 20 defects and option B has the superset
             | of 25 defects then option A is better_
             | 
             | Only if "defect count" is what you care for.
             | 
             | What if you don't give a fuck about defect count, but
             | prefer simplicity to explore/experiment quickly, ease of
             | use, time to market, and so on?
        
             | Karrot_Kream wrote:
             | But if Option A has 20 defects and takes a lot of effort to
             | go down to 15 defects, yet Option B has 25 defects and
             | offers a quick path to go down to 10 defects, then which
             | option is superior? You can't take this in isolation. The
             | cognitive load of Rust takes a lot of defects out of the
             | picture completely, but going off the beaten path in Rust
             | takes a lot of design and patience.
             | 
             | People have been fighting this fight forever. Should we use
             | static types which make it slower to iterate or dynamic
             | types that help converge on error-free behavior with less
             | programmer intervention? The tradeoffs have become clearer
             | over the years but the decision remains as nuanced as ever.
             | And as the decision space remains nuanced, I'm excited
             | about languages exploring other areas of the design space
             | like Zig or Nim.
        
               | avgcorrection wrote:
               | > But if Option A has 20 defects and takes a lot of
               | effort to go down to 15 defects, yet Option B has 25
               | defects and offers a quick path to go down to 10 defects,
               | then which option is superior?
               | 
               | Yes. If you change the entire premise of my example then
               | things are indeed different.
               | 
               | Rust eliminates some defects entirely. Most other low-
               | level languages do not. You would have to use a language
               | like ATS to even compete.
               | 
               | That's where the five-less-defects thing comes from.
               | 
               | Go down to ten effects? What are you talking about?
        
             | kristoff_it wrote:
             | Zig keeps overflow checks in the main release mode
             | (ReleaseSafe), Rust defines ints as naturally wrapping in
             | release. This means that Rust is not a strict superset of
             | Zig in terms of safety, if you want to go down that route.
             | 
             | I personally am not interested at all in abstract
             | discussions about sets of errors. Reality is much more
             | complicated, each error needs to be evaluated with regards
             | to the probability of causing it and the associated cost.
             | Both things vary wildly depending on the project at hand.
        
               | avgcorrection wrote:
               | > This means that Rust is not a strict superset of Zig in
               | terms of safety, if you want to go down that route.
               | 
               | Fair.
               | 
               | > I personally am not interested at all in abstract
               | discussions about sets of errors.
               | 
               | Abstract? Handwaving "no silver bullet" is even more
               | abstract (non-specific).
        
       | einpoklum wrote:
       | One-liner summary: Zig has run-time protection against out-of-
       | bounds heap access and integer overflow, and partial run-time
       | protection against null pointer dereferencing and type mixup (via
       | optionals and tagged unions); and nothing else.
        
       | tptacek wrote:
       | "Temporal" and "spatial" is a good way to break this down, but it
       | might be helpful to know the subtext that, among the temporal
       | vulnerabilities, UAF and, to an extent, type confusion are the
       | big scary ones.
       | 
       | Race conditions are a big ugly can of worms whose exploitability
       | could probably be the basis for a long, tedious debate.
       | 
       | When people talk about Zig being unsafe, they're mostly reacting
       | to the fact that UAFs are still viable in it.
        
         | jorangreef wrote:
         | I see your UAF and raise you a bleed!
         | 
         | As you know, buffer bleeds like Heartbleed and Cloudbleed can
         | happen even in a memory safe language, they're hard to defend
         | against (padding is everywhere in most formats!), easier to
         | pull off than a UAF, often remotely accessible, difficult to
         | detect, remain latent for a long time, and the impact is
         | devastating. All your RAM are belong to us.
         | 
         | For me, this can of worms is the one that sits on top of the
         | dusty shelf, it gets the least attention, and memory safe
         | languages can be all the more vulnerable as they lull one into
         | a false sense of safety.
        
           | kaba0 wrote:
           | Would that work in the case of Java for example? It nulls
           | every field as per the specification (at least observably at
           | least), so unless someone writes some byte mangling manually
           | I don't necessarily see it work out.
        
           | tptacek wrote:
           | Has an exploitable buffer bleed (I'm happy with this
           | coinage!) happened in any recent memory safe codebase?
        
             | jorangreef wrote:
             | I worked on a static analysis tool to detect bleeds in
             | outgoing email attachments, looking for non-zero padding in
             | the ZIP file format.
             | 
             | It caught different banking/investment systems written in
             | memory safe languages leaking server RAM. You could
             | sometimes see the whole intranet web page, that the teller
             | or broker used to generate and send the statement, leaking
             | through.
             | 
             | Bleeds terrify me, no matter the language. The thing with
             | bleeds is that they're as simple as a buffer underflow, or
             | forgetting to zero padding. Not even the borrow checker can
             | provide safety against that.
        
               | raphlinus wrote:
               | I am skeptical until I see the details, and strongly
               | suspect you are dealing with a "safe-ish" language rather
               | than one which has Rust-level guarantees. Uninitialized
               | memory reads are undefined behavior in basically all
               | memory models in the C tradition. In Rust it is not
               | possible to make a reference to a slice containing
               | uninitialized memory without unsafe (and the rules around
               | this have tightened relatively recently, see
               | MaybeUninit).
               | 
               | I say this as someone who is doing a lot of unsafe for
               | graphics programming - I want to be able to pass a buffer
               | to a shader without necessarily having zeroed out all the
               | memory, in the common case I'm only using some of that
               | buffer to store the scene data etc. I have a safe-ish
               | abstraction for this (BufWriter in piet-gpu, for the
               | curious), but it's still possible for unsafe shaders to
               | do bad things.
        
               | ghusbands wrote:
               | I would imagine that the scenario is simply reuse of some
               | buffer without clearing it, maybe in an attempt to save
               | on allocations. It can happen across so many (even safe)
               | languages. It doesn't matter what guarantees you have
               | around uninitialised memory if you're reusing an
               | initialised buffer yourself.
        
               | jorangreef wrote:
               | Hackers exploit any avenue (and usually come in through
               | the basement!), regardless of how skeptical we might be
               | that they won't. They don't need the details, they'll
               | figure it out. You give them a scrap and they'll get the
               | rest. It's a different way of thinking that we're not
               | used to, and don't understand unless we're exposed to it
               | first-hand, e.g. through red-teaming.
               | 
               | For example, another way to think of this is that you
               | have a buffer of initialized memory, containing a view
               | onto some piece of data, from which you serve a subset to
               | the user, but you get the format of the subset wrong, so
               | that parts of the view leak through. That's a bleed.
               | 
               | Depending on the context, the bleed may be enough or it
               | might be less severe, but the slightest semantic gap can
               | be chained and built up into something major. Even if it
               | takes 6 chained hoops to jump through, that's a low bar
               | for a determined attacker.
        
               | woodruffw wrote:
               | > For example, another way to think of this is that you
               | have a buffer of initialized memory (no unsafe),
               | containing a view onto some piece of data, from which you
               | serve a subset to the user, but you get the format of the
               | subset wrong, so that parts of the view leak through.
               | That's a bleed.
               | 
               | If there's full initialization then this is just a logic
               | error, no? Apart from some kind of capability typing over
               | ranges of bytes (not very ergonomic), this would be a
               | very difficult subtype of "bleed" to statically describe,
               | much less prevent.
        
               | jorangreef wrote:
               | Yes, exactly. That's what I was driving at. It's just a
               | logic error, that leaks sensitive information, by virtue
               | of leaking the wrong information. File formats in
               | particular can make this difficult to get right. For
               | example, the ZIP file format (that I have at least some
               | experience with bleeds in) has at least 9 different
               | places where a bleed might happen, and this can depend on
               | things like: whether files are added incrementally to the
               | archive, the type of string encoding used for file names
               | in the archive etc.
        
               | woodruffw wrote:
               | Makes sense! My colleagues work on some research[1]
               | that's intended to be the counterpart to this:
               | identifying which subset of a format parser is actually
               | activated by a corpus of inputs, and automatically
               | generating a subset parser that only accepts those
               | inputs.
               | 
               | I think you mentioned WUFFS before the edit; I find that
               | approach very promising!
               | 
               | [1]: https://www.darpa.mil/program/safe-documents
        
               | jorangreef wrote:
               | Thanks! Yes, I did mention WUFFS before the edit, but
               | then figured I could make it a bit more detailed. WUFFS
               | is great.
               | 
               | The SafeDocs program and approach looks incredible.
               | Installing tools like this at border gateways for SMTP
               | servers, or as a front line defense before vulnerable AV
               | engine parsers (as Pure is intended to be used), could
               | make such a massive dent against malware and zero days.
        
               | raphlinus wrote:
               | Thanks for the explanation. I would consider that type of
               | logic error more or less impossible to defend at the
               | language level, but I can see how analysis tools can be
               | helpful.
        
               | tptacek wrote:
               | You have my attention!
        
               | jorangreef wrote:
               | Wow, that's saying something!
               | 
               | The tool is called Pure [1]. It was originally written in
               | JavaScript and open-sourced, then rewritten for Microsoft
               | in C at their request for performance (running sandboxed)
               | after it also detected David Fifield's "A Better Zip
               | Bomb" as a zero day.
               | 
               | I'd love to rewrite it in Zig to benefit from the checked
               | arithmetic, explicit control flow and spatial safety--
               | there are no temporal issues for this domain since it's
               | all run-to-completion single-threaded.
               | 
               | Got to admit I'm a little embarrassed it's still in C!
               | 
               | [1] https://github.com/ronomon/pure
        
       | dkersten wrote:
       | I'm not sure I understand the value of an allocator that doesn't
       | reuse allocations, as a bug prevention thing. Is it just for
       | performance? (Since its never reused, allocation can simply be
       | incrementing an offset by the size of the allocation)? Because
       | beyond that, you can get the same benefit in C by simply never
       | calling free on the memory you want to "protect" against use-
       | after-free.
        
         | anonymoushn wrote:
         | The allocations are freed and the addresses are never reused.
         | So heap use-after-frees are segfaults.
        
         | kaba0 wrote:
         | I believe it is only for performance, as malloc will have to
         | find place for the allocation, while it is a pointer bump only
         | for a certain kind of allocator.
        
       | AndyKelley wrote:
       | I have one trick up my sleeve for memory safety of locals. I'm
       | looking forward to experimenting with it during an upcoming
       | release cycle of Zig. However, this release cycle (0.10.0) is all
       | about polishing the self-hosted compiler and shipping it. I'll be
       | sure to make a blog post about it exploring the tradeoffs - it
       | won't be a silver bullet - and I'm sure it will be a lively
       | discussion. The idea is (1) escape analysis and (2) in safe
       | builds, secretly heap-allocate possibly-escaped locals with a
       | hardened allocator and then free the locals at the end of their
       | declared scope.
        
         | skullt wrote:
         | Does that not contradict the Zig principle of no hidden
         | allocations?
        
           | kristoff_it wrote:
           | I don't know the precise details of what Andrew has in mind
           | but the compiler can know how much memory is required for
           | this kind of operation at compile time. This is different
           | from normal heap allocation where you only know how much
           | memory is needed at the last minute.
           | 
           | At least in simple cases, this means that the memory for
           | escaped variables could be allocated all at once at the
           | beginning of the program not too differently to how the
           | program allocates memory for the stack.
        
             | messe wrote:
             | Static allocation at the beginning of the program like that
             | can only work for single threaded programs with non-
             | recursive functions though, right?
             | 
             | I'd hazard a guess that the implementation will rely on
             | use-after-free faulting, meaning that the use of any
             | escaped variable will fault rather than corrupting the
             | stack.
        
         | remexre wrote:
         | Could this be integrated into the LLVM SafeStack pass? (I don't
         | know how related Zig still is to LLVM, or if your thing would
         | be implemented there.)
        
       | LAC-Tech wrote:
       | Safe enough. You can use `std.testing.allocator` and it will
       | report leaks etc in your test cases.
       | 
       | What rust does sounds like a good idea in theory. In practice it
       | rejects too many valid programs, over-complicates the language,
       | and makes me feel like a circus animal being trained to jump
       | through hoops. Zigs solution is hands down better for actually
       | getting work done, plus it's so dead simple to use arena
       | allocation and fixed buffers that you're likely allocating a lot
       | less in the first place.
       | 
       | Rust tries to make allocation implicit, leaving you confused when
       | it detects an error. Zig makes memory management explicit but
       | gives you amazing tools to deal with it - I have a much clearer
       | mental model in my head of what goes on.
       | 
       | Full disclaimer, I'm pretty bad at systems programming. Zig is
       | the only one I've used where I didn't feel like memory management
       | was a massive headache.
        
       | afdbcreid wrote:
       | Do compilers really can never call `free()`?
       | 
       | Simple compiler probably can. Most complex probably cannot (I
       | don't want to imagine a Rust compiler without freeing memory: it
       | has 7 layers of lowering (source
       | code->tokens->ast->HIR->THIR->MIR->monomorphized MIR, excluding
       | the final LLVM IR) and also allocates a lot while type-checking
       | or borrow-checking).
       | 
       | What is most interesting to me is the average compiler. Does
       | somebody have statistics on the average amount compilers allocate
       | and free?
        
         | com2kid wrote:
         | > Do compilers really can never call `free()`?
         | 
         | I worked on, one of the many, Microsoft compiler teams, though
         | as a software engineer in test not directly on the compiler
         | itself, and I believe the lead dev told me they don't free any
         | memory, though I could be misremembering since it was my first
         | job out of college.
         | 
         | Remember C compilers are often one file at a time (and a LOLWTF
         | # of includes), and the majority of work goes into making a
         | single output file, and then you are done. Freeing memory would
         | just take time, better to just hand it all back to the OS.
         | 
         | Also compilers are obsessed with correctness, generating
         | incorrect code is to be avoided at all costs. Dealing with
         | memory management is just one more place where things can go
         | wrong. So why bother?
         | 
         | I do remember running out of memory using link time code gen
         | though, back when everything was 32bit.
         | 
         | Related, I miss the insane dedication to quality that team had.
         | Every single bug had a regression test created for it. We had
         | regression tests 10-15 years old that would find a bug that
         | would have otherwise slipped through. It was a great way to
         | start my career off, just sad I haven't seen testing done at
         | that level since then!
        
           | kaba0 wrote:
           | Bootstrapping aside, a compiler written in a GCd language
           | would make perfect sense. It really doesn't have any reason
           | to go lower level than that (other than of course, if one
           | wants to bootstrap it in the same language that happens to be
           | a low-level one)
        
             | com2kid wrote:
             | There is no reason to free memory. Your process is going to
             | hard exit after a set workload.
             | 
             | If you wrote a compiler in a GCd language, you'd want to
             | disable the collector because that just takes time, and
             | compilers are slow enough as it is!
        
               | kaba0 wrote:
               | A good GC will not really increase the execution time at
               | all -- they turn on only after a significant "headroom"
               | of allocations. For short runs they will hardly do any
               | work.
               | 
               | Also, most of the work will be done in parallel, and I
               | really wouldn't put aside that a generational GC's
               | improved cache effect (moving still used objects close)
               | might even improve performance (all other things being
               | equal, but they are never of course). All in all, do not
               | assume that just because a runtime has a GC it will
               | necessarily be slower, that's a myth.
        
         | IshKebab wrote:
         | I presume he means that compilers _could_ be written to never
         | call `free()`. I 'm sure that most of them are not written like
         | that, though they do tend to be very leaky and just `exit()` at
         | the end rather than clean everything up neatly (partly because
         | it's faster).
        
         | woodruffw wrote:
         | LLVM uses a mixed strategy: there's both RAII _and_ lots of
         | globally allocated context that only gets destroyed at program
         | cleanup. I believe GCC is the same.
         | 
         | Rustc is written entirely in Rust, so I would assume that it
         | doesn't do that.
        
           | notriddle wrote:
           | The headline feature of rustc memory management is the use of
           | arenas: https://github.com/rust-
           | lang/rust/blob/10f4ce324baf7cfb7ce2b...                   //!
           | The arena, a fast but limited type of allocator.         //!
           | //! Arenas are a type of allocator that destroy the objects
           | within, all at         //! once, once the arena itself is
           | destroyed. They do not support deallocation         //! of
           | individual objects while the arena itself is still alive. The
           | benefit         //! of an arena is very fast allocation; just
           | a pointer bump.
           | 
           | The other thing (not specifically mentioned in this comment,
           | but mentioned elsewhere, and important to understanding why
           | it work the way it does) is that if everything in the arena
           | gets freed at once, it implies that you can soundly treat
           | everything in the arena as having exactly the same lifetime.
           | 
           | You can see an example of how every ty::Ty<'tcx> in rustc
           | winds up with the same lifetime, and an entry point for
           | understanding it more, here in the dev guide: https://rustc-
           | dev-guide.rust-lang.org/memory.html
           | 
           | However, arena allocation doesn't cover all of the dynamic
           | allocation in rustc. Rustc uses a mixed strategy: there's
           | both RAII and lots of arena allocated context that only gets
           | destroyed at the end of a particular phase.
        
             | woodruffw wrote:
             | Yep -- arenas compose very nicely with lifetimes, and
             | basically accomplish the same thing as global allocation
             | (in effect, a 'static arena) but with more control.
        
             | zRedShift wrote:
             | For further reading, I recommend Niko's latest blog,
             | tangentially related to rustc internals (and arena
             | allocation): https://smallcultfollowing.com/babysteps/blog/
             | 2022/06/15/wha...
        
         | TazeTSchnitzel wrote:
         | > Do compilers really can never call `free()`?
         | 
         | If a compiler has to be run multiple times in the same process,
         | it may use an area allocator to track all memory, so you can
         | free it all in one go once you're done with compilation.
         | Delaying all freeing until the end effectively eliminates
         | temporal memory issues.
        
         | MaxBarraclough wrote:
         | I imagine plenty of compilers do call _free_ , but here's a
         | 2013 article by Walter Bright on modifying the dmd compiler to
         | never free, and to use a simple pointer-bump allocator (rather
         | than a proper malloc) resulting in a tremendous improvement in
         | performance. [0] (I can't speak to how many layers dmd has, or
         | had at the time.)
         | 
         | The never-free pattern isn't just for compilers of course, it's
         | also been used in missile-guidance code.
         | 
         | [0]
         | https://web.archive.org/web/20190126213344/https://www.drdob...
        
       | verdagon wrote:
       | A lot of embedded devices and safety critical software sometimes
       | don't even use a heap, and instead use pre-allocated chunks of
       | memory whose size is calculated beforehand. It's memory safe, and
       | has much more deterministic execution time.
       | 
       | This is also a popular approach in games, especially ones with
       | entity-component-system architectures.
       | 
       | I'm excited about Zig for these use cases especially, it can be a
       | much easier approach with much less complexity than using a
       | borrow checker.
        
         | jorangreef wrote:
         | This is almost what we do for TigerBeetle, a new distributed
         | database being written in Zig. All memory is statically
         | allocated at startup [1]. Thereafter, there are zero calls to
         | malloc() or free(). We run a single-threaded control plane for
         | a simple concurrency model, and because we use io_uring--
         | multithreaded I/O is less of a necessary evil than it used to
         | be.
         | 
         | I find that the design is more memory efficient because of
         | these constraints, for example, our new storage engine can
         | address 100 TiB of storage using only 1 GiB of RAM. Latency is
         | predictable and gloriously smooth, and the system overall is
         | much simpler and fun to program.
         | 
         | [1] "Let's Remix Distributed Database Design"
         | https://www.youtube.com/channel/UC3TlyQ3h6lC_jSWust2leGg
        
           | infamouscow wrote:
           | > Latency is predictable and gloriously smooth, and the
           | system overall is much simpler and fun to program.
           | 
           | This has also been my experience building a database in Zig.
           | It's such a joy.
        
         | snicker7 wrote:
         | How exactly is pre-allocation safer? If you would ever like to
         | re-use chunks of memory, then wouldn't you still encounter
         | "use-after-free" bugs?
        
           | nine_k wrote:
           | No; every chunk is for single, pre-determined use.
           | 
           | Imagine all variables in your program declared as static.
           | This includes all buffers (with indexes instead of pointers),
           | all nested structures, etc.
        
           | bsder wrote:
           | Normally you do this on embedded so that you know _exactly_
           | what your memory consumption is. You never have to worry
           | about Out of Memory and you never have to worry about Use
           | After Free since there is no free. That memory is yours for
           | eternity and what you do with it is up to you.
           | 
           | It doesn't, however, prevent you from accidentally scribbling
           | over your own memory (buffer overflow, for example) or from
           | scribbling over someone else's memory.
        
           | verdagon wrote:
           | The approach can reuse old elements for new instances of the
           | same type, so to speak. Since the types are the same, any
           | use-after-free becomes a plain ol' logic error. We use this
           | approach in Rust a lot, with Vecs.
        
             | olig15 wrote:
             | But if you have a structure that contains offsets into
             | another buffer somewhere, or an index, whatever - the wrong
             | value here could be just as bad as a use-after-free. I
             | don't see how this is any safer. If you use memory after
             | free from a malloc, with any chance you'll hit a page
             | fault, and your app will crash. If you have a index/pointer
             | to another structure, you could still end up reading past
             | the end of that structure into the unknown.
        
               | verdagon wrote:
               | That's just a logic error, and not memory unsafety which
               | might risk UB or vulnerabilities. The type system
               | enforces that if we use-after-"free" (remember, we're not
               | free'ing or malloc'ing), we just get a different instance
               | of the same type, which is memory-safe.
               | 
               | You do bring up a valid broader concern. Ironically, this
               | is a reason that GC'd systems can sometimes be better for
               | privacy than Ada or Rust which uses a lot more
               | Vec+indexes. An index into a Vec<UserAccount> is riskier
               | than a Java List<UserAccount>; a Java reference can never
               | suddenly point to another user account like an index
               | could.
               | 
               | But that aside, we're talking about memory safety, array-
               | centric approaches in Zig and Rust can be appropriate for
               | a lot of use cases.
        
               | pjmlp wrote:
               | In high integrity computing that is pretty much safety
               | related, if that logic error causes someone to die due to
               | corrupt data, like using the wrong radiation value.
        
               | [deleted]
        
               | tialaramex wrote:
               | But, Java has exactly the same behaviour, the typical
               | List in Java is ArrayList which sure enough has an
               | indexed get() method.
               | 
               | There seems to be no practical difference here. Rust can
               | do a reference to UserAccount, and Java can do an index
               | into an ArrayList of UserAccounts. Or vice versa. As you
               | wish.
        
             | kaba0 wrote:
             | These are 1000 times worse than even a segfault. These are
             | the bugs you won't notice until they crop up at a wildly
             | different place, and you will have a very hard time
             | tracking it back to their origin (slightly easier in Rust,
             | as you only have to revalidate the unsafe parts, but it
             | will still suck)
        
         | pcwalton wrote:
         | Even in this environment, you can still have dangling pointers
         | to freed stack frames. There's no way around having a proper
         | lifetime system, or a GC, if you want memory safety.
        
           | infamouscow wrote:
           | > Even in this environment, you can still have dangling
           | pointers to freed stack frames.
           | 
           | How frequently does this happen in real software? I learned
           | not to return pointers to stack allocated variables when I
           | was 12 years old.
           | 
           | > There's no way around having a proper lifetime system, or a
           | GC, if you want memory safety.
           | 
           | If you're building an HTTP caching program where you know the
           | expiration times of objects, a Rust-style borrow-checker or
           | garbage collector is not helping anyone.
        
             | Arnavion wrote:
             | >I learned not to return pointers to stack allocated
             | variables when I was 12 years old.
             | 
             | So, if you slip while walking today, does that mean you
             | didn't learn to walk when you were one year old?
        
           | im3w1l wrote:
           | Well if get rid of not just the heap, but the stack too...
           | turn all variables into global ones, then it will be safe.
           | 
           | This means we lose thread safety and functions become non-
           | reentrant (but easy to prove safe - make sure graph of
           | A-calls-B is a acyclical).
        
           | verdagon wrote:
           | Yep, or generational references [0] which also protect
           | against that kind of thing ;)
           | 
           | The array-centric approach is indeed more applicable at the
           | high levels of the program.
           | 
           | Sometimes I wonder if a language could use an array-centric
           | approach at the high levels, and then an arena-based approach
           | for all temporary memory. Elucent experimented with something
           | like this for Basil once [1] which was fascinating.
           | 
           | [0] https://verdagon.dev/blog/generational-references
           | 
           | [1] https://degaz.io/blog/632020/post.html
        
             | com2kid wrote:
             | > Yep, or generational references [0] which also protect
             | against that kind of thing ;)
             | 
             | First off, thank you for posting all your great articles on
             | Vale!
             | 
             | Second off, I just read the generational references blog
             | post for the 3rd time and now it makes complete sense, like
             | stupid obvious why did I have problems understanding this
             | before sense. (PS: The link to the benchmarks is dead :( )
             | 
             | I hope some of the novel ideas in Vale make it out to the
             | programming language world at large!
        
               | verdagon wrote:
               | Thank you! I just fixed the link, thanks for letting me
               | know! And if any of my articles are ever confusing, feel
               | welcome to swing by the discord or file an issue =)
               | 
               | I'm pretty excited about all the memory safety advances
               | languages have made in the last few years. Zig is doing
               | some really interesting things (see Andrew's thread
               | above), D's new static analysis for zero-cost memory
               | safety hit the front page yesterday, we're currently
               | prototyping Vale's region borrow checker, and it feels
               | like the space is really exploding. Good time to be
               | alive!
        
         | brundolf wrote:
         | Rust's borrow checker would be much calmer in these scenarios
         | too, wouldn't it? If there are no lifetimes, there are no
         | lifetime errors
        
           | thecatster wrote:
           | Rust is definitely different (and calmer imho) on bare metal.
           | That said (as much of a Rust fanboy I am), I also enjoy Zig.
        
           | the__alchemist wrote:
           | Yep! We've entered a grey areas, where some Rust embedded
           | libs are expanding the definitions of memory safety, and what
           | the borrow checker should evaluate beyond what you might
           | guess. Eg, structs that represent peripherals, that are now
           | checked for ownership; the intent being to prevent race
           | conditions. And Traits being used to enforce pin
           | configuration.
        
       | ajross wrote:
       | > In practice, it doesn't seem that any level of testing is
       | sufficient to prevent vulnerabilities due to memory safety in
       | large programs. So I'm not covering tools like AddressSanitizer
       | that are intended for testing and are not recommended for
       | production use.
       | 
       | I closed the window right there. Digs like this (the "not
       | recommended" bit is a link to a now famous bomb thrown by
       | Szabolcs on the oss-sec list, not to any kind of industry
       | consensus piece) tell me that the author is grinding an axe and
       | not taking the subject seriously.
       | 
       | Security is a spectrum. There are no silver bullets. It's OK to
       | say something like "Rust is better than Zig+ASan because", it's
       | quite another to refuse to even treat the comparison and pretend
       | that hardening tools don't exist.
       | 
       | This is fundamentally a strawman, basically. The author wants to
       | argue against a crippled toolchain that is easier to beat instead
       | of one that gets used in practice.
        
         | klyrs wrote:
         | As a Zig fan, I disagree. I think it's really important to
         | examine the toolchain that beginners are going to use.
         | 
         | > I'm also focusing on software as it is typically shipped,
         | ignoring eg bounds checking compilers like tcc or quarantining
         | allocators like hardened_malloc which are rarely used because
         | of the performance overhead.
         | 
         | To advertize that Zig is perfectly safe because things like
         | ASan exist would be misleading, because that's not what users
         | get out of the box. Zig is up-front and honest about the
         | tradeoffs between safety and performance, and this evaluation
         | of Zig doesn't give any surprises if you're familiar with how
         | Zig describes itself.
        
           | ajross wrote:
           | > To advertize that Zig is perfectly safe because things like
           | ASan exist would be misleading
           | 
           | Exactly! And for the same reason. You frame your comparison
           | within the bounds of techniques that are used in practice.
           | You don't refuse to compare a tool ahead of time,
           | _especially_ when doing so reinforces your priors.
           | 
           | To be blunt: ASan is great. ASan finds bugs. Everyone should
           | use ASan. Everyone should advocate for ASan. But doing that
           | cuts against the point the author is making (which is
           | basically the same maximalist Rust screed we've all heard
           | again and again), so... he skipped it. That's not good faith
           | comparison, it's spin.
        
             | KerrAvon wrote:
             | ASAN doesn't add memory safety to the base language. It
             | catches problems during testing, assuming those problems
             | occur during the testing run (they don't always! ASAN is
             | not a panacea!). It's perfectly fair to rule it out of
             | bounds for this sort of comparison.
        
       | anonymoushn wrote:
       | I would like Zig to do more to protect users from dangling stack
       | pointers somehow. I am almost entirely done writing such bugs,
       | but I catch them in code review frequently, and I recently moved
       | these lines out of main() into some subroutine:
       | var fba = std.heap.FixedBufferAllocator.init(slice_for_fba);
       | gpa = fba.allocator();
       | 
       | slice_for_fba is a heap-allocated byte slice. gpa is a global.
       | fba was local to main(), which coincidentally made it live as
       | long as gpa, but then it was local to some setup subroutine
       | called by main(). gpa contains an internal pointer to fba, so you
       | run into trouble pretty quickly when you try allocating memory
       | using a pointer to whatever is on that part of the stack later,
       | instead of your FixedBufferAllocator.
       | 
       | Many of the dangling stack pointers I've caught in code review
       | don't really look like the above. Instead, they're dangling
       | pointers that are intended to be internal pointers, so they would
       | be avoided if we had non-movable/non-copyable types. I'm not sure
       | such types are worth the trouble otherwise though. Personally,
       | I've just stopped making structs that use internal pointers. In a
       | typical case, instead of having an internal array and a slice
       | into the array, a struct can have an internal heap-allocated
       | slice and another slice into that slice. like I said, I'd like
       | these thorns to be less thorny somehow.
        
         | alphazino wrote:
         | > so they would be avoided if we had non-movable/non-copyable
         | types.
         | 
         | There is a proposal for this that was accepted a while ago[0].
         | However, the devs have been focused on the self-hosted compiler
         | recently, so they're behind on actually implementing accepted
         | proposals.
         | 
         | [0] https://github.com/ziglang/zig/issues/7769
        
         | 10000truths wrote:
         | Alternatively, use offset values instead of internal pointers.
         | Now your structs are trivially relocatable, and you can use
         | smaller integer types instead of pointers, which allows you to
         | more easily catch overflow errors.
        
           | anonymoushn wrote:
           | This is a good idea, but native support for slices tempts one
           | to stray from the path.
        
         | throwawaymaths wrote:
         | This. I believe it is in the works, but postponed to finish up
         | self-hosted.
         | 
         | https://github.com/ziglang/zig/issues/2301
        
       | belter wrote:
       | 1 year ago, 274 comments.
       | 
       | "How Safe Is Zig?": https://news.ycombinator.com/item?id=26537693
        
       ___________________________________________________________________
       (page generated 2022-06-23 23:00 UTC)