[HN Gopher] Zig's multi-sequence for loops ___________________________________________________________________ Zig's multi-sequence for loops Author : kristoff_it Score : 213 points Date : 2023-02-27 13:55 UTC (9 hours ago) (HTM) web link (kristoff.it) (TXT) w3m dump (kristoff.it) | AnIdiotOnTheNet wrote: | Ranges for for loops was a long time coming. Some C programmers | just completely lost their shit at having to use a while loop for | some reason. It was such a common complaint that we invented | stuff like the following as a joke, that of course became | something people actually used because internet gonna internet: | for(@as([10]void, undefined)) |_, idx| { _ = idx; } | | Which, for those unfamiliar with Zig, has `for` iterate over an | 'array' of 0-bit values and capture the index while throwing out | the value (which would always be 0 because that's the only value | a 0-bit type can represent). | | The implementation of this proposal brings with it the additional | benefit of making index capturing less mysterious. | gonzus wrote: | I suspect the real reason behind "some C programmers losing | their shit" was being forced to introduce a new variable into | the outer scope, or to use extra braces around your loop -- | both are distasteful hacks, according to some. | Decabytes wrote: | I've been curious about Zig. I find its cross compilation story | using zig cc interesting. I like its focus on simplicity instead | of debugging your knowledge of the language. On the surface it | looks like a better C, that isn't as complicated as Rust. | | I'll admit though the syntax is a little off putting. But that is | a minor complaint. I know it's not 1.0 and there are still lots | to do, but I'm curious if they do more for memory safety. With | companies trying to avoid starting new code in memory unsafe | languages if they don't have to , I wonder if that will hurt Zig | adoption. Right now it just seems like their approach is, reduce | Undefined Behavior as much as possible, make the "correct" way of | programming the easiest, and have array bounds on by default. | Will this be enough to make the language "memory safe" enough? | throwawaymaths wrote: | If zig includes tags or annotations (there are a few proposals | in the issues tracker) and surfaces this information at an | exportable zir level, it seems likely that data provenance | tracking (this includes memory safety and file descriptor, | socket closing, thread spawn/despawn etc) would be able to be | checked by a static analysis system. If zig supports compiler | hooks, then it could conceivably halt compilation unless these | invariants are satisfied. | | I'm not convinced that this _needs_ to be in the type system. | | Nonetheless it's not there yet. | matu3ba wrote: | > If zig supports compiler hooks, then it could conceivably | halt compilation unless these invariants are satisfied. | | Can you sketch out why hooks are necessary on the proposals, | like 14656 ? So far I have not read a justification, what the | advantage of coupling it to the compiler provides as | performance gains on concrete examples. Afaik, Rust has | lifetime checks decoupled to parallelize compilation and I | have not seen research or a list of the possible | optimizations with synthetic benchmark results. | | > I'm not convinced that this needs to be in the type system. | | If you want to prevent static analysis semantics becoming an | incompatibilty mess like C due to 1. unknown and 2. | untrackable tag semantics, then you have to provide a at | least an AST to ensure annotations have known + unique | semantics as I explain in https://github.com/ziglang/zig/issu | es/14656#issuecomment-143.... I would say that this is a | primitive quasi-type check decoupled from main compilation | and could be kept very flexible for research experiments by | having an uri + multihash (so basically what Zig uses as | package). | | More concretely: Stuff to look out for is RefinedC and | related projects to have more strict C semantics + how to | compose those things outside the regular Zig type system. | throwawaymaths wrote: | The only advantage to coupling to the compiler is that it | gives a tighter feedback loop for the developer, versus, | say, statically checking as a part of CI. | rwbt wrote: | I would suggest you to checkout Odin[0]. It's very similar to | Zig but has much better ergonomics and probably the closest to | a 'Better C' replacement in my experience. It does array bounds | checking by default (which you can turn off if you choose to) | | [0] - https://odin-lang.org | brundolf wrote: | I would guess there will always be a niche for languages that | make C's overall set of trade-offs. Rust will shrink that | niche, but it'll still be there. I see Zig as targeting that | niche specifically: "we can strictly improve on C in a whole | bunch of ways, without changing the fundamental bargain" | throwawaymaths wrote: | There is a market for statically checked c, as evidenced by | the existence of something like SEL4 (though to be fair it | technically checks assembly code) | | It seems like statically checked zig has a chance to be | strictly easier to implement relative to statically checked | C, and that's I think something to shoot for, especially | since it could require very little on the part of the zig | developers proper. | Y_Y wrote: | Is this different from good old `zip`? | Yoric wrote: | Apparently, it's zip, except with an UB if sizes don't match. | helloworld23443 wrote: | Hilarious. I was evaluating Zig, I took a look at Bun, | probably the most well known Zig project. Multiple issues | related to seg faults. | saagarjha wrote: | Those are safe depending on what kind they are. | kristoff_it wrote: | Take a look at TigerBeetle which is also written in Zig, if | you find segfaults there they even give you money :^) | | https://github.com/tigerbeetledb/tigerbeetle | rosetremiere wrote: | It's hard to understand what tigerbeetle is about. Can | anyone ELI5 it for me? As far as I can tell, it's some | kind of a library/system geared at distributed | transactions? But is it a blockchain, a db, a program ? | (I did look at the website) | eatonphil wrote: | Hey thanks for the feedback! We've got concrete code | samples in the README as well [0] that might be more | clear? | | It's a distributed database for tracking accounts and | transfers of amounts of "thing"s between accounts | (currency is one example of a "thing"). You might also be | interested in our FAQ on why someone would want this [1]. | | [0] | https://github.com/tigerbeetledb/tigerbeetle#quickstart | | [1] https://docs.tigerbeetle.com/FAQ/#why-would-i-want-a- | dedicat... | rosetremiere wrote: | The faq helped, thanks! So, an example of typical use | would be, say, as the internal ledger for a company like | (transfer)wise, with lots of money moving around between | accounts? But I understand it's meant to be used | internally to an entity, with all nodes in your system | trusted, and not as a mean to deal with transactions from | one party to another, right? | jorangreef wrote: | Great to hear! Joran from TigerBeetle here. | | Yes, exactly. You can think of TigerBeetle as your | internal ledger database, where perhaps in the past you | might have had to DIY your own ledger with 10 KLOC around | SQL. | | And to add to what Phil said, you can also use | TigerBeetle to track transactions with other parties, | since we validate all user data in the transaction--there | are only a handful of fields when it comes to double- | entry and two-phase transfers between entities running | different tech stacks. | | The TigerBeetle account/transfer format is meant to be | simple to parse, and if you can find user data that would | break our state machine, then it's a bug. | | Happy to answer more questions! | eatonphil wrote: | Yes that's a good example! | | And you can model external accounts that have their own | confirmation process using our two-phase transfer | support. | | https://docs.tigerbeetle.com/FAQ#what-is-two-phase-commit | ngrilly wrote: | It's a distributed database for financial transactions, | using double entry accounting, written in Zig, and with a | very innovative design: | | - LMAX inspired | | - Static memory allocation | | - Zero copy with Direct I/O | | - Zero syscalls with io_uring | | - Zero deserialization | | - Storage fault tolerance | | - Viewstamped Replication consensus protocol | | - Flexible Quorums | | - Deterministic simulation like FoundationDB | Yoric wrote: | Zero deserialization? That sounds rather scary. This | means absolute trust in data read from disk or received | from other nodes? | rom-antics wrote: | What is the threat model you're worried about? If an | attacker can write data to your disk or authenticate to | your cluster, aren't you already screwed? | Yoric wrote: | Yes, these are exactly my threats. | | First, because I'm a strong believer in defense-in-depth. | Secondly because both disk corruption and network packet | corruption happen. Alarmingly often, in fact, if you're | operating at large scale. | jorangreef wrote: | Ours too! | | For example, our deterministic simulation testing does | storage fault corruption up to the theoretical limit of f | according to our consensus protocol. | | Details in our other reply to you. | jorangreef wrote: | Great question! Joran from TigerBeetle here. | "This means absolute trust in data read from disk or | received from other nodes?" | | TigerBeetle places zero trust in data read from the disk | or network. In fact, we're a little more paranoid here | than most. | | For example, where most databases will have a network | fault model, TigerBeetle also has a storage fault model ( | https://github.com/tigerbeetledb/tigerbeetle/blob/main/do | cs/...). | | This means that we fully expect the disk to be what we | call "near-Byzantine", i.e. to cause bitrot, or to | misdirect or silently ignore read/write I/O, or to simply | have faulty hardware or firmware. | | Where Jepsen will break most databases with network fault | injection, we test TigerBeetle with high levels of | storage faults on the read/write path, probably beyond | what most systems, or write ahead log designs, or even | consensus protocols such as RAFT (cf. "Protocol-Aware | Recovery for Consensus-Based Storage" and its analysis of | LogCabin), can handle. | | For example, most implementations of RAFT and Paxos can | fail badly if your disk loses a prepare, because then the | stable storage guarantees, that the proofs for these | protocols assume, is undermined. Instead, TigerBeetle | runs Viewstamped Replication, along with UW-Madison's | CTRL protocol (Corruption-Tolerant Replication) and we | test our consensus protocol's correctness in the face of | unreliable stable storage, using deterministic simulation | testing (ala FoundationDB). | | Finally, in terms of network fault model, we do end-to- | end cryptographic checksumming, because we don't trust | TCP checksums with their limited guarantees. | | So this is all at the physical storage and network | layers. "Zero deserialization? That | sounds rather scary." | | At the wire protocol layer, we: * assume | a non-Byzantine fault model (that consensus nodes are not | malicious), * run with runtime bounds-checking (and | checked arithmetic!) enabled as a fail-safe, plus * | protocol-level checks to ignore invalid data, and * | we only work with fixed-size structs. | | At the application layer, we: * have a | simple data model (account and transfer structs), * | validate all fields for semantic errors so that we don't | process bad data, * for example, here's how we | validate transfers between accounts: https://github.com/t | igerbeetledb/tigerbeetle/blob/d2bd4a6fc240aefe04625138210 | 2b9b4f5384b05/src/state_machine.zig#L867-L952. | | No matter the deserialization format you use, you always | need to validate user data. | | In our experience, zero-deserialization using fixed-size | structs the way we do in TigerBeetle, is simpler than | variable length formats, which can be more complicated | (imagine a JSON codec), if not more scary. | Yoric wrote: | > Where Jepsen will break most databases with network | fault injection, we test TigerBeetle with high levels of | storage faults on the read/write path, probably beyond | what most systems, or write ahead log designs, or even | consensus protocols such as RAFT (cf. "Protocol-Aware | Recovery for Consensus-Based Storage" and its analysis of | LogCabin), can handle. | | Oh, nice one. Whenever I speak with people who work on | "high reliability" code, they seldom even use fuzz- | testing or chaos-testing, which is... well, unsatisfying. | | Also, what do you mean by "storage fault"? Is this | simulating/injecting silent data corruption or | simulating/injecting an error code when writing the data | to disk? | | > validate all fields for semantic errors so that we | don't process bad data, | | Ahah, so no deserialization doesn't mean no validation. | Gotcha! | | > In our experience, zero-deserialization using fixed- | size structs the way we do in TigerBeetle, is simpler | than variable length formats, which can be more | complicated (imagine a JSON codec), if not more scary. | | That makes sense, thanks. And yeah, JSON has lots of | warts. | | Not sure what you mean by variable length. Are you | speaking of JSON-style "I have no idea how much data I'll | need to read before I can start parsing it" or entropy | coding-style "look ma, I'm somehow encoding 17 bits on | 3.68 bits"? | jorangreef wrote: | > Also, what do you mean by "storage fault"? Is this | simulating/injecting silent data corruption or | simulating/injecting an error code when writing the data | to disk? | | Exactly! We focus more on bitrot/misdirection in our | simulation testing. We use Antithesis' simulation testing | for the latter. We've also tried to design I/O syscall | errors away where possible. For example, using O_DSYNC | instead of fsync(), so that we can tie errors to I/Os. | | > Ahah, so no deserialization doesn't mean no validation. | Gotcha! | | Well said--they're orthogonal. | | > Not sure what you mean by variable length. Are you | speaking of JSON-style "I have no idea how much data I'll | need to read before I can start parsing it" | | Yes, and also where this is internal to the data | structure being read, e.g. both variable-length message | bodies and variable-length fields. | | There's also perhaps an interesting example of how | variable-length message bodies can go wrong actually, | that we give in the design decisions for our wire | protocol, and why we have two checksums, one over the | header, and another over the body (instead of one | checksum over both!): https://github.com/tigerbeetledb/ti | gerbeetle/blob/main/docs/... | Yoric wrote: | Alright, I'm officially convinced that you've thought | this out! | | So, how's the experience of implementing this in Zig? | jorangreef wrote: | Thanks! I hope so! [:raised_hands] | | And we're always learning. | | But Zig is the charm. TigerBeetle wouldn't be what it is | without it. Comptime has been a gamechanger for us, and | the shared philosophy around explicitness and memory | efficiency has made everything easier. It's like working | with the grain--the std lib is pleasant. I've learned so | much also from the community. | | My own personal experience has been that I think Andrew | has made some truly stunning number of successively | brilliant design decisions. I can't fault any. It's all | the little things together--seeing this level of | conceptual integrity in a language is such a joy. | ngrilly wrote: | I can't stop being amazed by TigerBeetle's design and | engineering. | jorangreef wrote: | Thank you Nicolas! | [deleted] | laserbeam wrote: | It's a special purpose DB. No relation to blockchains. | speed_spread wrote: | Well, Zig would not be such a good C replacement if it did | not also allow segfaulting all over the place. | jeroenhd wrote: | Looking at the bugs themselves, I don't think any low level | language would've caught those. C/C++ would've crashed as | well (hopefully, at least, many of these problems would be | UB and the compiler might just ignore the problem or patch | out the offending code) and Rust would've panic'd. There | are a few cases where Rust wouldn't have allowed the code | to panic but the surrounding code would be pretty | unreadable in safe Rust (without stacking types like | Box+RefCell+Rc and clone()ing a bunch) so it's hard to | compare the two. | | The advantage of Rust would be a nice and readable stack | trace to the crashing method, but a core dump would've | included even more information for the person debugging the | binary, so I think it ends up quite even. | jjnoakes wrote: | A panic (deterministic, guaranteed, immediate, and worst- | case a dos) is an order of magnitude better than memory | corruption (non-deterministic, not guaranteed, eventual- | if-at-all, and worst-case-rce). | cormacrelf wrote: | I don't know what's going on in this thread where | encountering UB has somehow been morphed into some kind | of guaranteed immediate core dump that's basically better | than panicking anyway. Yes, people are talking about | segfaults. But it's memory corruption. Maybe you get a | crash at some point, maybe you do not. | | A reminder for all that have forgotten: UB is the one | that can email your local council and submit a request to | bulldoze the house you're in. It is not a free core dump. | sfink wrote: | You appear to have particularly vengeful nasal demons. | AndyKelley wrote: | A sementation fault is well-defined behavior. If you look | at Jarred's comment nearby he reveals that the pointers | in question are special pointers, e.g. 0x0, 0x1, 0x2, | etc. | | It is 100% well-defined behavior to dereference these | pointers. It always segfaults, which as Jarred mentioned | is a lot like a panic. | | Rust evangelists need to be careful because in their zeal | they have started to cause subtle errors in the general | knowledge of how computers work in young people's minds. | Ironically it's a form of memory corruption. | adwn wrote: | > _If you look at Jarred 's comment nearby he reveals | that the pointers in question are special pointers, e.g. | 0x0, 0x1, 0x2, etc._ | | Is that guaranteed by the language semantics, or could it | possibly change at some point in the future? If it's the | latter, then yes, it is very much Undefined Behavior, and | not guaranteed to segfault before opening the door for | potential exploits. | rom-antics wrote: | I can buy that dereferencing null is a special case, but | why is 0x2 special? Is 0x20 also special? What about | 0x20000? Are the invalid non-null pointer values listed | in a reference somewhere? If 0x2 is an invalid pointer, | what do I do if my microcontroller has a hardware | register at 0x2? | jeroenhd wrote: | On many platforms, the zero page is set up so access to | it will always segfault. This isn't a language guarantee, | but it's a guarantee in most modern operating systems | (Linux, FreeBSD, Windows). This is set up for pointers | all the way up to the end of the first page. | | On Windows and Linux this is the first 4KiB so range | 0x0000 up to 0x1000, unless large pages are on (then it's | even more). | | On macOS in x64 this is the entire 4GiB memory space, | probably a method to help developers port their 32-bit | software to x64. I don't know what the zero page size on | ARM is. | | If your microcontroller doesn't have this guarantee, you | can't make use of this feature. | rom-antics wrote: | That's a guarantee on the level of the hardware/OS, but | hardware semantics are not the same as language/compiler | semantics. Even if according to the _source code_ you 're | dereferencing a pointer value 0x0 or 0x2, that doesn't | mean the compiler-emitted machine code will end up | telling the hardware to do the same. | | Remember this gem? | | https://kristerw.blogspot.com/2017/09/why-undefined- | behavior... | | Once you trigger UB, all bets are off and your code could | do anything. A segfault just means you spun the roulette | wheel, bet it all on red, and got lucky your house wasn't | bulldozed. | | Zig also uses LLVM under the hood, right? So it's subject | to these same semantics. An LLVM pointer value cannot | legally contain arbitrary non-null non-pointer integers | such as 0x2. That's a dead giveaway of UB. And I doubt | the emitted Zig code safety-checks every pointer | dereference for a value less than 0x1000 before | performing the dereference. | kristoff_it wrote: | > An LLVM pointer value cannot legally contain arbitrary | non-null non-pointer integers such as 0x2. | | 0x2 is a perfectly valid pointer value, it just happens | to never be a good _virtual memory_ address on modern | systems where virtual memory is setup by the usual OSs, | hence the fact that you can rely on it segfaulting. | jeroenhd wrote: | The semantics are actually operating system and even | compiler flag dependent. On macOS you can choose the size | of your zero page during build. The numbers I've listed | are just the defaults. | | Zig UB is not C UB. There is an entire language built on | top of it. Just because something behaves a certain way | in C, doesn't mean the same thing is true in Zig. Zig is | no longer a code generator for C, it has switched to a | self hosted compiler a while back. In fact, the language | is rapidly progressing to the point where LLVM is a mere | optional dependency. | | I don't know the semantics around LLVM pointers. I don't | see why 0x2 would be invalid, there are plenty of | platforms programmed in C(++) that have a flat memory | model. It would be quite painful to have a | microcontroller where you can't send data to the output | pin because LLVM decided that 2 is invalid (but 0 isn't). | I've never seen LLVM complain about invalid | dereferencing, though, it always ends up doing what the | compiler tells it to do as far as I can tell. | | Zig pointers will definitely cause UB but most Zig code | shouldn't need them. Slices are actually bound checked | and should probably be preferred in most cases of pointer | arithmetic. Simple pointers can't be increased or | decremented so you need to manually go through @intToPtr | if you want to do real pointer arithmetic, which is quite | unusable. | | I haven't used Zig much so I don't know how many Zig | semantics are copies of C semantics and how many are | translated by the Zig frontend. However, "this is a | bad/undefined thing in C so it must be a bad/undefined | thing in Zig" is simply not true. | rom-antics wrote: | I know Zig is not C, that's why I specifically mentioned | LLVM. It's fine if Zig has different opinions about UB | than LLVM does, but in that case ReleaseSafe builds | should not use LLVM, not even optionally. If Zig says | some operation is defined, but LLVM says it's undefined, | well, LLVM is the one optimizing code so it's LLVM's | invariants that matter. Right now it looks like Zig is | playing fast and loose with correctness, shoving | everything through LLVM but not respecting LLVM's | invariants. And hey, if something is observed to segfault | under some conditions today on the current version of | LLVM, we'll just say segfaults are guaranteed. It's | disappointing to see. | AndyKelley wrote: | A lot of people have the same misunderstanding as you. | | LLVM has rules about what is legal and what is not legal. | If you follow the rules, you get well-defined behavior. | It's the same thing in C. You could compile a safe | language to C, and as long as you follow the rules of | avoiding UB in C, everything is groovy. | | Likewise, this is how Zig and other languages such as | Rust use LLVM. They play by the rules, and get rewarded | by well-defined behavior. | rom-antics wrote: | Is not one of the LLVM rules, pointers must be valid and | have a valid provenance in order to be dereferenced? If | 0x2 ends up in a pointer that is dereferenced (or 0x0 in | a nonnull pointer), has that rule not been broken? And if | the rule is broken, does that not trigger undefined | behavior? | avgcorrection wrote: | > On many platforms, the zero page is set up so access to | it will always segfault. This isn't a language guarantee, | but it's a guarantee in most modern operating systems | (Linux, FreeBSD, Windows). This is set up for pointers | all the way up to the end of the first page. | | Then I guess it could be a language guarantee if Zig only | supports/targets those platforms. However, considering | how low-level Zig is, I doubt that _that_ is the case. | jeroenhd wrote: | First of all: Zig is not C. The rules for undefined | behaviour can be found here: | https://ziglang.org/documentation/master/#Undefined- | Behavior | | TL;DR: Zig injects checks and aborts the program at | runtime unless you specify that you wish to ignore the | problem. This can be done explicitly within the code or | by compiling under a build mode that ignores checks | (unless specified manually). | | Programs compiled as Debug and ReleaseSafe will terminate | at runtime if UB is triggered. Compiling for ReleaseSmall | and ReleaseFast will cause traditional C-style UB. If you | care about your program doing what it's supposed to do, | you use ReleaseSafe. Doing Release[Fast|Small] will do | something similar to -O3 in other languages, which will | often change behaviour. | | Note, however, that you can compile your code under "just | allow UB and see what happens" mode but still benefit | from checked UB by setting @setRuntimeSafety(true); this | will introduce the assertions despite the unsafe build | modes you may specify. | | It's like introducing a C++ compiler flag* telling the | compiler "ignore exceptions and just continue". You know | you're in for a bad time the moment you specify it, but | it makes your program blazingly fast because it greatly | reduces the amount of code to generate/checks to execute. | | The main advantage of checked UB is that well-tested code | can make use of the unchecked nature of these features | for speed without having length check code blocks that | need to be wrapped in debug #ifdefs or similar. Assuming | you don't run test builds with checks enabled (and why | wouldn't you) you'd catch these problems in your build | pipeline. | | This is different from the normal way of working with C | and friends, where UB remains in debug/-O1 builds but | just acts a little differently. Some compilers will | insert breakpoints, others will ignore the problem like | in release mode, nobody knows what will happen and your | compiler can't detect this problem for you. | | * note that -fno-exceptions exists, but that aborts the | program rather than let it continue. | Jarred wrote: | Most of what manifests as a segfault in Bun have been due | to assuming a JSValue is a heap allocated value when it is | (the JavaScript representation) "null", "undefined", | "true", "false" etc. These are invalid pointers, the | operating system signals the memory access was invalid, Bun | runs the signal handler, prints some metadata, and exits. | This is a lot like a panic | naikrovek wrote: | safety-checked UB; an important distinction. I assume. | Yoric wrote: | Could you elaborate? What's a safety-checked UB? | codethief wrote: | https://ziglang.org/documentation/master/#Undefined- | Behavior | Yoric wrote: | Thanks. | | So, if I read this correctly, barring the simple cases | that Zig can detect at compile-time, this means that | whether it's a UB (in the C++ definition of the term) | depends on the flags specified by the author of the | library and the person who compiles the final binary. | | That's definitely much better than C++ UB. Still a bit | scary, though. | planede wrote: | Not very. It really just means that "there is a sanitized | mode for building", which already exists in many C and C++ | compilers (for language UB) and standard libraries (for | library UB). | naikrovek wrote: | I think of this case as an assert() that both lists are | the same length. if they're not, I want to know about | that via a crash. | oconnor663 wrote: | I think Zig's ReleaseSafe mode is intended to be suitable | for production, which IIUC isn't really the case with | ASan, UBSan, and friends. Those have some performance | problems and also some attack surface problems. | planede wrote: | OK, but is "safe" in ReleaseSafe any kind of guarantee, | or is it just safer than ReleaseFast? | | I can enable lightweight assertions in libstdc++ and | libc++ and it makes C++ safer, but not in any way "safe". | There are some flags that can be enabled to trap on some | language UB too, without bringing in the heavy weight | sanitizers. | oconnor663 wrote: | Last time I checked (more than a year ago) there were | major open questions about what could be guaranteed. My | impression was that you could expect e.g. all array | accesses to be bounds checked, but that use-after-free | and dangling pointers were still issues, especially if | you use the C allocator. | [deleted] | randyrand wrote: | Is it UB? | | I thought it was defined as an OOB access in non-safe modes, | or a panic in safe-modes. | Laremere wrote: | As much as a concept can be translated from one programming | language to the next, they're conceptually pretty much | identical. However for Zig there are two important differences: | | 1. It's simpler syntax than reaching for a zip function. I | personally like this design because the conceptual load is | pretty low as it feels like a natural extension of simple for | loops. Eg you could teach someone simple for loops, then later | go "hey, you could do this the whole time!" | | 2. Zig doesn't have support for custom iterators. Zip is doable | using the existing metaprogramming features, but it's not as | simple. Support for iterators also likely violates Zig's `No | hidden control flow.` maxim. Plus I imagine it's a lot easier | for the compiler to perform optimizations this way. | | Both points combined are related to Zig's design goal for being | good at writing code that can run fast on modern CPU | architectures. Being able easily to loop over multiple arrays | is a good step for making that practical. | kps wrote: | > No hidden control flow. | | Off-topic, but... I very much like this idea, but I think Zig | shares one wart with languages like C++: it's impossible to | tell syntactically whether `f()` is a direct or indirect | call. An indirect branch is a _conditional_ branch, where the | condition can be arbitrarily far away in space and time, and | it 's invisible. Dynamic control flow can be tricky to reason | about even when you know it's there. | rom-antics wrote: | iirc there was a proposal to have Zig used `funcptr.()` | syntax (with a dot) for indirect calls, but it was | rejected. | Jayschwa wrote: | While with optionals enables some iterator-like behavior. | https://ziglang.org/documentation/master/#while-with- | Optiona... | malcolmstill wrote: | To expand on the parent and grandparent: | | Laremere is correct in that there is no "magic" built-in | understanding of iterators in the language, i.e. under the | hood calling a `.next()` method, without explicitly having | to call it. That _would_ violate the no hidden flow control | maxim. | | However, as Jayschwa points out, Zig's `while` loop will | bind the result of its expression (in its own block scope) | if it is non-null and otherwise exit the loop. This gives | you essentially the same as a for loop that has some | language-level knowledge of the iterator pattern, except | there is no hidden flow control (I have to explicitly call | `next`). | | And indeed the Zig standard library is replete with | iterators (and in most of the Zig code I write I will will | write iterators for my own collections). For example, | `mem.split` returns an iterator: var it = | mem.split(...); // it.next() returns null | after we run out of // split text and the while | loop exits while (it.next()) |substr| { | // In here we have a non-nil substr } | | > Plus I imagine it's a lot easier for the compiler to | perform optimizations this way. | | That's an interesting point: does Zig miss out on some | optimisation possibilities with iterators given they are | not a language-level construct? I don't know. | jeroenhd wrote: | It's zip with an arbitrary amount of array parameters. Quite | useful for the purposes provided by the article. | | I do wonder about performance, though, as multiple array | derefences may not be captured well by the L1 cache like a | well-rounded struct might. | | An L1 cache line is often 64 bytes long, enough to fit one of | the "monster" example structs but never two. Performance in | real life scenarios may actually increase if these structs are | padded with an additional 16 bytes so none of the structs are | on a cache line boundary. | masklinn wrote: | > It's zip with an arbitrary amount of array parameters. | | So... zip? | | Python's does that, and for most other langages you could use | overloading, basic macros, or traits trickery to get there if | you really wanted to support unreasonable widths (IME you | almost never need more than 3, and combining two zip/2 works | fine then). | klyrs wrote: | Zig's "zip" is purely syntactical, where Python's zip is a | generator. This is significant both in terms of performance | (Zig wins) and flexibility (Python wins). | | Unlike Python, you can't pass a zip generator around. It's | just a for-loop. While zig loops are expressions, they only | return a single value. | masklinn wrote: | > This is significant both in terms of performance (Zig | wins) | | Does it, actually? Does Zig's built-in pseudo-zip | outperform Rust's? Or C++23's? | | > It's just a for-loop. | | Except it's not "just" a for loop, it's a weird special | case for a for loop. And one which is actively dangerous | too. | klyrs wrote: | Please check the context, I was comparing to Python. And | please chill out. It's only "dangerous" when you decide | to run it that way. | masklinn wrote: | > Please check the context, I was comparing to Python. | | Ah yes, that's very honest, no shit zig is faster than | Python, what's the next one, thrustscc may be faster than | polio-johnny down the road? | klyrs wrote: | Hi, no need to be so abrasive about it. | | > Please respond to the strongest plausible | interpretation of what someone says, not a weaker one | that's easier to criticize. Assume good faith. | | Python's zip function returns a generator. A generator in | zig would look like a function pointer with a closure. If | zig were to implement the Python-style "zip" function, | constructing the closure, and iterating over the | generator, would be significantly slower than the naked | "just a for loop" that we see in TFA. And that's not even | considering the tuple construction (oops, now we need an | allocator) & unpacking. | | Ergo, the zig-style "syntactical zip" is higher | performance than the Python-style "functional zip". Even | when you cut through the baseline performance differences | between the languages. | masklinn wrote: | > Hi, no need to be so abrasive about it. | | That's just projection. | | > Please respond to the strongest plausible | interpretation of what someone says, not a weaker one | that's easier to criticize. Assume good faith. | | I'll get right on that as soon as you extend the same | courtesy, which you have refused to do at every | opportunity. | judofyr wrote: | > Except it's not "just" a for loop, it's a weird special | case for a for loop. | | You could also argue that a for loop which can only | iterate over a _single_ sequence is a special case of a | multi-sequence for loop. | | > And one which is actively dangerous too. | | In safe builds (ReleaseSafe, Debug) it will cause a | controlled panic if the sequences are not of the same | size. Most likely it's a logical bug if you iterate over | two sequences of different sizes. In ReleaseFast the | compiler will make assumptions to improve performance. If | it's very important for your code you can force a certain | code block to always have runtime safety. Yes, there are | trade-offs, but I don't feel it's _unreasonable_. | jeroenhd wrote: | There is no single "zip". Java's .zip() will work on two | sources, as will C#'s Zip(). Haskell's zip is no different, | only accepting two parameters. I don't know any language | other than Python that shares Python's iterator zip() | implementation. | | In implementation, Python's zip will return a generator | that is iterated over using the iterator functionality, | while Zig's .zip is compiled as a loop. Python's iteration | may be turned into a loop, it may be interpreted, or it may | be turned into some other kind of bytecode, who knows. The | standard cpython implementation is much more complex, | though: https://github.com/python/cpython/blob/main/Python/ | bltinmodu... | | Concatenating zip()s is an unnecessarily complex solution, | both in terms of syntax and in code generated. In Python | this may not matter because it's a relatively slow | programming language in general (the language often being | "glue between fast C methods"), but in Zig this can easily | become untennable. | | I also disagree that you don't need more than 3. As the | article states, if you leverage array-of-structs rather | than struct-of-arrays you can use this to "deconstruct" | objects without paying the memory usage penalty of struct | padding. The 15% wasted RAM in this example is relatively | small compared to some real use scenarios; something as | common as a 3D vector will often have a whopping 25% space | waste. | | Other languages allow this as well (and often using such | iterations are much faster than zip()ing lists together) | but the lack of guarantees and repetitive syntax becomes a | pain. | masklinn wrote: | > There is no single "zip". | | Which means you can implement yours to fit your needs. | | > I don't know any language other than Python that shares | Python's iterator zip() implementation. | | https://docs.rs/itertools/latest/itertools/macro.izip.htm | l | | > In implementation | | Which is hardly relevant. Python's entire implementation | has aims, means, and purpose with no relation to Zig's. | | > I also disagree that you don't need more than 3. | | Which is not what I wrote. | | > As the article states, if you leverage array-of-structs | rather than struct-of-arrays you can use this to | "deconstruct" objects without paying the memory usage | penalty of struct padding. | | Sure? And the article uses an example with 3 values. | | > The 15% wasted RAM in this example is relatively small | compared to some real use scenarios; something as common | as a 3D vector will often have a whopping 25% space | waste. | | It also could hardly be less relevant: it's an issue in | an AoS structure because all your objects have that | overhead, therefore that's your total overhead. | | Here it's 15 or 25% padding _in a single value within a | stackframe_. You 're probably wasting more stackframe | space due to the compiler not bothering reusing | temporally dead locations. | | And that's if the compiler reifies the tuple instead of | eliding the entire thing. | | > Other languages allow this as well | | OK? | | > (and often using such iterations are much faster than | zip()ing lists together) | | Until they are not. | jeroenhd wrote: | > Which means you can implement yours to fit your needs. | | Which this doesn't, as zip is an expression and multi- | sequence loops aren't. | | > https://docs.rs/itertools/latest/itertools/macro.izip.h | tml | | External libraries aren't part of a language. | | > Which is not what I wrote. | | I admit, I read over the "almost" in "you almost never | need more than 3". | | > It also could hardly be less relevant: it's an issue in | an AoS structure because all your objects have that | overhead, therefore that's your total overhead. | | > Here it's 15 or 25% padding in a single value within a | stackframe. You're probably wasting more stackframe space | due to the compiler not bothering reusing temporally dead | locations. | | That's not true: arrays are byte-addressable so inside an | array the alignment can be shorter. An array of 121 | 33-byte values is 3993 bytes in size, an array of 121 | usizes is 968 bytes in size, and assuming enums resolve | to 32-bit values an array of 121 enums is also 484 bytes | in size. There is no overhead here. | | This has advantages and disadvantages. Unaligned access | is slower in general but in many cases and unaligned | array can be faster because of how many of its entries | can be loaded into the CPU cache. There's no definite | advantage here in terms of CPU performance, but in terms | of RAM usage there is. | | > Until they are not. | | When does a for loop ever become faster than a generator? | The values being mapped over are already evaluated, there | is no lazy loading+early stopping to take advantage of | the generator. | senkora wrote: | Nit: It should be "Air Nomads" instead of "Wind Nomads". | | (I know this doesn't matter but I figured the author would | appreciate the heads up!) | kristoff_it wrote: | thanks, fixed, I started by thinking about the last example | (the one about pokemons) and then it stuck | planede wrote: | In C++23 with zip it looks something like: for | (const auto& [x, y] = std::views::zip(a, b)) { /* ... */ | } | | A notable difference is that the ranges don't have to match in | size, the loop will run until the end of the shorter range is | reached. | | If it is required for optimization to not check for reaching the | end of one of the ranges then it can be achieved with something | like: for (const auto& [x, y] = | std::views::zip(a, std::ranges::subrange(std::ranges::begin(b), | std::unreachable_sentinel))) { /*...*/ } | | I guess it's hard enough to bump into this by accident. | WalterBright wrote: | D has std.zip: | | https://dlang.org/phobos/std_range.html#zip | | Sorting two arrays in parallel: import | std.algorithm.sorting : sort; int[] a = [ 1, 2, 3 | ]; string[] b = [ "a", "c", "b" ]; zip(a, | b).sort!((t1, t2) => t1[0] > t2[0]); writeln(a); | // [3, 2, 1] // b is sorted according to a's sorting | writeln(b); // ["b", "c", "a"] | edflsafoiewq wrote: | In Common Lisp it's (loop for x in a | ; on each iteration, steps x to next element of a for y | in b ; same thing do ...) | | This is like the Zig in that it's a hard-coded feature of the | looping construct instead of being a general combinator like | C++/Rust, but I think it's neat that by allowing multiple | clauses, zipping falls out completely for free. | Someone wrote: | > This is like the Zig in that it's a hard-coded feature of | the looping construct | | I think lisp's _loop_ isn't hard-coded in the language, but | defined in the standard library. See for example the _loop_ | implementation at | https://github.com/sbcl/sbcl/blob/master/src/code/loop.lisp | (about 2,000 lines because _loop_ is a monster /very flexible | (pick whatever you prefer. I would pick both)) | edflsafoiewq wrote: | LOOP isn't hard-coded into the language, but the possible | clauses are hard-coded into LOOP. This is in contrast to | ITERATE, which is an extensible CL looping macro, or the | generator/iterator style popular in C++/Rust/Python/etc. | netr0ute wrote: | As someone who is using a lot of C++20, I can't wait to use | this feature when C++23 is finally ready :) | noobermin wrote: | God, C++ even today is still horribly verbose. I get that being | "explicit" is being "better" but there are limits. Even after | auto removed a lot of verbosity there still is a lot there just | to get it to do exactly what you want. | nikbackm wrote: | What is verbose about that? (The first case, not the second, | special case) | | Seems hard to make it any shorter. Well, I guess you could | remove "const" if you want. | jeroenhd wrote: | Here's how it looks when you write readable C++: | for (const auto &[x, y]: zip(a, subrange(begin(b), | unreachable_sentinel))) { /* Do something with x | and y */ } | | I'm not sure why C++ programmers don't like using `using`, | it's as if Java programmers insist on typing out | java.util.ArrayList every time because you may have an | ArrayList of your own in the future. | | So much C++ code can become readable by adding `using | namespace std::something`. | twic wrote: | C++ doesn't have a lot of namespacing below 'std', so if | you go around using everything you need, you end up with a | lot of short, possibly conflicting names in scope. | | Java avoids this somewhat because functions are always in a | class, so there is a little bit of extra namespacing, even | if you've imported the class. | winrid wrote: | You can still import fully qualified static method on a | class: | | import someclass.doThing; | | Or maybe that's what you meant by somewhat :) | cmovq wrote: | The reason is header files. If you do `using namespace | std::something` in one file and it gets included in | another, the other file now has std::something in the | global namespace which it may not have excepted. | | Other languages like Java have imports scoped a to a single | file, so this is not a problem. | titzer wrote: | The word "antiquated" comes to mind, but it's worse than | that. An age-old self-created fractal hell of dumb | problems created by a _simplistic_ view of code reuse | based on the _simple_ mechanism of text inclusion that | constantly restrains future evolution. It 's antediluvian | and just plain backwards. | twic wrote: | We are at least getting modules real soon now: | https://en.cppreference.com/w/cpp/language/modules | duped wrote: | The only compiler that seems to care about implementing | modules is MSVC. It'll be "real soon" when GCC stops | crashing. | xigoi wrote: | We've been getting them real soon for... about 5 years | now? | tragomaskhalos wrote: | Using in a header file => complete no-no. | | Using in a cpp file => absolutely fine. | jcelerier wrote: | > Using in a cpp file => absolutely fine. | | definitely not as soon as you want to do unity/jumbo | builds (which are in my experience the ndeg1 thing to do | to get fast CI builds) | jeroenhd wrote: | I see, but that's only a program in header files, isn't | it? Most code will end up in .cpp files which don't get | included (usually). | | It makes sense to use std::vector in a .h(pp) file, but | in the .cpp you should be able to `using namespace std`, | right? | chongli wrote: | _Most code will end up in .cpp files which don 't get | included (usually)._ | | Not if a lot of your code is in the form of templates. | You have to put those in headers. | dcow wrote: | At this point, just use a modern language already d= | pdntspa wrote: | I don't do C++ but if it's anyting like my Java IDE most of | that stuff pops up in autocomplete and you just tab through | it all | codethief wrote: | While this solves one of the issues I've had with Zig, it doesn't | seem very flexible. I would love to the same thing for (tuples | of) variable-length arrays, arrays of different lengths etc. Now | my implementation will still look completely different in | slightly different situations. | | Yes, a flexible solution (a zip function) would probably need | iterators but why would introducing them be such a big problem? | (FWIW, I know that one can emulate looping over iterators with | while() and optionals[0] but it feels a bit dirty.) | | More generally, my biggest gripe with Zig has been a lack of | expressiveness: Things like deep equality checks between structs, | arrays and optionals; looping over different kinds of containers; | ... should just work(tm). Sure, I understand that Zig doesn't | want hidden control flow but OTOH explicit control flow | everywhere often just gets in the way of readability and of | implementing business logic. I usually follow the approach | "Implement first, optimize later" but with Zig the implementation | will look completely different depending on which optimization or | data structure I choose, so I need to think about optimizations | from the start if I don't want to rewrite everything later. I | should mention, though, I'm very used to Python these days and | haven't written C or C++ in ages, so maybe that sort of culture | shock is somewhat expected. | | Anyway, I'm still excited about the language and my impression is | that Andrew Kelley is very open to new suggestions and new ideas, | so things will certainly still change in one way or another till | v1.0. | | [0]: https://news.ycombinator.com/item?id=34958051 | AndyKelley wrote: | There is no "just work" for deep equality. The standard library | would have to make decisions on behalf of the application that | it has no business making. | | I can tell you right now that while there will be many upcoming | language changes, none of them will be comfy to Python | programmers. Zig is very much an imperative language. Or | perhaps think of it as a declarative DSL for outputting machine | code. | _a_a_a_ wrote: | > a declarative DSL for outputting machine code | | and thanks for the laugh | codethief wrote: | Thanks for your message, Andrew! | | Just to be clear, I didn't mean any offense and maybe my | critique wasn't as well-balanced as it could have been. So | far, coding in Zig has been a fun ride, despite the | occasional obstacles I've run into! | | > There is no "just work" for deep equality | | Say I have two variables A and B of the same struct type. | Each points to a finite region in memory of the same size. | Why can't I just compare these two regions using `A == B` to | make sure they are equal? Yes, one can obviously come up with | alternative definitions of what equality might mean for | structs (only compare certain subfields etc.) but wouldn't | the aforementioned definition be a good default that would | work in the majority of cases? | | Alternatively, there is `std.testing.expectEqualDeep()` which | walks through all fields but as far as I know there is no | equivalent for production code(?) | | > none of them will be comfy to Python programmers | | Oh I think the multi-sequence for loops feature already makes | Zig more comfy! :) | | Just to be clear: I wouldn't want Zig to be another Python. | While I like Python from a developer experience POV, it is | dictionaries and magic methods all the way down and often | unnecessarily slow and complex. | | I still think one could find a good balance between DX and | full low-level control, though. One could e.g. have one | convenient way to express a certain problem that gives you | medium control over performance (e.g. the `==` example above) | and one or more fine-tuned, but possibly less concise ways of | expressing the problem that provide full control but require | more lines of code. In the struct equality example the latter | would mean defining some kind of `eql()` function that would | be optimized to the struct type (e.g. compare certain fields | first as they are more likely to differ etc.). Would this | violate the Zen of Zig?[0] | | > Only one obvious way to do things. | | After all, there is also | | > Favor reading code over writing code | | Right now, at least, I often run into situations where I | don't know of any obvious way to solve my problem. Then I end | up writing lengthy code to tell Zig what I want and end up | with code that's so-so on the fun-to-read scale. | | [0]: https://ziglang.org/documentation/master/#Zen | munificent wrote: | _> Why can 't I just compare these two regions using `A == | B` to make sure they are equal?_ | | Why is shallow equality useful? | | You could have `A == B` be true, but then as soon as you | wrap pointers to them in C and D, now `C == D` is false. | shagie wrote: | It gets into even more fun if A has a pointer to C has a | pointer to B, and B has a pointer to D has a pointer to | A. | [deleted] | Existenceblinks wrote: | Probably a good signal for potential O(n^2) when reading the | code. | | EDIT: Nope. I was wrong. This is not a list comprehension. | tuukkah wrote: | Right, it's not an ordinary list comprehension. It's a parallel | list comprehension though: zip as bs = [(a,b) | | a <- as | b <- bs] | | https://downloads.haskell.org/ghc/latest/docs/users_guide/ex... | loeg wrote: | Why do you say so? It's equivalent to iterating indexes from 0 | to N-1 of two (or more) lists with N elements and providing | syntax sugar for those lists' elements at that index. This is | O(N). | Existenceblinks wrote: | True. I misread it, it's a zip pattern. I thought it was a | fancy list comprehension. | dan00 wrote: | I think the naming of the 'else' branch in the loop could be more | telling, like using the name 'finally' or 'finish'. | Someone wrote: | I agree, but 'finally' or 'finish' IMO aren't good choice | because that code doesn't always execute. | | I think I would go for something expressing 'default', but | would first look at existing code to see how common this is, | and if I decided I wanted this feature, look hard for | alternative syntax. const match: ?usize = for | (text, 0..) |x, idx| { if (x == needle) break idx; | } | | could return an optional int, for example. If so, you would get | a 'null' for free, and if you didn't want a null, you could | tack on a _.getOrElse(NOT_FOUND)_. | | I guess they picked this because Python has it, too. | https://docs.python.org/3/tutorial/controlflow.html#break- | an...: | | _"Loop statements may have an else clause; it is executed when | the loop terminates through exhaustion of the iterable (with | for) or when the condition becomes false (with while), but not | when the loop is terminated by a break statement."_ | masklinn wrote: | `finally` hints at a very different behaviour because in most | languages' context a finally clause is executed whether an | exception is raised or not. | puffoflogic wrote: | Sorry, but using new syntax to accomplish something other | languages have as library code is not clever. | | When reading zig code you have to stop and think, "wait does this | syntax mean zip or direct product?" But when expressed as a | _function called zip_ , the meaning is clear. | | (Obligatory reminder that zig devs think that sometimes running | code inside `if (false)` is a minor bug of no consequence, and | after all what are the _real_ motives of anyone pointing it out, | eh?) | Kukumber wrote: | Kinda nice to have, D can do it aswell: import | std; void main() { int[] a = [1, | 2, 3]; string[] b = ["a", "b", "c"]; | foreach (e1, e2; zip(a, b)) { | writeln(e1, ":", e2); } } 1:a | 2:b 3:c | hota_mazi wrote: | Question for Zig experts: for (elems) |x| { | std.debug.print("{} ", .{x}); } | | Why is the .{x} necessary here? What happens if I just write "x"? | Laremere wrote: | Zig doesn't have variable length args. The .{} syntax is for an | anonymous struct with no field names (which are named tuples in | Zig.) Print takes the struct's type info at compile time to | check the validity of the statement, and also produce optimal | code. This is implemented entirely within Zig's standard | functionality that's available to all users. | | So, if you just type X, you're getting an error about it not | being a struct. That's unless X is a struct with one field, | where it'll just print that field. I find Zig meta-programming | to actually be fairly readable, here's the function that does | the formatting: | https://github.com/ziglang/zig/blob/master/lib/std/fmt.zig | anonymoushn wrote: | `std.debug.print` and similar fmt-like functions take 2 | arguments. The first argument is a format string, and the | second argument is a tuple. I think a tuple is an anonymous | struct whose members are named 0, 1, 2, etc., but I'm not | completely sure on this. If you just write "x", it won't work, | since you needed to pass a tuple containing 1 thing, and x | probably isn't a tuple containing 1 thing. | quic5 wrote: | The print function is implemented in the std library[1] not the | compiler and Zig does not have varargs | | [1] | https://github.com/ziglang/zig/blob/f6c934677315665c140151b8... | AnIdiotOnTheNet wrote: | Zig doesn't have varargs anymore. Instead, it has anonymous | structs/arrays/tuples. The second argument to `print` here is | expected to be a list of the values referenced by the `{}` | placeholders in the string in the first argument. | | `.{a, b, c}` is the syntax for an anonymous struct/array/tuple, | and a single element still needs to be wrapped in it. | noobermin wrote: | >In the multi-sequence for loop version it's only necessary to | test once at the beginning of the loop that the two arrays have | equal size, instead of having 2 assertions run every loop | iteration. | | So, I'm assuming zig generally cannot be used with multi-threaded | code? Can the underlying arrays not be modified during the whole | loop execution? | int_19h wrote: | Arrays can be modified, but their size is a part of their type, | just like C. | | For slices, length is only known at runtime, but it's immutable | once the slice is created, so there's no issue there, either. | [deleted] | masklinn wrote: | That's broken in every language, except for the few which just | don't allow doing it. So I'm not quite sure what the question | is about. | spullara wrote: | I'm not sure zip is used enough to add it to the language but | since they are also using it for tracking the index maybe that is | the primary use case. | kristoff_it wrote: | The example of using SoA memory layout is not there just as a | random example. We hope for Zig developers to employ DOD | principles whenever appropriate, which is not going to be that | rare in a low-level programming language like Zig. | | Andrew has a full talk about how the Zig compiler benefits | tremendously from DOD: | | https://vimeo.com/649009599?embedded=true&source=video_title... | spullara wrote: | Honestly working with them in a column oriented way makes a | lot of sense. I wonder though if that should just be handled | at the struct level? i.e. ask for row vs column layout. | conaclos wrote: | Is there other rationale behind this `for` and `while` syntax? | Why not: for x in elms {} for x in | elms, i in 0.. {} | | In my view, this seems simpler to read and understand. In | particular, the iteration variable is close to the iterated | structure. In the Zig proposal you have to think about the | position of the iterated structure and the position of the | iteration variable. for (elms, 0..) |x, i| {} | ^^^ ^ Ok it is in second position | | I still don't get why `||`. Is this a lambda? Can I write | something like: fn func(x: i8, i: i8) void {} | for (elms, 0..) func | | In the same vein I do not understand the rationale behind the | `while` syntax. Why `:`? while (condition) : (i | += 1) { | messe wrote: | Because it's consistent with the rest of Zig's capture syntax: | if (foo()) |result| { while (it.next()) |elem| { | bar() catch |err| ... | kazinator wrote: | Here is Awk with C99 preprocessing: cppawk! | | Loop macro for parallel/nested iteration featuring a (user- | extensible!) vocabulary of clauses: $ cppawk ' | #include <cons.h> #include <iter.h> BEGIN { | loop (list(iter0, item, list("alpha", "charlie", "bravo")), | list(iter1, ltr, list("a", "b", "c")), range(i, 1, | 3)) { print item, ltr, i } }' | alpha a 1 charlie b 2 bravo c 3 | | This is a tiny shell script plus a collection of header files in | a small directory structure. It requires an Awk such as GNU Awk, | and the GNU C preprocessor. | | Preprocessed programs cam be captured, to run on systems that | don't have the cppawk script or a preprocessor, and with less | startup overhead. | | https://www.kylheku.com/cgit/cppawk/about/ | stephc_int13 wrote: | I have absolutely no problem with the good old C-style for loop | syntax. | | I think that a separate foreach or for-each loop, made for built- | in or extended containers could be a nice addition. | | Not seeing much value there. | titzer wrote: | Regarding the lengths must match: | | > (i.e. you will get a panic in safe release modes) | | Should I take that to mean there is an unsafe release mode | without the bounds check? But UB is mentioned too. Is there UB!? | | It's 2023; I think we can afford a single branch to avoid UB, | even in release mode. | conradev wrote: | it is one of the build modes: | https://ziglang.org/documentation/master/#Build-Mode | titzer wrote: | Interesting, thanks for the link. | | I think that disabling safety checks is a thing you should | only do if you are studying the cost of safety checks (i.e. a | compiler switch only available to compiler engineers). IMHO, | the _whole point_ of safety checks is to find the bugs that | are in your program[1]. Crashing it safely with an exact | source stack trace is the nice way of both motivating you to | fix it and also _helping_ you fix it. | | [1] And there _are_ bugs in your program. Right now. Bugs. | In. Your. Program. Running without safety checks is like | closing your eyes and rolling the dice. | quic5 wrote: | There's also the compromise of only disabling safety checks | per block e.g. in your hot loop with `@setRuntimeSafety`[1] | where you are confident that they aren't needed. | | [1] | https://ziglang.org/documentation/0.10.1/#setRuntimeSafety | laserbeam wrote: | Most software should probably release with safety checks | on. Certain software shouldn't (i.e. games). Toolchains | like zig give you that option and respect that you can | decide what's most appropriate for whatever you ship. | | Arguing that safety checks should always be enabled doesn't | really make sense. Context matters. | kristoff_it wrote: | > Should I take that to mean there is an unsafe release mode | without the bounds check? | | Yes, it's called ReleaseFast. | | > It's 2023; I think we can afford a single branch to avoid UB, | even in release mode. | | Zig has 3 release modes: ReleaseSafe, ReleaseFast, | ReleaseSmall. If you want the safety checks, just use the | first. | | It might even be $currentYear, but many of the latest AAA games | still don't always run at more than 60fps on my fairly powerful | machine and I sincerely hope they were built with all the | optimizations enabled. | Maken wrote: | That's probably because of the DRM. | AnIdiotOnTheNet wrote: | And 60fps is the low end in an era where many monitors are | 120-240hz. 4ms/frame is a pretty tight budget. | titzer wrote: | > I sincerely hope they were built with all the optimizations | enabled. | | Sure, but I don't agree that disabling safety checks is an | "optimization". It is a regression in functionality that is | betting on nothing going wrong. | | Bounds checks do not cost much[1]. Maybe if a bounds check | disables vectorization[2]. | | [1] https://blog.readyset.io/bounds-checks/ | | [2] https://github.com/matklad/bounds-check-cost | | There are a lot of techniques to remove bounds checks, e.g. | in counted loops [3][4]. | | [3] https://ieeexplore.ieee.org/document/5381765 | | [4] https://en.wikipedia.org/wiki/Bounds-checking_elimination | AnIdiotOnTheNet wrote: | If that's what you believe, you are free to enable them as | the programmer. Programmers who disagree are likewise free | not to. | | This can even be decided on a scope-by-scope basis if so | desired. | titzer wrote: | If it's a program you wrote to run on your on hardware, | feel free. But in reality most programmers write programs | for other people's computers, or just write programs | because it's fun or pays well, and then their work gets | integrated into a larger whole at a much later date, and | then it's run in contexts the original author never | imagined, long after they move on. Safety checks catch | the base-level logic bugs that would otherwise cause | programs to go silently wrong and misbehave in complex | and inscrutable ways. Disabling them is not just living | dangerously, it's a moral hazard; the programmer doesn't | suffer the consequences, users do. It's not your program | or computer at risk, but someone else's. I don't know how | as a profession we're so cavalier with shipping exposed | whirling knives, but we are. | verdagon wrote: | If we go so far as to say that using anything unsafe is | dangerous and a "moral hazard" then we would also have to | disqualify Rust, C#, and any other language that allows | unsafe escape hatches (especially in dependencies). | AnIdiotOnTheNet wrote: | > the programmer doesn't suffer the consequences, users | do. | | The same is true of poorly performing programs. My | computer's resources are not the programmers' to waste, | yet they routinely do waste it to save themselves | time[0]. | | > I don't know how as a profession we're so cavalier with | shipping exposed whirling knives, but we are. | | That's a separate problem than not handcuffing | programmers and forcing them into safety checks. Why | should Zig force this? | | Like, I just don't even get what you're complaining about | here. The default build mode _and_ the recommended | release one insert the check. Checks can additionally be | enabled and disabled on a scope-by-scope basis. What | exactly do you want? Just eliminate ReleaseFast as an | option and give people more reasons to go back to | footgun-laden C because it 'll be the only way to | eliminate a bounds check in a tight loop hot spot? | | [0] Yes, I know this isn't due to safety checks in the | vast majority of circumstances, that's not the point. I | have nothing against safety checks, my problem is with | the mentality that it should not be possible to disable | them. Even Rust has `unsafe`. | adamrezich wrote: | the mere naming of the keyword `unsafe` has been a wholly | unintentional disaster for programming in general as more | and more people use Rust, because | "safe"/"safety"/"unsafe" are sort of emotionally-loaded | words in English, and it's led to people to build mental | heuristics about the pros and cons of "safe" and "unsafe" | code which may be subtly incorrect. the language feature | itself is completely reasonable of course, given the | design decisions of the language, but as Andy said | elsewhere in this comments thread: | | > Rust evangelists need to be careful because in their | zeal they have started to cause subtle errors in the | general knowledge of how computers work in young people's | minds. Ironically it's a form of memory corruption. | | I'm not even a zig user or fan or anything and I don't | have any real opinion about Rust, either, except for | completely agreeing with this analysis based on how I've | seen Rust evangelists talk online. I'm not sure what the | solution to this is, but it seems like it's just going to | get worse over time as Rust becomes more popular and | gains market share. | Gene_Parmesan wrote: | So isn't it on the programmer to ensure the safety checks | are enabled if appropriate? I agree with the gist of your | statement, I'm just not sure how this is the | responsibility of the language itself. It ships with the | option to build via a safe mode. I don't think it's a | moral imperative of the language designer to ship without | an unsafe mode. Even rust has unsafe blocks. | | In most engineering professions, it's the engineer's | responsibility to ensure appropriate levels of safety, | not the CAD software used to build the blueprints. But | every situation doesn't have the same level of safety | required; backyard sheds don't have the same needs as | skyscrapers. | titzer wrote: | Most engineering disciplines are considerably more | regulated than software development, and for good reason; | bridges and skyscrapers falling down can kill people. | Even electrical engineering and device manufacturing have | to fit in with standards that address shock hazard and | EMF interference. | | I actually _do_ think it is the responsibility of the | language and runtime system to ensure some base-level | safety of programs. The one constant over the years is | that programmers keep making mistakes. No matter how much | they keep yelling "trust us", they (we) just keep | screwing up. That's not to pillory us programmers. It's | just the facts that everyone screws up. In some sense, | engineering is putting processes and procedures and | checks in place that move human fallibility out of the | critical load-bearing situations so that a simple whoops | or memory slip doesn't kill people or ruin things. | krona wrote: | Without bounds checks: game crashes, core dump. | | With bounds checks: game crashes, meaningless error message | given to the user. | | What am I missing? | gaganyaan wrote: | The meaningless error message can be entered into Google | and the user can find a thread about how to fix their | specific problem instead of wading through endless | threads of similar-but-unrelated problems. | ekimekim wrote: | With bounds checks: game crashes, meaningless error | message given to the user. | | Without bounds checks: I join a multiplayer lobby, and | the next thing I know my computer is part of a botnet. | | This isn't an imaginary fear, it has happened many times. | Some examples from a brief search: | https://gridinsoft.com/blogs/rce-vulnerability-in-gta- | online... https://www.polygon.com/22898895/dark-souls- | pvp-exploit-mult... https://security.gerhardt.link/RCE- | in-Factorio/ | | I am not claiming all of these would've been prevented by | bounds checking arrays, or even memory safety in general. | The point is that security is not optional just because | it's a game. | dxhdr wrote: | Now suppose your game runs in a WASM sandbox and re-run | those scenarios. What do you gain from bounds checks? | | I'm not suggesting that shipping without bounds checks is | wise or leads to a better product. However I do think | with /some games/ security is basically not a concern. | Arnavion wrote: | Heartbleed still happens inside a sandbox, because it's | the sandboxed memory that leaks. For multiplayer games | specifically, that can be a client auth key that can be | used to impersonate you. | adgjlsfhk1 wrote: | the bounds check can sometimes catch the error before it | corrupts your save. | titzer wrote: | > Without bounds checks: game crashes, core dump. | | I think it's more like (assuming it does actually go out | of bounds at some point): | | 30% chance of core dump right away | | 20% chance of core dump at some point after errant write | | 40% chance it never crashes in testing | | 5% chance it doesn't crash the first year after shipping | | 5% chance it never crashes | | With an explicit bounds check, all of these scenarios | result in a crash at the exact location where the program | first violated safety[1]. The developer gets a source- | level crash and doesn't spend the first 20 minutes just | trying to figure out what the crash dump even means. | | [1] Hopefully with a complete stacktrace, maybe even the | index and length values! | | It's time we recognized that _all_ our tooling should be | designed to help us programmers who _do have bugs in our | program_. Like, this crashing part is the normal part | that all the tools should help deal with. | nordsieck wrote: | > What am I missing? | | Sometimes without bounds checking you get an exploit | instead of a crash. | kristoff_it wrote: | btw note that you're arguing this point in a the thread of | a blog post about a feature that is all about maintaining | safety while not paying for it at runtime. There's an | entire section dedicated to explaining this point. | tuukkah wrote: | > _The new multi-sequence syntax allows you to loop over two or | more arrays or slices at the same time_ | | In Haskell, this is called a parallel list comprehension: | [x+y | x <- xs | y <- ys] | | In a normal list comprehension, you have a single pipe, in a | parallel one you have as many pipes as how many lists you are | zipping. | https://downloads.haskell.org/ghc/latest/docs/users_guide/ex... | masklinn wrote: | No, your list comprehension is a product (it iterates ys for | every x). The feature here is zip. | tuukkah wrote: | I'll quote from the documentation link I referenced: | | > _For example, the following zips together two lists:_ | [ (x, y) | x <- xs | y <- ys ] | | That's precisely the difference between a normal list | comprehension (one pipe) and a parallel list comprehension | (multiple pipes). | | For clarity, here's your normal list comprehension (with one | pipe) that produces all the combinations instead: | [ (x, y) | x <- xs, y <- ys ] | | And here's the full example from the article converted to | Haskell and producing the exact same output: | {-# LANGUAGE ParallelListComp #-} import | Control.Monad (mapM) elems = [ "water", "earth", | "fire", "air" ] nats = [ "tribes", "kingdom", "nation", | "nomads" ] main = mapM putStrLn [ show | idx ++ " - " ++ e ++ " " ++ n | e <- elems | | n <- nats | idx <- [0..] ] | | EDIT: I suppose an explicit zip with an anonymous function | looks more idiomatic though: main = forM | (zip3 elems nats [0..]) $ \(e, n, idx) -> putStrLn | (show idx ++ " - " ++ e ++ " " ++ n) | | EDIT2: Best of both worlds with the list monad? | main = mapM putStrLn $ do (e, n, idx) <- zip3 elems | nats [0..] [ show idx ++ " - " ++ e ++ " " ++ n ] | arethuza wrote: | I was fond of the Common Lisp loop macro that handled iterating | over multiple things quite nicely: | | https://lispcookbook.github.io/cl-cookbook/iteration.html#lo... | | Edit: 27 years since I was paid to write Lisp.... | masklinn wrote: | This is very strange, because it looks like a comprehension | (https://wiki.haskell.org/List_comprehension), which would be a | product iteration. | | Most languages have a function called zip or something similar | (https://hackage.haskell.org/package/base-4.17.0.0/docs/Prelu.. | .) which handles pairing sequences, to be composed upstream of | the iteration proper. | tuukkah wrote: | It's a _parallel_ list comprehension, as linked from the wiki | page you referenced: https://downloads.haskell.org/ghc/9.4.4/ | docs/users_guide/ext... | MrBuddyCasino wrote: | Is must say this looks pleasant, coming from Kotlin. Also ranges | seem to work very similarly. | | Not sure how I fell about the UB - is it really necessary to | optimise away a single length check per loop (not iteration)? | AnIdiotOnTheNet wrote: | That's up to the programmer. Zig's default build mode is Debug, | and ReleaseSafe is recommended if you don't require extreme | performance. Both modes will insert the check. | | Safety checks can also be enabled or disabled on a scope-by- | scope basis if desired. | carterschonwald wrote: | Zipwith is a great iteration api to have available. | andrewstuart wrote: | Off topic, but I was weighing up trying Zig last night for a | project. | | No doubt Zig has changed alot and is better than it was only a | year or two ago. | | Is anyone here willing to say if they have experienced success | and satisfaction using Zig? I'm wanting to do some C library | interfacing. | blameitonme wrote: | Hey Im just a student and cant even think to build stuff of | complexity most of the guys here make rn, but I made a json | parser in zig and it was fun. | marmada wrote: | I really like that for loops can be expressions. It seems obvious | in hindsight, but hindsight is always 20/20 :) | masklinn wrote: | > It seems obvious in hindsight | | It's not, because most languages don't have an `else` clause in | their for loop (and in my experience with Python that clause is | quite confusing so its use is not common). | | And a for loop can be executed 0 times, so without a mechanism | for a fallback it might not have a value _to_ yield. | Someone wrote: | > And a for loop can be executed 0 times, so without a | mechanism for a fallback it might not have a value to yield. | | I would think that and the similar case where no iteration | hits _break_ are solvable by having a _for_ loop return an | optional type. | avgcorrection wrote: | Special-casing (same-length) zip and iteration+count might make | sense for an imperative language which doesn't want to go down | the rabbit hole of implementing efficient, lazy iterators. It | doesn't make sense in a language where you want the flexibility | of switching between (as in: compiling to) serial loops and | paralell code, but it makes sense for a language which leans more | towards what-you-see-is-what-you-get rather than sufficiently- | smart-compiler. | noobermin wrote: | Tbh, there are limits to how much any language that does | "wysiwyg" compilation that would have for loops. For example, | any "for" loop can be a "while" loop in asm, the one | optimization is you can use the index registers as long as the | number of arrays is less than the number of index registers you | have. If it is more, which the language does not constrain of | course, you just go back to a loop with memory locations for | pointers. But of course, in that case then, you _must_ have a | "smart compiler" that can decide that which case it is and thus | compile to the right code. | | That said, this likely will be an esoteric case on most modern | machines (like x86_64 has 16 regs that can be used for indexes) | and I doubt people want to use this for like avr. | nemo1618 wrote: | This is a gripe I have about Go -- a very minor gripe, to be | sure, but it's still there. If you want to iterate over two | arrays/slices that have the same length, you have to choose | between: for i := 0; i < n; i++ { | fn(foo[i], bar[i]) } for i := range foo { | fn(foo[i], bar[i]) } for i := range bar { | fn(foo[i], bar[i]) } for i, x := range foo { | fn(x, bar[i]) } for i, y := range bar { | fn(foo[i], y) } | | But none of these are satisfactory; what I _really_ want to write | is: for _, (x, y) := range (foo, bar) { | fn(x, y) } | maxmcd wrote: | This is pretty ugly and add the overhead of a function | callback, but just for fun: func multiLoop[X, | Y any](x []X, y []Y, cb func(i int, x X, y Y)) { if | len(x) != len(y) { panic("invalid slice | lengths") } for i := 0; i < len(x); i++ | { cb(i, x[i], y[i]) } } | func foo() { multiLoop([]int{1, 2, 3}, | []string{"a", "b", "c"}, func(i int, x int, y string) { | fmt.Println(i, x, y) }) } | masklinn wrote: | FWIW this is often called `zipWith`, or sometimes just `map` | (some `map` implementations can take a variable number of | sequences to map over). | rcme wrote: | This kind of syntactic sugar used to appeal to me, but now I | think it's a pretty weird feature to add to a language. Using zip | / enumerate primitives feels a lot more flexible. | cryptonector wrote: | To me this looks a lot like closure syntax w/ non-local exits. | Seems quite reasonable for a functional programming language. | moomin wrote: | I think it matters what your target use cases are. This makes | me think quite a few people are running ECS systems. | cgh wrote: | How cache-friendly are zip/enumerate implementations? Zig is | influenced by the ideas behind Data Oriented Design, mentioned | in the article (and a buried lede, if you ask me). Explicit for | loops like this are generally cache-friendly and ideal for eg | game programming, as shown in the structs of arrays example. | [deleted] | steveklabnik wrote: | I tossed together a simple function using enumerate | https://godbolt.org/z/PKsEdKvKK | | You get the same exact asm as the manual loop. | | Of course, the idiom recognition seems to kick in, in both | cases, as there's no actual loop here. I tossed in a +sum, | which makes that fail, so you get some loops, check it out: | | https://godbolt.org/z/1ddf5ded7 | | They are one instruction different in length, which is kind | of amusing to me. Some small differences. | cgh wrote: | Thanks, that's exactly what I was asking. I don't write | Rust so it's informative to see this. | steveklabnik wrote: | Any time. | vore wrote: | As cache-friendly as advancing two pointers and a bounds | check. | [deleted] | pmontra wrote: | I don't know how common working with ranges is in Zig. Ruby | would iterate on multiple ranges by converting them to arrays | one = (1..3) seven = (7..10) (one.to_a + | seven.to_a).each {|n| puts n} | | I suppose that if it was common they would have added a + | method to Range. Actually I think that's possible to implement | it with a refinement on the Range class. | | Yup, it works. First time I ever used refinements. | module JoinRanges refine Range do def | +(other) self.to_a + other.to_a end | end end using JoinRanges one = (1..3) | seven = (7..10) (one + seven).each {|n| puts n} | kdmccormick wrote: | This is different. You are concatenating the arrays, whereas | the article & discussion are about zipping arrays. | laserbeam wrote: | Depends on what you mean by flexible. If you want to use them | outside of loops then they could cause magic data copies behind | the hood. Zig really hates hidden control | flow/allocations/copies. Within the for syntax it's pretty | straightforward what gets assigned to what variables and how | copies can be avoided. | | Doing things like `a = @zip(some_list, some_other_list)` can be | reasoned about in multiple ways, some of which involve silently | calling malloc. It's particularly unclear what could be done | with `a` afterwards. Zig hates that kind of ambiguity and is | happy to err away from flexibility at times. | brundolf wrote: | Rust also hates hidden allocations, and its iterator system | can do all of this without them | | Although- thinking about it, that may rely on the borrow | checker (move semantics specifically) | gpanders wrote: | >Rust also hates hidden allocations | | Does it? Rust seems happy to allocate silently all the | time. let x = String::new("hi"); | let y = vec![]; | | Do either of these allocate? As the writer or reader of | this code, how do I know if either of these statements | result in a heap allocation, or if the data is strictly on | the stack? | | Zig's requirement of explicitly passing around an Allocator | type removes any ambiguity completely. | steveklabnik wrote: | (You're forgetting a "new" in the string example) | gpanders wrote: | Thanks, I fixed it :) | brundolf wrote: | Sure, any arbitrary function (or macro) logic can | allocate. It's more a philosophy, not something that's | language-enforced[0] in Rust- if you're creating a | mutable, variable-size data structure like a String or a | Vec or a HashMap you're not going to be very surprised | that it allocates at some point (though technically zero- | length Vecs don't allocate on construction, they wait | until an item is added) | | But closures don't require allocation, iterators don't | require allocation, async doesn't require allocation. | Copy semantics also don't allow allocation- implicit | copies can only happen for data structures that are | bitwise-copyable, which is enforced by the compiler. For | copy-with-allocation you have to implement the Clone | trait, and then invoke it explicitly with the .clone() | method | | But the original context was a question of philosophy, so | I was only speaking to Rust's overall philosophy | | [0] Technically I think if you're using no_std you won't | have access to any standard constructs that allocate | (which obviously will prevent their use at compile-time), | though I believe you're still allowed to eg. call out to | foreign functions manually that would allocate. And of | course, this still isn't as granular as Zig's allocation- | control. | [deleted] | masklinn wrote: | That's nothing special though, `zip` just takes an item | from each iterator, packs them into a tuple, and yields | that. It has no weird bounds or requirements or anything: | https://doc.rust-lang.org/std/iter/fn.zip.html | | The impl of the default `next` is: fn | next(&mut self) -> Option<(A::Item, B::Item)> { | let x = self.a.next()?; let y = self.b.next()?; | Some((x, y)) } | | So completely straightforward. | defen wrote: | Can that zip more than two iterators? And does it perform | a bounds check on each call to `a.next()` and `b.next()`? | brundolf wrote: | It stops when one of the two iterators ends | | It can't zip more than two per se, but you could zip the | result of the first zip into a third and get ((item1, | item2), item3). You could then map these if you wanted, | to flatten them into a single tuple .map(|((item1, | item2), item3)| (item1, item2, item3)) | | Of course there's a trade-off here between ergonomics and | generality | masklinn wrote: | > It can't zip more than two per se, but you could zip | the result of the first zip into a third and get ((item1, | item2), item3). You could then map these if you wanted, | to flatten them into a single tuple .map(|((item1, | item2), item3)| (item1, item2, item3)) | | FWIW that's more or less what `itertools::izip!` does for | you, it just chains `zip`s then "splats" them using a | `map`. | defen wrote: | > It stops when one of the two iterators ends | | Right; my question is, suppose you're iterating over two | slice iterators - won't each call to `a.next()` and | `b.next()` have to check whether that sub-iterator is | done? One of the benefits of the Zig approach is that you | can iterate over an arbitrary number of slices and do one | check before entering the loop, followed by the compiler | emitting unchecked index access in the loop. So it | basically compiles down to the equivalent of a C `for` | loop. | masklinn wrote: | Rust's zip has a specialisation for iterators with a | trusted length. Such as slice iterators. | | `zip` yields exactly the same assembly as a loop over the | index range with an unsafe item access: | https://godbolt.org/z/7ebfxbhxc | brundolf wrote: | Interesting, are "trusted-length" iterators something | that might ever make it into userspace? Maybe as const | generics? | masklinn wrote: | It's already in userspace, though nightly (and unsafe, | obviously), so whether it'll be stabilised, and in what | form, is an open question: https://doc.rust- | lang.org/std/iter/trait.TrustedLen.html | the8472 wrote: | For Zip it's TrustedRandomAccess[0] instead of | TrustedLen. Imo the most radioactively unsafe trait in | the standard library and will likely never be stabilized | in its current form. | | [0] https://github.com/rust- | lang/rust/blob/f540a25745e03cfe9eac7... | kristoff_it wrote: | You have to pass -O though, the point of Zig's for loop | syntax is to get fast compile times and good performance | also in debug mode :^) | defen wrote: | That's cool. At the same time though, it almost feels | like a distinction without a difference in some ways - | Zig has a special built-in syntax; Rust doesn't use | special syntax, but it does use complex special-cased | unsafe code in the stdlib in order to implement a safe + | performant API. | masklinn wrote: | On the other hand, the "special cased unsafe code" is | applicable to more than just zip, more than just the one | array type, and is available in userland (though | currently unstable so nightly only, both to implement it | on a bespoke type and to rely on it). | brundolf wrote: | Rust's is built on top of (and exposed to) Iterators, | which are a very general concept that can be rooted in | all kinds of data structures, composed in all kinds of | ways, and collected/processed in all kinds of ways (i.e. | the user's code might not even contain an actual loop). | The code continues to work in many situations, even where | the optimization doesn't apply | | You trade some special-case syntax and ergonomics for | that generality, but it is very general even if not all | of it is optimized in the same way | brundolf wrote: | I was thinking about the fact that whatever you're | iterating over has to be copied around throughout the | process. Rust can guarantee that eg. deep-copies (clones) | of allocated structs will never happen implicitly, if | your iterator owns the values being iterated. But in | languages where copying can trigger allocations, this | could be a problem | | I don't actually know whether that applies to Zig though | hryx wrote: | In general Zig foregoes syntactic sugar and requires | implementing higher-level APIs by composing primitives. But a | new language feature is a candidate when it solves a use case | that can't otherwise be solved, or opens up a path to more | efficient code. | | Loris' blog post points out that the new for loops address the | latter: | | > In the multi-sequence for loop version it's only necessary to | test once at the beginning of the loop that the two arrays have | equal size, instead of having 2 assertions run every loop | iteration. The multi-sequence for loop syntax helps convey | intention more clearly to the compiler, which in turn lets it | generate more efficient code. | | It also builds on existing properties of slices/arrays, rather | than adding a new "enumerate primitive". | travisgriggs wrote: | This is my take as well. The older and more travelled I get the | more I disdain these kinds of things. Your language syntax | should do whatever the "thing" is that your language model is | all about. Syntactic sugar should be for the things you do | LOTS. | | I watch language after language add sugar to maintain the | appeal of their product, one niche group or application at a | time. It turns into a death by a thousand cuts, or by a | thousand sugar cubes. Most languages start out simple and | appealing and understandable, an increasingly short amount of | time later, they've layered on "helper" after "helper" to the | point it takes a bit of expertese to consume the language | effectively. | | I dream of a world where we'd measure languages by the | complexity of their ASTs rather than their popularity on a | TIOBE or StackOverflow index. | AndyKelley wrote: | Arguably this change to the zig language is overall a | simplification because the loop index capture is no longer a | special case. | travisgriggs wrote: | Could be. I think you're more the expert here than me? :D | | To me, the followin is a bit of syntactic sugar that I | think is the kind of transcendental "go big/basic with it" | that I hint at. | | Some time ago, I worked in a language that had this idea | that any composable block of code could be captured as 0-N | statements between the characters [ and ]. They thought | they were being clever and called it a "block of code". | Which I thought was cool, because it looked like a block. | Pedants called it a BlockClosure. If you wanted to pass | parameters to one of these, they used a colon denoted list. | So a two arg block might look like | | [:a :b | <code goes here> ] | | So yay, pass a closure to a service, and it "captures" the | values be invoking said closure with arguments. | | And then the authors thought, okay, enough sugar for a few | days, let's just use this. I mean really really really use | this. | | You can use a two arg block like that for a zip function of | course, but why limit it to iteration? Use it in the | standard library to implement the "for each" function. | Which when you looked at was just that "how dare they not | have a for syntax" while implementation. But because it | wasn't embedded in the syntax, you could copy/paste/modify | to come up with a filter iteration. Or a reduce. Or a map. | Or all kinds of interesting compositions | "selectAndCollectAndReject" with 3 closures. | | And why stop there? They decided, "let's just do boolean | logic with these block things too". So where as most | languages has special syntax for conditionals (and once | they start, they're in competition with their peers to keep | adding more and more of them (do while, case, if, if with N | elses, on and on). But they just wrote it like | | <condition> ifTrue: [trueBlock] ifFalse: [falseBlock] | | Sure they optimized it, but from a linguistic point of | view, it was the same thing as above. No new sugar was | needed. | | Whereas many languages have added sugar for optionals | (usually involving ?s), this language, 20 years ago, was | doing it with closures already. Someone noticed they could | implement the following family of "functions" | | ifNil: [nilBlock] | | ifNil: [nilBlock] notNil: [:notNilValue | notNilBlock] | | ifNotNil: [:notNilValue | notNilBlock] | | Sure, not as terse as ? (which some endeavoured to deal | with), but the language semantics didn't have to change | each time there was a new thing to do. | | I'm sure there's a Lisper out there that can write their | analog to the above. Because it too, is was one of these | "do much with little" langauges. | duped wrote: | "Sugar" is implemented by converting an AST into itself, so | it wouldn't change its "complexity" at all. | Bekwnn wrote: | Working on low level performance sensitive code in games, | this is something I see in code LOTS. | | As mentioned in the article, data oriented design runs into | the pattern of wanting to iterate over parallel arrays of | data frequently. | Terretta wrote: | coming from total unawareness of zig: in the for (1..5) | construct, these integer ranges consistently not including the | upper limit element when lists do include the last element, seems | surprising. i guess it's a range boundary (1 TO 5), not a list (1 | THROUGH 5), but the other behavior feels like a list, so it feels | like 5 should be in. | throwawaymaths wrote: | It's consistent with (some) other languages, for example iirc | in ruby .. is exclusive of the last item and ... Includes the | last item. | cmoski wrote: | I completely agree with you on the madness of not including the | upper limit. However, I don't see how the phrase "one to five" | would not include five. "Rate this film on a scale of one to | five" does not mean four is the highest rating. | | It translates to "increment from one, stopping before you get | to five". Ridiculous. | kzrdude wrote: | It's not ridiculous, "1 to 5" translates into it starts with | 1 and ends with 5, and both versions are ambiguous on the | point of including the endpoint or not. In a programming | context, it seems "clear" that it's ambiguous or down to | convention. | andrewstuart wrote: | Hang on, I was reading last night that Zig has no for loop? That | you have to use while.... is this not correct? | messe wrote: | It has no "for (init; cmp; step)" type loop, and instead you | had to use: var i: usize = 0; while (i | < sz) : (i += 1) { ... } | | Meaning that the scope for i would leak. | | It did have a foreach-style for loop, as seen in the article | though. | kzrdude wrote: | Looks like they have the most important part in place, the | increment before the next iteration. | tialaramex wrote: | > Ranges can only exist as an argument to a for loop. This means | that you can't store them in variables | | I am confident this is a mistake. Every time you make a new kind | of "thing" in your language somebody will want to do all the same | stuff with it that they did with the other things, such as | integers, ie in this case store a range in a variable. Ideally | you'd just always be able to do that, see Lisp, but it can get | very unwieldy, thus this is a reason to avoid making new kinds of | thing so the issue doesn't arise. | | C++ chooses to actually do the heavy lifting here, which is why | std::format (and its inspiration fmt::format) was such an | enormous undertaking -- C++ can express the idea of a function | which takes a variable number of arguments and yet all those | arguments are independently type checked at compile time, not via | compiler magic but just as a normal feature of the language. This | is an enormous labour, and because they don't have any way to fix | syntax issues the resulting problem accumulate forever in their | language so I cannot recommend it as a course to other languages. | It's like the Pyramids, do not build giant stone tombs for your | leaders, this is a bad idea and your society should not copy it - | however, the ancient Egyptians already did build giant stone | tombs and they're pretty awesome to look at. | | Anyway, Rust chose to make its half-open range type | std::ops::Range an actual type which you can store in a variable, | pass to functions, modify etc. as well as using it in a for loop. | Obviously don't copy Rust here exactly, for one thing Range | should probably be IntoIterator, not an Iterator itself if they | had it over, but you will wish this was an ordinary type in your | language, so, just do it now. let a = 0..4; // | The Range starting at zero and (non-inclusively) ending at four. | masklinn wrote: | The problem is that zig's designers apparently don't want to | introduce an iterator abstraction, hence the frankenstein-ing | of the for loop instead. | | Though in fairness getting an iterator abstraction to the same | efficiency as a for loop requires pretty brutal optimisations, | frankensteining your for loop, a lot less so. | matu3ba wrote: | Yes, this makes debug builds bloated and slow. | throwawaymaths wrote: | I don't know how ranges are implemented now (and I'm too lazy | to check right now) but It's entirely possible zig's ranges | could wind up as comptime-only values. | | Then you _could_ pass them around, but only at comptime, | which will achieve many of the things you expect. | | There's also nothing stopping you from creating an iterator | interface in userland. | xigoi wrote: | > Though in fairness getting an iterator abstraction to the | same efficiency as a for loop requires pretty brutal | optimisations | | How about Nim's inline iterators? ___________________________________________________________________ (page generated 2023-02-27 23:01 UTC)