[HN Gopher] Hobby x86 kernel written with Zig ___________________________________________________________________ Hobby x86 kernel written with Zig Author : netgusto Score : 310 points Date : 2020-01-06 08:54 UTC (14 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | maruko wrote: | aaa | saluuu wrote: | dda | FpUser wrote: | I really wish Zig picks up and goes mainstream. I also wish it | gets a little (not too much) of higher level features. Something | in line of simple OOP. | | I would even start using it now for some smaller projects that | are not vital but I was stalled on Zig not being able to compile | some C (it claims to do that and it does but not in my cases). | Sure I could do import but would rather prefer for Zig feature to | work | vips7L wrote: | Here is a microkernel written in Zig: | https://github.com/AndreaOrru/zen | knebulae wrote: | I tried to do this with Zig about 13 months ago. It was not where | it needed to be at that time; the biggest impediments were its | rudimentary handling of C pointers to one vs pointers to many | (which has long since been fixed), and its meta programming | issues (lack of a macro language or pre-processor) that made OS | development tedious. I have not revisited it as much as I would | have liked simply because I chose to step back a bit on | implementation and focus on theory. | | I'm pulling for Andrew. He busts his rear-end, livestreams, and | is generally a good dude. Zig has a TON of potential. | kburman wrote: | I'm also trying to do something very similar. I can use it as a | reference. Thanks for sharing! | archsurface wrote: | The hello world examples for master and 0.5.0 seem quite | different - any trouble keeping up with the changes? | jackhalford wrote: | migration from 0.4 to 0.5 was ok. AFAICT there the big change | for 0.6 syntax wise is the drop or varargs in favour of tuples. | I'll take some time to do the switch when 0.6 hits but I'm | confident it won't take long because of how concise the | language is. | AndyKelley wrote: | 0.6.0 is scheduled for April 13, 2020. Let's make sure your | project builds successfully and correctly with master branch | 1-3 weeks before that, and I'll prioritize any bugs before | the release that affect you. | | (This goes for all zig community members with active | projects) | rumanator wrote: | Very interesting. Keep up the good work! | jackhalford wrote: | Hi, author here! I've just finished writing the pre-emptive | multitasking [0](only round robbin though, nothing fancy). | | I'm currently writing an ATA driver [1], the idea is to implement | ext2. | | I used to do this in Rust but I switched to zig for | maintainability and readability over Rust. It seems that with | `comptime` I'm able to make a lot of things optimal. | | Overall I have to say kernel programming is _hard_ but very | rewarding when it finally works, it's really demystifying | computers for me! | | [0] https://wiki.osdev.org/Brendan%27s_Multi-tasking_Tutorial | | [1] https://wiki.osdev.org/IDE | rehemiau wrote: | Looks awesome! Planning to do a similar thing in Jai once it | comes out | jackhalford wrote: | I don't follow JAI but maybe Jonathan Blow should try zig, | it's a great fit for game development and I remember him | mentioning that he wants a language that is fun to write in. | [deleted] | hippyhippo wrote: | zig is following pretty much different path from jai afaik. | zig prefers everything explicit and is very verbose whereas | jai has many implicit things and also has macros or | something similar. Anyways, more will be seen when it is | released. | andi999 wrote: | Maybe you can enlighten me, but from my view it seems the | biggest difference is that zig exists and jai doesnt | (yet?); at least for practical purposes. | gameswithgo wrote: | Blow is pretty familiar with Zig, there has been some | collaboration between the two creators. | adamrezich wrote: | There's still zero evidence that the compiler or language | ever have been or ever will be called "Jai," aside from the | fact that it is the tenuous file extension being used in | development streams. | rehemiau wrote: | It's the best short name we can use right now so it's the | one we should use IMO. If it gets a different name, we'll | start calling it differently, I don't mind. | guggle wrote: | > I used to do this in Rust but I switched to zig for | maintainability and readability over Rust. | | Can you expand on this ? I'm asking out of curiosity because I | want to learn a "system" programming language (for whatever | definition there is to this term). So far I briefly tried Rust | and Nim and found the former more difficult to read. I know | nothing about Zig, how would you place it between these two ? | jackhalford wrote: | In general you'll find that zig is easier to read than Rust | (see the first version of this project in Rust [0]) because | it's a simpler language. For kernel programming this is even | more so the case: | | * zig has native support for arbitrary sized integers. In | Rust I used to do bitshifts, Now I just have a packed struct | of u3/u5/u7 whatever (see `src/pci/pci.zig`). Of course Rust | has a bitflags crate but I didn't find it handy, this is a | case where native support vs library support means a world of | difference. | | * zig has native support for freestanding target. I used to | have to build with Xargo for cross compiling to a custom | target, I also was forced to `[nostd]`, and some features I | was using forced me to use nightly Rust. In zig I have a | simple `build.zig` and a simple `linker.ld` script, just | works. | | * zig has nicer pointer handling. A lot of kernel programming | is stuff that Rust considers unsafe anyway. It's not uncommon | to have to write lines like `unsafe { _(ptr as_ const u8) }` | to deref a pointer in Rust, which is a pain because this kind | of thing happens all of the time. Also you have to play | around with mut a lot like this: `unsafe { &mut _(address as_ | mut _)`. It just felt wrong a lot of the time, where in zig | you have either `const` or `var` and that's the end of it. | | * zig is really fun to write! this is something that comes up | often in the community, after years of C it's just very | refreshing. | | Some things zig is missing: | | * Package manager, this is coming soon. [1] | | * Missing documentation for inline assembly (I think this | part is going to get overhauled, as Andrew Kelley is writing | an assembler in zig atm [2]). | | I don't know Nim, but I believe it has a garbage collector so | it could be tricky to use for kernel programming. | | [0] https://github.com/jzck/kernel-rs | | [1] https://github.com/ziglang/zig/issues/943 | | [2] https://www.youtube.com/watch?v=iWRrkuFCYXQ | sov wrote: | Your point about unsafe pointer handling in Rust is | specifically what dissuaded us from using it in an upcoming | project. It really feels bad prepending all of the code | that you actually care about being safe with `unsafe`. | guggle wrote: | Thank you. | | > I don't know Nim, but I believe it has a garbage | collector so it could be tricky to use for kernel | programming. | | You're right. Still good for libraries though (or apps, but | that may be outside of "system"). | joshbaptiste wrote: | Nim's GC is optional | Touche wrote: | How can it be optional if there is lots of code that | assumes you are using GC? For example, as far as I can | tell the stdlib doesn't do its own allocations. Does this | mean you can't use the stdlib with GC disabled? Or am I | missing something here? | alehander42 wrote: | overally if you write an os you need to write your | malloc/free .. and then you should be able (i think..) to | use many of the gc-s anyway and more of stdlib. | | But you can also think of Nim as macro-able higher level | C and write it like that (but indeed you probably still | need this minimal allocation support) | qmmmur wrote: | A friend of mine wrote a DSL for audio using Nim with GC | off. | | https://github.com/vitreo12/omni | | I don't know enough to comment but it may be useful to | look at things in the wild. This project also heavily | relies on calling in C to interface with environments and | the SuperCollider scsynth. | alehander42 wrote: | you're right, it is confusing, but it is optional: some | toy kernels already work in nim , and with latest work on | memory, you should be able to use most of the language | for kernel development ! not the perfect language for | that yet though, but i hope we should see more nim os | examples | Touche wrote: | Are you saying that the GC is optional? If you don't use | it how do you allocate/free memory? | guggle wrote: | Didn't test this but some clues: https://nim- | lang.org/docs/gc.html | | Then I guess you would use new/dealloc | mratsim wrote: | You call malloc/free, or if working with GPU | cudaMalloc/free. You can write your own memory pool or | object pools, you can use destructors, and even implement | your own reference counting scheme. | | This is what I use for my own multithreading runtime in | Nim and the memory subsystem makes it faster and more | robust than any runtime (including OpenMP and Intel TBB) | that I've been benchmarking against, see memory subsystem | details here: | https://github.com/mratsim/weave/tree/master/weave/memory | | Example of atomic refcounting in this PR here: https://gi | thub.com/mratsim/weave/blob/025387510/weave/dataty... | | Also one important thing, Nim current GC is based on TLSF | (http://www.gii.upv.es/tlsf/) which is a memory | reclamation scheme for real-time system, it provides | provably bounded O(1) allocations. You can tune Nim GC | with max-pauses for latency critical applications. | Touche wrote: | Does the standard library use malloc/free or does it | depend on the GC? This is the part that's puzzling to me, | if the stdlib depends on GC then it's harder to say that | GC is optional. Technically optional but not super | practical. | nimmer wrote: | The majority of stdlib modules does not depend on GC. | | Also, the new ARC memory manager replaces GC and can run | in a kernel. | mratsim wrote: | No that's not true. | | As soon as you use sequences or strings or async you | depend on the GC. | | You can however compile with gc:destructors or gc:arc so | that those are managed by RAII. | guggle wrote: | more here: https://nim-lang.org/araq/destructors.html | alehander42 wrote: | overally i think for small os-es you can write easily a | micro libc core in C or even in Nim where you define | malloc/free etc and just use them directly as in C and | zig. | | otherwise you should be able to use something like | destructors eventually | pjmlp wrote: | > I don't know Nim, but I believe it has a garbage | collector so it could be tricky to use for kernel | programming. | | Have a look at Project Oberon, Singularity, Interlisp-D, | Smalltalk, Modula-3 Topaz, Blue Bottle (AOS), CosmosOS, | lilith (Crystal variant) given that their source is | available. | celticmusic wrote: | I looked over zig when it was first announced and really | liked the idea, it was just too fresh for me to work with. | | But I've always had a mental note to go back once it got a | bit more stable because I really do like the ideas behind | it. | | It sounds to me like it's definitely getting there and | that's awesome. | dom96 wrote: | You can certainly use Nim for kernels. I've created a proof | of concept long ago, and things are only improving with ARC | fast becoming a great alternative to Nim's GC. | | https://github.com/dom96/nimkernel | simias wrote: | It's true that for bare metal programming Rust is quite far | from 1.0. You can't even do inline assembly in stable | rust... | | Arbitrary sized integers do sound very convenient, although | I guess the counterargument would be that they simply can't | map to machine types so you'll have to have magic behind | the scenes which may not make complete sense in low level | languages. Still, C has bitfields in structs which are sort | of the same thing so why not. | | Pointer handling however I'm not sold on. I'd argue that | the very cumbersome pointer handling in Rust is a feature, | not a bug. It's explicitly discouraged, and for good | reasons. Map your pointer into a clean and safe reference, | slice or wrapper object ASAP and leave raw pointers to the | messy details of C FFIs and ultra low-level code. If you | end up having to mess with raw pointers everywhere in your | codebase you're doing it wrong, as far as Rust is | concerned. | | I think this is reasonable even for kernel code. After all | typically you wouldn't dereference register pointers | directly in C code because of volatility issues and making | it obvious that you're accessing hardware and not RAM | (readl/writel or similar), so having a safe wrapper adds no | overhead in my experience. | jackhalford wrote: | Ok maybe I was user pointers wrong in Rust. And I must | confess that I never really understood lifetimes either, | however | | * I don't think that low level code needs to be "messy", | and | | * I don't think that adding an abstraction on top solves | anything complexity wise. | | For memory paging I used to have a whole library in Rust | that handled hierarchy of page tables with very abstract | pagetable classes and interfaces [0]. Now I just have 80 | lines of zig that handles everything [1]. I much prefer | the latter. | | > I think this is reasonable even for kernel code. | | It's not, for some of the reasons that Linus doesn't want | C++ code in the kernel [2] | | [0] https://github.com/gz/rust-x86/blob/28d5839933973e0b6 | 39ef354... | | [1] https://git.sr.ht/~jzck/kernel/tree/master/src/arch/x | 86/pagi... | | [2] https://yarchive.net/comp/linux/c++.html | nickpsecurity wrote: | You might find it helpful to look at Redox OS and Tock to | see how they're handling kernel development in Rust. May | or may not be easier than what you described. | | https://www.redox-os.org/ | | https://www.tockos.org/ | CDSlice wrote: | The Rust code you wrote seems to be extremely heavy on | boilerplate. A couple of macros could have drastically | reduced the line count. | | I don't think what Linus wrote in 2004 on C++ has much | relevance for 2020 Rust, especially for a new kernel that | doesn't have a ton of contributors. Most of his | complaints seem to be on how C++ abstractions can make it | hard to review unknown code and how 2004 C++ code bases | often had extremely messy code. Rust IMO doesn't suffer | from those problems near as much as 2004 C++ did. | | Also, Google is using Rust for part of their new Fuchsia | kernel so at least they think Rust has something good to | offer kernel dev. | roblabla wrote: | Small nitpick: Google isn't using Rust for the Fuchsia | kernel. The kernel, Zircon, is written entirely in C and | is based on Little Kernel. Google is using Rust for some | userspace drivers (Fuchsia being a micro-kernel). | loeg wrote: | > The kernel, Zircon, is written entirely in C | | No, it's written in C++[1]. They're very different | languages. | | That said, Google seems to think C++ has something to | offer to all kinds of development. It seems to work for | them, but they are a heavily C++ shop. I wouldn't read | too much out of their use of C++ anywhere; it'd be more | surprising and interesting to see them use anything else. | | [1]: https://fuchsia.googlesource.com/fuchsia/+/master/zi | rcon/ker... | zozbot234 wrote: | I don't think bitfields are even that portable in C? You | end up with a weird mix of bit-endianness and byte- | endianness issues. The best approach AFAICT, if you need | to model a binary format with arbitrary-sized integers, | is to keep the binary structure opaque and have a | function to "unpack" it to a record of machine-native | types. Modern compilers can easily optimize the resulting | code into shifts+bit operations, and it is a reasonably | foolproof approach. | pcwalton wrote: | The problem with C-style bitfield values/arbitrary-sized | integers is that they don't have addresses, so you can't | take references to them. This means that you need special | cases everywhere, and it increases the complexity of the | language a lot. In Rust, integers often have methods that | take self by reference, which demands an actual address. | For these reasons, in Rust I think it's better that | arbitrarily-sized integers aren't first class. They're | never really first class anyway. | AndyKelley wrote: | In Zig they do; here is an example: | https://clbin.com/HlJ0X | | Output: Test [1/1] test "bit field | access"... *align(:3:1) u3 All 1 tests | passed. | | Zig's pointer type has optional metadata to describe | alignment, sub-byte offset, and SIMD vector element | index. This generally makes things just work, and you'll | get a compile error if you expected an unadorned aligned | pointer. If I were to swap: - fn | deref(ptr: var) u3 { + fn deref(ptr: *u3) u3 { | | Then I'd get: test.zig:13:23: error: | expected type '*u3', found '*align(:3:1) u3' | assert(deref(&data.b) == 2); | ^ | simias wrote: | How do you deal with atomicity though? A big deal in Rust | is that you can only have one mutable reference to an | object at any given time which prevents all sorts of | aliasing issues, but here if you have two mutable fat | pointers, one to and and one to b, and you attempt to | write them without synchronization you're going to have a | big problem, won't you? | | This is the main risk with bitfields in C, it makes reads | and writes that appear to be independent actually access | the same memory cell behind the scenes. I'm not convinced | by the idea of adding even more magic on top of it to be | honest, it is true that packing and unpacking bitfields | is annoying and requires some boilerplate but in the end | that _is_ how it the hardware does it anyway. | AndyKelley wrote: | The builtin functions for atomic operations provided by | the language do not allow unaligned pointers, so it would | be a compile error. | rapsey wrote: | On the other hand safe Rust code will be memory safe.. | woodrowbarlow wrote: | not if you have to keep using unsafe blocks. | cies wrote: | As an ancestor in this thread pointed out: it's | hard/impossible to go without unsafe in a kernel. | | Proof (and they have a policy to only use unsafe when not | otherwise possible): | | https://github.com/redox- | os/kernel/search?q=unsafe&unscoped_... | Cyph0n wrote: | I was expecting way more "unsafe" uses than that! | | I see two advantages to explicitly marking code as | unsafe: | | (1) You are very clearly marking the areas of code that | are most suspect | | (2) You can still have safe code outside of those | "unsafe" blocks | woodrowbarlow wrote: | yes, that's the point i'm making. | | saying "if you write your kernel in rust, it will be | memory safe" is a little goofy, because writing kernels | in rust requires you to drop into unsafe blocks way too | often. | bluejekyll wrote: | Not as often as you would think, Phill Oppermann has | shown this: https://os.phil-opp.com/ | | Unsafe is definitely used, but it does not mean what I | think is being implied, which is that it makes the | software unsafe. | rapsey wrote: | The ammount of unsafe blocks is pretty small compared to | the rest of the code. Which means those parts can be | reviewed more extensively. If anything your argument is | pretty goofy. | DSMan195276 wrote: | > The ammount of unsafe blocks is pretty small compared | to the rest of the code. Which means those parts can be | reviewed more extensively. If anything your argument is | pretty goofy. | | I've talked about this at length before, but just because | you have 'less' `unsafe` code does not mean you code is | any better off. Very few operations are actually | `unsafe`, and you're still not allowed to break any of | the (poorly or not documented) constraints that Rust | requires even in `unsafe` code. | | Point being, unless you can make "safe" abstractions | around your `unsafe` code that cannot be broken (Which is | largely what you do in any language anyway, when you | can...), the distinction between "safe" and "unsafe" is | thin, because the parts of the code you declare `unsafe` | are not the parts that are actually going to have the | bugs. And because of the fact that the constraints you | are required to fulfill in `unsafe` code is unclear, | there's little way to guarantee your code is broken in | some subtle way the optimizer might jump on (either now, | or in a later release). | | And to be clear, I _like_ Rust, but `unsafe` is poorly | thought-out IMO. | zozbot234 wrote: | If you _aren 't_ making a safe abstraction around your | unsafe code, you're supposed to mark the surrounding | function "unsafe" as well, and document its expected | constraints in a 'Safety' comment block. In fact, you're | supposed to do the same for _any_ piece of code that | cannot provide safety guarantees about its use of | 'unsafe' features. 'Plain' unsafe{ } blocks should only | be used as part of building a safe abstraction. | DSMan195276 wrote: | > If you aren't making a safe abstraction around your | unsafe code, you're supposed to mark the surrounding | function "unsafe" as well, and document its expected | constraints in a 'Safety' comment block. | | Sure, but now we're back to the issue that it's unclear | what constraints `unsafe` code actually has to hold, | meaning ensuring your abstraction is safe is just about | impossible to do with certainty (And OS kernel code is | definitely going to hit on a lot of those corner cases). | You may think you have a safe interface, only to find out | later you're relying on internal details that aren't | guaranteed to stay the same between compiler versions or | during optimizations. | | With that said, while I agree with what you're proposing | about marking surrounding code "unsafe", it leads to lots | of strange cases and most people get it wrong or will | even disagree with this proposal completely. For example, | it can lead to cases where you mark a function `unsafe` | even though it contains nothing but "safe" code. And at | that point, it's up to you to determine if something is | actually "safe" or "unsafe", meaning the markings of | "safe" and "unsafe" just become arbitrary choices based | on what you think "unsafe" means, rather than marking | something that actually does one of the "unsafe" | operations. | | One of the best examples is pointer arithmetic. It's | explicitly a safe operation, but it's also the first | things most people would identify as being "unsafe", even | though it is dereferencing that is the "unsafe" | operation. Ex. You could easily write | slice::from_raw_parts without using any `unsafe` code at | all, it just puts a pointer and length together into a | structure (it doesn't even need to do arithmetic!). It's | only marked `unsafe` because it can break _other pieces | of "safe" code_ that it doesn't use, but it itself is | 100% safe. You could just as easily argue the "other | code" should be the `unsafe` code, since it's what will | actually break if you use it incorrectly. | | Perhaps the biggest annoyance I have is that the official | Rust documentation pretty much goes against the idea you | presented, saying | | > People are fallible, and mistakes will happen, but by | requiring these four unsafe operations to be inside | blocks annotated with unsafe you'll know that any errors | related to memory safety _must be within an unsafe | block_. Keep unsafe blocks small; you'll be thankful | later when you investigate memory bugs. | | Which is just incorrect - memory safety issues are likely | to be due to the "safe" code surrounding your unsafe code | - when you're writing C code, the bug isn't that you | dereferenced NULL or an OOB pointer, the bug is the code | that gave you that pointer in the first place, and in | Rust that code is likely to all be "safe". Point being, | most of the advice on keeping your unsafe blocks small | just leads to people making silly APIs that can be broken | via the safe API they wrap (Even in weird ways, like | calling a method containing only safe code), and | unfortunately there are lots of subtle ways you can | unintentionally break you Rust code that most people | aren't going to have any idea about. | swsieber wrote: | It doesn't mean your code is better off, but it makes | auditing your code a lot easier. Rust gives you lots of | good tools so that you can fence in your unsafe and | restrict it to a module (source code file), such that if | you audit the module containing the unsafe, you should be | good to go. | phendrenad2 wrote: | What I find is that, while Rust unquestionably improves | the security of software, the argument that you can | minimize unsafe code and thoroughly inspect it doesn't | tell the full story, because in my experience people end | up writing general-purpose unsafe blocks, and while they | are controlled by the "safe" outside, the sequence of | actions the "safe" code tells them "unsafe" code to | execute can lead to subtle bugs. | celticmusic wrote: | yep, 100% | | I personally think the Rust community has missed the | biggest advantage of unsafe, and that's tooling. Imagine | an IDE that could highlight and warn when touching state | that is accessed in unsafe blocks, or when state being | accessed in unsafe blocks isn't being properly tested. | | Or tools that kick off notification for a required code | review because code that touches state used by an unsafe | block got changed. | | To me, these sorts of things will have a far greater | effect on security and stability than just the existence | of the unsafe block in code. | steveklabnik wrote: | Those tools exist or are being worked on! Servo (last I | checked) uses a "unsafe code was added or modified in | this PR, please review extra carefully" bot, and Miri is | on its way to becoming a fantastic tool. | celticmusic wrote: | That's good to know. | | The Rust community can be a bit... fanatical, lets say. | I've seen so many people argue that the unsafe keyword by | itself makes rust so much safer than alternatives, when | really safe code can cause unsafe code to explode. So you | end up writing modules to protect the unsafe code, which | is what you do in any other language as well. | | Which means the unsafe keyword is valuable, just not | nearly as valuable as I've seen a lot of people claim. | | Now the unsafe keyword combined with the sort of tooling | I've mentioned? That to me is a __killer __combination. | Unsafe isn 't a silver bullet, it still requires work, | but it enables tools to make that work immensely easier | to deal with. | Cyph0n wrote: | But isn't the most important step having a compiler- | enforced keyword for code that seems to be doing unsafe | stuff? | | The rest are just plugins and tools that look for said | blocks and "do stuff". I don't see why these tools should | be part of the language. | celticmusic wrote: | I never said they should be part of the language, I said | the rust community over-emphasizes the "safe" part of | having the unsafe keyword and neglected the part that's | actually useful in keeping things safe. | | Another poster responded to me stating they are starting | to work on those tools, so it looks like the rust | community is starting to come around. | | It's just that I've seen a lot of people tout the safety | of rust as if the unsafe keyword was a panacea that | automatically prevented problems. That attitude is really | what I was responding to. | Cyph0n wrote: | Ah, gotcha. Yes, improved tooling in this area would add | a lot of value to the feature. | pjmlp wrote: | As proven by the Oberon and Xerox PARC workstations, the | amount of unsafe code is quite minimal. | | Source code is available. | celticmusic wrote: | The value is in knowing where the unsafe operations are. | | It could be argued that's not a huge value gain since | you're so far down the stack, but that's the value of the | unsafe keyword. | gameswithgo wrote: | the million dollar question is what % of your code has to | be unsafe in a kernel. At some %, it ends up not being | worth the trouble. But if the % can be kept relatively | low (say 5% or less) then you get a lot of value out of | minimizing the surface area of the dangerous code. | nickpsecurity wrote: | Unsafe blocks don't mean it's unsafe: just that the | language's safety scheme can't prove anything about that | particular block. You prove each safe externally with | Rust's safety handling everything else (i.e. majority of | the code). | | You can also go further like I did here: | | https://news.ycombinator.com/item?id=21840431 | | You're still worrying about a smaller portion of your | code. | nimmer wrote: | Same for many other languages... as far as memory safety | goes because there are performance tradeoffs in a kernel. | pron wrote: | It's Zig's goal to make it easy to write safe code, it's | just that its approach to safety is very different from | Rust's. | UncleOxidant wrote: | > * zig has native support for arbitrary sized integers | | That's really cool. | bogomipz wrote: | >"zig has native support for arbitrary sized integers." | | I apologize if this is a naive question. Might you or | someone else elaborate on where arbitrary sized integers | are used or necessary in kernel programming? Don't they all | ultimately need to get padded out for the CPU registers to | work with them anyway? | eli_gottlieb wrote: | Can Zig's enums and tagged unions be used to write | algebraic data types? | netgusto wrote: | Bouncing on recently featured HN thread "Hello World": | https://drewdevault.com/2020/01/04/Slow.html (and the HN | thread: https://news.ycombinator.com/item?id=21954886) | | According to this resource, Zig produces code that is very | close to the hand written assembly for the simple use case of | outputing "hello world" on the stdout. | | This is to be taken with a grain of salt though, as of | course, caring about the assembly output is a spectacular | case of premature optimization. I guess it tells a bit about | the goal of Zig as a low-level programming language though. | jessermeyer wrote: | For a language that _competes_ with C, caring about how | semantics map to hardware instructions is absolutely within | the domain of concern. It may not be a top priority, but if | you _ignore_ it too long, you'll make uninformed high level | decisions that prevent entire classes of important low | level optimizations without extensive re-thinking. So just | do the thinking upfront and save yourself and your | community the hassle. | nine_k wrote: | Assembly output is important when you are trying to | understand an exploit, or to make sure none can be | produced. | | Can be important for kernel stuff. | jackhalford wrote: | > caring about the assembly output is a spectacular case of | premature optimization | | I don't agree! at least for operating systems. | | Consider that some operating system code may be run a | million times per second, on a million different machines | (e.g: block system IO on linux). We very much want our | assembly to be pristine in this case. | | I also like the idea of "optimality" brought forward by | zig. In the post you link, there's an ideal hello_world in | asm, can we have a higher level language that doesn't | sacrifice this ideal achieved by assembly? | | Consider that some network cards are now capable of | 400Gibps, and modern OSes are not capable of handling these | linerate. I strongly believe the bottleneck should be in | the hardware, if the software can't max out your hardware | then your software has failed. | | I like that zig focuses on optimality. | antpls wrote: | > Consider that some operating system code may be run a | million times per second, on a million different machines | (e.g: block system IO on linux). We very much want our | assembly to be pristine in this case. | | I would believe that most of the time of a processor is | dedicated to run userspace code, not OS code (regarding | to the work that must be done). At the end of the day, | the OS is "only" a scheduler to share hardware resources | between unrelated tasks. | | > I like that zig focuses on optimality. | | Optimality isn't required in many real world businesses. | Optimality is often a tradeoff : with optimality you lose | in flexibility. An optimal program with SISD instruction | set isn't optimal anymore when SIMD instructions are | introduced. Your "optimal" asm program is still optimal | on x86, but also totally obsolete because it cannot use | the latest NEON instructions from ARM or 64bit | instructions. | klyrs wrote: | > ... caring about the assembly output is a spectacular | case of premature optimization. | | This is a hobgoblin. Assembly output of a compiler is the | very baseline of performance. It requires no effort on | behalf of the programmer (aside from learning a language). | | Now, if you're skimming the assembly after every | compilation and, say, fuzzing your implementation to coax | the compiler to emit the best possible assembly... that's | probably premature. If you're writing inline assembly | before doing a higher level implementation, that's probably | premature. | | But choosing a language on the basis of its | performance:effort ratio is downright pragmatic. | apta wrote: | I find Zig's approach to generics to be very cool. Types are | a first class citizen, so you basically get generics out of | the box. Combined with Zig's `comptime` makes for writing | very cool code. This solves lot of things you'd need macros | in other languages to do. | jackhalford wrote: | this exactly, the language doesn't know about generics but | having first class types + comptime allows you to do | generics. Writing C feels like a chore now. | simias wrote: | I'm curious too, I'm quite familiar with Rust and never | written any Zig in my life so I went digging through the | source and I find the syntax remarkably similar for the most | part. The only thing that stood out is that apparently you | can drop the braces for single-line `if` bodies like in C | whereas Rust makes them always mandatory but I'm firmly on | Rust's side on this one. | | The part where Rust can get really messy is when you involve | generic programing and traits, especially when lifetimes are | involved, but I couldn't find any obviously generic code in | this codebase. | | EDIT: after digging a bit deeper into Zig it looks like it | uses C++'s style duck typing for metaprograming instead of a | trait-based approach like Rust? It definitely removes some | overhead to writing generic code but I'm not sure about | readability and maintainability... But I'm going to stop here | because I'm about to go full fanboy for Rust. | swiley wrote: | If you want more syntax weirdness: tab characters are | illegal. | simias wrote: | I don't mind opinionated coding style (my Rust | integration scripts all enforce that stock "rust clippy" | doesn't return any error) and I do think that using tabs | for indentation simply doesn't work in practice | regardless of how great they are in theory because hardly | anybody uses them correctly (including the vast majority | of code editors by default). It might be strange to back | it straight into the compiler but I don't mind it. | | That being said it does make it even weirder to allow | braceless ifs IMO, but that's bikeshedding. | clarry wrote: | > I do think that using tabs for indentation simply | doesn't work in practice | | It has worked just fine for decades for projects with | millions of lines of code. | | It might not work for babby's first patch and how do I | configure an IDE?? | simias wrote: | Depends of your definition of "fine". The linux kernel | works fine indenting with tabs but it mandates 8-space | tabs and a lot of code ends up looking borderline | unreadable if you use anything else, meaning that the | only thing they get by using tabs is slightly smaller | file sizes. Besides if you want to enforce the 80column | limit you have to standardize a tab width anyway. | | It's simply not worth it IMO. The pros are simply too | small and inconsequential to justify not using spaces on | new projects. Although now that automatic code formaters | are becoming ubiquitous it really doesn't matter what you | use in your editor I suppose. | [deleted] | chungus_khan wrote: | > including the vast majority of code editors by default | | Which ones exactly? Not really a problem I've encountered | often except when someone tries to mix both spaces and | tabs, and in general editors are built with the existence | of this very common character in mind. People tend to hit | the tab key to indent anyway, and one tab char meaning | one level of indent is perfectly intuitive and allows | users to individually configure how large they want their | indents to be. | simias wrote: | Many editors (including the venerable Vim and Emacs) | indent _and_ align with tabs by default, which means that | if you want your code to look right you standardize tab | width which in turn removes one of the only (meager) | advantages tabs have over spaces: configuring the | indentation width to match your personal taste. | | >Not really a problem I've encountered often except when | someone tries to mix both spaces and tabs | | Mixing spaces and tab for indentation and alignment _is_ | how it should be done if you want things to remain | aligned when you change the tab width. | Arnavion wrote: | >Mixing spaces and tab for indentation and alignment _is_ | how it should be done if you want things to remain | aligned when you change the tab width. | | How it should be done is to never write code that needs | alignment if you're using tabs for indentation. | woodrowbarlow wrote: | the problem i encounter is when you try to break a long | line into multiple lines. if you want to use tabs and | align the continuation, you _should_ be mixing spaces and | tabs. | | for example ('-' is tab, '.' is space): | --function_with_lots_of_arguments(arg1, arg2, arg3, | --................................arg4, arg5, arg6); | | it can be done, but a lot of editors get it wrong and it | requires paying attention to the whitespace. | | of course, another style would be to just indent the | continuation twice, without aligning it. (i personally | prefer to align continuations.) | Mawr wrote: | Some food for thought: https://youtu.be/ZsHMHukIlJY?t=633 | | For example, this is the best way to define functions | with argument lists long enough not to fit on a single | line: fn doThing( argument1, | argument2, argument3, ) { | <code> } | | Visually clear, doesn't have any spaces/tabs issues, | produces minimal diffs when adding/removing/renaming | arguments. | cgh wrote: | Really? I might have to take a look at this language | then. Making tabs illegal whitespace might very well be a | proxy for other good design decisions. | dnautics wrote: | It's linted out iirc. Is it really strange to mandate | that code look explicit and have no ambiguous whitespace? | BubRoss wrote: | Even weirder than that, it intentionally fails on \r\n | newline, so windows text files straight up don't work by | default. | Bekwnn wrote: | Because the language is attempting to have a standardized | formatter. I'm all for it, personally. | wtetzner wrote: | I'm assuming it's only illegal as whitespace? In other | words, they're allowed in string literals, right? | rhodysurf wrote: | Correct | dnautics wrote: | It uses "comptime" which, loosely, is a compile time | version of the language. I would say it's a huge | improvement over C in that quite frankly string | interpolation for a preprocessor is terrible. I can't | compare to rust since my brain can't parse rust syntax; too | many bells and whistles. | pron wrote: | I wouldn't call Zig's comptime "C++-style." Unlike Rust, | there's very little in Zig that is borrowed from C++. Zig's | error reporting and comptime makes it easy to write | arbitrary compile-time checks, so Zig uses a single | construct and keyword, comptime, to replace all special | instances of partial evaluation: generics, concepts/traits, | value templates, macros and constexprs. | | The main difference between Zig and Rust is a _huge_ | disparity in language complexity. Zig is a language that | can be fully learned in a day or two. Rust has the same | philosophy of "zero-cost abstraction" as C++, i.e. | spending a lot of complexity budget to make a low- | abstraction language appear as if it has high abstraction. | Zig, like C, does not try to give the illusion of | abstraction. | | There is also the difference in their approach to safety, | but that's a complicated subject that ultimately boils down | to an empirical question -- which approach is safer? -- | which we don't have the requisite data to answer. | pcwalton wrote: | > There is also the difference in their approach to | safety, but that's a complicated subject that ultimately | boils down to an empirical question -- which approach is | safer? -- which we don't have the requisite data to | answer. | | We do have the requisite data to answer whether | preventing use-after-free is better than not preventing | it. | | You can argue (not successfully, in my opinion) that it's | not worth the loss in productivity to prevent UAF and | other memory safety issues, but it's impossible to argue | that not _trying_ to prevent UAF is somehow _safer_. | abainbridge wrote: | > but it's impossible to argue that not trying to prevent | UAF is somehow safer. | | It might be possible. Someone might prove that preventing | UAF necessarily increases complexity somewhere else. Like | one of those laws of thermodynamics, but for software. | pron wrote: | > We do have the requisite data to answer whether | preventing use-after-free is better than not preventing | it. | | I expect Zig will prevent use-after-free. It will be | sound for safe code and unsound for unsafe code (by | turning this on only in debug mode for testing). | | > but it's impossible to argue that not trying to prevent | UAF is somehow safer. | | First, see above. Second, it is not only possible but | even reasonable to argue that not trying to completely | eliminate a certain error is very much safer. The reason | is that soundness has a non-trivial cost which can very | often be traded for an unsound reduction in a larger | class of bugs. As an example, instead of soundly | eliminating bugs of kind A, reducing bugs of kinds A, B | and C -- for a similar cost -- may well be safer. | | There has been little evidence to settle whether sound | elimination of bugs results in more correctness than | unsound reduction of bugs or vice-versa, and it's a | subject of debate in software correctness research. | pcwalton wrote: | It's not interesting to talk about a system that might or | might not exist in the future. (There is a word for | that--"vaporware".) The point is that Zig doesn't even | try to prevent UAF now, so you can't say that it's safer | than languages that do prevent the problem. | | > As an example, instead of soundly eliminating bugs of | kind A, reducing bugs of kinds A, B and C -- for a | similar cost -- may well be safer. | | Hasn't this essentially been what C++ has been trying for | memory safety for decades, without success? The C++ | approach has been "smart pointers are good enough, and | they prevent several other problems too", and the | experience of web browsers (among others) has pretty much | definitively shown: no, they really aren't. For memory | safety, I would not bet on this approach. | pron wrote: | > The point is that Zig doesn't even try to prevent UAF | now | | I wouldn't say Zig exists at all right now, but just as | it strives to one day be production-ready, it strives to | prevent use-after-free. Safety is a stated goal for the | language. | | > Hasn't this essentially been what C++ has been trying | for memory safety for decades, without success? | | No. I'm talking about a mechanism that can detect various | errors at runtime, and is turned on or off for various | pieces of code and/or for all code at various stages of | development. Rust, BTW, doesn't entirely guarantee | memory-safety, either, when any unproven unsafe code is | used, and even when it isn't (e.g., have you proven | LLVM's correctness?). We always make some compromises on | soundness; the question is where the sweet-spots are. | | Software correctness is one area where there are no easy | answers and very few obvious ones. | pcwalton wrote: | > No. I'm talking about a mechanism that can detect | various errors at runtime, and is turned on or off for | various pieces of code and/or for all code at various | stages of development | | What you are describing exists: ASan. We have a pretty | good answer to the question "is ASan sufficient to | prevent memory safety problems in practice": "no, not | really". | | > Rust, BTW, doesn't entirely guarantee memory-safety, | either, when any unproven unsafe code is used, and even | when it isn't (e.g., have you proven LLVM's | correctness?). We always make some compromises on | soundness; the question is where the sweet-spots are. | | Empirically, Rust's approach has resulted in far fewer | memory safety problems than previous approaches like | smart pointers and ASan, with only garbage collectors | (and restrictive languages with no allocation at all) | having similar success in practice. Notice that the | working approaches have something important in common: a | strong system that, given certain assumptions, guarantees | the lack of memory safety problems. Even though those | assumptions are never quite satisfied in practice, | empirically having those theoretical guarantees seems | important. It separates systems that drastically reduce | safety problems, such as Rust and GC languages, from | those that do so less well, such as ASan and smart | pointers. This is why I'm so skeptical of just piling on | more mitigations: they're helpful, but we've been piling | on mitigations for decades and UAF (for instance) is | still a big a problem as ever. | pron wrote: | > We have a pretty good answer to the question "is ASan | sufficient to prevent memory safety problems in practice" | | That is not the question we're interested in answering, | and elimination of all memory errors is no one's ultimate | goal, certainly not at any cost. By definition, unsound | techniques will let some errors through. The question is | which approach leads to an overall safer program for a | given effort, and soundness (of properties of interest) | _always_ comes at a cost. | | (also, Zig catches various overflow errors better than | ASan) | | > Empirically, Rust's approach has resulted in far fewer | memory safety problems than previous approaches like | smart pointers | | I don't doubt that, and if minimization of memory errors | was programmers' primary concern (even in the scope of | program correctness or even just security), there would | be little doubt that Rust's approach is better. | | As someone who currently mostly programs in C++, lack of | memory safety barely makes my top three concerns. My #1 | problem with C++ is that the language is far too complex | for (my) comfort, where by "complex" I mean requires too | much effort to read and write. That, and build times, | have a bigger impact on the correctness of the programs I | write than the lack of sound memory safety. Would I be | happier if, for a similar cost, I could eliminate all | memory safety errors? Sure, which is why, if C++ and Rust | were the only low-level languages in existence, I'd | rather people used Rust. But I would be happier still if | I could solve the first two, and also get some better | safety as a cherry-on-top. Memory safety is similarly not | the main, and certainly not the only, reason I use | languages that have a (tracing) GC when I use them. | pron wrote: | P.S. | | > Notice that the working approaches have something | important in common: a strong system that, given certain | assumptions, guarantees the lack of memory safety | problems. | | That's a _very_ good point and I 'm not arguing against | it. It's just that even if it's true -- and I'm more than | willing to concede that it is -- it still doesn't answer | the question, which is: what is the best approach to | achieving a required level of correctness? | | The "soundness" approach says, let's guarantee, with some | caveats, certain technical correctness properties that we | can guarantee at some reasonable cost. The problem is | that that cost is still not zero, and my hypothesis is | that it's not negligible. My personal perspective is that | Rust might be sacrificing too much for that, but that's | not even what has disappointed me with Rust the most. I | think -- and I could be wrong -- that Rust sacrifices | more than it has to just to achieve that soundness, by | also paying for "zero-cost abstractions," which, for my | taste, is repeating C++'s biggest mistake, namely | sacrificing complexity for the _appearance_ of high-level | abstraction that may look convincing when you _read_ the | finished code (perhaps more convincing in Rust than in | C++), but falls apart when you try to change it. Once you | try to change the code you 're faced with the reality | that low-level languages have low abstraction; i.e. they | expose their technical details, whether it's through code | -- as in C and Zig -- or through the type system, as in | Rust. Zig says, since the abstraction in low-level | languages is low anyway (i.e. we cannot really hide | technical details) there is little reason to pay in | complexity for so-called zero-cost abstractions. | | Language simplicity goes a long way, even as far as | _sound formal verification_ is concerned. For example, | there are existing sound static analysis tools that can | guarantee no UB for C -- but not the complete C++, AFAIK | -- with relatively little effort. It 's not yet clear to | me whether Zig, with its comptime, is simple enough for | that, though. | | It is my great interest in software correctness, together | with my personal aesthetic preferences, that has made me | dislike language complexity so much and made me a | believer in "when in doubt -- leave it out." | mntmoss wrote: | I would like to chime in by noting that Rust's mature | form is borne of a very specific scenario, which is the | web browser, software so complex that the "zero-cost" | element is more like "actually possible to optimize" in | practice. And a great deal of that complexity is | accidential in some form, a result of accreted layers. | | And in that respect, it's not really the kind of software | anyone needs to aspire to; aspiring to write programs | simple enough that Zig will do the job is much more | palatable. | pron wrote: | I don't take issue with the "zero-cost" part -- Zig and | C, like every low-level language, have that -- but with | the non-abstraction-"abstraction" part, which is rather | unique to C++ and Rust. Rust has become a modern take on | C++, and I'm not sure it _had_ to be that for the sake of | safety; I think it became that because of what you said: | it was designed to replace C++ in a certain application | with certain requirements. It 's probably an improvement | over C++, but, having never been a big fan of C++, it's | not what _I_ want from a modern systems programming | language. It seems to me that Rust tries to answer the | question "how can we make C++ better?" while Zig tries | to answer the question "how can we make systems | programming better?" | | Of course, Zig has an unfair advantage here in that it is | not production-ready yet, and so it's not really "out | there," and doesn't have to carry the burden of any real | software (there's very little software that Rust carries, | but it's still much more than Zig). I admit that when | Rust was at that state I had the same hopes for Rust as I | do now for Zig, so Zig might yet disappoint. | pcwalton wrote: | > I think -- and I could be wrong -- that Rust sacrifices | more than it has to just to achieve that soundness, by | also paying for "zero-cost abstractions," which, for my | taste, is repeating C++'s biggest mistake, namely | sacrificing complexity for the appearance of high-level | abstraction that may look convincing when you read the | finished code (perhaps more convincing in Rust than in | C++), but falls apart when you try to change it. | | The argument here seems to be that there is can be no | real abstraction in low-level languages, so there's no | point providing language features for abstraction. The | premise seems clearly false to me, because even C has | plenty of abstraction. Functions are abstractions. | Private symbols are abstractions. Even local variables | are abstractions (over the stack vs. registers). | | People often argue that Rust is too complicated for its | goal of memory safety. It's easy to say that, but it's a | lot harder to list specific features that Rust has that | shouldn't be there. In fact, as far as I'm concerned Rust | is an exercise in _minimal_ language design, as the | development of Rust from 0.6-1.0 makes clear (features | were being thrown out left and right). Most of the | features that look like they 're there solely to support | "zero-cost abstractions"--traits, for example--are really | needed to achieve memory safety too. For instance, Deref | is central to the concept of smart pointers, and, without | smart pointers, users would have to manually write | Arc/Rc/Box in unsafe code every time they wanted to heap- | allocate something. | | > Language simplicity goes a long way, even as far as | sound formal verification is concerned. For example, | there are existing sound static analysis tools that can | guarantee no UB for C -- but not the complete C++, AFAIK | -- with relatively little effort. It's not yet clear to | me whether Zig, with its comptime, is simple enough for | that, though. | | The most important static analyzers used in industry | today are Clang's sanitizers, which work on both C and | C++. The most important such sanitizers actually work at | the LLVM level, which means they work on Rust as well | [1]! The days of having to write a compiler frontend for | static analysis are long gone. We have excellent shared | compiler infrastructure that makes it easy to write | instrumentation that targets many low-level languages at | once. (Even in the world of C, this is necessary. Plain | old C99 is an increasingly marginal language, because the | real important code, such as Windows and Linux kernels, | are written in compiler-specific dialects of C, which | means that a static analysis tool that isn't integrated | with some popular compiler infrastructure will have | limited usefulness anyway.) | | > It is my great interest in software correctness, | together with my personal aesthetic preferences, that has | made me dislike language complexity so much and made me a | believer in "when in doubt -- leave it out." | | Again: easy to say, harder to specify specific Rust | features you think should be removed. | | [1]: https://github.com/japaric/rust-san | pron wrote: | > The argument here seems to be that there is can be no | real abstraction in low-level languages, so there's no | point providing language features for abstraction. | | My argument is that low-level languages allow for _low_ | abstraction, i.e. there 's little that they can abstract | over, where by abstraction I mean hide internal | implementation details in a way that when they change the | consumer of the construct, or "abstraction", does not | need to change; if it does, then the construct is not an | abstraction. | | > People often argue that Rust is too complicated for its | goal of memory safety. | | I didn't know people often say that. I said it, and I'm | not at all sure that's the case. I think that Rust pays | far too heavy a price in complexity. It's too heavy for | my taste whether or not it's all necessary for sound | memory safety, but if it isn't, all the more the shame. | | > Most of the features that look like they're there | solely to support "zero-cost abstractions"--traits, for | example--are really needed to achieve memory safety too. | | OK, so I'll take your word for it and not say that again. | | > The most important static analyzers used in industry | today are Clang's sanitizers | | I'm talking about sound static analysis tools, like | Trust-in-Soft, that can _guarantee_ no UB in C code. I | think that particular tool might support some subset of | C++, but not all of it. The sanitizers you mention use | concrete interpretation (aka "dynamic") and are, | therefore, usually unsound. Sound static analysis | requires abstract interpretation, of which type checking | is a special case. Just as you can't make all of Rust's | guarantees by running Rust's type-checker on LLVM | bytecode, so too you cannot run today's most powerful | static analysis tools -- that are already strong enough | to absolutely guarantee no UB in C st little cost -- on | LLVM bytecode; they require a higher-level language. | Don't know about tomorrow's tools. | | > Again: easy to say, harder to specify specific Rust | features you think should be removed. | | I accept your claim. In general, I don't like to isolate | language features; it's the gestalt that matters, and | it's possible that once Rust committed to sound memory | safety everything else followed. But let me just ask: are | macros absolutely essential? | apta wrote: | > As someone who currently mostly programs in C++, lack | of memory safety barely makes my top three concerns. | | Isn't having a UB-free JVM a noteworthy goal though? | Especially if it gets used in life-critical systems such | as avionics or autonomous cars. | pron wrote: | UB-freeness is not a goal in-and-of-itself. It's | shorthand for a certain kind of technical (i.e. non- | functional) correctness, which, in turn, is related in | some ways to functional correctness, and it's improving | functional correctness (and I include security here) | that's the goal. Is the most effective way to achieve | that is by working to completely eliminate undefined | behavior? I'm not at all sure. | eeZah7Ux wrote: | > Rust, BTW, doesn't entirely guarantee memory-safety, | either | | > We always make some compromises on soundness; the | question is where the sweet-spots are. | | Excellent point. There are complex tradeoffs and the | "rust is safe" slogan is just a slogan. | Touche wrote: | Just for clarity for anyone reading, the Zig author does | not claim that Zig is safe and has in fact said that is | unsafe. Could change in the future, but there's no denial | about what it is today. | pron wrote: | There is a difference between safe _code_ , which is the | goal, and a safe _language_ (that 's a statement the | language makes on sound safety guarantees). Using a safe | language is definitely _one_ way to write safe code, but | it is not necessarily always the best way, and it 's | certainly not the only way. Zig is not meant to ever be a | safe language, but it is very much intended to be a | language that helps write safe code. That is what I meant | when I said that the two languages have a very different | approach to safety. | gw wrote: | comptime does not replace macros though...there is no way | to manipulate AST in zig, nor will there ever be, | according to the author. | pron wrote: | Well, it is intentionally weaker than macros (and I agree | with Zig's designer that that's a very good thing, though | it is a matter of taste), but it does replace many of the | cases where in Rust you'd have to use macros (or the | preprocessor in C/C++). So it replaces macros everywhere | where it deems their usage reasonable. | gw wrote: | It is a legit design choice, but it does detract from | your comment about language complexity. Not having AST | macros inherently adds complexity to a language by | requiring features to be built into the compiler rather | than be implemented as libraries. | pron wrote: | Those are different kinds of complexity. You're talking | about the effort required by the implementor of the | compiler. I'm talking about the effort required by the | programmer using the language. | gw wrote: | Then you aren't talking about complexity (an objective | quality), you are talking about difficulty to read (a | subjective quality relative to the reader). There is no | doubt that macros can make a given piece of code harder | to read, if the reader is unfamiliar with the macro being | used. Complexity describes how intertwined different | pieces of something are internally, which has nothing to | do with a given vantage point. | pron wrote: | No, that's just how Rich Hickey describes complexity; | it's hardly a universal definition. For example, in | computer science, the complexity of a task is often a | measure of the effort, in time or memory, required to | perform it. | gw wrote: | English is certainly not free of ambiguity, but in the | past i've seen you heavily emphasize precision in word | choice, so it's surprising to see you de-emphasize it | here. Nobody is a final arbiter of definitions, but the | distinction i'm making is not a trivial one, and Hickey | isn't the only one to have made it. Even thinking about | it colloquially, how often do you follow the word | "complex" with an infinitive verb describing an action? A | rube goldberg machine is complex...and hard to build! | pron wrote: | I'm not deemphasizing it, I'm just saying that we're | talking about different meanings of "complexity" here and | there is no well-accepted definition. My "complexity" | refers to the effort required by the programmer when | understanding programs written in the language. | gw wrote: | Fair enough, ron. I won't belabor it further. I'll just | leave this: Long ago, after coming across a very useful | distinction between the words "practical" and | "pragmatic," i intentionally changed my usage of those | words as a result. Not because a charismatic person told | me to, but because it was useful. If a distinction is | useful to make, start making it, my man! | AnIdiotOnTheNet wrote: | Which is the right call really. It is valuable to have | the code you're looking at actually be what it appears to | be. | mratsim wrote: | I don't agree. | | For many domains, being able to implement a domain- | specific language with a set of expressive rules for the | domain is an incredible productivity boost and also | prevents many mistakes because you cannot represent them. | | Not being able to manipulate the AST means that you are | restricted on the embedded DSLs that you can provide. And | embedded DSLs encompasses code generation for: | | - state machines | | - parsing grammars (PEGs for example) | | - shader languages | | - numerical computing (i.e. having more math like syntax | to manipulate indices) | | - deep learning (neural network DSL) | | - HTML templates / generators | | - monad composition | | There is a reason most people are not building in | Assembly anymore, there is a right-level of abstraction | for every domain. A language that provides building block | for domain expert to build up to the right level of | abstraction is very valuable. | [deleted] | pron wrote: | The question is, as always, at what cost? | | Low-level languages (aka "systems programming" languages) | already suffer from various constraints that increase | their accidental complexity. Is it really necessary to | complicate _those particular languages_ further to | support embedded DSLs? | | I don't think there's a universal right or wrong answer | here, but there is certainly a big question. | gw wrote: | Ironically, the lack of macros leads to an explosion of | ad-hoc, extra-language DSLs. Look at rules engines, for | example. In Drools (java), you have to write rules with a | special language, DRL. Meanwhile in Clara (clojure) you | write your rules in clojure. Macros simplify languages, | they don't complicate them. | simias wrote: | It's always a trade-off and placing the cursor correctly | is tricky. Things like macros, operator overloading, | metaprograming, virtual calls, exceptions or even | function pointers are effectively "obfuscating" code by | having non-obvious side effects if you don't have the | full context. On the other hand if you push the idea too | far in the other direction you end up with basically | assembly, where you have an ultra-explicit sequence of | instruction to execute. | | It's very easy to come up with examples of terrible abuse | of these features that lead to bad code (like for | instance if somebody was insane enough to overload the | binary shift operator << to, I don't know, write to an | output or something) but it also gives a lot of power to | write concise and expressive code. | curtisf wrote: | You can't manipulate ASTs (ie, Zig code), but nothing | stops you from parsing string literals however you want. | | For example, I wrote a PEG-like parser combinator library | in Zig. Using it currently [looks like this](https://gith | ub.com/CurtisFenner/zsmol/blob/87de4c77dd8543011...). | However, as a library, I ^could provide a function that | looks like pub const UnionDefinition = | comb.fromLiteral( \\ _: KeyUnion | \\ union_name: TypeIden \\ generics: | Generics? \\ implements: Implements? | \\ _: PuncCurlyOpen \\ fields: Field* | \\ members: FunctionDef* \\ _: PuncCurlyClose | ); | | etc. But, I find reading the code as it is good enough | for now that I didn't want to spend time implementing | such a library. | | [^]: Being able to create brand new types at `comptime` | isn't [yet | implemented](https://github.com/ziglang/zig/issues/383), | so this can't _quite_ be done yet, though you could fake | it with `get` /`set` methods instead of real fields | sobeston wrote: | Traits are a part of the stdlib (under std.meta) as opposed | to the language. This is unlikely to change | nicoburns wrote: | Zig is lower level than either (it's more like C than C++: | expect to manually call 'free'on allocated values). It has a | lot of nice improvments over C however, like proper arrays, | no null, a module system, and proper compile time evaluation | (no need for the hacky preprocessor). | rvz wrote: | I found the module system for C interop, quite innovative | and was surprised to see that Rust didn't even have such a | thing. If you compare bindgen to this approach, bindgen | just seems like double work and harder to maintain | autogenerated bindings. | | I have seen both Zig and Swift use the module system very | well despite several devs strangely saying its somewhat | complicated, but when used for low-level development this | makes Zig an interesting language to use for language | bindings. | dnautics wrote: | I wrote a library for ffi (nifs) in elixir using zig, and | a long term goal is for it to be easier to use zig as an | intermediary for c library ffi than it is to use c. | | Already importing blas into elixir with it is crazy easy. | zambal wrote: | That sounds interesting. Do you have a link to a hex | package or repo? | dnautics wrote: | https://hexdocs.pm/zigler/Zigler.html | blackhaz wrote: | May I ask why people invent new languages rather than | trying to extend or improve existing ones, like C? | matheusmoreira wrote: | Existing languages can't be meaningfully improved without | breaking compatibility with all existing software. Python | 3 is a better language in every way but it took many | years before it saw significant adoption. | AndyKelley wrote: | Here is a talk I gave that directly addresses this | question: | | https://www.youtube.com/watch?v=Gv2I7qTux7g | | (the title & abstract in the youtube description is the | one I gave for the RFP; I ended up going in a slightly | different direction than it once I actually made the | talk) | wolfgke wrote: | > May I ask why people invent new languages rather than | trying to extend or improve existing ones, like C? | | Because extending or improving a language nearly always | means breaking compatibility. Nearly every language is | "stuck in a local optimum". To get out of it, you have to | kill assumptions of the language that lead to this local | optimum. | | Also it takes a lot more work to convince other people to | adapt my changes than to write an implementation of a | language. Add to this the fact that being a good | programmer and being a good politician are rather | independent skills. | | Finally, there do exist issues that cannot be fixed, for | example that the official ISO standard of the C language | is not freely available (I am aware that there exist | drafts in the internet). | jeltz wrote: | Yeah, why did people not fix ALGOL instead of writing new | ones like C? | Ryckes wrote: | Many of the features of new languages are their | restrictions. How would you restrict C in a new version | to e.g. remove null pointers in favor of optional types? | alfiedotwtf wrote: | A great feature I find with Rust not being superset of an | unsafe language, is that the only option for you is to | write safe code (unless you explicitly opt for `unsafe`). | | Compare this to merely extending a language e.g adding | smart pointers - you could always (mistakenly) implicitly | fall back onto bad coding practices. | simias wrote: | You could argue that it's what these languages do, the C | heritage is very strong in every one of them. | | The problem is: do you want to maintain full backcompat | or are you willing to break things to do it cleanly? If | you do the former you end up with something like C++ | which retains almost complete compatibility with C (but | they you have a lot of baggage to carry around, which may | get in the way at times) or you're willing to break | things and then why not take the opportunity to improve | the syntax and get rid of the cruft? | | C is ancient now, its type system and its many quirks are | quite far from the state of the art or language design, a | modern language wouldn't have to mess with the nonsense | that are, for instance, C arrays (including C++ which | can't outright remove them but does everything it can to | render them obsolete with std::vector and std::array). | | An other big problem with C interop for newer languages | is that while C itself is relatively small and easy the C | preprocessor isn't. That's usually where the friction is, | because if you want to maintain compatibility with C | macros you have to choice but to implement the language | syntax wholesale. | rumanator wrote: | > If you do the former you end up with something like C++ | | That's a pretty awesome place to be, as C++ is one of the | most successful programming languages in history. | | Perhaps there are lessons there. | | > or you're willing to break things and then why not take | the opportunity to improve the syntax and get rid of the | cruft? | | Break things just for the hell of it is not much of a | tech argument. | | C is already pretty light, and already gave origin to | successful programming languages such as C++ and | Objective C. Unless you find a compelling reason to break | backward compatibility in a very specific way, I don't | see how that argument makes any sense. | | > An other big problem with C interop for newer languages | is that while C itself is relatively small and easy the C | preprocessor isn't. | | Arguably, the preprocessor is orthogonal to the | programming language itself. I fail to see how that's | relevant. | simias wrote: | C++ is hugely successful, that's true, but is there a | need for an other C++-style language? C++ already feels | like 10 languages under a trenchcoat anyway, whatever | your style you'll probably find a subset of it you'll | like. I think it showed how powerful "C-with-classes-and- | the-kitchensink" can be, and also the limits of the | concept. | | C is light but it does have some things worth breaking | IMO. Type inference is something I dearly miss when I | write C these days (and I do that a lot). C didn't have | any generic programing for a long time (if you don't | count macro soup, that is), now it has some very limited | support but it still looks like banging rocks together | compared to more modern languages. | | C's unsafety is legendary, and segfaults a common problem | even for experienced programmers. Rust's lifetimes makes | them impossible by design for safe code. | | You may not like that of course, but those are all good | reasons for experimenting with other paradigms. | | >Arguably, the preprocessor is orthogonal to the | programming language itself. I fail to see how that's | relevant. | | Arguably it is, practically it very much isn't. | pjmlp wrote: | C was already ancient when it came to be. | | In retrospective, the aversion of Go designers to current | practices in other programming languages is quite similar | to the aversion of them back when designing C, versus the | other system programming languages that were being | developed since 1961, like ESPOL, NEWP, PL/I, PL/S, | BLISS. | pjmlp wrote: | Multiple attempts have been done to fixing C security | issues, but the community at large tends to refuse to | adopt them, so the only way forward is to create other | languages for the same domain. | | Note that Objective-C and C++ are extensions to the C | language, and both started as pre-processors that would | generate C code. | | Also C wasn't the first on its domain, just got lucky | that UNIX got widespread adoption and then found its way | outside UNIX, just like JavaScript eventually found a way | outside the browser. | marcoperaza wrote: | > _I 'm asking out of curiosity because I want to learn a | "system" programming language_ | | If you mean that you don't know any right now, just learn C. | This isn't web programming, where everyone is always hopping | to the latest fads. C is the lingua franca, the sine qua non | of systems programming. It will be a long time before that | changes. The C machine model is what everyone is working with | anyway. C is small, despite some devious and fun corners of | it, and mostly just exposes you to that model and way of | interacting with the machine. | | You can't even appreciate the new systems languages unless | you have a firm grasp of C and systems programming. They are | all a response to C, addendum and proposed evolutions to it. | guggle wrote: | Thank you. I sure tried C too (also C++ for gui | programming), though I wouldn't say I "know" it (which to | me would imply at the very least one significant real-world | experience with it), I do understand why some projects try | to modernize "system" programming. I just want to evaluate | alternatives, but I may very well go for C in the end... | baq wrote: | C is portable assembly, I'd recommend learning it | together with the assembly for the platform you're | learning on, so you can actually understand the stack, | heap, calling conventions, etc. | mratsim wrote: | C is portable assembly until it isn't: | | - no access to carry/overflow flag in registers making it | a chore to write bigint libraries | | - no way to guarantee emitting add with carry or sub with | borrow even though those are simple instructions and some | architecture (6502) don't even provide a normal add/sub | | - need to drop down to assembly to implement resumable | function/fibers to avoid the syscall costs of | ucontext/setjmp/longjmp | | - no access to hardware counters like RDTSC | | - no way to get the CPU frequency in a portable and | reliable way | | - no portable threadID function and then, those that are | available (pthread_self, and Windows') rely on expensive | syscalls. | | - no way to do CPU feature detection (SSE, AVX, ARM Neon, | ...) | | - no way to control emission of common intrinsics like | popcount, bit-scan-reverse, lowest bit isolation in a | portable way. | | The portable assembly narrative breaks down rapidly when | you actually need specific code to be emitted. | sigjuice wrote: | _need to drop down to assembly to implement resumable | function /fibers to avoid the syscall costs of | ucontext/setjmp/longjmp_ | | Which syscalls would be involved in setjmp/longjmp? | mratsim wrote: | See http://www.1024cores.net/home/lock-free- | algorithms/tricks/fi... on ucontext | | And for longjmp, Apple source for example https://opensou | rce.apple.com/source/Libc/Libc-262/i386/sys/s.... It does | seem like glibc impl doesn't involve syscalls: https://gr | oups.google.com/forum/#!topic/comp.unix.programmer... | baq wrote: | Precisely my point - learning both together as they | compliment each other instead of C abstracting the | underlying computer away. | tastyminerals wrote: | and somehow you forgot about D | matheusmoreira wrote: | The Road to Zig explains the essence of the language: | | https://youtu.be/Gv2I7qTux7g | | C but with the problems fixed. | pjmlp wrote: | It doesn't fix use-after-free, double free(). | anon767 wrote: | Are you planning to write up a Tutorial about this? | jackhalford wrote: | I'm more on the reading end of tutorials right now. If I come | up with something original that doesn't have an osdev page | I'll contribute to the wiki. | ajxs wrote: | Great work! I'm just behind you working on my own alternate | language x86 kernel in Ada: https://github.com/ajxs/cxos | | Admittedly I don't know much about Zig, but it's good to see | people investigating languages other than C. I'm still not | convinced of Rust's merits in this area, we'll see how this | develops with time. | joshbaptiste wrote: | Author, if you decide to build more of this kernel any thoughts | on providing live screencasts of the implementation like Andrew | Kelly (Zig) and Andreas Kling (SerenityOS) on Youtube and/or | Twitch. I never realized how effective it is for me to watch | others go through the mental process of coding/debugging. | jackhalford wrote: | Hi, that's a great idea, it would be a good exercise for me | especially. I always enjoy watching Andy's live zig coding on | youtube. | steeve wrote: | Great work! | bambataa wrote: | This looks really cool, well done. Do you mind sharing the | resources you've used so far? I see many, many tutorials on | OSDev... | jackhalford wrote: | OSDev can be hit or miss. All of the bootstrapping was done | following this [0] tutorial for Rust, which translates easily | to zig. For the more advanced parts you can find usefull ones | in [1] or use the OSDev search bar. For even more advanced | topics though you'll find that there are no tutorial and only | a few open source implementations to take inspiration from! | | ps: I've updated my readme with a few references | | [0] https://os.phil-opp.com/ | | [1] https://wiki.osdev.org/Tutorials | vkaku wrote: | I'd rather you expose a FUSE interface first than expose | ext2.... Just a totally random suggestion. | fortran77 wrote: | Zig is so much better than Rust for these sorts of tasks. | Readable and maintainable. The Rust astroturf-brigade tries | hard to make it fit any situation. I'm glad you have provided a | substantial concrete counterexample. ___________________________________________________________________ (page generated 2020-01-06 23:00 UTC)