[HN Gopher] Starting Over - A FOSS proposal for a new type of OS... ___________________________________________________________________ Starting Over - A FOSS proposal for a new type of OS for a new type of computer Author : gjvc Score : 113 points Date : 2022-03-26 14:08 UTC (8 hours ago) (HTM) web link (archive.fosdem.org) (TXT) w3m dump (archive.fosdem.org) | BruceEel wrote: | Alas, my Macbook Pro computer is so obsolete the video won't | play! | raybb wrote: | Their website seems to be struggling a bit with the video. | | Here's a mirror that will only last 24 hours or 100 downloads. | https://wormhole.app/kr6xX#tHs1gcAJfyBgN5ti0tBpxA | | Alternatively, here's a p2p mirror that will stay up as long as | someone has the page open: | https://instant.io/#8587ef24d016aff8c87c6de186e1e8584270a4c7 | BruceEel wrote: | Thank you! | orev wrote: | The original PalmOS did this from the beginning, with the main | difference being that memory was all RAM instead of non-volitile. | Apps existed in one part of memory, and data in another. Data | storage didn't use files, but databases and memory segments. | Anyone trying to solve this problem today could learn from that | system. | | And yes, if your battery died, you lost everything. Palm devices | were designed to by synced to PCs at least daily, so you'd just | restore by syncing again. | evgen wrote: | Or learn from a better version of that same idea that was | embodied in the Newton. Able to be coded using prototypes from | top to bottom so that every app (including ones from Apple that | were embedded in the system) was hookable and modifiable, data | 'soups' that were persistent in nvram, and easy cross- | application data sharing. | zozbot234 wrote: | Except that memory segments are already files in all but name. | The two notions were unified already in MULTICS, and planning | for "single-level storage" makes that unification explicit. | Someone wrote: | Earlier HN discussion at | https://news.ycombinator.com/item?id=26066762 (119 comments) | josephg wrote: | Personally I'm horrified by the vision of 18 different electron | apps, each using gigabytes of storage space even when they're not | running. | | But I think there's always some essential state that we'll want | to persist. And I agree that modern filesystems are way too | janky. Its hard to store anything safely because there's no | transactional support. Its hard to share edits between computers | because files are totally opaque to the operating system. So my | changes will clobber your changes. And the save-load loop is | inefficient. Why do I have to write out my entire file again | after making a single character change? | | But the core promise of persistent values is important. Why can't | I just bless a variable in my program (and give it a name) and | have it persist between instantiations? Like, if I give x some | persistent identity, then set x = 5, it should still be 5 when I | restart my computer, reload my webpage, or open a second web | browser. | | I'm not sure if persistent memory is the answer. My take is that | I think we need to start using CRDTs more deeply in our operating | systems. | | A well written operation based CRDT (Concurrent Replicated Data | Type) is an append-only log of patches, which together create a | mutable value. Treating it as an append-only log makes it easy to | store on disk or replicate over the network. And treating it like | a value makes it easy to program against. The log of changes | grows over time, but you don't have to store all of it. (Or | really, any of it). You have a tunable knob of how much history | you want to keep around. More history data means you can support | branches, merging and time travel. Less history makes it smaller. | (And the data you need to keep for merging is really tiny in | practice, btw.) | | With CRDTs we can make magic variables which work like that if we | want. And unlike persistent memory, CRDTs can share their value | with other processes, other devices and other users. | | Rather than sysfs, procfs, etc, I want all the OS's internal | state exposed as live variables I can just subscribe to. And I | want my own data (documents, etc) to work exactly the same way. | For some data sets (like my source code), I want to keep a full | change history. And for other data sets (my CPU temperature) that | should be something I explicitly opt in to. | | But regardless, all of this data should be able to be shared with | a flick of the mouse. | bryan_w wrote: | Interesting idea, but I think he's a little too hung up about | programming languages. I think there's space for new operating | systems and that doesn't necessarily preclude the use of any | programming language. | | We could have OSes that don't use files, that save state and are | able to resume in seconds, and we can do all of that in C (I | don't think we would want to but we could). Also there doesn't | exist a world where grandma (or even the Kool kids today) is | going to learn smalltalk in order to use an OS. | | The main hurdle I see with all of this is hardware initialization | -- how to handle, from cold boot, the fact that your network card | has no idea what frames has been sent and what have it has been | received. | | The first step I see in all of this is creating an operating | system where the kernel has no concept of being booted. One thing | people struggle to wrap their head around with optane is "what | does kernel upgrades look like" which would be messy in existing | OSes. | jka wrote: | > Also there doesn't exist a world where grandma (or even the | Kool kids today) is going to learn smalltalk in order to use an | OS. | | I'd agree that well-designed operating systems shouldn't | require anyone to undestand programming to begin to use them, | but equally I'd argue that anyone should be able to look at the | code that's running on their computer and inspect it. It's one | way (often a very effective way) to learn about software in the | first place. | | The technology landscape will no doubt look very different in | another twenty years, and while it's relatively well-understood | that anyone can contribute to FOSS projects regardless of age | or experience, it's worth improving that opportunity when | possible (both technically and socially). | | There's some kind of analogy with web development here; the | ability to view source for webpages and tinker with the | HTML/CSS/JS easily is an educational pathway. | | > The first step I see in all of this is creating an operating | system where the kernel has no concept of being booted. | | If both the operating system and network card are safe to | initialize from cold at any point (and with robust software and | protocol design, they should be), then even with peer-to-peer | commodity memory as a storage medium, statelessness should be | perfectly achievable, I think? | | Slightly off-topic: you may be interested in Linux's kexec[1] | functionality (it provides the ability for a running kernel to | load and run another kernel). | | [1] - https://wiki.archlinux.org/title/Kexec | zozbot234 wrote: | A single-level store based OS uses even _more_ files than | current OS 's like Windows or Linux. Because every memory | mapping is notionally persisted to disk and is thus a file. | Whereas currently you have "anonymous" mappings that are | understood as ephemeral, with no "files" in storage backing | them. | andi999 wrote: | I think the next step should be focused on security. This | captures part of it nicely: https://xkcd.com/1200/ | | I mean any user space program can harm all (same) user space data | and since often there is only one user in a system this is fatal. | fuzzfactor wrote: | Well I've been doing computerized gas analysis since dirt was | rocks and only the rare laboratories with mainframes could do it. | | Before the first benchtop data systems came along you had to | evaluate the curves from the graph paper yourself and then | calculate numerical results by hand. A single-point report like | CO2 content alone is quite easy & quick manually, but when you | have dozens of hydrocarbons on the same graph that all need to be | calculated in parallel it was not so quick and more prone to | error. | | Programmable calculators could then be used to save time on some | of the purely numerical work for the high-data runs, and this is | about like what would become the final part of the built-in | workflow for these early application-specific computerized gas | workstations. | | Offices didn't have PCs yet, just typewriters, copiers, filing | cabinets and simple calculators. | | For laboratories the built-in gas report printers were | leapfrogging technology with each generation, the inkjet first | appeared in a HP model for the instrument lab before they began | to apply it to office equipment. | | The evolution has been interesting. | | One particular series of late 1970's design benchtop data system | was multi-user and multi-tasking way before Windows. Like others | it took in the raw analog signal (still do but they're "black | box" interfaces connected to PC's for further calculation now) | from the benchtop gas analyzer in real time and produced digital | data from the curves on the graph. This electronically | accomplished that first manual procedure of measuring the curve | to begin with. | | From that point on it's all calculation. | | Each gas run was autosaved in memory as a file having an | automatically-assigned file number, no more than 3 digits. There | was no way anybody had enough memory to record the entire analog | signal at the time so this was just the key geometric features in | digital form. | | There was no file system. | | Today on PCs the entire analog gas signal is recorded to HDD in | digital format routinely. And people still lose files all the | time, don't ask me how I know. | | But for that antique data system there was no storage other than | memory absolutely required, many bare-bones consoles were issued | without the optional micro-cassette tape drive. The tape was | quite slow and best for long-term storage or disaster recovery. | An optional COM port became available years later which was | faster, but there was no workflow change for most users. A great | deal of fairly advanced work could be accomplished by many | without need for any storage device I/O over a period of years. | Just a one-page printout about each gas destined for the physical | filing cabinet. | | You never were supposed to turn the console off. Except whenever | you wanted to! The major use case is 24/7 so there were expensive | battery backup units to protect from short- or long-term power | failures, which only preserved the memory. If the power failed | the battery kept everything the same and you only lost a gas run | if it was in progress and had not been saved in memory. | | But you could also power down the console from the main switch | any time and its high-power-consuming hardware, processor, and | power-supply would go cold. Like when you're done for the day, | the gas runs are over, and you were finished sitting at the | console. The memory alone would still be powered by the battery | backup unit without depleteing the battery at all unless the | backup unit itself lost power. Under backup power you could also | change any circuit board other than the dedicated memory PCB, | including the main power supply, and it would power back up with | the memory intact. | | The OS was immutable and in ROM where it should be. | | Also the default OS performance upon cold power-up was adequate | for many simple gas tests without need for user programs or | reference data files to be loaded beforehand. For many operators | a few minutes of manual parameter entry gave full disaster | recovery. So those guys didn't even need the expensive batteries, | tapes or COM port. It was basically application-specific enough | for the simple stuff already. | | You could then extend that with optional user programs to more | complex automation & calculations but only up to the limitations | of memory. | | User programs ended up as file numbers too so you were your own | file system. | | And to just get a truly worthwhile file listing, you had to write | your own program. | | Glad there were only 3 digits. | | These things did achieve nearly zero show-stoppers per year on | average, unlike modern PC systems which are in the handfuls at | minumum and dozens for many operators on average. | | There was no booting, the OS ran from ROM and the memory was | accessed as needed from there. Optional storage devices were only | needed for user items, the OS did not require any files from | storage, and only one file present by default in memory. Simply | its own memory work area. | clintonwoo wrote: | If the computer has non volatile memory then how do you turn it | off and on again to fix the bugs in software? Hope we don't have | to say rest in peace to this beautiful piece of troubleshooting | procedure | Someone wrote: | The same way you do it now. Most computers already have non- | volatile memory, typically a tiny bit for such things as boot | parameters and an enormous amount in the form of disk storage. | | If, today, your disk is corrupt, you tell your OS to boot in | single-user mode and run _fsck_. If that doesn't work, you fall | back a level and reformat a disk. | | I don't see why a device where all storage is directly | addressable would be different. | e2le wrote: | Some kind of overlayfs type solution? With a non-writeable | "default" lower state and a writeable upper state where the | changes occur, removing the upper state would effectively reset | the system. | clintonwoo wrote: | Fair call, sounds like an immutable OS like CoreOS where | updating it requires a reboot. That's possibly all that's | required! | a9h74j wrote: | IMHO something was lost when hard drives no longer had | their own write-protect switch (remembering back to DEC | rack-mount spinning drives). | IshKebab wrote: | Same way you fix errors with persistent caches. You clear them. | oneplane wrote: | As wonderful as ideas are, change from technology only becomes | change in availability. Or, in simpler terms: it's not always | about what is possible, and quite often it's about what is needed | instead. | 2OEH8eoCRo0 wrote: | This is great. Ties together a lot of ideas that I've had but | either can't properly express or lack the knowledge to execute | on. | LichenStone wrote: | This reminds me of some Bret Victor talks. | bumblebritches5 wrote: | welklkjlerg wrote: | PaulDavisThe1st wrote: | 1) The premise laid out at the link is very interesting, and one | I have wondered about for more than a decade without reaching any | conclusions. So much code in a modern kernel is predicated on the | idea that if the data is on "disk" it will be slow and we should | do something else while it arrives. What does kernel design look | like when that's just not true anymore? | | 2) Then I went to the slides, and found what appeared to be a | talk about programming languages and desktop environments. | | 3) Filesystems are often thought of as ways of finding data on | some storage medium, and that's not wrong. But _files_ are also a | way to _organize_ data, largely independently of the technical | details of the filesystem itself. | gjvc wrote: | This talk is a wonderful 50-minute summary of computing over the | last ~50 years with a couple of ideas for future directions, and | I recommend every serious computerist watch. | karmakaze wrote: | We're talking about losing the constraint of filesystems that | will burden future development. | | At the same time we're clinging to programming using empirical | thinking that came from the first machine language programs ever | written. We should be at a point that we can think and write more | declaratively even when executed sequentially. Numbers of cores | has continued to go up and any language that doesn't parallelize | well isn't worth investing in for the long term. | tux1968 wrote: | IMHO, the ideas about non-volatile memory are a distraction and | not a very fundamental or interesting detail of future systems. | But the core proposal of combining https://squeak.org/ on top of | https://oberon.org is really quite an exciting idea and could be | a lot of FOSS fun. | jll29 wrote: | As he says, Squeak already runs on bare metal, so why have a | version with Oberon below? He himself admits in the talk that a | system as he proposes would have two languages rather than one | (i.e., go against the spirit of SmallTalk systems and LISP | machines that he praises earlier in his task). | | There are Smalltalk ports on top of LuaJIT and Oberon ports on | top of LuaJUT, so there is at least one reality where the two | languages sit side-by-side (rather than on top of each other as | the talk proposes); see Michael Engel's talk on Vimeo for more | details. | Animats wrote: | _When a computer 's permanent storage is all right there in the | processors' memory map, there is no need for disk controllers or | filesystems. It's all just RAM._ | | And errors are forever. | | There were once LISP, Smalltalk, and Forth environments where you | saved the entire state of the system, rather than using files. | This was not a good thing. Only one person could really work on | something. If you messed up the image, undoing the mess was hard. | Revision control? What's that? | | Progress has been made by learning how to minimize and discard | state. What makes the web backend application industry go is that | almost all the state is in the database. Database systems are | well debugged and reliable today. This allows for mediocre | quality in web apps. | | Containers are another example of discarding state. Don't | sysadmin, just flush and reload. So is the transition from | object-oriented to functional programming. | | No, making memory persistent is not the answer. | | We do need to be thinking about new OS designs, because the | hardware world is changing. | | - Lots of RAM. Swapping out to disk, even SSD, is so last-cen. | The moment that happens performance degrades so badly you may as | well restart. | | - Lots of CPUs. There are now 128-core CPUs. NVidia just | announced a 144-core ARM CPU. The OS and hardware need to be | really serious about interprocess communication. It's an | afterthought in Unix/Linux. Needs to be at least as good as QNX. | Probably better, and with hardware support. | | - Different kinds of compute units. GPUs need to be first-class | players in the OS. There are going to be more special purpose | devices, for machine learning. | | - SSD is fast enough that buffering SSD in RAM is mostly | unnecessary. | jandrewrogers wrote: | > Lots of RAM. Swapping out to disk, even SSD, is so last-cen. | The moment that happens performance degrades so badly you may | as well restart. | | RAM is the new disk. These days we try to make codes live out | of CPU cache as much as possible, scheduling the RAM access the | way we used to schedule disk access, which is effective -- plus | ca change, plus c'est la meme chose. | | > SSD is fast enough that buffering SSD in RAM is mostly | unnecessary. | | This is not even approximately true in practice for many | applications. Striping an array of NVMe storage devices is | _still_ about an order of magnitude less bandwidth than RAM. | Storage I /O is also high latency and highly concurrent; if | your storage I/O is zero-copy then it implies locking up | considerable amounts of RAM just to schedule the I/O | transactions safely, never mind caching. Modern database | kernels require clever I/O schedulers and large amounts of RAM | to keep throughput from degrading to the throughput of even | very fast storage hardware. And this assumes your storage code | is aware of the peculiarities of SSD (inconvenient mix of block | sizes, no overwrite, garbage collection, etc). | | That said, you obliquely raise a valid point. Buffering storage | in a cache usually doesn't work well once the storage:cache | ratios become high enough, and the 1000:1 ratios seen in modern | servers are well past that threshold. In these cases, you still | need a lot of RAM but you end up redeploying it to other parts | of the kernel architecture where it can be more effective than | a buffer cache for storage. | zasdffaa wrote: | > SSD is fast enough that buffering SSD in RAM is mostly | unnecessary. | | A very strong claim, which I'd contest depending on what you're | caching. If it's application code then probably but if you've | lots of data which you're expecting to re-read such as with a | database with many gigs, I really disagree with that. If an SSD | had say a 1 gig/sec bandwidth for reasing that's about 22 times | slower than my arthritic 7-year dual-channel RAM bandwidth[1]. | A more modern higher spec server will have more channels (and | hopefully more modern, faster ram). | | Can you elaborate a bit on your thoughts? | | [1] and the path to ram may not have as many page faults -> | kernel switches as reading from an SSD, but I may be wrong and | it may not matter | kragen wrote: | https://www.club386.com/silicon-motion-pcie-5-0-nvme-ssds- | pr... says 14 GB/s. | | And of course the sum of bandwidths within a set of NVDIMMs | can easily be much higher than that. | a9h74j wrote: | > GPUs need to be first-class players in the OS | | I've been starting to wonder, taking a crazy-stupid extreme, | whether a GPU unit should be in charge of booting. Perhaps you | boot into a graphical interpeter (like a boot into BASIC of | old) which can then implement a menu (or have "autoexec.bas") | to select a heavyweight OS to boot into. | kragen wrote: | Multiuser systems with orthogonal persistence, like the single- | user Smalltalk and image-based Lisps you mention, include | EUMEL, L3, KeyKOS, OS/400 (iSeries), and I think Multics. (The | Forths I know of were never image-based, so I can't comment on | them.) It is definitely not the case that "only one person | could really work on something [at a time]" on EUMEL, L3, | KeyKOS, and OS/400. Maybe only one person could hack on the TCB | at a time, but EUMEL, L3, KeyKOS, and OS/400 had extremely tiny | TCBs, for that reason among others. | | Having a single-level store doesn't save you from minimizing | and discarding state. It just (mostly) decouples discarding | state from power failures, making it intentional rather than | accidental. In today's battery-powered environments, this is | probably a better idea than ever. KeyKOS and EUMEL still had | named files, containing sequences of bytes, in directories. | They just weren't coupled to the disk device drivers. OS/400 | has files, too. Smalltalk and Lisp environments were more | radical in this sense. | | Also, Smalltalk has had version control within its images for | decades. | | Stephen White and Pavel Curtis's MOO also had transparent | persistence, didn't have files, and supported massively | multiuser access; possibly a MOO server has supported more | concurrent users than any AS/400, and certainly more than any | EUMEL, L3, or KeyKOS system ever did. (LambdaMOO has 48 users | online right now according to http://mudstats.com/Browse but | had many more at its peak.) But, while MOO was programmable, it | never supported high-performance or high-reliability computing | as those other systems did. You could think of it as being | written in a groupware scripting DSL, like a centralized Lotus | Notes. The most complex MOO software I know of was a Gopher | client, a "Gopher slate" that multiple people could use | simultaneously. It used cooperative multitasking with timeouts | that would abort long-running event handlers without rolling | back their incomplete effects. | | I agree, though, that a single-level store on its own doesn't | solve the problems OSes are facing. But it might be useful. We | have many layers of software on top of everything that accesses | persistent data being built on the assumption that anything | persistent is necessarily slow enough to cover millions of | nanoseconds of processing. That greatly reduces the benefits of | the enormous improvement SSDs represent over spinning rust. | | I'd say that buffering SSD in RAM goes beyond "mostly | unnecessary": it needs to be done in hardware (as with NVDIMMs) | to not _hurt_ performance in many cases. FRAM is even more | extreme on this axis. | | Another interesting question to me is migration. If I'm editing | a Jupyter notebook on my laptop, it'd be nice to move it onto | my cellphone and keep editing it when I travel, and it'd be | nice to move the computations onto heavy servers when I'm | connected to the internet. Lacking this, we seem to be moving | to warehouse-scale computing: everybody runs all their | notebooks on Google App Engine so they can use TPUs and access | them from anywhere. | | Related to migration is UI remoting. Why can't I use two | cellphones together to access the same app, or two cellphones | and a mouse as I/O devices to the same music synthesizer | (running on my laptop)? Accelerometers and multitouch could be | useful for a lot of things. Maybe ROS (the Willow Garage one, | not the BBS software) has the right approach here? | | Also, why is everything so goddamn slow and unreliable? Ivan | Sutherland had a VR wireframe cube running on a head-mounted | display more than 50 years ago. Why are these supercomputers in | our pockets still too slow to run VR with low enough latency | that people don't throw up? Why does it take multiple minutes | to start up again if its battery runs down and I plug it in? | Why does the alarm app on my cellphone refuse to launch when | its Flash fills up? Why does my laptop become unresponsive and | require rebooting when a web page uses too much RAM? | | _shakes fist at cloud_ | zozbot234 wrote: | > It just (mostly) decouples discarding state from power | failures, making it intentional rather than accidental. | | Just wondering, but how is an OS supposed to achieve this | without fsync()ing all writes to memory and being slow as | dog$h!t? Having a "single level" of storage means it becomes | harder to tell what might be purely ephemeral, and can thus | be disregarded altogether if RAM or disk caches are lost in a | failure. You might think you can do this via new "volatile | segments" API's but then you're just reintroducing the | volatile storage that you were trying to get rid of. | marginalia_nu wrote: | Eh, from hands on experience with Optane drives I'm very | skeptical. They're still very far removed from RAM, even if they | are good compared to SSDs. Like maybe next to RAM in the 1980s | they seem about equivalent. Modern RAM has absurd bandwidth. ___________________________________________________________________ (page generated 2022-03-26 23:00 UTC)