[HN Gopher] Starting Over - A FOSS proposal for a new type of OS...
       ___________________________________________________________________
        
       Starting Over - A FOSS proposal for a new type of OS for a new type
       of computer
        
       Author : gjvc
       Score  : 113 points
       Date   : 2022-03-26 14:08 UTC (8 hours ago)
        
 (HTM) web link (archive.fosdem.org)
 (TXT) w3m dump (archive.fosdem.org)
        
       | BruceEel wrote:
       | Alas, my Macbook Pro computer is so obsolete the video won't
       | play!
        
         | raybb wrote:
         | Their website seems to be struggling a bit with the video.
         | 
         | Here's a mirror that will only last 24 hours or 100 downloads.
         | https://wormhole.app/kr6xX#tHs1gcAJfyBgN5ti0tBpxA
         | 
         | Alternatively, here's a p2p mirror that will stay up as long as
         | someone has the page open:
         | https://instant.io/#8587ef24d016aff8c87c6de186e1e8584270a4c7
        
           | BruceEel wrote:
           | Thank you!
        
       | orev wrote:
       | The original PalmOS did this from the beginning, with the main
       | difference being that memory was all RAM instead of non-volitile.
       | Apps existed in one part of memory, and data in another. Data
       | storage didn't use files, but databases and memory segments.
       | Anyone trying to solve this problem today could learn from that
       | system.
       | 
       | And yes, if your battery died, you lost everything. Palm devices
       | were designed to by synced to PCs at least daily, so you'd just
       | restore by syncing again.
        
         | evgen wrote:
         | Or learn from a better version of that same idea that was
         | embodied in the Newton. Able to be coded using prototypes from
         | top to bottom so that every app (including ones from Apple that
         | were embedded in the system) was hookable and modifiable, data
         | 'soups' that were persistent in nvram, and easy cross-
         | application data sharing.
        
         | zozbot234 wrote:
         | Except that memory segments are already files in all but name.
         | The two notions were unified already in MULTICS, and planning
         | for "single-level storage" makes that unification explicit.
        
       | Someone wrote:
       | Earlier HN discussion at
       | https://news.ycombinator.com/item?id=26066762 (119 comments)
        
       | josephg wrote:
       | Personally I'm horrified by the vision of 18 different electron
       | apps, each using gigabytes of storage space even when they're not
       | running.
       | 
       | But I think there's always some essential state that we'll want
       | to persist. And I agree that modern filesystems are way too
       | janky. Its hard to store anything safely because there's no
       | transactional support. Its hard to share edits between computers
       | because files are totally opaque to the operating system. So my
       | changes will clobber your changes. And the save-load loop is
       | inefficient. Why do I have to write out my entire file again
       | after making a single character change?
       | 
       | But the core promise of persistent values is important. Why can't
       | I just bless a variable in my program (and give it a name) and
       | have it persist between instantiations? Like, if I give x some
       | persistent identity, then set x = 5, it should still be 5 when I
       | restart my computer, reload my webpage, or open a second web
       | browser.
       | 
       | I'm not sure if persistent memory is the answer. My take is that
       | I think we need to start using CRDTs more deeply in our operating
       | systems.
       | 
       | A well written operation based CRDT (Concurrent Replicated Data
       | Type) is an append-only log of patches, which together create a
       | mutable value. Treating it as an append-only log makes it easy to
       | store on disk or replicate over the network. And treating it like
       | a value makes it easy to program against. The log of changes
       | grows over time, but you don't have to store all of it. (Or
       | really, any of it). You have a tunable knob of how much history
       | you want to keep around. More history data means you can support
       | branches, merging and time travel. Less history makes it smaller.
       | (And the data you need to keep for merging is really tiny in
       | practice, btw.)
       | 
       | With CRDTs we can make magic variables which work like that if we
       | want. And unlike persistent memory, CRDTs can share their value
       | with other processes, other devices and other users.
       | 
       | Rather than sysfs, procfs, etc, I want all the OS's internal
       | state exposed as live variables I can just subscribe to. And I
       | want my own data (documents, etc) to work exactly the same way.
       | For some data sets (like my source code), I want to keep a full
       | change history. And for other data sets (my CPU temperature) that
       | should be something I explicitly opt in to.
       | 
       | But regardless, all of this data should be able to be shared with
       | a flick of the mouse.
        
       | bryan_w wrote:
       | Interesting idea, but I think he's a little too hung up about
       | programming languages. I think there's space for new operating
       | systems and that doesn't necessarily preclude the use of any
       | programming language.
       | 
       | We could have OSes that don't use files, that save state and are
       | able to resume in seconds, and we can do all of that in C (I
       | don't think we would want to but we could). Also there doesn't
       | exist a world where grandma (or even the Kool kids today) is
       | going to learn smalltalk in order to use an OS.
       | 
       | The main hurdle I see with all of this is hardware initialization
       | -- how to handle, from cold boot, the fact that your network card
       | has no idea what frames has been sent and what have it has been
       | received.
       | 
       | The first step I see in all of this is creating an operating
       | system where the kernel has no concept of being booted. One thing
       | people struggle to wrap their head around with optane is "what
       | does kernel upgrades look like" which would be messy in existing
       | OSes.
        
         | jka wrote:
         | > Also there doesn't exist a world where grandma (or even the
         | Kool kids today) is going to learn smalltalk in order to use an
         | OS.
         | 
         | I'd agree that well-designed operating systems shouldn't
         | require anyone to undestand programming to begin to use them,
         | but equally I'd argue that anyone should be able to look at the
         | code that's running on their computer and inspect it. It's one
         | way (often a very effective way) to learn about software in the
         | first place.
         | 
         | The technology landscape will no doubt look very different in
         | another twenty years, and while it's relatively well-understood
         | that anyone can contribute to FOSS projects regardless of age
         | or experience, it's worth improving that opportunity when
         | possible (both technically and socially).
         | 
         | There's some kind of analogy with web development here; the
         | ability to view source for webpages and tinker with the
         | HTML/CSS/JS easily is an educational pathway.
         | 
         | > The first step I see in all of this is creating an operating
         | system where the kernel has no concept of being booted.
         | 
         | If both the operating system and network card are safe to
         | initialize from cold at any point (and with robust software and
         | protocol design, they should be), then even with peer-to-peer
         | commodity memory as a storage medium, statelessness should be
         | perfectly achievable, I think?
         | 
         | Slightly off-topic: you may be interested in Linux's kexec[1]
         | functionality (it provides the ability for a running kernel to
         | load and run another kernel).
         | 
         | [1] - https://wiki.archlinux.org/title/Kexec
        
         | zozbot234 wrote:
         | A single-level store based OS uses even _more_ files than
         | current OS 's like Windows or Linux. Because every memory
         | mapping is notionally persisted to disk and is thus a file.
         | Whereas currently you have "anonymous" mappings that are
         | understood as ephemeral, with no "files" in storage backing
         | them.
        
       | andi999 wrote:
       | I think the next step should be focused on security. This
       | captures part of it nicely: https://xkcd.com/1200/
       | 
       | I mean any user space program can harm all (same) user space data
       | and since often there is only one user in a system this is fatal.
        
       | fuzzfactor wrote:
       | Well I've been doing computerized gas analysis since dirt was
       | rocks and only the rare laboratories with mainframes could do it.
       | 
       | Before the first benchtop data systems came along you had to
       | evaluate the curves from the graph paper yourself and then
       | calculate numerical results by hand. A single-point report like
       | CO2 content alone is quite easy & quick manually, but when you
       | have dozens of hydrocarbons on the same graph that all need to be
       | calculated in parallel it was not so quick and more prone to
       | error.
       | 
       | Programmable calculators could then be used to save time on some
       | of the purely numerical work for the high-data runs, and this is
       | about like what would become the final part of the built-in
       | workflow for these early application-specific computerized gas
       | workstations.
       | 
       | Offices didn't have PCs yet, just typewriters, copiers, filing
       | cabinets and simple calculators.
       | 
       | For laboratories the built-in gas report printers were
       | leapfrogging technology with each generation, the inkjet first
       | appeared in a HP model for the instrument lab before they began
       | to apply it to office equipment.
       | 
       | The evolution has been interesting.
       | 
       | One particular series of late 1970's design benchtop data system
       | was multi-user and multi-tasking way before Windows. Like others
       | it took in the raw analog signal (still do but they're "black
       | box" interfaces connected to PC's for further calculation now)
       | from the benchtop gas analyzer in real time and produced digital
       | data from the curves on the graph. This electronically
       | accomplished that first manual procedure of measuring the curve
       | to begin with.
       | 
       | From that point on it's all calculation.
       | 
       | Each gas run was autosaved in memory as a file having an
       | automatically-assigned file number, no more than 3 digits. There
       | was no way anybody had enough memory to record the entire analog
       | signal at the time so this was just the key geometric features in
       | digital form.
       | 
       | There was no file system.
       | 
       | Today on PCs the entire analog gas signal is recorded to HDD in
       | digital format routinely. And people still lose files all the
       | time, don't ask me how I know.
       | 
       | But for that antique data system there was no storage other than
       | memory absolutely required, many bare-bones consoles were issued
       | without the optional micro-cassette tape drive. The tape was
       | quite slow and best for long-term storage or disaster recovery.
       | An optional COM port became available years later which was
       | faster, but there was no workflow change for most users. A great
       | deal of fairly advanced work could be accomplished by many
       | without need for any storage device I/O over a period of years.
       | Just a one-page printout about each gas destined for the physical
       | filing cabinet.
       | 
       | You never were supposed to turn the console off. Except whenever
       | you wanted to! The major use case is 24/7 so there were expensive
       | battery backup units to protect from short- or long-term power
       | failures, which only preserved the memory. If the power failed
       | the battery kept everything the same and you only lost a gas run
       | if it was in progress and had not been saved in memory.
       | 
       | But you could also power down the console from the main switch
       | any time and its high-power-consuming hardware, processor, and
       | power-supply would go cold. Like when you're done for the day,
       | the gas runs are over, and you were finished sitting at the
       | console. The memory alone would still be powered by the battery
       | backup unit without depleteing the battery at all unless the
       | backup unit itself lost power. Under backup power you could also
       | change any circuit board other than the dedicated memory PCB,
       | including the main power supply, and it would power back up with
       | the memory intact.
       | 
       | The OS was immutable and in ROM where it should be.
       | 
       | Also the default OS performance upon cold power-up was adequate
       | for many simple gas tests without need for user programs or
       | reference data files to be loaded beforehand. For many operators
       | a few minutes of manual parameter entry gave full disaster
       | recovery. So those guys didn't even need the expensive batteries,
       | tapes or COM port. It was basically application-specific enough
       | for the simple stuff already.
       | 
       | You could then extend that with optional user programs to more
       | complex automation & calculations but only up to the limitations
       | of memory.
       | 
       | User programs ended up as file numbers too so you were your own
       | file system.
       | 
       | And to just get a truly worthwhile file listing, you had to write
       | your own program.
       | 
       | Glad there were only 3 digits.
       | 
       | These things did achieve nearly zero show-stoppers per year on
       | average, unlike modern PC systems which are in the handfuls at
       | minumum and dozens for many operators on average.
       | 
       | There was no booting, the OS ran from ROM and the memory was
       | accessed as needed from there. Optional storage devices were only
       | needed for user items, the OS did not require any files from
       | storage, and only one file present by default in memory. Simply
       | its own memory work area.
        
       | clintonwoo wrote:
       | If the computer has non volatile memory then how do you turn it
       | off and on again to fix the bugs in software? Hope we don't have
       | to say rest in peace to this beautiful piece of troubleshooting
       | procedure
        
         | Someone wrote:
         | The same way you do it now. Most computers already have non-
         | volatile memory, typically a tiny bit for such things as boot
         | parameters and an enormous amount in the form of disk storage.
         | 
         | If, today, your disk is corrupt, you tell your OS to boot in
         | single-user mode and run _fsck_. If that doesn't work, you fall
         | back a level and reformat a disk.
         | 
         | I don't see why a device where all storage is directly
         | addressable would be different.
        
         | e2le wrote:
         | Some kind of overlayfs type solution? With a non-writeable
         | "default" lower state and a writeable upper state where the
         | changes occur, removing the upper state would effectively reset
         | the system.
        
           | clintonwoo wrote:
           | Fair call, sounds like an immutable OS like CoreOS where
           | updating it requires a reboot. That's possibly all that's
           | required!
        
             | a9h74j wrote:
             | IMHO something was lost when hard drives no longer had
             | their own write-protect switch (remembering back to DEC
             | rack-mount spinning drives).
        
         | IshKebab wrote:
         | Same way you fix errors with persistent caches. You clear them.
        
       | oneplane wrote:
       | As wonderful as ideas are, change from technology only becomes
       | change in availability. Or, in simpler terms: it's not always
       | about what is possible, and quite often it's about what is needed
       | instead.
        
       | 2OEH8eoCRo0 wrote:
       | This is great. Ties together a lot of ideas that I've had but
       | either can't properly express or lack the knowledge to execute
       | on.
        
       | LichenStone wrote:
       | This reminds me of some Bret Victor talks.
        
       | bumblebritches5 wrote:
        
       | welklkjlerg wrote:
        
       | PaulDavisThe1st wrote:
       | 1) The premise laid out at the link is very interesting, and one
       | I have wondered about for more than a decade without reaching any
       | conclusions. So much code in a modern kernel is predicated on the
       | idea that if the data is on "disk" it will be slow and we should
       | do something else while it arrives. What does kernel design look
       | like when that's just not true anymore?
       | 
       | 2) Then I went to the slides, and found what appeared to be a
       | talk about programming languages and desktop environments.
       | 
       | 3) Filesystems are often thought of as ways of finding data on
       | some storage medium, and that's not wrong. But _files_ are also a
       | way to _organize_ data, largely independently of the technical
       | details of the filesystem itself.
        
       | gjvc wrote:
       | This talk is a wonderful 50-minute summary of computing over the
       | last ~50 years with a couple of ideas for future directions, and
       | I recommend every serious computerist watch.
        
       | karmakaze wrote:
       | We're talking about losing the constraint of filesystems that
       | will burden future development.
       | 
       | At the same time we're clinging to programming using empirical
       | thinking that came from the first machine language programs ever
       | written. We should be at a point that we can think and write more
       | declaratively even when executed sequentially. Numbers of cores
       | has continued to go up and any language that doesn't parallelize
       | well isn't worth investing in for the long term.
        
       | tux1968 wrote:
       | IMHO, the ideas about non-volatile memory are a distraction and
       | not a very fundamental or interesting detail of future systems.
       | But the core proposal of combining https://squeak.org/ on top of
       | https://oberon.org is really quite an exciting idea and could be
       | a lot of FOSS fun.
        
         | jll29 wrote:
         | As he says, Squeak already runs on bare metal, so why have a
         | version with Oberon below? He himself admits in the talk that a
         | system as he proposes would have two languages rather than one
         | (i.e., go against the spirit of SmallTalk systems and LISP
         | machines that he praises earlier in his task).
         | 
         | There are Smalltalk ports on top of LuaJIT and Oberon ports on
         | top of LuaJUT, so there is at least one reality where the two
         | languages sit side-by-side (rather than on top of each other as
         | the talk proposes); see Michael Engel's talk on Vimeo for more
         | details.
        
       | Animats wrote:
       | _When a computer 's permanent storage is all right there in the
       | processors' memory map, there is no need for disk controllers or
       | filesystems. It's all just RAM._
       | 
       | And errors are forever.
       | 
       | There were once LISP, Smalltalk, and Forth environments where you
       | saved the entire state of the system, rather than using files.
       | This was not a good thing. Only one person could really work on
       | something. If you messed up the image, undoing the mess was hard.
       | Revision control? What's that?
       | 
       | Progress has been made by learning how to minimize and discard
       | state. What makes the web backend application industry go is that
       | almost all the state is in the database. Database systems are
       | well debugged and reliable today. This allows for mediocre
       | quality in web apps.
       | 
       | Containers are another example of discarding state. Don't
       | sysadmin, just flush and reload. So is the transition from
       | object-oriented to functional programming.
       | 
       | No, making memory persistent is not the answer.
       | 
       | We do need to be thinking about new OS designs, because the
       | hardware world is changing.
       | 
       | - Lots of RAM. Swapping out to disk, even SSD, is so last-cen.
       | The moment that happens performance degrades so badly you may as
       | well restart.
       | 
       | - Lots of CPUs. There are now 128-core CPUs. NVidia just
       | announced a 144-core ARM CPU. The OS and hardware need to be
       | really serious about interprocess communication. It's an
       | afterthought in Unix/Linux. Needs to be at least as good as QNX.
       | Probably better, and with hardware support.
       | 
       | - Different kinds of compute units. GPUs need to be first-class
       | players in the OS. There are going to be more special purpose
       | devices, for machine learning.
       | 
       | - SSD is fast enough that buffering SSD in RAM is mostly
       | unnecessary.
        
         | jandrewrogers wrote:
         | > Lots of RAM. Swapping out to disk, even SSD, is so last-cen.
         | The moment that happens performance degrades so badly you may
         | as well restart.
         | 
         | RAM is the new disk. These days we try to make codes live out
         | of CPU cache as much as possible, scheduling the RAM access the
         | way we used to schedule disk access, which is effective -- plus
         | ca change, plus c'est la meme chose.
         | 
         | > SSD is fast enough that buffering SSD in RAM is mostly
         | unnecessary.
         | 
         | This is not even approximately true in practice for many
         | applications. Striping an array of NVMe storage devices is
         | _still_ about an order of magnitude less bandwidth than RAM.
         | Storage I /O is also high latency and highly concurrent; if
         | your storage I/O is zero-copy then it implies locking up
         | considerable amounts of RAM just to schedule the I/O
         | transactions safely, never mind caching. Modern database
         | kernels require clever I/O schedulers and large amounts of RAM
         | to keep throughput from degrading to the throughput of even
         | very fast storage hardware. And this assumes your storage code
         | is aware of the peculiarities of SSD (inconvenient mix of block
         | sizes, no overwrite, garbage collection, etc).
         | 
         | That said, you obliquely raise a valid point. Buffering storage
         | in a cache usually doesn't work well once the storage:cache
         | ratios become high enough, and the 1000:1 ratios seen in modern
         | servers are well past that threshold. In these cases, you still
         | need a lot of RAM but you end up redeploying it to other parts
         | of the kernel architecture where it can be more effective than
         | a buffer cache for storage.
        
         | zasdffaa wrote:
         | > SSD is fast enough that buffering SSD in RAM is mostly
         | unnecessary.
         | 
         | A very strong claim, which I'd contest depending on what you're
         | caching. If it's application code then probably but if you've
         | lots of data which you're expecting to re-read such as with a
         | database with many gigs, I really disagree with that. If an SSD
         | had say a 1 gig/sec bandwidth for reasing that's about 22 times
         | slower than my arthritic 7-year dual-channel RAM bandwidth[1].
         | A more modern higher spec server will have more channels (and
         | hopefully more modern, faster ram).
         | 
         | Can you elaborate a bit on your thoughts?
         | 
         | [1] and the path to ram may not have as many page faults ->
         | kernel switches as reading from an SSD, but I may be wrong and
         | it may not matter
        
           | kragen wrote:
           | https://www.club386.com/silicon-motion-pcie-5-0-nvme-ssds-
           | pr... says 14 GB/s.
           | 
           | And of course the sum of bandwidths within a set of NVDIMMs
           | can easily be much higher than that.
        
         | a9h74j wrote:
         | > GPUs need to be first-class players in the OS
         | 
         | I've been starting to wonder, taking a crazy-stupid extreme,
         | whether a GPU unit should be in charge of booting. Perhaps you
         | boot into a graphical interpeter (like a boot into BASIC of
         | old) which can then implement a menu (or have "autoexec.bas")
         | to select a heavyweight OS to boot into.
        
         | kragen wrote:
         | Multiuser systems with orthogonal persistence, like the single-
         | user Smalltalk and image-based Lisps you mention, include
         | EUMEL, L3, KeyKOS, OS/400 (iSeries), and I think Multics. (The
         | Forths I know of were never image-based, so I can't comment on
         | them.) It is definitely not the case that "only one person
         | could really work on something [at a time]" on EUMEL, L3,
         | KeyKOS, and OS/400. Maybe only one person could hack on the TCB
         | at a time, but EUMEL, L3, KeyKOS, and OS/400 had extremely tiny
         | TCBs, for that reason among others.
         | 
         | Having a single-level store doesn't save you from minimizing
         | and discarding state. It just (mostly) decouples discarding
         | state from power failures, making it intentional rather than
         | accidental. In today's battery-powered environments, this is
         | probably a better idea than ever. KeyKOS and EUMEL still had
         | named files, containing sequences of bytes, in directories.
         | They just weren't coupled to the disk device drivers. OS/400
         | has files, too. Smalltalk and Lisp environments were more
         | radical in this sense.
         | 
         | Also, Smalltalk has had version control within its images for
         | decades.
         | 
         | Stephen White and Pavel Curtis's MOO also had transparent
         | persistence, didn't have files, and supported massively
         | multiuser access; possibly a MOO server has supported more
         | concurrent users than any AS/400, and certainly more than any
         | EUMEL, L3, or KeyKOS system ever did. (LambdaMOO has 48 users
         | online right now according to http://mudstats.com/Browse but
         | had many more at its peak.) But, while MOO was programmable, it
         | never supported high-performance or high-reliability computing
         | as those other systems did. You could think of it as being
         | written in a groupware scripting DSL, like a centralized Lotus
         | Notes. The most complex MOO software I know of was a Gopher
         | client, a "Gopher slate" that multiple people could use
         | simultaneously. It used cooperative multitasking with timeouts
         | that would abort long-running event handlers without rolling
         | back their incomplete effects.
         | 
         | I agree, though, that a single-level store on its own doesn't
         | solve the problems OSes are facing. But it might be useful. We
         | have many layers of software on top of everything that accesses
         | persistent data being built on the assumption that anything
         | persistent is necessarily slow enough to cover millions of
         | nanoseconds of processing. That greatly reduces the benefits of
         | the enormous improvement SSDs represent over spinning rust.
         | 
         | I'd say that buffering SSD in RAM goes beyond "mostly
         | unnecessary": it needs to be done in hardware (as with NVDIMMs)
         | to not _hurt_ performance in many cases. FRAM is even more
         | extreme on this axis.
         | 
         | Another interesting question to me is migration. If I'm editing
         | a Jupyter notebook on my laptop, it'd be nice to move it onto
         | my cellphone and keep editing it when I travel, and it'd be
         | nice to move the computations onto heavy servers when I'm
         | connected to the internet. Lacking this, we seem to be moving
         | to warehouse-scale computing: everybody runs all their
         | notebooks on Google App Engine so they can use TPUs and access
         | them from anywhere.
         | 
         | Related to migration is UI remoting. Why can't I use two
         | cellphones together to access the same app, or two cellphones
         | and a mouse as I/O devices to the same music synthesizer
         | (running on my laptop)? Accelerometers and multitouch could be
         | useful for a lot of things. Maybe ROS (the Willow Garage one,
         | not the BBS software) has the right approach here?
         | 
         | Also, why is everything so goddamn slow and unreliable? Ivan
         | Sutherland had a VR wireframe cube running on a head-mounted
         | display more than 50 years ago. Why are these supercomputers in
         | our pockets still too slow to run VR with low enough latency
         | that people don't throw up? Why does it take multiple minutes
         | to start up again if its battery runs down and I plug it in?
         | Why does the alarm app on my cellphone refuse to launch when
         | its Flash fills up? Why does my laptop become unresponsive and
         | require rebooting when a web page uses too much RAM?
         | 
         |  _shakes fist at cloud_
        
           | zozbot234 wrote:
           | > It just (mostly) decouples discarding state from power
           | failures, making it intentional rather than accidental.
           | 
           | Just wondering, but how is an OS supposed to achieve this
           | without fsync()ing all writes to memory and being slow as
           | dog$h!t? Having a "single level" of storage means it becomes
           | harder to tell what might be purely ephemeral, and can thus
           | be disregarded altogether if RAM or disk caches are lost in a
           | failure. You might think you can do this via new "volatile
           | segments" API's but then you're just reintroducing the
           | volatile storage that you were trying to get rid of.
        
       | marginalia_nu wrote:
       | Eh, from hands on experience with Optane drives I'm very
       | skeptical. They're still very far removed from RAM, even if they
       | are good compared to SSDs. Like maybe next to RAM in the 1980s
       | they seem about equivalent. Modern RAM has absurd bandwidth.
        
       ___________________________________________________________________
       (page generated 2022-03-26 23:00 UTC)