[HN Gopher] What if mass storage were free? (1980)
       ___________________________________________________________________
        
       What if mass storage were free? (1980)
        
       Author : dwenzek
       Score  : 44 points
       Date   : 2023-02-15 12:56 UTC (10 hours ago)
        
 (HTM) web link (dl.acm.org)
 (TXT) w3m dump (dl.acm.org)
        
       | tpmx wrote:
       | We now (2023) live in a time where storing years of text and even
       | audio is essentially free. Storing years of video is still
       | actually costly.
       | 
       | Btw: You need about 12 TB for a 1 year video stream at 3 Mbit/s,
       | so it's certainly doable, but it's not cheap.
        
       | Hooray_Darakian wrote:
       | > Optical discs promise to come one to two orders of magnitude
       | closer to the limiting case of free mass storage than ever
       | before. Other features of optical discs include improved
       | reliability and a single technology for both on-line and archival
       | storage with a long shelf life. Because of these features and
       | because of (not in spite of) their non-deletion limitation, it is
       | argued that optical discs fit the requirements of database
       | systems better than magnetic discs and tapes.
       | 
       | Wild view from where we sit today, but CDs were ~700MB in 1982.
       | Seagate launched a 5MB hard drive in 1980 so.... not entirely
       | absurd to think that `just don't delete things` could be the way
       | of the future. We sorta adopted `just don't delete things` anyway
       | though not with respect to RDBMS systems.
       | 
       | Thanks for sharing!
        
         | PaulDavisThe1st wrote:
         | 1988: Schlumberger Cambridge Research takes possession of a new
         | 1MB drive to be added to its VAXcluster. The drive is the size
         | of ... a small refridgerator. It was quite a day!
        
           | WalterBright wrote:
           | At Aph we had 10Mb drives in 1978.
        
           | gruturo wrote:
           | In Winter 1987 I had an Amstrad PC1640HDMD with a 20MB MFM
           | hard disk. I opened the case many times, the disk was just a
           | 5.25 inch device, certainly not like 20 small refrigerators,
           | or even one.
           | 
           | Bonus: I got an RLL controller and turned it into a 30MB hard
           | disk! Couldn't believe it. But getting the interleaving right
           | was time consuming..
        
           | berkut wrote:
           | I suspect that was really 1 GB?
           | 
           | 3.5" HDs of over > 20 MB for IBM PCs were around in IBM PCs
           | at the time.
        
           | ta1243 wrote:
           | Did you really mean Nineteen-Eighty-Eight?
           | 
           | In PC Magazine from July 1988 there is an advert for a 15MHz
           | XT for $575 with an optional 30MB Segate ST238 5.25" scsi
           | hard drive inside for an extra $295 [0]
           | 
           | The price hasn't dropped much since, it's now $206 for the
           | drive [1]
           | 
           | [0] https://archive.org/details/PC-
           | Mag-1988-07-01/page/22/mode/2...
           | 
           | [1] https://www.amazon.com/ST238R-Seagate-3600RPM-Internal-
           | Drive...
        
         | joering2 wrote:
         | I recall the 10MB ad more.
         | 
         | https://i.pinimg.com/originals/0d/b3/b6/0db3b67dcdd2edbedd5c...
         | 
         | Classic!
        
         | ocal5 wrote:
         | Isn't the way of _Glacier_ ?
        
       | bob1029 wrote:
       | If mass storage were free, then everything would be append-only
       | by default. There would be no excuse to not do this.
       | 
       | A major benefit of append-only is that your writes are always
       | ideal for whatever storage medium. Especially magnetic or tape.
       | Combine append-only with batching of transactions (i.e. across
       | 1-10 milliseconds at a time), and you can write multiple txns per
       | disk I/O operation (assuming txn size < storage block size).
        
         | kragen wrote:
         | what if you accidentally wrote your private key, photos of your
         | nude boyfriend, or evidence of a crime to your mass storage
        
           | sharkjacobs wrote:
           | Then you would delete it
           | 
           | > everything would be append-only by default
           | 
           | doesn't mean everything is permanently etched into stone or
           | written on the blockchain, it just means that "by default"
           | everything you write is written to a new block[1], instead of
           | having to free up old blocks to reuse and keep track of which
           | blocks of storage are available
           | 
           | [1]"block" is just meant as a generic unit of storage, I'm
           | not trying to say anything about actual drive blocks and
           | implementation details
        
             | sn0wf1re wrote:
             | Even if it is etched in stone (or laser blasted into
             | crystal), nothing a chisel and a hammer can't undo. :)
        
           | bitwize wrote:
           | The blocks with those would be scribbled over and
           | invalidated.
           | 
           | Do you know why ASCII 0x7f is DEL? Paper tape is write-once.
           | To indicate a deleted character, it was conventional to punch
           | holes in all bit positions -- 0x7f on a 7-bit punch.
        
       | pclmulqdq wrote:
       | It's interesting that we have almost started to live in this
       | world. I have a half-written blog post on this phenomenon but I
       | guess I'm 45 years too late.
       | 
       | Interestingly, Google and Facebook seem to have basically done it
       | right with their exascale filesystems. The same with object
       | stores.
        
       | fmajid wrote:
       | Leo Szilard's solution to the problem of Maxwell's Demon was that
       | the acto of deleting data the demon must perform is the
       | thermodynamically limiting factor. Deleting selective data
       | efficiently is in fact one of the greatest challenges in large
       | production databases, and in an era of increasing privacy
       | restrictions like GDPR's right to deletion, an increasing
       | challenge for database operators.
        
       | MisterTea wrote:
       | Plan 9 implemented this concept in the worm cached file server,
       | one of the on-disk file systems used in plan 9. The idea was you
       | have a disk based cache and a WORM (write once, read many) dump
       | consisting of optical juke boxes. Writes to the fs are stored in
       | the cache until the fs is dumped to worm, manually or on a
       | schedule (hard-coded to do this 2am every night.)
       | http://man.9front.org/4/cwfs
       | 
       | The idea was to reduce the cost of storage by removing long term
       | data from costly hard disks and storing it on cheap magneto-
       | optical disks which like CD's could be stored in an automated
       | juke box. Write all the data you want to the cache, then commit
       | to worm. As the worm fills, you just buy another disk and put it
       | in the jukebox. The history(1) command then gives you a files
       | history as a set of paths you can bind over another path to use
       | an old version of a file instead of copying it. Its really a file
       | system for programmers.
       | http://doc.cat-v.org/plan_9/4th_edition/papers/fs/
       | 
       | This idea was expanded on with Venti/Fossil which allows you to
       | build file systems from arbitrary venti data sets.
       | http://doc.cat-v.org/plan_9/4th_edition/papers/venti/
        
       | usrusr wrote:
       | Reminds me of the various "what if all memory was non-volatile"
       | that made the rounds when Intel Optane entered the stage. A bit
       | like the inverse of this, but the caveats might turn out similar:
       | in one case you'd still want a well-defined resettable area, in
       | the other case you'd still want to avoid having to deal with
       | arbitrarily long addresses which would at some point become as
       | bad as seek times even if hypothetically seek times in the
       | stricter sense did not exist.
        
       | [deleted]
        
       | dwenzek wrote:
       | I just found this quite old paper and it came as a surprise to me
       | to discover that the idea of append-only storage is not 20 years
       | old but more than 40!
       | 
       | The older work I was aware of is on "The design and
       | implementation of a log-structured file system" (1)
       | 
       | So this is with pleasure that I learned that these ideas was
       | around in the 80:
       | 
       | - Deletion considered harmful
       | 
       | - A non-deletion strategy using timestamps
       | 
       | - The importance of accessing past data
       | 
       | - A non-deletion strategy can improve both integrity and
       | reliability
       | 
       | (1) https://dl.acm.org/doi/10.1145/146941.146943
        
         | mouse_ wrote:
         | I'm fairly certain data and records have been sewn into
         | tapestries for thousands of years.
        
         | kragen wrote:
         | the idea of append-only storage is surely older than pacioli
        
         | 082349872349872 wrote:
         | Paper-based accounting was append-only, so I think the idea's
         | always been there but was uneconomic in machine readable media
         | for a long time.
         | 
         | (in particular, "new master = old master + updates" card/tape
         | jobs were in principle append-only but --due to finite number
         | of tapes-- in practice overwriting)
        
           | cperciva wrote:
           | Not at all! Medieval documents were routinely washed and
           | reused.
        
             | dmurray wrote:
             | And wax tablets, slates, etc. Pedantically, I suppose
             | neither these nor parchment is really paper.
        
             | anamexis wrote:
             | I don't think that was their point. If you went to a bank
             | in the 1940s and made a withdrawal, they wouldn't pull your
             | account slip, erase the balance, and write in the new one.
             | They would add a new line to the ledger noting a new
             | balance. This is by design.
        
               | WalterBright wrote:
               | Double-entry bookkeeping was a great advance.
        
               | ezekiel68 wrote:
               | And -- in keeping with the flow of this thread -- a
               | complete luxury until the implements
               | (paper/pens/tape/disk/silicon) would become abundant and
               | ubiquitous.
        
             | itsmartapuntocm wrote:
             | More recently, movie studios regularly destroyed old film
             | to make space for new ones, causing old pictures,
             | especially silent era ones, to be lost forever.
             | 
             | Even NASA wiped the original Apollo 11 tapes to reuse them.
        
             | zwirbl wrote:
             | A reused manuscript page is called a palimpsest and often
             | the scraped off text can be recovered. Some of Archimedes
             | writings actually survived this way
             | https://en.wikipedia.org/wiki/Archimedes_Palimpsest
        
               | unwind wrote:
               | Obligatory shout-out to the novella of the same name, by
               | HN user (and, obviously, Actual Author) Charles Stross.
               | 
               | [1]: https://en.wikipedia.org/wiki/Palimpsest_(novella)
        
         | refset wrote:
         | The topic dealing with history in databases seems to go most of
         | the way back to the beginning of the field. I'm still hoping a
         | copy of "Bubenko (1977) The Temporal Dimension in Information
         | Modelling" turns up on the web eventually as I'd love to read
         | it.
         | 
         | The 1980 paper you linked is touched on briefly at the
         | beginning of this Strange Loop talk on "Light and Adaptive
         | Indexing for Immutable Databases (2022)":
         | https://www.youtube.com/watch?v=Px-7TlceM5A
        
         | codemac wrote:
         | Sadly 1992 is 31 years ago. The authors pushed for log
         | structured filesystems in an earlier paper in 1988 : Beating
         | the I/O Bottleneck
         | https://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5760.html .
         | It was inspiration for many storage appliances, NetApp probably
         | being a very strong example.
         | 
         | Though many were thinking about these ideas in the 88-92
         | timeframe, as Tape storage systems are roughly speaking append
         | only, so lots of the ideas of a logged filesystem are around
         | the increased random read from disk drives.
        
         | oakwhiz wrote:
         | A non-deletion strategy should consider including an encryption
         | and key management strategy to enable retroactive secure
         | deletion without impacting availability, reliability, and
         | performance. This seems to be missing from a lot of systems
         | that deal with personal information.
        
           | spelunker wrote:
           | also called crypto shredding! We had this issue trying to
           | square GDPR-type things with an append-only store.
        
       ___________________________________________________________________
       (page generated 2023-02-15 23:01 UTC)