[HN Gopher] Pstore: Ruby Built-In Hash Persistence
       ___________________________________________________________________
        
       Pstore: Ruby Built-In Hash Persistence
        
       Author : hstaab
       Score  : 84 points
       Date   : 2022-09-20 10:22 UTC (2 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | [deleted]
        
       | [deleted]
        
       | aartav wrote:
       | pstore has been a built-in with Ruby stdlib for as long as ruby
       | has existed, so _over_ 20 years.
        
         | mperham wrote:
         | I'm assuming it pre-dates Rubygems because it really should be
         | a gem. I can't speak for Japan but few people in the Western
         | world seem to use it.
        
           | brunno wrote:
           | There was a time when some stuff was being extracted
           | (removed) from Ruby core and becoming gems and I really
           | tought PStore and YAML::Store were going to be among those,
           | but no, they decided to keep them in core. So maybe there are
           | some important enough use cases that justify it being there.
           | 
           | Or maybe it would be a hard task that didn't justify the
           | effort.
        
             | byroot wrote:
             | Many parts of the stdlib are being slowly gemified, that's
             | the case of `pstore` too hence why it has it's own repo.
             | 
             | It's now no longer technically stdlib, but a "default gem",
             | a gem that is installed by default with ruby, see:
             | https://stdgems.org/
             | 
             | Since a few years every version remove one or two rarely
             | used default gems. The Ruby core team just doesn't like big
             | breaking changes.
        
         | tyingq wrote:
         | Pstore also uses Marshal behind the scenes, so I assume has
         | similar caveats you see in other comments on this thread.
        
       | kayodelycaon wrote:
       | Note, this is a wrapper around Ruby's Marshal class.
        
         | thunderbong wrote:
         | Mentioned in the linked article.
        
       | fny wrote:
       | I do a lot of ML and AI work nowadays... I miss Ruby a lot
       | especially the its culture around ergonomics.
        
         | cdiamand wrote:
         | There have been some interesting ML gems rolled in the past few
         | years:
         | 
         | https://ankane.org/new-ml-gems
         | 
         | Any thoughts on what the Ruby community would need to build in
         | order for it to become an attractive tool for AI work?
        
           | fny wrote:
           | A huge cultural shift. People in scientific computing speak
           | Python and R.
           | 
           | Something would need to happen that makes Ruby far more
           | attractive. Say performance parity with Crystal or Nim.
        
             | mattnewton wrote:
             | I think it's more than that, Julia exists and adoption is
             | still slow. Lua and torch were plenty fast and they were
             | still replaced by pytorch. I think to compete with python
             | you need at least a fraction of the de-facto corporate
             | sponsorship for python in the ML space.
        
           | waffle_ss wrote:
           | As a primarily Ruby dev I'd prefer the AI/ML ecosystem not be
           | split-brained between two languages that are semantically 90%
           | the same thing. Just learn Python and integrate the models
           | into your Rails (or whatever) apps.
        
           | mattnewton wrote:
           | My guess is some kind of corporate sponsorship. Someone with
           | deep pockets to maintain it, encourage new apis keeping up
           | with the latest papers, and make sure it works out of the box
           | with the accelerator people want to use this month.
        
             | pmontra wrote:
             | The web framework part is basically sponsored by 37signals
             | https://37signals.com/32
             | 
             | Maybe that's why Ruby is best known for Ruby on Rails.
        
         | brightball wrote:
         | It keeps me coming back.
         | 
         | Ruby itself is just such an enabler.
        
         | prescriptivist wrote:
         | I recently had the need to build an internal system that
         | distributed workloads across many workers via a client/server
         | model. I did the proof-of-concept using druby [1] and it turned
         | out to be so simple and stable that we just ran with it. It'd
         | been years since I had used that library and instinctively I
         | assumed we'd get the prototype out and then rebuild it using
         | some sort of web service and utilize a high concurrency web
         | server but druby just worked!
         | 
         | [1] https://github.com/ruby/drb
        
           | thunderbong wrote:
           | drb is awesome. I've had the good fortune to be able to use
           | it once. The simplicity of it compared to anything else is
           | amazing.
        
         | green_on_black wrote:
         | Have you tried Scala?
        
           | fny wrote:
           | It's more of a cultural thing. People tend to write Ruby in a
           | literate fashion and think critically about their APIs. Scala
           | devs get a little over their skis sometimes playing with
           | language features.
        
       | why-el wrote:
       | Interesting. Transactionality is implemented via a regular thread
       | lock, this means in a concurrent Rails app where this library is
       | used in a hot path you might suffer some contention. Best is to
       | use for marshaling data in non-hot paths such as stand alone
       | scripts or app start up. I only say this because it's quite
       | different from expectations around transactions in an SQL sense.
        
       | mrinterweb wrote:
       | I would think this would have limited usefulness for most web
       | applications as the latest trend for web apps is to think of the
       | deployed code as ephemeral, and local files are not something
       | devs often rely on. I guess if you're mounting block storage or
       | some other virtual file system that would be another thing. For
       | non-web applications, this could be a simplistic replacement for
       | what people often use sqlite for. The readme doesn't talk much
       | about concurrent access to the store other than the transactions,
       | so concurrent operations may also be a limitation.
        
       | 3pt14159 wrote:
       | Don't use this. Marshal has too many issues. If you really need
       | persistence and can't use something like Postgres, use the Ox gem
       | instead. It's more reliable between versions of Ruby and easier
       | to parse from other languages if you ever have to.
        
         | jrochkind1 wrote:
         | > too many issues
         | 
         | Such as?
        
           | woodruffw wrote:
           | Marshal is Ruby's version of pickle in Python: it serializes
           | arbitrary objects, which means that correct deserialization
           | requires arbitrary code execution.
           | 
           | This is bad enough on its own, but it also makes pivoting a
           | file read/write primitive into code execution much easier.
        
             | Rafert wrote:
             | https://github.com/ruby/psych defaults to only loading
             | permitted classes since 4.0 so that seems less of a concern
             | now?
        
               | jrochkind1 wrote:
               | `psych`, used for YAML, is a different thing than
               | Marshal. pstore uses Marshal. https://ruby-
               | doc.org/core-2.6.3/Marshal.html. I don't believe psych
               | will be involved with pstore.
               | 
               | I'm honestly not sure, though, how much I should be
               | worried about the fact that someone who has write access
               | to my database can maybe escalate that to an arbitrary
               | code execution if I use pstore. Literally not sure. Write
               | access to my DB seems pretty disastrous already...
        
             | solarkraft wrote:
             | Pickle is fine (in a pinch). It's not meant for untrusted
             | data.
        
               | woodruffw wrote:
               | Anything is fine when the data is trusted. The problem is
               | that the data is almost never actually trusted :-)
        
         | e12e wrote:
         | > use the Ox gem
         | 
         | The main thing is that it's part of the standard library. If
         | you import a gem anyway, often you'd be well off with sqlite.
         | 
         | As for storage format, there's also:
         | 
         | https://ruby-doc.org/stdlib-3.1.2/libdoc/yaml/rdoc/YAML/Stor...
        
           | brunno wrote:
           | I love the simplicity of YAML::Store. It was introduced in
           | Ruby 1.8, almost 20 years ago (https://github.com/ruby/ruby/c
           | ommit/55f4dc4c9a5345c28d0da750...).
           | 
           | I even created a little gem when I was starting with Ruby, 10
           | years ago, that was a very thin wrapper around it so that I
           | could play around using an ActiveRecord like syntax
           | (https://github.com/brunnogomes/active_yaml). I used in some
           | pet projects so I could do stuff like:                 p =
           | Post.new       p.title = "Great post!"       p.body = "Lorem
           | ipsum..."       p.save            Post.all # =>
           | [#<Post:0x895bb38 @title="Great post!", @body="Lorem
           | ipsum...", @id=1>]            Post.find(1) # =>
           | #<Post:0x954bc69 @title="Great post!", @body="Lorem
           | ipsum...", @id=1>            Post.where(author: 'Brunno',
           | visibility: 'public')       # => [#<Post:0x895bb38
           | @author="Brunno", @visibility="public", @id=1>,
           | #<Post:0x457pa36 @author="Brunno", @visibility="public",
           | @id=2>]
           | 
           | And have access to the data directly in the YAML files.
           | 
           | Good times!
        
             | 3pt14159 wrote:
             | The problem with YAML is that meaningful whitespace means
             | that the size grows quickly for highly nested documents. I
             | don't love XML, but there is a reason I recommended Ox.
             | I've used it for real projects and it never fell over like
             | so many of the alternatives I've tried where databases were
             | not in the cards.
        
       ___________________________________________________________________
       (page generated 2022-09-22 23:01 UTC)