[HN Gopher] Packrat Parsing: Simple, Powerful, Lazy, Linear Time...
       ___________________________________________________________________
        
       Packrat Parsing: Simple, Powerful, Lazy, Linear Time [pdf]
        
       Author : gjvc
       Score  : 41 points
       Date   : 2020-11-15 18:24 UTC (4 hours ago)
        
 (HTM) web link (pdos.csail.mit.edu)
 (TXT) w3m dump (pdos.csail.mit.edu)
        
       | arnarbi wrote:
       | Tag title with 2002? It's a great paper.
        
         | gjvc wrote:
         | dammit. missed the edit window. will do so in future; thank you
         | for the reminder.
        
       | coderzach wrote:
       | Guido van Rossum's series on PEG parsing
       | https://medium.com/@gvanrossum_83706/peg-parsing-series-de5d...
       | 
       | Proposal to use PEG parser in python
       | https://www.python.org/dev/peps/pep-0617/
       | 
       | Python PEG grammar
       | https://github.com/python/cpython/blob/3.9/Grammar/python.gr...
        
       | amboo7 wrote:
       | Latest developments: https://arxiv.org/abs/2005.06444
        
       | chrisseaton wrote:
       | Packrat Parsing: Simple, Powerful, Lazy, Linear Time, Unlimited
       | Memory
       | 
       | But I still think it's a pretty good idea.
        
         | stormbrew wrote:
         | Yes, this. It's really common for naive implementations of
         | packrat parsing to work just fantastic on short input strings
         | (aka benchmarks) and then take literal minutes of "linear time"
         | to parse, say, a 1kbyte source code file. Particularly if you
         | (as is relatively common) do it in a scripting language with a
         | naive memory management policy where creating and connecting
         | vast numbers of small objects can be a performance problem.
         | 
         | It turns out that, in practice, it's better to be selective
         | about where you apply the packrat mechanism.
        
         | pmiller2 wrote:
         | Any parsing algorithm uses "unlimited memory" if you make the
         | input big enough.
         | 
         | But, seriously, it's called "packrat parsing," because it saves
         | all the things. You have to expect it's going to use a bit of
         | memory.
        
         | josephg wrote:
         | Memory usage is linear too. It's bounded by length of input *
         | number of rules in the grammar.
        
           | chrisseaton wrote:
           | For example 1 GB per character is linear... but it's still a
           | hell of a lot.
           | 
           | The mathematical function doesn't matter as much as the
           | practical overhead.
           | 
           | I mean... the clue's in the name, right?
           | 
           | (I worked on PEG and Packrat for my MEng.)
        
             | centimeter wrote:
             | But packrat parsers don't require nearly that much space in
             | practice, right? Most languages I've worked with will have
             | a relatively small number of production rules, requiring
             | only a small number of results at each parsing position.
        
               | chrisseaton wrote:
               | I think in practice they're pretty chunky.
        
         | unquietcode wrote:
         | I still use it for everything. Memory at that scale just isn't
         | an issue anymore. Your grammar would have to be huge and your
         | data structures highly unoptimized for it to matter all that
         | much.
        
       | tgv wrote:
       | For LL and LR languages. Not for general CFGs. The abstract also
       | mentions other classes, but isn't specific, probably referring to
       | other deterministic classes of CFG.
        
       ___________________________________________________________________
       (page generated 2020-11-15 23:00 UTC)