[HN Gopher] Packrat Parsing: Simple, Powerful, Lazy, Linear Time... ___________________________________________________________________ Packrat Parsing: Simple, Powerful, Lazy, Linear Time [pdf] Author : gjvc Score : 41 points Date : 2020-11-15 18:24 UTC (4 hours ago) (HTM) web link (pdos.csail.mit.edu) (TXT) w3m dump (pdos.csail.mit.edu) | arnarbi wrote: | Tag title with 2002? It's a great paper. | gjvc wrote: | dammit. missed the edit window. will do so in future; thank you | for the reminder. | coderzach wrote: | Guido van Rossum's series on PEG parsing | https://medium.com/@gvanrossum_83706/peg-parsing-series-de5d... | | Proposal to use PEG parser in python | https://www.python.org/dev/peps/pep-0617/ | | Python PEG grammar | https://github.com/python/cpython/blob/3.9/Grammar/python.gr... | amboo7 wrote: | Latest developments: https://arxiv.org/abs/2005.06444 | chrisseaton wrote: | Packrat Parsing: Simple, Powerful, Lazy, Linear Time, Unlimited | Memory | | But I still think it's a pretty good idea. | stormbrew wrote: | Yes, this. It's really common for naive implementations of | packrat parsing to work just fantastic on short input strings | (aka benchmarks) and then take literal minutes of "linear time" | to parse, say, a 1kbyte source code file. Particularly if you | (as is relatively common) do it in a scripting language with a | naive memory management policy where creating and connecting | vast numbers of small objects can be a performance problem. | | It turns out that, in practice, it's better to be selective | about where you apply the packrat mechanism. | pmiller2 wrote: | Any parsing algorithm uses "unlimited memory" if you make the | input big enough. | | But, seriously, it's called "packrat parsing," because it saves | all the things. You have to expect it's going to use a bit of | memory. | josephg wrote: | Memory usage is linear too. It's bounded by length of input * | number of rules in the grammar. | chrisseaton wrote: | For example 1 GB per character is linear... but it's still a | hell of a lot. | | The mathematical function doesn't matter as much as the | practical overhead. | | I mean... the clue's in the name, right? | | (I worked on PEG and Packrat for my MEng.) | centimeter wrote: | But packrat parsers don't require nearly that much space in | practice, right? Most languages I've worked with will have | a relatively small number of production rules, requiring | only a small number of results at each parsing position. | chrisseaton wrote: | I think in practice they're pretty chunky. | unquietcode wrote: | I still use it for everything. Memory at that scale just isn't | an issue anymore. Your grammar would have to be huge and your | data structures highly unoptimized for it to matter all that | much. | tgv wrote: | For LL and LR languages. Not for general CFGs. The abstract also | mentions other classes, but isn't specific, probably referring to | other deterministic classes of CFG. ___________________________________________________________________ (page generated 2020-11-15 23:00 UTC)