[HN Gopher] Deep symbolic regression for recurrent sequences
       ___________________________________________________________________
        
       Deep symbolic regression for recurrent sequences
        
       Author : ShamelessC
       Score  : 29 points
       Date   : 2022-01-30 21:15 UTC (1 hours ago)
        
 (HTM) web link (recur-env.eba-rm3fchmn.us-east-2.elasticbeanstalk.com)
 (TXT) w3m dump (recur-env.eba-rm3fchmn.us-east-2.elasticbeanstalk.com)
        
       | not2b wrote:
       | Doesn't do well with 2,3,5,7,11,13. Perhaps they can figure out
       | how to incorporate OEIS or a similar resource in addition to just
       | a simple curve fit.
        
         | somebodythere wrote:
         | That sequence is out of domain I think.
        
       | civilized wrote:
       | I changed the 13 in the Fibonacci sequence to 14 just to see what
       | it would do.
       | 
       | It added the term "quotient(u_{n-1} , n)" to the Fibonacci
       | recursion, and claimed the resulting formula fit the data
       | perfectly.
       | 
       | What?
        
         | eutectic wrote:
         | Seems right to me.
        
         | nextaccountic wrote:
         | It found another formula that worked, what's the problem with
         | that?
        
       | fjfaase wrote:
       | If I enter the sixteen terms of the sequence A003742 it does not
       | find the exact recurrence equation (with six terms). About twenty
       | years ago, I wrote a program that did find this recurrence
       | equation just from the sequence. If I enter these numbers at
       | http://www.ryanhmckenna.com/2015/06/automatically-finding-re...
       | it gives the exact recurrence equation within a few seconds.
        
         | ShamelessC wrote:
         | Sort of a shallow dismissal, wouldn't you say? These models are
         | for research purposes and the authors aren't claiming to have
         | solved this to the degree of a hand-crafted program. In fact,
         | they probably used a program very similar to the one linked in
         | order to create the dataset.
        
       | rurban wrote:
       | Fantastic. This can be used for superior compressors, eg. Or to
       | break random number generators.
        
       | muds wrote:
       | Hmmm. I can't get the model to recognize a damped sinusoidal wave
       | (10, 0, -10, 0, -5, 0, 5, ...). Does the model have the capacity
       | to express such a function?
       | 
       | An equation is available here:
       | https://en.wikipedia.org/wiki/Damping
       | 
       | Pretty neat otherwise!! I especially love the interface. I wonder
       | if there is a plug and play framework for deploying pytorch
       | models on a website.
       | 
       | EDIT: They seem to be using https://streamlit.io . Seems like a
       | neat tool.
        
         | danuker wrote:
         | Looks like it fails quite hard for "natural" sequences. I input
         | the Bitcoin price and I got some ridiculous zigzag around a
         | constant instead of say, a power law fit.
        
       | sytelus wrote:
       | I don't think people here realize how amazing this is. You are
       | providing very little data and the machine is able to fit the
       | closed-form recurrent relation out of it. Very little data and
       | the clean simple expressions is the key. It's just matter of
       | adding more ops like sinusoidals, primes etc to make it recognize
       | more and more complex expressions.
        
       | ShamelessC wrote:
       | To clarify as some people have mismatched expectations, the
       | authors mention viability for real-world applications in the
       | paper:
       | 
       | "One may ask to what extent our model can be used for real-world
       | applications, such as time-series forecasting. Although the
       | robustness to noise is an encouraging step in this direction, we
       | believe our model is not directly adapted to such applications,
       | for two reasons. First, because real-world data generally cannot
       | be described by neat mathematical equations, in which case
       | numeric approaches will outperform symbolic approaches. Second,
       | even in cases where the sequence is described by a formula, this
       | formula will often con- tain complex prefactors. While our method
       | is capable of approximating prefactors rather remarkably, this
       | adds a layer of difficulty to the original problem, as the model
       | sometimes needs to use many operators to approximate a single
       | prefactor (see Table 2). However, one easy way to solve this
       | issue is to extend the vocabulary of the decoder, enabling it to
       | build prefactors more easily, or use a separate solver to fit the
       | prefactors as done in approaches. We leave this for future work."
        
       | ShamelessC wrote:
       | tl;dr - They model operations (e.g. multiply, add, cosine, sqrt)
       | as tokens, and use a transformer to predict a recursive function
       | from (incomplete) integer sequences (and float sequences,
       | separately).
       | 
       | This is not dissimilar from e.g. DALL-E where they model images
       | as NLP-style tokens and use a transformer to predict image tokens
       | (operations, respectively) from text tokens (list of ints/floats,
       | respectively).
       | 
       | This demo lets you interact with the model. Pretty cool stuff.
       | Refreshing to see symbolic problems being tackled with deep
       | learning.
       | 
       | Yannic Kilcher video with one of the authors:
       | https://youtu.be/1HEdXwEYrGM
       | 
       | arxiv: https://arxiv.org/abs/2201.04600
        
       ___________________________________________________________________
       (page generated 2022-01-30 23:00 UTC)