[HN Gopher] Show HN: Exp. Smoothing is 32% more accurate and 100...
       ___________________________________________________________________
        
       Show HN: Exp. Smoothing is 32% more accurate and 100x faster than
       Neural-Prophet
        
       We benchmarked on more than 55K series and show that ETS improves
       MAPE and sMAPE forecast accuracy by 32% and 19%, respectively, with
       104x less computational time over NeuralProphet.  We hope this
       exercise helps the forecast community avoid adopting yet another
       overpromising and unproven forecasting method.
        
       Author : maxmc
       Score  : 115 points
       Date   : 2022-08-17 19:33 UTC (3 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | ren_engineer wrote:
       | what's the consensus on machine learning vs more classical
       | methods for time series forecasting? I know in 2018 a hybrid
       | model won the M4 competition, obviously in this case classical
       | still beats AI/ML
       | 
       | https://en.wikipedia.org/wiki/Makridakis_Competitions
        
         | rich_sasha wrote:
         | I think depends massively in what you mean by "time series". If
         | it is really an ARMA model you're looking at then ML can only
         | bring noise to the problem. If it is a complex large system
         | that happens to be indexed by time, ML can well be better.
         | 
         | AFAIK Prophet had more modest scope than "be all and end all of
         | TS modelling", rather a decent model for everything. It might
         | indeed be excellent at that...
        
         | dylanjcastillo wrote:
         | In the M5 competition[1], most winning solutions used LightGBM.
         | So ML beat classical.
         | 
         | Just a couple of the winning solutions used DL.
         | 
         | [1]
         | https://www.sciencedirect.com/science/article/pii/S016920702...
        
       | gillesjacobs wrote:
       | This wouldn't pass peer-review of it were a paper. Major issues:
       | 
       | - No fair hyperparametrization for Neural Prophet. They mention
       | multiple times they used default hyperparams or ad-hoc example
       | hyperparams.
       | 
       | - 3/4 benchmark datasets (one they didn't finish training) where
       | ETS outperforms is not strong evidence of all-round robustness.
       | Benchmarks like SuperGlue for NLP combine 10 completely different
       | tasks with more subtasks to assess language model performance.
       | And even SuperGlue is not uncontroversial.
        
         | gillesjacobs wrote:
         | While the results don't prove the superiority convincingly, it
         | does seem that ETS is a good candidate as a first go-to in
         | practical applications. "In practice, practice and theory are
         | the same. In theory, they are not."
        
           | variaga wrote:
           | In _theory_ practice and theory are the same. In _practice_
           | they are not.
        
             | gillesjacobs wrote:
             | HN pedantry ruins the fun of wordplay yet again.
        
               | anon_123g987 wrote:
               | In theory, his version is right. In practice, yours.
        
       | Imnimo wrote:
       | In the original Prophet paper
       | (https://peerj.com/preprints/3190.pdf) they claim that Prophet
       | outperforms ETS (see Figure 7, for example). And in the
       | NeuralProphet paper, they claim that it outperforms Prophet (but
       | do not, as far as I can see, compare directly to ETS). Here we
       | see ETS outperforms NeuralProphet.
       | 
       | Presumably this apparent non-transitivity is because of
       | differences in each evaluation. If we fix the evaluation to the
       | method used here, is it still the case that NeuralProphet
       | outperforms Prophet (and therefore the claim that Prophet
       | outperforms ETS is not correct)? Or is it that NeuralProphet does
       | not outperform Prophet, but Prophet does outperform ETS?
        
       | beernet wrote:
       | As usual in ML, the appropriate solution depends on the problem
       | and context.
       | 
       | ML (particularly DL) tends to outperform "classical" statistical
       | time series forecasting when the data is (strongly) nonlinear,
       | highly dimensional and large. The opposite holds as well.
       | 
       | It is also important to note that accuracy is not the only
       | relevant metric in practical applications. Explainability is of
       | particular interest in time series forecasting: it is good to
       | know if your sales are going to increase/decrease, but it is even
       | more valuable to know which input variables are likely to account
       | for that change. Hence, a "simple" model with inferior
       | forecasting accuracy might be preferred to a stronger estimator
       | if it can give insights to not only the "what" will happen, but
       | also the "why".
        
         | tomwphillips wrote:
         | > ML (particularly DL) tends to outperform "classical"
         | statistical time series forecasting when the data is (strongly)
         | nonlinear, highly dimensional and large.
         | 
         | This claim about forecasting with DL comes up a lot, but I've
         | seen little evidence to back it up.
         | 
         | Personally, I've never managed to have the same success others
         | apparently have with DL time series forecasting.
        
           | beernet wrote:
           | It's true simply because large ANNs have a higher capacity,
           | which is great for large, nonlinear data but less so for
           | small datasets or simple functions.
           | 
           | In any case, Transformers are eating ML right now and I'm
           | actually surprised there's no "GPT-3 for time series" yet.
           | It's technically the same problem as language modeling (that
           | is, multi-step prediction of numerics), however, there is
           | only a comparably little amount of human-generated data for
           | self-supervised learning of a time series forecasting model.
           | Another reason might be that the expected applications and
           | potentials of such a pre-trained model aren't as glamorous as
           | generating language.
        
             | time_to_smile wrote:
             | > It's technically the same problem as language modeling
             | 
             | You're thinking of modeling event sequences which is not
             | strictly speaking the same as time series modeling.
             | 
             | Plenty of people do use LSTMs to model event sequences,
             | using the hidden state of the model as a vector
             | representation of processes current location walking a
             | graph (i.e. a Users journey through a mobile app, or
             | navigating following links on the web.)
             | 
             | Time series is different because the ticks of timed events
             | are at consistent intervals and are also part of the
             | problem being modeled. In general time series models have
             | often been distinct from sequence models.
             | 
             | The reason there's no GPT-3 for any general sequence is the
             | lack of data. Typically the vocabulary of events is much
             | smaller than natural languages and the corpus of sequences
             | much smaller.
        
         | time_to_smile wrote:
         | A larger problem is that time series modeling is particularly
         | resistant towards black box approaches since a lot of
         | information is encoded in the model itself.
         | 
         | Take even a simple moving average model on daily observations.
         | Consider stock ticker data (where there are no weekends) and
         | web traffic data (where there is an observation each day). The
         | stock ticker data should be smoothed with a 5 day window and
         | the web traffic with a 7 to help reduce the impact of weekly
         | effects (which probably shouldn't exist in the stock market
         | anyway).
         | 
         | It's possible in either of these cases you might find a moving
         | average that performs better on some choose metric, say 4 or 8
         | days. However neither of these alternatives make any sense as a
         | window if we're trying to remove day-of-week effect, and unless
         | you can come up with a justifiable explanation, smoothing over
         | arbitrary windows should be avoided.
         | 
         | If you let a black box optimize even a simple moving average
         | you would be avoiding some very essential introspection into
         | what your model is actually claiming.
         | 
         | Not to mention that we often can do more than just prediction
         | with these intentional model tunings (for example day-of-week
         | effect can be explicitly differenced from the data to measure
         | exactly how much sales should increase on a Saturday)
        
       | rich_sasha wrote:
       | Hmm, wow. When I saw the headline, I assumed they used like one
       | dataset or something similarly limiting.
       | 
       | I'd need to dig out the original paper, but I would be surprised
       | if the original didn't compare to basic benchmark methods. But
       | from memory, I never saw such a comparison (until now).
        
       | cercatrova wrote:
       | Can someone explain this? I don't know what the context is for
       | this Show HN.
        
         | IshKebab wrote:
         | They're time series prediction methods. E.g. they mention
         | electricity usage forecasting - given historical data, what
         | will the usage be in 1 hour?
         | 
         | Facebook's Prophet is quite popular in the space I understand.
         | No idea about the other two.
        
       | mkl wrote:
       | A minor language error: "this model does not outperform classical
       | statistical methods _neither_ in accuracy _nor_ speed. " should
       | say "either" and "or".
        
         | ISV_Damocles wrote:
         | https://dictionary.cambridge.org/grammar/british-grammar/nei...
        
           | mkl wrote:
           | Nothing there seems to contradict me. The problem in the
           | linked page is that "neither ... nor" is used after "not",
           | which makes it a double negative.
        
         | lightedman wrote:
         | The "Not-Neither-Nor" sequence is typical, even with regards to
         | American English, versus British English (the Queen's English.)
         | In either case, both are technically-correct.
        
           | bo1024 wrote:
           | As part of a double negative?
        
             | kevin_thibedeau wrote:
             | English is nothing if not inconsistent.
        
       ___________________________________________________________________
       (page generated 2022-08-17 23:00 UTC)