[HN Gopher] Show HN: Gorse - An Out-of-the-box open-source Recom...
       ___________________________________________________________________
        
       Show HN: Gorse - An Out-of-the-box open-source Recommender System
        
       Author : zhenghaoz
       Score  : 156 points
       Date   : 2021-07-17 15:07 UTC (7 hours ago)
        
 (HTM) web link (gorse.io)
 (TXT) w3m dump (gorse.io)
        
       | jedwhite wrote:
       | Great project!
       | 
       | Are you thinking about offering this as a service as well? It
       | feels like you could have a lot of interest with a run-it-
       | yourself or we-do-it-for-you-as-a-service model. I couldn't see
       | that after a quick look :)
        
         | zhenghaoz wrote:
         | Good idea! But before that, we have to take a look at GDPR :D
        
           | jedwhite wrote:
           | I feel you :D
           | 
           | It might be worth sharing the project with some privacy-
           | oriented forums to get some feedback and ideas on what a good
           | privacy model would look like for it. Good luck! :)
        
         | gingerlime wrote:
         | I use recombee[0] and it seems pretty nice, albeit not open
         | source. Not affiliated in any way. Not even a paying customer,
         | just using the free plan for a personal project.
         | 
         | [0] https://www.recombee.com/
        
       | scoresmoke wrote:
       | Great job! Gorse reminds me of the now-retired Apache
       | PredictionIO recommendation engine:
       | https://predictionio.apache.org/templates/recommendation/qui....
       | Have you evaluated why it took off and why it was later shut
       | down, so that your project can avoid its mistakes?
        
         | Sloppy wrote:
         | Apache PredictionIO was designed for data scientists (I was a
         | committer so am quite familiar with design decisions). It was
         | built to allow new "engines" to be implemented for any
         | arbitrary ML/AI algorithm. These came in "templates" of various
         | flavors.
         | 
         | However it turned out that only the "Universal Recommender"
         | template (that I wrote) had real demand and virtually no one
         | was developing or using other "engines"
         | 
         | As designed it was hard to deploy and was not multi-tenant. I
         | could go on about the shortcomings but suffice it to say that
         | after I did the Universal Recommender and saw the PIO
         | shortcomings, we (ActionML, my consulting company) decided to
         | write a from-scratch ML server, called Harness, to solve the
         | PIO issues and act as the host for continuing work on the
         | Universal Recommender.
         | 
         | https://actionml.com/docs https://github.com/actionml/harness
         | 
         | So the best part of PIO (The Universal Recommender) lives on
         | with a clean new design and modern architecture.
        
       | abeppu wrote:
       | I wonder whether a really good experience can be built by
       | abstracting out recommendations to this degree. I think in the
       | context of an actual product, there would be other considerations
       | specific to the particular domain, which is partly why
       | organizations often eventually grow up teams around
       | recommendations or discovery.
       | 
       | For example:
       | 
       | - Perhaps there are relationships or similarities among items
       | which are known prior to any feedback (e.g. different options of
       | the "same" product are represented as distinct items in your
       | catalog). Recommending to the user many closely related items may
       | be a poor experience in some contexts.
       | 
       | - Perhaps you have other relevant actions, and care about more
       | than maximizing the probability of a click. E.g. purchases,
       | shares, etc. In an ecommerce context, you may want to recommend
       | items at a range of price points; showing only very expensive
       | items may get you clicks but not purchases.
       | 
       | - This has a concept of whether an item was "read" (which I
       | interpret as an impression, or an opportunity to be clicked). But
       | not all impressions are equal. Perhaps you have other knowledge
       | about the context in which something was displayed. Was it "read"
       | in the context of a search? Was it 8th in a list of 10 items?
        
         | zhenghaoz wrote:
         | Your consideration is absolutely right. The abstractions in
         | Gorse do lose lots of informations. Prior knowedges, contextual
         | informations is important to further improve recommender
         | systems.
        
           | barefeg wrote:
           | Could these issues be solved by providing connectors to
           | different data stores to Gorse can use multiple signals in a
           | configurable way?
        
             | zhenghaoz wrote:
             | The problem is how to utilize these signals. The annoying
             | thing is we can't solve these issues if we are not in this
             | situation. So, feedbacks from Gorse users are important.
        
           | antman wrote:
           | So are there any plans for context aware recommenders?
        
         | Sloppy wrote:
         | The Universal Recommender (part of the Harness project) allows
         | contextual and multimodal behavioral information to be used in
         | recs. https://github.com/actionml/harness
        
       | KaoruAoiShiho wrote:
       | Does this handle real time recommendations as described here:
       | https://eugeneyan.com/writing/real-time-recommendations/
        
       | Reubend wrote:
       | I think this looks very good, and I love seeing that it has the
       | capability to automatically select the recommendation algorithm
       | with the best performance.
       | 
       | I think there are two major areas where this can be improved. A
       | quick read through the docs doesn't show much info about the
       | algorithms themselves, and from what I can see, it doesn't look
       | like you can customize the hyperparameters of the algorithms
       | (correct me if I'm wrong). You have to just wait for the
       | automated hyperparameter search to finish. I think this is mostly
       | very good but the ability to freeze them would be even better to
       | save compute time in subsequent retrainings. The other thing I
       | notice is that there's no system to detect when the offline model
       | is too out of date and trigger retraining. It would be nice to
       | have a way of automating the retraining based on performance
       | evaluations rather than on a clock schedule.
        
         | zhenghaoz wrote:
         | Thanks for your advice
        
       | opheliate wrote:
       | Really interesting project, thank you for sharing! Looking at the
       | documentation [0], it would appear that negative feedback for an
       | item can only be given by giving feedback that the user has read
       | the item, and not then providing positive feedback. Is that the
       | case? Or is there a way to provide explicitly negative feedback
       | that I'm not seeing?
       | 
       | 0: https://docs.gorse.io/ch01-02-recommend.html
        
         | zhenghaoz wrote:
         | There are no explicit negative feedback yet. When a item is
         | seen by a user, the read event is recorded. If this user likes
         | this item, the positive feedback is recorded. It seems a
         | natural way to track user's preference. I will try to add
         | explicit negative feedback if someone really need it.
        
           | genewitch wrote:
           | "ignored by 84% of visitors" is a metric
        
         | Sloppy wrote:
         | If you start with an algorithm that accepts any type of user
         | behavior (is multimodal) then using user-dislikes is easy.
         | 
         | We can easily do this with the Universal Recommender as shown
         | in this article (reprinted from the IBM dev blog)
         | https://actionml.com/blog/making_dislikes_predict_likes
        
       | vector_spaces wrote:
       | A challenge with naive recommender systems in an ecommerce
       | setting is that they tend to recommend the most popular items on
       | your site rather than surfacing the long tail of SKUs, which
       | doesn't generally add much value (you could achieve a similar
       | effect by showing users recommendations from a hardcoded list of
       | top items rather than leaning on a model). If Gorse is generic
       | and unaware of the domain, how does it avoid falling into this
       | trap?
        
         | Sloppy wrote:
         | Not sure this is true but in any case, the Universal
         | Recommender uses a technique based on the TF-IDF algorithm of
         | search to get long-tail recs. This de-weights popularity so
         | that relevancy is more important.
        
         | contravariant wrote:
         | Now you've made me curious. If the point is to recommend stuff
         | that people are most likely to buy next then recommending the
         | most popular items is likely correct. So what metric do you
         | have in mind that leads you to the conclusion that recommender
         | engines _shouldn 't_ recommend the most popular articles?
        
           | draugadrotten wrote:
           | Seller satisfaction will be the highest if the user buys the
           | item. Basing the recommender on sales will make it a seller-
           | focused recommender.
           | 
           | If the recommender somehow can be based on buyer satisfaction
           | - it would be a system that focuses on the buyer. It would
           | then perhaps not please the seller.
           | 
           | Taking it one step further, if a buyer is VERY satisfied, the
           | buyer is likely to return for MORE purchases, ensuring the
           | satisfaction of BOTH seller and buyer with the first
           | recommendation. This is the ultimate recommender, measuring
           | not first-purchase buyer/seller satisfaction, but rather
           | measuring "so satisfied with the first purchase that the
           | buyer returned for a second purchase".
        
       | KaoruAoiShiho wrote:
       | What do you think about using a graph DB like neo4j, redis-graph
       | or the chinese competitor Nebula Graph?
        
       | andyxor wrote:
       | cool project, i was curious which algorithms are used for
       | recommendations/ranking, couldn't find it in the docs, so had to
       | go through the code, may be add a section on algorithms and
       | models?
       | 
       | it seems that one algo used is "Bayesian Personal Ranking" based
       | on a paper from 2012 https://arxiv.org/pdf/1205.2618.pdf, here is
       | a related blog post https://towardsdatascience.com/recommender-
       | system-using-baye...
       | 
       | another one is Weighted Regularized Matrix Factorization/ALS (by
       | the Netflix Prize winner team): http://yifanhu.net/PUB/cf.pdf ,
       | also see https://www.ethanrosenthal.com/2016/10/19/implicit-mf-
       | part-1...
        
       ___________________________________________________________________
       (page generated 2021-07-17 23:00 UTC)