[HN Gopher] Real World Recommendation System
       ___________________________________________________________________
        
       Real World Recommendation System
        
       Author : nikhilgarg28
       Score  : 254 points
       Date   : 2022-04-11 16:44 UTC (6 hours ago)
        
 (HTM) web link (blog.fennel.ai)
 (TXT) w3m dump (blog.fennel.ai)
        
       | kixiQu wrote:
       | And then all of it is thrown away and they show ads instead. :)
        
       | greesil wrote:
       | MANGA
        
         | ehsankia wrote:
         | Google -> Alphabet, then add in Microsoft, Tesla and NVIDIA.
         | 
         | MANTAMAN
         | 
         | https://streetsharks.fandom.com/wiki/Mantaman
        
       | vikingcaffiene wrote:
       | Gentle reminder to anyone reading this that your problems are
       | probably not FAANG problems. If you architect your system trying
       | to solve problems you don't have, you are gonna have a bad time.
        
         | samstave wrote:
         | Wow, this is something that has been a floater-in-mind for
         | decades ;
         | 
         | I'll top it off with an interview at Twitter with the Eng MGR
         | ~2009-ish?
         | 
         | --
         | 
         | Him: _So tell me how you would do things differnetly here at
         | twitter based n your experience?_
         | 
         | ME: " _Well, I have no idea what your internal processes are,
         | or architecture, or problems, so my previous experience wouldn
         | 't be relevant._"
         | 
         | I'd go for the best option that suits goals.
         | 
         | [This was my literal response to the question, which I thought
         | was a trap but responded honestly -- as a previous mgr of
         | teams, the "well, we did it at my last company as such"]
         | 
         | Dont reply this way. <--
         | 
         | Here was his statement:
         | 
         | This is a literal quote from a hiring manager for
         | DevOps/Engineering at Twitter:
         | 
         |  _" Thank god!, We have hired so many people from FB, where
         | that was there only job out of school, and no other experience,
         | and the biggest thing they told me was "well - the way we did
         | this at FB was... X"_
         | 
         | --
         | 
         | His biggest concern was engineering-culture-creep...
        
           | HWR_14 wrote:
           | Wow. That amazes me that anyone would answer that question
           | without knowing anything about the problem space and
           | implemented solutions.
           | 
           | Wait, I got it, I would rewrite everything as AWS Lambdas.
           | That's the right answer! Screw your (almost certainly SQL)
           | DB, let's move it all to DynamoDB too.
        
           | samhw wrote:
           | > Wow, this is something that has been a floater-in-mind for
           | decades
           | 
           | Have you literally never come across the "you're not
           | Google!!!" trope before now, during the whole ~decade leading
           | up to this very day? Gosh I envy you.
           | 
           | (Also, I am reaaally struggling to understand that story. Who
           | is speaking? It sounds like a story within a story within a
           | story. I can just about piece together the gist, but I'm very
           | confused by all the formatting and nested quotes.)
        
         | jeffbee wrote:
         | "And note that you don't even have to be at FAANG scale to run
         | into this problem - even if you have a small inventory (say few
         | thousand items) and a few dozen features, you'd still run into
         | this problem. "
         | 
         | -TFA
        
           | vikingcaffiene wrote:
           | Fair enough. I still think people should read stuff like this
           | with a healthy measure of skepticism.
        
       | habibur wrote:
       | > a machine learning model is trained that takes in all these
       | dozens of features and spits out a score (details on how such a
       | model is trained to be covered in the next post).
       | 
       | This part was the one I was interested in. As most of the rest
       | are obvious.
        
         | arkj wrote:
         | Looks like FAANG in the title is just to get your attention.
         | Details are missing.
        
         | nikhilgarg28 wrote:
         | (Disclaimer: I'm the author of the post)
         | 
         | Good feedback, noted. Will get the next post focused on
         | training within the next couple of days.
        
           | 1minusp wrote:
           | Also, how is this article different or more informative
           | compared to others that deal with the challenges of model
           | deployment/management at scale?
        
       | voz_ wrote:
       | This is shallow and generic almost to the point of uselessness. I
       | am having trouble understanding who the target audience is.
        
       | priansh wrote:
       | The main issue with deploying these systems right now is the
       | technical overhead to develop them out. Existing solutions are
       | either paid and require you to share your valuable data, or open
       | source but either abandoned (rip Crab) or inextensible (most rely
       | on their own DB or postgres).
       | 
       | I'd love to see a lightweight, flexible recommendation system at
       | a low level, specifically the scoring portion. There are a few
       | flexible ones (Apache has one) but none are lightweight and
       | require massive servers (or often clusters). It also can't be
       | bundled into frontend applications which makes it difficult for
       | privacy-centric, own-your-data applications to compete with paid,
       | we-own-your-data-and-will-exploit-it applications.
        
         | orasis wrote:
         | I think we've done a pretty good job on the scoring side with a
         | fast and simple to use API that runs in-process:
         | https://improve.ai
        
       | KaiserPro wrote:
       | > As a result, primary databases (e.g. MySQL, Mongo etc.) almost
       | never work
       | 
       | I mean it does. As far as I'm aware Facebook's ad platform is
       | mostly backed by hundreds of thousands of Mysql instances.
       | 
       | But more importantly this post really doesn't describe issues of
       | scale.
       | 
       | Sure it has the stages of recommendation, that might or might not
       | be correct, but it doesn't describe how all of those processes
       | are scheduled, coordinated and communicate.
       | 
       | Stuff at scale is normally a result of tradeoffs, sure you can
       | use a ML model to increase a retention metric by 5% but it costs
       | an extra 350ms to generate and will quadruple the load on the
       | backend during certain events.
       | 
       | What about the message passing, like is that one monolith making
       | the recommendation (cuts down on latency kids!) or micro
       | services, what happens if the message doesn't arrive, do you have
       | a retry? what have you done to stop retry storms?
       | 
       | did you bound your queue properly?
       | 
       | none of this is covered, and my friends, that is 90% of the
       | "architecture at scale" that matters.
       | 
       | Normally stuff at scale is "no clever shit" followed by "fine you
       | can have that clever shit, just document it clearly, oh you've
       | left" which descends into "god this is scary and exotic" finally
       | leading to "lets spend half a billion making a new one with all
       | the same mistakes."
        
         | judge2020 wrote:
         | > . As far as I'm aware Facebook's ad platform is mostly backed
         | by hundreds of thousands of Mysql instances.
         | 
         | Same for YouTube itself
         | https://www.mysql.com/customers/view/?id=750 and they use
         | Vitess for horizontal scaling: https://vitess.io/
        
           | emptysea wrote:
           | YouTube has since migrated to Spanner, there's a podcast
           | episode with one of the Vitess creators that covers the
           | politics of the switch
        
         | efsavage wrote:
         | > mostly backed by hundreds of thousands of Mysql instances
         | 
         | Kind of. It's part of the recipe but one you find at these
         | large tech companies (I've worked at FB and GOOG) is they have
         | the resources to bend even large/standard projects like MySQL
         | to their will, while ideally preserving the good ideas that
         | made them popular in the first place. There are
         | wrappers/layers/modifications/etc that eventually evolve to
         | subsume the original software, such that is acting more like a
         | library than a standalone service/application. So, for example,
         | while your data might eventually sit in a MySQL table, you'll
         | never know, and likely didn't write anything specific to MySQL
         | (or even SQL) to get there.
        
           | samhw wrote:
           | I mean, this post from a year ago makes it sound _not that
           | non-standard_ : https://engineering.fb.com/2021/07/22/data-
           | infrastructure/my...
           | 
           | What you're describing sounds like you mean something on the
           | level of Cockroach, talking the Postgres wire protocol but
           | implemented entirely independently underneath (which came
           | indirectly out of Google). Facebook's MySQL deployment sounds
           | more like a heavily-patched-but-basically-MySQL installation.
           | I think Facebook is overanalogised to Google sometimes, as an
           | engineering org.
           | 
           | (Admittedly I haven't worked at either whereas you have -
           | though I have at another FAANG fwliw - but am basing this
           | impression partly on what I hear from friends & partly on
           | plain old stuff I read on the internet.)
        
         | xico wrote:
         | Meta is relatively open (and open source) in how they handle
         | stuff, including ranking, scoring and filtering described in
         | the original article, but also fast inverted indexes and
         | approximate nearest neighbors in high-dimensional spaces. See,
         | for instance, Unicorn [1,2] or (at a lower level) FAISS [3].
         | 
         | [1]
         | http://people.csail.mit.edu/matei/courses/2015/6.S897/readin...
         | 
         | [2] https://dl.acm.org/doi/pdf/10.1145/3394486.3403305
         | 
         | [3] https://faiss.ai/
        
         | whimsicalism wrote:
         | I disagree - this seems quite clearly to address issues of
         | scale, going into multiple-pass ranking, etc. etc.
        
       | dinobones wrote:
       | How FAANG actually builds their recommendation systems:
       | 
       | Millions of cores of compute, exabyte scale custom data stores.
       | Good recommendations are expensive. If you try to build a similar
       | system on AWS, you will spend a fortune.
       | 
       | Most recommender models just use co-occurrence as a seed, this
       | can actually work pretty well on it's own. If you want to get
       | fancy then build up a vectorized form of the document with
       | something like an an autoencoder, then use some approximate
       | nearest neighbors to find documents close by. 95% of the compute
       | and storage is just spent on calculating co-occurrence though.
        
         | TheRealDunkirk wrote:
         | > Millions of cores of compute, exabyte scale custom data
         | stores. Good recommendations are expensive. If you try to build
         | a similar system on AWS, you will spend a fortune.
         | 
         | And then it will be gamed, and become as useless as every other
         | recommendation system already going.
        
           | samhw wrote:
           | Also, 'millions of cores' is a ludicrously shitty, zero-clue
           | answer. It's like asking how Eminem makes music, and saying
           | 'millions of pills'. Like, yes, that's an input, but you're
           | missing _the entire method of creation, of converting the
           | crude inputs into the outputs_.
           | 
           | For my money - and, for what little it's worth, I work in
           | this field - I think most of the impressive feats of data
           | science attributed to 'machine learning' are really just a
           | function of now having hardware capacity so insanely great
           | that we're able to 'make the map the size of the territory',
           | so to speak. These models are essentially overfitting
           | machines, but that's OK when (a) it's an interpolation
           | problem and (b) your model can just memorise the entire input
           | space (and deal with any inaccuracies by regularisation,
           | oversampling, tweaking parameters till you get the right
           | answers on the validation set, then talking about how 'double
           | descent' is a miracle of mathematics, etc).
           | 
           | Don't get me wrong, neural nets are obviously not rubbish.
           | They are a very good method for non-convex, non-
           | differentiable optimisation problems, especially
           | interpolation. (And I'm grateful for the hype cycle that's
           | let me buy up cheap TPUs from Google and hack on their
           | instruction set to code up linear algebraic ops, but for way
           | more efficient optimisation methods, and also in Rust, lol.)
           | It's just a far more nuanced story than "this method we
           | discovered and hyped up for a decade in the 80s suddenly
           | became the key to AGI".
        
       | nixpulvis wrote:
       | These steps read to me like: first we filter, then we filter,
       | then we filter; all of this being done based on some various
       | orders of the data.
       | 
       | The devil's in the details, which are surely domain specific and
       | hopefully not too morally questionable.
        
       | fmakunbound wrote:
       | With all of this technology applied, I am still disappointed by
       | Netflix's recommendations - to the point of just giving up and
       | doing something else.
        
         | liveoneggs wrote:
         | I was actually pretty impressed the other day when searching
         | for "shiloh" (which they didn't have) because it showed a bunch
         | of "related" queries to other dog movies (they also didn't
         | have). The available search results were a little lacking
         | though.
        
         | foldr wrote:
         | In some ways it seems like a classic case of trying to solve
         | the wrong problem because the wrong problem potentially has a
         | technical solution. The real problem is making lots of
         | interesting content for people to watch. If you can solve that
         | problem then a simple system of categories is perfectly
         | sufficient for people to discover content. But that's not a
         | technical problem, and all those engineers have to be given
         | something to do.
        
         | chuckcode wrote:
         | Do you think part of this is that Netflix has assumed zero
         | effort from user model? My experience has been that Netflix
         | does an ok job of recommendations, but fails at overall
         | discovery experience. There is no way for me to drive or view
         | content from different angles easily. I end up googling for
         | expert opinions or hitting up rotten tomatoes to get better
         | reviews. Netflix knows a ton about me and their content, but
         | seems to do a poor job of making their content
         | browseable/discoverable overall. I do like their "more like
         | this" feature where I can see similar titles.
        
           | imilk wrote:
           | Google TV has the best content discovery I've come across so
           | far. Recommendations across most streaming services based on
           | overall similar movies, different slices of the genre, and
           | movies with similar directors/cast members. Plus as soon as
           | you select another movie, you can see all the same "similar"
           | recommendations for that movie.
        
           | invalidOrTaken wrote:
           | >Do you think part of this is that Netflix has assumed zero
           | effort from user model?
           | 
           | Talking w/a friend who works at Netflix, it sounds like this
           | is a warranted assumption. The way he told it, they were
           | tearing their hair out at one point b/c users wouldn't put
           | much into it.
        
             | samhw wrote:
             | What I don't understand about their response is: _why not
             | make it configurable?_ Admittedly this is my philosophy for
             | almost every product I work on -  "make it maximally
             | configurable, but make the defaults maximally sane" - but
             | I'm baffled every time I hear someone talking about this
             | 'dilemma'.
             | 
             | You just keep your simple interface, but allow the power
             | users to, say, click through to a particular menu and
             | change their setting - the setting in this case being ~"let
             | me provide feedback / configure how recommendations work".
             | For that kind of user, finding a 'cheat code' is actually a
             | gratifying product experience anyway.
        
               | aleksiy123 wrote:
               | I think its because the complexity of allowing
               | configurability isn't always worth it. Verifying it works
               | for all configurations becomes exponentially harder.
               | 
               | I believe it can also have performance implications
               | especially for things like recommender systems where you
               | are depending a lot on caching, pre computation and
               | training.
        
               | invalidOrTaken wrote:
               | I don't disagree!
        
         | nonameiguess wrote:
         | Rotten Tomatoes works fine as a recommendation system. It lists
         | all of the new content coming out in a given week. I just read
         | that every week, file down to what looks interesting based on
         | the premise, and read a few reviews. I can usually tell pretty
         | easily what I'll like. No need for in-app recommendations from
         | any specific streaming service at all. Good old-fashioned human
         | expert curators.
        
         | edmundsauto wrote:
         | This indicates that the problem is difficult to solve at scale
         | and customized per person. Maybe the issue is with our
         | expectations - I find other people are pretty bad at
         | recommending things for me as well.
        
           | buescher wrote:
           | Maybe. Recommendation systems definitely seem to get worse as
           | they scale. Amazon's was incredible circa 2000. Pandora seems
           | to be getting worse and more repetitive. Netflix kept getting
           | better and better until they ended their contest and since
           | then they seem to have only become worse.
        
         | jeffbee wrote:
         | Maybe you're just disappointed with Netflix's inventory, not
         | their recommendations.
        
           | colinmhayes wrote:
           | I think it's both. I'm usually able to find decent stuff by
           | searching "best on netflix" with some modifiers, but I almost
           | never find new stuff I like by scrolling on netflix.
        
       | werber wrote:
       | Tangent, but I was recently thinking about how FAANG, is now
       | MAANG, and the definition of mange : (from a google search, lol)
       | mange /manj/ Learn to pronounce noun noun: mange a skin disease
       | of mammals caused by parasitic mites and occasionally
       | communicable to humans. It typically causes severe itching, hair
       | loss, and the formation of scabs and lesions. "foxes that get
       | mange die in three or four months"
       | 
       | I find it oddly poetic, but, this is my last day of magic.
        
       | nitinagg wrote:
       | What's going wrong with Google search's recommendations every
       | day?
        
         | ultra_nick wrote:
         | Garbage data in. Garbage data out.
        
           | samhw wrote:
           | What? They have absolutely _tremendous_ data, the envy of any
           | data scientist on the planet. I don 't understand how you
           | could possibly describe their user data as garbage in any
           | conceivable way. Even search result click-and-query data
           | _alone_ - leaving out Android, Chrome, Cloud, and everything
           | else - is a stupendously invaluable, priceless asset.
           | 
           | If you call that garbage, what on earth - or, for that
           | matter, off it - is _not_ garbage!?
        
       | lysecret wrote:
       | Interesting post. On thing to note, this seems to be about "on
       | request" ranking. E.g. googleing something and in 500ms you need
       | the recommended content.
       | 
       | However, a lot of usecases are time insensitive rankings. Like
       | recommending content on netflix, spotify etc. (spotifys discover
       | weekly even has a one week! request time :D).
       | 
       | In which case you can just run your ranking and store the recs in
       | your DB and its much much easier.
        
         | troiskaer wrote:
         | This is pretty much what both Netflix and Spotify do. I would
         | argue that there isn't a canonical recommendations stack that
         | FAANG is converging towards, and that's a direct corollary of
         | differing business requirements and organizational structure.
        
       | endisneigh wrote:
       | Is there any recommendation system people we actually happy with?
       | They all seem to suck in my experience
        
         | chudi wrote:
         | all feeds are recommendations systems, instagram, facebook,
         | twitter, tiktok, youtube, every single one is a recommendation
         | system.
        
           | notriddle wrote:
           | Technically, yes, but when they're talking about this sort of
           | thing, they mean "personal recommendation system" or
           | "content-based recommendation system."
           | 
           | For example, the HN front page is a recommendation system if
           | you literally mean system-that-recommends-web-pages-to-look-
           | at. But it's not personalized; every visitor sees the same
           | front page. This fundamentally makes it a different sort of
           | thing.
        
             | chudi wrote:
             | If You have 10000 posts that You have to sort it in some
             | way and the user just going to see 20 of those, the sorting
             | is the recommendation system, people are just used to think
             | of products, movies and songs, but in those platforms the
             | users are the products
        
               | notriddle wrote:
               | This hardly seems like a reasonable way to characterize
               | Netflix, which has a personal recommendation system,
               | especially compared to HN, which is ad supported yet
               | gives the same recommendations to everyone.
        
         | charcircuit wrote:
         | YouTube, TikTok, and Twitter all work well for me.
        
         | mrfox321 wrote:
         | TikTok
        
           | colesantiago wrote:
           | Why TikTok in particular? What is the engineering story
           | behind TikTok's recommendation system? How did they get it
           | right?
        
             | keewee7 wrote:
             | TikTok seem to be learning from what the user is actually
             | watching and for how long and not just the user's
             | "Like"/"Not Interested In" actions. However it still seem
             | to learn from the "Not Interested In" action more than any
             | other platform.
        
               | pedrosorio wrote:
               | This is a pretty misinformed take when it's publicly
               | known that YouTube was already doing this (learn from
               | what the user is watching and for how long) the year
               | Bytedance was founded (2012):
               | 
               | https://blog.youtube/news-and-events/youtube-now-why-we-
               | focu...
        
       | hallqv wrote:
       | Anyone have recommendations (no pun) for more in depth resources
       | on the subject (large scale recommendation systems)?
        
         | whiplash451 wrote:
         | The RecSys conference proceedings might help
        
         | lmc wrote:
         | Much of the field seems to be fixated on throwing massive
         | compute resources at models with results that can neither be
         | evaluated nor reproduced.
         | 
         | "the Recommender Systems research community is facing a crisis
         | where a significant number of papers present results that
         | contribute little to collective knowledge [...] often because
         | the research lacks the [...] evaluation to be properly judged
         | and, hence, to provide meaningful contributions"
         | 
         | https://doi.org/10.1145%2F2532508.2532513
         | 
         | More here...
         | https://en.wikipedia.org/wiki/Recommender_system#Reproducibi...
        
           | whimsicalism wrote:
           | By "the field", you surely mean the academic field. In the
           | industry, we run controlled experiments to validate all the
           | time.
           | 
           | Recommender systems is one of the few areas in ML where
           | almost all of the knowledge is contained in industry, not
           | academia.
        
             | lmc wrote:
             | That was my thinking - anything of value is product-
             | specific and behind closed doors. It's not my field, but
             | something I see come up from time to time that seems
             | weirdly over-represented in ML articles.
        
               | samhw wrote:
               | I work on these systems, and if anything my only
               | complaint about the field is the propensity to solve
               | every optimisation problem with ML. I have seen people
               | solve textbook-grade linear, and even differentiable,
               | optimisation problems.
               | 
               | And the reason it happens despite the 'invisible hand'
               | etc is because _it still works, it just happens to be
               | horrendously inefficient_. I think that 's the main area
               | of inefficiency in the industry: not in getting the job
               | done, nor even arguably in accuracy - at least not
               | severely - but in overcomplicating the solution[0]
               | because we've formed a cargo cult around one particular
               | method of optimisation, beyond all nuance.
               | 
               | [0] I mean 'overcomplicating' in absolute terms. Of
               | course the very crux of my point is that, from the data
               | scientist's perspective, it's _not_ overcomplicated - it
               | 's _less_ complicated than using e.g. ILP precisely
               | because we have made libraries like TensorFlow so
               | incredibly easy and tempting to use.
        
       | ocrow wrote:
       | These recommendation systems take control away from individuals
       | over what content they see and replace that choice with black box
       | algorithms that don't explain why you are seeing the content that
       | you are or what other content was excluded. All of the companies
       | who have deployed these content selection algorithms could have
       | also given you manual choice over the content that you see, but
       | chose instead to let the algorithm solely determine the content
       | of your feed, either removing the manual option entirely or
       | burying it so thoroughly that no one bothers to use it.
       | 
       | These algorithms are not benign. They make choices about what
       | information you consume, whose opinions you read, what movies you
       | watch, what products you are exposed to, even which politicians
       | messages you hear.
       | 
       | When people complain about the takeover of algorithms, they don't
       | mean databases or web interfaces. They mean this: content
       | selection or preference algorithms.
       | 
       | We should be deeply suspicious. We should demand greater
       | accountability. We should require that the algorithms explain
       | themselves and offer alternatives. We should implement better.
       | Give control back to the users in meaningful ways
       | 
       | If software engineering is indeed a profession, our professional
       | responsibilities include tempering the damaging effects of
       | content selection algorithms.
        
         | KaiserPro wrote:
         | Did you know how a news paper used to choose what articles it
         | wanted to run?
         | 
         | Do you know how a TV channel decides to schedule stories?
         | 
         | Humans, its all humans. Looking at the metrics, and steering
         | stuff that feeds that metric.
         | 
         | Content filters are dumb and easy to understand. seriously,
         | open up a fresh account at FB, instagram, twitter or tiktok.
         | 
         | First it'll try and get a list of people you already know.
         | Don't give it that.
         | 
         | Then it'll give you a bunch of super popular but click baity
         | influencers to follow. why? because they are the things that
         | drive attention.
         | 
         | if you follow those defaults, you'll get a view of whats
         | shallow and popular: spam, tits, dicks and money.
         | 
         | If you find a subject leader, for example a independent tool
         | maker, cook, pattern maker, builder, then most of your feed
         | will be full of those subjects, save for about 10% random shit
         | thats there to expand your subject range (mostly tits, dicks,
         | spam or money)
         | 
         | What you'll see is stuff related to what you like and stare at.
         | 
         | And thats the problem, they are dumb mirrors. Thats why you
         | don't let kids play with them. Thats why you don't let people
         | with eating disorders go on them, thats why mental health needs
         | to be more accessible, because some times holding up a mirror
         | to your dark desires is corrosive.
         | 
         | Could filter designers do more? fuck yeah, be we also have to
         | be aware that filters are a great whipping boy for other more
         | powerful things.
        
       | oofbey wrote:
       | Off-topic, but how did Netflix manage to get itself inserted into
       | the FAANG acronym anyway? Their impact on the tech industry is
       | trivial compared to all the others. Sure, if you just take out
       | the N it's offensive, but we could have said "GAFA" or "FAAMG"
       | would be more accurate to include Microsoft in their place.
        
         | vincentmarle wrote:
         | There was a point in time when FAANG offered the best
         | compensation packages for engineers (Netflix was one of them) -
         | so that's where the term originated from but while it's
         | outdated in many respects (Microsoft is not included, Facebook
         | is now Meta, Google is now Alphabet etc etc) it's still sticky
         | for some reason.
        
           | whimsicalism wrote:
           | Microsoft in 2022 does not compensate as well as any of
           | those. Microsoft in 2021 only out-compensated Amazon.
        
           | hbn wrote:
           | > Facebook is now Meta, Google is now Alphabet
           | 
           | Eh, the new parent company names aren't really what people
           | know them as still. I don't think most people are even aware
           | that Google has a parent company.
           | 
           | I have a friend that works at Google, and that's what we say.
           | I don't think him or anyone would ever say he works at
           | Alphabet.
        
             | oofbey wrote:
             | Yeah, Meta will likely stick because Zuckerberg and crew
             | are actively trying to run away from the dumpster fire they
             | lit with Facebook.
             | 
             | But Google is still Google, and probably always will be.
             | Just like Youtube is still Google, and Waymo is still
             | Google.
        
         | [deleted]
        
         | TacticalCoder wrote:
         | > "FAAMG" would be more accurate to include Microsoft in their
         | place
         | 
         | In Europe you nearly always see "GNAFAM", which includes
         | Microsoft too. It's certainly weird to exclude MSFT, worth at
         | times more than Amazon+Meta+Netflix combined.
        
         | dljsjr wrote:
         | The phrase originated w/ Jim Cramer, it refers to the 5 best
         | performing tech stocks(or what were the best performing at the
         | time). Nothing to do with their impact on the field from a
         | technical perspective, just a business perspective.
        
         | cordite wrote:
         | Netflix has contributed a lot to Java micro services, see
         | Eureka and Hystrix.
        
           | troiskaer wrote:
           | as well as to ML - Netflix Prize
           | (https://en.wikipedia.org/wiki/Netflix_Prize) and Metaflow
           | (https://github.com/Netflix/metaflow)
        
             | oofbey wrote:
             | No question they've done some things that have had some
             | impact on others in the industry. But none of them are
             | particularly important. It's all relative. Companies like
             | Twitter, Uber, AirBnb have all released open source
             | projects or figured things out how to solve hard problems
             | in ways that others have emulated.
             | 
             | But for every other one of the FAA(N)G companies, I can
             | barely work a day as a developer without touching every one
             | of their technologies. Yeah, Netflix got into ML years
             | before most, but the netflix prize exists as a distant
             | cautionary memory, and as an ML professional, I'd literally
             | never heard of metaflow before. Just sayin'.
        
               | troiskaer wrote:
               | > But none of them are particularly important
               | 
               | Nowhere was the argument made that somehow Netflix was
               | more influential than Twitter/Uber/AirBnB, but your
               | counter-argument that somehow it's less influential
               | because you haven't heard of/used some projects directly
               | holds no ground.
        
               | samhw wrote:
               | > your counter-argument that somehow it's less
               | influential because you haven't heard of/used some
               | projects directly holds no ground
               | 
               | Oh come on, they are indisputably right that Microsoft,
               | Twitter, Uber, Airbnb, hell, even Cloudflare are more
               | technically influential than Netflix is.
               | 
               | Apple and Google would make _anyone 's_ top 5, that's his
               | point. No argument about it. Their products collectively
               | dominate anyone's life, along with MSFT. Netflix is
               | _maybe_ in your top 10, top 20 for sure, but it 's not up
               | there as one of the few 'platform that everyone's lives
               | are built on' techcos.
               | 
               | (Like, Netflix vs Microsoft? Seriously? For that matter,
               | Amazon probably wouldn't be in my top 5 either, and not
               | only because it's not mainly a tech company. I s'pose it
               | depends how you define 'Amazon', and if you include AWS.
               | But for Netflix there's just no argument that they win a
               | spot there.)
        
               | troiskaer wrote:
               | What's your argument for Twitter/Uber/AirBnB being
               | indisputably more technologically influential than
               | Netflix? And let's please talk facts rather than
               | opinions.
        
         | jedberg wrote:
         | FAANG was created by the TV personality Jim Cramer to talk
         | about high growth tech stocks. At the time Netflix was doubling
         | every year. It was based purely on finance.
         | 
         | It's now been taken over by the tech industry to be shorthand
         | for places that are highly selective in their hiring and tend
         | to work on cutting edge tech at scale.
         | 
         | That being said, the impact of Netflix on tech is pretty big.
         | They pioneered using the cloud to run at massive scale.
        
           | oofbey wrote:
           | > They pioneered using the cloud to run at massive scale.
           | 
           | Which is to say they were AWS's biggest early customer?
           | Doesn't really seem like Netflix should get the credit for
           | that one.
        
             | jedberg wrote:
             | It was a lot more than that. They developed systems and
             | techniques that even Amazon adopted and are still adopting
             | to this day. They also created a ton of open source tools
             | for other people to use the cloud:
             | 
             | https://netflix.github.io
             | 
             | Netflix tech even spawned a company to sell their open
             | source tools:
             | 
             | https://www.armory.io
             | 
             | And they codified the entire practice of Chaos Engineering:
             | 
             | https://en.wikipedia.org/wiki/Chaos_engineering
        
               | hetspookjee wrote:
               | That stockphoto on the front page of armory.io manages to
               | trigger al kinds of spammy website triggers for me.
        
             | [deleted]
        
             | patmorgan23 wrote:
             | They were pretty influential in refining the microservices
             | architecture
        
           | samhw wrote:
           | > FAANG was created by the TV personality Jim Cramer to talk
           | about high growth tech stocks. At the time Netflix was
           | doubling every year. It was based purely on finance.
           | 
           | That, and FAAG had less of a ring to it.
           | 
           |  _Edit: Dammit, the GP made the same observation. Oh well, I
           | 'm keeping it._
        
             | jedberg wrote:
             | If Netflix hadn't been such high growth and not included,
             | Cramer probably would have gone with GAAF. :)
        
         | tempest_ wrote:
         | All the cool kids say GAMMA now.
        
           | aczerepinski wrote:
           | What is the G?
        
             | oofbey wrote:
             | Those who used to do no evil, but gave up on the idea as
             | not profitable enough.
        
           | oofbey wrote:
           | Not MAGMA?
        
             | ystad wrote:
             | Too hot I say :)
        
             | svachalek wrote:
             | Now I can't get Dr Evil saying MAGMA out of my mind.
        
         | errantmind wrote:
         | I think the acronym gained prominence before Microsoft's recent
         | 'commitment' to open source. Netflix also seemed to be doing
         | really interesting things scaling out 'disruption' to video
         | delivery at the time. It stuck
        
         | yukinon wrote:
         | FAANG was never about impact on tech industry. Otherwise, MSFT
         | would be part of FAANG. Instead, it's directly related to (1)
         | stock price and (2) compensation.
        
       | BubbleRings wrote:
       | Want to dive in to all this stuff but can't find a starting
       | point? Start with reading my patent!
       | 
       | I was smart enough to see what collaborative filtering (CF) could
       | be early on, and to file a patent that issued. I wasn't smart
       | enough to make it a complicated patent, or to choose the right
       | partners so I could have success with it.
       | 
       | But the patent makes a good way to learn how to get from "what
       | are your desert island 5 favorite music recordings?" over to
       | "here is a list of other music you might like". Basic CF, which
       | is at the core of a lot of this stuff. Enjoy!:
       | 
       | https://whiteis.com/whiteis/SE/
        
       | siskiyou wrote:
       | All I know is that Facebook's recommendation systems always show
       | me things that I hate to see. I suppose they may "work" at scale,
       | but at an individual level it's epic failure.
        
         | samstave wrote:
         | FB needs an Ad-Rev-Share-Model with ALL of its users...
         | 
         | Imagine if FB were to pay a fraction% of how yur data was used
         | and paid you for it...
         | 
         | It may be a small amount, but in super 4th world countries, it
         | could affect change in their lives...
         | 
         | Now imagine that this becomes big... and it works well.
         | 
         | Now imagine that the populous is aware of the hand of god above
         | them just pressing keys to affect land masses (yes I am
         | referring to the game from the 80s)
         | 
         | but this cauterizes them into union building...
         | 
         | So when the people realize their metrics are the product to
         | feed consumerism for capitalistic profits, and decide to
         | organize, what happens?
         | 
         | Is FB going to need a military force to protect their DCs?
         | 
         | ---
         | 
         | With "Zuck Bucks" (I still am not sure if true)
         | 
         | This makes this ultimate "company store"
         | 
         | Tokens?
         | 
         | So how get?
         | 
         | How EARN? (What service on FB GENERATES '$ZB'?)
         | 
         | How spend?
         | 
         | WHAT GET? (NFTs?, Goods? Services?)?
         | 
         | The entire fucking model of EVERYTHING FB DOES is to MAP
         | SENTIMENT!
         | 
         | Sentiment is the tie btwn INTENT and SENTIMENTAL VALUE
         | 
         | The idea is to map interest with emotional drivers which make
         | someone buy _(spend resources their time and effort went into
         | building up a store-of)_...
         | 
         | ---
         | 
         | So map out your emotinal response over N topics and forums..
         | Eval your documented Online comments, NLP the fuck out of that,
         | see what your demos are and build this profile to you....
         | 
         | THEN THEN THEN THEN
         | 
         | Offer an "earnable" (i.e. Grindable by farms and bots alike) --
         | "Zuck Buck" which is a TOKEN (etymology that fucking word for
         | yourself)
         | 
         | of value...
         | 
         | Meaning, zero INTRINSIC value, Zero accountability (managed by
         | a central Zuck Bank) <-- Yeah fuck that)
         | 
         | And the vaule both determined AND available to you via not
         | INTRINSIC CONTROL, nor VALUE.
         | 
         | ---
         | 
         | FB Bots Galore.
        
           | ParanoidShroom wrote:
           | >With "Zuck Bucks" (I still am not sure if true) I expected
           | more from this place than to believe every click bait FB
           | news. Of all the UX people and tons of money they throw to
           | into research... Yes the best option was... "Zuck bucks".
           | Don't get played ffs
        
             | samstave wrote:
             | You >quoted with no commentary.. so your point is :: TROLL?
        
               | samhw wrote:
               | So are you just making the punctuation up as you go, or
               | what?
        
           | imilk wrote:
           | Like many NFT/crypto posts, I have absolutely no idea whether
           | this is serious or a parody.
        
       | greatpostman wrote:
       | I've built one of these at FAANG. Generally the different parts
       | of the system are completely separate teams that interact through
       | apis and ingest systems. Usually there's a mix of online and
       | offline calculations, where features are stored in a nosqldb and
       | some simple model runs in a tomcat server at inference time, or
       | the offline result is just retrieved. Almost everything is
       | precomputed.
       | 
       | We had an api layer where another team runs inference on their
       | model as new user data comes in, then streams it to our api which
       | inboards the data.
       | 
       | On top of this, you have extensive A/B testing systems
        
         | splonk wrote:
         | I have as well, and your comment matches my experience more
         | than the article does. Different teams own different systems,
         | and there's basically no intersection between "things that
         | require a ton of data/computation" and "things that must be
         | computed online".
        
           | oofbey wrote:
           | Yep. The author, as a peddler of recommendations solutions,
           | has an incentive to convince people that this problem is very
           | complicated, and they should hire a consultant.
           | 
           | In practice, good old Matrix Factorization works really well.
           | Can you beat it with a huge team and tons of GPU hours to
           | train fancy neural nets? Probably. Can you set up a nightly
           | MF job on a single big machine and serve results quickly?
           | Sure can.
        
         | lysecret wrote:
         | Yea same here. What Nosql DB did you use for these lookups? Im
         | currently using postgres for it but seems a bit like a waste.
         | Even though the array field is nice for feature vectors.
        
           | jenny91 wrote:
           | Presumably they mean internal stuff like google bigtable or
           | equivalent. (Though some version of that is now on gcp).
        
       | nickdothutton wrote:
       | If you can possibly precompute it. Precompute it.
        
       | [deleted]
        
       | rexreed wrote:
       | Isn't this obvious list-building promotion for a company (Fennel)
       | that sells recommendation systems?
       | 
       | "Fennel AI: Building and deploying real world recommendation
       | systems in production Launched 18 hours ago"
       | 
       | Caveat reader.
        
         | [deleted]
        
         | warent wrote:
         | Nothing wrong with some content marketing. They provide value
         | to people in return for getting exposure to their brand. Simple
         | healthy quid pro quo
        
         | imilk wrote:
         | I'll never understand why people think this is a valid
         | criticism of an article, rather than pointing out an issue they
         | have with the actual content of the article. There's nothing
         | inherently wrong with a company sharing info about the space
         | they operate in. In fact, it should be encouraged as long as
         | what they share is useful.
        
           | notafraudster wrote:
           | It's a short-hand for the treatment of the subject being
           | pretty shallow and non-descript, which seems to apply to this
           | article exactly. I read this and didn't learn anything.
        
             | ZephyrBlu wrote:
             | Do you work on recommendations or something similar as part
             | of your job? I don't and I found the article interesting.
        
             | imilk wrote:
             | Saying the article is "pretty shallow and non-descript: is
             | much shorter and more useful than what they posted.
        
               | notafraudster wrote:
               | Right, but then it starts a meta-conversation about why
               | the article got posted, or even written. It doesn't have
               | the down-the-rabbit hole trait of an individual project
               | of passion, or the sort of authoritative voice of a
               | conference talk or even a Netflix blog post, it doesn't
               | really speak to specific actionable technologies so it's
               | not the kind of onboarding a Toward Data Science post
               | would be. And that meta conversation inevitably leads to,
               | oh, it's a marketing funnel. So just saying "this is
               | content marketing" I think is a shibboleth for the entire
               | conversation that starts with "pretty shallow and non-
               | descript".
               | 
               | Of course I didn't write the original comment and there's
               | something to say for flag-and-move-on or whatever, and
               | other people did enjoy it. I'm just saying I understand
               | the impulse to short-circuit the entire tedious
               | conversation!
        
               | HWR_14 wrote:
               | It provides more information. It's shallow and non-
               | descript because it's an ad is the argument. I don't know
               | if I believe that here. It's a blurry line with sponsored
               | content.
        
       ___________________________________________________________________
       (page generated 2022-04-11 23:00 UTC)