[HN Gopher] Yann LeCun on his start in AI and recent self-superv...
       ___________________________________________________________________
        
       Yann LeCun on his start in AI and recent self-supervised learning
       research
        
       Author : andreyk
       Score  : 79 points
       Date   : 2021-08-05 17:03 UTC (5 hours ago)
        
 (HTM) web link (thegradientpub.substack.com)
 (TXT) w3m dump (thegradientpub.substack.com)
        
       | mooseburger wrote:
       | LeCun is interesting. The way he reasons about AI X-risk makes
       | him seem like a retard, let's not mince words. He's an actual
       | example of a mind so specialized that it has lost the capacity
       | for lateral thinking.
        
       | andreyk wrote:
       | Hey, I am an editor at The Gradient and host of this interview!
       | As this is only episode 6 the podcast domain is pretty new to
       | us^, so would definitely welcome feedback on question choice or
       | my style as an interviewer. We tried focusing much more on
       | research than other interviews out there such as Lex Fridman's,
       | would be curious to hear if you think that worked well.
       | 
       | ^(we've existed as a digital publication focused on AI for way
       | longer, see thegradient.pub if you're curious.)
        
         | joe_the_user wrote:
         | You don't seem to include a transcript here. Seems like a
         | serious flaw (I personally prefer transcripts to audio but it's
         | actually an accessibility issue for some people).
        
           | andreyk wrote:
           | Thanks! We'll work on that
        
         | antimora wrote:
         | I couldn't see this episode in pocketcasts. Is it a technical
         | delay or it usually becomes available on other platforms later.
        
           | andreyk wrote:
           | Yeah, it usually takes a little while to appear on other
           | platforms, annoyingly.
        
       | danmaz74 wrote:
       | One very interesting thing that was mentioned in the interview is
       | how much Facebook relies on deep learning right now;
       | specifically, how hate speech detection went from 75% manual to
       | something like 2.5% manual, and how manual false negatives
       | detection allowed this improvement.
       | 
       | What I'm wondering is about false positive detection, which
       | wasn't mentioned, and how much of this incredible decrease in
       | false negatives came at the expense of an increase in false
       | positives.
        
         | joe_the_user wrote:
         | Anecdotally, FB's hate speech detector is pathetic. I have lots
         | of friends run afoul of it for trivial things. It seems no more
         | coherent than a bunch of regular expressions.
         | 
         | I had a post in a group get flagged by the algorithm for
         | something "your politics shouldn't involve saying '[bad word]
         | about [protected group]'".
         | 
         | I suspect the problem is there's nothing that would flag as
         | wrong for a system that just defaults to such crudeness. So
         | that's what happens.
        
           | civilized wrote:
           | I was once asked to use machine learning to make a record
           | linkage system for some crappy dataset. I got no
           | requirements, of course, so I set it up to have a reasonable
           | balance of precision and recall. After all, the point of
           | asking for an ML system must be to allow fuzzy matches that a
           | simple exact matching system would miss, right?
           | 
           | But my boss apparently got complaints about bad matches, so
           | he changed it to allow exact matches only.
           | 
           | The machine learning system ended up being a Rube Goldberg
           | machine for linking people based on exact name match.
        
       | IdiocyInAction wrote:
       | Is there a summary or transcript available?
        
         | andreyk wrote:
         | There is a rough (AI generated) transcript here:
         | https://app.trint.com/public/7ad490c6-bf70-41ea-bb43-24c3264...
         | 
         | We hope to produce polished transcripts in the future, but have
         | yet to figure out the best way.
         | 
         | A quick summary is:
         | 
         | * First ~15 minutes is intro + discussion of Yann's early days
         | of research in the 80s
         | 
         | * Minutes 15-~45 cover several notable recent works in self-
         | supervised learning for computer vision (including SimCLR,
         | SWAV, SEER, simsiam, barlow twins).
         | 
         | * Final ~15 minutes are discussion of empirical vs theoretical
         | research in AI, how AI is used at Facebook, and whether there
         | will be another AI winter.
        
           | scribu wrote:
           | Thanks for that. It's hard to make out acronyms like "SWAV"
           | or "SEER" from the audio, if you're not already familiar with
           | them.
        
           | personjerry wrote:
           | The best way is to have someone listen to the audio and type
           | it out.
        
             | andreyk wrote:
             | True, I should say the best way that does not involve us
             | transcribing it by ourselves. In any case, we'll work on
             | this!
        
               | jazzyjackson wrote:
               | I wonder if you could post the transcript to a git repo
               | and allow corrections via pull request. Auto-captioning
               | is a great first step to get phrases set to time-codes,
               | and then open it up to the community for corrections and
               | translations.
        
               | shoo wrote:
               | accepting PRs would have the downside of generating
               | additional work for maintainers of the repo to review
               | PRs.
        
               | jazzyjackson wrote:
               | Still, it beats paying by the minute to actually hire
               | someone.
        
             | somerandomness wrote:
             | ASR works pretty well these days
        
           | gnicholas wrote:
           | I've heard great things about Descript. It's not free (aside
           | from a limited trial), but apparently it makes it really easy
           | to get good transcripts, and also allows you to clean up the
           | audio as well.
        
       | ok2938 wrote:
       | I am no one. I have greatest respect for all Turing award
       | winners.
       | 
       | But one thing I am wary is that LeCun - while special and
       | excellent - is just as many others, working at a place where "AI"
       | is already used to "engage people up" - it is just the nature of
       | the business if you are in the engagement business. And your "AI"
       | will gladly help you in all kind of subtle ways. What is also
       | nice is that it's uncharted territory now, so you can freely roam
       | - and engage the heck out of your audience.
       | 
       | And LeCun is just - as a "neutral" scientist - just doing his
       | part.
       | 
       | Why can't he not work at FB? Money? Data?
        
         | rchaud wrote:
         | Because Facebook makes the most money, and probably offered him
         | the most.
         | 
         | Same reason why back in the day, a lot of people got electrical
         | engineering degrees but went into software development or
         | finance. The skills were transferable, and the pay was a lot
         | higher.
        
         | throwawaygh wrote:
         | _> Why can 't he not work at FB? Money?_
         | 
         | IDK, I get it. Grad school sucks. Post-docs suck. Pre-tenure
         | sucks. Post-tenure isn't any better. For that entire period of
         | time you are working on de facto _fixed term contracts_. Which
         | is extremely uncommon among salaried engineers, and those that
         | take these sorts of contingent employment contracts are
         | typically paid quite well. It 's like 10-15 years of low pay,
         | "will I have a job next year?" stress, and moving your family
         | around all the time (or, more commonly, just not starting a
         | family).
         | 
         | And not even for good pay. These days, even after a decade or
         | more of experience, you're making less than your undergrads.
         | Half as much or even less in some cases.
         | 
         | So, your undergrad once-peers start retiring -- or at least
         | thinking about it -- around the time that you're finally
         | transitioning from de facto fixed-term positions to something
         | resembling a normal employment contract, but, again, for a
         | third to a fifth of what you'd be making in industry at that
         | point in your career.
         | 
         | So, yeah, people say fuck it and cash in on
         | influence/engagement/reputation where they can. The only real
         | alternative is the public sector paying researchers better, but
         | that's never going to happen.
        
           | [deleted]
        
           | jstx1 wrote:
           | Those problems might be real but they aren't really relevant
           | to LeCun - it's not like his only options are academia and
           | Facebook.
        
         | andreyk wrote:
         | IMO it's similar to why Hinton works for Google - this gives
         | him huge resources (data, compute, money to pay researchers) to
         | do research with, unlike anything to be found in academia.
         | Perhaps this is a naive view, but this is a guy who spent
         | decades pushing for a direction in AI that was not popular but
         | which he really believed in, so it seems natural he would want
         | to accept resources to further research in that direction. Of
         | course, he's also been public about thinking Facebook does more
         | good than bad for the world, in his view.
         | 
         | Also, TBH I doubt he has much to do with the AI used for
         | engagement optimization, his specialty is in other topics and
         | he seems to be focused on the work of Facebook AI Research
         | (which openly engages with academia and so on). And to be fair
         | he is also still a professor at NYU and has students there.
        
       | 908B64B197 wrote:
       | A better format than random Twitter thread where a mob tries to
       | cancel him [0]. You might recognize one name that got really
       | famous not long ago!
       | 
       | [0] https://syncedreview.com/2020/06/30/yann-lecun-quits-
       | twitter...
        
         | malshe wrote:
         | Thanks for sharing this article. Can someone knowledgeable
         | about this issue explain why this is not a data issue? I have
         | read people claiming that ML researchers may bring their own
         | biases into the models but I haven't seen any concrete example
         | of that. Even in the Twitter exchange in this article, Gebru
         | doesn't explain how this is not just data bias. She just throws
         | a lot of insults at LeCun but anyone can do that, right? I
         | would have loved to see her explanation as she is the expert in
         | this area.
        
           | visarga wrote:
           | Well, technically, the way you choose the algorithm and set
           | the hyper-parameters can influence accuracy in a non-uniform
           | way over the distribution of data, introducing additional
           | bias. The training process also introduces bias: optimizer,
           | batch size, learning rate and duration.
        
         | horrified wrote:
         | Horrible story :-( And once again the grievance strategy was
         | successful.
        
           | mrtranscendence wrote:
           | I don't know how folks can be aware of how the exchange went
           | down and say that it was a "successful" "grievance strategy".
           | LeCunn wasn't necessarily in the right here, and it wasn't
           | only Gebru's twitter followers going on the offensive.
        
             | 908B64B197 wrote:
             | > LeCunn wasn't necessarily in the right here
             | 
             | And yet, he was.
             | 
             | Gebru couldn't know that, because despite all her claims,
             | she's not technical.
        
               | spoonjim wrote:
               | Gebru is not a Woz-level wizard like LeCun but someone
               | who worked at Apple as an engineer and did a PhD with
               | Fei-Fei Li cannot be dismissed as "not technical."
        
             | horrified wrote:
             | Well LeCunn quit Twitter, so it is "one down". That is what
             | I meant by successful. And Gebru's "arguments" weren't even
             | arguments, just "whatever you say is wrong because you are
             | white and don't recognise our special grievances".
             | 
             | I personally agree with what he said when he said it is a
             | difference between a research project and a commercial
             | product. No actual harm was done when the AI completed
             | Obama's image into a white person. You could just laugh
             | about it and move on.
        
               | [deleted]
        
               | frozenport wrote:
               | Not to mention Obama is 50% white.
               | 
               | (picture of his parents) https://static.politico.com/dims
               | 4/default/553152c/2147483647...
        
               | andreyk wrote:
               | Not to disagree, but a couple of FYIs:
               | 
               | * LeCun did not really quit Twitter, he's still active on
               | there and has been for a while - but I guess he did
               | temporarily when all this happened.
               | 
               | * many researchers agreed with Gebru's opposition to
               | LeCun's original point - see tweets by Charles Isbel,
               | yoavgo, Dirk Hovy embedded here
               | https://thegradient.pub/pulse-lessons/ under 'On the
               | Source of Bias in Machine Learning Systems' (warning - it
               | takes a while to load). There was a civil back-and-forth
               | between him and these other researchers as you can see in
               | that post, so it was a point worth discussing. Gebru
               | mostly did not participate in this beyond her initial
               | tweets as far as I remember.
               | 
               | * Lecun got into more heat when he posted a long set of
               | tweets to Gebru which to many seemed like he was
               | lecturing her on her subject of expertise aka
               | 'mansplaining'. I am sure many would see that as
               | nonsense, but afaik many people making that point was the
               | cause of quitting twitter.
        
               | horrified wrote:
               | Thanks for the further background information. I have to
               | say it doesn't really make it better for me. The "angry
               | people" are of course correct that you can also create
               | bias in other ways than data sets. But are they implying
               | that people generally deliberately introduce such biases
               | to uphold discrimination? That seems like a very serious
               | and offensive claim to make, and not very helpful either.
               | 
               | The whole way to think about issues is backwards in my
               | opinion. I would think usually when you train some
               | algorithm, you tune and experiment until it roughly does
               | what it wants you to do. I don't think anybody starts out
               | by saying "let's use the L2 loss function so that
               | everybody starts white". They'll start with some loss
               | function, and if the results are not as good as they
               | hope, they'll try another one. In fact the usual approach
               | will lead back to issues with the data set, because that
               | is what people will test and tweak their algorithms with.
               | If the dataset doesn't contain "problematic" cases, they
               | won't be detected.
               | 
               | But overall, such misclassifications are simply "bugs"
               | that should get a ticket and be fixed, not trigger huge
               | debates. I think it is toxic to try to frame everything
               | as an issue of race.
        
         | IfOnlyYouKnew wrote:
         | > You might recognize one name
         | 
         | Yes, it's terrible when people are subject to all sorts of
         | personal attacks based on snark and innuendo.
        
           | elefanten wrote:
           | This is indecipherable to me. Who are you taking a shot at?
           | Is this comment pro-snark or anti-snark? Sarcastic or
           | straight? Who knows.
        
       ___________________________________________________________________
       (page generated 2021-08-05 23:00 UTC)