[HN Gopher] Sharing learnings about our image cropping algorithm
       ___________________________________________________________________
        
       Sharing learnings about our image cropping algorithm
        
       Author : dsr12
       Score  : 112 points
       Date   : 2021-05-19 18:10 UTC (4 hours ago)
        
 (HTM) web link (blog.twitter.com)
 (TXT) w3m dump (blog.twitter.com)
        
       | gipp wrote:
       | I'm more forgiving about corporate jargon than most. A lot of it
       | really does help optimize communication for the situations you
       | encounter in corporate work.
       | 
       | But "learnings" is literally, exactly, just a synonym for
       | "lessons." Can we not?
        
         | alanbernstein wrote:
         | I disagree, "Sharing lessons..." would mean "here is an
         | educational resource that we have created, as teachers, for an
         | audience of students". I think "Lessons learned..." is closer
         | to what you mean to suggest, and "Learnings..." is more concise
         | (this is from Twitter, after all).
        
           | robotresearcher wrote:
           | Google 'lessons from' and you'll see that one usage is
           | synonymous with 'learnings from'. The suggested completions
           | are perfectly idiomatic.
        
         | bloak wrote:
         | Also the "sharing". Is the article about "sharing"? Perhaps a
         | normal human would have used the title "An evaluation of our
         | image cropping algorithm".
        
           | dylan604 wrote:
           | Maybe there's a bias to sharing since it is a social media
           | platform where sharing is the thing?
        
         | kzrdude wrote:
         | "Can we not?" is a pretty modern colloquialism. Is it better or
         | worse than learnings?
        
         | NovemberWhiskey wrote:
         | It's a neologism, or possibly the resurrection of a long unused
         | form - I don't know exactly how we came about it but I agree
         | completely that one of the meanings of "lesson" is "a thing
         | which has been learned".
         | 
         | In my experience, there's a tendency toward folksiness in
         | certain varieties of corpspeak that causes rejection of
         | "formal-sounding" terms and repurposing of "plainer" forms to
         | create new words, hence lessons = learnings, protege = mentee,
         | and so on.
        
           | jameshart wrote:
           | I don't know why people object to the word 'learnings'. Do
           | you also dislike the word 'teachings'?
        
             | NovemberWhiskey wrote:
             | I don't think the fact that some words can be formed that
             | way is a good guide to whether other formations are words
             | or not.
             | 
             | Do you think "cookings" is a word? It would be "things that
             | have been cooked" I suppose, on that basis. I would say if
             | it's a word it's one I've never heard before.
             | 
             | On the other hand, "cuttings" is clearly a word in common
             | usage in horticulture.
        
               | jameshart wrote:
               | My point is more that it sits in roughly the same
               | semantic space as 'lessons' and 'learnings', yet there
               | seems to be room for nuanced differences of meaning. Yet
               | I don't see people object to 'teachings' on the basis
               | that we already have the word 'lessons'.
        
       | rwmj wrote:
       | Bias aside, the saliency algorithm doesn't work well either. This
       | twitter feed (SFW) https://twitter.com/punhubonline often shows
       | the punchlines in the preview, spoiling the joke.
        
         | alanbernstein wrote:
         | Asking for an image analysis algorithm to both read and
         | understand jokes might be expecting a bit much...
        
       | Areibman wrote:
       | "We began testing a new way to display standard aspect ratio
       | photos... without the saliency algorithm crop. The goal of this
       | was to give people more control over how their images appear
       | while also improving the experience of people seeing the images
       | in their timeline. After getting positive feedback on this
       | experience, we launched this feature to everyone."
       | 
       | So the solution all along was to give users the ability to crop
       | their own photos. Why wasn't this the original way of doing
       | things?
       | 
       | Instead of forcing a complicated algorithm into the Twitter
       | experience, it seems to me that the solution all along was just
       | to let users do what they do best-- make tweets for themselves.
       | This incident strikes me as a major failing of AI: We are so
       | eager to shoehorn AI/ML into our products that we lose sight of
       | what actually makes users happy.
        
         | ggggtez wrote:
         | > Why wasn't this the original way of doing things?
         | 
         | Someone wanted to do a feature so they could get promoted.
         | Probably with some mumbo jumbo about how it reduces the number
         | of clicks to create a tweet and thus increases revenue.
        
         | cblconfederate wrote:
         | Next they'll tell us that chronological mode is better! (it is
         | for me in any case)
        
       | grenoire wrote:
       | Imagine the world where images aren't algorithmically cropped.
       | It's easy if you try. No AI below us, above us only thumbnails.
        
         | goldenkey wrote:
         | Something something, John Lennon, crop out Ringo.
        
       | [deleted]
        
       | codeulike wrote:
       | Some context: They dont mention it directly but I think this
       | refers back to this thread last september
       | 
       | https://twitter.com/colinmadland/status/1307111816250748933
       | 
       | (Note the thread displays differently now because Twitter have
       | changed their cropping algorithm)
       | 
       | Originally @colinmadland was trying to post examples of how Zoom
       | virtual background had removed his black colleagues head, however
       | when he posted the side-by-side images (with heads) on Twitter,
       | twitter always cropped out his colleague and just showed him,
       | even if he horizontally swapped the image. So, while trying to
       | talk about an apparently racist algorithm in Zoom, he was
       | scuppered by an apparently racist algorithim in Twitter.
       | 
       | It was widely covered in the press at the time
       | https://www.theguardian.com/technology/2020/sep/21/twitter-a...
        
         | SiempreViernes wrote:
         | Here's an example that still works:
         | https://twitter.com/bascule/status/1307440596668182528?s=20
        
       | boulos wrote:
       | /rant but I feel like talking about percentage points of
       | difference is always hard for humans. For example:
       | 
       | > In comparisons of men and women, there was an 8% difference
       | from demographic parity in favor of women.
       | 
       | would have been clearer (and more correct) as "an 8 percentage-
       | point difference from demographic parity". That 8 pp difference
       | though is a 16% "relative" difference (58/50), or more starkly
       | "The algorithm chose the woman almost 40% more often" (58/42 =>
       | 1.38). That said, the diagram in the post [1] is much easier for
       | humans to parse and say "wow, that looks pretty far off!".
       | 
       | tl;dr: A number like 8% sounds like "no big deal", but 8
       | percentage points (on each side) is a big deal!
       | 
       | [1] https://cdn.cms-twdigitalassets.com/content/dam/blog-
       | twitter...
        
       | cratermoon wrote:
       | News flash: ML sucks. Most of the time.
        
       | LinAGKar wrote:
       | How about letting people disable this cropping crap altogether
        
         | tjkrusinski wrote:
         | People abuse tall aspect ratio images to take up more space in
         | the UI to get more attention on their tweets.
        
           | refactor_master wrote:
           | Just letterbox them.
        
       | cmckn wrote:
       | So, I can choose to see only un-cropped images on my TL, and the
       | author can see a preview of the algorithm's crop before they
       | tweet -- but a glaring omission is simply exposing a crop tool to
       | the author. The model works by choosing a point on which to
       | center the crop. Why can't you give user's a UI to do the same?
       | "Tap a focal point in the image, or let our robot decide!"
       | 
       | The blog post mentions several times how ML might not be the
       | right choice for cropping; but their conclusion was...to keep
       | using ML for cropping. I hope someone got a nice bonus for
       | building the model!
        
         | [deleted]
        
         | jawns wrote:
         | I can't really see any down side, besides maybe a little bit of
         | developer time, to allowing users to see a preview of the crop
         | and optionally override. It's done all the time in other
         | places.
        
           | mattacular wrote:
           | It's probably a bit harder at Twitter's unique scale. They
           | have an incredibly high throughput of new posts and a large
           | portion of these posts include between 1-4 images that need
           | cropping.
        
             | kixiQu wrote:
             | I dunno, that just means it's a pain for the _user_ , not
             | for Twitter
        
             | bavell wrote:
             | Isn't this just a UI change and wiring that up to the
             | backend? The cropping happens either way, it might actually
             | be faster considering that if the user providers the crop
             | info, Twitter does't have to burn CPU cycles on figuring
             | out the "optimal" crop dimensions.
        
             | Spivak wrote:
             | /s? I mean we're talking about a client-side feature here.
             | If phpbb forums can do per-image cropping I think Twitter
             | can manage.
        
         | nightpool wrote:
         | > but their conclusion was...to keep using ML for cropping
         | 
         | My takeaway from the article was that their conclusion was to
         | remove cropping from the product, starting incrementally on
         | iOS. (I got cropping removed on Android as well recently). That
         | seems like the opposite of "keep using ML for cropping"?
        
       | [deleted]
        
       | jedberg wrote:
       | Image cropping algorithms are hard. When we made our first one
       | for reddit, it used this algorithm:
       | 
       | Find the larger dimension of the image. Remove either the first
       | or last row/column of pixels, based on which had less entropy.
       | Keep repeating until the image was a square.
       | 
       | The most notable "bias" of this algorithm was the male gaze
       | problem identified in the article. Women's breasts tended to have
       | more entropy than their face, so the algorithm focused on that
       | since it was optimized for entropy. To solve the problem, we
       | added software that allowed the user to choose their thumbnail,
       | but not a lot of users used it or even realized they could.
       | 
       | I assume they've since upgraded it to use more AI with actual
       | face detection and so on, but at the time, doing face detection
       | on every image was computational infeasible.
        
         | amelius wrote:
         | > Women's breasts tended to have more entropy than their face
         | 
         | Aha, perhaps that's the problem then.
        
         | mvzvm wrote:
         | Reddit's image cropping algorithm is hilariously bad. As is
         | their video player, and their ads, and their ranking, and their
         | messaging tools...
        
         | cblconfederate wrote:
         | Suffice to say the twitter algorithm fails badly with NSFW
         | images (where often the focus is ... not face)
        
         | erichahn wrote:
         | How is entropy related to "male gaze". This approach seems to
         | be unsupervised, I don't see the problem.
        
           | colllectorof wrote:
           | I see that you didn't get the memo. If an algorithm or a
           | mathematical definition is made by a white man, it's _by
           | definition_ racist and sexist _because_ it was made by a
           | white man. Entropy was defined by Claude Shannon who was a
           | white man, so entropy is racist and sexist, because it
           | absorbed all of Shannon 's biases.
        
             | oceanghost wrote:
             | I honestly can't tell if you're joking or not.
        
               | bigfudge wrote:
               | Sadly he's too angry to make the joke funny. As always
               | there probably is a grain of truth to the idea that
               | postmodern feminist critiques occasionally disappear up
               | their own fundamental, and so there's a joke to be made
               | here. But this isn't it!
        
           | opsy2 wrote:
           | How is entropy defined in this context?
           | 
           | Clearly there is human-derived input in the system
           | (otherwise... What's the point just crop randomly)
        
             | TheGallopedHigh wrote:
             | Randomness in pixel values
        
             | jedberg wrote:
             | Here is the code:
             | 
             | https://github.com/reddit-
             | archive/reddit/blob/753b17407e9a9d...
             | 
             | But in short, it's a histogram of the values of the pixels.
        
               | bavell wrote:
               | "Entropy" in this context also left me wondering. Perhaps
               | "variance" or "deviation from the mean"?
               | 
               | Thanks for the insights!
        
           | SiempreViernes wrote:
           | I don't think the claim is that the behaviour is _caused_ by
           | "male gaze", but rather that the _outcome_ of always focusing
           | the cropping around any visible cleavage is functionally
           | identical.
        
           | skavi wrote:
           | Whether or not it's unsupervised, whether or not it's sexist,
           | it seems that a thumbnail focusing on a person's face rather
           | than their breasts is typically going to be more desirable.
           | Depending on context, of course.
        
         | fuzzythinker wrote:
         | Breasts shouldn't have more entropy than face. Perhaps the
         | reason is due to the breasts being in the middle of the
         | picture, so the face gets being compared to bottom rows more
         | frequently?
        
       | [deleted]
        
       | Nick87633 wrote:
       | How about not calling this the male gaze? If we're trying to
       | remove bias FFS.
        
         | krapp wrote:
         | The article says the algorithm was trained on eye tracking
         | data, and if the predominance of data came from men then
         | referring to the male gaze as a possible source of unintended
         | bias seems valid.
        
       | natpat wrote:
       | > One of our conclusions is that not everything on Twitter is a
       | good candidate for an algorithm, and in this case, how to crop an
       | image is a decision best made by people.
       | 
       | This seems like it should have been a foregone conclusion. What
       | was the driving force in the first place to think cropping images
       | with an AI model was desirable? Seems like ML was a solution
       | looking for a problem here, and I'm glad they've realised that.
        
         | [deleted]
        
         | klodolph wrote:
         | It seems obvious in retrospect. Calling it a foregone
         | conclusion is too harsh.
         | 
         | Twitter crops photos to fit their preview formats. It seems
         | like an obvious improvement to show people's faces when
         | cropping, etc.
        
           | the_gastropod wrote:
           | Right but... we've been cropping images in web applications
           | since... y'know, pretty much ever. Using ML to do this was
           | always pretty ridiculous overkill. Give the users an image
           | cropper, and be done with it.
        
             | klodolph wrote:
             | I can't see why this is overkill. You're eliminating a step
             | from the image posting process, and making it so users
             | don't have to crop an image twice (once for the full image,
             | and a second time for the preview). That makes sense when
             | you're writing a CMS or blogging platform like Wordpress,
             | but for Twitter it adds some friction.
             | 
             | So, previously, the preview was just cropped in the center.
             | But this made some images look funny, since people's faces
             | would get chopped off.
             | 
             | Coming up with a _workable_ solution to this with ML is not
             | especially hard. You can get things like face detection off
             | the shelf, maybe just tell your autocropper,  "crop closer
             | to the face" and have a demo within a couple days (and then
             | much more effort to productionize it). From there, you can
             | start introducing ML models to improve on your basic face
             | detection. (I'm not counting face detection as ML.)
             | 
             | This is not a case where some massive ML model is being
             | brought in to save two seconds of your time. This is a very
             | natural and obvious application of ML, at a company which
             | already does ML at scale, in a way that sounds like it has
             | a good chance at improving the appearance of the site
             | without introducing additional friction.
             | 
             | Instragram gets around this by encouraging everyone to take
             | square photos.
        
       | amelius wrote:
       | I fear this is still horribly incomplete. E.g. if a picture shows
       | brand A next to brand B, which brand will be cropped?
        
         | e-clinton wrote:
         | Didn't sound like the model had that info, so I'd imagine it
         | would select one based on physical attributes of the brand
         | logo, not the "brand" itself.
        
       ___________________________________________________________________
       (page generated 2021-05-19 23:00 UTC)