[HN Gopher] Sharing learnings about our image cropping algorithm ___________________________________________________________________ Sharing learnings about our image cropping algorithm Author : dsr12 Score : 112 points Date : 2021-05-19 18:10 UTC (4 hours ago) (HTM) web link (blog.twitter.com) (TXT) w3m dump (blog.twitter.com) | gipp wrote: | I'm more forgiving about corporate jargon than most. A lot of it | really does help optimize communication for the situations you | encounter in corporate work. | | But "learnings" is literally, exactly, just a synonym for | "lessons." Can we not? | alanbernstein wrote: | I disagree, "Sharing lessons..." would mean "here is an | educational resource that we have created, as teachers, for an | audience of students". I think "Lessons learned..." is closer | to what you mean to suggest, and "Learnings..." is more concise | (this is from Twitter, after all). | robotresearcher wrote: | Google 'lessons from' and you'll see that one usage is | synonymous with 'learnings from'. The suggested completions | are perfectly idiomatic. | bloak wrote: | Also the "sharing". Is the article about "sharing"? Perhaps a | normal human would have used the title "An evaluation of our | image cropping algorithm". | dylan604 wrote: | Maybe there's a bias to sharing since it is a social media | platform where sharing is the thing? | kzrdude wrote: | "Can we not?" is a pretty modern colloquialism. Is it better or | worse than learnings? | NovemberWhiskey wrote: | It's a neologism, or possibly the resurrection of a long unused | form - I don't know exactly how we came about it but I agree | completely that one of the meanings of "lesson" is "a thing | which has been learned". | | In my experience, there's a tendency toward folksiness in | certain varieties of corpspeak that causes rejection of | "formal-sounding" terms and repurposing of "plainer" forms to | create new words, hence lessons = learnings, protege = mentee, | and so on. | jameshart wrote: | I don't know why people object to the word 'learnings'. Do | you also dislike the word 'teachings'? | NovemberWhiskey wrote: | I don't think the fact that some words can be formed that | way is a good guide to whether other formations are words | or not. | | Do you think "cookings" is a word? It would be "things that | have been cooked" I suppose, on that basis. I would say if | it's a word it's one I've never heard before. | | On the other hand, "cuttings" is clearly a word in common | usage in horticulture. | jameshart wrote: | My point is more that it sits in roughly the same | semantic space as 'lessons' and 'learnings', yet there | seems to be room for nuanced differences of meaning. Yet | I don't see people object to 'teachings' on the basis | that we already have the word 'lessons'. | rwmj wrote: | Bias aside, the saliency algorithm doesn't work well either. This | twitter feed (SFW) https://twitter.com/punhubonline often shows | the punchlines in the preview, spoiling the joke. | alanbernstein wrote: | Asking for an image analysis algorithm to both read and | understand jokes might be expecting a bit much... | Areibman wrote: | "We began testing a new way to display standard aspect ratio | photos... without the saliency algorithm crop. The goal of this | was to give people more control over how their images appear | while also improving the experience of people seeing the images | in their timeline. After getting positive feedback on this | experience, we launched this feature to everyone." | | So the solution all along was to give users the ability to crop | their own photos. Why wasn't this the original way of doing | things? | | Instead of forcing a complicated algorithm into the Twitter | experience, it seems to me that the solution all along was just | to let users do what they do best-- make tweets for themselves. | This incident strikes me as a major failing of AI: We are so | eager to shoehorn AI/ML into our products that we lose sight of | what actually makes users happy. | ggggtez wrote: | > Why wasn't this the original way of doing things? | | Someone wanted to do a feature so they could get promoted. | Probably with some mumbo jumbo about how it reduces the number | of clicks to create a tweet and thus increases revenue. | cblconfederate wrote: | Next they'll tell us that chronological mode is better! (it is | for me in any case) | grenoire wrote: | Imagine the world where images aren't algorithmically cropped. | It's easy if you try. No AI below us, above us only thumbnails. | goldenkey wrote: | Something something, John Lennon, crop out Ringo. | [deleted] | codeulike wrote: | Some context: They dont mention it directly but I think this | refers back to this thread last september | | https://twitter.com/colinmadland/status/1307111816250748933 | | (Note the thread displays differently now because Twitter have | changed their cropping algorithm) | | Originally @colinmadland was trying to post examples of how Zoom | virtual background had removed his black colleagues head, however | when he posted the side-by-side images (with heads) on Twitter, | twitter always cropped out his colleague and just showed him, | even if he horizontally swapped the image. So, while trying to | talk about an apparently racist algorithm in Zoom, he was | scuppered by an apparently racist algorithim in Twitter. | | It was widely covered in the press at the time | https://www.theguardian.com/technology/2020/sep/21/twitter-a... | SiempreViernes wrote: | Here's an example that still works: | https://twitter.com/bascule/status/1307440596668182528?s=20 | boulos wrote: | /rant but I feel like talking about percentage points of | difference is always hard for humans. For example: | | > In comparisons of men and women, there was an 8% difference | from demographic parity in favor of women. | | would have been clearer (and more correct) as "an 8 percentage- | point difference from demographic parity". That 8 pp difference | though is a 16% "relative" difference (58/50), or more starkly | "The algorithm chose the woman almost 40% more often" (58/42 => | 1.38). That said, the diagram in the post [1] is much easier for | humans to parse and say "wow, that looks pretty far off!". | | tl;dr: A number like 8% sounds like "no big deal", but 8 | percentage points (on each side) is a big deal! | | [1] https://cdn.cms-twdigitalassets.com/content/dam/blog- | twitter... | cratermoon wrote: | News flash: ML sucks. Most of the time. | LinAGKar wrote: | How about letting people disable this cropping crap altogether | tjkrusinski wrote: | People abuse tall aspect ratio images to take up more space in | the UI to get more attention on their tweets. | refactor_master wrote: | Just letterbox them. | cmckn wrote: | So, I can choose to see only un-cropped images on my TL, and the | author can see a preview of the algorithm's crop before they | tweet -- but a glaring omission is simply exposing a crop tool to | the author. The model works by choosing a point on which to | center the crop. Why can't you give user's a UI to do the same? | "Tap a focal point in the image, or let our robot decide!" | | The blog post mentions several times how ML might not be the | right choice for cropping; but their conclusion was...to keep | using ML for cropping. I hope someone got a nice bonus for | building the model! | [deleted] | jawns wrote: | I can't really see any down side, besides maybe a little bit of | developer time, to allowing users to see a preview of the crop | and optionally override. It's done all the time in other | places. | mattacular wrote: | It's probably a bit harder at Twitter's unique scale. They | have an incredibly high throughput of new posts and a large | portion of these posts include between 1-4 images that need | cropping. | kixiQu wrote: | I dunno, that just means it's a pain for the _user_ , not | for Twitter | bavell wrote: | Isn't this just a UI change and wiring that up to the | backend? The cropping happens either way, it might actually | be faster considering that if the user providers the crop | info, Twitter does't have to burn CPU cycles on figuring | out the "optimal" crop dimensions. | Spivak wrote: | /s? I mean we're talking about a client-side feature here. | If phpbb forums can do per-image cropping I think Twitter | can manage. | nightpool wrote: | > but their conclusion was...to keep using ML for cropping | | My takeaway from the article was that their conclusion was to | remove cropping from the product, starting incrementally on | iOS. (I got cropping removed on Android as well recently). That | seems like the opposite of "keep using ML for cropping"? | [deleted] | jedberg wrote: | Image cropping algorithms are hard. When we made our first one | for reddit, it used this algorithm: | | Find the larger dimension of the image. Remove either the first | or last row/column of pixels, based on which had less entropy. | Keep repeating until the image was a square. | | The most notable "bias" of this algorithm was the male gaze | problem identified in the article. Women's breasts tended to have | more entropy than their face, so the algorithm focused on that | since it was optimized for entropy. To solve the problem, we | added software that allowed the user to choose their thumbnail, | but not a lot of users used it or even realized they could. | | I assume they've since upgraded it to use more AI with actual | face detection and so on, but at the time, doing face detection | on every image was computational infeasible. | amelius wrote: | > Women's breasts tended to have more entropy than their face | | Aha, perhaps that's the problem then. | mvzvm wrote: | Reddit's image cropping algorithm is hilariously bad. As is | their video player, and their ads, and their ranking, and their | messaging tools... | cblconfederate wrote: | Suffice to say the twitter algorithm fails badly with NSFW | images (where often the focus is ... not face) | erichahn wrote: | How is entropy related to "male gaze". This approach seems to | be unsupervised, I don't see the problem. | colllectorof wrote: | I see that you didn't get the memo. If an algorithm or a | mathematical definition is made by a white man, it's _by | definition_ racist and sexist _because_ it was made by a | white man. Entropy was defined by Claude Shannon who was a | white man, so entropy is racist and sexist, because it | absorbed all of Shannon 's biases. | oceanghost wrote: | I honestly can't tell if you're joking or not. | bigfudge wrote: | Sadly he's too angry to make the joke funny. As always | there probably is a grain of truth to the idea that | postmodern feminist critiques occasionally disappear up | their own fundamental, and so there's a joke to be made | here. But this isn't it! | opsy2 wrote: | How is entropy defined in this context? | | Clearly there is human-derived input in the system | (otherwise... What's the point just crop randomly) | TheGallopedHigh wrote: | Randomness in pixel values | jedberg wrote: | Here is the code: | | https://github.com/reddit- | archive/reddit/blob/753b17407e9a9d... | | But in short, it's a histogram of the values of the pixels. | bavell wrote: | "Entropy" in this context also left me wondering. Perhaps | "variance" or "deviation from the mean"? | | Thanks for the insights! | SiempreViernes wrote: | I don't think the claim is that the behaviour is _caused_ by | "male gaze", but rather that the _outcome_ of always focusing | the cropping around any visible cleavage is functionally | identical. | skavi wrote: | Whether or not it's unsupervised, whether or not it's sexist, | it seems that a thumbnail focusing on a person's face rather | than their breasts is typically going to be more desirable. | Depending on context, of course. | fuzzythinker wrote: | Breasts shouldn't have more entropy than face. Perhaps the | reason is due to the breasts being in the middle of the | picture, so the face gets being compared to bottom rows more | frequently? | [deleted] | Nick87633 wrote: | How about not calling this the male gaze? If we're trying to | remove bias FFS. | krapp wrote: | The article says the algorithm was trained on eye tracking | data, and if the predominance of data came from men then | referring to the male gaze as a possible source of unintended | bias seems valid. | natpat wrote: | > One of our conclusions is that not everything on Twitter is a | good candidate for an algorithm, and in this case, how to crop an | image is a decision best made by people. | | This seems like it should have been a foregone conclusion. What | was the driving force in the first place to think cropping images | with an AI model was desirable? Seems like ML was a solution | looking for a problem here, and I'm glad they've realised that. | [deleted] | klodolph wrote: | It seems obvious in retrospect. Calling it a foregone | conclusion is too harsh. | | Twitter crops photos to fit their preview formats. It seems | like an obvious improvement to show people's faces when | cropping, etc. | the_gastropod wrote: | Right but... we've been cropping images in web applications | since... y'know, pretty much ever. Using ML to do this was | always pretty ridiculous overkill. Give the users an image | cropper, and be done with it. | klodolph wrote: | I can't see why this is overkill. You're eliminating a step | from the image posting process, and making it so users | don't have to crop an image twice (once for the full image, | and a second time for the preview). That makes sense when | you're writing a CMS or blogging platform like Wordpress, | but for Twitter it adds some friction. | | So, previously, the preview was just cropped in the center. | But this made some images look funny, since people's faces | would get chopped off. | | Coming up with a _workable_ solution to this with ML is not | especially hard. You can get things like face detection off | the shelf, maybe just tell your autocropper, "crop closer | to the face" and have a demo within a couple days (and then | much more effort to productionize it). From there, you can | start introducing ML models to improve on your basic face | detection. (I'm not counting face detection as ML.) | | This is not a case where some massive ML model is being | brought in to save two seconds of your time. This is a very | natural and obvious application of ML, at a company which | already does ML at scale, in a way that sounds like it has | a good chance at improving the appearance of the site | without introducing additional friction. | | Instragram gets around this by encouraging everyone to take | square photos. | amelius wrote: | I fear this is still horribly incomplete. E.g. if a picture shows | brand A next to brand B, which brand will be cropped? | e-clinton wrote: | Didn't sound like the model had that info, so I'd imagine it | would select one based on physical attributes of the brand | logo, not the "brand" itself. ___________________________________________________________________ (page generated 2021-05-19 23:00 UTC)