[HN Gopher] Twitter showed us its algorithm - what does it tell us? ___________________________________________________________________ Twitter showed us its algorithm - what does it tell us? Author : randomwalker Score : 83 points Date : 2023-04-11 00:39 UTC (22 hours ago) (HTM) web link (knightcolumbia.org) (TXT) w3m dump (knightcolumbia.org) | mountainofdeath wrote: | Given what has been open sourced so far, it makes sense that | content that is likely to be controversial, or content that | generates neutral to negative engagement would have a smaller | probability of being displayed. | | I suspected this and told my far-right/left wing acquaintances, | that no, Twitter (and Facebook too) isn't suppressing you, your | content is just a net-negative from the platform's perspective. | The platform is in the business of keeping the bulk of its users | and advertisers happy. | akira2501 wrote: | When people talk about suppression do they mean that their own | tweets are suppressed, or the tweets of people they follow are? | Or tweets from news organizations are, depending on their | content? | whimsicalism wrote: | The evidence is abundant for active suppression of certain | political views beyond just predicted engagement. No clue about | Twitter's algo, but Facebook certainly does this. | dekhn wrote: | Can you, uh, share that abundant evidence? | whimsicalism wrote: | Clear-cut example #1, which you might consider "flipping | switches" more than part of the algo is the suppression of | posting about Hunter Biden/laptop on Twitter. | | Instagram also downweights posts about Biden passing 1994 | crime bill. [0] | | [0]: | https://twitter.com/perma___ben/status/1339293381625864195 | dekhn wrote: | Do you have something substantive? IE, evidence that | there is a systemic, large-scale policy being applied? | Anyway, it looks like they outsource their fact checking | and applied a fact check. The fact check is extensively | documented here: https://www.usatoday.com/story/news/fact | check/2020/07/03/fac... | | I'm a bit tired of the "aggrieved people claiming | censorship" over small potatoes that often turns out to | be "the company applied their policy uniformly" | paulpauper wrote: | Tweets with hashtags and links do worse. | | It would seem anti-woke tweets do very well. I see such tweets a | lot when logged out. | | Replying to an account with an unverified account is | automatically collapsed. Only twitter blue accounts get to post | replies in comments and not have those comments be collapsed. | They can also post replies to their own tweets without the | replies being collapsed. | simplotek wrote: | I was disappointed by this article and how it omitted the fact | that Twitter hardcoded how references to Russia's invasion of | Ukraine should be downranked. | | https://news.ycombinator.com/item?id=35410841 | theteapot wrote: | That HN artcile is "[flagged]". Reading the comments I think | it's because there is no analysis of what "UkraineCrisisTopic" | actually means or does? Author seems to just grep code base for | "Ukraine" then draw what conclusion fits narrative. | ROTMetro wrote: | Versus the push that we should draw the conclusion that it's | a nothing burger? It definitely highlights that Twitter is | happy to categorize and treat Ukrainian content differently | which is insightful to know. | matthew9219 wrote: | [dead] | paulpauper wrote: | If I had to guess it's because it gets poor engagement and | would crowd out other topics due to the popularity of the | topic. | simplotek wrote: | If that was remotely true then why wouldn't the topic be | handled at the training level? Instead, Twitter is censoring | references to Russia's invasion of Ukraine through the same | mechanism used to kill DMCA violations. This is not an | approach motivated by "engagement". | xwdv wrote: | Training is a waste of time when you know exactly what you | want to block. | hathawsh wrote: | HN did the same thing with Bitcoin when it was soaring in | value. I remember the front page felt like it was 90% | Bitcoin. I was grateful that HN added an exception, | bringing the discussion back into balance. | GenerocUsername wrote: | I don't know when this dataset was published from, but there | was def a time when Ukraine news was beyond saturation. | madeofpalk wrote: | It's the sort of thing I would expect on a highly opinionated | Hacker News (iirc like how posts about Apple have a penalty | applied to them to counteract massive usual interest in | them), but less so on something more general audience like | Twitter. | | I'm not really looking to twitter to say "Actually, we've all | heard a bit too much about Tennis". I don't want Twitter's | timeline to have an editorial voice. | luckylion wrote: | If I recall correctly, Twitter has explicitly said before | that they do rank topics very differently because otherwise | Justin Bieber would be trending all day, every day. | whimsicalism wrote: | All social media algo timelines have an editorial voice. | Otherwise, you would exclusively be seeing engagement bait. | simplotek wrote: | > All social media algo timelines have an editorial | voice. | | Sorry, that's a bullshit excuse. Russia's invasion of | Ukraine was hardcoded to be downranked like DMCA | violations and high toxicity content. | | This has zero do to with "editorial voice" or other | bullshit excuse. This was a blatant attempt to smuther | any reference on what Russia is doing to Ukraine. | whimsicalism wrote: | I was simply responding to the comment that was made. | phailhaus wrote: | Because if he's going to claim to be the internet's town | square, he can't pick and choose topics that he's personally | 'tired of'. | [deleted] | btilly wrote: | Anyone who is interested should also read | https://twitter.com/aakashg0/status/1641976869460275201 for a | rather different take. | | It particularly interested me that Twitter under Musk is trying | NOT to discuss Ukraine, and PENALIZES people who attempt to | interact with those outside of their general political circle. I | can give arguments for why they should do both, but I think both | are ultimately bad ideas. | simplotek wrote: | > It particularly interested me that Twitter under Musk is | trying NOT to discuss Ukraine, and PENALIZES people who attempt | to interact with those outside of their general political | circle. | | While Musk's Twitter explicitly censors references to Russia's | genocide of Ukraine, Musk himself feigns ignorance and false | indignation accusing the "western press" of insisting "on | pushing such a lopsided view of the conflict". | | https://twitter.com/VsimPohuy/status/1645699649003569152?t=v... | hathawsh wrote: | From your link (thanks!): | | > 9. Making up words or misspelling hurts | | > Words that are identified as "unknown language" are given | 0.01, which is a huge penalty. | | Does that mean if I tweet about coding and use identifiers like | "setUserName", which is not an English word, the tweet gets a | huge penalty? If so, that's disappointing. | strken wrote: | That jumped out at me as a possible misreading of the code. | Is it detecting the language of the whole tweet, or just a | word as the author claims? | | Demoting a tweet that's entirely unidentifiable as any human | language seems fair enough. | giraffe_lady wrote: | Man if someone asked me to build a system to merely | identify whether a unicode string is human language or not | I would flatly refuse. There are thousands of spoken | languages, many of them with no standard written form, some | that are transcribed into multiple different writing | systems, some with no writing tradition at all and with | only ad-hoc transliteration unique to each user and use. | | Even being 90% confident would be a massive undertaking, | and "speakers of this language may/may not use the | internet" feels like high stakes for getting it wrong. | | It seems a little niche but I'm sure a few times a year | some far out town gets connected and suddenly there are | speakers of a previously unknown-to-the-internet language | newly online. | iudqnolq wrote: | Note that the metric here is "is the tweet in one of the | languages spoken by the user". This hypothetically allows | more nuanced implementations than you contemplate. | | For example, they could have a language "unrecognized" | and assume everyone speaks it. | | I broadly find this useful: I see tweets in other | languages when they're retweeted by people I follow, and | about half the time I machine translate them. But I don't | want my whole feed to be that. | thrashh wrote: | Well if someone asked me to do that, I would suggest that | it'd be based off their recent tweet history and not just | one tweet. And I would make my case in the meeting. | | Second, it's already been done so my next suggestion | would be to look what at all the computational linguistic | majors have been up to. | giraffe_lady wrote: | Yeah that struck me too. I can see the reasons why you'd want | it but the collateral damage on that must be huge. | | For example do they check for the common but nonstandard | transliteration systems arabic speakers use? There have to be | similar systems in other languages that don't use the roman | or cyrillic alphabets too right? | | Or for that matter what about languages twitter simply isn't | aware of? There are thousands with native speakers after all, | does this make it basically impossible for them to | organically use twitter together? | Mezzie wrote: | Also RIP the Conlang community on Twitter... | stingraycharles wrote: | The actual code comment doesn't mention "words" but rather if | the "tweet language" isn't in one of the user's | "understandable" languages. As such, I assume your example is | perfectly fine (would be extremely surprising if it wasn't). | | Whether the user implies the reader or author, I don't know, | I assume the reader as that would make most sense. | | https://twitter.com/aakashg0/status/1641976943141699584?s=61. | .. | simonw wrote: | I was not at all impressed with the analysis in that thread. It | makes a bunch of assumptions that don't feel very thorough to | me, but announces them as if they are unimpeachable facts. | | Biggest example is this one: | | "9. Making up words or misspelling hurts - Words that are | identified as "unknown language" are given 0.01, which is a | huge penalty." | | The code in the screenshot for that looks like this: | // Boost (demotion) if the tweet language is not one of user's | // understandable languages, nor interface language. | optional double unknownLanguageBoost = 0.01 | | That doesn't match the description of "Making up words or | misspelling hurts" at all! | LegitShady wrote: | input youtube thumbnail of cat in the hat enraged "DR SUESS | CANCELLED?! TWITTER WON'T COMMENT!" ragebait youtuber. | treis wrote: | The systemic racism people are going to go wild about this | randomwalker wrote: | OP here. Unfortunately this thread is mostly misinformation. | There were a bunch of viral threads from the growth hacker / | influencer crowd, including this one, within hours of the code | release with a very superficial understanding of the code (and | how recsys work in general). That's partly what motivated me to | write this article. | | See here for a rebuttal of the main tweet in that thread (near | the bottom of the article). | https://solomonmg.github.io/post/twitter-the-algorithm/ | ROTMetro wrote: | If this is for their Crisis Misinformation Policy why only | one specific callout and specifically directed to Ukraine? | Seems like a generous assumptions to make on your part that | it's a nothing burger. The takeaway we should go with is that | we now know that internally they are willing to | programatically segment out Ukraine related topics. The | question to me that this new knowledge should lead to is why | a policy to segmenting this? (not to call immediately jump to | 'nothing burger' or as you put it in the above post | 'misinformation'). | wunderland wrote: | It's unclear if it penalizes discussion of Ukraine equally | though. | | There have been many stories that have come to light in the | last few months. Merkel and Macron admitting the Minsk | agreements were used to buy time for CIA and British to arm | rebels since 2014 was big story. Large amounts of money the US | has supplied Ukraine and lack of oversight to where this is | going (the total US aid now surpasses Russia's entire military | budget per year). But this same poster (aakashg0) claims these | stories have been suppressed, even though they would be counter | to dominant narrative in western media. | | I think algorithmic moderation on a particular topic is hard; | you still need someone in there boosting the stories you want | people to read and downplaying the stories you don't. | stefan_ wrote: | Tell us more, who are the "rebels" in this story and what | arms did Merkel send? | | (Is this what news in the PRC feel like?) | wunderland wrote: | The rebels are right-wing paramilitary groups. And Germany | didn't send any weapons during 2014-2022, but she said in a | Der Spiegel interview from Oct 2022 that during the Minsk | negotiations, it became clear that the US' objective was to | buy time to secretly arm Ukraine (which is newsworthy | because this would imply a violation of the Minsk | agreement). | stefan_ wrote: | So in a year the US has sent more than Russia's yearly | defence budget, yet Minsk (which one, even?) was needed | to secretly (what was secret?) arm the Ukraine over 8 | years? Who are the "right-wing paramilitary groups" and | if they are Ukraine, since this is who you are alleging | is being armed, why are they rebels if they are | government-aligned? | wunderland wrote: | This is all very easy stuff for you to verify for | yourself and wasn't the original point of my top comment | (which was that these stories are hard to suppress | without manual effort-- although apparently many | Americans are unaware). | | But to be clear, the US was funding Ukrainian rebel | groups (right-wing paramilitary organizations) 2014-2022 | but through clandestine means. This is much more | difficult to do without the support of congress because | the support has to be indirect -- the funding has to be | off-the-books -- because this was a violation of the | Minsk agreement. | | Since 2022, the floodgates have opened and the US is now | openly sending money and weapons systems, now totaling | over $100B since the Russian invasion. The Russian | Defence budget is estimated to be $70-80B per year. | Fauntleroy wrote: | The gigantic key factor in all this that you're leaving | out is that Ukraine is defending itself against a full-on | invasion by a hostile neighbor. | matthewdgreen wrote: | I mean, the fact that the Ukraine re-armed itself after | Russia invaded their territory isn't news, is it? I think it | was reported on pretty substantially. And a good thing too | since they were invaded a second time, this time with a | strike towards their capital. I sort of assumed that was | obvious public knowledge and don't understand why people are | making it into a "story." | jawns wrote: | I would assume that if "probability that other users will | positively engage with a tweet" is the primary determiner of | reach, then the more you can help Twitter accurately predict that | probability, the better, because otherwise the default | probability is likely no higher than middle-of-the-pack. | | If that assumption holds, then I would guess this type of | algorithm favors consistency of content. In other words, someone | who picks a certain topic and consistently tweets only about that | topic is going to be easier to form predictions around versus | someone whose tweets show much more variety, in topics, styles, | etc. | | What that might mean, from a "gaming the system" point of view, | is that if you're a person who intends to primarily tweet about | two or three disparate things, you might be better off creating a | separate account for each, rather than a single account where | engagement is harder to predict. | gregbander wrote: | As the article calls out, the code is right there. Post your | results of tests, not knee jerk conjecture. Wrong opinions are | a dime a dozen. | whimsicalism wrote: | It's probably less this and more "if you only talk about one | topic, then when we show your posts to similar users, they are | more likely to like it" | just_boost_it wrote: | That doesn't really make sense because most big accounts tweet | about a range of topics. There's pretty well established ways | for estimating the probability based on how the range of topics | you might tweet about would match with the range of topics a | user likes. That means you have to try and figure out what your | base likes to see, and be like that. Tweeting about only a | single topic means that you're only targeting people who are | likely to like tweets from accounts that tweet about that one | topic. | teruakohatu wrote: | > That doesn't really make sense because most big twitter | accounts tweet about a range of topics. | | People engage with celebrities on everything. If an A list | celeb announces they enjoy a slice of lemon in hot water, | twenty news articles will be published around the world. | delecti wrote: | Incidentally, those same conditions seems to apply to other | sites. Lots of Youtubers have multiple channels (a main | channel, a livestream channel, a shorts channel). | netcraft wrote: | I've thought many times over the years that I would love to | be able to subscribe to a particular playlist or "show" from | a channel. There is several channels that I want to see their | main stuff, but not their side content. Or a particular game | from a lets-play-er, but not their other games. | delecti wrote: | Surprisingly Youtube did have that functionality, though it | was removed quite a while ago. I specifically remembering | being able to subscribe to "Is It A Good Idea To Microwave | This?" (in the late '2000s time range), without also | subscribing to the other videos on the channel. | suddenclarity wrote: | Pitch meetings and Ars Technicas interviews about old games | come to mind. Fortunately, pitch meetings got his own | channel last year. | thrashh wrote: | I don't think it's only helpful to the algorithm. It's also | helpful for me as a subscriber. | | If I want to watch episodes of Breaking Bad, I don't want you | to randomly throw in episodes or M*A*S*H (even though both | are good). ___________________________________________________________________ (page generated 2023-04-11 23:00 UTC)