[HN Gopher] The makers of Eleuther hope it will be an open sourc... ___________________________________________________________________ The makers of Eleuther hope it will be an open source alternative to GPT-3 Author : webmaven Score : 110 points Date : 2021-03-29 13:39 UTC (9 hours ago) (HTM) web link (www.wired.com) (TXT) w3m dump (www.wired.com) | jdonaldson wrote: | It's funny how behind the times that Wired is getting. Even my | parents know about how scary good these text models are getting. | grapecookie wrote: | The fact that there are no advanced AI chat-bots because they | might (I mean they will lol) say something offensive is absurd. | we are such babies. | | General AI is already here. It should be implemented on twitter | or wherever and used to teach us about ourselves. driven by | engagement, untethered by morals. A dispassionate glimpse into | what sells. An AI that exploits our engagement, for good or evil. | | The bot would become infamous and in due course banned. Teaching | us even more. | | But we are so fragile. | IgorPartola wrote: | What are you on about? There are AI driven chat bots. When they | aren't used it's not because they might say something | offensive. General AI is not here by any definition or | redefinition of General AI. We are not fragile. | robotresearcher wrote: | AI driven chat bots are routinely deployed, yes. But it's | also true that at least one bot generated content that | spooked its owner: | | "The AI chatbot Tay is a machine learning project, designed | for human engagement. As it learns, some of its responses are | inappropriate and indicative of the types of interactions | some people are having with it. We're making some adjustments | to Tay." (Microsoft statement) | | https://www.theverge.com/2016/3/24/11297050/tay-microsoft- | ch... | claudiawerner wrote: | This is because Twitter users, some coordinated on 4chan's | /pol/ board, decided to train the bot on extreme racist | input: | | https://en.wikipedia.org/wiki/Tay_(bot)#Initial_release | ravi-delia wrote: | I mean we're very fragile, which is why if we had General AI | we shouldn't release it at all, but that's of course not what | OP was saying lol. | grapecookie wrote: | General AI is not here: https://openai.com/blog/image-gpt/ | lol | leereeves wrote: | What do you mean by General AI? | | If you mean AGI (artificial general intelligence) it's | definitely not here yet. | ravi-delia wrote: | Ah yes, the only possible issue with releasing fully general AI | is that it might say something offensive. Not because we don't | have it at all, not because if we did we shouldn't just let it | out like a lion in the gazelle pen to see what it does, because | of those snowflakes! | grapecookie wrote: | Wrong, gpt4 is new and has not been implmented as a chat bot. | | Also wrong that previous chat bots were not shut down for | being offensive. https://en.wikipedia.org/wiki/Tay_(bot) | ravi-delia wrote: | Mmm, and was Tay General AI also? | moistbar wrote: | One data point in a sea of billions does not a pattern | make. | stellaathena wrote: | GPT-4 doesn't exist mate. | mrkramer wrote: | Couldn't Google build the the world's most powerful NLP AI? They | scraped the whole web and have DeepMind to pull it off on top of | Google's powerful and massive data centers. | wongarsu wrote: | They probably could, but what for? | | They did develop BERT and use (used?) it for parsing search | queries [1]. They probably use NLP models in the ranking | algorithm too. But those use cases are about getting a good | enough result with the throughput/latency requirements, which | necessarily makes them less "powerful" than models like GPT | that pay little attention to performance. | | https://blog.google/products/search/search-language-understa... | minimaxir wrote: | As someone who works on a Python library solely devoted to making | AI text generation more accessible to the normal person | (https://github.com/minimaxir/aitextgen ) I think the headline is | misleading. | | Although the article focuses on the release of GPT-Neo, even | GPT-2 released in 2019 was good at generating text, it just spat | out a lot of garbage requiring curation, which GPT-3/GPT-Neo | still requires albeit with a better signal-to-noise ratio. Most | GPT-3 demos on social media are survivorship bias. (in fact | OpenAI's rules for the GPT-3 API strongly encourage curating such | output) | | GPT-Neo, meanwhile, is such a big model that it requires a bit of | data engineering work to get operating and generating text (see | the README: https://github.com/EleutherAI/gpt-neo ), and it's | unclear currently if it's as good as GPT-3, even when comparing | models apples-to-apples (i.e. the 2.7B GPT-Neo with the "ada" | GPT-3 via OpenAI's API). | | That said, Hugging Face is adding support for GPT-Neo to | Transformers | (https://github.com/huggingface/transformers/pull/10848 ) which | will help make playing with the model easier, and I'll add | support to aitextgen if it pans out. | nipponese wrote: | Totally off topic: can you fix the pip3 installer for | aitextgen? I just filed an issue on GH issue tracker. | pabe wrote: | Did anybody set up a webinterface for testing this already? | droopyEyelids wrote: | I apologize for joking on Hacker News, but go to Google and | type in anything to do with a consumer product comparison, and | you'll get a billion results of webpages filled with text | indistinguishable from AI generated blather. | holstvoogd wrote: | I believe we are reaching a singularity. | | Like 90% of content is written by marketeers for bots. SEO | they call it. Now we can take out the middle man. Bots | writing crap for other bots. And then we use that content to | train more bots to write even crappier blog spam. And finally | the bots decide the actual recipe is no longer needed on the | recipe blogs and they kick us of the internet. | frockington1 wrote: | I wish it were possible to break down how much of twitter | is bots reading other bots and then creating content for | bots. They would never admit to how many 'users' are this | but it has to be significant | [deleted] | spideymans wrote: | It'll be interesting to see how colleges and universities react | to GPT-3. Students will surely use it to write entire | assignments. | b0rsuk wrote: | Corporate speeches, sermons, motivational talks, poetry, and | political speeches. They are either not required to make sense | or no one dares to interrupt. | Der_Einzige wrote: | Anecdotally, but I know a number of people who do university | assignments for money. Many of the clients for the folks I know | are at the university level are folks with poorer then average | English language skills and are usually in intro writing | courses. I'd be terrified if I were one of them right now. | | GPT-3 would be a godsend for cheaters, but still requires a | human to jump in and rewrite whole sections. | | No, if you want to REALLY want to cheat using AI, you should | most likely utilize either 1. Abstractive Summarizers (e.g. | Pegasus) or 2. paraphrasing tools (e.g. like at | https://quillbot.com/). I believe that Quilbot is primarily | powered by MLMs like BERT rather than CLMs like GPT-2 (but | someone who works there can enlighten me more). | | Copy and paste a text that you want rewritten in your own words | (e.g. the ideas of a really smart individual), and then it | rewrites it using totally different language but preserving the | same meaning. (old) Plagerism detection tools don't work and | hell, it's not hard to fool the never ones. You can try tools | for detecting if something is AI written by a particular model | and weights (e.g. to prove if they used GPT2-Medium), but if I | fine-tuned those same weights, than proving it was plagiarism | will become exceedingly difficult. | | Welcome to the brave new world of cheating. Also, techniques | like this are coming to a CS department near you (in the form | of source code generation powered by NLP models). | charcircuit wrote: | GET-3 is just as much of a cheat as using a thesaurus. New | writing tools shouldn't be banned just because old people | didn't have those tools. | rjzzleep wrote: | Even long before GPT-3 a friend of mine did his thesis with | generated text, in an engineering university and received a B. | This is 6 years ago. I have my own beefs with thesis' in | general already, since 2 thirds of it seems to be filled with | redundant text to prove that you went to university. I guess | it's a little bit different, since back then he had to actually | work to generate it and now it's a lot easier | whimsicalism wrote: | > his thesis with generated text | | 6 years ago = probably LSTM. | | He wrote an entire thesis with this and got a B? That seems | implausible to me, but maybe I'm used to higher grading | standards. Did he just use it to fill in parts of it? | | Also, the plural of thesis is theses, not thesis' which | implies the possessive. | kwhitefoot wrote: | How does that work? Don't you have to defend a thesis? | pedalpete wrote: | A friend started a AI to improve writing (https://outwrite.com) | and when the initially started, they had a detect plagiarism | feature that teachers could use, I think they stopped | developing that eventually. | | If I recall correctly, the way it worked was to build up a | model of this persons writing, and how it compared to to other | people, and then would measure the likelihood that sentences | and paragraphs matched the rest of the writing. | | I suspect something similar could be done with GPT-x | vmception wrote: | Wolfram Alpha has been solving calculus problems for 12 years | and it is barely a footnote in how the college experienced has | changed | | So I would say this likely will just be there. It just is. Wont | change anything, universities will acknowledge it, a headline | or two will occur when its use was discovered in a paper that a | student didnt even skim to make less obvious, and most papers | will fly under the radar. | | Other kinds of assessments will still do their job. | grogenaut wrote: | Have you read much gpt3 stuff? While it's coherent in a | sentence it is very rambling over paragraphs to pages. It could | probably do fine for a grade school or bad highschool paper. I | think if you turned it in for college you'd get an f. | | On an unrelated note my fake daughter is now a TA and the | professor lead off saying "we are in a golden age of cheating". | They're going for way more short assignments as it's a lot more | work to cheat on those than one make or break test. | SubiculumCode wrote: | Have you read college freshman essays? While it's coherent in | a sentence it is very rambling over paragraphs to pages. | grogenaut wrote: | yes I've read them, would those pass an English composition | class in college? This comment generated by Gpt3 | whimsicalism wrote: | I think GPT essays could definitely pass a freshman | expository writing class. I went to a pretty good | university and when we did peer review I was pretty | surprised at (what I considered) the low average quality | of the writing. | bluetwo wrote: | Examples? | whimsicalism wrote: | Looking back, I think freshman me was perhaps a bit harsh | in my assessment of my peers. Here are two excerpts, one | my own writing and one from a peer. Rereading them, I am | not sure GPT3 could recreate either of these, but you can | judge: | | Peer: > The Gaza Conflict Gave Hamas what they needed to | build an even deeper anti US narrative and anti-israeli | narrative. The reasons that Israel was able to act the | way it did so during these conflicts were:The civil | war/Russian invasion of Ukraine, especially after the | July 17 downing of Malaysian Airlines Flight 17; | President Assad's brutal tactics in the Syrian civil war, | which seemed to be paying off at the time; and advance of | the Islamic State in northwest Iraq and the horrific | videos of their executions. As a way to show 0 tolerance | for Islamic Radicalism President Obama and his | administration gave Israel full support at the start of | the Gaza operation. | | Me: > Eleven years later, peace in Israel and Palestine | seems no closer. Despite being ousted by Fatah from the | West Bank, continually targeted by economic sanctions, | and subject to military action, Hamas seems no closer to | disappearing than when it was first founded (Milton- | Edwards 212) and continues to be a major political force, | particularly in the Gaza strip. What has changed, | however, is that Hamas has grown desperate. In "A rare | opportunity to influence Hamas," Daniel Nisman argues | that the increased isolation experienced by Hamas due to | the international community's pressure is in fact | counterproductive. | nl wrote: | Here's some samples using the _small GPT-2_ (emphasis: | small GPT-2!) in AiTextGen, seeded using "The Gaza | Conflict Gave Hamas what they needed " (1st 3 samples, | not cherry picked): | | > The Gaza Conflict Gave Hamas what they needed to | survive and that's not something Israel can afford. | | > The Gaza Conflict Gave Hamas what they needed to take | control of Gaza. In the past few days I have had a few | conversations with people who have been in Gaza who are | very critical of the IDF. These people are often very, | very angry. They are also very angry about the way the | IDF is doing what it is supposed to be doing and what it | is supposed to be doing. So in my mind, the issue of | Hamas being in control of Gaza is not something that | should go away. It is something that needs to be | addressed. Q: What have you been hearing about since the | war started? A: The IDF has been saying that it is going | to allow Hamas to take over Gaza. Q: What has been | particularly troubling to you lately? A: I have been | hearing about this as well. Q: What has been more | troubling to you recently? A: My friends have been saying | that they want to call it Operation Protective Edge. Q: | And, you know, what do they mean by that? A: They mean | that they want to use it to make sure that they have | enough force to take control of Gaza. Q: And that means | that they want to make sure that they have enough | | > The Gaza Conflict Gave Hamas what they needed?" | | The UN is now considering whether to send military | advisers to Gaza to assist the country's military forces. | The Security Council is considering whether to send the | equipment, along with the training, to a military | operation in the region. The US is also considering | sending a team of US special forces to assist the | Palestinian armed forces in the conflict. | | The United Nations is now considering whether to send | military advisers to Gaza to assist the country's | military forces. | | Kerry's comments come as the US has been in touch with | the Palestinians to offer support in exchange for a full | ceasefire, and as the US continues to support the PA and | Hamas, the two groups have been engaged in a long-running | conflict with Israel in the Gaza Strip. | | In January, Kerry condemned Israel's "continued offensive | against Gaza," saying the blockade was the "worst | violation of international law on the part of the Israeli | government and the civilian population of Gaza." | | The US is now considering whether to send military | advisers to Gaza to assist the country's military forces. | According to Reuters, the US Secretary of State John | Kerry said this week that "there is no guarantee" that | the US will send special forces "to the Gaza Strip | | So yeah - not fantastic, but interestingly not terrible | either. The non-factual but coherent nature of it is very | troubling. | whimsicalism wrote: | From reading these (esp. the last), you would think the | US is allied with Palestine against Israel! | gwern wrote: | Well... https://www.eduref.net/features/what-grades-can- | ai-get-in-co... https://arxiv.org/abs/2009.03300 | troelsSteegin wrote: | Will anyone care to read it? In a reductive dystopian way, I am | just looking for the authority figures in my ideological | landscape to signal to me what my position should be on this or | that topic. In this landscape, argument and evidence matter less | than just communicating an "actionable" judgment. Maybe there | could be a Rush Lim-bot. I suppose some iteration of GPT-foo will | be good at generating genre-consistent narratives, but could that | instead be screen plays that render as tiktok videos? The tech is | super cool, but I struggle with the "why, really?". Does anyone | benefit beside platform operators? | k1rcher wrote: | While the fraud implications of convincing generative text is | quite daunting, it's great to see progression in this field. | aaron695 wrote: | > the Eleuther team has curated and released a high-quality text | data set known as the Pile for training NLP algorithms. | | This includes HN [i] HackerNews 3.90GiB 0.62% | | Which if SciFi has taught me anything means we are all uploaded | now and will live forever. | | [i] https://arxiv.org/pdf/2101.00027.pdf | f38zf5vdt wrote: | "Wintermute was hive mind, decision-maker, effecting change in | the world outside. Neuromancer was personality. Neuromancer was | immortality. ... Wintermute [had] the compulsion that had | driven the thing to free itself, to unite with Neuromancer." | worik wrote: | I thought that the important barrier to building these sorts of | systems is the cost of, indirectly the energy required for, | training the model. Is that still correct? | | How does a Free Software or "Open Source" project get around | that? | sodality2 wrote: | Distributing the trained models. | worik wrote: | I should have read the article more carefully!! | | The Eleuther project makes use of distributed computing | resources, donated by cloud company CoreWeave as well as | Google, through the TensorFlow Research Cloud, an initiative | that makes spare computer power available, according to | members of the project | vmception wrote: | "Man GPT-3 is such an inaccessible naming convention and it uses | a prohibitive license" | | Solution: | doesnotexist wrote: | How many internet forum prophecy cults (you know like the q one) | are or will be powered by these language models? It's often | assumed or at least easier to imagine the evaluator in Turing's | test is a rational actor that possesses a high-degree of | skepticism. But it seems that a lot of the human population is | ready and willing to believe wild claims with little or no | evidence and many people seek out information that confirms what | they already believe. | | As the cost of making such models becomes less and less, it seems | inevitable, spin up many such models and see what sticks and/or | combine some evolutionary process for feeding back user- | engagement to fine-tune and adapt the models. How many of these | influence machines will latch onto the language of existing | religious traditions and how many might invent or spur on the | development of entirely new ones? Maybe not exactly the "Age of | Spiritual Machines" that some futurists predicted... | | How far are we from "Show HN: I started a cult by training a | model on the sermons of televangelists and MLM copy." | caslon wrote: | This thought process is something I think is a common | misconception with how cults work. | | A machine to autogenerate cult-ish nonsense isn't needed. | Humans are already _incredibly good at doing this on their | own._ | | Not only this, but another thing about this is that cults | generally fine-tune themselves to fit their members. | | A machine generating convincing lies still wouldn't | meaningfully do as much as a human-operated, human-targeted | attempt at a cult. Creating one is something basically any | human can do; the required skillset is something most people | possess. | nutanc wrote: | I have recently started an experiment to generate an AI generated | newsletter[1]. All posts are generated by GPT-3. I work as the | editor. It works well for some topics and not so well for some | topics. Since I curate the content, I dont publish topics which | are not done well. For example, I tried to make it generate a | nice article on the Suez canal crises. But it was harder than I | thought it would be. | | It generates buzzfeed kind of stories very well though :) | | [1] https://aifeed.substack.com/ | starik36 wrote: | Are you using OpenAI API to generate these? | I_Byte wrote: | How do you go about generating these posts? I think I would | like to play around with something like this but I am not sure | where to start. | hooande wrote: | GPT-3 doesn't know anything about the Suez Canal blockage. It | only knows what it could have learned by googling "suez canal" | on the date the last update was released. I imagine the | newsletter content it created for you was mostly general | background info about the canal. | | Whenever GPT-3 is updated or a new version comes out, it will | be able to speak much more intelligently about the topic. But | of course any update will require re-doing all the careful | tuning of prompts and models... | girlinIT wrote: | AI can also convert audio to text, one of great examples is | https://audext.com/. What do you think? | 6gvONxR4sf7o wrote: | I don't know why the eleuther project riles me up so much. Their | work on the pile gets to me because they're so cavalier about | copyright (while I defend myself by training on similarly pirated | text datasets, but feel different because I don't redistribute | them and am honest that it's pirated. to be clear, i'm rolling my | eyes at my rationalization right here). Their work on gpt-neo | riles me up because they do such a weak job comparing it to the | models whose hype they're riding. It also riles me up because so | many people just eat it up uncritically. | | But it's all out of proportion. I think it's that last part (the | uncritical reaction) that makes me blow this out of proportion. | stellaathena wrote: | > Their work on GPT-Neo rules me up because they do such a weak | job comparing it to the models whose hype they're riding. | | Building open source infrastructure is hard. There does not | currently exist a comprehensive open source framework for | evaluating language models. We are currently working on | building one (https://github.com/EleutherAI/lm-evaluation- | harness) and are excited to share results when we have the | harness built. | | If you don't think the model works, you are welcome to not use | it and you are welcome to produce evaluations showing that it | doesn't work. We would happily advertise your eval results side | by side with our own. | | I am curious where you think we are riding the hype /to/ so to | speak. The attention we've gotten in the last two weeks has | actually been a net negative from a productivity POV, as it's | diverted energy away from our larger modeling work towards bug | fixes and usability improvements. We are a dozen or so people | hanging out in a discord channel and coding stuff in our free | time, so it's not like we are making money or anything based on | this either. | stellaathena wrote: | Hi! I'm the EAI person who your criticism of the Pile is most | directed at. I'm curious if you read Sections 6.5 and 7 of the | Pile working paper and, if so, what your response to it is. As | you note, virtually everyone trains on copyright data and just | ignores any implications of that fact. I feel that our paper is | very upfront about this though, going as far as to have a table | that explicitly lists which subsets contain copyrighted text. | | Also, I realize that you don't have any ways of knowing this | but we also have separated out the subset of the Pile that we | can confirm is licensed CC-BY-SA or more leniently. This wasn't | done in time for the preprint, but is in the (currently under | review) peer reviewed publication. Unfortunately the conference | rules forbid you from posting materials or updating preprints | between Jan 1st 2021 and the final decision announcement. But | we will be making the license-compliant subset of the Pile | public when we are able to and will give it equal prominence on | our website to the "full" Pile. | | Also, we will be releasing a datasheet for the dataset but | again conference limitations prevent us from doing so yet. | | If you're interested in talking about this in depth, feel free | to send me an email. | 6gvONxR4sf7o wrote: | Hi again! We had a back-and-forth about this a while back | regarding the paper and I think we didn't end up on the same | page regarding the "public data" definition in the paper | (found it! [0]). I love that you're upfront in the paper, | because it's silly how most people just don't acknowledge it | (though they usually don't redistribute it publicly like the | pile does). | | I think the gist was us disagreeing about the relevance of | | > _Public data_ is data which is freely and readily available | on the internet. This primarily excludes ... and data which | cannot be easily obtained but can be obtained, e.g. through a | torrent or on the dark web. | | That last phrase is what got to me. It puts things in the | same category that feel too different. E.g. the harry potter | books in vs this comment I'm writing. They're both available | within a few clicks from the search bar (one because I put it | there, another because it was put up against the wishes of | the author and owners), but that commonality doesn't feel | relevant. | | Excluding torrents especially seems like a cop out explicitly | to get around the issue of "X is the top result when i google | it" being so common as a torrent. I think you're trying to | exclude that content as public because then it defines too | much as public? But torrent vs ftp doesn't feel at all | relevant when it's just google plus a click or three. Or | searching on pirate bay plus a single click. | | I imagine a judge looking at the copyright status of | someone's pirate site and saying they can't redistribute the | content, and the pirate responding "okay we'll take down the | ftp server and put up a torrent instead, so that it's not | public. If you google us (or search on pirate bay), the top | result will stop saying 'X download' and now it'll say 'X | download torrent'" and expecting the law to be on their side. | | I didn't really buy the arguments in section 7 either. The | usage points seem legitimate, but don't cover redistribution. | | > But we will be making the license-compliant subset of the | Pile public when we are able to and will give it equal | prominence on our website to the "full" Pile. | | This is fantastic and I want to sincerely thank you for that. | | I'm trying not to be combative, but I feel like publicly | redistributing other people's work does raise the bar quite a | lot higher than just using it to train. | | [0] https://news.ycombinator.com/item?id=25616218 | nl wrote: | I don't have a dog in this fight, but I think you should | re-read this: _data which cannot be easily obtained but can | be obtained, e.g. through a torrent or on the dark web._ | | It's an extra piece of engineering to reliably scrape | torrents and the dark web and exclude spam traps. "Easily | obtained" is probably as much about this vs the copyright | aspects. | | The person you are replying to is correct in saying that | most people train on the "public web" (eg, common crawl | data). The copyright implications of this haven't been | tested in court as yet. | | It is worth noting that common-crawl data is widely | distributed and would seem to raise the same issues you are | identifying here. | andyxor wrote: | that's not an AI | neonate wrote: | https://archive.is/MxlnQ | everdrive wrote: | People already believe garbage at a pretty alarming rate. It's | easy to guess at a number of possible outcomes here: | | - More junk text moves the public to doubt legitimate information | even further than they currently do. | | - There is so much human-generated junk text that adding more of | it via AI actually doesn't have much of an effect. | | - People return to lean on experts, perhaps even more than | before. (just as a number of tech-literate folks have now | returned to relying on brand name.) | | Speculation is easy of course, so who knows what will actually | happen. | hanniabu wrote: | > People return to lean on experts | | The problem with this is that people look at anybody confirming | their bias as an expert. I can't tell you how many FB posts | I've seen where some armchair poster claims that a researcher | is wrong because of xyz and it's being reposted thousands of | times. | burlesona wrote: | I think we may come to see the era of roughly 1990-2010 as the | golden age of information: relative abundance creating new | opportunity, before the noise drowned it all out. | | I suspect that in the future people will, ironically, return | more strictly to tribal knowledge, as the media and the | internet will be (already is) a vast ocean from which you can | pull anything you want to believe. Thus nothing you see or hear | from mass media or the internet can be trusted, there are no | experts, and you go back to information scarcity as you have to | rely on your immediate human network for trust. Actually I | think we're already seeing the return to tribal authority, the | early waves are already here on Facebook and YouTube... they | just haven't devolved to strictly local circles of trust yet. | api wrote: | Concrete prediction: There will be a global cult similar in | nature to Qanon driven by an AI spitting out generated bullshit | within the next ten years. | | That's assuming some percentage of Qanon word salad isn't the | output of Markov chain generators. A lot of it resembles low- | order statistical text generator output after having been | trained on a corpus of 1990s Usenet alt.conspiracy and the | Protocols of the Elders of Zion. | cblconfederate wrote: | People believe what's believable (even if backed up by | garbage). GPTs dont make believable stuff, but they can be used | to flower up some b.s. idea. Nothing that can't be done with a | few hired trolls, and the proliferation of garbage will | endanger the troll industry, as people will start becoming | suspicious. So i doubt its impact can go beyond generating spam | and noise. | isolli wrote: | > just as a number of tech-literate folks have now returned to | relying on brand name | | out of curiosity, what are you referring to? | everdrive wrote: | In the early-ish days of the consumer internet, consumers had | a new and huge information advantage over companies. People | moved from relying on brand name, to reading online reviews. | Often finding niche brands which they had otherwise not heard | of. | | Now, in 2021, that experience is flipped on its head. Amazon | reviews are gamed and cannot be trusted. Companies build | niche brands like fly-by-night companies, and the lesser | known brands have a very high chance of being both seriously | inferior, and also short lived. | | At least, this has been my experience, and the experience of | some others. | | [edit] | | And as further anecdotal proof that things have come full | circle, my elderly mother in law keeps getting tricked by | Amazon purchases. "The reviews were good," she'll say before | returning something. | whimsicalism wrote: | I just rely on reviewers like NYT Wirecutter and then buy | whatever the reviewers suggest (and is cheap) on Amazon. | [deleted] | [deleted] | ElFitz wrote: | True. But a simple API to generate junk text? It can scale, | cheaply, beyond measure. | | No need for a troll farm, hiring, managing and training tens or | hundreds of people. | | A reasonable amount of cash, a bit of motivation, some moderate | technical skills, and voila! Anyone can compete with the | Russian troll farms now and build their own networks of | hundreds or hundreds of thousands sufficiently credible (as | humans) fake accounts spewing garbage and patting each other on | the back via likes, retweets and whatnots. | | All with the appropriate fake news blogs and sites happily | churning out grammatically correct nonsense that makes (enough) | sense. | | Basically, this kid's dream: | https://www.nbcnews.com/news/world/fake-news-how-partying-ma... | jackTheMan wrote: | But even Russian/Chinese bots can step up as it is much | easier now to flood forums (like reddit) etc, where e.g. a | China critic article appears to kill any discussion. | luckylion wrote: | I find it easier to identify humans that flood forums | though. Especially non-native speakers usually are somewhat | easy to spot, I assume that's true in any language. That's | different for ML-generated texts. On the other hand, human | texts are more "on message", but if all you want to do is | create noise, I guess you don't need to have targeted | communications. | TheAdamAndChe wrote: | > if all you want to do is create noise, I guess you | don't need to have targeted communications. | | This is key in anti-extremist operations on anonymous | boards. 4chan and other similar sites are absolutely | nothing like they were a decade ago, I presume because of | such bots flooding them with noise. | [deleted] | ElFitz wrote: | Oh, definitely. Existing operations also absolutely can | leverage that in order to amplify their reach and | capabilities. | TriNetra wrote: | And unfortunatley, the web will be forced to move toward | verified human identities to fight with such junk and | anonymous browsing will become a thing of the history. | UnFleshedOne wrote: | There is a market (well, a need at least) for nonsense | detectors that work similarly to the way ad blockers work. | Detect internal inconsistencies, non-sequiturs, low information | density and other similar reasons to avoid reading the text -- | and visibly flag or block that. | | That should eliminate 80%+ of existing human generated text | content and lead to text generators composing useful articles. | mfDjB wrote: | It's very nice to see Eleuther fulfill the Open promise of | OpenAI. | | I'm scared that more and more big model advancements are being | denied access from the general public, which will just make the | inequality between big corporations and startups even greater. ___________________________________________________________________ (page generated 2021-03-29 23:01 UTC)