[HN Gopher] Meta is inviting researchers to pick apart the flaws... ___________________________________________________________________ Meta is inviting researchers to pick apart the flaws in its version of GPT-3 Author : mgl Score : 223 points Date : 2022-06-27 16:41 UTC (6 hours ago) (HTM) web link (www.technologyreview.com) (TXT) w3m dump (www.technologyreview.com) | bribri wrote: | So happy more stuff like this is open. Kudos to Meta | westoncb wrote: | This makes for some pretty excellent counter-marketing against | OpenAI: | | "so Meta's GTP-3 is open?" | | "correct" | | "and the original is not?" | | "correct" | | "and the original is made by 'OpenAI'?" | | "correct" | | "hmm" | slowhadoken wrote: | Is this an advertisement for developers to work on Facebook's AI | for free or am I being cynical? | charcircuit wrote: | No, it means that researchers now can have access to Facebook's | large language model. No one is forcing the researchers to do | research using it. | rexpop wrote: | Of course it is! That's the premise of every open-source | initiative, too. It's not too cynical, it's plain economics. | Pretty sure it's the explicit purpose, too. | | No one really thinks open-source sponsorships are charity, do | they? | abrax3141 wrote: | Yet another confabulation generator with pretty good grammar. | sudden_dystopia wrote: | Didn't we already learn our "free" lesson from this company? | mgraczyk wrote: | Really concerning to me that people find the luddite argument so | persuasive and that it gets so much play in the press. The crux | of the argument from the outside "ethicists" quoted in the | article is something like. | | "This new piece of technology might be dangerous and we don't | fully understand it, so we should not poke at it or allow people | to study it." | | Maybe there's just something about my personality that is deeply | at odds with this sentiment, but it's also about the lack of | testable predictions coming from people like this. Their position | could be taken about literally anything with the same logical | justification. It's a political and emotional stance masquerading | as a technical or scientific process. | drcode wrote: | The remaining great apes remain alive primarily due to our pity | for their plight- They were once the highest IQ species in the | world, but no more | | Probably, we will be in the same situation relatively soon: And | there is little reason to expect the AI systems to have the | same pity | | Sorry I can't set up a double-blind, testable, peer-reviewed | study to help convince you of this | jonas21 wrote: | It's much easier to sit on the sidelines and come up with | reasons why you shouldn't do something new than it is to | actually do it. | | Some people have figured this out and built careers on it. This | wouldn't be a problem, except that this opposition eventually | becomes their professional identity - they derive prestige from | being the person who is fighting against whatever. So even | after researchers address their concerns, they have to ignore | the progress or move the goalposts so they can keep on opposing | it. | eszaq wrote: | This doesn't seem like a bad thing to me. In the same way we | have public defenders who professionally defend scoundrels, | it seems good to have people who professionally critique new | technologies. | | I'm old enough to remember the naive optimism around the | internet in the 2000s. "The long tail", "cognitive surplus", | "one laptop per child", Creative Commons, the Arab "Spring", | breathless Youtube videos about how social media is gonna | revolutionize society for the better, etc. Hardly anyone | forecasted clickbait, Trump tweets, revenge porn, or social | media shaming. If we had a few professional critics who were | incentivized to pour cold water on the whole deal, or at | least scan the horizon for potential problems, maybe things | would've turned out better. | amelius wrote: | One day, these people will be right! | | (And then we know one solution to the Fermi paradox.) | armchairhacker wrote: | My understanding is that when companies say "we aren't | releasing this / limited access / restrictions / for ethical | reasons" they really mean "we aren't releasing this because a) | it's expensive to run these models, b) it was even more | expensive to create them and we might be able to profit, and c) | maybe it's bad for our ethics, which affects our funding and | relations, and also, ethics." | godmode2019 wrote: | Its a business play, | | They are asking to be regulated because they have finished | writing their models. | | With regulation it will be harder for up and coming models to | gain traction. | | Its getting so much coverage because its paid press, I read | about it in my newspaper BEFORE tech YouTube and here. | rglover wrote: | Read Ted Kaczynski's (yes, that one) "Industrial Society and | Its Future" with a neutral mind and you will start to | understand why it's compelling. | guelo wrote: | The attitude that I have trouble understanding is "A company | spent millions of dollars researching and developing a new | technology, they must make it available to me or else they are | evil." | ipaddr wrote: | Spent millions on tech that could be a net negative for | society. Keeping details secret makes people think they are | evil because that's how coverups happen. | trention wrote: | Here is one prediction: Because of language models, the amount | of fake news online will increase by an order of magnitude (at | least) before this decade ends. And there is no interpretation | of that development as anything else but a net negative. | | Another more broad prediction: In a decade, the overall | influence of language models on our society will be universally | seen as a net negative. | radford-neal wrote: | "Because of language models, the amount of fake news online | will increase by an order of magnitude (at least) before this | decade ends. And there is no interpretation of that | development as anything else but a net negative." | | That's not at all clear. You're assuming people will continue | to give credence to random stuff they read. But once fake AI- | generated content is common, people will surely become less | trusting. The end result could easily be that fewer people | than before believe fake news is real. Presumably, fewer | people will believe real news too, but the result could still | be net positive. | mortenjorck wrote: | I keep seeing this prediction, but have yet to see a | convincing argument as to how this content is supposed to | circumvent existing trust networks. | | News outlets are nothing without a track record. People trust | names they recognize. You can spin up as many CMS instances, | domains, and social media profiles for fake news as you want, | but without a history shared with its core audience, all the | language models in the world aren't going to convince anyone | but the most credulous when the content is coming from | unfamiliar sources. | brian_cloutier wrote: | What is the issue with fake news? | | Language models can now pass as human in many situations but | there are already billions of humans capable of writing fake | news, this isn't a new capability. | | We have already created mechanisms for deciding which voices | to trust and no matter how good language models get they will | not be able to prevent you from visiting economist.com | Viliam1234 wrote: | > there are already billions of humans capable of writing | fake news | | You have to pay them, and most of them are not very good at | writing. Even with a big budget, you get a limited number | of good articles per day. | | If you can make writing fake news 100x cheaper, and then | just throw everything at social networks and let people | sort out the most viral stuff, that can change the game. | | Also, computers can be faster. If something new happens | today and hundred articles are written about it, a computer | can quickly process them and generate hundred more articles | on the same topic, than a group of humans would. (Many | humans can do the writing in parallel, but each of them | needs to read individually the things they want to react | to.) | Vetch wrote: | The vast majority of generators of fake news today are | from Content Mills, Copy Paste Mills and general SEO | spammers. Political misinformation is the other big | generator. The economics of it and not "ethical" gate | keeping is what will affect their output. Realistically, | normal people don't have the ability to coordinate hordes | of proxied IPs to attack social networks requiring | account sign ups and custom feeds. | | The value of exercising critical thinking, checking | trusted curated sources, information hygiene, recognizing | and avoiding manipulation tactics in news and ads will | have to go up. The internet is already almost entirely | unreliable without requiring any fancy AI. The listed | skills will be necessary regardless any increase of | politically manipulative content, advertisements or | product misrepresentations. | jmathai wrote: | Your argument holds true in theory but does not always work | in practice. | | The issue many people have with fake news is that it's a | tool that can sway public opinion without any basis on | facts. I'm not sure, by your response, if you find that to | be problematic or not. | | I think we've recently found that people _haven 't_ decided | which voices to trust and can be led to believe things | placed in front of them. Paired with the ability to spread | that information - there is significant impact on society. | | That's the reason some people have issues with fake news, | from my experience. | | Also, getting a computer to do something will always scale | several orders of magnitude more than having billions of | people do it. | numpad0 wrote: | I kind of agree - a lot of "fake" news believers seems to | be actively seeking for contrarian views purely for the | sake of it, with indoctrination as an incentive offered for | labor of reading, rather than harm unto themselves. In that | sense, the factual accuracy - the "fake" notion, don't seem | to be the point, and the volume of text that NN generators | enable can be less of an issue. | texaslonghorn5 wrote: | I think the issue could be volume (and also that the vast | majority of humans aren't actively exercising their ability | to write fake news at present). Also that language models | might be far more convincing. | NoMAD76 wrote: | Fake news and AI generated news are kinda all over the place | for a good amount of time. It's faster and cheaper to have AI | write news from a press release. | | My prediction is that in the next 10y we will really struggle | to determine between fake-people and real-human. There will | be an explosion of fake-identities posting more and more | human-like. | | But I'm not Nostradamus so I could be very very off here. | shon wrote: | You're probably right but it will be an arms race with ever | more sophisticated mitigation techniques being deployed to | filter. | | I'd say Neil Stephenson has a pretty good take on what this | might look like in his recent book: Fall, where in everyone | has a "feed" and those that or more savvy/wealthy have | better editors (AI, etc) of their feed. | NoMAD76 wrote: | It's all about having the right tools, but I wonder how | long can we "beat the machine" :) | jerf wrote: | I'm not particularly convinced by the Dead Internet Theory | [1] as of 2022, in the sense that it is completely true | right now. But I am convinced it is building around us, and | even now, the correct frame for the question isn't | _whether_ it is true, but _how true_ it is. There is too | many entities with too much reason to build it for it not | to be happening. And the nature of it is such that it doesn | 't need to be one entity doing it for one unified reason; | dozens, hundreds can all be participating & fighting with | each other on different levels and the sum total of all | that is to put it together that much more quickly. | | You know, the PGP web of trust idea may yet take off all | these decades later, not because we need a web of trust to | send 100.0000000% safely encrypted messages to each other | to protect from governments, but because we need a web of | trust just to know _who the real humans are_. | | [1]: https://forum.agoraroad.com/index.php?threads/dead- | internet-... | agar wrote: | I love the idea but hate that I'm so cynical about the | likely outcome. | datadata wrote: | Curious to what degree and how you think web of trust | idea could help here? Assume you could use it to prove | whether an article was signed or not by a real person. I | think this would solve the problem of articles being | published with a false author attribution. However, it | would not work to prevent actual people from publishing | AI written articles using their own identity. It would | also not (directly) do anything to establish if the facts | in an article are correct or not. | jerf wrote: | Specifically, a web of trust designed for exactly that | assertion: This person is a real human. Signatures here | serve to assert what identity something came from. | | There would be some of the usual web of trust problems, | e.g., trying to explain to Joe Q. Public that you only | sign for people you _know_ , beyond doubt, are human. | Preferably in person. Many other problems, too. | | I guess you could say, my thought here isn't that this | would solve the problems. The problems at this point are | somewhat well known. What has been missing is any benefit | significant enough to motivate us to get past those | problems. If there is, it's obviously many years away. | Wouldn't exactly suggest building a startup around this | idea right now, if you get my drift. We still need to go | through a phase of the problem getting larger before we | even get to the phase where people start to realize this | is a problem and start demanding that people online prove | they are actually people, and goodness knows "I'm a | human" is merely the lowest of low bars itself, not the | solution to all trust problems. | paganel wrote: | It depends. On forums like this it would basically take a | machine that would pass the Turing test in order not to be | seen as an AI in any "meaningful" conversation that it | might join (so, not just a comment posted as a reply here | and there). | | And even if the powers that be manage to get those future | AI bots to post stuff that will very much resemble what we | now post in here, it is my belief that the uncanny valley | will be, in the end, impossible to pass (in fact that's one | of the main motifs of many of Asimov's books when it comes | to robots). | l33t2328 wrote: | I have already generated fake news-esque things with gpt-3 to | send to friends. | | A lot of the outputs look incredibly genuine. We live in | interesting times. | jerf wrote: | That has already happened. manimino linked me to this great | page on another thread on HN a few days ago: | https://cookingflavr.com/should-you-feed-orioles-all-summer/ | But consider that a particularly easy to detect version of | the problem. Click some of the other links on that site and | have a look. I especially suggest you click something on a | topic you know nothing about. | | I've been hitting these sites in searches accidentally more | and more over the past few months. Goodness help you if you | don't realize it's totally fake; some of what I've seen is | dangerous, like, bad electrical advice being blithely | generated by whichever exact transformer variant is spewing | that stuff. | Vetch wrote: | There is an economic incentive to detect machine generated | output and curate trusted sites since the feedback loop of | training models on an unfiltered internet of mostly | generated junk output will eventually lead the models into | converging on a useless degenerate state. | bufferoverflow wrote: | People don't get their news from random AI-generated blogs. | | The actually bad consequence is SEO spam of high quality. You | can now generate a hundred articles a minute in any topic. | uni_rule wrote: | We are already seeing a lot of fucked up SEO spam rising to | the top these days. IMO it might actually start picking at | Google's market share because prior to this the product | actually seemed infallible to the average layman. | textcortex wrote: | I would not worry about that too much. There are already | models that can predict if its transformer generated or not. | At the same time google started penalizing transformer | generated text; https://youtu.be/EZWx05O6Hss | hhmc wrote: | It doesn't seem like there's any real guarantee that the | 'generated text detectors' will outpace the 'generated text | generators' | xmprt wrote: | What if that's the push that brings people out into the | physical world where they don't have to deal with all this | crap online. | carschno wrote: | Slightly less optimistic, but perhaps more realistic | thought: what if that's the push that makes people validate | their (internet) sources? Seems like it might become clear | that random web pages are just automatically generated | content. If you really want to learn something credible at | all, you'll really have to be more specific about your | sources than "the internet". | ninkendo wrote: | Yeah, that was gonna be my contrarian HN hot-take too. | Basically if it becomes _really_ obvious some day that | basically all online news publication is noise written by | computers, maybe people will stop actually trusting it so | much? | mgraczyk wrote: | Alternative hypothesis for which I have at least as much | evidence: | | Because of large language models, detecting Fake News becomes | trivial and cheap. Building and doing inference on language | models is too expensive for most attackers, so they give up. | Only well financed state actors are capable of disseminating | fake news, and they are no better at it than they are today | because content generation is not a bottleneck. | notahacker wrote: | Cost per word of fake news is already very low though, and | humans are much better at tailoring it to the appropriate | audience and agenda (and not just restating stuff that's | already out there that might actually be true) | | GPT type models are much better suited to low effort | blogspam, and whilst that's not a _good_ thing, they produce | better blogspam than existing blogspamming techniques. I | think we underestimate how bad the Internet already is, and | at worst text generated by AI is simply going to reflect | that. | NoMAD76 wrote: | It's about being 1st to publish it. It is mainly use during | live press conferences. No human can snap a photo (at | least), write a short update in a dedicated article, and so | on... all in 1s. | | Been there (as an independent press member years ago), | simply you cannot beat that. | tqi wrote: | Personally, I've never understood why first to publish | matters? As far as I can tell, the only people who care | are other journalists, who seem to think that any story | about a breaking news item MUST credit the person who | wrote about it first (see: ESPN's NBA and NFL | "insiders"). | redtexture wrote: | The unpersuasive argument is "you (second-to-publish- | person) copied my stupendous idea to get your derivative | result". | notahacker wrote: | First to publish matters, but GPT-3 is neither necessary | nor sufficient to achieve that. If you're producing | _fake_ news related to a press conference, speed of | content generation is entirely unimportant because you | don 't have to wait for the press conference to start to | write the fake article/update/tweet. If you care about | fidelity to the message of the press conference, I don't | see many situations in which a human who has anticipated | the likely message(s) of the conference and therefore has | pre-drafted paragraphs about "striking a conciliatory | tone" and "confirmed the reversal of the policy" ready to | select and combine with a choice quote or two as soon as | they're uttered isn't significantly better (and as quick | as) GPT-type models prompted by the speech-to-text of the | press conference. Sure, more reliable publications will | want to stick more validation and _at least wait for the | conference to finish before summarising its message_ | steps in the way, but those apply to bots as well as | journalists (or not, for the publications that prioritise | being first over being right). | NoMAD76 wrote: | You have a solid point, but I wasn't talking abut | summarizing or excerpt from a press release (those are | anyway handed as press-kits before with all NDA | agreements and so on). | | Real human journalists have a delay of about 1m before | making a short tweet. Funny (or not), something similar | was in the "live update" article page in less than 10s. | Including photo(s). I was on quite a lot of tech- | conferences/live events and earned a decent living then | as an independent tech journalist (but then I got bored | and really it was a 1-man-show). | | Another personal observation (from field), that was not | happening prior to 2010-2012, the years we all got Siri, | Cortana.. | | You can make the dots and dashes. | phphphphp wrote: | History has shown that humans are terrible judges of the | outcomes of our behaviour; your belief that we can confidently | understand the risks of anything through experimentation might | work in theory but hasn't been borne out in practice. | | Extremists exist at both ends of the spectrum and serve to | balance each other out: without people positing the worst-case | scenarios, the people positing the best-case scenarios would | run full steam ahead without any consideration for what could | happen. | | Perhaps if the proponents of (various flavours of) AI were | doing careful experimentation and iteratively working towards a | better understanding, then maybe the loud voices against it | would be less valuable, but as we've seen through the last 20 | years, progress in technology is being made without a second | thought for the consequences -- and what Facebook are doing | here is a bare minimum, so it's reasonable for proponents to be | somewhat cynical about the long term consequence. | skybrian wrote: | It seems like there is a difference between "let's release it | for researchers to investigate" and "let's release it for the | general public to use and abuse, including all the script | kiddies and malware authors and 4chan trolls and spammer sites | out there." | | Unfortunately, that difference can only be maintained through | some kind of gatekeeping. | | I like to try out the new algorithms, but I'm mostly just | playing, and I don't see how they make it available to me | without letting any random troll use it. | adamsmith143 wrote: | That's not the argument at all. Rather it's that the technology | is progressing so fast and could become dangerous far faster | than we can make it safe. Therefore it's worth seriously | thinking about the risks that scenario poses. Stopping research | or further work completely is A potential solution but given | the monetary investments involved it's extremely unlikely it | will be implemented. | | There are lots of very serious people seriously looking at | these issues and to dismiss them as simple luddites is frankly | insulting. | mgraczyk wrote: | But nobody is failing to take the risks seriously. The people | who actually work on these models understand the risks far | better than the outside ethicists. I work on ML in research. | Reading their work is like listening to a drunk relative at | Thanksgiving ranting about Bill Gates putting nanobots in | vaccines. It's completely uninformed pseudoscience that comes | from a place of strong political bias. | | For example Timnit's "parrots" paper confused training with | inference and GPUs with TPUs, making specific quantitative | estimates that were off by orders of magnitude. If she had | talked to a single person working on large language models, | she would have recognized the error. But these people work in | a bubble where facts don't matter and identify politics is | everything. | trention wrote: | There are enough people criticizing both current language | models and the overall "quest" towards AGI that come from a | non-political (unless you subscribe to an aristotelian | everything-is-politics) perspective. I personally don't | think any of the companies with significant AI research is | actually doing anything meaningful in terms of safety. | Also, from their public "utterings", it's quite clear to me | that both Altman and Hassabis (not to mention Lecun) don't | actually care about safety or consequences. | mgraczyk wrote: | > I personally don't think any of the companies with | significant AI research is actually doing anything | meaningful in terms of safety | | I assume this is just speculation on your part? Do you | have any reason to make that claim? I personally know | multiple people doing this full time at large tech | companies. | | I can give you some examples of serious safety oriented | criticism of large language models to contrast with what | plays out in the press and amongst "ethicists". | | It's well understood that one can generate so-called | "adversarial examples" for image classifiers. These | adversarial examples can be chosen so that to a human | they look like thing A, but the model classifies them as | thing B with high probability. Methods of finding these | adversaries are well understood. Methods of preventing | them from being problematic are less developed but | rapidly advancing. | | For language models, the situation is much worse. I don't | know of any effective way to prevent a large language | model from being searched for adversarial inputs, | trivially. That is, an attacker could find inputs from | large curated input spaces that cause the model to output | a specific, desired sequence. For example, an attacker | with access to the model weights could probably find an | innocuous looking input that causes the model to output | "kill yourself". | | Is this a risk that AI researchers are aware of? Yes, of | course. But the difference between AI researchers and | "ethicists" is that AI researchers understand the | implications of the risk and will work on mitigations. | "Ethicists" do not care about mitigating risk, and they | don't care that the people who build the models already | understand them and are comfortable with them. | adamsmith143 wrote: | >I can give you some examples of serious safety oriented | criticism of large language models to contrast with what | plays out in the press and amongst "ethicists". | | To clarify I think the poster above was talking about the | AI Alignment/Control Problem and not the specifics | failure modes of particular models, LLM, CNNs etc. Very | few people at OpenAI or Deepmind for example are | seriously engaging with Alignment. Paul Cristiano at | least acknowledges the problem but seems to think there | will be available solutions in time to avert serious | consequences which may or may not be the case. The folks | at MIRI certainly don't seem optimistic. | trention wrote: | >I personally know multiple people doing this full time | at large tech companies. | | The failure mode of internal "ethical" control at private | enterprises is well-known and has already played out (at | least) once when we tried to regulate medical experiments | in the 2 decades after WW2. I personally consider the | current AI safety positions to be just blatant | whitewashing. The lemoine fiasco is a specifically | hilarious case in point combining both a) a person that | is utterly incompetent and biased to work at that | position and b) total failure of leadership to adequately | engage with an issue (or even admit it's possible in | principle). At the current point, AI safety is roughly as | useful as tobacco lobbying (exaggerated for effect). | adamsmith143 wrote: | Well I definitely wasn't talking about people like Timnit | but rather researchers like Stuart Russell who actually are | at the forefront of the field and discuss AI safety | broadly. | api wrote: | This is the hydrogen bomb of propaganda. | | Imagine assigning every single living human being a dedicated | 24/7 con artist to follow them around and convince them of | something. That's what will soon be possible if not already. It | will be intimate con artistry at scale driven by big data, a | massive DDOS attack on human cognition and our ability to | conduct any form of honest discourse. | | What hustlers, trolls, and completely amoral companies will do | is bad enough. Now throw in state sponsored intelligence | agencies, propaganda farms, militaries, special interest | groups, and political parties. | | Usually I'm anything but a luddite, but with this I can't help | but think of many more evil uses than good ones. It doesn't | help that the principal business models of the (consumer) | Internet seem to center around surveillance, advertising, | propaganda, and addictive forms of entertainment (like slot- | machine-like mobile games) designed to suck money or time out | of people. | | Lesser but also very bad concerns include: the end of useful | search engines due to a deluge of continuously learning | adversarial SEO spam, the collapse of pretty much any open | online forum due to same, and addictive "virtual friend" / | "virtual relationship partner" hyper-sophisticated chatbots | that hook vulnerable lonely people and then empty their bank | accounts in various ways. | | I really don't fear AI itself. I fear what human beings will do | with AI. | mgraczyk wrote: | > It will be intimate con artistry at scale driven by big | data, a massive DDOS attack on human cognition and our | ability to conduct any form of honest discourse. | | This is an unfounded fear. For one thing, if the value in | doing this is high then it's already cheap enough to be | practical. The Chinese govt can pay millions of people to do | this to dozens of people each. They basically do this | already, for specific topics and issues. LLMs won't | significantly move the needle here. | | Second, are you proposing that attempts to stop Facebook from | releasing models will somehow slow down or stop the Chinese, | US, or Russian governments? What's the goal, to buy us 6 | months? I would much rather the technology be out in the open | for everyone to research and understand vs accessible only to | state actors or huge tech companies. | api wrote: | The difference between this and a troll farm is like the | difference between machine guns and lines of soldiers | manually loading and firing muskets. Yes both can be used | to gun down a lot of people, but machine guns are much | faster and cheaper. Mechanized warfare is coming to | propaganda and con artistry. | | I'm not necessarily arguing for intervention to stop this | release or something like that. The cat is out of the bag. | There's no stopping it. This is going to happen, so get | ready for it. | | Oh, and throw in deepfakes. You'll have automatic con | artistry at scale that can incorporate personalized fake | audio and video on demand depicting any supporting detail | it needs. It'll be like assigning each person a con artist | who's also supported by a staff of content producers. | mgraczyk wrote: | I guess, but on the flip side there are potentially | transformative positive applications that we already know | about and have yet to discover. Fundamentally some people | are more optimistic and risk-loving when it comes to new | technology. I believe the "good" will overwhelmingly | outweigh the "bad" that you're pointing out. I think it | mostly comes down to personality. | api wrote: | I can think of some positive applications. The thing that | makes me cynical here is that all the evil applications | seem like they're going to be far more profitable in | terms of either money or power. | | This would be a continuation of what's happened to the | Internet in the last 10-15 years. The Internet is amazing | and has tons of incredibly positive uses but all the | money is in mass surveillance, addictive "engagement | maximizing" stuff, and gambling and scams. | dr-detroit wrote: | humanistbot wrote: | > "ethicists" | | > It's a political and emotional stance masquerading as a | technical or scientific process. | | I don't think you understand what ethics is. | 4oh9do wrote: | What does "inviting" mean? It sounds like it means "Facebook | wants free labor instead of paying for formal and expensive | security audits". | whoisjuan wrote: | Meta/Facebook has given the world React, PyTorch, GraphQL, | Jest, and other fantastic technologies, and you are just | boiling down their open source efforts to "Facebook wanting | free labor." | | Not everything in tech is a sinister capitalistic plot. Open | Source and Open Research are truly one of the best ways to | accelerate technology advancements, in particular software | technology advancements. | bravura wrote: | That access form was... refreshing. | | Here's why this matters to me, an independent researcher who | wants to start publishing again. | | In 2008, Ronan Collobert and Jason Weston had published some work | that made neural network training of word vector representations | really fast. But only ML people read that paper. Yoshua Bengio | and Lev-Arie Ratinov and I plugged old-school cluster based as | well as fancy-but-uncool-and-icky neural network word | representations into a variety of NLP models. It worked awesome. | Before "transformers go brrrrrr" our paper told the NLP | community, basically, self-supervised learning and neural | networks go "brrrrrrr". People finally started paying attention | in the language world, ML stopped being treated with suspicion | and the field moved rapidly, our paper racked up 2700 cites and | an ACL 10 Year "Test Of Time" award, and here we are. | | I don't work in a big research lab but I still publish. I pay for | my GPUs the old fashioned way. You know, out of pocket. | | It took me _ages_ to get access to GPT-3. Ilya was a colleague of | mine, so I messaged him fb, but no dice. Why? I know I could pull | other strings through my network but, like, really? Is this where | we are right now? | | All I'm saying is: It's nice to fill out a form asking my | intended use and my previously related publications, as a means | of gatekeeping. The access process feels more transparent and | principled. Or maybe I'm just being grouchy. | O__________O wrote: | Link to the ACL 10 Year "Test Of Time" award paper mentioned | above: | | Word Representations: A Simple and General Method for Semi- | Supervised Learning | | https://aclanthology.org/P10-1040/ | | (PDF link) | | https://aclanthology.org/P10-1040.pdf | jackblemming wrote: | Yandex and Facebook are both more open than OpenAI? And the world | isn't ending because large language models were released? | Shocking. | jacooper wrote: | OpenAI is basically only open in the name. | dqpb wrote: | It's basic run-of-the-mill gaslighting. | thesiniot wrote: | It reminds me of that time the US Air Force designed an "open | source jet engine". | | https://www.wpafb.af.mil/News/Article- | Display/Article/201113... | | Their definition of "open source" turned out to be: "the | government owns the source IP instead of some defense | contractor. No, you can't see it." | | In fairness, I'm impressed that they even got that far. How | do you think the defense contractor lobbyists responded to | that program? | enlyth wrote: | I guess "Closed source pay-as-you-go AI" didn't have quite a | ring to it | tiborsaas wrote: | Maybe they meant Opening up AI :) | bobkazamakis wrote: | OpenWallet | Judgmentality wrote: | You dare question the gatekeepers of our future AI overlords? | [deleted] | Tepix wrote: | May 3rd, which is why Yandex's 100B model release is not | mentioned. | ArrayBoundCheck wrote: | Why is facebook using GPT-3? Generate fake content it wants to | push out? | makz wrote: | Beta testing for free? | option wrote: | 175B is for research only and as far as I understood their ToU | does not allow commercial usage. | | Currently, the largest LLM that is both free and commercially | usable (Apache 2.0) is 100B YaLM from Yandex (russian's copy of | Google). However, they did not publish any details on their | training data. | plegresl wrote: | The 176B parameter BLOOM model should be available soon: | https://bigscience.notion.site/BLOOM-BigScience-176B-Model-a... | option wrote: | yes, looking forward to it especially because it is going to | be multilingual by design | charcircuit wrote: | >However, they did not publish any details on their training | data. | | Yes they did. It's in the README. | option wrote: | All I can see is " 1.7 TB of online texts, books, and | countless other sources in both English and Russian." | | if there are more details, can you please share a link? | | I am worried that "other sources" may contain Yandex.news | which is a cesspool of anti-West and anti-Ukraine propaganda | 533474 wrote: | The pile dataset is used for the English language | remram wrote: | Direct link: https://github.com/yandex/YaLM-100B/blob/main/RE | ADME.md#trai... | timmg wrote: | Dumb question: does 175B parameters mean the number of bytes | (or floats?) in the model? Does that also mean you need the | whole model in memory to do inference (in practice)? | | If so, not many machines have that much RAM. Makes it hard to | "play" with. | lostmsu wrote: | float16s or bfloat16s so 2x of that for storage | | You can infer using DeepSpeed. | annadane wrote: | Oh really? Now you invite researchers instead of shutting down | legitimate projects to investigate your algorithms? | blip54321 wrote: | On the ethics front: | | * Yandex released everything as full open | | * Facebook released open with restrictions | | * OpenAI is completely non-transparent, and to add insult to | injury, is trying to sell my own code back to me. | | It seems like OpenAI has outlived its founding purpose, and is | now a get-rich-quick scheme. | | What I really want is a way to run these on a normal GPU, not one | with 200GB of RAM. I'm okay with sloooow execution. | TrinaryWorksToo wrote: | Have you looked into HuggingFace Accelerate? People have | supposedly been able to make the tradeoff with that. Although | you still need to download the huge models. | leereeves wrote: | Can confirm. HuggingFace Accelerate's big model feature[1] | has some limits, but it does work. I used it to run a 40GB | model on a system with just 20GB of free RAM and a 10GB GPU. | | All I had to do was prepare the weights in the format | Accelerate understands, then load the model with Accelerate. | After that, all the rest of the model code worked without any | changes. | | But it is _incredibly slow_. A 20 billion parameter model | took about a half hour to respond to a prompt and generate | 100 tokens. A 175 billion parameter model like Facebook 's | would probably take hours. | | 1: https://huggingface.co/docs/accelerate/big_modeling | chessgecko wrote: | That already exists depending on your definition of slow. Just | get a big ssd, use it as swap and run the model on cpu. | leereeves wrote: | A comment below said this model uses fp16 (half-precision). | If so, it won't easily run on CPU because PyTorch doesn't | have good support for fp16 on CPU. | netr0ute wrote: | Parent never claimed it was going to be fast. | guelo wrote: | I don't see giving spammers, marketers and scammers more | powerful tools as the ethical stance. | dqpb wrote: | Better take away the internet then | sarahhudson wrote: | They wont, but the cat is out of the bag. It is data, and | data gets leaked, shared in the open, shared in the dark. | Researchers can be bribed. | | It is like: you can not talk to your kids about drugs and | pretend they don't exist ... or you can. | remram wrote: | Almost certainly they are getting it, OpenAI will just get | paid for it. | shon wrote: | That's an understandable view point. However, "Security | through obscurity" just doesn't work. Worse, trying to keep | something from people really only punishes/limits the rule | followers. | | The bad guys get it anyway so this gives the good guys a | chance. | guelo wrote: | There's not much obscurity here. If you have tens of | millions of dollars to throw at compute and a bunch of PhDs | you could develop similar tech. I don't understand the idea | that ethics somehow requires existing private models to be | made available to everybody. | shon wrote: | Yeah I was responding to a post asking why we should | allow open access, given that some of those with access | will do bad things. | | I agree with you. Ethics doesn't demand that existing | private tech be made available. Who's saying that?? | | OpenAI is just catching shade because their initial | founding mission was to democratize access to AI tech and | they've gone pretty far the other way. | trention wrote: | I am curious what is the reasoning behind "giving "good | guys" access to language models will {deus ex machina} and | thus allow us to prevent the spam and abuse". | leereeves wrote: | Automated tools to distinguish AI generated text from | human writing and hide the AI spam. | numpad0 wrote: | Can humans be trained en masse to output less | distinguishable text from those of NN? | shon wrote: | This ^^ + many other mitigation/analytics use cases. | [deleted] | YetAnotherNick wrote: | I don't understand why OpenAI has so many restrictions on its | API. Isn't things like erotic writing, unlabelled marketing | etc. good money for them with minimal chances of litigation? Is | it for PR? | bpodgursky wrote: | It's because it was genuinely founded as an organization | worried about misaligned AI. | dmix wrote: | The critique is that the _type_ of ethics they concern | themselves with is borderline moral-panic /Victorian era. | Not the Laws of Robotics kind of stuff. | | Maybe it's my personality but I get the impression since AI | is rather limited in 2022 that all the paid AI ethicists | spending 90% of the time on bullshit problems because there | aren't many real threats. And these gets amplified because | the news is always looking for a FUD angle with every AI | story. | | The priority seems to be protecting random peoples feelings | from hypothetical scenarios they invent, when IRL they are | releasing research tools on a long-term R&D timeline... | GPT-3 isn't a consumer product they are releasing. It's a | baby step on a long road to something way bigger. Crippling | that progress because of some hyper-sensitivty to people | who get offended easily seems ridiculous to me. | c7DJTLrn wrote: | Also, it's pointless. OpenAI might be a leader right now | but it won't be forever. It can't control a technology. | It's like restricting fire because it can burn down | houses... yeah it can, but good look with that, all we | need is some friction or flint. As time goes on that | flint will become easier to find. | | If OpenAI wants to concern itself with the ethics of | machine learning, why not develop tools to fight misuse? | rm_-rf_slash wrote: | There are more than enough unaddressed ethics issues in | ML/DS from racial bias in criminal sentencing to de- | anonymization of weights to keep ethicists busy without | needing Skynet. | [deleted] | option wrote: | On the ethics front Yandex should provide more details on the | data they've used. | [deleted] | anothernewdude wrote: | How much are they paying for this service? | jimsmart wrote: | Link to original blog post by Meta (3 May 2022) | | https://ai.facebook.com/blog/democratizing-access-to-large-s... | nharada wrote: | The logbook is awesome: | https://github.com/facebookresearch/metaseq/blob/main/projec... | | This is the true secret sauce -- all the tricks on how to get | these things to train properly that aren't really published. | gxqoz wrote: | Any particular highlights from this? | fny wrote: | Honestly, no. | amelius wrote: | This all makes me wonder: how reproducible is the final output | model? | dekhn wrote: | the details of that specific model they ended up with? | Irreproducible, unless the system was carefully designed and | every detail required to do a fully reproducible computation | was recorded and replayed. But they could easily produce a | bunch of models that all sort of end up in roughly the same | place and perform the same, ideally reducing the number of | things they needed to change ad-hoc during the training. | screye wrote: | Not this one, but Google's PaLM (which is 4x OPT3) trains | semi-deterministically. | | These kinds of large transformers can be relatively | reproduceable in results and benchmarks. However, making them | converge to the exact same parameter set might not be a | reasonable expectation. | zubspace wrote: | I have no knowledge of such things, but it seems they run Cuda | jobs on about 150 nodes? | | But why do they have so many problems to keep this cluster | stable? Network failures? Bad GPU's? Bad drivers? Bad software? | | Running fixmycloud and going after all those cryptic errors | every day seems like a nightmare to me... | sp527 wrote: | This really does read like 'DevOps: Nightmare Edition' | | > CSP fat fingered and deleted our entire cluster when trying | to replenish our buffer nodes | | Yikes | dang wrote: | Related: | | _100 Pages of raw notes released with the language model | OPT-175_ - https://news.ycombinator.com/item?id=31260665 - May | 2022 (26 comments) | axg11 wrote: | It really is great that Meta released the notes/logbook. Credit | where credit is due. Very few other academic or industry labs | release materials like this, especially when the reality is so | messy. | | Some interesting takeaways: | | - Meta aren't using any software for scientific logbooks, just | prepending a document | | - So many hardware/cluster issues. | | - Hot-swapping algorithms is common and likely underreported | (in this case activation functions and optimization method) | | - A well resourced team didn't solve enough issues to fully | utilize compute resources until >50% of the total time into the | project | domenicrosati wrote: | I wonder what software would be good for a logbook like | this... I just use google docs for these kinds of things. | Sure wandb and jupyter notebooks are good but they are not so | good for notes and ideas and documentation | [deleted] ___________________________________________________________________ (page generated 2022-06-27 23:00 UTC)