[HN Gopher] GPT-3 is no longer the only game in town ___________________________________________________________________ GPT-3 is no longer the only game in town Author : sebg Score : 217 points Date : 2021-11-07 14:53 UTC (8 hours ago) (HTM) web link (lastweekin.ai) (TXT) w3m dump (lastweekin.ai) | supperburg wrote: | I have supported my beliefs on this topic in these threads to the | point of exhausting myself. The tools that we use to find these | agents are the underpinning of AGI, it's coming way faster than | even most people here appreciate, this development is | intrinsically against the interest of human beings. Please stop | and think, please. | Simon321 wrote: | I argue it's very much in the interest of human beings. It has | been since we first picked up a rock and used it has a hammer. | It's the ultimate tool and has the potential to bring unseen | prosperity. | supperburg wrote: | It won't. You're wrong.this is the perfect illustration. You | think a rock is good therefore AI is good. You're just | unbelievably wrong. | [deleted] | eunos wrote: | Number of parameters aside, I am really surprised that we havent | yet reached hundreds of TB of training data. Especially Chinese | model only used less than 10 TB of data. | visarga wrote: | The GPT-3 family is still to expensive to use, too big to fit in | memory on a single machine. Prices need to come down before large | scale adoption or someone needs to invent the chip to hold it | (cheaply). | | The most exciting part about it is showing us there is a path | forward by scaling and prompting, but you can still do much | better with a smaller model and a bit of training data (which can | come from the expensive GPT-3 as well). | | What I expected from the next generation: multi-modality, larger | context, using retrieval to augment input data with fresh | information, tuned to solve thousands of tasks with supervised | data so it can generalize on new tasks better, and some efficient | way to keep it up to date and fine-tune it. On the data part - | more data, more languages - a lot of work. | worik wrote: | The underlying methods seem impractical. GPT-n are an existence | proof - it is possible to make parrot like software that | generates realistic text. But using these methods it is not | practical. | | Maybe that is a good thing, maybe a bad thing, but unless there | is a breakthrough in methods this is a dead end. Impressive | though. | WithinReason wrote: | https://copilot.github.com/ | mrbukkake wrote: | Can anyone tell me what the value of GPT-3 actually is other than | generating meaningless prose? What would a business use it for | phone8675309 wrote: | It's good for the university-industrial-business complex - | people writing papers about a model they can't even run | themselves. It practically prints money in journal articles, | travel per diem, and conference honorariam, not even counting | the per-API call rates. | DeathArrow wrote: | >What would a business use it for | | If you think about business uses you can actually get advices | from Jerome Powell, simulated by GPT-3. | | If someone use GPT-3 to simulate Warren Buffet, he can extract | even more value. | | https://www.institutionalinvestor.com/article/b1tktmhcfdyqsk... | mrbukkake wrote: | I can't tell whether or not this article is parody... is this | a new kind of turing test | vitus wrote: | Somehow I don't think that this is quite how Jerome Powell | would respond in an interview: | | > Interviewer: How do you think a global pandemic would | impact the financial system? | | > Mr. Powell: A pandemic will have impact on the economy. It | will have an impact on the financial system. | | > Interviewer: What would the impact be? | | > Mr. Powell: If a major pandemic occurred, the economy would | be affected, and the financial system would be affected. | | Yes, GPT-3 can provide a convincing chatbot, but it shouldn't | be confused for domain expertise. | robbedpeter wrote: | This is a bad example. Here's an output from gpt-j-6b by | EleutherAI: Prompt( Jerome Powell was interviewed about the | impact of a pandemic on the economy. | | > > Interviewer: What would the impact be? > Mr. Powell: If | a major pandemic occurred) | | Output( it could be very disruptive. > > Interviewer: How | disruptive? > Mr. Powell: Well, it could be disruptive in a | number of ways. > > Interviewer: How so? > Mr. Powell: | Well, one of the first things that would be disrupted would | be the supply chain. ) | | Using prompts well makes a huge difference. | | If you parse the generated output, classify it, then | develop a decision tree that uses further prompts to refine | the response, you can get more sophisticated, valuable | responses. | | The output in the parent is comparable to an off-the-cuff | interview response. If you emulate a deeper thought | process, you can get more meaningful output, and if you use | the right prompts, you can access the semantic networks in | the model related to your domain of interest. | notahacker wrote: | I think the "bad example" is actually the good one, | because it's a reminder that actually you're not getting | business advice from someone with Warren Buffet or Jerome | Powell's understanding of the economy, you're getting | text generated by analysing patterns in other not- | necessarily-applicable text. If you start forcing it in | very specific directions you start getting text that | summarises the commentary in the corpus, but most of that | commentary doesn't come from Warren Buffet or Jerome | Powell and isn't applicable to the future you're asking | it about... | sva_ wrote: | _> Interviewer: Are you in favor of a carbon tax? | | > Mr. Powell: I don't want to get into the details of taxes. | | > Interviewer: Are you in favor of a cap and trade system? | | > Mr. Powell: I don't want to get into the details of a cap | and trade system. | | > Interviewer: How do you think a global pandemic would | impact the financial system? | | > Mr. Powell: A pandemic will have impact on the economy. It | will have an impact on the financial system. | | > Interviewer: What would the impact be? | | > Mr. Powell: If a major pandemic occurred, the economy would | be affected, and the financial system would be affected._ | | Maybe I'm a bit harsh on GPT-3, but I'm not nearly as | fascinated by this kind of output as the author. | teaearlgraycold wrote: | It does pretty well at transforming text into a person's | style of talking. So you could have it re-write any | sentence to sound like a Trump tweet. | renewiltord wrote: | I, too, that sounded like Eliza. Anyway, it looks like | that's a small excerpt from the conversation. | benatkin wrote: | It looks like the dialogue is only on the human end. The | chatbot is treating each question as the first. I think | it sounds a lot like Biden. I prefer that to Trump, but | don't like either sort of conversation! | CamperBob2 wrote: | GPT-3 and similar ML/AI projects may have many interesting and | valuable commercial applications, not all of which are readily | apparent at this stage of the game. For instance, it could be | used to insert advertisements for herbal Viagra at | https://www.geth3r3a1N0W.com into otherwise-apropros comments | on message boards, preferably near the end once it's too late | to stop reading. | | Life online is about to become very annoying. | hubraumhugo wrote: | At https://reviewr.ai we're using GPT-3 to summarize product | reviews into simple bullet-point lists. Here's an example with | backpack reviews: https://baqpa.com | staticautomatic wrote: | Did you test it against extractive summarizers? | hubraumhugo wrote: | We experimented with BERT summarization, but the results | weren't too good. Do you have any resources or experiences | in this area? | moffkalast wrote: | That sounds like BERT alright. | cma wrote: | How do you avoid libel? | kingcharles wrote: | Are you confusing libel with something else? Can you | extrapolate what you mean here? Are you saying that they | will be liable for libel (!) if they publish a negative | summary of a product? | cma wrote: | If they mischaracterize a positive review into a negative | summary based on factual mistakes they know the system | makes at a high rate, I would think they would be liable | for libel right? | teaearlgraycold wrote: | I work for a company that re-sells GPT-3 to small business | owners. We help them generate product descriptions in bulk, | Google ads, Facebook ads, Instagram captions, etc. | crubier wrote: | Have you heard of GitHub copilot ? It's based on GPT3 and I can | tell you one thing: it does not generate meaningless prose (90% | of the time) | inglor wrote: | This - it is tremendously valuable to me and I use it all the | time at work. | skybrian wrote: | What do you use it for? | inglor wrote: | Coding, I actually had to forbid it today in a course I | teach because it solves all the exercises :) (given unit | tests with titles students needed to fill those tests in) | singlow wrote: | Isn't that just because others have stored solutions to | these problems in GitHub? | iamcurious wrote: | That is my question too. Is it a fancier autocomplete? Or | does it reason about code? | PeterisP wrote: | In some sense you could think of as a fancy autocomplete | which uses not only code but also comments as input, | looks up previous solutions for the same problem but | (mostly) appropriately replaces the variable names to | those that you are using. | robbedpeter wrote: | It reasons over the semantic network between tokens, in a | feedforward inference pass over the 2k(ish) words or | tokens of the prompt. Sometimes that reasoning is | superficial and amounts to probabilistic linear | relationships, but it can go deeply abstract depending on | training material, runtime/inference parameters, and | context of the prompt. | inglor wrote: | Probably, also I'm sure 99%+ of the code I author isn't | groundbreaking and someone did it before. | worldsayshi wrote: | But what about the potential for intellectual property | problems? | trothamel wrote: | That's beside the point, which is that the output copilot | produces is useful. | worldsayshi wrote: | I don't see how that's besides the point. How can it be | that useful if the output is a such legal mystery? | | I'd love to use it but not when there's such a risk of | compromising the code base. | amelius wrote: | How many % of the time does it produce code that compiles? | crubier wrote: | In my experience 95% of the time. And 80% of the time it | output codes which is better than I would have done myself | in a first approach (thinks of corner cases, adds | meaningful comments etc.). It's impressive. | bidirectional wrote: | From my anecdotal experience, the vast majority of the time | (90+%). | amelius wrote: | Interesting. Is there any constraint built into the model | that makes this possible? E.g. grammar, or semantics of | the language? Or is it all based on deep learning only? | crubier wrote: | Deep learning only I believe. But real good one | emteycz wrote: | The overwhelming majority. Whatever used to take me an hour | or two is now a 10-minute task. | ilteris wrote: | I am so confused. Is there a tutorial explaining how you | are using in the IDE whatever it is. I use vscode curious | if it can be applied. Thanks | crubier wrote: | It works very well with VSCode. It has an integration. It | shows differently than normal autocomplete, it shows just | like gmail autocomplete (grayed out text sugggestion, and | press tab to actually autocomplete). Sometimes the | suggestion is just a couple tokens long, sometimes it's | an entire page of correct code. | | Nice trick: write a comment describing quickly what your | code will do ("// order an item on click") and enjoy the | complete suggested implementation ! | | Other nice trick: write the code yourself, and then just | before your code, start a comment saying "// this code" | and let copilot finishe the sentence with a judgement | about your code like "// this code does not work in case | x is negative". Pretty fun ! | icelancer wrote: | Interesting second use case; I use comments like this | already as typical practice and I agree Copilot fills in | the gaps quite well - never thought to do it in | reverse... will give that a shot today. | emteycz wrote: | I also like to do synthesis from example code (@example | doccomment) and synthesis from tests. | icelancer wrote: | I was exceptionally skeptical about it, but it's been very | useful for me and I'm only using it for minor tasks, like | automatically writing loops to pull out data from arrays, | merge them, sort information, make cURL calls and process | data, etc. | | Simply leading the horse to water is enough in something | like PHP: | | // instantiate cURL event from API URL, POST vars to it | using key as variable name, store output in JSON array and | pretty print to screen | | Usually results in code that is 95-100% of the way done. | nradov wrote: | The fact that GPT3 works at all for coding indicates that our | programming languages are too low level and force a lot of | redundancy (low entropy). From a programmer productivity | optimization perspective it should be impossible to reliably | predict the next statement. Of course there might be trade | offs. Some of that redundancy might be helping maintenance | programmers to understand the code. | hans1729 wrote: | >From a programmer productivity optimization perspective it | should be impossible to reliably predict the next statement | | Why? 99.9% of programming being done is composition of | trivial logical propositions, in some semantic context. The | things we implement are trivial, unless you're thinking | about symbolic proofs etc | tshaddox wrote: | I think that's precisely the problem the parent commenter | is describing. | alephaleph wrote: | That would only follow if we were trying to optimize code | for brevity, and I have no clue why that would be your top | priority. | mpoteat wrote: | I have indeed seen codebases where it seems like the | programmer was being charged per source code byte. Full | of single letter methods and such - it takes a large | confusion of ideas to motivate such a philosophy. | nradov wrote: | Not at all. Brevity (or verbosity) is largely orthogonal | to level of entropy or redundancy. In principle it ought | to be possible to code at a higher level of abstraction | while still using understandable names and control flow | constructs. | mpoteat wrote: | Indeed, in the limit of maximal abstraction, i.e. semantic | compression, code becomes unreadable by humans in practice. | We can see this in code golf competitions. | dharmaturtle wrote: | Let me rephrase: | | > The fact that GPT3 works at all for English indicates | that English is too low level and forces a lot of | redundancy (low entropy). | | I don't think the goal is to compress information/language | and maximize "surprise". | Traster wrote: | I think this is kind of true, but also kind of not true. | Programming, like all writing, is the physical | manifestation of a much more complex mental process. By the | time that I _know_ what I want to write, the hardwork is | done. In that way, you can think of co-pilot as a way of | increasing the WPM of an average coder. But the WPM isn 't | the bit that matters. In fact almost the onlt thing that | matters are hte bits you won't predict. | pharmakom wrote: | Code is easier to write than read and maintain, so how useful | is something that generates pages of 90% correct code? | ALittleLight wrote: | It's not useful if you use it to auto complete pages of | code. It is useful to see it propose lines, read, and | accept its proposals. Sometimes it just saves you a second | of typing. Sometimes it makes a suggestion that causes you | to update what you wanted to do. Sometimes it proposes | useless stuff. On the whole, I really like it and think | it's a boon to productivity. | ghoomketu wrote: | Yes I'm used it now but first time it started doing its | thing, I wanted to stop and clap for how jaw dropping and | amazing this technology is. | | I was a Jetbrains fan but this thing takes productivity to a | whole new level. I really don't think I can go back to my | normal programming without it anymore. | kuschku wrote: | Luckily, there's a jetbrains addon for it. | inglor wrote: | Someone at work showed me copilot works on WebStorm today | (I also use VSCode). | supperburg wrote: | That's like if an alien took Mozart as a specimen and then | disregarded the human race because this human, while making | interesting sounds, does nothing of value. You have to look at | the bigger picture. | lysecret wrote: | Hey for a long time i was also very sceptical. However i can | refer you to this paper to a really cool applciaiton. | https://www.youtube.com/watch?v=kP-dXK9JEhY. They baseically | use clever GPT-3 prompting to create a dataset, you then train | another model on. Besides, you can prompt these models to get | (depending on the usecase) really good few shot performance. | And finally, github copilot is another pretty neat application. | micro_cam wrote: | Actually using this class (larger transformer based language | models) of models to generate text is to me the least | interesting use case. | | They can also all be adapted and fine tuned for other tasks in | content classification, search, discovery etc. Think facnial | recognition for topics. Want to mine a whole social network for | anywhere people are talking about _______ even indirectly with | very low false negative rate? You want to fine tune a | transformer model. | | Bert tends to get used for this more because it is freely | available, established and not too expensive to fine tune but i | suspect this is what microsoft licensing gpt-3 is all about. | warning26 wrote: | GPT-3 is fairly effective at summarization, so that's one | potential business use case: | | https://sdtimes.com/monitor/using-gpt-3-for-root-cause-incid... | Tijdreiziger wrote: | https://replika.ai/ | amelius wrote: | I hope that one day it will allow me to write down my thoughts | in bullet-list form, and it will then produce beautiful prose | from it. | | Of course this will be another blow for journalists, who rely | on this skill for their income. | DeathArrow wrote: | I played with GPT-3 giving it long news stories. It actually | replied with more meaningful titles than the journalists | themselves used. | rm_-rf_slash wrote: | Perhaps GPT-3 was optimizing to deliver information while | news sites these days optimize titles to get clicks. | ailef wrote: | You can prompt GPT-3 in ways that make it perform various tasks | such as text classification, information extraction, etc... | Basically you can force that "meaningless prose" into answers | to your questions. | | You can use this instead of having to train a custom model for | every specific task. | DeathArrow wrote: | Chat bots are an usage. I think you might use it for customer | support. | | One example of GPT-3 powered chat bot: | https://www.quickchat.ai/emerson | jszymborski wrote: | While the generation is fun and even suitable for some use | cases, I'm particularly interested in its ability to take _in_ | language and use it for downstream tasks. | | A good example is DALL-E[0]. Now, what's interesting to me is | the emerging idea of "prompt engineering" where once you spend | long enough with a model, you're able to ask it for some pretty | specific results. | | This gives us a foothold in creating interfaces whereby you can | query things using natural language. It's not going to replace | things like SQL tomorrow (or maybe ever?) but it certainly is | promising. | | [0] https://openai.com/blog/dall-e/ | 13415 wrote: | Automatic generation of positive fake customer reviews on | Amazon, landing pages about topics that redirect to attack and | ad sites, fake "journalism" with auto-generated articles mixed | with genuine press releases and viral marketing content, | generating fake user profiles and automated karma farming on | social media sites, etc. etc. | phone8675309 wrote: | > fake "journalism" with auto-generated articles mixed with | genuine press releases and viral marketing content | | How would you tell the difference from the real thing these | days? | DeathArrow wrote: | The state of the journalism is so poor, I'd rather take some | AI generated articles instead. | akelly wrote: | https://copy.ai/ | teaearlgraycold wrote: | Hey, I work there! To be honest it's still very much a | prototype. We have big plans for the next few months. | mark_l_watson wrote: | You can try it yourself - apply for a free API license from | OpenAI. If you like to use Common Lisp or Clojure then I have | examples in two of my books (you can download for free by | setting the price to zero): https://leanpub.com/u/markwatson | pyb wrote: | I know of some credible developers who were struggling to get | access, so YMMV | mark_l_watson wrote: | It took me over a month, so put in your request. Worth the | effort! | [deleted] | moffkalast wrote: | I put in a request months ago, I think they're not approving | people anymore. | mark_l_watson wrote: | You might try signing up directly for a paid non/free | account, if that is possible to do. I was using a free | account, then switched to paying them. Individual API calls | are very inexpensive. | warning26 wrote: | Neat to see more models getting closer, thought it appears only | one so far has exceeded GPT-3's 175B parameters. | | That said, what I'm really curious is how those other models | stack up against GPT-3 in terms of performance -- does anyone | know of any comparisons? | sillysaurusx wrote: | I'm surprised that no one has answered for three hours! | | The answer is at https://github.com/kingoflolz/mesh- | transformer-jax | | It has detailed comparisons and a full breakdown of the | performance, courtesy of Eleuther. | 6gvONxR4sf7o wrote: | I was so frustrated when that was first announced because it | didn't include those metrics, and everyone ate it up like the | models were equivalent. | 6gvONxR4sf7o wrote: | Whenever I've seen language modeling metrics, GPT-3's largest | model has been at the top. If you see a writeup that doesn't | include accuracy-type metrics, you're reading a sales pitch, | not an honest comparison. | machiaweliczny wrote: | There's Wu Dao 2.0 and Google has 2 models with 1T+ params. | atty wrote: | For clarity, i believe these are all mixture of expert | models, where each input only sparsely activates some subset | subset of the full model. This is why they were able to make | such a big jump over the "dense" GPT3. Not really an apples- | to-apples comparison. | pyb wrote: | +1, does the new generation match or exceed GPT-3 in terms of | relevance ? Is there a way for a non-AI-researcher to | understand how the benchmarks measure this ? Bigger does not | mean better. | GhettoComputers wrote: | >However, the ability of people to build upon GPT-3 was hampered | by one major factor: it was not publicly released. Instead, | OpenAI opted to commercialize it and only provide access to it | via a paid API. This made sense given OpenAI's for profit nature, | but went against the common practice of AI researchers releasing | AI models for others to build upon. So, since last year multiple | organizations have worked towards creating their own version of | GPT-3, and as I'll go over in this article at this point roughly | half a dozen such gigantic GPT-3 esque models have been | developed. | | Seems like aside from Eleuther.ai you can't use the models freely | either, correct me if I'm wrong. | andreyk wrote: | I believe you are correct, at least for GPT-3 scaled things. | Hopefully that'll change with time, though. | rg111 wrote: | The future is not as dark as it seems because of the rat race of | megacorps. | | You can use reduced versions of language models with extremely | good results. | | I was involved in training the first-ever GPT2 for Bengali | language, but with 117 million parameters. | | It took a month's effort (training + writing code + setup) and | about $6k in TPU cost, but Google Cloud covered it. | | Anyway, it is surprisingly good. We fine-tuned the model for | several downstream tasks and we were shocked when we saw the | quality of generated text. | | I fine-tuned this model to write Bengali poems with a dataset of | just about 2k poems and ran the training for 20 minutes in GPU | instance of Colab Pro. | | I was really blown away by the quality. | | The main training was done in JAX, and it is much faster and | seamless than PyTorch XLA, much _better_ than TensorFlow in every | way. | | So, my point is, although everyone is talking hundreds of | billions of parameters and millions in training cost, you can | still derive practical value from language models, and that too, | at a low cost. | amelius wrote: | > The future is not as dark as it seems because of the rat race | of megacorps. | | Just wait until NVidia comes with a "Neural AppStore" and | corresponding restrictions. Then wait until the other GPU | manufacturers follow suit. | rg111 wrote: | Much of the work done is fully open source and are liberally | licensed. | | DeepMind and OpenAI have a bad rep in this regard. | | But a lot is available for free (as in beer _and_ speech). | | And most of the research papers are released in arXiv. It's | very refreshing. | | The bottleneck is not the knowledge or code, but the compute. | People are fighting this in innovative ways. | | I have been an inactive part of Neuropark that first demoed | collaborative training. A bunch of folks (some of them close | to laypeople) ran their free Colab instances and trained a | huge model. You can even utilize a swarm of GT1030s or | something like that. | | Also, if you have shown signs of success, you are very likely | to have people willing to sponsor your compute needs, case in | point- Eluether AI. | | The situation is far from ideal with this megacorps rat race | [0], and NLP research being more and more inaccessible, but | it is not completely dark. | | [0]: I, along with many respected figures tend to think that | this scaling up stuff approach is not even _useful_. We can | write good prose with GPT-3 nowadays, that are, for all | intents and purposes, indistinguishable from text written by | humans. But we are far, far away from true _understanding_. | These models don 't really _understand_ anything and are not | even "AI", so to speak. | | The Transformer architecture, the backbone of all these | approaches- is too brute-force-y for my taste to be | considered something that can mimic or, further- _be_ | intelligent. | cmrajan wrote: | Good to know. We're trying to attempt something similar[1] but | for Tamil. I'm also surprised how well the OSS language model & | library AI4Bharat [2] performs for NLP tasks against SoTA | systems. Is there a way to contact you? [1] | https://vpt.ai/posts/about-us/ [2] | https://ai4bharat.org/projects/ | rg111 wrote: | Among a master's degree, a consultancy gig, personal research | and study, and finding unis abroad- I am living a hectic | life. | | I don't see how I can be of help. | | But I can talk. Leave me something through which I can reach | you. And I will reach you within a week. | xyproto wrote: | I think companies should be banned from having "Open" in their | names. | evergrande wrote: | OpenAI takes the Orwellian cake. | Tenoke wrote: | I hear a lot of low effort takes about OpenAI but how exactly | is providing your service via a paid API the "Orwellian | cake"? Is this really the most (or even at all) Orwellian | practice for you? | leoxv wrote: | https://en.wikipedia.org/wiki/Doublespeak | TulliusCicero wrote: | I think it's more the contrast where they claim, via their | name, to be open, but actually aren't. | | If their name was ProfitableAI, there'd probably be fewer | complaints. | c7DJTLrn wrote: | "Open"AI but you can only use it how we want you to and no, you | can't run it yourself. | moffkalast wrote: | The only thing open in OpenAI is your wallet. | moffkalast wrote: | > most recently NVIDIA and Microsoft teamed up to create the 530 | billion parameter model Megatron-Turing NLG | | Get it, cause it's a generative transformer? Hah | DeathArrow wrote: | People were blaming cryptocurrencies miners for the prices of | GPUs, when in fact it was the AI researchers who bought all the | GPUs. :D | | I wonder what if somebody designs an electronic currency rewarded | as payment for general GPU computations instead of just computing | hashes? You pay some $, to train your model and the miner gets | some coins. | | Every one is happy, electricity is not wasted and the GPUs gets | used for a reasonable purpose. | nathias wrote: | Yes, this is an old idea (which I really like) but it hasn't | really taken off yet. GridCoin was one example, where you | solved BOINC problems or RLC that's for more general | computation. | rewq4321 wrote: | The problem is that, currently, large ML models need to be | trained on clusters of tightly-connected GPUs/accelerators. | So it's kinda useless having a bunch of GPUs spread all over | the world with huge latency and low bandwidth between them. | That may change though - there are people working on it: | https://github.com/learning-at-home/hivemind | Kiro wrote: | It hasn't taken off because it doesn't work. PoW only works | for things that are hard to calculate but easy to verify. Any | meaningful result is equally hard to verify. | snovv_crash wrote: | It's easy to verify ML training - inference on a test set | has lower error than it did before. | | Training NN ML is much slower than inference (1000x at | least) because it has to calculate all of the gradients. | petters wrote: | > Any meaningful result is equally hard to verify. | | This is very much not true. A central class in complexity | is NP whose problems are hard to answer but easy to verify | if the answer is yes. | | E.g. is there a path visiting all nodes in this graph of | length less than 243000? Hard to answer but easy to check | any proposed answer. | PeterisP wrote: | The current way of training is efficient when compute is | located in a single place and is colocated with large | quantities of training data. Distributing small parts of | computation to remote computers is theoretically possible (and | an active direction of research) but currently not preferable | nor widely used; you really need very high bandwidth between | all the nodes to constantly synchronize the hundreds-of- | gigabytes sized weights they're iterating on and the resulting | gradients. | bckr wrote: | This may not be true in the future. There is some work being | done on distributed neural net training. I can't recall the | name of the technique at the moment, but a paper came out in | the last year showing results comparable with backprop that | only required local communication of information (whatever | this technique's alternative to gradients is). | evergrande wrote: | First the electricity morality police came for crypto and I | said nothing. | | Then they came for AI, video games, HN... | Nextgrid wrote: | My understanding is that proof-of-work is intentionally | wasteful; the objective is to make 51% attacks (where a single | entity controls at least 51% of the global hashrate) infeasible | by attaching a cost to the mining process. | | Making the mining process produce useful output that can be | resold nullifies the purpose as it means an attacker can now | mine "for free" as a byproduct of doing general-purpose | computations (as opposed to tying up dedicated hardware), | lowering the barrier for a 51% attack dramatically. | magikabula wrote: | If everyone offer GPUs, is the same game. If I will buy more | GPU I will get more money, so the average payment for a person | with a single or a small bunch of GPU will be low. | | And second, the principles of electronic currency are different | from gold/money. That's why crypto uses GPU ;) | qwertox wrote: | If I were to run GPT-3 on my 70000 browser bookmarks, what kind | of insights could I get from that? | | Only by analyzing the page title (from the bookmark, not by re- | fetching the url) and eventually also the domain name. | supermatt wrote: | GPT-3 is a text generator, so i doubt you would get anything of | use. You cant even supply such a large input to GPT-3. | teaearlgraycold wrote: | GPT-3 is also a classifier and data-extractor. | | You could give it a couple dozen bookmarks with example | classifications and then feed it a new bookmark and ask GPT-3 | what category the page belongs in. Repeat for the entire data | set. | | For data extraction you could ask questions about the titles. | Maybe have it list all machine-learning model names that | appear in the set of bookmark titles. | keithalewis wrote: | print gpt.submit_request("Give me insights") | | >>> You are spending way to much time browsing. | air7 wrote: | So is there any one of them that I could play around with? | sillysaurusx wrote: | https://6b.eleuther.ai | lazylion2 wrote: | AI21 labs 178B parameter model | | https://studio.ai21.com/ | ComodoHacker wrote: | Are we heading to the (distant) future where to make progress in | any field you _have_ to spend big $$$ to train a model? | jowday wrote: | That's not even distant - most of the self-supervised vision | and language models at the bleeding edge of the field require | huge compute budgets to train. | iamcurious wrote: | We are already there. Machine learning is the flavor of A.I. | that keeps business barriers of entry high. If we had invested | in symbolic A.I., things would be different. A similar thing | happens with programming language flavors. PHP lowers barriers | of entry so it is discredited by the incumbents. | j45 wrote: | Your point about incumbents not wanting it to be easier to | create beginners in a language or technology is very | understated. | | Excluding participation in having the time and resources | available to overcome the initial inertia required to become | productive is a form of opportunity and earning segregation. | | Despite having a background in your tech, there is little | more if satisfying than people experiencing putting tech to | work for them, rather than the other way around or being | dependent on others. | [deleted] | lostdog wrote: | The difference between ML and symbolic AI is that ML works | and symbolic AI doesn't. At my job, dropping the | computational load of our ML models is heavily invested in, | and every success is celebrated. Everybody wants it to be | easier and cheaper to train high quality models, but some | things are still intrinsically hard. | CodeGlitch wrote: | > The difference between ML and symbolic AI is that ML | works and symbolic AI doesn't. | | IBM managed to beat Garry Kasperov using symbolic AI did | they not? So in what way does it not work? | lostdog wrote: | > IBM managed to beat Garry Kasperov using symbolic AI | did they not? So in what way does it not work? | | Ok, I should be clearer. ML approaches are way way better | than symbolic approaches. Given almost any problem, it is | much much easier to make an ML approach work than any | symbolic approach. | | Yes, chess was first solved symbolically, but it's since | been solved by ML better and more easily, to the point | that stockfish now incorporates neural nets [1]. ML has | also given extremely high levels of performance on Go, | Starcraft, DoTA, and on protein folding, image | recognition, text processing, speech recognition, and | pretty much everything else. | | I would challenge you to name any (non-simple) problem | where traditional AI methods are still state of the art. | | [1] https://stockfishchess.org/blog/2020/stockfish-12/ | goodside wrote: | "I would challenge you to name any (non-simple) problem | where traditional AI methods are still state of the art." | | Lossless file compression. As far as I know none of the | algorithms in widespread use are neural-based, despite | the fact that compression is clearly a rich statistical | modeling problem, at least on par with GPT-3-style | language understanding in difficulty. There are published | attempts to solve the problem with neural networks, but | they simply don't work well enough to date. Modern | solutions also still use old-fashioned AI ingredients | like compiled dictionaries of common natural-language | words -- any other domain where nat-lang dictionaries are | useful has been conquered by neural solutions, e.g. | spelling and grammar checkers. | _game_of_life wrote: | I'm far from an expert in this subject but doesn't this | ranking of large text compression algorithms with NNCP | coming first suggest that neural-nets are pretty great at | compression? | | http://mattmahoney.net/dc/text.html | | https://bellard.org/nncp/ | | I don't see examples of high performing symbolic AI based | compression algorithms anywhere, but again I am very | ignorant, do you have examples? | CodeGlitch wrote: | Thanks for clearing that up, I do agree that ML-based AI | has surpassed symbolic approaches in every field. | adgjlsfhk1 wrote: | they didn't. that was just alpha beta search with some | custom hardware to speed it up. also at this point, both | of the strongest chess ai (stockfish and lc0) are using | neutral networks and are roughly 1000 elo above where | deep blue was (and most of that is from software, not | hardware) | shmageggy wrote: | > _just alpha beta search_ | | I will cling to these goal posts every time. Search was | and still is AI, unless you think Russell and Norvig | should have named the field's foundational textbook | something other than "Artificial Intelligence: A Modern | Approach" | PeterisP wrote: | 1. There's a world of problems (such as "perception- | related" e.g. vision and NLP) which we tried to solve for | decades with symbolic AI and got worse results than what | nowadays first-year students can do as a homework with | ML; | | 2. For your example of chess, for some time now ML | engines are pretty much untouchable by engines based on | pre-ML methods. | CodeGlitch wrote: | Yes I agree with all your points - I was however | responding to the point being made that symbolic AI | "wasn't useful"...which in the past it was. Perhaps in | the future some new method or breakthrough will mean it | becomes useful once again? | panabee wrote: | this is a great point. | | much like deep learning was invented decades ago but | didn't become feasible until technology caught up, could | the same be true for symbolic AI? | | i.e., is the ceiling for symbolic AI technical and | transient or fundamental and permanent? | PeterisP wrote: | My feeling is that even in our own thinking symbols are | used mostly to communicate our (inherently non-symbolic) | thoughts to others or record them; i.e. they are a | solution to a bandwidth-limited transfer of information | while the actual thinking process happens with concepts | that have more similarity to collections of vague | parameters and associations which can be compressed to | symbols only imperfectly with losses. | | From that perspective, I don't see how symbolic AI would | be competitive but there would be a role for symbolic AI | in designing systems that can be comprehensible for | humans, but perhaps just as a distillation/compression | output from a non-symbolic system. I.e. have a strong | "black box" ML system that learns to solve a task, and | then have it construct a symbolic system that solves that | task worse, but in an explainable way. | iamcurious wrote: | >The difference between ML and symbolic AI is that ML works | and symbolic AI doesn't. | | There was a point when it was the other way around, this is | not static but the result of resources being poured. The | data heavy, computational heavy, black box style of ML | gives power to large business over small business. So it's | seen as a safer bet than symbolic A.I. This in turn makes | it work better, which makes it an even safer bet. Notice | that startups dream of being big business so they still | pick ML. | | Also notice that in some domains ML is still behind | symbolic A.I., for instance a lot of robotics and | autonomous vehicles. | DonHopkins wrote: | PHP wasn't discredited by the incumbents. It was discredited | by its creator. | | "I'm not a real programmer. I throw together things until it | works then I move on. The real programmers will say Yeah it | works but you're leaking memory everywhere. Perhaps we should | fix that. I'll just restart Apache every 10 requests." | -Rasmus Lerdorf | | "I was really, really bad at writing parsers. I still am | really bad at writing parsers." -Rasmus Lerdorf | | "We have things like protected properties. We have abstract | methods. We have all this stuff that your computer science | teacher told you you should be using. I don't care about this | crap at all." -Rasmus Lerdorf | iamcurious wrote: | To most programmers that doesn't discredit PHP at all. He | cares about a working product, much like 90% of | programmers, who don't have the privilige to worry about | theory. They just need an ecommerce, or blog or whatever, | running asap. To use a pg's analogy, they are there to | paint not to worry about painting chemistry. | | The incumbents do discredit PHP though. For instance, | facebook was built on PHP, and still runs on it. They used | the language of personal home pages to give every person on | the planet a personal home page. Nevertheless, once they | suceeded they forked PHP with a new name and isolated devs | culturally. | YetAnotherNick wrote: | Training model is getting cheaper. GPT-3 is one of the very few | countable examples where it is so expensive. In the end it all | depends on the size of data you have that you could scale up | the model without overfitting. And the internet text data is | one of the only data size is this big. | minimaxir wrote: | Fortunately, costs for training superlarge models are coming | down rapidly thanks to TPUs (which was the approach used to | train GPT-J 6B) and DeepSpeed improvements. | Nextgrid wrote: | Are there any TPUs that can be purchased off-the-shelf and | then owned, like you can do with a CPU or GPU? Or are you | just limited to paying rent to cloud providers and ultimately | being at their mercy when it comes to pricing, ToS, etc? | 6gvONxR4sf7o wrote: | No, but you probably aren't going to buy an A100 either, so | it's a moot point. ___________________________________________________________________ (page generated 2021-11-07 23:00 UTC)