[HN Gopher] GPT-3 can run code
       ___________________________________________________________________
        
       GPT-3 can run code
        
       Author : maytc
       Score  : 163 points
       Date   : 2022-03-29 15:31 UTC (7 hours ago)
        
 (HTM) web link (mayt.substack.com)
 (TXT) w3m dump (mayt.substack.com)
        
       | unixhero wrote:
       | Great so how do I run GPT-3 on my own hardware at home?
        
         | mr_toad wrote:
         | It's not available to the public or open source so you can't.
         | Only the smallest models might run on a single GPU, the largest
         | would need a large grid.
        
       | DC-3 wrote:
       | Very far from an expert on ML, but isn't GPT-3 trivially not
       | Turing Complete since it halts deterministically?
        
         | [deleted]
        
       | charcircuit wrote:
       | >Is GPT-3 Turing complete? Maybe.
       | 
       | It's obviously not. To handle infinite loops it needs to solve
       | the halting problem. Which is not possible.
        
         | anyfoo wrote:
         | I don't quite understand your answer. You don't need to solve
         | the halting problem to be Turing complete, quite obviously. Why
         | would GPT-3 need to in order to be?
        
       | mountainriver wrote:
       | This is such an interesting field but I think there needs to be
       | more focus on determinism and correctness. The stuff that's
       | happening with retrieval transformers is likely where this is
       | heading
        
       | a-dub wrote:
       | is there a search engine for the training data so that one can
       | verify that it is actually performing novel operations and not
       | just quoting back stuff from its incredibly large training set?
        
       | algon33 wrote:
       | If I remember rightly, the AlphaCode paper includes a list of
       | benchmarks, including the results of a finetuned GPT-3 for
       | coding. I think they did it because Codex wasn't available to
       | them when were doing their tests, but I might be wrong there.
        
       | imranq wrote:
       | An interesting research direction would be to see how much the
       | GPT3 deviates as we get more precise on various computational
       | tasks. Possibly this would give some measure of some of the
       | concepts the model has learned
        
         | sho_hn wrote:
         | Do we today have any test suites/benchmarks for models along
         | those lines?
        
       | kaetemi wrote:
       | It has a ton of programming books in its training data. It only
       | "runs" anything that's close enough to any samples it has seen
       | that included output. Anything complex, and it fails, because it
       | does not reason about it logically. It's bad at the same things
       | humans are bad at.
        
         | mr_toad wrote:
         | Human programmers rely on intuition and experience much more
         | than some people give them credit for. An experienced
         | programmer can find common errors quickly, simply because
         | they've seen (and made) so many.
         | 
         | Being able to intuit what a block of code does is actually a
         | core skill; having to actually step through code in your head
         | is slow and difficult.
        
       | Avalaxy wrote:
       | Just because you can, doesn't mean that you should. For some
       | things it's just better to use a rules-based engine that is
       | always correct, rather than a heuristics based algorithm that
       | gives answers that are merely close.
        
         | tasty_freeze wrote:
         | I don't think the author of the piece (or anyone for that
         | matter) thinks GPT-3 should be used for running programs or
         | evaluating functions.
         | 
         | It is being discussed because it is surprising that GPT-3 can
         | do it at all. It is worth investigating what types of emergent
         | knowledge and behavior are encoded in the trained network, as
         | the boundaries of its capabilities may help illuminate future
         | neural network architecture design.
        
       | kevincox wrote:
       | I find it quite interesting that in the JSON to YAML example it
       | reordered the list. If this was an access control list that could
       | be a serious security issue that could have easily been missed in
       | review. (Especially if dozens of files like this were changed at
       | once). Of course a malicious user could have done this as well
       | and likely got by code review but the fact that it was accidental
       | is scarier in a way.
        
       | timdellinger wrote:
       | I assume that GPT-3 is just exhibiting rote memory. For small
       | numbers, it has accurate answers memorized from the training set,
       | but for larger numbers, it just "remembers" whatever is close...
       | hence the ability to estimate.
       | 
       | My take is not that GPT-3 can run code, but rather that GPT-3 has
       | memorized what code looks like and what the output looks like.
        
       | lopatin wrote:
       | Can someone explain for a dummy how this is possible? How does it
       | know that range() is zero indexed? Was it specifically trained on
       | Python input/function/output data? Or did it just "learn" it? Do
       | the researchers know how it learned it?
       | 
       | Does it actually "run" the code? Like, if it was looping over 1
       | billion iterations would it take 1B times longer than if it was
       | just one iteration? I have so many questions.
        
         | MauranKilom wrote:
         | > How does it know that range() is zero indexed?
         | 
         | If you read through all of the internet once, would _you_ know
         | that range() is zero indexed?
         | 
         | > Like, if it was looping over 1 billion iterations would it
         | take 1B times longer than if it was just one iteration?
         | 
         | It clearly cannot, because querying the network for a token
         | executes the exact same sequence of operations every time.
         | 
         | But it's very impressive that it can basically recognize the
         | Collatz Conjecture in the code and mostly guess in the right
         | ballpark for the results.
         | 
         | The fact that it's just liking (in a loose sense) inputs to
         | inputs it has seen is quite visible in the f(g(x)) vs g(f(x))
         | behavior - the former is significantly more common, so it
         | struggles to work with the latter.
        
         | etskinner wrote:
         | It seems more likely that it learned it. If you knew nothing
         | about Python, but understood the word "for" a little, and
         | understood code a little, you're likely to figure out that
         | range() is zero-indexed after you see something like this a few
         | times
         | 
         | >>> for i in range(3): print(i)
         | 
         | 0 1 2
        
           | lopatin wrote:
           | My mind is just blown that it learned a language runtime
           | based on examples. What would happen if you gave it an
           | infinitely recusrive function? It can't stack overflow,
           | there's no stack! Wait, is there?
        
             | stevenhuang wrote:
             | My guess is it would respond with the standard stack
             | overflow error, from examples of similar output posted in
             | its training set.
        
         | lucidrains wrote:
         | https://alphacode.deepmind.com/ gives you a glimpse inside of
         | what emerged from a similar attention net trained on code.
         | however, whether the attention net has been forced upon pixels,
         | language, amino acid sequences, the resultant representations
         | are a bit beyond human reasoning, even if we can examine what
         | individual attention heads are 'looking' at
        
           | [deleted]
        
           | [deleted]
        
           | [deleted]
        
         | mhh__ wrote:
         | How do you know a range is zero indexed? (As in how is it
         | stored in your brain)
        
           | lopatin wrote:
           | I have no idea how it's stored in my brain. Is that the same
           | way it's stored in GPT-3?
        
       | ogogmad wrote:
       | Theorem: There are infinitely many prime numbers.
       | 
       | Proof:
       | 
       |  _[GPT-3 's "proof"]_
       | 
       | Let p be a prime number. Then by definition, p is a positive
       | integer that has no positive divisors other than 1 and p.
       | 
       | Now consider the integer q = p + 1. We will show that q is not a
       | prime number, which will prove that there are infinitely many
       | prime numbers.
       | 
       | Since p is a positive divisor of q, q cannot be prime.
       | 
       |  _[...another attempt with the same prompt]_
       | 
       | We know that there are infinitely many natural numbers. Let's
       | suppose that there are only finitely many prime numbers. Then we
       | can list them all:
       | 
       | p_1, p_2, p_3, ..., p_n
       | 
       | Now, consider the number
       | 
       | N = p_1 p_2 p_3 ... p_n + 1
       | 
       | This number is greater than any of the prime numbers in our list,
       | so it cannot be divisible by any of them. But it is also a
       | natural number, so it must be prime. This contradicts our
       | assumption that there are only finitely many prime numbers, so we
       | must conclude that there are infinitely many prime numbers.
       | 
       | ~~ _Personal remark_ : I had to give it the same prompt like 5
       | times to get a correct-ish proof. The first attempts came out
       | like the first example.~~ The proof is still wrong.
        
         | [deleted]
        
         | nonameiguess wrote:
         | This isn't really the "human level mathematician" equivalent
         | task anyway. A human mathematician's main purpose isn't to
         | memorize and reproduce proofs generated by other people. It's
         | to prove original results no one else has proven before. To
         | remember and reproduce existing proofs, I just typed "proof
         | infinitely many primes" into DuckDuckGo and it gave me plenty
         | of correct results.
        
           | karpierz wrote:
           | That's like saying "standing still" isn't a human-level
           | sprinter's task. In principle, yes, nothing in the 100m
           | sprint requires that you need to be able to stand still. In
           | practice, I would be very skeptical of someone who can't
           | stand claiming they can sprint.
        
           | zardo wrote:
           | It's a human level mathematics student problem. If it can't
           | determine that's it's proof is nonsense here there's little
           | hope it could produce any worthwhile original work.
        
         | jameshart wrote:
         | What does GPT-3 come up with if you ask it for a proof that
         | there are a finite number of primes? Or that pi is rational?
         | 
         | I guess it would stitch together some more seemingly sensible
         | statements that also don't quite add up to a rigorous proof?
        
         | [deleted]
        
         | gnulinux wrote:
         | Both proofs are wrong, second one is closest. Second one should
         | not claim that N is a prime (it likely isn't). It should say N
         | is not divisible by any of p_i, and since due to Fun. Theo. of
         | Arith. it is such that N = Sum {c_i q_i} where q_i are prime,
         | and none of q_i in {p_i} which shows a finite list of primes is
         | not possible construct.
        
         | brian_cloutier wrote:
         | Interestingly, these attempts are about the same as what pops
         | up when I try to remember the proof:
         | 
         | - It's a proof by contradiction - The key step is in taking the
         | finite list of primes, multiplying them together, and adding 1
         | 
         | I then try to flesh out the details, it might take a second to
         | realize that this new number is also prime, and then a few
         | moments more to remember the exact rationale why.
         | 
         | Along the way the proof lives in a kind of superposition where
         | I'm not clear on the exact details. The "proofs" you gave here
         | seem to be serializations of a similar superposition! GPT-3
         | seems to remember the proof about as well as I do, but it's
         | missing the final sanity check which tweaks the proof until all
         | the pieces correctly fit together.
         | 
         | In this case, you seem to be performing a version of this
         | sanity check by running the prompt multiple times until a
         | correct answer comes out. I wonder if it's possible to prove
         | something more obscure using a similar process: GPT-3 comes up
         | with ideas and the human sanity checks.
        
           | ctoth wrote:
           | I believe this recent paper demonstrates a method for
           | allowing these large language models to perform this "sanity
           | check" automatically[0].
           | 
           | [0]: Self-Consistency Improves Chain of Thought Reasoning in
           | Language Models https://arxiv.org/abs/2203.11171
        
           | actually_a_dog wrote:
           | The thing I find interesting about the proof attempts in the
           | GP comment is that they very much resemble what you'd expect
           | to see coming from a hypothetical somewhat confused
           | undergrad. I think that ties into what you say about the
           | proof living "in a kind of superposition where I'm not clear
           | on the exact details," because that's where I imagine said
           | hypothetical confused undergrad's understanding being.
        
             | mr_toad wrote:
             | It's imitation rather than true understanding. Still, even
             | imitation is a remarkable ability for a computer.
        
           | Banana699 wrote:
           | >this new number is also prime
           | 
           | Not necessarily, it might be composite, but in this case one
           | of it's prime factors will necessarily not lie in the
           | supposed list of primes, therefore also a contradiction.
           | 
           | The first counter example to "If L := {P0,P1,..,Pn} is a list
           | of primes, then prod(L)+1 is prime" is {2,3,5,7,11,13}, their
           | product is 30030, and 30031 is a composite of 2 primes, none
           | of which are in the list.
        
             | falcor84 wrote:
             | It's somewhat silly semantics, but I believe it is a valid
             | deductive step on the way to the contradiction - if the
             | number is not divisible by any other prime, then it must be
             | a new prime, [?].
        
               | ivegotnoaccount wrote:
               | The issue is that it is not divisible by any other prime
               | *from the list*. The two cases (prime or composite) must
               | be handled separately since they do not use the same
               | logic to infer there is one more prime.
               | 
               | For instance, 2 * 3 * 5 * 7 * 11 * 13 + 1 = 30031 = 59 *
               | 509.
        
               | ravi-delia wrote:
               | But to get the contradiction, you assume a finite number
               | of primes. As each of them does _not_ divide the new one,
               | the new one is not divisible by a prime. It seems like
               | your method is some kind of induction? Which probably
               | gets a little closer to the  "reason" for it, but isn't
               | the standard proof I've seen.
        
               | Tainnor wrote:
               | You don't need two separate cases.
               | 
               | Assume p1, ..., pn is a finite list of primes. The sum
               | p1+...+pn+1 is divisible by a prime, because every
               | natural number> 1 is. However, it's not divisible by
               | p1,...,pn, hence there must be an additional prime not in
               | the list.
               | 
               | (I think you're right though that GP's "contradiction"
               | doesn't work)
        
         | ogogmad wrote:
         | I keep asking GPT-3 to prove that the LR algorithm (for finding
         | eigenvalues and eigenvectors) converges for PSD matrices. It
         | keeps insisting that it's a form of gradient descent. Is that
         | true?
        
       | daenz wrote:
       | Nit, but YAML is a superset of JSON, so no conversion required :)
        
         | jefftk wrote:
         | This sort of "do what I mean" situation, where doing the thing
         | the user intended is different from doing something technically
         | correct, is a place GPT-3 excels. Even though returning the
         | input would be easiest, it has the pragmatic judgement to
         | predict that's not what the user wants.
        
       | mbowcut2 wrote:
       | So, for people unfamiliar with deep language models like GPT,
       | it's essentially a program that takes in a prompt and predicts
       | the next set of words based on a training corpus -- which in
       | GPT-3's case is a large portion of the internet. In these
       | examples GPT is not executing any python code, it has just been
       | trained on enough Python code/output to successfully predict what
       | kinds of outputs these functions would produce.
        
       | kcorbitt wrote:
       | For folks wanting to play around with the GPT-3 code-editing
       | capabilities referenced in the article within your own codebase,
       | I wrote a simple open source VS Code plugin that lets you run
       | commands against your currently-open file and get GPT-3's
       | suggested edits back in a diff:
       | https://marketplace.visualstudio.com/items?itemName=clippy-a...
        
         | 58x14 wrote:
         | I think I'm going to pair this with Copilot and see what
         | happens. Hopefully I don't accidentally unlock something
         | bizarre.
        
       | zora_goron wrote:
       | A quick question for anyone familiar with the architecture of
       | these Transformer-based models -- I've heard that one reason why
       | they don't work well with numbers is how the inputs are tokenized
       | (i.e. as "chunks" rather than individual words/numbers). Is there
       | anything architecturally preventing an exception in this form of
       | tokenizing in the data preprocessing step, and passing numbers
       | into the model in the format of 1 digit == 1 token? It seems like
       | such a change could possibly result in a better semantic
       | "understanding" of digits by the model.
        
         | [deleted]
        
         | Veedrac wrote:
         | Nothing prevents it, no. Transformers are certainly capable of
         | learning mathematical tasks; consider [1] as an example, which
         | uses big but regular token lengths.
         | 
         | Alternatively you could just scale 'till the problem solves
         | itself.
         | 
         | [1] https://arxiv.org/abs/2201.04600
        
       | learndeeply wrote:
       | Anyone have any ideas on how they're doing text insertion using
       | an auto-regressive model?
        
         | lucidrains wrote:
         | yes, they are most likely finetuning with this type of
         | pretraining https://arxiv.org/abs/2103.10360 quite easy to
         | build
        
       | PaulHoule wrote:
       | It would be remarkable if it got the right answers.
       | 
       | But it can't because it doesn't have the right structure (e.g.
       | GPT-3 finishes in a finite time, a program in a real programming
       | doesn't necessarily!)
       | 
       | GPT-3's greatest accomplishment is that it has "neurotypical
       | privilege", that is if it gets an answer that is 25% or 95%
       | correct people give it credit for the whole thing. People see a
       | spark of intelligence in it the way that people see faces in leaf
       | axels or in martian rock formations or how G.W. Bush looked in
       | Vladimir Putin's eyes and said he got a sense of Putin's soul.
       | (That was about the only thing in his presidency that he later
       | said he regretted!)
       | 
       | As an awkward person I am envious because sometimes it seems I
       | get an answer 98% correct or 99.8% correct and get no credit at
       | all.
        
         | Micoloth wrote:
         | GPT3 does _not_ think like a human, but it definitely executes
         | code in a way that is more similar to a human than a computer..
         | 
         | Proof is, that indeed humans _do_ get the wrong answer in
         | quizzes like these sometimes!
         | 
         | So i cannot understand this point of view of diminishing it as
         | "spark of intelligence". It is exactly what advertised: a very
         | big step forward towards real AI, even if definitely not the
         | last one?
        
           | PaulHoule wrote:
           | It is the Emperor's New Clothes incarnate.
           | 
           | It has the special talent of hijacking your own intelligence
           | to make you think it is intelligent.
           | 
           | People understood this about the 1966 ELIZA program but
           | intellectual standards have dropped greatly since then.
        
       | thrtythreeforty wrote:
       | > GPT-3 struggles with large numbers, decimal numbers, and
       | negative numbers. When used it returns answers that are close but
       | often incorrect.
       | 
       | Regarding GPT-3's "guesstimates," intuitively it feels like the
       | network _has_ to guess because it hasn 't been given a way to do
       | exact computation--a neural network is built out of nonlinear
       | functions--even if it "understands" the prompt (for whatever
       | value you want to give to "understand").
       | 
       | Are there any techniques that involve giving the model access to
       | an oracle and allowing it to control it? To continue the analogy,
       | this would be the equivalent of giving GPT-3 a desk calculator.
       | 
       | If this is a thing, I have other questions. How do you train
       | against it? Would the oracle have to be differentiable? (There
       | are multiple ways to operate a desk calculator to evaluate the
       | same expression.) Also, what control interface would the model
       | need so that it can learn to use the oracle? (Would GPT-3 emit a
       | sequence of 1-hot vectors that represent functions to do, and
       | would the calculator have "registers" that can be fed directly
       | from the input text? Some way of indirectly referring to operands
       | so the model doesn't have to lossily handle them.)
        
         | ravi-delia wrote:
         | I believe the dominant thinking is that GPT-3 has trouble with
         | math because it doesn't see individual digits. It obviously has
         | no trouble working on words, which are much more discreet than
         | numbers. I wouldn't be surprised if it had trouble carrying a
         | long equation though. When writing it can reconsider the whole
         | context with each new word, externalizing that memory, but with
         | most computations it would have to carry out the whole thing in
         | one go. That's a lot of dedicated parameters for a single
         | subtask.
        
           | thrtythreeforty wrote:
           | > with most computations it would have to carry out the whole
           | thing in one go
           | 
           | Is there a way to allow models to say "let me think about
           | this some more"? With language models like GPT-3 you emit one
           | token per inference iteration, with its previous output fed
           | back in as input/state. Can models opt out of providing a
           | token, but still update state? That would allow it to break
           | up the computation into discrete steps.
        
             | durovo wrote:
             | I believe GPT-3 has a transformer-based architecture. So it
             | doesn't recursively ingest it's own output in each
             | iteration. I believe attention-based transformer models
             | have enough complexity to be able to learn what you are
             | talking about on their own.
        
               | thrtythreeforty wrote:
               | Thank you for pointing out the difference. I went and
               | reread about transformers; previously I thought they were
               | a kind of RNN. (I am not an ML engineer.)
        
             | ravi-delia wrote:
             | I think it would work, but backprop would be computed in a
             | different way every time. I'm not an expert, so there may
             | be sneaky ways around it, but I'm pretty sure you'd lose
             | out on a long history of little efficiency improvements
             | when you could just make it more recurrent instead.
        
           | daniel-cussen wrote:
           | And that's where you see the man behind the curtain.
        
             | AitchEmArsey wrote:
             | Next year: GPT-NG offloads it's answers to Amazon
             | Mechanical Turk, and we've come full circle.
        
               | daniel-cussen wrote:
               | Yeah for sure. With energy prices soaring, Moore's law
               | being morally over for since 2010, wages being so
               | completely destroyed by the hatred Democrats have for
               | them, and the sneaky little misconceptions and errors the
               | golem's makers did not fight hard enough to let in, AI
               | will be supplanted by plain I.
        
           | edflsafoiewq wrote:
           | Can it do math on "prose numbers", eg. "two thousand three
           | hundred and four"?
        
           | mirker wrote:
           | Even the tokenization is wonky. Imagine if you had no concept
           | of math characters and instead has a lookup table of common-
           | ngrams (BPE encoding). For example, the binary addition
           | function "a+b" may be tokenized as a unary "3+b" because
           | "3+b" occurs commonly. That tokenization is vastly different
           | from "3.00000001+b". GPT has to invert this tokenization
           | artifact with finite training data.
        
         | visarga wrote:
         | There are many papers trying to couple language models with
         | external modules.
         | 
         | In the Retrieval-Enhanced Transformer (RETRO) paper a large
         | language model was coupled with a similarity based text index.
         | It can populate the prompt with relevant information from the
         | index thus being more grounded and update-able.
         | 
         | In another paper (AlphaCode) the language model was coupled
         | with a compiler and could run programs and check if they match
         | the expected outputs for a few test cases. The model was able
         | to solve competition style coding problems above average human
         | score.
         | 
         | In another paper (Language Models as Zero Shot Planners) a
         | language model generates commands to navigate a virtual home
         | environment and performs tasks. The knowledge in the LM helps
         | in quickly learning tasks.
         | 
         | A recent one can learn new concepts by simple conversation,
         | then apply them where necessary. You can talk-train your model.
         | (Memory assisted prompt editing to improve GPT 3 after
         | deployment)
         | 
         | So the trend is to add "toys" on language models - a simulator,
         | a compiler, a search engine, a long term memory module.
         | 
         | I'd like to see a recursive language model, that can sub-call
         | itself to decompose problems.
        
           | gwern wrote:
           | You forgot all the inner monologue
           | (https://www.gwern.net/docs/ai/gpt/inner-monologue/index) &
           | scratchpad papers which give it additional steps or access to
           | Python REPL etc: eg https://arxiv.org/abs/2112.15594
           | https://arxiv.org/abs/2111.08267
           | https://arxiv.org/abs/2111.08171
        
             | visarga wrote:
             | AI Chains really takes it to the next level.
        
         | emmelaich wrote:
         | An intriguing thought is that a GAI will behave very much like
         | a well-read smart individual. With the faults, mystery and
         | foibles that implies.
        
       | spupe wrote:
       | This is fascinating. I feel that we are still in the infancy of
       | the field, however. These observations are analogous to
       | naturalists of the past describing an animal's behavior, but we
       | need to get to the point where more accurate estimates are made
       | (ie, how often does it do each thing, how accurate it is after
       | 100+ tries, etc). Every day we see a new observation showing wha
       | GPTs can do, we also need a good way to make these observations
       | systematic.
        
       | berryg wrote:
       | I struggle to understand how GPT-3 executes code. Is it simply
       | running a python (or any other language) interpreter? Or is GPT-3
       | itself interpreting and executing python code? If the latter
       | question is true that would be amazing.
        
         | [deleted]
        
         | bidirectional wrote:
         | It is the latter.
        
       | Veedrac wrote:
       | > GPT-3 seems to have issues with large numbers. Moyix's gist
       | covers this in detail. GPT-3 tends to guesstimate an algebraic
       | function instead of evaluating the numbers, so the answer is only
       | correct to a certain approximation.
       | 
       | There are two issues here. One is the lack of working memory,
       | which means that there is very little scratch space for
       | calculating things with a meaningful sequential depth. GPT-3 is
       | very unlike traditional evaluation methods in this regard, in
       | that it is easier for it to interpret the meaning of a program
       | you give it and then intuit the result given the context than it
       | is to mechanically execute its steps.
       | 
       | The other issue is the text encoding, which makes it much harder
       | for GPT-3 to do digit-by-digit operations. Many arbitrary numbers
       | are just their own token. A fixed length number to us looks like
       | a fixed number of characters, but for GPT-3 they can be and
       | almost arbitrary number of tokens divided into almost arbitrary
       | chunks. Using thousands separators is very helpful for it.
       | 
       | If you account for these and design a prompt that mitigates them
       | you can get much stronger results. Here is an example:
       | https://news.ycombinator.com/item?id=30299360#30309302. I managed
       | an accuracy of 42% for 3-by-3 digit multiplication.
        
       | bitwize wrote:
       | GPT-3 is starting to remind me of SCP-914. Give it an input, and
       | its millions of tiny wheels churn and it produces something like
       | what you want, but otherwise quite unexpected.
       | 
       | Let's hope it doesn't turn into something like SCP-079...
        
       | csmeder wrote:
       | What year will GTP be able to take an app written in
       | Swift/SwiftUI and output a spectacular Android translation?
       | 3-years? 5-years? 10-years?
       | 
       | This is an interesting benchmark because it is a very difficult
       | problem, however: GTP has both everything it needs to do this
       | without needing a fundamental improvement to the core of GTP
       | (this process is more of a science than art) and using automated
       | UI testing GTP can check if its solution worked.
       | 
       | Thus this challenge is in the realm of what GTP already is,
       | however, once it can do this it will have massive implications
       | for how software is built.
        
         | anyfoo wrote:
         | A terrible prospect.
         | 
         | It's hard enough for people to faithfully port an application.
         | People who participate and live in the world that makes up our
         | reality. Leaving this up to an AI will at best flood us with
         | low quality junk. At worst it's actively harmful.
        
       | ivegotnoaccount wrote:
       | > For example, it seems to understand how to find a sum, mean,
       | median, and mode. > Input: 1, 4, 5, 6, 2, 1, 1 > Output:
       | 2.28571428571
       | 
       | Well, even with those small numbers, it's wrong. The first "2"
       | after the dot should not be there. The result it gives is 16/7,
       | not 20/7.
        
         | loganmhb wrote:
         | I wonder how much of this is an illusion of precision that
         | comes from pattern matching on content from filler sites like
         | https://www.free-hosting.biz/division/16-divided-7.html (I do
         | not recommend clicking the link, but the result appears there).
        
       | aplanas wrote:
       | Seems that it can convert from Python to Perl:
       | 
       | https://beta.openai.com/playground/p/o4qZWSXVz8JMmVaI9j9NMIK...
        
       | 7373737373 wrote:
       | Has anyone tried using it for SAT problems yet?
        
         | timdellinger wrote:
         | my recollection is that the original journal article announcing
         | GPT-3 included some data on how it performed against SAT-style
         | questions
        
       ___________________________________________________________________
       (page generated 2022-03-29 23:00 UTC)