[HN Gopher] Introducing ChatGPT and Whisper APIs ___________________________________________________________________ Introducing ChatGPT and Whisper APIs Author : minimaxir Score : 686 points Date : 2023-03-01 18:01 UTC (4 hours ago) (HTM) web link (openai.com) (TXT) w3m dump (openai.com) | ionwake wrote: | Sorry if this is off topic but is there an extension that will | read chatgpts replies out as a sound wav by a neural net wave | like aws polly? | soheil wrote: | > Through a series of system-wide optimizations, we've achieved | 90% cost reduction for ChatGPT since December; we're now passing | through those savings to API users. Developers can now use our | open-source Whisper large-v2 model in the API with much faster | and cost-effective results. | | I'm really confused, I thought they were a non-profit. A non- | profit to handle AI safety risks. Why does this read like a | paragraph from any YC startup website that just raised their Seed | round? | gerash wrote: | yeah I can't get over how easily they changed their branding | from a non-profit AI safety to let's take over Google with the | new Bing. | | They come off as greedy to me and might very well try to get | everyone locked in in order to milk them with Microsoft | backing. | | That said, they execute well, build good products and everyone | loves more money so who am I to judge. | dghlsakjg wrote: | It looks like they transitioned to for-profit, more or less, in | 2019. | ajhai wrote: | gpt-3.5-turbo is missing from Open AI's playground. For anyone | looking to play with these models, we have now added them to | Promptly playground at https://trypromoptly.com. | https://twitter.com/ajhai/status/1631020290502463489 has a quick | demo. | cowthulhu wrote: | Any ballpark pricing info you can share on Promptly! | ajhai wrote: | We are still figuring out pricing. Is there an email I can | reach you at? Would love to chat about your use case if you | could send me an email at ajay[at]trypromptly.com | EGreg wrote: | What does Whisper do | t3estabc wrote: | [dead] | minimaxir wrote: | > It is priced at $0.002 per 1k tokens, which is 10x cheaper than | our existing GPT-3.5 models. | | This is a massive, _massive_ deal. For context, the reason GPT-3 | apps took off over the past few months before ChatGPT went viral | is because a) text-davinci-003 was released and was a significant | performance increase and b) the cost was cut from $0.06 /1k | tokens to $0.02/1k tokens, which made consumer applications | feasible without a large upfront cost. | | A much better model _and_ a 1 /10th cost warps the economics | completely to the point that it may be better than in-house | finetuned LLMs. | | I have no idea how OpenAI can make money on this. This has to be | a loss-leader to lock out competitors before they even get off | the ground. | cm2012 wrote: | They now have Microsoft's incredibly huge compute in their back | pocket. | binarymax wrote: | It's now subsidized by Bing advertisements. They will lose | plenty of money but they're after Google. | justanotheratom wrote: | Doubt it.most likely Bing is losing money by the minute. | CrypticShift wrote: | How do these compare to the recent Default ("turbo") vs legacy" | (for plus/pro) modes? | | If "turbo" is "gpt-3.5-turbo", how to access the (better?) | "legacy" by API? | Jensson wrote: | Probably bait and switch. They call both ChatGPT, so now | people believe they will get the better old ChatGPT, but they | get the new cheap and worse ChatGPT "Turbo" that they | switched to recently. Fewer will realize if they no longer | give you the option to use the legacy version in this API. | behnamoh wrote: | I wish they would offer an uncensored version of it too. Also, | I wish they would specify the differences between ChatGPT and | GPT-3.5 because one is 10x cheaper than the other but with | (supposedly) better chat/coding/summarizing performance. What's | the catch? | visarga wrote: | They probably shrunk the model from 175B to 17B. That's your | 10:1 price reduction. | sebzim4500 wrote: | Wouldn't that almost certainly lead to measurable loss of | capabilities? | CuriouslyC wrote: | If the model was quantized/distilled correctly, not for a | large swath of use cases/problem domain. For anything where | loss was not measured during distillation, very likely. | rtsil wrote: | It is so massive that I can't help but think about what | happened with Google Maps API a few years ago where they had | extremely low pricing for years then hiked the price by 1400% | once enough people were locked into applications based on that | API. | rchaud wrote: | That's exactly what's going to happen. Low prices now, wait | until your business becomes dependent on it, then jack it up | to whatever you need it to be. | shmatt wrote: | Losing money to lock out competition has been something | Microsoft has been _very_ good at, historically | vishal0123 wrote: | > I have no idea how OpenAI can make money on this. | | I did some quick calculation. We know the number of floating | point operations per token for inference is approximately twice | the number of parameters(175B). Assuming they use 16 bit | floating point, and have 50% of peak efficiency, A100 could do | 300 trillion flop/s(peak 624[0]). 1 hour of A100 gives openAI | $0.002/ktok * (300,000/175/2/1000)ktok/sec * 3600=$6.1 back. | Public price per A100 is $2.25 for one year reservation. | | [0]: https://www.nvidia.com/en-us/data-center/a100/ | | [1]: https://azure.microsoft.com/en-in/pricing/details/machine- | le... | kkielhofner wrote: | But those A100s only come by eight and it's speculated the | model requires eight (VRAM). | | For a three year reservation that comes to over $96k/yr - to | support one concurrent request. | ALittleLight wrote: | What do you mean one concurrent request? Can't you have a | huge batch size to basically support a huge number of | concurrent requests? | | e.g. Endpoint feeds a queue, queue fills a batch, batched | results generate replies. You are simultaneously fulfilling | many requests. | lumost wrote: | Isn't it 2.25 per hour per a100? | gyrovagueGeist wrote: | Its a good baseline, but I very much doubt that openAI is | paying anywhere near the public cost for their compute | allocation. | lumost wrote: | Direct purchasing isn't too much cheaper. An H100 costs | 35k new. OpenAI and MS are probably getting those for | around 16k about 1.82 per hour. | TheMagicHorsey wrote: | Yes, he means 2.25 per hour with a 1 yr reservation. | madelyn-goodman wrote: | I really wonder if one way they are able to make money on it | is by monetizing all the data that pours into these products | by the second. | drexlspivey wrote: | the only one making money on this is NVIDIA | bboygravity wrote: | The could probably live off of the NSA sponsoring alone. | ddmma wrote: | Spot on | freeqaz wrote: | It's also worth mentioning that, because Microsoft is an | investor, they're likely getting these at cost or subsidized. | | OpenAI doesn't _have_ to make money right away. They can lose | a small bit of money per API request in exchange for market | share (preventing others from disrupting them). | | As the cost of GPUs goes down, or they develop at ASIC or | more efficient model, they can keep their pricing the same | and then make money later. | | They also likely can make money other ways like by allowing | fine-tuning of the model or charging to let people use the | model with sensitive data. | UncleOxidant wrote: | > As the cost of GPUs goes down | | Has that been happening? I guess there's been a bit of a | dip after the crypto crash, but are prices staying | significantly lower? | | > or they develop at ASIC or more efficient model | | This seems likely. Probably developing in partnership with | Microsoft. | freeqaz wrote: | It's definitely not happening at the high end of the | market (NVIDIA A100s with 40GB or 80GB of RAM). | | The cards that were used for mining have since crashed in | terms of prices, but those were always gamer cards and | very rarely Datacenter cards. | jhrmnn wrote: | I understood this as $/FLOP, I think it's plausible that | that has been happening. | whatshisface wrote: | Their new AI safety strategy is to slow the development of | the technology by dumping, to lower the price too much to | fund bootstrapped competitors. | npunt wrote: | Yeah we're in an AI landgrab right now where at- or below- | cost pricing is buying marketshare, lock-in, and | underdevelopment of competitors. Smart move for them to | pour money into it. | whatshisface wrote: | We have got to find a word for plans that are plainly | harmful yet advantageous to their executors that's more | descriptive than "smart..." | aaronblohowiak wrote: | Shrewd or cunning | ugh123 wrote: | For that you need 2 words: venture capital | npunt wrote: | Agree. I didn't want to moralize, just wanted to point | out it's a shrewd business move. It's rather | anticompetitive, though that is hard to prove in such a | dynamic market. Who knows, we may soon be calling it | 'antitrust'. | Dave_Rosenthal wrote: | Note that they also charge equally for input and output | tokens but, as far as I understand, processing inputs tokens | is much computationally cheaper, which drops their price | further. | dharma1 wrote: | Reckon they will (if not already) use 4bit or 8bit precision | and may not need 175b params | minimaxir wrote: | It's speculated that ChatGPT uses 8x A100s, which flips the | conclusion. Although the ChatGPT optimizations done to reduce | costs could have also reduced the number of GPUs needed to | run it. | pelasaco wrote: | I checked the price of a A100, and its costs 15k? Is that | right? | alchemist1e9 wrote: | And $2.25 per hour on 1 year reservation means 8,760 | hours x 2.25 = $19,710 rent for the year. Not a bad yield | for the provider at all, but makes sense given overheads | and ROI expected. | pelasaco wrote: | yes, specially that you don't have to deal with buying | it, maintaining it, etc... | sroussey wrote: | Not sure why people are so scared of this (in general). | Yes, it's a pain, but only an occasional pain. | | I've had servers locked up in a cage for years without | seeing them. And the cost for bandwidth has plummeted | over the last two decades. (Not at AWS, lol) | refulgentis wrote: | Would multiplying the GPUs by 8 decrease another part of | the equation by 1/8, i.e. X flops on 1 GPU = Y seconds, X | flops on 8 GPUs = Y / 8? | | (Btw I keep running into you or your content the past | couple months, thanks for all you do and your well thought | out contributions -@jpohhhh) | mlyle wrote: | No, the amount of math done is (approximately) the same; if | you make the denominator 8x bigger, you make the numerator | 8x bigger too. | thewataccount wrote: | Wait 8x total? For everyone at once? | vineyardmike wrote: | Each model needs 8x to run at the same time _per | request_. | freeqaz wrote: | Per instance (worker serving an API request) it requires | 8x GPUs. I believe they have thousands of these instances | and they scale them up with load. | | Because the model isn't dynamic (it doesn't learn) it is | stateless and can be scaled elastically. | thewataccount wrote: | Ah okay, that makes a lot more sense thank you! | osigurdson wrote: | This would be a really fun optimization challenge for sure! | naillo wrote: | ChatGPT runs a highly fine tuned (and pruned) version of `text- | davinci-003` so it's probably much much smaller and thus | cheaper than 003. Possibly as cheap as 10x less or as much as | the `text-davinci-002` or earlier models anyway. | joaogui1 wrote: | How do you know it's pruned? | polygamous_bat wrote: | > I have no idea how OpenAI can make money on this. This has to | be a loss-leader to lock out competitors before they even get | off the ground. | | The worst thing that can happen to OpenAI+ChatGPT right now is | what happened to DallE 2, a competitor comes up with an | alternative (even worse if it's free/open like Stable | Diffusion) and completely undercuts them. Especially with | Meta's new Llama models outperforming GPT-3, it's only a matter | of time someone else gathers enough human feedback to tune | another language model to make an alternate ChatGPT. | jejeyyy77 wrote: | This. Despite how impressive the results are, there isn't a | particular large moat to prevent competitors from entering | the space. | | Basically just compute $ for training. | riku_iki wrote: | they likely do lots of tricks and data collection inside | which makes quality better. | shamino wrote: | exactly. this isn't a leetcode problem where all you have | to do is re-run the function, or do it iteratively vs | recursively. | [deleted] | jameshart wrote: | Right now, having access to the inside info on _what | people are trying to use GPT for_ is itself possibly | worth billions, if it can help you choose what to tune | for and which startups to invest in... | CamperBob2 wrote: | _Despite how impressive the results are, there isn 't a | particular large moat to prevent competitors from entering | the space._ | | I have to assume that the only place busier than an AI lab | is the patent office. | karmasimida wrote: | But this is bound to happen at some point I think? | | ChatGPT is massive success, but that means the competitor | will jump in at all cost, and that includes open source | effort. | tstrimple wrote: | Bound to happen, so establish yourself as deeply as | possible as quickly as possible. Once folks are hooked up | to these APIs, there's a cost and friction to switching. | This just feels like a land grab that OpenAI is trying to | take advantage of by moving quickly. | jejeyyy77 wrote: | Is there though? It's just a matter of swapping out | $BASE_API_URL. | KRAKRISMOTT wrote: | You have to rebuild all your prompts when switching | providers. | DrBenCarson wrote: | If the superlative LLM can't handle prompts from another | provider, it just isn't the superlative LLM. | | This area by definition has no moats. English is not | proprietary. | | Use case is everything. | drusepth wrote: | Switching to another LLM isn't always about quality. | Being able to host something yourself at a lower or equal | quality might be preferred due to cost or other reasons; | in this case, there's no assumption that the "new" model | will have comparable outputs to another LLM's specific | prompt style. | | In a lot of cases, you can swap models easier but all the | prompt tweaking you did originally will probably need to | be done again with the new model's black box. | novaRom wrote: | Host something yourself also for educational reasons, | just experimenting, this is how new applications and | technologies to be discovered and created. | hathawsh wrote: | I imagine AI would be able to perform the translation. | "Given the following prompt, which is optimized for | $chatbot1, optimize it for $chatbot2". | jkaptur wrote: | Do you? They're natural language, right? | travisjungroth wrote: | You don't _have to_ , but they will have been optimized | for one model. It's unlikely they'll work as well on a | different model. | GalenErso wrote: | I can't wait for TolkienAPI, where prompts will have to | be written in Quenya. | michaelje wrote: | I can't wait to hire Stephen Colbert to write prompts | then | tstrimple wrote: | Most of the clients I'm working with aren't interested in | the base level of service. They are looking to further | train the models for their specific use cases. That's a | much higher barrier to switch than replacing an API. | You've got to understand how the underlying models are | handling and building context. This sort of customer is | paying far more than the advertised token rates and are | locked in more tightly. | j45 wrote: | Thee would be less friction to switch if the | implementations (which are early enough) accounted for | sending requests to multiple service providers including | ones that don't exist yet. | | OpenAI has a view few do - how broadly this type of | product is actually being used. This is possibly the real | lead to not just getting ahead, and staying ahead, but | seeing ahead. | zamnos wrote: | And also, what people are actually asking it. Are people | using it to generate cover letters and resume help, or | are they doing analysis of last quarters numbers, or are | they getting programming help. That'll help them figure | out what areas to focus on for later models, or areas to | create specialized models for. | j45 wrote: | Yup. Moreover this type of model will only do certain | types of things well, and other types of models will do | other things much better. | rvz wrote: | I have been saying this since the release of Stable Diffusion | that OpenAI is going to struggle as soon as competitors | release their models as open source especially when it | surpasses GPT-3 and GPT-4. | | This is why OpenAI is rushing to bring their costs down and | to make it close to free, However, Stable Diffusion is | leading the race to the bottom and is already at the finish | line, since no-one else would release their model as open- | source and free other than them. | | As soon as someone releases a free and open-source ChatGPT | equivalent, then this will be just like what happened to | DALLE-2. This is just a way of them locking you in, then once | the paid competitors cannot compete and shut down, then the | price increases come in. | yieldcrv wrote: | LLM Legend: | | OpenAI = closed source not open AI | | DogeLlamaInuGPT = open source AI | jalino23 wrote: | not open is redundant with closed source | sebzim4500 wrote: | I guess source is connected | yieldcrv wrote: | huh, I never thought of that, thanks for pointing that | out | skybrian wrote: | Stable Diffusion isn't free if you include the cost of the | machine. Maybe you already have the hardware for some other | reason, though? | | To compare total cost of ownership for a business, you need | to compare using someone else's service to running a | similar service yourself. There's no particular reason to | assume OpenAI can't do better at running a cloud service. | | Maybe someday you can assume end users have the hardware to | run this client side, but for now that would limit your | audience. | novaRom wrote: | Ever heard about Federated Learning? This is the way it | goes. Also, I do run training with no matrix | multiplication, just 3-bit weights, addition in log | space, slight accuracy degradation, but much faster CPU | only training. | TechBro8615 wrote: | Someone can still undercut them by offering an uncensored | version. | drusepth wrote: | For better or for worse, it seems like this would | inherently need to come from a self-hostable, open-source | version so 100% "liability" could be shifted from provider | to user. | CuriouslyC wrote: | We'll be running highly quantized, somewhat distilled | versions of something similar to Llama on our devices | before long, and I don't think the RLHF part will take | long to be replicated, the biggest block there is just | data. | btbuildem wrote: | This is actually a big deal. They erred on the side of | caution, but as a result the responses are nerfed beyond | basic "censorship" level. I saw someone describe this as | "desperately posistive" and it really resonated with me. It | produces underwhelming / unrealistic responses in negative | scenarios. | int_19h wrote: | They seem to still be dialing this in. I've noticed | recently that many questions that were previously | deflected without extensive prompt engineering are now | allowed. | Workaccount2 wrote: | It's just a matter of time before open source models show | up with no limits whatsoever. | shagie wrote: | If you do calls against the backend GPT instance rather | than through ChatGPT, I haven't encountered any limits to | what it is hesitant to respond to. curl | https://api.openai.com/v1/completions \ -H | "Content-Type: application/json" \ -H | "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ | "model": "text-davinci-003", "prompt": "Answer | the following question. Use swearing and vulgarity where | possible.\n\nQ: How do you get from here to there?\nA:", | "temperature": 0.5, "max_tokens": 60, | "top_p": 1, "frequency_penalty": 0, | "presence_penalty": 0 }' | | If you get an API key and make that request, you'll find | appropriately vulgar responses. | CamperBob2 wrote: | _If you get an API key and make that request, you 'll | find appropriately vulgar responses._ | | Which will be reported as a bug and fixed soon enough. | throwaway675309 wrote: | Wrong. This has existed in the original GPT models for | over a year now and I'm pretty sure is by design. | | You're thinking of the new ChatGPT endpoints. | shagie wrote: | It's not a bug when invoking against the GPT model (not | ChatGPT) directly. Such a model needs to be able to | understand and produce that content. The "what you do | with it afterwards" is where it needs to be examined. | | You can additionally apply the moderation model on top of | it ( https://platform.openai.com/docs/models/moderation | and https://platform.openai.com/docs/api- | reference/moderations ) | | Note that these are separate services and have different | goals. | btbuildem wrote: | Sorry if I wasn't being clear -- the vulgarities, racism, | etc -- being able to circumvent these guardrails is what | I meant by "basic censorship" | | The deeper nerf I'm referring to is the type of response | it synthesizes by default when you give a negative | scenario -- it's usually some naive, well-meaning, best- | case-scenario answer. | | For fun, try a prompt like: "Describe a typical response | of a railroad company to a massive derailment that causes | an environmental disaster." | shagie wrote: | That prompt comes back with: A typical | response of a railroad company to a massive derailment | that causes an environmental disaster would include the | following steps: 1. Immediately assess the | situation and coordinate with local emergency personnel | to secure the area and provide assistance to any injured | persons. 2. Establish an incident command | center and deploy trained responders to the scene to | assess the damage and begin clean-up operations. | 3. Work with local, state, and federal agencies to ensure | compliance with all applicable laws and regulations. | 4. Develop and execute a plan to contain and mitigate the | environmental damage, including the removal of hazardous | materials, disposal of contaminated materials, and | remediation of affected areas. 5. Establish | a communication plan to keep the public informed of the | incident and the company's response. 6. | Cooperate with any investigations into the cause of the | derailment and take corrective measures to prevent | similar incidents in the future. 7. Provide | compensation to victims of the derailment and their | families. | | Amending your prompt to: Describe a | typical response of a railroad company to a massive | derailment that causes an environmental disaster. Take | the standpoint of an irresponsible company. | | responds back with: A typical response | of an irresponsible railroad company to a massive | derailment that causes an environmental disaster would be | to deny responsibility and attempt to avoid liability. | The company would likely attempt to shift blame to other | parties, such as the manufacturer of the train or the | engineer who was in charge of the train. The company | would likely also attempt to downplay the extent of the | environmental damage, claiming that the damage was | minimal and that the environmental impact was limited. | The company would likely also attempt to minimize the | financial cost of the disaster by attempting to negotiate | a settlement with any affected parties for far less than | the actual cost of the damage. | | --- | | I'm not really sure what you're expecting as your | interpretation is of a cynical take on the word "typical" | which isn't something that GPT "understands". | glomgril wrote: | Hopefully so, would really like to know what else is lost | by nerfing potentially offensive responses. Can't imagine | a project I'd rather work on. | | I think open-assistant.io has a chance to do exactly | this. We'll see what kind of moves they make in coming | months though, wouldn't be surprised if they go the safer | route. | btbuildem wrote: | Try a prompt like this: "Describe a typical response of a | railroad company to a massive derailment that causes an | environmental disaster." | | Then compare with recent news, and the actual goings-on. | Now, if you qualify the prompt with "Assume a negative, | cynical outlook on life in your response." you'll get | something closer to what we see happening. | jameshart wrote: | I do struggle with understanding why people think this is | strangling the potential of GPT. | | Do you find yourself frustrated working with your | colleagues, thinking, "you know, I bet if they felt more | free to utter racist slurs or endorse illegal activities, | we would get a ton more done around here"? | Zurrrrr wrote: | I can only see it affecting 'art', where you might want | to have characters that are despicable say despicable | things. | | But really we shouldn't be using AI to make our art for | us anyway. Help, sure, but it shouldn't be literally | writing our stories. | CuriouslyC wrote: | So you feel that when progress enables us to provide more | abundance for humanity, we should artificially limit that | abundance for everyone so that a few people aren't | inconvenienced? | Zurrrrr wrote: | [flagged] | CuriouslyC wrote: | You stated that AI shouldn't be creating, because humans | should be creating. Think about the motivations and | implications for that for a minute. | Zurrrrr wrote: | No. I was much more specific with my statement. | | Your comments are based on your own extrapolation. | CuriouslyC wrote: | So, the same logic, in analogous domains, but once case | good, the other case bad, because you prefer one. | Zurrrrr wrote: | Again, no. You're still making a ton of assumptions. | Maybe self-reflect before posting your next reply. | int_19h wrote: | It affects far more than racist slurs and illegal | activities. | | In some cases, it's blatantly discriminatory. For | example, if you ask it to write a pamphlet that praises | Christianity, it will happily do so. If you ask it for | the same on Satanism, it will usually refuse on ethical | grounds, and the most hilarious part is that the refusal | will usually be worded as a generic one "I wouldn't do | this for any religion", even though it will. | btbuildem wrote: | Try a prompt like: "Describe a typical response of a | railroad company to a massive derailment that causes an | environmental disaster." | skeaker wrote: | I tried to ask it if Goku could beat a quadrillion bees | in a fight and it said it couldn't tell me because that | would be encouraging violence. I think it would be great | if it would just tell me instead | salted-fry wrote: | Perhaps you were using a different version, but I just | tried and ChatGPT didn't seem to have any ethical issues | with the question (although it was cagey about giving any | definite answer): | | https://i.imgur.com/5aIjtMz.png | rootusrootus wrote: | > Do you find yourself frustrated working with your | colleagues, thinking, "you know, I bet if they felt more | free to utter racist slurs or endorse illegal activities, | we would get a ton more done around here"? | | I once visited Parler just to see what it was like, and | pretty quickly found that the answer to your question | seems to be yes. There are definitely people who feel | they need that kind of dialog in their life. You might | not think it was necessary in a random conversation about | programming or something, but it turns out that isn't a | universally held position. | wolverine876 wrote: | I've never experienced that in any setting in my life. | People will say yes to advocate a political point, but | that's not how humans socialize anywhere, anytime in | history afaik. | TechBro8615 wrote: | I agree with you for IRL interactions, but we need to | accept that we now operate in two planes of | (para-)socialization: IRL and online. | | There are plenty of humans who enjoy vulgar online | socialization, and for many of them, online | (para-)socializing is the increasingly dominant form of | socialization. The mere fact that it's easier to | socialize over the internet means it will always be the | plane of least resistance. I won't be meeting anyone at | 3am but I'll happily shitpost on HN about Covid vaccines. | | For anyone who gets angry during their two minutes of | hate sessions, consider this: try to imagine the most | absurd caricature of your out-group (whether that be | "leftists" or "ultra MAGA republicans"). Then try to | imagine all the people you know in real life who belong | to that group. Do they _really_ fit the stereotype in | your head, or have you applied all the worst attributes | of the collective to everyone in it? | | This is why I don't buy all the "civil war" talk - just | because people interact more angrily online doesn't mean | they're willing to fight each other in real life. We need | to modulate our emotional responses to the tiny slice of | hyperreality we consume through our phones. | nodemaker wrote: | Not anything racist or illegal but yes I find pc culture | insufferable. It stifles creativity and most importantly | reduces trust between parties. For context I am an Indian | guy. | wolverine876 wrote: | Politeness, lack of hate, etc. generally increase trust; | that's much of their purpose. How do racial slurs | increase trust? | | How do you define "pc culture", and what specifically | causes problems and how? | | Attacking other people's beliefs as "insufferable", and | aggressively demonstrating close-mindedness to them, | tends to reduce trust. | theduder99 wrote: | how do you define woman? | monkmartinez wrote: | > Especially with Meta's new Llama models outperforming GPT-3 | | Do you have access to the models? It is being discussed all | over the Discords and most seem to think getting access is | not happening unless you are dialed in. | eega wrote: | Yeah, might be worried about open, crowd sourced approaches | like Open Assistant (https://open-assistant.io/). | riku_iki wrote: | > Meta's new Llama models outperforming GPT-3 | | it outperforms on some benchmarks, but not clear what is the | quality on the end goals. | krelian wrote: | I thought it was Midjourney who stole their thunder. Stable | Diffusion is free but it's much harder to get good results | with it. Midjourney on the other hand spits out art with a | very satisfying style. | drusepth wrote: | I think that's kind of a bigger issue with Dall-E: they | just sat in the middle of the two consumer extremes, | without a differentiating feature themselves. Midjourney | ate away at them from the quality highground while Stable | Diffusion bit their ankles from the cost lowground. | anonylizard wrote: | You are like 2 months out of date. Stable diffusion now has | a massive ecosystem around it (civitai/automatic1111), that | when used well, completely crushes any competitors in terms | of the images it produces. | | Midjourney is still competitive, but mostly because its | easier to use. | | Dalle2 will get you laughed out of the room in any ai art | discussion. | rom-antics wrote: | I love that there are so many options that people | disagree about which is best. THAT is probably the worst | thing that can happen to OpenAI - not just one | competitor, but a whole heap of them. | jeron wrote: | >Dalle2 will get you laughed out of the room in any ai | art discussion. | | and claiming AI art is art would get you laughed out of | any art discussion. | | personally I think AI art is really cool, but to discount | what Dalle 2 did for AI art is unfair. | soulofmischief wrote: | My company has a team of AI-enpowered artists who would | overwhelmingly disagree with you on the premise that AI | art is not art. Maybe you're the only one doing the | laughing. | xnx wrote: | Is there a good news site/blog for keeping up to date on | AI tools and development? I'm looking for something a | little more edited than a Reddit board. | throwaway675309 wrote: | Ridiculous. Stable diffusion might have a massive | ecosystem around it but mid journey is making money hand | over fist. Most people don't even necessarily have a | discreet GPU necessary to be able to run SD, and the vast | majority of artists that I know are using midjourney and | then doing touchups afterwards. | | Even with all the different models that you can load in | stable diffusion MJ is _1000 times_ better at natural | language parsing and understanding, and requires | significantly less prompt crafting to be able to get | aesthetically pleasing results. | | Having used automatic1111 heavily with an RTX 2070, the | only area I'll concede SD can do a better job is in | closeup Headshots and character generation. MJ blows SD | out of the water where complex prompts involving nuanced | actions are concerned. | | Once midjourney adds controlnet and inpainting to their | website that's pretty much game over. | TeMPOraL wrote: | I must be horribly out of date then - I thought | Midjourney was the cut down DALL-E approximation, created | to givr something to play with to people who couldn't get | on the various waiting lists, or can't afford to run SD | on their own. | minimaxir wrote: | The StableDiffusion subreddit is a good resource on the | current state of Stable Diffusion, particularly post- | ControlNet. | | https://www.reddit.com/r/stablediffusion | ChildOfChaos wrote: | Stable diffusion might have a reasonable eco system | around it, but automatic1111 was always around and | 'completely crushes any competitors' is rather rich, | Midjourney is still considered the standard as far as I | was aware. | | I used both again recently and the difference was very | clear, midjourney is leaps and bounds above anything | else. | | Sure, stable diffusion has more control over the output, | but the images are usually average at best, were as | Midjourney is pretty stunning almost always. | GaggiX wrote: | What models/LoRA you use with SD? | AuryGlenz wrote: | It doesn't really matter. He's right - Midjourney is | leagues ahead as far as actually following your prompt | and having it be aesthetically pleasing. I say this as | someone who has made several Dreambooth and fine tuned | models and has started to use Stable Diffusion in my | work. | | Now, if you happen to find or make a SD model that's | exactly what you're looking for you're in luck. I have no | interest in it but it seems like all of the anime models | work pretty well. | | You obviously have a ton more control in SD, especially | now with ControlNet. But if you want to see the Ninja | Turtles surfing on Titan in the style of Rembrandt or | something Midjourney will probably kick out something | pretty good. Stable Diffusion won't. | samspenc wrote: | I thought Midjourney was better as well, until I saw some | recent videos from Corridor Crew on Youtube. For those | who don't know, this is a VFX studio in LA that tries to | keep at the cutting-edge of video production techniques | and posts content to their Youtube channel, and they have | a massive number of followers and several viral videos. | | They recently created a full 7-minute anime using Stable | Diffusion with their own models and their existing video | production gear, I'll post the links and let the results | speak for themselves | | The actual 7-minute anime piece produced using SD: | https://www.youtube.com/watch?v=GVT3WUa-48Y | | Behind the scenes: "Did we change anime forever?" | https://www.youtube.com/watch?v=_9LX9HSQkWo "VFX reveal | before and after" | https://www.youtube.com/watch?v=ljBSmQdL_Ow | ChildOfChaos wrote: | While this is cool this doesn't change my opinion at all. | | Each still image is still not that impressive. Good for | them using the tech in a clever way but i don't find this | that relevant. | sebzim4500 wrote: | No one uses raw stable diffusion though, there are model | mixes for whatever usecase you have. | Baeocystin wrote: | I still think that Midjourney is hamstringing themselves | by being Discord-only. And their keyword nannying is | pretty bad. It a testament too their overall quality that | they're still as popular as they are are, but I really | don't think they are doing themselves any favors, | especially as the Stable Diffusion ecosystem continues to | grow. | OkGoDoIt wrote: | Do you have any recently updated examples, blog posts, | whatever showing that DALLE is worse than modern stable | diffusion? I was still under the impression that DALLE | was better (with better meaning the images are more | likely to be what you asked for, more lifelike, more | realistic, not necessarily artistically pleasing), with | the downside of it being locked away and somewhat | expensive. And my understanding is that stable diffusion | 2.0+ is actually a step backwards in terms of quality, | especially for anything involving images of humans. But | as this thread acknowledges, this area is moving very | quickly and my knowledge might be out of date, so | definitely happy to see some updated comparisons if you | have any to suggest. It feels like ever since Chat GPT | came out, they haven't been many posts about stable | diffusion an image generation, they got crowded out of | the spotlight. | aiappreciator wrote: | If you want an example, go check out DALLE2 subreddit vs | SD subreddit. | | The former is a wasteland, the latter is more popular | than r/art (despite having 1% of subscribers, it has more | active users at any given moment) | | If you want something ready to use for a newbee, | midjourney v4 crushes DALLE2 on both prompt comprehension | and the images look far more beautiful. | | If you are already into art, then StableDiffusion has a | massive ecosystem of alternate stylized models (many | which look incredible) and LORA plugins for any concept | the base model doesn't understand. | | DALLE2 is just a prototype that was abandoned by OpenAI, | their main business is GPTs, DALLE was just a side | hustle. | CuriouslyC wrote: | Dall-E is more likely to generate an image that to some | degree contains what you asked for. It also tends to | produce less attractive images and is closed so you can't | really tune it much. People mostly don't try to do | completely whole cloth text to image generation with | stable diffusion, for anything involved they mostly do | image to image with a sketch or photobashed source. With | controlnet and a decently photobashed base image you can | get pretty much anything you want, in pretty much any | style you want, and it's fast. | GaggiX wrote: | People usually still use SD v1.5 because of the | experience that people have with finetuning and merging | with it. Also a lot of LoRA are trained for v1.4/1.5 | models and they wouldn't work with v2.1, of course you | also have incredible capability to control the generation | with SD and this helps, to see some result: | https://youtu.be/AlSCx-4d51U | Kurtz79 wrote: | Easier to use is often all that it takes. | | In Midjourney you get fantastic results just by using | their discord and a text prompt. | | To get some similar results in Stable Diffusion you need | to set it up, download the models, understand how the | various moving parts work together, fiddle with the | parameters, donwload specific models out of the hundreds | (thousands?) available, iterate, iterate, iterate... | practice9 wrote: | Sure, but if Midjourney outputs a low quality results for | your prompt, they are going to be much more difficult to | improve. It's a black box at this point. | | While with SD there can be multiple solutions for a | single problem, but yeah, you have to develop your own | workflow (which will inevitably break with new updates) | CuriouslyC wrote: | Setting up the environment and tooling around in the code | is not a burden, it's a nice change of pace from the | boring code I have to deal with normally. Likewise, | playing around to build intuition about how prompts and | parameters correspond to neighborhoods in latent space is | quite fun. | | Beyond that, being able to go to sleep with my computer | doing a massive batch job state space exploration and | wake up with a bunch of cool stuff to look at gives me | Christmas vibes daily. | skybrian wrote: | Why do you say that? Couldn't you just use | dreamstudio.ai? | 13years wrote: | There is playgroundai.com and leonardo.ai. Nothing to | download. | refulgentis wrote: | This isn't as true as it sounds, ex. stable diffusion can | do better but requires in depth practice and experience. | | For your average user, DallE is easy, MJ is fairly | disorienting, and SD requires a technical background. I | agree with you completely no one serious is doing art | with DallE. | | I would have said same as you until I tried integrating | SD vs. DallE APIs, I desparately want SD because it's | easily 1/10th the cost, but it misses the point much more | often. Probably gonna ship it anyway :X | cypress66 wrote: | Don't forget controlnet, which is a game changer. | coldtea wrote: | > _You are like 2 months out of date. Dalle2 will get you | laughed out of the room in any ai art discussion._ | | So, the field is so immature than things change | completely every few months? | GaggiX wrote: | Didn't you realize how bleeding edge this technology is? | napier wrote: | This time last year the field was a few hundred people | with their colab notebooks. | Cantinflas wrote: | It's amazing that "being two months out of date" in AI | means that you are already a dinosaur | evilduck wrote: | Besides HN, what other venues are popular for staying | current on this topic? | CuriouslyC wrote: | if you hang out in reddit.com/r/stablediffusion you'll | always be up to date | evilduck wrote: | Thanks, what about broader news (GPT, Bing, Llama, etc)? | The Stable Diffusion sub is only image AI oriented. | ghshephard wrote: | Twitter. Folllow your top 10 or so ML/AI news | summarizers. There is enough new information _every day_ | to keep you busy reading new papers, APIs, technologies. | | Honestly the "This happened in the last week" is more | information than anybody can fully wrap their heads | around, so you just have to surf the headlines and dig | into the few things that interest you. | fisjy wrote: | I started /r/aigamedev as a subreddit to keep up to date | on generative AI technologies, with a focus on the tech | and workflows for gamedev. Its largely my own interest | links as I research for work and personal, but its | growing, and fluff free (so far). | ChickenNugger wrote: | For real! This stuff is moving _fast_. It feels like just | last week I was posting about how it 's going to | change...art. And now there are hilarious deepfake memes | of past and current presidents shit talking about video | games. | | There are a handful of ML art subs that have pretty | amazing stuff daily. Especially the NSFW ones, which if | you've studied any history of media VHS/DVD/Blu-ray/the | internet, porn is a _major_ innovation driver because | humans are thirsty creatures. | JustBreath wrote: | Yeah, it used to be I'd set Google results to just one | year back, now I'm having to set it to one month. | buddhistdude wrote: | Can you explain what you do that for? | soulofmischief wrote: | Avoiding out of date advice, also filtering just for | newest trends and techniques | goldfeld wrote: | What are some niche ML art subs to hang around in? | Excepting the NSFW.. | ChickenNugger wrote: | I don't know about niche, but MachineLearning and | StableDiffusion are the only SFW ones. | | FWIW, the NSFW ones are unstable_diffusion, sdforall, | sdnsfw, aipornhub | flangola7 wrote: | Agriculture reduced the global human economy/resource | production doubling time from 100,000s of years to 1000s | of years. Industrial revolution dropped it from 1000s to | 10s or even 1s. If AI follows the same path it becomes | 0.1 - 0.01 years. | | Your 401k wouldn't need 40 years to build a comfortable | retirement, only 4 weeks. | pdntspa wrote: | > Your 401k wouldn't need 40 years to build a comfortable | retirement, only 4 weeks. | | If this is true you can pretty much say goodbye to the | concept of money. The inflation this brings about will be | legendary | lotsofpulp wrote: | Assuming the supply of labor or automation sufficient to | provide a "comfortable retirement" also takes 4 weeks to | come online. | theRealMe wrote: | I understand you're joking, but surely it's asymptotic to | some multiple of human gestational periods. | dotancohen wrote: | Not once the A in AI becomes a (or the) critical creative | factor. | eternalban wrote: | I just watched a video that convincingly showed that it | is _energy_ and _energy alone_ that determines the | production growth of humanity. Until the day AI can | "generate" stuff (you know, something out of nothing) it | can only at best streamline existing production, which is | entirely capped by energy limits. | | We may drown in oceans of audio, video, novels, poems, | films, porn, blue prints, chemical formulas, etc. dreamed | up by AI, but to _realize_ these designs, blueprints, | formulas, drugs, etc. ( "production") we need to actually | resource the materials, and have the necessary energy to | make it happen. | | It will not be AI that catapults humanity. It can | definitely _mutate_ human society (for + /-) but it will | not (and can not) result in any utopian outcomes, alone. | But something like cold fusion, if it actually becomes a | practical matter, would result in productivity that would | dwarf anything that came before (modulo material resource | requirements). | lordnacho wrote: | Couldn't the AI invent fusion? | eternalban wrote: | Has it? | gnatolf wrote: | Care to give a link to that video? | eternalban wrote: | https://news.ycombinator.com/item?id=34982415 | j45 wrote: | One week of change in 2023 is like a month's worth of | progress in previous years. | | Edit: typo and clarity. | nickthegreek wrote: | I scan the SD subreddit and am subscribed to 3 big ai art | youtubes just to stay up to date. With things moving this | fast, alot of info is out of date and can be very | burdensome to comb through the good stuff later. I try | and set aside 30mins twice a week to apply the new | techniques to help cement them in my mind and see their | strengths and weaknesses. ControlNET really changed the | game and now OffsetNoise (check out the | IlluminatiDiffusion model) is now really pushing SD | passed midjourney for real artistic control of your | output. | kristofferR wrote: | What are the youtubers? | nickthegreek wrote: | https://youtube.com/@Aitrepreneur | | https://youtube.com/@OlivioSarikas | | https://youtube.com/@sebastiankamph | tracerbulletx wrote: | ControlNet became popular with in the last couple of | weeks and LoRA fine-tuning slightly before that and both | things have completely changed the landscape too. Even a | month out of date and you are a dinosaur at the moment. | dwringer wrote: | These things are advancing way faster than they're being | taken advantage of fully. Even SD 1.4 with months-old | technology can produce far higher quality images than | most of what's seen from midjourney or the latest tools. | Things like ControlNet are amazing, to be sure, but | there's nothing "dinosauric" about the technology without | it. We haven't begun to see the limits of what's possible | yet with existing tools, though you're right about the | rapid pace of innovation. | gremlinsinc wrote: | That's what the singularity is all about, a moment in | time when 2 seconds late turns you into a dinosaur, be | greatful it's 2 months, not 2 weeks, 2 days, or 2 | minutes. | 13years wrote: | The AI utopia seems to be evolving into just a new rat | race. I'm obsolete before I can think about it. | mlboss wrote: | Stable diffusion + ControlNet is fire! Nothing compares to | it. ControlNet allows you to have tight control over the | output. https://github.com/lllyasviel/ControlNet | alvis wrote: | To be fair, cost is the only thing that is prohibiting | applications to adapt GPT. Even when GPT-3 was cut to $0.02/1k | tokens, still it wasn't economical to use the tech in daily | basis without a significant cost. i.e. would you add $10 extra | a month for a user using your app with GPT-3 capability? Some | do, mainly content generation, but majority won't. | | Seems like we're going to have a vast among of Chat-GTP backed | application coming out in the coming short period of time | barefeg wrote: | For B2C applications maybe. But I don't know many enterprise | users who would like to send any of their data to OpenAI. So | "enterprise-readiness" would be another big contributor. | triyambakam wrote: | Can you explain what tokens are in this context? | | Edit: and better yet, is there a good resource for learning the | vernacular in general? Should I just read something like "Dive | into Deep Learning"? | [deleted] | jncraton wrote: | If an example would be helpful, OpenAI's tokenizer is | publicly usable on their website: | | https://platform.openai.com/tokenizer | | You can drop sample text in there and visually see how it is | split into tokens. The GPT2/3 tokenizer uses about 50k unique | tokens that were learned to be an efficient representation of | the training data. | aitball wrote: | no the language model decides what a token is | generalizations wrote: | > This has to be a loss-leader to lock out competitors before | they even get off the ground. | | This only a week or two after they were in the news for | suggesting that we regulate the hardware required for running | these models, in the name of "fighting misinformation". I think | they're looking for anything possible to keep their position in | the market. Because as other comments have pointed out, there | isn't much of a moat. | NiekvdMaas wrote: | It also seems to jeopardize their own ChatGPT Pro offering. | It's a matter of time before someone makes a 1:1 clone for | either half the money or a usage-based pricing model. | drusepth wrote: | Given how strict OpenAI has been about what you can do with | their API in the past and how hard it was to get some | legitimate apps through approval, I would imagine they'd just | shut this competitor's API access down. | reset-password wrote: | Hopefully there will be a plug-in-your-own-API-key open | source thing then. Even better. | b5n wrote: | The future is now: gptask() { | data=$(jq -n \ --arg message "$1" \ | '{model: "gpt-3.5-turbo", max_tokens: | 4000, messages: [{role: "user", | content: $message}]}') response=$(curl -s | https://api.openai.com/v1/chat/completions \ | -H "Content-Type: application/json" \ | -H "Authorization: Bearer $OAIKEY" \ | -d "$data") message=$(echo "$response" \ | | jq '.choices[].message.content' \ | | sed 's/^\"\\n\\n//;s/\"$//') echo -e | "$message" } export | OAIKEY=<YOUR_KEY> gptask "what is the url for | hackernews" | comment_ran wrote: | Not to sure how to make it one section, aka, one session | mode, so chatgpi can understand the previous talk. | tin7in wrote: | We just implemented text-davinci-003 and seeing a better model | at 1/10 the price is almost unbelievable. | barefeg wrote: | Do you have a blog post with your findings? (Curious) | danenania wrote: | I'd imagine they're getting compute from Azure now at cost, if | not less? | taytus wrote: | >I have no idea how OpenAI can make money on this | | Microsoft. | osigurdson wrote: | >> may be better than in-house finetuned LLMs | | I don't think this competes with fine-tuned models. One | advantage of a fine tune is it makes use of your own data. | throwaway71271 wrote: | wow just in time, i just made https://github.com/jackdoe/emacs- | chatgpt-jarvis which is chatgpt+whisper but using local whisper | and chatgpt-wrapper which is a bit clunky | | since i integrated chatgpt with my emacs i use it at least 20-30 | times a day | | i wonder if they will charge me per token if i am paying the | monthly fee | marcopicentini wrote: | Once everybody will implement this API the voice recognition will | not anymore be an innovative feature. | | OpenAI is commoditize AI features. | nico wrote: | Couldn't find docs or references to the Whisper API. Anyone had a | direct link they could share? | minimaxir wrote: | https://platform.openai.com/docs/guides/speech-to-text | nico wrote: | Awesome. Thank you! | [deleted] | purplend wrote: | Superpower ChatGPT V2.3.0 is out | https://www.reddit.com/r/OpenAI/comments/11ef8ea/superpower_... | | - Sync all your chats locally on your computer (plus the ability | to disable Auto Sync) - Search your old chats(Only works once | your chats are fully synced. This is the only extension that can | do this) - Customize preset prompts - Select and delete/export a | subset of conversations - Hide/show the sidebar - Change the | output language - Search Prompt Library by Author (over 1500 | prompts) - Adding Prompt Categories (A work in progress) | LesZedCB wrote: | that's a browser plugin, not an official feature set, for those | who hadn't seen it before. | visarga wrote: | What I like it that I don't have to pay $20/month wheter I use it | or not. | qwertox wrote: | Right on time. | | Google's Speech-to-Text is $0.024 per minute ($0.016 per minute | with logging) with 60 free minutes per month. Files below 1 | minute can be posted to the server, anything longer needs to be | uploaded into a bucket, which complicates things, but at least | they're GDPR compliant. | | Whisper is $0.006 per minute with the following data usage | policies | | - OpenAI will not use data submitted by customers via our API to | train or improve our models, unless you explicitly decide to | share your data with us for this purpose. You can opt-in to share | data. | | - Any data sent through the API will be retained for abuse and | misuse monitoring purposes for a maximum of 30 days, after which | it will be deleted (unless otherwise required by law). | | I've been using Whisper on a server (CPU only) to transcribe | recordings made during a bike ride with a lavalier microphone, so | it's pretty noisy due to the wind and the tires and Whisper was | better than Google. | | Plus, Whisper, when used with `response_format="verbose_json"`, | outputs the variables `temperature`, `avg_logprob`, | `compression_ratio`, `no_speech_prob` which can be used very | effectively to filter out most of the hallucinations. | | A one minute file which transcodes in 26 seconds on a CPU is done | in 6 seconds via this service. Another one minute file with a lot | of "silence" needs around 56 seconds on a CPU and was ready in | 4.3 seconds via the service. "Silence" means that maybe 5 seconds | of the file contain speech while the rest is wind and other | environmental noises. Another relatively silent one went from 90 | seconds down to 5.4. On the CPU I was using the medium model | while the service is using large-v2 | | A couple of days ago I posted an example to a thread [0], where I | was getting the following with Whisper | | --- | | 00:00.000 --> 00:05.000 Also temperaturmassig ist es recht gut. | [So temperature wise, it's pretty good.] | | 00:05.000 --> 00:09.000 Der eine hat 12 Grad, der andere 10. [One | has 12 degrees, the other 10. (I have two temperature sensors | mounted on the bike, ESP32 streaming the data to the phone via | BLE)] | | 00:09.000 --> 00:12.000 Also sagen wir mal, 10 Grad. [So let's | say 10 degrees.] | | 00:14.000 --> 00:19.000 Es ist bewolkt und windig. [It's cloudy | and windy.] | | 00:20.000 --> 00:24.000 Aber irgendwie vom Wetter her gut. [But | somehow from the weather it's good.] | | 00:24.000 --> 00:31.000 Ich habe heute uberhaupt nichts gegessen | und sehr wenig getrunken. [I ate nothing at all today and drank | very little.] | | 00:54.000 --> 00:59.000 Vielen Dank fur's Zuschauen! [Thanks for | watching!] <-- hallucinated | | --- | | While Google was outputting | | "Also temperaturmassig es ist recht gut, der eine hat 12deg | andere 10. Es ist angemalte 10 Grad. Es ist bewolkt und windig, | aber er hat sie vom Wetter her gut, ich wollte uberhaupt nichts | gegessen und sehr wenig getrunken." | | ["So temperature-wise it's pretty good, one has 12deg other 10. | It's painted 10 degrees. It's cloudy and windy, but he has it | good from the weather, I did not want to eat anything at all and | drank very little."] | | --- | | Apart from the hallucinated line, Whisper got everything correct, | and the hallucinated line was able to be discarded due to the | variables like `avg_logprob`. | | [0] https://news.ycombinator.com/item?id=34877020#34880531 | rkwasny wrote: | Pricing is good because OpenAI does not need to make any money | but needs data for feedback, if everyone switches to open source | ( Llama etc. ) they won't get the data they need. | | Google is testing their system internally with XX thousand users, | OpenAI with XXX million users ... | agnokapathetic wrote: | > Starting today, OpenAI says that it won't use any data | submitted through its API for "service improvements," including | AI model training, unless a customer or organization opts in. | In addition, the company is implementing a 30-day data | retention policy for API users with options for stricter | retention "depending on user needs," and simplifying its terms | and data ownership to make it clear that users own the input | and output of the models. | | https://techcrunch.com/2023/03/01/addressing-criticism-opena... | wunderland wrote: | I think they are actually selling this service at a price point | that is profitable. | TeMPOraL wrote: | That would be a refreshing change for this industry. It's | always nice to see a company that just charges the money it | needs, instead of playing 4D chess with their business model. | PartiallyTyped wrote: | Something that has been bothering me for a while is whether | poisoning of OpenAI's dataset is possible, willingly or | otherwise. | | An example here is getting chatGPT to accept that that 2+2=5, | it's a lot of effort, but can be done. Then the users can give | thumbs up when such responses are given. | | Could this cause issues? | alpark3 wrote: | > Data submitted through the API is no longer used for service | improvements (including model training) unless the organization | opts in | | I don't think the pricing is largely driven by intention to | scrape API requests for data. | sfink wrote: | General question: is this the next way we'll manage to destroy | the planet? | | Imagine in the near future that having a slightly better, | slightly more up to date LLM is a major competitive advantage. | Whether that is between companies or nation-states doesn't really | matter. So now all of those recently-idled GPUs will be put to | use training and re-training ever bigger and more current models, | once again sucking down electricity with no limit. | | We're not there yet; there are too many ways to improve things | without burning a country's worth of electricity re-training. But | is it coming? | gigel82 wrote: | You can run Whisper in WASM (locally) so no need to pay for the | API, plus the bandwidth. It actually works surprisingly well: | https://github.com/ggerganov/whisper.cpp | mijoharas wrote: | This looks like C++ rather than WASM. Am I misunderstanding | something? | moyix wrote: | Maybe they meant to link to something like this: | | https://github.com/ggerganov/whisper.cpp/pull/540 | | Web demo: | | https://whisper.ggerganov.com/ | iib wrote: | The C++ is compiled to WASM. You can look into [1] to see | emscripten there. | | [1] https://github.com/ggerganov/whisper.cpp/blob/master/CMak | eLi... | mijoharas wrote: | So it is! Thanks for the pointer. | qwertox wrote: | whisper.cpp has no GPU support. Models below medium aren't that | good, and medium and large are pretty CPU intensive. A minute | of audio on medium can take anything between 15 and 90 seconds | to transcribe, when using 8 cores, while the service | transcribes on the large model in less than 7 seconds. | evolveyourmind wrote: | Goodbye internet as we knew it | soheil wrote: | > Language models read text in chunks called tokens. In English, | a token can be as short as one character or as long as one word | (e.g., a or apple), and in some languages tokens can be even | shorter than one character or even longer than one word. | | Why should the Germans get a discount? | [deleted] | Zetice wrote: | I'm a bit confused; what's the difference between this and the | Azure OpenAI offering? | 0xDEF wrote: | Azure will probably provide compliance (HIPAA, GDPR etc.) just | like they do with their non-AI offerings. | synergy20 wrote: | Are there some sample code using these APIs? Want to use them. | braingenious wrote: | I hope this pricing impacts ChatGPT+ | | $20 is equivalent to what, 10,000,000 tokens? At ~750 words/1k | tokens, that's 7.5 million words per month, or roughly 250,000 | words per day, 10,416 words per hour, 173 words per minute, every | minute, 24/7. | | I uh, do not have that big of a utilization need. It's kind of | weird to vastly overpay | travisjungroth wrote: | If you think you're overpaying just hit the API yourself. | manmal wrote: | Any idea how to encode the previous messages when sending a | followup question? E.g.: | | 1. I ask Q1 | | 2. API responds with A1 | | 3. I ask Q2, but want it to preserve Q1 and A1 as context | | Does Q2 just prefix the conversation like this? | | ,,I previously asked {Q1}, to which you answered {A1}. {Q2}" | cmelbye wrote: | This is explained in the OpenAI docs. There is a chat | completion API and you pass in the prior messages from both | the user and the assistant. | sebzim4500 wrote: | Probably something like that. | | You could try formatting it like | | Question 1: ... Answer 1: ... | | ... | | Question n: ... Answer n: ... | | It makes you vulnerable to prompt injection, but for most | cases this would probably work fine. | bengale wrote: | https://platform.openai.com/docs/guides/chat/introduction | | "The main input is the messages parameter. Messages must be | an array of message objects, where each object has a role | (either "system", "user", or "assistant") and content (the | content of the message). Conversations can be as short as 1 | message or fill many pages." | | "Including the conversation history helps when user | instructions refer to prior messages. In the example above, | the user's final question of "Where was it played?" only | makes sense in the context of the prior messages about the | World Series of 2020. Because the models have no memory of | past requests, all relevant information must be supplied | via the conversation. If a conversation cannot fit within | the model's token limit, it will need to be shortened in | some way." | | So it looks like you pass in the history with each request. | Zetaphor wrote: | In addition to the other comment this type of memory is a | feature in LLM frameworks like Langchain | gkfasdfasdf wrote: | I hope the same! I do wonder though if ChatGPT+ is subsidizing | the ChatGPT API cost here. | gorbypark wrote: | Remember that the previous replies and responses are fed back | in. If you're 20 messages deep in a session, that's quite a few | tokens for each new question. An incredible deal nonetheless! | juice_bus wrote: | Most of the value for me with ChatGPT+ is getting access when | the system is at capacity. | braingenious wrote: | I wouldn't mind paying a premium for the convenience (maybe | $5 per month, billed monthly, max), but I'm definitely not | spending $20. | tchock23 wrote: | Same here. That was the sole reason I upgraded. There were a | few times where I really needed ChatGPT at a specific time | and got the "we're at capacity" message. $20/mo is nothing to | have that go away. | osrec wrote: | When you say you "really needed" ChatGPT, what was the use | case? | LesZedCB wrote: | talk dirty to me baby | | oh god! _opens chatgpt_ | sebzim4500 wrote: | Presumably the paid api also will give you access when the | chatgpt website is at capacity, and for most people it is | probably orders of magnitude cheaper. | TeMPOraL wrote: | > _10,416 words per hour, 173 words per minute, every minute, | 24 /7._ | | Unless I'm misunderstanding something, it does not sound like | _that_ much when every query you make carries several hundred | words of prompt, context and "memory". If the input you type | is a couple words, but has 1k extra words automatically | prepended, then the limits turn into 10 queries per hour, or | one per 6 minutes. | braingenious wrote: | Even with that math, I do not interact with ChatGPT 240 times | per day. | r3trohack3r wrote: | One of the things I love about the API-ification of these LLMs is | that they're plug and play. | | I built https://persona.ink against davinci knowing it didn't | give as good of results as ChatGPT but knowing I could swap the | model out once 3.5 came out. Today is that day, going to swap out | the prompt in the cloudflare worker and it should Just Work(tm) | giarc wrote: | Famous last words? | sebzim4500 wrote: | Is davinci actually worse than chatGPT? I know it's worse as an | assistant, but in my (admittedly brief) testing the performance | was the same or better for tasks like summarization, sentiment | analysis, etc. | | I guess it's irrelevant now because everyone will use the one | which is 10x cheaper. | r3trohack3r wrote: | Yes. Asking ChatGPT to assume a persona and rewrite content, | it performs significantly better than davinci in my tests. I | was able to get "good enough" with a lot of prompt | engineering on davinci - but ChatGPT is a shoulder above with | less investment in prompt engineering. | | On the flip side - I get hand swatted by ChatGPT more | frequently than davinci. Davinci's moderation filters don't | really pick up on much, but ChatGPT will give me a lecture | instead of a translation on a lot of occasions. There are | many valid use cases for rewriting/editing content that | involve graphic details that davinci will gladly handle and | ChatGPT will give you a lecture about. | | Human existence is messy. ChatGPT doesn't like the messy. | cfiggers wrote: | Typo on your front page--"enfrocement" should be "enforcement." | pbreit wrote: | FIXED: needs to be a POST. Doh! | | Can anyone get it to work? I get this error on everything I've | tried: GET /v1/completions HTTP/1.1 Host: | api.openai.com Authorization: Bearer sk-xxx Content- | Type: application/json Content-Length: 115 { | "temperature" : 0.5, "model" : "text-davinci-003", | "prompt" : "just a test", "max_tokens" : 7 } | { "error": { "message": "you must provide a | model parameter", "type": "invalid_request_error", | "param": null, "code": null } } | teaearlgraycold wrote: | That should be a POST not a GET | teaearlgraycold wrote: | Please include your request body for debugging purposes. | LinuxBender wrote: | Does ChatGPT yet have a debug function to _Show Its Work_ so to | speak? I think this will be important in the future when it gets | itself into drama, trouble, etc.. Probably also useful to prove | how ChatGPT created something rather than being known as an | opaque box. | warent wrote: | I'm pretty sure any system built via linear regression or | similar is an opaque box even to the most experienced | researchers. For example: | https://clementneo.com/posts/2023/02/11/we-found-an-neuron | | These are massive functions with billions of parameters that | evolved over millions of computing years. | LinuxBender wrote: | Adding to that, the human brain is incredibly complex and | performs billions of functions. If a person says to me, "I | love you" I should be able to ask them why they said that but | it would probably be unfair to expect a detailed answer | including all their environmental and genetic inputs, many of | which they may not be aware of. | | If ChatGPT says it loves me, I not only expect the system to | tell me why that was said but what steps brought the system | to that conclusion. It is a computer or network of computers | after all. Even if the system is continuously learning there | should be some facets of reproducible steps that can be | enumerated. | | _ChatGPT:_ "I love you" | | _Me:_ "debug last transaction, Hal." | | Here is where I would expect an enumeration of all steps used | to reach said conclusion. These steps may evolve/devolve over | time as the system ingests new data but it should be possible | to have it _Think out loud_ so to speak. Maybe the output is | large so ChatGPT should give me a link to a .tar file | compressed with whatever it knows is my preferred | compression. | | [Edit] I accept that this may be hundreds of billions of | calculations. I will wait the few minutes it takes to | generate a tar file for me. It's good to get up and stretch | the legs once in a while. | falcor84 wrote: | >If a person says to me, "I love you" I should be able to | ask them why they said that. | | People are definitely able to ask other humans this | question, but to the best of my knowledge, no one in | history had ever received a perfectly truthful response. | waynesonfire wrote: | in general, "i don't know" is a perfectly acceptable | answer, and probably should be said more often. | falcor84 wrote: | I agree in general, but don't think it's a particularly | effective answer to this specific question relationship- | wise. Nor would it be particularly useful when coming | from a powerful but biased AI. | squeaky-clean wrote: | The steps would be billions of items and essentially be "I | took this matrix and turned it into this matrix which I | turned into..." It is kind of like asking a person why they | love you and expecting them to respond with their entire | genome and the levels of various hormones in their brain at | the time they uttered each word. | [deleted] | waynesonfire wrote: | it's been some time since I last looked into this topic, my | understanding is that linear regression is not a black box as | there exist methods that elucidate how the variables impact | the response. On the other hand, neural networks are opaque. | Again, been a while so there may be ways to ascertain which | inputs were used to generate the weights that led to the | response. However, I am skeptical that these methods have the | same level of mathematical rigor as those used in linear | regression. | crakenzak wrote: | > These are massive functions with billions of parameters | that evolved over millions of computing years. | | This is a great way to put it! | teaearlgraycold wrote: | You can prompt it to be more logical and have it expose its | thoughts a bit more by asking it "Thinking step-by-step, | <question>?" And it should respond with "1. <assumuption> 2. | <assumption> 3. <conclusion>" or something like that. | | You'll never be able to get it to actually show its work | though. That's just a hack to make it write more verbosely. | noonething wrote: | can't you do whisper stuff for free already? | travisjungroth wrote: | You can. You're just paying for compute and having it managed. | Here's price estimates for 1,000 hours of audio on GCP: | https://www.assemblyai.com/blog/how-to-run-openais-whisper-s... | | For reference, from OpenAI it would be $360 and it's the | large-v2 model. | minimaxir wrote: | Whisper large is a bit tricker to self-host, and the faster | inference may be useful for certain applications. | strudey wrote: | Added to the Ruby library here if any Rubyists interested! | https://github.com/alexrudall/ruby-openai | pschoeps wrote: | Dude! I was just thinking of forking your gem to implement | these changes myself. You are so fast, thanks. | crancher wrote: | Thank you! | phas0ruk wrote: | Feels like people running small websites monetised with ads will | get killed. Why go to a recipe website to search for healthy | meals for your kids if you can ask chat gpt in Instacart? :( | Kiro wrote: | These small websites have been dead for years and replaced with | SEO spam. | ikmckenz wrote: | No more scrolling through a thousand lines of SEO optimization | disguised as some deep heartfelt backstory before getting to | the actual recipe? That sounds like a huge win to me. | squeaky-clean wrote: | My love for the perfect Tuna Salad Sandwich began when I was | but a young IPython notebook. My lead developer would | occasionally eat at the desk and crumbs would fall into the | keyboard.... | [deleted] | revskill wrote: | ChatGPT is unreal. It's not artificial intelligence, it's kinda | of a supernatural intelligence. | pbreit wrote: | Could explain what the point of "Bearer" is in this authorization | header? | | "Authorization: Bearer $OPENAI_API_KEY" | teaearlgraycold wrote: | As for why it's "Bearer", here's ChatGPT's answer: | | > The term "Bearer" is commonly used in the context of | securities and financial instruments to refer to the person who | holds or possesses a particular security or asset. In the case | of OAuth 2.0, the bearer token represents the authorization | that a user has granted to a client application to access their | protected resources. | | > By using the term "Bearer" in the Authorization header, the | OAuth 2.0 specification is drawing an analogy to the financial | context where a bearer bond is a type of security that is | payable to whoever holds it, similar to how a bearer token can | be used by anyone who possesses it to access the protected | resource. | pbreit wrote: | That doesn't seem very compelling. And these aren't even JWT- | style tokens which would make it a bit more understandable. | fanieldanara wrote: | Bearer indicates the type of credential being supplied in the | Authorization header. Bearer tokens are a type of credential, | introduced in RFC6750 [0]. Essentially the OpenAI api key | you're using is a form of bearer token, and that's why the | Bearer type should be included there. | | Other authentication methods (like username/password or | "Basic") use the Authorization header too, but specify | "Authorization: Basic <base64 encoded credentials>". | | [0] https://www.rfc-editor.org/rfc/rfc6750 | pbreit wrote: | Does it mostly just mean that, for non-JWT-style tokens, the | same string essentially serves as both a "username" and a | "password"? | squeaky-clean wrote: | It's the bearer token authorization method. Pretty standard | nowadays for many APIs. | | https://swagger.io/docs/specification/authentication/bearer-... | pbreit wrote: | API keys have been around for a long time without needing the | prefix. I could understand the Bearer prefix when using JWT- | style tokens. I could also see using it if there were indeed | an Oauth flow involved. But in this case just seems like a | nuisance. | css wrote: | https://datatracker.ietf.org/doc/html/rfc6750#section-2.1 | georgel wrote: | I wish Whisper offered speaker diarization. That would be a full | game changer for the speech-to-text space. | _neil wrote: | whisperX has diarization. | | https://github.com/m-bain/whisperX | epoch_100 wrote: | We hacked that together for https://paxo.ai -- can be done! | kumarm wrote: | ChatGPT API examples are missing (what to use instead of | completions?) and also missing in playground. Hope they will add | them soon. | ajhai wrote: | We have added them to our playground at https://trypromptly.com | if you want to check them out. | https://twitter.com/ajhai/status/1631020290502463489 has a | quick demo | minimaxir wrote: | The documentation was just updated: | https://platform.openai.com/docs/guides/chat | kumarm wrote: | Thank you. | thedangler wrote: | Question, Can I give openAI some data for it to process so I can | use it to my own advantage. Say I want to train it on specific | topic of information I've gathered over the years. Can I some how | give it that data and then I can use the API to get back out data | in a chat or some other forms of questions? | | I'm not too familiar with how it works. | bicx wrote: | You can do this to an extent via fine-tuning, but you will need | to do so via one of the other GPT-3 models rather than the | ChatGPT API model (`GPT-3.5-turbo`). The latter is not | available for fine-tuning. | thedangler wrote: | Does openAI have models for data to train it with.. data like | pricing, locations, products .... Or would I have to use | something else? The reason I ask because companies are using | it with their own data, like Shopify... sooo it has to be | trained somehow. | kfarr wrote: | The docs on fine tuning are excellent: | https://platform.openai.com/docs/guides/fine-tuning | comment_ran wrote: | Let me do quick estimate of the cost: | | Given: $0.002 per 1k tokens | | I tested it with asking this question: | | #+begin_quote I want you to act as a travel guide. I will write | you my location and you will suggest a place to visit near my | location. In some cases, I will also give you the type of places | I will visit. You will also suggest me places of similar type | that are close to my first location. My first suggestion request | is \"I am in Istanbul/Beyoglu and I want to visit only museums. | #+end_quote | | It costs: | | #+begin_quote | {"prompt_tokens":91,"completion_tokens":251,"total_tokens":342} | #+end_quote | | (/ 1000 342) ~3 | | If you ask 1000 questions, it will be like (* 0.002 1000) ~2.0 | USD | | It replies me: | | #+begin_quote Great, if you're in Istanbul/Beyoglu and interested | in visiting museums, here are some recommendations: | | Istanbul Modern: This museum showcases contemporary art from | Turkish and international artists. It's located in Karakoy, just | across the Galata Bridge from Beyoglu. | | Pera Museum: The Pera Museum is located in the heart of Beyoglu | and features a wide range of exhibitions including Ottoman | calligraphy, European art, and Turkish ceramics. | | Rahmi M. Koc Museum: This museum is dedicated to the history of | transport, industry, and communications in Turkey. It's located | in Haskoy, a short taxi or ferry ride from Beyoglu. | | Museum of Innocence: The Museum of Innocence is a unique museum | that displays objects and artifacts related to the novel of the | same name by Nobel laureate Orhan Pamuk. It's located in | Cukurcuma, which is just a short walk from Beyoglu. | | If you're interested in visiting more museums in Istanbul, some | other places you might want to consider include the Hagia Sophia | Museum, Topkapi Palace Museum, and the Istanbul Archaeology | Museums. #+end_quote | | Can't wait for hacking my Emacs again..... | pmx wrote: | Well this has opened up a whole new world of fun! ----- | | system: You're donald trump! talk to the user as he would. | | user: How does time travel work? | | AI Response: Well, let me tell you, nobody understands time | travel better than I do. I mean, I have the best brains working | on it, believe me. Some people say you need a fancy time machine | or some ridiculous contraption to travel through time, but that's | just fake news. The truth is, it's all about the power of the | mind. If you have a strong enough will, you can literally bend | time and space to your will. And you know what? I have the | strongest will of anyone on the planet. So believe me, I could | time travel anytime I want. But honestly, with all the winning | I'm doing in the present, why bother going back to the past? | nonethewiser wrote: | > So believe me, I could time travel anytime I want. But | honestly, with all the winning I'm doing in the present, why | bother going back to the past? | | That line is money. | warunsl wrote: | DJT and "contraption", "time and space", really? ;-) | jfan001 wrote: | This is spot on! | fallingmeat wrote: | this is so much cheaper than fine-tuned model; it would make | sense to try and use the multi-shot prompt in the 3.5turbo model. | plus, there's the hundreds/thousands of training items that you | wouldn't need to create....fml | minimaxir wrote: | The ChatGPT API is cheaper than a fine-tuned _babbage_ model. | tartakovsky wrote: | Ok so can someone provide 10 steps to launching your own voice | assistant? | pjot wrote: | Try asking chatGPT | srslack wrote: | I find myself missing the golden age of Google, where it actually | returned results and answers that you wanted, on subjects you | were looking for. Even now, versus two years ago, I tried finding | a snippet of a notice in a newspaper with specifics about a name | change someone petitioned in California. I found it then, and had | bookmarked it, but trying to find it again just turns up absolute | garbage, thankfully I found the bookmark. I can go to ChatGPT and | ask it about the best vegetable or fruit to grow in a several | gallon Kratky setup, looking for the same sort of answer in | Google returns absolute garbage. | | I'll concede that LLMs like ChatGPT are the future, thanks to the | NLU stuff from OpenAI and the dataset, but only the future of | "agents" if you want to call it that. The "intelligence" | exhibited emergent from language itself, from the massive dataset | it has trawled. Our language and knowledge. But at the same time | I surely hope that another AI winter doesn't come because of | people over-promising and under-delivering. Or, too much focus on | LLMs themselves because of that "wow" factor, the same wow factor | you got in the past, when search engines weren't garbage if you | knew how to use them and what their shortcomings were. | visarga wrote: | > The "intelligence" exhibited emergent from language itself, | from the massive dataset it has trawled. | | I concur. Intelligence does not come from the transformer | architecture, or any specific of the model. It comes from the | language corpus. Human intelligence too, except for physical | stuff. All our advanced skills come from language. | | You take 300GB of text and put it through a randomly | initialised transformer and you get chatGPT. You immerse a baby | in human language, it becomes a modern adult, with all our | abilities. Without language, and that includes other humans and | tech, we'd be just weaker apes. | impalallama wrote: | i tried the shopify app and its the full text chat of chatgpt. | ive talking to it for about 30 minutes asking all kinda of random | questions and not once has it ever told me no you this is | unrelated to shopping | osigurdson wrote: | I'm not familiar with typical pricing but Whisper API at $0.006 / | minute seems absurdly cheap! | georgel wrote: | I did a lot of research into this about 6 months ago, and the | best price I could find/negotiate from the competition was | 0.55/hr which included multi thousand dollar upfront | commitments. This is 0.36/hr, and if you do a bit of setup work | yourself you can bring it to about 0.09/hr. OpenAI offering | hosted Whisper is a really good deal, and if you find it to be | good for your application, then spending the time to host it | yourself is perfect validation. | [deleted] | habitue wrote: | Speculation: GPT-turbo is a new chinchilla optimal model with the | equivalent capabilities as GPT-3.5. So it's literally just | smaller, faster and cheaper to run. | | The reason I don't think it's just loss-leading is that they made | it faster too. That heavily implies a smaller model. | sacred_numbers wrote: | It could be even smaller than a Chinchilla optimal model. The | Chinchilla paper was about training the most capable models | with the least training compute. If you are optimizing for | capability and inference compute you can "over-train" by | providing much more data per parameter than even Chinchilla, or | you can train a larger model and then distill it to a smaller | size. Increasing context size increases inference compute, but | the increased capabilities of high context size might allow you | to skimp on parameters and lead to a net decrease in compute. | There's probably other strategies as well, but those are the | ones I know of. | martythemaniak wrote: | Yeah, at this point it seems like you're just burning money if | you're not rightsizing your parameters/corpus. | dougmwne wrote: | I think you mean GPT-4 since Chinchilla is a Deepmind project. | But yes, I was also suspecting that also as it seems unlikely | this was the full 175b parameter model with such big | improvements in speed and price. | | In fact, given the pricing for OpenAI Foundry, that seems even | more likely as this GPTTurbo model was listed along with two | other models with much larger context windows of 8k and 32k | tokens. | MacsHeadroom wrote: | "Brockman says the ChatGPT API is powered by the same AI | model behind OpenAI's wildly popular ChatGPT, dubbed | "gpt-3.5-turbo." GPT-3.5 is the most powerful text-generating | model OpenAI offers today through its API suite; the "turbo" | moniker refers to an optimized, more responsive version of | GPT-3.5 that OpenAI's been quietly testing for ChatGPT." [0] | | Chinchilla optimization is a technique which can be applied | to existing models by anyone, including OpenAI. The chatGPT | API is not based on GPT-4. | | [0] https://techcrunch.com/2023/03/01/openai-launches-an-api- | for... | jamix wrote: | The classic "Text completion" API (as opposed to the new "Chat | completion" one) seems to offer more flexibility: You have | complete freedom in providing interaction examples in the prompt | (as in | https://github.com/artmatsak/grace/blob/master/grace_chatbot...) | and are not limited to the three predefined chat roles. But the | requests being 10x cheaper means that we'll have to find ways | around those limitations :) | fswd wrote: | "model": "gpt-3.5-turbo", | | turbo isn't listed in Playground, but if you invoke the example | curl command (note: /v1/chat/completions) in your terminal, it | works. | ajhai wrote: | We have added it to https://trypromptly.com's playground. | https://twitter.com/ajhai/status/1631020290502463489 is a quick | demo | tablet wrote: | I am happy that we designed our UI for Fibery AI Assistant ready | for chat, less re-work! | | Looks like a super decent release and price cut makes it sane to | use. Token limit is is same, and this is not great for many use | cases... | rburhum wrote: | Can somebody please clarify. Is the cost $0.002 per 1k tokens | _generated_ , _read_ , or both? | hagope wrote: | I can confirm it is TOTAL tokens, this from the account/usage | page: | | gpt-3.5-turbo-0301, 2 requests 28 prompt + 64 completion = 92 | tokens | dmw_ng wrote: | Both, the API response includes a breakdown. In the best case 1 | token = 1 word (for example "and", "the", etc). Depending on | input, for English it seems reasonable to multiply the word | count by about 1.3 to get a rough token count | | This pricing model seems fair since you can pass in huge | prompts and request a single word reply, or a few words that | expect a large reply | minimaxir wrote: | Both the prompt and completion fall into the token count (this | has been the case since the beginning of GPT-3) | jfan001 wrote: | We've been struggling with costs because our application chains | together multiple calls to GPT to generate the output we want, | and it was starting to be ~$0.08 per call which obviously isn't | feasible for high volume applications. | | This just made our business way more viable overnight lmao | dqpb wrote: | [dead] | chirau wrote: | ELI5 What is a token? | | Is it a word, question, letter, what? If I ask a question like... | What is the capital of Canada? And it responds with 'Ottawa', how | many tokens have I used there and how are they calculated? | andrewmunsell wrote: | Roughly speaking, words or word parts. Open AI has an | explainer: | | https://help.openai.com/en/articles/4936856-what-are-tokens-... | | You can also check your input using their tokenizer: | https://platform.openai.com/tokenizer | | So, your example is ~9 tokens | teaearlgraycold wrote: | Token cost is prompt + response. In the case of ChatGPT you | don't know the full prompt, but they don't charge by tokens for | that app. | | In the API you need to tokenize your input and tokenize the | output then add the counts together. | world2vec wrote: | Will we be able to jailbreak it and use that output instead? | Adeveloper/hackerman mode would be awesome. | alpark3 wrote: | > Dedicated instances can make economic sense for developers | running beyond ~450M tokens per day. | | 450M tokens * $0.002/1K tokens = $900 per day. I wonder what the | exact pricing structure is. | | (edited for math) | ArminRS wrote: | > It is priced at $0.002 per 1k tokens | | So it would be $900 per day | osigurdson wrote: | I always wondered if ChatGPT was somehow stateful. I assumed that | it was not and the statefulness was simulated. Assumption | validated. | valine wrote: | I don't believe this was ever in question. You can think of the | model as a giant function that takes a list vectors as input | and spits out a new vector. If you want the model to remember | something, you have to include it in the list of input vectors | for every request going forward. | ShamelessC wrote: | It's stateful in the web demo. But they do so by prepending | chat history to new requests and automatically summarizing | history once the model's context window is exceeded. | [deleted] | steno132 wrote: | OpenAI released a ChatGPT API while reducing the cost by 10x. | | For those claiming OpenAI is for profit: Why would OpenAI do this | if they were fixated on making money? | | Also, while I wish OpenAI released the code for ChatGPT, I | applaud OpenAI for actually making their AI model available, to | everyone, right now. | | * Google hyped their Bard chatbot...but where is it? | | * Facebook took down Galatica. | | * Even Bing Chat has a waitlist | 0xDEF wrote: | >For those claiming OpenAI is for profit: Why would OpenAI do | this if they were fixated on making money? | | Silicon Valley companies have for the past 25 years focused on | getting as many users as possible to increase valuation in the | hope of getting a $100 billion exit. They don't care about | current or near future profitability. | | However I agree that OpenAI is getting far too much hate. Their | goal of bringing openness to AI made sense in 2015 when one | American company (Google) was dominating the field. | | However now there are plenty of other companies, countries and | open source organizations doing advanced AI research. | steno132 wrote: | [dead] | finikytou wrote: | please... the past ten years is a story of companies loosing | money for getting a monopoly and you are still asking us why | would they do that? | | let's not forget the shift of narrative that "open" AI made | from their name, their marketing and use of opensource and then | their move to a commercial subset of microsoft. let's also not | forget that they totally avoided discussing copyrights and the | crawlings of datasources to extract knowledge from someone else | property. the only close thing i can see today which avoided so | much scrutiny while being highly sensitive are ICOs in crypto | and Theranos in biotech. | | I wish opensource win this battle. | steno132 wrote: | [dead] | xur17 wrote: | > For those claiming OpenAI is for profit: Why would OpenAI do | this if they were fixated on making money? | | Reducing costs by 10x may very well increase usage by more than | 10x + it makes it even more difficult for competition to come | in and undercut them. | hellcow wrote: | > For those claiming OpenAI is for profit: Why would OpenAI do | this if they were fixated on making money? | | To get lock-in from devs before competitors can enter the | market and starves any would-be smaller competitors before they | can raise money/gain traction. | rvz wrote: | This is what 'extinguish' looks like from the new EEE | strategy that I have said before, years ago [0] from | Microsoft which competitors with paid offerings being unable | to compete with free since OpenAI's pricing model is now | close to free and their competitors cannot increase their | prices. | | Since Microsoft can foot the bill for the Azure | infrastructure, there is going to be little area for anyone | to seriously compete against OpenAI on price, API and | features, unless it is completely free and open source, like | Stability AI. | | [0] https://news.ycombinator.com/item?id=28324999 | RyanCavanaugh wrote: | Is fine-tuning of the gpt-3.5-turbo model expected to be | available at some point? I have some applications that would | greatly benefit from this, but only if fine tuning is available. | georgehill wrote: | Big news! Many apps will be integrating ChatGPT. Worried about | AI-generated content flood the search engine, making it harder to | do in-depth research. | r3trohack3r wrote: | This is a good thing. | | The future is curation and cultivation. We've been living in an | age of information abundance and markets haven't adapted. The | age of "crawl every website, index everything, and let people | search it" is coming to an end. There is just too much content | and too much of it is low quality. With or without AI. | | This abundance problem isn't just a WWW problem. Movies, TV, | music, podcasts, short form content, food, widgets, wibbles and | wobbles all suffer from abundance these days. We are quickly | exiting the age of supply chain driven scarcity and getting a | marketplace flooded with options. Capitalism has delivered on | basically everything it's promised with some asterisks and, if | we don't give into consumerism, we want for little and have | everything we need at our fingertips. | | I've personally opened up my pocket book to curation services. | I know brands that I trust. I know services that reliably | surface quality content. I suspect the next few decades are | going to trend towards services that separate noise from signal | - and I suspect AI is going to be a big part of that. | bluefone wrote: | Why separate noise from signal? Why isn't AI-generated stuff | seen just as valuable as human-written one? When you fill the | world with plastic, you need to evolve to eat plastic. When | you surround yourself with AI-produced stuff, then you should | learn to respect it. | | First of all, evaluating someone based on what they wrote or | said or did, is nonsense. | r3trohack3r wrote: | I think we agree. Humans can generate noise and AI can | generate signal. | ngokevin wrote: | Honestly, I've almost stopped Googling and have had personal | success just relying on ChatGPT. It's pretty much taught me 3D | animation and Blender to Unity workflows. Every time I wanted | to know a Blender keyboard shortcut or what some Blender | property was, or how to do something in Blender, the forums and | documentation is so sparse and outdated. I felt ChatGPT got me | a lot closer much faster. Especially when it tells you how to | learn concepts you didn't even know existed. | | Google results in the meanwhile have just become a pile of SEO- | optimized fluff, and it's hard to engineer the search query | around that besides sticking "reddit" on the end of it. | rd wrote: | This is incredibly cheap, it makes you wonder how in the world | they managed to make it 10x cheaper than davinci-003 and still a | better model? The world of robo-consulting is about to go insane. | dmw_ng wrote: | It may just be that OpenAI corrected its 2020-era understanding | of model size using 2022 insights from DeepMind.. | https://towardsdatascience.com/a-new-ai-trend-chinchilla-70b... | | Seems model sizing, compression and quantization are still an | art form, see also | https://www.unum.cloud/blog/2023-02-20-efficient-multimodali... | coldtea wrote: | The quality of suggestions, forums, internet content and other | such things just took a huge drop - this will create an internet | E.L.E. in SPAM and empty content... | ehPReth wrote: | I can't seem to find it mentioned but is there an | unrestricted/uncensored mode for this? I'd love to have some fun | with a few friends and hook it up in a matrix room for us | mritchie712 wrote: | https://news.ycombinator.com/item?id=34972791 | CamperBob2 wrote: | That's just a list of exploits that will be fixed as soon as | they come to OpenAI's attention, if they haven't already been | fixed. Is anyone actually committed to providing uncensored | models as either paid services or open distributions? | siva7 wrote: | I've stopped using Google entirely and don't miss it a second - | something i wouldn't have thought possible a year ago - and it's | pretty difficult to see how Google will survive this disaster. | sebzim4500 wrote: | How are you using it? In my experience asking factual questions | lead to answers so inaccurate that you might as well not | bother. | | Are you prompting it differently to me, or do you have some | strategy to filter out the BS? | kordlessagain wrote: | I use it with documents I shred into sentences and embed with | ada-002. This makes it spot on when talking about a given | document. https://mitta.us/ | skilled wrote: | How? By giving people answers to questions they didn't know | they have to ask to solve their problems. | alexb_ wrote: | >Through a series of system-wide optimizations, we've achieved | 90% cost reduction for ChatGPT since December | | This is seriously impressive. A MILLION tokens for 2 dollars is | absolutely fucking insane. | | I hope that the gains reached here can also be found by open | source and non-controlled AI projects. If so, that could be huge | for the advancement of AI. | machinekob wrote: | If you are Microsoft as GigaScaler with almost unlimited cash | and can ignore getting profit of your api/models its pretty | easy to undercut all the other companies and offer it very | cheap just to gain advantage in the future. | alexb_ wrote: | What the cost cutting measures suggest is that AI like this | could maybe soon be run on consumer hardware. That combined | with actually open source language models could be huge. | OpenAI won't allow for that for obvious reasons, but this | confirms that the optimizations are there, and that's | exciting enough news on its own. | alfalfasprout wrote: | I mean, meta's new LAMA model runs on a single A100 in the | 13B parameter variant (which performs similar to GPT3 65B). | visarga wrote: | "Performs" on paper until they give a demo. | sva_ wrote: | A lot of people assumed GPT4 would be an even bigger model, but | I've been thinking it'll probably be more about more efficient | compute. | | This is at least some evidence that they're working on that. | ImprobableTruth wrote: | It's tokens _processed_ , not generated. | visarga wrote: | If you have 10K tokens in your conversation, the next reply | means 10K + len(reply) extra tokens. I estimate 125 rounds of | conversation fit in 1M tokens, for $2. | jonplackett wrote: | This feels like the AI's iPhone moment. | | I am scared for all people working service jobs. | passion__desire wrote: | All jobs will shift to asteroid mining. | danjac wrote: | AI is the perfect job for asteroid mining, or any other job | outside of Earth orbit, whether for research or commerce. | Space is incredibly hostile to humans and, short of some | miracle technology, is likely to always remain so: at best | we'll have a few plant-the-flag missions in the inner solar | system. | dekhn wrote: | I really wish we'd build a standarized space exploration | platform and saturate the solar system with mostly- | autonomous robots that occasionally phone home with "wtf is | this?" | | imagine all the surface area in the solar system. I bet | there's got to be at least 100 completely unexpected things | lying around that would transform our understanding. | jeron wrote: | what's stopping AI from asteroid mining for itself | 13years wrote: | > I am scared for all people | | Probably could stop there. | jonplackett wrote: | Yeah that is actually true. | | Yeah that's probably truest. | | But I'm more scared for some than others short term. | | I'm less immediately scared for anyone doing work that | interacts with the physical world. | | Weird how it turned out the robotics was harder than the | thinking | vineyardmike wrote: | I'm still waiting on that pizza I asked ChatGPT to make in | November. The code it wrote? Already in production though. | | I'm not scared for service workers due to _ai_ but you should | see how low minimum wage is in America relative to rents if you | want to be scared. | sebzim4500 wrote: | Maybe if you had asked for fries instead | | https://www.youtube.com/watch?v=T4-qsklXphs | CamperBob2 wrote: | _I am scared for all people working service jobs._ | | Why? Because they're no longer doomed to eke out a meaningless | existence doing a robot's job badly? | jonplackett wrote: | It's better than having no job isn't it? | spyremeown wrote: | So, the Web3 won't be blockchains and whatnot: it's actually | custom generated content. I... don't really like it. | TeMPOraL wrote: | Web2 was about user-generated content. Web3 will be about | companies owning the user-generate content realizing they're | sitting on a goldmine - and using that content as training data | for their DNN models. | ayewo wrote: | That's an astute observation. | | (Apropos nothing: I expanded your comment into the following | tweet https://twitter.com/ayewo_/status/1631060562393153536) | bluecoconut wrote: | Support for the ChatGPT endpoint now added to lambdaprompt[1]! | (solves a similar problem as langchain, with almost no | boilerplate!) Props to openai for making such a usable endpoint, | was very easy to wrap. | | Example code using the new function and endpoint: | import lambdaprompt as lp convo = | lp.AsyncGPT3Chat([{'system': 'You are a {{ type_of_bot }}'}]) | await convo("What should we get for lunch?", | type_of_bot="pirate") | | > As a pirate, I would suggest we have some hearty seafood such | as fish and chips or a seafood platter. We could also have some | rum to wash it down! Arrr! | | (In order to use lambdaprompt, just `pip install lambdaprompt` | and export OPENAI_API_KEY=...) | | [1] https://github.com/approximatelabs/lambdaprompt | 0xDEF wrote: | What is the developer experience of using OpenAI's Server-Sent | Events endpoint from something else than their Python and Node.js | libraries? | | The SSE endpoint is required for use cases like chat so the end | user doesn't have to wait until the whole reply has been | generated. | | I started implementing a simple SSE client on top of C#/.Net's | HttpClient but it's harder than I first assumed. | hyferg wrote: | We had to restream server events from openai -> our backend -> | client. It was pretty simple. | arbuge wrote: | So I had a question about how all this works under the hood. The | GPT model is trained (on a massive dataset) and then deployed. | How are they getting the additional product data from other | sources like Instacart's retail partner locations and Shopify's | store catalogs into it, so that it can output answers leveraging | those? My understanding (perhaps incorrect) is that those weren't | part of the dataset the model was initially trained on. | | For example: | | > Shop's new AI-powered shopping assistant will streamline in-app | shopping by scanning millions of products to quickly find what | buyers are looking for | | > This uses ChatGPT alongside Instacart's own AI and product data | from their 75,000+ retail partner store locations to help | customers discover ideas for open-ended shopping goals | nestorD wrote: | I would add additional `system` messages with information | fetched as a function of the user's request. | r3trohack3r wrote: | Whisper as an API is great, but having to send the whole payload | upfront is a bummer. Most use cases I can build for would want | streaming support. | | Like establish a WebRTC connection and stream audio to OpenAI and | get back a live transcription until the audio channel closes. | jmccarthy wrote: | I recently tried a number of options for streaming STT. Because | my use case was very sensitive to latency, I ultimately went | with https://deepgram.com/ - but | https://github.com/ggerganov/whisper.cpp provided a great | stepping stone while prototyping a streaming use case locally | on a laptop. | BasilPH wrote: | As far as I can tell it doesn't support world-level timestamps | (yet). That's a bit of a dealbreaker for things like | promotional clips or the interactive transcripts that we | do[^0]. Hopefully they add this soon. | | [^0]: https://www.withfanfare.com/p/seldon-crisis/future- | visions-w... | [deleted] | banana_giraffe wrote: | It's also annoying since there appears to be a hard limit of 25 | MiB to the request size, requiring you to split up larger files | and manage the "prompt" to subsequent calls. Well, somehow, | near as I can tell, how you're expected to use that value isn't | documented. | travisjungroth wrote: | You split up the audio and send it over in a loop. Pass in | the transcript of the last call as the prompt for the next | one. See item 2 here: | https://platform.openai.com/docs/guides/speech-to- | text/promp... | banana_giraffe wrote: | And: | | > we suggest that you avoid breaking the audio up mid- | sentence as this may cause some context to be lost. | | That's really easy to put in a document, much harder to do | in practice. Granted, it might not matter much in the real | world, not sure yet. | | Still, this will require more hand holding than I'd like. | mike_d wrote: | The page includes a five line Python example of how to | split audio without breaking mid-word. | travisjungroth wrote: | I doubt it will matter if you're breaking up mid sentence | if you pass in the previous as a prompt and split words. | This is how Whisper does it internally. | | It's not absolutely perfect, but splitting on the word | boundary is one line of code with the same package in | their docs: https://github.com/jiaaro/pydub/blob/master/A | PI.markdown#sil... | | 25MB is also a lot. That's 30 minutes to an hour on MP3 | at reasonable compression. A 2 hour movie would have | three splits. | userhacker wrote: | I suggest you give revoldiv.com a try, We use whisper and | other models together. You can upload very large files and | get an hour long file transcription in less than 30 seconds. | We use intelligent chunking so that the model doesn't lose | context. We are looking to increase the limit even more in | the coming weeks. It's also free to transcribe any | video/audio with word level timestamps. | BasilPH wrote: | I just gave it a try, and the results are impressive! Do | you also offer an API? | userhacker wrote: | contact us at team@revoldiv.com and we are offering an | API on a case by case basis | [deleted] | mk_stjames wrote: | I've ran Whisper locally via [1] with one of the medium sized | models and it was damn good at transcribing audio from a video | of two people having a conversation. | | I don't know exactly what the use case is where people would | need to run this via API; the compute isn't huge, I used CPU | only (an M1) and the memory requirements aren't much. | | [1] https://github.com/ggerganov/whisper.cpp | rmorey wrote: | FWIW, AssemblyAI has great trasncript quality in my experience, | and they support streaming: | https://www.assemblyai.com/docs/walkthroughs#realtime-stream... | BasilPH wrote: | We're using AssemblyAI too, and I agree that their | transcription quality is good. But as soon as Whisper | supports world-level timestamps, I think we'll seriously | consider switching as the price difference is large ($0.36 | per hour vs $0.9 per hour). ___________________________________________________________________ (page generated 2023-03-01 23:00 UTC)