[HN Gopher] OpenAI is too cheap to beat ___________________________________________________________________ OpenAI is too cheap to beat Author : cgwu Score : 162 points Date : 2023-10-12 18:16 UTC (2 hours ago) (HTM) web link (generatingconversation.substack.com) (TXT) w3m dump (generatingconversation.substack.com) | eurekin wrote: | Didn't see batching taken into equation, might skew a bit | sidnb13 wrote: | Yep, batching is a feature I really wish the OpenAI API had. | That and the ability to intelligently cache frequently used | prompts. Much easier to achieve this with a hosted OS model, so | I guess it's a speed + customizability/cost tradeoff for the | time being. | advaith08 wrote: | imo they dont have batching because they pack sequences | before passing through the model. so a single sequence in a | batch on OpenAI might have requests from multiple customers | in it | jonplackett wrote: | Is this a reflection of OpenAI's massive scale making it so cheap | for them? | | Or is it the deal with Microsoft for cloud services making it | cheap? | | Or are they just operating at a massive loss to kill off other | competition? | | Or something else? | 4death4 wrote: | Probably all three: | | 1) They hiring too talent to make their models as efficient as | possible. | | 2) They have a sweetheart deal with MS. | | 3) They're better funded than everyone else and bringing in | substantial revenue. | smachiz wrote: | deleted | ryduh wrote: | Is this a guess or is it informed by facts? | sebzim4500 wrote: | Are just suggesting this as an option or do you have | evidence that it is true? | ugjka wrote: | They are also trying to lobby the government for AI | "regulation" in order limit any competitors ability achieve | OpenAI's level | wkat4242 wrote: | They basically are MS by now. Everyone at Microsoft I work | with literally calls it an 'aquisition'. Even though they | only own a share. It's pretty clear what their plans are. | SkyMarshal wrote: | Probably the first two, plus first-mover brand recognition. | Millions of $20 monthly subs for GPT4 add up. | | They might also be operating at a loss afaik, but I suspect | they're one of the few that can break even just based on scale, | brand recognition, and economics. | michaelbuckbee wrote: | $20/mo subs which is also the lead in to also unlocking paid | API access. | sarchertech wrote: | I haven't heard any evidence that they have millions of Plus | subscribers. | | I've seen 100 to 200 million active users, but nothing about | paid users from them. The surveys I saw when doing a quick | google search reported much less than 1% of users paying. | SkyMarshal wrote: | Yeah I don't know what the actual subscription numbers are, | would be surprised if OpenAI is publishing that info. | ShadowBanThis01 wrote: | They're mining the gullible for phone numbers, among other | things. | vsreekanti wrote: | Probably some combination of all the above! I think 1 and 2 are | interlinked though -- the cheaper they can be, the more they | build that moat. They might be eating the cost on these APIs | too, but unlike the Uber/Lyft war, it'll be way stickier. | te_chris wrote: | There's also just the benefits of being in market, at scale and | being exposed to the full problem space of serving and | maintaining services that use these models. It's one thing to | train and release and OSS model, it's another to put it into | production and run all the ops around it. | iliane5 wrote: | I think it's mostly the scale. Once you have a consistent user | base and tons of GPUs, batching inference/training across your | cluster allows you to process requests much faster and for a | lower marginal cost. | ilaksh wrote: | I think the weird thing about this is that it's completely true | right now but in X months it may be totally outdated advice. | | For example, efforts like OpenMOE | https://github.com/XueFuzhao/OpenMoE or similar will probably | eventually lead to very competitive performance and cost- | effectiveness for open source models. At least in terms of | competing with GPT-3.5 for many applications. | | Also see https://laion.ai/ | | I also believe that within say 1-3 years there will be a | different type of training approach that does not require such | large datasets or manual human feedback. | sidnb13 wrote: | > I also believe that within say 1-3 years there will be a | different type of training approach that does not require such | large datasets or manual human feedback. | | I guess if we ignore pretraining, don't sample-efficient fine- | tuning on carefully curated instruction datasets sort of | achieve this? LIMA and OpenOrca show some really promising | results to date. | sharemywin wrote: | distilbert was trained from Bert. there might be an angle | using another model to train the model especially if your | trying to get something to run locally. | nico wrote: | > I also believe that within say 1-3 years there will be a | different type of training approach that does not require such | large datasets or manual human feedback | | This makes a lot of sense. A small model that "knows" enough | English and a couple of programming languages should be enough | for it to replace something like copilot, or use plug-ins or do | RAG on a substantially larger dataset | | The issue right now is that to get a model that can do those | things, the current algorithms still need massive amounts of | data, way more than what the final user needs | Dwedit wrote: | Abbreviate Mix of Experts as "MoE" and the Anime fans | immediately start rushing in... | daft_pink wrote: | I'm confused don't a100s cost 10,000 to buy? Why would you pay | 166k per year to rent? | sidnb13 wrote: | I would assume the datacenter and infra needed would also | contribute a sizeable chunk to the costs when you consider | upkeep to run it 24/7 | latchkey wrote: | For the same reason people use AWS. | | Spending the capex/opex to run a cluster of compute isn't easy | or cheap. It isn't just the cost of the GPU, but the cost of | everything else around it that isn't just monetary. | etothepii wrote: | This could be an interesting comparison. My experience with | AWS is that it was super easy and cheap to start on. By the | time we _could_ use whole servers we were using so much AWS | orchestration that it 's going to be put off until we are at | least $1M ARR, and probably til we are at $5M. | | Make adoption easy, give a free base tier but charge more | could be a very effective model to get start ups stuck on | you. It even probably makes adoption by small teams in big | companies possible that can then grow ... | dekhn wrote: | How much does an A100 consume in power a year (in dollar | costs)? How much does it cost to hire and retain datacenter | techs? How long does it take to expand your fleet after a user | says "we're gonna need more A100s?" How many discounts can you | get as a premier customer? | | Answer these questions, and the equation shifts a bunch! | shrubble wrote: | Not really. | | A full rack with 16 amps usable power and some bandwidth is | $400/month in Kansas City, MO. That is enough to power 5x | A100s 24x7, so 10k plus $80 per month each, amortized, of | course many more A100s would drop the price. | | Once installed in the rack ($250 1 time cost) you shouldn't | need to touch it. So 10k plus $1250 per A100, per year | including power. You can put 2 or 3 A100s per cheapo Celeron | based CPU with motherboards. | | Of course if doing very bursty work then it may well make | sense to rent... | akomtu wrote: | And how many A100s do you need to do something meaningful | with LLMs? | shrubble wrote: | The funding has to come from somewhere, right? You either | pay up front and save money over time, or pay as you go | and pay more... | dekhn wrote: | Did you also include the network required to make the A100s | talk to each other? Both the datacenter network (so the | CPUs can load data) and the fabric (so the A100s can talk?) | | You also left out the data tech costs- probably at least | $50K/individual-year in KC (although I guess I'd just work | for free ribs). | | If you're putting A100s into celeron motherboards... I | don't know what to say. You're not saving money by putting | a ferrari engine in a prius. | latchkey wrote: | $50m GPU capex (which is A LOT) is about 2-3MW of power, it | isn't that much. | | The problem though is that getting 2-3MW of power in the US | is increasingly difficult and you're going to pay a lot more | for it since the cheap stuff is already taken. | | Even more distressing is that if you're going to build new | data center space, you can't get the rest of the stuff in the | supply chain... backup gennies, transformers, cooling towers, | etc... | amluto wrote: | Those are 8x A100 systems. | joefourier wrote: | AWS is extremely overpriced for nearly every service. I don't | know why anyone else outside of startups with VC money to burn | or bigcos that need the "no one ever got fired for buying IBM" | guarantee would use them. You're better off with Lambdalabs or | others which charge only $1.1/h per A100. | | Also that is a 8xA100 system as others have noted, but it is | the 40GB one which can be found on eBay for as low as $3k if | you go with the SXM4 one (although the price of supporting | components may vary) or $5k for the PCI-e version. | wg0 wrote: | There are only two services that are dirt cheap and way too | reliable, useful.That's S3 and SQS. Rest can get very | expensive very soon. | | You can build a lot of stuff on top of these two. | ommpto wrote: | Even for S3 while the storage is dirt cheap they still have | exorbitant bandwidth pricing. | charcircuit wrote: | S3 is not dirt cheap. Bandwidth is ludicrously expensive. | charlesischuck wrote: | You pay for the system not the gpu with AWS. | | It's absolutely worth the money when you look at the whole | picture. Also lambda labs never has availability. I actually | can schedule a distributed cluster on AWS. | AndroTux wrote: | > It's absolutely worth the money when you look at the | whole picture. | | That highly depends on many things. If you run a business | with a relatively steady load that doesn't need to scale | quickly multiple times per day, AWS is definitely not for | you. Take Let's Encrypt[1] as an example. Just because | cloud is the hype doesn't mean it's always worth it. | | Edit: Or a personal experience: I had a customer that | insisted on building their website on AWS. They weren't | expecting high traffic loads and didn't need high | availability, so I suggested to just use a VPS for $50 a | month. They wanted to go the AWS route. Now their website | is super scalable with all the cool buzzwords and it costs | them $400 a month to run. Great! And in addition, the whole | setup is way more complex to maintain since it's built on | AWS instead of just a simple website with a database and | some cache. | | [1] https://news.ycombinator.com/item?id=37536103 | nharada wrote: | Sometimes I need 512 GPUs for 3 days. | charlesischuck wrote: | A top end gpu now to make you competitive cost 20-50k per gpu. | | To train a top model you need hundreds of them in a very | advanced datacenter. | | You can't just plug gpus into standard systems and train, | everything is custom. | | The technical talent required for these systems is rare to say | the least. The technical talent to make a model is also rare. | | I trained a few foundation models with images, and I would | NEVER buy any of them. These guys are on a wildly different | scale than basically everyone. | SkyMarshal wrote: | I think OpenAI may eventually have to go upmarket, as basic "good | enough" AI becomes increasingly viable and cheap/free on consumer | level devices, supplied by FOSS models and apps. | | Apple may be leading the way here, with Apple Silicon | prioritizing AI processing and built into all their devices. | These capabilities are free (or at least don't require an extra | sub), and just used to sell more hardware. | | OpenAI is clearly going to compete in that market with its | upcoming smart phone or device [1]. But what revenue model can | OpenAI use to compete with Apple's and not get undercut by it? I | suppose hardware + free GPT3.5, and optional subscription to GPT4 | (or whatever their highest end version is). Maybe that will be | competitive. | | I also wonder what mobile OS OpenAI will choose. Probably not | Android, otherwise they would have partnered with Google. A | revamped and updated Microsoft mobile OS maybe, given their MS | partnership? Or something new and bespoke? I could imagine Johnny | Ive demanding something new, purpose-built, and designed from | scratch for a new AI-oriented UI/UX paradigm. | | A market for increasingly sophisticated AI that can only be done | in huge GPU datacenters will exist, and that's probably where the | margins will be for a long time. I think that's what OpenAI, | Microsoft, Google, and the others will be increasingly competing | for. | | [1]:https://www.reuters.com/technology/openai-jony-ive-talks- | rai... | vsreekanti wrote: | Yep, we agree that the obvious direction of innovation for OSS | models is smaller and cheaper, likely at roughly the same | quality: https://generatingconversation.substack.com/p/open- | source-ll... | smcleod wrote: | Also more privacy respecting, and more customisable / | flexible. | mensetmanusman wrote: | Please Apple let me replace worthless Siri with ChatGPT on my | iPhone. | | Would completely change how I use the device. | bitcurious wrote: | If you have the new iPhone with the action button, you can | set a shortcut to ask questions of ChatGPT. It's not as fluid | as Siri, and can't control anything, but still much more | useful. | CamperBob2 wrote: | Just yesterday, while driving: "Read last message." | | Siri: "Sorry. Dictation service is unavailable at the | moment." | | It's past time for excuses. High-level people at Apple need | to be fired over this. Hello? Tim? Do your job. Hello? | Anybody home...? | freedomben wrote: | Nobody is switching away from Apple over this, so | ultimately Tim _is_ doing his job. Under his watch Apple | has become the defacto choice for entire generations. | Between vendor-lockin /walled gardens and societal/cultural | pressures (don't want to be a green bubble!), they have one | of the stickiest user bases there are. | mensetmanusman wrote: | True, but that doesn't mean we shouldn't complain. | | My hope is that the upcoming eu rulings allow competition | here. Ie force Apple to get out of the way of making | their hardware better with better software. | CamperBob2 wrote: | Stop excusing shitty work from trillion-dollar companies. | It makes the world a worse place. | smoldesu wrote: | I think it's shitty and has no excuse, but the parent is | right. Apple has no incentive to respond to their users | since all roads lead to first-party Rome. It's why stuff | like the Digital Market Act is more needed than some | people claim. | | You know what would get Apple to fix this? Forced | competition. You know what Apple spends their trillions | preventing? | layer8 wrote: | Apple is ramping up spending in that area: | https://www.macrumors.com/2023/09/06/apple-conversational- | ai... | | It'll probably take a while though. | grahamplace wrote: | > OpenAI is clearly going to compete in that market with its | upcoming phone | | What phone are you referring to? A quick google didn't seem to | pull up anything related to OpenAI launching a hardware | product? | BudaDude wrote: | They are most likely referring to this in collaboration with | Jony Ive: | | https://www.yahoo.com/entertainment/openai-jony-ive-talks- | ra... | SkyMarshal wrote: | Yes that one. | jimkoen wrote: | > OpenAI is clearly going to compete in that market with its | upcoming phone. | | Excuse me, I'm not an english native, you mean like a smart | phone? Or do you mean some sort of other new business | direction? Where did you get the info thtat they're planning to | launch a phone? | MillionOClock wrote: | I believe there has been rumors that OpenAI was working with | Jony Ive to create a wearable device, but it was unclear | wether it would be a phone or something else. | SkyMarshal wrote: | Yes a smartphone, /corrected. It's a recent announcement: | | https://www.nytimes.com/2023/09/28/technology/openai- | apple-s... | sharemywin wrote: | It's not a really a phone. they mention ambient computing. | SkyMarshal wrote: | Oh, smart device then. | layer8 wrote: | https://www.reuters.com/technology/openai-jony-ive-talks- | rai... | layer8 wrote: | Where are you taking the confidence that Apple will be able to | catch up to OpenAI's GPT? "Apple's built-in AI capabilities" | are very weak so far. | filterfiber wrote: | Not OP, | | In my experience apple's ML on iphones is seamless. Tap and | hold on your dog in a picture and it'll cut out the | background, your photos are all sorted automatically | including by person (and I think by pet). | | OCR is seamless - you just select text in images as if it was | real text. | | I totally understand these aren't comparable to LLMs - rumor | has it apple is working on an llm - if their execution is | anything like their current ML execution it'll be glorious. | | (Siri objectively sucks although I'm not sure it's fair to | compare siri to an LLM as AFAIK siri does not do text | prediction but is instead a traditional "manually crafted | workflow" type of thing that just uses S2T to navigate) | blackoil wrote: | >OCR is seamless | | Wasn't that solved about a decade ago. Does anyone suck at | that? | filterfiber wrote: | > Does anyone suck at that? | | Does android even have native OCR? Last I checked | everything required an OCR app of varying quality | (including windows/linux). | | On ios/macos you can literally just click on a picture | and select the text in it as if it wasn't a picture. I | know for sure on iOS you don't even open an app to do it, | just any picture you can select it. | | Last I checked the Opensource OCR tools were decent but | behind the closed source stuff as well. | | Random google result of OCR on android (could be | outdated) - https://www.reddit.com/r/androidapps/comments | /10te5et/why_oc... | smoldesu wrote: | > Does android even have native OCR? | | Tesseract? https://github.com/tesseract-ocr/tesseract | SkyMarshal wrote: | I'm not saying they will on the high-end, but maybe on the | low end. Apple's strategy is to embed local AI in all their | devices. Local AI will never be as capable as AI running in | massive GPU datacenters, but if it can get to a point that | it's "good enough" for most average users, that may be enough | for Apple to undercut the low end of the market. | freedomben wrote: | > _Local AI will never be as capable as AI running in | massive GPU datacenters_ | | I'm not sure this is true, even in the short term. For some | things yes, that's definitely true. But for other things | that are real-time or near real-time where network latency | would be unacceptable, we're already there. For example, | Google's Pixel 8 launch includes real-time audio | processing/enhancing which is made possible by their new | Tensor chip. | | I'm no fan of Apple, but I think they're on the right path | with local AI. It may even be possible that the tendency of | other device makers to put AI in the cloud might give Apple | a much better user experience, unless Google can start | thinking local-first which kind of goes against their | grain. | SkyMarshal wrote: | _> But for other things that are real-time or near real- | time where network latency would be unacceptable, we 're | already there._ | | Agreed. Something else I wonder is if local AI in mobile | devices might be better able to learn from its real-time | interactions with the physical world than datacenter- | based AI. | | It's walking around in the world with a human with all | its various sensors recording in real-time (unless | disabled) - mic, camera, GPS/location, LiDAR, barometer, | gyro, accelerometer, proximity, ambient light, etc. Then | the human uses it to interact with the world too in | various ways. | | All that data can of course be quickly sent to a | datacenter too, and integrated into the core system | there, so maybe not. But I'm curious about this | difference and wonder what advantages local AI might | eventually confer. | sharemywin wrote: | I wonder if you could send the embeddings or some higher | level compressed latent vector across the cloud you | couldn't get the best of both worlds. | | GPS, phone orientation, last 5 apps you were in, etc. --> | embedding | | you might even have like "what time is it?" compressed as | it's own embedding. | huevosabio wrote: | OpenAI will make its money on enterprise deals for finetuning | their latest and greatest on corporate data. They are already | having this big enterprise deals and I think that's where the | money is. | | They will keep pricing the off-the-shelf AI at-cost to keep | competitors at bay. | | As for competitors, Anthropic is the most similar to OpenAI | both in capabilities and business model. I am not sure what | Google is up to, since historically their focus has been in | using AI to enhance their products rather than making it a | product. The "dark horses" here are Stability and Mistral which | both are OSS and European and will try to make that their edge | as they give the models for _free_ but to institutional clients | that are more sensitive to the models being used and where is | the data being handled. | | Amazon and Apple are probably catching up. Apple likely thinks | that all of this just makes their own hardware more attractive. | It's not clear to me what Meta's end goal is. | tmpz22 wrote: | > I think OpenAI may eventually have to go upmarket | | Let me introduce you to the VC business model. Get comical | amounts of money. Charge peanuts for an initial product. Build | a moat once you trap enough businesses inside it. Jack up | prices. | sharemywin wrote: | don't forget the sneaky TOS changes you have to agree to | robertlagrant wrote: | OpenAI'd better hope no one else does it too, if that's all | it takes. | latchkey wrote: | I just paid the $20 for a month to try it out. In my super | limited experience, GPT-4 is actually impressive and worth the | money. | smileysteve wrote: | I've spent the last few weeks comparing Google Duet with Chat | GPT 3.5, and Chat GPT seems years ahead. | a_wild_dandan wrote: | The value I get for that $20/month is astonishing. It's by far | the best discretionary subscription I've ever had. | | That scares me. I hate moats and actively want out. Running the | uncensored 70B parameter Llama 2 model on my MacBook is great, | but it's just not a competitive enough general intelligence to | entirely substitute for GPT-4 yet. I think our community will | get there, but the surrounding water is deepening, and I'm | nervous... | sharemywin wrote: | tentatively called "Claude-Next" -- that is 10 times more | capable than today's most powerful AI, according to a 2023 | investor deck TechCrunch obtained earlier this year. | | this is the thing that scare me. | | when do these models stop getting smarter? or at least slow | down? | minimaxir wrote: | When the ChatGPT API was released 7 months ago, I posted a | controversial blog post that the API was so cheap, it made other | text-generating AI obsolete: | https://news.ycombinator.com/item?id=35110998 | | 7 months later, nothing's changed surprisingly. Even open-source | models are trickier to get to be more cost-effective despite the | many inference optimizations since. Anthropic Claude is closer to | price and quality effectiveness now, but there's no reason to | switch. | cainxinth wrote: | These are still early days. All the major players are willing | to lose billions to be top of mind with consumers in an | emerging market. | | Either there will be some major technological breakthrough that | lowers their costs, or they will all eventually start raising | prices. | Eumenes wrote: | "too cheap to beat" sounds anti-competitive and monopolistic. | Large LLM providers are not dissimilar to industrial operations | at scale - it requires alot of infrastructure and the more you | buy/rent, the cheaper it gets. Early bird gets the worm I guess. | stevenae wrote: | Not sure I understand your comment, but generally you have to | prove anti-competitiveness /beyond/ too cheap to beat (unless | it is a proven loss-leader which, viz all big tech companies, | seems very hard to prove) | Havoc wrote: | Yep. Building a project that needs some LLMs. I'm very much of | the self-hosting mindset so will try DIY, but it's very obviously | the wrong choice by any reasonable metric. | | OpenAI will murder my solution by quality, by availability, by | reliability and by scalability...all for the price of a coffee. | | It's a personal project though & partly intended for learning | purposes so there is scope for accepting trainwreck level | tradeoffs. | | No idea how commercial projects are justifying this though. | nine_k wrote: | One small caveat: OpenAI gets to see all your prompts, and all | the responses. | | Sometimes this can be unacceptable. Law,, medicine, finance, | all of them would prefer a self-hosted, private GPT. | kevlened wrote: | Their data retention policy on their APIs is 30 days, and | it's not used for training [0]. In addition, qualifying use | cases (likely the ones you mentioned) qualify for zero data | retention for most endpoints. | | [0] - https://platform.openai.com/docs/models/how-we-use- | your-data | nine_k wrote: | In sensitive cases you do not think about the normal | policy, you think about the worst case. You just can't | afford a leak. Your local installation may be much better | protected than a public service, by technology and by | policy. | BoorishBears wrote: | For years people have essentially made a living off FUD | like "ignore the literal legal agreement and imagine all | the worst case scenarios!!!" to justify absolutely | farcical on-premise deployments of a lot of software, but | AI is starting to ruin the grift. | | There _are_ some cases where you really can 't afford to | send Microsoft data for their OpenAI offering... but | there are a lot more where some figurehead solidified | their power by insisting the company build less secure | versions of public offerings instead of letting their | "gold" go to a 3rd party provider. | | As AI starts to appear as a competitive advantage, and | the SOTA of self-hosted lagging so ridiculously far | behind, you're seeing that work less and less. Take | Harvey.ai for example: it's a frankly non-functional | product and still manages to spook top law firms with | tech policies that have been entrenched for decades into | paying money despite being OpenAI based on the simple | chance they might get outcompeted otherwise. | littlestymaar wrote: | > and it's not used for training [0]. | | It's "not be used to train or improve OpenAI models", | doesn't mean it's not used to get knowledge about your | prompts, your business use case. In fact, the wording of | the policy is lose enough they could train a policy model | on it (just not the LLM itself). | Der_Einzige wrote: | A lot of tools for constraint, creativity, and related rely on | manipulating the entire log probability distribution. OpenAI | won't expose this information and is therefor shockingly | uncompetitive on things like poetry generation | fulafel wrote: | This focuses on compute capacity but wouldn't the algorithmic | improvements be much more important in bang for the buck at this | stage as there's so much low hanging fruit as evidenced by | constant stream of news about getting better results with less | hardware. | debacle wrote: | Open source always wins, in the end. This is a fluff piece. | downWidOutaFite wrote: | Where's the open source web search that is beating Google? | serjester wrote: | I think this is under appreciated. I run a "talk-to-your-files" | website with 5ish K MRR and a pretty generous free tier. My | OpenAI costs have not exceeded $200 / mo. People talk about using | smaller, cheaper models but unless you have strong data security | requirements you're burdening yourself with serious maintenance | work and using objectively worse models to save pennies. This | doesn't even consider OpenAI continuously lowering their prices. | | I've talked to a good amount of businesses and 90% of custom use | cases would also have negligible AI costs. In my opinion, unless | you're in a super regulated industry or doing genuinely cutting | edge stuff, you should probably just be using the best that's | available (OpenAI). | vsreekanti wrote: | I completely agree -- open-source models and custom deployments | just can't compete with the cost and efficiency here. The only | exception here is _if_ open-source models can get way smaller | and faster than they are now while maintaining existing | quality. That will make private deployments and custom fine- | tuning way more likely. | SkyMarshal wrote: | Or FOSS models remain the same size and speed, but hardware | for running them, especially locally, steadily improves till | the AI is "good enough" for a large enough segment of the | market. | hobs wrote: | How do you deal with the fact that Azure et al are not | appearing to sell anyone additional capacity? | jejeyyy77 wrote: | how do ur customers feel about you uploading potentially | confidential documents to a 3rd party? | CDSlice wrote: | If they are confidential they probably shouldn't be uploaded | to any website no matter if it calls out to OpenAI or does | all the processing on their own servers. | yunohn wrote: | It's simple really, lots of businesses share data with 3rd | parties to enable various services. OpenAI provides a service | contract claiming they do not mine/reshare/etc the data | shared via their API. As the SaaS provider, you just need to | call it out your user service agreement. | euazOn wrote: | Just curious, could you briefly mention some of the custom use | cases with negligible AI costs? Thanks | cyode wrote: | Are any OpenAI powered flows available to public, logged-out | user traffic? I've worried (maybe irrationally) about doing | this in a personal project and then dealing with malicious | actors and getting stuck with a big bill. | Bukhmanizer wrote: | The bleeding obvious is that OpenAI is doing what most tech | companies for the last 20 years have done. Offer the product | for dirt cheap to kill off competition, then extract as much | value from your users as possible by either mining data or | hiking the price. | | I don't understand how people are surprised by this anymore. | | So yeah, it's the best option right now, when the company is | burning through cash, but they're planning on getting that | money back from you _eventually_. | jaredklewis wrote: | > Offer the product for dirt cheap to kill off competition, | then extract as much value from your users as possible by | either mining data or hiking the price. | | Genuine question, what are some examples of companies in that | "hiking the price" camp? | | I can think of tons of tech companies that sold or sell stuff | at a loss for growth, but struggling to find examples where | the companies then are able to turn dominant market share | into higher prices. | | To be clear, I'm definitely not implying they are not out | there, just looking for examples. | loganfrederick wrote: | Uber, Netflix and the online content streaming services. | These are probably the most prominent examples from this | recent 2010s era. | spacebanana7 wrote: | The Google Maps API price hike of 2018 [1] is a relevant | example. | | [1] https://kobedigital.com/google-maps-api-changes | beezlebroxxxxxx wrote: | Uber is probably the biggest pure example. When I was in | uni when they first spread, Uber's entire business model | was flood the market with hilariously low prices and steep | discounts. People overnight started using them like crazy. | They were practically giving away their product. Now, | they're as expensive, if not sometimes more expensive, than | any other taxi or ridesharing service in my area. | | One thing I'll add is that it's not always that this ends | with higher prices in an absolute sense, but that the tech | company is able to essentially cut the knees out of their | competitors until they're a shell of their former selves. | Then when the prices go "up", they're in a way a return to | the "norm", only they have a larger and dominant market | share because of their crazy pricing in the early stages. | wkat4242 wrote: | Yeah I kinda wonder why people even use them anymore. | I've long gone back to real taxis because their cheaper | and I don't have to book them, I can just grab one on the | street. Much more efficient than waiting for slowly | watching my driver edge his way to me from 3 kilometers | away. | jdminhbg wrote: | The number of places where you can reliably walk out onto | the street and hail a taxi is pretty small. Everywhere | else, the relevant decision is whether calling a | dispatcher or using a taxi company's app is | faster/cheaper/more reliable than Uber/Lyft. | mikpanko wrote: | - Uber/Lyft increased prices significantly (and partially | transition it into longer wait times) since they got into | profitability mode | | - Google is showing more and more ads over time to power | high revenue growth YoY | | - Unity has just tried to increase its prices | jaredklewis wrote: | I think Google fits more in the "extract as much value | from your users" bucket more than the price hiking one. | | Uber/Lyft did raise prices, but interestingly (at least | to me) is that if the strategy was the smother the | competition with low prices, it didn't seem to work. | | Unity is interesting too, though I'm not sure it would | make a good poster child for this playbook. It raised | prices but seems to be suffering for it. | HillRat wrote: | Everyone's in "show your profits" mode, as befitting a | mature market with smaller growth potential relative to | the last few decades. Some of what we're talking about | here is just what happens when a company tries to use | investment capital to build a moat but fails (the | Uber/Lyft issue you mentioned -- there's no obvious moat | to ride-hailing, as with many software and app domains). | My theory is that, going forward, we're going to see a | much lower ceiling on revenue coupled with lots of | competition in the market as VC investments cool off and | companies can't spend their way into ephemeral market | dominance. | | As for Unity, they're certainly dealing with a bunch of | underperforming PE and IPO-enabled M&A on the one hand | (really should have considered that AppLovin offer, | folks), but also just a failure to extract reasonable | income from their flagship product on the other; I don't | think their problems come from raising prices _per se_ | (game devs pay for a lot already, an engine fee is | nothing new to them) as much as how they chose to do it | and the original pricing model they tried to force on | their clients. What they chose to do and the way they | handled it wasn 't just bad, it was "HBS case study bad." | dboreham wrote: | VMWare, Docker. | zarzavat wrote: | OpenAI doesn't own transformers, they didn't even invent | them. They just have the best one at this particular time. | They have no moat. | | At some point, someone else will make a competitive model, if | it's Facebook then it might even be open source, and the | industry will see price competition _downwards_. | strangemonad wrote: | This argument has always felt to me like saying "google has | no moat in search, they just happen to currently have the | best page rank. Nothing is stopping yahoo from creating a | better one" | jdminhbg wrote: | Google has a flywheel where its dominant position in | search results in more users, whose data refines the | search algorithm over time. The question is whether | OpenAI has a similar thing going, or whether they just | have done the best job of training a model against a | static dataset so far. If they're able to incorporate | customer usage to improve their models, that's a moat | against competitors. If not, it's just a battle between | groups of researchers and server farms to see who is best | this week or next. | zarzavat wrote: | It's a different situation computationally. Transformers | are asymmetric: hard to train but easy to run. | | There is no such thing as an open source Google because | Google's value is in its vast data centers. Search is | hard to train and hard to run. | | GPT4 is not that big. It's about 220B parameters, if you | believe geohot, or perhaps more if you don't. | | _One_ hard drive. | shihab wrote: | My understanding is that Google search is a lot more than | just Pagerank (Map reduce for example). They had lots of | heuristics, data, machine learning before anyone else | etc. | | Whereas the underlying algorithms behind all these GPTs | so far are broadly same. Yes, OpenAI does probably have | better data, model finetuning and other engineering | techniques now, but I don't feel it's anything special | that'll allow themselves to differentiate themselves from | competitors in the long run. | | (If the data collected from a current LLM user in | improving model proves very valuable, that's different. I | personally think that's not the case now but who knows). | YetAnotherNick wrote: | The difference between openai and next best model seems to | be increasing and not decreasing. Maybe Google's gemini | could be competitive, but I don't believe open source will | match OpenAI's capability ever. | | Also OpenAI gets significant discount on compute due to | favourable deals from Nvidia and Microsoft. And they could | design their server better for their homogenous needs. They | are already working on AI chip. | goosinmouse wrote: | Are you using 3.5 turbo? Its always funny when i test a new fun | chatbot or something and see my API usage 10x just from a | single GPT 4 API call. Although i only usually have a $2 bill | every month from openAI. | littlestymaar wrote: | > you should probably just be using the best that's available | (OpenAI). | | Sure, if you want to let a monopoly have all the added value | while you get to keep the rest you can do that. | | Just make sure you're never successful enough to inspire them | though, otherwise you're dead the next minute. Oops. | zzbn00 wrote: | p4d.24xlarge spot price is $8.2 / hour in US East 1 at the | moment... | charlesischuck wrote: | Good luck getting that lol | tester756 wrote: | >iPhone of artificial intelligence | | It feels like the biggest investor bait of this year | | Will it beat ARM IPO? | lossolo wrote: | It's also worth noting that if you build your business on using | OpenAI's LLM or Anthropic etc, then, in the majority of cases | I've seen so far (no fine tuning etc), your competitor is just | one prompt away from replicating your business. | beauHD wrote: | I signed up for OpenAI's ChatGPT tool, and entered a query, like | 'What does the notation 1e100 mean?' (just to try it out). And | then when displaying the output it would start outputting the | reply in a slow way, like, it was dripfeeded to me, and I was | like: 'what? surely this could be faster?' | | Maybe I'm missing something crucial here, but why does it | dripfeed answers like this? Does it have to think really hard | about the meaning of 1e100? Why can't it just spit it out | instantly without such a delay/drip, like with the near-instant | Wolfram Alpha? | baby wrote: | You can but it'll take longer. So one way to get faster answers | is to stream the response as it is generated. And in GPT-based | apps the response is generated token by token (~4chars), hence | what you're seeing. | maccam912 wrote: | Its a result of how these transformer models work. It's pretty | quick for the amount of work it does, but it's not looking up | anything, it's generating it a token a time. | notRobot wrote: | Under the hood, GPT works by predicting the next token when | provided with an input sequence of words. At each step a single | word is generated taking into consideration all the previous | words. | | https://ai.stackexchange.com/questions/38923/why-does-chatgp... | swatcoder wrote: | The non-technical way to think about it is that ChatGPT "thinks | out loud" and can _only_ "think out loud". | | Future products would be able to hide some of that, but for | now, that's what the ChatGPT / Bing Assistant product does. | codedokode wrote: | Because it needs to do billions of arithmetic operations to | generate a reply. Replying to questions is not an easy task. | iambateman wrote: | This is _the_ playbook for big, fast scaling companies...Uber | subsidized every ride for _a decade_ before finally charging | market price, just to make sure that Uber was the only option | which made sense. | | While it's nice to consume the cheap stuff, it is not good for | healthy markets. | matteoraso wrote: | It's not even just the cost of finetuning. The API pricing is so | low, you literally can't save money by buying a GPU and running | your own LLM, no matter how many tokens you generate. It's an | incredible moat for OpenAI, but something they can't provide is | an LLM that doesn't talk like an annoying HR manager, which is | the real use case for self-hosting. | rosywoozlechan wrote: | The service quality sucks. You're getting what you pay for. We | switched to Azure Open AI APIs because of all the service quality | issues. | layer8 wrote: | Isn't OpenAI too cheap to be sustainable, and currently living | off Microsoft's $10B investment? | xnx wrote: | Nothing in that article convinces me the situation couldn't | change entirely in any given month. Google Gemini could be more | capable. Any number of new players (AWS, Microsoft, Apple) could | enter the market in a serious way. The head-start OpenAI has in | usage data is small and probably eclipsed by the clickstream and | data stores that Google and Microsoft have access to. I see no | durable advantage for OpenAI. | freedomben wrote: | Gemini very well might be the biggest threat to OpenAI. ChatGPT | has first-mover advantage so has a decent moat, but the amount | of people willing to pay $20 per month for something worse[1] | than they get for free with google.com is going to dwindle. I'd | be very worried if I were them. | | [1]: That knowledge cutoff and terrible UX of browse the web is | brutal compared to the experience of Bard | appplication wrote: | The premise of this is flawed. OpenAI is cheap because of has to | be right now. They need to establish market dominance quickly, | before competitors slide in. The winner of this horse race is not | going to be the company with the best performing AI, it's going | to be the one who does the best job at creating an outstanding | UX, ubiquitously presence, entrenching users, and building | competitive moats that are not feature differentiated because at | best even cutting edge features are only 6-12 months ahead of | competition cloning or beating. | | This is Uber/AirBnB/Wework/literally every VC subsidized hungry- | hungry-hippos market grab all over again. If you're falling in | love because the prices are so low, that is ephemeral at best and | is not a moat. Someone try calling an Uber in SF today and tell | me how much that costs you and how much worse the experience is | vs 2017. | | OpenAI is the undisputed future of AI... for timescales 6 months | and less. They are still extremely vulnerable to complete | disruption and as likely to be the next MySpace as they are | Facebook. | shaburn wrote: | Your Uber/AirBnB/Wework all have physical base units with | ascending costs due to inflation and theoretical economies of | scale. | | AI models have some GPU constraints but could easily reach a | state where the cost to opperate falls and becomes relatively | trivial with almost no lowerbound, for most use cases. | | You are correct there is a race for marketshare. The crux in | this case will be keeping it. Easy come, easy go. Models often | make the worst business model. | monocasa wrote: | Probably why Altman has been talking so much about how | dangerous it is and how regulations are needed. No natural | moat, so building a regulatory one. | blackoil wrote: | This point is discussed in the article. Title is not for | Google/Meta, they'll invest all the billions that they have to. | | It is for the consumers of these models, is there even a point | to train your own or experiment with OSS! | hendersoon wrote: | Sure, open models often require much less hardware than | chatGPT3.5 and offer ballpark (and constantly improving) | performance and accuracy. ChatGPT3.5 scores 85 in ARC and the | huggingface leaderboard is up to 77. | | If you need chatGPT4-quality responses they aren't close yet, | but it'll happen. | toddmorey wrote: | Just heard Steve today from Builder.io who did an impressive | launch of Figma -> code powered by AI. | | They trained a custom model for this. Better accuracy, sure, | but I was a little surprised to watch how much faster it is | than GPT4. | | Based on their testing, they've become believers in domain | specific smaller models, especially for performance. | ldjkfkdsjnv wrote: | Completely wrong, the best AI will win. There is insane demand | for better models. | datadrivenangel wrote: | There is insane demand for good enough models at extremely | good prices. | | Better beyond a certain point is unlikely to be competitive | with the cheaper models. | oceanplexian wrote: | Yep, quality over quantity. The difference between 99.9% | accurate and 99.999% accurate can be ridiculously valuable in | so many real world applications where people would apply | LLMs. | gbmatt wrote: | Only Big Tech (Microsoft,Google,Facebook) can crawl the web | at scale because they own the major content companies and | they severly throttle the competition's crawlers, and | sometimes outright block them. I'm not saying it's impossible | to get around, but it is certainly very difficult, and you | could be thrown in prison for violating the CFAA. | PaulHoule wrote: | I'm not sure if training on a vast amount of content is | really necessary in the sense that linguistic competence | and knowledge can probably be separated to some extent. | That is, the "ChatGPT" paradigm leads to systems that just | confabulate and "makes shit up" and making something | radically more accurate means going to something retrieval- | based or knowledge graph-based. | | In that case you might be able to get linguistic competence | with a much smaller model that you end up training with a | smaller, cleaner, and probably partially synthetic data | set. | wkat4242 wrote: | The improvements seem to be leveling off already. GPT-4 isn't | really worth the extra price to me. It's not that much | better. | | What I would really want though is an uncensored LLM. OpenAI | is basically unusable now, most of its replies are like "I'm | only a dumb AI and my lawyers don't want me to answer your | question". Yes I work in cyber. But it's pretty insane now. | bugglebeetle wrote: | GPT-4, correctly prompted, is head and shoulders above | everything for coding. All the text generation stuff and | NLP tasks, it's a toss-up. | jrockway wrote: | I haven't played with the self-hosted LLMs at all yet, but | back when Stable Diffusion was brand new I had a ton of fun | creating images that lawyers wouldn't want you to create. | ("Abraham Lincoln and Donald Trump riding a battle | elephant." It's just so much funnier with living people!) I | imagine that Llama-2 and friends offer a similar | experience. | PaulHoule wrote: | Depends how you define quality. This paper reflects my own | experience | | https://arxiv.org/abs/2305.08377 | | and shows how LLM technology has a lot more to offer than | "ChatGPT". The real takeaway is that by training LLMs with | real training data (even with a "less powerful" model) you | can get an error rate more than 10x less than you get with | the "zero shot" model of asking ChatGPT to answer a question | for you the same way that Mickey Mouse asked the broom to | clean up for him in _Fantasia._ The "few-shot" approach of | supplying a few examples in the attention window was a little | better but not much. | | The problem isn't something that will go away with a more | powerful model because the problem has a lot to do with the | intrinsic fuzziness of language. | | People who are waiting for an exponentially more expensive | ChatGPT-5 to save them will be pushing a bubble around under | a rug endlessly while the grinds who formulate well-defined | problems and make training sets will actually cross the | finish line. | | Remember that Moore's Law is over in the sense that | transistors are not getting cheaper generation after | generation, that is why the NVIDIA 40xx series is such a | disappointment to most people. LLMs have some possibility of | getting cheaper from a software perspective as we understand | how they work and hardware can be better optimized to make | the most of those transistors, but the driving force of the | semiconductor revolution is spent unless people find some | entirely different way to build chips. | | But... people really want to be like Mickey in _Fantasia_ and | hope the grinds are going to make magic for them. | sbierwagen wrote: | > Remember that Moore's Law is over in the sense that | transistors are not getting cheaper generation after | generation, that is why the NVIDIA 40xx series is such a | disappointment to most people. | | Huh? The NVIDIA H100 has twice the FLOPS of the A100 on a | smaller die. How is that not Moore's law? | mg wrote: | I don't think Uber and AirBnB are good comparisons. | | Both are B2C and have network effects. | paul7986 wrote: | The PI iPhone app has a solid UX and even better UX if Apple | (bought it) integrated into Siri. | kcorbitt wrote: | Eh, OpenAI is too cheap to beat at their own game. | | But there are a ton of use-cases where a 1 to 7B parameter fine- | tuned model will be faster, cheaper and easier to deploy than a | prompted or fine-tuned GPT-3.5-sized model. | | In fact, it might be a strong statement but I'd argue that _most_ | current use-cases for (non-fine-tuned) GPT-3.5 fit in that | bucket. | | (Disclaimer: currently building https://openpipe.ai; making it | trivial for product engineers to replace OpenAI prompts with | their own fine-tuned models.) | kristjansson wrote: | This article might have a point about the data flywheel, but it's | lost in the confused economics in the second half. Why would we | expect to hire one engineer per p4.24x instance? Why do we think | OpenAI needs a whole p4.24x to run fine tuning? Why do we ignore | the higher costs on the inference side for fine-tuned models? Why | do we think OpenAI spends _any_ money on racking-and-stacking | GPUs rather than just take them at (hyperscaler) cost from Azure? | oceanplexian wrote: | Has anyone actually used GPT4? It's not "cheap". | | It was roughly $150 for me to build a small dataset with a few | thousand quarter-page chunks of text for a data project using | GPT4. GPT3 is substantially cheaper but it would hallucinate 30% | of the time; honestly a nice fine-tune of LlaMA is on-par with | GPT3 and after the sunk cost all it costs is a few $0.01 in | electricity to generate the same sized dataset. | slowhadoken wrote: | It's insanely expensive to run and operate "AI". Meredith | Whittaker's talk on AI is very insightful | https://www.youtube.com/watch?v=amNriUZNP8w | slowhadoken wrote: | Thanks to traumatized $2 an hour Kenyan labor, yeah | https://time.com/6247678/openai-chatgpt-kenya-workers/ | pimpampum wrote: | Classic anti-competition strategy, sell below cost and burn money | until competition is out, then sell higher than you could have | ever sold with competition. | BrunoJo wrote: | We just started a service different open source models and with | an OpenAI compatible API [1]. The pricing isn't final and we | haven't officially launched yet but you should be able to save at | least 75% compared to GPT 3.5. | | [1] https://lemonfox.ai/ | Meegul wrote: | Are you doing this profitably? If so, does that entail owning | your own hardware or renting from cheaper services such as | Lambda? | slowhadoken wrote: | none of it is cheap, "AI" insanely expensive. Meredith Whittaker | talks about it in this interview | https://www.youtube.com/watch?v=amNriUZNP8w She's the president | of the Signal Foundation. | AJRF wrote: | I read this and think "That won't last long". | | The pricing is too good to be true with you think about it | rationally. If they raise prices they seem much, much less | attractive than using AWS or Azure. | | Amazon seem to have a much better business built around their | Bedrock offering. And all their other tools are available there | like SageMaker, ec2, integration with MLFlow, etc, etc. | | I guess the same goes for Azure, if you are already using it it's | much easier to just stick with whatever they are offering for LLM | Ops. | | OpenAI offering just models doesn't seem like it can last | forever, and to compete with AWS or Azure at enterprise level | they need to build all the things Amazon/MS have built. | | The other side of that coin seems much more realistic. | DominikPeters wrote: | > While per-token inference costs for fine-tuned GPT-3.5 is 10x | more expensive than GPT-3.5 it is still 10x cheaper than GPT-4! | | Not quite accurate; finetuned 3.5 is only 4x cheaper than GPT-4. | Cost per million output tokens from https://openai.com/pricing $2 | - GPT-3.5 $16 - finetuned GPT-3.5 $60 - GPT 4 ___________________________________________________________________ (page generated 2023-10-12 21:00 UTC)