[HN Gopher] OpenAI is too cheap to beat
       ___________________________________________________________________
        
       OpenAI is too cheap to beat
        
       Author : cgwu
       Score  : 162 points
       Date   : 2023-10-12 18:16 UTC (2 hours ago)
        
 (HTM) web link (generatingconversation.substack.com)
 (TXT) w3m dump (generatingconversation.substack.com)
        
       | eurekin wrote:
       | Didn't see batching taken into equation, might skew a bit
        
         | sidnb13 wrote:
         | Yep, batching is a feature I really wish the OpenAI API had.
         | That and the ability to intelligently cache frequently used
         | prompts. Much easier to achieve this with a hosted OS model, so
         | I guess it's a speed + customizability/cost tradeoff for the
         | time being.
        
           | advaith08 wrote:
           | imo they dont have batching because they pack sequences
           | before passing through the model. so a single sequence in a
           | batch on OpenAI might have requests from multiple customers
           | in it
        
       | jonplackett wrote:
       | Is this a reflection of OpenAI's massive scale making it so cheap
       | for them?
       | 
       | Or is it the deal with Microsoft for cloud services making it
       | cheap?
       | 
       | Or are they just operating at a massive loss to kill off other
       | competition?
       | 
       | Or something else?
        
         | 4death4 wrote:
         | Probably all three:
         | 
         | 1) They hiring too talent to make their models as efficient as
         | possible.
         | 
         | 2) They have a sweetheart deal with MS.
         | 
         | 3) They're better funded than everyone else and bringing in
         | substantial revenue.
        
           | smachiz wrote:
           | deleted
        
             | ryduh wrote:
             | Is this a guess or is it informed by facts?
        
             | sebzim4500 wrote:
             | Are just suggesting this as an option or do you have
             | evidence that it is true?
        
           | ugjka wrote:
           | They are also trying to lobby the government for AI
           | "regulation" in order limit any competitors ability achieve
           | OpenAI's level
        
           | wkat4242 wrote:
           | They basically are MS by now. Everyone at Microsoft I work
           | with literally calls it an 'aquisition'. Even though they
           | only own a share. It's pretty clear what their plans are.
        
         | SkyMarshal wrote:
         | Probably the first two, plus first-mover brand recognition.
         | Millions of $20 monthly subs for GPT4 add up.
         | 
         | They might also be operating at a loss afaik, but I suspect
         | they're one of the few that can break even just based on scale,
         | brand recognition, and economics.
        
           | michaelbuckbee wrote:
           | $20/mo subs which is also the lead in to also unlocking paid
           | API access.
        
           | sarchertech wrote:
           | I haven't heard any evidence that they have millions of Plus
           | subscribers.
           | 
           | I've seen 100 to 200 million active users, but nothing about
           | paid users from them. The surveys I saw when doing a quick
           | google search reported much less than 1% of users paying.
        
             | SkyMarshal wrote:
             | Yeah I don't know what the actual subscription numbers are,
             | would be surprised if OpenAI is publishing that info.
        
         | ShadowBanThis01 wrote:
         | They're mining the gullible for phone numbers, among other
         | things.
        
         | vsreekanti wrote:
         | Probably some combination of all the above! I think 1 and 2 are
         | interlinked though -- the cheaper they can be, the more they
         | build that moat. They might be eating the cost on these APIs
         | too, but unlike the Uber/Lyft war, it'll be way stickier.
        
         | te_chris wrote:
         | There's also just the benefits of being in market, at scale and
         | being exposed to the full problem space of serving and
         | maintaining services that use these models. It's one thing to
         | train and release and OSS model, it's another to put it into
         | production and run all the ops around it.
        
         | iliane5 wrote:
         | I think it's mostly the scale. Once you have a consistent user
         | base and tons of GPUs, batching inference/training across your
         | cluster allows you to process requests much faster and for a
         | lower marginal cost.
        
       | ilaksh wrote:
       | I think the weird thing about this is that it's completely true
       | right now but in X months it may be totally outdated advice.
       | 
       | For example, efforts like OpenMOE
       | https://github.com/XueFuzhao/OpenMoE or similar will probably
       | eventually lead to very competitive performance and cost-
       | effectiveness for open source models. At least in terms of
       | competing with GPT-3.5 for many applications.
       | 
       | Also see https://laion.ai/
       | 
       | I also believe that within say 1-3 years there will be a
       | different type of training approach that does not require such
       | large datasets or manual human feedback.
        
         | sidnb13 wrote:
         | > I also believe that within say 1-3 years there will be a
         | different type of training approach that does not require such
         | large datasets or manual human feedback.
         | 
         | I guess if we ignore pretraining, don't sample-efficient fine-
         | tuning on carefully curated instruction datasets sort of
         | achieve this? LIMA and OpenOrca show some really promising
         | results to date.
        
           | sharemywin wrote:
           | distilbert was trained from Bert. there might be an angle
           | using another model to train the model especially if your
           | trying to get something to run locally.
        
         | nico wrote:
         | > I also believe that within say 1-3 years there will be a
         | different type of training approach that does not require such
         | large datasets or manual human feedback
         | 
         | This makes a lot of sense. A small model that "knows" enough
         | English and a couple of programming languages should be enough
         | for it to replace something like copilot, or use plug-ins or do
         | RAG on a substantially larger dataset
         | 
         | The issue right now is that to get a model that can do those
         | things, the current algorithms still need massive amounts of
         | data, way more than what the final user needs
        
         | Dwedit wrote:
         | Abbreviate Mix of Experts as "MoE" and the Anime fans
         | immediately start rushing in...
        
       | daft_pink wrote:
       | I'm confused don't a100s cost 10,000 to buy? Why would you pay
       | 166k per year to rent?
        
         | sidnb13 wrote:
         | I would assume the datacenter and infra needed would also
         | contribute a sizeable chunk to the costs when you consider
         | upkeep to run it 24/7
        
         | latchkey wrote:
         | For the same reason people use AWS.
         | 
         | Spending the capex/opex to run a cluster of compute isn't easy
         | or cheap. It isn't just the cost of the GPU, but the cost of
         | everything else around it that isn't just monetary.
        
           | etothepii wrote:
           | This could be an interesting comparison. My experience with
           | AWS is that it was super easy and cheap to start on. By the
           | time we _could_ use whole servers we were using so much AWS
           | orchestration that it 's going to be put off until we are at
           | least $1M ARR, and probably til we are at $5M.
           | 
           | Make adoption easy, give a free base tier but charge more
           | could be a very effective model to get start ups stuck on
           | you. It even probably makes adoption by small teams in big
           | companies possible that can then grow ...
        
         | dekhn wrote:
         | How much does an A100 consume in power a year (in dollar
         | costs)? How much does it cost to hire and retain datacenter
         | techs? How long does it take to expand your fleet after a user
         | says "we're gonna need more A100s?" How many discounts can you
         | get as a premier customer?
         | 
         | Answer these questions, and the equation shifts a bunch!
        
           | shrubble wrote:
           | Not really.
           | 
           | A full rack with 16 amps usable power and some bandwidth is
           | $400/month in Kansas City, MO. That is enough to power 5x
           | A100s 24x7, so 10k plus $80 per month each, amortized, of
           | course many more A100s would drop the price.
           | 
           | Once installed in the rack ($250 1 time cost) you shouldn't
           | need to touch it. So 10k plus $1250 per A100, per year
           | including power. You can put 2 or 3 A100s per cheapo Celeron
           | based CPU with motherboards.
           | 
           | Of course if doing very bursty work then it may well make
           | sense to rent...
        
             | akomtu wrote:
             | And how many A100s do you need to do something meaningful
             | with LLMs?
        
               | shrubble wrote:
               | The funding has to come from somewhere, right? You either
               | pay up front and save money over time, or pay as you go
               | and pay more...
        
             | dekhn wrote:
             | Did you also include the network required to make the A100s
             | talk to each other? Both the datacenter network (so the
             | CPUs can load data) and the fabric (so the A100s can talk?)
             | 
             | You also left out the data tech costs- probably at least
             | $50K/individual-year in KC (although I guess I'd just work
             | for free ribs).
             | 
             | If you're putting A100s into celeron motherboards... I
             | don't know what to say. You're not saving money by putting
             | a ferrari engine in a prius.
        
           | latchkey wrote:
           | $50m GPU capex (which is A LOT) is about 2-3MW of power, it
           | isn't that much.
           | 
           | The problem though is that getting 2-3MW of power in the US
           | is increasingly difficult and you're going to pay a lot more
           | for it since the cheap stuff is already taken.
           | 
           | Even more distressing is that if you're going to build new
           | data center space, you can't get the rest of the stuff in the
           | supply chain... backup gennies, transformers, cooling towers,
           | etc...
        
         | amluto wrote:
         | Those are 8x A100 systems.
        
         | joefourier wrote:
         | AWS is extremely overpriced for nearly every service. I don't
         | know why anyone else outside of startups with VC money to burn
         | or bigcos that need the "no one ever got fired for buying IBM"
         | guarantee would use them. You're better off with Lambdalabs or
         | others which charge only $1.1/h per A100.
         | 
         | Also that is a 8xA100 system as others have noted, but it is
         | the 40GB one which can be found on eBay for as low as $3k if
         | you go with the SXM4 one (although the price of supporting
         | components may vary) or $5k for the PCI-e version.
        
           | wg0 wrote:
           | There are only two services that are dirt cheap and way too
           | reliable, useful.That's S3 and SQS. Rest can get very
           | expensive very soon.
           | 
           | You can build a lot of stuff on top of these two.
        
             | ommpto wrote:
             | Even for S3 while the storage is dirt cheap they still have
             | exorbitant bandwidth pricing.
        
             | charcircuit wrote:
             | S3 is not dirt cheap. Bandwidth is ludicrously expensive.
        
           | charlesischuck wrote:
           | You pay for the system not the gpu with AWS.
           | 
           | It's absolutely worth the money when you look at the whole
           | picture. Also lambda labs never has availability. I actually
           | can schedule a distributed cluster on AWS.
        
             | AndroTux wrote:
             | > It's absolutely worth the money when you look at the
             | whole picture.
             | 
             | That highly depends on many things. If you run a business
             | with a relatively steady load that doesn't need to scale
             | quickly multiple times per day, AWS is definitely not for
             | you. Take Let's Encrypt[1] as an example. Just because
             | cloud is the hype doesn't mean it's always worth it.
             | 
             | Edit: Or a personal experience: I had a customer that
             | insisted on building their website on AWS. They weren't
             | expecting high traffic loads and didn't need high
             | availability, so I suggested to just use a VPS for $50 a
             | month. They wanted to go the AWS route. Now their website
             | is super scalable with all the cool buzzwords and it costs
             | them $400 a month to run. Great! And in addition, the whole
             | setup is way more complex to maintain since it's built on
             | AWS instead of just a simple website with a database and
             | some cache.
             | 
             | [1] https://news.ycombinator.com/item?id=37536103
        
         | nharada wrote:
         | Sometimes I need 512 GPUs for 3 days.
        
         | charlesischuck wrote:
         | A top end gpu now to make you competitive cost 20-50k per gpu.
         | 
         | To train a top model you need hundreds of them in a very
         | advanced datacenter.
         | 
         | You can't just plug gpus into standard systems and train,
         | everything is custom.
         | 
         | The technical talent required for these systems is rare to say
         | the least. The technical talent to make a model is also rare.
         | 
         | I trained a few foundation models with images, and I would
         | NEVER buy any of them. These guys are on a wildly different
         | scale than basically everyone.
        
       | SkyMarshal wrote:
       | I think OpenAI may eventually have to go upmarket, as basic "good
       | enough" AI becomes increasingly viable and cheap/free on consumer
       | level devices, supplied by FOSS models and apps.
       | 
       | Apple may be leading the way here, with Apple Silicon
       | prioritizing AI processing and built into all their devices.
       | These capabilities are free (or at least don't require an extra
       | sub), and just used to sell more hardware.
       | 
       | OpenAI is clearly going to compete in that market with its
       | upcoming smart phone or device [1]. But what revenue model can
       | OpenAI use to compete with Apple's and not get undercut by it? I
       | suppose hardware + free GPT3.5, and optional subscription to GPT4
       | (or whatever their highest end version is). Maybe that will be
       | competitive.
       | 
       | I also wonder what mobile OS OpenAI will choose. Probably not
       | Android, otherwise they would have partnered with Google. A
       | revamped and updated Microsoft mobile OS maybe, given their MS
       | partnership? Or something new and bespoke? I could imagine Johnny
       | Ive demanding something new, purpose-built, and designed from
       | scratch for a new AI-oriented UI/UX paradigm.
       | 
       | A market for increasingly sophisticated AI that can only be done
       | in huge GPU datacenters will exist, and that's probably where the
       | margins will be for a long time. I think that's what OpenAI,
       | Microsoft, Google, and the others will be increasingly competing
       | for.
       | 
       | [1]:https://www.reuters.com/technology/openai-jony-ive-talks-
       | rai...
        
         | vsreekanti wrote:
         | Yep, we agree that the obvious direction of innovation for OSS
         | models is smaller and cheaper, likely at roughly the same
         | quality: https://generatingconversation.substack.com/p/open-
         | source-ll...
        
           | smcleod wrote:
           | Also more privacy respecting, and more customisable /
           | flexible.
        
         | mensetmanusman wrote:
         | Please Apple let me replace worthless Siri with ChatGPT on my
         | iPhone.
         | 
         | Would completely change how I use the device.
        
           | bitcurious wrote:
           | If you have the new iPhone with the action button, you can
           | set a shortcut to ask questions of ChatGPT. It's not as fluid
           | as Siri, and can't control anything, but still much more
           | useful.
        
           | CamperBob2 wrote:
           | Just yesterday, while driving: "Read last message."
           | 
           | Siri: "Sorry. Dictation service is unavailable at the
           | moment."
           | 
           | It's past time for excuses. High-level people at Apple need
           | to be fired over this. Hello? Tim? Do your job. Hello?
           | Anybody home...?
        
             | freedomben wrote:
             | Nobody is switching away from Apple over this, so
             | ultimately Tim _is_ doing his job. Under his watch Apple
             | has become the defacto choice for entire generations.
             | Between vendor-lockin /walled gardens and societal/cultural
             | pressures (don't want to be a green bubble!), they have one
             | of the stickiest user bases there are.
        
               | mensetmanusman wrote:
               | True, but that doesn't mean we shouldn't complain.
               | 
               | My hope is that the upcoming eu rulings allow competition
               | here. Ie force Apple to get out of the way of making
               | their hardware better with better software.
        
               | CamperBob2 wrote:
               | Stop excusing shitty work from trillion-dollar companies.
               | It makes the world a worse place.
        
               | smoldesu wrote:
               | I think it's shitty and has no excuse, but the parent is
               | right. Apple has no incentive to respond to their users
               | since all roads lead to first-party Rome. It's why stuff
               | like the Digital Market Act is more needed than some
               | people claim.
               | 
               | You know what would get Apple to fix this? Forced
               | competition. You know what Apple spends their trillions
               | preventing?
        
             | layer8 wrote:
             | Apple is ramping up spending in that area:
             | https://www.macrumors.com/2023/09/06/apple-conversational-
             | ai...
             | 
             | It'll probably take a while though.
        
         | grahamplace wrote:
         | > OpenAI is clearly going to compete in that market with its
         | upcoming phone
         | 
         | What phone are you referring to? A quick google didn't seem to
         | pull up anything related to OpenAI launching a hardware
         | product?
        
           | BudaDude wrote:
           | They are most likely referring to this in collaboration with
           | Jony Ive:
           | 
           | https://www.yahoo.com/entertainment/openai-jony-ive-talks-
           | ra...
        
             | SkyMarshal wrote:
             | Yes that one.
        
         | jimkoen wrote:
         | > OpenAI is clearly going to compete in that market with its
         | upcoming phone.
         | 
         | Excuse me, I'm not an english native, you mean like a smart
         | phone? Or do you mean some sort of other new business
         | direction? Where did you get the info thtat they're planning to
         | launch a phone?
        
           | MillionOClock wrote:
           | I believe there has been rumors that OpenAI was working with
           | Jony Ive to create a wearable device, but it was unclear
           | wether it would be a phone or something else.
        
           | SkyMarshal wrote:
           | Yes a smartphone, /corrected. It's a recent announcement:
           | 
           | https://www.nytimes.com/2023/09/28/technology/openai-
           | apple-s...
        
             | sharemywin wrote:
             | It's not a really a phone. they mention ambient computing.
        
               | SkyMarshal wrote:
               | Oh, smart device then.
        
           | layer8 wrote:
           | https://www.reuters.com/technology/openai-jony-ive-talks-
           | rai...
        
         | layer8 wrote:
         | Where are you taking the confidence that Apple will be able to
         | catch up to OpenAI's GPT? "Apple's built-in AI capabilities"
         | are very weak so far.
        
           | filterfiber wrote:
           | Not OP,
           | 
           | In my experience apple's ML on iphones is seamless. Tap and
           | hold on your dog in a picture and it'll cut out the
           | background, your photos are all sorted automatically
           | including by person (and I think by pet).
           | 
           | OCR is seamless - you just select text in images as if it was
           | real text.
           | 
           | I totally understand these aren't comparable to LLMs - rumor
           | has it apple is working on an llm - if their execution is
           | anything like their current ML execution it'll be glorious.
           | 
           | (Siri objectively sucks although I'm not sure it's fair to
           | compare siri to an LLM as AFAIK siri does not do text
           | prediction but is instead a traditional "manually crafted
           | workflow" type of thing that just uses S2T to navigate)
        
             | blackoil wrote:
             | >OCR is seamless
             | 
             | Wasn't that solved about a decade ago. Does anyone suck at
             | that?
        
               | filterfiber wrote:
               | > Does anyone suck at that?
               | 
               | Does android even have native OCR? Last I checked
               | everything required an OCR app of varying quality
               | (including windows/linux).
               | 
               | On ios/macos you can literally just click on a picture
               | and select the text in it as if it wasn't a picture. I
               | know for sure on iOS you don't even open an app to do it,
               | just any picture you can select it.
               | 
               | Last I checked the Opensource OCR tools were decent but
               | behind the closed source stuff as well.
               | 
               | Random google result of OCR on android (could be
               | outdated) - https://www.reddit.com/r/androidapps/comments
               | /10te5et/why_oc...
        
               | smoldesu wrote:
               | > Does android even have native OCR?
               | 
               | Tesseract? https://github.com/tesseract-ocr/tesseract
        
           | SkyMarshal wrote:
           | I'm not saying they will on the high-end, but maybe on the
           | low end. Apple's strategy is to embed local AI in all their
           | devices. Local AI will never be as capable as AI running in
           | massive GPU datacenters, but if it can get to a point that
           | it's "good enough" for most average users, that may be enough
           | for Apple to undercut the low end of the market.
        
             | freedomben wrote:
             | > _Local AI will never be as capable as AI running in
             | massive GPU datacenters_
             | 
             | I'm not sure this is true, even in the short term. For some
             | things yes, that's definitely true. But for other things
             | that are real-time or near real-time where network latency
             | would be unacceptable, we're already there. For example,
             | Google's Pixel 8 launch includes real-time audio
             | processing/enhancing which is made possible by their new
             | Tensor chip.
             | 
             | I'm no fan of Apple, but I think they're on the right path
             | with local AI. It may even be possible that the tendency of
             | other device makers to put AI in the cloud might give Apple
             | a much better user experience, unless Google can start
             | thinking local-first which kind of goes against their
             | grain.
        
               | SkyMarshal wrote:
               | _> But for other things that are real-time or near real-
               | time where network latency would be unacceptable, we 're
               | already there._
               | 
               | Agreed. Something else I wonder is if local AI in mobile
               | devices might be better able to learn from its real-time
               | interactions with the physical world than datacenter-
               | based AI.
               | 
               | It's walking around in the world with a human with all
               | its various sensors recording in real-time (unless
               | disabled) - mic, camera, GPS/location, LiDAR, barometer,
               | gyro, accelerometer, proximity, ambient light, etc. Then
               | the human uses it to interact with the world too in
               | various ways.
               | 
               | All that data can of course be quickly sent to a
               | datacenter too, and integrated into the core system
               | there, so maybe not. But I'm curious about this
               | difference and wonder what advantages local AI might
               | eventually confer.
        
               | sharemywin wrote:
               | I wonder if you could send the embeddings or some higher
               | level compressed latent vector across the cloud you
               | couldn't get the best of both worlds.
               | 
               | GPS, phone orientation, last 5 apps you were in, etc. -->
               | embedding
               | 
               | you might even have like "what time is it?" compressed as
               | it's own embedding.
        
         | huevosabio wrote:
         | OpenAI will make its money on enterprise deals for finetuning
         | their latest and greatest on corporate data. They are already
         | having this big enterprise deals and I think that's where the
         | money is.
         | 
         | They will keep pricing the off-the-shelf AI at-cost to keep
         | competitors at bay.
         | 
         | As for competitors, Anthropic is the most similar to OpenAI
         | both in capabilities and business model. I am not sure what
         | Google is up to, since historically their focus has been in
         | using AI to enhance their products rather than making it a
         | product. The "dark horses" here are Stability and Mistral which
         | both are OSS and European and will try to make that their edge
         | as they give the models for _free_ but to institutional clients
         | that are more sensitive to the models being used and where is
         | the data being handled.
         | 
         | Amazon and Apple are probably catching up. Apple likely thinks
         | that all of this just makes their own hardware more attractive.
         | It's not clear to me what Meta's end goal is.
        
         | tmpz22 wrote:
         | > I think OpenAI may eventually have to go upmarket
         | 
         | Let me introduce you to the VC business model. Get comical
         | amounts of money. Charge peanuts for an initial product. Build
         | a moat once you trap enough businesses inside it. Jack up
         | prices.
        
           | sharemywin wrote:
           | don't forget the sneaky TOS changes you have to agree to
        
           | robertlagrant wrote:
           | OpenAI'd better hope no one else does it too, if that's all
           | it takes.
        
       | latchkey wrote:
       | I just paid the $20 for a month to try it out. In my super
       | limited experience, GPT-4 is actually impressive and worth the
       | money.
        
         | smileysteve wrote:
         | I've spent the last few weeks comparing Google Duet with Chat
         | GPT 3.5, and Chat GPT seems years ahead.
        
         | a_wild_dandan wrote:
         | The value I get for that $20/month is astonishing. It's by far
         | the best discretionary subscription I've ever had.
         | 
         | That scares me. I hate moats and actively want out. Running the
         | uncensored 70B parameter Llama 2 model on my MacBook is great,
         | but it's just not a competitive enough general intelligence to
         | entirely substitute for GPT-4 yet. I think our community will
         | get there, but the surrounding water is deepening, and I'm
         | nervous...
        
           | sharemywin wrote:
           | tentatively called "Claude-Next" -- that is 10 times more
           | capable than today's most powerful AI, according to a 2023
           | investor deck TechCrunch obtained earlier this year.
           | 
           | this is the thing that scare me.
           | 
           | when do these models stop getting smarter? or at least slow
           | down?
        
       | minimaxir wrote:
       | When the ChatGPT API was released 7 months ago, I posted a
       | controversial blog post that the API was so cheap, it made other
       | text-generating AI obsolete:
       | https://news.ycombinator.com/item?id=35110998
       | 
       | 7 months later, nothing's changed surprisingly. Even open-source
       | models are trickier to get to be more cost-effective despite the
       | many inference optimizations since. Anthropic Claude is closer to
       | price and quality effectiveness now, but there's no reason to
       | switch.
        
         | cainxinth wrote:
         | These are still early days. All the major players are willing
         | to lose billions to be top of mind with consumers in an
         | emerging market.
         | 
         | Either there will be some major technological breakthrough that
         | lowers their costs, or they will all eventually start raising
         | prices.
        
       | Eumenes wrote:
       | "too cheap to beat" sounds anti-competitive and monopolistic.
       | Large LLM providers are not dissimilar to industrial operations
       | at scale - it requires alot of infrastructure and the more you
       | buy/rent, the cheaper it gets. Early bird gets the worm I guess.
        
         | stevenae wrote:
         | Not sure I understand your comment, but generally you have to
         | prove anti-competitiveness /beyond/ too cheap to beat (unless
         | it is a proven loss-leader which, viz all big tech companies,
         | seems very hard to prove)
        
       | Havoc wrote:
       | Yep. Building a project that needs some LLMs. I'm very much of
       | the self-hosting mindset so will try DIY, but it's very obviously
       | the wrong choice by any reasonable metric.
       | 
       | OpenAI will murder my solution by quality, by availability, by
       | reliability and by scalability...all for the price of a coffee.
       | 
       | It's a personal project though & partly intended for learning
       | purposes so there is scope for accepting trainwreck level
       | tradeoffs.
       | 
       | No idea how commercial projects are justifying this though.
        
         | nine_k wrote:
         | One small caveat: OpenAI gets to see all your prompts, and all
         | the responses.
         | 
         | Sometimes this can be unacceptable. Law,, medicine, finance,
         | all of them would prefer a self-hosted, private GPT.
        
           | kevlened wrote:
           | Their data retention policy on their APIs is 30 days, and
           | it's not used for training [0]. In addition, qualifying use
           | cases (likely the ones you mentioned) qualify for zero data
           | retention for most endpoints.
           | 
           | [0] - https://platform.openai.com/docs/models/how-we-use-
           | your-data
        
             | nine_k wrote:
             | In sensitive cases you do not think about the normal
             | policy, you think about the worst case. You just can't
             | afford a leak. Your local installation may be much better
             | protected than a public service, by technology and by
             | policy.
        
               | BoorishBears wrote:
               | For years people have essentially made a living off FUD
               | like "ignore the literal legal agreement and imagine all
               | the worst case scenarios!!!" to justify absolutely
               | farcical on-premise deployments of a lot of software, but
               | AI is starting to ruin the grift.
               | 
               | There _are_ some cases where you really can 't afford to
               | send Microsoft data for their OpenAI offering... but
               | there are a lot more where some figurehead solidified
               | their power by insisting the company build less secure
               | versions of public offerings instead of letting their
               | "gold" go to a 3rd party provider.
               | 
               | As AI starts to appear as a competitive advantage, and
               | the SOTA of self-hosted lagging so ridiculously far
               | behind, you're seeing that work less and less. Take
               | Harvey.ai for example: it's a frankly non-functional
               | product and still manages to spook top law firms with
               | tech policies that have been entrenched for decades into
               | paying money despite being OpenAI based on the simple
               | chance they might get outcompeted otherwise.
        
             | littlestymaar wrote:
             | > and it's not used for training [0].
             | 
             | It's "not be used to train or improve OpenAI models",
             | doesn't mean it's not used to get knowledge about your
             | prompts, your business use case. In fact, the wording of
             | the policy is lose enough they could train a policy model
             | on it (just not the LLM itself).
        
         | Der_Einzige wrote:
         | A lot of tools for constraint, creativity, and related rely on
         | manipulating the entire log probability distribution. OpenAI
         | won't expose this information and is therefor shockingly
         | uncompetitive on things like poetry generation
        
       | fulafel wrote:
       | This focuses on compute capacity but wouldn't the algorithmic
       | improvements be much more important in bang for the buck at this
       | stage as there's so much low hanging fruit as evidenced by
       | constant stream of news about getting better results with less
       | hardware.
        
       | debacle wrote:
       | Open source always wins, in the end. This is a fluff piece.
        
         | downWidOutaFite wrote:
         | Where's the open source web search that is beating Google?
        
       | serjester wrote:
       | I think this is under appreciated. I run a "talk-to-your-files"
       | website with 5ish K MRR and a pretty generous free tier. My
       | OpenAI costs have not exceeded $200 / mo. People talk about using
       | smaller, cheaper models but unless you have strong data security
       | requirements you're burdening yourself with serious maintenance
       | work and using objectively worse models to save pennies. This
       | doesn't even consider OpenAI continuously lowering their prices.
       | 
       | I've talked to a good amount of businesses and 90% of custom use
       | cases would also have negligible AI costs. In my opinion, unless
       | you're in a super regulated industry or doing genuinely cutting
       | edge stuff, you should probably just be using the best that's
       | available (OpenAI).
        
         | vsreekanti wrote:
         | I completely agree -- open-source models and custom deployments
         | just can't compete with the cost and efficiency here. The only
         | exception here is _if_ open-source models can get way smaller
         | and faster than they are now while maintaining existing
         | quality. That will make private deployments and custom fine-
         | tuning way more likely.
        
           | SkyMarshal wrote:
           | Or FOSS models remain the same size and speed, but hardware
           | for running them, especially locally, steadily improves till
           | the AI is "good enough" for a large enough segment of the
           | market.
        
         | hobs wrote:
         | How do you deal with the fact that Azure et al are not
         | appearing to sell anyone additional capacity?
        
         | jejeyyy77 wrote:
         | how do ur customers feel about you uploading potentially
         | confidential documents to a 3rd party?
        
           | CDSlice wrote:
           | If they are confidential they probably shouldn't be uploaded
           | to any website no matter if it calls out to OpenAI or does
           | all the processing on their own servers.
        
           | yunohn wrote:
           | It's simple really, lots of businesses share data with 3rd
           | parties to enable various services. OpenAI provides a service
           | contract claiming they do not mine/reshare/etc the data
           | shared via their API. As the SaaS provider, you just need to
           | call it out your user service agreement.
        
         | euazOn wrote:
         | Just curious, could you briefly mention some of the custom use
         | cases with negligible AI costs? Thanks
        
         | cyode wrote:
         | Are any OpenAI powered flows available to public, logged-out
         | user traffic? I've worried (maybe irrationally) about doing
         | this in a personal project and then dealing with malicious
         | actors and getting stuck with a big bill.
        
         | Bukhmanizer wrote:
         | The bleeding obvious is that OpenAI is doing what most tech
         | companies for the last 20 years have done. Offer the product
         | for dirt cheap to kill off competition, then extract as much
         | value from your users as possible by either mining data or
         | hiking the price.
         | 
         | I don't understand how people are surprised by this anymore.
         | 
         | So yeah, it's the best option right now, when the company is
         | burning through cash, but they're planning on getting that
         | money back from you _eventually_.
        
           | jaredklewis wrote:
           | > Offer the product for dirt cheap to kill off competition,
           | then extract as much value from your users as possible by
           | either mining data or hiking the price.
           | 
           | Genuine question, what are some examples of companies in that
           | "hiking the price" camp?
           | 
           | I can think of tons of tech companies that sold or sell stuff
           | at a loss for growth, but struggling to find examples where
           | the companies then are able to turn dominant market share
           | into higher prices.
           | 
           | To be clear, I'm definitely not implying they are not out
           | there, just looking for examples.
        
             | loganfrederick wrote:
             | Uber, Netflix and the online content streaming services.
             | These are probably the most prominent examples from this
             | recent 2010s era.
        
             | spacebanana7 wrote:
             | The Google Maps API price hike of 2018 [1] is a relevant
             | example.
             | 
             | [1] https://kobedigital.com/google-maps-api-changes
        
             | beezlebroxxxxxx wrote:
             | Uber is probably the biggest pure example. When I was in
             | uni when they first spread, Uber's entire business model
             | was flood the market with hilariously low prices and steep
             | discounts. People overnight started using them like crazy.
             | They were practically giving away their product. Now,
             | they're as expensive, if not sometimes more expensive, than
             | any other taxi or ridesharing service in my area.
             | 
             | One thing I'll add is that it's not always that this ends
             | with higher prices in an absolute sense, but that the tech
             | company is able to essentially cut the knees out of their
             | competitors until they're a shell of their former selves.
             | Then when the prices go "up", they're in a way a return to
             | the "norm", only they have a larger and dominant market
             | share because of their crazy pricing in the early stages.
        
               | wkat4242 wrote:
               | Yeah I kinda wonder why people even use them anymore.
               | I've long gone back to real taxis because their cheaper
               | and I don't have to book them, I can just grab one on the
               | street. Much more efficient than waiting for slowly
               | watching my driver edge his way to me from 3 kilometers
               | away.
        
               | jdminhbg wrote:
               | The number of places where you can reliably walk out onto
               | the street and hail a taxi is pretty small. Everywhere
               | else, the relevant decision is whether calling a
               | dispatcher or using a taxi company's app is
               | faster/cheaper/more reliable than Uber/Lyft.
        
             | mikpanko wrote:
             | - Uber/Lyft increased prices significantly (and partially
             | transition it into longer wait times) since they got into
             | profitability mode
             | 
             | - Google is showing more and more ads over time to power
             | high revenue growth YoY
             | 
             | - Unity has just tried to increase its prices
        
               | jaredklewis wrote:
               | I think Google fits more in the "extract as much value
               | from your users" bucket more than the price hiking one.
               | 
               | Uber/Lyft did raise prices, but interestingly (at least
               | to me) is that if the strategy was the smother the
               | competition with low prices, it didn't seem to work.
               | 
               | Unity is interesting too, though I'm not sure it would
               | make a good poster child for this playbook. It raised
               | prices but seems to be suffering for it.
        
               | HillRat wrote:
               | Everyone's in "show your profits" mode, as befitting a
               | mature market with smaller growth potential relative to
               | the last few decades. Some of what we're talking about
               | here is just what happens when a company tries to use
               | investment capital to build a moat but fails (the
               | Uber/Lyft issue you mentioned -- there's no obvious moat
               | to ride-hailing, as with many software and app domains).
               | My theory is that, going forward, we're going to see a
               | much lower ceiling on revenue coupled with lots of
               | competition in the market as VC investments cool off and
               | companies can't spend their way into ephemeral market
               | dominance.
               | 
               | As for Unity, they're certainly dealing with a bunch of
               | underperforming PE and IPO-enabled M&A on the one hand
               | (really should have considered that AppLovin offer,
               | folks), but also just a failure to extract reasonable
               | income from their flagship product on the other; I don't
               | think their problems come from raising prices _per se_
               | (game devs pay for a lot already, an engine fee is
               | nothing new to them) as much as how they chose to do it
               | and the original pricing model they tried to force on
               | their clients. What they chose to do and the way they
               | handled it wasn 't just bad, it was "HBS case study bad."
        
             | dboreham wrote:
             | VMWare, Docker.
        
           | zarzavat wrote:
           | OpenAI doesn't own transformers, they didn't even invent
           | them. They just have the best one at this particular time.
           | They have no moat.
           | 
           | At some point, someone else will make a competitive model, if
           | it's Facebook then it might even be open source, and the
           | industry will see price competition _downwards_.
        
             | strangemonad wrote:
             | This argument has always felt to me like saying "google has
             | no moat in search, they just happen to currently have the
             | best page rank. Nothing is stopping yahoo from creating a
             | better one"
        
               | jdminhbg wrote:
               | Google has a flywheel where its dominant position in
               | search results in more users, whose data refines the
               | search algorithm over time. The question is whether
               | OpenAI has a similar thing going, or whether they just
               | have done the best job of training a model against a
               | static dataset so far. If they're able to incorporate
               | customer usage to improve their models, that's a moat
               | against competitors. If not, it's just a battle between
               | groups of researchers and server farms to see who is best
               | this week or next.
        
               | zarzavat wrote:
               | It's a different situation computationally. Transformers
               | are asymmetric: hard to train but easy to run.
               | 
               | There is no such thing as an open source Google because
               | Google's value is in its vast data centers. Search is
               | hard to train and hard to run.
               | 
               | GPT4 is not that big. It's about 220B parameters, if you
               | believe geohot, or perhaps more if you don't.
               | 
               |  _One_ hard drive.
        
               | shihab wrote:
               | My understanding is that Google search is a lot more than
               | just Pagerank (Map reduce for example). They had lots of
               | heuristics, data, machine learning before anyone else
               | etc.
               | 
               | Whereas the underlying algorithms behind all these GPTs
               | so far are broadly same. Yes, OpenAI does probably have
               | better data, model finetuning and other engineering
               | techniques now, but I don't feel it's anything special
               | that'll allow themselves to differentiate themselves from
               | competitors in the long run.
               | 
               | (If the data collected from a current LLM user in
               | improving model proves very valuable, that's different. I
               | personally think that's not the case now but who knows).
        
             | YetAnotherNick wrote:
             | The difference between openai and next best model seems to
             | be increasing and not decreasing. Maybe Google's gemini
             | could be competitive, but I don't believe open source will
             | match OpenAI's capability ever.
             | 
             | Also OpenAI gets significant discount on compute due to
             | favourable deals from Nvidia and Microsoft. And they could
             | design their server better for their homogenous needs. They
             | are already working on AI chip.
        
         | goosinmouse wrote:
         | Are you using 3.5 turbo? Its always funny when i test a new fun
         | chatbot or something and see my API usage 10x just from a
         | single GPT 4 API call. Although i only usually have a $2 bill
         | every month from openAI.
        
         | littlestymaar wrote:
         | > you should probably just be using the best that's available
         | (OpenAI).
         | 
         | Sure, if you want to let a monopoly have all the added value
         | while you get to keep the rest you can do that.
         | 
         | Just make sure you're never successful enough to inspire them
         | though, otherwise you're dead the next minute. Oops.
        
       | zzbn00 wrote:
       | p4d.24xlarge spot price is $8.2 / hour in US East 1 at the
       | moment...
        
         | charlesischuck wrote:
         | Good luck getting that lol
        
       | tester756 wrote:
       | >iPhone of artificial intelligence
       | 
       | It feels like the biggest investor bait of this year
       | 
       | Will it beat ARM IPO?
        
       | lossolo wrote:
       | It's also worth noting that if you build your business on using
       | OpenAI's LLM or Anthropic etc, then, in the majority of cases
       | I've seen so far (no fine tuning etc), your competitor is just
       | one prompt away from replicating your business.
        
       | beauHD wrote:
       | I signed up for OpenAI's ChatGPT tool, and entered a query, like
       | 'What does the notation 1e100 mean?' (just to try it out). And
       | then when displaying the output it would start outputting the
       | reply in a slow way, like, it was dripfeeded to me, and I was
       | like: 'what? surely this could be faster?'
       | 
       | Maybe I'm missing something crucial here, but why does it
       | dripfeed answers like this? Does it have to think really hard
       | about the meaning of 1e100? Why can't it just spit it out
       | instantly without such a delay/drip, like with the near-instant
       | Wolfram Alpha?
        
         | baby wrote:
         | You can but it'll take longer. So one way to get faster answers
         | is to stream the response as it is generated. And in GPT-based
         | apps the response is generated token by token (~4chars), hence
         | what you're seeing.
        
         | maccam912 wrote:
         | Its a result of how these transformer models work. It's pretty
         | quick for the amount of work it does, but it's not looking up
         | anything, it's generating it a token a time.
        
         | notRobot wrote:
         | Under the hood, GPT works by predicting the next token when
         | provided with an input sequence of words. At each step a single
         | word is generated taking into consideration all the previous
         | words.
         | 
         | https://ai.stackexchange.com/questions/38923/why-does-chatgp...
        
         | swatcoder wrote:
         | The non-technical way to think about it is that ChatGPT "thinks
         | out loud" and can _only_ "think out loud".
         | 
         | Future products would be able to hide some of that, but for
         | now, that's what the ChatGPT / Bing Assistant product does.
        
         | codedokode wrote:
         | Because it needs to do billions of arithmetic operations to
         | generate a reply. Replying to questions is not an easy task.
        
       | iambateman wrote:
       | This is _the_ playbook for big, fast scaling companies...Uber
       | subsidized every ride for _a decade_ before finally charging
       | market price, just to make sure that Uber was the only option
       | which made sense.
       | 
       | While it's nice to consume the cheap stuff, it is not good for
       | healthy markets.
        
       | matteoraso wrote:
       | It's not even just the cost of finetuning. The API pricing is so
       | low, you literally can't save money by buying a GPU and running
       | your own LLM, no matter how many tokens you generate. It's an
       | incredible moat for OpenAI, but something they can't provide is
       | an LLM that doesn't talk like an annoying HR manager, which is
       | the real use case for self-hosting.
        
       | rosywoozlechan wrote:
       | The service quality sucks. You're getting what you pay for. We
       | switched to Azure Open AI APIs because of all the service quality
       | issues.
        
       | layer8 wrote:
       | Isn't OpenAI too cheap to be sustainable, and currently living
       | off Microsoft's $10B investment?
        
       | xnx wrote:
       | Nothing in that article convinces me the situation couldn't
       | change entirely in any given month. Google Gemini could be more
       | capable. Any number of new players (AWS, Microsoft, Apple) could
       | enter the market in a serious way. The head-start OpenAI has in
       | usage data is small and probably eclipsed by the clickstream and
       | data stores that Google and Microsoft have access to. I see no
       | durable advantage for OpenAI.
        
         | freedomben wrote:
         | Gemini very well might be the biggest threat to OpenAI. ChatGPT
         | has first-mover advantage so has a decent moat, but the amount
         | of people willing to pay $20 per month for something worse[1]
         | than they get for free with google.com is going to dwindle. I'd
         | be very worried if I were them.
         | 
         | [1]: That knowledge cutoff and terrible UX of browse the web is
         | brutal compared to the experience of Bard
        
       | appplication wrote:
       | The premise of this is flawed. OpenAI is cheap because of has to
       | be right now. They need to establish market dominance quickly,
       | before competitors slide in. The winner of this horse race is not
       | going to be the company with the best performing AI, it's going
       | to be the one who does the best job at creating an outstanding
       | UX, ubiquitously presence, entrenching users, and building
       | competitive moats that are not feature differentiated because at
       | best even cutting edge features are only 6-12 months ahead of
       | competition cloning or beating.
       | 
       | This is Uber/AirBnB/Wework/literally every VC subsidized hungry-
       | hungry-hippos market grab all over again. If you're falling in
       | love because the prices are so low, that is ephemeral at best and
       | is not a moat. Someone try calling an Uber in SF today and tell
       | me how much that costs you and how much worse the experience is
       | vs 2017.
       | 
       | OpenAI is the undisputed future of AI... for timescales 6 months
       | and less. They are still extremely vulnerable to complete
       | disruption and as likely to be the next MySpace as they are
       | Facebook.
        
         | shaburn wrote:
         | Your Uber/AirBnB/Wework all have physical base units with
         | ascending costs due to inflation and theoretical economies of
         | scale.
         | 
         | AI models have some GPU constraints but could easily reach a
         | state where the cost to opperate falls and becomes relatively
         | trivial with almost no lowerbound, for most use cases.
         | 
         | You are correct there is a race for marketshare. The crux in
         | this case will be keeping it. Easy come, easy go. Models often
         | make the worst business model.
        
           | monocasa wrote:
           | Probably why Altman has been talking so much about how
           | dangerous it is and how regulations are needed. No natural
           | moat, so building a regulatory one.
        
         | blackoil wrote:
         | This point is discussed in the article. Title is not for
         | Google/Meta, they'll invest all the billions that they have to.
         | 
         | It is for the consumers of these models, is there even a point
         | to train your own or experiment with OSS!
        
           | hendersoon wrote:
           | Sure, open models often require much less hardware than
           | chatGPT3.5 and offer ballpark (and constantly improving)
           | performance and accuracy. ChatGPT3.5 scores 85 in ARC and the
           | huggingface leaderboard is up to 77.
           | 
           | If you need chatGPT4-quality responses they aren't close yet,
           | but it'll happen.
        
         | toddmorey wrote:
         | Just heard Steve today from Builder.io who did an impressive
         | launch of Figma -> code powered by AI.
         | 
         | They trained a custom model for this. Better accuracy, sure,
         | but I was a little surprised to watch how much faster it is
         | than GPT4.
         | 
         | Based on their testing, they've become believers in domain
         | specific smaller models, especially for performance.
        
         | ldjkfkdsjnv wrote:
         | Completely wrong, the best AI will win. There is insane demand
         | for better models.
        
           | datadrivenangel wrote:
           | There is insane demand for good enough models at extremely
           | good prices.
           | 
           | Better beyond a certain point is unlikely to be competitive
           | with the cheaper models.
        
           | oceanplexian wrote:
           | Yep, quality over quantity. The difference between 99.9%
           | accurate and 99.999% accurate can be ridiculously valuable in
           | so many real world applications where people would apply
           | LLMs.
        
           | gbmatt wrote:
           | Only Big Tech (Microsoft,Google,Facebook) can crawl the web
           | at scale because they own the major content companies and
           | they severly throttle the competition's crawlers, and
           | sometimes outright block them. I'm not saying it's impossible
           | to get around, but it is certainly very difficult, and you
           | could be thrown in prison for violating the CFAA.
        
             | PaulHoule wrote:
             | I'm not sure if training on a vast amount of content is
             | really necessary in the sense that linguistic competence
             | and knowledge can probably be separated to some extent.
             | That is, the "ChatGPT" paradigm leads to systems that just
             | confabulate and "makes shit up" and making something
             | radically more accurate means going to something retrieval-
             | based or knowledge graph-based.
             | 
             | In that case you might be able to get linguistic competence
             | with a much smaller model that you end up training with a
             | smaller, cleaner, and probably partially synthetic data
             | set.
        
           | wkat4242 wrote:
           | The improvements seem to be leveling off already. GPT-4 isn't
           | really worth the extra price to me. It's not that much
           | better.
           | 
           | What I would really want though is an uncensored LLM. OpenAI
           | is basically unusable now, most of its replies are like "I'm
           | only a dumb AI and my lawyers don't want me to answer your
           | question". Yes I work in cyber. But it's pretty insane now.
        
             | bugglebeetle wrote:
             | GPT-4, correctly prompted, is head and shoulders above
             | everything for coding. All the text generation stuff and
             | NLP tasks, it's a toss-up.
        
             | jrockway wrote:
             | I haven't played with the self-hosted LLMs at all yet, but
             | back when Stable Diffusion was brand new I had a ton of fun
             | creating images that lawyers wouldn't want you to create.
             | ("Abraham Lincoln and Donald Trump riding a battle
             | elephant." It's just so much funnier with living people!) I
             | imagine that Llama-2 and friends offer a similar
             | experience.
        
           | PaulHoule wrote:
           | Depends how you define quality. This paper reflects my own
           | experience
           | 
           | https://arxiv.org/abs/2305.08377
           | 
           | and shows how LLM technology has a lot more to offer than
           | "ChatGPT". The real takeaway is that by training LLMs with
           | real training data (even with a "less powerful" model) you
           | can get an error rate more than 10x less than you get with
           | the "zero shot" model of asking ChatGPT to answer a question
           | for you the same way that Mickey Mouse asked the broom to
           | clean up for him in _Fantasia._ The  "few-shot" approach of
           | supplying a few examples in the attention window was a little
           | better but not much.
           | 
           | The problem isn't something that will go away with a more
           | powerful model because the problem has a lot to do with the
           | intrinsic fuzziness of language.
           | 
           | People who are waiting for an exponentially more expensive
           | ChatGPT-5 to save them will be pushing a bubble around under
           | a rug endlessly while the grinds who formulate well-defined
           | problems and make training sets will actually cross the
           | finish line.
           | 
           | Remember that Moore's Law is over in the sense that
           | transistors are not getting cheaper generation after
           | generation, that is why the NVIDIA 40xx series is such a
           | disappointment to most people. LLMs have some possibility of
           | getting cheaper from a software perspective as we understand
           | how they work and hardware can be better optimized to make
           | the most of those transistors, but the driving force of the
           | semiconductor revolution is spent unless people find some
           | entirely different way to build chips.
           | 
           | But... people really want to be like Mickey in _Fantasia_ and
           | hope the grinds are going to make magic for them.
        
             | sbierwagen wrote:
             | > Remember that Moore's Law is over in the sense that
             | transistors are not getting cheaper generation after
             | generation, that is why the NVIDIA 40xx series is such a
             | disappointment to most people.
             | 
             | Huh? The NVIDIA H100 has twice the FLOPS of the A100 on a
             | smaller die. How is that not Moore's law?
        
         | mg wrote:
         | I don't think Uber and AirBnB are good comparisons.
         | 
         | Both are B2C and have network effects.
        
         | paul7986 wrote:
         | The PI iPhone app has a solid UX and even better UX if Apple
         | (bought it) integrated into Siri.
        
       | kcorbitt wrote:
       | Eh, OpenAI is too cheap to beat at their own game.
       | 
       | But there are a ton of use-cases where a 1 to 7B parameter fine-
       | tuned model will be faster, cheaper and easier to deploy than a
       | prompted or fine-tuned GPT-3.5-sized model.
       | 
       | In fact, it might be a strong statement but I'd argue that _most_
       | current use-cases for (non-fine-tuned) GPT-3.5 fit in that
       | bucket.
       | 
       | (Disclaimer: currently building https://openpipe.ai; making it
       | trivial for product engineers to replace OpenAI prompts with
       | their own fine-tuned models.)
        
       | kristjansson wrote:
       | This article might have a point about the data flywheel, but it's
       | lost in the confused economics in the second half. Why would we
       | expect to hire one engineer per p4.24x instance? Why do we think
       | OpenAI needs a whole p4.24x to run fine tuning? Why do we ignore
       | the higher costs on the inference side for fine-tuned models? Why
       | do we think OpenAI spends _any_ money on racking-and-stacking
       | GPUs rather than just take them at (hyperscaler) cost from Azure?
        
       | oceanplexian wrote:
       | Has anyone actually used GPT4? It's not "cheap".
       | 
       | It was roughly $150 for me to build a small dataset with a few
       | thousand quarter-page chunks of text for a data project using
       | GPT4. GPT3 is substantially cheaper but it would hallucinate 30%
       | of the time; honestly a nice fine-tune of LlaMA is on-par with
       | GPT3 and after the sunk cost all it costs is a few $0.01 in
       | electricity to generate the same sized dataset.
        
         | slowhadoken wrote:
         | It's insanely expensive to run and operate "AI". Meredith
         | Whittaker's talk on AI is very insightful
         | https://www.youtube.com/watch?v=amNriUZNP8w
        
       | slowhadoken wrote:
       | Thanks to traumatized $2 an hour Kenyan labor, yeah
       | https://time.com/6247678/openai-chatgpt-kenya-workers/
        
       | pimpampum wrote:
       | Classic anti-competition strategy, sell below cost and burn money
       | until competition is out, then sell higher than you could have
       | ever sold with competition.
        
       | BrunoJo wrote:
       | We just started a service different open source models and with
       | an OpenAI compatible API [1]. The pricing isn't final and we
       | haven't officially launched yet but you should be able to save at
       | least 75% compared to GPT 3.5.
       | 
       | [1] https://lemonfox.ai/
        
         | Meegul wrote:
         | Are you doing this profitably? If so, does that entail owning
         | your own hardware or renting from cheaper services such as
         | Lambda?
        
       | slowhadoken wrote:
       | none of it is cheap, "AI" insanely expensive. Meredith Whittaker
       | talks about it in this interview
       | https://www.youtube.com/watch?v=amNriUZNP8w She's the president
       | of the Signal Foundation.
        
       | AJRF wrote:
       | I read this and think "That won't last long".
       | 
       | The pricing is too good to be true with you think about it
       | rationally. If they raise prices they seem much, much less
       | attractive than using AWS or Azure.
       | 
       | Amazon seem to have a much better business built around their
       | Bedrock offering. And all their other tools are available there
       | like SageMaker, ec2, integration with MLFlow, etc, etc.
       | 
       | I guess the same goes for Azure, if you are already using it it's
       | much easier to just stick with whatever they are offering for LLM
       | Ops.
       | 
       | OpenAI offering just models doesn't seem like it can last
       | forever, and to compete with AWS or Azure at enterprise level
       | they need to build all the things Amazon/MS have built.
       | 
       | The other side of that coin seems much more realistic.
        
       | DominikPeters wrote:
       | > While per-token inference costs for fine-tuned GPT-3.5 is 10x
       | more expensive than GPT-3.5 it is still 10x cheaper than GPT-4!
       | 
       | Not quite accurate; finetuned 3.5 is only 4x cheaper than GPT-4.
       | Cost per million output tokens from https://openai.com/pricing $2
       | - GPT-3.5 $16 - finetuned GPT-3.5 $60 - GPT 4
        
       ___________________________________________________________________
       (page generated 2023-10-12 21:00 UTC)