[HN Gopher] Azure ChatGPT: Private and secure ChatGPT for intern...
       ___________________________________________________________________
        
       Azure ChatGPT: Private and secure ChatGPT for internal enterprise
       use
        
       Author : taubek
       Score  : 345 points
       Date   : 2023-08-13 18:35 UTC (4 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | jmorgan wrote:
       | This appears to be a web frontend with authentication for Azure's
       | OpenAI API, which is a great choice if you can't use Chat GPT or
       | its API at work.
       | 
       | If you're looking to try the "open" models like Llama 2 (or it's
       | uncensored version Llama 2 Uncensored), check out
       | https://github.com/jmorganca/ollama or some of the lower level
       | runners like llama.cpp (which powers the aforementioned project
       | I'm working on) or Candle, the new project by hugging face.
       | 
       | What's are folks' take on this vs Llama 2, which was recently
       | released by Facebook Research? While I haven't tested it
       | extensively, 70B model is supposed to rival Chat GPT 3.5 in most
       | areas, and there are now some new fine-tuned versions that excel
       | at specific tasks like coding (the 'codeup' model) or the new
       | Wizard Math (https://github.com/nlpxucan/WizardLM) which claims
       | to outperform ChatGPT 3.5 on grade school math problems.
        
       | littlestymaar wrote:
       | "private and secure" from the company that let contractor listen
       | to your private Teams conversation for data labeling purpose, and
       | monitor your activity on your own computer with their OS...
        
         | svaha1728 wrote:
         | Move fast and break things, including basic security. Why
         | anyone trusts Azure that all these prompts won't eventually be
         | leaked is beyond me. No one goes broke trusting Azure, but I'd
         | love it if someone was held responsible.
         | 
         | https://www.schneier.com/blog/archives/2023/08/microsoft-sig...
        
         | tharwan wrote:
         | Huh. I missed this one. Got a link?
        
           | dijital wrote:
           | At a guess it's this story:
           | https://www.vice.com/en/article/xweqbq/microsoft-
           | contractors...
        
             | littlestymaar wrote:
             | Ah yes it was Skype and not Teams, my bad.
        
       | alpinemeadow wrote:
       | We have this at IKEA for a while now. Not impressed, but funny to
       | read the hallucinations.
        
         | BoorishBears wrote:
         | I'd expect a company like IKEA to have the expertise to create
         | interfaces specific to their workflows so hallucinations aren't
         | an issue.
         | 
         | Imo if you're making an open ended chat interface for a
         | business, you're doing it wrong.
        
       | singingfish wrote:
       | I was looking through our server logs the other day and spotted
       | the openai bot going through our stuff ... however a decent bit
       | of our content is now augmented by GPT ...
        
       | Havoc wrote:
       | How does this work in terms of utilization? The isolation
       | presumably means buying gpu capacity and only using a %?
        
         | asabla wrote:
         | Basically you get N tokens/second (or if it was minute, can
         | check tomorrow if you're really interested) per deployment. So
         | if you would outgrow on deployment, just deploy another one
         | (with the associated costs of course).
         | 
         | One deployment = a deployed model which you can query
         | 
         | On top of that, depending on the model you're using, you also
         | see a cost increment for each 1000 request you make.
        
       | refulgentis wrote:
       | Crappy clone of ChatGPT frontend, half missing, half direct copy.
       | Implied and overly vast claims of insecurity + lack of privacy,
       | that are narrowly true, i.e. for _Chat_GPT.
       | 
       | Really surprised to see this aggressive of language 1) written
       | down 2) on Github. I'd be pretty pissed if I was OpenAI,
       | regardless of the $10B.
        
         | sebzim4500 wrote:
         | I think OpenAI is entirely on board with the idea that OpenAI
         | sells to consumers and Azure/Microsoft sells the same product
         | to enterprise.
         | 
         | That's how it's been working for months, and if OpenAI objected
         | they would have done something about it.
        
         | jeremyjh wrote:
         | I have no doubt OpenAI is on board. This is just bringing more
         | paid users to their platform because it still uses their API.
        
       | justinlloyd wrote:
       | Interesting release, though still lacking a few features I've had
       | to resort building myself such as code summary, code base
       | architecture summary, and conversation history summary. ChatGPT
       | (the web UI) now has the ability to execute code, and make
       | function callbacks, but I prefer running that code locally,
       | especially if I am debugging. This latter part, conversation
       | history summary, is something that ChatGPT web UI does reasonably
       | well, giving it a long history, but a sentiment extraction and
       | salient detail extraction before summarizing is immensely useful
       | for remembering details in the distant past. I've been building
       | on top of the GPT4 model and tinkering with multi-model (gpt4 +
       | davinci) usage too, though I am finding with the MoE that Davinci
       | isn't as important. Fine tuning has been helpful for specific
       | code bases too.
       | 
       | If I had the time I'd like to play with an MoE of Llama2, as a
       | compare and contrast, but that ain't gonna happen anytime soon.
        
       | atlgator wrote:
       | Is this a full, standalone deployment including GPT-3 (or
       | whatever version) or just a secured frontend that sends data to
       | GPT hosted outside the enterprise zone?
       | 
       | Edit: Uses Azure OpenAI as the backend
        
       | leerob wrote:
       | This is awesome to see, feels heavily inspired (in a good way) by
       | the version we made at Vercel[1]. Same tech stack: Next.js,
       | NextAuth, Tailwind, Shadcn UI, Vercel AI SDK, etc.
       | 
       | I'd expect this trend of managed ChatGPT clones to continue. You
       | can own the stack end to end, and even swap out OpenAI for a
       | different LLM (or your own model trained on internal company
       | data) fairly easily.
       | 
       | [1]: https://vercel.com/templates/next.js/nextjs-ai-chatbot
        
       | Xenoamorphous wrote:
       | Darn I just spent a week or so working on a ChatGPT clone that
       | used Azure ChatGPT API due to the privacy aspect. Wasted effort I
       | guess.
        
         | saliagato wrote:
         | This is exactly the same
        
           | ddmma wrote:
           | Welcome to the club :)
        
       | EGreg wrote:
       | We just have to trust them and take their word for it? Or what?
       | 
       | https://azure.microsoft.com/en-us/explore/trusted-cloud/priv...
       | 
       | https://azure.microsoft.com/en-us/blog/3-reasons-why-azure-s...
       | 
       | I guess I would trust them, since they're big and they make these
       | promises and other big companies use them.
        
       | xeckr wrote:
       | No better than the API.
        
       | PaulWaldman wrote:
       | Since the only users who would likely care about this derive far
       | more value than the $20/month of OpenAI's direct offering. Why
       | doesn't OpenAI market this service, but with chat history, for
       | something like $200/month?
        
         | unnouinceput wrote:
         | OenAI IS Microsoft. Don't get tangled in the web of creating
         | different entities when they are all part of the same pyramid.
         | Also GitHub IS Microsoft too!!
        
           | nixgeek wrote:
           | GitHub was acquired by Microsoft, and they are no longer
           | legally separate entities.
           | 
           | Microsoft is an investor in OpenAI, but does not own it, and
           | they are legally separate companies. OpenAI is _not_
           | Microsoft and it is factually incorrect to claim that OpenAI
           | _is_ Microsoft.
           | 
           | [1] https://blogs.microsoft.com/blog/2023/01/23/microsoftando
           | pen...
        
             | Scoundreller wrote:
             | But saying they're just an investor isn't quite doing the
             | arrangement the justice it deserves. There seems to be a
             | lot of strings attached to that investment.
             | 
             | It's not just a straight trade of dollars for shares, but
             | many further contractual obligations.
        
               | nixgeek wrote:
               | I understand that perception but "seems to be a lot of
               | strings" is all that is publicly known. None of those
               | further obligations seem to have been disclosed. Without
               | that disclosure it's a bit of a conspiracy theory?
               | 
               | Thus, it could very well be OpenAI has taken dollars, is
               | commercially selling its technology to Microsoft on terms
               | which aren't special, and sama and the OpenAI executive
               | team and board has _independently_ concluded that
               | engaging in the partnership is a stellar way to grow
               | their OpenAI brand, business and valuation?
        
         | semitones wrote:
         | That's a laughable price for an enterprise subscription.
         | 
         | And the reason is, it's enough for OpenAI to "say" that they're
         | "not going to use your data" - you need a cloud deployment
         | where you can control network boundaries to _prove_ that your
         | data isn't going anywhere it isn't supposed to.
        
           | agildehaus wrote:
           | Unless you're physically controlling the network boundaries,
           | how are you proving that on any cloud service?
        
       | aantix wrote:
       | I don't understand - chat with a file?
       | 
       | I want to chat and ask about an entire body of knowledge - wiki
       | pages, git commit diffs/messages, jira tasks.
        
       | croes wrote:
       | Yeah sure, I totally trust you after the Storm-0558 desaster
        
       | robbomacrae wrote:
       | This is potentially a huge deal. Companies are concerned using
       | ChatGPT might violate data privacy policies if someone puts in
       | user data or invalidate trade secrets protections if someone
       | uploads sections of code. I suspect many companies have been
       | waiting for an enterprise version.
        
         | tbrownaw wrote:
         | This is a web UI that talks to a (separate) Azure OpenAI
         | resource that you can deploy into your subscription as a SaaS
         | instance.
        
           | hackernewds wrote:
           | So how is it any different
        
         | [deleted]
        
         | judge2020 wrote:
         | I imagine most companies serious about this created their own
         | wrappers around the API or contracted it out, likely using
         | private Azure GPUs.
        
           | Normal_gaussian wrote:
           | Most companies are either not tech companies, or do not have
           | the knowledge to manage such a project within reasonable cost
           | bounds.
        
             | jmathai wrote:
             | Most companies are trying to figure out exactly what
             | generative AI is and how to use it in their business. Given
             | how new this is - I doubt any large company has done much
             | besides ban the public ChatGPT. So this is probably very
             | relevant for them.
        
       | bouke wrote:
       | How is this different from the other OpenAI GUI? Why another one
       | by Microsoft? https://github.com/microsoft/sample-app-aoai-
       | chatGPT.
        
         | wodenokoto wrote:
         | There's at least two more. There's also
         | https://github.com/Azure-Samples/azure-search-openai-demo
         | 
         | And you can deploy a chat bot from within the Azure playground
         | which runs on another codebase.
        
         | pamelafox wrote:
         | This is an internal ChatGPT, whereas that sample is ChatGPT
         | constrained to internal search results (using RAG approach).
         | Source: I help maintain the RAG samples.
        
         | FrenchDevRemote wrote:
         | i'm pretty sure it's a part of it
        
         | colonwqbang wrote:
         | Bigger companies are cautious about using GPT-style products
         | due to data security concerns. But most big companies trust
         | Microsoft more or less blindly.
         | 
         | Now that Microsoft has an official "enterprise" version out,
         | the floodgates are open. They stand to make a killing.
        
         | pjmlp wrote:
         | I bet there are plenty of OKR/KPIs now tied to AI at Microsoft.
        
       | PoignardAzur wrote:
       | > _However, ChatGPT risks exposing confidential intellectual
       | property. One option is to block corporate access to ChatGPT, but
       | people always find workarounds_
       | 
       | Pretty bold thing to say to your potential clients. "You can
       | always tell your employees not to use our product, but they won't
       | listen to you."
        
       | coldblues wrote:
       | Pretty sure Azure has a moderation endpoint enabled by default
       | that makes using the OpenAI API an awful experience.
        
       | Ecstatify wrote:
       | Our company is pushing everyone to use a similar offering. Most
       | of the company is doing low value work ... still using excels
       | even though we have a custom ERP. Now seeing people who couldn't
       | write a coherent email before write 3 page emails. The illusion
       | of being productive by doing more work even though it has zero
       | impact on the bottom line. It's insane how inefficient
       | organisations are. No doubt we'll have some KPI soon about using
       | the tool.
        
         | simmerup wrote:
         | If anything it's less productive because people have to parse
         | all that nonsense.
         | 
         | I was gobsmacked to hear a friend say that their work guidance
         | is to use ChatGPT to write letters to external clients for
         | example. I know for sure I'd be insulted if someone sent me
         | paragraphs of text to read created from a sentence long prompt.
         | I'd rather have the prompt, my time is valuable as well.
        
           | mritchie712 wrote:
           | ahhhh, but they're pasting the 3 page email into ChatGPT
           | ("summarize this"). The future is here.
        
             | ilyt wrote:
             | Wouldn't be surprised if that was next Outlook feature.
             | 
             | Cue someone making some horrible error because some crucial
             | information didn't survive ChatGPT->ChatGPT round-trip
        
               | ddmma wrote:
               | Actually this was in an Azure hacktoon some time ago
               | https://devpost.com/software/amabot
        
               | mritchie712 wrote:
               | it's already here...
               | https://blogs.microsoft.com/blog/2023/03/16/introducing-
               | micr...
        
             | kossTKR wrote:
             | Yeah that's one of the insane things that will happen.
             | 
             | Very soon everyone will in effect "hide" behind an agent
             | that will take all kinds of decisions on one's behalf.
             | Everything from writing e-mails to proposals but also to
             | sue someone, make financial decisions, and be a filter that
             | transforms everything going in or out.
             | 
             | I can't imagine this world really. How the hell are people
             | going to compete or stand out? Doesn't it seem that what
             | little meritocracy existed wills soon drown in noise?
        
               | simmerup wrote:
               | I was scared about organizations doing this and losing
               | their connection to the humans they serve.
               | 
               | The realization that individuals will also have this
               | barrier to the world is even scarier.
               | 
               | If it goes that way we could be looking at a change to
               | society on the level of social media, again. Mad.
        
           | voiper1 wrote:
           | I write emails and put it into chatgpt and ask it to make it
           | more concise or point out issues. No utility in asking
           | chatgpt to needlessly expand the text...
        
           | kenjackson wrote:
           | I think the more common case is to have a handful of bullet
           | points and some notes and ask chat GOT to put into a coherent
           | letter for an external customer with the goal of XYZ. I've
           | done similar things and it is a huge timesaver. I still have
           | to edit it, but it gives me a start that's probably on par to
           | what a Junior engineer would write as a first draft.
        
           | klabb3 wrote:
           | Exactly right. If you increase entropy you need energy to
           | reduce it back. It be _more_ valuable to take crap that
           | humans have put together incoherently and summarizing it.
           | (Perhaps someone should put a GPT on the other end in order
           | to read it)
           | 
           | I honestly don't know why we're so obsessed with having LLMs
           | generate crap. Especially when they're very capable of
           | reducing, simplifying. Imagine penetrating legal texts,
           | political bills, obtuse technical writing, academic papers
           | and making sense of those quickly. Much more useful imo.
        
           | throw__away7391 wrote:
           | You'll just have people reversing it into a summary on the
           | other end, kind of like a "text" chat where both sides are
           | using text-to-speech and speech-to-text instead of having a
           | phone call.
        
           | skepticATX wrote:
           | The amount of othewise very smart people who completely lose
           | the ability to think critically when it comes to "AI" is
           | really interesting to me.
           | 
           | I'm not anti-AI; I've recommended that we use it at work a
           | few times _where it made sense and was backed by evidence
           | /bencharmks_. But for essentially any problem that comes up
           | someone will try to solve it with ChatGPT, even if it
           | demonstrably can't do the job. And these are not business
           | folks, these are engineering leaders who absolutely have the
           | capability to understand this technology.
        
         | mritchie712 wrote:
         | What ERP are you using?
         | 
         | We've found some early success selling to companies with older
         | "long-tail" ERP's. I've been finding a new one every day.
        
           | Ecstatify wrote:
           | It's a proprietary ERP completely custom. Think it was
           | deployed through an acquisition. The problem isn't the ERP
           | it's the business. "We want custom processes" but hire the
           | cheapest developers possible to maintain the ERP and then
           | complain about bugs. "We're agile(tm)" ... but have the same
           | inefficient processes for the last 3 years. Cargo cult org,
           | the CEO was taking about Black Swans during COVID ... even
           | though Nassim Taleb explicitly said COVID wasn't a black swan
           | event.
        
         | [deleted]
        
         | amluto wrote:
         | I've learned that the most important writing skill is to figure
         | out what you're trying to say -- this is a rather important
         | prerequisite to writing well.
         | 
         | Naively asking a chatbot to write for you does not help with
         | this at all.
         | 
         | It would be interesting to try to prompt ChatGPT to ask
         | questions to try to figure out what the user is trying to write
         | and then to write it.
        
       | paxys wrote:
       | Would it be too much to mention somewhere in the README what this
       | repo actually contains? Just docs? Deployment files? Some
       | application (which does..something)? The model itself?
        
         | Xenoamorphous wrote:
         | The repo contains the UI code, not the model or anything else
         | around ChatGPT, it just uses Azure's ChatGPT API which doesn't
         | share data with OpenAI.
        
           | paxys wrote:
           | So basically - what you really need to do to run Azure
           | ChatGPT is go and click some buttons in the Azure portal.
           | This repo is a sample UI that you could possibly use to talk
           | to that instance, but really you will probably always build
           | your own or embed it directly into your products.
           | 
           | So calling the repo "azurechatgpt" is misleading. It should
           | really be "sample-chatgpt-api-frontend" or something of that
           | sort.
        
             | saliagato wrote:
             | Yes exactly
        
             | laurels-marts wrote:
             | Correct. If offers a front-end scaffolding for your
             | enterprise ChatGPT app. Uses Next/NextAuth/Tailwind etc.
             | for deployment on Azure App Service that hooks into Azure
             | Cosmos DB and Azure OpenAI (the actual model).
        
           | [deleted]
        
       | padolsey wrote:
       | I'm confused. If this is just a front-end for the OpenAI API then
       | how does it remove the data privacy concern? Your data still ends
       | up with Azure/OpenAI, right? It doesn't stay localized to your
       | instance; it's not your GPU running the transformations. You have
       | no way of knowing whether your data is being used to train
       | models. If customer data is sensitive, I'm pretty sure running a
       | 70B llama (or similar) on a bunch of A100s is the only way?
        
         | dbish wrote:
         | Azure is hosting and operating the service themselves rather
         | then for OpenAI, with all the security requirements that come
         | with that. I assume this comes with different data and access
         | restrictions as well and ability to run in secured instances
         | (and nothing sent to OpenAI the company).
         | 
         | Most companies use cloud already for their data, processing,
         | etc. and aren't running anything major locally, let alone ML
         | models, this is putting trust in the cloud they already use.
        
           | nmstoker wrote:
           | Yes, this was my understanding.
        
           | padolsey wrote:
           | Ah that's fair. But it is my impression that the bulk of
           | privacy/confidentiality concerns (e.g. law/health/..) would
           | require "end to end" data safety. Not sure if I'm making
           | sense. I guess microsoft is somehow more trustworthy than
           | openai themselves...
           | 
           | EDIT: what you say about existing cloud customers being able
           | to extend their trust to this new thing makes sense, thanks.
        
             | PoignardAzur wrote:
             | Right. If I was an European company worried about, say,
             | industrial espionage, this wouldn't be nearly enough to
             | reassure me.
        
       | jrm4 wrote:
       | "Private and secure"
       | 
       | From _Microsoft_?
       | 
       | Ha.
        
       | gdiamos wrote:
       | we wrote a blog post about why companies do this here:
       | https://www.lamini.ai/blog/specialize-llms-to-private-data-d...
       | 
       | Here are a few:
       | 
       | Data privacy
       | 
       | Ownership of IP
       | 
       | Control over ops
       | 
       | The table in the blog lists the top 10 reasons why companies do
       | this based on about 50 customer interviews.
        
       | H8crilA wrote:
       | What's the practical difference between this and OpenAI API?
       | 
       | All I can see is the same product but offered by a larger
       | organization. I.e. they're more likely to get the security
       | details right, and you can potentially win more in a lawsuit
       | should things go bad.
        
         | ebiester wrote:
         | Compliance and customer trust. Azure can sign a BAA, for
         | example. If you are Building LLM capability on top of your
         | SaaS, your customers want assurances about their data.
        
         | jeffschofield wrote:
         | A few months ago my team moved to Azure for capacity reasons.
         | We were constantly dealing with 429 errors and couldn't get in
         | touch with Open AI, while Azure offered more instances.
         | 
         | Eventually got more from Open AI so we load balance both. The
         | only difference is the 3.5 turbo model on Azure is outdated.
        
       | ajhai wrote:
       | A lot of companies are already using projects like chatbot-ui
       | with Azure's OpenAI for similar local deployments. Given this is
       | as close to local ChatGPT as any other project can get, this is a
       | huge deal for all those enterprises looking to maintain control
       | over their data.
       | 
       | Shameless plug: Given the sensitivity of the data involved, we
       | believe most companies prefer locally installed solutions to
       | cloud based ones at least in the initial days. To this end, we
       | just open sourced LLMStack
       | (https://github.com/TryPromptly/LLMStack) that we have been
       | working on for a few months now. LLMStack is a platform to build
       | LLM Apps and chatbots by chaining multiple LLMs and connect to
       | user's data. A quick demo at
       | https://www.youtube.com/watch?v=-JeSavSy7GI. Still early days for
       | the project and there are still a few kinks to iron out but we
       | are very excited for it.
        
         | toomuchtodo wrote:
         | Can you plug this together with tools like api2ai to create
         | natural language defined workflow automations that interact
         | with external APIs?
        
           | cosbgn wrote:
           | You can use unfetch.com to make API calls via LLMs and build
           | automations. (I'm building it)
        
           | ajhai wrote:
           | There is a generic HTTP API processor that can be used to
           | call APIs as part of the app flow which should help invoke
           | tools. Currently working on improving documentation so it is
           | easy to get started with the project. We also have some
           | features planned around function calling that should make it
           | easy to natively integrate tools into the app flows.
        
         | bhanu423 wrote:
         | Interesting project - was trying it out, found an issue in
         | building the image - have opened an issue on github - please
         | take a look. Also do you have plan to support llama over openai
         | models.
        
           | ajhai wrote:
           | Thanks for the issue. Will take a look. In the meantime, you
           | can try the registry image with `cp .env.prod .env && docker
           | compose up`
           | 
           | > Also do you have plan to support llama over openai models.
           | 
           | Yes, we plan to support llama etc. We currently have support
           | for models from OpenAI, Azure, Google's Vertex AI, Stability
           | and a few others.
        
         | gdiamos wrote:
         | I find it interesting to see how competitive this space got so
         | quickly.
         | 
         | How do these stacks differentiate?
        
           | scrum-treats wrote:
           | Quality and depth of particular types of training data is one
           | difference. Another difference is inference tracking
           | mechanisms within and between single-turn interactions (e.g.,
           | what does the human user "mean" with their prompt, what is
           | the "correct" response, and how best can I return the
           | "correct" response for this context; how much information do
           | I cache from the previous turns, and how much if any of it is
           | relevant to this current turn interaction).
        
       | extr wrote:
       | One thing I still don't understand is what _is_ the ChatGPT front
       | end exactly? I've used other "conversational" implementations
       | built with the API and they never work quite as well, it's
       | obvious that you run out of context after a few conversation
       | turns. Is ChatGPT doing some embedding lookup inside the
       | conversation thread to make the context feel infinite? I've
       | noticed anecdotally it definitely isn't infinite, but it's pretty
       | good at remembering details from much earlier. Are they using
       | other 1st party tricks to help it as well?
        
         | SOLAR_FIELDS wrote:
         | They definitely do some proprietary running summarization to
         | rebuild the context with each chat. Probably a RAG like
         | approach that has had a lot of attention and work
        
           | extr wrote:
           | This is effectively my question. I assume there is some magic
           | going on. But how many engineering hours worth of magic,
           | approximately? There is a lot of speculation around GPT-4
           | being MoE and whatnot. But very little speculation about the
           | magic of the ChatGPT front end specifically that makes it
           | feel so fluid.
        
             | BoorishBears wrote:
             | That's mostly because there's very little value in deep
             | speculation there.
             | 
             | It's not particularly more fluid than anything you couldn't
             | whip up yourself (and the repo linked proves that) but
             | there's also not much value in trying to compete with
             | ChatGPT's frontend.
             | 
             | For most products ChatGPT's frontend is the minimal level
             | of acceptable performance that you need to beat, not an
             | maximal one really worth exploring.
        
               | extr wrote:
               | What front end is better than ChatGPT? Is the OP
               | implementation doing running summarization or in-convo
               | embedding lookup?
        
         | simonbutt wrote:
         | Logic for chatgpt's "infinite context" summarisation is in
         | https://github.com/microsoft/azurechatgpt/blob/main/src/feat...
        
           | furyofantares wrote:
           | That doesn't really look right to me, it looks like that's
           | for responding regarding uploaded documents. Also I don't
           | think I'd expect this repo to have anything to do with the
           | actual ChatGPT front-end. I highly doubt the official ChatGPT
           | front-end uses langchain, for example.
        
           | qwertox wrote:
           | I don't see anything related to an infinite context in there.
           | There's only a reference to a server-side `summary` variable
           | which suggests that there is a summary of previous posts
           | which will get sent along with the question for context, as
           | is to be expected. Nothing suggests an infinite context.
        
         | MaxLeiter wrote:
         | It uses a sliding context windows. Older tokens are dropped as
         | new ones stream in
        
           | extr wrote:
           | I don't believe that's the whole story. Other conversational
           | implementations use sliding context windows and it's very
           | noticable as context drops off. Whereas ChatGPT seems to
           | retain the "gist" of the conversation much longer.
        
             | lsaferite wrote:
             | I mean, I explicitly have the LLM summarize content that's
             | about to fall out of the window as a form of pre-emptive
             | token compression. I'd expect maybe they do something
             | similar.
        
               | kuchenbecker wrote:
               | I feel like we're describing short vs long term memory.
        
         | shubb wrote:
         | This is one of the things that make me uncomfortable about
         | proprietary llm.
         | 
         | They get task performance by doing a lot more than just feeding
         | a prompt straight to an llm, and then we performance compare
         | them to raw local options.
         | 
         | The problem is, as this secret sauce changes, your use case
         | performance is also going to vary in ways that are impossible
         | for you to fix. What if it can do math this month and next
         | month the hidden component that recognizes math problems and
         | feeds them to a real calculator is removed? Now your use case
         | is broken.
         | 
         | Feels like building on sand.
        
           | BoorishBears wrote:
           | I'm not sure you realize how proprietary LLMs are being built
           | on.
           | 
           | No one is doing secret math in the backend people are
           | building on. The OpenAI API allows you to call functions now,
           | but even that is just a formalized way of passing tokens into
           | the "raw LLM".
           | 
           | All the features in the comment you replied to only apply to
           | the _web interface_ , and here you're being given an open
           | interface you can introspect.
        
             | edgyquant wrote:
             | It was a contrived example to make a point, one that seems
             | to have flown over your head.
        
               | BoorishBears wrote:
               | No it was a bad (straight up wrong) example because you
               | don't understand how people are building applications on
               | proprietary LLMs.
               | 
               | If you did you'd also know what evals are.
        
       | albert_e wrote:
       | is there away to run this on AWS instead.
       | 
       | we were looking to explore Llama2 for internal use
        
         | villgax wrote:
         | Have your engineers set this up internally
         | https://huggingface.co/spaces/huggingface-projects/llama-2-7...
        
           | speedgoose wrote:
           | You can't really replace ChatGPT 4 with llama2 7B.
        
         | froggychairs wrote:
         | OpenAI models are exclusively Azure only. Llama2 should have an
         | AWS option I believe?
        
         | axpy906 wrote:
         | Use SageMaker: https://www.philschmid.de/sagemaker-llama-llm
        
         | gdiamos wrote:
         | We can run llama 2 on an AWS vm if you have enough GPUs:
         | https://lamini.ai/
         | 
         | Install in 10 minutes.
         | 
         | Make sure you have enough GPU memory to fit your llama model if
         | you want good perf
        
         | braydenm wrote:
         | Amazon Bedrock makes Claude 2 available, as well as some other
         | models.
        
         | klysm wrote:
         | Msft spent a lot of money to ensure that was not an option w
         | chatgpt
        
       | gdiamos wrote:
       | Can you fine tune it?
        
         | jensen2k wrote:
         | Yes! You can.
        
           | gdiamos wrote:
           | Is it the same api as the public OpenAI
        
           | saliagato wrote:
           | How?
        
       | Y_Y wrote:
       | So the public access one isn't private and secure?
        
         | jrflowers wrote:
         | No
         | 
         | Edit: yes
        
           | stavros wrote:
           | I just love this comment.
        
         | jensen2k wrote:
         | Another thing is that using ChatGPT for European companies
         | might be in violation with GDPR - Azure OpenAI Services are
         | available on European servers.
        
         | froggychairs wrote:
         | I believe it's implying the free ChatGPT collects data and this
         | one doesn't.
        
         | nwoli wrote:
         | I thought sama said they don't use data going through the api
         | for training. Guess we can't trust that statement
        
           | jumploops wrote:
           | That is correct, they do not use the data going through the
           | API for training, but they do use the data from the web and
           | mobile interfaces (unless you explicitly turn it off).
        
             | quickthrower2 wrote:
             | "We don't water down your beer".
             | 
             | Oh nice!
             | 
             | "But that is lager"
        
         | zardo wrote:
         | Unless you have an NDA with Open AI, you are giving them
         | whatever you put in that prompt.
        
           | ElFitz wrote:
           | Also, at some point some users ended up with other users'
           | chat history [0]. So they've proven to be a bit weak on that
           | side.
           | 
           | [0]: https://www.theverge.com/2023/3/21/23649806/chatgpt-
           | chat-his...
        
         | candiddevmike wrote:
         | > However, ChatGPT risks exposing confidential intellectual
         | property.
         | 
         | I don't remember seeing this disclaimer on the ChatGPT website,
         | gee maybe OpenAI should add this so folks stop using it.
        
           | sebzim4500 wrote:
           | It's pretty clear in the FAQ to be fair.
        
           | cmarschner wrote:
           | If you use ChatGPT through the app or website they can use
           | the data for training, unless you turn it off.
           | https://help.openai.com/en/articles/5722486-how-your-data-
           | is...
        
         | theusus wrote:
         | [dead]
        
         | theptip wrote:
         | The concern is that ChatGPT is training on your chats (by
         | default, you can opt out but you lose chat history last I
         | checked).
         | 
         | So in general enterprises cannot allow internal users to paste
         | private code into ChatGPT, for example.
        
           | Buttons840 wrote:
           | As an example of this. I found that GPT4 wouldn't agree with
           | me that C(A) = C(AA^T) until I explained the proof. A few
           | weeks later it would agree in new chats and would explain
           | using the same proof I did presented the same way.
        
             | samrolken wrote:
             | I've found that the behavior of ChatGPT can vary widely
             | from session to session. The recent information about GPT4
             | being a "mixture of experts" might also be relevant.
             | 
             | Do we know that it wouldn't have varied in its answer by
             | just as much, if you had tried in a new session at the same
             | time?
        
               | quickthrower2 wrote:
               | There is randomness even at t=0, there was another HN
               | submission about that
        
             | simmerup wrote:
             | Kind of implies that OpenAI are lying and using customer
             | input to train their models
        
             | behnamoh wrote:
             | This is kinda creepy. But at the same time, _how_ do they
             | do that? I thought the training of these models stopped in
             | September 2021 /2022. So how do they do these incremental
             | trainings?
        
               | infinityio wrote:
               | The exact phrase they previously used on the homepage was
               | "Limited knowledge of world and events after 2021" - so
               | maybe as a finetune?
        
               | behnamoh wrote:
               | but doesn't finetuning result in forgetting previous
               | knowledge? it seems that finetuning is most usable to
               | train "structures" not new knowledge. am i missing
               | something?
        
       | mark_l_watson wrote:
       | This seems like such an obvious thing to do.
       | 
       | I see the use of general purpose LLMs like ChatGPT, but smaller
       | fine tuned models will probably end up being more useful for
       | deployed applications in most companies. Off topic, but I was
       | experimenting with LLongMA-2-7b-16K today, running it very
       | inexpensively in the cloud, and given about 12K of context text
       | it really performed well. This is an easy model to deploy. 7B
       | parameter models can be useful.
        
         | stavros wrote:
         | Is there an easy way to play with these models, as someone who
         | hasn't deployed them? I can download/compile llama.cpp, but I
         | don't know which models to get/where to put them/how to run
         | them, so if someone knows about some automated downloader along
         | with some list of "best models", that would be very helpful.
        
       | TuringNYC wrote:
       | Curious if anyone has done a side-by-side analysis of this
       | offering vs just running LLaMA?
       | 
       | I'm currently running a side-by-side comparison/evaluation of
       | MSFT GPT via Cognitive Services vs LLaMA[7B/13B/70B] and
       | intrigued by the possibility of a truly air-gapped offering not
       | limited by external computer power (nor by metered fees racking
       | up.)
       | 
       | Any reads on comparisons would be nice to see.
       | 
       | (yes, I realize we'll _eventually_ run into the same scaling
       | issues w /r/t GPUs)
        
         | tikkun wrote:
         | I did one. I took a few dozen prompts from my ChatGPT history
         | and ran them through a few LLMs.
         | 
         | GPT-4, Bard and Claude 2 came out on top.
         | 
         | Llama 2 70b chat scored similarly to GPT-3.5, though GPT-3.5
         | still seemed to perform a bit better overall.
         | 
         | My personal takeaway is I'm going to continue using GPT-4 for
         | everything where the cost and response time are workable.
         | 
         | Related: A belief I have is that LLM benchmarks are all too
         | research oriented. That made sense when LLMs were in the lab.
         | It doesn't make sense now that LLMs have tens of millions of
         | DAUs -- i.e. ChatGPT. The biggest use cases for LLMs so far are
         | chat assistants and programming assistants. We need benchmarks
         | that are based on the way people use LLMs in chatbots and the
         | type of questions that real users use LLM products, not
         | hypothetical benchmarks and random academic tests.
        
           | Q6T46nT668w6i3m wrote:
           | I don't know what you mean by "too research oriented." A
           | common complaint in LLM research is the poor quality of
           | evaluation metrics. There's no consensus. Everyone wants new
           | benchmarks but designing useful metrics is very much an open
           | problem.
        
             | p1esk wrote:
             | I think he wants to limit evaluations to the most frequent
             | question types seen in the real world.
        
           | register wrote:
           | How did you measure the performance?
        
           | TillE wrote:
           | I think tests like "can this LLM pass an English literature
           | exam it's never seen before" are probably useful, but yeah
           | there's a lot of silly stuff like math tests.
           | 
           | I suppose the question is where are they most commercially
           | viable. I've found them fantastic for creative brainstorming,
           | but that's sort of hard to test and maybe not a huge market.
        
             | TuringNYC wrote:
             | >> I suppose the question is where are they most
             | commercially viable.
             | 
             | Fair point, though I'm not aiming to start a competing LLM
             | SaaS service, rather i'm evaluating swapping out the TCO of
             | Azure Cognitive Service OpenAI for the TCO of dedicated
             | cloud compute running my own LLM -- _to serve my own LLM
             | calls currently being sent to a metered service (Azure
             | Cognitive Service OpenAI)_
             | 
             | Evaluation points would be: output quality; meter vs fixed
             | breakeven points; latency; cost of human labor to
             | maintain/upgrade
             | 
             | in most cases, i'd outsource and not think about it. _BUT_
             | we 're currently in some strange economics where the costs
             | are off the charts for some services
        
         | robertnishihara wrote:
         | We (at Anyscale) have benchmarked GPT-4 versus the Llama-2
         | suite of models on a few problems: functional representation,
         | SQL generation, grade-school math question answering.
         | 
         | GPT-4 wins by a lot out of the box. However, surprisingly,
         | fine-tuning makes a huge difference and allows the 7B Llama-2
         | model to outperform GPT-4 on some (but not all) problems.
         | 
         | This is really great news for open models as many applications
         | will benefit from smaller, faster, and cheaper fine-tuned
         | models rather than a single large, slow, general-purpose model
         | (Llama-2-7B is something like 2% of the size of GPT-4).
         | 
         | GPT-4 continues to outperform even the fine-tuned 70B model on
         | grade-school math question answering, likely due to the data
         | Llama-2 was trained on (more data for fine-tuning helps here).
         | 
         | https://www.anyscale.com/blog/fine-tuning-llama-2-a-comprehe...
        
         | FrenchDevRemote wrote:
         | chatgpt is obviously a LOT better, llama doesn't even
         | understand some prompts
         | 
         | and since LLMs aren't even that good to begin with, it's
         | obvious you want the SOTA to do anything useful unless maybe
         | you're finetuning
        
           | londons_explore wrote:
           | openai offers finetuning too. And it's pretty cheap to do
           | considering.
        
           | baobabKoodaa wrote:
           | > and since LLMs aren't even that good to begin with, it's
           | obvious you want the SOTA to do anything useful unless maybe
           | you're finetuning
           | 
           | This is overkill. First of all, ChatGPT isn't even the SOTA,
           | so if you "want SOTA to do anything useful", then this
           | ChatGPT offering would be as useless as LLaMA according to
           | you. Second, there are many individual tasks where even those
           | subpar LLaMA models are useful - even without finetuning.
        
             | FrenchDevRemote wrote:
             | it's the SOTA for chat(prove me wrong), and you can always
             | use the API directly
             | 
             | even for simple tasks they're less reliable and needs more
             | prompt engineering
        
               | baobabKoodaa wrote:
               | > it's the SOTA for chat(prove me wrong)
               | 
               | GPT-4 beats ChatGPT on all benchmarks. You can easily
               | google these.
        
               | Kiro wrote:
               | I tried and got nothing useful. What's the difference
               | between GPT-4 and ChatGPT Plus using GPT-4?
        
               | stavros wrote:
               | The distinction between GPT-4 and ChatGPT is blurry, as
               | ChatGPT is a chat frontend for a GPT model, and you can
               | use GPT-4 with ChatGPT. The parent probably means ChatGPT
               | with GPT-4.
        
         | [deleted]
        
       | villgax wrote:
       | Yeah right for the three letter agencies to have a backdoor, hard
       | pass on something that cannot be deterministic with a seed
        
       ___________________________________________________________________
       (page generated 2023-08-13 23:00 UTC)