[HN Gopher] Study: ChatGPT outperforms physicians in quality, em... ___________________________________________________________________ Study: ChatGPT outperforms physicians in quality, empathetic answers to patients Author : consumer451 Score : 262 points Date : 2023-04-29 08:55 UTC (14 hours ago) (HTM) web link (today.ucsd.edu) (TXT) w3m dump (today.ucsd.edu) | motohagiography wrote: | Several years ago, I worked on the technical privacy and security | design and architecture for systems that used ML for | prescriptions and some diagnostic triage. At the time, I thought | this was too big an ethical issue to be pushed down to us | technologists to solve (as are many in government), so I got in | touch with an eminent philosopher about the question of whether a | person was recieving honest and ethical care from ML and from | whom? (15 years ago now). | | His response to my (limited and naive) question was essentially, | people will hold attachments to beliefs about the human element | in these transactions, and the philosophical part of the question | was why people would hold beliefs that were not strictly | rational, and he sort of declined the implied ethical questions. | There was no reason to expect him to respond with more, but given | I was navigating the ethics of AI driven medical care via | questions of privacy in system design decisions (read: | accountability) for the institutions who would use it and the | millions of people subject to it, it seemed like an opportunity | to be at the very forefront of what would likely become a | defining social issue of our lives in a couple of short decades. | | What we discovered then as architects, as most people are just | about to, is that the main use case for ML/AI will be to use | complexity to diffuse accountability away from individuals, and | aggregate it up into committees and ultimately corporate entities | that are themselves artificial beings. AI is the corporeal | manifestation of an institution, essentially a golem. | dimal wrote: | > AI-augmented care is the future of medicine. | | Somehow I think that the only profession in the US that still | uses fax machines will be slow to take up this new technology. | sorokod wrote: | Isn't r/AskDocs in the corpus on which ChatGPT was trained on the | first place? | PrimeMcFly wrote: | It probably used that more to figure out how to word sentences, | but I assumed it relied more on wiki or academic articles to | diagnose. | sorokod wrote: | Why would it be more probable and why would you assume "wiki" | and "academic articles"? | PrimeMcFly wrote: | Not sure why you put those things in quotes, that's kind of | strange. | | That aside, the training isn't blind, it's guided, and it's | likely they use verified correct sources of info to train | for some things, like medical diagnoses. | sorokod wrote: | I can help with "verified correct sources", have a look | at "Language Models are Few-Shot Learners" section 2.2 | [1]. | | You may also be interested in Apendix A in the same | document: "Details of Common Crawl Filtering" | | [1] https://arxiv.org/pdf/2005.14165.pdf | amriksohata wrote: | Interesting, but the application would be weird, the physician | would diagnose you then read out a automated AI generated text to | speech to the patient? | Abroszka wrote: | It seems it gives more accurate and emphatic response. I would | guess the physician just needs to double check that what the | LLM says is medically correct, reasonable and ask the right | questions from the patient and the LLM. | revelio wrote: | It's probably not directly useful for physicians except as a | teaching aid, but I have a friend who runs a small local | business and she sometimes finds it difficult dealing with | problem customers when she's tired or upset. As she interacts | with them mostly via WhatsApp (up until the point that they | purchase), the idea of having a bot write the replies for her | has been floated. The LLM has infinite patience. | rolisz wrote: | > The LLM has infinite patience. | | Not Bing Chat. She doesn't have problems telling users that | they have been bad users. | AmazingTurtle wrote: | "she"? | RheingoldRiver wrote: | she == Sydney | revelio wrote: | Good point but Bing Chat seems like an example of why | OpenAI's mastering of RLHF has been so critical for them. | Honestly the more time you spend with ChatGPT the more | absurd fantasies of evil AI takeover look. The thing is | pathologically mild mannered and well behaved. | BoorishBears wrote: | Bing originally intentionally undid some of the RLHF | guardrails with its system prompt, today it's actually | more tame than normal ChatGPT and very aggressive about | ending chats if it detects it's headed out of bounds | (something ChatGPT can't offer with the current UI) | flangola7 wrote: | ChatGPT will delete its response if it is severely out of | line. | BoorishBears wrote: | That's based on the moderation API, which only kicks on | on _severe_ content | | Bing on the other hand will end a conversation because | you tried to correct it one too many times, or used the | wrong tone, or even asked something a little too | philosophical about AI. | | They seem to be using it as a way to keep people from | stuffing the context window to slowly get it away from | its system prompt | ChatGTP wrote: | This is why I think Google had the brains to stay clear | from a releasing asimilar product, I think it was | intentional. They're not idiots and they could probably | see using that "AI" products behind the scenes to be | safer and easier than having people directly talking to a | model which has to suit all customers moods and | personality types while not being creepy, vindictive and | dealing with all the censorship and safety aspects of it. | | Fun times. | hef19898 wrote: | One of my favorite physicians was my oncologist. I litterally | spent less time with him during the whole diagnostics and | treatment period then I did with my, at the time, dentist. He | was straight o the point, no empathetic BS, just a doctor with | a diagnosis and a treatment plan to discuss. On the hand was | me, an engineer with a problem to fix and an expert with the | right answers. That discussion took all of 15 minutes. | | That guy would have failed against ChatGPT, and I _loved_ the | way he told things. Anythong else would have just driven me | crazy, maybe to the point of looking for a different doctor. | | So I giess, what passes as good bed side manners for doctors | largely depends on the patient. By the way, the dentist I have | since is in the same category as my, luckily former, | oncologist. A visit with hom usually takes no more than 5 | minutes if he's chatty, less if not. Up to 10 when treatment is | required, anuthing longer than thaf is a different appointment. | kashunstva wrote: | The real communication skill for a physician is to be able | flex the style, information content, and level of detail to | the patient with whom they are meeting. The patient in this | room is an engineer, the patient in the next room is elderly | and has mild cognitive impairment, etc. As impressive as | ChatGPT is its domain, I don't see it "reading the room" in | this way anytime soon. And as a human who enjoys interacting | face to face with other humans from time to time, I hope we | keep it that way. | IanCal wrote: | You can give those models a system prompt, in which you can | tell it how to act generally but it's a very good place | (imo) for background information and formatting. 3.5 isn't | great at following it but 4 is. | Macha wrote: | It also has to flex a bit with a diagnosis, "it's | aggressive and terminal cancer, you should do up a will and | enjoy your next couple of months", "it's broken, you need a | cast" and "it's seems like nothing, take some paracetamol | and come back if it's still the same in a week or gets | worse" all arguably call for different communication | styles. | xupybd wrote: | Wow you overcame cancer. That is awesome. | hef19898 wrote: | I didn't do much, except being insufferable during chemo. | Evrybidy around me did the hea y lifting, emotionally and | physically. | lukehollis wrote: | Hey, I'm working on this! If it's interesting I built DoctorChat: | https://doctor-chat.com/ | | You can chat messages on WhatsApp, SMS, and now a ChatGPT plugin | to query a vector database of health information (Pinecone) and | respond with GPT-4 for higher quality results than default | ChatGPT. The chatbot then prompts the person messaging to verify | results with a real doctor and offers to connect them to a Doctor | that I'm working with. | | I was doing fieldwork and would get sick in foreign countries | where I didn't know the language. My friends would drag my food- | poisoned body to a hospital where they inevitably would try to | give me prescriptions that I wasn't familiar with and wanted to | double check. I wanted to build for myself a WhatsApp bot that I | could text while I travel to verify health information in low- | bandwidth internet situations. | | I shared it with some friends, and they shared the WhatsApp | contact with other friends and family, and now it's being used by | people in about 10 countries around the world, in several | languages. | | Would love any feedback if you try it out! The phone number for | SMS/WhatsApp is +1 402.751.9396 Or link to WhatsApp if easier: | https://wa.link/levyx9 | mattnewton wrote: | I am sure you have thought about this, but curious how you are | handling safeguards for crises that might require people to | intervene, like rare conditions that nonetheless require | medical attention or mental health problems that pose imminent | risk of self harm. | lukehollis wrote: | It's a really serious issue, and we've tested many of those | types of questions/messages, but you can test for yourself | also if you want. We state that it definitely shouldn't be | used for emergency situations, and the chatbot tries to | provide medical information and recommend anything else to a | healthcare professional. | | But also it can be dangerous when you don't have access to | medical information. The first friend that started testing | the WhatsApp bot lives in the Sinai desert in Egypt where | it's really hard to get to a clinic to ask questions. It's | kind of similar in rural Nebraska where I grew up. We're | taking things one step at a time and trying to provide the | best services that we're able. | lukehollis wrote: | Our API if you want to test out the ChatGPT plugin that I built | for yourself is here: https://chatgpt-api.doctor-chat.com/ -- | but maximum 15 people can test. | etiam wrote: | That's completely inane. There's nobody home. The physician by | definition wins actual empathy on walkover, no matter how bad for | a human. | | Sad statement on the judgment of the respondents. But an | important reason it can turn out like this, I suppose, would also | be that the RL feedback gives the model a fairly effective | general optimization about what statements are liked by the | mechanical turk-like evaluators. Most physicians have probably | never had access to anything like that level of feedback on how | their expressions are received. Maybe the LLM's can be rigged to | provide goodness gradients for actual physicians' statements? | micromacrofoot wrote: | I've had doctors that I would consider worse than no one, | gaslighting me into thinking I'm not in pain and must be | mentally ill, so maybe not so inane | Llamamoe wrote: | Exactly. I've spent ten years too disabled to leave my home | and most doctors are just bullies who would rather insult you | than consider that their initial evaluation of "it's just | stress" might be wrong. | notahacker wrote: | > Sad statement on the judgment of the respondents. | | Nope, it's a sad reflection of the study construction. | | Physicians' empathy was evaluated by their 52 word responses on | Reddit. Unsurprisingly, a chatbot optimised for politeness and | waffle outperformed responses of people volunteering answers in | a different format optimised for brevity... | ipunchghosts wrote: | Of course! Do u know how emotionally exhausting it is to treat | every patient as a blank slate when u see them? | jossclimb wrote: | To be fair, its not that hard. If I ever see my GP nowadays, I | come armed with a stack of research sufficient to convince them | to refer me to a specialist. GP's are generalists at the end of | the day. | WolfCop wrote: | From the paper: | | > ...the original full text of the question was put into a fresh | chatbot session, in which the session was free of prior questions | asked that could bias the results (version GPT-3.5, OpenAI), and | the chatbot response was saved. | | It seems like they just pasted the question in. For those who | have asked it for medical advice, how did you frame your | questions? Is there a prompt that will help ChatGPT get into a | mode where it knows it is to provide medical advice? As an | example, should it be prompted to ask follow up questions if it | is uncertain? | exabrial wrote: | This headline is incredibly deceiving... none of what it says | happened. | | ChatGPT did not actually answer patients. | spaceman_2020 wrote: | I don't know about chatGPT, but dealing with doctors has been | incredibly frustrating as someone with symptoms with no | immediately visible direct cause. | | One reputed physician completely refused to acknowledge that a | specific medication might be causing some of my side effects, | even when I shared links to peer reviewed studies from reputed | universities (including his alma mater) that specifically talk | about side effects from the medication. | | My success rate with doctors is about 20% at this point. I've | stopped visiting them for most ailments, and if I ever do get any | prescriptions, I make sure to research them thoroughly. The | number of doctors who will casually prescribe heavy duty drugs | for common ailments is unreal. | [deleted] | Llamamoe wrote: | This is exactly it. Doctors have neither any incentive to | actually treat nontrivial cases, nor any accountability over | whether they do. | | Combined with the prestige of the profession, many turn into | egoistic know-it-alls whose real competence is equivalent to a | car mechanic who tells you you're just driving your car wrong | when you come in with anything that's not obvious from a 15s | inspection because they get paid anyway. | | I would be surprised if most of the doctors I've seen could | outperform even GPT-2. | juve1996 wrote: | For every one person misdiagnosed there are probably 10 that | think they have cancer because of webMD. People are dumb, and | I bet every anecdote you have about how your doctor was wrong | they have 50 about patients who refused to do basic treatment | courses that would radically improve their lives. But it's | ultimately true: you have to be somewhat knowledgeable and | responsible for your own health. Always get a second opinion | if you're unsure. But this know it all syndrome also applies | to you. Just because you can "do your own research" on google | doesn't mean you know it all either. | | Even so doctors are extremely overworked now and insurance | companies don't want to pay their fees. So now they're | running from patient to patient, unable to even pay attention | to them. That's even if you get a doctor now before the | multiple nurse practitioners or physician's assistants who | also don't know anything. | spaceman_2020 wrote: | I get that, but that doesn't explain why they're always so | overeager to prescribe hardcore drugs or over-operate | routine problems. The role of doctors in the entire opiates | crisis is pretty damning. | | I've had a couple of surgeries and the eagerness of doctors | to give me opiates for pain relief was baffling, even when | I clearly told them that I can tolerate the pain and don't | need anything stronger than ibuprofen. | themantalope wrote: | Working as a surgery resident and now in IR, I can tell | you it's much better to be a little overprescritvie in | addressing post-op pain than to get behind and underdose. | | Also opiates in a short term setting are good meds. Pain | control is good and people are able to get moving faster. | juve1996 wrote: | It's a lot harder to overprescribe addictive drugs now. | Much much harder. But your point remains. The problem | wasn't so much painkillers after a major back surgery - | it was prescribing to treat chronic pain - which, in | hindsight, is an obvious way to get people addicted. What | isn't told is how insurance would only cover pills, not | more expensive therapies (to deal with the underlying | issues.) Now you have a patient who's in some pretty | serious pain and you have an FDA approved pill to treat | it. In a country that still has TV ads for medicine, the | outcome isn't that surprising. It's also why opioid | addiction is a strictly american phenomenon. | juve1996 wrote: | The problem is there's simply not enough doctors to go around | and cost. The schooling is enormous and the costs to | hospitals/insurance companies is too. Hence, more PAs, more | nurse practitioners, less doctor time and that doctor is rushed | from patient to patient. | | There are good doctors out there but there are a lot of bad. I | always advocate people if they're unsure to get a second | opinion. Just like you would if a plumber said "you need to | replace the whole system." If it doesn't seem right or you | don't feel like you got the proper attention, go somewhere else | and see. | | Medicine isn't an exact science for much of us. It's a lucky | thing when it's a simple infection that antibiotics can cure. | Most of our problems aren't so easy. Just be slightly | skeptical. Don't go "Fruit will cure my pancreatic cancer" | crazy either. | spaceman_2020 wrote: | Balanced take. Appreciate it. | IanCal wrote: | It'd be fascinating to know how GPT-4 would perform on the same | task. It's been so much better than 3.5 in most things I've | tried. | ulizzle wrote: | I think this is just propaganda. Empathy is a "qualia", meaning | it's part of the hard problem of consciousness. Qualias are non- | computable, so a computer can't display empathy. I'm sure it's | good at inserting little sugary cliches and platitudes in each | response, but to call it empathy is a real stretch. | | Perhaps it's rated as more empathetic because it's more likely to | tell people what they want to hear since the patient is leading | the questions and not the other way around. | | That's another issue: since the A.I can't compute qualia; it | can't discern if the patient has psychosomatic symptoms, so it is | more likely to give a false diagnosis. | nuancebydefault wrote: | Whatever is the case today, within a few years the automated | systems will clearly outperform doctors on most cases. Today we | have systems that can hallucinate randomly. Since most of it is a | black box, we cannot tell when it is hallucinating. This is | solvable and just a matter of time. | | Today we do not have feedback to the automated systems. If we | start doing following on a large scale | Measure->treatment->measure again->adapt treatment, then the | system will learn and will make connections that no doctor has | ever made, because the minds of all doctors are not | interconnected, at least not in a structural manner. | lukehollis wrote: | I totally believe this too--I hope that it can help provide | efficiencies and lower costs through the whole healthcare | system. | canadiantim wrote: | Here in Canada I would 100% rather deal with ChatGPT than a | doctor, so long as ChatGPT also controlled the keys to the gates | of the medical world (i.e. ability to make referrals). | lukehollis wrote: | Haha, I know, I was feeling the same way sometimes in other | countries where I traveled. I was working on this problem for | referrals! I built a chatbot on whatsapp that right now | connects to a real doctor to recommend to local clinics if | someone doesn't have one yet. | Ldorigo wrote: | I suffer from a yet unknown chronic illness and as everyone in my | predicament knows, I have seen a staggering amount of medical | professionals who ranged from arrogant jerks who didn't listen | not take you seriously to highly empathic and thorough people | (who still failed to figure out what was wrong) and passing by | overzealous specialists who sent me down completely wrong paths | with copious amounts of anxiety. Last month I felt particularly | low and desperate and decided to dump everything (medical | history, doctor notes, irregular lab results, symptoms) into GPT4 | and ask for help (did a bit of prompt tuning to get professional | level responses). | | It was mind blowing: it identified 2 possible explanation that | were already on my radar, 3 more that I had never considered of | which one seems very likely and I am currently getting tested | for, explained how each of those correlated with my symptoms and | medical history, and asked why I had not had a specific marker | tested (HLA b27) that is commonly checked for this type of | disease (and indeed, my doctor was equally stumped - he just | thought that test had been done already and didn't double-check). | | Bonus: I asked if the specific marker could be inferred from | whole genome sequencing data (had my genome sequenced last year). | He told me which tool I could use, helped me align my sequencing | data to the correct reference genome expected by that tool, gave | me step by step instructions on how to prepare the data and tool, | and I'm now waiting for results of the last step (NGS data | analysis is veery slow). | mial wrote: | I also have a chronic yet unknown condition with a similar | story as you, would you share privately your prompts? | | Contact me at Drgpt@altmails.com | xwdv wrote: | Given that doctors are basically inefficient data analysts | focused on a single domain, I imagine GPT can replace most of | the need to consult a doctor until some physical action needs | to be taken. I think an AI that monitors daily vitals and | symptoms and reports to you anything that seems alarming might | help people live longer and more healthy. | jimsimmons wrote: | Brainstorming rare diseases and making diagnosis and providing | treatment using medical science are different things. | | If I ask GPT4 about some arcane math concept it'll wax lyrical | about how it has connections to 20 other areas of math. But it | fails at simple arithmetic. | BurningFrog wrote: | Being bad at arithmetic and making diagnoses are entirely | separate things. | | If that's your best argument, you don't have an argument. | hwillis wrote: | You're completely wrong- look at the wikipedia page for | differential diagnosis: | https://en.wikipedia.org/wiki/Differential_diagnosis | | Literally the majority of the page is basic arithmetic, | mostly Bayes. Diagnosis is a process of determining | (sometimes quantitative, sometimes qualitative) the | relative incidences of different diseases and all the | possible ways they can present. Could this be X rare virus, | or is it Y common virus presenting atypically? | slashdev wrote: | LLMs are not for doing arithmetic. Don't use a hammer to | drive screws. | jimsimmons wrote: | It's an irregularity in their performance profile. | Arithmetic is a known issue. How many such irregularities | exist but are not measurable? | tough wrote: | is arithmetic based on language? should an LLM be | expected to handle one plus one ad infinitum? Makes no | sense since it's not built for it | hwillis wrote: | Why does this apply for math but not for _being a | doctor_?? It can do basic math, but you say that of | course it can 't do math- math isn't language. The fact | that it can do some basic diagnosis does _not_ mean it 's | good at doctor things or even that its better than webmd. | TeMPOraL wrote: | Arithmetic requires a step-by-step execution of an | algorithm. LLMs don't do that implicitly. What they do is | vector adjacency search in absurdly high-dimensional | space. This makes them good at giving you things related | to what you wrote. But it's the opposite of executing | arbitrary algorithms. | | Or, look at it this way: the LLM doesn't have a "voice in | its head" in any form other than a back-and-forth with | you. If I gave you any arithmetic problem less trivial | than the times table, you won't suddenly come up with the | right answer - you'll do some sequence of steps in your | head. If you let an LLM voice the steps, it gets better | at procedural tasks too. | slashdev wrote: | Despite the article, I don't think it would be a good | doctor. | | I read a report of a doctor who tried it on his case | files from the ER (I'm sure it was here in HN) It called | some of the cases correctly, missed a few others, and | would have killed one woman. I'm sure it has its place, | but use a real doctor if your symptoms are in any way | concerning. | hedora wrote: | They are terrible at synthesizing knowledge. | | If a search engine result says water is wet, they'll tell | you about it. | | If not, then we should consider all the issues around | water and wetness, but note that water is a great | candidate for wetting things, though it is important to | remember that it has severe limitations with respect to | wetting things, and, at all costs some other alternatives | should be considered, including _list of paragraphs about | tangential buzzwords such as buckets and watering cans go | here._ | gpm wrote: | Proof based higher math and being good at calculating the | answers to arithmetical formulas are two pretty unrelated | things that just happen to both be called "math". | | One of my better math professors in a very good pure math | undergraduate program added 7 + 9 and got 15 during a | lecture, that really doesn't say anything about his ability | as a mathematician though. | bloqs wrote: | I thought all math was similar due to the ability to work | with it requiring decent working memory. Both mental math | and conceptually complex items from theory require | excellent working memory, which is a function of IQ | dinosaurdynasty wrote: | You still have to practice arithmetic to be good at it, | and a lot of mathematicians don't | jimsimmons wrote: | That's sorta my point: diagnosing well studied diseases and | providing precise treatment is different from speculating | causes for rare diseases. | | Who knows, OP could be a paint sniffer and that's their | root issue. Brainstorming these things requires creativity | and even hallucination. But that's not what doctors do. | jquery wrote: | Most humans fail at doing simple arithmetic in their head. At | the very least I'd say GPT4 is superior to 99% of people at | mental math. And because it can explain its work step by step | it's easy to find where the flaw in its reasoning is and fix | it. GPT-4 is capable of self-correction with the right | prompts in my experience. | TeMPOraL wrote: | > _If I ask GPT4 about some arcane math concept it'll wax | lyrical about how it has connections to 20 other areas of | math. But it fails at simple arithmetic._ | | The only reason failing at basic arithmetic indicates | something when discussing a human is because you can | reasonably expect any human to be first taught arithmetic in | school. Otherwise, those things are hardly related. Now, LLMs | don't go to school. | PartiallyTyped wrote: | > But it fails at simple arithmetic. | | Does it though? When allowing LLMs to use their outputs as a | form of state they can very much succeed up to 14 digits with | > 99.9% accuracy, and it goes up to 18 without deteriorating | significantly [1]. | | That really isn't a good argument because you are asking it | to do one-shot something that 99.999% of humans can't. | | https://arxiv.org/abs/2211.09066 | amelius wrote: | What do you mean one-shot? Hasn't ChatGPT been trained on | hundreds of maths textbooks? | PartiallyTyped wrote: | When I ask a human to do 13 digit addition, 99.999% of | them will do the addition in steps, and almost nobody | will immediately blurt out an answer that is also correct | without doing intermediate steps in their head. Addition | requires carries, and we start from least to most | significant and calculate with the carries. That is what | 1-shot refers to. | | If allow LLMs to do the same instead of producing the | output in a single textual response, then they will do | just fine according to the cited paper. | | Average humans can do multiplication in 1 step for small | numbers because they have memorized the tables. So can | LLMs. Humans need multiple steps for addition, and so do | LLMs. | amelius wrote: | Ok. In the context of AI, 1-shot generally means that the | system was trained only on 1 example (or few examples). | | Regarding of the number of steps it takes an LLM to get | the right answer: isn't it more important that it gets | the right answer, since LLMs are faster than humans | anyway? | PartiallyTyped wrote: | I am well aware what it means, and I used 1-shot for the | same reason we humans say I gave it "a shot", meaning | attempt. | | LLMs get the right answer and do so faster than humans. | The only real limitation here is the back and forth | because of the chat interface and implementation. | Ultimately, it all boils down to giving prompts that | achieve the same thing as shown in the paper. | | Furthermore, this is a weird boundary/goal-post humans | get stuff wrong all the time, and we created tools to | make our lives easier, if we let LLMs use tools, they do | even better. | hwillis wrote: | Try asking it to combine some simple formulas involving | unit conversions. It does not do math. You can ask it | questions that let it complete patterns more easily. | PartiallyTyped wrote: | It does not have to do math in one shot, and neither can | humans. The model needs only to decompose the problem to | subcomponents and solve those. If it can do so | recursively via the agents approach then by all means it | can do it. | | The cited paper covers this to some extend. Instead of | asking the LLMs to do multiplication of large integers | directly, they ask the LLM to break the task into 3-digit | numbers, do the multiplications, add the carries, and | then sum everything up. It does quite well. | throwbadubadu wrote: | But that marker is not really indicative both ways (you can | have it without the marker and be healthy with it) - find a | good rheumatologist who usually sends you to a good radiologist | and have some MRTs that identify it certain and quickly. | Buttons840 wrote: | > arrogant jerks who didn't listen [and would] not take you | seriously | | Pride goeth before the fall. I wonder how many arrogant jerks | will be humbled to see that they too are now inferior to a | computer ("soon"). Humans will always be better at being human | though, perhaps they will learn empathy is more important than | they thought. | vasco wrote: | Humans will be better at being humans, for sure, but the jury | is still out as to if humans prefer to interact with other | humans given a sufficiently reliable alternative. Hell is | other people, after all. | krisoft wrote: | > Humans will always be better at being human though | | What do you mean by that? If you mean humans will be always | the best being the creature we call human then that goes by | definition. | | If you mean humans will be always more | compassionate/emotionaly understanding/better suited to | deliver bad news then I am afraid that is unsubstantiated. | Buttons840 wrote: | I meant that a human is valuable just because they are a | human; this is not cold truth, but it is a moral value | almost everyone shares (and I don't want to imagine a | future without this value). In days past some have become | arrogant because, let's say, they know more about health | and medicine than everyone else; it's their source of self- | worth. They may soon have to reassess their value and | values. | rep_lodsb wrote: | What you are calling empathy is just patterns of language | statistically optimized to be convincing to the average | person. And I might sound _arrogant_ when I despair at being | surrounded by morons who "buy it", but that is IMO still | better than being a sociopath who enjoys it when others are | easy to manipulate with pretty words. | | ChatGPT = artificial sociopathy | GreedClarifies wrote: | One of the defining characteristics of LLMs is that their | knowledge bases are shockingly wide. Anyone who has played with | GPT (or similar) has experienced that the width of its | knowledge is beyond most if not all humans. | | Medicine, and in particular diagnosis, is particularly | difficult due to the width of knowledge required due to the | span of possible diseases. | | It completely makes sense that GPT, or similar, would simply be | better than doctors at diagnosis in time, and it is very | plausible that the time is now. | | This is fantastic news for humanity. Humanity gets better | diagnosis and we don't put high IQ people thought a grinder | that is medical school and residency to do a job which is not | suited to human cognition. | | It's an amazing win. | GaryWilder wrote: | [dead] | dennis_jeeves1 wrote: | I personally feel you're being overly optimistic. Genes, | markers and all of that might seem like high tech medicine, but | to the extent that I know there has not been much progress on | that front, although it's been hyped a lot, and gets a lot of | media coverage. | hcrisp wrote: | Indeed HLA-B27 only has a partial correlation with diseases | such as AxSpa. | | "... about 7 percent of Americans are positive for HLA-B27 | but only 5 to 10 percent of people with a positive HLA-B27 | will have AS" https://creakyjoints.org/about-arthritis/axial- | spondyloarthr... | marpstar wrote: | I don't think it has to be about being "high tech medicine". | | In this case, the existing documentation of such things | combined with the events in the GP's own medical history have | been fed into a machine that can identify patterns that a | human doctor should have, but for whatever reason has not, | identified. | | I think the potential ramifications for this are huge. | makk wrote: | Even if this is right, something about this reply feels like | it's missing the magic of the moment we're in. No empathy, | alex_lav wrote: | What was the intent of this message? | wycy wrote: | Could you share what sort of prompt you used (less all your | private data, of course)? | astockwell wrote: | When you say "He told me which tool I could use", did you mean | the doctor told you, or was that a typo and you meant that | ChatGPT told you which tool and walked you through it? Seemed | like the latter, but was too ambiguous to assume. | Ldorigo wrote: | Yes, it. What _it_ said. In my defense, my native language | has no neutral nouns, and neural networks and language models | are both masculine - so they 're a "he". | ryanwaggoner wrote: | It seemed like the latter to me too, and if so, it's an | interesting example of how easy it is to subconsciously | anthropomorphize these AIs. | hammyhavoc wrote: | An in-law had a minor tremor to their left hand, and whilst this | wasn't pointed out to said in-law, it was noticed, and there were | other small tells about what the problem might be, which | ultimately led to a diagnosis and appropriate treatment. | | Any kind of LLM and its frequency of hallucination means that | LLMs are inappropriate as a solution in this scenario. An LLM is | not a physician, it's an LLM. | | You can make soup in an electric kettle, but it doesn't make it | the right tool for the job, and comes with a lot of compromises. | master_yoda_1 wrote: | WTF my tax money is spent on this bull $hit | robwwilliams wrote: | And no one is surprised | benzible wrote: | Great article going into the details of what it would take to get | a system using an LLM for diagnosis approved: | https://www.hardianhealth.com/blog/how-to-get-regulatory-app... | | > [...] a roadmap for how to get a medical large language model- | based system regulatory cleared to produce a differential | diagnosis. It won't be easy or for the faint-hearted, and it will | take millions in capital and several years to get it built, | tested and validated appropriately, but it is certainly not | outside the realms of future possibility. | | > [...] | | > There is one big BUT in all this that we feel compelled to | mention. Given the lengthy time to build, test, validate and gain | regulatory approval, it is entirely possible that LLM technology | will have moved on significantly by then, if the current pace of | innovation is anything to go by, and this ultimately begs the | question - is it even worth it if we are at risk of developing a | redundant technology? Indeed, is providing a differential | diagnosis to a clinician who will already have a good idea (and | has available to them multiple other free resources) even a good | business case? | revskill wrote: | Of course ChatGPT could easily beat any human WITHOUT accessing | to the internet. | teekert wrote: | One thing you really notice when interacting with ChatGPT for | some time: It doesn't get tired of your sh*t. | | The thing feels human, until you ask it for more options 10 times | in a row. Or ask it for a more concise version 5 times. Then it | shows: It just doesn't get tired of your S. | kyleyeats wrote: | It definitely has a "sick of your crap" mode where it will | abruptly end the conversation. | MacsHeadroom wrote: | I've been using chatGPT daily for 6 months and it has never | ended a conversation on me. | kyleyeats wrote: | I haven't either. I think it's something the jailbreakers | run into. | qwefggggqwe wrote: | [flagged] | cinntaile wrote: | Martin Shkreli's new project is kind of like this. They use an | LLM that's fed with medical data to make a diagnosis. With the | disclaimer that it's not actual medical advice of course. | https://www.drgupta.ai/ | flangola7 wrote: | I thought he was in prison | sebzim4500 wrote: | Looked him up, apparently he was released in 2022. The | government auctioned that Wu Tang album though. | cheeseface wrote: | I definitely want to hand over all of my medical records to a | convicted criminal who has made his career in running ponzi | schemes and abusing people with rare medical conditions. | cinntaile wrote: | You don't have to say who you are and you can even use a VPN | to connect to the site. It's way harder being anonymous at a | regular doctor's practice. | azubinski wrote: | Let patients recover, because their blockchain certificates have | not yet been completed, and now this ChatGPT begins... | | https://www.lipscomb.edu/news/lipscomb-co-creates-blockchain... | m00x wrote: | I was verified as a doctor on /r/askdocs, and I am not a doctor. | I sent a google'd picture of a page 3 doctor credentials. | | The quality of this study is so incredibly poor that I'm | flabbergasted at UCSD's bar for platforming such garbage. | PrimeMcFly wrote: | This isn't that surprising given how much Doctors are overworked | and, often I think, underappreciated. | | A concern with something like this though, is to what extent is | ChatGPT just telling patients what they want to hear, as opposed | to what they need to hear? | shagie wrote: | Responses to a person would still need to be gated by a medical | professional. | | I recall in a chat a person getting rather exasperated with a | coworker and the person using ChatGPT to generate a | friendly/business professional "I don't have time for this | right now." | | GPT could be used in a similar manner - "Here is the | information that needs to be sent to the patient. Generate an | email describing the following course of treatment: 1. ... | Stress that the prescription needs to be taken twice a day." | | The response will likely be more personal than the clinical (we | even use that word as an adjective) response that a doctor is | likely to give. | butterisgood wrote: | I'm not at all impressed by GPT yet. Sure it's doing things we've | never seen before, but the hype exceeds the actual utility. | | It gets a lot of things wrong - and I mean a lot of critically | important things. Can a physician be wrong? Yes, but I can sue | the shit out of them for malpractice too. Whom do I sue for bad | "parrot advice" when GPT goes off the rails? | cubefox wrote: | Related question: If a driver kills someone and it's his fault, | he will presumably get large fines or even go to jail. But if | the same thing happens with a self-driving car, whose faul is | it then? The owner of the car? The car company? It seems nobody | is really at fault, mistakes are just bound to happen | eventually, and a car company can't go to jail anyway. | butterisgood wrote: | Can't put them in prison but they are "persons" in many other | ways. | | https://en.wikipedia.org/wiki/Corporate_personhood | rootusrootus wrote: | Liability for self-driving cars is a topic long discussed and | still not completely resolved. Take a look at | https://en.wikipedia.org/wiki/Self-driving_car_liability for | examples. | | The only level 3 system in production (that I'm aware of, at | least) is Mercedes', and they have liability while the car is | driving itself. It shifts back to the driver 10 seconds | (IIRC) after the system notifies the driver he/she must take | over. | dauertewigkeit wrote: | As it should considering that it is basically a knowledge system, | what IBM Watson should have been. | sethammons wrote: | Inspired, I just typed my wife's symptoms into ChatGPT. | | Ten to fifteen years of doctors, over half a dozen, failed to | accurately diagnose her until this month. | | ChatGPT got her diagnosis right in a second. Wow. I'm both amazed | and angry that we could not get this diagnosed a decade ago. | dxdm wrote: | This definitely sounds impressive, but could it be that you | "learned" how to describe the symptoms over time in a way that | makes it easier to arrive at the correct diagnosis? | | It could also be that knowing the correct diagnosis changes | your description to highlight things in a way that suggests the | correct outcome. I believe doctors are also susceptible to that | effect. | | I'm not trying to imply that doctors should not do a better job | diagnosing. People should not have to "learn" to find the right | doctor, or how to operate them. It's a crying shame that people | like you have to go on years-long journeys to get correct help, | and I'm sorry that you and your wife had to go through that. | | Just saying it might be slightly more apples-to-apples to | compare ChatGPT's performance to the last couple of physicians | you saw, and not the whole lot of them. But again, that's still | a very favorable comparison for the non-human. | danabrams wrote: | I'm a chatGPT skeptic, but willing to admit this is true... but | only because most physicians aren't very good. | NiceWayToDoIT wrote: | Statements like this are very dangerous and detrimental. I have | personally tested ChatGPT with multiple different topic, and | found issues and errors across the range, even more disturbing is | that explanation why "result" is as it is, is very confident | bullshit. If person is not familiar with subject probably will | blindly believe what machine says. That being said, in the field | of medicine people will use ChatGPT as poor man doctor | (especially inspired by studies likes this), where wrong results | coupled with confident BS could result in increase of fatalities | due to wrong self medication. | Mizoguchi wrote: | That was a very low bar anyway. US doctors are not known for | their empathy. During our recent appointment with our fertility | doctor, he goes and tells my wife, who has been in hormone | therapy for a week, that they will be "squeezing water from a | rock" , lol. This is one of the top doctors in the world in that | field, and certainly knows what he's doing, but man he has zero | emotional awareness. | jutrewag wrote: | [dead] | ceejayoz wrote: | This is incredibly misleading. The actual study: | https://jamanetwork.com/journals/jamainternalmedicine/fullar... | | Buried deep in the limitations section: | | > evaluators did not assess the chatbot responses for accuracy or | fabricated information | | Repeat: | | > EVALUATORS DID NOT ASSESS THE CHATBOT RESPONSES FOR ACCURACY OR | FABRICATED INFORMATION | sebzim4500 wrote: | But they did assess "the quality of information provided". I | don't understand what that means if not accuracy. | ceejayoz wrote: | As far as I can determine, they basically did a vibe check. | "Sounds good to me" sort of thing. | danShumway wrote: | This should be rated higher, possibly even over the reveal that | this study is comparing to Reddit answers. | | https://xkcd.com/937/ comes to mind. It's not implausible at | all to me that ChatGPT could outperform Reddit in | detail/manners for health advice (and honestly, even for actual | doctors I've heard some horror stories about bedside manner and | refusing to actually believe/consider symptoms), but if the | study isn't actually checking that, if they're just checking if | the chatbot was more polite/empathetic... that's a huge | qualification that should be up-front and center. | PragmaticPulp wrote: | Outperforms Physicians _answering questions on a public | subreddit_ : | | > In this cross-sectional study, a public and nonidentifiable | database of questions from a public social media forum (Reddit's | r/AskDocs) was used to randomly draw 195 exchanges from October | 2022 where a verified physician responded to a public question | | They didn't go to physicians in a patient setting. The physician | answers were taken from Reddit threads where they were | interacting with people who were not their patients. | | Reddit has its own dynamic and people tend to get snarky/jaded. | Using this as a baseline for physician responses seems extremely | misleading. | antegamisou wrote: | And God knows if those claiming to be doctors there are | actually that... | | It's apparent, at least in the US, that a lot of GPs can be | unhelpful but believing that accurate diagnosis via text | without providing evidence, or going through some type, of | physical examination is feasible by _Reddit expert docs_ , only | demonstrates lack of critical thought. | wolverine876 wrote: | > And God knows if those claiming to be doctors there are | actually that | | Do you know if the subreddit does anything to address that? | dimal wrote: | They say that they are verified. For heavily moderated | subreddits like AskDocs, I think it's highly likely that | the mods take it seriously enough to maintain quality. | corey_moncure wrote: | If they went to a physician in a patient setting, they may have | had an experience like this: -Made to wait | 45-60 minutes past their appointment time bombarded with | pharmaceutical advertisements -Spend another 20 minutes | sitting in the examination room staring at pharmaceutical | company sponsored ads -Nurse takes your history with a | bunch of redundant questions they already have the answers to | -Finally the physician arrives -Assesses the patient | without laying their hands on them -Ignores everything | you say -Does no tests and cultures no pathogens | -"take some antibiotics and I'll bill your insurance company | $500, see ya" | thih9 wrote: | Can you elaborate? Did you have an experience like this? | | Is this a regular occurrence or have you found a better | physician since then? | zarzavat wrote: | > Nurse takes your history with a bunch of redundant | questions they already have the answers to | | If you're a nurse and your job is to take histories, it's | better to take the history in the same way every time, | systematically. This minimizes the chance of making mistakes. | | Moreover the notes may be wrong, or you may give a different | answer this time around. You might give an important detail | this time around that you didn't give the last several times, | which actually turns out to be consequential. | | It might seem like a waste of your time but it's really not. | Measure twice cut once. | hgsgm wrote: | > If you're a nurse and your job is to take histories, it's | better to take the history in the same way every time, | systematically. This minimizes the chance of making | mistakes. | | Citation needed. For example, patient may get bored and | only answer the first few questions accurately each time. | indecisive_user wrote: | Citation needed. For example, patients may have a vested | interest in their care and answer all questions | truthfully each time. | flangola7 wrote: | The point is the patient has already given their history. | The clinic's internal processes aren't the patient's | problem. | hgsgm wrote: | The point is that the patients is neither compete nor | consistent. | flangola7 wrote: | What are patients competing for? | | Are you a bot? Your comment doesn't make any sense. | Toutouxc wrote: | I think they were trying to say "The point is that the | patient's story is neither complete nor consistent." and | it makes sense to me that way. | KyeRussell wrote: | If you can't deal with a one-character typo in a comment, | then ChatGPT certainly has you beat. The bot paranoia is | the icing on the cake. | | Calling this an "internal process" that's none of your | concern, a much more egregious wilful misrepresentation | of this situation. There is a situation or phenomenon of | human behaviour, this is how the healthcare system deals | with it. | | Who are you as presumably some software person to come in | telling them to knock the gate down without understanding | why it was put there in the first place? God knows you'd | hate it if someone did that to you in your area of | expertise. I understand that the proliferation of VC- | backed money-losing companies which parachute clueless | software people into other domains have given developers | an undue sense of transferable expertise, but perhaps | exercise some self-awareness. | DangitBobby wrote: | The main problem is when you take your medical history | and current medications on a tablet on intake and then | ask the same questions once you get in the room. | jquery wrote: | - Your doctor has the incredible ability to understand your | symptoms without even listening to you | | - Your 40 minute in-depth appointment is finished in an | amazing 5 minutes | | - They can do this amazing appointment time compression | because of an encoding technique called "one size fits all" | | - The physician's assistant is so efficient that they've | already scheduled your follow-up appointment - for three | months from now, because they know you just love the | suspense. | ulizzle wrote: | Most communication is non-verbal, with 7% being attributed | to words (see the "55/38/7 Formula"), so they may need to | listen to how you say what you say (vocal), but don't have | to believe the words you say. | ajmurmann wrote: | Where is this happening? I've had long wait times in Germany, | but low cost and good care. In the US I have very little wait | time and high cost and equally good care. The doctors | unfortunately don't go all Doctor House on me, but they | certainly aren't shying away from touching me and physically | inspecting any issue and ordering follow-up tests. The | results aren't always very satisfying, but I understand that | there are limited to what's reasonable to invest in | researching relatively mild symptoms. | | Meanwhile my mom has had cronic pain without any proper | diagnosis. However, the German public health care system must | have spent EUR50-100k in tests and to my astonishment she is | currently in a clinic for 3 weeks that focuses on undiagnosed | pain. As close to Dr House as it gets. | seszett wrote: | This sounds like hell but it's also infinitely different from | how it works here (in Western Europe, but I'd assume it's | different from almost anywhere but the US). Also, | pharmaceutical ads are forbidden here, and antibiotics are | very sparsely used. | ben_w wrote: | Indeed; my experiences have been UK and Germany, and while | the UK is currently experiencing chronic failure in | timelines due to the combination of {chronic understaffing} | _with_ {strike action caused by {{chronic low pay} and | {extra stress from acute overwork}}}, I 've never seen a | single thing advertised in medical facilities of either | country. | acheron wrote: | It's also not actually what happens in the US either. The | poster is lying for whatever reason, probably because | making negative comments about healthcare is good for | Internet points. | DangitBobby wrote: | Eh, no, all of these definitely actually happen in some | places, but it would have to be a really shitty place for | all of these things to happen in one visit. | [deleted] | musicale wrote: | You forgot: - remove your clothes and wait | in a freezing room for 30+ minutes | | (Largely a compliance exercise to reinforce the social status | hierarchy.) | Retric wrote: | If that's your experience, _find a new doctor._ | VancouverMan wrote: | It may not be that simple, especially in regions where | there isn't anything resembling a free market in the health | care sector. | | Canada's various provincial public health care systems | (with a corresponding lack of private health care | offerings) tend to be like that. | | Many Canadians don't have a dedicated physician. Even if | they want one (or want a new one), it's often difficult, if | not impossible, to find one who's close by and who's | accepting new patients. | | Having a dedicated physician still often results in an | experience much like that other commenter described. It's | not an exaggeration. Long waits even with appointments, | rushed examinations, and low-quality service are the norm. | | Another option, which is sometimes used even by people who | have dedicated physicians, is a walk-in clinic. | Unfortunately, they can be quite rare and inconvenient to | get to, even in Canada's largest cities, assuming they're | even open when you need them. You'll usually face an even | longer wait, even less time with the doctor, and typically | see a different doctor if any sort of followup is needed. | | Then there are hospital emergency rooms. That usually means | getting to the nearest sizable city, and even once you're | there, you've got to be prepared to wait many hours, even | for relatively serious situations. | | Ultimately, in Canada, it doesn't matter whether you see | your family doctor (if you even have one), use a walk-in | clinic, or go to a hospital emergency room. It's going to | be a horrible experience, and there's pretty much nothing | the average person can do about it. | | Given the lack of competition and due to other government- | imposed market distortions, there's no incentive for | doctors to offer anything resembling good service to the | general public. | | The best situation is to have a doctor who's a close friend | or family member, and who may be able to help mitigate at | least some of the typical problems. | | The next best option for Canadians, assuming they have the | money for it, is often to seek treatment in the US or | overseas. | | Just "finding a new doctor" isn't feasible, unfortunately. | andsoitis wrote: | > The best situation is to have a doctor who's a close | friend or family member | | The code of medical ethics advises physicians to not | treat themselves or members of their own families. | | https://code-medical-ethics.ama-assn.org/ethics- | opinions/tre... | Retric wrote: | It is that simple even if it's inconvenient. | | Being forced to wait for routine care isn't a big deal, | but having a doctor unwilling to listen to you is a life | threatening situation. | | If your local area lacks access to proper medical care | that's a serious enough problem where you should _move._ | Medical care is like clean drinking water or working | breaks for your car, it's not optional. | tgv wrote: | > Assesses the patient without laying their hands on them | | That's not a killer argument in a comparison with an AI. | User23 wrote: | Yes, yes it is, because it's an intrinsic advantage the | human doctor could have that's being ignored. | | Palpation, especially for musculoskeletal issues, is an | incredible diagnostic tool. And manual therapies are | surprisingly often an effective alternative to surgery or | drugs. | tgv wrote: | An AI couldn't touch anyone even if we could give it the | desire to do so. | hgsgm wrote: | The point of that thread starting comment was that it's | what a _bad_ doctor does. The reply was saying good | doctors _can_ do better. | DangitBobby wrote: | I can count on pretty much 0 hands the number of times I had | an on-time appointment with a doctor. I understand they have | things going on that drag out their schedule a bit, but my | God. | SkyMarshal wrote: | That last one is more like: -"take some | antibiotics and our billing dept will bill your insurance | company for the maximum amount they estimate your insurance | can pay, plus some margin, see ya" | speedgoose wrote: | It depends on the location. In Norway they are usually | perfectly on time, you don't have any ads, they do the | required tests and examinations, they listen to you, they | take the time to explain everything and will do drawings if | necessary, they tell you to do more sport, and you get billed | about $25 (but it becomes free if you spend more than $300 in | visits and medications during the year). | WheatMillington wrote: | I'm in New Zealand and my experience matches the GP | comment, except cost and pharma ads. Long wait times and | disinterested doctors are the norm. | cscurmudgeon wrote: | Even in the US you don't get Pharma ads in a hospital. | ericbarrett wrote: | Have you ever seen those brochures and flyers in the | waiting room (and sometimes even the exam room)? Those | are almost entirely supplied to the practice by pharma | reps. Every U.S. doctor's office I've been in for the | last four decades has been full of them. | cscurmudgeon wrote: | I have seen zero such brochures in every US doctor's | office I have visited. Zero. | | Is there any study that shows how prevalent they are? | andsoitis wrote: | Besides brochures, these ads also come in the form of | posters and bulletin board posts. Next time, also look at | the walls. | kldx wrote: | > perfectly on time | | > they do the required tests and examinations | | Do you have a source or is this anecdata? I do agree with | your other claims though. | speedgoose wrote: | Not really. I said that based on the experiences from my | close family. The sample size is small. | mensetmanusman wrote: | Might be a consequence of oil wealth offering a government | savings rate of $200,000 per citizen? | hgsgm wrote: | That's an asset, not a savings _rate_. | | USA's mean household Net Worth is $500K more than | Norway's, and average household is about 3 people (some | households are single people) so the oil fund | approximately cancels that out. | tbossanova wrote: | There are other places with a similar experience that | don't have the same oil wealth. I wonder though if it's | different at the higher end of things, like expensive | treatments for major issues which a tighter public purse | might not stretch to. | kwhitefoot wrote: | Perhaps you could explain how a fund that doesn't invest | onshore does this. The government uses only a small | percentage of the return on the fund (not the capital) | each year. | speedgoose wrote: | No, I don't think the oil fund (https://www.nbim.no/en/) | is responsible for the culture to be on time or is | connected directly to the public healthcare finances. | Though being a state with money helps, obviously. | PragmaticPulp wrote: | If we're comparing worst-case scenarios: ChatGPT might | confidently hallucinate a very valid-sounding explanation | that convinces the patient they have a rare disorder that | they spend $1000s of dollars testing for, despite real | doctors disagreeing and test results being negative. Maybe it | only does this once in a long series of questions, but the | seed is planted despite negative test results. | | Then when the patient asks ChatGPT if the tests could give | false negatives, it could provide some very valid-sounding | answers that say repeat testing might be necessary. | | This isn't a hypothetical. It's currently happening to an old | friend of mine. They won't let it go because ChatGPT | continues to give them the same answers when they ask it the | same (leading) questions. At this point he can get ChatGPT to | give him any medical answer he wants to hear by rephrasing | his questions until ChatGPT tells him what he wants to hear. | He's learned to ask questions like "If someone has symptoms | _____ and ____ could they have <rare disease> and if so how | would it be treated?" A real doctor would see what's | happening and address the leading questions. ChatGPT just | takes it at face value. | | The difference between ChatGPT and real doctors is that he | can iterate on his answer-shopping a hundred times in one | sitting, whereas a doctor is going to see what's happening | and stop the patient. | | ChatGPT is an automated confirmation bias machine for | hypochondriacs. | BlackSwanMan wrote: | [flagged] | ngngngng wrote: | > they spend $1000s of dollars testing for | | The subtext you're missing here is that GPT with access to | the entire corpus of medical data could undermine the | entire money printing machine (referring to US healthcare | here). What test would cost thousands of dollars if the | only human cost to run it is drawing some blood and putting | it in a machine? | Firmwarrior wrote: | I hate doctors and pharmaceutical companies as much as | anyone, but those tests are serious business. There's a | lot of very hard science and engineering involved in them | | When they're cheap outside the USA, it often just means | the companies aren't attempting to recoup any r&d costs | outside the US | hkt wrote: | Surely it isn't pharmaceuticals companies doing blood | tests in the US? In the UK people's bloods get done in | hospital labs (phlebotomy doesn't always occur in | hospital though) | sammax wrote: | The tests are done using proprietary hard- and software, | though it is generally made by biotech and not | pharmaceutical companies. | throwaway049 wrote: | Some of that lab work is contracted out of the NHS, but | your NHS doctor is largely steered by clinical guidelines | rather than profit or ass-covering. | | Eg https://www.synnovis.co.uk/about-synnovis | Retric wrote: | Abstractions like that are meaningless. Why is the large | hadron collider so expensive it's just throwing stuff at | each other and looking at what happens? Random teens | could do that... | | The actual process of doing blood work is often | surprisingly complicated. | hef19898 wrote: | I see a resurected Theranos on the horizon. This time | powered by by ChatGPT. What could go wrong? | alexb_ wrote: | Funny enough, I'm actually developing an AI-powered | biomedical technological breakthrough that's about to | disrupt the medical industry. It uses wearables to enable | a blockchain-enabled data management system that also | functions as a cloud based SaaS provider, linking with | ChatGPT and NFTs to create value for those who are | underprivileged, all with only 1 drop of blood. If you | want to further this please give me a few billion dollars | and I promise something might come :) | snovv_crash wrote: | I'm only interested if the blood is delivered by drones. | grumple wrote: | Is what was described above a worst-case scenario? It | accurately describes the way nearly every doctor's visit I | or anyone I know has had. | badloginagain wrote: | US healthcare could accurately be described as the | literal worst case scenario. | namuol wrote: | Ah, yes. Hypochondriacs. I was a hypochondriac for years | until I was able to get an appointment timed such that my | symptoms were physically present while I was being assessed | (not easy if you have a disease that comes and goes). I | really hope you're not a medical professional that | interfaces directly with patients. | Dylan16807 wrote: | So do you not believe actual hypochondriacs exist or | something? They do. | | Especially the people that will read about diseases and | get anxious they have them, and while you can validly say | they have anxiety issues they don't have whatever they | just read about 99% of the time. | | Even if there's a real disease of some sort, you don't | want to diagnose with the latest guess in someone that | keeps guessing different things. Their treatment needs | improvement, but confirmation bias is not how you do it. | sagarm wrote: | Whether hypochondriacs exist really has no bearing on how | people feel about their interaction with the medical | system. | | And it's pretty clear that for many people, they don't | feel like their needs are being met. | Dylan16807 wrote: | It has a lot to do with whether "ChatGPT is an automated | confirmation bias machine for hypochondriacs" is a valid | worry or something that should disqualify you from being | a medial professional dealing with patients. | pmarreck wrote: | If you are willing to divulge it, what was the disease, | out of curiosity? | faeriechangling wrote: | I was a hypochondriac for decades. I was eventually cured | after decades of cropping hypochondria through self- | diagnosis of the actual medical condition I had that | became formal diagnosis eventually resulting in treatment | through some modest lifestyle changes. | | I'm a BIG supporter of using things like ChatGPT, google, | and sci-hub to do your own medical research because the | whole system where some physician diagnoses you based on | an extremely limited amount of data collected in a | haphazard manner after a few minutes of observation | because he's experienced and smart or whatever is | incredibly dumb. The way people hold it up as the ethical | standard which we cannot deviate from because it would be | too dangerous is utterly baffling to me. The status quo | totally lacks ethics and mostly serves to line the | pockets of a cartel of doctors with a monopoly on access | to medication and treatment who often condescendingly | think patients are simply too irrational to treat | themselves without their help. | | I legitimately cannot wait for this field to mature and | medical self-help with AI assistance becomes the norm. | throwaway049 wrote: | I agree with studying the field to help you understand | your own health, but I prefer sci hub or any peer | reviewed source over an LLM. I'll revise this view as | LLMs develop, but right now I'm seeing plausible bs as | often as I see good advice. | lukehollis wrote: | I was the same way with this while I travel--it's | definitely the future. I'm working on AI healthcare | assistant where you can summarize a conversation with our | plugin on ChatGPT or bot on WhatsApp and then send it to | a real doctor to continue the conversation. | | I hope that more founders build and innovate in the field | to provide efficiencies throughout the whole system to | lower costs and provide high quality care for everyone | that needs it. | | Some insurance providers need a primary care physician | for referral, but some do not, so one area we're doing | research is if we can do referrals through doctor follow- | up/verfication from summary of chat. | crx07 wrote: | Fellow hypochondriac here. I was at the point where | doctors, hospital staff, and lab techs would immediately | warn new practitioners about me so they wouldn't waste | finite medical resources in a small town, and I just | completely discontinued normal activities out of terror | as a result. | | When I finally blacked out and fractured my spine, first | responders detected a lifelong cardiac arrhythmia in the | back of an ambulance. Only with that knowledge have I | been able to receive treatment and begin to heal | emotionally from the gaslighting and medical abuse I | experienced while in the care of licensed professionals. | | AI-assisted medicine will prevent so many of these | mistakes in the future. It can't come soon enough as far | as I'm concerned. | DangitBobby wrote: | Wow, the doctors that tried to block you out completely | within their network of friends must have balls of steel, | what with zero fear of legal repercussions! | crx07 wrote: | It was more a matter of how misinformation malignantly | spreads I believe. | | I had to see the only available PCP to see anyone else. | This automatically prompted medical releases that, even | if unethical, would have still made everything | technically legal. If it was an emergency room trip, | there were always the same two or three physicians there, | so they all became aware of me from the first couple of | episodes and could warn any specialist they referred me | to see. | | Same deal with laboratories and radiological facilities. | When you've got only one or two options in town, they | have your consent to release PHI by default if you ever | want the results interpreted, and their interpreting | physician can just accompany the report with a courtesy | call to the receiving provider about a suspected | diagnosis. | DangitBobby wrote: | I was also a hypochondriac in high school and college, | sleeping 12-16 hours a day and still being completely | exhausted! Apparently what I really needed was more | exercise. The CPAP machine I eventually got after | ignoring my PCPs diagnosis merely serves as a placebo, | but very effective nonetheless. I don't bother him with | my delusions anymore. | javajosh wrote: | Accurate. This is a result of the heady combination of | relying on the profession to enforce ethical codes, and | letting corruption become standard practice, and therefore | common and safe. That medicine is NOT a free market is easy | to prove, because as soon as you posit a provider that keeps | their appointment times, charges a reasonable fee for a short | consult, actually looks at you and performs tests himself, | relates to your whole being as a human on this planet, and | charges something like $100/hour, cash - imagine how that | poor schmuck is going to fare in this hard, harsh world of | ours. As soon as they need to interface with literally any | other part of the system, the other providers, pharmacies, | etc will not be able to handle them - and won't want to. | Won't need to. Many of their colleagues will sneer at them or | even refuse to refer. Some patients, the sociopaths, will | sense weakness, the lack blood-thirsty corpo apparatus backup | that deals with 10 frivolous lawsuits a day, and say "what | the hell" and sue you for malpractice simply because they | think you're unprepared and they can win. | Firmwarrior wrote: | I had some doctors like that | | To be fair, they were so heavily booked that they weren't | taking new patients and you had to wait weeks or months for | non-urgent visits | | Also some of them went out of business.. | musicale wrote: | I wonder if things like CVS's minute clinic are any better | for basic stuff vs. the hell of regular clinics. | | The potential seems there: cash payments accepted, many | convenient locations, possibly lower wait times. | | Of course CVS is Aetna, for better or for worse. | KyeRussell wrote: | Ah, an American conflating healthcare with American | healthcare. You love to see it. | devinprater wrote: | Can confirm. | vitorgrs wrote: | This is very U.S. Not even here in Brazil, which is not a | developed country, it's like that. | wolverine876 wrote: | In almost every HN discussion about research, the top comment | criticizes the validity of the research. Perhaps we should | learn that research doesn't work the way we imagine. | PragmaticPulp wrote: | The research is fine if you actually read it and understand | what they're researching. | | It's almost always the headlines and PR pieces that | exaggerate it. | | "ChatGPT is more empathetic than Reddit doctors" isn't | interesting. Strip the "Reddit" out and then everyone can | substitute their own displeasures with doctors and now it's | assumed true. | ceejayoz wrote: | > It's almost always the headlines and PR pieces that | exaggerate it. | | That is definitely happening here. | | https://jamanetwork.com/journals/jamainternalmedicine/fulla | r... | | > evaluators did not assess the chatbot responses for | accuracy or fabricated information | | Yikes. (I do fault the researchers quite a bit for quietly | slipping that little detail into a page-long "limitations" | section.) | withinboredom wrote: | Not to mention knock-on-effects of teens trying to decide | what to be when they grow up. Some would-be-docs are going | to read this and go, huh, maybe I should pick a | professional field that AI won't take over. And bam, in 15 | years, there are slightly less docs. | idrios wrote: | At least in the US there are more people who want to be | doctors than positions available. The bottleneck is | medical school acceptance rates. To practice medicine you | need a medical license, which you can only get from an | accredited university. | | I knew students who were rejected from medical school, | but I also knew far more students who were at one time | pre-med, saw how much effort and debt that would-be | doctors would need to take on, and saw the risk of | pursuing an e.g. biology degree where their entire future | hinges on getting into med school, and they chose a | different field. | dimal wrote: | My experience with doctors in a patient setting is that they | have generally been far less empathetic than the doctors in | /r/AskDocs. Those doctors seem to be motivated to help and | actually show some empathy. While Reddit on the whole is | snarky, many subreddits (especially heavily moderated ones like | AskDocs and AskHistorians) have a completely different tone. | | My experience with most doctors is that they are among the | least empathic people I have ever dealt with. I think that | using AskDocs actually gives doctors an unrealistic _advantage_ | in the study. | gwern wrote: | And had they done that, then no matter how they did it, they | would be criticized for being 'unethical' and criticized for | either keeping the data secret or not keeping the data secret. | fumeux_fume wrote: | Considering how difficult getting this kind of dataset is, I | would say that using r/askdocs would be a great place to start. | Doctors' responses are labeled as such and the subreddit looks | healthy and well moderated. Of course it's not perfect and it's | important to point out possible issues with using r/askdocs, | but I don't think your criticism of reddit being snarky or | jaded holds much weight here. | andreagrandi wrote: | The more I use ChatGPT (including 4.0 version, available with | Plus subscription) the more I think these "Studies" and articles | are simply made up. | | For my experience is terrible for any question I ask: - if it | doesn't know well a person it simply makes up things - if I ask | code examples or scripts, most of the time they are wrong, I need | to fix them, they contain obsolete syntax etc... - if I'm asking | a question, I'm expecting being asked for more context if the | subject is not clear, instead it starts spitting text without | even realising I asked a completely different thing etc... | | I could go on for hours with other examples, but I'm seriously | not finding it useful | SkyMarshal wrote: | I've only asked GPT4 a few niche questions on subjects of | interest to me, so I can't really judge it yet. But so far its | answers can't compete with Wikipedia. However, it seems good at | drilling down and doing followup questions that build on the | prior questions, which is interesting. I can see that natural | language give-and-take back-and-forth being useful for things | like early education, diagnosing non-emergency patients, | troubleshooting home PC problems with non-computerphiles, etc. | hammyhavoc wrote: | It hallucinates a lot. Any time statistics or specifications | are involved, don't trust it whatsoever. | jstanley wrote: | If you tell it it's wrong, it often comes up with a better | answer on the second try. | | It seems like you could maybe automate that. Let it spit out | its first draft of an answer, have the framework tell it | "please correct the errors" and then let it have another go and | only present the second attempt to the user. | simmerup wrote: | But how do you know it's wrong until a human tries the output | mirekrusin wrote: | You don't need to, you always tell it it's wrong. | [deleted] | notahacker wrote: | But if you always tell it it's wrong, it will sometimes | come up with _worse_ answers on the second try. Which | means you still need to know whether the first answer is | correct or not (or neither of them) | | I'm reminded particularly of a screenshot of someone | gaslighting ChatGPT into repeatedly apologising and | providing different suggestions for Neo's favourite pizza | topping, despite it answering correctly that the Matrix | did not specify his favourite pizza topping first time | round, but it applies equally to non-ridiculous questions | jstanley wrote: | The idea isn't that you tell it to simply _change_ what | it wrote on the first try. The idea is that having a | first draft to work with allows it to rewrite a better | version. | notahacker wrote: | This technique works fine if you're creating new | iterations of creative work, or have a specific thing you | want it to fix | | It's not much use if ChatGPT gives you a diagnosis of | your symptoms which may or may not be accurate. | hammyhavoc wrote: | Great example! | | What if someone refers to x part of their body | incorrectly? What if they're not able to think clearly | due to y reasons and tell it something completely wrong? | | An LLM is wholly inappropriate. | | As for its "bedside manner"/how polite or friendly it is, | that's meaningless if it isn't good at what it does. Some | of the best docs/profs I've known have been very detached | and seemingly unfriendly, but I'll be damned if they | weren't great at what they do. Give me the stern and | grumpy doc that knows what they're doing over the | cocksure LLM that can't reason. | hammyhavoc wrote: | That's assuming it doesn't just enter a fail state and | keep providing the same answer again and again and again, | despite explaining what about the answer is wrong. | ChatGTP wrote: | Is there a bot writing these comments, I'd say everyone who | used ChatGPT 4 had to know this by now and takes this into | account ? | andreagrandi wrote: | the point is: my job is not to train a deep learning model, | my job is to write code. If I already know the answer, I'm | not asking the question. If I can recognise something is | wrong, it means I should have done that task by myself. Can | it do certain things faster than me? Sure! Can it do them | correctly? Most of the time it can't and if I have to spend | my time to check every output, the time I saved is already | wasted again. | birdyrooster wrote: | The whole point of computing is to have it do tasks for us | and here you are advocating that we should do tasks by | ourselves. I hate everything about this. | AshamedCaptain wrote: | Your experience matches mine. Everything I have asked is | usually _horribly_ wrong, and even asking things in a different | order makes it completely change its responses, even for | otherwise binary questions. Even the "snippet" part of a Google | search with the same prompt normally contains enough | information to contradict it... | | I'll also note that when there was hype around Stable | Diffusion, one of the images shared around was that of an | astronaut riding a horse. If you actually run Stable Diffusion | with its default tuning and ask for that prompt, you will get 6 | images, of which 5 of them are outright disasters (horses with | 6 legs and going downhill from there), and then the 6th image, | the only one which could possibly pass as a decent result, is | the one that everyone shared and reshared and hyped. Usually | other prompts give even more terrible results where there are 0 | passable images without extensive tuning. Stable Diffusion now | is acknowledged to actually be crap --despite the hype-- and I | supposedly need to try the next best thing, whatever that is. | But I find myself facing the same situation with ChatGPT 3.5, | and now with ChatGPT 4, despite the fact there is no "next best | thing", and I don't even know how they could even possible try | to fix the problem of it being just wrong. | pmoriarty wrote: | You should try Midjourney. It is far more likely than SD to | give great results on the first try, even when you're not | good at writing prompts. | | Advanced users of both Midjourney and SD can get some stellar | results out of them. Some of that is due to trial and error, | and going through dozens or hundreds of images to pick the | best ones, but being adept at crafting prompts and using | other features of the programs plays a big role too. | | Know your tools. | LewisVerstappen wrote: | okay that's just because you're not good at using the models. | | Obviously you have to play around with them and figure out | how to prompt & tune them. | | Once you do that, you can get pretty amazing results. | | I make do a bunch of social media graphics and stable | diffusion + midjourney has been insanely useful. | AshamedCaptain wrote: | > okay that's just because you're not good at using the | models. | | You literally cannot respond with "you are holding it | wrong" specially when I'm claiming that even for the | popular _example prompts_ SD authors used they had to hand- | pick the best random result over a sea of extremely shitty | images. | | And even in the original paper they disclaim it by saying | "oh, our model is just bad at limbs". No, it's not just bad | at limbs. They just happened to try examples where it could | particularly show how terrible it is at limbs (i.e. spider | legged horses and the like). But in truth, it's just bad at | everything. | kyleyeats wrote: | This is like throwing your hammer down in frustration | over each nail taking more than one swing. | ChatGTP wrote: | I guess the difference is hammers are logical, simple | tools to use with a known use case. They're fairly hard | to use incorrectly, although it does take some practice | to use one, I'll admit. | cubefox wrote: | "It's bad at everything" ... bad by what standards? Just | a few years ago it would have been regarded as | unbelievable science fiction that a model with such | capabilities would soon be available. As soon as they are | here, people stop being impressed. But the objective | impressiveness of a technology is determined by how | unlikely it was regarded in the past, not by how | impressed people are now. People get used to things | pretty quickly. | | Besides, there are models that are much more capable than | Stable Diffusion. The best one currently seems to be | Midjourney V5. | AshamedCaptain wrote: | > Just a few years ago it would have been regarded as | unbelievable science fiction that a model with such | capabilities would soon be available. As soon as they are | here, people stop being impressed. | | I don't know. I've had chatbots for decades before "a few | years ago", so I have never been particularly impressed. | I would say that for someone who was already impressed | with that you could practically describe a landscape in | plain old 2000s Google Images and get a result, SD feels | like just an incremental improvement over it -- the | ability to create very surreal-looking 'melanges', at the | cost of it almost always generating non-sensical ones. | And also add that Google Images is much easier to use | than SD... | wizzwizz4 wrote: | > _Just a few years ago it would have been regarded as | unbelievable science fiction that a model with such | capabilities would soon be available._ | | No, it wouldn't have - not to people in the know. We just | didn't have powerful enough computers back in the 90s. | Sure, the techniques we've got now are _better_ , but 90s | algorithms (with modern supercomputers) can get you most | of the way. | | Transformers are awesome, but they're not _that_ much of | a stretch from 90s technology. GANs are... _ridiculously_ | obvious, in hindsight, and people have been doing similar | things since the dawn of AI; I imagine the people who | came up with the idea were pretty confident of its | capabilities even before they tested them. | | Both these kinds of system - and neural-net-based systems | in general - are based around mimicry. Their inability to | draw limbs, or to tell the truth, or count, are | _fundamental_ to how they function, and iterative | improvement isn 't going to fix them. Iterative | improvement would be going faster, if researchers | (outside of OpenAI and similar corporations) thought it | was worthwhile to focus on improving _these systems_ | specifically. | | ChatGPT is not where transformers shine. StyleGAN3 is not | where GANs shine. Midjourney is not where diffusion | models shine. They're _really_ useful lenses for | visualising the way the architectures work, so they _are_ | useful test-beds for iterative algorithmic | improvements...1 but they _aren 't_ all that they're made | out to be. | | 1: See the 3 in StyleGAN3. Unlike the 4 in GPT-4, it | actually _means_ something more than "we made it bigger | and changed the training data a bit". | cubefox wrote: | > No, it wouldn't have - not to people in the know. We | just didn't have powerful enough computers back in the | 90s. | | I'm not talking about the 90s. I'm talking about April | 29, 2020. | wizzwizz4 wrote: | What's special about that day? That's _after_ the | algorithms were developed, models and drivers were built, | and most of these behaviours were discovered. I 've got | fairly-photorealistic "AI-generated" photos on my laptop | timestamped September 2019, and that was _before_ I | started learning how it all worked. | | If you're talking about popular awareness of GPT-style | autocomplete, then I agree. If you're talking about | _academic_ awareness of what these things can and can 't | do, we've had that for a while. | cubefox wrote: | What photorealistic AI generated image? In September 2019 | this must have been a GAN face. I admit those are | impressive, but incredibly limited compared to todays | text to image models. If you look at an iPhone from 2019, | or a car, or a videogame ... they all still look about | the same today. | | Three years ago there was nothing remotely as impressive | as modern GPT style or text to image models. Basically | nobody predicted what was about to happen. The only | exception I know is Scott Alexander [1]. I don't know | about any similar predictions from the experts, but I'm | happy to be proven wrong. | | [1] https://slatestarcodex.com/2019/02/19/gpt-2-as-step- | toward-g... | wizzwizz4 wrote: | > _In September 2019 this must have been a GAN face._ | | Well, yes (that file was), but actually no. StyleGAN1's | public release was February 2019, and it's capable of far | more than just faces. | | > _Three years ago there was nothing remotely as | impressive as modern GPT style_ | | I predicted that! Albeit, not publicly. (My predictions | claimed it would have certain limitations; I can show | that those still exist in GPT-4, but nobody on Hacker | News seems to understand when I try to communicate it.1) | | > _or text to image models._ | | Artbreeder (then called Ganbreeder) existed in early | 2020, and it didn't take me by surprise when it came out. | It parameterises the output of the model by mapping | sliders to regions of the latent space; quite an obvious | thing to do, if you want to try getting fine-grained | control over the output. (A 2015 paper built on this | technique: https://arxiv.org/abs/1508.06576) | | I was using spaCy back in 2017-2018-time. It represents | sentences as vectors, that you can do stuff like cosine | similarity on. | | If I'd been more interested in the field back then, I | could have put two and two together, and realised you | could train a net on labelled images (with supervised | learning) to map a spaCy model's space to StyleGAN's, | which would be a text-to-image model. It was _very_ much | imaginable back before April of 2020; a wealthy non- | researcher hobbyist could 've made one, using off-the- | shelf tools. | | If I were better at literature searches, I could probably | find you an example of someone who'd _done_ that, or | something like it! | | --- | | 1: See e.g. here: they tell me GPT-4 can translate more | than just explicitly-specified meaning, and the | "evidence" doesn't even manage _that_. | https://news.ycombinator.com/item?id=35530316 (They also | think translating the _title_ of a game is the same as | translating the _game_ , for some reason; that confusion | was probably my fault.) | LewisVerstappen wrote: | Nope. You literally just tried out one prompt and saw one | good image and several bad ones and just shook your fist | at the computer and gave up. | | I'll repeat myself. You have to play around with the | models and learn how to use it (just like you have to do | _for everything_ ) . | | > But in truth, it's just bad at everything | | Thousands of people (including myself) have had the | complete opposite result and have gotten amazing | pictures. You can play around with the finetuning with | different models from civitai and get completely | different art styles too. | | Like, this is so dumb I don't even know how to respond | lol. | | You're like some guy who got a computer for the first | time and couldn't figure out how to open the web browser, | so he just dismissed it as useless. | AshamedCaptain wrote: | I don't think you understand the point. Your claims that | "all of this needs extensive tuning and hand-holding and | picking results" do not help your argument, they help | _mine_. | | Most egregious if you are even doing more tuning and | cherry picking that the authors of the models are doing, | which you definitely are. | SanderNL wrote: | It sounds magic to you because you are unskilled. Like | people using the mouse for the first time. The things he | talks about are very basic, very easy. | whateveracct wrote: | I would rather just learn to draw than constantly write | different text until it looks good | | Drawing isn't hard, you know | sebzim4500 wrote: | I could spend 20,000 hours trying to learn to draw and I | would still be far worse than what I could generate with | Stable Diffusion + Control Net + etc. | whateveracct wrote: | I'm fairly sure if you can best stable diffusion after 7 | years of daily 8h investment into drawing :) | johnnyyyy wrote: | I doubt you would be better than someone who would use | Stable Diffusion for 7 years. and I don't even include | the technological advancements in the next 7 years. | dragonwriter wrote: | Having spent several orders of magnitude more time | working on drawing than with SD, I'll say "Drawing isn't | hard for some people". | | If drawing was that easy, no one would worry about | disruption from AI image generators, because everyone who | wanted images would be knocking them out by hand, not | paying people for them, so there'd be nothing to disrupt. | whateveracct wrote: | Anyone can learn to draw, and it's not hard. If you want | to create, you can and will learn to draw. | | Therefore, I would say most people hyped for SD and | friends are the capitalists and consumers of the art | world - not the creators. | dragonwriter wrote: | > Anyone can learn to draw, and it's not hard. | | Repeating an assertion more times doesn't make it true. | | > If you want to create, you can and will learn to draw. | | "Drawing", even when it comes to imagery, is far from the | only form of creation, and, even if it were, will is not | determinative of capacity. | | But, yeah, I mean, people have said similar things about | basically evey new means of creation ever. | Sunhold wrote: | Learning to draw at the level of what Stable Diffusion | can generate would take thousands of hours of practice, | and the individual drawings would take hours. | whateveracct wrote: | But if you do learn, you can then render photorealistic | image with nothing but pencil and paper instead of being | reliant on a beefy computer running a blackbox model | trained at enormous cost :) | | SD will never compare to the power of pencil and paper | imo. Drawing is an essential skill for any visual artist | not just for mechanics but for developing style, taste, | and true understanding of the world around you visually. | | I recommend Freehand Figure Drawing for Illustrators as a | good starting point (along with some beginner art | lessons). It won't take 1k hours before you see results. | It's also fun! | dale_glass wrote: | You can optimize that: feed SD a sketch, get a finished | painting as a result. | | It works surprisingly well. | dragonwriter wrote: | > You literally cannot respond with "you are holding it | wrong" specially when I'm claiming that even for the | popular _example prompts_ SD authors used they had to | hand-pick the best random result over a sea of extremely | shitty images. | | I do a lot of my SD with a fixed seed and 1-image | batches, once you know the specific model you are using | getting _decent_ pictures isn 't hard, and zeroing in on | a specific vision is easier with a fixed seed. Once I am | happy with it, I might do multiple images without a fixed | seed using the final prompt to see if I get something | better | | If you are using a web interface that only use the base | SD models _and_ don't allow negative prompts, yes, its | harder (negative prompts and in particular good, model | specific, negative embeddings are an SD superpower.) | smusamashah wrote: | Agree with GPT but with StableDiffusion you are only | partially correct. Visit /r/StableDiffusion to see the | stuff people are making. People share prompt, seed, model | settings etc and you can reproduce the exact same thing. | | I do agree that it's bad at following the prompt exactly. | It will produce most of the things you mentioned in the | prompt but not necessarily in the same fashion you asked | for. I don't agree that produced images are mostly shitty, | just visit that subreddit. | AshamedCaptain wrote: | > Agree with GPT but with StableDiffusion you are only | partially correct. Visit /r/StableDiffusion to see the | stuff people are making. People share prompt, seed, model | settings etc and you can reproduce the exact same thing. | | This doesn't really say anything, because it's just | survivor bias. The entire purpose of that site is to show | the successes. Most people get disasters every single | day, they just don't upload them to the site. Even if I | try the same prompt I will get a disaster image, as long | as I don't use e.g., exactly the same random seed they | happened to use. This is not even "prompt engineering". | It's just outright playing with the dice. | Sunhold wrote: | Why do you think it matters if five out of six images are | failures? If the sixth is a success, you have your image. | The tool has worked. Glancing over a few failed | generations is certainly far less effort than making the | image from scratch. | bamboozled wrote: | Because it's a departure from the more logical and | rational computing people have come to expect from | "computers". For many people, it's unusual to have "fuzzy | computing" become popular again. | SanderNL wrote: | Have you actually tried using these tools beyond starting | it up and swinging your fist in frustration? | | I have thousands upon thousands of beautiful images, each | one more inspiring than the other and I did nothing or | very close to nothing. You are belittling this amazing | tech so much it sounds like you are scared of it. What's | it to you? | | Have you ever lived in a time where it was even remotely | possible to go from "astronaut on horse" and get even 1 | decent result, in seconds? | | I can't even. | smusamashah wrote: | You are wrong. Sort that sub by new to get a glimpse of | ugly stuff. With SD you can't ask it for "very beautiful | image of X" and actually get one. You must fine tune your | prompt to get the right asthetics (photo, cinematic, | specific artist etc) and also choose a better model. The | base model that stable Diffusion released is not very | good. Visit https://civitai.com to get a glimpse of how | good the models have become. | | I madeots of wallpapers in bulk recently with a good | prompt and model. None of the images, all via random | seeds, was shitty by any measure. Only images I ended up | deleting had some usual Ai artifacts I couldn't stand. | hartator wrote: | Highest post on /r/StableDiffusion: | | > You need to agree to share your contact informations to | access this model | | All of this does feel like another scam where the tech is | exaggerated and side hustles are actually the endgame. | LewisVerstappen wrote: | Uhh, I think you're replying to the wrong person? | smusamashah wrote: | My bad, I miss clicked somewhere. | candyman wrote: | I agree that the nature of the algorithm allows it to | generate false results which would not be acceptable in many | domains. But I think there is tremendous potential to improve | what you might call the "baseline" of what information and | diagnosis is available. There are great doctors out there for | sure but in many places there are close to zero doctors, or | worse, bad doctors. While it's a very complex domain there | are much simpler parts of it. A child with a cancer tumor in | their brain is a very different case than someone who has a | rash and a headache. There's a great deal of regulation as | well that will come into play here so it's going to take a | while. I know Google has a whole LLM that is using only | healthcare data that they have been working on for a while. | sorokod wrote: | Kindest thing one could say is that there is massive cherry | picking going on which is borderline dishonest. | modeless wrote: | If the cherry picking is less effort than painting a | picture from scratch then I wouldn't say it's dishonest. | sorokod wrote: | It is fine if one says that they tried various prompt | tweaks and out of twenty attempts (for example) here is | the the best. | | To present the best prompt/response while not disclosing | that it is a result of trial and error is a different | thing altogether. | mensetmanusman wrote: | Even with cherry pickings it's still amazing a digital | 'plant' was invented that has any cherries at all. | shanebellone wrote: | "massive cherry picking" | | I think the coined phrase is "prompt engineering". | | Side note, where's the eye roll emoji hiding? | whateveracct wrote: | reminds me of the BTC pivot from "decentralized currency" | to "store of value" | | goalposts | BoorishBears wrote: | If you understand what an LLM is, chain of thought isn't | something you eye roll at. | shanebellone wrote: | If you understood what an LLM is not, you might.. | jasondigitized wrote: | This is the exact opposite experience I have had with code. | While it isn't perfect it's ability to scaffold code is a | huge productivity booster. Example. Tailwind. "How can I | highlight a parent container when I hover on any of its | children using Tailwind". JavaScript "How can I merge two | JavaScript objects and remove any duplicates". | logifail wrote: | > JavaScript "How can I merge two JavaScript objects and | remove any duplicates" | | (This is a genuine question) Is it hard to find a solution | to that question on the web by using a search engine? | catlifeonmars wrote: | Yeah this question is confusing without more context, | because that is already how JavaScript objects work by | default. If you assign multiple times to one key, only | the latest assignment is preserved. It's more complicated | (though not much more) to merge two JavaScript objects | while _preserving_ duplicates. | ericmcer wrote: | It would be difficult to search because that is a weird | question. If you asked a more practical question like | "merge two arrays and remove duplicate values" you would | find tons of exact results. | lm28469 wrote: | Yeah it's very good at answering basic questions which have | been answered 50+ times on stack overflow &co that's for | sure. | | I use it for my side projects, for tech I have no | experience in, and it works very well, because I know what | I want, I know that it is possible and I just need it to | vomit the boilerplate to save me 5 google searches | | For my day job it's next to useless, and if your day job | can already be automated by chatgpt I have bad news for you | 2devnull wrote: | "if your day job can already be automated by chatgpt" | | If they can automate the work of a physician, who exactly | is safe? Low skill labor, maybe, for awhile. | meroes wrote: | The remote physician misdiagnosed my X-ray last time. | That physician is easily automated out and possibly for | the benefit of the patients not just costs. The other | staff involved, like the NP, X-ray tech, assistant, are | fine for a lot longer. | lm28469 wrote: | > If they can automate the work of a physician, who | exactly is safe? | | But they can't, it's like saying you're 15cl stove top | italian coffe maker is replacing a starbucks tier coffee | machine | | If the only metric you account for is "it makes coffee" | boolean then sure, if you actually implement it you'll | notice things falling apart in the first 10 minutes. | smusamashah wrote: | Agree with GPT but with StableDiffusion you are only | partially correct. Visit /r/StableDiffusion to see the stuff | people are making. People share prompt, seed, model settings | etc and you can reproduce the exact same thing. | | I do agree that it's bad at following the prompt exactly. It | will produce most of the things you mentioned in the prompt | but not necessarily in the same fashion you asked for. I | don't agree that produced images are mostly shitty, just | visit that subreddit. | Sharlin wrote: | > I don't agree that produced images are mostly shitty, | just visit that subreddit. | | SD is definitely very good in the right hands, and it's a | little unfair to expect to be able to get instant good | results without any skill. It's honestly pretty crazy that | we now have things like ChatGPT and SD - and people are | already calling them crap because they don't work perfectly | and their productive use actually requires some skill! | | But r/StableDiffusion, or any public gallery, is obviously | one giant selection effect. 99.9% of attempts could be crap | and the 0.1% would still be enough to fill a subreddit. | smusamashah wrote: | Sort by new and you will get the idea of crap that people | post on a public forum. Default sorting only shows you | the popular/better content. | | To give you a general idea of what percentage can be good | images: I recently made a lots of wallpapers to cycle | through daily using SD. I found a good prompt, a good | model, and let it generate bunch of images continuously | for few hours. | | None of the images were shitty (they were all random | seeds), only images I discarded had artifacts I didn't | like or couldn't keep my eyes away from. With SD you | can't just expect to give a prompt "beautiful landscape" | and expect it to give you a beautiful landscape. It | won't. You shall get shitty images and might get a few | pleasing ones. You must tune your prompt to get good | results. | sabellito wrote: | Is there something about the premise of the study or its method | that you feel are not good? | | After reading the article, what you wrote doesn't seem to make | much sense. | ceejayoz wrote: | > Is there something about the premise of the study or its | method that you feel are not good? | | Yes. Down in the limitations section of the study (https://ja | manetwork.com/journals/jamainternalmedicine/fullar...): | | > evaluators did not assess the chatbot responses for | accuracy or fabricated information | | That is a... _significant_ issue with the methodology of the | study. | andreagrandi wrote: | I did not read this particular article, I was explaining that | from my own experience, all these articles telling how great | ChatGPT is seem to be made up because my experience (and from | what I read I'm not alone) is completely opposite. Maybe it's | not able to solve the type of questions I ask? Fine. But it's | not how ChatGPT is presented most of the times. | raincole wrote: | > I did not read this particular article | | It costs you zero dollar to not post an irrelavant comment | then. I really wonder how you justify this "I didn't read | the article, but I have a very, very strong opinion | (straight up calling it a made-up) on it" behavior. The | internet is rotting people's brains I guess. | andreagrandi wrote: | I've now read the article and I'm not changing my | opinion. ChatGPT can do some things very well and people | tend to hype those things and claim it's better than | humans rather than recognising its limits. And again, you | are still missing the point of my comment despite having | read it, so I'm out of patience. Maybe try to ask ChatGPT | to explain what I meant :) | notahacker wrote: | To be fair, the opinion is thoroughly justified by the | article, which might have been more honestly titled | _Physicians ' Reddit comments shorter than ChatGPT | responses; relative accuracy unknown_... | Sunhold wrote: | Not really. A team of licensed health care professional | rated "the quality of information provided". | notahacker wrote: | And unsurprisingly, the average 52 word Reddit comments | [isolated from the context of other comments] didn't | provide very much information compared with a much more | verbose chatbot. The relevance of the ChatGPT response to | the actual patient condition remains unknown. | | This is relevant to the real world of primary care only | if your sole access to a medical professional is | Reddit... | andreagrandi wrote: | And it costs zero to you to ignore my comment especially | if you don't understand it. Other people seem to have | understood what I meant and posted constructive | responses, you didn't, but honestly it's not my problem. | raincole wrote: | Ah, I see. You just don't realize that your comment is | not relavent to the original article... (of course not, | because you didn't read it) | jprete wrote: | Their first example of a good ChatGPT answer - about bleach | in the eye - feels like copypasted SEOified liability-proof | WebMD copy. Every medical site has that crap and it's useless | once you have a moderately difficult question. | | N.B. as well: If someone thinks they have bleach in their | _eye_ and can still open their eyes enough to write a Reddit | post, much less read through ChatGPT's extremely long answer, | they're almost certainly fine. | usrusr wrote: | It quite literally writes whatever sounds about right. Which is | certainly very impressive if you happen to assess by exactly | the same metric... It's more artificial overconfidence than | artificial intelligence | broast wrote: | Your experience matches mine other than I still find it | extremely useful regardless of errors | allisdust wrote: | My experience has been polar opposite with GPT4. As long as I | structure my thoughts and present it with what needs to be done | - not like a product manager but like a development lead, it | spits out stuff that works on first try. It also writes code | with a lot of best practices baked in (like better error | handling, comments, descriptive names, variable | initialization). | | Some times this presenting of problem to it means I spend | anywhere from 5-10 mins actually writing the points down that | describes the requirement - which would result in a working | component/module (UI/backend). | | We have been trialing GPT4 in my company and unfortunately | almost everyone's experience is more on the lines of yours than | mine. I know it shouldn't, but honestly it frustrates me a lot | when I see people complain that it doesn't work :). It | definitely works but it depends on the problem domain and | inputs. Often people forget that it has no other context about | the problem than just the input you are providing. It pays to | be descriptive. | mensetmanusman wrote: | I wonder which personality profiles interact with it best. | Probably some function of which abstract layers people start | with when thinking. | croes wrote: | How do you do that for patients? | SanderNL wrote: | I'd say keep quiet and even join them in their incessant | whining, meanwhile building your skills. Use this advantage. | stainablesteel wrote: | there's definitely an art to asking the questions, likely | because of subtle differences in how a lot of people | communicate in writing. | | NLP can recognize alt accounts of individuals on places like | HN and reddit, but a person would probably need to study the | comments pretty hard to determine the same thing, its not | natural for people imo but it seems to be the foremost aspect | of any kind of model that's processing human writing. | catlifeonmars wrote: | LLMs currently have this problem where they will give | confident-sounding responses to prompts where they are | lacking enough context. Humans are built to read that as | accuracy. It's wholly a human interface problem. | rep_lodsb wrote: | _Some_ humans find that style of response infuriating, but | apparently we are in the minority. | | It's almost like "AI hacking people's brains" turned out to | happen accidentally, and a huge number of supposedly smart | people are getting turned into mindless enthusiasts by | nothing more than computer generated bullshit. | wouldbecouldbe wrote: | Code yeah if you know it's pitfalls you can get it correct | pretty fast. But I just don;t believe it gives better answers | doctors, it makes silly mistakes that signal it doesn't | understand things deeply. | | I only believe if they actually trained chatgpt on those type | of tests specifically. | | Not the actualy dynamic nature of dealing with patients & | lawsuits. | asimpletune wrote: | I always tell people that if they want learn more about the | hype to just try and use it to do actual work. Almost no one | ever does, but when they do it becomes almost immediately clear | how limited it is and what it is and isn't good for. | vidarh wrote: | I've had ChatGPT build a whole website for a project for me | (HTML, CSS, backend code for signups, login, payments) | bamboozled wrote: | I guess I've done similar things with Django in a very | short ammount of itme. | vidarh wrote: | Even just typing what I had it write for me, and a typing | speed well above average, would take many times as long, | and I've done enough sites over the years to know I would | not keep up that kind of typing speed, because I'd need | to check things and go back and forth. | | Put another way: People who don't pick up these tools and | learn how to be effective at them will increasingly be at | a significant performance disadvantage from people at | their skill level who do pick them up. | hammyhavoc wrote: | What you did is nothing a template/boilerplate couldn't | otherwise provide. | | You're trusting that it is secure and sensible in its | output too. | vidarh wrote: | You know what it wrote for me then? No, it's not. It | filled in custom logic per specifications. | | But even _if_ it was just boilerplate, I 'd have had to | apply it, and that takes time to do. I've started dozens | of projects over the 28 years in this industry - I have a | very good idea of how long it takes both me and a typical | developer to do what I've had it do, and it's _far | faster_. | | And no, I'm not trusting it to be "secure or sensible" at | all, no more than I trust a developer. But overall the | quality of what it has produced _all of which I 've | reviewed_ the same way I would with code delivered by a | developer, has overall been of good quality. That does | not mean free of bugs, any more than human developers | write flawless code on first try, but it does mean I've | had it write cleaner code than a whole lot of people I've | worked with who'd take many times as long and cost a hell | of a lot more. | asimpletune wrote: | Yeah, IDK, of all the things an AI can do for us, code | generation seems to be the one I'm actually least | interested in. It's anecdotes like this that sort | reinforce that feeling. | | I would much rather have a "rubber ducky" that I can try | and explain my thought process to, and then it can try to | question me and poke holes in my thinking. I think my | expectations on AI are pretty realistic and I don't | really expect it to ever be ever *thinking* in the way I | associate with the word, not with today's SoA at least. | In that respect I'm just not particularly interested in | it generating code, but that also may come down to our | individual preferences for how we write code. | | At any rate, my issue is that it fails at the "rubber | ducky" position I mentioned earlier. It's not really able | to follow a train of thought that I have, in a reasonably | competent way, and every time I have tried to do work | with it I just end up feeling silly for | anthropomorphizing something that I know isn't really, | even if for a second. Just my $0.02 though, I'm glad so | many people seem to like it and am happy for them. | hammyhavoc wrote: | Well, post the source code and prompts in a GitHub repo | then. Let's have a proper look at the code it output. | hammyhavoc wrote: | Specify the work that you know it excels at first-hand, then | list projects/tasks you have completed using it. | | How do you know if they don't? Do you expect them to report | back to a throwaway comment on HN? The sentiment is shifting | on HN, that much is clear, just like it shifted with crypto. | jstanley wrote: | It's good at writing. It's not good at knowing facts. If you | give it all the relevant facts and ask it to do a writeup in | a certain style it does a better first draft than a typical | human, and in a lot less time. | | If you ask it what the facts are, it just gives you a load of | nonsense. | Abroszka wrote: | I use it almost every day for work. It has mostly replaced | Google for me. A lot more convenient. Now whenever I use | Google it's more or less just to look up the address of a | specific site. | robby_w_g wrote: | I tried phind after seeing it linked here a few times. It | felt like it was a slower Google search with extra steps. | The sources it used for the answers were the sources I | would have found by searching "site:stackoverflow.com [my | question]". It did distill the information decently well, | but I'm skeptical it properly pulls in the context that the | comment replies to the question/answers provides | hammyhavoc wrote: | Given how much it hallucinates, that's one very scary echo | chamber to be in where you trust info reinterpreted by an | algorithm instead of just reading it yourself. Yikes. | Abroszka wrote: | It rarely hallucinates for me. But you need to know what | it's capable of and how to use it to work effectively | with it. You can't use it well if you think it's an all | knowing sentient AI. | hammyhavoc wrote: | How would you know how frequently it hallucinates if it | has "mostly replaced Google" for you? Are you fact- | checking all your queries? This is a very strange self- | inflicted echo chamber. | Abroszka wrote: | Because the code it generates is working? The recipes it | gave to me are also delicious and I don't even have to | read someone's life story on a blog before getting to the | recipe. | | Not really sure why you are so fixated on the echo | chamber thing. We are on the internet! The biggest echo | chamber humanity has ever built. | hammyhavoc wrote: | Functional or delicious [?] accurate to the source | material. | | Because it's a layer of abstraction, mate. One known to | get things wrong, because it's an _LLM_. If I write a | post about Richard Stallman 's opinions on paedophilia | and Jeffrey Epstein (https://en.wikipedia.org/wiki/Richar | d_Stallman#Controversies), and it incorrectly tells you | that Stallman associated directly with Epstein, or is a | paedophile himself, that would not be accurate to the | source. | | At least with a Google search result you can go more | directly to the source. If the scientific method is | getting to the truth, why on Earth would you put an | obstacle in front of it? | | If someone tells me x, am I going to believe them? No. | So, why would I believe an LLM if it isn't presenting | sources to me that are 1:1 in accuracy to the information | it presents to me? | Kamq wrote: | > Functional or delicious [?] accurate to the source | material. | | Alright, but most people aren't looking for accuracy | relative to the source material. | | They're looking for code that does a specific thing, or | food that tastes a certain way. | | > If the scientific method is getting to the truth, why | on Earth would you put an obstacle in front of it? | | Possibly because they aren't doing science. They may be | doing software development or baking instead. | hammyhavoc wrote: | Yup, and it hallucinates plenty with dev too, even over | basic stuff like an NGINX config. | | Given that it hallucinates in particular over | measurements/specs/stats, I'd be extremely sceptical of | taking a _recipe_ from it, whether that 's generated and | original or coming from a known source. | | Baking requires very specific measurements, the slightest | mistake and it won't turn out well in most cases. Again, | why go via an LLM and not a search engine to the actual | _source_? It makes zero sense, especially if it only | returns text and you can 't see what the recipe produces | if it's an existing thing. | Kamq wrote: | > Again, why go via an LLM and not a search engine to the | actual source? | | I believe the argument presented (possibly in a separate | thread) was that search engines have degraded to the | point where what they show you is worse than LLM output. | kaba0 wrote: | Well, google has become utterly bad at its job -- I fail to | find sites I remember verbatim quotes from, so expert | google-usage is no longer a possibility. It will gladly | leave out any of your important keywords, even if you add | quotes around it, absolutely useless. Sure, the average | person will search for "how old is X" not "x age", but for | more complex queries the first form is not a good fit. | | That said, I can't really use ChatGPT as a search engine, | but I did plug it into a self-hosted telegram bot and I do | ask it some basic questions from time to time - telegram is | a good UI for it. | Abroszka wrote: | Yeah, I agree. Google did a lot of work to make ChatGPT | useful. It's clearly worse than it used to be. | | I usually need someone to explain something to me. And I | used Google before to land on a site where I could find | the explanation (e.g. how to use a library). ChatGPT can | explain most things I need and I can skip Google and the | other sites. But it's indeed not a search engine, if you | need factual information then your best bet is to find | the documentations, articles, databases, etc. | amelius wrote: | For us to get a better understanding of how well this tech | works I suggest ChatGPT becomes integrated in HN in this way: | it generates 1 response per comment; the responses written by | the AI are clearly marked as such (e.g. different color); the | user can turn them off; these comments can be up/down voted and | the votes can be seen by any user; of course users can reply to | the generated comments. | ziml77 wrote: | I've seen a lot of positivity on the output of ChatGPT for | coding tasks in my workplace. And it does seem to have some use | in that area. But there is just no way in hell it's replacing a | human in its current state. | | If you ask it for boilerplate or for something that's a basic | combination of things its seen before, it can give you | something decent, possibly even useable as-is. But as soon as | you step into more novel territory, forget it. | | There was one case where I wanted it to add an async method to | an interface as a way of seeing if it "understood" the | limitations of covariant type parameters in C# with regards to | Task<T>. It did not. I replied explaining the issue and it | actually did come back with a solution, but it wasn't a good | solution. I told it very specifically that I wanted it to | instead create a second interface for holding the async method. | It did that but made the original mistake despite my message | about covariance still being within the context fed back in for | generating this response. I corrected it again, but the output | from that ended up being so stupid I stopped trying. | | And at no point was it actually doing something that's very | important when given tasks that are not precisely specified: | ask me questions back. This seems equally likely to be a | problem for one of these language models replacing a doctor. It | doesn't request more context to better answer questions so the | only way to know it needs more is if you already know enough to | be able to recognize that the output doesn't make sense. It | basically ends up working like a search engine that can't | actually give you sources. | afro88 wrote: | Is it everyone else that's wrong or.... | | > if it doesn't know well a person it simply makes up things | | Asking it for factual information about a subject can be a bit | hit/miss depending on the subject. Better to use bing chat, | because it will use info from the web to inform the response | | > if I ask code examples or scripts, most of the time they are | wrong, I need to fix them, they contain obsolete syntax etc... | | How wrong? More wrong than having a junior or mid level | developer contributing code? | | Think about it a different way: you just gained an assistant | developer that writes mostly correct code in seconds. Big time | saver. | | Also: if you want it to use a particular code style etc, give | it few shot examples. | | > if I'm asking a question, I'm expecting being asked for more | context if the subject is not clear | | Then you need to tell it that in you prompt: "if the subject | isn't clear, ask me some clarifying questions. Don't respond | with your answer until I have answered your clarifying | questions first". Or: "ask me 3 clarifying questions before | answering" to force it to "consider" how well it "knows" the | subject first. | | ChatGPT isn't an AI in the sci fi sense of the word. It's a | language model that needs to be prompted the right way to get | the results you want. You will get a feel for that the more you | use it. | jasondigitized wrote: | This. It's ability to stub code is a huge productivity gain. | Paste code. Test it. Tweak it. Done. | raincole wrote: | And even if ChatGPT is always 100% wrong with code, I still | failed to see how it is relevant to this particular article. | | The article compares verified responses on r/AskDocs (yeah, a | subreddit) and those from ChatGPT. That's it. How is its | coding compatibility even remotely relevant? It's like saying | "Excel is bad in editing photos, so it must be a bad | spreadsheet software as well." | isaacremuant wrote: | > Is it everyone else that's wrong or.... | | This is disingenuous. OP is right. ChatGPT is mostly | inaccurate and contextless by nature. | | It always produces something confidently so it's very easy to | think it's the right answer. | | Now, it can be extremely useful for any task that can be | easily verifiable and you don't know the syntax, how to | approach something, etc. Because any decent software | developer can use it as part of the prototyping or | brainstorming and get wherever they want to go in | coordination with ChatGPT. | | What you can't do is assume it's "the expert". You are the | expert and you're the intelligent one and the chat generates | potential useful things. | | That's on verifiable stuff. On other things it can be so | laughably bad it's impressive how much "everyone" (as in | articles, hype, HN users who downvote criticism as being from | luddites) pushes it as something that is not: an AI that can | think and be relied upon and any inacuracy will just "get | better with time" aa opposed to "the model and also the | politics around it" doesn't have a path to get to that | idealisation that is being sold. | | It's a tool, but it's usually sold as better than it is, like | in this case, presumably with the intent of relying upon it | to save cost in some key integration point. The problem is | that it mostly won't work and comparing it to bad humans or | worse integrations doesn't show the fundamental low ceiling | that it has. | | I think people with authoritarian mindsets (and I don't mind | Left or right but inherent trust in authority) easily want | this to be a source of truth they can magically use, but | there's no path for that to be true. Just to appear true. | kaba0 wrote: | > How wrong? More wrong than having a junior or mid level | developer contributing code? | | Yes. The failure mode of a human and chatgpt are nothing | alike -- I am far more experienced in spotting beginner | mistakes in code reviews, then seemingly good, but actually | illogical bullshit code generated by LLMs. | | I have never had it produce non-trivial, novel code that was | correct, so I mostly use it as a search engine instead. | hammyhavoc wrote: | I can only conclude that both the junior and senior were | inappropriate hires if people really think the output of an | LLM is anything approaching that of an appropriate human | being. It's saying a lot about where people work, IMO. | | Either that or the problems they're solving are nothing | boilerplate couldn't handle. | zmnd wrote: | Can you give an example? Have you tried asking it to | generate both code and unit tests for it? | [deleted] | hammyhavoc wrote: | Yes. It is inappropriate for most things. It's an LLM. It | predicts the next word. People are throwing it at all kinds of | problems that are not only inappropriate, but their ability to | assess the quality of its output is questionable. | | E.g., lots of HN users claim to use it for dev or learning new | programming languages. Given the frequency of hallucination and | their Dunning-Kruger complexes in full-effect, they don't know | when it's teaching bad information or functions that don't | exist. | | It's an _LLM_. Not an AGI. | SilkRoadie wrote: | I use it at work and it clearly has strengths and weaknesses. | My two use cases are initial research and generating prototype | code. | | I find it very helpful to ask a series of questions and see a | number of examples to get a primer on what to expect with | something. The main benefit over Google or going straight to | the docs is I can start with my specific requirements. I then | dig into the documentation to deepen my understanding. I can | typically move forward with ChatGPT generating some code as a | starting point. | | It can be incorrect or out of date but combined with my | experience I find myself being more productive with it. | | A weakness I see is complex code requirements. It knows what it | knows. | | I note that you seem a little frustrated with vague or | incorrect responses. It helps to tell ChatGPT the role it | should play. It helps as well to instruct it to ask questions | of you to improve the response. Personally I prefer to tell it | keep its answers brief, I get less walls of text and I can | narrow in on the specific answer I am after more quickly. | 13415 wrote: | To add to this, I tried to use it professionally but the | answers were too general and generic. I suppose it has been | prompted to put things simple, which prohibits it from saying | meaningful things about certain topics. It did give one or two | useful references, though. | geonnave wrote: | My experience is completely different. I have successfully used | GPT-4 to: | | - write a contract for the sale of my motorcycle: put all | details, names and numbers with labels on a spreadsheet, paste | on the chat and ask for a contract, then edit. | | - learn french: I told gpt "when I write wrong stuff in french, | always let me know and teach me the correct ways". Then, after | a few weeks I asked for a .csv with the stuff that he corrected | so I could import into Anki, which actually worked. | | - coding on a daily basis: I am learning Rust on my new job, so | I ask it things all the time, it helps me a lot. | andreagrandi wrote: | Good to know it's able to do some useful stuff. In my case I | mostly ask Python related questions, because it's the one I | know better so I can check if the answer is right or wrong. I | will try with different languages, but I will be less capable | of knowing if I got a good answer or not. It may take more | time, but I find the combination of Google + Stack Overflow | more accurate than asking ChatGPT | isaacremuant wrote: | Your experience is not different from OPs. You're just ok | with the mistakes or are unaware of them because you don't | know how to judge them. | | I both use ChatGPT to boost productivity but also see the | amount of mistakes it makes and will keep making and am | surprised at the extreme denial of anyone who tries to shut | down criticism of the wrong type of hype (the one that sells | something that is not there) | geonnave wrote: | Oh but it is: I find it _useful_. For example, I would not | pay a lawyer for that contract, so having it draft me a | mediocre contract is still better than having no contract. | hammyhavoc wrote: | A mediocre contract can be worse than having no contract, | or as bad as having no contract if it isn't actually | legally binding. Yikes. | hammyhavoc wrote: | Has a lawyer looked over your contract and confirmed that it | is actually legally valid? Can you name said lawyer and the | firm they work at? | | Would you be willing to publish the contract with sensitive | information redacted? | | All these bold claims are just claims until people come up | with some substance. Talk is cheap and confirmation bias | happens all the time. | TrueSlacker0 wrote: | All you need for a bill of sale is a simple sentence saying | on x date I xxx, sale vehicle xxx, with vin number xxx to | xxx person. Then write the driver's license of both parties | and sign. A lawyer is extreme overkill for such a simple | transaction. | ceejayoz wrote: | > All you need for a bill of sale is a simple sentence | saying on x date I xxx, sale vehicle xxx, with vin number | xxx to xxx person. | | If you know this, you don't need GPT. | | If you don't know this, you don't have a way to assess | GPT's attempts at a contract. A bill of sale is indeed | simple, but there's a lot of more subtle legal issues | someone might run into in life. | hammyhavoc wrote: | By the same logic, a hallucinating LLM is also overkill | versus just doing the simple task yourself and not | needlessly adding risk to it. | | The point still remains: let's see what the LLM delivered | that the user actually used. Either it's legally binding | and an appropriate use, or it's not fit-for-purpose. | | Equally, why not an interactive form using conditional | logic? No hallucination possible. Much more simple and | reliable. | vidarh wrote: | If you expect to be able to ask it an underspecified question | without context and without telling it what role it should take | and how it should act, sure, that often fails entirely. It's | not a productive use of ChatGPT at all. | | If, on the other hand you actually put together a prompt which | tells it what you expect, the results are _very_ different. | | E.g. I've experimented with "co-writing" specs for small | projects with it, and I'll start with a prompt of the type "As | a software architect you will read the following spec. If | anything is unclear you will ask for clarification. You will | also offer suggestions for how to improve. If I've left "TODO" | notes in the text you will suggest what to put there." and a | lot more steps, but the key element is to 1) tell it what role | it should assume - you wouldn't hire someone without telling | them what their job is, 2) tell it what you expect in return, | and what format you want it in if applicable, 3) if you want it | to ask for clarifications, either ask for it and/or tell it to | follow a back and forth conversational model instead of dumping | a large / full answer on you. | | The precise type of prompt you should use will depend greatly | on the type of conversation you want to be able to have. | skepticATX wrote: | This type of usage is rapidly approaching a Clever Hans type | of situation: https://en.wikipedia.org/wiki/Clever_Hans. | | An intelligent agent shouldn't need this type of prompting, | in my opinion. | vidarh wrote: | It's perfectly fine if it's approaching a Clever Hans type | situation as long as it's producing sufficient quality | output fast enough that it's producing it faster than I can | do manually. | | There are many categories of usage for them, and relatively | "dumb" completion and boilerplate is still hugely helpful. | In fact, probably 3/4 of my use of ChatGPT are uses where I | have a pretty good idea what it'll output for a given input | _and that is why I 'm using it_, because it saves me | writing and adjusting boilerplate that it can produce | faster. Most of the time _I don 't want it to be smart_, I | want it to reliably do almost the same as it it's done for | me before, but adjusted to context in a predictable way | (the reason I'll reach for it over e.g. copying something | and adapting it manually). | | We use far dumber agents all the time and still derive | benefits from it. Sure it'd be nice if it gets smarter, but | it's already saving me a tremendous amount of time. | danenania wrote: | Yes, I think using it for code that you _could_ write | yourself fairly easily is a sweet spot since you can | quickly check it over and are unlikely to be fooled by | hallucinations. It can save significant time on typing | out boilerplate, refreshing on api calls and type | signatures, error handling, and so on. | | It's a save 15 minutes here, 20 minutes there kind of | thing that can add up to hours saved over the course of a | day. | segh wrote: | Why are people so determined to 'debunk' it? Why not try to | work with it and within its limitations? | hammyhavoc wrote: | Is the agent the LLM or the user who needs an LLM? | bamboozled wrote: | I'm starting to wonder if it's just easier to actually | program? | | Like I know ChatGPT-4 can generate a bunch of code really | quickly, but is coding in python using pretty well known | libraries so hard that it wouldn't just be easy to write some | code? It's super neat that it can do what it does, but on the | other hand, modern editors with language servers are super | efficient too. | sebzim4500 wrote: | If you have years of experience programming and ten minutes | of experience 'prompt engineering' then programming is | probably easier, yes. | vidarh wrote: | If I want it to just write code where I know exactly what I | want, I will have it write code. ChatGPT can write code and | fill in things when you give it something well specified | very quickly. | | My point was that if you just ask it an ambiguous question | it _will_ return something that is a best guess. It 's what | it does. To get it to act the way the person above want it | to, you need to feed it a suitable prompt first. | | You don't need to write a new set of instructions every | time. When I "co-write" specs with it, I _cut and paste_ a | prompt that I 'm gradually refining as I see what works, | and I get answers that fit the context I want. When I want | it to spit out Systemd unit files, I _cut and paste_ a | prompt that works for that. | | The stuff I'm using it for is stuff where I couldn't | possibly produce what it spits out as productively not | because it's hard but because even typing well above | average typing speed I couldn't possibly type that fast. | AnIdiotOnTheNet wrote: | My experience is the same, and yet I am not surprised that it | still scores better than the average physician. | ramoz wrote: | I have treatable cancer. | | ChatGPT has been helpful for fairly in-depth research, as well as | answering questions I don't want to bother my oncologist with. | | Actually, I prefer it to be less empathetic and more robotic with | its answers. When it tries to be empathetic it goes beyond what | I've asked it and sprinkles on typical safety filter bs. | | In contrast my oncologist could be a bit more empathetic. This | person directs your care, and if they do so in a way that seems | like a black box it can be rather scary. They know enough about | me, with all of their experience, to be a "helpful human" in a | difficult situation. | damontal wrote: | Are you confident that its responses are accurate? I've had it | just make up facts or tack made-up facts onto actual facts. It | will also fabricate references to research papers. | ramoz wrote: | more confident about my understanding of chatgpt, how to use | it appropriately, and abilty to filter its output or do | expanded research as needed. | dubcanada wrote: | I'm interesting in your statements do you have an example of | a instance where it made up facts? | | I've have yet to experience any of the domain knowledge I | have being incorrect. | notRobot wrote: | I do. Multiple times I've asked it to summarise plotlines | for TV shows and when it doesn't know (if for example the | show isn't super popular and there isn't a lot of | discussion about it on the internet), it'll just make stuff | up outright and give me something that sounds absolutely | reasonable based on the name of the show but is actually | entirely hallucinated. | | Similarly, today I saw this [0] comment and in an attempt | to help the user went to ChatGPT. | | Prompt: | | > _User I remember in the 2000s, there was some site using | a decimal IP address (as a single number, not dotted quad) | that had hacking /crypto puzzles. Something with a "Alice | in Wonderland" theme. Does that ring a bell for you?_ | | Response: | | > _I 'm sorry, but as an AI language model, I don't have | personal experiences or memories. However, based on your | description, it's possible that you might be referring to | the website known as "Project Wonderland." It was a website | that featured a series of cryptographic puzzles based on | the story of Alice in Wonderland, and it used a decimal IP | address (203.208.46.147) as its URL. The website was active | in the early 2000s and gained a cult following among | enthusiasts of cryptography and puzzle-solving. However, | the website is no longer active today._ | | I got really excited to have found an answer until through | Google and the Wayback Machine I realised that ChatGPT just | made this all up and no such website existed at that URL. | | I tried starting a new chat with ChatGPT to ask it about | this "Project Wonderland" website and it had no idea what I | was talking about. | | [0]: https://news.ycombinator.com/item?id=35748714 | | (I am using ChatGPT 3.5 (March 23, 2023)) | rep_lodsb wrote: | The important bit of context - that ChatGPT completely | missed - is that the address was a single number like | e.g. http://3520653040 | | The server might even have refused connection if the HTTP | "Host: " header wasn't in that format, but _as a human, | rather than a language model, I 'm not sure about that | and might be misremembering :)_ | nielsole wrote: | GPT 4 responds with cicada 3301 which as best asi can | tell is a valid solve for your query. | | * 3301 is one of three numbers that had to be added to | get the .com url * The Wikipedia page cites someone close | to the group with "follow the white rabbit" * Years don't | quite match up but given that you only asked if it rang a | bell, that is fair enough | dustincoates wrote: | I can. I've always struggled with the difference between | polyptoton and antanaclasis. (Lucky for me, it doesn't come | up very often!) I like what ChatGPT can do, and I've had | some good luck with it, so I asked. | | What I got back was confidently incorrect, and gave me | examples that were neither of the two. (Interestingly, most | of the examples were chiasmus instead.) I tried correcting | it to get better responses, but it kept apologizing and | then giving me the incorrect examples back. | [deleted] ___________________________________________________________________ (page generated 2023-04-29 23:02 UTC)