[HN Gopher] Meditron: A suite of open-source medical Large Langu...
       ___________________________________________________________________
        
       Meditron: A suite of open-source medical Large Language Models
        
       Author : birriel
       Score  : 58 points
       Date   : 2023-11-28 19:01 UTC (3 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | gardenfelder wrote:
       | Paper: https://arxiv.org/abs/2311.16079
        
       | 2023throwawayy wrote:
       | This is the only part of AI that actually terrifies me.
       | 
       | I've run into people on this very site who use LLMs as a doctor,
       | asking it medical questions and following its advice.
       | 
       | The same LLMs that hallucinate court cases when asked about law.
       | 
       | The same LLMs that can't perform basic arithmetic in a reliable
       | fashion.
       | 
       | The same LLMs that can't process internally consistent logic.
       | 
       | People are following the medical "advice" that comes out of these
       | things. It will lead to deaths, no questions asked.
        
         | rolisz wrote:
         | Following the advice of chatgpt without double checking? Bad
         | idea.
         | 
         | Using ChatGPT as a starting point? Sounds really good to me,
         | been there, done that.
        
           | twayt wrote:
           | Yea I think this is the most reasonable take.
           | 
           | You can always check information before believing or acting
           | on it.
           | 
           | However it's often super difficult to even get started and
           | know what it is that you should be reading more about.
        
         | leetharris wrote:
         | The reality is that the majority of things people want to go to
         | the doctor for are not serious.
         | 
         | If this can help with that, I am all for it.
        
         | bilsbie wrote:
         | On the contrary modern medicine terrifies me. Something like
         | this might be our only hope.
        
           | geek_at wrote:
           | Chat GPT and that Amazon Healthcare thing will be more
           | efficient than the US Healthcare system. Which is kind of
           | crazy
        
           | firebot wrote:
           | It should. Most medicine is just extracting plant chemicals,
           | modifying them, concentrating them, and thereby they can
           | patent what nature has provided.
        
         | bilsbie wrote:
         | Wait until you hear about search engines ...
        
         | techwizrd wrote:
         | I used to work on a healthcare AI chatbot startup before
         | traditional LLMs like BERT. We were definitely worried about
         | accuracy and reliability of the medical advice then, and we had
         | clinicians working closely to make sure the dialog trees were
         | trustworthy. I work in aerospace medicine and aviation safety
         | now, and I constantly encounter inadvisable use of LLMs and a
         | lack of effective evaluation methods (especially for domain-
         | specific LLMs).
         | 
         | I appreciate the advisory notice in the README and the
         | recommendation against using this in settings that may impact
         | people. I sincerely hope that it's used ethically and
         | responsibly.
        
         | ryandvm wrote:
         | Sure, but we already have 250,000 medical deaths PER YEAR in
         | the US due to medical errors
         | (https://pubmed.ncbi.nlm.nih.gov/28186008/).
         | 
         | I don't think people should trust LLMs completely, but let's be
         | real, they shouldn't trust humans completely either.
        
           | blipmusic wrote:
           | Isn't that whataboutism at its best? Those two things are
           | completely unrelated.
        
             | mannyv wrote:
             | No, it's showing that the risk of errors exists even
             | without AI.
             | 
             | AI doesn't necessarily make that risk higher or lower a
             | priori.
             | 
             | Plus if you knew how much of current medical practice
             | exists without evidence you wouldn't be worrying about AI.
        
               | blipmusic wrote:
               | Maybe it's ok to worry about both? Not trusting
               | "arbitrary thing A" does not logically make "arbitrary
               | thing B" more trustworthy. I do realise that these models
               | intend to (incrementally) represent collective knowledge
               | and may get there in the future. But if you worry about
               | A, why not worry about B which is based on A?
        
             | robertlagrant wrote:
             | It's not whataboutism at its best, no. Just as with self-
             | driving cars, medical AIs don't have to be perfect, or even
             | to cause zero deaths. They just have to improve the current
             | situation.
        
         | davidjade wrote:
         | Here's a recent (yesterday) example of a benefit though.
         | 
         | I tried unsuccessfully to search for an ECG analysis term (EAR
         | or EA Run) using Google, DDG, etc. There was no magic set of
         | quoting, search terms, etc. that could explain what those terms
         | were. Ear is just too common for a word.
         | 
         | ChatGPT however was able to take the context of the question I
         | had (an ECG analysis) and lead me to the answer right away of
         | what EAR meant.
         | 
         | I wasn't seeking medical advice though, just a better search
         | engine with context. So there are clearly benefits here too.
        
           | nhinck2 wrote:
           | Ectopic Atrial Rhythm?
        
         | BrandoElFollito wrote:
         | On the other hand, your MD is going to look for the obvious, or
         | statistically relevant, or currently prominent disease.
         | 
         | But they could be presented 99% probability for flu, 1% or
         | wazalla, and that testing for wazalla means pinching your ear
         | tout may actually be correctly diagnosed sometimes.
         | 
         | It is not that MDs are incompetent, it is just that when
         | wazalla was briefly mentioned during their studies, they
         | happened to be in the toilets and missed it. Flu was mentioned
         | 76 times because it is common.
         | 
         | Disclaimer: I know medicine from "House, MD" but also witnessed
         | a miraculous diagnosis on my father just because his MD
         | happened to read an obscure article
         | 
         | (for the story, he was diagnosed with a worm-induced illness
         | that happened one or twice a year in France in the 80's. The
         | worm was from a beach in Brazil, and my dad never travelled to
         | Americas. He was kindly asked to provide a sample of blood to
         | help research in France, which he did. Finally the drug to heal
         | him was available in one pharmacy in Paris and in Lyon. We
         | expected a hefty cost (though it is all covered in France), it
         | costed 5 franks or so. But we were told with my brother to keep
         | an eye on him as he may become delusional and try to jump
         | through the window. The poor man cold hardly blink before we
         | were on him:) Ah, and the pills were 2cm wide, looked like they
         | were for an elephant. And he had 5 or so to swallow)
        
         | firebot wrote:
         | What's to be terrified about? Humans also hallucinate. Doctors
         | are terrible at their jobs.
        
       | 094459 wrote:
       | Is this open source? It says the model is the Llama license which
       | is NOT open source.
        
       | firebot wrote:
       | I like this is a pun of Metatron.
        
       | vessenes wrote:
       | Very brief summary of the paper: there aren't any new technical
       | ideas here, just finetuning a 70B model on curated medical
       | papers, using self-consistency CoT sampling.
       | 
       | Results: @70B: Better than GPT3.5, better than non-fine tuned
       | Llama, worse than GPT-4.
       | 
       | 70B gets a human passing score on MedQA. (Passing: 60, Medtron:
       | 64.4, GPT-3.5: 47, GPT-4: 78.6).
       | 
       | TLDR: Interesting, not crazy revolutionary, almost certainly
       | needs more training, stick with GPT-4 for your free unlicensed
       | dangerous AI doctor needs
        
       ___________________________________________________________________
       (page generated 2023-11-28 23:00 UTC)