[HN Gopher] The Secret Sauce behind 100K context window in LLMs:...
       ___________________________________________________________________
        
       The Secret Sauce behind 100K context window in LLMs: all tricks in
       one place
        
       Author : T-A
       Score  : 73 points
       Date   : 2023-06-17 21:40 UTC (1 hours ago)
        
 (HTM) web link (blog.gopenai.com)
 (TXT) w3m dump (blog.gopenai.com)
        
       | treprinum wrote:
       | Not training full attention might score nicely in benchmarks but
       | humans will instantly notice the whole spectrum is not
       | represented. What you are proposing is basically get rid of
       | infrequent combinations but those happen in the real world and
       | will be missing from whatever your LLM will produce.
        
       | upthestake_s wrote:
       | Unfortunately, these "scientific" discoveries are where I get off
       | the merry-go-round.
       | 
       | This is not computer science its applied math/statistics and its
       | uninteresting in the same way (but with more powerful
       | applications) that big data is... And it never will be computer
       | science.
       | 
       | So when the AI powers that be need me to debug or build anything
       | on top of this, I dont want to hear nonsense excuses about why it
       | doesnt work and what Im doing wrong.
       | 
       | Q: "Why is this hadoop query taking 3 days to complete?????"
       | Answer: "I dont care and never will".
       | 
       | I will not ask why AI does or does not work, I simply dont care.
       | 
       | ML and AI will not be something I adopt besides asking it to
       | generate some boilerplate until I retire.
       | 
       | I wish the "boy geniuses of nonsense they dont understand" all
       | the best.
        
       | version_five wrote:
       | https://archive.md/bw2cN
       | 
       | (Its a medium page that doesn't load for me)
        
         | knodi123 wrote:
         | whereas archive.md returns
         | "ERR_SSL_VERSION_OR_CIPHER_MISMATCH"!
         | 
         | Sometimes I wish there was a way to tell our browsers "I really
         | don't care about SSL on this page, honestly, and I'm qualified
         | to tell when it matters."
        
           | version_five wrote:
           | Hmmm.. hopefully between the two of them most can read it.
           | The archive works for me.
        
           | james-revisoai wrote:
           | As far as I know, Firefox still allows this for any expired
           | certificate which at least has correct domain details and
           | authority (e.g. it once worked, which some dev should
           | validate).
           | 
           | SSL version or cipher mismatch can be from other causes. For
           | example, the server might be responding with a html page that
           | your browser is interpreting as https or vice versa, such as
           | if the developers run http for local dev and https for prod
           | and something gets confused.
        
           | londons_explore wrote:
           | I wish the browser would just load the page without cookies
           | whenever that happens. (ie. automatically switch to incognito
           | mode for just that tab whenever security can't be
           | guaranteed).
           | 
           | Also, perhaps disable keyboard entry so you can't type a
           | password in without acknowledging that you probably aren't
           | visiting the site you think you are.
        
             | atherton33 wrote:
             | There's probably heightened risk of having an unpatched
             | vulnerability exploited if you keep processing the payload
             | past the point where you suspect a bad actor is on the
             | other end.
        
           | sam_bristow wrote:
           | I believe you can type "thisisunsafe" on the SSL error page
           | in Chrome to bypass any warnings.
        
       | flakiness wrote:
       | The primary source is the liked Twitter thread. I wonder how
       | credible this source is. (I'm not familiar with the norm of ML
       | community - They seem to be Twitter-heavy than other part of
       | tech.)
        
         | Lerc wrote:
         | I only gave it a quick skim but it seems to match what I have
         | learned so far, but I'm also learning from things that people
         | said online so there remains the possibility of common
         | misconceptions.
         | 
         | The ALiBi stuff just makes sense to me. I don't understand why
         | the Positional Sinusoidal Encoding was used initially. I assume
         | there were good reasons for it but I haven't seen an
         | explanation, (pointers to one appreciated).
        
         | ShamelessC wrote:
         | Can you clarify what you're referring to?
        
       | [deleted]
        
       | asylteltine wrote:
       | [dead]
        
       ___________________________________________________________________
       (page generated 2023-06-17 23:00 UTC)