[HN Gopher] Hutter Prize for compressing human knowledge
       ___________________________________________________________________
        
       Hutter Prize for compressing human knowledge
        
       Author : kelseyfrog
       Score  : 45 points
       Date   : 2023-09-13 22:03 UTC (56 minutes ago)
        
 (HTM) web link (prize.hutter1.net)
 (TXT) w3m dump (prize.hutter1.net)
        
       | TheRealPomax wrote:
       | Q: Why do you restrict to a single CPU core and exclude GPUs?
       | A: The primary intention is to limit compute and memory to some
       | generally available amount in a transparent, easy, fair, and
       | measurable way. 100 hours on one i7 core with 10GB RAM seems to
       | get sufficiently close to this ideal
       | 
       | Sorry, who are these people that don't have a GPU? Even laptops
       | have GPUs. Why would you spend 100 hours on an "i7" (which
       | generation? 4790K or six times faster 13700k?) CPU when you can
       | achieve orders of magnitude better performance on a consumer GPU
       | that literally everyone has access to?
        
         | lucb1e wrote:
         | Note that the competition close to 20 years old
         | 
         | ...though I also had a GPU in 2006, so idk. Then again, you
         | need to define _something_ as reference hardware and it doesn
         | 't really matter what it is. Better compression should win out
         | over less-good compression no matter if you run both on a
         | 100-core system or a 1-core system, I think?
        
           | TheRealPomax wrote:
           | In the category "then update your FAQ, you've have many, many
           | years to do so" =D
           | 
           | (not to change the rules, but to explain why they rules
           | _haven 't_ changed. Level playing fields are a worthwhile
           | pursuit)
        
         | caseyavila wrote:
         | I do think it's interesting that recent submissions use nearly
         | the entire 50 hours. I wonder how much better people could do
         | if faster hardware was allowed.
        
       | dang wrote:
       | Related. Others?
       | 
       |  _Saurabh Kumar 's fast-cmix wins EUR5187 Hutter Prize Award_ -
       | https://news.ycombinator.com/item?id=36839446 - July 2023 (1
       | comment)
       | 
       |  _Hutter Prize Submission 2021a: STARLIT and cmix (2021)_ -
       | https://news.ycombinator.com/item?id=36745104 - July 2023 (1
       | comment)
       | 
       |  _Hutter Prize Entry: Saurabh Kumar 's "Fast Cmix" Starts 30 Day
       | Comment Period_ - https://news.ycombinator.com/item?id=36154813 -
       | June 2023 (5 comments)
       | 
       |  _Hutter Prize_ - https://news.ycombinator.com/item?id=33046194 -
       | Oct 2022 (3 comments)
       | 
       |  _Hutter Prize_ - https://news.ycombinator.com/item?id=26562212 -
       | March 2021 (48 comments)
       | 
       |  _500 '000EUR Prize for Compressing Human Knowledge_ -
       | https://news.ycombinator.com/item?id=22431251 - Feb 2020 (1
       | comment)
       | 
       |  _Hutter Prize expanded by a factor of 10_ -
       | https://news.ycombinator.com/item?id=22388359 - Feb 2020 (2
       | comments)
       | 
       |  _Hutter Prize: up to 50k EUR for the best compression algorithm_
       | - https://news.ycombinator.com/item?id=21903594 - Dec 2019 (2
       | comments)
       | 
       |  _Hutter Prize: Compress a 100MB file to less than the current
       | record of 16 MB_ - https://news.ycombinator.com/item?id=20669827
       | - Aug 2019 (101 comments)
       | 
       |  _New Hutter Prize submission - 8 years since previous winner_ -
       | https://news.ycombinator.com/item?id=14478373 - June 2017 (1
       | comment)
       | 
       |  _Hutter Prize for Compressing Human Knowledge_ -
       | https://news.ycombinator.com/item?id=7405129 - March 2014 (24
       | comments)
       | 
       |  _Build a human-level AI by compressing Wikipedia_ -
       | https://news.ycombinator.com/item?id=143704 - March 2008 (4
       | comments)
        
       | slashdev wrote:
       | I think the mistake here is to require lossless compression.
       | 
       | Humans and LLMs only do lossy compression. I think lossy
       | compression might be more critical to intelligence. The ability
       | to forget, change your synapses or weights, is crucial to being
       | able to adapt to change.
        
         | version_five wrote:
         | Yeah it makes no sense to say it's inspired by intelligence and
         | then require lossless which is definitionally rote work and not
         | intelligent.
        
           | whimsicalism wrote:
           | Not true, a smart model could be really good at lossy
           | compression and then you only have to store a small delta to
           | make it lossless.
        
             | ClassyJacket wrote:
             | I'm no mathematician but I don't believe this is true.
             | Lossless information encoding requires _all_ the original
             | information to be present.
        
               | AnotherGoodName wrote:
               | Arithmetic coding allows you to make a prediction and
               | only provide bits for correction.
               | 
               | Have the de-compressor predict the next data based on the
               | outcome so far (a statistical prediction of next data
               | will be lossy as it won't always be correct). If the
               | prediction is correct you need to spend very little to
               | confirm that. If it's incorrect you'll need to spend data
               | to correct it. Arithmetic coding is the best way to make
               | this work.
               | 
               | It's also been used by all winning entries of the Hutter
               | prize so far.
        
               | glitchc wrote:
               | Or at least reproducible. It could still be compressed.
        
               | vladf wrote:
               | What
        
             | AnotherGoodName wrote:
             | That's literally arithmetic coding which is used by all
             | winning entries in the above so far.
        
         | sytelus wrote:
         | Humans can do lossy or lossless. There are plenty of people who
         | can recite entire Bible or Koran flawlessly.
        
           | kadoban wrote:
           | That's true, but it seems unlikely that that's a particularly
           | important part of intelligence. The vast majority of people
           | do _not_ do that type of memorization, are they still
           | intelligent?
        
           | anonylizard wrote:
           | Many can recite the Koran flawlessly, its short and heavily
           | encouraged in education through rote repetition.
           | 
           | Much, much fewer can recite the bible, its many times longer.
           | 
           | LLMs can also recite the bible and Koran flawlessly, given
           | how frequent the text appears in their training material.
        
           | TheRealPomax wrote:
           | This is more the equivalent of asking humans to create an
           | exact copy of the text, typesetting and all, including the
           | publishing information, page numbers, and exact linebreaks.
           | Not just recite the text, which would be a lossy encoding of
           | the original.
           | 
           | Humans are _terrible_ at lossless encoding of information, it
           | 's what we invented machines for =D
        
           | Supply5411 wrote:
           | And there are humans that can jump 8ft in the air. Doesn't
           | mean it's correct to say that "humans can jump 8ft in the
           | air." Very few people are regurgitating verbatim information.
        
         | mik1998 wrote:
         | Lossy text compression has little utility.
        
           | JumpCrisscross wrote:
           | > _Lossy text compression has little utility_
           | 
           | You're describing every book you've ever read and learned
           | from.
        
       | TheAlchemist wrote:
       | I mean, come on man. For some reason, the nerd in me sees this
       | and immediately adds it on my 'I really need to do this' list.
       | 
       | Just memories of old times doing some similar (albeit less
       | challenging probably) competitions on TopCoder almost a decade
       | ago, and also the curiosity to see how I would manage it know,
       | with experience. Given that the current scores are also very far
       | from what they estimate the lower bound to be, this is really
       | interesting ! The prize is however very misleading - per their
       | own FAQ - the total possible payout is ~223k euros.
       | 
       | Definitely not thanking you for the hours I will put into this !
        
       | omoikane wrote:
       | 500000 EUR is the prize pool. Each winner has to gain at least 1%
       | improvement over previous record to claim a prize that is
       | proportional to the improvement. Getting the full 500000 EUR
       | prize requires an 100% improvement (i.e. compressing 1GB to zero
       | bytes).
        
         | lainga wrote:
         | Ah... I had professors who graded like that
        
         | phobotics wrote:
         | Does it or does it just require 1% improvement over the last
         | winner? As opposed to a static additional 1% improvement vs the
         | initial best "score".
        
           | omoikane wrote:
           | It's 1% over the last winner. The latest winner has a total
           | size of 114156155, compared to previous winner of 115352938.
           | The payout was                  500000 * (1 - 114156155 /
           | 115352938) = 5187
           | 
           | (see table near "Baseline Enwik9 and Previous Records
           | Enwik8")
        
         | bigyikes wrote:
         | Probably if you succeed at this, 500,000 will be worthless to
         | you
        
           | sytelus wrote:
           | Why? How does this improvement translates to more financial
           | gains?
        
             | Eduard wrote:
             | because with that knowledge, you will be able to decompress
             | 0 dollar to infinite dollars which the storage mafia will
             | pay you for not publishing your breakthrough in making them
             | obsolete.
        
       ___________________________________________________________________
       (page generated 2023-09-13 23:00 UTC)