[HN Gopher] Hutter Prize for compressing human knowledge ___________________________________________________________________ Hutter Prize for compressing human knowledge Author : kelseyfrog Score : 45 points Date : 2023-09-13 22:03 UTC (56 minutes ago) (HTM) web link (prize.hutter1.net) (TXT) w3m dump (prize.hutter1.net) | TheRealPomax wrote: | Q: Why do you restrict to a single CPU core and exclude GPUs? | A: The primary intention is to limit compute and memory to some | generally available amount in a transparent, easy, fair, and | measurable way. 100 hours on one i7 core with 10GB RAM seems to | get sufficiently close to this ideal | | Sorry, who are these people that don't have a GPU? Even laptops | have GPUs. Why would you spend 100 hours on an "i7" (which | generation? 4790K or six times faster 13700k?) CPU when you can | achieve orders of magnitude better performance on a consumer GPU | that literally everyone has access to? | lucb1e wrote: | Note that the competition close to 20 years old | | ...though I also had a GPU in 2006, so idk. Then again, you | need to define _something_ as reference hardware and it doesn | 't really matter what it is. Better compression should win out | over less-good compression no matter if you run both on a | 100-core system or a 1-core system, I think? | TheRealPomax wrote: | In the category "then update your FAQ, you've have many, many | years to do so" =D | | (not to change the rules, but to explain why they rules | _haven 't_ changed. Level playing fields are a worthwhile | pursuit) | caseyavila wrote: | I do think it's interesting that recent submissions use nearly | the entire 50 hours. I wonder how much better people could do | if faster hardware was allowed. | dang wrote: | Related. Others? | | _Saurabh Kumar 's fast-cmix wins EUR5187 Hutter Prize Award_ - | https://news.ycombinator.com/item?id=36839446 - July 2023 (1 | comment) | | _Hutter Prize Submission 2021a: STARLIT and cmix (2021)_ - | https://news.ycombinator.com/item?id=36745104 - July 2023 (1 | comment) | | _Hutter Prize Entry: Saurabh Kumar 's "Fast Cmix" Starts 30 Day | Comment Period_ - https://news.ycombinator.com/item?id=36154813 - | June 2023 (5 comments) | | _Hutter Prize_ - https://news.ycombinator.com/item?id=33046194 - | Oct 2022 (3 comments) | | _Hutter Prize_ - https://news.ycombinator.com/item?id=26562212 - | March 2021 (48 comments) | | _500 '000EUR Prize for Compressing Human Knowledge_ - | https://news.ycombinator.com/item?id=22431251 - Feb 2020 (1 | comment) | | _Hutter Prize expanded by a factor of 10_ - | https://news.ycombinator.com/item?id=22388359 - Feb 2020 (2 | comments) | | _Hutter Prize: up to 50k EUR for the best compression algorithm_ | - https://news.ycombinator.com/item?id=21903594 - Dec 2019 (2 | comments) | | _Hutter Prize: Compress a 100MB file to less than the current | record of 16 MB_ - https://news.ycombinator.com/item?id=20669827 | - Aug 2019 (101 comments) | | _New Hutter Prize submission - 8 years since previous winner_ - | https://news.ycombinator.com/item?id=14478373 - June 2017 (1 | comment) | | _Hutter Prize for Compressing Human Knowledge_ - | https://news.ycombinator.com/item?id=7405129 - March 2014 (24 | comments) | | _Build a human-level AI by compressing Wikipedia_ - | https://news.ycombinator.com/item?id=143704 - March 2008 (4 | comments) | slashdev wrote: | I think the mistake here is to require lossless compression. | | Humans and LLMs only do lossy compression. I think lossy | compression might be more critical to intelligence. The ability | to forget, change your synapses or weights, is crucial to being | able to adapt to change. | version_five wrote: | Yeah it makes no sense to say it's inspired by intelligence and | then require lossless which is definitionally rote work and not | intelligent. | whimsicalism wrote: | Not true, a smart model could be really good at lossy | compression and then you only have to store a small delta to | make it lossless. | ClassyJacket wrote: | I'm no mathematician but I don't believe this is true. | Lossless information encoding requires _all_ the original | information to be present. | AnotherGoodName wrote: | Arithmetic coding allows you to make a prediction and | only provide bits for correction. | | Have the de-compressor predict the next data based on the | outcome so far (a statistical prediction of next data | will be lossy as it won't always be correct). If the | prediction is correct you need to spend very little to | confirm that. If it's incorrect you'll need to spend data | to correct it. Arithmetic coding is the best way to make | this work. | | It's also been used by all winning entries of the Hutter | prize so far. | glitchc wrote: | Or at least reproducible. It could still be compressed. | vladf wrote: | What | AnotherGoodName wrote: | That's literally arithmetic coding which is used by all | winning entries in the above so far. | sytelus wrote: | Humans can do lossy or lossless. There are plenty of people who | can recite entire Bible or Koran flawlessly. | kadoban wrote: | That's true, but it seems unlikely that that's a particularly | important part of intelligence. The vast majority of people | do _not_ do that type of memorization, are they still | intelligent? | anonylizard wrote: | Many can recite the Koran flawlessly, its short and heavily | encouraged in education through rote repetition. | | Much, much fewer can recite the bible, its many times longer. | | LLMs can also recite the bible and Koran flawlessly, given | how frequent the text appears in their training material. | TheRealPomax wrote: | This is more the equivalent of asking humans to create an | exact copy of the text, typesetting and all, including the | publishing information, page numbers, and exact linebreaks. | Not just recite the text, which would be a lossy encoding of | the original. | | Humans are _terrible_ at lossless encoding of information, it | 's what we invented machines for =D | Supply5411 wrote: | And there are humans that can jump 8ft in the air. Doesn't | mean it's correct to say that "humans can jump 8ft in the | air." Very few people are regurgitating verbatim information. | mik1998 wrote: | Lossy text compression has little utility. | JumpCrisscross wrote: | > _Lossy text compression has little utility_ | | You're describing every book you've ever read and learned | from. | TheAlchemist wrote: | I mean, come on man. For some reason, the nerd in me sees this | and immediately adds it on my 'I really need to do this' list. | | Just memories of old times doing some similar (albeit less | challenging probably) competitions on TopCoder almost a decade | ago, and also the curiosity to see how I would manage it know, | with experience. Given that the current scores are also very far | from what they estimate the lower bound to be, this is really | interesting ! The prize is however very misleading - per their | own FAQ - the total possible payout is ~223k euros. | | Definitely not thanking you for the hours I will put into this ! | omoikane wrote: | 500000 EUR is the prize pool. Each winner has to gain at least 1% | improvement over previous record to claim a prize that is | proportional to the improvement. Getting the full 500000 EUR | prize requires an 100% improvement (i.e. compressing 1GB to zero | bytes). | lainga wrote: | Ah... I had professors who graded like that | phobotics wrote: | Does it or does it just require 1% improvement over the last | winner? As opposed to a static additional 1% improvement vs the | initial best "score". | omoikane wrote: | It's 1% over the last winner. The latest winner has a total | size of 114156155, compared to previous winner of 115352938. | The payout was 500000 * (1 - 114156155 / | 115352938) = 5187 | | (see table near "Baseline Enwik9 and Previous Records | Enwik8") | bigyikes wrote: | Probably if you succeed at this, 500,000 will be worthless to | you | sytelus wrote: | Why? How does this improvement translates to more financial | gains? | Eduard wrote: | because with that knowledge, you will be able to decompress | 0 dollar to infinite dollars which the storage mafia will | pay you for not publishing your breakthrough in making them | obsolete. ___________________________________________________________________ (page generated 2023-09-13 23:00 UTC)