[HN Gopher] How to fit any dataset with a single parameter ___________________________________________________________________ How to fit any dataset with a single parameter Author : tambourine_man Score : 114 points Date : 2021-09-29 18:50 UTC (4 hours ago) (HTM) web link (arxiv.org) (TXT) w3m dump (arxiv.org) | axegon_ wrote: | Hypothetically you could achieve something similar using this[1]. | Note the word "hypothetically". | | [1] https://github.com/cakenggt/Library-Of-Pybel | caterama wrote: | This is beautiful. The real numbers are uncountable, so seems | intuitive that could approximate any other space with them... | basically a hash function. But this one has an inverse! It's so | cool that someone actually implemented this. | Sharlin wrote: | You don't need uncountability. Any nontrivial interval of | _rational_ numbers contains every finite string that exists, in | fact infinite copies of each! | blauditore wrote: | It's not about uncountability, but simply that their space is | infinite. The same would work with (unbounded) integer numbers. | bflesch wrote: | This sounds like it could explain many ML models ;) | darawk wrote: | I've thought about this for a while. I didn't go to the trouble | of actually inventing the function, but just the fact that it | ought to be possible in principle is sufficient to change your | perspective on some things. Most notably, parameter count based | information criteria. | | AIC, BIC, et al. need to be reformulated for each model in which | they're used, and not all parameters can be treated as fungible. | shmageggy wrote: | Prior work: "One parameter is always enough", Piantadosi (2018) | | https://www.google.com/search?q=one%20parameter%20piantadosi | | Haven't read the posted article but sounds like the same idea and | motivation | ww520 wrote: | Let me guess before reading the article. Infinite precision float | point number? | aappleby wrote: | very silly. :) | cs702 wrote: | Fun read. But if we allow ourselves as much precision as we need, | we don't even need a parameter. Any _constant_ that is a _normal | number_ should suffice. Such constants already contain every | possible sequence of digits you could muster -- i.e., they | already contain every possible dataset. | | EDIT: I replaced "transcendental" with "normal" after reading | Scarblac's comment below: | https://news.ycombinator.com/item?id=28699622 -- many important | transcendental numbers, including p (Pi), are thought to be (but | have not been proven to be!) normal. | KerrickStaley wrote: | It's not true that any transcendental number would work. A | transcendental number is a number which isn't the root of any | polynomial with rational coefficients. This property doesn't | imply that it contains all number sequences. | | It hasn't even been proven that pi contains all number | sequences. | | See https://math.stackexchange.com/questions/216343/does-pi- | cont... for more details. | cs702 wrote: | You're _right_. I edited my comment. Thank you for pointing | it out! | creddit wrote: | This actually isn't known to be true afaict: | https://www.askamathematician.com/2009/11/since-pi-is-infini... | cs702 wrote: | You're _right_. I edited my comment. Thank you for pointing | it out! | Scarblac wrote: | I don't believe just being transcendental is sufficient for | that property. | | E.g. I can't imagine that pi with all the '1' digits replaced | by '2' isn't transcendental, but it clearly doesn't contain | every sequence of digits. | | I think you mean normal numbers. | cs702 wrote: | You're _right_ : normal, not transcendental. | | IIRC, most real numbers are normal, and many important | transcendental numbers are thought to be (but have not been | proven to be) normal. | | I edited my comment. Thank you for the correction! | gnramires wrote: | Right, but any constant doesn't "encode" this information. To | use a normal constant to encode information, you need to encode | the location of the substring of interest. Generally, the | location of the substring of interest needs to require as many | bits as the substring itself (unless there's an intimate | relationship between the number and the substrings of | interest). So arguably it's the location that encodes the data, | the number is irrelevant (and why not encode the data directly | into this location?). | civilized wrote: | Too easy. Now fit any dataset using a substring of digits from | the binary expansion of pi | zwieback wrote: | more digits than I care for | idiotsecant wrote: | First, this is a fun implementation and I love it. | | Second, you could as easily embed an infinite size dataset into | an infinitely long binary string and say that you've reproduced | your dataset with a 'single' parameter! That's sort of what this | is doing, with some extra steps. | hnlmorg wrote: | This reminds me of the fun thought experiment of using | /dev/random as storage. Given an infinite amount of time you'll | find every file you need. | laumars wrote: | Your second point is what I first assumed this paper was doing | before I read it. | cobookman wrote: | Up until ARG_MAX is reached, or you run out of RAM that is... | https://serverfault.com/a/844666 | [deleted] | asimjalis wrote: | The word "single parameter" is doing a lot of work in this | statement. | llimllib wrote: | intentionally: | | > In addition to casting doubts on the validity of parameter- | counting methods and highlighting the importance of complexity | bounds based on Occam's razor such as minimum description | length (that trade off goodness-of-fit with expressive power), | we hope that fa may also serve as entertainment for curious | data scientists | air7 wrote: | This reminds me of a joke idea I read somewhere: You can encode | the entire Encyclopaedia Britannica using a single mark on a | simple stick! | | Just encode the text as a ascii codes after the decimal dot of a | zero. (0.656168.. etc). Then just mark that ratio of the sticks | length and you're done... | alittlesalami wrote: | This is Arithmetic Coding: | https://en.wikipedia.org/wiki/Arithmetic_coding | ohazi wrote: | You jest, but this is almost exactly how arithmetic coding | works. | | If you arrange your symbols and contexts carefully, you can | even use this as a technique for progressive or lossy | compression -- i.e. the more accurately you specify the ratio, | the higher fidelity your result. | thehappypm wrote: | You can do it easier than that -- just make a stick of length | .65168 meters. (Same idea, just the "full 1 meter stick" is | defined elsewhere, not by the length of the stick). | tshaddox wrote: | That's way harder, because you have to cut a stick to write | and have a meter ruler to read. | an1sotropy wrote: | I remember reading about this as a kid in one of Martin | Gardner's books, either "aha!" or "aha! Gotcha" | gh02t wrote: | And of course in a footnote of the documentation for this | encoding "Sufficiently precise measurement of this mark left as | an exercise to the reader." | mjburgess wrote: | I don't see this as a joke, but a radical and important point. | | Reality, in being geometrical, is infinitely informationally | dense (with a discrete conception of information). | | This distinction between geometrical space and time, and | discrete algorithmic computability is unbridgeable. | | And hence there is an extemely firm footing on which to reject: | AI, brain scanning readers, teleporters, etc and most sci-fi | computationalim. | | Almost nothing can be simulated, as in, realised by a merely | computational system. | joshgrib wrote: | I just wanted to add that reality being a non-discrete | geometric thing is still an assumption - we don't know for | sure that it isn't discrete and a lot of quantum stuff points | more toward a discrete reality than a continuous one. | | So assuming continuous space/time and discrete information | then I'd agree, but as far as we know space/time aren't | continuous, but just appear that way to us. It doesn't seem | like we know for sure that it is discrete, but at the least | I'd say it's solid evidence that stuff like brain scanning is | definitely in the realm of possibility. | | This answer from the Physics StackExchange nicely covers how | time/space could appear continuous to us even if they are in | fact discrete at lower levels. Also some interesting | discussion in the other answers | https://physics.stackexchange.com/a/35676 | mjburgess wrote: | I think for my purposes defining continuous = unmeasurably | discrete produces the same results. | | Ie., there is an irreducible geometrical continuity in the | sense that no discontinuity can ever appear. The state | density is maximal. | | via this route we reporduce the same point: | computationalism/simulation'ism' is then just the thesis | that computers qua measurably discrete systems can realise | dense unmeasurable discrete systems. | | This can be shown to be impossible with much the same | argument: spatial and temporal geometrical properties | obtain in virtue of dense discreteness; and fail to obtain | at measureable levels. | | The key property of continuity is its irreducibility to | measurably discrete systems. That irreducibility isn't, | however , limited to continuity . | | Wolfram makes this point about the failures of reductionism | in a perfectly discrete context, ie., that no CA can | compute a CA whose complexity is greater than it can | summarise. | | I prefer to press a continuous angle: our best theories of | all of reality are continuous and geometrical . That energy | levels are discrete in bounded quantum systems has almost | nothing to do with the indispensability of contintuous | mathemtatjcs in every known physical system -- including | that very bounded wavefn | whatshisface wrote: | It's computational again when you allow the computation to be | precise enough for any practical purpose, which is not | infinitely precise. | wallacoloo wrote: | These are weird conclusions. Any attempt to measure "reality" | gives some amount of uncertainty. The only way for this to | lead to the relatively stable experience you perceive is if | those small variations in measurement lead to relatively | small differences in perception. In which case, you can | truncate the resolution of your simulation and still get | plausible results. | | I assure you there are plenty of groups out there simulating | systems that operate with similar densities (but lower | volume) to the brain. | mgraczyk wrote: | I wouldn't say that the distinction is "unbridgeable". Keep | in mind that we send packets over plenty of "geometrical" and | seemingly infinitely dense analog channels all the time, like | electricity over a copper wire or EM waves over the air. | | The "mark on a stick" channel has a capacity like any other | channel. If you're sending just one symbol, you could easily | calculate the information capacity given a desired | probability of bit-error. | | Assuming you can put the mark in exactly the right spot, you | can model the "noise" as a distribution over the values that | the reader will measure. If you model this as `mark + zero- | mean normal distribution` with a known variance, then your | stick is just an AWGN channel. | tbabb wrote: | I think there are at least two things wrong with this take: | | One is that I don't think it follows from the premise that | the continuity of the physical world precludes AI, brain | scanning, etc. Even if the physical world were continuous | (likely not, see below), an arbitrary degree of approximation | could be attained, in principle. At the _very least_ I would | not call the footing "firm". | | The second is that the universe is very likely not continuous | anyway. The Beckenstein bound[1] puts an upper limit on the | number of bits of information a region of space may contain. | If the ruler tickmark were either measured or localized to | the precision required to encode the information, the | information density would cause it to collapse into a black | hole. This would happen once your measurement needs to be | about as precise as a Planck length, which would allow you to | encode about 115 bits of information with your tickmark. | | (This in of itself is independent of the fact that you would | need to construct the ruler out of objects that the universe | permits; your ruler tickmark would need to be made of and | measured with discrete fundamental particles, which by their | very nature are quantized). | | [1] https://en.wikipedia.org/wiki/Bekenstein_bound | oever wrote: | Or use a larger stick. Every doubling of the stick length | gives another bit of information. A stick of one light-year | length would add about 53 bits. | tbabb wrote: | The main point I'm making is that the information density | of the physical world is not unlimited as GP suggests. | | But I think that example just shows how few bits you can | really get out the exercise! | [deleted] | dlivingston wrote: | Even if space and time were continuous (which things like the | Planck length would discredit), there are still discrete | objects in that continuum. | | Elementary particles, for example, are discrete. You could | argue that they have continuous effects vis a vis the EM | field and spatial positioning, but ensemble effects usually | render that irrelevant at large enough scales. | tsimionescu wrote: | I will note that even in QM, both space and time are | considered continuous - the Planck length is just a | smallest measurable distance, but nothing in QM currently | assumes that particles must be separated by an integer | multiple of the Planck length (unlike spins, for example). | | I believe there are some theories of quantum gravity that | do rely on the idea that space-time is quantized in integer | multiples of Planck's length, but these are far from | definitive theories. | | A much more relevant limit in terms of possible information | density is Heisenberg's uncertainty principle, which | essentially puts a limit on the maximum possible precision | for any measurement. | yongjik wrote: | In theory, pi has infinite digits. You could publish a book | of a trillion digits of pi, and you have barely scratched the | surface: in fact you published a precisely 0.00000% of all | digits of pi. | | In practice, you "only" need ~42 digits of pi to draw a | circle spanning the entire known universe (diameter of 8.8 * | 10^26 m) and it will deviate from the ideal circle by less | than the size of a proton (0.8 * 10^-15 m). | | Having a theoretically infinite precision does not mean that | it makes a measurable difference. | Kranar wrote: | Every number has infinite digits or can be made to have | infinite digits for a given representation, but that's not | the same as a number having an infinite amount of | information. PI represents a finite amount of information, | as opposed to say a number like Chaitin's constant which | represents an infinite and irreducible amount of | information: | | https://en.wikipedia.org/wiki/Chaitin%27s_constant | tsimionescu wrote: | This is not just a false idea, but an obviously false one, | contradicted by all the laws of physics. If you were right, | then any finite volume would contain an infinite amount of | information, which would mean it has infinite entropy, | temperature, and energy. | | Also, by the same logic you apply to space, you could say | that time is infinitely divisible, so you could create a | computer which finishes an infinite amount of steps in a | finite amount of time. | mjburgess wrote: | information here, ie log of a probability, is a continuous | notion -- it is real-valued, as not least, log is a | transcendental fn -- | | i specifically said with a /discrete/ conception , geometry | is infinitely dense (of discrete states) | tsimionescu wrote: | In all physical theories, any finite system has a finite | number of _distinguishable_ states. So it is not | infinitely informationally dense, especially when working | with discrete bits of information. | | Not to mention, the finer the distinctions between two | states of a system, the more energy you need to | distinguish them. So, the less impact these differences | can have, unless the system is extraordinarily energetic | (and even then, you end up in fundamental limits of | energy per volume, like the Schwarzschild radius). | | So again, there is no sense in which a finite part of the | universe is universally dense. | | Even worse for your argument, all currently known laws of | physics use computable functions (the randomness in QM | notwithstanding). So, by definition, all known laws of | physics can be simulated by an ideal Turing machine | (again, give or take some randomness in QM, depending on | the interpretation you chose to believe in and on how you | chose to simulate the QM system). | rnhmjoj wrote: | There's an interesting paper[1] that argues that real | numbers aren't physical, for the precise reasons you | stated. That is, only the subset of real numbers that | contain a finite amount of information (like constructive | real numbers) are physically meaningful. | | A consequence of this is that classical physics is not | really deterministic: this is because, in general (ie. | including chaotic systems), the evolution of a system | depends on a set of initial conditions that are specified | by full real numbers, impossible to measure with finite | precision. So, the use of real numbers is hiding the | indeterminism in the initial conditions, much like the | function in this article encodes a dataset in a single | parameter. | | [1]: https://arxiv.org/abs/1803.06824 | tsimionescu wrote: | I think the precise definition of what exactly is | physically meaningful is more interesting a bit, because | pi is just as 'real' as 1 in some sense. You can't any | more precisely measure 1m than pi meters. A perfect | square doesn't exist any more or less precisely than a | perfect circle. | | Basically, the universe can just as well be approximated | in two different ways. One, it's all straight lines, and | anything that looks like a circle/curve is actually a | piece of a veeery many sided polygon if you look closely | enough at it. Two, it's all curves of some curvature, and | anything that looks like a straight line is actually a | segment of a veeery large circle. The first perspective | corresponds to taking the rational numbers as physical. | The second one corresponds to taking pi as physical (and | then it's unclear whether e is also physical, or which | other transcendental numbers are, but 1 is definitely un- | physical then). | | Probably the constructive framework is the best way to | express this concept, just starting from whatever | constant we chose to define as the unit. | Ginden wrote: | Existence of analog reality is quite problematic, because it | opens can of worms - eg. hypercomputation is possible. | gnramires wrote: | I used to think there must be something 'special' to brains | to distinguish us from computers. There isn't. Brains encode | finite amounts of information (quantum mechanics seems to | imply bounded local information). We are a huge information | network ourselves -- that's what consciousness is (with some | added bits like self-identity and various particulars | structures that dictate the _character_ of our experience). | | But that doesn't mean brains aren't special -- it means | brains are special _and_ computers are special. Even more: it | seems to imply computers, AI, etc. can be as special as | ourselves, sentient, and perhaps even more special in ways we | haven 't realized yet. | | It's difficult to even imagine a physical theory with | unbounded local information. It seems to open the possibility | to crazy things like hypercomputation, which do not seem very | well defined. (For example: every 1-1/n seconds, (n>1) from | now, flip a switch ON/OFF. At what state will the switch be | at t>1s? An at exactly t=1s?) | | Note: while information and information flow itself bounded | (hence no hypercomputation), I don't know of any obvious | objections to continuous time. (I'm not sure the continuity | of time has any profound implications) | mjburgess wrote: | the issue isn't brains | | no algorithm running on a cpu can move a muscle -- | | it is precisely that movement is a spatiotemporal property | which means no turing machine can realise it | | movement isn't a symbolic operation | tsimionescu wrote: | This is such a strange claim to make, I can't even tell | what you could have meant. We are surrounded by | simplistic mechanical computations moving muscles , and | have been for more than a hundred years, since the | industrial revolution at least. | gnramires wrote: | What do you suppose those computers are doing, then? ;) | | https://www.youtube.com/watch?v=tF4DML7FIWk | | https://www.youtube.com/watch?v=tfb6aEUMC04&t=115s | idiotsecant wrote: | Reality isn't quite infinitely dense though, it's hard to | imagine how you could encode data geometrically over a | distance smaller than the planck length, for example. Reality | just has a very, very fine resolution. | Rapzid wrote: | A quick Google shows 6.25e+34 plank lengths in a meter. So, | seems impossible to measure and make the mark. | | Edit: As a fraction of appended ascii codes. | petercooper wrote: | And if the mark were instead the removal of a single atom | in a 40cm length, the resolution would be even poorer.. | able to encode a single 30 bit number by my very rough | calculation. | dnautics wrote: | > And hence there is an extemely firm footing on which to | reject: AI | | This is like saying it's impossible to build a water pump | without solving the quantum mechanical interactions that | govern water flow. | [deleted] | WithinReason wrote: | How do you know reality is not computational? | meowphius wrote: | That is from the novel "Hard-Boiled Wonderland and the End of | the World" by Haruki Murakami[0]. | | [0]: https://everything2.com/title/Encyclopedia+on+a+toothpick | joiguru wrote: | One theoretical objection to this idea is that distances | measured have an accuracy limit of Planck length (1.616e-35 | meters). So if your number needs more precision then "just | mark" step can't be done. | pveierland wrote: | My shot at the calculation for fun :-) | | Stick encoding with graphite resolution (0.335 * 10^-9 meter) | [1]: "Uti" (31 bits -> 3 UTF8 characters) | | Stick encoding with Planck resolution (1.616255 * 10^-35 meter) | [2]: "Utility ought " (115 bits -> 14 UTF8 characters) | | Complete first sentence: "Utility ought to be the principal | intention of every publication." [3] | | It appears that this storage scheme may not be suited towards | the safekeeping of literature. | | [1] | https://www.wolframalpha.com/input/?i=floor%28floor%28log_2%... | | [2] | https://www.wolframalpha.com/input/?i=floor%28floor%28log_2%... | | [3] https://digital.nls.uk/encyclopaedia- | britannica/archive/1441... | Nition wrote: | "Simply" scale up the stick to your desired minimum level of | precision! | michaelcampbell wrote: | This reminds me of the "worst way to ask a user for a telephone | number" UI, after one UI forced the user to select each digit in | a 0-9 dropdown. | | The specific one I'm thinking of is just spit out a scrollable | numeric string of Pi, and make the user scroll until their phone | number was the digits of Pi that matched it. | | My (7 digit) phone number occurs after digit position 9_000_000, | and occurs 21 times in the first 200_000_000 digits. | albertzeyer wrote: | I think it is still not proven that Pi contains all possible | strings. | | https://math.stackexchange.com/questions/216343/does-pi-cont... | throwawaytemp27 wrote: | I wonder if you gave away your phone number there? Or how many | 7 digit numbers match those criteria? | a-dub wrote: | this made me smile. :) | | oh, uncountable infinities. | kzrdude wrote: | How to have fun with mpmath :) | jldugger wrote: | >The purpose of this paper is to show that all the samples of any | arbitrary dataset X can be reproduced by a simple differentiable | equation: fa(x) = sin^2 (2^xt arcsin [?]a) | | So... use a PRNG? | andi999 wrote: | Doesn't this exclude negative numbers? ___________________________________________________________________ (page generated 2021-09-29 23:00 UTC)