[HN Gopher] How to fit any dataset with a single parameter
       ___________________________________________________________________
        
       How to fit any dataset with a single parameter
        
       Author : tambourine_man
       Score  : 114 points
       Date   : 2021-09-29 18:50 UTC (4 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | axegon_ wrote:
       | Hypothetically you could achieve something similar using this[1].
       | Note the word "hypothetically".
       | 
       | [1] https://github.com/cakenggt/Library-Of-Pybel
        
       | caterama wrote:
       | This is beautiful. The real numbers are uncountable, so seems
       | intuitive that could approximate any other space with them...
       | basically a hash function. But this one has an inverse! It's so
       | cool that someone actually implemented this.
        
         | Sharlin wrote:
         | You don't need uncountability. Any nontrivial interval of
         | _rational_ numbers contains every finite string that exists, in
         | fact infinite copies of each!
        
         | blauditore wrote:
         | It's not about uncountability, but simply that their space is
         | infinite. The same would work with (unbounded) integer numbers.
        
       | bflesch wrote:
       | This sounds like it could explain many ML models ;)
        
       | darawk wrote:
       | I've thought about this for a while. I didn't go to the trouble
       | of actually inventing the function, but just the fact that it
       | ought to be possible in principle is sufficient to change your
       | perspective on some things. Most notably, parameter count based
       | information criteria.
       | 
       | AIC, BIC, et al. need to be reformulated for each model in which
       | they're used, and not all parameters can be treated as fungible.
        
       | shmageggy wrote:
       | Prior work: "One parameter is always enough", Piantadosi (2018)
       | 
       | https://www.google.com/search?q=one%20parameter%20piantadosi
       | 
       | Haven't read the posted article but sounds like the same idea and
       | motivation
        
       | ww520 wrote:
       | Let me guess before reading the article. Infinite precision float
       | point number?
        
       | aappleby wrote:
       | very silly. :)
        
       | cs702 wrote:
       | Fun read. But if we allow ourselves as much precision as we need,
       | we don't even need a parameter. Any _constant_ that is a _normal
       | number_ should suffice. Such constants already contain every
       | possible sequence of digits you could muster -- i.e., they
       | already contain every possible dataset.
       | 
       | EDIT: I replaced "transcendental" with "normal" after reading
       | Scarblac's comment below:
       | https://news.ycombinator.com/item?id=28699622 -- many important
       | transcendental numbers, including p (Pi), are thought to be (but
       | have not been proven to be!) normal.
        
         | KerrickStaley wrote:
         | It's not true that any transcendental number would work. A
         | transcendental number is a number which isn't the root of any
         | polynomial with rational coefficients. This property doesn't
         | imply that it contains all number sequences.
         | 
         | It hasn't even been proven that pi contains all number
         | sequences.
         | 
         | See https://math.stackexchange.com/questions/216343/does-pi-
         | cont... for more details.
        
           | cs702 wrote:
           | You're _right_. I edited my comment. Thank you for pointing
           | it out!
        
         | creddit wrote:
         | This actually isn't known to be true afaict:
         | https://www.askamathematician.com/2009/11/since-pi-is-infini...
        
           | cs702 wrote:
           | You're _right_. I edited my comment. Thank you for pointing
           | it out!
        
         | Scarblac wrote:
         | I don't believe just being transcendental is sufficient for
         | that property.
         | 
         | E.g. I can't imagine that pi with all the '1' digits replaced
         | by '2' isn't transcendental, but it clearly doesn't contain
         | every sequence of digits.
         | 
         | I think you mean normal numbers.
        
           | cs702 wrote:
           | You're _right_ : normal, not transcendental.
           | 
           | IIRC, most real numbers are normal, and many important
           | transcendental numbers are thought to be (but have not been
           | proven to be) normal.
           | 
           | I edited my comment. Thank you for the correction!
        
         | gnramires wrote:
         | Right, but any constant doesn't "encode" this information. To
         | use a normal constant to encode information, you need to encode
         | the location of the substring of interest. Generally, the
         | location of the substring of interest needs to require as many
         | bits as the substring itself (unless there's an intimate
         | relationship between the number and the substrings of
         | interest). So arguably it's the location that encodes the data,
         | the number is irrelevant (and why not encode the data directly
         | into this location?).
        
       | civilized wrote:
       | Too easy. Now fit any dataset using a substring of digits from
       | the binary expansion of pi
        
       | zwieback wrote:
       | more digits than I care for
        
       | idiotsecant wrote:
       | First, this is a fun implementation and I love it.
       | 
       | Second, you could as easily embed an infinite size dataset into
       | an infinitely long binary string and say that you've reproduced
       | your dataset with a 'single' parameter! That's sort of what this
       | is doing, with some extra steps.
        
         | hnlmorg wrote:
         | This reminds me of the fun thought experiment of using
         | /dev/random as storage. Given an infinite amount of time you'll
         | find every file you need.
        
         | laumars wrote:
         | Your second point is what I first assumed this paper was doing
         | before I read it.
        
         | cobookman wrote:
         | Up until ARG_MAX is reached, or you run out of RAM that is...
         | https://serverfault.com/a/844666
        
       | [deleted]
        
       | asimjalis wrote:
       | The word "single parameter" is doing a lot of work in this
       | statement.
        
         | llimllib wrote:
         | intentionally:
         | 
         | > In addition to casting doubts on the validity of parameter-
         | counting methods and highlighting the importance of complexity
         | bounds based on Occam's razor such as minimum description
         | length (that trade off goodness-of-fit with expressive power),
         | we hope that fa may also serve as entertainment for curious
         | data scientists
        
       | air7 wrote:
       | This reminds me of a joke idea I read somewhere: You can encode
       | the entire Encyclopaedia Britannica using a single mark on a
       | simple stick!
       | 
       | Just encode the text as a ascii codes after the decimal dot of a
       | zero. (0.656168.. etc). Then just mark that ratio of the sticks
       | length and you're done...
        
         | alittlesalami wrote:
         | This is Arithmetic Coding:
         | https://en.wikipedia.org/wiki/Arithmetic_coding
        
         | ohazi wrote:
         | You jest, but this is almost exactly how arithmetic coding
         | works.
         | 
         | If you arrange your symbols and contexts carefully, you can
         | even use this as a technique for progressive or lossy
         | compression -- i.e. the more accurately you specify the ratio,
         | the higher fidelity your result.
        
         | thehappypm wrote:
         | You can do it easier than that -- just make a stick of length
         | .65168 meters. (Same idea, just the "full 1 meter stick" is
         | defined elsewhere, not by the length of the stick).
        
           | tshaddox wrote:
           | That's way harder, because you have to cut a stick to write
           | and have a meter ruler to read.
        
         | an1sotropy wrote:
         | I remember reading about this as a kid in one of Martin
         | Gardner's books, either "aha!" or "aha! Gotcha"
        
         | gh02t wrote:
         | And of course in a footnote of the documentation for this
         | encoding "Sufficiently precise measurement of this mark left as
         | an exercise to the reader."
        
         | mjburgess wrote:
         | I don't see this as a joke, but a radical and important point.
         | 
         | Reality, in being geometrical, is infinitely informationally
         | dense (with a discrete conception of information).
         | 
         | This distinction between geometrical space and time, and
         | discrete algorithmic computability is unbridgeable.
         | 
         | And hence there is an extemely firm footing on which to reject:
         | AI, brain scanning readers, teleporters, etc and most sci-fi
         | computationalim.
         | 
         | Almost nothing can be simulated, as in, realised by a merely
         | computational system.
        
           | joshgrib wrote:
           | I just wanted to add that reality being a non-discrete
           | geometric thing is still an assumption - we don't know for
           | sure that it isn't discrete and a lot of quantum stuff points
           | more toward a discrete reality than a continuous one.
           | 
           | So assuming continuous space/time and discrete information
           | then I'd agree, but as far as we know space/time aren't
           | continuous, but just appear that way to us. It doesn't seem
           | like we know for sure that it is discrete, but at the least
           | I'd say it's solid evidence that stuff like brain scanning is
           | definitely in the realm of possibility.
           | 
           | This answer from the Physics StackExchange nicely covers how
           | time/space could appear continuous to us even if they are in
           | fact discrete at lower levels. Also some interesting
           | discussion in the other answers
           | https://physics.stackexchange.com/a/35676
        
             | mjburgess wrote:
             | I think for my purposes defining continuous = unmeasurably
             | discrete produces the same results.
             | 
             | Ie., there is an irreducible geometrical continuity in the
             | sense that no discontinuity can ever appear. The state
             | density is maximal.
             | 
             | via this route we reporduce the same point:
             | computationalism/simulation'ism' is then just the thesis
             | that computers qua measurably discrete systems can realise
             | dense unmeasurable discrete systems.
             | 
             | This can be shown to be impossible with much the same
             | argument: spatial and temporal geometrical properties
             | obtain in virtue of dense discreteness; and fail to obtain
             | at measureable levels.
             | 
             | The key property of continuity is its irreducibility to
             | measurably discrete systems. That irreducibility isn't,
             | however , limited to continuity .
             | 
             | Wolfram makes this point about the failures of reductionism
             | in a perfectly discrete context, ie., that no CA can
             | compute a CA whose complexity is greater than it can
             | summarise.
             | 
             | I prefer to press a continuous angle: our best theories of
             | all of reality are continuous and geometrical . That energy
             | levels are discrete in bounded quantum systems has almost
             | nothing to do with the indispensability of contintuous
             | mathemtatjcs in every known physical system -- including
             | that very bounded wavefn
        
           | whatshisface wrote:
           | It's computational again when you allow the computation to be
           | precise enough for any practical purpose, which is not
           | infinitely precise.
        
           | wallacoloo wrote:
           | These are weird conclusions. Any attempt to measure "reality"
           | gives some amount of uncertainty. The only way for this to
           | lead to the relatively stable experience you perceive is if
           | those small variations in measurement lead to relatively
           | small differences in perception. In which case, you can
           | truncate the resolution of your simulation and still get
           | plausible results.
           | 
           | I assure you there are plenty of groups out there simulating
           | systems that operate with similar densities (but lower
           | volume) to the brain.
        
           | mgraczyk wrote:
           | I wouldn't say that the distinction is "unbridgeable". Keep
           | in mind that we send packets over plenty of "geometrical" and
           | seemingly infinitely dense analog channels all the time, like
           | electricity over a copper wire or EM waves over the air.
           | 
           | The "mark on a stick" channel has a capacity like any other
           | channel. If you're sending just one symbol, you could easily
           | calculate the information capacity given a desired
           | probability of bit-error.
           | 
           | Assuming you can put the mark in exactly the right spot, you
           | can model the "noise" as a distribution over the values that
           | the reader will measure. If you model this as `mark + zero-
           | mean normal distribution` with a known variance, then your
           | stick is just an AWGN channel.
        
           | tbabb wrote:
           | I think there are at least two things wrong with this take:
           | 
           | One is that I don't think it follows from the premise that
           | the continuity of the physical world precludes AI, brain
           | scanning, etc. Even if the physical world were continuous
           | (likely not, see below), an arbitrary degree of approximation
           | could be attained, in principle. At the _very least_ I would
           | not call the footing  "firm".
           | 
           | The second is that the universe is very likely not continuous
           | anyway. The Beckenstein bound[1] puts an upper limit on the
           | number of bits of information a region of space may contain.
           | If the ruler tickmark were either measured or localized to
           | the precision required to encode the information, the
           | information density would cause it to collapse into a black
           | hole. This would happen once your measurement needs to be
           | about as precise as a Planck length, which would allow you to
           | encode about 115 bits of information with your tickmark.
           | 
           | (This in of itself is independent of the fact that you would
           | need to construct the ruler out of objects that the universe
           | permits; your ruler tickmark would need to be made of and
           | measured with discrete fundamental particles, which by their
           | very nature are quantized).
           | 
           | [1] https://en.wikipedia.org/wiki/Bekenstein_bound
        
             | oever wrote:
             | Or use a larger stick. Every doubling of the stick length
             | gives another bit of information. A stick of one light-year
             | length would add about 53 bits.
        
               | tbabb wrote:
               | The main point I'm making is that the information density
               | of the physical world is not unlimited as GP suggests.
               | 
               | But I think that example just shows how few bits you can
               | really get out the exercise!
        
           | [deleted]
        
           | dlivingston wrote:
           | Even if space and time were continuous (which things like the
           | Planck length would discredit), there are still discrete
           | objects in that continuum.
           | 
           | Elementary particles, for example, are discrete. You could
           | argue that they have continuous effects vis a vis the EM
           | field and spatial positioning, but ensemble effects usually
           | render that irrelevant at large enough scales.
        
             | tsimionescu wrote:
             | I will note that even in QM, both space and time are
             | considered continuous - the Planck length is just a
             | smallest measurable distance, but nothing in QM currently
             | assumes that particles must be separated by an integer
             | multiple of the Planck length (unlike spins, for example).
             | 
             | I believe there are some theories of quantum gravity that
             | do rely on the idea that space-time is quantized in integer
             | multiples of Planck's length, but these are far from
             | definitive theories.
             | 
             | A much more relevant limit in terms of possible information
             | density is Heisenberg's uncertainty principle, which
             | essentially puts a limit on the maximum possible precision
             | for any measurement.
        
           | yongjik wrote:
           | In theory, pi has infinite digits. You could publish a book
           | of a trillion digits of pi, and you have barely scratched the
           | surface: in fact you published a precisely 0.00000% of all
           | digits of pi.
           | 
           | In practice, you "only" need ~42 digits of pi to draw a
           | circle spanning the entire known universe (diameter of 8.8 *
           | 10^26 m) and it will deviate from the ideal circle by less
           | than the size of a proton (0.8 * 10^-15 m).
           | 
           | Having a theoretically infinite precision does not mean that
           | it makes a measurable difference.
        
             | Kranar wrote:
             | Every number has infinite digits or can be made to have
             | infinite digits for a given representation, but that's not
             | the same as a number having an infinite amount of
             | information. PI represents a finite amount of information,
             | as opposed to say a number like Chaitin's constant which
             | represents an infinite and irreducible amount of
             | information:
             | 
             | https://en.wikipedia.org/wiki/Chaitin%27s_constant
        
           | tsimionescu wrote:
           | This is not just a false idea, but an obviously false one,
           | contradicted by all the laws of physics. If you were right,
           | then any finite volume would contain an infinite amount of
           | information, which would mean it has infinite entropy,
           | temperature, and energy.
           | 
           | Also, by the same logic you apply to space, you could say
           | that time is infinitely divisible, so you could create a
           | computer which finishes an infinite amount of steps in a
           | finite amount of time.
        
             | mjburgess wrote:
             | information here, ie log of a probability, is a continuous
             | notion -- it is real-valued, as not least, log is a
             | transcendental fn --
             | 
             | i specifically said with a /discrete/ conception , geometry
             | is infinitely dense (of discrete states)
        
               | tsimionescu wrote:
               | In all physical theories, any finite system has a finite
               | number of _distinguishable_ states. So it is not
               | infinitely informationally dense, especially when working
               | with discrete bits of information.
               | 
               | Not to mention, the finer the distinctions between two
               | states of a system, the more energy you need to
               | distinguish them. So, the less impact these differences
               | can have, unless the system is extraordinarily energetic
               | (and even then, you end up in fundamental limits of
               | energy per volume, like the Schwarzschild radius).
               | 
               | So again, there is no sense in which a finite part of the
               | universe is universally dense.
               | 
               | Even worse for your argument, all currently known laws of
               | physics use computable functions (the randomness in QM
               | notwithstanding). So, by definition, all known laws of
               | physics can be simulated by an ideal Turing machine
               | (again, give or take some randomness in QM, depending on
               | the interpretation you chose to believe in and on how you
               | chose to simulate the QM system).
        
             | rnhmjoj wrote:
             | There's an interesting paper[1] that argues that real
             | numbers aren't physical, for the precise reasons you
             | stated. That is, only the subset of real numbers that
             | contain a finite amount of information (like constructive
             | real numbers) are physically meaningful.
             | 
             | A consequence of this is that classical physics is not
             | really deterministic: this is because, in general (ie.
             | including chaotic systems), the evolution of a system
             | depends on a set of initial conditions that are specified
             | by full real numbers, impossible to measure with finite
             | precision. So, the use of real numbers is hiding the
             | indeterminism in the initial conditions, much like the
             | function in this article encodes a dataset in a single
             | parameter.
             | 
             | [1]: https://arxiv.org/abs/1803.06824
        
               | tsimionescu wrote:
               | I think the precise definition of what exactly is
               | physically meaningful is more interesting a bit, because
               | pi is just as 'real' as 1 in some sense. You can't any
               | more precisely measure 1m than pi meters. A perfect
               | square doesn't exist any more or less precisely than a
               | perfect circle.
               | 
               | Basically, the universe can just as well be approximated
               | in two different ways. One, it's all straight lines, and
               | anything that looks like a circle/curve is actually a
               | piece of a veeery many sided polygon if you look closely
               | enough at it. Two, it's all curves of some curvature, and
               | anything that looks like a straight line is actually a
               | segment of a veeery large circle. The first perspective
               | corresponds to taking the rational numbers as physical.
               | The second one corresponds to taking pi as physical (and
               | then it's unclear whether e is also physical, or which
               | other transcendental numbers are, but 1 is definitely un-
               | physical then).
               | 
               | Probably the constructive framework is the best way to
               | express this concept, just starting from whatever
               | constant we chose to define as the unit.
        
           | Ginden wrote:
           | Existence of analog reality is quite problematic, because it
           | opens can of worms - eg. hypercomputation is possible.
        
           | gnramires wrote:
           | I used to think there must be something 'special' to brains
           | to distinguish us from computers. There isn't. Brains encode
           | finite amounts of information (quantum mechanics seems to
           | imply bounded local information). We are a huge information
           | network ourselves -- that's what consciousness is (with some
           | added bits like self-identity and various particulars
           | structures that dictate the _character_ of our experience).
           | 
           | But that doesn't mean brains aren't special -- it means
           | brains are special _and_ computers are special. Even more: it
           | seems to imply computers, AI, etc. can be as special as
           | ourselves, sentient, and perhaps even more special in ways we
           | haven 't realized yet.
           | 
           | It's difficult to even imagine a physical theory with
           | unbounded local information. It seems to open the possibility
           | to crazy things like hypercomputation, which do not seem very
           | well defined. (For example: every 1-1/n seconds, (n>1) from
           | now, flip a switch ON/OFF. At what state will the switch be
           | at t>1s? An at exactly t=1s?)
           | 
           | Note: while information and information flow itself bounded
           | (hence no hypercomputation), I don't know of any obvious
           | objections to continuous time. (I'm not sure the continuity
           | of time has any profound implications)
        
             | mjburgess wrote:
             | the issue isn't brains
             | 
             | no algorithm running on a cpu can move a muscle --
             | 
             | it is precisely that movement is a spatiotemporal property
             | which means no turing machine can realise it
             | 
             | movement isn't a symbolic operation
        
               | tsimionescu wrote:
               | This is such a strange claim to make, I can't even tell
               | what you could have meant. We are surrounded by
               | simplistic mechanical computations moving muscles , and
               | have been for more than a hundred years, since the
               | industrial revolution at least.
        
               | gnramires wrote:
               | What do you suppose those computers are doing, then? ;)
               | 
               | https://www.youtube.com/watch?v=tF4DML7FIWk
               | 
               | https://www.youtube.com/watch?v=tfb6aEUMC04&t=115s
        
           | idiotsecant wrote:
           | Reality isn't quite infinitely dense though, it's hard to
           | imagine how you could encode data geometrically over a
           | distance smaller than the planck length, for example. Reality
           | just has a very, very fine resolution.
        
             | Rapzid wrote:
             | A quick Google shows 6.25e+34 plank lengths in a meter. So,
             | seems impossible to measure and make the mark.
             | 
             | Edit: As a fraction of appended ascii codes.
        
               | petercooper wrote:
               | And if the mark were instead the removal of a single atom
               | in a 40cm length, the resolution would be even poorer..
               | able to encode a single 30 bit number by my very rough
               | calculation.
        
           | dnautics wrote:
           | > And hence there is an extemely firm footing on which to
           | reject: AI
           | 
           | This is like saying it's impossible to build a water pump
           | without solving the quantum mechanical interactions that
           | govern water flow.
        
           | [deleted]
        
           | WithinReason wrote:
           | How do you know reality is not computational?
        
         | meowphius wrote:
         | That is from the novel "Hard-Boiled Wonderland and the End of
         | the World" by Haruki Murakami[0].
         | 
         | [0]: https://everything2.com/title/Encyclopedia+on+a+toothpick
        
         | joiguru wrote:
         | One theoretical objection to this idea is that distances
         | measured have an accuracy limit of Planck length (1.616e-35
         | meters). So if your number needs more precision then "just
         | mark" step can't be done.
        
         | pveierland wrote:
         | My shot at the calculation for fun :-)
         | 
         | Stick encoding with graphite resolution (0.335 * 10^-9 meter)
         | [1]: "Uti" (31 bits -> 3 UTF8 characters)
         | 
         | Stick encoding with Planck resolution (1.616255 * 10^-35 meter)
         | [2]: "Utility ought " (115 bits -> 14 UTF8 characters)
         | 
         | Complete first sentence: "Utility ought to be the principal
         | intention of every publication." [3]
         | 
         | It appears that this storage scheme may not be suited towards
         | the safekeeping of literature.
         | 
         | [1]
         | https://www.wolframalpha.com/input/?i=floor%28floor%28log_2%...
         | 
         | [2]
         | https://www.wolframalpha.com/input/?i=floor%28floor%28log_2%...
         | 
         | [3] https://digital.nls.uk/encyclopaedia-
         | britannica/archive/1441...
        
           | Nition wrote:
           | "Simply" scale up the stick to your desired minimum level of
           | precision!
        
       | michaelcampbell wrote:
       | This reminds me of the "worst way to ask a user for a telephone
       | number" UI, after one UI forced the user to select each digit in
       | a 0-9 dropdown.
       | 
       | The specific one I'm thinking of is just spit out a scrollable
       | numeric string of Pi, and make the user scroll until their phone
       | number was the digits of Pi that matched it.
       | 
       | My (7 digit) phone number occurs after digit position 9_000_000,
       | and occurs 21 times in the first 200_000_000 digits.
        
         | albertzeyer wrote:
         | I think it is still not proven that Pi contains all possible
         | strings.
         | 
         | https://math.stackexchange.com/questions/216343/does-pi-cont...
        
         | throwawaytemp27 wrote:
         | I wonder if you gave away your phone number there? Or how many
         | 7 digit numbers match those criteria?
        
       | a-dub wrote:
       | this made me smile. :)
       | 
       | oh, uncountable infinities.
        
       | kzrdude wrote:
       | How to have fun with mpmath :)
        
       | jldugger wrote:
       | >The purpose of this paper is to show that all the samples of any
       | arbitrary dataset X can be reproduced by a simple differentiable
       | equation: fa(x) = sin^2 (2^xt arcsin [?]a)
       | 
       | So... use a PRNG?
        
         | andi999 wrote:
         | Doesn't this exclude negative numbers?
        
       ___________________________________________________________________
       (page generated 2021-09-29 23:00 UTC)