PREDICTIVE COMPRESSION In my last ideas post I mentioned that I'd been thinking about alternative data compression and transmission schemes. In this respect what I meant was that I had written one of my pages of handwritten notes on the topic, without much consideration of prior work. Checking now I see that my basic idea is of course well established in the very actively researched field of data compression. This is good though - it is a significant part of the current PAQ compressors, on which the winners (well err, winner) of the Hutter Prize have based their work, and that dataset of English Wikipedia content should be relatively similar to compressing things like Gopher content (plus or minus some markup and ASCII-art). That means it might be applicable to transmitting over low-bandwidth communications such as for my idea of broadcasting Gopherspace. gopher://gopherpedia.com:70/0/PAQ http://prize.hutter1.net/ That said, it's always a little disappointing to find that you haven't really thought of anything new. I will provide a breif summary of my idea though, ending with one alternative approach that perhaps could lead to something of potential. Basically you have an AI system process preceeding parts (maybe also following parts) of the data stream and then proposes a selection of probable sequences to follow it. The depth of context around the part to be guessed would obviously help it to narrow down the most likely outcomes. The probability of each of these outcomes is assigned a value within a pre-defined scale, therefore if an identical AI system is used for both compression and decompression, only the value assigned to the correct outcome needs to be transmitted - the decompressor simply has to match that to the corresponding outcome that it calculated itself from the other data already received. If the likelihood of the correct outcome is too low to be included within the range of values within the probability range, then it has to be sent using more conventional compression methods. Obviously this is computationally difficult (slower) compared to conventional compression schemes (PAQ compressors confirm this), but in the application I have in mind the main thing is keeping size small enough to be practical for transmission over a slow radio link. The only thing I'd add to this that doesn't factor into the implementations that I've read about online (and for good reason really) is on the subject of this probability scale itself. Conventionally this would be a list of integers that correspond to bit values in digital communication. But what if you represent it within an analogue frequency or voltage range? Then granuality becomes a question of resolution rather than simply byte size. An analogue scale could theoretically represent the entire range of probabilities if the measurement resolution was fine enough to detect every single possible outcome within that range. In practice it won't be, but perhaps there's some advantage to this concept if over a more restricted range similar probable outcomes are grouped together. For example, say you're measuring voltage within a voltage range that represents probable outcomes in the data stream. The range used is only accurately measureable at close to ideal signal conditions - very little noise. If the signal is degraded by noise, the value received is incorrect, but because it is still nearby on the scale the result is still similar to the correct outcome. This is in contrast to digital communications where an incorrectly received bit can cause a wildly different position within the scale to be received (and it's often discarded entirely after comparing with a CRC value). That said, digital communications basically operate this same system with only two outcomes within their range, so they can cope with much more noise while retaining signal integrity. But this is only as a consequence of throwing away extra resolution in the received analogue signal that _could_ represent more information. It is confusing to mix the analogue and digital domains - that argument could continue in favour of dynamic number bases according to signal quality - base 2 when signal quality is poor and only two states can be reliably detected, base 10 when it's good enough to represent ten points within the voltage or frequency range (which I guess is an alternative to simply increasing the transmission speed with base 2 until the signal quality drops to your limit). But what if it's possible to stay within the analogue domain? Picture a form of analogue AI computer capable of generating the analogue voltage value that corresponds to the correct outcome within the voltage range representing the scale of probable outcomes, and use that for compression. Then you have another analogue computer able to reverse that process for the decompression. Or else for decompression you use one that is identical to that used for compression, but its "correct outcome" inputs are scanned through the entire range of probailities within their own analogue input scale, until its output value is identical to that generated by the compresson computer, at which time the momentary state of its inputs represents the answer. I can't help but think how that last arrangement sniffs a little of quantum computing and qbits "collapsing" to a particular state. I'm not sure how to tie those domains of analogue and quantum together so that an analogue voltage value within a range would equate directly to a qbit, but perhaps it's close to something. I can't draw this to much of a conclusion unfortunately. An analogue AI computer at a scale to be useful in this application is bound to be impractical by conventional means, and my equating analogue electronics to quantum computing may well be complete nonsense. So while these thoughts spell to me some machine that can process an analogue data stream with increased resolution and data throughput compared with conventional digital techniques, it may well be completely wrong and backward-looking. On the other hand it's interesting to consider that the simplification of analogue values down to just one of two states might represent just one stage of computer development; One that leads towards data representations within infinate scales of possibility, and calculation using physical behaviours themselves rather than the layered abstrations of digital electronics. - The Free Thinker, 2021.