[HN Gopher] Show HN: Store the proof of a webpage saved with Sin...
       ___________________________________________________________________
        
       Show HN: Store the proof of a webpage saved with SingleFile in
       Bitcoin
        
       Author : gildas
       Score  : 73 points
       Date   : 2020-01-06 17:06 UTC (5 hours ago)
        
 (HTM) web link (blog.woleet.io)
 (TXT) w3m dump (blog.woleet.io)
        
       | mathiasrw wrote:
       | A more complete way is to put all of the website data on the
       | blockchain including metadata like https://etched.page does.
       | Other files are added in a way that you can see how the page
       | looked at the time. The user selects what files from the document
       | is kept (.css is normally a good idea).
       | 
       | The website, and the metadata about when it was stored including
       | a signature from etched.page can be unpacked directly from chain.
        
         | maxfan8 wrote:
         | What's wrong with just a hash of all the relevant files? You
         | can then store the actual files conventionally and provide them
         | upon request. The security guarantees are still there (you
         | implicitly trust SHA256's resistance against collision attacks
         | if you use Bitcoin).
        
       | sareum wrote:
       | neat, this is really handy1
        
       | crazypython wrote:
       | I use DEVONThink, and it saves stuff in Web Archive (WARC) format
       | when I want it to. Like Evernote clipper, but better.
       | 
       | I also use httrack to download files offline, which I can then
       | have DEVONThink index. Bam, offline search engine!
        
       | m-p-3 wrote:
       | Can we alternatively get the ability to upload that to IPFS?
        
       | gill3s wrote:
       | I use it every day to track bad takes / scams and prove them
       | later just in case authors deny theirs words afterwards. Internet
       | has good memory, Bitcoin make it immutable :)
        
       | kichik wrote:
       | Why not just timestamp it with RFC 3161 timestamp server? BitCoin
       | sounds like a bit of an overkill here.
        
         | XnoiVeX wrote:
         | Or a combined hash with the NIST randomness beacon.
         | https://beacon.nist.gov/home
        
           | maxfan8 wrote:
           | Actually, uploading to Bitcoin is tantamount to using the
           | NIST randomness beacon since the NIST randomness beacon is
           | injected into Bitcoin:
           | https://github.com/opentimestamps/nist-inject
        
         | eli wrote:
         | Or, for most people, just tweet a hash
        
       | verdverm wrote:
       | Suspicious new accounts commenting
        
         | kick wrote:
         | Interestingly, at least one deleted their comment after it was
         | flagged, just in case any reader sees this and thinks "Wait,
         | there's only two suspicious new accounts commenting!"
        
           | verdverm wrote:
           | HN will delete them too, I think. Both are now gone
        
             | kick wrote:
             | No, only one is. Turn showdead on.
        
       | zemnmez wrote:
       | this is a proof that a user viewed a webpage rather than a proof
       | that a webpage existed with given contents, which would need many
       | users to find consensus on the content of a webpage.
       | 
       | as a result of this model, there's no way to verify the content
       | is 'correct', since the user can arbitrarily modify it before
       | submitting
       | 
       | so my understanding of this is that cryptographically it can be
       | used to say that a user submitted some content at a given time.
       | 
       | what's the use case for this?
        
         | maxfan8 wrote:
         | Prior art in patent, trademark, or copyright cases.
        
       | realty_geek wrote:
       | Is there a simple solution to allow people to timestamp a static
       | webpage like Patrick McKenzie does here:
       | 
       | https://twitter.com/patio11/status/958494488061595649
        
       | fiatjaf wrote:
       | Why not use opentimestamps.org for this extension instead?
        
         | maxfan8 wrote:
         | Yeah, having an open, easy to verify standard is part of what
         | makes a "proof" a good proof. No need to reinvent the wheel
         | here.
        
       | drdeca wrote:
       | If webpages supported tls-N, then this seems like it could be
       | cool, but as is, I don't see what this does beyond what
       | originstamp (and similar services) provide . The tls notary thing
       | others have mentioned here sounds cool, and I hadn't heard of it.
       | The interactivity allowing for a proof even without the server
       | supporting like tls-n is impressive, if I am understanding
       | correctly!
        
       | kwantam wrote:
       | Since there's little technical detail it's hard to be certain,
       | but I doubt this is useful as a proof in the way one might wish
       | (that is, proving that a web server delivered content X at date
       | Y). The reason is, it does not appear that anything prevents the
       | user from modifying the web page and then generating a "proof"
       | about the modified version.
       | 
       | TLSNotary (tlsnotary.org) is an example of a project that
       | attempts to use a (modified) TLS connection for non-repudiation
       | (which is roughly the property that we would want here), but it
       | requires a trusted third party to act as the notary.
       | 
       | It's possible this project is taking a similar approach (which
       | would be fine, for those who trust the trusted third party). But
       | given the lack of technical detail, and reading between the
       | lines, I don't see a reason to believe this is the case.
       | 
       | (Happy to be wrong, though! Maybe a more detailed description
       | would help us understand what's going on.)
        
         | gildas wrote:
         | Basically, you're right. I found the idea interesting despite
         | the fact that the solution isn't "perfect". For your
         | information, the code used to upload the proof is here [1]. A
         | SHA256 of the content of the saved page is generated (i.e. the
         | "hash" parameter the "anchor" function) and is sent to the API
         | of Woleet.
         | 
         | [1] https://github.com/gildas-
         | lormeau/SingleFile/blob/master/ext...
        
         | jmeyer2k wrote:
         | > I doubt this is useful as a proof in the way one might wish
         | (that is, proving that a web server delivered content X at date
         | Y). The reason is, it does not appear that anything prevents
         | the user from modifying the web page and then generating a
         | "proof" about the modified version.
         | 
         | The point of this service is to prove data __existed __at a
         | certain time. It can 't prove the authenticity of the data.
        
           | attilakun wrote:
           | The linked post says something else:
           | 
           | > It is therefore quite natural that this collaboration was
           | born, allowing all users of the extension to retrieve a
           | neutral, irrefutable and usable evidence worldwide, including
           | in court.
        
             | allovernow wrote:
             | This is potentially very useful for IP. It's a great way to
             | provide some proof that you had an idea before anyone else.
        
         | gregable wrote:
         | Another possible solution to what you want to solve is to use a
         | signed exchange signature:
         | https://wicg.github.io/webpackage/draft-yasskin-http-origin-...
         | 
         | The publisher server must support it, but this results in the
         | document being signed with the publisher's certificate. This
         | won't "date" the signature, but combined with a blockchain
         | solution like this could prove that a web server delivered
         | content X at date Y.
        
         | gill3s wrote:
         | hi, I'm woleet's ceo. I can give you all the details you want.
         | To explain it simply, for each signleFile export we "anchor"
         | the hash in Bitcoin. It means each hash is link to one
         | particular bitcoin transaction. Feel free to ask any question
         | I'll be happy to answer
        
           | hanniabu wrote:
           | This sounds extremely expensive due to the cost of bitcoin
           | transactions. What ZeroNet is doing seems much more feasible.
           | https://zeronet.io/
        
             | gill3s wrote:
             | We use layer 2 technology, the main idea of woleet is to
             | stamp many hashes (possibly millions of hashes in one
             | bitcoin transaction) our service is running for years and
             | we produce thousands of proofs daily.
        
               | this_was_posted wrote:
               | what do you mean with layer 2 technology? If you're
               | talking about the data link layer of the OSI model I am
               | not sure how that applies here..
        
               | Ohn0 wrote:
               | "Layer 2 refers to a secondary framework or protocol that
               | is built on top of an existing blockchain system."
               | 
               | from https://www.binance.vision/glossary/layer-2
        
               | exdsq wrote:
               | Layer 2 is essentially an application layer abstracted on
               | top of a blockchain. Among other things it is a way to
               | allow for better scalability.
        
             | Paul-ish wrote:
             | If they are using Merkle trees to aggregate pages, they can
             | have many page proofs in one transaction.
        
           | la_fayette wrote:
           | Actually, I also cannot find any information how this is
           | actually achieved by your service. As OP mentioned, if "just"
           | a browser extension is used to create a hash of the html
           | page, one could use dev tools to modify the dom inside the
           | browser and then create the hash...
        
           | mpeg wrote:
           | I think kwantam's point is that you are merely storing a hash
           | of the resulting file, it proves it existed at a certain
           | datetime, but it doesn't guarantee it wasn't modified (which
           | TLSNotary does, albeit with a trusted third party required)
           | 
           | This wasn't clear from the link, as there was very little
           | technical information provided.
        
             | gill3s wrote:
             | I get it, and no, nothing proves you didn't modify it.
             | Maybe a solution is to create some king of "witness
             | community" stamping the same page at the same time. It will
             | have diferent hashes each time and the evidence could be
             | stronger in the end
        
               | JoshuaDavid wrote:
               | If the website uses SSL, would it be possible to prove
               | that the server signed the particular sequence of bytes
               | you received? That doesn't prove that _nobody_ modified
               | the data but it does prove that anyone who did was able
               | to sign things with a key that nobody else should have
               | access to.
        
           | Znafon wrote:
           | How do you prove that the user did not modify the page before
           | computing the hash?
           | 
           | Do you have documentation of how this is done?
        
             | gill3s wrote:
             | this tool does not guarantee that you've not modified the
             | export before you stamp it. The main protection is the
             | timestamp and the fact that the hash is calculated by the
             | extension itself. This proof just guarantee that this
             | particular file existed at this date. I personnally believe
             | that even if it's not a siver bullet, the certain date is
             | the main protection it provides. If you want to make some
             | fraud with a bitcoin timestamp, you need a proper timing
             | and preparation. In conclusion it just makes things harder
        
               | lilyball wrote:
               | Being able to prove that a certain webpage exists
               | _locally_ at a given date is rather useless. The only
               | utility there is if the page contains confidential
               | information and you want to prove that you had that
               | information at that time, but you could do that just as
               | easily without saving a whole web page in the process.
        
               | bscphil wrote:
               | Yeah, ironically the whole point of SingleFile for me is
               | that I can locally edit webpages to strip out ads and
               | other crap so I can send them to friends or family who
               | might not have ad blockers.
        
             | rmtech wrote:
             | There are some, many even, instances where the kind of
             | fraudulent modifications you might want to do could not
             | realistically be done at the time the page is saved, but
             | could be done at a later date.
             | 
             | So for that, it's useful.
        
           | alfonsodev wrote:
           | I think the OP is thinking in a scenario where if the proof
           | is generated locally, it just proves the existence of that
           | file, not that the file was public on the internet, you could
           | use a proxied network (or just hosts file ?) to fool the
           | browser extension. I'm not implying this is the case, but if
           | you could explain how it is implemented would be great.
        
         | gdm85 wrote:
         | > I doubt this is useful as a proof in the way one might wish
         | (that is, proving that a web server delivered content X at date
         | Y). The reason is, it does not appear that anything prevents
         | the user from modifying the web page and then generating a
         | "proof" about the modified version.
         | 
         | Thanks for saying it so clearly, this is exactly the first
         | thing I thought (and it reduces the value of the functionality
         | by heaps, since everything can be forged).
        
       | marcinjachymiak wrote:
       | There's already a better service that timestamps files in
       | Bitcoin. It also uses blockchain space efficiently using servers
       | that aggregate data that must be timestamped into a single
       | Bitcoin transaction. You just need to publish a Merkle root and
       | hold onto your Merkle proof.
       | 
       | https://opentimestamps.org/
       | https://petertodd.org/2016/opentimestamps-announcement
        
       | joshspankit wrote:
       | I believe this technique of storing hashes on a blockchain is how
       | public figures should be inoculating against the upcoming risk
       | posed by deepfakes.
       | 
       | If a video surfaces that's faked from an existing hashed one,
       | that's a _very_ easy proof.
        
         | maxfan8 wrote:
         | How would this work? We'd still need something that can detect
         | similar/probably deepfaked content (a good cryptographic hash
         | has random distribution).
        
           | RobLach wrote:
           | Furthermore common actions such as re-encoding a video for
           | streaming will generate a copy of the content that's
           | identical but a completely different hash which will generate
           | an ecosystem of real and deepfaked content with a similar
           | amount of noise that deepfaked content thrives in.
        
         | exdsq wrote:
         | Solid idea actually! Can you think of a way to wrap this as a
         | product?
        
       | dqv wrote:
       | Nice. I'll give it a try when I get to my primary PC.
       | 
       | Journalists have a bad habit of linking to tweets which are often
       | ephemeral because accounts are deleted, tweets are deleted, or
       | accounts go private.
       | 
       | Another problem is where publishers themselves change the open
       | graph meta (or whatever it's called) after a tweet has been
       | published. One memorable example (for me) is where Washington
       | Post changed the image on an article about Alexandria Ocasio
       | Cortez's Jewish heritage depicting her with her hands clasped
       | similar to the Happy Merchant meme[0]. Obviously they realized
       | the resemblance enough to change the image, but didn't comment on
       | it. If you look at the original tweet[1] now, you can see the
       | replies look completely out of context because they changed it.
       | 
       | [0]:https://knowyourmeme.com/memes/happy-merchant [1]:https://twi
       | tter.com/washingtonpost/status/107212454556018278...
        
       | RileyJames wrote:
       | Great solution. The previous blockchain enabled solution for this
       | problem that I'd found was https://tlsnotary.org/.
       | 
       | A few questions, how does notarising the tls handshake vs the
       | entire document differ in terms of "proof".
       | 
       | Is one form of proof better than the other? Or do they prove
       | something different?
        
       | [deleted]
        
       | [deleted]
        
       | onyb wrote:
       | At my previous job, in a legal-tech company, we used Woleet to
       | build a copyright protection product for intellectual property.
       | However, I believe IPFS [1] is a superior solution for proof-of-
       | existence, compared to timestamping on Bitcoin.
       | 
       | With Woleet, you must keep the original payload (file + personal
       | identification) that was timestamped, for eternity. In the event
       | of a copyright violation, you must be able to prove in front of a
       | judge that hash of the file in your possession is indeed what
       | exists on the Bitcoin blockchain.
       | 
       | With IPFS, you only need to save the hash of the payload (or a
       | human-readable name, with IPNS [2]), to convince the judge that
       | you authored the original file at a certain point in time.
       | Additionally, IPFS has version control. This means that if you
       | want to prove to a court that some revision to the T&Cs of your
       | product were made before a certain date, it makes more sense to
       | use IPFS.
       | 
       | [1] https://ipfs.io [2] https://docs.ipfs.io/guides/concepts/ipns
        
         | jmeyer2k wrote:
         | You can't prove a file existed before a certain date with IPFS
         | like you can with Bitcoin.
        
           | capableweb wrote:
           | Yes, if I understand IPFS correctly, you can. Since IPFS
           | works as a content addressed system, if you embed the date,
           | send the document to the judge (the hash which is based on
           | the content), don't show it until a later point, you can
           | prove the document is the same as you sent, even without
           | revealing the content until later.
           | 
           | IPFS doesn't seem to have anything about "version control" as
           | onyb mentioned.
        
             | bluesign wrote:
             | how you will embed the date?
        
         | bluesign wrote:
         | I am confused, how you can prove the date and ownership ? Does
         | IPNS have some kind of timestamp?
        
       ___________________________________________________________________
       (page generated 2020-01-06 23:00 UTC)