[HN Gopher] The first chosen-prefix collision for SHA-1
       ___________________________________________________________________
        
       The first chosen-prefix collision for SHA-1
        
       Author : ynezz
       Score  : 702 points
       Date   : 2020-01-07 12:34 UTC (10 hours ago)
        
 (HTM) web link (sha-mbles.github.io)
 (TXT) w3m dump (sha-mbles.github.io)
        
       | bjornsing wrote:
       | > We note that classical collisions and chosen-prefix collisions
       | do not threaten all usages of SHA-1. In particular, HMAC-SHA-1
       | seems relatively safe, and preimage resistance (aka ability to
       | invert the hash function) of SHA-1 remains unbroken as of today.
       | 
       | Nice to see this bit of intellectual honesty. Would be even nicer
       | if they had explained what that means in terms of PGP keys.
        
         | dward wrote:
         | HMACs do not require collision resistance from the underlying
         | hash to provide secure message authentication. HMAC-MD5 is
         | still considered "secure", although that doesn't mean you
         | should use it.
         | 
         | http://cseweb.ucsd.edu/~mihir/papers/hmac-new.pdf
        
         | _notreallyme_ wrote:
         | It means if someone you want to impersonate uses the Web Of
         | Trust, i.e. their key is signed by other people whose keys have
         | been signed the same way, you can generate a GPG key for which
         | all of these signatures are still valid.
         | 
         | For example, if an attacker gains access to a victim email
         | account, they could send to their contacts a "trusted" key (as
         | explained above) and then use it to send signed documents to
         | the victim's contacts.
         | 
         | This would defeat an adversary "paranoid" enough to check a key
         | signature, but not paranoid enough to obtain a clear
         | explaination/confirmation of why the key changed...
        
           | bjornsing wrote:
           | > It means if someone you want to impersonate uses the Web Of
           | Trust, i.e. their key is signed by other people whose keys
           | have been signed the same way, you can generate a GPG key for
           | which all of these signatures are still valid.
           | 
           | No...
           | 
           | > For example, if an attacker gains access to a victim email
           | account, they could send to their contacts a "trusted" key
           | (as explained above) and then use it to send signed documents
           | to the victim's contacts.
           | 
           | Ok... But in this scenario the attacker has the victim's new
           | private key, so they don't need to create a collision (using
           | OP). They can just use the new private key to sign the
           | documents. Right?
        
             | _notreallyme_ wrote:
             | > No...
             | 
             | Why ?
             | 
             | > in this scenario the attacker has the victim's new
             | private key
             | 
             | You don't want to keep your private key in cleartext on
             | your email provider servers, do you ?
        
               | makomk wrote:
               | This allows you to take two messages and append some data
               | to both of them which causes the modified versions to
               | have the same SHA-1 hash - but you need to modify both
               | messages, and in order to use this in an attack you need
               | to set up a scenario where the SHA-1 hash of one of your
               | modified messages is trusted for some purpose. Creating a
               | message with the same hash as another, existing message
               | requires a second-preimage attack which is much harder
               | and not feasible for any cryptographic hash that's
               | currently in use.
        
           | im3w1l wrote:
           | > It means if someone you want to impersonate uses the Web Of
           | Trust, i.e. their key is signed by other people whose keys
           | have been signed the same way, you can generate a GPG key for
           | which all of these signatures are still valid.
           | 
           | I'm not really knowledgeable about the implementation details
           | of GPG. Mind explaining how this follows?
        
           | whatshisface wrote:
           | > _This would defeat an adversary "paranoid" enough to check
           | a key signature, but not paranoid enough to obtain a clear
           | explaination/confirmation of why the key changed..._
           | 
           | Thereby turning the signal intelligence problem into a human
           | intelligence problem.
        
       | nneonneo wrote:
       | So to be clear about what this is (because the website doesn't
       | quite clarify): this collision lets you pick two different
       | prefixes P1, P2, then calculates some pseudorandom data C1, C2
       | such that SHA1(P1+C1) = SHA1(P2+C2). The length extension
       | property of SHA1 (and MD5) means that now SHA1(P1+C1+X) =
       | SHA1(P2+C2+X) for any X.
       | 
       | A similar attack (which requires only a few hours on modest
       | hardware nowadays) has been known for a long time for MD5, but
       | this is the first time it's been demonstrated for SHA-1.
       | 
       | The previous attack, called Shattered (https://shattered.io) was
       | a regular collision, that is, they chose a single prefix P and
       | found different C1, C2 such that SHA1(P+C1) = SHA1(P+C2). This
       | can also be length extended, so that SHA1(P+C1+X) = SHA1(P+C2+X).
       | However, this attack is more limited because there is little to
       | no control over the pseudorandom C1 and C2 (the only differing
       | parts of the messages).
       | 
       | With a chosen prefix collision, though, things are way worse. Now
       | you can create two documents that are arbitrarily different, pad
       | them to the same length, and tack on some extra blocks to make
       | them collide.
       | 
       | Luckily, the first collision should have already warned people to
       | get off of SHA1. It's no longer safe to use for many
       | applications. (Note, generally for basic integrity operations it
       | might be OK since there's no preimage attack, but I'd still be a
       | bit wary myself).
        
         | tptacek wrote:
         | Squeamish Osssifrage's answer to this Crypto Stack Exchange
         | question does a predictably excellent job of putting these
         | attacks in context.
         | 
         | https://crypto.stackexchange.com/questions/60640/does-shatte...
        
           | ReidZB wrote:
           | Agreed. It's a great loss to the community that they have
           | decided to step away from Stack Exchange:
           | https://crypto.meta.stackexchange.com/questions/1361/im-
           | leav...
        
             | gowld wrote:
             | * that the owners of Stack Exchange drove them and others
             | away.
        
         | RickHull wrote:
         | Actually, the website does cover this in the '''Q&A''' section:
         | 
         | > What is a chosen-prefix collision?
         | 
         | > A classical collision (or identical-prefix collision) for a
         | hash function H is simply two messages M and M' that lead to
         | the same hash output: H(M) = H(M'). Even though this security
         | notion is fundamental in cryptography, exploiting a classical
         | collision for attacks in practice is difficult.
         | 
         | > A chosen-prefix collision is a more constrained (and much
         | more difficult to obtain) type of collision, where two message
         | prefixes P and P' are first given as challenge to the
         | adversary, and his goal is then to compute two messages M and
         | M' such that H(P || M) = H(P' || M'), where || denotes
         | concatenation.
         | 
         | > With such an ability, the attacker can obtain a collision
         | even though prefixes can be chosen arbitrarily (and thus
         | potentially contain some meaningful information). This is
         | particularly impactful when the hash function is used in a
         | digital signature scheme, one of the most common usage of a
         | hash function.
        
           | nneonneo wrote:
           | The confusing thing is that the team behind SHAttered _did_
           | choose a prefix - they just had to choose a common prefix
           | rather than two different prefixes (choosing a common prefix
           | simply changes the initial state used by SHA-1). So I 'm
           | hoping that my post clarifies this point a bit.
           | 
           | The SHAttered attack (classical collision) _can_ be used in
           | practice - for example, my project
           | https://github.com/nneonneo/sha1collider exploits their
           | collision to turn any two PDFs into documents with identical
           | SHA-1 hashes.
        
             | gowld wrote:
             | Is any practical application safe from Shattered but not
             | Shambles?
        
         | dickjocke wrote:
         | Can you give a specific example of the danger here? I
         | understand the principle behind the attack (kinda).
         | 
         | I just don't understand what danger being able to pad two
         | documents to make them collide poses?
         | 
         | edit: My guess is that it can be abused to make something that
         | I believe to be library X actually be library Y when I download
         | it from the internet. Lets say I want to download something,
         | and I check the signature provided. Assuming the attacker is
         | able to send me the wrong library via a MITM attack, how can
         | this prefix collision work? It seems that the original library
         | AND the original signature on the library's website have not
         | been altered, so their efforts to use this and make them match
         | are impossible. And it seems like if they can alter the
         | signature on the website and stuff, then all bets are off--why
         | not just send the malicious library at that point?
        
           | femto113 wrote:
           | It's a step on the same path that led to being able to spoof
           | a CA for MD5 signed certs.
           | 
           | https://www.google.com/amp/s/techcrunch.com/2008/12/30/md5-c.
           | ..
        
             | ThePowerOfFuet wrote:
             | Please don't feed the cancer which is AMP.
             | 
             | https://techcrunch.com/2008/12/30/md5-collision-creates-
             | rogu...
        
               | sofaofthedamned wrote:
               | Please don't overegg the issue with AMP by comparing it
               | to cancer.
        
               | andrewstuart2 wrote:
               | Don't make cancer anything it's not, either. Cancer is
               | just growth of abnormal cells, unconstrained, to the
               | point that it causes harm to the host.
               | 
               | That said, maybe AMP is more like a virus. A non-living
               | organism that spreads by infecting living organisms and
               | repurposing them to replicate itself instead of
               | sustaining the organism they were a part of.
               | 
               | The more sites adopt AMP, the more everyone else says
               | "well I guess we have to now." Seems pretty viral.
        
           | nneonneo wrote:
           | Suppose your system uses SHA-1 hashes for codesigning
           | verification (e.g. to load a system driver). I create an
           | innocent-looking device driver and convince a signing
           | authority to sign it. However, secretly I've created a
           | malicious driver (e.g. a rootkit) which collides with my
           | innocent one. Now, I can load the malicious one on your
           | machine - which the signing authority has never seen - using
           | the signing certificate of the legitimate one.
           | 
           | This might sound far-fetched; after all, you'd need to
           | convince a signing authority to sign the code. But this is
           | pretty much exactly how Apple's Gatekeeper verification
           | works: your software is submitted to them, and they do some
           | security checks and notarize your bundle
           | (https://developer.apple.com/developer-id/), and I'm sure
           | there's many more such examples out there.
        
             | gowld wrote:
             | > you'd need to convince a signing authority to sign the
             | code.
             | 
             | Not necessarily. You can also self-sign or just announce
             | the hash, let the public inspect and test your good driver
             | for a year, and then ship the bad driver to people who only
             | check the hash
        
             | dickjocke wrote:
             | Hey OP,
             | 
             | im not much of a math guy, I'm getting hung up on this
             | part:
             | 
             | SHA1(P1+C1+X) = SHA1(P2+C2+X) for any X.
             | 
             | The example above seems like SHA1(GOOD_DRIVER) ==
             | SHA1(BAD_DRIVER+C2+X) somehow.
             | 
             | How does the C1 and X get appended to the signature of the
             | good driver.
        
               | cjm42 wrote:
               | The good driver consists of P1+C1+X. That's what gets
               | sent to the signing authority. They verify it doesn't do
               | anything bad and return a signature listing
               | SHA1(P1+C1+X). But that signature is also valid for the
               | malicious driver P2+C2+X.
        
               | nneonneo wrote:
               | Most executable file formats (including drivers) put the
               | code first followed by the data. So you could construct
               | your drivers thusly:
               | 
               | GOOD_DRIVER = P1 (good code and some data) + C1 (data) +
               | X (more data)
               | 
               | BAD_DRIVER = P2 (bad code and some data) + C2 (data) + X
               | (more data)
               | 
               | You'd disguise the random-looking block of C1 data in the
               | middle of the good driver as e.g. a cryptographic key to
               | avoid suspicion. The "more data" part couldn't be
               | modified in the bad driver, but since you can arbitrarily
               | modify P2 this wouldn't be a severe restriction.
        
               | dickjocke wrote:
               | thank you OP and everyone. I have that shaky initial
               | understanding but makes much more sense.
        
               | air7 wrote:
               | He said "innocent- _looking_ device driver ".
               | 
               | The good driver is actually padded so that it can be
               | later replaced with the bad driver. I.E This doesn't
               | allow the attacker to replace _any_ driver, but only one
               | they prepared in advance to look innocent but have the
               | right structure.
        
               | dickjocke wrote:
               | that makes a lot more sense, thank you
        
           | Kalium wrote:
           | The core of it is that it means a malicious document will
           | pass the check of authenticity when you are using the
           | _genuine signature_. Someone could tamper with a mirror
           | infect you that way. Absolutely no tampering with the
           | signature would be required, which is what makes this
           | dangerous. Basically, it can be used to send to entirely fake
           | data that you can 't tell is fake.
           | 
           | Or someone could say that you have your legally binding
           | signature attached to a contract with hash ABC, but then
           | present a different contract with hash ABC but very different
           | terms.
           | 
           | Those are the end state. This is a major step closer to that
           | end state. Neither of those scenarios are the case _today_ ,
           | but they're now close enough that it's a matter of time. And
           | likely not a lot of time.
           | 
           | Time to abandon SHA1.
        
       | GhettoMaestro wrote:
       | Lol and my engineering called me paranoid when I made us go from
       | SHA1 to SHA256 (it was trivial for us). Guess I'm having the
       | chuckle now.
        
       | rustybolt wrote:
       | > By renting a GPU cluster online, the entire chosen-prefix
       | collision attack on SHA-1 costed us about 75k USD.
       | 
       | So they just decided to try their attack and spend two years
       | worth of salary on it?? That's crazy.
        
         | oppositelock wrote:
         | That's dirt cheap for a government actor, and we can be sure
         | that the big governments have been doing this sort of attack
         | for years.
        
           | rustybolt wrote:
           | It's a lot of money for an academic researcher.
        
             | progval wrote:
             | Researchers in applied fields other than CS spend this kind
             | of money on a regular basis. eg.
             | https://www.quora.com/What-are-the-costs-for-lab-rat-
             | testing...
        
             | CydeWeys wrote:
             | Is it? Multi-million grants are common in academia. A lot
             | of research is expensive. When I worked in a lab the
             | materials alone for a single day's experiment would
             | frequently run into the thousands. E.g. the total human RNA
             | samples we used cost many thousands of dollars per
             | _milligram_. Admittedly this is a different field, but it
             | 's still academia.
        
               | klmr wrote:
               | > _Multi-million grants are common in academia._
               | 
               | Not in the way you explain. Multi-million dollar grants
               | are usually awarded over multiple years, and pay for
               | multiple researchers' salaries, as well as other
               | resources, some of which are expected to outlast the
               | project (e.g. microscopes, or hardware for databases).
               | Burning 75k on a single experiment (which is effectively
               | what was done here) is rare.
               | 
               | Note that this is true even for current hot topics such
               | as biomedical (e.g. cancer) research, for which vast
               | chunks of the federal budget have been allocated in
               | multiple countries. Obtaining similar sums in less sexy
               | fields is much harder. And even in biomedical research,
               | multi-million dollar grants are considered _large_. Most
               | grants are much smaller, they just don't get talked about
               | as much.
               | 
               | Since you mention human RNA samples I assume you know
               | this. You mention the per-milligram cost but this is
               | pretty misleading if you mean to imply that "milligram"
               | is somehow little, because it isn't: yes, the samples are
               | tiny (<=1 ug of RNA is more typical than milligrams!),
               | but so what? It's not like we need more.
        
         | exikyut wrote:
         | No. It's incredibly generous and selfless. They threw that
         | money away _just_ to incontrovertibly prove a huge point to the
         | entire security community (and all developers, at the end of
         | the day).
         | 
         | Like... chaotic good. Really really good.
        
         | martpie wrote:
         | You can see their emails at the bottom: those are
         | universities/research institutes domains. You can be sure
         | _they_ did not actually spend that themselves.
        
           | rustybolt wrote:
           | I'm just amazed that they were willing to take the risk that
           | there was a bug in the code and they wouldn't find a
           | collision.
        
             | CiPHPerCoder wrote:
             | This is _cheap_!
             | 
             | Breaking MD5 for the Flame malware probably set the NSA
             | back millions when you sum the person-hours and CPU time
             | together.
        
             | basilgohar wrote:
             | It was a sound premise to investigate. EFF spent much more
             | to demonstrate the weakness of DES decades ago, and it
             | highlighted the need for stronger crypto and the fact that
             | nation states would be more than able to break commonly-
             | used crypto at the time.
        
         | n-gauge wrote:
         | As GPU's get better the cost will come down. Was it a massive
         | cluster of RTX2060's or something?
        
           | aidenn0 wrote:
           | 900 GTX-1060s.
        
       | silasdavis wrote:
       | Who paid for this?
        
         | jessant wrote:
         | Judging by the paper[1], I would say any or all of Inria,
         | Nanyang Technological University, and Temasek Laboratories.
         | 
         | [1] https://eprint.iacr.org/2020/014.pdf
        
       | emilfihlman wrote:
       | >Can I try it out for myself? Since our attack on SHA-1 has
       | pratical implications, in order to make sure proper
       | countermeasures have been pushed we will wait for some time
       | before releasing source code that allows to generate SHA-1
       | chosen-prefix collisions.
       | 
       | Sigh. Again with this idiocy. All instances where the adversary
       | is capable of launching this attack financially mean they also
       | have the capability to write the exploit themselves.
        
         | DarkWiiPlayer wrote:
         | It does make sense.
         | 
         | Targets worth attacking at a high financial cost will most
         | likely be the first to take measures against this attack.
         | 
         | The kind of target that takes a longer time to switch most
         | likely isn't worth attacking unless it's a very cheap and fast
         | operation.
         | 
         | And the longer you spend developing an exploit, the less viable
         | the attack will become.
        
         | lm28469 wrote:
         | > All instances where the adversary is capable of launching
         | this attack financially mean they also have the capability to
         | write the exploit themselves.
         | 
         | Iran will eventually create a nuclear bomb, why don't we gave
         | them one now, it's the same thing isn't it ?
        
       | whatshisface wrote:
       | > _A countermeasure has been implemented in commit edc36f5,
       | included in GnuPG version 2.2.18 (released on the 25th of
       | November 2019): SHA-1-based identity signatures created after
       | 2019-01-19 are now considered invalid._
       | 
       | Since SHA-1 was always possible to break, and since NSA probably
       | gets access to big computers and sophisticated techniques before
       | researchers, why doesn't this invalidate every SHA-1 signature
       | ever made and not just ones from last year?
        
         | Boulth wrote:
         | Actually it's even worse than that: signature creation time is
         | added by the signer so it's totally under control of the
         | attacker. IMHO all SHA-1 based signatures should be ignored.
        
           | im3w1l wrote:
           | The signer is assumed to be trustworthy. That's why we care
           | for their signature in the first place. It's the sign-ee that
           | is presumed to be the attacker.
        
           | NieDzejkob wrote:
           | In this case, signature creation time isn't under control of
           | the attacker. The attack scenario being considered is that
           | Mallory can convince Alice to sign a key K1, provided by
           | Mallory, such that it looks like Alice signed K2. The party
           | creating the signature is honest here.
        
         | pornel wrote:
         | No, because this is a collision attack (without control over
         | the hash value), not a preimage attack (where you match an
         | existing hash).
         | 
         | We know how to make _pairs_ of new files that collide with each
         | other, but there 's no known way of creating a file that
         | collides with something specific that existed before.
        
       | edwintorok wrote:
       | > security level 2 (defined as 112-bit security) in the latest
       | release (Debian Buster); this already prevents dangerous usage of
       | SHA-1
       | 
       | FWIW this doesn't apply to Fedora currently, because it has a
       | patch that re-enables SHA-1 in security level 2 in non-FIPS mode:
       | https://src.fedoraproject.org/rpms/openssl/blob/master/f/ope...
        
       | noja wrote:
       | Is a collision impossible with two hashes, each using a different
       | algorithm?
        
         | femto113 wrote:
         | Not impossible, but assuming there's not a mathematical flaw
         | that affect both algorithms the difficulty is roughly the
         | product of the difficulty of finding a collision in each. AFAIK
         | no one has come up with a joint collision for MD5+SHA1 despite
         | collisions in each being practical for several years.
        
           | noja wrote:
           | So should we do that, as well as keep finding new algorithms?
        
             | femto113 wrote:
             | Concatenating two hashes is an algorithm. Mathematically
             | concatenating two 128 bit hashes is not any stronger than a
             | single 256 bit hash (and is likely weaker), but if it's all
             | you have (or all you can afford to compute) two weak hashes
             | is definitely much better than one.
        
               | noja wrote:
               | Why is concatenating two 128 bit hashes (each with a
               | different algorithm) not stronger than a single-algorithm
               | 256 bit hash?
        
               | femto113 wrote:
               | One reason is that it is theoretically possible to use
               | memory instead of computation to attack the combined
               | hashes by pre-generating a large number of collisions
               | under one algorithm and then simply checking those using
               | the other one, which means you don't need to do both
               | algorithms for every check. Can't say for sure if that
               | works out cheaper in terms of money but if the memory is
               | available it could definitely save a lot of time.
               | 
               | Another reason is that for any given hash its theoretical
               | maximum strength against any attack will be less than or
               | equal to its bit length, but the practical strength
               | always trends lower over time as attacks are found, and
               | having two algorithms to attack increases the chances of
               | finding flaws.
        
         | pferde wrote:
         | That is what Gentoo Linux' package manager does - it checks
         | checksums of downloaded files made by several different hashes
         | (as well as file size).
        
       | LennyWhiteJr wrote:
       | The root certificate authority for my company's Active Directory
       | is signed using a sha1 hash. What are the practical implications
       | of this chosen collision?
       | 
       | How do I convince my IT department to update our CA to sha256?
        
       | HashThis wrote:
       | Quick, everyone bend over and cover your hashes
        
       | jVinc wrote:
       | So how would someone go about gaining more than 45k USD in profit
       | from a single case of using the chosen-prefix collision? Not
       | being candid here, I am honestly curious here. I'd guess that
       | even in situations where you somehow get a signed e-mail sent off
       | spoofing a CEO saying "Please pay these guys 50k$" the actual
       | payout seems unlikely and that puts the attacker 45k in the red.
       | But maybe there are some obvious avenues of abuse that I'm
       | missing, or is this more a case of "In a decade it will become
       | economical to abuse this for profit"?
        
         | DaiPlusPlus wrote:
         | Bitcoin uses SHA-256, but I wouldn't put it past some of the
         | devs of some of the Altcoins and Shitcoins to use SHA-1 for
         | either - or both of - their Proof-of-Work or Blockchain
         | integrity hashing algorithms.
         | 
         | So if I understand today's news correctly, you could use this
         | to break blockchain integrity and offer-up alternative "valid"
         | historical blocks (but not cheat at proof-of-work). You would
         | still need to convince a quorum of network nodes to use your
         | fake historical blocks - I imagine this might be doable on
         | lesser-used coins that still have some trades - you could
         | probably combine this with a few pump-and-dump trades too
         | (without costing you anything as the coins you pump would be
         | stolen).
        
         | biggestdecision wrote:
         | At the current time this is much more likely to be abused for
         | political/intelligence reasons, than for profitable criminal
         | reasons. Sneaking a malicious commit with the same signature
         | into a git repository for example.
        
         | racingmars wrote:
         | I'm familiar with an organization that lost about $750,000
         | because someone spoofed an email from the CEO to the CFO asking
         | to wire money to an account. The CFO fell for it. AFAIK, the
         | money was never recovered (nor was the CFO fired... it was all
         | just chalked up to 'the cost of doing business').
         | 
         | That was with NO crypto/signature spoofing involved... if the
         | CFO has now been trained to not act on large dollar amount
         | requests from the CEO without at least checking a digital
         | signature... perhaps the CFO would be _more_ likely to fall for
         | it now since he has been  "trained" that cryptographic
         | signatures are a sign of authenticity?
        
           | jVinc wrote:
           | $750,000 is a lot of money, but seeing as some people will
           | act on emails without signature, and that you would
           | effectively have to invest that amount up front in order to
           | attempt this attack on 15 individuals, and then hope that at
           | least one falls for it just to make it even, I can't really
           | see this being a viable attack vector. Maybe if the cost goes
           | down significantly to the $100-$1000 range it might be
           | something you would see in the wild.
        
       | tambourine_man wrote:
       | This kind of thing always brings me down a bit. It's not
       | rational, but it does.
       | 
       | I mean I truly admire these folks skills, the math involved is
       | obviously remarkable.
       | 
       | But I think the feeling is related to not being able to rely on
       | anything in our field. Hard to justify going to the trouble of
       | encrypting your backup. 10 years from now, it might be as good as
       | plain text.
       | 
       | It's not security only, nothing seems to work in the long term.
       | Imagine an engineer receiving a call at midnight about his bridge
       | because gravity changed during daylight saving in a leap year.
       | That's our field.
        
         | flingo wrote:
         | It was always broken like this. These guys just figured that
         | out.
         | 
         | Now we know. It's better to know.
        
         | mlurp wrote:
         | This can really apply to almost everything though, since if I
         | understand you, it seems to be about progress in the face
         | impermanence. The history of science is basically a constant
         | process of things kind of working, then breaking. But overall
         | it goes forward and things are better off for it. In contrast,
         | I find it kind of inspiring.
        
         | tomphoolery wrote:
         | The thing to remember about cryptography is that as long as
         | it's based around computational power, it can always be broken
         | at some point. Especially if we're building exponentially more
         | powerful computers every N years.
        
           | dorgo wrote:
           | Assuming the universe is infinite in some sense. Otherwise
           | the exponential growth has to stop at some point. So base
           | your cryptography around computational power beyond physical
           | limits and you are safe.
        
         | ChrisCinelli wrote:
         | Even if you could make it secure for all you life time, this is
         | always at play: https://xkcd.com/538/ - In general probably
         | nobody give a dime about our backups but if they did they will
         | find a way to get the data before it encrypted or when we are
         | decrypting the backup.
        
         | suixo wrote:
         | Last part of this comment made me laugh, trying to imagine
         | someone shouting over Slack "PATCH YOUR BRIDGE, NOW!!"
        
           | wbl wrote:
           | https://en.m.wikipedia.org/wiki/Citigroup_Center It has been
           | done once after they found a bug.
        
         | pathseeker wrote:
         | All of this stuff is so new so we'd expect to see churn. Many
         | people involved in the discovery that asymmetric crypto was
         | even possible (Diffie, Hellman, Rivest, etc) are still alive.
         | If the people that first invented bridges were still alive we'd
         | be running into all kinds of failures in bridge building
         | techniques. Bridges have been built in the last 100 years that
         | the wind blew down!
        
         | gist wrote:
         | The way I see it, this is what is bad:
         | 
         | > We have tried to contact the authors of affected software
         | before announcing this attack, but due to limited resources, we
         | could not notify everyone.
         | 
         | Many in the 'security industrial complex' make as if they are
         | doing god's work by their 'research' but from a common sense,
         | man/woman on the street, and layman point of view that is not
         | what appears to be going on at all.
         | 
         | What they do is self serving and a detriment. Then they try to
         | justify it in some way as being good when it's not good. It's
         | like going around and compiling a list of doors in a
         | neighborhood that are open, attempting to contact everyone but
         | not getting some people and then saying 'hey look at what we
         | did for you'. Meanwhile if a list were not compiled at all
         | almost all people would do just fine.
         | 
         | > But I think the feeling is related to not being able to rely
         | on anything in our field. Hard to justify going to the trouble
         | of encrypting your backup. 10 years from now, it might be as
         | good as plain text.
         | 
         | Figure out the amount of issues there would be if there was not
         | such open disclosure and an entire industry surrounding
         | breaking things vs. same not being done. That is the issue.
         | 
         | > Imagine an engineer receiving a call at midnight about his
         | bridge because gravity changed during daylight saving in a leap
         | year. That's our field.
         | 
         | No because nobody is able to or actively trying to change
         | gravity (or is able to).
        
           | jodrellblank wrote:
           | The idea behind your complaint is " _if we tell good people
           | about risks, then bad people will know about them. If we keep
           | them secret from good people, then bad people won 't find out
           | about them_". Or, " _if we don 't make a list of open doors,
           | then bad people can't find which doors are open_". Which is
           | .. not true. Bad people will already be making their own list
           | of open doors, and sneaking through them without being
           | noticed, over and over again.
           | 
           | > all almost all people would do just fine
           | 
           | Here's an ongoing ransom attack in the UK, started on New
           | Year's Eve: https://www.bbc.com/news/business-51017852 The
           | ransom is in the millions of dollars range, Travelex websites
           | in 30 countries down for a week, Sainsbury's money exchange
           | websites down for the same time, and " _Dates of birth,
           | credit card information and social security numbers [of
           | customers] are all in their possession, they say._ "
           | 
           | Not looking at problems doesn't stop problems from existing.
        
             | gist wrote:
             | > "if we tell good people about risks, then bad people will
             | know about them. If we keep them secret from good people,
             | then bad people won't find out about them". Or, "if we
             | don't make a list of open doors, then bad people can't find
             | which doors are open".
             | 
             | You are not recognizing the nuance which is typically the
             | case with people who supports practically any and all
             | disclosure and thinks it's good plain and simple. With
             | almost no downside at all (not the case).
             | 
             | In particular this: "then bad people won't find out about
             | them"
             | 
             | My point is that disclosure makes it simpler for more bad
             | people to find and learn and inflict damage damage. Unclear
             | to me how you could think that isn't what has and is
             | happening. If someone say publishing how to create a card
             | skimmer at a gas station then more people (who are bad
             | actors) will then have what they need to give it a try. If
             | not there will be people who have figured it out and people
             | who will put the effort into figuring it out but you must
             | realize vastly less will do that, right? The disclosure
             | removes a great deal of friction.
             | 
             | The amount of effort and friction and the amount of 'bad
             | people' that can be actors is many magnitudes larger (I
             | would argue) as a result of disclosure.
        
               | jodrellblank wrote:
               | What I think you're missing is that the disclosure could
               | come from a bad person who doesn't care about any of your
               | arguments. It's like I'm saying "banks should invest in
               | vaults to protect against theft" and you're saying "that
               | costs money and disruption of building work, what if you
               | just don't talk about where the money is kept". I agree
               | that if people didn't steal the money, that would be
               | nice. But most of the people who are talking about where
               | banks keep money with a view to stealing it, aren't going
               | to shut up in order to keep the money safe, because _they
               | don 't care about keeping the money safe_. So if us bank
               | customers stop talking about it, a) that doesn't keep it
               | quiet, and b) our money gets stolen.
               | 
               | It would be a nice world if we could tell companies about
               | flaws and they fixed them, and nothing went public, but
               | instead we tell companies with "responsible disclosure"
               | and they ignore it, don't spend any effort on it, act
               | incompetently leaving it with first line support people
               | who don't understand it and have no path to escalate it,
               | have no security contacts available for reports, cover it
               | up or deny it or try to silence it with NDA style
               | agreements, prioritise shareholder profit or public image
               | over it, and generally behave irresponsibly in all
               | possible ways that avoid them having to deal with it,
               | with very few companies excepted.
               | 
               | In light of that, public disclosure with all its risks,
               | actually does kick companies into taking action, and
               | closing risk vectors for good. Like companies who say "we
               | put customers first!" but it takes a complaining public
               | Twitter thread for them to even respond at all. Telling
               | people to not take it to Twitter ignores the fact that
               | there's no other way which seems to actually work.
               | 
               | Give an alternative which also gets problems fixed, and
               | I'll be much more in favour of it.
        
           | SAI_Peregrinus wrote:
           | We typically assume that organizations like the NSA, FSB,
           | Mossad, MSS, and the like already know of such attacks.
        
             | gist wrote:
             | The percentage of people that are impacted by NSA et al is
             | exceedingly small compared to the pain and impact on
             | everyday citizens and companies by disclosures. Not all
             | disclosures and for sure an upside but it's out of control
             | and has been for a long time.
             | 
             | The government (in the US) does not have the resources to
             | go after everyone who commits a crime and that would assume
             | they are actually scooping up info and know of the crimes
             | (they aren't and they don't). They don't even have the
             | resources to audit tax returns (other than a very small
             | percentage). This idea that you are being watched all the
             | time is fantasy. In the US. Other countries? Unfortunate
             | when that's the case but that does not mean as a US citizen
             | I can't view it as detrimental to me that this type of
             | security disclosure makes it so easy for hackers to do a
             | better job (and it does nobody is going to dispute that
             | fact, right?).
        
               | jessaustin wrote:
               | _The government (in the US) does not have the resources
               | to go after everyone..._
               | 
               | In point of fact, since Snowden's revelations we know not
               | only that the USA state actually does have the resources
               | to monitor everyone, but also that it does so.
               | Furthermore, the state is not a monolith. While it may
               | not be in the interest of the state as a whole to
               | capriciously victimize individual humans, it is often in
               | the interest of particular officers and organizations
               | that comprise the state to do so. Cf. "parallel
               | construction".
        
         | WorldMaker wrote:
         | There's an MC Frontalot song called "Secrets from the Future"
         | and the refrain is "You can't hide secrets from the future."
         | It's something of a useful mantra to remind oneself that if
         | "the future" is a part of your threat model, yes your
         | encryption likely isn't enough because on a long enough
         | timescale it is likely "the future" will crack it.
         | 
         | As with any other security issue, the question is "what is your
         | threat model?" You can still justify encrypting your backup
         | today if your threat model includes today's actors, however
         | much you worry about "the future".
         | 
         | > 10 years from now, it might be as good as plain text.
         | 
         | Or 10 years from now it might be the next Linear A tablets to
         | confuse cryptoarcheologists, unreadable and untranslatable and
         | entirely foreign. If "the future" is in your threat model,
         | don't forget the other fun forms of entropy beyond encryption
         | being cracked such as encodings changing, file formats falling
         | out of service/compatibility, "common knowledge" about slang or
         | memes lost to the ages leaving things indecipherable, and so on
         | and so forth. Most of those things are probably unlikely on
         | only a 10 year time horizon, but you never can tell with "the
         | future".
        
           | MaxBarraclough wrote:
           | > You can't hide secrets from the future
           | 
           | With the usual disclaimer about one-time pads.
           | 
           | > other fun forms of entropy beyond encryption being cracked
           | such as encodings changing, file formats falling out of
           | service/compatibility, "common knowledge" about slang or
           | memes lost to the ages leaving things indecipherable
           | 
           | I wonder what the digital archaeologists of the future will
           | make of today's programming languages.
           | 
           | (I was going to point out how far we've come since the
           | languages of yesteryear, but we still have such horror-shows
           | as Perl and Bash.)
        
           | xapata wrote:
           | > Or 10 years from now it might be the next Linear A tablets
           | 
           | At some point in the future, after the universe has expanded
           | to the extent that other galaxies are moving away from ours
           | faster than the speed of light, someone might read our
           | astronomy papers and wonder whether they can believe
           | something that cannot be verified.
        
           | CobrastanJorji wrote:
           | The point about archeologists is a good one because it speaks
           | to motive. In general, we should be very supportive of the
           | efforts of distant historians who want to understand what
           | humanity used to be like. We should not WANT to hide secrets
           | from a sufficiently far future. I can't think of any secret
           | that deserves to be hidden from them for any reason besides
           | perhaps modesty.
        
             | WorldMaker wrote:
             | Relatedly, that is a part of why I love the term
             | "cryptoarcheology" in general, as a reminder that digital
             | spaces will need archeologists too.
             | 
             | There's it's somewhat shortened form "cryptarch", generally
             | more used as a title ("cryptoarcheologist"), which was used
             | in places in the Quantum Thief trilogy of books and is most
             | probably already burned into your brain if you have played
             | any of the Destiny series of videogames (and I presume was
             | heavily influenced by Quantum Thief).
        
               | moyix wrote:
               | Hmm, I thought cryptarch was just crypt + arch with the
               | arch part meaning "leader" (i.e. this sense
               | https://www.etymonline.com/word/arch-), not
               | archaeologist. Is there something about this in the
               | Quantum Thief trilogy I've forgotten?
        
               | WorldMaker wrote:
               | It's been a bit since I read it, but I recall the meaning
               | overlap/double meaning between leader -arch (such as
               | plutarch) and arch- from archaeo- was directly
               | intentional word play in the book trilogy, and yes that
               | leader -arch meaning does add spice to the neologism.
               | 
               | (I don't think Destiny does much with the playful dual
               | meaning, though. Certainly the cryptarchs in Destiny have
               | never yet been meaningful leaders.)
        
             | jessant wrote:
             | What is considered sufficiently far future may change with
             | life extension technology.
        
               | thekyle wrote:
               | Well I would assume that as long as you live you'll keep
               | updating your crypto as new tech comes out. That way the
               | clock only starts ticking when you die.
        
           | SheepSlapper wrote:
           | "You can't hide secrets from the future with math. You can
           | try, but I bet that in the future they laugh at the half-
           | assed schemes and algorithms amassed to enforce cryptographs
           | in the past."
           | 
           | Didn't expect to see Front on HN today, what a pleasant
           | surprise.
        
             | b4n4n4p4nd4 wrote:
             | _coughcaesarciphercough_
        
           | toppy wrote:
           | This biblical prophecy from Luke 12,3 is solo true: " What
           | you have said in the dark will be heard in the daylight, and
           | what you have whispered in the ear in the inner rooms will be
           | proclaimed from the roofs."
        
             | shrimpx wrote:
             | I wonder what the intended meaning is. I presume it has to
             | do with a presumed eventual urge to confess.
        
               | toyg wrote:
               | It's basically "don't try to get fresh with God". Which
               | conveniently translates into a call to humility and
               | honesty towards its earthly representatives.
        
               | PhasmaFelis wrote:
               | In context, it appears to be saying "you can't keep
               | secrets from God."
        
         | comicjk wrote:
         | Bridge engineers don't have to fear the progress of science
         | working against them, but computer security is not alone here.
         | Consider designing body armor or military aircraft and hoping
         | that the state of the art will stay the same! An adversary who
         | can use the progress of science against you is always
         | dangerous. Computer security has been rather lucky so far: the
         | asymmetry between hashing and cracking a hash, for example, is
         | much more favorable to the defense than the balance between
         | bulletproof vests and bullets.
        
           | WorldMaker wrote:
           | On a long enough time scale Bridge Engineers still have to
           | worry about decay, entropy, collisions. Some of that can even
           | be attributed to the "progress of science", as bridges have
           | collapsed because earlier assumptions were invalidated by
           | larger trucks, for instance.
           | 
           | An issue so far for computer security is less that decay and
           | entropy happen, but that they happen _so fast_ that the
           | timescales are decades or even years rather than lifetimes.
        
             | candu wrote:
             | This is a great point in general: sometimes the problem
             | isn't an adversary using knowledge in your field against
             | you, it's the unintended consequences of progress / changes
             | in adjacent fields.
             | 
             | It also underscores that, when dealing with things like
             | passwords, it's helpful to be able to seamlessly upgrade to
             | more robust methods down the line, e.g. "if this password
             | hash was generated using the old method, check the login
             | using that method but rehash and update the database using
             | the new method after authentication."
        
           | jtmarmon wrote:
           | The problem is that the expectations of software are much
           | more akin to the expectations of a bridge than those of a
           | bulletproof vest
        
         | [deleted]
        
         | wyday wrote:
         | >> "Hard to justify going to the trouble of encrypting your
         | backup."
         | 
         | Huh? If you're "encrypting" using SHA, I've got some bad news
         | about those backups of yours.
        
           | LadyCailin wrote:
           | If you use SHA-256 to encrypt your backup, then I just need
           | to steal your backup and wait 20 years, until that is
           | cracked, and then I can decrypt your backup, even though
           | today you're using the "correct" encryption.
        
             | riquito wrote:
             | The GP was likely hinting at SHA1 being an hashing
             | function, non an encryption function, so just applying sha*
             | wouldn't produce a working backup
        
               | SAI_Peregrinus wrote:
               | To be excessively pedantic you can encrypt securely (but
               | slowly for the SHA series) with a hash function of
               | sufficient capacity by running the hash function in CTR
               | mode. You turn it into a stream cipher. Ideally you also
               | MAC the ciphertext, nonce, and other associated data.
               | That's is pretty easy with such a hash function (either
               | use HMAC or a MAC mode of the hash function if
               | supported).
               | 
               | Salsa20 & ChaCha20 cores are hash functions (though not
               | collision-resistant compression functions since it's not
               | needed for their design goal and would be slower) run in
               | CTR mode.[1]
               | 
               | [1] https://cr.yp.to/salsa20.html
        
               | caspper69 wrote:
               | > To be excessively pedantic
               | 
               | This is the best, most delicious, type of pedantry
               | friend!
        
               | gpm wrote:
               | It probably will if your data is less than 128 bytes, and
               | you're willing to wait a few decades to decrypt it.
        
               | ksangeelee wrote:
               | You might be able to find bytes that result in your hash,
               | but they probably won't be the same bytes you 'backed
               | up'.
        
               | gpm wrote:
               | If the data is shorter than the hash shouldn't it be the
               | same data I backed up with reasonably high probability?
        
               | dorgo wrote:
               | I guess you get (infinite?) many results which all have
               | the same hash and one (or more) of them will be shorter
               | than the hash.
        
               | pathseeker wrote:
               | No. http://matt.might.net/articles/counting-hash-
               | collisions/
        
               | monktastic1 wrote:
               | Can you explain the relevance? If I put N items randomly
               | into >> N buckets the chance of there being a second item
               | in a _particular_ bucket is small (as opposed to there
               | merely _being_ a bucket with two items, as in the
               | birthday  "paradox").
        
               | DuskStar wrote:
               | That doesn't apply here, since the birthday paradox is
               | about the _existence_ of a collision, not that any
               | particular sequence collides.
               | 
               | Most people in the room will still have unique birthdays
               | even if one pair share theirs.
        
               | [deleted]
        
               | jodrellblank wrote:
               | If you're planning to brute-force count through 2^(128x8)
               | possible bit inputs, it will be quite a few decades
               | indeed. And you'll need a few spare solar systems to
               | annihilate to get enough energy to drive your counting
               | engine through that many states.
               | 
               | https://security.stackexchange.com/a/6149/1427
        
               | gpm wrote:
               | The idea is to wait for a preimage attack on sha, not
               | brute force it.
        
               | im3w1l wrote:
               | As an aside, sha-1 is smaller than 128 bytes.
               | 
               | From my numerical experiments (I hope I didn't mess
               | up...) using the random oracle model, the probability
               | that a given key is collision-free is 99.6% if the input
               | is one byte shorter than hash, 1/e if input is same size
               | as hash and 6.6e-112 if the input is one byte longer than
               | hash.
               | 
               | And this holds basically irrespective of key size.
        
           | tambourine_man wrote:
           | I'm refering to not being able to rely on encryption in the
           | long term.
        
             | vbrandl wrote:
             | Hashing is a separate problem from encryption. There is no
             | proof that one way functions (the idea behind hashing) even
             | exist (by proving this, you would actually prove P!=NP,
             | IIRC). Encryption has a slightly better track record of
             | being broken. AES still holds its promise and is also
             | secure against quantum computing (you might want longer
             | keys, but that's it).
             | 
             | And if you want really, provably unbreakable encryption,
             | there is still OTP. But then you'd need a key, that is as
             | long as the data you want to encrypt.
        
               | blattimwind wrote:
               | The best known attack against AES reduces attack
               | complexity by about two bits over brute force. Given the
               | history of block ciphers, the idea that AES might not be
               | broken in this life time is not uncommon.
        
           | harikb wrote:
           | I think GP was taking about the general nature of "previously
           | assumed to be unbreakable" methods being broken. Not sure if
           | he has implying using a checksum also for encryption
        
             | PeterisP wrote:
             | What do you mean by "previously assumed to be unbreakable"
             | ? SHA-1 has been known to be unsafe for a dozen years, we
             | just went from "assumed to be breakable" to "yep,
             | definitely breakable, here's how one exact attack will
             | work".
        
               | rakoo wrote:
               | But backups have existed for more than a dozen years. And
               | its replacements today, SHA-256 and SHA-3 will also be
               | broken if you wait long enough.
        
               | jessaustin wrote:
               | I can see why backups might be needed for a dozen years,
               | and I can see why encrypted backups might be needed, but
               | outside plainly fake requirements like those of "national
               | security" why would encrypted backups be needed for a
               | dozen years? Aren't we throwing everything sensitive away
               | after seven years? After that isn't it mostly about
               | preserving history? Even things like balance sheets that
               | might be sensitive today will be too out-of-date to be
               | sensitive a dozen years from now.
        
         | nitwit005 wrote:
         | > Hard to justify going to the trouble of encrypting your
         | backup. 10 years from now, it might be as good as plain text.
         | 
         | Realistically, any archive is going to have to re-record data
         | to stay ahead of age and equipment becoming obsolete. I've
         | heard the lifetime for the physical media of tape backups is in
         | the 10-30 year range.
         | 
         | Updating the encryption isn't a big deal once you're already
         | rewriting everything.
        
         | harikb wrote:
         | At least in our case problem can be solved by re-encryption.
         | Yes, we have to keep up to date with developments before
         | everything is completely broken, but it is not bad as
         | discovering a bridge needs to be rebuilt. Speaking of which, it
         | is common to have to retrofit for earthquakes which probably
         | wasn't the rule when they built the original bridge
        
           | hansvm wrote:
           | That depends on your threat model. A patient attacker can
           | store all of your encrypted messages till some future point
           | where decryption is economically feasible. Since they already
           | have the weakly encrypted data, encrypting it again doesn't
           | solve anything.
        
         | baddox wrote:
         | > Hard to justify going to the trouble of encrypting your
         | backup. 10 years from now, it might be as good as plain text.
         | 
         | When has that happened? Public key cryptography and symmetric
         | key cryptography are still doing fine as far as I'm aware, and
         | the latter doesn't even seem to be vulnerable to quantum
         | computing.
         | 
         | Moreover, SHA-1 has been considered insecure for, what, at
         | least 10 years? The fact that a cryptographic hash function has
         | been widely considered insecure and widely recommended for
         | deprecation a decade before a proof of concept even emerges is,
         | to me, something to feel very good about.
        
           | rakoo wrote:
           | "Public/symmetric key cryptography" is just the name of the
           | practice, of course it's doing fine. What's not doing fine is
           | picking a particular set of ciphers/hash functions/signature
           | scheme and expecting to not fail in 40 years.
        
             | xmprt wrote:
             | Encrypting is still better than not encrypting and if you
             | care about keeping your data private then you can take
             | additional measures to ensure that. Nothing will last
             | forever but that's not a good reason to be nihilistic.
        
               | mattr47 wrote:
               | It's like locking my front door. I chose to lock the door
               | to prevent random acts of burglary. But that lock will
               | not stop someone determined from BnE my house.
        
               | gowld wrote:
               | DES Encryption will stop someone from BnE your data.
               | 
               | They might BnE your key though, if you aren't careful
               | enough.
        
             | xoa wrote:
             | Incorrect, particular sets of ciphers are themselves doing
             | just fine. There is no hint of well studied symmetric-key
             | algorithms (like AES, which is in fact now over 20 years
             | old) with long known sufficient key lengths "failing" (in
             | terms of cold ciphertext being cracked) in 40 years, or 400
             | years for that matter. Even with a fully scalable general
             | quantum computer, the best known improvement is Grover's
             | Algorithm, which results in roughly a quadratic improvement
             | (so a 128-bit key cracked in 2^64 iterations, 256-bit in
             | 2^128, etc). This obviously is trivially defeated by
             | doubling the key length, and AES-256 has been available
             | from the beginning and has been becoming more standard for
             | a while as well.
             | 
             | Now, there are various kinds of hot attacks, such as side
             | channel leaks, that always present a serious challenge in
             | many threat scenarios. And it's certainly possible to have
             | bad implementations of a solid algorithm. But for AES
             | implementations are very mature, and side channel attacks
             | do not apply to data at rest. So in the context of "will my
             | encrypted ciphertext with all my deepest secrets [sitting
             | on a drive/in the cloud/wherever] be crackable in 40 years"
             | no, symmetric-key crypto with 256-bit keys is dependable at
             | this point.
             | 
             | A false sense of defeatism is itself plenty dangerous to
             | security.
        
               | ReidZB wrote:
               | DES is a good example here: it was designed ~45 years ago
               | and it's still theoretically surprisingly strong when you
               | compare it with other areas of cryptography. By that, I
               | mean it's held up very well to cryptanalysis that
               | attempts to find faster-than-brute-force techniques, with
               | the best attack taking 2^43 time (vs. the 2^56 brute-
               | force time).
               | 
               | To be clear, 2^56 is trivially brute-force-able today,
               | but that can be mitigated with constructs like 3DES,
               | which are still (barely) secure despite being based on a
               | 45-year-old cipher.
               | 
               | (So no one mistakes my intent, there are many reasons to
               | prefer AES over DES. I just wanted to provide it as an
               | example, especially since it happened to line up with the
               | 40-year timeframe.)
        
             | wbl wrote:
             | Triple DES remains as secure today as in 1975.
        
         | saurik wrote:
         | FWIW, a lot of very smart people seem to think that we actually
         | do have symmetric encryption in a state where it "works" and is
         | "understood" and that AES is unlikely to be "broken". What we
         | don't really feel so great about is asymmetric encryption, but
         | at least that feels like something P != NP might imply can be
         | done... hashing algorithms just have this problem where if you
         | glance at the information theory you would have guessed they
         | couldn't be possible until someone shows you one that seems to
         | work.
        
           | fanf2 wrote:
           | Yes: for specific examples of this confidence see JP
           | Aumasson's paper https://eprint.iacr.org/2019/1492 "Too much
           | crypto"
        
           | im3w1l wrote:
           | AES may be fine, but what about the block cipher modes?
        
           | mehrdadn wrote:
           | > something P != NP might imply can be done... hashing
           | algorithms just have this problem where if you glance at the
           | information theory you would have guessed they couldn't be
           | possible until someone shows you one that seems to work
           | 
           | I'm confused what you mean by this? Why does the info theory
           | suggest hashing wouldn't be possible? Also, you can easily
           | derive a secure hash function from a secure symmetric cipher
           | and vice-versa, so why would one seem to be possible but not
           | the other?
        
             | saurik wrote:
             | 1) The premise of a hash function is that you are going to
             | take a large set of inputs and map it to a small set of
             | outputs such that you don't find collisions, with the use
             | case that the result is somehow so unique that the input
             | file can be henceforth named by its hash and looked up in
             | some giant dictionary of everything we have so far ever
             | bothered to hash.
             | 
             | The hash is this tiny bit of information, and somehow is
             | expected to be sufficient to uniquely describe some
             | arbitrarily large amount of information? That just flies in
             | the face of the naive expectation. The only argument that
             | it is even sort of reasonable to expect would be the idea
             | that while there are an insane number of files very similar
             | but not quite your file, very few of them are "interesting"
             | to a human in any way, and so the set of "interesting"
             | files might be super small... but it isn't like we designed
             | the hash functions to know that.
             | 
             | The reality that it seemingly can be done well enough that
             | no one notices most of the time is fascinating and
             | surprising--the kind of thing that inspires awe and wonder
             | at the universe if true or calls into question our hubris
             | if it isn't--not something obvious in the way the existence
             | of symmetric ciphers nor something reasonable to expect in
             | the way asymmetric ciphers are: to the extent to which
             | reality lets us pull this one off we should be thankful.
             | 
             | 2) If one can truly "easily" derive a "secure" hash
             | function from a secure symmetric cipher, then why don't we
             | ever have secure hash functions, despite seemingly having
             | secure symmetric ciphers? If this is so easy, I suggest you
             | do it and then wait 30 years and see if someone figures out
             | how to break it (and we can see it AES is still holding
             | strong).
        
               | mehrdadn wrote:
               | 1) Oh I see what you mean now. Yeah I guess it depends on
               | your intuition.
               | 
               | 2) I mean, I'm not sure how correct your premise that
               | symmetric ciphers are more secure than hash functions is,
               | but it literally is something that is done. You can read
               | more about it in [1], including the possible pitfalls.
               | The transformation should provide more than enough
               | intuition to see why both are equally plausible, which
               | was the point of my reply. Whether or not it's best to
               | actually implement them that way in practice is a
               | separate question which I'm not trying to answer here.
               | 
               | [1] https://crypto.stackexchange.com/a/6476
        
               | silasdavis wrote:
               | >The hash is this tiny bit of information, and somehow is
               | expected to be sufficient to uniquely describe some
               | arbitrarily large amount of information
               | 
               | It doesn't really describe the information it just
               | indexes each piece. For that it is a very large key
               | space. A 128 bit key could index 10^38 objects. That's a
               | lot of slots. Provided your hash has sufficient mixing
               | you will need a lot of shots to hit anything.
               | 
               | >If one can truly "easily" derive a "secure" hash
               | function
               | 
               | Semantic security has a precise definition. We just have
               | to make empirical assumptions about whether an input
               | cipher is semantically secure. The hash is only secure if
               | the cipher was but the derivation is easy.
        
         | hannob wrote:
         | > Hard to justify going to the trouble of encrypting your
         | backup. 10 years from now, it might be as good as plain text.
         | 
         | That is an absurd statement. Every backup you encrypted 10
         | years ago with then up-to-date security is still well
         | encrypted. The SHA1-thing came coming over a timespan of 15
         | years. It still doesn't threaten past usage, only active
         | attacks are really relevant.
        
         | Izmaki wrote:
         | No, the bridge collapsed because it was never touched again
         | after initial deployment, for 10 years. How are buildings doing
         | in Chernobyl? Don't neglect your data, if you want to keep it
         | safe always. :P
        
           | tambourine_man wrote:
           | A counter example are roman aqueducts
        
           | app4soft wrote:
           | Dude, Chernobyl reactor by design was nuclear bomb.
           | 
           | P.S. My father (still alive) was one of thousands of common
           | liquidators of Chernobyl disaster from May to July 1986. Many
           | of his coworkers, that lived with him in same tent, already
           | dead.
        
           | spease wrote:
           | > How are buildings doing in Chernobyl?
           | 
           | Not great, but not terrible?
        
       | femto113 wrote:
       | While a meaningful accomplishment, suggesting the algorithm is in
       | a "shambles" seems hyperbolic to me. For one thing there's a non-
       | trivial practical leap between formulating two colliding
       | identities and forging an existing one, and for another this was
       | only modestly better than a pure brute force attack. If anything
       | I'm somewhat reassured by the idea that it still costs $40,000+
       | of GPU time to pull something like this off while doing the same
       | with MD5 is feasible on a mobile phone.
        
       | jwilk wrote:
       | Link to the GnuPG commit:
       | 
       | https://dev.gnupg.org/rGedc36f59fcfc
        
       | RcouF1uZ4gsC wrote:
       | Does this affect Git? I believe it uses SHA-1 for commits. Is it
       | possible to use this attack to add malicious code to a git
       | repository without changing the hashes for the commits?
        
         | LennyWhiteJr wrote:
         | Potentially. GitHub at least already started collision
         | detection after Shattered was published.
         | 
         | https://github.blog/2017-03-20-sha-1-collision-detection-on-...
        
       | jlokier wrote:
       | Just a curiosity, since people are talking about Git still using
       | SHA-1 (despite work on SHA-256 since 2017).
       | 
       | I see that Git doesn't actually use SHA-1 any more, it uses
       | "hardened SHA-1":
       | https://stackoverflow.com/questions/10434326/hash-collision-...
        
         | rst wrote:
         | Well, according to that reference, it's hardened against a
         | specific, previously known attack. Do you have any information
         | on whether that also protects against the different, new attack
         | which was just published?
        
           | jlokier wrote:
           | I was wondering the same thing, and hoping someone else would
           | answer that.
        
             | joeyh wrote:
             | Hardened sha1 does detect this new attack. Easy to test:
             | Check their pair of files into a git repo and see that they
             | have different checksums, while sha1sum(1) generates the
             | same for both.
        
           | tialaramex wrote:
           | Not so much the specific attack, as the broad class of
           | attacks. I think this new work is in that same broad class
           | but I am not a mathematician.
           | 
           | The idea in Marc Stevens' anti-collision work is that some
           | inputs are "disturbance vectors" which do unusual things to
           | the SHA-1 internals, and we want to detect those and handle
           | that case specially since there is almost no chance of that
           | happening by accident. It has a list of such vectors found
           | during his research.
           | 
           | This paper doesn't talk about "disturbance vectors" but it
           | does discuss ideas like "Boomerangs" which I think ends up
           | being similar - I just don't understand the mathematics
           | enough to know whether that means "the same" or not.
        
         | FactolSarin wrote:
         | How would an attack on a git repo work? You create a repo with
         | identical hashes but different content and next time the user
         | clones from scratch they get your modified version?
        
           | flatiron wrote:
           | yeah my thoughts about git are similar. look at the two
           | messages they have an an example:
           | 
           | Key is part of a collision! It's a
           | trap!yE'NsbK#dhW]{1gKmCx's/vr| -pJO_,1$)uB1qXv#U)9ESU;p~0G:Y 
           | [?]bBIjFranweom3&t'lB_!h5M([,[?]QMK#|o5pv|i,+yYp[?]D7_Rf\'GUZ
           | , _[?]dvAYAugV=Lk8_E_ 2 +nolBtxXoQt
           | &+?Y3LP:'Qt(,u[?]WJm:A"M6<|B4kVvtA=M+m%Sha j5N|EMA\Ed-
           | s&@u@:a?pq^Xf0U?R}
           | 
           | and
           | 
           | Practical SHA-1 chosen-prefix
           | collision!'lka}vbI3,*W]A+gK}Cxs/v&r| }-hRJO_ rO;bzC ,1&uRP-
           | MXrU3aO;pr0:sY'2 l&r7#(A{oNyCJ_W,8
           | @bHBYeepFr2a8#&t+n_15q(_,?QMW#hzYMgVV=L,kO0E*N
           | +oc@BpXod&?+?[{3LvP&'U t ( WJIm\:A"6>>|SB(k;Vvt^A=Y
           | ;om%j-|cUAAET&@o@:La3psH^eXf0QJm [?]d
           | 
           | they have the same sha1sum, but in all practicality its
           | nonsense since both messages are pure trash. you couldn't
           | have malicious C code that would have the same hash as non
           | malicious C code in this example
        
             | saalweachter wrote:
             | Isn't that like incredibly simple?
             | 
             | Dump your garbage string behind a // or inside an #if 0,
             | restrict the garbage string character set to characters
             | which will not disturb that, and your compiler will whistle
             | while it works.
        
               | pathseeker wrote:
               | Depends on if the chosen prefix attack allows the content
               | to appear arbitrarily in the middle of the byte stream
               | like that.
        
               | flatiron wrote:
               | anyone checking diffs would notice that, or working on
               | the file, etc. it wouldn't survive long
        
               | munk-a wrote:
               | I think active projects would detect this fine - but what
               | if that commit was pushed to lpad and everyone ended up
               | pulling it to local because it's a dependency of a
               | dependency of a dependency in NPM?
               | 
               | Or what if it's a really obscure library for parsing
               | like... pyramidal jpeg2000s, are the library consumers
               | going to be checking the source? Heck, most people
               | already don't check download checksums unless their
               | downloader does it automatically.
        
               | saalweachter wrote:
               | Hmmm, does the garbage string actually have to survive
               | long?
               | 
               | If there's a followup CL to "delete a garbage string that
               | accidentally made it into the repo", which doesn't
               | actually fix whatever else was added, would that get you
               | anywhere?
        
           | munk-a wrote:
           | If you could push up a commit that computed to the same hash
           | of the last tagged release in a repo... I'm not certain, the
           | tag might end up referencing the new object? Certain versions
           | of git (i.e. maybe git for windows) may also react in
           | different manners.
           | 
           | In theory you might get people building software packages for
           | distros to build your malicious version, you may also just
           | temporarily shut down the ability for anyone to check out the
           | version (basically denial of service for making?) but the
           | time window would be weird.
        
           | thenewnewguy wrote:
           | You'd probably be most successful modifying the original repo
           | - either by being the creator of the software or gaining
           | their trust. However, it would have to be a rather powerful
           | SHA1 attack for the commit to still be valid syntax, hard to
           | detect, and make a meaningful malicious change.
        
       | nvartolomei wrote:
       | I assume there was a lot of work (read money) put in those
       | collision attacks rather than it being discovered by accident.
       | I'm wondering who is sponsoring this work and for what purpose?
       | The argument about proving that an algorithm is broken and
       | working on better cryptography wouldn't suffice in this case, as
       | issues were shown before that. Here the purpose was to make the
       | attack cheaper?
        
         | tptacek wrote:
         | That is not how cryptographic research works, at all.
        
       | mehrdadn wrote:
       | > SHA-1 has been broken for 15 years, so there is no good reason
       | to use this hash function in modern security software.
       | 
       | Why are cryptographers always exaggerating things and so out of
       | touch with reality? The first actual collision was like 3 years
       | ago. It's not like the world has been on fire in the meantime,
       | and it's not like SHA-1 is broken for every single possible usage
       | even now. And why the nonsense with "no good reason"? Obviously
       | performance is one significant consideration for the unbroken use
       | cases. Do they think painting a different reality than the one we
       | live in somehow makes their case more compelling?
        
         | ApolloFortyNine wrote:
         | For anything requiring fast hashing performance today, is there
         | a reason why Blake2 [1] wouldn't be chosen? It seems to be
         | faster than SHA1 and uses a similar algorithm as SHA3. As
         | collisions become easier to create I would think it'd cause a
         | problem in many even insecure use cases. I'm pretty sure github
         | had to blacklist those two pdfs that originally had the same
         | hash.
         | 
         | [1] https://blake2.net/
        
         | ewidar wrote:
         | As stated, they are talking about "modern security software".
         | 
         | So yeah, if you are building/improving software that has a
         | clear focus on security, you should use a secure hash. Seems
         | only natural to me.
        
         | lmilcin wrote:
         | The difference it was "a" collision. Now, that wasn't very
         | helpful if you wanted to forge a document.
         | 
         | This is chosen prefix collision. This means you can select a
         | beginning of a document which in many cases is enough.
        
         | PeterisP wrote:
         | Past experience and documents that ceased being classified
         | shows that serious attackers (e.g. NSA) are at least decade
         | ahead of what's publicly known in cryptography; i.e. we know
         | that pretty much always when new relevant groundbreaking math
         | was published, the classified cryptographers had known that for
         | a long, long time already.
         | 
         | So if this attack is developed today, then you should assume
         | that NSA has been able to execute this attack for at least ten
         | years already whenever it suited them, including mass
         | surveilance of random not-that-important people. The same
         | applies for the collisions - the first _published_ collision
         | was 3 years ago, but we should assume that that 's definitely
         | not the first; I mean, only a minority of the world's
         | cryptography researchers participate in the public, open,
         | academic community; the majority of them are employed with a
         | condition that they won't get to publish anything important.
         | And since we know for the last ten years that such attacks were
         | possible, there's no reasonable reason why SHA-1 would have
         | been considered as safe.
        
           | Skunkleton wrote:
           | > So if this attack is developed today, then you should
           | assume that NSA has been able to execute this attack for at
           | least ten years already whenever it suited them, including
           | mass surveilance of random not-that-important people.
           | 
           | They might be 10 years ahead on the algorithms side, but they
           | aren't 10 years ahead on the hardware side. Also, spending
           | 45k today gets you a single collision. That is hardly going
           | to be useful for mass surveillance.
        
             | velosol wrote:
             | They may have been though - there's talk of liquid helium
             | tens+ of gigahertz (8-bit) processors being purpose built
             | (in 2002). The National Cryptologic Museum next door to NSA
             | HQ is a fascinating place and has some very cool displays
             | to include a large disk changer and a piece of a cooling
             | system. There's a PDF at [1] but the discussion at [2] I
             | think might give a better idea.
             | 
             | [1]: https://www.nitrd.gov/pubs/nsa/sta.pdf [2]:
             | https://it.slashdot.org/comments.pl?sid=485458&cid=22736288
        
             | nickpsecurity wrote:
             | Other declassified documents of the past showed they were
             | years ahead on hardware, too. The NRE costs got so high
             | that commodity hardware got preferable in the general case.
             | That said, they can still throw money at ASIC's, FPGA's,
             | semi-custom versions of Intel/AMD CPU's, maybe same for
             | GPU's, and so on.
             | 
             | They have access to better and more hardware than most
             | threat actors.
        
           | dweekly wrote:
           | Are there materials that show a _ten_ year head start?
        
             | ReidZB wrote:
             | I think there's skepticism in the community that NSA et al
             | actually have such a far head-start now.
             | 
             | That said, historically speaking, back in the early 70s DES
             | was being designed. The NSA made some unjustified changes
             | to its S-boxes. At the time, there were allegations that
             | they had made them intentionally weaker. (Or so I've read;
             | I wasn't born yet.) In the early 90s, differential
             | cryptanalysis was discovered for the first time, and it
             | turns out that DES was already resistant to it (unlike
             | other block ciphers at the time): in fact, the NSA already
             | knew about differential cryptanalysis, 20 years ahead of
             | the general public, and intentionally strengthened DES.
             | (Also, IBM discovered it, too, but kept it quiet at the
             | NSA's request.)
        
             | PeterisP wrote:
             | For an early example, differential cryptanalysis methods
             | were 'discovered' in late 1980s by Biham and Shamir; but it
             | later turned out that resistance to it was pushed as a
             | design consideration already back in 1974 when DES was
             | designed, so NSA knew of it _at least_ then, that 's more
             | than a decade.
             | 
             | We know that British intelligence (who don't have as much
             | resources as NSA) had developed the RSA equivalent
             | something like 5 years before Rivest/Shamir/Adleman got to
             | it; and we still have no idea how far NSA was with that
             | math at the time - all we have is circumstancial evidence
             | such as travel reports of NSA representatives going to
             | cryptography conferences and being satisfied that
             | absolutely no math that's new (to them) or even potentially
             | leading to something new was being revealed there.
             | 
             | We also have NSA suddenly changing recommendations to
             | use/stop using certain cryptosystems that still doesn't
             | make sufficient sense - e.g. the 2015 turning away from
             | 'suite B' ECC _may_ have been due to some quantum discovery
             | as is claimed, or some other weakness being found, but it
             | 's been five years and we (as far as I understand) still
             | don't know as much as they did back in 2015, so they're
             | more than 5 years ahead. But to know whether the current
             | advantage is ten years or more or less, we'll have to wait
             | a generation or so, it takes a long time for truth to leak.
        
               | whatshisface wrote:
               | How many mathematicians are working for NSA? How many
               | public research cryptographers are there?
        
               | PeterisP wrote:
               | One thing is that proper public cryptographers are very
               | rare, there's a handful of effective teams - e.g. MIT has
               | Rivest and associates, there are a bunch of other places,
               | but most universities, including quite serious ones,
               | don't have anyone doing reasonable cryptographic
               | research. Cryptocurrencies caused a recent boom, but it's
               | a niche with separate, specific goals that doesn't
               | advance the rest of the field much.
               | 
               | It's hard to give good numbers, we'd have to look at
               | Snowden leaks and others, but I haven't done much about
               | that. Here's an earlier HN comment
               | https://news.ycombinator.com/item?id=6338094 that
               | estimates 600 proper mathemathic researchers working on
               | crypto, and it seems quite plausible to me that it would
               | be more research power than the entire public academia -
               | especially because in many countries who do take this
               | field seriously (e.g. China, Russia, Iran) there's no
               | real public research in crypto happening because that's
               | classified by default. I mean, prety much all academic
               | research happens through targeted grants by governments,
               | and who other than deparment of defence (or similar
               | organizations in other countries) would be funding
               | cryptographic research?
               | 
               | Also, I'll quote Bruce Shneier (2013, https://www.schneie
               | r.com/essays/archives/2013/09/how_advance...) regarding
               | their budget - "According to the black budget summary,
               | 35,000 people and $11 billion annually are part of the
               | Department of Defense-wide Consolidated Cryptologic
               | Program. Of that, 4 percent--or $440 million--goes to
               | 'Research and Technology.' That's an enormous amount of
               | money; probably more than everyone else on the planet
               | spends on cryptography research put together."
        
               | thesabreslicer wrote:
               | This is a great summary, thank you!
        
               | ljhsiung wrote:
               | This is fascinating; are there any readings/sources for
               | similar events, i.e. governments "predicting" academia?
               | 
               | I am aware of the 2015 plan to "transition soon(tm)", but
               | that's because I was alive 5 years ago. Other earlier
               | events would be super cool to read up on.
        
         | shilch wrote:
         | Like computer scientist, they think binary: Either it's secure,
         | or it's not. In reality there's a spectrum where you also have
         | "good enough".
        
           | rectang wrote:
           | The security strategy with the most practical utility for
           | most software engineers working on most projects is _defense
           | in depth_ : multiple layers, each assumed to be breakable.
           | 
           | It's a striking contrast with the stark mathematical language
           | deployed by cryptographers, on whose work we rely.
           | 
           | If we differentiate between the two fields of software
           | engineering and cryptography, it's easier to be generous in
           | our appreciation for the different goals and mental models.
        
           | mbowcutt wrote:
           | "good enough" relies on a threat model. Cryptography
           | researchers work in the abstract - without a threat model you
           | must consider cases where your attacker has unlimited
           | resources.
           | 
           | It's good enough for you and me, but research isn't meant to
           | be practical, imo
        
             | Ar-Curunir wrote:
             | What. The first thing any security paper defines is the
             | assumed threat model. People design all kinds of schemes
             | for different threat models.
             | 
             | The point with assuming conservative threat models for key
             | primitives like hash functions is that the threat model can
             | change rapidly even within the same application, and
             | attackers only get stronger. So you err on the side of
             | caution, and don't rely on luck to keep safe.
        
           | [deleted]
        
           | brink wrote:
           | Most people think in yes/no logic. Unfortunately binary is a
           | horrible oversimplification of a very analog reality and
           | results in many of the world's problems that we're in now.
           | Because we tend to think of a binary of yes/no, we often end
           | up flying from one ditch on the side of the road to the
           | other.
        
           | btilly wrote:
           | But in many contexts, "good enough" is more a question of
           | perception than reality.
           | 
           | These statements serve to shift the Overton window of
           | perception, and therefore help improve the odds that people
           | are't thinking "good enough" when they are broken.
        
         | espadrine wrote:
         | Cryptographers rely on precise definitions to make their
         | assessments.
         | 
         | In particular, a primitive makes a number of useful promises.
         | Any achievement that describes a way in which a promise is not
         | kept makes that primitive _broken_ , regardless of whether that
         | achievement is theoretical or practical.
         | 
         | (They often talk about "theoretically broken" or "practically
         | broken" to distinguish whether it was actually done.)
         | 
         | > _it 's not like SHA-1 is broken for every single possible
         | usage_
         | 
         | True, but it is extremely easy to believe your usage is
         | unbroken and be wrong. Besides, often, primitives that are
         | broken for one usage eventually break for others.
         | 
         | > _why the nonsense with "no good reason"? Obviously
         | performance is one significant consideration for the unbroken
         | use cases_
         | 
         | There are many much more robust primitives with better
         | performance nowadays, such as BLAKE2, SHA-512/256, or
         | Kangaroo12.
        
           | staticassertion wrote:
           | Is SHA1 not still faster than 256/512? And Blake2 barely
           | faster? I suspect with some hardware SHA1 is still going to
           | beat all of those out - am I wrong? I would love to learn
           | more.
        
             | __s wrote:
             | CRC is faster than SHA1
        
             | tptacek wrote:
             | https://bench.cr.yp.to/results-hash.html
             | 
             | Regardless, no competent designer is going to use SHA1.
        
               | staticassertion wrote:
               | Thank you, that's an excellent chart.
        
         | remus wrote:
         | Depreciating things takes a long time. It could be 10 years
         | before 'that guy' in your office gets it in to his head that
         | SHA-1 isn't appropriate where security is concerned. Thus you
         | want to make it unambiguous that now is a good time to move to
         | something else so that when it is thoroughly broken in 10 years
         | time you're minimising the number of people who have yet to
         | switch.
        
           | tialaramex wrote:
           | Depreciating things takes however long your accountants and
           | the tax laws in your country allow :)
           | 
           | Deprecating things does take a long time, and the only
           | practical thing to do about that is to get ahead of the game
           | as you say.
        
         | hinkley wrote:
         | Ten years ago I was being pressured into not adding SHA-256
         | support to a new system because SHA-1 would allow them to use a
         | stack of old crypto cards some idiot bought and hadn't been
         | able to put to use, so they were just collecting dust in a
         | closet somewhere. (They had a max of 1k RSA keys, which also
         | were considered broken at the time, close to it when the cards
         | were new).
         | 
         | This wasn't for some throw-away project. Big, old company. This
         | may be the closest I've come by far to writing software that
         | will still be used after I'm dead (gets a little easier every
         | year, though). If I gave them SHA-1 they'd still be using it
         | for sure.
         | 
         | I refused, and the fact that people were calling MD5 very
         | broken and SHA-1 broken helped.
         | 
         | (A completely different faction was trying to get me to jump
         | straight to SHA-512. I said that was probably overkill, and yes
         | I will implement it but we're using SHA-256 as a default. Then
         | a couple years later it turns out SHA-256 is more resilient
         | than 512 anyway. But what a schizophrenic place.)
        
         | Ar-Curunir wrote:
         | Because by shouting loudly now, we can maybe get changes into
         | libraries within the next 5 years, before these attacks are
         | commonplace
        
         | hannob wrote:
         | Performance is the worst argument you could choose.
         | 
         | a) There's hardly an application where hash-performance
         | matters. These things are fast.
         | 
         | b) For precisely that reason that people may still complain
         | about performance cryptographers invented a hash function that
         | is faster than all the old choices like md5/sha1 and still
         | secure (it's called blake2).
        
       | 0x0 wrote:
       | Q: Does this make it even more urgent for git to move to a
       | different hash?
        
         | rom1v wrote:
         | What happens if you actually get a SHA-1 collision in git?
         | 
         | -> https://stackoverflow.com/questions/9392365/how-would-git-
         | ha...
         | 
         | (This does not answer your question, but is still interesting.)
        
         | LeonM wrote:
         | A: (from the article)
         | 
         | SHA-1 has been broken for 15 years, so there is no good reason
         | to use this hash function in modern security software. Attacks
         | only get better over time, and the goal of the cryptanalysis
         | effort is to warn users so that they can deprecate algorithms
         | before the attacks get practical. We actually expect our attack
         | to cost just a couple thousand USD in a few years.
        
           | pathseeker wrote:
           | That's not a relevant answer. Git's use-case is very specific
           | and it's quite possible that this attack won't be relevant.
           | It needs analysis.
           | 
           | >no good reason to use this hash function in modern security
           | software
           | 
           | This argument conveniently ignores the cost to switching
           | existing software (i.e. it's completely detached from
           | reality).
        
         | EGreg wrote:
         | It may, because now an attacker can replace code with arbitrary
         | other valid code as long as developers are willing to ignore
         | the long weird random comment at the end ;-)
         | 
         | I'm gonna say many developers will not care but and many
         | compilers will not care either.
         | 
         | So yeah, Linus' main deterrent reason (code won't compile)
         | doesn't apply anymore.
         | 
         |  _HOWEVER!_
         | 
         | 1. A chosen-prefix attack still needs to compute TWO suffixes
         | m1 and m2 so that _h(a1+m1) = h(a2+m2)_. This does NOT mean
         | that given _a1_ and _a2_ you can find a single _m2_ so that
         | _h(a1) = h(a2+m2)_. So that ONLY THE ORIGINAL AUTHOR OF THE
         | COMMIT could spoof their own commit, by preparing in advance
         | and attaching a long and weird comment in the end. And you
         | could build tools to watch out for such commits in the first
         | place
         | 
         | 2. If git had used HMAC based on SHA1 then it would have been
         | fine, even after this attack has become feasible.
         | 
         | 3. Furthermore, it is likely still kinda fine because Merkle
         | Trees have nodes referencing previous nodes. You'd have to
         | spoof every historical node as well, to push malicious code.
         | BitTorrent also requires computers to supply an entire merkle
         | branch when serving file chunks.
         | 
         | Maybe someone can elaborate on this.
        
           | rkangel wrote:
           | If you look in this 2017
           | (https://marc.info/?l=git&m=148787047422954) email from
           | Linux, he discusses how git also encodes length. That would
           | mean that you need a collision of the same length _and_ the
           | right functionality, so you can 't just append data.
        
             | toyg wrote:
             | Now that you can arbitrarily produce collisions, the second
             | step is easy enough for a skilled and well-funded attacker.
        
         | rkangel wrote:
         | It adds to the weight of the argument, but there isn't a big
         | issue. This article (https://www.zdnet.com/article/linus-
         | torvalds-on-sha-1-and-gi...) and the linked email
         | (https://marc.info/?l=git&m=148787047422954) both seem to still
         | apply.
        
           | groovybits wrote:
           | Further details as to why Torvalds is not concerned:
           | 
           | From the email...
           | 
           | "I haven't seen the attack yet, but git doesn't actually just
           | hash the data, it does prepend a type/length field to it.
           | That usually tends to make collision attacks much harder,
           | because you either have to make the resulting size the same
           | too, or you have to be able to also edit the size field in
           | the header."
           | 
           | [...]
           | 
           | "I haven't seen the attack details, but I bet
           | 
           | (a) the fact that we have a separate size encoding makes it
           | much harder to do on git objects in the first place
           | 
           | (b) we can probably easily add some extra sanity checks to
           | the opaque data we do have, to make it much harder to do the
           | hiding of random data that these attacks pretty much always
           | depend on."
        
             | tedunangst wrote:
             | What if somebody makes an attack where they can choose the
             | size and then find a collision?
        
               | banana_giraffe wrote:
               | Like the two files on the linked page?
        
               | Skunkleton wrote:
               | The two files on the linked page were both full of junk
               | data. I suspect that those files being of the same length
               | isn't the norm.
        
               | aidenn0 wrote:
               | I would bet that a fixed size M' is part of the attack.
        
             | est31 wrote:
             | That first quote is misleading. git's special hashing
             | scheme doesn't make the attack "much harder". First there
             | is no difference in length in the original shattered
             | collision already:
             | 
             | $ curl https://shattered.io/static/shattered-1.pdf | wc -c
             | 
             | 422435
             | 
             | $ curl -s https://shattered.io/static/shattered-2.pdf | wc
             | -c
             | 
             | 422435
             | 
             | Second, the length is already being hashed into the content
             | during computation of a SHA-1 hash. Look up Merkle-Damgard
             | construction: https://en.wikipedia.org/wiki/Merkle%E2%80%93
             | Damg%C3%A5rd_co...
             | 
             | There _is_ benefit in storing the length at the prefix as
             | well, as you can avoid length extension attacks, but that
             | 's not making attacks "much harder".
        
               | tny wrote:
               | But even if the lengths are same, the resulting SHA1 will
               | be different since you prefix the length before hashing
        
               | est31 wrote:
               | The shattered prefix was chosen as well, see my other
               | comment in the thread:
               | https://news.ycombinator.com/item?id=21980759
               | 
               | The only thing that prefixing the length makes difficult
               | is using the same prefix multiple times: you basically
               | have to make up your mind about the type and length
               | before mounting the shattered attack. Also, the prefix
               | means you have to do your own shattered attack and can't
               | use the PDFs that google provided as proof of their
               | project's success. Price tag for that seems to be 11k.
               | 
               | [1]: https://github.com/cr-
               | marcstevens/sha1collisiondetection
        
               | toyg wrote:
               | Yeah, that quote doesn't exactly make me confident about
               | Linus's understanding of this particular issue.
        
               | DarkWiiPlayer wrote:
               | It's not about it being the same length, but the length
               | of the data being part of the hashed data, which, Linus
               | assumes, will likely make it more difficult to find a
               | collision. He even says at the beginning that he hasn't
               | had a look at the attack yet and is just making an
               | assumption.
        
               | est31 wrote:
               | > It's not about it being the same length, but the length
               | of the data being part of the hashed data
               | 
               | As I tried to point out, the length is already part of
               | what the SHA-1 function hashes:
               | 
               | https://tools.ietf.org/html/rfc3174#section-4
               | As a summary, a "1" followed by m "0"s followed by a 64-
               | bit integer are appended to the end of the message to
               | produce a         padded message of length 512 * n.  The
               | 64-bit integer is the length         of the original
               | message.  The padded message is then processed by the
               | SHA-1 as n 512-bit blocks.
               | 
               | Now, storing the length as a prefix does give you
               | advantages: you can't mount a length extension attack,
               | which limits your ability to exploit one shattered
               | attack, e.g. the pdfs released by google, for different
               | files/types of files. But it doesn't make mounting a
               | novel shattered attack "much harder" as Linus claims.
        
               | magicalhippo wrote:
               | > But it doesn't make mounting a novel shattered attack
               | "much harder" as Linus claims.
               | 
               | From what I understood the core of Linus' argument[1] is
               | that it's very hard to make a "bad" variant of the code
               | which has the same length _and_ the same hash while still
               | looking like sane code. For random data files, sure those
               | are more at risk.
               | 
               | [1]: https://marc.info/?l=git&m=148787287624049&w=2
        
               | wbl wrote:
               | A good thing no one checks binary assets into source
               | control
        
               | runeks wrote:
               | The point is that the hashed data must follow a specific
               | data format, and can't just be arbitrary data. This means
               | that the collision data MUST contain the length at some
               | specific offset in the data, which makes it harder to
               | find a collision.
               | 
               | The more restrictive the serialization format of the
               | hashed data, the harder it is to find a collision that's
               | valid in the given application context.
        
               | est31 wrote:
               | Yeah the data must start with a specific prefix, but can
               | otherwise contain whatever it wants. Anyways, even the
               | shattered attack, which this paper says costs 11k to
               | execute today, had a pdf specific prefix (the shown part
               | is the same in both files):                   $ curl -s
               | https://shattered.io/static/shattered-1.pdf | hexdump -n
               | 512 -C          00000000  25 50 44 46 2d 31 2e 33  0a 25
               | e2 e3 cf d3 0a 0a  |%PDF-1.3.%......|         00000010
               | 0a 31 20 30 20 6f 62 6a  0a 3c 3c 2f 57 69 64 74  |.1 0
               | obj.<</Widt|         00000020  68 20 32 20 30 20 52 2f
               | 48 65 69 67 68 74 20 33  |h 2 0 R/Height 3|
               | 00000030  20 30 20 52 2f 54 79 70  65 20 34 20 30 20 52
               | 2f  | 0 R/Type 4 0 R/|         00000040  53 75 62 74 79
               | 70 65 20  35 20 30 20 52 2f 46 69  |Subtype 5 0 R/Fi|
               | 00000050  6c 74 65 72 20 36 20 30  20 52 2f 43 6f 6c 6f
               | 72  |lter 6 0 R/Color|         00000060  53 70 61 63 65
               | 20 37 20  30 20 52 2f 4c 65 6e 67  |Space 7 0 R/Leng|
               | 00000070  74 68 20 38 20 30 20 52  2f 42 69 74 73 50 65
               | 72  |th 8 0 R/BitsPer|         00000080  43 6f 6d 70 6f
               | 6e 65 6e  74 20 38 3e 3e 0a 73 74  |Component 8>>.st|
               | 00000090  72 65 61 6d 0a ff d8 ff  fe 00 24 53 48 41 2d
               | 31  |ream......$SHA-1|         000000a0  20 69 73 20 64
               | 65 61 64  21 21 21 21 21 85 2f ec  | is dead!!!!!./.|
               | 
               | The shattered attack was about a so-called "identical
               | prefix" collision, while the shambles paper's collision
               | was a "chosen prefix" one. You can choose it in both
               | cases, but in the "chosen prefix" one both colliding
               | prefixes can be entirely different (and can be as long as
               | you want btw, the attack doesn't cost more if the prefix
               | is 4 KB vs 4 GB), while in the "identical prefix" case it
               | has to be identical.
        
               | toyg wrote:
               | _> the harder it is to find a collision that's valid in
               | the given application context_
               | 
               | In the double-digit thousands of dollars, an attack that
               | gets 10x or 100x harder is still cheap for state actors.
               | 
               | Assuming the NSA is at least a year or two ahead of the
               | field, git should now accelerate its migration process.
        
             | joeyh wrote:
             | "(b)" is kind of amusing.. It's been known since 2011 that
             | collision generating garbage material can be put after a
             | NUL in a git commit message and hidden from git log, git
             | show, etc. Still not fixed.
             | 
             | With this chosen-prefix attack, they chose two prefixes and
             | generated collisions by appending some data. So your two
             | prefixes just need to be "tree {GOOD,BAD}\nauthor
             | foo\n\nmerge me\0"
             | 
             | The only thing preventing injecting a backdoor into a pull
             | request now seems to be git's use of hardened sha1.
        
             | wyldfire wrote:
             | I think those are pretty practical approaches.
             | 
             | But it sounds as if the cost of changing the hash algorithm
             | is high. What are the impacts of this change? How many
             | things would break if git just changed the algorithm with
             | each new release? Does git assume that the hash algorithm
             | is statically given to be SHA-1 or are there qualifiers on
             | which algorithm is enabled/permitted/configured?
        
               | paulddraper wrote:
               | After making the actual code change, the biggest problem
               | is breaking compatibility with decades of tools in the
               | ecosystem that rely on historically consistent SHA-1
               | hashes.
               | 
               | Git is moving to a flexible hash though. [1]
               | 
               | [1] https://stackoverflow.com/questions/28159071/why-
               | doesnt-git-...
        
               | toyg wrote:
               | Maybe it's time for a version 3 that breaks a bit of
               | compatibility.
               | 
               | The Python community would freak out, lol.
        
               | simias wrote:
               | The cost is very high but it's only getting higher with
               | time. People have known that SHA-1 was weak and
               | deprecated for much of git's existence. Doing the switch
               | in 2010 would've been painful, doing it now would be
               | orders of magnitude more so and I doubt it'll get any
               | easier in 2030 unless some other SCM manages to overtake
               | git in popularity which seems unlikely at this point.
               | 
               | Unless Linus really believes that git will be fine using
               | SHA-1 for decades to come I don't think it's very
               | responsible to keep kicking the ball down the road
               | waiting for the inevitable day when a viable proof of
               | concept attack on git will be published and people will
               | have to emergency-patch everything.
        
             | simias wrote:
             | The fact that this attack is chosen prefix does weaken the
             | first argument though, you may now find a collision even
             | accounting for any prefixed git "header". The rest is still
             | completely valid though.
             | 
             | I still feel like they really should've taken this problem
             | more seriously and earlier. The more we wait the more
             | painful the migration will be when the day comes to move to
             | a different hash function, because everybody knows that'll
             | happen sooner or later. Two years ago we had a collision,
             | now we have chosen prefix, how much longer until somebody
             | actually manages to make a git object collision?
             | 
             | And keep in mind that public research is probably several
             | years behind top secret state agency capabilities. Let's
             | stop looking for excuses every time SHA-1 takes a hit and
             | rip the bandaid already. It's going to be messy and painful
             | but it has to be done.
        
               | [deleted]
        
             | bjornsing wrote:
             | > you have to be able to also edit the size field in the
             | header."
             | 
             | As I read the OP [1] a chosen-prefix collision attack such
             | as this allows you to "edit the size field in the header".
             | Or am I missing something?
             | 
             | 1. "A chosen-prefix collision is a more constrained (and
             | much more difficult to obtain) type of collision, where two
             | message prefixes P and P' are first given as challenge to
             | the adversary, and his goal is then to compute two messages
             | M and M' such that H(P || M) = H(P' || M'), where ||
             | denotes concatenation."
             | 
             | EDIT: On second thought I was missing something: the
             | adversary is further constrained in the git case because it
             | must find M and M' of correct length (specified in P and
             | P'). Linus is right (as usual), this probably makes it much
             | harder.
        
               | bjornsing wrote:
               | A few emails forward in the thread Linus explains though
               | why we don't need to worry much about this attack in
               | practice: https://marc.info/?l=git&m=148787287624049&w=2
               | 
               | This argument sounds sound to me.
        
         | mikepurvis wrote:
         | There is a migration path to SHA-256, see a good summary here:
         | https://stackoverflow.com/a/47838703/109517
         | 
         | See a previous discussion here, regarding Linus's position on
         | this in 2017: https://news.ycombinator.com/item?id=13719368
        
       | [deleted]
        
       | kibwen wrote:
       | Out of curiosity, can anyone explain in layman's terms the
       | differences in design that make SHA-1's successors immune to the
       | known attacks against SHA-1? Ultimately was this the result of an
       | apparent flaw in SHA-1 that only became obvious in retrospect, or
       | was it something totally unforeseeable?
        
         | knorker wrote:
         | Partly this: Number of bits.
         | 
         | This attack is almost 2^64, and SHA-1 is 160 bits. All else
         | being equal (big big if) that means sha256 is 102 bits, meaning
         | 362703572709.30493 times more expensive. Or about $16321
         | trillion USD.
        
         | Zaak wrote:
         | SHA-2 is based on similar techniques to those in SHA-1, which
         | prompted the SHA-3 competition when weaknesses in SHA-1 were
         | first discovered (as they could conceivably have been present
         | in SHA-2 as well). As it turns out, SHA-2 appears to be
         | resistant to the attacks found thus far.
         | 
         | SHA-3 (originally named Keccak) is built on an entirely
         | different foundation (called a sponge function), so it is
         | unlikely that any attack against SHA-1 will be relevant to
         | SHA-3. However, sponge functions are a relatively new idea, and
         | weaknesses in the basic principles could conceivably be found
         | in the future, as could weaknesses in the Keccak algorithm
         | specifically.
        
       | [deleted]
        
       | newscracker wrote:
       | General questions:
       | 
       | (edit: these are indeed general questions, not just about SHA1)
       | 
       | Has anyone else been worried about data deduplication done by
       | storage and/or backup systems, considering that they usually use
       | hashes to detect data blocks that are "the same" (without
       | additional metadata) and avoid storing those "duplicate data
       | blocks" again? Doesn't this seem far worse when you also consider
       | that systems like Dropbox deduplicate data across all their users
       | (expanding the footprint for collisions)? Are there any research
       | papers/articles/investigations about this?
        
         | jbverschoor wrote:
         | Yep, I'm afraid of that. I have no idea how they are mitigating
         | this, and I doubt they'll ever disclose.
        
           | onei wrote:
           | From what I recall when the first collision was proved it was
           | with a PDF which is easier to craft a collision from. I also
           | seem to recall that the collisions were from hashes of the
           | whole file.
           | 
           | The dedupe engine written where I work chunks a file and
           | hashes those chunks meaning it's somewhat harder to craft
           | collisions (I forget where the chunking boundaries are, but
           | it's within a range iirc). The hashing algorithm was SHA-1
           | last I checkee but I've never heard even company folklore of
           | corrupted backups caused by hash collisions. I get the
           | feeling that it's near impossible in practical terms given
           | the size of the string being hashed. Having said that, hubris
           | is the downfall of programmers everywhere, so I wouldn't bet
           | all my money on it.
        
         | upofadown wrote:
         | SHA1 isn't broken for hashes.
        
           | mlyle wrote:
           | SHA1 is vulnerable to preimage attacks in reduced round
           | variants. The findings keep steadily improving.
           | 
           | https://en.wikipedia.org/wiki/Preimage_attack
           | 
           | This means if a storage system _just_ uses SHA1 to detect
           | duplication, you can abuse the ability to create a collision
           | to possibly do bad things to the storage system.
        
             | knorker wrote:
             | Yes, but for the specific concerns listed it should not be
             | a problem. That is, if you upload to dropbox and check the
             | sha1 when it comes back, yes you did get the same data
             | back.
             | 
             | And your data can't be stolen by a hash-to-data oracle
             | either, unless the evil attacker constructed _your_ secret
             | data for you.
             | 
             | So it depends on your threat model. Yes there are practical
             | concerns, but not the ones listed.
        
               | mlyle wrote:
               | I disagree. Deduplicating filesystems often depend on
               | hash equivalency meaning data equivalency. They may or
               | may not have modes to validate all data before
               | deduplication, but these suck for performance and are
               | often turned off.
               | 
               | E.g. I make two things that hash to the same thing. One
               | is a contract where I'm obligated to pay back a loan.
               | Another is some meaningless document. I give them to a
               | counterparty who puts them in their filesystem, which
               | later scrubs and deduplicates data. Since they hash the
               | same, the filesystem removes the contract and leaves my
               | meaningless document (or never bothers to store the
               | contract because it already exists, if deduping online,
               | etc).
               | 
               | Note that this is a _chosen prefix_ collision, which is
               | much more demanding (and more useful!) than finding a
               | collision in general. And this leaves aside that SHA1 is
               | looking increasingly vulnerable to preimage attacks which
               | further broaden the attack scenarios.
        
               | im3w1l wrote:
               | This doesn't sound crazy farfetched. I bet a lot of
               | people have files in their dropbox that were created by
               | someone else.
        
             | upofadown wrote:
             | >SHA1 is vulnerable to preimage attacks.
             | 
             | From your linked article:
             | 
             | >All currently known practical or almost-practical attacks
             | on MD5 and SHA-1 are collision attacks. In general, a
             | collision attack is easier to mount than a preimage attack,
             | as it is not restricted by any set value (any two values
             | can be used to collide).
        
           | fanf2 wrote:
           | SHA-1 (and SHA-256 and many other hash functions) are subject
           | to length extension, which is how the 2017 shattered attack
           | works: find a pair of short prefixes that work as PDF headers
           | and that have colliding hashes; then append a common suffix
           | such that the different headers display different parts of
           | the suffixes. Once the collision has been found you can
           | easily make your own colliding PDFs - https://alf.nu/SHA1
           | 
           | This breaks anything that dedups on just the SHA-1 hashes of
           | raw files
        
         | MichaelMoser123 wrote:
         | are there any systems that do sha-1 for dedup? I am only aware
         | of sha-256.
        
           | munificent wrote:
           | Git?
        
             | [deleted]
        
           | fanf2 wrote:
           | Subversion's dedup was broken by the http://shattered.io/
           | SHA-1 collision in 2017 https://medium.com/@hyrum/shattered-
           | subversion-7edea4ba289c
        
           | [deleted]
        
           | rakoo wrote:
           | bup does. Even worse: rsync relies on md5 for finding
           | duplicate blocks.
        
         | kemitche wrote:
         | It depends on how the deduplication is configured, and the risk
         | tolerance of the organization running it, I suppose.
         | 
         | ZFS, for example, has a dedup setting option that forces the
         | file system to do a byte-for-byte verification for any deduped
         | data:
         | https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/g...
        
       | umvi wrote:
       | Is "a Shambles" British or something? I've always heard it as "in
       | Shambles"
        
         | cpach wrote:
         | AFAICT "a shambles" is correct usage. See
         | https://brians.wsu.edu/2016/05/24/in-shambles-a-shambles/
        
           | StuffedParrot wrote:
           | ...for traditional use, maybe, but any editor worth their
           | salt will stop you from doing so. At least in America. Many
           | words have changed their meaning and use over the last 500
           | years.
        
           | OGWhales wrote:
           | Neat, but don't know if I can switch because of how it rolls
           | off the tongue.
        
             | umvi wrote:
             | It got easier for me once I learned that "shambles" is a
             | noun that is a synonym of "slaughterhouse". Once I learned
             | that, I emphasized _shambles_ slightly differently as I
             | rolled the phrase off my tongue.
             | 
             | "SHA-1 is a _slaughterhouse_ "
             | 
             | "SHA-1 is a _shambles_ "
        
               | OGWhales wrote:
               | Yeah I got that from the link, I had no idea before.
               | However, I feel like if it was "is a shamble" it would
               | make more sense to me than "is a shambles".
        
           | umanwizard wrote:
           | Downvoted. There's no ISO standard for English, since
           | language is a social phenomenon. "Correct" language is
           | whatever the community of speakers uses in practice, not what
           | someone claims is correct in a blog post (or even a book).
        
             | simias wrote:
             | And the linked article showed you that, in practice, "a
             | shambles" is perfectly correct English. It is therefore
             | correct in a descriptivist sense.
             | 
             | Beyond that prefixing your comment by "downvoted" is
             | frankly silly and only serves to derail the conversation
             | IMO.
        
               | StuffedParrot wrote:
               | > And the linked article showed you that, in practice, "a
               | shambles" is perfectly correct English.
               | 
               | I'd think "in practice" would mean evidence of
               | contemporary usage.
        
               | umanwizard wrote:
               | "a shambles" is not used (or quite rarely used) in North
               | America, in my experience. Which was what the OP was
               | asking about.
               | 
               | You are of course right that "a shambles" is perfectly
               | fine British English, but that's beside the point.
        
               | kps wrote:
               | I'm in North America and have never heard 'in shambles',
               | without 'a'. If I did, I'd probably take it as uneducated
               | sociolect.
        
               | plopz wrote:
               | NYC here, never have heard it with an 'a'. Its always
               | just 'in shambles'.
        
               | umvi wrote:
               | Fascinating. I'm guessing it's a regional thing where
               | certain parts of the US say "in shambles" vs. "a
               | shambles" vs. "in a shambles".
               | 
               | Not sure how it would be measured, but I think a
               | /r/dataisbeautiful map would be interesting to see.
        
         | dafman wrote:
         | Yes, at least where I live in the south of the UK. Other
         | phrases you might hear are "What a shambles", "That is an
         | absolute shambles" or "Omnishambles"
        
           | jen20 wrote:
           | "Omnishambles" has a fun etymology - it comes from the
           | political satire "The Thick of It":
           | https://en.wikipedia.org/wiki/Omnishambles
        
         | no_flags wrote:
         | "Far be the thought of this from Henry's heart, To make a
         | shambles of the parliament-house!"
         | 
         | -Shakespeare (Henry VI, Part 3)
        
           | tialaramex wrote:
           | Note that in Shakespeare's time an audience would directly
           | have understood "make a shambles" here to mean violent
           | bloodshed in the parliament building because a "shambles"
           | would have been a common term for somewhere you killed
           | animals to produce meat. Which is exactly what is being
           | described here. Henry VI part 3 is not a play about a polite
           | disagreement settled over lunch...
           | 
           | Today "shambles" remains in the common vocabulary, but only
           | in a sense of things being messy or disorganised, most
           | English readers today probably wouldn't think of literal
           | violence and I suppose that the people who named this
           | website, likewise, were thinking of modern usage.
        
         | ancorevard wrote:
         | It's a word for a meat market, or a butcher's shop, in other
         | words - a bloody mess.
        
         | pjc50 wrote:
         | Yes; in addition to the other comments, there is also _the_
         | shambles: https://en.wikipedia.org/wiki/The_Shambles
         | 
         | (nowadays devoid of mead products but a nice picturesque place
         | to visit if you can stand crowds. Try the bookshop)
        
           | claudiawerner wrote:
           | I live in York and I haven't actually gone into any shops in
           | the Shambles...
        
       | ebg13 wrote:
       | Quick question about the "What should I do" section. It says "
       | _use instead SHA-256_ ". Isn't SHA-512 both better and faster on
       | modern hardware?
        
         | Spooky23 wrote:
         | SHA-256 is on the approved FIPS lists and are faster on 32-bit
         | operating systems, which were surprisingly common in enterprise
         | environments until recently.
         | 
         | People tend to be conservative in making changes for stuff like
         | this, and don't do so until forced.
        
         | akvadrako wrote:
         | I believe so also. Specifically, SHA-512/256 seems to be a
         | better choice by almost every metric except 32-bit hashing
         | speed.
        
         | [deleted]
        
         | LeonM wrote:
         | Depends on your definition of 'better'. Theoretically SHA-512
         | is harder to brute force than SHA-256, but 256 bits is already
         | extremely strong, so really there is no practical safety
         | benefit of SHA-512 over SHA-256.
         | 
         | On 64-bit capable processors SHA-512 has a slight performance
         | gain over SHA-256, but only on larger inputs. However, the
         | digest of SHA-512 is twice the size, so what you gain in
         | processing time, you loose in storage.
        
           | quotemstr wrote:
           | You can truncate the SHA-512 digest down to 256 bits though.
           | Cryptographic hash functions mix well and don't suffer from
           | the truncation (except to the extent that the truncated
           | digest is shorter, of course). You don't necessarily need
           | more storage (after hashing completes) just because you're
           | using a new hash function.
        
           | jlokier wrote:
           | > but 256 bits is already extremely strong
           | 
           | The strength of hashes like SHA-256 doesn't just come from
           | the number of output bits.
           | 
           | The 256 bits there is relevant for brute force attacks, but
           | not more sophisticated attacks that take into account the
           | internal structure of the hash algorithm, and in some cases
           | "weak" values.
           | 
           | SHA-512 performs more "rounds" of computation than SHA-256.
           | 
           | Although it's impossible to compare two different hashes on
           | rounds alone, in general a large number of rounds of the same
           | type of hash decreases the likelihood of non-brute-force
           | attacks finding a collision.
           | 
           | If you look at the literature for attacks on hashes, they
           | will often say they could do it for a certain number of
           | rounds, and that number increases over time as new methods
           | are discovered.
           | 
           | The number of rounds in the hash design is chosen with this
           | in mind, trying to balance being more than sufficient for
           | future attacks yet not too slow.
        
             | tptacek wrote:
             | Cryptographic engineers do not in fact think you should use
             | SHA-2-512 so that you can maximize the number of times the
             | round function is applied.
        
               | jlokier wrote:
               | I'm not sure what that sentence means, can you rephrase
               | it?
        
               | tptacek wrote:
               | The analysis you have provided for why SHA-512 is
               | superior to SHA-256 is faulty.
        
         | CiPHPerCoder wrote:
         | SHA-256 and SHA-512 are both in the same family (SHA-2).
         | 
         | Latacora says to use SHA-2. If you can get away with it,
         | SHA-512/256 instead of SHA-256. But they're all SHA-2 family
         | hash functions.
         | 
         | https://latacora.micro.blog/2018/04/03/cryptographic-right-a...
         | 
         | No need to bikeshed this. But if you must: SHA-512/256 >
         | SHA-384 > SHA-512 = SHA-256
         | 
         | If you're wondering, "Why is SHA-384 better than SHA-512 and
         | SHA-256?" the answer is the same reason why SHA-512/256 is the
         | most preferred option:
         | https://blog.skullsecurity.org/2012/everything-you-need-to-k...
         | 
         | Additionally, the Intel SHA extensions target SHA1 and SHA-256
         | (but not SHA-512), which makes SHA-256 faster than SHA-512 on
         | newer processors.
         | 
         | Isn't crypto fun?
        
           | fireflash38 wrote:
           | I read the blog, but it doesn't really expand on _why_ SHA224
           | /384 aren't vulnerable to that attack, can you explain (or
           | link to some place that does?)
        
             | tptacek wrote:
             | A length-extension attack works by taking the output of a
             | hash and using it as the starting point for a new hash; you
             | can do this even for hashes of messages you haven't seen,
             | minting new related hashes. Truncated SHA-2 hashes don't
             | output the whole hash, and so you aren't given enough
             | information to start a new related hash.
        
             | CiPHPerCoder wrote:
             | Because they're already truncated.
             | 
             | SHA-384 is SHA-512 with a different IV (which doesn't
             | affect LEAs) truncated to 384 bits (which gives you 128
             | bits of resistance against LEAs).
             | 
             | SHA-224 is the same story but with SHA-256 instead (and
             | only 32 bits of LEA resistance).
        
           | colanderman wrote:
           | I'm super confused. Are SHA-256 and SHA256 _different_ , and
           | if so, why in the world would this be considered a sane
           | naming scheme?
           | 
           | If not, I completely do not understand the inequation you
           | wrote, which seemingly lists SHA-256 (and -512) multiple
           | times.
        
             | timdumol wrote:
             | You're probably confused by "SHA-512/256", which does not
             | mean SHA-512 or 256, but rather a truncated version of
             | SHA-512: https://en.wikipedia.org/wiki/SHA-2 in the third
             | paragraph.
        
               | colanderman wrote:
               | Ah! Makes sense now, thanks.
        
               | Ajedi32 wrote:
               | So why would a truncated version of SHA-512 be better
               | than SHA-512? And why is SHA-512 = SHA-256?
        
               | CiPHPerCoder wrote:
               | Truncated hash functions are not vulnerable to length-
               | extension attacks.
               | 
               | Length-extension attacks are relevant when you design a
               | MAC by passing a secret and then a message to a hash
               | function, where only the message is known.
               | 
               | Truncating the hash (which is what SHA-512/256 and
               | SHA-384 do to SHA-512) removes the ability to grab an
               | existing hash H(k || m) (where k is unknown and m might
               | be known) and append junk because a truncated hash does
               | not contain sufficient information to recover the full
               | state of the hash function in order to append new blocks.
        
               | p1mrx wrote:
               | Why do SHA-512/160 and SHA-512/128 not exist? They could
               | be useful as drop-in replacements for SHA1 and MD5.
        
               | tedunangst wrote:
               | You can truncate a hash anywhere you like. But 128 bits
               | is considered too short now.
        
               | SAI_Peregrinus wrote:
               | Because 224 bits is considered the minimum safe output
               | length for a general purpose hash function. So they'd be
               | drop-in replacements but still wouldn't be safe. Safer
               | than MD5/SHA1, but not actually safe.
               | 
               | So rather than push off getting people to make things
               | actually safe by providing a footgun NIST just didn't do
               | that.
        
               | p1mrx wrote:
               | > 224 bits is considered the minimum safe output length
               | for a general purpose hash function.
               | 
               | Considered by whom?
        
               | CiPHPerCoder wrote:
               | Truncating a hash function to 224 bits put it at the
               | 112-bit security level, which is roughly equivalent to
               | 2048-bit RSA under today's understanding of the costs of
               | distributed cracking attacks.
               | 
               | There are a lot of standards organizations all over the
               | world with various recommendations.
               | https://www.keylength.com collates quite a few of them.
               | Pick the one most closely relevant for your jurisdiction.
               | 
               | Most of them recommend 2048-bit RSA as their minimum for
               | asymmetric security, and AES-128 / SHA-256 as their
               | minimum for symmetric security. This is a [112, 128]-bit
               | security lower bound.
               | 
               | Truncating a hash to 160 bits yields 80-bit security,
               | which is insufficient. 128 bits (64-bit security) is out
               | of the question.
        
               | [deleted]
        
             | dlgeek wrote:
             | SHA-512/256 is a truncated SHA-512 that's shortened to 256
             | bits, it's a separate standard.
        
             | [deleted]
        
             | faceplanted wrote:
             | "SHA-512/256" is a single entity
        
             | CiPHPerCoder wrote:
             | I've edited my parent comment to consistently use hyphens.
             | 
             | There are six SHA-2 family hash functions:
             | * SHA-224       * SHA-256       * SHA-384       * SHA-512
             | * SHA-512/224       * SHA-512/256
             | 
             | Hope that helps. (I know it's still confusing.)
        
         | oconnor663 wrote:
         | SHA-512 provides a higher security level (256 bits), but
         | there's not much practical value in a security level higher
         | than 128 bits. I think "better" is too strong a word here.
         | 
         | SHA-512 is faster than SHA-256 in software on 64-bit machines.
         | That's a more important difference than the security level.
         | However, there are two major caveats to consider: 1) Hash
         | function performance is more likely to matter on cheap
         | (non-64-bit) hardware where everything is slow, than on fancy
         | hardware where everything is fast. 2) Some x86 and ARM chips
         | have hardware accelerated implementations of SHA-256, but not
         | of SHA-512.
        
         | GlitchMr wrote:
         | SHA-256 is shorter, and there isn't much of a difference
         | between SHA-256 and SHA-512 (they are both using SHA-2
         | algorithm).
        
         | [deleted]
        
         | woadwarrior01 wrote:
         | FWIW, the instructions in Intel SHA extensions (which are also
         | supported on modern AMD processors starting from Ryzen onwards)
         | only support SHA-1 and SHA-256.
        
       | notlukesky wrote:
       | > Responsible Disclosure
       | 
       | We have tried to contact the authors of affected software before
       | announcing this attack, but due to limited resources, we could
       | not notify everyone.
       | 
       | Is there a list of affected software out there?
        
         | CiPHPerCoder wrote:
         | https://github.com/search?q=sha1&type=Code
         | https://github.com/search?q=sha-1&type=Code
        
       | eerrt wrote:
       | The full paper is https://eprint.iacr.org/2020/014.pdf if anyone
       | is interested
        
       ___________________________________________________________________
       (page generated 2020-01-07 23:00 UTC)