[HN Gopher] The first chosen-prefix collision for SHA-1 ___________________________________________________________________ The first chosen-prefix collision for SHA-1 Author : ynezz Score : 702 points Date : 2020-01-07 12:34 UTC (10 hours ago) (HTM) web link (sha-mbles.github.io) (TXT) w3m dump (sha-mbles.github.io) | bjornsing wrote: | > We note that classical collisions and chosen-prefix collisions | do not threaten all usages of SHA-1. In particular, HMAC-SHA-1 | seems relatively safe, and preimage resistance (aka ability to | invert the hash function) of SHA-1 remains unbroken as of today. | | Nice to see this bit of intellectual honesty. Would be even nicer | if they had explained what that means in terms of PGP keys. | dward wrote: | HMACs do not require collision resistance from the underlying | hash to provide secure message authentication. HMAC-MD5 is | still considered "secure", although that doesn't mean you | should use it. | | http://cseweb.ucsd.edu/~mihir/papers/hmac-new.pdf | _notreallyme_ wrote: | It means if someone you want to impersonate uses the Web Of | Trust, i.e. their key is signed by other people whose keys have | been signed the same way, you can generate a GPG key for which | all of these signatures are still valid. | | For example, if an attacker gains access to a victim email | account, they could send to their contacts a "trusted" key (as | explained above) and then use it to send signed documents to | the victim's contacts. | | This would defeat an adversary "paranoid" enough to check a key | signature, but not paranoid enough to obtain a clear | explaination/confirmation of why the key changed... | bjornsing wrote: | > It means if someone you want to impersonate uses the Web Of | Trust, i.e. their key is signed by other people whose keys | have been signed the same way, you can generate a GPG key for | which all of these signatures are still valid. | | No... | | > For example, if an attacker gains access to a victim email | account, they could send to their contacts a "trusted" key | (as explained above) and then use it to send signed documents | to the victim's contacts. | | Ok... But in this scenario the attacker has the victim's new | private key, so they don't need to create a collision (using | OP). They can just use the new private key to sign the | documents. Right? | _notreallyme_ wrote: | > No... | | Why ? | | > in this scenario the attacker has the victim's new | private key | | You don't want to keep your private key in cleartext on | your email provider servers, do you ? | makomk wrote: | This allows you to take two messages and append some data | to both of them which causes the modified versions to | have the same SHA-1 hash - but you need to modify both | messages, and in order to use this in an attack you need | to set up a scenario where the SHA-1 hash of one of your | modified messages is trusted for some purpose. Creating a | message with the same hash as another, existing message | requires a second-preimage attack which is much harder | and not feasible for any cryptographic hash that's | currently in use. | im3w1l wrote: | > It means if someone you want to impersonate uses the Web Of | Trust, i.e. their key is signed by other people whose keys | have been signed the same way, you can generate a GPG key for | which all of these signatures are still valid. | | I'm not really knowledgeable about the implementation details | of GPG. Mind explaining how this follows? | whatshisface wrote: | > _This would defeat an adversary "paranoid" enough to check | a key signature, but not paranoid enough to obtain a clear | explaination/confirmation of why the key changed..._ | | Thereby turning the signal intelligence problem into a human | intelligence problem. | nneonneo wrote: | So to be clear about what this is (because the website doesn't | quite clarify): this collision lets you pick two different | prefixes P1, P2, then calculates some pseudorandom data C1, C2 | such that SHA1(P1+C1) = SHA1(P2+C2). The length extension | property of SHA1 (and MD5) means that now SHA1(P1+C1+X) = | SHA1(P2+C2+X) for any X. | | A similar attack (which requires only a few hours on modest | hardware nowadays) has been known for a long time for MD5, but | this is the first time it's been demonstrated for SHA-1. | | The previous attack, called Shattered (https://shattered.io) was | a regular collision, that is, they chose a single prefix P and | found different C1, C2 such that SHA1(P+C1) = SHA1(P+C2). This | can also be length extended, so that SHA1(P+C1+X) = SHA1(P+C2+X). | However, this attack is more limited because there is little to | no control over the pseudorandom C1 and C2 (the only differing | parts of the messages). | | With a chosen prefix collision, though, things are way worse. Now | you can create two documents that are arbitrarily different, pad | them to the same length, and tack on some extra blocks to make | them collide. | | Luckily, the first collision should have already warned people to | get off of SHA1. It's no longer safe to use for many | applications. (Note, generally for basic integrity operations it | might be OK since there's no preimage attack, but I'd still be a | bit wary myself). | tptacek wrote: | Squeamish Osssifrage's answer to this Crypto Stack Exchange | question does a predictably excellent job of putting these | attacks in context. | | https://crypto.stackexchange.com/questions/60640/does-shatte... | ReidZB wrote: | Agreed. It's a great loss to the community that they have | decided to step away from Stack Exchange: | https://crypto.meta.stackexchange.com/questions/1361/im- | leav... | gowld wrote: | * that the owners of Stack Exchange drove them and others | away. | RickHull wrote: | Actually, the website does cover this in the '''Q&A''' section: | | > What is a chosen-prefix collision? | | > A classical collision (or identical-prefix collision) for a | hash function H is simply two messages M and M' that lead to | the same hash output: H(M) = H(M'). Even though this security | notion is fundamental in cryptography, exploiting a classical | collision for attacks in practice is difficult. | | > A chosen-prefix collision is a more constrained (and much | more difficult to obtain) type of collision, where two message | prefixes P and P' are first given as challenge to the | adversary, and his goal is then to compute two messages M and | M' such that H(P || M) = H(P' || M'), where || denotes | concatenation. | | > With such an ability, the attacker can obtain a collision | even though prefixes can be chosen arbitrarily (and thus | potentially contain some meaningful information). This is | particularly impactful when the hash function is used in a | digital signature scheme, one of the most common usage of a | hash function. | nneonneo wrote: | The confusing thing is that the team behind SHAttered _did_ | choose a prefix - they just had to choose a common prefix | rather than two different prefixes (choosing a common prefix | simply changes the initial state used by SHA-1). So I 'm | hoping that my post clarifies this point a bit. | | The SHAttered attack (classical collision) _can_ be used in | practice - for example, my project | https://github.com/nneonneo/sha1collider exploits their | collision to turn any two PDFs into documents with identical | SHA-1 hashes. | gowld wrote: | Is any practical application safe from Shattered but not | Shambles? | dickjocke wrote: | Can you give a specific example of the danger here? I | understand the principle behind the attack (kinda). | | I just don't understand what danger being able to pad two | documents to make them collide poses? | | edit: My guess is that it can be abused to make something that | I believe to be library X actually be library Y when I download | it from the internet. Lets say I want to download something, | and I check the signature provided. Assuming the attacker is | able to send me the wrong library via a MITM attack, how can | this prefix collision work? It seems that the original library | AND the original signature on the library's website have not | been altered, so their efforts to use this and make them match | are impossible. And it seems like if they can alter the | signature on the website and stuff, then all bets are off--why | not just send the malicious library at that point? | femto113 wrote: | It's a step on the same path that led to being able to spoof | a CA for MD5 signed certs. | | https://www.google.com/amp/s/techcrunch.com/2008/12/30/md5-c. | .. | ThePowerOfFuet wrote: | Please don't feed the cancer which is AMP. | | https://techcrunch.com/2008/12/30/md5-collision-creates- | rogu... | sofaofthedamned wrote: | Please don't overegg the issue with AMP by comparing it | to cancer. | andrewstuart2 wrote: | Don't make cancer anything it's not, either. Cancer is | just growth of abnormal cells, unconstrained, to the | point that it causes harm to the host. | | That said, maybe AMP is more like a virus. A non-living | organism that spreads by infecting living organisms and | repurposing them to replicate itself instead of | sustaining the organism they were a part of. | | The more sites adopt AMP, the more everyone else says | "well I guess we have to now." Seems pretty viral. | nneonneo wrote: | Suppose your system uses SHA-1 hashes for codesigning | verification (e.g. to load a system driver). I create an | innocent-looking device driver and convince a signing | authority to sign it. However, secretly I've created a | malicious driver (e.g. a rootkit) which collides with my | innocent one. Now, I can load the malicious one on your | machine - which the signing authority has never seen - using | the signing certificate of the legitimate one. | | This might sound far-fetched; after all, you'd need to | convince a signing authority to sign the code. But this is | pretty much exactly how Apple's Gatekeeper verification | works: your software is submitted to them, and they do some | security checks and notarize your bundle | (https://developer.apple.com/developer-id/), and I'm sure | there's many more such examples out there. | gowld wrote: | > you'd need to convince a signing authority to sign the | code. | | Not necessarily. You can also self-sign or just announce | the hash, let the public inspect and test your good driver | for a year, and then ship the bad driver to people who only | check the hash | dickjocke wrote: | Hey OP, | | im not much of a math guy, I'm getting hung up on this | part: | | SHA1(P1+C1+X) = SHA1(P2+C2+X) for any X. | | The example above seems like SHA1(GOOD_DRIVER) == | SHA1(BAD_DRIVER+C2+X) somehow. | | How does the C1 and X get appended to the signature of the | good driver. | cjm42 wrote: | The good driver consists of P1+C1+X. That's what gets | sent to the signing authority. They verify it doesn't do | anything bad and return a signature listing | SHA1(P1+C1+X). But that signature is also valid for the | malicious driver P2+C2+X. | nneonneo wrote: | Most executable file formats (including drivers) put the | code first followed by the data. So you could construct | your drivers thusly: | | GOOD_DRIVER = P1 (good code and some data) + C1 (data) + | X (more data) | | BAD_DRIVER = P2 (bad code and some data) + C2 (data) + X | (more data) | | You'd disguise the random-looking block of C1 data in the | middle of the good driver as e.g. a cryptographic key to | avoid suspicion. The "more data" part couldn't be | modified in the bad driver, but since you can arbitrarily | modify P2 this wouldn't be a severe restriction. | dickjocke wrote: | thank you OP and everyone. I have that shaky initial | understanding but makes much more sense. | air7 wrote: | He said "innocent- _looking_ device driver ". | | The good driver is actually padded so that it can be | later replaced with the bad driver. I.E This doesn't | allow the attacker to replace _any_ driver, but only one | they prepared in advance to look innocent but have the | right structure. | dickjocke wrote: | that makes a lot more sense, thank you | Kalium wrote: | The core of it is that it means a malicious document will | pass the check of authenticity when you are using the | _genuine signature_. Someone could tamper with a mirror | infect you that way. Absolutely no tampering with the | signature would be required, which is what makes this | dangerous. Basically, it can be used to send to entirely fake | data that you can 't tell is fake. | | Or someone could say that you have your legally binding | signature attached to a contract with hash ABC, but then | present a different contract with hash ABC but very different | terms. | | Those are the end state. This is a major step closer to that | end state. Neither of those scenarios are the case _today_ , | but they're now close enough that it's a matter of time. And | likely not a lot of time. | | Time to abandon SHA1. | GhettoMaestro wrote: | Lol and my engineering called me paranoid when I made us go from | SHA1 to SHA256 (it was trivial for us). Guess I'm having the | chuckle now. | rustybolt wrote: | > By renting a GPU cluster online, the entire chosen-prefix | collision attack on SHA-1 costed us about 75k USD. | | So they just decided to try their attack and spend two years | worth of salary on it?? That's crazy. | oppositelock wrote: | That's dirt cheap for a government actor, and we can be sure | that the big governments have been doing this sort of attack | for years. | rustybolt wrote: | It's a lot of money for an academic researcher. | progval wrote: | Researchers in applied fields other than CS spend this kind | of money on a regular basis. eg. | https://www.quora.com/What-are-the-costs-for-lab-rat- | testing... | CydeWeys wrote: | Is it? Multi-million grants are common in academia. A lot | of research is expensive. When I worked in a lab the | materials alone for a single day's experiment would | frequently run into the thousands. E.g. the total human RNA | samples we used cost many thousands of dollars per | _milligram_. Admittedly this is a different field, but it | 's still academia. | klmr wrote: | > _Multi-million grants are common in academia._ | | Not in the way you explain. Multi-million dollar grants | are usually awarded over multiple years, and pay for | multiple researchers' salaries, as well as other | resources, some of which are expected to outlast the | project (e.g. microscopes, or hardware for databases). | Burning 75k on a single experiment (which is effectively | what was done here) is rare. | | Note that this is true even for current hot topics such | as biomedical (e.g. cancer) research, for which vast | chunks of the federal budget have been allocated in | multiple countries. Obtaining similar sums in less sexy | fields is much harder. And even in biomedical research, | multi-million dollar grants are considered _large_. Most | grants are much smaller, they just don't get talked about | as much. | | Since you mention human RNA samples I assume you know | this. You mention the per-milligram cost but this is | pretty misleading if you mean to imply that "milligram" | is somehow little, because it isn't: yes, the samples are | tiny (<=1 ug of RNA is more typical than milligrams!), | but so what? It's not like we need more. | exikyut wrote: | No. It's incredibly generous and selfless. They threw that | money away _just_ to incontrovertibly prove a huge point to the | entire security community (and all developers, at the end of | the day). | | Like... chaotic good. Really really good. | martpie wrote: | You can see their emails at the bottom: those are | universities/research institutes domains. You can be sure | _they_ did not actually spend that themselves. | rustybolt wrote: | I'm just amazed that they were willing to take the risk that | there was a bug in the code and they wouldn't find a | collision. | CiPHPerCoder wrote: | This is _cheap_! | | Breaking MD5 for the Flame malware probably set the NSA | back millions when you sum the person-hours and CPU time | together. | basilgohar wrote: | It was a sound premise to investigate. EFF spent much more | to demonstrate the weakness of DES decades ago, and it | highlighted the need for stronger crypto and the fact that | nation states would be more than able to break commonly- | used crypto at the time. | n-gauge wrote: | As GPU's get better the cost will come down. Was it a massive | cluster of RTX2060's or something? | aidenn0 wrote: | 900 GTX-1060s. | silasdavis wrote: | Who paid for this? | jessant wrote: | Judging by the paper[1], I would say any or all of Inria, | Nanyang Technological University, and Temasek Laboratories. | | [1] https://eprint.iacr.org/2020/014.pdf | emilfihlman wrote: | >Can I try it out for myself? Since our attack on SHA-1 has | pratical implications, in order to make sure proper | countermeasures have been pushed we will wait for some time | before releasing source code that allows to generate SHA-1 | chosen-prefix collisions. | | Sigh. Again with this idiocy. All instances where the adversary | is capable of launching this attack financially mean they also | have the capability to write the exploit themselves. | DarkWiiPlayer wrote: | It does make sense. | | Targets worth attacking at a high financial cost will most | likely be the first to take measures against this attack. | | The kind of target that takes a longer time to switch most | likely isn't worth attacking unless it's a very cheap and fast | operation. | | And the longer you spend developing an exploit, the less viable | the attack will become. | lm28469 wrote: | > All instances where the adversary is capable of launching | this attack financially mean they also have the capability to | write the exploit themselves. | | Iran will eventually create a nuclear bomb, why don't we gave | them one now, it's the same thing isn't it ? | whatshisface wrote: | > _A countermeasure has been implemented in commit edc36f5, | included in GnuPG version 2.2.18 (released on the 25th of | November 2019): SHA-1-based identity signatures created after | 2019-01-19 are now considered invalid._ | | Since SHA-1 was always possible to break, and since NSA probably | gets access to big computers and sophisticated techniques before | researchers, why doesn't this invalidate every SHA-1 signature | ever made and not just ones from last year? | Boulth wrote: | Actually it's even worse than that: signature creation time is | added by the signer so it's totally under control of the | attacker. IMHO all SHA-1 based signatures should be ignored. | im3w1l wrote: | The signer is assumed to be trustworthy. That's why we care | for their signature in the first place. It's the sign-ee that | is presumed to be the attacker. | NieDzejkob wrote: | In this case, signature creation time isn't under control of | the attacker. The attack scenario being considered is that | Mallory can convince Alice to sign a key K1, provided by | Mallory, such that it looks like Alice signed K2. The party | creating the signature is honest here. | pornel wrote: | No, because this is a collision attack (without control over | the hash value), not a preimage attack (where you match an | existing hash). | | We know how to make _pairs_ of new files that collide with each | other, but there 's no known way of creating a file that | collides with something specific that existed before. | edwintorok wrote: | > security level 2 (defined as 112-bit security) in the latest | release (Debian Buster); this already prevents dangerous usage of | SHA-1 | | FWIW this doesn't apply to Fedora currently, because it has a | patch that re-enables SHA-1 in security level 2 in non-FIPS mode: | https://src.fedoraproject.org/rpms/openssl/blob/master/f/ope... | noja wrote: | Is a collision impossible with two hashes, each using a different | algorithm? | femto113 wrote: | Not impossible, but assuming there's not a mathematical flaw | that affect both algorithms the difficulty is roughly the | product of the difficulty of finding a collision in each. AFAIK | no one has come up with a joint collision for MD5+SHA1 despite | collisions in each being practical for several years. | noja wrote: | So should we do that, as well as keep finding new algorithms? | femto113 wrote: | Concatenating two hashes is an algorithm. Mathematically | concatenating two 128 bit hashes is not any stronger than a | single 256 bit hash (and is likely weaker), but if it's all | you have (or all you can afford to compute) two weak hashes | is definitely much better than one. | noja wrote: | Why is concatenating two 128 bit hashes (each with a | different algorithm) not stronger than a single-algorithm | 256 bit hash? | femto113 wrote: | One reason is that it is theoretically possible to use | memory instead of computation to attack the combined | hashes by pre-generating a large number of collisions | under one algorithm and then simply checking those using | the other one, which means you don't need to do both | algorithms for every check. Can't say for sure if that | works out cheaper in terms of money but if the memory is | available it could definitely save a lot of time. | | Another reason is that for any given hash its theoretical | maximum strength against any attack will be less than or | equal to its bit length, but the practical strength | always trends lower over time as attacks are found, and | having two algorithms to attack increases the chances of | finding flaws. | pferde wrote: | That is what Gentoo Linux' package manager does - it checks | checksums of downloaded files made by several different hashes | (as well as file size). | LennyWhiteJr wrote: | The root certificate authority for my company's Active Directory | is signed using a sha1 hash. What are the practical implications | of this chosen collision? | | How do I convince my IT department to update our CA to sha256? | HashThis wrote: | Quick, everyone bend over and cover your hashes | jVinc wrote: | So how would someone go about gaining more than 45k USD in profit | from a single case of using the chosen-prefix collision? Not | being candid here, I am honestly curious here. I'd guess that | even in situations where you somehow get a signed e-mail sent off | spoofing a CEO saying "Please pay these guys 50k$" the actual | payout seems unlikely and that puts the attacker 45k in the red. | But maybe there are some obvious avenues of abuse that I'm | missing, or is this more a case of "In a decade it will become | economical to abuse this for profit"? | DaiPlusPlus wrote: | Bitcoin uses SHA-256, but I wouldn't put it past some of the | devs of some of the Altcoins and Shitcoins to use SHA-1 for | either - or both of - their Proof-of-Work or Blockchain | integrity hashing algorithms. | | So if I understand today's news correctly, you could use this | to break blockchain integrity and offer-up alternative "valid" | historical blocks (but not cheat at proof-of-work). You would | still need to convince a quorum of network nodes to use your | fake historical blocks - I imagine this might be doable on | lesser-used coins that still have some trades - you could | probably combine this with a few pump-and-dump trades too | (without costing you anything as the coins you pump would be | stolen). | biggestdecision wrote: | At the current time this is much more likely to be abused for | political/intelligence reasons, than for profitable criminal | reasons. Sneaking a malicious commit with the same signature | into a git repository for example. | racingmars wrote: | I'm familiar with an organization that lost about $750,000 | because someone spoofed an email from the CEO to the CFO asking | to wire money to an account. The CFO fell for it. AFAIK, the | money was never recovered (nor was the CFO fired... it was all | just chalked up to 'the cost of doing business'). | | That was with NO crypto/signature spoofing involved... if the | CFO has now been trained to not act on large dollar amount | requests from the CEO without at least checking a digital | signature... perhaps the CFO would be _more_ likely to fall for | it now since he has been "trained" that cryptographic | signatures are a sign of authenticity? | jVinc wrote: | $750,000 is a lot of money, but seeing as some people will | act on emails without signature, and that you would | effectively have to invest that amount up front in order to | attempt this attack on 15 individuals, and then hope that at | least one falls for it just to make it even, I can't really | see this being a viable attack vector. Maybe if the cost goes | down significantly to the $100-$1000 range it might be | something you would see in the wild. | tambourine_man wrote: | This kind of thing always brings me down a bit. It's not | rational, but it does. | | I mean I truly admire these folks skills, the math involved is | obviously remarkable. | | But I think the feeling is related to not being able to rely on | anything in our field. Hard to justify going to the trouble of | encrypting your backup. 10 years from now, it might be as good as | plain text. | | It's not security only, nothing seems to work in the long term. | Imagine an engineer receiving a call at midnight about his bridge | because gravity changed during daylight saving in a leap year. | That's our field. | flingo wrote: | It was always broken like this. These guys just figured that | out. | | Now we know. It's better to know. | mlurp wrote: | This can really apply to almost everything though, since if I | understand you, it seems to be about progress in the face | impermanence. The history of science is basically a constant | process of things kind of working, then breaking. But overall | it goes forward and things are better off for it. In contrast, | I find it kind of inspiring. | tomphoolery wrote: | The thing to remember about cryptography is that as long as | it's based around computational power, it can always be broken | at some point. Especially if we're building exponentially more | powerful computers every N years. | dorgo wrote: | Assuming the universe is infinite in some sense. Otherwise | the exponential growth has to stop at some point. So base | your cryptography around computational power beyond physical | limits and you are safe. | ChrisCinelli wrote: | Even if you could make it secure for all you life time, this is | always at play: https://xkcd.com/538/ - In general probably | nobody give a dime about our backups but if they did they will | find a way to get the data before it encrypted or when we are | decrypting the backup. | suixo wrote: | Last part of this comment made me laugh, trying to imagine | someone shouting over Slack "PATCH YOUR BRIDGE, NOW!!" | wbl wrote: | https://en.m.wikipedia.org/wiki/Citigroup_Center It has been | done once after they found a bug. | pathseeker wrote: | All of this stuff is so new so we'd expect to see churn. Many | people involved in the discovery that asymmetric crypto was | even possible (Diffie, Hellman, Rivest, etc) are still alive. | If the people that first invented bridges were still alive we'd | be running into all kinds of failures in bridge building | techniques. Bridges have been built in the last 100 years that | the wind blew down! | gist wrote: | The way I see it, this is what is bad: | | > We have tried to contact the authors of affected software | before announcing this attack, but due to limited resources, we | could not notify everyone. | | Many in the 'security industrial complex' make as if they are | doing god's work by their 'research' but from a common sense, | man/woman on the street, and layman point of view that is not | what appears to be going on at all. | | What they do is self serving and a detriment. Then they try to | justify it in some way as being good when it's not good. It's | like going around and compiling a list of doors in a | neighborhood that are open, attempting to contact everyone but | not getting some people and then saying 'hey look at what we | did for you'. Meanwhile if a list were not compiled at all | almost all people would do just fine. | | > But I think the feeling is related to not being able to rely | on anything in our field. Hard to justify going to the trouble | of encrypting your backup. 10 years from now, it might be as | good as plain text. | | Figure out the amount of issues there would be if there was not | such open disclosure and an entire industry surrounding | breaking things vs. same not being done. That is the issue. | | > Imagine an engineer receiving a call at midnight about his | bridge because gravity changed during daylight saving in a leap | year. That's our field. | | No because nobody is able to or actively trying to change | gravity (or is able to). | jodrellblank wrote: | The idea behind your complaint is " _if we tell good people | about risks, then bad people will know about them. If we keep | them secret from good people, then bad people won 't find out | about them_". Or, " _if we don 't make a list of open doors, | then bad people can't find which doors are open_". Which is | .. not true. Bad people will already be making their own list | of open doors, and sneaking through them without being | noticed, over and over again. | | > all almost all people would do just fine | | Here's an ongoing ransom attack in the UK, started on New | Year's Eve: https://www.bbc.com/news/business-51017852 The | ransom is in the millions of dollars range, Travelex websites | in 30 countries down for a week, Sainsbury's money exchange | websites down for the same time, and " _Dates of birth, | credit card information and social security numbers [of | customers] are all in their possession, they say._ " | | Not looking at problems doesn't stop problems from existing. | gist wrote: | > "if we tell good people about risks, then bad people will | know about them. If we keep them secret from good people, | then bad people won't find out about them". Or, "if we | don't make a list of open doors, then bad people can't find | which doors are open". | | You are not recognizing the nuance which is typically the | case with people who supports practically any and all | disclosure and thinks it's good plain and simple. With | almost no downside at all (not the case). | | In particular this: "then bad people won't find out about | them" | | My point is that disclosure makes it simpler for more bad | people to find and learn and inflict damage damage. Unclear | to me how you could think that isn't what has and is | happening. If someone say publishing how to create a card | skimmer at a gas station then more people (who are bad | actors) will then have what they need to give it a try. If | not there will be people who have figured it out and people | who will put the effort into figuring it out but you must | realize vastly less will do that, right? The disclosure | removes a great deal of friction. | | The amount of effort and friction and the amount of 'bad | people' that can be actors is many magnitudes larger (I | would argue) as a result of disclosure. | jodrellblank wrote: | What I think you're missing is that the disclosure could | come from a bad person who doesn't care about any of your | arguments. It's like I'm saying "banks should invest in | vaults to protect against theft" and you're saying "that | costs money and disruption of building work, what if you | just don't talk about where the money is kept". I agree | that if people didn't steal the money, that would be | nice. But most of the people who are talking about where | banks keep money with a view to stealing it, aren't going | to shut up in order to keep the money safe, because _they | don 't care about keeping the money safe_. So if us bank | customers stop talking about it, a) that doesn't keep it | quiet, and b) our money gets stolen. | | It would be a nice world if we could tell companies about | flaws and they fixed them, and nothing went public, but | instead we tell companies with "responsible disclosure" | and they ignore it, don't spend any effort on it, act | incompetently leaving it with first line support people | who don't understand it and have no path to escalate it, | have no security contacts available for reports, cover it | up or deny it or try to silence it with NDA style | agreements, prioritise shareholder profit or public image | over it, and generally behave irresponsibly in all | possible ways that avoid them having to deal with it, | with very few companies excepted. | | In light of that, public disclosure with all its risks, | actually does kick companies into taking action, and | closing risk vectors for good. Like companies who say "we | put customers first!" but it takes a complaining public | Twitter thread for them to even respond at all. Telling | people to not take it to Twitter ignores the fact that | there's no other way which seems to actually work. | | Give an alternative which also gets problems fixed, and | I'll be much more in favour of it. | SAI_Peregrinus wrote: | We typically assume that organizations like the NSA, FSB, | Mossad, MSS, and the like already know of such attacks. | gist wrote: | The percentage of people that are impacted by NSA et al is | exceedingly small compared to the pain and impact on | everyday citizens and companies by disclosures. Not all | disclosures and for sure an upside but it's out of control | and has been for a long time. | | The government (in the US) does not have the resources to | go after everyone who commits a crime and that would assume | they are actually scooping up info and know of the crimes | (they aren't and they don't). They don't even have the | resources to audit tax returns (other than a very small | percentage). This idea that you are being watched all the | time is fantasy. In the US. Other countries? Unfortunate | when that's the case but that does not mean as a US citizen | I can't view it as detrimental to me that this type of | security disclosure makes it so easy for hackers to do a | better job (and it does nobody is going to dispute that | fact, right?). | jessaustin wrote: | _The government (in the US) does not have the resources | to go after everyone..._ | | In point of fact, since Snowden's revelations we know not | only that the USA state actually does have the resources | to monitor everyone, but also that it does so. | Furthermore, the state is not a monolith. While it may | not be in the interest of the state as a whole to | capriciously victimize individual humans, it is often in | the interest of particular officers and organizations | that comprise the state to do so. Cf. "parallel | construction". | WorldMaker wrote: | There's an MC Frontalot song called "Secrets from the Future" | and the refrain is "You can't hide secrets from the future." | It's something of a useful mantra to remind oneself that if | "the future" is a part of your threat model, yes your | encryption likely isn't enough because on a long enough | timescale it is likely "the future" will crack it. | | As with any other security issue, the question is "what is your | threat model?" You can still justify encrypting your backup | today if your threat model includes today's actors, however | much you worry about "the future". | | > 10 years from now, it might be as good as plain text. | | Or 10 years from now it might be the next Linear A tablets to | confuse cryptoarcheologists, unreadable and untranslatable and | entirely foreign. If "the future" is in your threat model, | don't forget the other fun forms of entropy beyond encryption | being cracked such as encodings changing, file formats falling | out of service/compatibility, "common knowledge" about slang or | memes lost to the ages leaving things indecipherable, and so on | and so forth. Most of those things are probably unlikely on | only a 10 year time horizon, but you never can tell with "the | future". | MaxBarraclough wrote: | > You can't hide secrets from the future | | With the usual disclaimer about one-time pads. | | > other fun forms of entropy beyond encryption being cracked | such as encodings changing, file formats falling out of | service/compatibility, "common knowledge" about slang or | memes lost to the ages leaving things indecipherable | | I wonder what the digital archaeologists of the future will | make of today's programming languages. | | (I was going to point out how far we've come since the | languages of yesteryear, but we still have such horror-shows | as Perl and Bash.) | xapata wrote: | > Or 10 years from now it might be the next Linear A tablets | | At some point in the future, after the universe has expanded | to the extent that other galaxies are moving away from ours | faster than the speed of light, someone might read our | astronomy papers and wonder whether they can believe | something that cannot be verified. | CobrastanJorji wrote: | The point about archeologists is a good one because it speaks | to motive. In general, we should be very supportive of the | efforts of distant historians who want to understand what | humanity used to be like. We should not WANT to hide secrets | from a sufficiently far future. I can't think of any secret | that deserves to be hidden from them for any reason besides | perhaps modesty. | WorldMaker wrote: | Relatedly, that is a part of why I love the term | "cryptoarcheology" in general, as a reminder that digital | spaces will need archeologists too. | | There's it's somewhat shortened form "cryptarch", generally | more used as a title ("cryptoarcheologist"), which was used | in places in the Quantum Thief trilogy of books and is most | probably already burned into your brain if you have played | any of the Destiny series of videogames (and I presume was | heavily influenced by Quantum Thief). | moyix wrote: | Hmm, I thought cryptarch was just crypt + arch with the | arch part meaning "leader" (i.e. this sense | https://www.etymonline.com/word/arch-), not | archaeologist. Is there something about this in the | Quantum Thief trilogy I've forgotten? | WorldMaker wrote: | It's been a bit since I read it, but I recall the meaning | overlap/double meaning between leader -arch (such as | plutarch) and arch- from archaeo- was directly | intentional word play in the book trilogy, and yes that | leader -arch meaning does add spice to the neologism. | | (I don't think Destiny does much with the playful dual | meaning, though. Certainly the cryptarchs in Destiny have | never yet been meaningful leaders.) | jessant wrote: | What is considered sufficiently far future may change with | life extension technology. | thekyle wrote: | Well I would assume that as long as you live you'll keep | updating your crypto as new tech comes out. That way the | clock only starts ticking when you die. | SheepSlapper wrote: | "You can't hide secrets from the future with math. You can | try, but I bet that in the future they laugh at the half- | assed schemes and algorithms amassed to enforce cryptographs | in the past." | | Didn't expect to see Front on HN today, what a pleasant | surprise. | b4n4n4p4nd4 wrote: | _coughcaesarciphercough_ | toppy wrote: | This biblical prophecy from Luke 12,3 is solo true: " What | you have said in the dark will be heard in the daylight, and | what you have whispered in the ear in the inner rooms will be | proclaimed from the roofs." | shrimpx wrote: | I wonder what the intended meaning is. I presume it has to | do with a presumed eventual urge to confess. | toyg wrote: | It's basically "don't try to get fresh with God". Which | conveniently translates into a call to humility and | honesty towards its earthly representatives. | PhasmaFelis wrote: | In context, it appears to be saying "you can't keep | secrets from God." | comicjk wrote: | Bridge engineers don't have to fear the progress of science | working against them, but computer security is not alone here. | Consider designing body armor or military aircraft and hoping | that the state of the art will stay the same! An adversary who | can use the progress of science against you is always | dangerous. Computer security has been rather lucky so far: the | asymmetry between hashing and cracking a hash, for example, is | much more favorable to the defense than the balance between | bulletproof vests and bullets. | WorldMaker wrote: | On a long enough time scale Bridge Engineers still have to | worry about decay, entropy, collisions. Some of that can even | be attributed to the "progress of science", as bridges have | collapsed because earlier assumptions were invalidated by | larger trucks, for instance. | | An issue so far for computer security is less that decay and | entropy happen, but that they happen _so fast_ that the | timescales are decades or even years rather than lifetimes. | candu wrote: | This is a great point in general: sometimes the problem | isn't an adversary using knowledge in your field against | you, it's the unintended consequences of progress / changes | in adjacent fields. | | It also underscores that, when dealing with things like | passwords, it's helpful to be able to seamlessly upgrade to | more robust methods down the line, e.g. "if this password | hash was generated using the old method, check the login | using that method but rehash and update the database using | the new method after authentication." | jtmarmon wrote: | The problem is that the expectations of software are much | more akin to the expectations of a bridge than those of a | bulletproof vest | [deleted] | wyday wrote: | >> "Hard to justify going to the trouble of encrypting your | backup." | | Huh? If you're "encrypting" using SHA, I've got some bad news | about those backups of yours. | LadyCailin wrote: | If you use SHA-256 to encrypt your backup, then I just need | to steal your backup and wait 20 years, until that is | cracked, and then I can decrypt your backup, even though | today you're using the "correct" encryption. | riquito wrote: | The GP was likely hinting at SHA1 being an hashing | function, non an encryption function, so just applying sha* | wouldn't produce a working backup | SAI_Peregrinus wrote: | To be excessively pedantic you can encrypt securely (but | slowly for the SHA series) with a hash function of | sufficient capacity by running the hash function in CTR | mode. You turn it into a stream cipher. Ideally you also | MAC the ciphertext, nonce, and other associated data. | That's is pretty easy with such a hash function (either | use HMAC or a MAC mode of the hash function if | supported). | | Salsa20 & ChaCha20 cores are hash functions (though not | collision-resistant compression functions since it's not | needed for their design goal and would be slower) run in | CTR mode.[1] | | [1] https://cr.yp.to/salsa20.html | caspper69 wrote: | > To be excessively pedantic | | This is the best, most delicious, type of pedantry | friend! | gpm wrote: | It probably will if your data is less than 128 bytes, and | you're willing to wait a few decades to decrypt it. | ksangeelee wrote: | You might be able to find bytes that result in your hash, | but they probably won't be the same bytes you 'backed | up'. | gpm wrote: | If the data is shorter than the hash shouldn't it be the | same data I backed up with reasonably high probability? | dorgo wrote: | I guess you get (infinite?) many results which all have | the same hash and one (or more) of them will be shorter | than the hash. | pathseeker wrote: | No. http://matt.might.net/articles/counting-hash- | collisions/ | monktastic1 wrote: | Can you explain the relevance? If I put N items randomly | into >> N buckets the chance of there being a second item | in a _particular_ bucket is small (as opposed to there | merely _being_ a bucket with two items, as in the | birthday "paradox"). | DuskStar wrote: | That doesn't apply here, since the birthday paradox is | about the _existence_ of a collision, not that any | particular sequence collides. | | Most people in the room will still have unique birthdays | even if one pair share theirs. | [deleted] | jodrellblank wrote: | If you're planning to brute-force count through 2^(128x8) | possible bit inputs, it will be quite a few decades | indeed. And you'll need a few spare solar systems to | annihilate to get enough energy to drive your counting | engine through that many states. | | https://security.stackexchange.com/a/6149/1427 | gpm wrote: | The idea is to wait for a preimage attack on sha, not | brute force it. | im3w1l wrote: | As an aside, sha-1 is smaller than 128 bytes. | | From my numerical experiments (I hope I didn't mess | up...) using the random oracle model, the probability | that a given key is collision-free is 99.6% if the input | is one byte shorter than hash, 1/e if input is same size | as hash and 6.6e-112 if the input is one byte longer than | hash. | | And this holds basically irrespective of key size. | tambourine_man wrote: | I'm refering to not being able to rely on encryption in the | long term. | vbrandl wrote: | Hashing is a separate problem from encryption. There is no | proof that one way functions (the idea behind hashing) even | exist (by proving this, you would actually prove P!=NP, | IIRC). Encryption has a slightly better track record of | being broken. AES still holds its promise and is also | secure against quantum computing (you might want longer | keys, but that's it). | | And if you want really, provably unbreakable encryption, | there is still OTP. But then you'd need a key, that is as | long as the data you want to encrypt. | blattimwind wrote: | The best known attack against AES reduces attack | complexity by about two bits over brute force. Given the | history of block ciphers, the idea that AES might not be | broken in this life time is not uncommon. | harikb wrote: | I think GP was taking about the general nature of "previously | assumed to be unbreakable" methods being broken. Not sure if | he has implying using a checksum also for encryption | PeterisP wrote: | What do you mean by "previously assumed to be unbreakable" | ? SHA-1 has been known to be unsafe for a dozen years, we | just went from "assumed to be breakable" to "yep, | definitely breakable, here's how one exact attack will | work". | rakoo wrote: | But backups have existed for more than a dozen years. And | its replacements today, SHA-256 and SHA-3 will also be | broken if you wait long enough. | jessaustin wrote: | I can see why backups might be needed for a dozen years, | and I can see why encrypted backups might be needed, but | outside plainly fake requirements like those of "national | security" why would encrypted backups be needed for a | dozen years? Aren't we throwing everything sensitive away | after seven years? After that isn't it mostly about | preserving history? Even things like balance sheets that | might be sensitive today will be too out-of-date to be | sensitive a dozen years from now. | nitwit005 wrote: | > Hard to justify going to the trouble of encrypting your | backup. 10 years from now, it might be as good as plain text. | | Realistically, any archive is going to have to re-record data | to stay ahead of age and equipment becoming obsolete. I've | heard the lifetime for the physical media of tape backups is in | the 10-30 year range. | | Updating the encryption isn't a big deal once you're already | rewriting everything. | harikb wrote: | At least in our case problem can be solved by re-encryption. | Yes, we have to keep up to date with developments before | everything is completely broken, but it is not bad as | discovering a bridge needs to be rebuilt. Speaking of which, it | is common to have to retrofit for earthquakes which probably | wasn't the rule when they built the original bridge | hansvm wrote: | That depends on your threat model. A patient attacker can | store all of your encrypted messages till some future point | where decryption is economically feasible. Since they already | have the weakly encrypted data, encrypting it again doesn't | solve anything. | baddox wrote: | > Hard to justify going to the trouble of encrypting your | backup. 10 years from now, it might be as good as plain text. | | When has that happened? Public key cryptography and symmetric | key cryptography are still doing fine as far as I'm aware, and | the latter doesn't even seem to be vulnerable to quantum | computing. | | Moreover, SHA-1 has been considered insecure for, what, at | least 10 years? The fact that a cryptographic hash function has | been widely considered insecure and widely recommended for | deprecation a decade before a proof of concept even emerges is, | to me, something to feel very good about. | rakoo wrote: | "Public/symmetric key cryptography" is just the name of the | practice, of course it's doing fine. What's not doing fine is | picking a particular set of ciphers/hash functions/signature | scheme and expecting to not fail in 40 years. | xmprt wrote: | Encrypting is still better than not encrypting and if you | care about keeping your data private then you can take | additional measures to ensure that. Nothing will last | forever but that's not a good reason to be nihilistic. | mattr47 wrote: | It's like locking my front door. I chose to lock the door | to prevent random acts of burglary. But that lock will | not stop someone determined from BnE my house. | gowld wrote: | DES Encryption will stop someone from BnE your data. | | They might BnE your key though, if you aren't careful | enough. | xoa wrote: | Incorrect, particular sets of ciphers are themselves doing | just fine. There is no hint of well studied symmetric-key | algorithms (like AES, which is in fact now over 20 years | old) with long known sufficient key lengths "failing" (in | terms of cold ciphertext being cracked) in 40 years, or 400 | years for that matter. Even with a fully scalable general | quantum computer, the best known improvement is Grover's | Algorithm, which results in roughly a quadratic improvement | (so a 128-bit key cracked in 2^64 iterations, 256-bit in | 2^128, etc). This obviously is trivially defeated by | doubling the key length, and AES-256 has been available | from the beginning and has been becoming more standard for | a while as well. | | Now, there are various kinds of hot attacks, such as side | channel leaks, that always present a serious challenge in | many threat scenarios. And it's certainly possible to have | bad implementations of a solid algorithm. But for AES | implementations are very mature, and side channel attacks | do not apply to data at rest. So in the context of "will my | encrypted ciphertext with all my deepest secrets [sitting | on a drive/in the cloud/wherever] be crackable in 40 years" | no, symmetric-key crypto with 256-bit keys is dependable at | this point. | | A false sense of defeatism is itself plenty dangerous to | security. | ReidZB wrote: | DES is a good example here: it was designed ~45 years ago | and it's still theoretically surprisingly strong when you | compare it with other areas of cryptography. By that, I | mean it's held up very well to cryptanalysis that | attempts to find faster-than-brute-force techniques, with | the best attack taking 2^43 time (vs. the 2^56 brute- | force time). | | To be clear, 2^56 is trivially brute-force-able today, | but that can be mitigated with constructs like 3DES, | which are still (barely) secure despite being based on a | 45-year-old cipher. | | (So no one mistakes my intent, there are many reasons to | prefer AES over DES. I just wanted to provide it as an | example, especially since it happened to line up with the | 40-year timeframe.) | wbl wrote: | Triple DES remains as secure today as in 1975. | saurik wrote: | FWIW, a lot of very smart people seem to think that we actually | do have symmetric encryption in a state where it "works" and is | "understood" and that AES is unlikely to be "broken". What we | don't really feel so great about is asymmetric encryption, but | at least that feels like something P != NP might imply can be | done... hashing algorithms just have this problem where if you | glance at the information theory you would have guessed they | couldn't be possible until someone shows you one that seems to | work. | fanf2 wrote: | Yes: for specific examples of this confidence see JP | Aumasson's paper https://eprint.iacr.org/2019/1492 "Too much | crypto" | im3w1l wrote: | AES may be fine, but what about the block cipher modes? | mehrdadn wrote: | > something P != NP might imply can be done... hashing | algorithms just have this problem where if you glance at the | information theory you would have guessed they couldn't be | possible until someone shows you one that seems to work | | I'm confused what you mean by this? Why does the info theory | suggest hashing wouldn't be possible? Also, you can easily | derive a secure hash function from a secure symmetric cipher | and vice-versa, so why would one seem to be possible but not | the other? | saurik wrote: | 1) The premise of a hash function is that you are going to | take a large set of inputs and map it to a small set of | outputs such that you don't find collisions, with the use | case that the result is somehow so unique that the input | file can be henceforth named by its hash and looked up in | some giant dictionary of everything we have so far ever | bothered to hash. | | The hash is this tiny bit of information, and somehow is | expected to be sufficient to uniquely describe some | arbitrarily large amount of information? That just flies in | the face of the naive expectation. The only argument that | it is even sort of reasonable to expect would be the idea | that while there are an insane number of files very similar | but not quite your file, very few of them are "interesting" | to a human in any way, and so the set of "interesting" | files might be super small... but it isn't like we designed | the hash functions to know that. | | The reality that it seemingly can be done well enough that | no one notices most of the time is fascinating and | surprising--the kind of thing that inspires awe and wonder | at the universe if true or calls into question our hubris | if it isn't--not something obvious in the way the existence | of symmetric ciphers nor something reasonable to expect in | the way asymmetric ciphers are: to the extent to which | reality lets us pull this one off we should be thankful. | | 2) If one can truly "easily" derive a "secure" hash | function from a secure symmetric cipher, then why don't we | ever have secure hash functions, despite seemingly having | secure symmetric ciphers? If this is so easy, I suggest you | do it and then wait 30 years and see if someone figures out | how to break it (and we can see it AES is still holding | strong). | mehrdadn wrote: | 1) Oh I see what you mean now. Yeah I guess it depends on | your intuition. | | 2) I mean, I'm not sure how correct your premise that | symmetric ciphers are more secure than hash functions is, | but it literally is something that is done. You can read | more about it in [1], including the possible pitfalls. | The transformation should provide more than enough | intuition to see why both are equally plausible, which | was the point of my reply. Whether or not it's best to | actually implement them that way in practice is a | separate question which I'm not trying to answer here. | | [1] https://crypto.stackexchange.com/a/6476 | silasdavis wrote: | >The hash is this tiny bit of information, and somehow is | expected to be sufficient to uniquely describe some | arbitrarily large amount of information | | It doesn't really describe the information it just | indexes each piece. For that it is a very large key | space. A 128 bit key could index 10^38 objects. That's a | lot of slots. Provided your hash has sufficient mixing | you will need a lot of shots to hit anything. | | >If one can truly "easily" derive a "secure" hash | function | | Semantic security has a precise definition. We just have | to make empirical assumptions about whether an input | cipher is semantically secure. The hash is only secure if | the cipher was but the derivation is easy. | hannob wrote: | > Hard to justify going to the trouble of encrypting your | backup. 10 years from now, it might be as good as plain text. | | That is an absurd statement. Every backup you encrypted 10 | years ago with then up-to-date security is still well | encrypted. The SHA1-thing came coming over a timespan of 15 | years. It still doesn't threaten past usage, only active | attacks are really relevant. | Izmaki wrote: | No, the bridge collapsed because it was never touched again | after initial deployment, for 10 years. How are buildings doing | in Chernobyl? Don't neglect your data, if you want to keep it | safe always. :P | tambourine_man wrote: | A counter example are roman aqueducts | app4soft wrote: | Dude, Chernobyl reactor by design was nuclear bomb. | | P.S. My father (still alive) was one of thousands of common | liquidators of Chernobyl disaster from May to July 1986. Many | of his coworkers, that lived with him in same tent, already | dead. | spease wrote: | > How are buildings doing in Chernobyl? | | Not great, but not terrible? | femto113 wrote: | While a meaningful accomplishment, suggesting the algorithm is in | a "shambles" seems hyperbolic to me. For one thing there's a non- | trivial practical leap between formulating two colliding | identities and forging an existing one, and for another this was | only modestly better than a pure brute force attack. If anything | I'm somewhat reassured by the idea that it still costs $40,000+ | of GPU time to pull something like this off while doing the same | with MD5 is feasible on a mobile phone. | jwilk wrote: | Link to the GnuPG commit: | | https://dev.gnupg.org/rGedc36f59fcfc | RcouF1uZ4gsC wrote: | Does this affect Git? I believe it uses SHA-1 for commits. Is it | possible to use this attack to add malicious code to a git | repository without changing the hashes for the commits? | LennyWhiteJr wrote: | Potentially. GitHub at least already started collision | detection after Shattered was published. | | https://github.blog/2017-03-20-sha-1-collision-detection-on-... | jlokier wrote: | Just a curiosity, since people are talking about Git still using | SHA-1 (despite work on SHA-256 since 2017). | | I see that Git doesn't actually use SHA-1 any more, it uses | "hardened SHA-1": | https://stackoverflow.com/questions/10434326/hash-collision-... | rst wrote: | Well, according to that reference, it's hardened against a | specific, previously known attack. Do you have any information | on whether that also protects against the different, new attack | which was just published? | jlokier wrote: | I was wondering the same thing, and hoping someone else would | answer that. | joeyh wrote: | Hardened sha1 does detect this new attack. Easy to test: | Check their pair of files into a git repo and see that they | have different checksums, while sha1sum(1) generates the | same for both. | tialaramex wrote: | Not so much the specific attack, as the broad class of | attacks. I think this new work is in that same broad class | but I am not a mathematician. | | The idea in Marc Stevens' anti-collision work is that some | inputs are "disturbance vectors" which do unusual things to | the SHA-1 internals, and we want to detect those and handle | that case specially since there is almost no chance of that | happening by accident. It has a list of such vectors found | during his research. | | This paper doesn't talk about "disturbance vectors" but it | does discuss ideas like "Boomerangs" which I think ends up | being similar - I just don't understand the mathematics | enough to know whether that means "the same" or not. | FactolSarin wrote: | How would an attack on a git repo work? You create a repo with | identical hashes but different content and next time the user | clones from scratch they get your modified version? | flatiron wrote: | yeah my thoughts about git are similar. look at the two | messages they have an an example: | | Key is part of a collision! It's a | trap!yE'NsbK#dhW]{1gKmCx's/vr| -pJO_,1$)uB1qXv#U)9ESU;p~0G:Y | [?]bBIjFranweom3&t'lB_!h5M([,[?]QMK#|o5pv|i,+yYp[?]D7_Rf\'GUZ | , _[?]dvAYAugV=Lk8_E_ 2 +nolBtxXoQt | &+?Y3LP:'Qt(,u[?]WJm:A"M6<|B4kVvtA=M+m%Sha j5N|EMA\Ed- | s&@u@:a?pq^Xf0U?R} | | and | | Practical SHA-1 chosen-prefix | collision!'lka}vbI3,*W]A+gK}Cxs/v&r| }-hRJO_ rO;bzC ,1&uRP- | MXrU3aO;pr0:sY'2 l&r7#(A{oNyCJ_W,8 | @bHBYeepFr2a8#&t+n_15q(_,?QMW#hzYMgVV=L,kO0E*N | +oc@BpXod&?+?[{3LvP&'U t ( WJIm\:A"6>>|SB(k;Vvt^A=Y | ;om%j-|cUAAET&@o@:La3psH^eXf0QJm [?]d | | they have the same sha1sum, but in all practicality its | nonsense since both messages are pure trash. you couldn't | have malicious C code that would have the same hash as non | malicious C code in this example | saalweachter wrote: | Isn't that like incredibly simple? | | Dump your garbage string behind a // or inside an #if 0, | restrict the garbage string character set to characters | which will not disturb that, and your compiler will whistle | while it works. | pathseeker wrote: | Depends on if the chosen prefix attack allows the content | to appear arbitrarily in the middle of the byte stream | like that. | flatiron wrote: | anyone checking diffs would notice that, or working on | the file, etc. it wouldn't survive long | munk-a wrote: | I think active projects would detect this fine - but what | if that commit was pushed to lpad and everyone ended up | pulling it to local because it's a dependency of a | dependency of a dependency in NPM? | | Or what if it's a really obscure library for parsing | like... pyramidal jpeg2000s, are the library consumers | going to be checking the source? Heck, most people | already don't check download checksums unless their | downloader does it automatically. | saalweachter wrote: | Hmmm, does the garbage string actually have to survive | long? | | If there's a followup CL to "delete a garbage string that | accidentally made it into the repo", which doesn't | actually fix whatever else was added, would that get you | anywhere? | munk-a wrote: | If you could push up a commit that computed to the same hash | of the last tagged release in a repo... I'm not certain, the | tag might end up referencing the new object? Certain versions | of git (i.e. maybe git for windows) may also react in | different manners. | | In theory you might get people building software packages for | distros to build your malicious version, you may also just | temporarily shut down the ability for anyone to check out the | version (basically denial of service for making?) but the | time window would be weird. | thenewnewguy wrote: | You'd probably be most successful modifying the original repo | - either by being the creator of the software or gaining | their trust. However, it would have to be a rather powerful | SHA1 attack for the commit to still be valid syntax, hard to | detect, and make a meaningful malicious change. | nvartolomei wrote: | I assume there was a lot of work (read money) put in those | collision attacks rather than it being discovered by accident. | I'm wondering who is sponsoring this work and for what purpose? | The argument about proving that an algorithm is broken and | working on better cryptography wouldn't suffice in this case, as | issues were shown before that. Here the purpose was to make the | attack cheaper? | tptacek wrote: | That is not how cryptographic research works, at all. | mehrdadn wrote: | > SHA-1 has been broken for 15 years, so there is no good reason | to use this hash function in modern security software. | | Why are cryptographers always exaggerating things and so out of | touch with reality? The first actual collision was like 3 years | ago. It's not like the world has been on fire in the meantime, | and it's not like SHA-1 is broken for every single possible usage | even now. And why the nonsense with "no good reason"? Obviously | performance is one significant consideration for the unbroken use | cases. Do they think painting a different reality than the one we | live in somehow makes their case more compelling? | ApolloFortyNine wrote: | For anything requiring fast hashing performance today, is there | a reason why Blake2 [1] wouldn't be chosen? It seems to be | faster than SHA1 and uses a similar algorithm as SHA3. As | collisions become easier to create I would think it'd cause a | problem in many even insecure use cases. I'm pretty sure github | had to blacklist those two pdfs that originally had the same | hash. | | [1] https://blake2.net/ | ewidar wrote: | As stated, they are talking about "modern security software". | | So yeah, if you are building/improving software that has a | clear focus on security, you should use a secure hash. Seems | only natural to me. | lmilcin wrote: | The difference it was "a" collision. Now, that wasn't very | helpful if you wanted to forge a document. | | This is chosen prefix collision. This means you can select a | beginning of a document which in many cases is enough. | PeterisP wrote: | Past experience and documents that ceased being classified | shows that serious attackers (e.g. NSA) are at least decade | ahead of what's publicly known in cryptography; i.e. we know | that pretty much always when new relevant groundbreaking math | was published, the classified cryptographers had known that for | a long, long time already. | | So if this attack is developed today, then you should assume | that NSA has been able to execute this attack for at least ten | years already whenever it suited them, including mass | surveilance of random not-that-important people. The same | applies for the collisions - the first _published_ collision | was 3 years ago, but we should assume that that 's definitely | not the first; I mean, only a minority of the world's | cryptography researchers participate in the public, open, | academic community; the majority of them are employed with a | condition that they won't get to publish anything important. | And since we know for the last ten years that such attacks were | possible, there's no reasonable reason why SHA-1 would have | been considered as safe. | Skunkleton wrote: | > So if this attack is developed today, then you should | assume that NSA has been able to execute this attack for at | least ten years already whenever it suited them, including | mass surveilance of random not-that-important people. | | They might be 10 years ahead on the algorithms side, but they | aren't 10 years ahead on the hardware side. Also, spending | 45k today gets you a single collision. That is hardly going | to be useful for mass surveillance. | velosol wrote: | They may have been though - there's talk of liquid helium | tens+ of gigahertz (8-bit) processors being purpose built | (in 2002). The National Cryptologic Museum next door to NSA | HQ is a fascinating place and has some very cool displays | to include a large disk changer and a piece of a cooling | system. There's a PDF at [1] but the discussion at [2] I | think might give a better idea. | | [1]: https://www.nitrd.gov/pubs/nsa/sta.pdf [2]: | https://it.slashdot.org/comments.pl?sid=485458&cid=22736288 | nickpsecurity wrote: | Other declassified documents of the past showed they were | years ahead on hardware, too. The NRE costs got so high | that commodity hardware got preferable in the general case. | That said, they can still throw money at ASIC's, FPGA's, | semi-custom versions of Intel/AMD CPU's, maybe same for | GPU's, and so on. | | They have access to better and more hardware than most | threat actors. | dweekly wrote: | Are there materials that show a _ten_ year head start? | ReidZB wrote: | I think there's skepticism in the community that NSA et al | actually have such a far head-start now. | | That said, historically speaking, back in the early 70s DES | was being designed. The NSA made some unjustified changes | to its S-boxes. At the time, there were allegations that | they had made them intentionally weaker. (Or so I've read; | I wasn't born yet.) In the early 90s, differential | cryptanalysis was discovered for the first time, and it | turns out that DES was already resistant to it (unlike | other block ciphers at the time): in fact, the NSA already | knew about differential cryptanalysis, 20 years ahead of | the general public, and intentionally strengthened DES. | (Also, IBM discovered it, too, but kept it quiet at the | NSA's request.) | PeterisP wrote: | For an early example, differential cryptanalysis methods | were 'discovered' in late 1980s by Biham and Shamir; but it | later turned out that resistance to it was pushed as a | design consideration already back in 1974 when DES was | designed, so NSA knew of it _at least_ then, that 's more | than a decade. | | We know that British intelligence (who don't have as much | resources as NSA) had developed the RSA equivalent | something like 5 years before Rivest/Shamir/Adleman got to | it; and we still have no idea how far NSA was with that | math at the time - all we have is circumstancial evidence | such as travel reports of NSA representatives going to | cryptography conferences and being satisfied that | absolutely no math that's new (to them) or even potentially | leading to something new was being revealed there. | | We also have NSA suddenly changing recommendations to | use/stop using certain cryptosystems that still doesn't | make sufficient sense - e.g. the 2015 turning away from | 'suite B' ECC _may_ have been due to some quantum discovery | as is claimed, or some other weakness being found, but it | 's been five years and we (as far as I understand) still | don't know as much as they did back in 2015, so they're | more than 5 years ahead. But to know whether the current | advantage is ten years or more or less, we'll have to wait | a generation or so, it takes a long time for truth to leak. | whatshisface wrote: | How many mathematicians are working for NSA? How many | public research cryptographers are there? | PeterisP wrote: | One thing is that proper public cryptographers are very | rare, there's a handful of effective teams - e.g. MIT has | Rivest and associates, there are a bunch of other places, | but most universities, including quite serious ones, | don't have anyone doing reasonable cryptographic | research. Cryptocurrencies caused a recent boom, but it's | a niche with separate, specific goals that doesn't | advance the rest of the field much. | | It's hard to give good numbers, we'd have to look at | Snowden leaks and others, but I haven't done much about | that. Here's an earlier HN comment | https://news.ycombinator.com/item?id=6338094 that | estimates 600 proper mathemathic researchers working on | crypto, and it seems quite plausible to me that it would | be more research power than the entire public academia - | especially because in many countries who do take this | field seriously (e.g. China, Russia, Iran) there's no | real public research in crypto happening because that's | classified by default. I mean, prety much all academic | research happens through targeted grants by governments, | and who other than deparment of defence (or similar | organizations in other countries) would be funding | cryptographic research? | | Also, I'll quote Bruce Shneier (2013, https://www.schneie | r.com/essays/archives/2013/09/how_advance...) regarding | their budget - "According to the black budget summary, | 35,000 people and $11 billion annually are part of the | Department of Defense-wide Consolidated Cryptologic | Program. Of that, 4 percent--or $440 million--goes to | 'Research and Technology.' That's an enormous amount of | money; probably more than everyone else on the planet | spends on cryptography research put together." | thesabreslicer wrote: | This is a great summary, thank you! | ljhsiung wrote: | This is fascinating; are there any readings/sources for | similar events, i.e. governments "predicting" academia? | | I am aware of the 2015 plan to "transition soon(tm)", but | that's because I was alive 5 years ago. Other earlier | events would be super cool to read up on. | shilch wrote: | Like computer scientist, they think binary: Either it's secure, | or it's not. In reality there's a spectrum where you also have | "good enough". | rectang wrote: | The security strategy with the most practical utility for | most software engineers working on most projects is _defense | in depth_ : multiple layers, each assumed to be breakable. | | It's a striking contrast with the stark mathematical language | deployed by cryptographers, on whose work we rely. | | If we differentiate between the two fields of software | engineering and cryptography, it's easier to be generous in | our appreciation for the different goals and mental models. | mbowcutt wrote: | "good enough" relies on a threat model. Cryptography | researchers work in the abstract - without a threat model you | must consider cases where your attacker has unlimited | resources. | | It's good enough for you and me, but research isn't meant to | be practical, imo | Ar-Curunir wrote: | What. The first thing any security paper defines is the | assumed threat model. People design all kinds of schemes | for different threat models. | | The point with assuming conservative threat models for key | primitives like hash functions is that the threat model can | change rapidly even within the same application, and | attackers only get stronger. So you err on the side of | caution, and don't rely on luck to keep safe. | [deleted] | brink wrote: | Most people think in yes/no logic. Unfortunately binary is a | horrible oversimplification of a very analog reality and | results in many of the world's problems that we're in now. | Because we tend to think of a binary of yes/no, we often end | up flying from one ditch on the side of the road to the | other. | btilly wrote: | But in many contexts, "good enough" is more a question of | perception than reality. | | These statements serve to shift the Overton window of | perception, and therefore help improve the odds that people | are't thinking "good enough" when they are broken. | espadrine wrote: | Cryptographers rely on precise definitions to make their | assessments. | | In particular, a primitive makes a number of useful promises. | Any achievement that describes a way in which a promise is not | kept makes that primitive _broken_ , regardless of whether that | achievement is theoretical or practical. | | (They often talk about "theoretically broken" or "practically | broken" to distinguish whether it was actually done.) | | > _it 's not like SHA-1 is broken for every single possible | usage_ | | True, but it is extremely easy to believe your usage is | unbroken and be wrong. Besides, often, primitives that are | broken for one usage eventually break for others. | | > _why the nonsense with "no good reason"? Obviously | performance is one significant consideration for the unbroken | use cases_ | | There are many much more robust primitives with better | performance nowadays, such as BLAKE2, SHA-512/256, or | Kangaroo12. | staticassertion wrote: | Is SHA1 not still faster than 256/512? And Blake2 barely | faster? I suspect with some hardware SHA1 is still going to | beat all of those out - am I wrong? I would love to learn | more. | __s wrote: | CRC is faster than SHA1 | tptacek wrote: | https://bench.cr.yp.to/results-hash.html | | Regardless, no competent designer is going to use SHA1. | staticassertion wrote: | Thank you, that's an excellent chart. | remus wrote: | Depreciating things takes a long time. It could be 10 years | before 'that guy' in your office gets it in to his head that | SHA-1 isn't appropriate where security is concerned. Thus you | want to make it unambiguous that now is a good time to move to | something else so that when it is thoroughly broken in 10 years | time you're minimising the number of people who have yet to | switch. | tialaramex wrote: | Depreciating things takes however long your accountants and | the tax laws in your country allow :) | | Deprecating things does take a long time, and the only | practical thing to do about that is to get ahead of the game | as you say. | hinkley wrote: | Ten years ago I was being pressured into not adding SHA-256 | support to a new system because SHA-1 would allow them to use a | stack of old crypto cards some idiot bought and hadn't been | able to put to use, so they were just collecting dust in a | closet somewhere. (They had a max of 1k RSA keys, which also | were considered broken at the time, close to it when the cards | were new). | | This wasn't for some throw-away project. Big, old company. This | may be the closest I've come by far to writing software that | will still be used after I'm dead (gets a little easier every | year, though). If I gave them SHA-1 they'd still be using it | for sure. | | I refused, and the fact that people were calling MD5 very | broken and SHA-1 broken helped. | | (A completely different faction was trying to get me to jump | straight to SHA-512. I said that was probably overkill, and yes | I will implement it but we're using SHA-256 as a default. Then | a couple years later it turns out SHA-256 is more resilient | than 512 anyway. But what a schizophrenic place.) | Ar-Curunir wrote: | Because by shouting loudly now, we can maybe get changes into | libraries within the next 5 years, before these attacks are | commonplace | hannob wrote: | Performance is the worst argument you could choose. | | a) There's hardly an application where hash-performance | matters. These things are fast. | | b) For precisely that reason that people may still complain | about performance cryptographers invented a hash function that | is faster than all the old choices like md5/sha1 and still | secure (it's called blake2). | 0x0 wrote: | Q: Does this make it even more urgent for git to move to a | different hash? | rom1v wrote: | What happens if you actually get a SHA-1 collision in git? | | -> https://stackoverflow.com/questions/9392365/how-would-git- | ha... | | (This does not answer your question, but is still interesting.) | LeonM wrote: | A: (from the article) | | SHA-1 has been broken for 15 years, so there is no good reason | to use this hash function in modern security software. Attacks | only get better over time, and the goal of the cryptanalysis | effort is to warn users so that they can deprecate algorithms | before the attacks get practical. We actually expect our attack | to cost just a couple thousand USD in a few years. | pathseeker wrote: | That's not a relevant answer. Git's use-case is very specific | and it's quite possible that this attack won't be relevant. | It needs analysis. | | >no good reason to use this hash function in modern security | software | | This argument conveniently ignores the cost to switching | existing software (i.e. it's completely detached from | reality). | EGreg wrote: | It may, because now an attacker can replace code with arbitrary | other valid code as long as developers are willing to ignore | the long weird random comment at the end ;-) | | I'm gonna say many developers will not care but and many | compilers will not care either. | | So yeah, Linus' main deterrent reason (code won't compile) | doesn't apply anymore. | | _HOWEVER!_ | | 1. A chosen-prefix attack still needs to compute TWO suffixes | m1 and m2 so that _h(a1+m1) = h(a2+m2)_. This does NOT mean | that given _a1_ and _a2_ you can find a single _m2_ so that | _h(a1) = h(a2+m2)_. So that ONLY THE ORIGINAL AUTHOR OF THE | COMMIT could spoof their own commit, by preparing in advance | and attaching a long and weird comment in the end. And you | could build tools to watch out for such commits in the first | place | | 2. If git had used HMAC based on SHA1 then it would have been | fine, even after this attack has become feasible. | | 3. Furthermore, it is likely still kinda fine because Merkle | Trees have nodes referencing previous nodes. You'd have to | spoof every historical node as well, to push malicious code. | BitTorrent also requires computers to supply an entire merkle | branch when serving file chunks. | | Maybe someone can elaborate on this. | rkangel wrote: | If you look in this 2017 | (https://marc.info/?l=git&m=148787047422954) email from | Linux, he discusses how git also encodes length. That would | mean that you need a collision of the same length _and_ the | right functionality, so you can 't just append data. | toyg wrote: | Now that you can arbitrarily produce collisions, the second | step is easy enough for a skilled and well-funded attacker. | rkangel wrote: | It adds to the weight of the argument, but there isn't a big | issue. This article (https://www.zdnet.com/article/linus- | torvalds-on-sha-1-and-gi...) and the linked email | (https://marc.info/?l=git&m=148787047422954) both seem to still | apply. | groovybits wrote: | Further details as to why Torvalds is not concerned: | | From the email... | | "I haven't seen the attack yet, but git doesn't actually just | hash the data, it does prepend a type/length field to it. | That usually tends to make collision attacks much harder, | because you either have to make the resulting size the same | too, or you have to be able to also edit the size field in | the header." | | [...] | | "I haven't seen the attack details, but I bet | | (a) the fact that we have a separate size encoding makes it | much harder to do on git objects in the first place | | (b) we can probably easily add some extra sanity checks to | the opaque data we do have, to make it much harder to do the | hiding of random data that these attacks pretty much always | depend on." | tedunangst wrote: | What if somebody makes an attack where they can choose the | size and then find a collision? | banana_giraffe wrote: | Like the two files on the linked page? | Skunkleton wrote: | The two files on the linked page were both full of junk | data. I suspect that those files being of the same length | isn't the norm. | aidenn0 wrote: | I would bet that a fixed size M' is part of the attack. | est31 wrote: | That first quote is misleading. git's special hashing | scheme doesn't make the attack "much harder". First there | is no difference in length in the original shattered | collision already: | | $ curl https://shattered.io/static/shattered-1.pdf | wc -c | | 422435 | | $ curl -s https://shattered.io/static/shattered-2.pdf | wc | -c | | 422435 | | Second, the length is already being hashed into the content | during computation of a SHA-1 hash. Look up Merkle-Damgard | construction: https://en.wikipedia.org/wiki/Merkle%E2%80%93 | Damg%C3%A5rd_co... | | There _is_ benefit in storing the length at the prefix as | well, as you can avoid length extension attacks, but that | 's not making attacks "much harder". | tny wrote: | But even if the lengths are same, the resulting SHA1 will | be different since you prefix the length before hashing | est31 wrote: | The shattered prefix was chosen as well, see my other | comment in the thread: | https://news.ycombinator.com/item?id=21980759 | | The only thing that prefixing the length makes difficult | is using the same prefix multiple times: you basically | have to make up your mind about the type and length | before mounting the shattered attack. Also, the prefix | means you have to do your own shattered attack and can't | use the PDFs that google provided as proof of their | project's success. Price tag for that seems to be 11k. | | [1]: https://github.com/cr- | marcstevens/sha1collisiondetection | toyg wrote: | Yeah, that quote doesn't exactly make me confident about | Linus's understanding of this particular issue. | DarkWiiPlayer wrote: | It's not about it being the same length, but the length | of the data being part of the hashed data, which, Linus | assumes, will likely make it more difficult to find a | collision. He even says at the beginning that he hasn't | had a look at the attack yet and is just making an | assumption. | est31 wrote: | > It's not about it being the same length, but the length | of the data being part of the hashed data | | As I tried to point out, the length is already part of | what the SHA-1 function hashes: | | https://tools.ietf.org/html/rfc3174#section-4 | As a summary, a "1" followed by m "0"s followed by a 64- | bit integer are appended to the end of the message to | produce a padded message of length 512 * n. The | 64-bit integer is the length of the original | message. The padded message is then processed by the | SHA-1 as n 512-bit blocks. | | Now, storing the length as a prefix does give you | advantages: you can't mount a length extension attack, | which limits your ability to exploit one shattered | attack, e.g. the pdfs released by google, for different | files/types of files. But it doesn't make mounting a | novel shattered attack "much harder" as Linus claims. | magicalhippo wrote: | > But it doesn't make mounting a novel shattered attack | "much harder" as Linus claims. | | From what I understood the core of Linus' argument[1] is | that it's very hard to make a "bad" variant of the code | which has the same length _and_ the same hash while still | looking like sane code. For random data files, sure those | are more at risk. | | [1]: https://marc.info/?l=git&m=148787287624049&w=2 | wbl wrote: | A good thing no one checks binary assets into source | control | runeks wrote: | The point is that the hashed data must follow a specific | data format, and can't just be arbitrary data. This means | that the collision data MUST contain the length at some | specific offset in the data, which makes it harder to | find a collision. | | The more restrictive the serialization format of the | hashed data, the harder it is to find a collision that's | valid in the given application context. | est31 wrote: | Yeah the data must start with a specific prefix, but can | otherwise contain whatever it wants. Anyways, even the | shattered attack, which this paper says costs 11k to | execute today, had a pdf specific prefix (the shown part | is the same in both files): $ curl -s | https://shattered.io/static/shattered-1.pdf | hexdump -n | 512 -C 00000000 25 50 44 46 2d 31 2e 33 0a 25 | e2 e3 cf d3 0a 0a |%PDF-1.3.%......| 00000010 | 0a 31 20 30 20 6f 62 6a 0a 3c 3c 2f 57 69 64 74 |.1 0 | obj.<</Widt| 00000020 68 20 32 20 30 20 52 2f | 48 65 69 67 68 74 20 33 |h 2 0 R/Height 3| | 00000030 20 30 20 52 2f 54 79 70 65 20 34 20 30 20 52 | 2f | 0 R/Type 4 0 R/| 00000040 53 75 62 74 79 | 70 65 20 35 20 30 20 52 2f 46 69 |Subtype 5 0 R/Fi| | 00000050 6c 74 65 72 20 36 20 30 20 52 2f 43 6f 6c 6f | 72 |lter 6 0 R/Color| 00000060 53 70 61 63 65 | 20 37 20 30 20 52 2f 4c 65 6e 67 |Space 7 0 R/Leng| | 00000070 74 68 20 38 20 30 20 52 2f 42 69 74 73 50 65 | 72 |th 8 0 R/BitsPer| 00000080 43 6f 6d 70 6f | 6e 65 6e 74 20 38 3e 3e 0a 73 74 |Component 8>>.st| | 00000090 72 65 61 6d 0a ff d8 ff fe 00 24 53 48 41 2d | 31 |ream......$SHA-1| 000000a0 20 69 73 20 64 | 65 61 64 21 21 21 21 21 85 2f ec | is dead!!!!!./.| | | The shattered attack was about a so-called "identical | prefix" collision, while the shambles paper's collision | was a "chosen prefix" one. You can choose it in both | cases, but in the "chosen prefix" one both colliding | prefixes can be entirely different (and can be as long as | you want btw, the attack doesn't cost more if the prefix | is 4 KB vs 4 GB), while in the "identical prefix" case it | has to be identical. | toyg wrote: | _> the harder it is to find a collision that's valid in | the given application context_ | | In the double-digit thousands of dollars, an attack that | gets 10x or 100x harder is still cheap for state actors. | | Assuming the NSA is at least a year or two ahead of the | field, git should now accelerate its migration process. | joeyh wrote: | "(b)" is kind of amusing.. It's been known since 2011 that | collision generating garbage material can be put after a | NUL in a git commit message and hidden from git log, git | show, etc. Still not fixed. | | With this chosen-prefix attack, they chose two prefixes and | generated collisions by appending some data. So your two | prefixes just need to be "tree {GOOD,BAD}\nauthor | foo\n\nmerge me\0" | | The only thing preventing injecting a backdoor into a pull | request now seems to be git's use of hardened sha1. | wyldfire wrote: | I think those are pretty practical approaches. | | But it sounds as if the cost of changing the hash algorithm | is high. What are the impacts of this change? How many | things would break if git just changed the algorithm with | each new release? Does git assume that the hash algorithm | is statically given to be SHA-1 or are there qualifiers on | which algorithm is enabled/permitted/configured? | paulddraper wrote: | After making the actual code change, the biggest problem | is breaking compatibility with decades of tools in the | ecosystem that rely on historically consistent SHA-1 | hashes. | | Git is moving to a flexible hash though. [1] | | [1] https://stackoverflow.com/questions/28159071/why- | doesnt-git-... | toyg wrote: | Maybe it's time for a version 3 that breaks a bit of | compatibility. | | The Python community would freak out, lol. | simias wrote: | The cost is very high but it's only getting higher with | time. People have known that SHA-1 was weak and | deprecated for much of git's existence. Doing the switch | in 2010 would've been painful, doing it now would be | orders of magnitude more so and I doubt it'll get any | easier in 2030 unless some other SCM manages to overtake | git in popularity which seems unlikely at this point. | | Unless Linus really believes that git will be fine using | SHA-1 for decades to come I don't think it's very | responsible to keep kicking the ball down the road | waiting for the inevitable day when a viable proof of | concept attack on git will be published and people will | have to emergency-patch everything. | simias wrote: | The fact that this attack is chosen prefix does weaken the | first argument though, you may now find a collision even | accounting for any prefixed git "header". The rest is still | completely valid though. | | I still feel like they really should've taken this problem | more seriously and earlier. The more we wait the more | painful the migration will be when the day comes to move to | a different hash function, because everybody knows that'll | happen sooner or later. Two years ago we had a collision, | now we have chosen prefix, how much longer until somebody | actually manages to make a git object collision? | | And keep in mind that public research is probably several | years behind top secret state agency capabilities. Let's | stop looking for excuses every time SHA-1 takes a hit and | rip the bandaid already. It's going to be messy and painful | but it has to be done. | [deleted] | bjornsing wrote: | > you have to be able to also edit the size field in the | header." | | As I read the OP [1] a chosen-prefix collision attack such | as this allows you to "edit the size field in the header". | Or am I missing something? | | 1. "A chosen-prefix collision is a more constrained (and | much more difficult to obtain) type of collision, where two | message prefixes P and P' are first given as challenge to | the adversary, and his goal is then to compute two messages | M and M' such that H(P || M) = H(P' || M'), where || | denotes concatenation." | | EDIT: On second thought I was missing something: the | adversary is further constrained in the git case because it | must find M and M' of correct length (specified in P and | P'). Linus is right (as usual), this probably makes it much | harder. | bjornsing wrote: | A few emails forward in the thread Linus explains though | why we don't need to worry much about this attack in | practice: https://marc.info/?l=git&m=148787287624049&w=2 | | This argument sounds sound to me. | mikepurvis wrote: | There is a migration path to SHA-256, see a good summary here: | https://stackoverflow.com/a/47838703/109517 | | See a previous discussion here, regarding Linus's position on | this in 2017: https://news.ycombinator.com/item?id=13719368 | [deleted] | kibwen wrote: | Out of curiosity, can anyone explain in layman's terms the | differences in design that make SHA-1's successors immune to the | known attacks against SHA-1? Ultimately was this the result of an | apparent flaw in SHA-1 that only became obvious in retrospect, or | was it something totally unforeseeable? | knorker wrote: | Partly this: Number of bits. | | This attack is almost 2^64, and SHA-1 is 160 bits. All else | being equal (big big if) that means sha256 is 102 bits, meaning | 362703572709.30493 times more expensive. Or about $16321 | trillion USD. | Zaak wrote: | SHA-2 is based on similar techniques to those in SHA-1, which | prompted the SHA-3 competition when weaknesses in SHA-1 were | first discovered (as they could conceivably have been present | in SHA-2 as well). As it turns out, SHA-2 appears to be | resistant to the attacks found thus far. | | SHA-3 (originally named Keccak) is built on an entirely | different foundation (called a sponge function), so it is | unlikely that any attack against SHA-1 will be relevant to | SHA-3. However, sponge functions are a relatively new idea, and | weaknesses in the basic principles could conceivably be found | in the future, as could weaknesses in the Keccak algorithm | specifically. | [deleted] | newscracker wrote: | General questions: | | (edit: these are indeed general questions, not just about SHA1) | | Has anyone else been worried about data deduplication done by | storage and/or backup systems, considering that they usually use | hashes to detect data blocks that are "the same" (without | additional metadata) and avoid storing those "duplicate data | blocks" again? Doesn't this seem far worse when you also consider | that systems like Dropbox deduplicate data across all their users | (expanding the footprint for collisions)? Are there any research | papers/articles/investigations about this? | jbverschoor wrote: | Yep, I'm afraid of that. I have no idea how they are mitigating | this, and I doubt they'll ever disclose. | onei wrote: | From what I recall when the first collision was proved it was | with a PDF which is easier to craft a collision from. I also | seem to recall that the collisions were from hashes of the | whole file. | | The dedupe engine written where I work chunks a file and | hashes those chunks meaning it's somewhat harder to craft | collisions (I forget where the chunking boundaries are, but | it's within a range iirc). The hashing algorithm was SHA-1 | last I checkee but I've never heard even company folklore of | corrupted backups caused by hash collisions. I get the | feeling that it's near impossible in practical terms given | the size of the string being hashed. Having said that, hubris | is the downfall of programmers everywhere, so I wouldn't bet | all my money on it. | upofadown wrote: | SHA1 isn't broken for hashes. | mlyle wrote: | SHA1 is vulnerable to preimage attacks in reduced round | variants. The findings keep steadily improving. | | https://en.wikipedia.org/wiki/Preimage_attack | | This means if a storage system _just_ uses SHA1 to detect | duplication, you can abuse the ability to create a collision | to possibly do bad things to the storage system. | knorker wrote: | Yes, but for the specific concerns listed it should not be | a problem. That is, if you upload to dropbox and check the | sha1 when it comes back, yes you did get the same data | back. | | And your data can't be stolen by a hash-to-data oracle | either, unless the evil attacker constructed _your_ secret | data for you. | | So it depends on your threat model. Yes there are practical | concerns, but not the ones listed. | mlyle wrote: | I disagree. Deduplicating filesystems often depend on | hash equivalency meaning data equivalency. They may or | may not have modes to validate all data before | deduplication, but these suck for performance and are | often turned off. | | E.g. I make two things that hash to the same thing. One | is a contract where I'm obligated to pay back a loan. | Another is some meaningless document. I give them to a | counterparty who puts them in their filesystem, which | later scrubs and deduplicates data. Since they hash the | same, the filesystem removes the contract and leaves my | meaningless document (or never bothers to store the | contract because it already exists, if deduping online, | etc). | | Note that this is a _chosen prefix_ collision, which is | much more demanding (and more useful!) than finding a | collision in general. And this leaves aside that SHA1 is | looking increasingly vulnerable to preimage attacks which | further broaden the attack scenarios. | im3w1l wrote: | This doesn't sound crazy farfetched. I bet a lot of | people have files in their dropbox that were created by | someone else. | upofadown wrote: | >SHA1 is vulnerable to preimage attacks. | | From your linked article: | | >All currently known practical or almost-practical attacks | on MD5 and SHA-1 are collision attacks. In general, a | collision attack is easier to mount than a preimage attack, | as it is not restricted by any set value (any two values | can be used to collide). | fanf2 wrote: | SHA-1 (and SHA-256 and many other hash functions) are subject | to length extension, which is how the 2017 shattered attack | works: find a pair of short prefixes that work as PDF headers | and that have colliding hashes; then append a common suffix | such that the different headers display different parts of | the suffixes. Once the collision has been found you can | easily make your own colliding PDFs - https://alf.nu/SHA1 | | This breaks anything that dedups on just the SHA-1 hashes of | raw files | MichaelMoser123 wrote: | are there any systems that do sha-1 for dedup? I am only aware | of sha-256. | munificent wrote: | Git? | [deleted] | fanf2 wrote: | Subversion's dedup was broken by the http://shattered.io/ | SHA-1 collision in 2017 https://medium.com/@hyrum/shattered- | subversion-7edea4ba289c | [deleted] | rakoo wrote: | bup does. Even worse: rsync relies on md5 for finding | duplicate blocks. | kemitche wrote: | It depends on how the deduplication is configured, and the risk | tolerance of the organization running it, I suppose. | | ZFS, for example, has a dedup setting option that forces the | file system to do a byte-for-byte verification for any deduped | data: | https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/g... | umvi wrote: | Is "a Shambles" British or something? I've always heard it as "in | Shambles" | cpach wrote: | AFAICT "a shambles" is correct usage. See | https://brians.wsu.edu/2016/05/24/in-shambles-a-shambles/ | StuffedParrot wrote: | ...for traditional use, maybe, but any editor worth their | salt will stop you from doing so. At least in America. Many | words have changed their meaning and use over the last 500 | years. | OGWhales wrote: | Neat, but don't know if I can switch because of how it rolls | off the tongue. | umvi wrote: | It got easier for me once I learned that "shambles" is a | noun that is a synonym of "slaughterhouse". Once I learned | that, I emphasized _shambles_ slightly differently as I | rolled the phrase off my tongue. | | "SHA-1 is a _slaughterhouse_ " | | "SHA-1 is a _shambles_ " | OGWhales wrote: | Yeah I got that from the link, I had no idea before. | However, I feel like if it was "is a shamble" it would | make more sense to me than "is a shambles". | umanwizard wrote: | Downvoted. There's no ISO standard for English, since | language is a social phenomenon. "Correct" language is | whatever the community of speakers uses in practice, not what | someone claims is correct in a blog post (or even a book). | simias wrote: | And the linked article showed you that, in practice, "a | shambles" is perfectly correct English. It is therefore | correct in a descriptivist sense. | | Beyond that prefixing your comment by "downvoted" is | frankly silly and only serves to derail the conversation | IMO. | StuffedParrot wrote: | > And the linked article showed you that, in practice, "a | shambles" is perfectly correct English. | | I'd think "in practice" would mean evidence of | contemporary usage. | umanwizard wrote: | "a shambles" is not used (or quite rarely used) in North | America, in my experience. Which was what the OP was | asking about. | | You are of course right that "a shambles" is perfectly | fine British English, but that's beside the point. | kps wrote: | I'm in North America and have never heard 'in shambles', | without 'a'. If I did, I'd probably take it as uneducated | sociolect. | plopz wrote: | NYC here, never have heard it with an 'a'. Its always | just 'in shambles'. | umvi wrote: | Fascinating. I'm guessing it's a regional thing where | certain parts of the US say "in shambles" vs. "a | shambles" vs. "in a shambles". | | Not sure how it would be measured, but I think a | /r/dataisbeautiful map would be interesting to see. | dafman wrote: | Yes, at least where I live in the south of the UK. Other | phrases you might hear are "What a shambles", "That is an | absolute shambles" or "Omnishambles" | jen20 wrote: | "Omnishambles" has a fun etymology - it comes from the | political satire "The Thick of It": | https://en.wikipedia.org/wiki/Omnishambles | no_flags wrote: | "Far be the thought of this from Henry's heart, To make a | shambles of the parliament-house!" | | -Shakespeare (Henry VI, Part 3) | tialaramex wrote: | Note that in Shakespeare's time an audience would directly | have understood "make a shambles" here to mean violent | bloodshed in the parliament building because a "shambles" | would have been a common term for somewhere you killed | animals to produce meat. Which is exactly what is being | described here. Henry VI part 3 is not a play about a polite | disagreement settled over lunch... | | Today "shambles" remains in the common vocabulary, but only | in a sense of things being messy or disorganised, most | English readers today probably wouldn't think of literal | violence and I suppose that the people who named this | website, likewise, were thinking of modern usage. | ancorevard wrote: | It's a word for a meat market, or a butcher's shop, in other | words - a bloody mess. | pjc50 wrote: | Yes; in addition to the other comments, there is also _the_ | shambles: https://en.wikipedia.org/wiki/The_Shambles | | (nowadays devoid of mead products but a nice picturesque place | to visit if you can stand crowds. Try the bookshop) | claudiawerner wrote: | I live in York and I haven't actually gone into any shops in | the Shambles... | ebg13 wrote: | Quick question about the "What should I do" section. It says " | _use instead SHA-256_ ". Isn't SHA-512 both better and faster on | modern hardware? | Spooky23 wrote: | SHA-256 is on the approved FIPS lists and are faster on 32-bit | operating systems, which were surprisingly common in enterprise | environments until recently. | | People tend to be conservative in making changes for stuff like | this, and don't do so until forced. | akvadrako wrote: | I believe so also. Specifically, SHA-512/256 seems to be a | better choice by almost every metric except 32-bit hashing | speed. | [deleted] | LeonM wrote: | Depends on your definition of 'better'. Theoretically SHA-512 | is harder to brute force than SHA-256, but 256 bits is already | extremely strong, so really there is no practical safety | benefit of SHA-512 over SHA-256. | | On 64-bit capable processors SHA-512 has a slight performance | gain over SHA-256, but only on larger inputs. However, the | digest of SHA-512 is twice the size, so what you gain in | processing time, you loose in storage. | quotemstr wrote: | You can truncate the SHA-512 digest down to 256 bits though. | Cryptographic hash functions mix well and don't suffer from | the truncation (except to the extent that the truncated | digest is shorter, of course). You don't necessarily need | more storage (after hashing completes) just because you're | using a new hash function. | jlokier wrote: | > but 256 bits is already extremely strong | | The strength of hashes like SHA-256 doesn't just come from | the number of output bits. | | The 256 bits there is relevant for brute force attacks, but | not more sophisticated attacks that take into account the | internal structure of the hash algorithm, and in some cases | "weak" values. | | SHA-512 performs more "rounds" of computation than SHA-256. | | Although it's impossible to compare two different hashes on | rounds alone, in general a large number of rounds of the same | type of hash decreases the likelihood of non-brute-force | attacks finding a collision. | | If you look at the literature for attacks on hashes, they | will often say they could do it for a certain number of | rounds, and that number increases over time as new methods | are discovered. | | The number of rounds in the hash design is chosen with this | in mind, trying to balance being more than sufficient for | future attacks yet not too slow. | tptacek wrote: | Cryptographic engineers do not in fact think you should use | SHA-2-512 so that you can maximize the number of times the | round function is applied. | jlokier wrote: | I'm not sure what that sentence means, can you rephrase | it? | tptacek wrote: | The analysis you have provided for why SHA-512 is | superior to SHA-256 is faulty. | CiPHPerCoder wrote: | SHA-256 and SHA-512 are both in the same family (SHA-2). | | Latacora says to use SHA-2. If you can get away with it, | SHA-512/256 instead of SHA-256. But they're all SHA-2 family | hash functions. | | https://latacora.micro.blog/2018/04/03/cryptographic-right-a... | | No need to bikeshed this. But if you must: SHA-512/256 > | SHA-384 > SHA-512 = SHA-256 | | If you're wondering, "Why is SHA-384 better than SHA-512 and | SHA-256?" the answer is the same reason why SHA-512/256 is the | most preferred option: | https://blog.skullsecurity.org/2012/everything-you-need-to-k... | | Additionally, the Intel SHA extensions target SHA1 and SHA-256 | (but not SHA-512), which makes SHA-256 faster than SHA-512 on | newer processors. | | Isn't crypto fun? | fireflash38 wrote: | I read the blog, but it doesn't really expand on _why_ SHA224 | /384 aren't vulnerable to that attack, can you explain (or | link to some place that does?) | tptacek wrote: | A length-extension attack works by taking the output of a | hash and using it as the starting point for a new hash; you | can do this even for hashes of messages you haven't seen, | minting new related hashes. Truncated SHA-2 hashes don't | output the whole hash, and so you aren't given enough | information to start a new related hash. | CiPHPerCoder wrote: | Because they're already truncated. | | SHA-384 is SHA-512 with a different IV (which doesn't | affect LEAs) truncated to 384 bits (which gives you 128 | bits of resistance against LEAs). | | SHA-224 is the same story but with SHA-256 instead (and | only 32 bits of LEA resistance). | colanderman wrote: | I'm super confused. Are SHA-256 and SHA256 _different_ , and | if so, why in the world would this be considered a sane | naming scheme? | | If not, I completely do not understand the inequation you | wrote, which seemingly lists SHA-256 (and -512) multiple | times. | timdumol wrote: | You're probably confused by "SHA-512/256", which does not | mean SHA-512 or 256, but rather a truncated version of | SHA-512: https://en.wikipedia.org/wiki/SHA-2 in the third | paragraph. | colanderman wrote: | Ah! Makes sense now, thanks. | Ajedi32 wrote: | So why would a truncated version of SHA-512 be better | than SHA-512? And why is SHA-512 = SHA-256? | CiPHPerCoder wrote: | Truncated hash functions are not vulnerable to length- | extension attacks. | | Length-extension attacks are relevant when you design a | MAC by passing a secret and then a message to a hash | function, where only the message is known. | | Truncating the hash (which is what SHA-512/256 and | SHA-384 do to SHA-512) removes the ability to grab an | existing hash H(k || m) (where k is unknown and m might | be known) and append junk because a truncated hash does | not contain sufficient information to recover the full | state of the hash function in order to append new blocks. | p1mrx wrote: | Why do SHA-512/160 and SHA-512/128 not exist? They could | be useful as drop-in replacements for SHA1 and MD5. | tedunangst wrote: | You can truncate a hash anywhere you like. But 128 bits | is considered too short now. | SAI_Peregrinus wrote: | Because 224 bits is considered the minimum safe output | length for a general purpose hash function. So they'd be | drop-in replacements but still wouldn't be safe. Safer | than MD5/SHA1, but not actually safe. | | So rather than push off getting people to make things | actually safe by providing a footgun NIST just didn't do | that. | p1mrx wrote: | > 224 bits is considered the minimum safe output length | for a general purpose hash function. | | Considered by whom? | CiPHPerCoder wrote: | Truncating a hash function to 224 bits put it at the | 112-bit security level, which is roughly equivalent to | 2048-bit RSA under today's understanding of the costs of | distributed cracking attacks. | | There are a lot of standards organizations all over the | world with various recommendations. | https://www.keylength.com collates quite a few of them. | Pick the one most closely relevant for your jurisdiction. | | Most of them recommend 2048-bit RSA as their minimum for | asymmetric security, and AES-128 / SHA-256 as their | minimum for symmetric security. This is a [112, 128]-bit | security lower bound. | | Truncating a hash to 160 bits yields 80-bit security, | which is insufficient. 128 bits (64-bit security) is out | of the question. | [deleted] | dlgeek wrote: | SHA-512/256 is a truncated SHA-512 that's shortened to 256 | bits, it's a separate standard. | [deleted] | faceplanted wrote: | "SHA-512/256" is a single entity | CiPHPerCoder wrote: | I've edited my parent comment to consistently use hyphens. | | There are six SHA-2 family hash functions: | * SHA-224 * SHA-256 * SHA-384 * SHA-512 | * SHA-512/224 * SHA-512/256 | | Hope that helps. (I know it's still confusing.) | oconnor663 wrote: | SHA-512 provides a higher security level (256 bits), but | there's not much practical value in a security level higher | than 128 bits. I think "better" is too strong a word here. | | SHA-512 is faster than SHA-256 in software on 64-bit machines. | That's a more important difference than the security level. | However, there are two major caveats to consider: 1) Hash | function performance is more likely to matter on cheap | (non-64-bit) hardware where everything is slow, than on fancy | hardware where everything is fast. 2) Some x86 and ARM chips | have hardware accelerated implementations of SHA-256, but not | of SHA-512. | GlitchMr wrote: | SHA-256 is shorter, and there isn't much of a difference | between SHA-256 and SHA-512 (they are both using SHA-2 | algorithm). | [deleted] | woadwarrior01 wrote: | FWIW, the instructions in Intel SHA extensions (which are also | supported on modern AMD processors starting from Ryzen onwards) | only support SHA-1 and SHA-256. | notlukesky wrote: | > Responsible Disclosure | | We have tried to contact the authors of affected software before | announcing this attack, but due to limited resources, we could | not notify everyone. | | Is there a list of affected software out there? | CiPHPerCoder wrote: | https://github.com/search?q=sha1&type=Code | https://github.com/search?q=sha-1&type=Code | eerrt wrote: | The full paper is https://eprint.iacr.org/2020/014.pdf if anyone | is interested ___________________________________________________________________ (page generated 2020-01-07 23:00 UTC)