1 / 53

530 likes | 732 Views

Short Chosen-Prefix Collisions for MD5 and the Creation of a Rogue CA Certificate. Marc Stevens, Alexander Sotirov , Jacob Appelbaum , Arjen Lenstra , David Molnar, Dag Arne Osvik and Benne de Weger. Presentation Saffi Keisari. Content. MD5 Collision Overview. MD5 Overview.

Download Presentation
## Presentation Saffi Keisari

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**Short Chosen-Prefix Collisions for MD5and the Creation of a**Rogue CA Certificate Marc Stevens, Alexander Sotirov,Jacob Appelbaum, ArjenLenstra, David Molnar,Dag Arne Osvik and Benne de Weger Presentation Saffi Keisari**Content**MD5 Collision Overview MD5 Overview MD5 Collision History MD5 Short Chosen Prefix Collision SSL Overview Attack on Certification Authorities Collision Improvements Conclusions Progress of Collision Attacks**Hashes and Message Digest**• Hash is also called message digest • One-way function: d=h(m) but no h’(d)=m • Cannot find the message given a digest • Cannot find m1, m2, where d1=d2 • Arbitrary-length message to fixed-length digest • Randomness • any bit in the outputs ‘1’ half the time • each output: 50% ‘1’ bits**Using Hash for Authentication**• Alice to Bob: challenge rA • Bob to Alice: MD(KAB|rA) • Bob to Alice: rB • Alice to Bob: MD(KAB|rB) • Only need to compare MD results**MD5: Message Digest Version 5**input Message Output 128 bits Digest • Until recently the most widely used hash algorithm • in recent times have both brute-force & cryptanalytic concerns • Specified as Internet standard RFC1321**MD5 Overview**• 1. Padding: • Pad the message with: first the ‘1’-bit, next as many ‘0’ bits until the resulting bit length equals 448 mod 512, and finally the bit length of the original message as a 64-bit little-endian integer. The total bit length of the padded message is 512N for a positive integer N. • 2. Partitioning: • The padded message is partitioned into N consecutive 512-bit blocks M1,M2, . . . ,MN. • 3. Processing: • MD5 goes through N + 1 states IHVi, for 0 i N, called the intermediate hash values. • Each intermediate hash value IHVi consists of four 32-bit words ai, bi, ci, di. For i = 0 these • are initialized to fixed public values: • IHV0 = (a0, b0, c0, d0) = (6745230116, EFCDAB8916, 98BADCFE16, 1032547616), • and for i = 1, 2, . . .N intermediate hash value IHVi is computed using the MD5 compression • function described in detail below: • IHVi = MD5Compress(IHVi−1,Mi). • 4. Output: • The resulting hash value is the last intermediate hash value IHVN, expressed as the concatenation • of the sequence of bytes, each usually shown in 2 digit hexadecimal representation, • given by the four words aN, bN, cN, dN using Little-Endian. E.g. in this manner IHV0 will • be expressed as the hexadecimal string • 0123456789ABCDEFFEDCBA9876543210**MD5 Process**• As many stages as the number of 512-bit blocks in the final padded message • Digest: 4 32-bit words: MD=A|B|C|D • Every message block contains 16 32-bit words: m0|m1|m2…|m15 • Digest MD0 initialized to: A=01234567,B=89abcdef,C=fedcba98, D=76543210 • Every stage consists of 4 passes over the message block, each modifying MD • Each block 4 rounds, each round 16 steps**Processing of Block mi - 4 Passes**mi MDi Functions and Random Numbers • F(x,y,z) == (xy)(~x z), (for 0 ≤ t< 16) • G(x,y,z) == (x z) (y ~ z), (for 16 ≤ t< 32) • H(x,y,z) == xy z, (for 32 ≤ t< 48) • I(x,y,z) == y(x ~z) , (for 48 ≤ t< 64) ABCD=fF(ABCD,mi,T[1..16]) A C D B ABCD=fG(ABCD,mi,T[17..32]) ABCD=fH(ABCD,mi,T[33..48]) ABCD=fI(ABCD,mi,T[49..64]) + + + + MD i+1**Different Passes...**• Input: • mt – a 32-bit word from the message With different shift every round • Tt – int(232 * abs(sin(i+1))), 0<i<65 Provided a randomized set of 32-bit patterns, which eliminate any regularities in the input data • ABCD: current MD • Output: • ABCD: new MD**MD5 Compression Function**• Each round has 16 steps of the form: a = b+((a+g(b,c,d)+X[k]+T[i])<<<s) • a,b,c,d refer to the 4 words of the buffer, but used in varying permutations • note this updates 1 word only of the buffer • after 16 steps each word is updated 4 times • where g(b,c,d) is a different nonlinear function in each round (F,G,H,I)**MD5(**) = MD5( ) Collisions for MD5 2004: First collision for MD5 [Wang,Yu]: • Two 128 byte messages with same MD5 hash value • Identical prefix collision attack • Messages differ only in 128 consecutive ‘random’ bytes • Bytes before or after may not differ • Currently: <1 sec on single pc core • Same MD5 hash value same signature**Collisions for MD5 2004**• Due to the iterative structure of MD5 and to the fact that IHV0 can have any 128 bit value, such collisions can be combined into larger inputs. Namely, for any given prefix P and any given suffix S a pair of "collision blocks" {C,C'} can be computed such that MD5(P||C||S) = MD5(P||C'||S). We use the term "collision block" for a specially crafted bit string that is inserted into another bit string to achieve a collision. One collision block may consist of several input blocks, even including partial input blocks. The collision blocks of [WY] consist of precisely two consecutive input blocks.**Collisions for MD5 2004**Their attack is based on a combined additive and XOR differential method. Using this differential, they have constructed 2 differential paths for the compression function of MD5 which are to be used consecutively to generate a collision of MD5 itself. Their constructed differential paths describe precisely how differences between the two pairs (IHV,B) and (IHV`,B`), of an intermediate hash value and an accompanying message block, propagate through the compression function. They describe the integer difference (−1, 0 or +1) in every bit of the intermediate working states Qt and even specific values for some bits. Using a collision finding algorithm they search for a collision consisting of two consecutive pairs of blocks (B0,B`0) and (B1,B`1 ), satisfying the 2 differential paths which starts from arbitrary IHˆV = IHˆV 0. Therefore the attack can be used to create two messages M and M0 with the same hash that only differ slightly in two subsequent blocks as shown in the following outline where IHˆV = IHVk for some k:**MD5(**) = MD5( ) Chosen-Prefix Collisions 2006 Chosen-prefix collision (CPC) attack • [Stevens, Lenstra, de Weger] • New stronger type of collisions • Choose two arbitrary files (same length) • Make them collide by appending 716 ‘random’ bytes • Currently: 1 day on quad-core pc w/ only 588 bytes • Example: • Colliding certificates with different identities • MD5 harmful for digital signatures**Chosen-Prefix Collisions**• MD5 Compression: IHV, M vs IHV’, M’ • Analyze propagation of differences • Choose δM=M’-M • Which achieves (partial) elimination of δIHV at end • Construct set of equations • Sufficient conditions • Solve set of equations • Actual M, M’ • Repeat until δ IHV=0**Birthday Problem**• How many people do you need so that the probability of having two of them share the same birthday is > 50% ? • Random sample of n birthdays (input) taken from k (365, output) • kn total number of possibilities • (k)n=k(k-1)…(k-n+1) possibilities without duplicate birthday • Probability of no repetition: • p = (k)n/kn 1 - n(n-1)/2k • For k=366, minimum n = 23 • n(n-1)/2 pairs, each pair has a probability 1/k of having the same output • n(n-1)/2k > 50% n>k1/2**Birthday Problem - Our Case**• After approximately √ (∏ |V| /2) iterations one may expect to have encountered a Collision • Let p be the probability that a birthday collision satisfies additional conditions that cannot be captured by V or f. where f deterministic function f : V V, where different points x and y such that f(x) = f(y) • in our case the Ctr =√ (∏ |V| /(2p)) where V: Search space p:probability**Chosen-Prefix Collisions**• Not all δIHVs can be eliminated • First perform birthday search • Find δIHVs of specific forme.g. δHV=(0,x,x,y) • Extend search to lower # near-collision blocks • Appends 64 to 96 bits to prefixes • Then iteratively eliminate differences in δIHV • Till δIHV=(0,0,0,0) • Vlastimil Klima, Finding MD5 collisions on a notebook PC using multi-message modifications, Cryptology ePrint Archive, Report 2005/102, 2005, http://eprint.iacr.org/2005/102. • Vlastimil Klima, Tunnels in hash functions: MD5 collisions within a minute, Cryptology ePrint Archive, Report 2006/105, 2006, http://eprint.iacr.org/2006/105.**Chosen-Prefix Collisions**• Thus, due to the iterative structure of MD5, the chosen-prefix collision construction method is able to produce on input of any pair of chosen prefixes {P,P'} and any suffix S, a pair of collision blocks {C,C'}, such that MD5(P||C||S) = MD5(P'||C'||S). • To be more precise, for given {P,P'} the collision blocks are constructed as follows. First P, P' are padded with bit strings A, A' of any useful size and contents to achieve equal lengths of P||A and P'||A'. Then, using a "birthdaying" step, bit strings B, B' are produced such that the resulting P||A||B and P'||A'||B' have equal length, being a multiple of 512, and such that the resulting IHVs at this point have a prespecified structure. This enables the construction of a pair of "near collision blocks" {NC,NC'} that gives rise to a collision. Each near collision block will consist of a number of 512 bit input blocks. Thus, with C = A||B||NC and C' = A'||B'||NC', the resulting IHVs are identical. Consequently, MD5(P||C) = MD5(P'||C'), and thus also for any suffix S it is the case that MD5(P||C||S) = MD5(P'||C'||S).**2006 Example Colliding Certificates**serial number serial number set by the CA validity period validity period chosen prefix (different) “Arjen K. Lenstra” “Marc Stevens” real cert RSA key 8192 bits real cert RSA key 8192 bits collision bits (computed) identical bytes (copied from real cert) X.509 extensions X.509 extensions valid signature valid signature**Collision Constrains**• In our target application, generating a rogue CA certificate, we have to deal with two hard limits: • Because the CA that is supposed to sign our (legitimate) certificate does not accept certification requests for RSA modulo larger than 2048 bits, each of our suffixes S and S` and their common appendage T must fit in 2048 bits. This implies that we can use at most 3 near-collision blocks. (each block 512 bits) • Furthermore, to reliably predict the serial number, the entire construction must be performed within a few days.**Partial differential paths**• W: a larger value allows elimination of more differences in δIHV per near-collision block • But increases the cost of constructing each near-collision block by a factor of roughly 22w. • However due to the blow-up factor of 22w only the values 2, 3, 4, and 5 are of interest • the new ones vary the carry propagations in the last 3 steps and the boolean function difference in the last step. • This change affects the working state only in difference δ Q64**Complexity, memory requirement**• The first chosen-prefix collision example from [#1] used a 96-bit birth-day search space V with |V|= 296 to find a δ IHV = (δa; δb; δc; δd) with δa = 0, δb = δc = δd. This search can be continued until a birthday collision is found that requires a sufficiently small number of near-collision blocks, which leads to a trade-of between the birthday search and the number of blocks. • If one would aim for just 3 near-collision blocks, one expects 257:33 MD5 compressions for the 96-bit birthday search, which would take about 50 days on 215 PlayStation 3 game consoles. • By leaving δb free, we get an improved 64-bit search space (cf. [#2]). • In the resulting birthday collisions, the differences in δb compared to δc were handled by the differential path from [#2, section 7.4] which corresponds to δQ64 = δ2q ¨δ2q+21 mod 32 • #1 http://www.win.tue.nl/hashclash/ • #2 http://www.win.tue.nl/hashclash/Nostradamus/ r: near-collision blocks**Complexity, memory requirement**• We can do a (64 + k)-bit search similar to the one above, but with δb = δc mod 2k. • Since δb does not introduce new differences compared to δc in the lower k bits, the average number of near-collision blocks may be reduced - in particular when taking advantage of our new family of differential paths - while incurring a higher birthdaying cost. • For any targeted number of near-collision blocks, this leads to a trade-of between the birthdaying • cost and space requirements (unless the number of blocks is at least 6, since then 241MB suffices for the plausible choice w = 2) r: near-collision blocks**Complexity, memory requirement**r: near-collision blocks 64 + k with δb = δc mod 2k. The overall chosen-prefix collision construction takes on average less than a day on the cluster of PS3s. With 150Mb per PS3 Where W=5,K=8,M=30G, takes √ 2 more time the excepted**Collision Improvements**• Allow extra bit differences in last step • Eliminate more IHV differences per block • Decreases avg. # collision bytes required • Increases collision search complexity O(22w) w Arbitrary bitdifferences**Collision Improvements**• Birthday search for δIHV=(δa,δb,δc,δd)of the form: δa=0, δd=δc • Short CPC: very high memory requirements • New trade-off: δb=δc mod 2k, 0·k·32 • Trade memory vs complexityw=5: £210 vs £29 δa δd δc δb**Collision Improvements**• Rogue CA construction (<2048 bits) • Cluster of 215 PlayStation3s • Performing like 8600 pc cores • Complexity 250 using 30GB: • 1 day on cluster • Complexity 248.2 using a few TBs: • 1 day on 20 PS3s and 1 pc • 1 day on 8 NVIDIA GeForce GTX280s • 1 day on Amazon EC2 at the cost of $2,000 • Normal CPC • Complexity approx. 239 (<1 day on quadcore pc)**Overview SSL**• Web Deployment: • Web server • Email Servers(POP3, IMAP) • Many Other service(IRC, SSL VPN. Etc…) • Very good at preventing eavesdropping • Asymmetric key exchange(RSA) • Symmetric crypto for data encryption • Man in The Middle attacks • Preventing by establishing a chain of trust from web site digital certificate to trusted certificate authority**Certification Authorities (CAs)**• Website digital certificates must be signed by a trusted Certificate Authority • Browsers ship with a list of trusted Cas • Firefox 3 includes 135 trusted CA certs • CAs’ responsibilities: • Verify the identity of the requestor • Verify domain ownership for SSL certs • Revoke bad certificates**Obtaining certificates**• 1. User generates private key • 2. User creates a Certificate Signing Request (CSR) containing • user identity • domain name • public key • 3. CA processes the CSR • validates user identity • validates domain ownership • signs and returns the certificate • 4. User installs private key and certificate on a • web server Obtaining certificates**Attack on Certification Authorities**• We were able to create a sub-CA signed by a known trusted CA (RapidSSL) • Not by default known by major web browsers • But is trusted as it is signed by a known CA • Same effect as subverting a known trusted CA • Possible because one particular commercial CA • used MD5 to create signatures • MD5 known to have significant weaknesses since 2004 • had weaknesses in procedures**Attack on Certification Authorities**• In July 2008 check if the colliding certificates attack could be applied to a real Certification Authority. • The first step was to identify the CAs that still used MD5. • Unfortunately it is not possible to determine the hash function a CA uses from the CA certificate. • We had to look at (website) certificates issued by the CAs instead. • Over the course of a week we spidered the web and collected more than 100,000 SSL certificates, of which about 30,000 were signed by CAs trusted by Firefox. • There were six CAs that had issued certificates signed with MD5 in 2008:**Attack on Certification Authorities**• RapidSSLC=US, O=Equifax Secure Inc., CN=Equifax Secure Global eBusiness CA-1 • FreeSSL (free trial certificates offered by RapidSSL)C=US, ST=UT, L=Salt Lake City, O=The USERTRUST Network, OU=http://www.usertrust.com, CN=UTN-USERFirst-Network Applications • TC TrustCenter AGC=DE, ST=Hamburg, L=Hamburg, O=TC TrustCenter for Security in Data Networks GmbH, OU=TC TrustCenter Class 3 CA/emailAddress=certificate@trustcenter.de • RSA Data SecurityC=US, O=RSA Data Security, Inc., OU=Secure Server Certification Authority • ThawteC=ZA, ST=Western Cape, L=Cape Town, O=Thawte Consulting cc, OU=Certification Services Division, CN=Thawte Premium Server CA/emailAddress=premium-server@thawte.com • verisign.co.jpO=VeriSign Trust Network, OU=VeriSign, Inc., OU=VeriSign International Server CA - Class 3, OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY LTD.(c)97 VeriSign • Out of the 30,000 certificates that collected, about 9,000 were signed using MD5, and 97% of those were issued by RapidSSL. • It was quite surprising that so many CAs are still using MD5, considering that MD5 has been known to be insecure since the first collisions were presented in 2004. Since these CAs had ignored all previous warnings by the cryptographic community**Creating a sub-CA**serial number rogue CA cert validity period real cert domain name chosen prefix (different) rogue CA RSA key rogue CA X.509 extensions CA bit! real cert RSA key max 2048 bits Netscape Comment Extension (contents ignored by browsers) collision bits (computed) X.509 extensions identical bytes (copied from real cert) valid signature valid signature**Obstacles**• Predicting serial number and validity period • Total computation < a few days • Max 204 collision bytes instead of 716 • Limit by the CA RapidSSL • Greatly increases computational time • 17 months on 1000 pc cores**Predictions**• RapidSSL uses a fully automated system • Certificate issued exactly 6 seconds after clicking • Validity period one year plus one day + 6 sec • RapidSSL uses sequential serial numbers: • Nov 3 07:44:08 2008 GMT 643006 • Nov 3 07:45:02 2008 GMT 643007 • Nov 3 07:46:02 2008 GMT 643008 • Nov 3 07:47:03 2008 GMT 643009 • Nov 3 07:48:02 2008 GMT 643010 • Nov 3 07:49:02 2008 GMT 643011 • Nov 3 07:50:02 2008 GMT 643012 • Nov 3 07:51:12 2008 GMT 643013 • Nov 3 07:51:29 2008 GMT 643014 • Nov 3 07:52:02 2008 GMT ?**Predictions**Estimate: 800-1000 certificates per weekend Procedure: • Get the serial number S on Friday • Predict the value for time T on Sunday to be S+1000 • Generate the collision bytes • Shortly before time T buy enough certs to increment the counter to S+999 • Send colliding request at time Tand get serial number S+1000**Result**• Success at 4th attempt • Generated CA signature for real certalso valid for rogue CA cert • Explicit safeguards: • Validity period limited to August 2004 • Private key remains secret • Major browsers and affected CAs were informed in advance • Responded quickly and adequately • MD5 abandoned by CAs hours after public presentation**Single block CPC**• Birthday search for δIHV that can be reduced to 0 with single near-collision block • New approach: • New fastest near-collision attack (compl. 215) • Allow extra factor 226 in collision finding compl. • Results in set of 223.3 usable δIHVs of the formδa=-25, δd=-25+225, δc=-25 mod 220 • Total complexity: approx. 253.2 • Example single block CPC in paper**CA Certificate**• byte 0 - 3:header • byte 4 - 8:version number (3, default value) • byte 9 - 13:serial number (643015, set by CA and predicted by us) • byte 14 - 28:signature algorithm ("md5withRSAEncryption", set by CA) • byte 29 - 120:issuer Distinguished Name (CA default value) • byte 121 - 152:validity ("from 3 Nov. 2008 7:52:02 to 4 Nov. 2009 7:52:02", set by CA and predicted by us) • byte 153 - 440:subject Distinguished Name(Country = "US",Organisation = "i.broke.the.internet.and.all.i.got.was.this.t-shirt.phreedom.org",Organisational Unit (3 fields set by the CA) Common Name = "i.broke.the.internet.and.all.i.got.was.this.t-shirt.phreedom.org",Country, Organisation and Common Name set by us in the CSR) • byte 441 - 734:subject public key info:byte 441 - 444:header • byte 445 - 459:public key algorithm ("RSAEncryption", set by us in the CSR) • byte 460 - 468:headers • byte 469 - 729:RSA modulus (2048 bit value, set by us in the CSR) • byte 730 - 734:RSA public exponent (65537, set by us in the CSR) • byte 735 - 926:extensions:byte 735 - 740:headers • byte 741 - 756:key usage ("digital signature, nonrepudiation, key encipherment, data encipherment", set by CA) • byte 757 - 881:subject key identifier, crl distribution points, authority key identifier (uninteresting fields, set by CA) • byte 882 - 912:extended key usage ("server authentication, client authentication", set by CA) • byte 913 - 926:basic constraints ("CA = FALSE, Path Length = None", set by CA)**Rogue CA Certificate**• byte 0 - 3:header • byte 4 - 8:version number (3, default value) • byte 9 - 11:serial number (65, arbitrary choice) • byte 12 - 26:signature algorithm ("md5withRSAEncryption", set by us in the CSR) • byte 27 - 118:issuer Distinguished Name (CA default value) • byte 119 - 150:validity ("from 31 Jul. 2004 0:00:00 to 2 Sep. 2004 0:00:00“, set by CA and predicted by us) • byte 153 - 212:subject Distinguished Name(Common Name = "MD5 Collisions Inc. (http://www.phreedom.org/md5)") • byte 213 - 374:subject public key info: • byte 213 - 215:header • byte 216 - 230:public key algorithm ("RSAEncryption“, ", set by us in the CSR) • byte 231 - 237:headers • byte 238 - 369:RSA modulus (1024 bit value) , , set by us in the CSR • byte 370 - 374:RSA public exponent (65537) , , set by us in the CSR • byte 375 - 926:extensions: • byte 375 - 378:headers • byte 379 - 395:key usage ("digital signature, nonrepudiation, certificate signing, offline CRL signing, CRL signing") • byte 396 - 412:basic constraints ("CA = TRUE, Path Length = None") • byte 413 - 476:subject key identifier, authority key identifier (uninteresting fields) • byte 477 - 926:tumor (Netscape extension)**Conclusion**• Collision attacks on MD5 form real threat • Hard to replace broken crypto primitives • MD5 used by major CAs4 years after first collision attacks • Crypto primitives can be broken overnight • What to do when e.g. SHA-1 really falls, say yesterday? • How to make replacement of primitives easier? • Source code implementation released:http://code.google.com/p/hashclash(Support for CELL/PS3 & CUDA)

More Related