How to Break MD5 and Other Hash Functions
MD5 is one of the most widely used cryptographic hash functions nowadays. It was designed in 1992 as an improvement of MD4, and its security was widely studied since then by several authors. The best known result so far was a semi free-start collision, in which the initial value of the hash function is replaced by a non-standard value, which is the result of the attack. In this paper we present a new powerful attack on MD5 which allows us to find collisions efficiently. We used this attack to find collisions of MD5 in about 15 minutes up to an hour computation time. The attack is a differential attack, which unlike most differential attacks, does not use the exclusive-or as a measure of difference, but instead uses modular integer subtraction as the measure. We call this kind of differential a modular differential. An application of this attack to MD4 can find a collision in less than a fraction of a second. This attack is also applicable to other hash functions, such as RIPEMD and HAVAL.
- Research Article
2
- 10.2498/cit.1002181
- Jan 1, 2013
- Journal of Computing and Information Technology
Cryptographic hash functions are important cryptographic techniques and are used widely in many cryptographic applications and protocols. All the MD4 design based hash functions such as MD5, SHA-0, SHA-1 and RIPEMD-160 are built on Merkle-Damgard iterative method. Recent differential and generic attacks against these popular hash functions have shown weaknesses of both specific hash functions and their underlying Merkle-Damgard construction. In this paper we propose a hash function which follows design principle of SHA-1 and is based on dither construction. Its compression function takes three inputs and generates a single output of 160-bit length. An extra input to a compression function is generated through a fast pseudo-random function. Dither construction shows strong resistance against major generic and other cryptanalytic attacks. The security of proposed hash function against generic attacks, differential attack, birthday attack and statistical attack was analyzed in detail. It is exhaustedly compared with SHA-1 because hash functions from SHA-2 and SHA-3 are of higher bit length and known to be more secure than SHA-1. It is shown that the proposed hash function has high sensitivity to an input message and is secure against different cryptanalytic attacks.
- Research Article
1
- 10.1007/s10559-021-00352-y
- Mar 1, 2021
- Cybernetics and Systems Analysis
In the paper, we construct security estimations of Poseidon hash function against non-binary linear and differential attacks. We adduce the general parameters for the Poseidon hash function that allow using this hash function in recurrent SNARK-proofs based on MNT-4 and MNT-6 triplets. We also analyse how to choose S-boxes for such function for this choice to be optimal from the point of view of the number of constraints and security. We show how many full rounds are sufficient to guarantee security of such hash function against non-binary linear and differential attacks. We also calculate the number of constraints per bit achieved in the proposed realizations and demonstrate a considerable gain as compared to the Pedersen hash function.
- Research Article
4
- 10.2478/s13537-014-0204-7
- Jan 1, 2014
- Open Computer Science
Cryptographic hash functions are important cryptographic techniques and are used widely in many cryptographic applications and protocols. All the MD4 design based hash functions such as MD5, SHA-1, RIPEMD-160 and FORK-256 are built on Merkle-Damgård iterative method. Recent differential and generic attacks against these popular hash functions have shown weaknesses of both specific hash functions and their underlying Merkle-Damgård construction. In this paper we propose a hash function follows design principle of NewFORK-256 and based on HAIFA construction. Its compression function takes three inputs and generates a single output of 256-bit length. An extra input to a compression function is a 64-bit counter (number of bits hashed so far). HAIFA construction shows strong resistance against major generic and other cryptanalytic attacks. The security of proposed hash function against generic attacks, differential attack, birthday attack and statistical attack was analyzed in detail. It is shown that the proposed hash function has high sensitivity to an input message and is secure against different cryptanalytic attacks.
- Book Chapter
186
- 10.1007/978-3-642-01001-9_8
- Jan 1, 2009
In this paper, we present the first cryptographic preimage attack on the full MD5 hash function. This attack, with a complexity of 2116.9, generates a pseudo-preimage of MD5 and, with a complexity of 2123.4, generates a preimage of MD5. The memory complexity of the attack is 245 ×11 words. Our attack is based on splice-and-cut and local-collision techniques that have been applied to step-reduced MD5 and other hash functions. We first generalize and improve these techniques so that they can be more efficiently applied to many hash functions whose message expansions are a permutation of message-word order in each round. We then apply these techniques to MD5 and optimize the attack by considering the details of MD5 structure.
- Research Article
148
- 10.1007/s00530-013-0314-4
- Mar 24, 2013
- Multimedia Systems
In this paper, a novel algorithm for image encryption based on hash function is proposed. In our algorithm, a 512-bit long external secret key is used as the input value of the salsa20 hash function. First of all, the hash function is modified to generate a key stream which is more suitable for image encryption. Then the final encryption key stream is produced by correlating the key stream and plaintext resulting in both key sensitivity and plaintext sensitivity. This scheme can achieve high sensitivity, high complexity, and high security through only two rounds of diffusion process. In the first round of diffusion process, an original image is partitioned horizontally to an array which consists of 1,024 sections of size 8 × 8. In the second round, the same operation is applied vertically to the transpose of the obtained array. The main idea of the algorithm is to use the average of image data for encryption. To encrypt each section, the average of other sections is employed. The algorithm uses different averages when encrypting different input images (even with the same sequence based on hash function). This, in turn, will significantly increase the resistance of the cryptosystem against known/chosen-plaintext and differential attacks. It is demonstrated that the 2D correlation coefficients (CC), peak signal-to-noise ratio (PSNR), encryption quality (EQ), entropy, mean absolute error (MAE) and decryption quality can satisfy security and performance requirements (CC 204.8, entropy >7.9974 and MAE >79.35). The number of pixel change rate (NPCR) analysis has revealed that when only one pixel of the plain-image is modified, almost all of the cipher pixels will change (NPCR >99.6125 %) and the unified average changing intensity is high (UACI >33.458 %). Moreover, our proposed algorithm is very sensitive with respect to small changes (e.g., modification of only one bit) in the external secret key (NPCR >99.65 %, UACI >33.55 %). It is shown that this algorithm yields better security performance in comparison to the results obtained from other algorithms.
- Research Article
- 10.48084/etasr.12601
- Oct 6, 2025
- Engineering, Technology & Applied Science Research
A hash function is a mathematical model that maps inputs of arbitrary size to unique outputs of a fixed length in bits. Hash functions are highly useful and appear in almost all information security applications. In addition to information security applications, it can also serve as index data in hash tables, aiding in the detection of duplicate data for fingerprinting or uniquely identifying files, as well as for checksums to identify data corruption. This research introduces an innovative 256-bit hash function that utilizes a chaotic substitution box using a non-linear logistic map. Unlike MD5 or SHA-family hash functions, which rely on modular arithmetic, logical operations, and bitwise shifts for diffusion and non-linearity, the proposed method incorporates a chaotic substitution box to introduce an additional nonlinear transformation layer and high diffusion. The avalanche rate, statistical analysis, pre-image resistance, second pre-image, collision resistance, and performance are examined to evaluate the cryptographic strength and the performance of the proposed method.
- Research Article
- 10.55632/pwvas.v95i2.993
- Apr 18, 2023
- Proceedings of the West Virginia Academy of Science
JOHNNA SMITH, Dept of Mathematics, Shepherd University, Shepherdstown, WV, 25443, and DONALD MILLS, Dept of Computer Sciences, Mathematics, and Engineering, Shepherd University, Shepherdstown, WV, 25443. Analysis of basic cryptographic concepts and recent open problems in hash function security. 
 
 The objectives of this study are to show an understanding of cryptographic concepts as well as highlight recent open problems involving hash function security. The method of study used included reading the first five chapters of Cryptography: Theory and Practice by Stinson and Paterson as well as a recent paper that outlined open problems in hash function security. Then, written reports were delivered on the information learned which included selected proofs and solved examples. The essentials of the opening report introduce the basic elements of cryptography: cryptosystems, cryptographic tools, message integrity, protocols, and security approaches. Chapter 2 of “Cryptography” describes various types of ciphers including Shift, Substitution, Affine, Vigenère, Hill, Permutation, and Stream Ciphers, as well as how to cryptanalyze them. The third report focuses on the One-time Pad, entropy, perfect security, and cryptographic security, specifically unconditional security, as introduced by Claude Shannon in his work on information theory. Throughout the fourth report, block and stream ciphers, including substitution-permutation networks, attacks such as linear and differential cryptanalysis, and modes of operation are discussed. In the fifth report, basic concepts of cryptography, hash function and message authentication are discussed, including iterated hash function, sponge construction, and unconditionally secure MACS. Using the information learned from the previous reports, current problems in hash functions were then researched. In conclusion, open problems in hash function security include collision resistance, preimage resistance, and resistant to length extension attacks. The project was sponsored by the NSF S-STEM Grant (DUE-2130267).
- Research Article
1
- 10.1051/itmconf/20182100011
- Jan 1, 2018
- ITM Web of Conferences
The following article presents the results on the impact of encryption algorithms and the cryptographic hash function on the QoS (Quality of Service) transmission in a computer network. A network model supporting data encryption using the AES algorithm and the MD5 and SHA hash functions used in VPN tunnels was designed and tested. The influence of different data length on the quality of transmission in a secured network was studied. The measurements and tests of networks were performed according to two methodologies ITU-T Y.1564 and RFC 2544. The impact of the data encryption mechanism on bandwidth, data loss and maximum delays was examined. The secured network tests were performed with different combinations of encryption algorithms and hash functions of the VPN tunnel in the ESP (Encapsulating Security Payload) transport mode.
- Research Article
7
- 10.31341/jios.41.2.9
- Dec 14, 2017
- Journal of information and organizational sciences
Cryptographic hash function is an important cryptographic tool in the field of information security. Design of most widely used hash functions such as MD5 and SHA-1 is based on the iterations of compression function by Merkle-Damgård construction method with constant initialization vector. Merkle-Damgård construction showed that the security of hash function depends on the security of the compression function. Several attacks on Merkle-Damgård construction based hash functions motivated researchers to propose different cryptographic constructions to enhance the security of hash functions against the differential and generic attacks. Cryptographic community had been looking for replacements for these weak hash functions and they have proposed new hash functions based on different variants of Merkle-Damgård construction. As a result of an open competition NIST announced Keccak as a SHA-3 standard. This paper provides a review of cryptographic hash function, its security requirements and different design methods of compression function.
- Research Article
1
- 10.5075/epfl-thesis-5333
- Jan 1, 2012
Cryptographic hash functions are used in many cryptographic applications, and the design of provably secure hash functions (relative to various security notions) is an active area of research. Most of the currently existing hash functions use the Merkle-Damgard paradigm, where by appropriate iteration the hash function inherits its collision and preimage resistance from the underlying compression function. Compression functions can either be constructed from scratch or be built using well-known cryptographic primitives such as a blockcipher. One classic type of primitive-based compression functions is single-block-length : It contains designs that have an output size matching the output length n of the underlying primitive. The single-block-length setting is well-understood. Yet even for the optimally secure constructions, the (time) complexity of collision- and preimage-finding attacks is at most 2n/2, respectively 2n ; when n = 128 (e.g., Advanced Encryption Standard) the resulting bounds have been deemed unacceptable for current practice. As a remedy, multi-block-length primitive-based compression functions, which output more than n bits, have been proposed. This output expansion is typically achieved by calling the primitive multiple times and then combining the resulting primitive outputs in some clever way. In this thesis, we study the collision and preimage resistance of certain types of multi-call multi-block-length primitive-based compression (and the corresponding Merkle-Damgard iterated hash) functions : Our contribution is three-fold. First, we provide a novel framework for blockcipher-based compression functions that compress 3n bits to 2n bits and that use two calls to a 2n-bit key blockcipher with block-length n. We restrict ourselves to two parallel calls and analyze the sufficient conditions to obtain close-to-optimal collision resistance, either in the compression function or in the Merkle-Damgard iteration. Second, we present a new compression function h: {0,1}3n → {0,1}2n ; it uses two parallel calls to an ideal primitive (public random function) from 2n to n bits. This is similar to MDC-2 or the recently proposed MJH by Lee and Stam (CT-RSA'11). However, unlike these constructions, already in the compression function we achieve that an adversary limited (asymptotically in n) to O (22n(1-δ)/3) queries (for any δ > 0) has a disappearing advantage to find collisions. This is the first construction of this type offering collision resistance beyond 2n/2 queries. Our final contribution is the (re)analysis of the preimage and collision resistance of the Knudsen-Preneel compression functions in the setting of public random functions. Knudsen-Preneel compression functions utilize an [r,k,d] linear error-correcting code over 𝔽2e (for e > 1) to build a compression function from underlying blockciphers operating in the Davies-Meyer mode. Knudsen and Preneel show, in the complexity-theoretic setting, that finding collisions takes time at least 2(d-1)n2. Preimage resistance, however, is conjectured to be the square of the collision resistance. Our results show that both the collision resistance proof and the preimage resistance conjecture of Knudsen and Preneel are incorrect : With the exception of two of the proposed parameters, the Knudsen-Preneel compression functions do not achieve the security level they were designed for.
- Book Chapter
9
- 10.1007/978-3-540-71039-4_28
- Feb 10, 2008
In 1989–1990, two new hash functions were presented, Snefru and MD4. Snefru was soon broken by the newly introduced differential cryptanalysis, while MD4 remained unbroken for several more years. As a result, newer functions based on MD4, e.g., MD5 and SHA-1, became the de-facto and international standards. Following recent techniques of differential cryptanalysis for hash function, today we know that MD4 is even weaker than Snefru. In this paper we apply recent differential cryptanalysis techniques to Snefru, and devise new techniques that improve the attacks on Snefru further, including using generic attacks with differential cryptanalysis, and using virtual messages with second preimage attacks for finding preimages. Our results reduce the memory requirements of prior attacks to a negligible memory, and present a preimage of 2-pass Snefru. Finally, some observations on the padding schemes of Snefru and MD4 are discussed.KeywordsHash FunctionMarked LocationLength BlockCompression FunctionGeneric AttackThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
- Research Article
1
- 10.1049/iet-ifs.2012.0035
- Sep 1, 2013
- IET Information Security
In this study the authors propose a new multivariate hash function with HAsh Iterative FrAmework framework which we call the hash function quadratic polynomials multiplying linear polynomials (QML). The new hash function is made of cubic polynomials which are the products of quadratic polynomials and linear polynomials. The authors design the quadratic-polynomial part of the compression function based on the centre map of the multivariate public key cryptosystem Matsumoto-Imai cryptosystem (MI). The hash function QML can keep the three cryptography properties and be immune to the pre-image attack, second pre-image attack, collision attack, differential attack and algebraic attack. The required memory storage is about 50% of the one which is built of the cubic polynomials and their coefficients are random. On the avalanche effect, by experiments the authors get the result that about one half of the output bits are different when one input bit is changed randomly. The one-round diffusion of the hash function QML is twice of that of Blake. Also the authors simplify the matrixes of the new hash function, analyse the rationality and show the comparable data. Finally, the authors give the advice to the parameters of the new hash function and summarise the paper.
- Research Article
4
- 10.1007/s10207-012-0156-7
- Feb 11, 2012
- International Journal of Information Security
In 2007, the US National Institute for Standards and Technology (NIST) announced a call for the design of a new cryptographic hash algorithm in response to vulnerabilities like differential attacks identified in existing hash functions, such as MD5 and SHA-1. NIST received many submissions, 51 of which got accepted to the first round. 14 candidates were left in the second round, out of which five candidates have been recently chosen for the final round. An important criterion in the selection process is the SHA-3 hash function security. We identify two important classes of security arguments for the new designs: (1) the possible reductions of the hash function security to the security of its underlying building blocks and (2) arguments against differential attack on building blocks. In this paper, we compare the state of the art provable security reductions for the second round candidates and review arguments and bounds against classes of differential attacks. We discuss all the SHA-3 candidates at a high functional level, analyze, and summarize the security reduction results and bounds against differential attacks. Additionally, we generalize the well-known proof of collision resistance preservation, such that all SHA-3 candidates with a suffix-free padding are covered.
- Book Chapter
61
- 10.1007/978-3-030-45724-2_9
- Jan 1, 2020
In this paper we spot light on dedicated quantum collision attacks on concrete hash functions, which has not received much attention so far. In the classical setting, the generic complexity to find collisions of an n-bit hash function is \(O(2^{n/2})\), thus classical collision attacks based on differential cryptanalysis such as rebound attacks build differential trails with probability higher than \(2^{-n/2}\). By the same analogy, generic quantum algorithms such as the BHT algorithm find collisions with complexity \(O(2^{n/3})\). With quantum algorithms, a pair of messages satisfying a differential trail with probability p can be generated with complexity \(p^{-1/2}\). Hence, in the quantum setting, some differential trails with probability up to \(2^{-2n/3}\) that cannot be exploited in the classical setting may be exploited to mount a collision attack in the quantum setting. In particular, the number of attacked rounds may increase. In this paper, we attack two international hash function standards: AES-MMO and Whirlpool. For AES-MMO, we present a 7-round differential trail with probability \(2^{-80}\) and use it to find collisions with a quantum version of the rebound attack, while only 6 rounds can be attacked in the classical setting. For Whirlpool, we mount a collision attack based on a 6-round differential trail from a classical rebound distinguisher with a complexity higher than the birthday bound. This improves the best classical attack on 5 rounds by 1. We also show that those trails are optimal in our approach. Our results have two important implications. First, there seems to exist a common belief that classically secure hash functions will remain secure against quantum adversaries. Indeed, several second-round candidates in the NIST post-quantum competition use existing hash functions, say SHA-3, as quantum secure ones. Our results disprove this common belief. Second, our observation suggests that differential trail search should not stop with probability \(2^{-n/2}\) but should consider up to \(2^{-2n/3}\). Hence it deserves to revisit the previous differential trail search activities.
- Research Article
40
- 10.1109/access.2020.2989050
- Jan 1, 2020
- IEEE Access
MD5 is a one-way cryptographic function used in various fields for maintaining data integrity. The application of a Hash function can provide much protection and privacy and subsequently reduce data usage. Most users are familiar with validating electronic documents based on a Hash function, such as the MD5 algorithm and other hash functions, to demonstrate the data integrity. There are many weaknesses of the current MD5 algorithm, mainly its failures and weaknesses against varying types of attacks, such as brute force attacks, rainbow table attacks, and Christmas attacks. Therefore, the method proposed in this paper enhances the MD5 algorithm by adding a dynamic variable length and a high efficiency that simulates the highest security available. Whereas the logistic system was used to encode ribonucleic acid (RNA) by generating a random matrix based on a new key that was created using the initial permutation (IP) tables used in the data encryption stander (DES) with the linear-feedback shift register (LFSR), this work proposes several structures to improve the MD5 hash function. The experimental results demonstrate its high resistance to hackers while maintaining a suitable duration. This paper discusses the design of a confident hash algorithm. This algorithm has characteristics that enable it to succeed in the field of digital authentication and data integrity.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.