Cryptographic Hash
A cryptographic hash algorithm produces a fixed-length string of bits from an input plaintext that can be of any length.
- aka keyless cryptography or hash function
- produces a digital fingerprint as output
- output value is called a hash or checksum or digest or tag
- messages encrypted this way are called message digests
- doesn’t use a key
- properties of a hash function:
- different inputs are unlikely to produce the same output (collision)
- one bit change in input produces a completely different hash
- a one or more bit change in input changes the output in about half of its bits
- it is impossible to recover the plaintext data from the digest (one-way)
- different inputs are unlikely to produce the same output (collision)
How It Works
- hashes can be thought of as fingerprints because they’re unique identifiers of a message
- hashes of similar messages look completely different
- can’t be used used to discover the contents of the original message
- but can use to see of the original message has been changed
- send the hash with a file or message, then receiver can use the same hash function to see if the hashes are the same
Collision is a matching hash for two different sets of data.
- theoretically possible to engineer this, but difficult and usually only happens when using a broken algorithm
- Message-Digest algorithm 5 (MD5) and SHA-1 have been attacked in this fashion
- when a collision occurs, you generally stop using the algorithm

Properties of Hash Functions
- should have the following properties so that security or performance is not compromised:
- Determinism
- the same input always results in the same output
- Efficiency
- computation of the hash value should be low
- Security
- should be practically irreversible (i.e., one way) in order to be attack resistant
- Determinism
Preimage Resistance
Preimage resistance property means given a hash value
- aka one-way property
- finding the original message from a message digest is computationally infeasible
- protects against an attacker who has a hash and wants to find the associated data
Second preimage resistance property means given an input
- aka weak collision resistance
- finding a second input that has the same digest as any other specified input is computationally infeasible
- i.e. it is not feasible to modify a message without changing its hash value
- protects against an attacker who has an input value and its hash and wants to substitute a forged value to the original input value
- preserves data integrity
Collision Resistance
Collision resistance property means it should be computationally infeasible to find any two inputs that result in the same hash.
- aka strong collision resistance
- if a function is collision resistant, it is also second preimage resistant
- prevents creating two distinct data with the same hash
Avalanche Effect
Avalanche effect property means that a small change to a message should result in a digest that is significantly different and uncorrelated with the digest of the original message.
Uses
- to prove integrity
- for password storage and authentication solutions
- anonymization of user (subject) IDs and credentials
- Error detection and integrity checking for files, messages, or data blocks
- Efficient table or database lookup
- compressing or collapsing a large set of possible values into a smaller but unique set of lookup key values makes the search space much smaller
- Secure message digests or hashes
- digital signatures
- message authentication codes (MAC)
Integrity Verification Using a Hash Function
Hash Algorithms
- Secure Hash Algorithm (SHA)
- Message Digest Algorithm 5 (MD5)
- SHA-2
- SHA-3
- MD2
- MD4
- RACE
