When you download an operating system image, install a package, or receive a critical file, two questions matter: did it arrive complete and uncorrupted, and is it the same file the publisher actually released? A single tool answers both — the cryptographic hash. It condenses a file of any size into a short, fixed-length fingerprint that changes completely if even one bit of the file changes.
Hashing underpins file verification, password storage, digital signatures, malware tracking, and data deduplication. This guide explains what a hash is, the properties that make it useful, which algorithms to trust, and how to verify a file yourself with the File Inspector or the Checksum Calculator.
What a hash actually is
A hash function takes an input of any length — a word, a document, a 4 GB disk image — and produces a fixed-size output called the hash, digest, or checksum. SHA-256, for example, always outputs 256 bits, written as 64 hexadecimal characters, no matter whether you feed it one byte or a gigabyte.
The defining feature is sensitivity. Change a single character in a document and re-hash it, and the result is not slightly different — it is entirely different, with roughly half the output bits flipping. This is the avalanche effect, and it is what makes a hash a trustworthy fingerprint: identical hashes mean identical data, and any difference at all produces a visibly different value.
The properties of a good cryptographic hash
Not every hash is suitable for security. A cryptographic hash function is designed to have four properties:
- Deterministic — the same input always yields the same output, on any machine, forever.
- Fast to compute — hashing a large file should take milliseconds.
- One-way (pre-image resistant) — given a hash, it is computationally infeasible to find an input that produces it. You cannot “decrypt” a hash.
- Collision resistant — it is infeasible to find two different inputs that produce the same hash.
Collision resistance is the property that, when it fails, breaks an algorithm for security use — which is exactly what happened to MD5 and SHA-1.
The common algorithms
Several hash and checksum algorithms are in everyday use, with very different strengths:
| Algorithm | Output size | Status | Use for |
|---|---|---|---|
CRC32 | 32-bit | Not cryptographic | Accidental-error detection only |
MD5 | 128-bit | Broken | Legacy checksums, never security |
SHA-1 | 160-bit | Broken | Legacy only, being retired |
SHA-256 | 256-bit | Secure | The modern default |
SHA-512 | 512-bit | Secure | Higher margin, 64-bit systems |
CRC32 is a checksum, not a cryptographic hash. It is excellent at catching random transmission errors (and is built into ZIP and PNG for that purpose) but trivial to forge, so it must never be used to prove authenticity.Verifying a download
The classic use of hashing is confirming a download. A publisher computes the SHA-256 of their release and posts it on their website. After downloading, you compute the hash of your copy and compare the two strings:
- They match — your file is byte-for-byte identical to the original. It downloaded completely and was not altered in transit.
- They differ — something is wrong. The download may have been truncated or corrupted, or, in the worst case, replaced with a malicious version.
This is why open-source projects and OS vendors publish checksums next to their downloads. It turns “I hope this is the real installer” into a one-line check anyone can perform — and it is the same mechanism behind the file hashes that antivirus and threat-intelligence services use to recognise known-good and known-bad files.
Why MD5 and SHA-1 are no longer safe
For integrity against accidental corruption, MD5 is still perfectly fine — random damage will not coincidentally produce a matching digest. The problem is deliberate attack. Researchers demonstrated practical collisions: they could construct two different files with the same MD5 (and later the same SHA-1) hash on purpose. That destroys the security guarantee, because an attacker could prepare a harmless file and a malicious one sharing a digest, get the harmless one signed or approved, then swap in the malicious twin. SHA-256 has no known practical collisions, which is why it is the modern baseline for anything security-related.
Hashes, checksums, HMACs, and signatures
These related terms are easy to confuse:
- A checksum (like CRC32) detects accidental errors — cheap, not secure.
- A cryptographic hash (like SHA-256) detects accidental and deliberate changes — but anyone can recompute it, so on its own it only proves a file matches a value you already trust.
- An HMAC mixes a secret key into the hash, so only someone with the key can produce or verify it — proving the data came from a trusted party.
- A digital signature uses public-key cryptography to hash a file and sign the hash, proving both integrity and authorship to anyone, without a shared secret.
A note on password hashing
Storing passwords is a special case where fast general-purpose hashes are exactly the wrong choice. Because SHA-256 is so fast, an attacker who steals a database can try billions of guesses per second. Password storage instead uses deliberately slow, salted algorithms — bcrypt, scrypt, or Argon2 — that add a unique random salt per password and are tuned to be expensive to compute. The File Inspector’s hash identifier recognises these formats when it finds them in a file, which is a strong signal you are looking at credential storage.
Verify a file yourself
Drop any file into the File Inspector and open the Hashes tab to see its SHA-256, SHA-1, and SHA-512 computed instantly in your browser via the Web Crypto API — nothing is uploaded. For a focused, side-by-side comparison against a published value, use the Checksum Calculator. Make the comparison a habit for anything you download and run, and you close one of the easiest doors an attacker has.
Frequently asked questions
What is a file hash?
A fixed-length fingerprint computed from a file’s contents. The same file always produces the same hash, and changing even one bit produces a completely different hash, so it acts as a unique identifier for that exact data.
Is hashing the same as encryption?
No. Encryption is reversible with a key; hashing is one-way with no key and no way back. A hash proves integrity and identity, but it cannot be decoded to recover the original file.
Why are MD5 and SHA-1 considered broken?
Researchers can deliberately construct two different files with the same MD5 or SHA-1 hash (a collision), so those algorithms can no longer prove a file is authentic against a determined attacker. Use SHA-256 or stronger for security.
How do I verify a download with a checksum?
Compute the file’s SHA-256 and compare it character-for-character to the value the publisher posted. If they match, the file is byte-for-byte identical to the original; if not, it was corrupted or tampered with.