FuzzPrint - Similarity & Import Fingerprints#

A cryptographic hash answers one question: are these two files byte-for-byte identical? It cannot tell you that two files are almost the same - a recompiled build, a patched binary, a packed variant, or a trojanized copy of a known tool. FuzzPrint adds similarity fingerprints to every binary HashWatch tracks, so you can measure how close an unknown file is to a known-good one.

Status: rolling out. Fingerprint fields populate as software is re-ingested; not every record carries every fingerprint yet.


What it captures#

FuzzPrint records up to three fingerprints per binary, computed in-stream during ingestion - the file is never written to disk or retained (the same stream-and-discard guarantee as the primary hashes):

FieldAlgorithmGood for
hash_ssdeepssdeep (context-triggered piecewise hashing)Detecting near-duplicate files and small modifications - a comparable score means a comparable file.
hash_tlshTLSH (locality-sensitive hash)Robust similarity at scale; a distance score where smaller = more alike.
imphashImport hash (PE import table)Clustering Windows executables that share the same imported API surface - a classic malware-family pivot.

imphash is present only for PE/EXE files; hash_ssdeep / hash_tlsh apply to any binary. All three are nullable - blank for formats or records where they do not apply.


Where it appears#

Public dashboard#

Each tracked binary exposes its fingerprints next to its exact hashes (copy-to-clipboard, same as SHA-256).

API#

The fingerprint fields are returned on hash records:

{
  "executable_name": "chrome-win64.msi",
  "hash_sha256": "…",
  "hash_ssdeep": "3072:Ab2…:9Qr…",
  "hash_tlsh": "T1A2B3…",
  "imphash": "f34d5e…"
}

Similarity matching (submit your own fingerprint and get the closest known-good records back) is a Teams-tier intel capability; the fingerprints on records are public, like the SigDiff signer fields.


How to use it#

  • Triage an unknown binary. Compute its ssdeep/TLSH and compare against the known-good FuzzPrint for that software. A high similarity but a different SHA-256 means “almost the genuine file” - exactly the shape of a tampered or trojanized build.
  • Cluster by import surface. Pivot on imphash in your EDR/SIEM to group executables that import the same APIs - a fast way to find relatives of a sample.
  • Pair with the exact hash. An exact match proves identity; a near match flags a relative worth a closer look.

FuzzPrint measures resemblance; SigDiff tells you who signed it. A file that resembles a known tool but is signed by a different identity is a strong tampering signal.