🚀 Launching Soon: Get 1 Month of Growth Tier Free by signing up for Early Access

Glossary

Blocking (record linkage)

A technique in record linkage in which records are partitioned into smaller subsets (blocks) such that only records within the same block are compared — making the matching problem computationally tractable on large datasets.

Naive record linkage compares every record in dataset A against every record in dataset B, which is quadratic and impractical at scale. Blocking is the structural fix: choose one or more 'blocking keys' such that records in the same block are plausibly matchable, and skip cross-block comparison entirely.

Block-key choice is the central design decision. A strict key (full date + amount) creates many small blocks with high precision but misses candidate matches where the key has any error. A relaxed key (date prefix + amount rounded to nearest hundred) creates fewer larger blocks with higher recall but more comparisons per block. Multi-pass blocking — running several blocking strategies and unioning the candidate sets — is a common compromise.

Modern reconciliation engines cascade blocking: an exact key pass first, a relaxed key pass second, an LSH pass third. Each pass catches candidates the previous missed; the union is then scored with Fellegi-Sunter or another probabilistic framework. This is the structure of ReconPe's ACRE blocking stage.

Put this into practice

See how ReconPe handles blocking (record linkage) on your real settlement data. Free tier, no card required.

Start reconciling