🚀 Launching Soon: Get 1 Month of Growth Tier Free by signing up for Early Access

Glossary

Levenshtein distance

A string-similarity metric that counts the minimum number of single-character edits (insertions, deletions, or substitutions) required to transform one string into another.

Levenshtein distance, named for Vladimir Levenshtein who introduced it in 1965, measures the edit distance between two strings. The string 'kitten' and 'sitting' have a Levenshtein distance of 3: substitute k→s, substitute e→i, insert g.

In reconciliation, Levenshtein distance is the standard primitive for fuzzy matching of identifiers: order IDs that may have a missing leading zero, vendor names that may differ in spacing or casing, UTRs that may be truncated. A typical use is to compute distance between candidate field values and accept the match if the distance is below a threshold proportional to string length.

Naive Levenshtein is O(n*m) in time and space, but the standard two-row dynamic-programming optimisation reduces space to O(min(n,m)). For very large reference sets, Levenshtein is usually combined with LSH or a trie-based prefilter to avoid computing distance against every candidate.

Put this into practice

See how ReconPe handles levenshtein distance on your real settlement data. Free tier, no card required.

Start reconciling