Why VLOOKUP breaks at 2,000 orders a month
A 3am WhatsApp from a Flipkart seller: 'the file crashed again.' Riya opens it — 4,800 rows, OrderID duplicated six times per order. Direct matching dies at this scale. Set matching is the answer.

“One order. Many rows. VLOOKUP returns one.”
Transcript›
Of course it did. Look at this.
VLOOKUP assumes one match per key. Marketplace settlements never do.

“The six rows are not duplicates. They're the truth.”
Transcript›
Each row is a different economic event. VLOOKUP picked the first. The other five became invisible.
Direct matching dies the moment cardinality stops being 1:1. That happens at every marketplace.

“What to do instead.”
Transcript›
Set matching. Not row matching. The bank credit matches the sum of the six rows, not row 1.
This is what subset-sum matching does. It's not a feature — it's the only honest way to reconcile a marketplace.
The longer take
Almost every serious reconciliation in the first month of a finance team's existence starts with VLOOKUP on order ID. It works. It keeps working until order volume reaches somewhere between 500 and 2,000 per month. And then, quietly, across a dozen small failure modes, it stops working. The failures are not obvious because VLOOKUP still returns values — it just returns the wrong ones, or misses rows that should match.
The Flipkart case in the panels is the canonical example. A single order generates between two and five settlement rows: the initial capture, a commission deduction, a fixed-fee deduction, a shipping-fee deduction, and a COD remittance line (which arrives 8 days later than the order). VLOOKUP on OrderID picks the first row and silently drops the rest. The seller sees a 'matched' file and assumes the reconciliation is done. The actual reconciliation — proving that the bank credit equals the sum of the order capture minus all fees plus any returns — hasn't started.
The mathematical shape of the problem is that the source-to-target cardinality stopped being 1:1. In the simple case (one order, one bank credit, one ledger entry), every direct-matching tool works. In the real case — split settlements, aggregated payouts, fee breakdowns, return cycles, COD remittance windows — the relationship between rows is many-to-many. Tools that assume 1:1 produce silent errors and a high-confidence wrong answer, which is worse than no answer.
The architectural step that addresses this is set matching, often implemented as subset-sum search. Sum candidate source rows for a given counterparty + date window, find the subset whose sum matches the target bank credit within a configurable tolerance (TDS withholding, GST adjustments, rounding), and return the subset as the match with a confidence score. This is computationally bounded in practice because the counterparty + date partitioning prunes the search space — you don't search across all 4,800 rows, you search across the 12 rows for that seller in that week.
The practical signal that a finance team has outgrown direct matching is usually one of four symptoms. The first is review time: if a single reconciliation cycle takes more than four hours of analyst time for a team doing under 10,000 monthly orders, the tool is failing. The second is exception-rate drift: if exceptions as a percentage of rows have been creeping up over successive cycles, the tool isn't adapting. The third is variance leakage: if the monthly 'round-off adjustment' journal has been growing, the tool is silently dropping matches and the team is compensating manually. The fourth is audit friction: external auditors asking for field-level rationale and the tool producing only a binary matched/unmatched flag.