EP02·4 May 2026matchingmarketplaceflipkartmanga

Why VLOOKUP breaks at 2,000 orders a month

ByAmit Mishra·Founder, ReconPe·Narrated by Riya Bhattacharya, CA

A 3am WhatsApp from a Flipkart seller: 'the file crashed again.' Riya opens it — 4,800 rows, OrderID duplicated six times per order. Direct matching dies at this scale. Set matching is the answer.

Riya at her desk late at night under warm desk-lamp light, eyes narrowed on a laptop screen displaying a Flipkart settlement CSV with 4,800 rows. The OrderID OD123456789 repeats across multiple rows with different transaction types (Order, Fee, Return, COD, TDS). A WhatsApp notification at top-right reads: 3:00 AM — Flipkart Seller (Client) — 'the file crashed again'. A sticky note on the right reminds: VLOOKUP assumes one match per key. Marketplace settlements never do. — “One order. Many rows. VLOOKUP returns one.”
Transcript›
Dialogue — Riya
Of course it did. Look at this.
Side monologue
VLOOKUP assumes one match per key. Marketplace settlements never do.

Riya pointing at the laptop screen. The screen now breaks the OrderID OD123456789 into its six settlement rows reflecting Flipkart's four-fee structure: Order Capture ₹1,499, Commission Fee -₹150, Fixed Fee -₹40, Collection Fee -₹20, Shipping Fee -₹60, COD Remittance ₹1,229 at T+8. A math strip below the table verifies: 1499 - 150 - 40 - 20 - 60 = ₹1,229. A side caption explains Flipkart's four-fee audit axes and links to /flipkart-reconciliation/. — “The six rows are not duplicates. They're the truth.”
Transcript›
Dialogue — Riya
Each row is a different economic event. VLOOKUP picked the first. The other five became invisible.
Side monologue
Direct matching dies the moment cardinality stops being 1:1. That happens at every marketplace.

Riya with a slight smile of resolution. The laptop screen shows a clean set-matching diagram: SOURCE (Flipkart settlement, 6 source rows) flows into SETTLEMENT TOTAL (1 sum) which then matches BANK CREDIT (1 match). The six settlement rows group together and reconcile to a single ₹1,229 bank credit, with a MATCHED checkmark and HIGH confidence score. A side caption reads: This is what subset-sum matching does. It's not a feature — it's the only honest way to reconcile a marketplace. — “What to do instead.”
Transcript›
Dialogue — Riya
Set matching. Not row matching. The bank credit matches the sum of the six rows, not row 1.
Side monologue
This is what subset-sum matching does. It's not a feature — it's the only honest way to reconcile a marketplace.

The longer take

Almost every serious reconciliation in the first month of a finance team's existence starts with VLOOKUP on order ID. It works. It keeps working until order volume reaches somewhere between 500 and 2,000 per month. And then, quietly, across a dozen small failure modes, it stops working. The failures are not obvious because VLOOKUP still returns values — it just returns the wrong ones, or misses rows that should match.

The Flipkart case in the panels is the canonical example. A single order generates between two and five settlement rows: the initial capture, a commission deduction, a fixed-fee deduction, a shipping-fee deduction, and a COD remittance line (which arrives 8 days later than the order). VLOOKUP on OrderID picks the first row and silently drops the rest. The seller sees a 'matched' file and assumes the reconciliation is done. The actual reconciliation — proving that the bank credit equals the sum of the order capture minus all fees plus any returns — hasn't started.

The mathematical shape of the problem is that the source-to-target cardinality stopped being 1:1. In the simple case (one order, one bank credit, one ledger entry), every direct-matching tool works. In the real case — split settlements, aggregated payouts, fee breakdowns, return cycles, COD remittance windows — the relationship between rows is many-to-many. Tools that assume 1:1 produce silent errors and a high-confidence wrong answer, which is worse than no answer.

The architectural step that addresses this is set matching, often implemented as subset-sum search. Sum candidate source rows for a given counterparty + date window, find the subset whose sum matches the target bank credit within a configurable tolerance (TDS withholding, GST adjustments, rounding), and return the subset as the match with a confidence score. This is computationally bounded in practice because the counterparty + date partitioning prunes the search space — you don't search across all 4,800 rows, you search across the 12 rows for that seller in that week.

The practical signal that a finance team has outgrown direct matching is usually one of four symptoms. The first is review time: if a single reconciliation cycle takes more than four hours of analyst time for a team doing under 10,000 monthly orders, the tool is failing. The second is exception-rate drift: if exceptions as a percentage of rows have been creeping up over successive cycles, the tool isn't adapting. The third is variance leakage: if the monthly 'round-off adjustment' journal has been growing, the tool is silently dropping matches and the team is compensating manually. The fourth is audit friction: external auditors asking for field-level rationale and the tool producing only a binary matched/unmatched flag.

Sub-ledger to GL, in seconds

Software does matching. People do judgement.

Marketplaces

Compare

Learn

Why VLOOKUP breaks at 2,000 orders a month

The longer take

Continue the series

Subset-sum: matching ₹25 lakh to twelve invoices

What a settlement actually settles

What is reconciliation, really?

New episode every Monday.