🎁 Early Access Open: Get 1 Month of Growth Tier Free when you sign up

Learn
EP02·matchingmarketplaceflipkartmanga

Why VLOOKUP breaks at 2,000 orders a month

ByAmit Mishra·Founder, ReconPe·Narrated by Riya Bhattacharya, CA

A 3am WhatsApp from a Flipkart seller: 'the file crashed again.' Riya opens it — 4,800 rows, OrderID duplicated six times per order. Direct matching dies at this scale. Set matching is the answer.

1. SETUPPanel 1 of 3
Riya at her desk late at night under warm desk-lamp light, eyes narrowed on a laptop screen displaying a Flipkart settlement CSV with 4,800 rows. The OrderID OD123456789 repeats across multiple rows with different transaction types (Order, Fee, Return, COD, TDS). A WhatsApp notification at top-right reads: 3:00 AM — Flipkart Seller (Client) — 'the file crashed again'. A sticky note on the right reminds: VLOOKUP assumes one match per key. Marketplace settlements never do.

One order. Many rows. VLOOKUP returns one.

Transcript
Dialogue — Riya

Of course it did. Look at this.

Side monologue

VLOOKUP assumes one match per key. Marketplace settlements never do.

2. REINFORCEPanel 2 of 3
Riya pointing at the laptop screen. The screen now breaks the OrderID OD123456789 into its six settlement rows reflecting Flipkart's four-fee structure: Order Capture ₹1,499, Commission Fee -₹150, Fixed Fee -₹40, Collection Fee -₹20, Shipping Fee -₹60, COD Remittance ₹1,229 at T+8. A math strip below the table verifies: 1499 - 150 - 40 - 20 - 60 = ₹1,229. A side caption explains Flipkart's four-fee audit axes and links to /flipkart-reconciliation/.

The six rows are not duplicates. They're the truth.

Transcript
Dialogue — Riya

Each row is a different economic event. VLOOKUP picked the first. The other five became invisible.

Side monologue

Direct matching dies the moment cardinality stops being 1:1. That happens at every marketplace.

3. TURNAROUNDPanel 3 of 3
Riya with a slight smile of resolution. The laptop screen shows a clean set-matching diagram: SOURCE (Flipkart settlement, 6 source rows) flows into SETTLEMENT TOTAL (1 sum) which then matches BANK CREDIT (1 match). The six settlement rows group together and reconcile to a single ₹1,229 bank credit, with a MATCHED checkmark and HIGH confidence score. A side caption reads: This is what subset-sum matching does. It's not a feature — it's the only honest way to reconcile a marketplace.

What to do instead.

Transcript
Dialogue — Riya

Set matching. Not row matching. The bank credit matches the sum of the six rows, not row 1.

Side monologue

This is what subset-sum matching does. It's not a feature — it's the only honest way to reconcile a marketplace.

The longer take

Almost every serious reconciliation in the first month of a finance team's existence starts with VLOOKUP on order ID. It works. It keeps working until order volume reaches somewhere between 500 and 2,000 per month. And then, quietly, across a dozen small failure modes, it stops working. The failures are not obvious because VLOOKUP still returns values — it just returns the wrong ones, or misses rows that should match.

The Flipkart case in the panels is the canonical example. A single order generates between two and five settlement rows: the initial capture, a commission deduction, a fixed-fee deduction, a shipping-fee deduction, and a COD remittance line (which arrives 8 days later than the order). VLOOKUP on OrderID picks the first row and silently drops the rest. The seller sees a 'matched' file and assumes the reconciliation is done. The actual reconciliation — proving that the bank credit equals the sum of the order capture minus all fees plus any returns — hasn't started.

The mathematical shape of the problem is that the source-to-target cardinality stopped being 1:1. In the simple case (one order, one bank credit, one ledger entry), every direct-matching tool works. In the real case — split settlements, aggregated payouts, fee breakdowns, return cycles, COD remittance windows — the relationship between rows is many-to-many. Tools that assume 1:1 produce silent errors and a high-confidence wrong answer, which is worse than no answer.

The architectural step that addresses this is set matching, often implemented as subset-sum search. Sum candidate source rows for a given counterparty + date window, find the subset whose sum matches the target bank credit within a configurable tolerance (TDS withholding, GST adjustments, rounding), and return the subset as the match with a confidence score. This is computationally bounded in practice because the counterparty + date partitioning prunes the search space — you don't search across all 4,800 rows, you search across the 12 rows for that seller in that week.

The practical signal that a finance team has outgrown direct matching is usually one of four symptoms. The first is review time: if a single reconciliation cycle takes more than four hours of analyst time for a team doing under 10,000 monthly orders, the tool is failing. The second is exception-rate drift: if exceptions as a percentage of rows have been creeping up over successive cycles, the tool isn't adapting. The third is variance leakage: if the monthly 'round-off adjustment' journal has been growing, the tool is silently dropping matches and the team is compensating manually. The fourth is audit friction: external auditors asking for field-level rationale and the tool producing only a binary matched/unmatched flag.

New episode every Monday.

Riya teaches reconciliation in three panels a week. Free, ungated, no newsletter sign-up gate.

See all episodes