All Workflows

Simple Invoice Processing

AP invoice OCR + GSTIN verification + tax-math checks — invoice processing automation in under 30 seconds per document.

Live demo Drop your own invoices, see Cadel extract every field and validate the math — in seconds.

The Problem

Manual invoice processing automation breaks at scale. A controller at a mid-market company processes invoices from three to five distinct vendor categories every cycle — and each one carries its own failure mode that pure-OCR or template-based tools miss.

Three to five vendor categories per cycle

Domestic GST-registered suppliers with multi-rate line items, inter-state vendors needing IGST vs CGST+SGST routing, and international vendors whose EUR or USD invoices carry no GSTIN and no Indian tax fields at all.

GSTIN state-code cross-check, every invoice

The AP clerk reads the vendor GSTIN, extracts the state code (first 2 digits), and compares it against the declared place of supply to determine the correct tax type — line by line, manually.

Tax math on every line

Verify line_amount × tax_rate ÷ 100 per line, then aggregate. A single Acme Corp bill with five line items at four rates (18%, 12%, 5%, 0%) means four multiplications plus a summation — every step a chance for a keying error that lands in GSTR-2B two months later.

International invoices break the ruleset

A EUR-denominated invoice from a US vendor to a German buyer carries no GSTIN, no place of supply, and no tax-rate fields. A GST-only validation either crashes on null fields or surfaces false-positive errors an AP clerk must clear one by one.

80–100

The scale-break point for most mid-market finance teams. Beyond ~100 invoices per month per AP resource, the error rate on manual invoice data extraction, math verification and GSTIN state-code checks rises faster than headcount can compensate — and the downstream cost (ITC mismatches, duplicate payments, vendor disputes) becomes material.

Why It Matters: Regulatory Context

Every tax invoice from a registered Indian supplier sits inside four overlapping CGST / IGST rules. Booking the wrong tax type or missing a required field creates a reconciliation mismatch in GSTR-2B that the AP team must fix before filing GSTR-3B — or face a demand notice.

Rule 46 · CGST Rules 2017

Required invoice fields

Every tax invoice must carry supplier GSTIN, recipient GSTIN, place of supply, HSN/SAC code, and a separate tax amount per rate slab. Missing any field disqualifies the invoice from claiming Input Tax Credit.

Section 5 · IGST Act 2017

Inter-state vs intra-state tax routing

If supplier state code matches place of supply state code → intra-state → CGST + SGST. If they differ → inter-state → IGST. Booking the wrong type triggers a GSTR-2B reconciliation mismatch the AP team must correct before filing GSTR-3B.

Section 16(2)(aa) · CGST Act 2017

ITC capped at supplier’s filed return

Input Tax Credit can only be claimed up to the amount the supplier has reflected in their GSTR-1. An over-claimed or mis-booked ITC must be reversed — often with interest under Section 50.

Rule 37A · CGST Rules 2017

GSTR-2B discrepancy resolution

Any auto-populated discrepancy between the buyer’s claimed credit and the supplier’s filed GSTR-1 must be addressed within the same return period — otherwise it surfaces as a demand notice under Section 73 or 74.

For controllers managing 150 vendor invoices a month across domestic and international suppliers — PDFs with embedded text, scanned images with embossed seals, German Rechnung layouts under the EU VAT Directive 2006/112/EC — reconciling formats manually means the data that enters the ERP is only as accurate as the clerk’s reading of the source document.

What This Workflow Automates

Eight deterministic steps that turn AP invoice OCR into structured, validated, ERP-ready data. Each step runs in under 30 seconds per invoice batch and produces a structured JSON output with a discrete validation-results array every AP controller can audit line by line — the same invoice processing software pattern used by mid-market finance teams to feed downstream three-way match.

01

Document ingestion & format detection

Accepts PDF invoices — digitally generated or scanned — and identifies the document type as Invoice, routing it to the right extraction schema whether the source is a GST-registered domestic supplier or a foreign vendor with no Indian tax fields.

02

Structured header extraction

For each invoice, extracts invoice_number, invoice_date, vendor_name, vendor_gstin, vendor_address, customer_name, customer_gstin, customer_address, place_of_supply, currency, subtotal, tax_amount and total_amount.

03

Line item extraction

For every line on the invoice, line item extraction captures item_code (HSN / SAC where present), description, quantity, unit_price, line_amount, tax_rate and tax_amount as a structured array — preserving mixed-rate line items as distinct records.

04

Line-level tax math check

For each line with a non-null tax_rate and line_amount, computes the expected tax as line_amount × tax_rate ÷ 100 and compares it to the extracted tax_amount. Any discrepancy raises a line-level validation exception.

05

Subtotal vs line-sum check

Sums all extracted line_amount values and compares the result to the header-level subtotal field. Flags the invoice if the difference is non-zero — catching the common OCR or vendor-keying error where the printed subtotal silently differs from the line math.

06

GSTIN verification & state-code routing

Where both vendor_gstin and customer_gstin are present, GSTIN verification extracts the supplier state code from the first two digits of the vendor GSTIN and compares it to the state code in the declared place_of_supply — surfacing any inter-state vs intra-state mismatch before tax-type booking.

07

International invoice handling

Where both GSTINs are null — as with EUR or USD foreign-vendor invoices — bypasses every GST-specific validation and extracts only the fields that exist (currency, line items, subtotal, total) without raising false-positive errors on absent tax fields.

08

Unreadable document flagging

Where OCR yields no extractable structured fields (e.g., a dense, seal-stamped scanned invoice), returns an empty extracted_fields object and records the file as requiring manual review — preventing a silent null-record from passing downstream to the ERP.

Edge Cases We Simulate

A battery of synthetic test scenarios that exercise every failure mode we have seen in real-world invoice data. Each scenario produces a deterministic outcome an auditor or controller can verify in seconds.

Mixed GST Rate Lines

What's wrongA single invoice carries line items at 18%, 12%, 5% and 0% (zero-rated export) — manual tax-sum verification is error-prone.
Expected outcomeEach line's tax is independently verified against its stated rate and line amount; aggregate tax is cross-checked against the invoice-level tax field. Any mismatch is flagged.

Inter-State Supply Mismatch

What's wrongVendor GSTIN state code and place of supply disagree — e.g. a Maharashtra (state 27) vendor billing place-of-supply Karnataka (state 29) — determining IGST vs CGST+SGST.
Expected outcomeWorkflow surfaces supplier state code (from GSTIN) and declared place of supply as separate flagged fields, allowing the AP team to confirm the correct tax type before booking.

Null Tax Fields — International Invoice

What's wrongA foreign-vendor invoice (e.g. EUR invoice US → Germany) carries no GST, no GSTIN, no tax-rate fields — breaks a GST-only ruleset.
Expected outcomeWorkflow detects absent GSTIN on both sides and skips GST-specific validations — extracts only currency, line items, subtotal and total, no false-positives.

Subtotal vs Line-Item Sum

What's wrongThe printed subtotal does not equal the arithmetic sum of individual line amounts — a common OCR or vendor-keying error that passes unnoticed in manual review.
Expected outcomeWorkflow sums all extracted line_amount values and compares to extracted subtotal — raising a validation exception when the difference is non-zero.

Unreadable / Scanned Invoice

What's wrongA dense, seal-stamped or low-resolution scanned invoice (e.g. 25 rows with an embossed company seal) yields no extractable structured fields after OCR.
Expected outcomeWorkflow returns an empty extracted_fields object and routes the file for manual review — preventing a silent null-record from passing to the ERP.

Duplicate Invoice Detection

What's wrongThe same invoice number is submitted a second time within the processing batch or against the existing vendor ledger — a common cause of duplicate payments.
Expected outcomeWorkflow flags the invoice_number as a potential duplicate when it matches a previously-processed record for the same vendor_gstin or vendor_name — requires explicit approver clearance.

Sample Files & Results

Four seeded invoices — each one engineered to exercise a different failure mode. Three extract cleanly. One is deliberately unreadable, to prove the workflow surfaces it for manual review instead of posting bad data to the ERP.

Inter-state INR · GST tax invoice
Extracted

Acme Corp → Acme Corp

Acme Corp LLP (Maharashtra, GSTIN 27AAAFZ…) Acme Corp Pvt Ltd (Karnataka, GSTIN 29AAACB…)
Total₹ 2.89 Lincl. IGST
Lines3all extracted
TaxIGST 18%inter-state

Place-of-supply (state 27) vs customer state (29) correctly routed as inter-state IGST, not CGST+SGST. Per-line tax math (₹2,45,000 × 18% = ₹44,100) validated against header total.

International · null-GSTIN bypass
Extracted

Acme Corp → Schneider GmbH

Acme Corp (Rochester, NY · 🇺🇸) Schneider Technologie GmbH (Berlin · 🇩🇪)
Total€ 300SaaS subscription
GSTINnullboth sides
GST errors0false-positives

Vendor address, customer address, subscription period and total extracted from a bilingual DE/EN document. The null-GSTIN bypass skips intra-state/inter-state and HSN validations instead of generating exceptions an AP clerk would have to clear.

Mixed GST rates · intra-state CGST+SGST
Extracted

Acme Corp → Acme Corp

Acme Corp Pvt Ltd (Bengaluru, GSTIN 29AABCT…) Acme Corp Pvt Ltd (same state · CGST+SGST)
Total₹ 1.65 L5 lines
Tax slabs418% · 12% · 5% · 0%
SAC codes5998313…998399

Per-line tax math verified at 4 different rates. The zero-rated export documentation line (SAC 998399, ₹15,000 @ 0%) extracts cleanly with tax_rate=0not flagged as a missing-field error.

Scanned image PDF · OCR fails
Manual review

Dense seal-stamped scan

High-row-count, low-DPI scan · 25 line rows · embossed seal overlay obscures text. Intentionally engineered to fail OCR.
Extracted fields{ }empty
Posted to ERP0nothing forwarded
RoutingManualexception queue

The workflow returns an empty extraction and routes the file for review — preventing the silent zero-value posting (or duplicate payment if the vendor resubmits the same invoice) that would happen if a partial extraction reached the ERP unchecked.

Why Automation Wins Here

For a mid-market AP team processing 100 domestic and international invoices per month, this AP automation software replaces an estimated 12–15 hours of manual field extraction, GSTIN verification and tax-math checking with a process that runs in under 30 seconds per document.

12–15 hrs
Saved per month on a 100-invoice AP workload
< 30 s
Per-document processing time, including math validation
100%
Line-level tax math verified, not just header totals
0
False-positive GST exceptions on international invoices

Math errors caught upstream

Computing line_amount × tax_rate ÷ 100 per line catches arithmetic discrepancies invisible to a clerk reading the printed total — reducing ITC booking errors that would require reversal under Rule 37A of the CGST Rules.

Quiet exception queue

Null-GSTIN detection eliminates the false-positive exceptions that an AP clerk would otherwise have to clear on every international invoice — keeping the queue limited to genuine anomalies, not format-driven noise.

Audit-ready, every invoice

Every processed invoice produces a structured JSON artifact (extracted fields + validation_results array) directly attachable to the AP voucher — a more reproducible evidence trail than a manually annotated printout under ICAI SA 500 standards. The output also feeds three way matching and downstream GSTIN validation flows without further transformation.

Frequently Asked Questions

The questions accountants and finance controllers ask most often before deploying invoice automation.

Which Indian tax regulations does this workflow's validation logic cover?

The workflow checks GSTIN format compliance per the alphanumeric structure mandated under Rule 10 of the CGST Rules, 2017, and verifies that the tax arithmetic on each line is consistent with the rates specified under Schedule I–IV of the CGST Act, 2017. It also extracts the place of supply field to help determine whether IGST (Section 5 of the IGST Act) or CGST+SGST (Section 9 of the CGST Act) should apply — a distinction that directly affects Section 16(2)(aa) input tax credit eligibility.

Does this workflow handle international (non-GST) invoices, and can it process multiple currencies?

Yes. When neither vendor nor customer GSTIN is present, the workflow automatically bypasses GST validations and extracts the core commercial fields — currency, line items, subtotal, and total — without raising false errors. The demo data includes a EUR-denominated invoice between a US vendor and a German buyer, confirming that currency codes other than INR are captured as-is. Multi-currency conversion to a functional currency must be handled downstream in the ERP per the applicable standard (IAS 21 or ASC 830).

How does the workflow integrate with Tally, NetSuite, SAP, or other mid-market ERPs?

Cadel outputs a structured JSON object for each invoice containing all extracted and validated fields, which can be mapped to the chart-of-accounts and vendor master of any ERP through a standard API or CSV export. No custom ERP connector is required at the extraction stage; the validated payload is designed to slot into the AP entry screen of Tally Prime, NetSuite Bill, or SAP FB60 with field-level mapping configured once at onboarding.

What audit trail does Cadel maintain for processed invoices?

Every extraction run stores the original document, the raw OCR output, the structured field extraction, and the full list of validation results — including any exceptions raised — as an immutable, timestamped record. This supports the documentation requirements under ICAI SA 230 (Audit Documentation) for external auditors and gives internal audit teams a line-by-line evidence chain from source PDF to ERP posting without relying on email threads or manual logs.

How does the workflow handle invoices with missing or null fields, such as quantity or unit price not printed on the document?

Fields that cannot be extracted are recorded as null rather than defaulting to zero, preventing silent arithmetic errors in downstream calculations. Validation rules that depend on a null field — for example, a unit-price-times-quantity check — are skipped with an explicit note in the validation results, so the reviewer knows exactly which fields require manual entry before the invoice is posted.

Can this workflow detect duplicate invoices before they reach the payment run?

The workflow compares the extracted invoice_number and vendor_gstin (or vendor_name for international vendors) against previously processed records in the same batch and against the vendor ledger. A match raises a duplicate-payment risk flag that must be explicitly cleared by an approver — consistent with the duplicate-payment control objectives described in COSO Internal Control — Integrated Framework and testable under ICAI SA 240 fraud-risk procedures.

Does Simple Invoice Processing handle the full three-way match (Invoice + PO + GRN)?

This workflow handles the invoice extraction and validation layer only — it is the upstream foundation for three-way match, not the match engine itself. The structured JSON it produces (with normalized line_amount, tax_rate, vendor_gstin and invoice_number fields) is exactly the shape Cadel's separate N-Way Reconciliation workflow consumes to compare invoice lines against PO and GRN lines, surface quantity, price and term variances, and flag any mismatches before payment is released. For teams whose three-way matching control is currently driven by spreadsheets, deploying invoice processing automation first removes the largest source of garbage-in errors that downstream matching has to clean up.

How does this AP invoice OCR engine differ from generic OCR or template-based invoice processing software?

Generic OCR reads pixels and returns text. Template-based invoice processing software reads pixels, locates fields against a per-vendor template, and breaks the moment a vendor changes their layout. Cadel's AP invoice OCR is schema-driven, not template-driven: the workflow extracts a fixed set of structured fields (invoice number, GSTIN, line items, tax rates, totals) using LLM-grounded extraction with a domain-specific GST invoice schema, runs deterministic math validation on every line, and produces an auditable validation_results array. It works on first-seen vendors with no template maintenance, on bilingual layouts (e.g. German Rechnung, US commercial invoice), and on scanned PDFs — without breaking when a vendor moves their tax field to a different position.