All Workflows

KYC Verification

Aadhaar + PAN + GST + cheque + sanctions + PEP — kyc verification software in under 45 seconds per bundle.

Live demo Upload an ID + address proof. Cadel extracts every field and runs sanctions + PEP checks.

The Problem

Manual customer KYC automation for a single applicant bundle takes 35–45 minutes — classifying 9 document types, keying fields into a checklist, comparing the applicant’s name across four documents, and validating the GSTIN-PAN substring rule. At 50 bundles/month that’s ~30 hours just on classification, before any cross-reference work begins.

9 doc types per bundle, one checklist

Aadhaar, PAN, GST REG-06, Excise Licence CL-7 or CL-9, cancelled cheque, KYC Top Sheet, Group Syndicate Form, NOC — classified by hand into a spreadsheet checklist. CL-7 vs CL-9 are visually identical except for one premises-type clause, so misclassifications are routine.

GSTIN verification & PAN substring rule

Rule 10 of the CGST Rules requires characters 3–12 of the GSTIN to equal the PAN. A single position mismatch invalidates the GSTIN — but is invisible to an analyst reading the document. Manual GSTIN verification at scale produces silent ITC-claim risk.

Cheque IFSC missing on older leaves

Older cheque leaves carry only the MICR band and no printed IFSC — leaving the bank account uncorroborated against the RBI MICR-IFSC master. Beneficiary disbursements proceed to unverified accounts unless the analyst manually resolves the MICR.

Name reconciliation across 4 documents

Aadhaar carries the full legal name. PAN may show initials. Cheque has the beneficiary chain. CL-7/CL-9 has the licence holder. Transliteration variations, initial expansion, and surname tokenisation mean string-equality fails — and a 0.85 similarity threshold has to be applied manually.

~30 hrs

Per month spent on classification and field extraction alone for a 50-bundle compliance workload — before any cross-reference work begins. The consequence of a missed exception isn’t theoretical: an unflagged GSTIN-PAN mismatch or a beneficiary cheque whose name doesn’t reconcile against the licence holder triggers recoverable disbursement, Section 12AA suspension proceedings under the Karnataka Excise Act, and an internal audit flag.

Why It Matters: Regulatory Context

KYC sits at the intersection of four mandatory regulatory regimes — PMLA, RBI Master Direction, CGST Rules, and the relevant state licensing act. Each one carries enforceable consequences for unverified beneficiaries.

PMLA 2002 · PEP & UBO checks

Anti-money-laundering identity proof

The Prevention of Money Laundering Act 2002 requires every regulated entity to verify customer identity, screen against the OFAC and EU sanctions lists, and check for politically-exposed-person status. The audit trail must persist for 5 years after relationship closure.

RBI Master Direction on KYC (2016)

Risk-based KYC tiers

RBI’s Master Direction on KYC mandates risk-based customer due diligence with periodic re-verification. Banks and NBFCs face penalties for onboarding beneficiaries without aadhaar-PAN linkage and beneficial-owner identification — both required at account-opening, not later.

Rule 10 · CGST Rules 2017

GSTIN format & PAN linkage

Rule 10 of the CGST Rules requires characters 3–12 of the 15-character GSTIN to equal the supplier’s PAN. A mismatch invalidates the GSTIN for ITC purposes — manual GSTIN verification at the document level catches what Tally Prime and SAP do not at master-record import.

State Licensing Acts

Beneficiary-name matching for disbursements

State-level licensing acts (e.g. Karnataka Excise Act 1965 — Section 12AA suspension) require that the cheque beneficiary name match the licence holder for every OWNED outlet. Mismatched beneficiaries trigger recoverable disbursements and internal audit flags.

What This Workflow Automates

Seven deterministic steps that classify, extract, cross-reference and verdict a KYC bundle in ~45 seconds. Every output field traces to its source document and validation rule.

01

Bundle ingestion & doc classification

Ingests the applicant bundle and classifies each file into one of nine registered types — using the premises-type clause to disambiguate CL-7 from CL-9 instead of the layout.

02

Structured field extraction

12-digit Aadhaar number + masked-Aadhaar flag, 10-character PAN, 15-character GSTIN, licence register number and excise year, MICR line, account number, IFSC, beneficiary name, and the outlet table from the Group Syndicate Form.

03

GSTIN verification (Rule 10 substring)

Validates that characters 3–12 of the GSTIN equal the PAN. On mismatch, the workflow logs the exact character positions that diverge and issues a FAIL on the GST document — preventing downstream ITC claim risk.

04

Cheque MICR → IFSC resolution

For older cheque leaves with no printed IFSC, resolves the MICR code against the RBI MICR-IFSC master. Marks the IFSC field FAIL if resolution does not return a single valid bank-branch record.

05

Name cross-reference across 4 docs

Compares applicant name across Aadhaar, PAN and cheque using initial expansion, surname tokenisation and transliteration-tolerant matching. Records a similarity score; below 0.85 routes to NEEDS_REVIEW with the character-level diff retained.

06

Group Syndicate Form reconciliation

Reconciles each outlet row against its CL-7 or CL-9 licence holder, tags the row as OWNED or LEASED, and applies the beneficiary-name match only to OWNED rows — preventing false fails on leased premises.

07

Per-doc & per-bundle verdict

Issues PASS / FAIL / NEEDS_REVIEW at both document and bundle level. Writes a structured JSON verdict file plus a human-readable verification report listing every field, its source document, its validation rule and its outcome.

Edge Cases We Simulate

The workflow ships with a battery of synthetic test scenarios that exercise every failure mode we have seen in real-world data. Each scenario produces a deterministic outcome that an auditor or controller can verify in seconds.

CL-7 Misclassified As CL-9

What's wrongBoth Karnataka Excise licence forms share layout, header, and signature block; the only distinguishing text is 'Hotel and Boarding House' versus 'Refreshment Rooms (Bars)'.
Expected outcomeClassifier reads the premises-type clause and tags the document as EXCISE_LICENCE_CL7 or EXCISE_LICENCE_CL9 accordingly; mismatch with the licence type declared in the KYC Top Sheet raises a NEEDS_REVIEW flag.

Name Variance Across Documents

What's wrongApplicant name on Aadhaar reads 'ACME APPLICANT', on PAN 'ACME APPLICANT', and on the bank cheque 'A C APPLICANT', which would fail a strict equality check.
Expected outcomeWorkflow applies a normalised match (initial expansion, surname tokenisation, transliteration tolerance) and records the match score; scores below 0.85 are routed to NEEDS_REVIEW with the diff highlighted.

Cancelled Cheque Without IFSC

What's wrongOlder cheque leaves carry only the MICR band and no printed IFSC, so account verification cannot rely on the cheque alone.
Expected outcomeMICR is parsed and resolved to IFSC against the RBI MICR-IFSC master; if resolution fails, the cheque is marked FAIL on the IFSC field and the bundle moves to NEEDS_REVIEW.

GSTIN PAN Mismatch

What's wrongCharacters 3–12 of the 15-character GSTIN must equal the applicant's PAN; data-entry errors or use of a related entity's GST often break this rule.
Expected outcomeWorkflow extracts both fields, performs the substring equality check defined in the CGST Rules, and issues FAIL on the GST document with the exact character positions of the mismatch logged.

Group Syndicate With Mixed Owned and Leased Outlets

What's wrongA Group Syndicate Form lists multiple outlets where some licences are held by the syndicate beneficiary and others by leased operators; bank account beneficiary may not match every outlet's licence holder.
Expected outcomeEach outlet row is reconciled against its CL-7/CL-9 licence holder and the syndicate beneficiary; ownership type is tagged OWNED or LEASED and only OWNED rows are required to match the beneficiary bank account.

Aadhaar Masked Or Partially Redacted

What's wrongUIDAI permits masked Aadhaar where the first 8 digits are replaced by 'XXXX XXXX'; full 12-digit validation cannot run on masked copies.
Expected outcomeWorkflow detects masking, validates the visible 4 digits and the QR code signature where present, and records the verdict as NEEDS_REVIEW pending offline Aadhaar XML or DigiLocker verification.

Sample Documents

Seeded sample files used to demonstrate this workflow. Each one exercises a specific scenario or failure mode.

Aadhar Card
aadhaar_front_back.pdf

Demonstrates 12-digit Aadhaar parsing, QR signature check, and address extraction across front and back pages.

PAN Card
pan_card.pdf

Validates 10-character PAN format ACMEA1234B and cross-references the 4th character against entity type.

Bank Cheque
cancelled_cheque.pdf

Cancelled cheque used to extract account number, IFSC, MICR, and account-holder name for beneficiary verification.

GST Registration Certificate
gst_reg_06.pdf

Form GST REG-06 with 15-character GSTIN, used for PAN-GSTIN substring check and trade-name match.

Excise Licence CL-9
excise_cl9_bar.pdf

Karnataka Department of Excise Form CL-9 for a bar/restaurant; tests CL-7 vs CL-9 classifier disambiguation.

Group Syndicate Form
group_syndicate_form.pdf

Multi-outlet syndicate filing with beneficiary bank table; tests owned-vs-leased reconciliation logic.

Why Automation Wins Here

A 35–45 minute per-bundle manual review collapses to under a minute of automated processing plus targeted analyst attention only on NEEDS_REVIEW items. At 50 bundles/month the compliance team recovers ~28 hours and eliminates the four most common error classes that drive disbursement clawbacks.

45 s
Per-bundle classification + extraction + 14 cross-doc validation rules
47
Structured fields extracted per representative bundle
9
Document types classified deterministically (incl. CL-7 vs CL-9)
~28 hrs
Recovered per month on a 50-bundle compliance workload

GSTIN verification at the document level

The CGST Rule 10 substring identity (GSTIN[3:13] == PAN) is checked on every document, with the exact character positions of any divergence logged — catching the silent ITC-claim risk that Tally Prime and SAP miss at master-record import.

CL-7 vs CL-9 disambiguated by content

The two Karnataka Excise forms share header, layout and signature block. The workflow reads the premises-type clause directly and flags any divergence from the licence type declared in the KYC Top Sheet, routing to NEEDS_REVIEW rather than passing silently.

PMLA audit trail, every onboarding

Verdict JSON, field-level extraction report and cross-reference log written as a single artifact bundle keyed to register number + excise year — satisfying Rule 9 of the PMLA Maintenance of Records Rules 2005 without manual evidence assembly.

Frequently Asked Questions

The questions accountants and finance controllers ask most often before deploying this workflow.

Which regulatory framework does this workflow satisfy?

The verification logic is mapped to the RBI Master Direction — Know Your Customer (KYC), 2016 (updated), the Prevention of Money Laundering Act, 2002 and PMLA Rules 2005, and the customer-due-diligence checks required under Section 12 of PMLA. Document-level checks (PAN format, GSTIN PAN-link, Aadhaar QR signature) follow the issuing authority's published specifications.

Does the workflow perform live Aadhaar e-KYC or only document parsing?

By default the workflow parses scanned or PDF Aadhaar copies and validates the QR code signature offline using UIDAI's public key. Live Aadhaar e-KYC and OTP-based authentication require a UIDAI AUA/KUA licence and can be wired in through a registered authentication agency; the workflow exposes the hook but does not bypass licensing requirements.

How is the PASS / FAIL / NEEDS_REVIEW verdict computed?

Each document produces field-level checks (format, issuer signature, expiry) and each cross-document rule (PAN ↔ GSTIN, name match across Aadhaar/PAN/cheque, licence holder ↔ bank beneficiary) produces a boolean. A bundle returns PASS only when every mandatory check passes, FAIL when any deterministic check fails (e.g., invalid PAN checksum), and NEEDS_REVIEW when checks are indeterminate (masked Aadhaar, fuzzy name match below threshold).

Can it handle multi-outlet group filings such as KSBCL syndicates?

Yes. The Group Syndicate Form is parsed into a list of outlets, each tagged OWNED or LEASED, and reconciled individually against its CL-7 or CL-9 licence and the syndicate's beneficiary bank account. The verdict is reported per outlet and rolled up at the syndicate level.

What audit trail is produced for each verification?

Every run stores the original document hash (SHA-256), the OCR/extraction output, the rule set version, the per-rule pass/fail with the exact field values compared, and the final verdict with timestamp and operator ID. The trail is exportable as a signed PDF workpaper that aligns with ICAI SA 230 Audit Documentation requirements.

How does it integrate with our existing onboarding or ERP system?

The workflow exposes a REST API and webhook callbacks; document bundles can be pushed from a customer-onboarding portal, Tally, SAP, or NetSuite, and verdicts are returned as structured JSON. Vendor master records in the ERP can be auto-flagged as KYC_VERIFIED or KYC_HOLD based on the verdict.