KYC Verification
KYC Verification ingests an applicant's identity, banking, and licensing document bundle, extracts each field, cross-references them, and issues a deterministic PASS, FAIL, or NEEDS_REVIEW verdict.
The Problem
A KSBCL licensee onboarding bundle for a single Karnataka liquor outlet contains nine distinct document types — Aadhaar, PAN, GST REG-06, Excise Licence CL-7 or CL-9, cancelled cheque, KYC Top Sheet, Group Syndicate Form, and a No Objection Certificate. A mid-market compliance team running 50–500 licensee verifications a quarter spends roughly 35–45 minutes per bundle manually classifying each PDF, keying fields into a checklist, comparing the applicant's name across four documents, and validating the GSTIN-PAN substring rule defined in Rule 10 of the CGST Rules, 2017.
The error surface is wide. CL-7 and CL-9 forms are visually identical except for the premises-type clause, so analysts frequently tag a CL-9 (Refreshment Rooms / Bars) as a CL-7 (Hotel and Boarding House) and approve a bundle whose declared licence type contradicts the document on file. Older cheque leaves carry only the MICR band and no printed IFSC, which leaves the bank account uncorroborated. Aadhaar copies arrive masked under the UIDAI masked-Aadhaar guideline and the analyst has no deterministic way to record that masking actually occurred versus a poor scan.
At 50 bundles a month the team spends ~30 hours on classification and field extraction alone, before any cross-reference work begins. The consequence of a missed exception is not theoretical: under the Karnataka Excise Act, 1965 and the KSBCL beneficiary-payment SOP, paying a licensee whose bank beneficiary name does not match the licence holder triggers a recoverable disbursement and an internal audit flag.
Why It Matters: Context
KYC for liquor licensees in Karnataka sits at the intersection of the Prevention of Money Laundering Act, 2002, RBI's Master Direction on KYC (2016, as amended), and the Karnataka Excise (Sale of Indian and Foreign Liquors) Rules. The compliance officer has to confirm identity (Aadhaar, PAN), tax registration (GSTIN with the Rule 10 substring rule), licence validity (CL-7 or CL-9 in the correct excise year), and the bank account beneficiary chain (cheque IFSC resolved against the RBI MICR-IFSC master, beneficiary name matched to licence holder).
Mid-market KSBCL distributors and group syndicates rarely have a dedicated KYC team. One or two analysts in the controller's office handle onboarding alongside vendor master, GST filing, and TDS reconciliation. There is no enterprise identity-verification suite in place; checks are performed against scanned PDFs in a shared drive with a spreadsheet checklist.
A single missed GSTIN-PAN mismatch, an unflagged CL-9 sold as CL-7, or a beneficiary cheque whose name does not reconcile against the licence holder of an OWNED outlet in a Group Syndicate Form is enough to trigger a Section 12AA suspension proceeding under the Karnataka Excise Act and a clawback of disbursed amounts.
What This Workflow Automates
- Ingests the licensee's document bundle and classifies each file into one of nine registered types — Aadhar Card, PAN Card, GST Registration, Excise Licence CL-7, Excise Licence CL-9, Bank Cheque, KYC Top Sheet, Group Syndicate Form, or No Objection Certificate — using the premises-type clause to disambiguate CL-7 from CL-9.
- Extracts structured fields per document: 12-digit Aadhaar number and masked-Aadhaar flag, 10-character PAN, 15-character GSTIN, licence register number and excise year, MICR line, account number, IFSC, beneficiary name, and the outlet table from the Group Syndicate Form.
- Validates the CGST Rule 10 substring identity: characters 3–12 of the GSTIN must equal the PAN; on mismatch the workflow logs the exact character positions that diverge and issues a FAIL on the GST document.
- Resolves the cheque MICR code against the RBI MICR-IFSC master when the IFSC is not printed on the leaf, and marks the IFSC field FAIL if resolution does not return a single valid bank-branch record.
- Cross-references the applicant name across Aadhaar, PAN, and the cheque using initial expansion, surname tokenisation, and transliteration-tolerant matching; records a similarity score and routes any score below 0.85 to NEEDS_REVIEW with the character-level diff retained.
- For Group Syndicate Forms, reconciles each outlet row against its CL-7 or CL-9 licence holder, tags the row as OWNED or LEASED, and applies the beneficiary-name match only to OWNED rows.
- Issues a per-document and per-bundle verdict — PASS, FAIL, or NEEDS_REVIEW — and writes a structured JSON verdict file plus a human-readable verification report listing every field, its source document, its validation rule, and its outcome.
All of this happens in roughly 45 seconds with deterministic outputs every controller can audit.
Edge Cases We Simulate
The workflow ships with a battery of synthetic test scenarios that exercise every failure mode we have seen in real-world data. Each scenario produces a deterministic outcome that an auditor or controller can verify in seconds.
| Scenario | What's wrong | Expected outcome |
|---|---|---|
| CL-7 Misclassified As CL-9 | Both Karnataka Excise licence forms share layout, header, and signature block; the only distinguishing text is 'Hotel and Boarding House' versus 'Refreshment Rooms (Bars)'. | Classifier reads the premises-type clause and tags the document as EXCISE_LICENCE_CL7 or EXCISE_LICENCE_CL9 accordingly; mismatch with the licence type declared in the KYC Top Sheet raises a NEEDS_REVIEW flag. |
| Name Variance Across Documents | Applicant name on Aadhaar reads 'RAJESH KUMAR S', on PAN 'RAJESH KUMAR SHETTY', and on the bank cheque 'R K SHETTY', which would fail a strict equality check. | Workflow applies a normalised match (initial expansion, surname tokenisation, transliteration tolerance) and records the match score; scores below 0.85 are routed to NEEDS_REVIEW with the diff highlighted. |
| Cancelled Cheque Without IFSC | Older cheque leaves carry only the MICR band and no printed IFSC, so account verification cannot rely on the cheque alone. | MICR is parsed and resolved to IFSC against the RBI MICR-IFSC master; if resolution fails, the cheque is marked FAIL on the IFSC field and the bundle moves to NEEDS_REVIEW. |
| GSTIN PAN Mismatch | Characters 3–12 of the 15-character GSTIN must equal the applicant's PAN; data-entry errors or use of a related entity's GST often break this rule. | Workflow extracts both fields, performs the substring equality check defined in the CGST Rules, and issues FAIL on the GST document with the exact character positions of the mismatch logged. |
| Group Syndicate With Mixed Owned and Leased Outlets | A Group Syndicate Form lists multiple outlets where some licences are held by the syndicate beneficiary and others by leased operators; bank account beneficiary may not match every outlet's licence holder. | Each outlet row is reconciled against its CL-7/CL-9 licence holder and the syndicate beneficiary; ownership type is tagged OWNED or LEASED and only OWNED rows are required to match the beneficiary bank account. |
| Aadhaar Masked Or Partially Redacted | UIDAI permits masked Aadhaar where the first 8 digits are replaced by 'XXXX XXXX'; full 12-digit validation cannot run on masked copies. | Workflow detects masking, validates the visible 4 digits and the QR code signature where present, and records the verdict as NEEDS_REVIEW pending offline Aadhaar XML or DigiLocker verification. |
Sample Documents
Download or inspect the seeded sample files used to demonstrate this workflow:
| File | Document type | Notes |
|---|---|---|
aadhaar_front_back.pdf |
Aadhar Card | Demonstrates 12-digit Aadhaar parsing, QR signature check, and address extraction across front and back pages. |
pan_card.pdf |
PAN Card | Validates 10-character PAN format AAAAA9999A and cross-references the 4th character against entity type. |
cancelled_cheque.pdf |
Bank Cheque | Cancelled cheque used to extract account number, IFSC, MICR, and account-holder name for beneficiary verification. |
gst_reg_06.pdf |
GST Registration Certificate | Form GST REG-06 with 15-character GSTIN, used for PAN-GSTIN substring check and trade-name match. |
excise_cl9_bar.pdf |
Excise Licence CL-9 | Karnataka Department of Excise Form CL-9 for a bar/restaurant; tests CL-7 vs CL-9 classifier disambiguation. |
group_syndicate_form.pdf |
Group Syndicate Form | Multi-outlet syndicate filing with beneficiary bank table; tests owned-vs-leased reconciliation logic. |
Sample Results
On a representative KSBCL bundle, the workflow classifies all nine documents, extracts 47 fields, and runs 14 cross-document validation rules in under a minute. The Rule 10 GSTIN-PAN substring check, the MICR-to-IFSC resolution, the Aadhaar masking detector, and the licence-holder-to-beneficiary reconciliation each produce a discrete PASS, FAIL, or NEEDS_REVIEW outcome that is written to the verdict JSON with the exact rule identifier and source field reference.
One exception class the workflow consistently catches is the CL-7 / CL-9 misclassification. Because the two Karnataka Excise forms share header, layout, and signature block, manual review routinely tags a Refreshment Rooms (Bars) licence as a Hotel and Boarding House licence; the workflow reads the premises-type clause directly, flags the divergence from the licence type declared in the KYC Top Sheet, and routes the bundle to NEEDS_REVIEW rather than letting it pass.
Why Automation Wins Here
A 35–45 minute per-bundle manual review collapses to under a minute of automated processing plus targeted analyst attention only on NEEDS_REVIEW items. At 50 bundles a month the compliance team recovers roughly 28 hours, and the deterministic rule engine eliminates the four most common error classes — CL-7/CL-9 confusion, GSTIN-PAN substring breaks, missing-IFSC cheques, and unverified masked Aadhaar — that account for the majority of disbursement clawbacks under the KSBCL beneficiary SOP.
The verdict JSON, the field-level extraction report, and the cross-reference log are written as a single artifact bundle keyed to the licensee's register number and excise year. The controller drops the bundle directly into the audit file as evidence of KYC review under Rule 9 of the PMLA Maintenance of Records Rules, 2005, with every rule outcome traceable to its source document and field.
Frequently Asked Questions
The verification logic is mapped to the RBI Master Direction — Know Your Customer (KYC), 2016 (updated), the Prevention of Money Laundering Act, 2002 and PMLA Rules 2005, and the customer-due-diligence checks required under Section 12 of PMLA. Document-level checks (PAN format, GSTIN PAN-link, Aadhaar QR signature) follow the issuing authority's published specifications.
By default the workflow parses scanned or PDF Aadhaar copies and validates the QR code signature offline using UIDAI's public key. Live Aadhaar e-KYC and OTP-based authentication require a UIDAI AUA/KUA licence and can be wired in through a registered authentication agency; the workflow exposes the hook but does not bypass licensing requirements.
Each document produces field-level checks (format, issuer signature, expiry) and each cross-document rule (PAN ↔ GSTIN, name match across Aadhaar/PAN/cheque, licence holder ↔ bank beneficiary) produces a boolean. A bundle returns PASS only when every mandatory check passes, FAIL when any deterministic check fails (e.g., invalid PAN checksum), and NEEDS_REVIEW when checks are indeterminate (masked Aadhaar, fuzzy name match below threshold).
Yes. The Group Syndicate Form is parsed into a list of outlets, each tagged OWNED or LEASED, and reconciled individually against its CL-7 or CL-9 licence and the syndicate's beneficiary bank account. The verdict is reported per outlet and rolled up at the syndicate level.
Every run stores the original document hash (SHA-256), the OCR/extraction output, the rule set version, the per-rule pass/fail with the exact field values compared, and the final verdict with timestamp and operator ID. The trail is exportable as a signed PDF workpaper that aligns with ICAI SA 230 Audit Documentation requirements.
The workflow exposes a REST API and webhook callbacks; document bundles can be pushed from a customer-onboarding portal, Tally, SAP, or NetSuite, and verdicts are returned as structured JSON. Vendor master records in the ERP can be auto-flagged as KYC_VERIFIED or KYC_HOLD based on the verdict.
This workflow is deployed and live in our demo environment. Upload your own documents to see it in action.
Open the live workflow