Karnataka — Valid GSTIN
29ACMEA1234B1Z5: all 15 positions structurally correct, base-36 Luhn check character at position 15 resolves to the expected value.Form GST REG-06 OCR + GSTIN structure check + base-36 Luhn checksum — GST certificate validation in under 30 seconds per document.
Vendor onboarding and AP teams in mid-market companies collect Form GST REG-06 certificates from hundreds of suppliers every year — and manual processing of those PDFs creates compounding risk at every step.
Manually keying GSTIN, legal name, trade name, constitution type, principal address, date of liability and approving authority from each Form GST REG-06 PDF takes 5–10 minutes per certificate. Across a vendor base of 200–500 active suppliers that is 15–40 person-hours per cycle with no built-in verification step.
The GSTN encodes a base-36 Luhn check character at position 15 of every GSTIN. A certificate where someone altered the GSTIN — swapping a digit or transposing two characters — will still appear 15 characters long with a valid state code and Z at position 14. Only the checksum computation reveals the alteration, which a manual review cannot perform reliably at scale.
Under Section 16(2)(aa) of the CGST Act, 2017 (inserted by the Finance Act, 2021), Input Tax Credit is admissible only when the supplier’s outward supply is reflected in GSTR-2B. An invalid or fabricated GSTIN on a supplier certificate is an early indicator that ITC claimed against that supplier may be denied during scrutiny. GST Council Circular No. 183/15/2022-GST confirms that buyers bear due-diligence responsibility.
At scale — 50 or more new vendor certificates per month — teams resort to spot-checking, and the exception queue grows invisibly until a GST audit or annual vendor verification exercise forces a full review. Constitution mismatches (e.g., extracting “Private Limited Company” when the certificate reads “Limited Liability Partnership”) affect TDS applicability under Sections 194C vs. 194J of the Income Tax Act and go undetected.
Estimated ITC exposure per annum in a mid-market firm with 300 active vendors if even 2–3% of GSTINs on file carry an undetected checksum error or are subsequently cancelled. Section 50 of the CGST Act mandates interest on reversed ITC — compounding the cost of a vendor master that was never validated at onboarding.
Form GST REG-06 sits at the intersection of four overlapping rules — each one creating a specific obligation the vendor onboarding controller must satisfy before a supplier enters the master register.
Form GST REG-06 is the Certificate of Registration issued under Rule 10(1) of the CGST Rules, 2017. It is the only government-prescribed format that carries a GSTIN, constitution of business, principal place of business address, date of liability and period of validity in a single structured document — making it the canonical source for vendor master data.
Input Tax Credit is admissible only up to the credit available in the buyer’s auto-populated GSTR-2B. A supplier with a forged or corrupted GSTIN will not file valid GSTR-1 returns under that GSTIN, meaning every ITC claim associated with that vendor is at risk of reversal — plus interest under Section 50 — during any future scrutiny or audit.
The GST Council’s Circular clarifies that a buyer cannot claim ignorance of a supplier’s registration status or the authenticity of its GSTIN. Controllers who accept certificates without programmatic verification are exposed to demand notices under Sections 73 and 74 of the CGST Act if ITC is later found inadmissible.
Under ICAI’s Standard on Auditing SA 505 (External Confirmations), auditors may independently circularise GSTIN validity as part of an indirect tax audit. A vendor master holding unvalidated or checksum-failed GSTINs will trigger qualified findings. Cadel’s per-certificate audit trail — with computed vs. extracted check characters — constitutes the documentary evidence SA 505 requires the controller to retain.
Seven deterministic passes from raw Form GST REG-06 PDF to a validated, vendor-master-ready record — in under 30 seconds per certificate batch, with a binary pass/fail result for each of four checks that every controller can audit without re-running the computation.
Accepts PDF uploads through the Cadel inbox. The classifier identifies each file as a Cert document type by matching the Form GST REG-06 header, sub-header (“Certificate of Registration under the CGST Act, 2017”), and the presence of a GSTIN field — routing only valid certificate PDFs to the extraction pipeline.
The OCR and extraction layer isolates nine named fields from each certificate: gstin, legal_name, trade_name, constitution, principal_address, date_of_liability, valid_from, registration_type and approving_authority. Fields not present in the PDF are recorded as null, not silently omitted.
Confirms that the extracted GSTIN is exactly 15 characters long, that positions 1–2 are a recognised two-digit numeric state code (01 through 38), that positions 3–12 follow the PAN-derived alphanumeric pattern, and that position 14 is the letter Z. Any structural violation fires a FAIL immediately — before the checksum step is even attempted.
Applies the GSTN-specified base-36 variant of the Luhn algorithm over positions 1–14 of the GSTIN and compares the computed check character to the extracted character at position 15. A mismatch — such as extracted O vs. computed 4 — fires a FAIL with the expected value shown inline, giving the controller the exact evidence needed to reject the certificate.
Confirms that the legal_name field is non-null and non-empty after extraction. The legal name must match the PAN-linked entity name in the supplier master; a missing or blank field indicates OCR failure or a corrupted PDF, and the certificate is routed to the exception queue for manual review regardless of the GSTIN result.
Confirms that principal_address is non-null and non-empty. The principal place of business address is required for state-specific compliance obligations and for matching the supplier in the e-invoicing IRP system. A blank address — from a partially printed or truncated PDF — fires a FAIL independently of the GSTIN checks.
Each certificate receives a Valid badge when all four checks pass or an Invalid badge when any check fails; failed records are routed to the exception queue. All extracted fields and validation outcomes across the full batch are written to a structured Excel export — one row per certificate, one validation-result column per check — ready to import into Tally Prime, Zoho Books, SAP or any ERP vendor master.
Five synthetic test scenarios that exercise every failure mode observed in real-world certificate batches. Each scenario produces a deterministic outcome an auditor or controller can verify in seconds.
29ACMEA1234B1Z5: all 15 positions structurally correct, base-36 Luhn check character at position 15 resolves to the expected value.27ACMEA1234B1Z5. The constitution field reads “Limited Liability Partnership” — a value that differs from the common “Private Limited Company” and must be extracted verbatim without normalisation errors that would misclassify TDS obligations.Z at position 14 — but the final check character O does not match the base-36 Luhn computed value of 4, indicating the GSTIN was manually altered after issuance.O, computed 4). Inbox badge: Invalid. Record routed to exception queue with computed vs. extracted values shown inline.0 as the letter O in the GSTIN body. The 15-character count is preserved and the state code is valid, so a format-only check reports PASS. Only the checksum detects the substitution.Three seeded Form GST REG-06 certificates — each engineered to exercise a specific validation scenario. Two pass cleanly across all four checks. One triggers the checksum FAIL that a format-only review would have missed.
29ACMEA1234B1Z5 · Karnataka (state 29)
Demonstrates a clean end-to-end pass on all four validations including the base-36 Luhn checksum. Approving authority details (Superintendent of Central Tax, Bengaluru West Commissionerate) extracted into a structured field alongside the principal address.
27ACMEA1234B1Z5 · Maharashtra (state 27)
Demonstrates multi-state support and verbatim extraction of the “Limited Liability Partnership” constitution value. The LLP constitution affects TDS rate applicability under Section 194C vs. 194J of the Income Tax Act; a normalisation error here would misclassify the deduction category for all future payments to this vendor.
The GSTIN passes every visual and structural test — 15 characters, numeric state code 06, Z at position 14 — which is exactly what a manual review would have confirmed and cleared. The base-36 Luhn computation over positions 1–14 produces the expected check character 4, not the extracted character O. Without automated checksum validation, this certificate would have entered the vendor master undetected.
Running the workflow against all three certificates produced the following outcomes. Certificates cert_001_karnataka_valid.pdf (GSTIN 29ACMEA1234B1Z5, Karnataka, Private Limited Company) and cert_002_maharashtra_llp.pdf (GSTIN 27ACMEA1234B1Z5, Maharashtra, LLP) each passed all four validations: GSTIN structure, base-36 Luhn checksum, legal name presence, and principal address presence. Across the two clean certificates, 8 of 8 validation checks resolved to PASS, all nine structured fields were populated in each record, and the constitution values — “Private Limited Company” and “Limited Liability Partnership” respectively — were extracted verbatim without normalisation.
Certificate cert_003_tampered_invalid.pdf (Haryana, state 06) demonstrated the checksum validation’s practical value. The GSTIN passed the structural check — 15 characters, state code 06, Z at position 14 — precisely what a visual review would have confirmed and cleared. However, the base-36 Luhn computation over positions 1–14 produced the expected check character 4, not the extracted character O. The workflow fired a FAIL on the checksum check, set the inbox badge to Invalid, and routed the record to the exception queue. The Excel export logged the extracted character, the computed expected character, and the timestamp of the validation — the exact evidence chain required under ICAI SA 230 (Audit Documentation) for a controller’s working paper file.
For a vendor onboarding team processing 50–300 Form GST REG-06 certificates per month, automated GSTIN validation replaces an estimated 25–50 person-hours of manual field transcription per onboarding cycle with a deterministic four-check pipeline that runs in under 30 seconds per certificate — and catches the one class of error that no manual process can reliably detect at scale.
Applying the GSTN base-36 Luhn algorithm deterministically over every certificate — not just format-checking the 15-character length — catches forged, OCR-corrupted and copy-paste-transposed GSTINs that pass all structural rules. These are the GSTINs that would later surface as ITC reversal demands under Section 50 of the CGST Act if admitted undetected.
Each certificate run produces a structured Excel artifact — extracted fields, per-check validation results, computed vs. extracted check characters, timestamp — that meets the documentation standard under ICAI SA 230 (Audit Documentation) and satisfies the due-diligence evidence requirement established by Circular No. 183/15/2022-GST. Attachable directly to the vendor onboarding working paper without further preparation.
The structured Excel export maps directly to vendor master fields in Tally Prime, Zoho Books, SAP Business One, and Oracle NetSuite — GSTIN, legal name, constitution, registration type and principal address all populated in the correct columns. Constitution extracted verbatim (not normalised) preserves the TDS rate distinction between “Private Limited Company” and “Limited Liability Partnership” without downstream correction.
The questions compliance controllers, vendor onboarding managers and internal auditors ask before deploying GST certificate validation automation.
The GSTIN format is specified by the Goods and Services Tax Network (GSTN) under the Central Goods and Services Tax Act, 2017 (CGST Act). The 15-character structure — two-digit state code, ten-character PAN, entity number, Z at position 14, and a base-36 Luhn check character at position 15 — is defined in the GSTN technical specification for taxpayer registration. The checksum algorithm is a base-36 variant of the Luhn algorithm and Cadel implements it deterministically: there is no heuristic or probabilistic element. Form GST REG-06 is the prescribed certificate format under Rule 10(1) of the CGST Rules, 2017.
Yes. A GSTIN can have exactly 15 characters, a valid numeric state code, and Z at position 14 — satisfying all structural rules — while carrying an incorrect check character at position 15. This is precisely the scenario demonstrated by cert_003_tampered_invalid.pdf, where the extracted character is O but the computed value is 4. Manual alteration of a GSTIN on a PDF, OCR misreads of ambiguous characters (e.g., 0 vs O), and copy-paste errors in vendor master data all produce structurally valid but arithmetically invalid GSTINs. A format-only check cannot catch these errors; only the checksum computation can.
Cadel produces a structured Excel export containing all extracted fields — GSTIN, legal name, trade name, constitution, principal address, registration type, date of liability, validity dates and approving authority — along with a per-certificate validation status column. This file can be imported directly into Tally Prime’s ledger creation screen, Zoho Books’ contact master, or any ERP vendor master via the standard CSV/Excel import. For companies running SAP Business One or Oracle NetSuite, the same export maps to vendor master fields without transformation.
Each uploaded PDF is assigned a unique document ID, and the workflow records the extracted field values, the raw OCR output, the four validation outcomes (structure, checksum, legal name, principal address), and a timestamp for each step. Validation failures are logged with the computed vs. extracted check character so that an internal auditor or a GST practitioner can confirm the finding without re-running the calculation. The exception queue preserves all failed records in their original state alongside the failure reason, satisfying the documentation requirements under ICAI SA 230 (Audit Documentation) for evidence of third-party credential verification.
Yes. The constitution field is extracted as a free-text string directly from the certificate, so values such as “Proprietorship”, “Partnership Firm”, “Limited Liability Partnership”, “Private Limited Company”, “Public Limited Company”, and “HUF” are all captured verbatim without normalisation. This matters because the constitution type determines TDS rate applicability: under Section 194C of the Income Tax Act, payments to a company attract 2% TDS while payments to an LLP or individual attract 1% — a misclassification caused by silent normalisation would persist in every payment run for that vendor.
Yes. The state code occupies characters 1–2 of the GSTIN and ranges from 01 (Jammu & Kashmir) to 38 (Ladakh), covering all 28 states and 8 union territories recognised under the CGST Act. The workflow validates the numeric format of the state code as part of the structural check but does not restrict processing to any specific state. The three demo certificates cover Karnataka (29), Maharashtra (27) and Haryana (06), illustrating multi-state operation.
The checksum workflow runs fully offline using the deterministic GSTN base-36 Luhn algorithm — no portal lookup is required or performed during extraction. This makes the process instantaneous and auditable: the computed expected check character is logged alongside the extracted character for every certificate. For live GSTN portal verification (to confirm the GSTIN is currently active and not cancelled), the validated GSTIN list produced by this workflow can be fed into a separate online verification step via the GSTN taxpayer search API, which is a distinct process outside the scope of this certificate extraction workflow.
The checksum validation confirms that the GSTIN was structurally correct and arithmetically valid at the time of its issuance — it cannot detect subsequent cancellation or suspension, which requires a live GSTN portal query under the CGST Act’s registration management provisions. The workflow is designed as a fast, offline first-pass filter: it eliminates forged or corrupted certificates immediately (which a portal query cannot prioritise), and the validated GSTIN list can then be submitted for portal status checks in bulk. This two-stage approach is consistent with the due-diligence standard described in Circular No. 183/15/2022-GST.