Free PDF Check

Forensic PDF API

Built for fraud ops at lending, insurance & compliance teams

HTPBE? is a forensic PDF API. Every request runs the same byte-level analysis a fraud analyst would perform manually — cross-reference table, incremental updates, producer field, font subset prefixes, signature chain, image streams — and returns a structured verdict with named markers. One POST, under 3 seconds, no UI. Built for fraud ops at lenders, insurance carriers, and document-heavy back offices. Self-serve API key at signup, no sales call, free test keys on all plans. From $15/mo.

~3 sec
per document
60 checks
forensic layers
From $15
per month
1,500+
docs / month on Growth

The problem

Modern document fraud is invisible to visual review

A growing class of document fraud opens a genuine PDF, edits a balance, a date, or a beneficiary, and re-saves it. Visually nothing changes — the document passes pixel-level review, layout review, and KYC.

Structural PDF analysis reads the layers rendering engines never expose: revision history, object structure, signature coverage maps. That is where edits leave fingerprints they cannot wipe.

Common tampering patterns

  • Modified balances or totals after export
  • Swapped IBAN or beneficiary on invoices
  • Post-signature edits on contracts
  • Backdated issue and modification dates
  • Fabricated documents from consumer PDF tools

What this looks like

What the Forensic PDF API Analyzes

Three real fraud mechanics we catch at the structural PDF layer.

01

Cross-reference table and incremental updates

Every post-creation edit appends a new section to the PDF cross-reference table. The forensic PDF API counts revision layers and reports how many times the file was re-saved. A bank statement with three xref layers was edited three times after the bank generated it.

02

Producer and creator field forensics

The API maintains a database of hundreds of PDF tool signatures — institutional generators (banking portals, payroll engines, IRS e-file), consumer editors (iLovePDF, Smallpdf, PDF24), and online forgery tools. When the producer field on a document that should reflect institutional software matches a known editing tool, the API returns the tool name and a high-confidence marker.

03

Signature chain analysis

The API detects whether a document was digitally signed, whether the signature is still valid, and whether modifications were applied after signing. Post-signature edits and signature removal both return certain-confidence markers — these are the strongest tamper signals in PDF forensics.

04

Font subset prefix divergence

When a PDF is composed from multiple source files or edited across independent sessions, the font subset prefixes diverge between pages. This is invisible to a reader but is a clean structural signal of composite assembly. The forensic PDF API surfaces this divergence as a named marker.

05

Image stream and JPEG-level anomalies

Tamper attempts often involve re-encoding raster regions of a PDF. The API inspects JPEG quantization tables and APP markers, flags non-genuine compression patterns, and detects when raster overlays were stamped onto an otherwise digitally-generated document.

06

Date and metadata consistency

Creation, modification, and metadata-update timestamps are cross-checked. A document where modification follows creation by days, weeks, or months — when the document type should be a single-session institutional export — returns a named marker explaining the gap.

60 layers
Forensic checks per document
~3 sec
Median analysis time, end to end
From $15
Self-serve per month, no sales call

The detection gap

KYC platforms check the document. HTPBE? checks the file.

Two different checks — both matter.

KYC & identity platforms

Plaid · Persona · Alloy · Jumio

  • Is this a real bank statement template?
  • Does the account number match the identity?
  • Is the document format consistent with the issuing bank?

Detects fake documents. Does not detect edited real documents.

HTPBE? tamper detection API

Structural PDF integrity

  • Was this specific PDF file modified after it was generated?
  • Do metadata timestamps match the file structure?
  • Were digital signatures valid at the time of signing?

Catches edits invisible to visual review and template checks.

Results in under 3 seconds30 to 1,500+ documents/monthFrom $15/mo

What HTPBE? checks

Detection capabilities

Deterministic structural signals. No probabilistic scores, no model training.

Producer signature mismatch

The PDF claims to come from one tool but the binary structure points to another. The first signal of post-export editing.

Incremental update trail

Every save after the original creates an incremental update. Long chains mean multiple editing sessions on the same file.

Multiple xref tables

Each editing session adds a new cross-reference table. Genuine institutional PDFs have one. Tampered PDFs have several.

Modification timestamp gap

A real PDF has matching CreationDate and ModDate. Months between them is a high-confidence forgery signal.

Digital signature validation

When a digital signature exists, we verify the coverage map. Modifications after signing return certain-confidence verdicts.

Font and object consistency

Edited text introduces new font subsets or objects with origin patterns inconsistent with the rest of the document.

Share with engineering

Wire this into your intake pipeline in under a day

Two API calls — one POST to submit the PDF, one GET to retrieve the verdict. Forward this page to your engineering team; the full API reference, quotas, and copy-paste examples in cURL, JavaScript, Python, PHP, Go, and Ruby are one click away.

Pricing

Self-serve plans, no sales call

All plans include the same forensic checks. Pick the quota that matches your monthly document volume.

manual

Starter

$15/mo

30 checks/mo

Manual spot-checks and integration testing

most common

Growth

$149/mo

350 checks/mo

Active document processing pipelines

high volume

Pro

$499/mo

1,500 checks/mo

High-volume automation and API integrations

Enterprise (unlimited, on-premise available) see full pricing

API key on signup. Free test environment on every plan. No card required.

Customer Stories

Teams that stopped document fraud

Compliance, finance, and risk teams use HTPBE? to catch manipulated PDFs before they become costly mistakes.

Caught an invoice where the total had been changed by less than a thousand dollars. Without this I would have approved it without a second look.

Sarah M.

AP Manager

United States

We had three applicants in the same week with bank statements that looked completely fine. Two of them were flagged as modified. You simply cannot see this by reading the document — it is in the file structure.

Lars V.

Risk Analyst, Online Lending

Netherlands

Salary slips were coming with altered figures. We identified two problematic files before the placement was finalised.

Priya K.

HR Operations Lead

India

Since we started checking documents this way, we stopped two applications early in the process that would have been very difficult to reverse later.

Julien R.

Fraud Analyst, Fintech

France

Some applicants were sending PDFs that looked authentic but had been edited in ways not visible to the eye. We now ask for checked originals when something is flagged. Already saved us from a few bad decisions.

Marta S.

Compliance Coordinator

Spain

One invoice was caught because there was a mismatch between the document dates and structure. That particular case would have cost us significantly.

Tariq A.

Finance Manager

United Arab Emirates

FAQ

Frequently asked questions

How is this different from OCR-based document validation?

OCR reads pixels and tells you what the document says. The forensic PDF API reads the file bytes and tells you whether the document is what it claims to be. The two are complementary: OCR extracts the income figure, the forensic API confirms the file carrying that figure was not edited after the bank issued it.

What documents can the API analyze?

Any standard PDF up to 10 MB. The forensic PDF API works on bank statements, invoices, pay stubs, tax returns, contracts, certificates, insurance documents, and any other PDF-format file. Password-protected PDFs and image files (JPEG, PNG) are not supported.

Do I need to send the original file?

No. The forensic PDF API analyzes the structural layer of a single PDF in isolation — no original-file comparison required. Markers and verdict are derived from the file itself: its xref structure, producer field, signature chain, font subsets, and metadata.

How fast is the response?

Under 3 seconds for most documents. The API is synchronous — the response returns once analysis is complete. There is no polling, webhook, or asynchronous job to track.

What is the pricing?

Starter is $15/month with 30 live requests included. Growth, Pro, and Enterprise tiers are listed at htpbe.tech/pricing. Test API keys are free and unlimited on every plan, including the free tier — use them to build the integration before paying.

Secure your workflow

Create your account — API key on signup, free test environment on every plan.
From $15/mo. No sales call. Cancel any time.