Unknown-Vendor Detection

Q: Can the detection be tuned per category or amount?

Yes. Stricter thresholds for high-amount or high-stakes categories; looser for routine spend. "Unknown-vendor + amount > €5,000" is the strictest cell; "unknown-vendor + amount < €100" can be looser. Application-level configuration on top of the prediction.

Q: Does the system learn from approved unknown-vendor cases?

Yes. Once the vendor's first invoice is approved and posted, the data enters the index. The next invoice from the same vendor has 1 observation in history; by the third, it's typically known with usable confidence. The transition from unknown to known is automatic.

Unknown-Vendor Detection in production — screenshot from the 📋 ERP demo

📋 ERP_predictERPAccounting

Production anchorFirst-time vendor triggers the no-prediction-yet path; visible flag at PO entry — caught before processing, not after.

The problem

Unknown-vendor detection is the slowest fraud and shadow-spend pattern to surface in a typical AP system. A new vendor is added; one or two invoices go through; nobody notices for months until quarterly vendor-review surfaces an unfamiliar name. By then there's a vendor-master entry, several posted invoices, and possibly a small expense pattern that bypassed procurement.

The signal is available at entry time. A vendor with no prior history in the database is, by definition, unknown to the prediction system. The query returns low confidence; the system says "no prediction yet — first-time vendor." That should be a flag at PO entry, not a discovery at quarterly review.

How it works

_predict on the PO entry returns a calibrated probability per prediction target (account, cost center, approver). When the vendor has no prior history, all predictions return low confidence — typically below 30%. The application reads this as the unknown-vendor signal and flags the PO for manual review.

The mechanism composes with the broader anomaly pattern. Inverse-prediction anomaly catches misrouted invoices from known vendors; unknown-vendor anomaly catches the rest. Together they form the anomaly trio (with amount-spike); all three surface from the same _predict operator, the same calibration.

{
  "from": "invoices",
  "where": {
    "vendor": "NewVendor LLC",
    "category": "office_supplies",
    "amount": 1240.00
  },
  "predict": "gl_code",
  "select": ["$p", "$why"],
  "limit": 1
}

// $p < 0.30 across all predictions → flag as unknown-vendor.

For the full architecture, see the technology overview. For the broader narrative across multiple use cases, read The Predictive Application.

See it live

This use case runs in the 📋 ERP demo today. Click through to the live application and inspect the queries that produce the result. Source is on GitHub under Apache 2.0.

Open the live demo →

Frequently asked

How is this different from a procurement-policy vendor-master rule?

Procurement policy enforces "new vendors require approval before first invoice." Unknown-vendor detection enforces "first invoice from any vendor surfaces for review, regardless of whether the vendor was pre-approved." The two layers compose; the unknown-vendor detection catches the edge cases where the policy was bypassed.

Does this work in multi-tenant deployments?

Yes. Conditional probability is scoped per tenant via customer_id. A vendor known to one tenant is treated as unknown to another tenant the first time it appears there. Multi-tenant accounting deployments at accounting.aito.ai run this pattern across 255 tenants.

What threshold defines "unknown" vs "low-confidence-known"?

Typical threshold: top-1 prediction confidence < 30% across the prediction targets. Above 30%, the vendor is treated as known-but-uncertain (predictions surface as suggestions). Below, the vendor is treated as unknown (manual review required).

Can the detection be tuned per category or amount?

Yes. Stricter thresholds for high-amount or high-stakes categories; looser for routine spend. "Unknown-vendor + amount > €5,000" is the strictest cell; "unknown-vendor + amount < €100" can be looser. Application-level configuration on top of the prediction.

Does the system learn from approved unknown-vendor cases?

Yes. Once the vendor's first invoice is approved and posted, the data enters the index. The next invoice from the same vendor has 1 observation in history; by the third, it's typically known with usable confidence. The transition from unknown to known is automatic.

The problem

How it works

See it live

Frequently asked

Address

Contact

Shortcuts

Unknown-Vendor Detection

The problem

How it works

See it live

Frequently asked

Related use cases

Inverse Prediction Anomaly

Amount-Spike Detection

PO Queue Routing

GL Coding

Address

Contact

Shortcuts