Problem–Solution: Slow, Risky Supplier Contract Review, Solved with AI
The problem: expensive risk hiding on page 14
Procurement teams at mining, manufacturing, and FMCG companies sign hundreds of supplier contracts a year — heavy-equipment supply deals, service agreements, distribution, leases. In every contract, real risk hides in the clauses few people read to the end: auto-renewal that silently locks you in for another term, uncapped liability, take-or-pay minimum purchase commitments, exclusivity that kills your sourcing options, and one-sided termination rights.
The reality is that reviewing every contract by hand is slow, costly, and inconsistent. So many companies review only a sample, or skip it — and the bad clause surfaces only when it hurts: a contract that already auto-renewed, a penalty that was never negotiated, an unlimited liability exposure. For companies with high-value, long-tenor contracts, a single missed clause is real money.
Why it's solvable now
What changed is the quality of large language models (LLMs). The latest models read tangled legal language well enough to recognize a clause even when it's paraphrased differently — not just keyword matching. And crucially for procurement: the model can also explain why a clause is risky in business language, not just point to where it is.
To prove it objectively, we tested on CUAD (the Contract Understanding Atticus Dataset) — a public benchmark of 510 real commercial contracts with 13,000+ expert lawyer labels. It is the de-facto standard for evaluating AI contract review, which makes the result verifiable rather than a marketing claim.
What we proved
We built a proof-of-concept that genuinely runs: upload a contract PDF, the AI detects the 10 clause types most material to a buyer, then shows the risk level, the exact supporting text, and the business rationale.
Tested against CUAD gold labels on 25 held-out contracts (250 decisions), the deepseek-v4-pro engine achieved:
- F1: 0.863 (precision 0.946 — recall 0.793)
- Accuracy: 88.8%
- Only 5 false positives out of 250 decisions
In other words: when the system flags a clause, it is almost always right (94.6% precision) — invaluable for triage, because legal teams aren't buried in false alarms. We also report it honestly: recall on clauses that often hide in long appendices (such as a liability cap) still has room to improve. That's an honest roadmap, not an invented number.
!Contract PDF upload interface
In the demo we uploaded a Master Supply Agreement, and within seconds the system flagged all 10 clause types — 5 high risk, 4 medium — each with a quote and an explanation.
!Risk summary and full list of flagged clauses
!Clause detail with quote and risk level
What this means for you
Imagine review work that used to take hours per contract turning into a few-second triage. For operators in mining, manufacturing, and FMCG, the impact is concrete:
- Faster: incoming contracts are pre-screened; legal focuses on the clauses that actually carry risk.
- Cheaper & safer: auto-renewal, uncapped liability, and minimum commitments are caught before signing, not after a loss.
- More consistent: every contract is assessed against the same, auditable standard.
This isn't about replacing your legal team — it's about giving them radar, so human attention goes where it matters most.
Let's talk
At RubyThalib AI we build measurable, proven AI solutions for your business context — including procurement contract review like the above, tailored to your contract types and document language. If your procurement or legal team wants to cut review time and close risk gaps, let's talk at rubythalib.ai.