ARCHER EVOLV · Business Case CASE 01
Case 01 · Key-Date Inference for Regulatory Content

Verified dates,
not confident guesses.

Determining the publication, effective, and comment-close dates of regulatory documents, proposed rules, guidelines, and internal policies. A head-to-head on a production run of 55 documents: Archer Evolv's tuned extraction + knowledge base + EITL methodology against a raw LLM.

Production Run
03·20·26
Documents
55
Jurisdictions
6
Adjudicated by
EITL
0%
Raw-LLM failure rate
31 of 55 wrong or unanswered
0%
Of "high-confidence" answers
that were factually wrong
Faster per request once a date
is verified & persisted by Evolv
01 — The stakes

A date is not metadata. It's a deadline.

In risk and compliance, the date attached to a regulatory document is the trigger for everything downstream. The effective date sets when an obligation becomes binding. The comment-close date is a hard window that, once missed, cannot be reopened. The publication date anchors version control, supersession, and audit trails. A wrong date doesn't produce a slightly-off answer — it produces a missed filing, an out-of-date control, or an obligation tracked against the wrong calendar.

That is why a "usually right" model is the dangerous case. An answer that is wrong but plausible and confident flows silently into a compliance calendar and is only discovered when the deadline has already passed.

Publication date

Version & supersession

Anchors which version of a rule governs, what it replaced, and the audit trail regulators expect. Wrong here corrupts the system of record.

Effective date

When the clock starts

Determines when an obligation becomes binding and when controls must be live. An error means operating out of compliance without knowing it.

Comment-close date

A window that won't reopen

The fixed deadline to influence a proposed rule. Miss it and the only remedy is litigation or living with the outcome.

02 — Accuracy

Same 55 documents. Different truth.

Every document was run through the raw-LLM determination mechanism and independently adjudicated by an Expert-in-the-Loop. The raw model was correct on fewer than half. Evolv, applying source-specific extraction configs and a tuned knowledge base, holds error below 5%.

Raw LLM56.4% error
24
14
17
Correct · 24 Incorrect · 14 Failed · 17
Correct24 · 43.6%
Wrong but returned an answer14 · 25.5%
Failed / timed out (no answer)17 · 30.9%
Archer Evolv< 5% error
VERIFIED
·
Verified correct · >95% Routed to EITL review · <5%
High-confidence → used directlyauto
Low-confidence → EITL reviewcaught, not shipped
Result persisted & reusableonce
7/20

Confidence is not a safety net.

Of the 20 answers the raw LLM rated high confidence, 7 were flatly wrong — a 35% false-assurance rate. You cannot filter risk by trusting the model's own confidence score; the failures it hides are precisely the ones a reviewer would have waved through.

03 — The dangerous errors

What "wrong" actually looked like.

These aren't near-misses. The raw model defaulted to tidy, plausible dates — the first of a month, the first of a year — while the true date sat in a statutory citation it never retrieved. Several confident answers were off by years or decades. The correct date in every case is traceable to a specific legal reference.

SourceLLM saidConf.ActualOff byAuthority for the correct date
DE · Gen. Assembly 1996-02-02 high 2024-06-25 ~28 yrs 84 Del. Laws, c. 277, §§ 2,3 — latest amendment approved
MT · Sec. of State 2024-10-01 medium 2007-03-22 ~17 yrs Sec. 1, Ch. 38, L. 2007 — amendment approved
UT · Sec. of State 2025-12-06 high 2025-10-14 ~2 mo Ch. 17, 2025 Special Session 1 — signed by governor
DE · Gen. Assembly 2023-08-03 high 2025-06-30 ~2 yrs 85 Del. Laws, c. 44, § 1 — latest amendment approved
CA · regulatory 2007-08-01 medium 2014-08-13 ~7 yrs CA Regulatory Notice Register — register filing
DE · Gen. Assembly 2024-07-01 medium 2026-01-30 ~1.5 yrs 85 Del. Laws, c. 233, § 10 — latest amendment approved

Each correction was supplied by EITL adjudication and traces to a source-specific authority — exactly the signal Evolv's extraction configs are tuned to find and that retrieval confirms.

04 — Speed & Cost

The raw model pays every time. Evolv pays once.

Per request, the raw LLM averaged ~4 seconds against a 5-second timeout; Evolv serves a verified, persisted date in ~0.05 seconds — about 80× faster. But the real divergence is repetition: when an agent or analyst asks for the same document's date again, the raw model re-computes from scratch — re-incurring latency, inference cost, and a fresh, non-deterministic chance of being wrong. Evolv answers from cache: compute once, verify once, serve forever.

Raw LLM — recompute each time
Inference calls / month
Compute time / month
Expected wrong answers served / mo *
Archer Evolv — compute once, cache
Inference calls / month
Compute time / month
Expected wrong answers served / mo
With Evolv, per month

* Wrong-answer estimate applies the measured rates to answers actually served: raw LLM 25.5% (incorrect, returned as if valid) on every recompute; Evolv <5%, caught at ingestion and routed to review rather than shipped. Latency assumptions: raw 4.0s/call, Evolv 0.05s/cache read. Inference is incurred once per document at ingestion for Evolv. Figures are illustrative and scale with the sliders.

05 — How Evolv does it

A tuned pipeline, governed by experts.

Evolv manages reusable, content-source-specific collections of extraction-configs and KB models, each linked to one or many sources. When a new document enters the infer_document_key_date pipeline, the specialized AI operator applies the matching config to drive its determination strategy.

Step 1 · Extract
Pull search signals from the document
Evolv extraction + tuned LLM identify the best in-document clues: title, citation, agency, chapter, register reference, effective date, source-specific metadata.
Step 2 · Retrieve
Find supporting evidence via those signals
The MCP server routes to the CAI App API / Advanced Search — or the pipeline calls Archer Advanced Search directly — across millions of pre-curated documents.
Step 3 · Infer
Determine the date from retrieved evidence
The LLM combines the retrieved authority with the original document to determine the date and return structured evidence + a confidence score.
Step 4 · Persist
Write the verified result onto the document
The answer, evidence, and confidence are stored — reusable on every future lookup at cache speed.
High confidence?
Yes
Use the inferred date
No
Route to EITL review
The EITL governance loop

Tuned, tested, validated — before production.

  1. EITL creates & manages source-specific extraction-configs and KB models in the Data Admin Tool.
  2. Tests each config against sample / training data sets before release.
  3. Validates accuracy, then activates the config for a source or spider.
  4. Low-confidence results return to EITL — feeding corrections back into the config.
06 — The bottom line

One approach guesses. The other is accountable.

Accuracy

Raw LLM: 44% correct, 56% wrong or failed, and confidently wrong 35% of the time. Evolv: >95% verified, with the residual caught and routed — never silently shipped.

Speed

~4s per raw request against a 5s timeout, recomputed on every ask. Evolv serves verified dates from persistence in ~0.05s — roughly 80× faster on repeat.

Cost

Raw pays inference, latency, and fresh error risk on every lookup. Evolv computes once at ingestion; every subsequent answer is near-free and already verified.

For a function where a wrong date is a missed deadline, the raw LLM's defining failure isn't that it's wrong — it's that it's confidently wrong, repeatedly, and at full cost each time. Evolv replaces that with a verified, cached, expert-governed answer.