Changelog¶
Version history for parser, metrics engine, dictionary packs, and scoring model.
1.4.0 (2026-06-24)¶
Python SDK configuration for tunable scoring and reproducibility metadata.
What shipped¶
Area |
Change |
|---|---|
Python SDK |
|
Versions output |
|
Docs / examples |
Pipeline and versioning docs updated; full-coverage example uses cyber incident language; score-catalog clarifies v2-only components on 10-K vs 8-K fixtures |
Default scores are unchanged when no custom config is passed. Custom weights are tracked via analytics_config_id, not a new scoring_model_version.
1.3.0 (2026-06-24)¶
Package release consolidating artifact bumps and default-surface updates.
What shipped¶
Area |
Change |
|---|---|
Dictionaries / metrics |
|
Scoring default |
|
Validation data |
Reports and baselines removed from the public repo (see |
Docs |
Evidence, scope, HTTP/MCP guides, versioning pins, and glossary aligned to current artifact versions |
Release tooling |
Version sync test, HTTP endpoints doc drift check, hardened PyPI publish workflow |
Current artifact defaults¶
Artifact |
Version |
|---|---|
Parser |
|
Metrics engine |
|
Dictionary |
|
Scoring (default) |
|
Scoring (legacy) |
|
Breaking behavior note: scores may differ from 1.2.0 for callers who did not pin scoring_model_version=deterministic_scoring_v1. Pin package and scoring versions for reproducibility — see Versioning and Reproducibility.
built_in_dictionaries_v3 / text_metrics_v3 (2026-06-23)¶
Shipped v3 dictionary enrichment and package reorganization.
What shipped¶
Area |
Change |
|---|---|
Dictionary package |
Split monolith into |
Flag precision |
Added |
Legal phrases |
Added |
Modal metrics |
Added |
Topic tuning |
Tightened broad topics ( |
MD&A density |
Added v3 phrase candidates for uncertainty, demand, margin, and liquidity packs |
Tooling |
Added |
Version bumps¶
Artifact |
v2 |
v3 |
|---|---|---|
|
|
|
|
|
|
|
|
unchanged at ship time (v2 default in 1.3.0) |
deterministic_scoring_v2 (2026-06-22)¶
Introduced SCORING_MODEL_VERSION_V2 / deterministic_scoring_v2. Shipped as opt-in in 1.2.0; default on all surfaces in 1.3.0.
What shipped¶
Component |
Change |
|---|---|
|
Form-aware percentile calibration for Item 1A tone ratios ( |
|
Multi-section evidence model; flag-only paths |
|
MD&A-first evidence with Item 1A fallback; flag-only paths |
|
Section-attributed controls diff + evidence-based flags |
Confidence (v2 path) |
|
Entry points (as of 1.3.0)¶
v2 (default):
score_filing_html(),score_for_model(), HTTP matrix/panel, MCP scoring toolsv1 (legacy):
score_deterministic(); HTTP/MCP viascoring_model_version=deterministic_scoring_v1
Artifact versions at v2 ship (2026-06-22)¶
Artifact |
Version |
|---|---|
Parser |
|
Metrics engine |
|
Dictionary |
|
Scoring (default at ship) |
|
Scoring (new) |
|
Public empirical evidence (v2): on 478 S&P 500 FY2025 Item 1A sections, company-specificity correlates ρ ≈ 0.87 with an independent NER-based specificity measure — see What This Does and Does Not Claim.
v2-only components (smoke-validated; not all validated at SP500 scale)¶
Available via score_deterministic_v2() or default score_for_model() on HTTP matrix/panel and MCP scoring tools:
static_disclosure_quality_score,static_disclosure_risk_score,disclosure_change_risk_score(score product split)cybersecurity_incident_risk_score,event_materiality_score(excluded from v1 headline weights)disclosure_change_score_v2on section diffs (v1disclosure_change_scoreunchanged)Sector/form baselines via
baselines.py+calibration.py
1.2.0 (2026-06-23)¶
Evidence: v2 specificity construct validity on 478 S&P 500 FY2025 Item 1A sections (ρ ≈ 0.87 vs NER) — see What This Does and Does Not Claim.
1.1.0 (2026-06-22)¶
Breaking: removed
viewfrom/disclosure-matrixand panel/disclosure-matrixrequest/response (deterministic scoring only).Fix:
disclosure_quality_scoreis correct whenboilerplate_risk_scoreis0.0(no longer treated as missing).Internal: unified
confidence_scoreviascore_deterministic; removed unusedllm_confidencesparameter.Deprecation intent:
disclosure-alpha-mcp(legacy shim to the analyst bundle) remains for backward compatibility; preferdisclosure-alpha-mcp-analystordisclosure-alpha-mcp-builderfor new deployments. No removal planned in 1.1.x.
Score catalog cleanup (2026-06-22)¶
Public docs and examples aligned with the deterministic scoring surface:
Removed dead fields from documentation and generated fixtures:
business_model_fragility_score,cybersecurity_risk_score,hidden_risk_score.Ten computed components — nine headline-weighted scores plus supplementary
specificity_quality_score; canonical list: Score Catalog.Doc scope cleanup — removed composite/OSS product-scope notes from public pages; renamed score catalog page to Score Catalog.
built_in_dictionaries_v2 / text_metrics_v2 (2026-06-21)¶
Shipped the built-in dictionary enrichment documented in Metrics Engine.
Dictionary additions¶
Pack |
Count (approx.) |
Notes |
|---|---|---|
|
42 |
Fraud, insolvency, impairment, outage terms |
|
30 |
Contingency, fluctuation, exposure terms |
|
26 |
Arbitration, antitrust, indemnification terms |
|
28 |
Covenant, lien, forbearance terms |
Modal tiers |
18 |
|
|
20 |
Safe-harbor and generic risk language |
|
21 topics |
Investable risk clusters for diff engine |
|
13 flags |
SEC/PCAOB/FASB-grounded event phrases |
|
4 packs |
MD&A uncertainty, demand, margin, liquidity density |
v2 flag phrase additions: material weaknesses in internal control over financial reporting, plans are intended to mitigate, no longer expects, incident response, systems outage.
TERM_PACK_METADATA now documents all shipped packs (negative, uncertainty, litigious, constraining, modal, boilerplate, topics, severity, flags, mdna_density, geography, segment).
Matching behavior (metrics engine)¶
Boilerplate: each phrase counted at most once per sentence.
Topics: word-boundary phrase matching (no substring false positives); removed standalone
competitivefrom competition topic.Severity: topic intensity uses severity words within ±10 tokens of a topic hit only.
Shared helpers live in
disclosure_alpha.text_matching.
Version bumps¶
Artifact |
v1 |
v2 |
|---|---|---|
|
|
|
|
|
|
|
unchanged |
|
Empirical evidence (S&P 500 FY2025 Item 1A, v2)¶
On 478 sections, company-specificity correlates ρ ≈ 0.87 with an independent NER-based specificity measure — see What This Does and Does Not Claim.
Out of scope (deferred)¶
External Loughran–McDonald loader
Package split of
dictionaries.pyFlag suppressions (
no material weakness) pending false-positive review
built_in_dictionaries_v1 / text_metrics_v1¶
Initial license-safe built-in lists and deterministic text metrics engine.