Changelog

Version history for parser, metrics engine, dictionary packs, and scoring model.

1.4.0 (2026-06-24)

Python SDK configuration for tunable scoring and reproducibility metadata.

What shipped

Area

Change

Python SDK

PipelineConfig and ScoringConfig on score_filing_html(), score_filing_ticker(), score_for_model(), and score_panel_tickers() — tune component_weights, flag_boost_points, flag_evidence_score, and v2 calibration_context without forking the parser

Versions output

versions.analytics_config_id in pipeline, MCP taxonomy, and panel batch responses (builtin_default when unset)

Docs / examples

Pipeline and versioning docs updated; full-coverage example uses cyber incident language; score-catalog clarifies v2-only components on 10-K vs 8-K fixtures

Default scores are unchanged when no custom config is passed. Custom weights are tracked via analytics_config_id, not a new scoring_model_version.

1.3.0 (2026-06-24)

Package release consolidating artifact bumps and default-surface updates.

What shipped

Area

Change

Dictionaries / metrics

built_in_dictionaries_v3 and text_metrics_v3 — dictionary package split, flag suppressions, legal phrases, modal tiers, topic tuning

Scoring default

deterministic_scoring_v2 is now the default on CLI, HTTP, MCP, and score_filing_html() / score_for_model(); legacy deterministic_scoring_v1 remains opt-in

Validation data

Reports and baselines removed from the public repo (see INTERNAL_VALIDATION.md on the internal branch)

Docs

Evidence, scope, HTTP/MCP guides, versioning pins, and glossary aligned to current artifact versions

Release tooling

Version sync test, HTTP endpoints doc drift check, hardened PyPI publish workflow

Current artifact defaults

Artifact

Version

Parser

section_extractor_v1

Metrics engine

text_metrics_v3

Dictionary

built_in_dictionaries_v3

Scoring (default)

deterministic_scoring_v2

Scoring (legacy)

deterministic_scoring_v1

Breaking behavior note: scores may differ from 1.2.0 for callers who did not pin scoring_model_version=deterministic_scoring_v1. Pin package and scoring versions for reproducibility — see Versioning and Reproducibility.

built_in_dictionaries_v3 / text_metrics_v3 (2026-06-23)

Shipped v3 dictionary enrichment and package reorganization.

What shipped

Area

Change

Dictionary package

Split monolith into src/disclosure_alpha/dictionaries/ modules (base.py, sentiment.py, phrases.py, topics.py, flags.py) with backward-compatible disclosure_alpha.dictionaries exports

Flag precision

Added FLAG_SUPPRESSIONS with sentence-scoped suppression logic in detect_section_flags()

Legal phrases

Added LEGAL_REGULATORY_PHRASES and emitted legal_regulatory_phrase_ratio

Modal metrics

Added weak_modal_word_ratio, moderate_modal_word_ratio, strong_modal_word_ratio while preserving modal_word_ratio

Topic tuning

Tightened broad topics (climate, labor) and added high-value phrases like net interest margin / semiconductor inventory

MD&A density

Added v3 phrase candidates for uncertainty, demand, margin, and liquidity packs

Tooling

Added scripts/mine_dictionary_candidates.py for corpus-driven candidate mining

Version bumps

Artifact

v2

v3

DICTIONARY_VERSION

built_in_dictionaries_v2

built_in_dictionaries_v3

METRICS_ENGINE_VERSION

text_metrics_v2

text_metrics_v3

SCORING_MODEL_VERSION

deterministic_scoring_v1

unchanged at ship time (v2 default in 1.3.0)

deterministic_scoring_v2 (2026-06-22)

Introduced SCORING_MODEL_VERSION_V2 / deterministic_scoring_v2. Shipped as opt-in in 1.2.0; default on all surfaces in 1.3.0.

What shipped

Component

Change

risk_factor_intensity_score

Form-aware percentile calibration for Item 1A tone ratios (calibration.py)

legal_regulatory_risk_score

Multi-section evidence model; flag-only paths

liquidity_stress_score

MD&A-first evidence with Item 1A fallback; flag-only paths

internal_controls_risk_score

Section-attributed controls diff + evidence-based flags

Confidence (v2 path)

compute_confidence_detailed() with explicit penalties

Entry points (as of 1.3.0)

  • v2 (default): score_filing_html(), score_for_model(), HTTP matrix/panel, MCP scoring tools

  • v1 (legacy): score_deterministic(); HTTP/MCP via scoring_model_version=deterministic_scoring_v1

Artifact versions at v2 ship (2026-06-22)

Artifact

Version

Parser

section_extractor_v1

Metrics engine

text_metrics_v2

Dictionary

built_in_dictionaries_v2

Scoring (default at ship)

deterministic_scoring_v1

Scoring (new)

deterministic_scoring_v2

Public empirical evidence (v2): on 478 S&P 500 FY2025 Item 1A sections, company-specificity correlates ρ ≈ 0.87 with an independent NER-based specificity measure — see What This Does and Does Not Claim.

v2-only components (smoke-validated; not all validated at SP500 scale)

Available via score_deterministic_v2() or default score_for_model() on HTTP matrix/panel and MCP scoring tools:

  • static_disclosure_quality_score, static_disclosure_risk_score, disclosure_change_risk_score (score product split)

  • cybersecurity_incident_risk_score, event_materiality_score (excluded from v1 headline weights)

  • disclosure_change_score_v2 on section diffs (v1 disclosure_change_score unchanged)

  • Sector/form baselines via baselines.py + calibration.py

1.2.0 (2026-06-23)

1.1.0 (2026-06-22)

  • Breaking: removed view from /disclosure-matrix and panel /disclosure-matrix request/response (deterministic scoring only).

  • Fix: disclosure_quality_score is correct when boilerplate_risk_score is 0.0 (no longer treated as missing).

  • Internal: unified confidence_score via score_deterministic; removed unused llm_confidences parameter.

  • Deprecation intent: disclosure-alpha-mcp (legacy shim to the analyst bundle) remains for backward compatibility; prefer disclosure-alpha-mcp-analyst or disclosure-alpha-mcp-builder for new deployments. No removal planned in 1.1.x.

Score catalog cleanup (2026-06-22)

Public docs and examples aligned with the deterministic scoring surface:

  • Removed dead fields from documentation and generated fixtures: business_model_fragility_score, cybersecurity_risk_score, hidden_risk_score.

  • Ten computed components — nine headline-weighted scores plus supplementary specificity_quality_score; canonical list: Score Catalog.

  • Doc scope cleanup — removed composite/OSS product-scope notes from public pages; renamed score catalog page to Score Catalog.

built_in_dictionaries_v2 / text_metrics_v2 (2026-06-21)

Shipped the built-in dictionary enrichment documented in Metrics Engine.

Dictionary additions

Pack

Count (approx.)

Notes

NEGATIVE_WORDS

42

Fraud, insolvency, impairment, outage terms

UNCERTAINTY_WORDS

30

Contingency, fluctuation, exposure terms

LITIGIOUS_WORDS

26

Arbitration, antitrust, indemnification terms

CONSTRAINING_WORDS

28

Covenant, lien, forbearance terms

Modal tiers

18

WEAK_MODAL_WORDS, MODERATE_MODAL_WORDS, STRONG_MODAL_WORDS

BOILERPLATE_PHRASES

20

Safe-harbor and generic risk language

TOPIC_KEYWORDS

21 topics

Investable risk clusters for diff engine

FLAG_PATTERNS

13 flags

SEC/PCAOB/FASB-grounded event phrases

MDNA_DENSITY_TERMS

4 packs

MD&A uncertainty, demand, margin, liquidity density

v2 flag phrase additions: material weaknesses in internal control over financial reporting, plans are intended to mitigate, no longer expects, incident response, systems outage.

TERM_PACK_METADATA now documents all shipped packs (negative, uncertainty, litigious, constraining, modal, boilerplate, topics, severity, flags, mdna_density, geography, segment).

Matching behavior (metrics engine)

  • Boilerplate: each phrase counted at most once per sentence.

  • Topics: word-boundary phrase matching (no substring false positives); removed standalone competitive from competition topic.

  • Severity: topic intensity uses severity words within ±10 tokens of a topic hit only.

  • Shared helpers live in disclosure_alpha.text_matching.

Version bumps

Artifact

v1

v2

DICTIONARY_VERSION

built_in_dictionaries_v1

built_in_dictionaries_v2

METRICS_ENGINE_VERSION

text_metrics_v1

text_metrics_v2

SCORING_MODEL_VERSION

unchanged

deterministic_scoring_v1

Empirical evidence (S&P 500 FY2025 Item 1A, v2)

On 478 sections, company-specificity correlates ρ ≈ 0.87 with an independent NER-based specificity measure — see What This Does and Does Not Claim.

Out of scope (deferred)

  • External Loughran–McDonald loader

  • Package split of dictionaries.py

  • Flag suppressions (no material weakness) pending false-positive review

built_in_dictionaries_v1 / text_metrics_v1

Initial license-safe built-in lists and deterministic text metrics engine.