Understanding Scores¶

Audience: Anyone interpreting CLI, Python, or HTTP score JSON for the first time. Before you start: Skim Core Concepts for pipeline vocabulary.

Summary¶

How to read Disclosure Alpha’s 0–100 disclosure risk scores, component fields, and coverage signals.

Default scoring model

Examples on this page and committed fixtures use deterministic_scoring_v2. CLI, HTTP, and MCP default to v2. Pass deterministic_scoring_v1 to opt into the legacy scale; see Versioning and Reproducibility. Do not compare v1 and v2 numeric levels without re-scoring.

Higher / lower means¶

Field	Higher (toward 100)	Lower (toward 0)
`overall_disclosure_risk_score`	More concern from weighted language/change signals	Less concern
`boilerplate_risk_score`	More vague / boilerplate language	More specific language
`specificity_quality_score`	Better specificity (directionally opposite of most risk scores)	Weaker specificity
`disclosure_change_score`	Larger year-over-year section change	Smaller change
`score_coverage_ratio`	More headline components computed	More gaps (null components)

Version and evidence context: Versioning and Reproducibility, Evidence and Validation, What This Does and Does Not Claim.

In plain terms¶

Disclosure Alpha compares filing language patterns and year-over-year section changes to produce reproducible risk scores — no LLM required. The headline number is a weighted blend of nine headline-weighted component scores; specificity_quality_score is also returned but excluded from headline weights. Lower coverage or missing prior filings show up as null components and a lower score_coverage_ratio.

Problem framing¶

You want to compare a company’s disclosure language against its prior filing — or against a peer screen — without hand-reading every risk-factor paragraph. Disclosure Alpha extracts Item 1A, MD&A, and other sections, runs deterministic text metrics and diffs, and returns a single JSON object you can sort, filter, or wire into dashboards.

Score anatomy¶

flowchart TB
  html["Filing HTML"]
  sections["Section extraction"]
  metrics["Text metrics + flags"]
  diffs["Section diffs vs prior"]
  components["Ten computed scores (9 headline)"]
  overall["overall_disclosure_risk_score"]

  html --> sections
  sections --> metrics
  sections --> diffs
  metrics --> components
  diffs --> components
  components --> overall

flowchart TB
  ingest["Ingest (HTML or EDGAR)"]
  extract["extract_sections_from_html()"]
  metrics["compute_section_metrics()"]
  aggregate["aggregate_deterministic_matrix()"]
  output["ScoreResult JSON"]

  ingest --> extract
  extract --> metrics
  metrics --> aggregate
  aggregate --> output

  subgraph deterministic ["Deterministic stage"]
    metrics
  end

Text equivalent:

ingest (HTML or EDGAR)
    ↓
extract sections (Item 1A, MD&A, …)
    ↓
deterministic stage
  • text metrics (tone, boilerplate, specificity, …)
  • boolean risk flags
  • section diffs vs prior comparable filing
    ↓
aggregate
  • 9 weighted component scores (0–100)
  • overall disclosure risk score + confidence

Reading a response¶

The sample below comes from a minimal synthetic 10-K with no prior filing. Section text is trimmed in the committed fixture; full structure: score-minimal-10k.json.

    "extraction_confs": [
      0.35,
      0.35
    ],
    "diff_confs": [
      0.2,
      0.2
    ],
    "extraction_warnings": [],
    "required_sections_present": true,
    "has_prior": false
  },
  "scores": {
    "overall_disclosure_risk_score": 33.159052,
    "score_coverage_ratio": 0.7778,
    "confidence_score": 0.3426,
    "missing_components": [
      "disclosure_change_score",
      "event_severity_score"
    ],
    "components": {
      "risk_factor_intensity_score": 60.0,
      "disclosure_change_score": null,
      "mdna_uncertainty_score": 26.726221,
      "legal_regulatory_risk_score": 37.6724,
      "liquidity_stress_score": 4.030477,
      "boilerplate_risk_score": 42.528733,
      "internal_controls_risk_score": 3.4483,

Headline fields¶

overall_disclosure_risk_score (~18 here) — weighted mean of present headline components. On the scale below, this filing is low concern.
score_coverage_ratio (0.78) — seven of nine headline components computed. Names in missing_components were not computed.
confidence_score (0.44) — lower here because extraction confidence is weak on a tiny synthetic filing and there is no prior for change diffs.

Top components in this example¶

legal_regulatory_risk_score (25.3) — litigious tone in Item 1A plus an investigation flag.
boilerplate_risk_score (42.5) — moderate vague-language signal relative to other components.
mdna_uncertainty_score (26.7) — uncertainty language and margin-pressure density in MD&A.

disclosure_change_score and event_severity_score are null because no prior filing was supplied — that means missing, not zero change.

When a prior filing is available, the scores block looks like this (abbreviated):

{
  "overall_disclosure_risk_score": 33.10187,
  "score_coverage_ratio": 1.0,
  "confidence_score": 0.6729,
  "missing_components": [],
  "components": {
    "risk_factor_intensity_score": 56.2975,
    "disclosure_change_score": 38.6268,
    "mdna_uncertainty_score": 26.726221,
    "legal_regulatory_risk_score": 30.317069,
    "liquidity_stress_score": 4.030477,
    "boilerplate_risk_score": 42.528733,
    "internal_controls_risk_score": 3.4483,
    "event_severity_score": 45.19,
    "specificity_quality_score": 36.2069,
    "tone_negativity_score": 5.2956,
    "cybersecurity_incident_risk_score": null,
    "event_materiality_score": null
  },
  "aggregates": {
    "disclosure_quality_score": 36.2069,
    "disclosure_deterioration_score": 38.6268,
    "static_disclosure_quality_score": 36.2069,
    "static_disclosure_risk_score": 18.7244,
    "disclosure_change_risk_score": 41.9084
  }
}

Here disclosure_change_score is present and coverage rises when MD&A and prior filing are both available.

Full coverage (all nine headline components) with prior + Item 1A + MD&A:

{
  "overall_disclosure_risk_score": 32.298828,
  "score_coverage_ratio": 1.0,
  "confidence_score": 0.6705,
  "missing_components": [],
  "components": {
    "risk_factor_intensity_score": 45.6425,
    "disclosure_change_score": 37.763145,
    "mdna_uncertainty_score": 26.726221,
    "legal_regulatory_risk_score": 31.335654,
    "liquidity_stress_score": 3.246773,
    "boilerplate_risk_score": 58.666667,
    "internal_controls_risk_score": 0.0,
    "event_severity_score": 47.57,
    "specificity_quality_score": 12.0,
    "tone_negativity_score": 3.57145,
    "cybersecurity_incident_risk_score": 65.0,
    "event_materiality_score": null
  },
  "aggregates": {
    "disclosure_quality_score": 12.0,
    "disclosure_deterioration_score": 37.763145,
    "static_disclosure_quality_score": 12.0,
    "static_disclosure_risk_score": 20.591127,
    "disclosure_change_risk_score": 42.666573
  }
}

This fixture uses Item 1A incident language, so cybersecurity_incident_risk_score is populated. event_materiality_score is null because the example is 10-K only — that field needs extracted 8-K event sections (see Score Catalog).

Component guide¶

Nine deterministic components feed the headline overall_disclosure_risk_score (weights in methodology/aggregation). specificity_quality_score is also returned but excluded from headline weights — see the score scale include for its inversion.

Plain English	JSON field	Primary section(s)
Risk-factor tone & volatility	`risk_factor_intensity_score`	Item 1A
Year-over-year disclosure change	`disclosure_change_score`	Item 1A, MD&A
MD&A uncertainty & demand stress	`mdna_uncertainty_score`	Item 7 (10-K) / Item 2 (10-Q)
Legal & regulatory risk language	`legal_regulatory_risk_score`	Item 1A (+ flags)
Liquidity & covenant stress	`liquidity_stress_score`	MD&A (+ flags)
Boilerplate & vague risk language	`boilerplate_risk_score`	Item 1A
Internal controls weakness signals	`internal_controls_risk_score`	Controls disclosure + Item 1A
Material event severity (diff-only)	`event_severity_score`	Item 1A
Cross-section negative tone	`tone_negativity_score`	Item 1A + MD&A

Score scale¶

All component scores use 0–100. Higher values mean more disclosure risk or deterioration, except for specificity_quality_score (higher = better specificity).

Range	Interpretation
0–25	Low concern
26–50	Moderate
51–75	Elevated
76–100	High

Specificity inversion: Most components rise when language gets worse. specificity_quality_score is the opposite — a higher value means the filing is more specific (numbers, named entities, concrete detail). It is returned in components but is not part of the headline overall_disclosure_risk_score weights.

Low coverage and null components¶

When required sections fail to extract or there is no prior comparable filing:

Affected components appear as null in components (never substituted with zero).
missing_components lists component names that could not be computed.
score_coverage_ratio drops; the headline score renormalizes over present components only.

See FAQ and Troubleshooting for troubleshooting low coverage and null change scores.

Core Concepts — pipeline vocabulary
Versioning and Reproducibility — artifact versions and v1 → v2 migration
Deterministic Scoring Overview — full specification
Score Catalog — component catalog and weights
Evidence and Validation — empirical validation
What This Does and Does Not Claim — scope and limits
Glossary — terms and artifact versions