Understanding Scores¶
Audience: Anyone interpreting CLI, Python, or HTTP score JSON for the first time. Before you start: Skim Core Concepts for pipeline vocabulary.
Summary¶
How to read Disclosure Alpha’s 0–100 disclosure risk scores, component fields, and coverage signals.
Default scoring model
Examples on this page and committed fixtures use deterministic_scoring_v2. CLI, HTTP, and MCP default to v2. Pass deterministic_scoring_v1 to opt into the legacy scale; see Versioning and Reproducibility. Do not compare v1 and v2 numeric levels without re-scoring.
Higher / lower means¶
Field |
Higher (toward 100) |
Lower (toward 0) |
|---|---|---|
|
More concern from weighted language/change signals |
Less concern |
|
More vague / boilerplate language |
More specific language |
|
Better specificity (directionally opposite of most risk scores) |
Weaker specificity |
|
Larger year-over-year section change |
Smaller change |
|
More headline components computed |
More gaps (null components) |
Version and evidence context: Versioning and Reproducibility, Evidence and Validation, What This Does and Does Not Claim.
In plain terms¶
Disclosure Alpha compares filing language patterns and year-over-year section changes to produce reproducible risk scores — no LLM required. The headline number is a weighted blend of nine headline-weighted component scores; specificity_quality_score is also returned but excluded from headline weights. Lower coverage or missing prior filings show up as null components and a lower score_coverage_ratio.
Problem framing¶
You want to compare a company’s disclosure language against its prior filing — or against a peer screen — without hand-reading every risk-factor paragraph. Disclosure Alpha extracts Item 1A, MD&A, and other sections, runs deterministic text metrics and diffs, and returns a single JSON object you can sort, filter, or wire into dashboards.
Score anatomy¶
flowchart TB
html["Filing HTML"]
sections["Section extraction"]
metrics["Text metrics + flags"]
diffs["Section diffs vs prior"]
components["Ten computed scores (9 headline)"]
overall["overall_disclosure_risk_score"]
html --> sections
sections --> metrics
sections --> diffs
metrics --> components
diffs --> components
components --> overall
flowchart TB
ingest["Ingest (HTML or EDGAR)"]
extract["extract_sections_from_html()"]
metrics["compute_section_metrics()"]
aggregate["aggregate_deterministic_matrix()"]
output["ScoreResult JSON"]
ingest --> extract
extract --> metrics
metrics --> aggregate
aggregate --> output
subgraph deterministic ["Deterministic stage"]
metrics
end
Text equivalent:
ingest (HTML or EDGAR)
↓
extract sections (Item 1A, MD&A, …)
↓
deterministic stage
• text metrics (tone, boilerplate, specificity, …)
• boolean risk flags
• section diffs vs prior comparable filing
↓
aggregate
• 9 weighted component scores (0–100)
• overall disclosure risk score + confidence
Reading a response¶
The sample below comes from a minimal synthetic 10-K with no prior filing. Section text is trimmed in the committed fixture; full structure: score-minimal-10k.json.
"extraction_confs": [
0.35,
0.35
],
"diff_confs": [
0.2,
0.2
],
"extraction_warnings": [],
"required_sections_present": true,
"has_prior": false
},
"scores": {
"overall_disclosure_risk_score": 33.159052,
"score_coverage_ratio": 0.7778,
"confidence_score": 0.3426,
"missing_components": [
"disclosure_change_score",
"event_severity_score"
],
"components": {
"risk_factor_intensity_score": 60.0,
"disclosure_change_score": null,
"mdna_uncertainty_score": 26.726221,
"legal_regulatory_risk_score": 37.6724,
"liquidity_stress_score": 4.030477,
"boilerplate_risk_score": 42.528733,
"internal_controls_risk_score": 3.4483,
Headline fields¶
overall_disclosure_risk_score(~18 here) — weighted mean of present headline components. On the scale below, this filing is low concern.score_coverage_ratio(0.78) — seven of nine headline components computed. Names inmissing_componentswere not computed.confidence_score(0.44) — lower here because extraction confidence is weak on a tiny synthetic filing and there is no prior for change diffs.
Top components in this example¶
legal_regulatory_risk_score(25.3) — litigious tone in Item 1A plus an investigation flag.boilerplate_risk_score(42.5) — moderate vague-language signal relative to other components.mdna_uncertainty_score(26.7) — uncertainty language and margin-pressure density in MD&A.
disclosure_change_score and event_severity_score are null because no prior filing was supplied — that means missing, not zero change.
When a prior filing is available, the scores block looks like this (abbreviated):
{
"overall_disclosure_risk_score": 33.10187,
"score_coverage_ratio": 1.0,
"confidence_score": 0.6729,
"missing_components": [],
"components": {
"risk_factor_intensity_score": 56.2975,
"disclosure_change_score": 38.6268,
"mdna_uncertainty_score": 26.726221,
"legal_regulatory_risk_score": 30.317069,
"liquidity_stress_score": 4.030477,
"boilerplate_risk_score": 42.528733,
"internal_controls_risk_score": 3.4483,
"event_severity_score": 45.19,
"specificity_quality_score": 36.2069,
"tone_negativity_score": 5.2956,
"cybersecurity_incident_risk_score": null,
"event_materiality_score": null
},
"aggregates": {
"disclosure_quality_score": 36.2069,
"disclosure_deterioration_score": 38.6268,
"static_disclosure_quality_score": 36.2069,
"static_disclosure_risk_score": 18.7244,
"disclosure_change_risk_score": 41.9084
}
}
Here disclosure_change_score is present and coverage rises when MD&A and prior filing are both available.
Full coverage (all nine headline components) with prior + Item 1A + MD&A:
{
"overall_disclosure_risk_score": 32.298828,
"score_coverage_ratio": 1.0,
"confidence_score": 0.6705,
"missing_components": [],
"components": {
"risk_factor_intensity_score": 45.6425,
"disclosure_change_score": 37.763145,
"mdna_uncertainty_score": 26.726221,
"legal_regulatory_risk_score": 31.335654,
"liquidity_stress_score": 3.246773,
"boilerplate_risk_score": 58.666667,
"internal_controls_risk_score": 0.0,
"event_severity_score": 47.57,
"specificity_quality_score": 12.0,
"tone_negativity_score": 3.57145,
"cybersecurity_incident_risk_score": 65.0,
"event_materiality_score": null
},
"aggregates": {
"disclosure_quality_score": 12.0,
"disclosure_deterioration_score": 37.763145,
"static_disclosure_quality_score": 12.0,
"static_disclosure_risk_score": 20.591127,
"disclosure_change_risk_score": 42.666573
}
}
This fixture uses Item 1A incident language, so cybersecurity_incident_risk_score is populated. event_materiality_score is null because the example is 10-K only — that field needs extracted 8-K event sections (see Score Catalog).
Component guide¶
Nine deterministic components feed the headline overall_disclosure_risk_score (weights in methodology/aggregation). specificity_quality_score is also returned but excluded from headline weights — see the score scale include for its inversion.
Plain English |
JSON field |
Primary section(s) |
|---|---|---|
Risk-factor tone & volatility |
|
Item 1A |
Year-over-year disclosure change |
|
Item 1A, MD&A |
MD&A uncertainty & demand stress |
|
Item 7 (10-K) / Item 2 (10-Q) |
Legal & regulatory risk language |
|
Item 1A (+ flags) |
Liquidity & covenant stress |
|
MD&A (+ flags) |
Boilerplate & vague risk language |
|
Item 1A |
Internal controls weakness signals |
|
Controls disclosure + Item 1A |
Material event severity (diff-only) |
|
Item 1A |
Cross-section negative tone |
|
Item 1A + MD&A |
Score scale¶
All component scores use 0–100. Higher values mean more disclosure risk or deterioration, except for specificity_quality_score (higher = better specificity).
Range |
Interpretation |
|---|---|
0–25 |
Low concern |
26–50 |
Moderate |
51–75 |
Elevated |
76–100 |
High |
Specificity inversion: Most components rise when language gets worse. specificity_quality_score is the opposite — a higher value means the filing is more specific (numbers, named entities, concrete detail). It is returned in components but is not part of the headline overall_disclosure_risk_score weights.
Low coverage and null components¶
When required sections fail to extract or there is no prior comparable filing:
Affected components appear as
nullincomponents(never substituted with zero).missing_componentslists component names that could not be computed.score_coverage_ratiodrops; the headline score renormalizes over present components only.
See FAQ and Troubleshooting for troubleshooting low coverage and null change scores.
Related¶
Core Concepts — pipeline vocabulary
Versioning and Reproducibility — artifact versions and v1 → v2 migration
Deterministic Scoring Overview — full specification
Score Catalog — component catalog and weights
Evidence and Validation — empirical validation
What This Does and Does Not Claim — scope and limits
Glossary — terms and artifact versions