Aggregation¶
What this page answers: How section-level signals become nine component scores and the headline overall_disclosure_risk_score.
Inputs |
Section metrics, diffs, flags, language deltas, MD&A densities |
Outputs |
|
Default version |
|
Opt-in version |
|
In plain terms¶
Aggregation blends section-level metrics, diffs, flags, and language deltas into nine filing-level component scores (0–100), then computes the weighted headline overall_disclosure_risk_score, coverage, and confidence. This is the last stage before JSON reaches CLI, Python, or HTTP clients.
When you’ll see this¶
CLI / Python: final
scoresblock indisclosure-alpha scoreorscore_filing_html()outputHTTP:
GET /v1/company/{ticker}/disclosure-matrixand panel POST responsesFields you read:
overall_disclosure_risk_score,components,score_coverage_ratio,missing_components,confidence_scoreAudit: request
include=provenanceon the matrix route for per-component input breakdowns
Implemented in deterministic_scoring.py:
v1:
aggregate_deterministic_matrix()— default pipeline pathv2:
aggregate_deterministic_matrix_v2()— calls v1, then overrides selected components
Python SDK tuning: pass config=ScoringConfig(...) to aggregate_deterministic_matrix() / aggregate_deterministic_matrix_v2(), or config=PipelineConfig(...) to score_filing_html() and related pipeline helpers. Tunable fields include component_weights, flag_boost_points, flag_evidence_score, and v2 calibration_context. Per-component blend weights inside each score remain fixed for this release.
v1 full specification (`deterministic_scoring_v1`)
Combines section metrics, diffs, flags, language deltas, and MD&A densities into filing-level component scores and an overall disclosure risk score. This is what CLI, HTTP, and MCP return today.
Inputs¶
aggregate_deterministic_matrix(
section_metrics: dict[str, dict[str, float]],
section_diffs: dict[str, float | None],
section_flags: dict[str, dict[str, bool]] | None = None,
language_deltas: dict[str, dict[str, float]] | None = None,
section_densities: dict[str, dict[str, float]] | None = None,
) → DeterministicAggregationResult
Section keys include item_1a_risk_factors, item_7_mdna / item_2_mdna, item_9a_controls / item_4_controls.
Helper: blend_scores¶
Weighted average over non-null inputs; weights renormalized. Returns None if all inputs are null.
Flag boost¶
_flag_boost(flags, names) adds +15.0 when any named flag is true (merged across sections). Result capped at 100 after addition.
Component blends¶
Tone ratios are scaled × 100 before blending. Diff scores are already 0–100.
risk_factor_intensity_score¶
blend(negative×100, uncertainty×100, diff_1a; weights 0.375, 0.375, 0.25)
disclosure_change_score¶
blend(diff_1a, diff_mdna; weights 0.6, 0.4)
+ 0.1 × avg(positive uncertainty_language_delta) # when present
mdna_uncertainty_score¶
blend(
uncertainty×100, modal×100, readability,
uncertainty_term_density, demand_softness_density, margin_pressure_density;
weights 0.40, 0.35, 0.25, 0.10, 0.05, 0.05
)
+ flag_boost(guidance_withdrawal_flag)
legal_regulatory_risk_score¶
blend(litigious×100, legal_language_delta; weights 0.70, 0.30)
+ flag_boost(investigation_flag, material_legal_proceeding_flag)
liquidity_stress_score¶
blend(constraining×100, liquidity_constraint_density; weights 0.50, 0.35)
+ flag_boost(going_concern_flag, covenant_breach_flag)
boilerplate_risk_score¶
blend(boilerplate×100, 100−numeric_specificity, 100−company_specificity; equal weights)
internal_controls_risk_score¶
blend(diff_controls, constraining×100; weights 0.6, 0.4)
+ flag_boost(material_weakness_flag, restatement_flag, ineffective_controls_flag)
event_severity_score¶
diff_1a # single input
specificity_quality_score¶
blend(numeric_specificity, company_specificity; weights 0.5, 0.5)
Computed and returned in components but not in headline weights.
tone_negativity_score¶
blend(negative×100 from 1A, uncertainty×100 from MD&A; weights 0.5, 0.5)
Headline score¶
Weights (COMPONENT_WEIGHTS — nine headline components):
Component |
Weight |
|---|---|
|
0.20 |
|
0.15 |
|
0.15 |
|
0.10 |
|
0.10 |
|
0.10 |
|
0.05 |
|
0.05 |
|
0.05 |
overall = Σ (weight_i × component_i) / Σ (weight_i for present components)
clamp(overall, 0, 100)
Coverage and confidence¶
coverage = |{headline components with non-null values}| / 9
missing_components = [names where value is null]
Initial confidence from coverage: clamp(0.3, 0.95, 0.5 + coverage × 0.4).
score_deterministic() in pipeline.py then refines confidence via compute_overall_confidence() using extraction confidences and average diff confidence.
Derived aggregates¶
Field |
Source |
|---|---|
|
|
|
|
Provenance¶
Each component records a DeterministicComponentProvenance entry:
{
"score_name": "liquidity_stress_score",
"value": 58.0,
"inputs": {
"constraining_word_ratio": 0.024,
"liquidity_constraint_density": 12.4,
"going_concern_flag": true,
"flag_boost": 15.0
},
"source": "deterministic"
}
Available in CLI/ Python JSON (scores.provenance) and HTTP matrix responses when include=provenance.
Output shape¶
ScoreResult.to_dict() (CLI / Python) returns:
{
"scores": {
"overall_disclosure_risk_score": 38.5,
"score_coverage_ratio": 0.8889,
"confidence_score": 0.82,
"missing_components": ["internal_controls_risk_score"],
"components": { "...": 42.0 },
"aggregates": { "...": null },
"provenance": [ "..."]
},
"versions": {
"parser_version": "section_extractor_v1",
"metrics_engine_version": "text_metrics_v2",
"scoring_model_version": "deterministic_scoring_v1"
}
}
Related (v1)¶
v2 specification (`deterministic_scoring_v2`, default)
aggregate_deterministic_matrix_v2() starts from the v1 result, then replaces four components and recomputes the headline. Entry points: score_deterministic_v2(), score_for_model() (default), HTTP matrix/panel, and MCP scoring tools.
Unchanged from v1¶
disclosure_change_score, mdna_uncertainty_score, boilerplate_risk_score, event_severity_score, tone_negativity_score, specificity_quality_score, COMPONENT_WEIGHTS, and conditional-null rules for those components.
risk_factor_intensity_score (v2)¶
When Item 1A is present, tone inputs pass through calibrate_metric() (calibration.py) using built-in 10-K percentile tables. Calibrated values are percentile ranks (0–100), not ratio × 100. Diff weight unchanged (0.25).
blend(calibrated_negative, calibrated_uncertainty, diff_1a; weights 0.375, 0.375, 0.25)
Provenance records raw_value, calibrated_value, calibration_reference, and calibration_status per tone metric.
Evidence-based components (v2)¶
Legal, liquidity, and internal-controls scores use ScoreEvidence + blend_evidence() instead of v1’s blend_scores() + _flag_boost(+15).
Serious flags (
investigation_flag,going_concern_flag,material_weakness_flag, etc.) contribute evidence at 65.0 with weight 0.5 and areasoncode (e.g.serious_flag,liquidity_fallback).Multiple evidence rows renormalize weights over non-null inputs (same pattern as
blend_scores).
legal_regulatory_risk_score (v2)¶
evidence: litigious×100 per section (item_1a_risk_factors weight 0.5; legal-proceedings sections 0.35)
+ legal_language_delta (weight 0.30)
+ legal flags as evidence (weight 0.5 each)
→ blend_evidence
Can be non-null from flags or legal-proceedings sections even when Item 1A tone metrics are missing.
liquidity_stress_score (v2)¶
evidence: MD&A constraining×100 (0.50) + liquidity_constraint_density (0.35)
+ Item 1A constraining×100 fallback (0.25, reason liquidity_fallback)
+ liquidity flags as evidence (0.5 each)
→ blend_evidence
internal_controls_risk_score (v2)¶
evidence: controls-section diff (0.6) + Item 1A constraining×100 (0.4)
+ serious IC flags as evidence (0.5 each)
→ blend_evidence
Controls diff resolves item_9a_controls or item_4_controls with section attribution in provenance.
Confidence (v2 path)¶
score_deterministic_v2() sets confidence via compute_confidence_detailed(): mean of extraction confidence, coverage, and diff confidence, minus penalties for missing required sections, extraction warnings, no prior filing, and low coverage. v1 uses the same underlying penalty model through compute_overall_confidence() after P1-2.
Provenance source tag¶
v2-overridden components set source: "deterministic_v2" in provenance entries.
Score product split and event components (v2)¶
aggregate_deterministic_matrix_v2() also emits supplementary aggregates and event scores (not blended into the v1 headline weights):
static_disclosure_quality_score,static_disclosure_risk_score,disclosure_change_risk_scorecybersecurity_incident_risk_score,event_materiality_score
Section diffs may include disclosure_change_score_v2 (sentence-aligned); v1 disclosure_change_score is unchanged. Baseline lookup: baselines.lookup_baseline() with form/sector/year fallback.
Comparability¶
Do not compare numeric levels between v1 and v2 without re-scoring. See What This Does and Does Not Claim for v2 empirical evidence.
Related (v2)¶
Versioning and Reproducibility — migration and pinning
Score Catalog — component list and weights