Core Concepts¶
Audience: New users who want a mental model before scoring. Before you start: None — this is the first conceptual stop after install.
Summary¶
What Disclosure Alpha does and how scores flow from filing HTML to JSON — without repeating the full score walkthrough.
In plain terms¶
Disclosure Alpha reads SEC filing HTML and compares language patterns and year-over-year section changes to produce reproducible disclosure risk scores. You get JSON with an overall score, ten computed component scores (nine headline-weighted plus specificity_quality_score), and coverage signals — no LLM required.
Pipeline¶
flowchart TB
ingest["Ingest (HTML or EDGAR)"]
extract["extract_sections_from_html()"]
metrics["compute_section_metrics()"]
aggregate["aggregate_deterministic_matrix()"]
output["ScoreResult JSON"]
ingest --> extract
extract --> metrics
metrics --> aggregate
aggregate --> output
subgraph deterministic ["Deterministic stage"]
metrics
end
Text equivalent:
ingest (HTML or EDGAR)
↓
extract sections (Item 1A, MD&A, …)
↓
deterministic stage
• text metrics (tone, boilerplate, specificity, …)
• boolean risk flags
• section diffs vs prior comparable filing
↓
aggregate
• 9 weighted component scores (0–100)
• overall disclosure risk score + confidence
See Deterministic Scoring Overview for component families and prior-filing rules.
Scores and components¶
Component names, plain-English meanings, and the 0–100 scale are documented in one place: Understanding Scores.
Evidence & limitations¶
Scores are research tools, not trading signals. See Evidence and Validation for validation numbers and What This Does and Does Not Claim for scope limits.
Related¶
Understanding Scores — read a score response