Core Concepts

Audience: New users who want a mental model before scoring. Before you start: None — this is the first conceptual stop after install.

Summary

What Disclosure Alpha does and how scores flow from filing HTML to JSON — without repeating the full score walkthrough.

In plain terms

Disclosure Alpha reads SEC filing HTML and compares language patterns and year-over-year section changes to produce reproducible disclosure risk scores. You get JSON with an overall score, ten computed component scores (nine headline-weighted plus specificity_quality_score), and coverage signals — no LLM required.

Pipeline

flowchart TB
  ingest["Ingest (HTML or EDGAR)"]
  extract["extract_sections_from_html()"]
  metrics["compute_section_metrics()"]
  aggregate["aggregate_deterministic_matrix()"]
  output["ScoreResult JSON"]

  ingest --> extract
  extract --> metrics
  metrics --> aggregate
  aggregate --> output

  subgraph deterministic ["Deterministic stage"]
    metrics
  end

Text equivalent:

ingest (HTML or EDGAR)
    ↓
extract sections (Item 1A, MD&A, …)
    ↓
deterministic stage
  • text metrics (tone, boilerplate, specificity, …)
  • boolean risk flags
  • section diffs vs prior comparable filing
    ↓
aggregate
  • 9 weighted component scores (0–100)
  • overall disclosure risk score + confidence

See Deterministic Scoring Overview for component families and prior-filing rules.

Scores and components

Component names, plain-English meanings, and the 0–100 scale are documented in one place: Understanding Scores.

Evidence & limitations

Scores are research tools, not trading signals. See Evidence and Validation for validation numbers and What This Does and Does Not Claim for scope limits.