Quickstart: Python API¶
Score a filing from Python in a few lines.
Audience: Notebook and application developers. Before you start: Installation; SEC EDGAR Setup for ticker helpers.
Summary¶
Import score_filing_html or score_filing_ticker and read .scores from the result object.
Score local HTML¶
Goal: Score HTML you already have — no EDGAR fetch.
from disclosure_alpha import score_filing_html
with open("filing.html", encoding="utf-8") as f:
result = score_filing_html(f.read(), "10-K")
print(result.scores.overall_disclosure_risk_score)
print(list(result.to_dict().keys()))
score_filing_html returns a structured result object; result.to_dict() is the full JSON-serializable dict (sections, metrics, scores, versions).
Sample output¶
Key fields from the committed minimal 10-K fixture:
"extraction_confs": [
0.35,
0.35
],
"diff_confs": [
0.2,
0.2
],
"extraction_warnings": [],
"required_sections_present": true,
"has_prior": false
},
"scores": {
"overall_disclosure_risk_score": 33.159052,
"score_coverage_ratio": 0.7778,
"confidence_score": 0.3426,
"missing_components": [
"disclosure_change_score",
"event_severity_score"
],
"components": {
"risk_factor_intensity_score": 60.0,
How to read it¶
result.scores.overall_disclosure_risk_score— same headline field as CLI JSONresult.to_dict()— full structure including sections, metrics, and versionsPass
prior_html=to populate change-related components
If something looks wrong¶
Null disclosure_change_score without prior HTML is expected: FAQ and Troubleshooting.
Score by ticker¶
Goal: Let the SDK fetch and score from EDGAR.
import os
os.environ["SEC_USER_AGENT"] = "YourName your@email.com"
from disclosure_alpha import score_filing_ticker
result = score_filing_ticker("AAPL", 2025, form_type="10-K")
print(result.scores.overall_disclosure_risk_score)
How to read it¶
Prior filing is resolved automatically for diffs when available
Check
result.scores.score_coverage_ratiobefore comparing across tickersUse
result.to_dict()["scores"]["components"]for component-level analysis
If something looks wrong¶
See FAQ and Troubleshooting for EDGAR and coverage issues.
Lower-level pipeline¶
Goal: Control extraction, metrics, and aggregation separately.
from disclosure_alpha import extract_sections_from_html, compute_section_metrics, score_deterministic
sections = extract_sections_from_html(html, form_type="10-K")
metrics = compute_section_metrics(sections)
scores = score_deterministic(metrics)
Use this when you need intermediate metrics without filing-level aggregation shortcuts.
Related¶
Understanding Scores — interpret score JSON
Python SDK Guide — SDK walkthrough