diff_engine¶
Use when: You have current and prior section text and need change scores, topic lists, or language deltas — typically as part of a prior-filing comparison workflow.
Start here¶
compute_section_diff()— full diff result includingdisclosure_change_scoreandlanguage_deltasSectionDiffResult— typed output with similarities, topics, and confidencelexical_similarity()— TF-IDF cosine similarity helper
Prior text is required for meaningful change scores. No prior → disclosure_change_score is null. See FAQ and Troubleshooting.
Example¶
from disclosure_alpha.diff_engine import compute_section_diff
diff = compute_section_diff(
current_text="We may face litigation and regulatory investigation.",
prior_text="We operate in a competitive market.",
section_name="item_1a_risk_factors",
)
print(diff.disclosure_change_score, diff.new_topics)
Full API¶
- class disclosure_alpha.diff_engine.SectionDiffResult(current_section_id: str | None = None, prior_section_id: str | None = None, lexical_similarity: float | None = None, semantic_similarity: float | None = None, length_change_pct: float | None = None, new_topics: list[str] = <factory>, removed_topics: list[str] = <factory>, intensified_topics: list[str] = <factory>, disclosure_change_score: float | None = None, disclosure_change_score_v2: float | None = None, diff_summary: str = '', confidence_score: float = 0.0, language_deltas: dict[str, float]=<factory>, added_sentence_count: int = 0, removed_sentence_count: int = 0, changed_numeric_count: int = 0, added_risk_language_score: float | None = None, diff_evidence: dict[str, typing.Any]=<factory>)[source]¶
Bases:
object-
current_section_id : str | None =
None¶
-
prior_section_id : str | None =
None¶
-
lexical_similarity : float | None =
None¶
-
semantic_similarity : float | None =
None¶
-
length_change_pct : float | None =
None¶
- new_topics : list[str]¶
- removed_topics : list[str]¶
- intensified_topics : list[str]¶
-
disclosure_change_score : float | None =
None¶
-
disclosure_change_score_v2 : float | None =
None¶
-
diff_summary : str =
''¶
-
confidence_score : float =
0.0¶
- language_deltas : dict[str, float]¶
-
added_sentence_count : int =
0¶
-
removed_sentence_count : int =
0¶
-
changed_numeric_count : int =
0¶
-
added_risk_language_score : float | None =
None¶
- diff_evidence : dict[str, Any]¶
-
current_section_id : str | None =
-
disclosure_alpha.diff_engine.compute_section_diff(*, current_text: str, prior_text: str | None, current_section_id: str | None =
None, prior_section_id: str | None =None) SectionDiffResult[source]¶