File & Step¶
ReportBase
¶
Bases: BaseModel, ABC, ReviewHelpers
Abstract base for report units (steps/files/batches).
Note:
Requires implementation of ok method in order to be instantiated.
Provides a common lifecycle (start → {succeed|fail|skip} → end),
structured messaging (notes, warnings, errors), metadata,
and HITL review flags. Subclasses must define :attr:ok to indicate
success when :meth:end infers a terminal status.
Attributes:
-
status(Status) –Current lifecycle status (defaults to
PENDING). -
percent(int) –Progress percentage
0..100(informational; not enforced). -
started_at(datetime | None) –UTC timestamp when processing started.
-
finished_at(datetime | None) –UTC timestamp when processing finalized.
-
notes(list[str]) –Freeform narrative messages intended for UI display.
-
errors(list[str]) –Fatal issues; presence typically implies failure.
-
warnings(list[str]) –Non-fatal issues worth surfacing to users.
-
metadata(dict[str, Any]) –Arbitrary structured context for search/analytics/UI.
-
review(ReviewFlag) –Human-in-the-loop flag (
flagged+ optionalreason). -
report_version(str) –Schema version written to JSON artifacts.
-
defer_start(bool) –Initialization flag to skip .begin call upon construction.
-
duration_ms(float | None) –Elapsed time in milliseconds (property).
-
failed(bool) –Return
Trueif the unit has failed (property). -
pending(bool) –Return
Trueif the unit is pending (property). -
running(bool) –Return
Trueif the unit is running (property). -
skipped(bool) –Return
Trueif the unit was skipped (property). -
succeeded(bool) –Return
Trueif the unit has succeeded (property).
See Also
StepReport Unit of work inside a file/batch. FileReport Ordered collection of steps for a single file.
Methods:
-
clear_review–Clear the HITL review flag.
-
end–Finalize the unit if not already terminal.
-
error–Append an error message (does not change status).
-
fail–Finalize as failed (
FAILED). -
model_post_init– -
note–Append a user-facing note.
-
request_review–Set the HITL review flag.
-
skip–Finalize as skipped (
SKIPPED). -
start–Mark the unit as running and stamp
started_atif missing. -
succeed–Finalize successfully (
SUCCEEDED) and setpercent=100. -
warn–Append a non-fatal warning.
defer_start
class-attribute
instance-attribute
¶
duration_ms
property
¶
Elapsed time in milliseconds.
Returns None if timing cannot be determined (e.g., no started_at). Uses finished_at when present; otherwise uses 'now' to reflect in-flight duration. Clamped at >= 0 and rounded to 3 decimals.
metadata
class-attribute
instance-attribute
¶
ok
abstractmethod
property
¶
Truthiness of success when inferring status.
Subclasses define the success condition used by :meth:end
when the unit is not already in a terminal status. Typical
implementations derive this from fields like errors,
checks (for steps), or child statuses (for files/batches).
Returns:
-
bool–Trueif the unit should be considered successful.
requires_human_review
property
¶
Whether the unit is flagged for human review.
Returns:
-
bool–Trueif :attr:review.flaggedis set, elseFalse.
clear_review
¶
end
¶
Finalize the unit if not already terminal.
If already terminal (SUCCESS, FAILED, SKIPPED), this stamps
:attr:finished_at if missing and returns. Otherwise, infers success
from :attr:ok (roll-up of step outcomes) and calls :meth:succeed
or :meth:fail accordingly.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
error
¶
error(msg: str) -> 'ReportBase'
Append an error message (does not change status).
Note
Fail is the preferred method. error does not change status for design consistency.
Parameters:
-
(msg¶str) –Message to append to :attr:
errors.
Returns:
-
ReportBase–Self.
Notes
Use :meth:fail to change terminal status to FAILED. This method
only records text.
Source code in src/pipeline_watcher/core.py
fail
¶
fail(message: Optional[str] = None) -> 'ReportBase'
Finalize as failed (FAILED).
Parameters:
-
(message¶str, default:None) –Error text to append to :attr:
errors.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
model_post_init
¶
note
¶
note(msg: str) -> 'ReportBase'
Append a user-facing note.
Parameters:
-
(msg¶str) –Message to append to :attr:
notes.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
request_review
¶
request_review(reason: str | None = None) -> HasReviewTV
Set the HITL review flag.
Parameters:
-
(reason¶str, default:None) –Short UI-visible reason for requesting review.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
skip
¶
skip(reason: Optional[str] = None) -> 'ReportBase'
Finalize as skipped (SKIPPED).
Parameters:
-
(reason¶str, default:None) –Rationale appended to :attr:
notesas"Skipped: {reason}".
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
start
¶
Mark the unit as running and stamp started_at if missing.
Returns:
-
ReportBase–Self (for fluent chaining).
Examples:
Source code in src/pipeline_watcher/core.py
succeed
¶
Finalize successfully (SUCCEEDED) and set percent=100.
Also stamps finished_at to the current time.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
warn
¶
warn(msg: str) -> 'ReportBase'
Append a non-fatal warning.
Parameters:
-
(msg¶str) –Message to append to :attr:
warnings.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
FileReport
¶
Bases: ReportBase
Per-file processing timeline composed of ordered steps.
Aggregates StepReport items and provides
helpers for appending terminal steps (success/failed/skipped) and requesting
human-in-the-loop (HITL) review.
Attributes:
-
# FileReport-specific– -
path(Path) –Source path or URI for display/debugging.
-
file_id(str or None) –Stable identifier for the file (preferably unique within a batch).
-
steps(list[StepReport]) –Ordered step sequence for this file.
-
n_steps(int) –Number of steps in the process (used to compute
percent). -
label(str) –Convenience label derived from :attr:
path(basename). -
name(str) –Convenience name derived from :attr:
path(basename). -
mime_type(str or None) –Guessed MIME type based on the path extension.
-
size_bytes(int or None) –Best-effort file size in bytes.
-
requires_human_review(bool) –Whether this file requires human review (computed).
-
human_review_reason(str or None) –Human-readable summary of why review is required (computed).
-
# Inherited lifecycle fields (see ReportBase)– -
status(Status) –Lifecycle status (see
ReportBase). -
percent(int) –Progress percentage (see
ReportBase). -
started_at(datetime | None) –When processing started (see
ReportBase). -
finished_at(datetime | None) –When processing finished (see
ReportBase). -
notes(list[str]) –Narrative messages (see
ReportBase). -
errors(list[str]) –Fatal issues (see
ReportBase). -
warnings(list[str]) –Non-fatal issues (see
ReportBase). -
metadata(dict[str, Any]) –Arbitrary structured context (see
ReportBase). -
review(ReviewFlag) –File-level HITL flag (see
ReportBase). -
report_version(str) –Schema version (see
ReportBase).
Notes
See ReportBase for full semantics of
lifecycle fields and methods (start, end, fail, etc.).
Methods:
-
_coerce_path– -
_flagged_steps–Return steps that have requested human review.
-
_make_unique_step_id–Generate a slugified, unique step id based on a label.
-
_recompute_percent–Recompute :attr:
percentas the fraction of completed steps. -
add_completed_step–Create a
SUCCEEDEDstep and append it (chainable). -
add_failed_step–Create a
FAILEDstep and append it (chainable). -
add_review_step–Create a step that requests HITL review, then append it.
-
add_skipped_step–Create a
SKIPPEDstep and append it (chainable). -
append_step–Finalize and append a step; recompute aggregate percent.
-
begin–Construct and mark the file report as running.
-
clear_review–Clear the HITL review flag.
-
end–Finalize the unit if not already terminal.
-
error–Append an error message (does not change status).
-
fail–Finalize as failed (
FAILED). -
last_step–Return the most recently appended step or
Noneif empty. -
model_post_init– -
note–Append a user-facing note.
-
request_review–Set the HITL review flag.
-
skip–Finalize as skipped (
SKIPPED). -
start–Mark the unit as running and stamp
started_atif missing. -
succeed–Finalize successfully (
SUCCEEDED) and setpercent=100. -
warn–Append a non-fatal warning.
_pipeline
class-attribute
instance-attribute
¶
defer_start
class-attribute
instance-attribute
¶
duration_ms
property
¶
Elapsed time in milliseconds.
Returns None if timing cannot be determined (e.g., no started_at). Uses finished_at when present; otherwise uses 'now' to reflect in-flight duration. Clamped at >= 0 and rounded to 3 decimals.
human_review_reason
property
¶
Compact human-readable reason summarizing review needs (computed).
Combines file-level reason (if present) with a roll-up of flagged steps. Shows up to five step names and a “+N more” suffix if necessary.
Returns:
-
str or None–Summary text or
Noneif no review is requested.
metadata
class-attribute
instance-attribute
¶
ok
property
¶
Truthiness of success used by :meth:end.
Returns:
-
bool–Falseif status isFAILEDor any errors exist.Trueif status isSUCCEEDED. Otherwise,all(step.ok for step in steps)(orTrueif no steps).
requires_human_review
property
¶
Whether this file needs human review (computed).
True if the file itself is flagged or any contained step is flagged.
Returns:
-
bool–Trueif review is required; otherwiseFalse.
size_bytes
property
¶
Best-effort determination of file's size in bytes. Avoid raising on missing/inaccessible paths.
_coerce_path
classmethod
¶
Source code in src/pipeline_watcher/core.py
_flagged_steps
¶
_flagged_steps() -> List[StepReport]
Return steps that have requested human review.
Returns:
-
list[StepReport]–Subset of :attr:
stepswhosereview.flaggedisTrue.
Source code in src/pipeline_watcher/core.py
_make_unique_step_id
¶
_make_unique_step_id(label: str) -> str
Generate a slugified, unique step id based on a label.
If the slug already exists among current steps, appends -2, -3,
etc., until unique.
Parameters:
-
(label¶str) –Human-readable label to slugify.
Returns:
-
str–Unique step identifier.
Source code in src/pipeline_watcher/core.py
_recompute_percent
¶
Recompute :attr:percent as the fraction of completed steps.
Source code in src/pipeline_watcher/core.py
add_completed_step
¶
add_completed_step(label: str, *, id: str | None = None, note: str | None = None, metadata: dict | None = None) -> 'FileReport'
Create a SUCCEEDED step and append it (chainable).
Parameters:
-
(label¶str) –Step label for UI.
-
(id¶str, default:None) –Explicit step id; if omitted, a unique id is derived from
label. -
(note¶str | None, default:None) –Optional note appended to the step.
-
(metadata¶dict, default:None) –Metadata merged into the step.
Returns:
-
FileReport–Self.
Source code in src/pipeline_watcher/core.py
add_failed_step
¶
add_failed_step(label: str, *, id: str | None = None, reason: str | None = None, metadata: dict | None = None) -> 'FileReport'
Create a FAILED step and append it (chainable).
Parameters:
-
(label¶str) –Step label.
-
(id¶str, default:None) –Explicit id; if omitted, derived from
label. -
(reason¶str, default:None) –Failure reason recorded on the step.
-
(metadata¶dict, default:None) –Metadata merged into the step.
Returns:
-
FileReport–Self.
Source code in src/pipeline_watcher/core.py
add_review_step
¶
add_review_step(label: str, *, id: str | None = None, reason: str | None = None, metadata: dict | None = None, mark_success: bool = True) -> 'FileReport'
Create a step that requests HITL review, then append it.
By default the step is marked SUCCESS (common pattern: “passed
but needs review”). The file-level review flag will be set when
appended if the file isn’t already flagged.
Parameters:
-
(label¶str) –Step label.
-
(id¶str, default:None) –Explicit id; if omitted, derived from
label. -
(reason¶str, default:None) –UI-visible reason for review.
-
(metadata¶dict, default:None) –Extra context for reviewers.
-
(mark_success¶bool, default:True) –If
True, mark the stepSUCCESS; otherwise leave status as-is.
Returns:
-
FileReport–Self.
Source code in src/pipeline_watcher/core.py
add_skipped_step
¶
add_skipped_step(label: str, *, id: str | None = None, reason: str | None = None, metadata: dict | None = None) -> 'FileReport'
Create a SKIPPED step and append it (chainable).
Parameters:
-
(label¶str) –Step label.
-
(id¶str, default:None) –Explicit id; if omitted, derived from
label. -
(reason¶str, default:None) –Skip rationale (added to notes).
-
(metadata¶dict, default:None) –Metadata merged into the step.
Returns:
-
FileReport–Self.
Source code in src/pipeline_watcher/core.py
append_step
¶
append_step(step: StepReport, max_steps: int = 10000) -> 'FileReport'
Finalize and append a step; recompute aggregate percent.
The step is finalized via StepReport.end, appended to steps, and the file percent is updated as the arithmetic mean of child step percents. If the step requests HITL review and the file is not already flagged, the file's review is set.
Parameters:
-
(step¶StepReport) –Step to finalize and append.
-
(max_steps¶int, default:10000) –The maximum number of steps allowed.
Returns:
-
FileReport–Self (chainable).
Source code in src/pipeline_watcher/core.py
begin
classmethod
¶
begin(path: Path | str, file_id: str | None = None, n_steps: int = 1, metadata: dict | None = None) -> 'FileReport'
Construct and mark the file report as running.
Parameters:
-
(path¶Path) –The path to file.
-
(file_id¶str | None, default:None) –Stable file identifier.
-
(n_steps¶int, default:1) –Number of steps in process.
-
(metadata¶dict | None, default:None) –Dictionary of metadata about the file.
Returns:
-
FileReport–Started file report (
status=RUNNING).
Source code in src/pipeline_watcher/core.py
clear_review
¶
end
¶
Finalize the unit if not already terminal.
If already terminal (SUCCESS, FAILED, SKIPPED), this stamps
:attr:finished_at if missing and returns. Otherwise, infers success
from :attr:ok (roll-up of step outcomes) and calls :meth:succeed
or :meth:fail accordingly.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
error
¶
error(msg: str) -> 'ReportBase'
Append an error message (does not change status).
Note
Fail is the preferred method. error does not change status for design consistency.
Parameters:
-
(msg¶str) –Message to append to :attr:
errors.
Returns:
-
ReportBase–Self.
Notes
Use :meth:fail to change terminal status to FAILED. This method
only records text.
Source code in src/pipeline_watcher/core.py
fail
¶
fail(message: Optional[str] = None) -> 'ReportBase'
Finalize as failed (FAILED).
Parameters:
-
(message¶str, default:None) –Error text to append to :attr:
errors.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
last_step
¶
last_step() -> StepReport | None
Return the most recently appended step or None if empty.
Returns:
-
StepReport or None–Last step in :attr:
steps, if any.
Source code in src/pipeline_watcher/core.py
model_post_init
¶
note
¶
note(msg: str) -> 'ReportBase'
Append a user-facing note.
Parameters:
-
(msg¶str) –Message to append to :attr:
notes.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
request_review
¶
request_review(reason: str | None = None) -> HasReviewTV
Set the HITL review flag.
Parameters:
-
(reason¶str, default:None) –Short UI-visible reason for requesting review.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
skip
¶
skip(reason: Optional[str] = None) -> 'ReportBase'
Finalize as skipped (SKIPPED).
Parameters:
-
(reason¶str, default:None) –Rationale appended to :attr:
notesas"Skipped: {reason}".
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
start
¶
Mark the unit as running and stamp started_at if missing.
Returns:
-
ReportBase–Self (for fluent chaining).
Examples:
Source code in src/pipeline_watcher/core.py
succeed
¶
Finalize successfully (SUCCEEDED) and set percent=100.
Also stamps finished_at to the current time.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
warn
¶
warn(msg: str) -> 'ReportBase'
Append a non-fatal warning.
Parameters:
-
(msg¶str) –Message to append to :attr:
warnings.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
StepReport
¶
Bases: ReportBase
Single unit of work within a file or batch.
A step succeeds if it is explicitly marked SUCCESS or, when not
terminal, if all recorded checks pass and no errors are present.
Aggregates validation checks and lifecycle metadata inherited from
ReportBase.
Attributes:
-
# StepReport-specific– -
label(str) –Human-readable label for UI display.
-
id(str or None) –Machine-friendly identifier (e.g.,
"parse","analyze"). -
checks(list[Check]) –Recorded boolean validations for this step.
-
# Inherited lifecycle fields (see ReportBase)– -
status(Status) –Lifecycle status (see
ReportBase). -
percent(int) –Progress percentage (see
ReportBase). -
started_at(datetime | None) –When processing started (see
ReportBase). -
finished_at(datetime | None) –When processing finished (see
ReportBase). -
notes(list[str]) –Narrative messages (see
ReportBase). -
errors(list[str]) –Fatal issues (see
ReportBase). -
warnings(list[str]) –Non-fatal issues (see
ReportBase). -
metadata(dict[str, Any]) –Arbitrary structured context (see
ReportBase). -
review(ReviewFlag) –HITL review flag (see
ReportBase). -
report_version(str) –Schema version (see
ReportBase). -
duration_ms(float or None) –Computed elapsed time in milliseconds.
Notes
idmay be omitted; containers such asFileReportoften assign a unique ID.- The auto-generated initializer accepts the same fields as attributes.
- Lifecycle semantics (
start,end,succeed,fail,skip) are defined inReportBase.
Examples:
>>> st = StepReport.begin("Extract text (OCR)")
>>> st.add_check("ocr_quality>=0.9", ok=True)
>>> st.end() # end is typically called by FileReport.append_step
>>> st.succeeded
True
>>> st.terminal
True
>>> st.duration_ms
314
Methods:
-
_default_id– -
add_check–Record a boolean validation result.
-
begin–Construct and mark the step as started.
-
clear_review–Clear the HITL review flag.
-
end–Finalize the unit if not already terminal.
-
error–Append an error message (does not change status).
-
fail–Finalize as failed (
FAILED). -
model_post_init– -
note–Append a user-facing note.
-
request_review–Set the HITL review flag.
-
skip–Finalize as skipped (
SKIPPED). -
start–Mark the unit as running and stamp
started_atif missing. -
succeed–Finalize successfully (
SUCCEEDED) and setpercent=100. -
warn–Append a non-fatal warning.
defer_start
class-attribute
instance-attribute
¶
duration_ms
property
¶
Elapsed time in milliseconds.
Returns None if timing cannot be determined (e.g., no started_at). Uses finished_at when present; otherwise uses 'now' to reflect in-flight duration. Clamped at >= 0 and rounded to 3 decimals.
metadata
class-attribute
instance-attribute
¶
ok
property
¶
Reviews checks and surfaces errors.
Returns:
-
bool–Falseif status isFAILEDor any errors exist.Trueif status isSUCCESS. Otherwise,all(check.ok for check in checks)(orTrueif no checks).
requires_human_review
property
¶
Whether the unit is flagged for human review.
Returns:
-
bool–Trueif :attr:review.flaggedis set, elseFalse.
_default_id
¶
Source code in src/pipeline_watcher/core.py
add_check
¶
Record a boolean validation result.
Parameters:
-
(name¶str) –Check identifier (e.g.,
"manifest_present"). -
(ok¶bool) –Outcome of the check.
-
(detail¶str, default:None) –Additional context for UI/debugging.
Examples:
>>> st = StepReport.begin("validate")
>>> st.add_check("ids_unique", ok=False, detail="3 duplicates")
>>> st.ok
False
Source code in src/pipeline_watcher/core.py
begin
classmethod
¶
Construct and mark the step as started.
Parameters:
Returns:
-
StepReport–Started step report (
status=RUNNING).
Source code in src/pipeline_watcher/core.py
clear_review
¶
end
¶
Finalize the unit if not already terminal.
If already terminal (SUCCESS, FAILED, SKIPPED), this stamps
:attr:finished_at if missing and returns. Otherwise, infers success
from :attr:ok (roll-up of step outcomes) and calls :meth:succeed
or :meth:fail accordingly.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
error
¶
error(msg: str) -> 'ReportBase'
Append an error message (does not change status).
Note
Fail is the preferred method. error does not change status for design consistency.
Parameters:
-
(msg¶str) –Message to append to :attr:
errors.
Returns:
-
ReportBase–Self.
Notes
Use :meth:fail to change terminal status to FAILED. This method
only records text.
Source code in src/pipeline_watcher/core.py
fail
¶
fail(message: Optional[str] = None) -> 'ReportBase'
Finalize as failed (FAILED).
Parameters:
-
(message¶str, default:None) –Error text to append to :attr:
errors.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
model_post_init
¶
note
¶
note(msg: str) -> 'ReportBase'
Append a user-facing note.
Parameters:
-
(msg¶str) –Message to append to :attr:
notes.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
request_review
¶
request_review(reason: str | None = None) -> HasReviewTV
Set the HITL review flag.
Parameters:
-
(reason¶str, default:None) –Short UI-visible reason for requesting review.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
skip
¶
skip(reason: Optional[str] = None) -> 'ReportBase'
Finalize as skipped (SKIPPED).
Parameters:
-
(reason¶str, default:None) –Rationale appended to :attr:
notesas"Skipped: {reason}".
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
start
¶
Mark the unit as running and stamp started_at if missing.
Returns:
-
ReportBase–Self (for fluent chaining).
Examples:
Source code in src/pipeline_watcher/core.py
succeed
¶
Finalize successfully (SUCCEEDED) and set percent=100.
Also stamps finished_at to the current time.
Returns:
-
ReportBase–Self.
Source code in src/pipeline_watcher/core.py
warn
¶
warn(msg: str) -> 'ReportBase'
Append a non-fatal warning.
Parameters:
-
(msg¶str) –Message to append to :attr:
warnings.
Returns:
-
ReportBase–Self.