Embargo¶

Real-world ML pipelines have latency: data arrives late, ETL jobs run on schedules, and features take time to compute. The embargo parameter models this lag.

Usage¶

rolling_spend = timefence.Feature(
    source=transactions,
    sql="SELECT ...",
    embargo="1d"  # Feature available 1 day after event
)

How it works¶

With embargo="1d", a feature recorded at 2024-03-15 10:00 is only eligible for labels at 2024-03-16 10:00 or later. This prevents optimistic leakage from features that wouldn't actually be available in production.

The full temporal constraint becomes:

feature_time < label_time - embargo

When to use embargo¶

Scenario	Recommended Embargo
Real-time features	`"0d"` (no embargo)
Daily ETL pipeline	`"1d"`
Weekly batch features	`"7d"`
Monthly aggregates	`"30d"`

Tip

When in doubt, set embargo to match your production pipeline's worst-case latency. It's better to be conservative (larger embargo) than to train on features that wouldn't actually be available at prediction time.

Duration format¶

Accepted formats: "30d", "1d12h", "6h", "30m", "15s".

See Duration Format for the full specification.