events_countdown

Computes previous and next occurrence of special events.

Usage

Compute previous and next occurrence of events.

Given a date_col and its format date_format generate two columns per event:

  • number of days before next event occurrence

  • number of day since last event occurrence

The events are defined as the following: -epiphanie -chandeleur -saint_valentin -mardi_gras -paques -pentecote -fete_meres -fete_musique -halloween -black_friday -cyber_monday -noel -nouvel_an -nouvel_an_chinois -aid -ramadan -roch_ha_chanah -yom_kippour -soukkot -hannoukkah -pourim -pessah -chavouot.

It will compute those columns for all the events at the same time.

This calculator can be used with the following method:

events_countdown

Examples:

  • Compute the number of days between until or since new year.

  • Compute the number of days between until the black friday, the chinese new year and christmas.


Main Parameters

The bold options represent the default values when the parameters are optional.

  • input_columns list of columns used as input of the calculators: The list of columns that will be used to fill the output column.

  • output_columns_prefix Prefix to use for the output columns, as this calculator adds several.

  • global (true, false) Should this calculator be performed before data splitting during training for cross-validation

  • steps [optionnal] (training, prediction, postprocessing*)*** List of steps in a pipeline where columns from this calculator are added to the data. Note that when the training option is listed, the calculator is actually added during preprocessing.

  • store_in_model [optionnal] (true, false) Please indicate whether the "calculated" columns by the calculator should be stored in the model or not to avoid recalculating them during prediction. This is only relevant if the calculated columns are added to both training and prediction. Without this parameter, the values will not be stored in the model. The following parameters only make sense if this parameter is set to true.

  • stored_columns [required if store_in_model is true] List indicating the columns to be stored among the output_columns.

  • stored_keys [required if store_in_model is true] List indicating the columns to use for identifying the correct values to join on the data for prediction among the stored values (logically, they are to be chosen from the input_columns).


Specific Parameters

  • countdown_type [optionnal] (in, ago) List of columns to create, among in and ago countdowns. By default, both will be added.

  • date_format [optionnal] Format of the date provided, by default, will use %Y-%m-%d.


Examples

  1. Given a dataset with daily sales data with the sales date information (receipt_date ), the user want to get the number of day until special date like Christmas or the black friday.

    calculated_cols:
    	special_event:
    	    method: events_countdown
    	    params:
    	      countdown_type:
    	        - in
    	    input_columns:
    	      - receipt_date
    	    output_columns_prefix: event

    Output :

    receipt_date
    event_black_friday_in
    event_noel_in

    2023-11-20

    9

    34

    2023-11-25

    4

    29

    2023-12-01

    364

    23

  2. Same context but now the user want the number of day since those special date.

    calculated_cols:
    	special_event:
    	    method: events_countdown
    	    params:
    	      countdown_type:
    	        - ago
    	    input_columns:
    	      - receipt_date
    	    output_columns_prefix: event

    Output :

    receipt_date
    event_black_friday_ago
    event_noel_ago

    2023-11-20

    356

    331

    2023-11-25

    361

    336

    2023-12-01

    2

    342

  3. In another case if the user want the number of day until and since in the same calculator.

    calculated_cols:
    	special_event:
    	    method: events_countdown
    	    params:
    	    input_columns:
    	      - receipt_date
    	    output_columns_prefix: event

Last updated