weather

Gets weather description from date and GPS coordinates.

Usage

Allows to get weather information for a given GPS coordinate and time. Weather forecasts are available for the next 2 weeks

This calculator can be used with the following method:

weather

Examples:

  • get weather information to better characterize the environment and refine sales predictions

  • in retail very often rainy days are correlated with increased sales


Main Parameters

The bold options represent the default values when the parameters are optional.

  • input_columns list of columns used as input of the calculators: gps coordinates and date (daily)

  • output_columns list of columns added by the calculators: tempmax, tempmin, temp, feelslikemax, feelslikemin, feelslike, dew, humidity, precip, precipprob, precipcover, preciptype, snow, snowdepth, windgust, windspeed, winddir, pressure, cloudcover, visibility, solarradiation, solarenergy, uvindex , moonphase, severerisk

  • global (true, false) Should this calculator be performed before data splitting during training for cross-validation

  • steps [optionnal] (training, prediction, postprocessing) List of steps in a pipeline where columns from this calculator are added to the data. Note that when the training option is listed, the calculator is actually added during preprocessing.

  • store_in_model [optionnal] (true, false) Please indicate whether the "calculated" columns by the calculator should be stored in the model or not to avoid recalculating them during prediction. This is only relevant if the calculated columns are added to both training and prediction. Without this parameter, the values will not be stored in the model. The following parameters only make sense if this parameter is set to true.

  • stored_columns [required if store_in_model is true] List indicating the columns to be stored among the output_columns.

  • stored_keys [required if store_in_model is true] List indicating the columns to use for identifying the correct values to join on the data for prediction among the stored values (logically, they are to be chosen from the input_columns).


Specific Parameters

  • gps_coordinates : The columns used to get the gps coordinates using the format “latitude,longitude”, for instance 48.8647, 2.3490

  • date_col: The column that contains the date

  • date_format [optional]: Default value: %Y-%m-%d


Examples

  1. We want to get weather information for different point of sales in different cities. This information could explain more precisely the environmet.

    In this example:

    • pos_gps : gps coordinates of different point of sales

    • receipt_date : dates per day

calculated_cols:
  temperature_precipitation_calculator_feat:
      method: weather
      input_columns:
        - pos_gps
        - receipt_date
      output_columns:
        - tempmax
        - tempmin
        - temp
        - precip
        - precipcover
        - cloudcover
        - uvindex
      params:
        gps_coordinate: pos_gps
        date_col: receipt_date
receipt_date
pos_id
pos_gps
tempmax
tempmin
temp
precip
precipcover
cloudcover
uvindex

2021-07-03

3701092636753

40.9784275,-74.122508

19.3

15.6

17.4

6.385

41.67

100.0

3.0

2022-09-23

3701092637040

42.4792134,-70.9048013

15.1

8.9

11.6

0.0

0.0

18.6

8.0

If you need to make forecasts with a time horizon longer than two weeks, it is useful to combine this calculator with others in order to obtain the last available value, which can then be used in the prediction set.

Example: we want to use the last available value for precipcover and used as default value if the information is not available.

We combine 3 calculators:

  • weather (get needed weather information per date)

  • aggregate_val_group_by_key with last as aggregation (get the last available value for a column precipcover for each pos_id)

  • case_na (if weather information is not available, we replace blank fields per last available one, so combination of precipcover, last_precipcover and final column is precipcover_combined )

calculated_cols:
# ---------------------------------------------- #
# ------------------ WEATHER ------------------- #
# ---------------------------------------------- #

  temperature_precipitation_calculator_feat:
      method: weather
      input_columns:
        - pos_gps
        - receipt_date
      output_columns:
        - precipcover
      params:
        gps_coordinate: pos_gps
        date_col: receipt_date
# ---------------------------------------------- #
# --------- LAST AVAILABLE INFORMATION --------- #
# ---------------------------------------------- #
    last_precipcover_feat:
      method: aggregate_val_group_by_key
      input_columns:
        - pos_id
        - precipcover
      output_columns:
        - last_precipcover
      store_in_model: true
      stored_keys:
        - pos_id
      stored_columns:
        - last_precipcover
      params:
        aggregation: last
        val:
          - precipcover
        group_by:
          - pos_id
# ---------------------------------------------- #
# COMBINE WEATHER INFORMATION AND LAST AVAILABLE #
# ---------------------------------------------- #

    case_na_precipcover_feat:
      input_columns:
        - precipcover
        - last_precipcover
      output_columns:
        - precipcover_combined
      method: case_na
      params:
        priority:
          - precipcover
          - last_precipcover

output:

receipt_date
pos_id
pos_gps
precipcover
last_precipcover
precipcover_combined

2021-07-03

3701092636753

40.9784275,-74.122508

41.67

35.34

41.67

2021-07-04

3701092636753

40.9784275,-74.122508

45.5

35.34

45.5

2021-07-05

3701092636753

40.9784275,-74.122508

40

35.34

40

2021-07-06

3701092636753

40.9784275,-74.122508

35.34

35.34

35.34

2021-07-07

3701092636753

40.9784275,-74.122508

null

35.34

35.34

Last updated