Identifying and preparing data
This section allows to create features from existing datasets.
This section defines all columns to be calculated. It details the formulae and operations needed to derive new data points from existing columns, which are crucial for subsequent analysis and prediction stages.
Specifies the data types for each input column involved in preprocessing. It is essential to define these types accurately to ensure proper data handling and operations. Supported data types include np.int32, np.int64, np.float32, str, and bool. Setting these types helps in optimizing data processing and ensuring compatibility with analytical functions.
Identifies the column containing date information. This column is crucial for sorting the dataset in chronological order to ensure consistent time-based splitting and for applying various temporal controls during data analysis. Proper identification and handling of the date column are vital for maintaining the integrity of time-series data.
Specifies whether normalization is applied to the dataset during preprocessing. Normalization adjusts the scale of data values, which is particularly important for models sensitive to the magnitude of input features, thereby improving the model's performance and stability.
Details specific operations that are applied during the preprocessing stage. This may include tasks such as data cleaning, outlier removal, feature scaling, or transformation techniques essential for preparing the data for effective model training and prediction.
Lists the columns that determine the granularity of the predictions. This setting defines the level of detail at which the forecast is generated, such as daily, weekly, or monthly forecasts. Understanding and setting the prediction resolution is crucial for aligning the outputs with business needs and operational planning.
Last updated