> For the complete documentation index, see [llms.txt](https://doc.verteego.com/verteego-doc/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://doc.verteego.com/verteego-doc/pipelines/forecasting-pipelines/getting-started.md).

# Getting started

## 1. **Overview** <a href="#overview" id="overview"></a>

Let’s walk through a simple example, using the Verteego platform to generate predictions.

**You will need&#x20;**<mark style="color:blue;">**:**</mark>&#x20;

* <mark style="color:blue;">**Training file**</mark>: contains data points on which model can be trained
* <mark style="color:blue;">**Model configuration**</mark>: YAML file (created directly in Verteego) which allows to create a predictive model
* <mark style="color:blue;">**Prediction file**</mark>: contains the points for which you want predictions

{% hint style="info" %}
Here, we will walk through an example:\
Let’s say we want to **predict sales for a given item, point-of-sales, and date**.
{% endhint %}

&#x20; We will need to:

1. Add the relevant train and predict datasets
2. Set up our configuration file
3. Launch a forecast pipeline run
4. Analyse the results
5. Add calculated features
6. Add features from external sources

## **2. A simple example** <a href="#a-simple-example" id="a-simple-example"></a>

**Context :** Let’s say we have two products (`item_id`), <mark style="color:red;">`123456`</mark> and <mark style="color:red;">`987654`</mark>. They’re sold in two different points-of-sales, with ids (`pos_id`) <mark style="color:red;">`50`</mark> and <mark style="color:red;">`100`</mark>.&#x20;

**Goal :** For each product and shop, we have the total sold quantity per day, from 2019 to 2023, and we would like to predict for 2024.&#x20;

**For the sake of this introductory example, the data is relatively simple:**&#x20;

* Quantity sold is the same every day for a given shop
* With a low-season value and a high-season value (spring-summer or fall-winter).&#x20;
* There’s a bit of growth every year. This is a graph of the data we have at our disposal

<figure><img src="/files/LIDGGpoj0dVKsYd1LFfd" alt=""><figcaption></figcaption></figure>

**Let’s use the Verteego platform to predict sales for 2024**. \
In this instance, we can easily extrapolate the values for 2024, assuming all trends remain the same - it is represented below in dots. We’ll be able to contrast our predictions with this extrapolation.

<figure><img src="/files/dk9w9PBT9L32wYknVAO5" alt=""><figcaption></figcaption></figure>

## 3. First Pipeline <a href="#initial-configuration" id="initial-configuration"></a>

### <mark style="color:blue;">**3.1. Creating train/test datasets to train a model**</mark> <a href="#creating-train-test-datasets-to-train-a-model" id="creating-train-test-datasets-to-train-a-model"></a>

To start to build a model for our scenario, **we will split our existing data into train and test datasets**.&#x20;

A model will be built on **data from 2019 to 2022** inclusive, ie the <mark style="color:red;">**“train dataset”**</mark>, and will **be tested on 2023**, the <mark style="color:red;">**“test dataset”**</mark>.&#x20;

{% hint style="danger" %}
Train and test datasets <mark style="color:red;">**should have the same columns**</mark>. The only tolerated difference is that your test dataset might not contain the column to predict. In that case, you will not get scores, as it cannot evaluate your predictions comparing to what is really happened.
{% endhint %}

We will be able to **compare** our **predictions** **to the real sales**, and therefore **evaluate the quality** of our model.&#x20;

Here is a graph of the period of time we’ll use for training and testing. The vertical black line marks the split between our train and test data.

<figure><img src="/files/1kqFigT5ALEaoh3K0887" alt=""><figcaption></figcaption></figure>

{% hint style="info" %}
In real life, we usually cannot compare our predictions to anything, since it is in the future, but here we’ll be able to compare to our easy extrapolation and see/evaluate the performance of the model.&#x20;

In real life, once a final model is decided, we would retrain the model on all of our data - 2023 included and use this new model to predict 2024.&#x20;
{% endhint %}

### <mark style="color:blue;">**3.2. Adding datasets to the platform**</mark> <a href="#adding-datasets-to-the-platform" id="adding-datasets-to-the-platform"></a>

Verteego supports a several number of Connectors which you can use to create datasources and datasets.&#x20;

You can also upload CSV files *(less than 50mb).*&#x20;

For our example, we will use two CSV files:

{% file src="/files/RT6aCCPDA6tB3BwRnXCw" %}

{% file src="/files/JzKsVOz0MZ90RjY0yIjm" %}

{% hint style="info" %} <mark style="color:blue;">How can you upload the data?</mark>&#x20;

Data -> Datasets section and add a dataset (select the “*Upload file*” option).&#x20;
{% endhint %}

<figure><img src="/files/PaB7MhmzVbkMWJeWeAse" alt=""><figcaption><p>Import CSV files in Verteego</p></figcaption></figure>

Here is a sample of what they contain:

| pos\_id | item\_id | sales\_date | qty\_sold |
| ------- | -------- | ----------- | --------- |
| 50      | 123456   | 2019-01-01  | 15        |
| 50      | 123456   | 2019-01-02  | 15        |
| 50      | 123456   | 2019-01-03  | 15        |

Verteego will validate your dataset after upload. This may take a few minutes. Once your dataset’s status has changed to <mark style="color:green;">**"Valid"**</mark>, it is correctly formatted and ready to be used.

If you click on your dataset, you can see an overview (number of rows, etc), but also quite a bit of details for each variable in the **Variables** tab.

<figure><img src="/files/ymbnB6WINSa3kDxEJE79" alt=""><figcaption><p>Check Variables tab</p></figcaption></figure>

### <mark style="color:blue;">**3.3. Creating the first configuration**</mark> <a href="#creating-the-first-configuration" id="creating-the-first-configuration"></a>

To create your first model, you need to create a <mark style="color:red;">**Pipeline**</mark> to experiment in.&#x20;

In the Pipelines section, create a new Forecast pipeline, and set its initial configuration to the YAML example below.&#x20;

For our example, this configuration is very simple and the only features our XGBoost model has are the `item_id` and the `pos_id`.&#x20;

That is, the only variables we will feed into our model are the IDs of the items and of the points-of-sales.

```yaml
# Define the column that you want to predict
column_to_predict: qty_sold

# Define the temporal column
date_col: sales_date

# Columns which are presented in your train and test sets 
# (except the column_to_predict)
# If your dataset columns presented in prediction_resolution are the same as
# then input_prediction_columns is not required

input_prediction_columns:
- sales_date
- item_id
- pos_id

# Define the type of each column (np.float32, np.int64 and str are availables)
cols_type:
  pos_id: np.int64
  item_id: np.int64
  sales_date: str
  qty_sold: np.int64

# What is your prediction resolution
prediction_resolution:
- sales_date
- item_id
- pos_id

#   -------------------------------------------------------
#   ------------------------ ALGORITHM --------------------
#   -------------------------------------------------------

algo_name: xgboost

#   -------------------------------------------------------
#   ------------------------- FEATURES --------------------
#   -------------------------------------------------------
features:
  categorical_columns:
  - pos_id
  - item_id
```

<figure><img src="/files/yUuvTrQU7VurMByb8K7i" alt=""><figcaption><p>How to set up a pipeline</p></figcaption></figure>

The **configuration tells** Verteego which **features** you want to use and which **algorithms you want** to train and score with.

You can find more details on the various sections in the [Configuration page](/verteego-doc/pipelines/forecasting-pipelines/configuration.md)

### <mark style="color:blue;">**3.4. Launching our first pipeline run**</mark> <a href="#v1-launching-our-first-pipeline-run" id="v1-launching-our-first-pipeline-run"></a>

Once you’ve configured your pipeline, and your Train and Test datasets are valid, you are ready to launch a pipeline run.

* In Pipelines -> Runs, click on “Run”. You can name your pipeline run
* Specify which dataset to use for training (“Train Dataset”)&#x20;
* Specify which dataset to use for predicting and testing (“Test Dataset”).&#x20;
* Hit “Create”, the pipeline will automatically launch. It will run several steps, described in more details in the [Concepts](https://www.notion.so/Concepts-d61276850eb5422e86b9e84183b8fa01?pvs=21).

<figure><img src="/files/dqYWsllspJe4vylj5jcw" alt=""><figcaption><p>How to launch a pipeline run</p></figcaption></figure>

### <mark style="color:blue;">3.5. Analyzing first model's results</mark>

Once the pipeline has finished running, you can look at differents metrics calculated on training and on predictions (score)

#### **3.5.1. Looking at Training details** <a href="#looking-at-training-details" id="looking-at-training-details"></a>

You can look at a specific Training step, by clicking on Pipelines -> Trainings, then on the training of your choice.&#x20;

**Specific Traing page will summarize:**&#x20;

* **Parameters** of this model
* **Metrics** of the model (on the training set).&#x20;
* **Feature importance**, which you might find useful in figuring out whether a feature adds signal or noise.\
  \&#xNAN;*In our example, item\_id is the most useful feature.*

<figure><img src="/files/ObrpKZN8OHf8h6HsNw4P" alt=""><figcaption><p>Training</p></figcaption></figure>

#### **3.5.2. Analyzing the scores** <a href="#analyzing-the-scores" id="analyzing-the-scores"></a>

You can head over to **Pipelines** -> **Scores** to see the scores of your model.&#x20;

***A set of standard metrics are calculated:***

* <mark style="color:red;">**MAPE**</mark> -> (Mean Absolute Pourcentage Error) measures **accuracy of a forecast system as a percentage**. It’s the average of the ratio between the error and the actual observation.

  **WARNING** ⇒ The lower the actual observation is, the higher the MAPE could be!
* <mark style="color:red;">**MAE**</mark> -> (Mean Absolute Error) measures the **average magnitude of the errors** in a set of predictions, without considering their direction. It’s the average over the test sample of the absolute differences between prediction and actual observation.
* <mark style="color:red;">**RMSE**</mark> -> (Root Mean Squared Error) is a quadratic scoring rule that measures the **average magnitude of the error.** It’s the square root of the average of squared differences between prediction and actual observation.
* <mark style="color:red;">**R2**</mark> -> is a measure of the goodness of fit of a model. In regression, the R2 coefficient of determination is a statistical measure of **how well the regression predictions approximate the real data points**. An R2 of 1 indicates that the regression predictions perfectly fit the data.
* <mark style="color:red;">**MSE**</mark> ->  (Mean Squared Error), the average squared difference between the value observed in a statistical study and the values predicted from a model.

{% hint style="info" %}
You can refer to [Performance analysis and ML model improvement](/verteego-doc/best-practices/performance-analysis-and-ml-model-improvement.md) to see how to interpret different metrics
{% endhint %}

They’re not all displayed by default. You can change which columns you see by clicking on the icon on the top right of the table.

This will allow you to compare your pipeline runs to one another. You can choose to show only the Score metrics (calculated on your Test set), or only the Train metrics, or both.

**In our example**

In Pipeline -> Scores, we can see we’ve got two new results:&#x20;

* PIPELINENAME\_<mark style="color:blue;">**score\_on\_prediction**</mark>
* PIPELINENAME\_<mark style="color:blue;">**score\_on\_postprocessing**</mark>

For now, we can ignore the <mark style="color:blue;">**score\_on\_postprocessing**</mark> *(usefull when you have postprocessed your raw results , example, round predicted quantities or replace negative value per 0)*&#x20;

**Looking at the&#x20;**<mark style="color:blue;">**score\_on\_prediction**</mark>**&#x20;results:**

* We see a **Mean Absolute Error** (MAE) **of 94**. \
  It is an average error over our predictions, which is not iquite good considering the average qty\_sold over the test data is 259.&#x20;
* Our **coefficient of determination**, R2, sits at **0.58.** Giving us room to improve, seeing as the theoretical best R2 is 1.&#x20;

For a more in-depth coverage of metrics and model evaluation in general, please see Performance analysis and ML model improvement

#### **3.5.3. Getting our predictions** <a href="#getting-our-predictions" id="getting-our-predictions"></a>

You can click on Pipeline -> Predictions, then on the prediction of your pipeline. In the top-right, you can then choose to download the prediction file (as a CSV), or export it to an existing DataSource.

Here is a plot of our first predictions, in dashed. We can see it essentially predicted an average value over our training dataset, for each item and pos.

<figure><img src="/files/CJgxKrpsV7giHtvTWqpL" alt=""><figcaption></figcaption></figure>

## **4. V2 of Pipeline - Adding calculated features** <a href="#v2-adding-calculated-features" id="v2-adding-calculated-features"></a>

One of the **powerful features** of Verteego consists in its ability to **use its** [**Calculators**](/verteego-doc/pipelines/forecasting-pipelines/calculators.md).&#x20;

Calculators can be used to **generate additional features** with only a few lines of configuration.&#x20;

For example:

* Calculate **averages quantities** at different levels *(*[*aggregagate\_val\_group\_be\_key*](/verteego-doc/pipelines/forecasting-pipelines/calculators/mathematic/aggregate_val_group_by_key.md)*,* [*hierarchical\_aggregate*](/verteego-doc/pipelines/forecasting-pipelines/calculators/mathematic/hierarchical_aggregate.md)*)*&#x20;
* Automatically extract [**seasonal patterns**](/verteego-doc/pipelines/forecasting-pipelines/calculators/temporal/seasonality.md)
* [**Encode** your categorical features](/verteego-doc/pipelines/forecasting-pipelines/calculators/machine-learning/glmm_encoder.md)
* Perform [**PCA**](/verteego-doc/pipelines/forecasting-pipelines/calculators/machine-learning/pca.md)
* Generate [**clusters**](/verteego-doc/pipelines/forecasting-pipelines/calculators/machine-learning/clustering.md)
* Use [**TSFresh**](/verteego-doc/pipelines/forecasting-pipelines/calculators/temporal/tsfresh.md) to generate hundreds of time-related features
* Get [**weather information**](/verteego-doc/pipelines/forecasting-pipelines/calculators/external-source/weather.md) using gps coordinates&#x20;
* and a lot more

### <mark style="color:blue;">**4.1. Adding date attributes**</mark>

Let’s use a simple calculator on our example.

So far, we’re not using our `sales_date` feature, and we are not capturing seasonality at all. We can add **date attributes** that could prove relevant, such as the **month**. Let’s add the[ **date\_attributes**](/verteego-doc/pipelines/forecasting-pipelines/calculators/temporal/date_attributes.md) calculator to our **`calculated_cols`** block, as such:

```yaml
calculated_cols:
  date_attributes:
    method: date_attributes
    input_columns:
    - sales_date
    output_columns:
    - month
```

Let’s not forget to also **make `month` available** to our model, as a **categorical feature**, by updating the features block of our configuration:

```yaml
features:
  categorical_columns:
  - pos_id
  - item_id
  - month
```

<figure><img src="/files/LDgwaEHxy2Miw3rtIDcB" alt=""><figcaption><p>Create a calculator in Configuration</p></figcaption></figure>

### <mark style="color:blue;">**4.2. Analyzing metrics**</mark>

If we run a new pipeline run with this updated configuration, we get an improved scores not only for Training but alos for score.&#x20;

**Metrics on Training:**&#x20;

* **MAE** from 69 to 26
* **R2** from 0.69 to 0.96

**Metrics on Prediction:**&#x20;

* **MAE** from 93 to 66
* **R2** from 0.56 to 0.86.&#x20;

Results are really improving. Let’s visualize the results. We can see that **it is capturing the seasonal pattern**, but it **is failing to capture the growth**, and essentially returns an average per month over our train set.

<figure><img src="/files/r1SDZBERpK7eS1oNB6iu" alt=""><figcaption></figcaption></figure>

## 5. V3 of Pipeline - Capturing growth <a href="#v3-capturing-growth" id="v3-capturing-growth"></a>

Our model is **capturing seasonality**, thanks to our **`month`** feature, but it is failing to capture growth. This is due to our use of **XGBoost**, which does **not inherently extrapolate**, though various techniques can be used to capture that growth.

### <mark style="color:blue;">5.1. Switching to LightGBM model (growth capturing)</mark> <a href="#switching-to-lightgbm-model" id="switching-to-lightgbm-model"></a>

What we’ll do, is switch our model to **LightGBM**, which can be used **to capture linear relationships between numerical variables**. Let’s update our model configuration as such:

```yaml
algo_name: lightgbm
algorithm_parameters:
  objective: regression
  linear_tree: true
  min_data_in_leaf: 2
```

Let’s add an **`ordinal`** output to our **`date_attributes`** calculator, which will output a number for each date that can then **be used in a linear equation**. This is our update to the date\_attributes calculator:

```yaml
  date_attributes:
    method: date_attributes
    input_columns:
    - sales_date
    output_columns:
    - month
    - ordinal
```

And finally, let’s add that ordinal to the numerical features for our model:

```yaml
features:
  categorical_columns:
  - pos_id
  - item_id
  - month
  numerical_columns:
  - ordinal
```

<figure><img src="/files/hl0y4XA5opT8n9wrQOcH" alt=""><figcaption></figcaption></figure>

### <mark style="color:blue;">**5.2. Analyzing metrics**</mark>

#### In our example <a href="#in-our-example-2" id="in-our-example-2"></a>

Running with this new configuration, we **get improved results**: an MAE of 21 and an r2 of 0.98!&#x20;

We’ve managed to **increase these predictions** and **capture some of the growth**, but it is not quite linear. This is because the quantity sold is not linear in the sales date. It is linear per season. Relative to the sales date, it is constant for periods of 6 months and increases linearly every other season.

*Here is a visualization of our predictions:*

<figure><img src="/files/WYRGwkmc0LOLEdf2Mq61" alt=""><figcaption><p>Model with LightGBM</p></figcaption></figure>

## **6. V4 of Pipeline - Modeling high and low seasons** <a href="#v4-modeling-high-and-low-seasons" id="v4-modeling-high-and-low-seasons"></a>

In order to **capture the true linear** relationship in our data between **high/low seasons** and the sales, we need to **identify the seasons**.&#x20;

Let’s use:&#x20;

* **Calculator to differentiate between high season and low season**
* **External dataset** to assign a number to a season. That way, lightGBM should capture the linear relationships between season\_number and qty\_sold, whether in high season or low season.

### &#x20;<mark style="color:blue;">6.1. Taging high season and low season</mark>&#x20;

#### **`is_summer` feature**

Let’s update our configuration to add a true/false feature for whether we’re in summer season. We can add the following calculator to our calculators section:

```yaml
calculated_cols:
  [...]
  is_summer_feat:
    method: mathematical_expression
    input_columns:
    - month
    output_columns:
    - is_summer
    params:
      expression: (month <= 9) * (month >= 4)
```

This will make available a new feature called `is_summer` that is true during the months from April to September inclusive.

### <mark style="color:blue;">6.2. Importing external data</mark>

#### **`season_nb` feature**

For a numerical identifier for our seasons, let’s create a new "season" dataset with this file:

{% file src="/files/9IcdJ2RVgsSPQVOQBSQG" %}

Whose content looks like this:

<table><thead><tr><th width="200">start_date</th><th width="213">end_date</th><th>season_nb</th></tr></thead><tbody><tr><td>2018-10-01</td><td>2019-04-01</td><td>1</td></tr><tr><td>2019-04-01</td><td>2019-10-01</td><td>2</td></tr><tr><td>2019-10-01</td><td>2020-04-01</td><td>3</td></tr><tr><td>2020-04-01</td><td>2020-10-01</td><td>4</td></tr><tr><td>…</td><td>…</td><td>…</td></tr></tbody></table>

Once the file is valid, let’s add it as a feature in our configuration, via a [get\_from\_dataset ](/verteego-doc/pipelines/forecasting-pipelines/calculators/external-source/get_from_dataset.md)calculator.

```yaml
calculated_cols:
  [...]
  season_number:
    method: get_from_dataset
    input_columns:
    - sales_date
    output_columns:
    - season_nb
    params:
      file_name: Seasons
      join_options:
        sales_date:
          greater_than_or_equal: start_date
          lesser_than: end_date
```

**Making the features available to the model**

Let’s use our new `is_summer` and `season_nb` features in our model, via the features section:

```yaml
features:
  categorical_columns:
  - pos_id
  - item_id
  - month
  - is_summer
  numerical_columns:
  - season_nb
```

## 7. Final predictions <a href="#final-predictions" id="final-predictions"></a>

### <mark style="color:blue;">7.1. Final prediction 2023</mark>

Running another pipeline run with our updated configuration, we get an MAE of 17 and an r2 of 0.98. We have managed to capture the seasonality as well as the growth of our training dataset. See a final visualisation here:&#x20;

<figure><img src="/files/cU3AR29WOtG7jGaL3W3w" alt=""><figcaption></figcaption></figure>

We are now quite satisfied with our model. As a reminder, this is our pipeline configuration now:

#### Final configuration <a href="#final-configuration" id="final-configuration"></a>

```yaml
# Define the column that you want to predict
column_to_predict: qty_sold

# Define the temporal column
date_col: sales_date

# Columns which are presented in your train and test sets 
# (except the column_to_predict)
# If your dataset columns presented in prediction_resolution are the same as
# then input_prediction_columns is not required

input_prediction_columns:
- sales_date
- item_id
- pos_id

# Define the type of each column (np.float32, np.int64 and str are availables)
cols_type:
  pos_id: np.int64
  item_id: np.int64
  sales_date: str
  qty_sold: np.int64

# What is your prediction resolution
prediction_resolution:
- sales_date
- item_id
- pos_id

#   -------------------------------------------------------
#   ------------------------ ALGORITHM --------------------
#   -------------------------------------------------------

algo_name: lightgbm
algorithm_parameters:
  objective: regression
  linear_tree: true
  min_data_in_leaf: 2

#   -------------------------------------------------------
#   ------------------ CALCULATED COLUMNS -----------------
#   -------------------------------------------------------

calculated_cols:
  date_attributes:
    method: date_attributes
    input_columns:
    - sales_date
    output_columns:
    - month
    - ordinal

  is_summer_feat:
    method: mathematical_expression
    input_columns:
    - month
    output_columns:
    - is_summer
    params:
      expression: (month <= 9) * (month >= 4)

  season_number:
    method: get_from_dataset
    input_columns:
    - sales_date
    output_columns:
    - season_nb
    params:
      file_name: Seasons
      join_options:
        sales_date:
          greater_than_or_equal: start_date
          lesser_than: end_date
#   -------------------------------------------------------
#   ------------------------- FEATURES --------------------
#   -------------------------------------------------------
features:
  categorical_columns:
  - pos_id
  - item_id
  - month
  - is_summer
  numerical_columns:
  - ordinal
  - season_nb
```

### <mark style="color:blue;">7.2. 2024 predictions</mark> <a href="#id-2024-predictions" id="id-2024-predictions"></a>

Let’s retrain it on the full data at our disposal, with sales ranging from 2019 to 2023, and predict our true unknown data, for 2024. Our new train and predict datasets for this exercise are:

{% file src="/files/RWwA6FwY0KIYehCBsv29" %}

{% file src="/files/M0NIauhD8MKFRddb3Tqh" %}

After validating these datasets, we'll initiate a new pipeline run.&#x20;

We'll **compare our predictions** with what a **basic extrapolation** would anticipate, observing that our predictions align reasonably well with expected outcomes.&#x20;

However, it's important to note that **our current dataset exhibits straightforward linear growth and simple step behavior**.&#x20;

Real-world data typically presents more complexity, especially when forecasting for numerous products across numerous points of sale. In such cases, visual inspection of predictions becomes challenging, and we must rely more on model metrics for evaluation.

## 8. Summary <a href="#summary" id="summary"></a>

**Throughout this tutorial, we've acquired the skills:**&#x20;

* to incorporate datasets
* to set up a fundamental pipeline
* to experiment with diverse models
* to integrate additional features using calculators
* to navigate the iterative process
* to refine our approach

**Leveraging the comprehensive metrics provided by Verteego**, we tracked our progress and identified instances where performance enhancements were achieved, leading us to a model that met our satisfaction.&#x20;

Ultimately, we successfully generated predictions for 2024 that align with our existing knowledge and expectations.

## **9. What’s next** <a href="#whats-next" id="whats-next"></a>

Right now, we are using a lightgbm model, with few settings. We might want to adjust its parameters, for instance:

```yaml
algorithm_parameters:
  objective: regression
  linear_tree: true
  max_depth: 3
  n_estimators: 125
  min_data_in_leaf: 2
  max_cat_threshold: 32
```

We could also configure different [Objectives](https://doc.verteego.com/engine/engine/objectives.html), or use [Hyper-Parameter Tuning](https://www.notion.so/8da3323eb7bc40f9944954a0830c2fb5?pvs=21) to find the best parameters for us.

We can also try different models, or a Meta-model: for more info and a full list of available models, see the [Model page](https://doc.verteego.com/engine/engine/model.html).

Happy modeling!


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://doc.verteego.com/verteego-doc/pipelines/forecasting-pipelines/getting-started.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
