Verteego Doc
  • Getting started
    • About Verteego
    • Sample Use Cases
    • Concepts
  • Data
    • Introduction
    • Datasources
      • URL connector specification
    • Datasets
  • Pipelines
    • Forecasting Pipelines
      • Getting started
      • Configuration
        • Identifying and preparing data
          • calculated_cols
          • cols_type
          • date_col
          • normalize
          • preprocessing
          • prediction_resolution
        • Configuring the Forecasting Algorithm
          • algo_name
          • algo_type
          • algorithm_parameters
          • fit_parameters
          • conditional_algorithm_parameters
        • Building the Training and Prediction Set
          • column_to_predict
          • features
          • input_prediction_columns
        • Using Hyperparameter Tuning for the Model
          • tuning_search_params
          • hyperparameter_tuning_parameters
        • Evaluating the Results of the Forecast
          • scores
        • Modifying the results of the forecast
          • postprocessing
      • Calculators
        • External source
          • get_from_dataset
          • weather
        • Mathematic
          • aggregate_val_group_by_key
          • binary_operation
          • count_rows_by_keys
          • hierarchical_aggregate
          • mathematical_expression
          • unary_operation
          • Moving Average (EWM)
        • Machine Learning
          • pca
          • clustering
          • glmm_encoder
          • one_hot_encode
          • words_similarity
        • Transformation
          • fillna
          • fill_series
          • case_na
          • interval_index
          • constant
          • cyclic
          • replace
        • Temporal
          • bank_holidays_countdown
          • bankholidays
          • date_attributes
          • date_weight
          • day_count
          • duration
          • events_countdown
          • seasonality
          • tsfresh
    • Optimization Pipelines
      • Getting started
      • Configuration
      • Constraints
        • Unary Constraint
        • Binary Constraint
        • Aggregation Constraint
        • Order Constraint
        • Multiplicative Equality Constraint
        • Generate constraints from a Dataset
  • Apps
    • About Apps
    • Recipes
      • Pipelines
      • Datasets
  • Users
    • User roles
  • Best practices
    • Performance analysis and ML model improvement
  • Developers
    • API
    • Change logs
Powered by GitBook
On this page
  • Understanding Datasets in Verteego
  • Example 1: Creating a Dataset from Google BigQuery (BQ)
  • Example 2: Creating a Dataset from a CSV File
  1. Data

Datasets

PreviousURL connector specificationNextForecasting Pipelines

Last updated 1 year ago

Understanding Datasets in Verteego

A Dataset in Verteego is fundamentally linked to a single Datasource. When a dataset is initially created, it is assigned the status of "created." This status will subsequently change to either "valid" or "invalid" once Verteego processes the data and makes it available for use within the platform. It is crucial to note that only datasets with a "valid" status can be utilized by other resources in Verteego.

Example 1: Creating a Dataset from Google BigQuery (BQ)

To create a dataset from Google BigQuery, follow these steps:

  • Navigate to the 'Datasets' page within the platform.

  • Click on 'New' to begin the process of importing a new dataset

  • Select the appropriate Datasource, which in this case would be your configured Google BigQuery connection.

  • Configure the necessary parameters, such as the Dataset and Table you wish to import.

  • Click on 'Create'. Your newly created dataset will then appear in the list of datasets within your project, assuming it validates correctly.

Example 2: Creating a Dataset from a CSV File

  • Go to the 'Data > Datasets' section of your project dashboard.

  • Select 'New' to initiate a new dataset creation.

  • Choose the 'Upload file' option as your Datasource and click on "Choose a file" to select the CSV file you wish to upload.

  • After selecting your file, click on 'create'. Your dataset will then be processed and, upon successful validation, added to your project's dataset list.

  • Finally choose as a Datasource the option 'Upload file' then 'Choisir un fichier' to select the file you want to upload. Then click on 'create' and your dataset will appear in the list of datasets in your project.

The CSV separator needs to be a comma