URL connector specification
If your datasource is not one that is natively supported by the Verteego Platform, you can provide an API through which Verteego can import data.
Overview
In the Verteego platform, you would configure such a dataset with a 'URL' datasource, where you can specify the URL of your API.
In your tools, you would provide an API that can respond to two calls:
a creation request, that would trigger the creation of a dataset at your end. It can return the dataset content, or a pending status (if dataset creation takes a while).
a fetch request, that also returns either a pending status, or the dataset content.
The content can be sent to us in a few formats: a CSV file, a parquet file, or a zlib-compressed CSV file. Uncompressed CSV is not recommended for large volumes, as timeouts could occur.
Creation request
Your API will first receive a creation request, with the following payload:
These parameters allow the dataset creation to be tied to the particular Verteego context of your run, but you might not need them.
runtime_parameters
will be a dictionary of key-value pairs. The runtime parameters and their values can be configured within an app_run - for instance they could be your start_date or internal_run_id or any other parameter you wish to communicate that would be particular to an app_run. It can be left empty.
From this call, we expect a response with status 200 or 201 - it can be a pending response.
Fetch request
If the dataset build takes a bit of time on your end, you can return a pending response to the initial creation request. We will then send you fetch requests every 60 seconds. The payload of the fetch request is similar to the creation one:
You no longer are provided with the runtime parameters. You however do know which dataset we're asking about.
The response of the fetch request follows the same format as the response of the creation request, detailed below.
Response options
Your response to either the creation or the fetch request can be of different content-types:
application/json
text/csv
text/html
The response code must be 200 or 201, or the dataset import will be marked as failed.
JSON response
If your response is JSON, we expect the following formats.
If the dataset creation is still ongoing:
If the dataset creation failed, you can communicate this back to Verteego as follows:
error_detail
is optional but could be helpful to report back to your users.
If the dataset creation succeeded, you can return:
If your format is parquet, you can encode the bytes content directly in base64. If your format is CSV, you can simply encode the CSV content. This is not recommended for large files. The encoded string is returned in file_content.
CSV response
The content of the response can be the CSV string. This is not recommended for large files.
HTML response
With an html response, you can choose to return either a zlib-compressed CSV file, or a parquet file. The content of the response is the bytes, and if the content-encoding
is zlib
, we will decompress it and parse the CSV.
Last updated