Creating changesets in a dataset
Important
Amazon FinSpace Dataset Browser will be discontinued on November 29,
2024
. Starting November 29, 2023
, FinSpace will no longer accept the creation of new Dataset Browser
environments. Customers using Amazon FinSpace with Managed Kdb Insights
Data files are added to datasets and tracked as a changeset. A changeset is created in a dataset when one or more data files are ingested in a single operation. All changesets in a dataset are preserved unless a dataset itself is deleted. A changeset is created with a unique identifier and a system timestamp is assigned to it at the time of creation.
A changeset is created as one of the following types
-
Append – New changeset is considered an addition to the end of the prior ingested changesets. For example, addition of a new daily file.
-
Replace – New changeset is considered a replacement to all prior ingested changesets in a dataset. This does not mean that the prior ingested changesets are deleted but they will not be considered for the view creation.
Replace data
To create a changeset with type as Replace
-
From the homepage search for a dataset where you want to replace data.
-
Choose the dataset name to view the dataset details page.
-
Choose the All Data Views tab.
-
Scroll down and choose Replace Data.
-
Choose Select CSV File to select and upload a file from your desktop.
-
Once the file is uploaded, choose the input format for the ingested data from the following options:
Delimiter – Specifies the delimiter character. The default value is Comma.
Escape Character – Specifies a character to use for escaping. The default value is None.
Quotes – Specifies the character to use for quoting. The default value is Double Quotes (").
Multiline Records – Specifies whether a single record can span multiple lines. By default this option is disabled. Enable this option if you want any record to span multiple lines.
Treat First Line As Header – Specifies whether to treat the first line as a header. By default this option is disabled.
Skip First Data Line – Specifies whether to skip the first data line. By default this option is disabled.
-
Choose Replace Data.
-
Once the file upload is complete, you should see a new entry for a changeset of type Replace under the Dataset Update History table with a Pending status. Once the status is set to Available, a data view that includes the new changeset can be created.
Append data
To create a changeset with type as Append
-
From the homepage, search for the dataset to which you want to append data.
-
Choose the dataset name to view the dataset details page.
-
Choose the All Data Views tab.
-
Scrolls down and choose Append Data.
-
Choose Select CSV File to select and upload a file from your desktop.
-
Once the file is uploaded, choose the input format for the ingested data from the following options:
Delimiter – Specifies the delimiter character. The default value is Comma.
Escape Character – Specifies a character to use for escaping. The default value is None.
Quotes – Specifies the character to use for quoting. The default value is Double Quotes (").
Multiline Records – Specifies whether a single record can span multiple lines. By default this option is disabled. Enable this option if you want any record to span multiple lines.
Treat First Line As Header – Specifies whether to treat the first line as a header. By default this option is disabled.
Skip First Data Line – Specifies whether to skip the first data line. By default this option is disabled.
-
Choose Append Data.
-
Once the file upload is complete, you should see a new entry for a changeset of type Append under the Dataset Update History table with a Pending status. Once the status is set to Available, a data view that includes the new changeset can be created.