

For similar capabilities to Amazon Timestream for LiveAnalytics, consider Amazon Timestream for InfluxDB. It offers simplified data ingestion and single-digit millisecond query response times for real-time analytics. Learn more [here](https://docs.aws.amazon.com//timestream/latest/developerguide/timestream-for-influxdb.html).

# Using batch load in Timestream for LiveAnalytics
<a name="batch-load"></a>

With *batch load* for Amazon Timestream for LiveAnalytics, you can ingest CSV files stored in Amazon S3 into Timestream in batches. With this new functionality, you can have your data in Timestream for LiveAnalytics without having to rely on other tools or write custom code. You can use batch load for backfilling data with flexible wait times, such as data that isn't immediately required for querying or analysis. 

You can create batch load tasks by using the AWS Management Console, the AWS CLI, and the AWS SDKs. For more information, see [Using batch load with the console](batch-load-using-console.md), [Using batch load with the AWS CLI](batch-load-using-cli.md), and [Using batch load with the AWS SDKs](batch-load-using-sdk.md).

In addition to batch load, you can write multiple records at the same time with the WriteRecords API operation. For guidance about which to use, see [Choosing between the WriteRecords API operation and batch load](writes.writes-or-batch-load.md).

**Topics**
+ [Batch load concepts in Timestream](batch-load-concepts.md)
+ [Batch load prerequisites](batch-load-prerequisites.md)
+ [Batch load best practices](batch-load-best-practices.md)
+ [Preparing a batch load data file](batch-load-preparing-data-file.md)
+ [Data model mappings for batch load](batch-load-data-model-mappings.md)
+ [Using batch load with the console](batch-load-using-console.md)
+ [Using batch load with the AWS CLI](batch-load-using-cli.md)
+ [Using batch load with the AWS SDKs](batch-load-using-sdk.md)
+ [Using batch load error reports](batch-load-using-error-reports.md)

# Batch load concepts in Timestream
<a name="batch-load-concepts"></a>

Review the following concepts to better understand batch load functionality. 

**Batch load task** – The task that defines your source data and destination in Amazon Timestream. You specify additional configuration such as the data model when you create the batch load task. You can create batch load tasks through the AWS Management Console, the AWS CLI, and the AWS SDKs. 

**Import destination** – The destination database and table in Timestream. For information about creating databases and tables, see [Create a database](console_timestream.md#console_timestream.db.using-console) and [Create a table](console_timestream.md#console_timestream.table.using-console).

**Data source** – The source CSV file that is stored in an S3 bucket. For information about preparing the data file, see [Preparing a batch load data file](batch-load-preparing-data-file.md). For information about S3 pricing, see [Amazon S3 pricing](https://aws.amazon.com/s3/pricing/).

**Batch load error report** – A report that stores information about the errors of a batch load task. You define the S3 location for batch load error reports as part of a batch load task. For information about information in the reports, see [Using batch load error reports](batch-load-using-error-reports.md).

**Data model mapping** – A batch load mapping for time, dimensions, and measures that is from a data source in an S3 location to a target Timestream for LiveAnalytics table. For more information, see [Data model mappings for batch load](batch-load-data-model-mappings.md).

# Batch load prerequisites
<a name="batch-load-prerequisites"></a>

This is a list of prerequisites for using batch load. For best practices, see [Batch load best practices](batch-load-best-practices.md).
+ Batch load source data is stored in Amazon S3 in CSV format with headers.
+ For each Amazon S3 source bucket, you must have the following permissions in an attached policy:

  ```
  "s3:GetObject",
  "s3:GetBucketAcl"
  "s3:ListBucket"
  ```

  Similarly, for each Amazon S3 output bucket where reports are written, you must have the following permissions in an attached policy:

  ```
  "s3:PutObject",
  "s3:GetBucketAcl"
  ```

  For example:

------
#### [ JSON ]

****  

  ```
  {
      "Version":"2012-10-17",		 	 	 
      "Statement": [
          {
              "Action": [
                  "s3:GetObject",
                  "s3:GetBucketAcl",
                  "s3:ListBucket"
              ],
              "Resource": [
                  "arn:aws:s3:::amzn-s3-demo-source-bucket1\u201d",
                  "arn:aws:s3:::amzn-s3-demo-source-bucket2\u201d"
              ],
              "Effect": "Allow"
          },
          {
              "Action": [
                  "s3:PutObject",
                  "s3:GetBucketAcl"
              ],
              "Resource": [
                  "arn:aws:s3:::amzn-s3-demo-destination-bucket\u201d"
              ],
              "Effect": "Allow"
          }
      ]
  }
  ```

------
+ Timestream for LiveAnalytics parses the CSV by mapping information that's provided in the data model to CSV headers. The data must have a column that represents the timestamp, at least one dimension column, and at least one measure column.
+ The S3 buckets used with batch load must be in the same region and from the same account as the Timestream for LiveAnalytics table that is used in batch load.
+ The `timestamp` column must be a long data type that represents the time since the Unix epoch. For example, the timestamp `2021-03-25T08:45:21Z` would be represented as `1616661921`. Timestream supports seconds, milliseconds, microseconds, and nanoseconds for the timestamp precision. When using the query language, you can convert between formats with functions such as `to_unixtime`. For more information, see [Date / time functions](date-time-functions.md).
+ Timestream supports the string data type for dimension values. It supports long, double, string, and boolean data types for measure columns.

For batch load limits and quotas, see [Batch load](ts-limits.md#limits.batch-load).

# Batch load best practices
<a name="batch-load-best-practices"></a>

Batch load works best (high throughput) when adhering to the following conditions and recommendations:

1. CSV files submitted for ingestion are small, specifically with a file size of 100 MB–1 GB, to improve parallelism and speed of ingestion.

1. Avoid simultaneously ingesting data into the same table (e.g. using the WriteRecords API operation, or a scheduled query) when the batch load is in progress. This might lead to throttles, and the batch load task will fail.

1. Do not add, modify, or remove files from the S3 bucket used in batch load while the batch load task is running.

1. Do not delete or revoke permissions from tables or source, or report S3 buckets that have scheduled or in-progress batch load tasks.

1. When ingesting data with a high cardinality set of dimension values, follow guidance at [Recommendations for partitioning multi-measure records](data-modeling.md#data-modeling-multi-measure-partitioning).

1. Make sure you test the data for correctness by submitting a small file. You will be charged for any data submitted to batch load regardless of correctness. For more information about pricing, see [Amazon Timestream pricing](https://aws.amazon.com/timestream/pricing/).

1. Do not resume a batch load task unless `ActiveMagneticStorePartitions` are below 250. The job may be throttled and fail. Submiting multiple jobs at the same time for the same database should reduce the number.

The following are console best practices:

1. Use the [builder](batch-load-using-console.md#batch-load-using-visual-builder) only for simpler data modeling that uses only one measure name for multi-measure records.

1. For more complex data modeling, use JSON. For example, use JSON when you use multiple measure names when using multi-measure records. 

For additional Timestream for LiveAnalytics best practices, see [Best practices](best-practices.md).

# Preparing a batch load data file
<a name="batch-load-preparing-data-file"></a>

A source data file has delimiter-separated values. The more specific term, comma-separated values (CSV) is used generically. Valid column separators include commas and pipes. Records are separated by new lines. Files must be stored in Amazon S3. When you create a new batch load task, the location of the source data is specified by an ARN for the file. A file contains headers. One column represents the timestamp. At least one other column represents a measure.

The S3 buckets used with batch load must be in the same Region as the Timestream for LiveAnalytics table that is used in batch load. Don't add or remove files from the S3 bucket used in batch load after the batch load task has been submitted. For information about working with S3 buckets, see [Getting started with Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/GetStartedWithS3.html).

**Note**  
CSV files that are generated by some applications such as Excel might contain a byte order mark (BOM) that conflicts with the expected encoding. Timestream for LiveAnalytics batch load tasks that reference a CSV file with a BOM throw an error when they're processed programmatically. To avoid this, you can remove the BOM, which is an invisible character.  
For example, you can save the file from an application such as Notepad\$1\$1 that lets you specify a new encoding. You can also use a programmatic option that reads the first line, removes the character from the line, and writes the new value over the first line in the file.  
When saving from Excel, there are multiple CSV options. Saving with a different CSV option might prevent the described issue. But you should check the result because a change in encoding can affect some characters.

## CSV format parameters
<a name="batch-load-data-file-options"></a>

You use escape characters when you're representing a value that is otherwise reserved by the format parameters. For example, if the quote character is a double quote, to represent a double quote in the data, place the escape character before the double quote.

For information about when to specify these when creating a batch load task, see [Create a batch load task](batch-load-using-console.md#console_timestream.create-batch-load.using-console).


| Parameter | Options | 
| --- | --- | 
| Column separator | (Comma (',') \$1 Pipe ('\$1') \$1 Semicolon (';') \$1 Tab ('/t') \$1 Blank space (' ')) | 
| Escape character | none | 
| Quote character | Console: (Double quote (") \$1 Single quote (')) | 
| Null value | Blank space (' ') | 
| Trim white space | Console: (No \$1 Yes) | 

# Data model mappings for batch load
<a name="batch-load-data-model-mappings"></a>

The following discusses the schema for data model mappings and gives and example.

## Data model mappings schema
<a name="batch-load-data-model-mappings-schema"></a>

The `CreateBatchLoadTask` request syntax and a `BatchLoadTaskDescription` object returned by a call to `DescribeBatchLoadTask` include a `DataModelConfiguration` object that includes the `DataModel` for batch loading. The `DataModel` defines mappings from source data that's stored in CSV format in an S3 location to a target Timestream for LiveAnalytics database and table. 

The `TimeColumn` field indicates the source data's location for the value to be mapped to the destination table's `time` column in Timestream for LiveAnalytics. The `TimeUnit` specifies the unit for the `TimeColumn`, and can be one of `MILLISECONDS`, `SECONDS`, `MICROSECONDS`, or `NANOSECONDS`. There are also mappings for dimensions and measures. Dimension mappings are composed of source columns and target fields. 

For more information, see [DimensionMapping](https://docs.aws.amazon.com/timestream/latest/developerguide/API_DimensionMapping). The mappings for measures have two options, `MixedMeasureMappings` and `MultiMeasureMappings`.

To summarize, a `DataModel` contains mappings from a data source in an S3 location to a target Timestream for LiveAnalytics table for the following.
+ Time
+ Dimensions
+ Measures

If possible, we recommend that you map measure data to multi-measure records in Timestream for LiveAnalytics. For information about the benefits of multi-measure records, see [Multi-measure records](writes.md#writes.writing-data-multi-measure). 

If multiple measures in the source data are stored in one row, you can map those multiple measures to multi-measure records in Timestream for LiveAnalytics using `MultiMeasureMappings`. If there are values that must map to a single-measure record, you can use `MixedMeasureMappings`. 

`MixedMeasureMappings` and `MultiMeasureMappings` both include `MultiMeasureAttributeMappings`. Multi-measure records are supported regardless of whether single-measure records are needed.

If only multi-measure target records are needed in Timestream for LiveAnalytics, you can define measure mappings in the following structure.

```
CreateBatchLoadTask
    MeasureNameColumn
    MultiMeasureMappings
        TargetMultiMeasureName
        MultiMeasureAttributeMappings array
```

**Note**  
We recommend using `MultiMeasureMappings` whenever possible.

If single-measure target records are needed in Timestream for LiveAnalytics, you can define measure mappings in the following structure.

```
CreateBatchLoadTask
    MeasureNameColumn
    MixedMeasureMappings array
        MixedMeasureMapping
            MeasureName
            MeasureValueType
            SourceColumn
            TargetMeasureName
            MultiMeasureAttributeMappings array
```

When you use `MultiMeasureMappings`, the `MultiMeasureAttributeMappings` array is always required. When you use the `MixedMeasureMappings` array, if the `MeasureValueType` is `MULTI` for a given `MixedMeasureMapping`, `MultiMeasureAttributeMappings` is required for that `MixedMeasureMapping`. Otherwise, `MeasureValueType` indicates the measure type for the single-measure record.

Either way, there is an array of `MultiMeasureAttributeMapping` available. You define the mappings to multi-measure records in each `MultiMeasureAttributeMapping` as follows:

`SourceColumn`  
The column in the source data that is located in Amazon S3.

`TargetMultiMeasureAttributeName`  
The name of the target multi-measure name in the destination table. This input is required when `MeasureNameColumn` is not provided. If `MeasureNameColumn` is provided, the value from that column is used as the multi-measure name.

`MeasureValueType`  
One of `DOUBLE`, `BIGINT` `BOOLEAN`, `VARCHAR`, or `TIMESTAMP`.

## Data model mappings with `MultiMeasureMappings` example
<a name="batch-load-data-model-mappings-example-multi"></a>

This example demonstrates mapping to multi-measure records, the preferred approach, which store each measure value in a dedicated column. You can download a sample CSV at [sample CSV](samples/batch-load-sample-file.csv.zip). The sample has the following headings to map to a target column in a Timestream for LiveAnalytics table.
+ `time`
+ `measure_name`
+ `region`
+ `location`
+ `hostname`
+ `memory_utilization`
+ `cpu_utilization`

Identify the `time` and `measure_name` columns in the CSV file. In this case these map directly to the Timestream for LiveAnalytics table columns of the same names.
+ `time` maps to `time`
+ `measure_name` maps to `measure_name` (or your chosen value)

When using the API, you specify `time` in the `TimeColumn` field and a supported time unit value such as `MILLISECONDS` in the `TimeUnit` field. These correspond to **Source columnn name** and **Timestamp time input** in the console. You can group or partition records using `measure_name` which is defined with the `MeasureNameColumn` key.

In the sample, `region`, `location`, and `hostname` are dimensions. Dimensions are mapped in an array of `DimensionMapping` objects.

For measures, the value `TargetMultiMeasureAttributeName` will become a column in the Timestream for LiveAnalytics table. You can keep the same name such as in this example. Or you can specify a new one. `MeasureValueType` is one of `DOUBLE`, `BIGINT`, `BOOLEAN`, `VARCHAR`, or `TIMESTAMP`. 

```
{
  "TimeColumn": "time",
  "TimeUnit": "MILLISECONDS",
  "DimensionMappings": [
    {
      "SourceColumn": "region",
      "DestinationColumn": "region"
    },
    {
      "SourceColumn": "location",
      "DestinationColumn": "location"
    },
    {
      "SourceColumn": "hostname",
      "DestinationColumn": "hostname"
    }
  ],
  "MeasureNameColumn": "measure_name",
  "MultiMeasureMappings": {
    "MultiMeasureAttributeMappings": [
      {
        "SourceColumn": "memory_utilization",
        "TargetMultiMeasureAttributeName": "memory_utilization",
        "MeasureValueType": "DOUBLE"
      },
      {
        "SourceColumn": "cpu_utilization",
        "TargetMultiMeasureAttributeName": "cpu_utilization",
        "MeasureValueType": "DOUBLE"
      }
    ]
  }
}
```

![\[Visual builder interface showing column mappings for timestream data attributes and types.\]](http://docs.aws.amazon.com/timestream/latest/developerguide/images/column-mapping.jpg)


## Data model mappings with `MixedMeasureMappings` example
<a name="batch-load-data-model-mappings-example-mixed"></a>

We recommend that you only use this approach when you need to map to single-measure records in Timestream for LiveAnalytics.

# Using batch load with the console
<a name="batch-load-using-console"></a>

Following are steps for using batch load with the AWS Management Console. You can download a sample CSV at [sample CSV](samples/batch-load-sample-file.csv.zip).

**Topics**
+ [Access batch load](#console_timestream.access-batch-load.using-console)
+ [Create a batch load task](#console_timestream.create-batch-load.using-console)
+ [Resume a batch load task](#console_timestream.resume-batch-load.using-console)
+ [Using the visual builder](#batch-load-using-visual-builder)

## Access batch load
<a name="console_timestream.access-batch-load.using-console"></a>

Follow these steps to access batch load using the AWS Management Console.

1. Open the [Amazon Timestream console](https://console.aws.amazon.com/timestream).

1. In the navigation pane, choose **Management Tools**, and then choose **Batch load tasks**.

1. From here, you can view the list of batch load tasks and drill into a given task for more details. You can also create and resume tasks.

## Create a batch load task
<a name="console_timestream.create-batch-load.using-console"></a>

Follow these steps to create a batch load task using the AWS Management Console.

1. Open the [Amazon Timestream console](https://console.aws.amazon.com/timestream).

1. In the navigation pane, choose **Management Tools**, and then choose **Batch load tasks**.

1. Choose **Create batch load task**.

1. In **Import destination**, choose the following.
   + **Target database** – Select the name of the database created in [Create a database](console_timestream.md#console_timestream.db.using-console).
   + **Target table** – Select the name of the table created in [Create a table](console_timestream.md#console_timestream.table.using-console).

   If necessary, you can add a table from this panel with the **Create new table** button.

1. From **Data source S3 location** in **Data source**, select the S3 bucket where the source data is stored. Use the **Browse S3** button to view S3 resources the active AWS account has access to, or enter the S3 location URL. The data source must be located in the same region.

1. In **File format settings** (expandable section), you can use the default settings to parse input data. You can also choose **Advanced settings**. From there you can choose **CSV format parameters**, and select parameters to parse input data. For information about these parameters, see [CSV format parameters](batch-load-preparing-data-file.md#batch-load-data-file-options).

1. From **Configure data model mapping**, configure the data model. For additional data model guidance, see [Data model mappings for batch load](batch-load-data-model-mappings.md)
   + From **Data model mapping**, choose **Mapping configuration input**, and choose one of the following.
     + **Visual builder** – To map data visually, choose **TargetMultiMeasureName** or **MeasureNameColumn**. Then from **Visual builder**, map the columns.

       Visual builder automatically detects and loads the source column headers from the data source file when a single CSV file is selected as the data source. Choose the attribute and data type to create your mapping.

       For information about using the visual builder, see [Using the visual builder](#batch-load-using-visual-builder).
     + **JSON editor** – A freeform JSON editor for configuring your data model. Choose this option if you're familiar with Timestream for LiveAnalytics and want to build advanced data model mappings.
     + **JSON file from S3** – Select a JSON model file you have stored in S3. Choose this option if you've already configured a data model and want to reuse it for additional batch loads.

1. From **Error logs S3 location** in **Error log report**, select the S3 location that will be used to report errors. For information about how to use this report, see [Using batch load error reports](batch-load-using-error-reports.md).

1. For **Encryption key type**, choose one of the following.
   + **Amazon S3-managed key (SSE-S3)** – An encryption key that Amazon S3 creates, manages, and uses for you.
   + **AWS KMS key (SSE-KMS)** – An encryption key protected by AWS Key Management Service (AWS KMS).

1. Choose **Next**.

1. On the **Review and create page**, review the settings and edit as necessary.
**Note**  
You can't change batch load task settings after the task has been created. Task completion times will vary based on the amount of data being imported.

1. Choose **Create batch load task**.

## Resume a batch load task
<a name="console_timestream.resume-batch-load.using-console"></a>

When you select a batch load task with a status of "Progress stopped" which is still resumable, you are prompted to resume the task. There is also a banner with a **Resume task** button when you view the details for those tasks. Resumable tasks have a "resume by" date. After that date expires, tasks cannot be resumed.

## Using the visual builder
<a name="batch-load-using-visual-builder"></a>

You can use the visual builder to map source data columns one or more CSV file(s) stored in an S3 bucket to destination columns in a Timestream for LiveAnalytics table.

**Note**  
Your role will need the `SelectObjectContent` permission for the file. Without this, you will need to add and delete columns manually.

### Auto load source columns mode
<a name="batch-load-using-visual-builder-auto-load"></a>

Timestream for LiveAnalytics can automatically scan the source CSV file for column names if you specify one bucket only. When there are no existing mappings, you can choose **Import source columns**.

1. With the **Visual builder** option selected from the **Mapping configuration input settings**, set the Timestamp time input. `Milliseconds` is the default setting.

1. Click the **Load source columns** button to import the column headers found in the source data file. The table will be populated with the source column header names from the data source file.

1. Choose the **Target table column name**, **Timestream attribute type**, and **Data type** for each source column.

   For details about these columns and possible values, see [Mapping fields](#batch-load-using-visual-builder-mapping-fields).

1. Use the drag-to-fill feature to set the value for multiple columns at once.

### Manually add source columns
<a name="batch-load-using-visual-builder-manually-add"></a>

If you're using a bucket or CSV prefix and not a single CSV, you can add and delete column mappings from the visual editor with the **Add column mapping** and **Delete column mapping** buttons. There is also a button to reset mappings.

### Mapping fields
<a name="batch-load-using-visual-builder-mapping-fields"></a>
+ **Source column name** – The name of a column in the source file that represents a measure to import. Timestream for LiveAnalytics can populate this value automatically when you use **Import source columns**.
+ **Target table column name** – Optional input that indicates the column name for the measure in the target table.
+ **Timestream attribute type** – The attribute type of the data in the specified source column such as `DIMENSION`.
  + **TIMESTAMP** – Specifies when a measure was collected.
  + **MULTI** – Multiple measures are represented.
  + **DIMENSION** – Time series metadata.
  + **MEASURE\$1NAME** – For single-measure records, this is the measure name.
+ **Data type** – The type of Timestream column, such as `BOOLEAN`.
  + **BIGINT** – A 64-bit integer.
  + **BOOLEAN** – The two truth values of logic—true and false.
  + **DOUBLE** – 64-bit variable-precision number.
  + **TIMESTAMP** – An instance in time that uses nanosecond precision time in UTC, and tracks the time since the Unix epoch.

# Using batch load with the AWS CLI
<a name="batch-load-using-cli"></a>

**Setup**

To start using batch load, go through the following steps.

1. Install the AWS CLI using the instructions at [Accessing Amazon Timestream for LiveAnalytics using the AWS CLI](Tools.CLI.md).

1. Run the following command to verify that the Timestream CLI commands have been updated. Verify that create-batch-load-task is in the list.

   `aws timestream-write help`

1. Prepare a data source using the instructions at [Preparing a batch load data file](batch-load-preparing-data-file.md).

1. Create a database and table using the instructions at [Accessing Amazon Timestream for LiveAnalytics using the AWS CLI](Tools.CLI.md).

1. Create an S3 bucket for report output. The bucket must be in the same Region. For more information about buckets, see [Creating, configuring, and working with Amazon S3 buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-buckets-s3.html).

1. Create a batch load task. For steps, see [Create a batch load task](#batch-load-using-cli-create-task).

1. Confirm the status of the task. For steps, see [Describe batch load task](#batch-load-using-cli-describe-task).

## Create a batch load task
<a name="batch-load-using-cli-create-task"></a>

You can create a batch load task with the `create-batch-load-task` command. When you create a batch load task using the CLI, you can use a JSON parameter, `cli-input-json`, which lets you aggregate the parameters into a single JSON fragment. You can also break those details apart using several other parameters including `data-model-configuration`, `data-source-configuration`, `report-configuration`, `target-database-name`, and `target-table-name`.

For an example, see [Create batch load task example](#batch-load-using-cli-example)

## Describe batch load task
<a name="batch-load-using-cli-describe-task"></a>

You can retrieve a batch load task description as follows.

```
aws timestream-write describe-batch-load-task --task-id <value>
```

Following is an example response.

```
{
    "BatchLoadTaskDescription": {
        "TaskId": "<TaskId>",
        "DataSourceConfiguration": {
            "DataSourceS3Configuration": {
                "BucketName": "test-batch-load-west-2",
                "ObjectKeyPrefix": "sample.csv"
            },
            "CsvConfiguration": {},
            "DataFormat": "CSV"
        },
        "ProgressReport": {
            "RecordsProcessed": 2,
            "RecordsIngested": 0,
            "FileParseFailures": 0,
            "RecordIngestionFailures": 2,
            "FileFailures": 0,
            "BytesIngested": 119
        },
        "ReportConfiguration": {
            "ReportS3Configuration": {
                "BucketName": "test-batch-load-west-2",
                "ObjectKeyPrefix": "<ObjectKeyPrefix>",
                "EncryptionOption": "SSE_S3"
            }
        },
        "DataModelConfiguration": {
            "DataModel": {
                "TimeColumn": "timestamp",
                "TimeUnit": "SECONDS",
                "DimensionMappings": [
                    {
                        "SourceColumn": "vehicle",
                        "DestinationColumn": "vehicle"
                    },
                    {
                        "SourceColumn": "registration",
                        "DestinationColumn": "license"
                    }
                ],
                "MultiMeasureMappings": {
                    "TargetMultiMeasureName": "test",
                    "MultiMeasureAttributeMappings": [
                        {
                            "SourceColumn": "wgt",
                            "TargetMultiMeasureAttributeName": "weight",
                            "MeasureValueType": "DOUBLE"
                        },
                        {
                            "SourceColumn": "spd",
                            "TargetMultiMeasureAttributeName": "speed",
                            "MeasureValueType": "DOUBLE"
                        },
                        {
                            "SourceColumn": "fuel",
                            "TargetMultiMeasureAttributeName": "fuel",
                            "MeasureValueType": "DOUBLE"
                        },
                        {
                            "SourceColumn": "miles",
                            "TargetMultiMeasureAttributeName": "miles",
                            "MeasureValueType": "DOUBLE"
                        }
                    ]
                }
            }
        },
        "TargetDatabaseName": "BatchLoadExampleDatabase",
        "TargetTableName": "BatchLoadExampleTable",
        "TaskStatus": "FAILED",
        "RecordVersion": 1,
        "CreationTime": 1677167593.266,
        "LastUpdatedTime": 1677167602.38
    }
}
```

## List batch load tasks
<a name="batch-load-using-cli-list-tasks"></a>

You can list batch load tasks as follows.

```
aws timestream-write list-batch-load-tasks
```

An output appears as follows.

```
{
    "BatchLoadTasks": [
        {
            "TaskId": "<TaskId>",
            "TaskStatus": "FAILED",
            "DatabaseName": "BatchLoadExampleDatabase",
            "TableName": "BatchLoadExampleTable",
            "CreationTime": 1677167593.266,
            "LastUpdatedTime": 1677167602.38
        }
    ]
}
```

## Resume batch load task
<a name="batch-load-using-cli-resume-task"></a>

You can resume a batch load task as follows.

```
aws timestream-write resume-batch-load-task --task-id <value>
```

A response can indicate success or contain error information.

## Create batch load task example
<a name="batch-load-using-cli-example"></a>

**Example**  

1. Create a Timestream for LiveAnalytics database named `BatchLoad` and a table named `BatchLoadTest`. Verify and, if necessary, adjust the values for `MemoryStoreRetentionPeriodInHours` and `MagneticStoreRetentionPeriodInDays`.

   ```
   aws timestream-write create-database --database-name BatchLoad \
   
   aws timestream-write create-table --database-name BatchLoad \
   --table-name BatchLoadTest \
   --retention-properties "{\"MemoryStoreRetentionPeriodInHours\": 12, \"MagneticStoreRetentionPeriodInDays\": 100}"
   ```

1. Using the console, create an S3 bucket and copy the `sample.csv` file to that location. You can download a sample CSV at [sample CSV](samples/batch-load-sample-file.csv.zip).

1. Using the console create an S3 bucket for Timestream for LiveAnalytics to write a report if the batch load task completes with errors.

1. Create a batch load task. Make sure to replace *\$1INPUT\$1BUCKET* and *\$1REPORT\$1BUCKET* with the buckets that you created in the preceding steps.

   ```
   aws timestream-write create-batch-load-task \
   --data-model-configuration "{\
               \"DataModel\": {\
                 \"TimeColumn\": \"timestamp\",\
                 \"TimeUnit\": \"SECONDS\",\
                 \"DimensionMappings\": [\
                   {\
                     \"SourceColumn\": \"vehicle\"\
                   },\
                   {\
                     \"SourceColumn\": \"registration\",\
                     \"DestinationColumn\": \"license\"\
                   }\
                 ],
                 \"MultiMeasureMappings\": {\
                   \"TargetMultiMeasureName\": \"mva_measure_name\",\
                   \"MultiMeasureAttributeMappings\": [\
                     {\
                       \"SourceColumn\": \"wgt\",\
                       \"TargetMultiMeasureAttributeName\": \"weight\",\
                       \"MeasureValueType\": \"DOUBLE\"\
                     },\
                     {\
                       \"SourceColumn\": \"spd\",\
                       \"TargetMultiMeasureAttributeName\": \"speed\",\
                       \"MeasureValueType\": \"DOUBLE\"\
                     },\
                     {\
                       \"SourceColumn\": \"fuel_consumption\",\
                       \"TargetMultiMeasureAttributeName\": \"fuel\",\
                       \"MeasureValueType\": \"DOUBLE\"\
                     },\
                     {\
                       \"SourceColumn\": \"miles\",\
                       \"MeasureValueType\": \"BIGINT\"\
                     }\
                   ]\
                 }\
               }\
             }" \
   --data-source-configuration "{
               \"DataSourceS3Configuration\": {\
                 \"BucketName\": \"$INPUT_BUCKET\",\
                 \"ObjectKeyPrefix\": \"$INPUT_OBJECT_KEY_PREFIX\"
               },\
               \"DataFormat\": \"CSV\"\
             }" \
   --report-configuration "{\
               \"ReportS3Configuration\": {\
                 \"BucketName\": \"$REPORT_BUCKET\",\
                 \"EncryptionOption\": \"SSE_S3\"\
               }\
             }" \
   --target-database-name BatchLoad \
   --target-table-name BatchLoadTest
   ```

   The preceding command returns the following output.

   ```
   {
       "TaskId": "TaskId "
   }
   ```

1. Check on the progress of the task. Make sure you replace *\$1TASK\$1ID* with the task id that was returned in the preceding step.

   ```
   aws timestream-write describe-batch-load-task --task-id $TASK_ID 
   ```
**Example output**  

```
{
    "BatchLoadTaskDescription": {
        "ProgressReport": {
            "BytesIngested": 1024,
            "RecordsIngested": 2,
            "FileFailures": 0,
            "RecordIngestionFailures": 0,
            "RecordsProcessed": 2,
            "FileParseFailures": 0
        },
        "DataModelConfiguration": {
            "DataModel": {
                "DimensionMappings": [
                    {
                        "SourceColumn": "vehicle",
                        "DestinationColumn": "vehicle"
                    },
                    {
                        "SourceColumn": "registration",
                        "DestinationColumn": "license"
                    }
                ],
                "TimeUnit": "SECONDS",
                "TimeColumn": "timestamp",
                "MultiMeasureMappings": {
                    "MultiMeasureAttributeMappings": [
                        {
                            "TargetMultiMeasureAttributeName": "weight",
                            "SourceColumn": "wgt",
                            "MeasureValueType": "DOUBLE"
                        },
                        {
                            "TargetMultiMeasureAttributeName": "speed",
                            "SourceColumn": "spd",
                            "MeasureValueType": "DOUBLE"
                        },
                        {
                            "TargetMultiMeasureAttributeName": "fuel",
                            "SourceColumn": "fuel_consumption",
                            "MeasureValueType": "DOUBLE"
                        },
                        {
                            "TargetMultiMeasureAttributeName": "miles",
                            "SourceColumn": "miles",
                            "MeasureValueType": "DOUBLE"
                        }
                    ],
                    "TargetMultiMeasureName": "mva_measure_name"
                }
            }
        },
        "TargetDatabaseName": "BatchLoad",
        "CreationTime": 1672960381.735,
        "TaskStatus": "SUCCEEDED",
        "RecordVersion": 1,
        "TaskId": "TaskId ",
        "TargetTableName": "BatchLoadTest",
        "ReportConfiguration": {
            "ReportS3Configuration": {
                "EncryptionOption": "SSE_S3",
                "ObjectKeyPrefix": "ObjectKeyPrefix ",
                "BucketName": "amzn-s3-demo-bucket"
            }
        },
        "DataSourceConfiguration": {
            "DataSourceS3Configuration": {
                "ObjectKeyPrefix": "sample.csv",
                "BucketName": "amzn-s3-demo-source-bucket"
            },
            "DataFormat": "CSV",
            "CsvConfiguration": {}
        },
        "LastUpdatedTime": 1672960387.334
    }
}
```

# Using batch load with the AWS SDKs
<a name="batch-load-using-sdk"></a>

For examples of how to create, describe, and list batch load tasks with the AWS SDKs, see [Create batch load task](code-samples.create-batch-load.md), [Describe batch load task](code-samples.describe-batch-load.md), [List batch load tasks](code-samples.list-batch-load-tasks.md), and [Resume batch load task](code-samples.resume-batch-load-task.md).

# Using batch load error reports
<a name="batch-load-using-error-reports"></a>

Batch load tasks have one of the following status values:
+ `CREATED` (**Created**) – Task is created.
+ `IN_PROGRESS` (**In progress**) – Task is in progress.
+ `FAILED` (**Failed**) – Task has completed. But one or more errors was detected.
+ `SUCCEEDED` (**Completed**) – Task has completed with no errors.
+ `PROGRESS_STOPPED` (**Progress stopped**) – Task has stopped but not completed. You can attempt to resume the task.
+ `PENDING_RESUME` (**Pending resume**) – The task is pending to resume.

When there are errors, an error log report is created in the S3 bucket defined for that. Errors are categorized as taskErrors or fileErrors in separate arrays. Following is an example error report.

```
{
    "taskId": "9367BE28418C5EF902676482220B631C",
    "taskErrors": [],
    "fileErrors": [
        {
            "fileName": "example.csv",
            "errors": [
                {
                    "reason": "The record timestamp is outside the time range of the data ingestion window.",
                    "lineRanges": [
                        [
                            2,
                            3
                        ]
                    ]
                }
            ]
        }
    ]
}
```