Tutorial: Loading data into Amazon Keyspaces using cqlsh
This tutorial guides you through the process of migrating data from Apache Cassandra to Amazon Keyspaces using the cqlsh COPY FROM
command.
The cqlsh COPY FROM
command is useful to quickly and easily upload small datasets to Amazon Keyspaces for academic or test purposes.
For more information about how to migrate production workloads, see Offline migration process: Apache Cassandra to Amazon Keyspaces.
In this tutorial, you'll complete the following steps:
Prerequisites – Set up an AWS account with credentials, create a JKS trust store file for the
certificate, and configure cqlsh
to connect to Amazon Keyspaces.
Create source CSV and target table – Prepare a CSV file as the source data and create the target keyspace and table in Amazon Keyspaces.
Prepare the data – Randomize the data in the CSV file and analyze it to determine the average and maximum row sizes.
Set throughput capacity – Calculate the required write capacity units (WCUs) based on the data size and desired load time, and configure the table's provisioned capacity.
Configure cqlsh parameters – Determine optimal values for
cqlsh COPY FROM
parameters likeINGESTRATE
,NUMPROCESSES
,MAXBATCHSIZE
, andCHUNKSIZE
to distribute the workload evenly.Run the
cqlsh COPY FROM
command – Run thecqlsh COPY FROM
command to upload the data from the CSV file to the Amazon Keyspaces table, and monitor the progress.
Troubleshooting – Resolve common issues like invalid requests, parser errors, capacity errors, and cqlsh errors during the data upload process.
Topics
- Prerequisites: Steps to complete before you can upload data using cqlsh COPY FROM
- Step 1: Create the source CSV file and a target table for the data upload
- Step 2: Prepare the source data for a successful data upload
- Step 3: Set throughput capacity for the table
- Step 4: Configure cqlsh COPY FROM settings
- Step 5: Run the cqlsh COPY FROM command to upload data from the CSV file to the target table
- Troubleshooting