Redshift Bulk Loader

Redshift Bulk Loader

Description

The Redshift Bulk Loader transform loads data from Apache Hop to AWS Redshift using the COPY command.

Tip: make sure your target Redshift table has a layout that is compatible with Parquet data types, e.g. use int8 instead of int4 data types.

Hop Engine

Spark

?

Flink

?

Dataflow

?

General Options

Option
Description

Transform name

Name of the transform.

Connection

Name of the database connection on which the target table resides.

Target schema

The name of the target schema to write data to.

Target table

The name of the target table to write data to.

AWS Authentication

choose which authentication method to use with the COPY command. Supported options are AWS Credentials and IAM Role.

Use AWS system variables

(Credentials only!) pick up the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY values from your operating system’s environment variables.

AWS_ACCESS_KEY_ID

(if Credentials is selected and Use AWS system variables is unchecked) specify a value or variable for your AWS_ACCESS_KEY_ID.

AWS_SECRET_ACCESS_KEY

(if Credentials is selected and Use AWS system variables is unchecked) specify a value or variable for your AWS_SECRET_ACCESS_KEY.

IAM Role

(if IAM Role is selected) specify the IAM Role to use, in the syntax arn:aws:iam::<aws-account-id>:role/<role-name>

Truncate table

Truncate the target table before loading data.

Truncate on first row

Truncate the target table before loading data, but only when a first data row is received (will not truncate when a pipeline runs an empty stream (0 rows)).

Specify database fields

Specify the database and stream fields mapping

Main Options

Option
Description

Stream to S3 CSV

write the current pipeline stream to a CSV file in an S3 bucket before performing the COPY load.

Load from existing file

do not stream the contents of the current pipeline, but perform the COPY load from a pre-existing file in S3. Supported formats are CSV (comma delimited) and Parquet.

Copy into Redshift from existing file

path to the file in S3 to COPY load the data from.

Database fields

Map the current stream fields to the Redshift table’s columns.

Last updated