How to Import a CSV in Redshift
Importing a CSV into Redshift requires you to create a table first. Duplicating an existing table's structure might be helpful here too.
The easiest way to load a CSV into Redshift is to first upload the file to an Amazon S3 Bucket. You can follow the Redshift Documentation for how to do this. After that you can use the COPY
command to tell Redshift to pull the file from S3 and load it to your table.
COPY table_name
FROM 's3://<your-bucket-name>/load/file_name.csv'
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
CSV;
Note that you need the CSV
keyword at the end, else Redshift will assume that the file contents uses the pipe character (|) as delimiter.
If your CSV file has a different column arrangement as the table or does not contain all columns, you need to specify a column list:
COPY table_name (col1, col2, col3, col4)
FROM 's3://<your-bucket-name>/load/file_name.csv'
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
CSV;
Finally, if the file has header rows to be ignored, you can specify the number of lines to be skipped:
-- Ignore the first line
COPY table_name (col1, col2, col3, col4)
FROM 's3://<your-bucket-name>/load/file_name.csv'
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
CSV
INGOREHEADER 1;
For a complete list of options for COPY
, please refer to the Redshift COPY Command Specification.