Extracting data from redshift is nearly as common as getting data in. Sometimes, the results of hard computations done in Redshift are necessary for serving systems. Other times, a large export is needed for analysis in Excel or other tools. UNLOAD command can be used to extract data from redshift to s3 in various formates like Delimited or fixed-width formate.
UNLOAD ('select * from my_table') TO 's3://bucket_name/path/to/my_filename_prefix' WITH CREDENTIALS 'aws_access_key_id=<my_access_key>; aws_secret_access_key=<my_secret_key>' MANIFEST GZIP ALLOWOVERWRITE ESCAPE NULL AS '\\N'
Although is quite easy to extract data from redshift to s3 buckets in various formates like Delimited or fixed-width formates, but there is no direct way to export the data in JSON formate.
In this blog I have tried to explain a work around to extract the data in json format. UNLOAD command dumps the output data of a select query by using some case statement the select query can be converted to output a json document to which can be dumped by the UNLOAD command.
SELECT '{ ' || case when "colimn1" is null then '' else '"column1": "' || "column1" END || case when "column2" is null then '' else '", "column2": "' || "column2" END || '" }' FROM target_table;
By putting this query inside UNLOAD we can dump the data in json formate in s3.
UNLOAD ('SELECT \'{ \' || case when "colimn1" is null then \'\' else \'"column1": "\' || "column1" END || case when "column2" is null then \'\' else \'", "column2": "\' || "column2" END || \'" }\' FROM target_table') TO 's3://bucket_name/path/to/my_filename_prefix' WITH CREDENTIALS 'aws_access_key_id=<my_access_key>; aws_secret_access_key=<my_secret_key>' MANIFEST GZIP ALLOWOVERWRITE ESCAPE NULL AS '\\N'