Until recently Amazon Redshift was not supporting external tables. But, you can create external tables in Amazon Redshift using Redshift Spectrum. External tables allow you to access S3 flat files as a regular Redshift table. You can join the Redshift external table with a database tables such as permanent or temporary table to get required information. You can also perform a complex transformation involving various tables including external tables. The External tables are usually used to build the data lake where you access the raw data file which is stored in the form of file and perform join with existing tables. You can even export a Redshift table by using external table in AWS Redshift.
External Tables in Amazon Redshift
In an Amazon Redshift, you can use external tables to access flat file from S3 as regular table. You can create a new external table in the specified schema. All external tables in Redshift must be created in an external schema. Once an external table is available, you can query it as if it is regular tables.
The following steps allow you to create external tables in Amazon Redshift:
- Create an Redshift External Schema
- Create External Table in an External Schema
- Access Redshift External Table
Create an Redshift External Schema
Amazon Redshift provides commands to create an external schema. You can use following Redshift command to create external schema within the current database. Following SQL code creates an external schema spectrum_schema.
create external schema spectrum_schema from data catalog
database 'spectrum_db'
iam_role 'arn:aws:iam::1234567890:role/AzRedshiftRole'
create external database if not exists;
After successful execution of the above command, external schema is visible in the list of the current database schema. Please remember to replace correct IAM role information.
Create External Table in an External Schema
You can create a new external table in the specified external schema using CREATE EXTERNAL TABLE command.
Following SQL code creates an external table ext_users in spectrum_schema external schema.
create external table spectrum_schema_vs.ext_users (
user_id int,
SSN varchar,
first_name varchar,
last_name varchar,
city varchar,
state varchar,
email varchar,
phone varchar,
like_sports boolean,
like_theatre boolean,
like_concerts boolean,
like_jazz boolean,
like_classical boolean,
like_opera boolean,
like_rock boolean,
like_vegas boolean,
like_broadway boolean,
like_musicals boolean)
row format delimited
fields terminated by ','
stored as textfile
location 's3://raw_data/data/';
Upon successful execution of the above command, the external table will be visible in the list of the tables within external schema.
Access Redshift External Table
You can access it as if a Redshift database regular table. You can join this external table with your regular Redshift tables to perform complex calculations.
For example, display the top 10 records from the flat file using external table.
Export Redshift Tables Using External Tables
You can unload the Amazon Redshift tables to S3 bucket by using CREATE EXTERNAL TABE AS command.
For example, following command will export the unique_users table to an external S3 location. This method is different from COPY command or UNLOAD command.
CREATE EXTERNAL TABLE spectrum_schema_vs.unique_users_exp
row format delimited
fields terminated by ','
stored as textfile
location 's3://raw_data/data/unload/'
AS SELECT distinct * FROM public.unique_users;
The redshift table will be unloaded to S3 location.
Related Articles,
- How to Handle NULL in Redshift? – Functions
- What are Amazon Redshift Distribution Styles?
- Best Methods to Compare Two Tables in SQL
- Working with Redshift Regular Expression Functions
- How to Handle NULL in Snowflake? Functions
- How to Handle Error in Snowflake Procedures and Functions?
- Redshift NVL and NVL2 Functions – Syntax and Examples
- Set Operators in Redshift: UNION, EXCEPT/MINUS and INTERSECT
- Redshift WHERE Clause Multiple Columns Support
- How to use Redshift Primary key Constraint? Its Syntax
Hope this helps 🙂