How to use Redshift Primary key Constraint? Its Syntax

Amazon Redshift is one of the earliest cloud data warehouses. It is based on the PostgreSQL. There are many features that are common in Redshift and PostgreSQL. Amazon Redshift provides many features to the user who wants to migrate to AWS Redshift. One of such features is primary key constraints. Basically, Redshift supports the referential integrity constraint such as primary key, foreign key, not null and unique key. You can create a primary key during table creation, that is, table DDL or you can use alter table command to add primary key constraint to already existing tables. In this article, we will check how to create Redshift primary key constraint and its syntax.

What is Primary Key Constraint?

In general, a primary key is a column of the table column that uniquely defines a record. None of the columns that are part of the primary key can contain a null or duplicate value. A table can have only one primary key. A primary key value must be unique within the entire table. A primary key value should not be changed over time.

In the relational databases such as Oracle, sequences are usually used to populate primary key columns. Sequence or identity column returns a unique value each time it is referred.

Does Amazon Redshift Support Primary key Constraint?

Amazon Redshift does allow you to define a primary key constraint on the tables. However, the primary key is informational only, the query optimizer uses those constraints to generate more efficient query plans.

The primary key, foreign key and unique key constraints are informational only. They are not enforced when you load a data into table on which these constraints are defined.

Redshift Primary key Constraint Syntax

There are different methods that you can use to add primary key constraint to the table.

Column level PRIMARY KEY – Add primay key constraint columns while creating table.
Alter Table to Add PRIMARY KEY Column – You can also use alter table command to add primary key constraint after table is created.

You can mention the primary key constraint when creating table either on column level or on table level:

Column level primary key:

create table PrimaryKey_demo (
 col1 int NOT NULL PRIMARY KEY
 ,col2 date
 ,col3 varchar(60 ) 
) distkey(col1)
compound sortkey(col1, col2);

Table level primary key:

create table PrimaryKey_demo2 (
 col1 int NOT NULL
 ,col2 date
 ,col3 varchar(60 ) 
, primary key(col1)
) distkey(col1)
compound sortkey(col1, col2);

How Amazon Redshift Uses Primary key?

Primary key, foreign key and unique keys are informational only. They should be declared if your ETL process or some other process in your application enforces their integrity. The primary keys and foreign keys are used as planning hints. The optimizer uses these hints to generate an optimal query execution plan.

For example, the Redshift query planner uses primary and foreign keys in certain statistical computations. It uses this primary key, foreign key to infer uniqueness and referential relationships that affect subquery decorrelation techniques. By doing this, it can order large numbers of joins and eliminate redundant joins.

Redshift recommends not to define key constraints for your tables if you doubt their validity. On the other hand, you should always declare primary and foreign keys and uniqueness constraints when you know that they are valid.

Moreover, it is really easy to migrate database to the Amazon Redshift that has constraints defined on the table. With a minimal changes you can same DDL to create a table on Redshift.

Amazon Redshift does enforce NOT NULL column constraints.

Amazon Redshift Does Not Enforce Primary Key

Now, let us test if the Primary key is enforced. You can insert the duplicate values in Redshift tables.

Insert duplicate value to a table that has primary key defined

insert into PrimaryKey_demo values (1, current_date, 'test1'); --This 1st value 
insert into PrimaryKey_demo values (1, current_date, 'test1'); -- Duplicate value

Above statement will execute perfectly despite of having primary key constraints on table. This concludes that Amazon Redshift does not enforce constraints on tables.

Alter Redshift Table to add Primary Key Constraint Syntax

You can add the Redshift primary key constraint by altering the tables using ALTER TABLE command.

Following is the syntax and example:

ALTER TABLE PrimaryKey_demo1 ADD CONSTRAINT PK_1 PRIMARY KEY (col1);