Set Operators in Redshift: UNION, EXCEPT/MINUS and INTERSECT

Like many other cloud data warehouses, Amazon Redshift supports set operators such as UNION, EXCEPT/MINUS and INTERSECT to combine two or more similar data sets from two or more SELECT statements. Here the similar data set literally mean, the data type of the result set should also match, otherwise you have to explicitly type cast data when using Redshift set operators. Set Operators in Redshift Types of Set Operators in Redshift Amazon Redshift supports the following types of set operators: UNION [DISTINCT] and UNION ALLINTERSECT [DISTINCT]EXCEPT [DISTINCT] or MINUS [DISTINCT]…

Comments Off on Set Operators in Redshift: UNION, EXCEPT/MINUS and INTERSECT

Working with Redshift Regular Expression Functions

The regular expression functions in Amazon Redshift offer many advantages such as identifying precise patterns of characters in the given string. You can use these functions to validate the input data. For e.g. validate if the input value is an integer. The regular expression can also help you to extract specific string or characters from input. Redshift Regular Expression Functions In the previous post, we have discussed Redshift NVL and NVL2 functions to deal with NULL values. In this article, we will check how to use Redshift RegEx functions in…

Comments Off on Working with Redshift Regular Expression Functions

Redshift NVL and NVL2 Functions – Syntax and Examples

In my other post, we have discussed how to handle NULL values in Redshift using NULL handling functions. In this article, we will check couple of NULL handling functions such as Redshift NVL and NVL2 functions in details with syntax, usage and few examples. Redshift NVL and NVL2 Functions Redshift NVL and NVL2 Functions Similar to many other relational database or data warehouse appliances, Redshift supports NVL and NVL2. These functions are mainly used to handle the null values in Redshift tables. For example, replace NULL values with any readable…

Comments Off on Redshift NVL and NVL2 Functions – Syntax and Examples

How to Handle NULL in Redshift? – Functions

A NULL value in a relational database is a special marker in SQL that indicates a data value does not exist in the database table. In other words, it is just a placeholder to denote values that are missing or that we do not know. Almost all relational databases and bigdata frameworks support functions to handle null values. In this article, we will check how to handle NULL in Amazon AWS Redshift, we shall also check Redshift NULL handling functions, usage and some examples. Handle NULL in Redshift Handle NULL in Amazon Redshift Similar…

Comments Off on How to Handle NULL in Redshift? – Functions

Working with External Tables in Amazon Redshift

Until recently Amazon Redshift was not supporting external tables. But, you can create external tables in Amazon Redshift using Redshift Spectrum. External tables allow you to access S3 flat files as a regular Redshift table. You can join the Redshift external table with a database tables such as permanent or temporary table to get required information. You can also perform a complex transformation involving various tables including external tables. The External tables are usually used to build the data lake where you access the raw data file which is stored…

Comments Off on Working with External Tables in Amazon Redshift

How to Handle Error in Snowflake Procedures and Functions?

Snowflake Cloud data warehouse supports stored procedures and user defined functions to help migration from other relational databases such as Oracle, Teradata. You can write stored procedures using JavaScript APIs. You can also write user defined functions using JavaScript APIs. JavaScript support error handling using try/catch block. However, Snowflake also supports built-in functions such as try_cast to handle error during type conversion. In this article, we will check how to handle error in Snowflake procedures and functions. Snowflake Stored Procedure How to Handle Error in Snowflake? The Snowflake cloud database…

Comments Off on How to Handle Error in Snowflake Procedures and Functions?

How to Replace Spark DataFrame Column Value? – Scala and PySpark

Similar to relational database tables, a DataFrame in Spark is a dataset organized into named columns. Spark DataFrame consists of columns and rows. When you are working on a multiple data sources, you may receive a data with unwanted values such as junk characters in your Spark DataFrames. In this article, we will check how to replace such a junk value in Spark DataFrame column. We will also check methods to replace values in Spark DataFrames. Replace Spark DataFrame Column Value It is very common requirement to cleanse the source…

Comments Off on How to Replace Spark DataFrame Column Value? – Scala and PySpark

How to Find String in Spark DataFrame? – Scala and PySpark

As a data engineer, you get to work on many different datasets and databases. It is common requirement to enrich the input data by filtering out unwanted data or to search for a specific string within a data or Spark DataFrame if you are working on Apache Spark. For example, identify the unwanted or junk string within a dataset. In this article, we will check how to find a string in Spark DataFrame with various methods. We shall see what are different methods find a string in a given data…

Comments Off on How to Find String in Spark DataFrame? – Scala and PySpark

Best Methods to Compare Two Tables in SQL

It is very common requirement to compare two or more tables in SQL. You will get this requirement in your day to day tasks. For example, check if tables in same server are matching or is there any discrepancies between the table's data. Following are the some of the common methods that you can use to compare two tables in any SQL database. You can use most of these methods on BigData technologies such as Apache Hive and Apache Spark. Compare Two Table in SQL using MINUSCompare Two Tables in…

Comments Off on Best Methods to Compare Two Tables in SQL

How to Handle NULL in Snowflake? Functions

Snowflake is one of the leading cloud databases that you can use to create cloud data warehouse. It is built for a cloud and supports almost all features that are available in the current on-premises or cloud databases. In my other article, we have seen why you should learn Snowflake database. In this article, we will check how to handle NULL values in a Snowflake cloud database. We will also check null handling functions available in Snowflake. How to Handle NULL in Snowflake? A NULL value in any relational database is a…

Comments Off on How to Handle NULL in Snowflake? Functions