How to Find String in Spark DataFrame? – Scala and PySpark

As a data engineer, you get to work on many different datasets and databases. It is common requirement to enrich the input data by filtering out unwanted data or to search for a specific string within a data or Spark DataFrame if you are working on Apache Spark. For example, identify the unwanted or junk string within a dataset. In this article, we will check how to find a string in Spark DataFrame with various methods. We shall see what are different methods find a string in a given data…

Comments Off on How to Find String in Spark DataFrame? – Scala and PySpark

Best Methods to Compare Two Tables in SQL

It is very common requirement to compare two or more tables in SQL. You will get this requirement in your day to day tasks. For example, check if tables in same server are matching or is there any discrepancies between the table's data. Following are the some of the common methods that you can use to compare two tables in any SQL database. You can use most of these methods on BigData technologies such as Apache Hive and Apache Spark. Compare Two Table in SQL using MINUSCompare Two Tables in…

Comments Off on Best Methods to Compare Two Tables in SQL