Tableau Tutorial Tutorial

JOIN

Introduction

In the real-world scenarios, we need to Combine data from multiple tables. To combine data from multiple tables, we need to create Joins between tables. In this article, we will first create a data source connection to an Excel file. Then we will create Joins between all the available tables in the excel file.

Sample Data File

For this article, we need to download the following Excel file. The following excel file contains data for the famous Cartoon Characters in the 1900s. All the data are separated between multiple excel worksheets. In the excel sheet, we have four different worksheets. Following are the different worksheets with available columns
FirstName: This worksheet contains 5 Columns, FirstName, Gender, DOB, ID, and CID. Here DOB means Date of Birth, ID is the unique Identifier for each cartoon character whose name is available in the FirstName Excel worksheet. The CID is the unique Identifier for the Creator’s name available in the Creators worksheet.
LastName: This worksheet contains two columns, LastName, and an ID. Again, the ID column is the unique Identifier for each cartoon character whose last name is available in the LastName Excel worksheet. Also, the ID available in FirstName and the LastName worksheet belongs to the same cartoon character. It means we assigned ID 2 for Mickey Mouse in FirstName the same ID is also assigned in the LastName. 
Creators: This worksheet contains two columns, CreatorID, and Creators. 
Voice By: This worksheet contains two columns, ID and Voice By. ID is the unique Identifier for Each cartoon character available in our List, Voice By contains person name who gave voice to our favorite Cartoon Characters.
For the tutorial, please download the Excel file.

FileDownload
Cartoon Characters Join Sample Excel FileLogin to Download

Once the file is downloaded, next we need to create the data source connection for our Excel File.
Following are the steps to create a Data Source connection for the Excel.

  • Open the start page. 
  • On the Connection Pane, select “Microsoft Excel” option under the “To a File” section.
  • In the open dialog box, select the Excel file for which you need to create a data source connection.

 

If you noticed this time after creating the data source connection, we have four tables on the left sidebar. It is based on four different sheets available in our Sample Excel file.

What we need to do next. Next, we need to start adding the tables one by one. Also, we need to create a proper join between all the tables, so we can get the correct data. 

First, we need to find the correct First Name and the Last Name for our cartoon characters. For this, let’s first add the FirstName table from the sidebar. To add a table, we have two options. Either we can select the table and then drag and drop it to the right. Or we can double click on the table name. We added the FirstName.

Note: For the better understanding we hide the extra columns from the output. 
Next, we need to add the LastName table. When we added the LastName sheet Tableau is smart enough to create the right join condition between the FirstName and the LastName excel worksheet. How it is possible? It's very simple Tableau uses the same column names for creating the Join. The ID column is available in both the tables or worksheet.

 

Note: Ignore ID (LastName). The column name is the same, but it is way Tableau representing the same column name available in two separate tables or worksheet. 
Following image shows the error when we tried the same example, but this time we have renamed the ID column in one of the worksheets.
In such scenarios, we need to need to define the Join condition manually.

We successfully added the FirstName and the LastName table. I hope you can recognize the character's name. If you think your favorite Cartoon Character is not available in the above list, please let us know in the comment section. I would love to know your favorite cartoon character name. Next, we will learn about Joins and Types of Join.

What is JOIN?

A Join is a mechanism to create a relationship between two or more tables based on the matching columns. The number of matching columns may be one or more it depends on the data available in all the tables.
In our above example, the FirstName and the LastName table is joined with ID Column.

Why Join is Required?

 As you know our First Name and the Last Name information is available in two different worksheets or table. Without a valid join condition, we cannot get the correct name for the following cartoon character.

 

Thank You PixaBay.com for the lovely free image.

When we created a valid Join condition, we have the correct names for all the cartoon characters. 

So why Join is required? A join condition is required to retrieve the correct information from one or more database tables. We will learn more about this in our future articles.

Types of Join

If you already know about Database, you already saw the following image.

 

Following are the different types of Join

  1. INNER JOIN
  2. LEFT OUTER JOIN or LEFT JOIN
  3. RIGHT OUTER JOIN or RIGHT JOIN
  4. FULL OUTER JOIN or FULL JOIN

If we assume our FirstName is the Left table, and the LastName is the right table, the following is the meaning of the different Join types

  1. Inner Join: It means returns only matching records in both the table. In our FirstName table, we have 15 records. In the LastName table again we have 15 records. Out of these 15 records we only have 14 matched records.
  2. Left Outer Join: It means returns only matching records from the right tables and all the records from the left table. In our scenario only, matching records from the LastName tables and all the records from the FirstName table. We have 14 matching records and one extra record from the FirstName.
  3. Right Outer Join: It is a just reverse of the Left Outer Join. It means returns only matching records from the left tables and all the records from the right table. In our scenario only, matching records from the FirstName tables and all the records from the LastName table. We have 14 matching records and one extra record of the LastName.
  4. Full Outer Join: It means all the records from both the tables. So, we have 14 matching records between the FirstName and the LastName table. And one not matching record in both FirstName and the LastName table.

So how to implement it in Tableau?

When we added the second table LastName by default Tableau created an inner join between the FirstName and the LastName table. You can confirm it with the image between the two tables. You can also click on the image to check more details. Following is the record for the inner Join.

To change the Join condition, you need to click on the inner join icon and the select the required Join type. Let change it to left outer join. The record for Scrooge Mc Duck is only available in the FirstName table.

Similarly, the following image is for Right Outer Join and the Full Outer Join. Check the number of records and please use the comment section and let us know which cartoon character is missing from the FirstName table. Hint for you it is not Scooby Doo.

Join Condition

Normally we used the “Equals To” operator for the Join condition. We can use the following operators for the Join Condition.
Equals To (=): Value for the joining columns must match. Example ID = ID.
Not Equals To (<>): Value for the joining columns must not match. Example ID <> ID.
Less Than (<): Value for the joining columns from the left table must be smaller than the value in the right table. Example ID < ID.
Less Than Equal To (<=): Value for the joining columns from the left table must be smaller or equal to the value in the right table. Example ID <= ID.
Greater Than (>): Value for the joining columns from the left table must be greater than the value in the right table. Example ID > ID.
Greater Than Equal To (>=): Value for the joining columns from the left table must be greater or equal to the value in the right table. Example ID >= ID.

Cross Join or Cartesian Product

What exactly is this? A cross join or the Cartesian Product is the special join. In the cross join each record from the left product will join with each record in the right table.
I never got any scenario where we need to apply a cross join between two or more tables. Normally, Cross Join is resource hungry. But Tableau manages it efficiently. For the completeness of this article we need to demonstrate How to apply a Cross Join in Tableau.

How to create a Cross Join in Tableau?

To create a Cross Join in the Join condition, add the calculation 1 = 1. It is not mandatory you need to add 1=1. You need to create a condition which is always true. So, you can say 1 <> 2. It will also work.

 

In this article, we only created the Joins between the two tables. For you add the remaining two tables and create the appropriate Joins for the remaining tables.
We are done with the Join in Tableaus. In our next article, we will learn about the Calculated Columns.

 

...

About The Author

Hello, I am Mayank Sanghvi I am primarily a BI developer with experience in MSBI and Cognos BI Suit. Also having interest in various other technologies such as Android, Java, C#, ASP .NET and other. I enjoy learning new technologies and share my experience with others.

0 Comments
Leave A Comment

Please login to post your valuable comments.

add