For creating reports and to perform an analysis of business data we need to combine data between multiple tables. In this article, we will learn How to Create Joins in Data Modules?
This article helps us to understand the types of Join and How to combine data to get the required data for reporting.
In this article, we will use the sample Excel files which are available in this article. The sample Excel file contains information for the Cartoon Character in the 1900s. The information is available in multiple excel files. The data is split into multiple sheets to demonstrate How to create a Join and retrieve the relevant data.
You may ask, Why Multiple different Excel Workbooks? We can create a single Excel file with multiple sheets?
At the time of writing this article, Multi Sheet Excel file is not supported by IBM Cognos Analytics. Because of this limitation, we can’t use an Excel file in which data is split between multiple worksheets.
Download the following Excel Files.
|Creators||Login to Download|
|First Name||Login to Download|
|Last Name||Login to Download|
|Voice By||Login to Download|
If you already downloaded all the files upload all the files one by one. After the upload selects all the column and then click on the OK button for all the files. If everything is fine, you can see all the files on the Welcome page.
Next, we will need to create a new Data Module. We are on the start page so click on the new button and then select the Data Module option. It will ask for the files which we need to use select all the files which we uploaded and then select Done. Next, on the new module page, we need to add all the files from the source.
We are almost done with the initial setup. It’s time to understand about Joins.
A Join is a mechanism to create a relationship between two or more tables based on the matching columns. The number of matching columns may be one or more it depends on the data available in all the tables.
In our above example, the FirstName and the LastName table is joined with ID Column.
As you know our First Name and the Last Name information is available in two different worksheets or table. Without a valid join condition, we cannot get the correct name for the following cartoon character.
Thank You PixaBay.com for the lovely free image.
When we created a valid Join condition, we have the correct names for all the cartoon characters.
So why Join is required? A join condition is required to retrieve the correct information from one or more database tables. Here we are using multiple Excel files. So, you can say one or more Excel files. We will learn more about this in our future articles.
If you already know about Database, you already saw the following image.
Following are the different types of Join
If we assume our FirstName Excel file is the Left table, and the LastName Excel File is the right table, then the following is the meaning of the different Join types
Enough theory, Let’s see how to implement all this in our Data Module. The first thing I want to do is rename the table names. Why? Because we are not able to see the table name properly. I am going to remove the extra information available in the tables from our Data Module. Next, to create a join open the relationship tab. First, create a Join Between FirstName and the LastName. Right click on the FirstName and select create Relationship. For the second table select the LastName. From both the tables select the ID column and click on the “Match selected columns” button. By default, a 1 to many Inner Join is created. Click on the OK button to save the relationship.
To test the output, click on the Try It button and add the ID, FirstName, and the LastName column. We are getting the right output.
Next, we need to change the relationship between the FirstName and the LastName table for this we will right click on the relationship between the FirstName and the LastName table and then select the Edit Relationship. On the editing relationship, you find the existing relation condition. We need to change the Join type from INNER JOIN to LEFT OUTER JOIN. For this, we need to click on the gauge icon available at the leftmost bottom corner. Click on the gauge icon and select the join type from INNER JOIN to LEFT OUTER JOIN. Also, we need to change the Cardinality. Why? The cardinality between FirstName and LastName is to be 1 to 1. For each FirstName, we have a matching LastName. Change the Relationship type and the cardinality. Once done, click on the OK button to save the relationship. Again, click on Try it to test the data. For the left Join, we are getting all the records from the Left table and only matching records from the right table. The cartoon character Scrooge is only available in the FirstName table. In case of missing records, it displays a null or blank value. For Scrooge the Last Name Mc Duck is missing. Refer to the image for more details.
Similarly, if you change the Join type to Right Join, then we will get only the matching records from the left table (FirstName) and all the records from the Right table (LastName). In the last name, we have an extra record with the Last Name Doo. The matching record for this Last Name is not available in the FirstName table. Can you guess the Cartoon Character First Name? Hint it is not Scooby Doo.
Finally, the last type full outer join it means to get the data from both the tables.
What exactly is this? A cross join or the Cartesian Product is the special join. In the cross join each record from the left product will join with each record in the right table.
I never got any scenario where we need to apply a cross join between two or more tables. Normally, Cross Join is resource hungry. But Tableau manages it efficiently. For the completeness of this article, we need to demonstrate How to apply a Cross Join in Tableau.
It is very simple. If you will not create a valid join, then by default Cognos use Cross Join. But in IBM Cognos Analytics trial on the cloud we will not able to execute the report. TO simulate the same, you can create a calculated column named “CrossJoin” with an express 1 in both the tables. Changes its property to an Identifier and finally create a Join on the “CrosJoin” column.
Please login to post your valuable comments.