Here, we will explore how to combine multiple datasets into one or two datasets, ensuring they share the same data definitions but contain different calculated fields. The primary reason for merging these datasets is to improve performance. When dealing with numerous datasets containing large amounts of data, Tableau can take a significant amount of time to load and move charts. For instance, we spent three hours just moving three charts to create the dashboards. By consolidating the datasets, we can enhance efficiency and streamline the dashboard creation process.
Following the successful completion of the sepsis project, a significant obstacle emerged during the data consolidation phase. We encountered a challenge in merging datasets contributed by nearly eight different groups. Each group used various methods to create their charts, ranging from simple to complex. Their approach to create complex chart included:
Using unions on the datasets.
Connecting to Excel sheets for Paths variable.
Copying and pasting data points to create the Path .
We used a dataset that was 153.58MB in size, containing 1.05 million rows and 43 columns. To iIllustrate the process we combined ten datasets into two consolidated datasets: one tailored for simple charts and another for complex visualizations. This approach will help us manage the data more efficiently and reduce the load time for creating and editing dashboards.
Here we planned to have two datasets for merging keeping them as base datasets.
One dataset for simple charts- Dataset.csv
Second dataset has a union in it for complex charts– Dataset.csv(7)
In our data source we have 10 dataset, some have simple charts and few have complex charts.
Steps to merge simple charts:
Note: To avoid naming conflict or wrong calculation field after merging we followed naming conventions to distinguish calculated fields for their respective charts. For ex: Group2_sankey_path, radial_path, or dendo_bin, Group1_sankey_bin
Dataset.csv(8) has simple chart so we are going to merge it into Dataset.csv()
Note:Blue tick in the above picture represents the current datasource for that chart after merge this blue ticks go to Dataset.csv once its changed then we have to close this data source.
As discussed above,we are keeping Dataset.csv for simple charts.
Step 1: Right click Dataset(8) and select Replace Data Source.
Step 2: Select the Current : Dataset(8) and Replacement:Dataset and check Replace current worksheet only.Then OK
Blue tick now changes to Dataset.csv to represent the current datasource for that chart
Step 3: After that we need to close the Dataset.csv(8).
After closing Dataset.csv(8) will be removed from datasource .In below picture Dataset(8)is removed.
Repeat the whole process for each dataset used for simple charts.
Steps to merge dataset for complex charts.
Merging to a dataset which have a union in that dataset.
Case 1: Union to the dataset.
Step 1: Rename Calculation Fields/Bins. Follow naming conventions to distinguish calculated fields for their respective charts.
For ex: Group2_sankey_path, radial_path, or dendo_bin, Group1_sankey_bin.
Step 2 : Modify the Path Calculation Field.Open your respective Path calculated field give ur table name = ‘ 'Dataset.csv’.
Use the following logic:
IF [Table Name] = 'Dataset.csv' Then 1 else 360 end
(Adjust 360 value to suit your chart's specific needs (e.g., 270 for radial, 49 for sankey, etc.))
For example:
Step 3: Do the merging operation as given above in the “Steps to merge simple charts”
Step 4: Rectify Red-Colored Fields
Some fields may appear in red,try to resolve any issues with them .For example you have to back track the calculated column which is showing a red color field. Sometimes,Tableau assigns different names for common columns to distinguish the datasets for eg, you will see [Patientid] as [Patientid 1] or [MAP] as [MAP 1] in which case removing 1 will resolve the issue.
Step5: If you cannot resolve the issues/errors, and since all calculated fields that are necessary for your complex chart are copied in the merged Dataset,recreating the chart is an easy solution.
Once the merging is successfull, close the old dataset.
Case 2: Connecting to Excel sheets for Paths variable.
If you have created a special chart using a spreadsheet tool like Microsoft Excel or Google Sheets, then follow these steps.
Step 1:Rename Calculation Fields/Bins
Follow naming convention to distinguish calculated fields for their respective charts
For ex: Group2_sankey_path, radial_path, or dendo_bin, Group1_sankey_bin.
Step 2:Create the calculated field for Path using the following formula:
IIF([Table Name]="Dataset.csv", 1, 360)
Adjust 360 based on your chart's requirements (e.g., 270 for radial, 49 for sankey, etc.).
NOTE: You will get a calculation is not valid error, merging to the Dataset(7) will solve that issue.
Step 3: Create a Bin for the Path Variable.
Step 4: Do the merging operation as given above in the “Steps to merge simple charts”.
Step 5: Rectify Red-Colored Fields.
Some fields may appear in red;try to resolve any issues with them .For example you have to back track the calculated column which is showing a red color field. Sometimes,Tableau assigns different names for common columns to distinguish the datasets for eg, you will see [Patientid] as [Patientid 1] or [MAP] as [MAP 1] in which case removing 1 will resolve the issue.
Step 6: If you cannot resolve the issues/errors, and Since all calculated fields that are necessary for your complex chart are copied in the merged Dataset,recreating the chart is an easy solution.
Once the merging is successfull , close the old dataset.
Case 3: Copying and pasting data points to create the Path.
Follow the same steps as in Case 2. Once the merging is successfull, close the old dataset. Conclusion: After following the outlined steps, we successfully merged our 10 datasets into two distinct datasets: one for simple charts and another for complex charts. This consolidation significantly streamlined the creation of our dashboard and storyboard. Consequently, we completed our sepsis project presentation smoothly, without any disruptions such as loading delays or difficulties transitioning between dashboards. I hope this approach proves equally beneficial for you when merging datasets for your own projects.
Commenti