PREPARING DATA FOR TABLEAU
METHODS TO PREPARE DATA FOR TABLEAU
It is important to format data before it can be analyzed using Tableau; this helps to save time and prevent errors.
Tableau offers the following tools to help prep data for analysis:
• A dataset is typically made up of a collection of tables related by specific fields or columns.
• The Joining method is used to combine the related data in those common fields.
• A Join results in a virtual table that is typically extended horizontally by adding columns.
Shown here is the analysis of data on product sales with two files:
The product ID field serves as the primary key to join the data from the two sets.
TYPES OF JOINS:
JOIN FROM DATA BASE:
Tableau facilitates creating joins in two ways:
1. Single database join: Joining tables from the same database requires only a single connection in the data source.
2. Cross database join:
· Cross-database Joins require setting up a multi-connection data source by creating a new connection to each database.
· Multi-connection data sources are helpful when different internal systems are used.
· Blending is a method of combining related data from multiple sources in a single view in order to analyze it.
· There is always one primary data source, while the rest become secondary data sources.
Splitting data from one field into multiple columns is used often in data preparation.
A string field can be split automatically based on a common separator that Tableau detects (space or underscore).
This split can be used to automatically separate a field’s value into a maximum of ten new fields, depending on the type of data connection.
The custom split can also separate a string field into a maximum of ten new fields based on a separator within the original field.
You can choose to split the values at:
· The first n occurrences of the separator
· The last n occurrences of the separator
· All the occurrences of the separator
· When preparing data for analysis, a list of fields is sometimes more useful than the data preview.
· The Metadata Grid view in Tableau allows you to quickly perform actions, such as rename, hide, and others, on multiple fields with a single command.
Data is often not organized as a typical data set: field names along the columns and members along the rows.
The Pivot function in Tableau allows you to select the columns you want to manipulate and format them into a typical data set ready for analysis.
· Data often also resides in multiple, separate files and may need to be combined into a “master file.”
· Tableau’s “Union” feature helps you assemble data from multiple small files into one large file.
This function automatically “cleans” your data and preps it for analysis.
Examples of items that need to be cleaned prior to analysis:
· Merged cells
· Blank rows or columns
Tableau Prep Builder provides a modern approach to data preparation, making it easier and faster to combine, shape, and clean data for analysis within Tableau. By providing a visual and direct path to prep your data, you can get your hands on quality data in just a few clicks.