In the world of data visualization, Tableau stands out as a powerful tool for uncovering insights and making data-driven decisions. One of the key techniques you can use in Tableau is correlation analysis. This method helps you understand the relationship between two or more variables, which is crucial for making informed decisions in various fields, including healthcare, finance, and marketing.
What is Correlation?
Correlation refers to statistical measure that expresses the extent to which two variables move in coordination with each other. The correlation coefficient, often denoted as (r), ranges from -1 to +1
Types of correlation:
Positive Correlation:
A positive correlation describes that this linear relationship is positive, and the two variables increase or decrease in the same direction.
Negative Correlation:
A negative correlation describes that this relationship line has a negative slope, and the variables change in opposite directions, i.e., one variable decrease while the other increases.
No Correlation:
No correlation describes that the variables behave very differently and thus have no linear relationship.
How is Correlation measured?
The correlation coefficient quantifies the strength of the relationship, and it is a unit-free measure that ranges between -1 and +1 and is denoted by r. Statistical significance is indicated with a p-value. Therefore, correlations are typically written with two key numbers: r = and p =.
The closer r to zero indicates the weaker the linear relationship.
Positive r values indicate a positive correlation.
Negative r values indicate a negative correlation.
P-value?
The p-value is defined as a measure of probability used for hypothesis testing
How to create a correlation chart in tableau using scattered plot.
In Tableau, the correlation coefficient is computed using two functions:
CORR () — Which returns the Pearson correlation coefficient of two expressions.
WINDOW_CORR () — Which returns the Pearson correlation coefficient of two expressions within a window.
For example, using this sample Health care data to create a correlation chart for two variables, Bilirubin total and Bilirubin direct.
Firstly, in tableau using CORR () function to compute Pearson correlation coefficient for Bilirubin total and Bilirubin direct. Create a calculated field (Correlation Coefficient) as shown below.
Next adding this calculation (Correlation Coefficient) to the label shelf.
Secondly to get scatter plot, Drag Bilirubin total and Bilirubin direct to rows and columns, and under marks select circle and drag Bilirubin total and Bilirubin direct to color and size and then drag the calculated field (Correlation Coefficient) to label and detail.
Finally for trend line go to Analytics tab and select trend line option under model and choose the required trend line.
Now notice that the value of Calculated field (Correlation Coefficient) is equivalent to the square root of the R-squared value.
Purpose and uses of Correlation charts:
Used to summarize data large data sets and use them as input into more advanced analyses.
Used to identify patterns in data and make predictions and decisions based on that.
Conclusion
Correlation analysis in Tableau is a powerful technique for uncovering relationships between variables. By leveraging Tableau’s built-in functions and visualization capabilities, you can gain deeper insights and make more informed decisions. Whether you’re working in healthcare, finance, or marketing, understanding correlations can help you optimize strategies and achieve better outcomes.