Scatter plots are an essential tool for visualizing relationships between two continuous variables. They are particularly useful when exploring patterns, trends, and correlations. In this guide, we will walk through the steps of creating a scatter plot in Tableau, explain when and why to use them, and dive into the different types of correlations.
What is a Scatter Plot?
A scatter plot visually displays the relationship between two continuous measures. It uses dots to represent data points, where:
X-axis: Represents one measure (e.g., Sales).
Y-axis: Represents another measure (e.g., Profit).
Dots: Represent individual data points based on these two measures.
Scatter plots help us identify potential correlations between the variables, revealing insights into how they influence each other.
When to Use Scatter Plots?
Scatter plots are ideal when:
You want to observe the relationship between two numerical measures (e.g., Sales and Profit).
You need to detect any patterns, outliers, or correlations.
You aim to identify clusters or groupings of data points.
Scatter plots require at least two continuous measures, but you can add more details by categorizing your data (e.g., by Product Category or Sub-category).
Example: Creating a Scatter Plot in Tableau
Objective: Visualize the relationship between Sales and Profit for different product categories and sub-categories.
Steps:
Place "Profit" on Columns:
Drag the Profit measure to the Columns shelf.
Place "Sales" on Rows:
Drag the Sales measure to the Rows shelf.
Add "Category" to Color:
Drag the Category dimension to the Color shelf. This will differentiate the dots by product categories.
Add "Sub-category" to Detail:
Drag the Sub-category dimension to the Detail shelf. This will break down the scatter plot by sub-categories, showing each one as an individual point.
Add Trend Lines:
Go to the top menu and select Analysis → Trend Lines → Show Trend Lines. Tableau will add a trend line to the plot, helping to indicate the relationship between the two variables.
Understanding Correlations in Scatter Plots
Once the scatter plot is generated, we can observe the correlation between Sales and Profit by analyzing the trend line and the distribution of data points.
Types of Correlations:
Positive Correlation:
Definition: As one variable increases, the other variable also increases.
In the Scatter Plot: The points form an upward slope from left to right. A positive trend line shows that higher Sales are associated with higher Profits.
Key Indicator: If the points align closely to a 45-degree line, this indicates a strong positive correlation with a correlation value of 1.
Example: if a product's sales increase, the profit typically grows in proportion, signifying a positive correlation.
Negative Correlation:
Definition: As one variable increases, the other variable decreases.
In the Scatter Plot: Dots will have a downward slope from left to right. A negative trend line shows that higher Sales may be linked with lower Profits.
Example: In some cases, you might observe that increasing Sales results in lower Profit margins, indicating a negative relationship.
Zero or No Correlation:
Definition: No clear relationship between the two variables.
In the Scatter Plot: The points will appear scattered with no distinct pattern, and the trend line will be flat or near flat. This lack of pattern indicates that changes in one variable do not predict changes in the other.
Key Indicator: Horizontal and vertical lines suggest zero correlation. This means that the trend line does not slope upward or downward and is essentially horizontal.
Example: If there is no consistent relationship between Sales and Profit, the scatter plot will show points scattered randomly, and the trend line will reflect this lack of correlation.
Interpreting Scatter Plots
Scatter plots allow us to visually interpret data. A positive or negative trend can help guide decision-making, especially when you want to:
Increase profitability: If you see a positive correlation, focusing on high-sales sub-categories might boost profits.
Improve underperforming categories: A negative correlation might indicate areas where high sales are not translating into higher profits, suggesting inefficiencies.
Outliers in scatter plots are also critical for identifying specific sub-categories that don’t follow the general trend. These points may require further investigation.
Key Considerations for Using Scatter Plots:
Two Continuous Measures: Scatter plots are best suited for visualizing the relationship between two continuous variables, such as Sales and Profit. If you're comparing only dimensions (e.g., product categories), another chart type may be more appropriate.
Categorical Groupings: You can add categorical dimensions, like Category or Sub-category, to Color or Detail to differentiate the points further. This helps in identifying patterns across different groups.
Trend Lines: Adding trend lines enhances the scatter plot by clearly showing whether the relationship is positive, negative, or nonexistent.
Scatter plots are a powerful tool for exploring the relationships between continuous measures. By visualizing how two variables interact and detecting trends or outliers, you can uncover valuable insights into your data. For sales and profitability analysis, a scatter plot in Tableau is a quick and effective way to identify which categories or sub-categories are performing well and which need attention.
With clear, actionable data visualizations, scatter plots enable data-driven decisions that maximize impact across your business.
Comments