Let us discuss on (What, Why, When and How) a Box Plot.

A Box and Whisker Plot often called the Box plot is used to show the distribution of values along an axis. It is used to visualize five values in a dataset for the selected column(s) which is called the five number summary.

Let us understand the Five number summary in detail.

The Five number summary refers to the Minimum value, First Quartile (Q1), Median (Second Quartile Q2), Third quartile(Q3) and the Maximum value.

Image by Author

The Median (50th percentile, Q2) is the value separating the higher half of a data sample from the lower half. In other words, it is the “middle” value of a data set.

First quartile (Q1/25th Percentile): The values between the smallest number and the median of the dataset.

Third quartile (Q3/75th Percentile): The values between the median and the highest value of the dataset.

Minimum value is calculated as Q1 -1.5*IQR

Maximum value is calculated as Q3 + 1.5*IQR

Inter-quartile range: The middle 50% of values fall within the inter-quartile range. Tableau draws a box around the interquartile range. The quartiles are called Hinge and Whiskers in Tableau.

Whiskers are the lines extending from the box on both the sides. They typically extend to 1.5*of the interquartile range to set boundary. Hence this plot is called the Box and whisker plot, the points beyond which would be considered Outliers.

When to use box plot

Box plot can be specifically used to find

The key values such as the minimum, median, maximum, etc.

Existence of outliers and their values

Skewness and its direction

If the data are symmetrical

How tightly the data is grouped

How to make a Box plot

Let us make a Box plot that shows Discount by region and product Category.

Connect to Sample -Superstore data source

Drag the Category to columns and Discount to rows

Image by Author

Tableau creates a Bar chart by default

3.Now drag the Region to the columns, a bar chart as shown in the image is formed.

Image by Author

4.Next Click show me in the tool bar and select the Box and Whisker plot type

5.Tableau displays a Box plot. We may notice that there are very few marks in each Box plot. Also, the regions from the column are shifted to the marks card.

Image by Author

6.Drag the Regions back to the columns and we find that the horizontal lines are flattened, this is because the Box plot is based on a single mark and the data is aggregated in the current view

Image by Author

7.Now Disaggregate data by selecting Analysis> Aggregate measures

Image by Author

Instead of a single mark we can see a range of marks.

Image by Author

8.Click the swap button to swap the axes,

The Box plot now flows Horizontally

Image by Author

9.Right click the bottom axis and select the Edit Reference line where we can select an interesting color in the Fill drop down list.

Image by Author

10.Box plot is now created with the selected color.

Image by Author

By hovering across each plot, we can Interpret that there are outliers and the Discount provided to each Category across the Region.